Testing
Conventions for the test suite, coverage thresholds, and how to run a single test or the whole thing.
The test suite is bun test (no Jest, no Vitest). It uses the test patterns from bun:test and lives entirely under __tests__/.
Run
# All tests
bun test
# A single file
bun test __tests__/homeassistant/light-count.test.ts
# A pattern
bun test --test-name-pattern "control_light"
bunfig.toml auto-preloads test/setup.ts for every test, which sets HASS_TOKEN and JWT_SECRET to placeholder values so the config validation doesn’t trip on missing env vars.
Layout
__tests__/
├── homeassistant/ # mirrors src/tools/homeassistant/
│ └── lights.test.ts
├── tools/ # mirrors src/tools/
│ └── search-entities.test.ts
├── integration/ # end-to-end tests
│ └── mcp-roundtrip.test.ts
├── mcp/
├── security/
├── speech/
└── ...
The convention: mirror the src/ structure. A tool at src/tools/homeassistant/foo.tool.ts has its test at __tests__/homeassistant/foo.test.ts (or __tests__/tools/homeassistant/foo.test.ts — match the existing convention in the file you’re contributing to).
Patterns
Mocking the HA client
The HA client talks to a real WebSocket on a real HA instance, which isn’t available in CI. Most tests mock it. The standard pattern:
import { describe, expect, it, mock } from "bun:test";
const mockHassClient = {
getStates: async () => [{ entity_id: "light.living_room", state: "on" }],
callService: async () => ({}),
// ... the methods your tool actually uses
};
const makeContext = () => ({
hassClient: mockHassClient as any,
logger: {
info: () => {},
warn: () => {},
error: () => {},
debug: () => {},
} as any,
requestId: "test",
});
as any is the common escape hatch — the ToolContext type is wide and you usually only need a few fields.
Testing the tool directly
it("turns the light on at the requested brightness", async () => {
const tool = new ControlLightTool();
const result = await tool.execute(
{ entity_id: "light.living_room", action: "turn_on", brightness: 200 },
makeContext(),
);
expect(result.state).toBe("on");
});
No HTTP, no server, no transport. The tool’s contract is: given a valid input and a context, produce a result. Test the contract.
Testing the transport
For tests that exercise the full HTTP or WebSocket surface, see __tests__/integration/. The pattern is to boot the server in-process, hit it with fetch or a WebSocket client, and assert on the response.
These tests are slower and more brittle. Use them sparingly — prefer tool-level tests for logic, integration tests for plumbing.
Coverage
bunfig.toml declares the coverage thresholds:
[test]
coverage = true
coverageThreshold = {
statements = 0.8,
lines = 0.8,
functions = 0.8,
branches = 0.7,
}
To see the report:
bun test --coverage
Open coverage/index.html in a browser for the line-by-line view.
The thresholds are enforced in CI (see .github/workflows/). A PR that drops coverage below the thresholds fails the build. If you’re removing a feature, also remove its tests (otherwise coverage is artificially high and the next person gets a rude surprise). If you’re adding a feature, add a test.
Conventions
- One
describeper file, named after the unit under test (describe("ControlLightTool", ...)). - One
itper behavior, with a sentence-style name (it("returns 0 when no lights are on", ...)). - Use
expect().toBe()for primitives,expect().toEqual()for objects/arrays. - For async assertions, use
expect().resolves.toBe(...)orawait expect(...).rejects.toThrow(...). - Don’t use
donecallbacks.bun:testhandles promises natively. - Cleanup with
afterEach(() => mock.restore())if you usedmock().
Mocking fetch / ws
The HA client uses fetch for some calls and ws for the WebSocket. To mock them:
import { mock } from "bun:test";
const originalFetch = globalThis.fetch;
afterEach(() => {
globalThis.fetch = originalFetch;
});
it("...", async () => {
globalThis.fetch = mock(() =>
Promise.resolve(new Response(JSON.stringify({ ok: true }))),
) as any;
// ...
});
For WebSocket, prefer mocking the HassClient directly — the ws library doesn’t have a clean mock surface.
Disabling tests temporarily
If you need to skip a test during development (e.g. it’s flaky in CI), use it.skip(...) or describe.skip(...). Don’t comment it out — skip is visible in the test report.
Common pitfalls
bunfig.tomlis read once. If you change the preloadedtest/setup.ts, restartbun test.test/setup.tssets placeholder env vars. Don’t write a test that asserts on a specific token value; read from the test’s own setup.- Don’t share state between tests. Each test should construct its own tool instance and context. The HA client mock is cheap to recreate.
- The coverage report lags by one test run. If you just added a test and the coverage report is unchanged, run
bun test --coverageagain.
Next
- Adding a Tool — the full contribution flow.
- Architecture > Tool System — what the tests are testing.