📊 Observability & Telemetry
Operating a globally distributed algorithmic trading network demands extreme observability. A silent failure in an order loop or a transient API throttling event can lead to missed targets and unhedged exposures.
This guide outlines our telemetry architecture, covering Cloudflare Workers 100% request sampling, time-series SQL database queries, R2 logging, and automated alarm systems.
⚡ 1. 100% Request Head Sampling
To monitor traffic at Cloudflare’s edge compiler level, every worker in the Hoox monorepo enables native observability tracing inside its wrangler.jsonc:
{
"observability": {
"enabled": true,
"head_sampling_rate": 1, // Captures and logs 100% of all incoming requests
},
}
Metrics Tracked Automatically
Once enabled, Cloudflare captures and graphs these metrics near-instantaneously:
- Requests velocity: Total hits and requests/second rates.
- CPU execution duration (ms): The exact CPU time consumed by the isolate thread.
- Uncaught Exceptions: Crashes or code errors that bypass middlewares.
- Subrequest Counts: Outbound HTTP calls made to exchange APIs or other workers.
💾 2. Cloudflare Analytics Engine (SQL Telemetry)
While Cloudflare collects basic HTTP metrics, Hoox uses Cloudflare Analytics Engine inside analytics-worker to track trading-specific variables (e.g. execution slippage, exchange response delays, fee sums).
Under-the-Hood SQL Queries
The analytics-worker exposes a SQL interface to query the metrics dataset directly from your Next.js Dashboard:
/* Query 1: Calculate the 95th percentile latency of Bybit API order fills */
SELECT
quantiles(0.95)(double1) as p95_latency_ms,
count() as total_executions
FROM hoox_telemetry
WHERE blob1 = 'trade-worker'
AND blob2 = 'bybit'
AND timestamp >= now() - INTERVAL '1' HOUR
/* Query 2: Aggregate error velocities across all edge isolates */
SELECT
blob1 as worker_name,
count() as error_count
FROM hoox_telemetry
WHERE blob3 = '0' -- Success = false
AND timestamp >= now() - INTERVAL '6' HOUR
GROUP BY worker_name
📝 3. Logging & R2 Storage Offloading
Because transactional databases like SQLite D1 have write limits, Hoox implements a dual-tier logging policy:
- High-Value Data: Transaction statuses and fills are written to your D1
tradestable. - Verbose Telemetry: Full, verbose JSON payload exchanges, network headers, and WebSocket stream records are serialized and saved to R2 Storage using date-based key paths:
logs/bybit/BTCUSDT/2026-05-19/order-18049284739.json
This offloading reduces database write volumes by up to 75%, keeping your database compact, fast, and fully within free-tier limits.
🚨 4. Standardized Alarm Systems
If a critical error or execution failure occurs (e.g., an order is rejected due to insufficient margin, or the kill switch is flipped due to drawdown limits):
- The executing worker intercepts the exception using the shared error handler.
- Direct V8 Service Binding pings
telegram-workerinstantly. - You receive an emergency alert on your phone.
// Standard alarm notification trigger
if (orderResponse.status === "Rejected") {
await env.TELEGRAM_SERVICE.fetch("https://telegram-worker/alert", {
method: "POST",
headers: {
"Content-Type": "application/json",
"X-Internal-Auth-Key": env.INTERNAL_KEY_BINDING,
},
body: JSON.stringify({
chatId: env.TELEGRAM_CHAT_ID_DEFAULT,
message: `🚨 <b>Emergency Order Reject!</b>\nExchange: Bybit\nSymbol: BTCUSDT\nReason: Insufficient Margin`,
}),
});
}
Tip: Need to tail live worker logs to debug a custom strategy? Run
hoox logs tail trade-workerin your terminal to open a real-time WebSocket connection to Cloudflare’s global edge logging console!
🔗 Next Steps
- Debugging Runbook — Diagnose local and remote V8 exceptions using diagnostic profiles.
- Self-Healing & Repair — Recover from degraded workers and provision missing bindings.