Description
Datadog queries filtering by @project_id return zero Mercury hits today because Mercury never receives or logs the calling project's nanoId. Filtering by @request_id works (correlation succeeds), but agents and oncall can't pivot from a project to the Mercury logs that belong to it.
This threads project_id from Apollo → Thermos → Mercury at the Lambda invocation boundary, so every Mercury op (execute_action, text_to_request, poll_payload, setup_webhook, refresh_webhook, fetch_authz_entities, …) gets @project_id correlation automatically without per-op plumbing.
How it works
The plumbing was almost entirely in place already; only the Lambda-layer consumer was missing.
- Apollo (
apps/apollo/src/common/lib/external/thermos.ts:89-97) — already sets x-client-project-id: <pr_nanoId> header on every Thermos call via the request interceptor, reading from authInfo.project.id. Unchanged.
- Thermos middleware (
apps/thermos/middleware/logging.go:108) — already reads x-client-project-id header onto request context as ProjectIDKey. Unchanged.
- Thermos Lambda boundary (
apps/thermos/lib/awsutils/lambda.go) — new injectProjectIDIntoPayload(ctx, payload) helper. Reads ProjectIDKey from context and adds a top-level project_id field to the outgoing Lambda payload JSON. Called from both Invoke (AWS) and InvokeWithHttp (local lambdad) paths.
- Mercury (companion PR ComposioHQ/mercury#23472) — reads
event["project_id"] top-level and adds it to request_ctx so the Datadog sink emits @project_id on every log record.
Best-effort semantics
injectProjectIDIntoPayload is a log-correlation helper, not a correctness primitive. On any failure (missing/blank context value, "N/A" sentinel, invalid JSON, marshal error) it returns the payload unchanged. Log correlation must never break the underlying Lambda call.
On the identifier convention
Apollo already logs @project_id as the nanoId today — apps/apollo/src/common/lib/internal/logger.ts:124 ([DATADOG_TAGS.PROJECT_ID]: authInfo?.project.id). The user's explicit Datadog query was @project_id:pr_Gv-uXCTsDSJI (nanoId). This PR sticks with that convention. xGroup (autoId) remains the routing/grouping header; x-client-project-id (nanoId) remains the log-correlation identifier. Two headers, two purposes.
Backwards compatibility
- No wire-format additions in user-facing APIs. The header has been in flight from Apollo for some time already.
- Older Thermos calling newer Apollo: header is silently ignored. Mercury logs miss
@project_id (today's behavior).
- Newer Thermos calling older Apollo: context lacks the header value,
injectProjectIDIntoPayload no-ops. Mercury logs miss @project_id (today's behavior).
- The Mercury PR is independently no-op when the metadata key is absent.
How did I test this PR
- 5 new table-driven tests in
apps/thermos/lib/awsutils/lambda_test.go::TestInjectProjectIDIntoPayload covering: happy path with nanoId, missing context, empty string, "N/A" sentinel, invalid JSON best-effort fallback. All pass.
cd apps/thermos && go build ./... && go vet ./... && go test ./handlers/... ./lib/... — full Thermos suite passes
cd apps/apollo && doppler run -- pnpm check-types — clean
cd apps/apollo && doppler run -- pnpm lint <changed files> — 0 errors, 0 warnings
- Companion Mercury PR has dedicated
test_project_id_logging.py (6 tests) for both ingress paths + the Datadog sink emit/omit semantics
E2E verification plan post-merge: once both PRs land and Mercury redeploys, fire a tool execution and verify https://app.datadoghq.com/logs?query=@project_id:pr_<id> returns Apollo + Thermos + Mercury log lines for the same request. Local E2E was not feasible — the sandbox's local Thermos requires a workerdb that isn't bootstrapped here.