auth configs / projects
Stack: Batch Revoke — Thermos side · PR 3 of 3
We're adding batch-revoke support to the platform: the ability to revoke every connected account in a scope (org / project / auth config / connected account) asynchronously, plus opt-in revoke-on-delete on the public DELETE endpoints. This introduces new batch-revoke APIs and revoke-on-delete semantics across the platform. The work runs as a Temporal workflow on Thermos that calls Apollo's single-connection revoke one account at a time, recording each per-connection outcome durably.
This PR is the top of a 3-PR stack landing the Thermos side of that machinery
(the Apollo side — jobs table, public endpoints, DELETE cascade — follows
separately).
Stack (merge bottom-up):
sarthak/batch-revoke-1-workerdb — revoke_job_item workerdb table + storesarthak/batch-revoke-2-workflow — RevokeBatchWorkflow + activitiessarthak/batch-revoke-3-endpoints — enqueue/poll HTTP endpoints ← this PRBase:
sarthak/batch-revoke-2-workflow— review/merge the lower two first.
The batch-revoke primitive runs as a Temporal workflow on Thermos, but had no HTTP surface for a caller to start a scoped revoke job or read its outcome. These two endpoints are that surface — and the only new Thermos routes the feature needs, since retry reuses poll's failed-filter plus the enqueue path.
What:
POST /api/jobs/revoke: stage the connection id list in the object store and start
(or 409-echo the in-flight) RevokeBatchWorkflow; return
thermos_job_id = encode(workflow_id, run_id).GET /api/jobs/revoke: resolve the run from thermos_job_id, derive status from
the Temporal workflow, and page the revoke_job_item result ledger
(filter=all|failed) once completed.lib/batchrevoke (EncodeCursor /
DecodeCursor) with table-driven tests; wire routes via SetupJobsRevokeRoutes
and fold NewRouter into ProvideRouter.How:
ALLOW_DUPLICATE reuse + FAIL conflict policy); the lost-race case echoes the
RunId carried on WorkflowExecutionAlreadyStarted, with no extra lookup.thermos_job_id is the self-describing encode(workflow_id, run_id), so poll
resolves the exact run via a direct DescribeWorkflowExecution (NotFound past
retention maps to completed) — no search attribute.AuthRefreshTemporalClient and the object store, and fail
closed (503) when either is unavailable.Automated: apps/thermos/lib/batchrevoke/cursor_test.go — table-driven tests for
the opaque keyset cursor codec (EncodeCursor ⇄ DecodeCursor round-trips, malformed
token rejected rather than silently mis-paging).
The enqueue/poll handlers themselves are not yet covered by automated tests (the
seam crosses Temporal + the object store + a real workerdb). I'll run the checklist
below locally first, then re-run the same checklist against staging after
deployment.
Enqueue/poll checklist (local, then staging):
POST /api/jobs/revoke with a connection-id list → 200 + thermos_job_id;
the id list is staged in the object store under workflow_id; RevokeBatchWorkflow
starts.POST the same scope while running → 409 echoing the in-flight
thermos_job_id (decodes to the same run); no second run, no duplicate work.POST the same scope → new run + new
thermos_job_id (different run_id).GET /api/jobs/revoke?thermos_job_id=… while running → { status: running },
no results.GET … once completed → { status: completed } + first page of
revoke_job_item results._Local: drive the two endpoints against a local Thermos + workerdb + localstack S3
object store.
Towards DASH-814
auth configs / projects