Description
The hourly PollingTriggerCleanerWorkflow activity CleanOldPollingTriggerRunsActivity has been failing on prod_cluster (service thermos-polling-triggers) with context deadline exceeded every run for at least the past 6 hours — surfaced today by the cron-dd-errors Datadog poll (50+ errors in 6h, caller workers/batched_polling_trigger_cleaner.go:64, message Failed to get count of old polling trigger runs).
Root cause: GetOldPollingTriggerRunsInfo issues two upfront COUNT(*) queries against polling_trigger_runs (the 37.8M-row / 43GB-bloat table from the Feb 2026 incident). The activity's per-call DB timeout was 15s, far too tight for those scans. Recent timeout reconciliations (#9572 and b694e3a435 from 2026-05-11) only adjusted StartToCloseTimeout, leaving the per-call DB budget unchanged.
Fix: bump per-call DB timeout from 15s to 120s, matching webhook_trigger_cleaner.go which already uses 120s for its analogous cleanup COUNTs. StartToCloseTimeout stays at 300s.
How did I test this PR
cd apps/thermos && go build ./workers/... — clean
go vet ./workers/... — clean
go test -count=1 -run TestPollingTriggerCleaner -timeout 90s ./workers/ — ok github.com/composio/hermes/apps/thermos/workers 0.291s
Runtime verification on local thermos isn't feasible — this activity is driven by a production Temporal schedule against the 37M-row prod table; the local registry DB has no equivalent data. The change is a one-line per-call timeout bump aligned with an existing precedent in the sibling cleaner.
Origin: cron-dd-errors / zen-cron-80a85847aa8b
Triggered by: Cron: Datadog Error Polling | Source: unknown
Session: https://zen-api-production-4c98.up.railway.app/dashboard/#/chat/zen-cron-80a85847aa8b