Weekly PII-scan cron found two real production PII leak patterns in mercury-logs (last 24h):
access_token leaks (~20 entries / 24h)The outer log line at mercury/utils/http.py:397 (Error executing {method} {url}, status_code=..., message=..., | <traceback>) already calls _sanitize_url(url) on the URL arg. But the traceback at the trailing {} is _sanitize_traceback(traceback.format_exc()), which previously only redacted plaintext emails. Inside the traceback, requests.exceptions.HTTPError embeds the failing URL a second time as part of its message string:
requests.exceptions.HTTPError: 401 Client Error: Unauthorized for url: https://graph.facebook.com/v24.0/act_xxx/insights?access_token=EAAg89W57ZC8cBQZBrCntArl...
That access_token= value was leaking verbatim into Datadog, repeated dozens of times per failing tool call.
%40) in Google Calendar 404 tracebacksSame code path. Calendar IDs that look like email addresses get URL-encoded by requests before the GET, so the URL inside the traceback looks like:
requests.exceptions.HTTPError: 404 Client Error: Not Found for url: https://www.googleapis.com/calendar/v3/calendars/neseatraveldesign%40gmail.com/events?...
urlsplit sees no query param matching email/etc, and _EMAIL_RE doesn't match %40, so the email survived both the URL sanitizer and the traceback sanitizer.
_URL_RE and run it across the traceback string before the email pass — every URL-shaped substring is routed through _sanitize_url, so traceback URLs get the same redaction as the caller-side URL log._URL_ENCODED_EMAIL_RE and apply it in both _sanitize_url (path-side %40 emails) and _sanitize_traceback (residual %40 emails outside URLs).Per-log-line justification:
mercury/utils/http.py:398 (logger.error("Error executing {} {}, status_code=...", ...)) — same call site, but its 5th arg now redacts URLs inside the traceback in addition to emails. Status-text (401 Client Error, 404 Client Error) is preserved for debugging.mercury/tools/api/tool.py:295,315,402,474,586,656,669,725 (8x _sanitize_traceback(traceback.format_exc())) — same call sites, no behavioral change for tracebacks without URLs (regression-tested via test_passthrough_when_no_email and test_preserves_non_url_text).Found but not fixed in this PR (separate follow-up needed):
apps/workspace/config.ts:11 contains a hardcoded Bearer token (fo1_uS_nEVGzkLlEBCj5pv2xAelMrd60kMXIoJ5J2LSB89U) for Fly.io's api.machines.dev. The auth scheme is marked is_deprecated: true, but rotating a real third-party API token is an integration-correctness change that needs Cortex-side validation; not in scope for this log-sanitization-only cron PR.Already Preprocessed Metadata: {... 'Authorization': 'Bearer eyJ...', 'composio_api_key': 'ak_...'} is leaking full Atlassian JWTs and Composio API keys via mercury-logs. That string does not exist anywhere in current Mercury master (grep of mercury/, apps/, full git history all return nothing), so it must come from a Cortex-injected debug print or an older deployment / bundled-tool. Filing this for Cortex/platform follow-up; cannot be fixed in this repo.This is the 6th time _sanitize_url / _sanitize_traceback have been touched (PRs #18398, #20502, #21006, #21884, #22460, this one). PR #22460 added regression tests as a structural guard, which held this run — both helpers were intact at start, the fix is purely additive for new patterns. To avoid concurrent PRs degrading the helpers in the future, please add mercury/utils/http.py and tests/test_utils/test_http.py to CODEOWNERS so any change requires explicit security/platform review.
pytest tests/test_utils/test_http.py -v -> 152 passed in 0.72s (7 new tests, all existing 145 still green).TestSanitizeTraceback::test_strips_access_token_from_embedded_url (uses the actual leaked Facebook URL prefix EAAg89W57ZC8cBQZBrCntArl... and asserts it does not survive)TestSanitizeTraceback::test_strips_url_encoded_email_in_traceback_url (uses the actual leaked googleapis.com calendar URL with neseatraveldesign%40gmail.com)TestSanitizeTraceback::test_strips_api_key_from_embedded_url, test_preserves_non_url_textTestSanitizeUrl::test_redacts_url_encoded_email_in_pathTestSanitizationUtilitiesExist::test_helpers_are_importable updated to include new regexesruff check mercury/utils/http.py tests/test_utils/test_http.py -> All checks passed.ruff format mercury/utils/http.py tests/test_utils/test_http.py -> clean.Origin: cron-a3f8b1896f41 / zen-cron-f3311c18890d Triggered by: rahul.lingala@composio.dev | Source: cron-a3f8b1896f41