Description
The learning pipeline was failing on ~28 sessions per day due to oversized prompts sent to OpenAI's GPT-5.2 API:
- 17 token limit exceeded errors: Sessions with huge payloads (up to 2.3M tokens vs 272K limit)
- 9 JSON parse errors: Control characters in session data causing invalid JSON in the API request body
- 2 string too long errors: Message content exceeding OpenAI's 10MB string limit
These failures waste API quota and generate noise in logs. The root cause is that flex_chat_completions_parse() had no pre-flight size checks — it sent every request regardless of size, and the errors were only caught after the API rejected them.
Fix: Added three pre-flight checks to flex_chat_completions_parse():
- Message content size check — rejects messages > 10MB (OpenAI's string limit)
- Estimated token count check — rejects prompts > 250K estimated tokens (conservative estimate using chars/4, leaving headroom under the 272K context window)
- Control character sanitization — strips C0/C1 control characters (NUL, etc.) from message content before sending, preventing JSON serialization errors
These checks log a warning and return None early, matching the existing error-handling pattern. Sessions that are too large are simply skipped (workflow-only results are still saved when error analysis is skipped).
How did I test this PR
- Verified the fix handles all three observed error types from Datadog logs (24h scan)
- Ran
ruff check and ruff format --check — all passing
- Reviewed the 28 error instances in Datadog to confirm they all fall under the three categories addressed
Origin: cron-5d55c321e47a / zen-cron-7bf66f8ab915
Triggered by: dhawal@composio.dev