PR Custody

Add LLM Observer skill — generic audit & optimization workflow for any LLM observability platform

@migetech✕ 1 failingchecks…add-llm-observer → master4 files · +720 −0updated 2w ago

▸Description

What this skill does

LLM Observer is a complete, platform-agnostic workflow for auditing, optimizing, and documenting LLM usage through any observability platform.

It guides the user through 7 structured modules:

Initial Setup — connect any app or Claude Code to the chosen platform (proxy, SDK wrapper, or MCP server)
Log & Trace Review — read dashboards, analyze agentic sessions, identify problematic requests
Metrics Analysis — convert raw data into actionable insights across cost, latency, quality, and efficiency
Prompt Optimization — iterate with real production data: one change at a time, measure before/after
Findings Documentation — generate a structured audit report
Claude Code Prompt Generation — turn audit findings into optimized system prompts and CLAUDE.md blocks
Replication — apply the same methodology to other platforms or SaaS services

Platforms covered

Helicone · LangSmith · Langfuse · Braintrust · W&B Weave · Phoenix/Arize · PromptLayer

Each platform has a dedicated setup guide in references/platforms.md with copy-paste code snippets for proxy, SDK wrapper, and MCP server integration methods.

Who uses this workflow

Anyone building or maintaining LLM-powered products who wants to:

Reduce API costs through prompt compression, caching, and model routing
Improve response quality with real production data (not playground guesses)
Debug latency issues in agentic Claude Code workflows
Document findings in a reproducible, team-shareable format

Why this skill fills a gap

Before creating this skill, I searched the skills.sh ecosystem and found zero existing skills for LLM observability, prompt auditing, or cost optimization workflows. This is the first skill in the ecosystem addressing this domain.

Skill structure

llm-observer/
├── SKILL.md                          # Main skill (7 modules)
└── references/
    ├── platforms.md                  # Setup guides for 7 platforms
    └── prompt-optimization.md        # Diagnostic framework + optimization techniques

The skill uses progressive disclosure: description loads on every session (~100 tokens), SKILL.md body loads on activation, references load only when the relevant module runs.

Origin

Built from real audit work on Helicone.ai, then generalized to cover the full ecosystem. Tested with Claude Sonnet in Cowork mode.

By @migetech

loading diff…