About

Who you're hiring.

Operator-built, not agency. I run the same routing stack on my own consumer app — Bailar, a Latin dance discovery app on iOS, Android, and bailar.site. The audit is the same playbook that cut my LLM bill ~60% versus the OpenAI-only baseline I started with.

Paul Plawin

I'm a solo founder. Bailar classifies ~40,000 events per month with text LLMs and runs ~10,000 vision-model calls per month for photo validation, hero ranking, and OCR. The whole stack is built and maintained by one person, which means every cent on the LLM bill is mine to feel.

The current routing — Groq → Cerebras → Anthropic for plain text, Vertex Gemini Flash for vision, Anthropic Sonnet for brand-visible writing, with retry/fallback on every chain — is the result of a year of trial and error. The audit is that learning, productized for other AI startups.

Why this audit exists

I burned a long weekend last spring on a Google Cloud free-tier suspension that was entirely my fault — multi-account quota stacking on Vertex during a scrape run. Re-architecting around it forced me to actually understand how the multi-provider routing worked: what quality you keep, what you concede, where the failover thresholds should land.

The pattern I came out with cuts 40–60% on most seed-to-Series-A AI startups. The blockers aren't technical — they're "we haven't had time to look at it." That's exactly the gap this audit fills: one engineer-week of someone else's time, fixed price, refund-backed.

What I won't do

No SaaS pitch
I won't try to sell you a recurring tool license. The audit is a fixed deliverable. You own the code. Done.
No 6-week agency timeline
If a routing audit takes a month, the consultant is padding. Mine is a week because that's the actual work.
No quality regression to chase savings
Cheaper providers that score worse on your traffic do not get recommended. Brand-visible writing stays on Anthropic for that reason.
No surprise fees
$5,000 is the audit. $15,000 is the optional turnkey implementation. There is no third invoice.

Reach me

Email: paul@aimargin.dev

LinkedIn: linkedin.com/in/paulplawin

Bailar: bailar.site (the consumer app whose stack the audit is based on)

Ready when you are

A 15-minute call to confirm fit before either side commits. No deck, no questionnaire — just a conversation about what your stack looks like today.

Book a 15-min discovery call

Who you're hiring.

Paul Plawin

Why this audit exists

What I won't do

No SaaS pitch

No 6-week agency timeline

No quality regression to chase savings

No surprise fees

Reach me

Ready when you are