LLM evaluation done wrong: why one eval setup can't answer three different questions
LLM evaluation in production is three different problems bundled into one confused setup. Here's how to separate them, and what each one actually needs.
LLM evaluation in production is three different problems bundled into one confused setup. Here's how to separate them, and what each one actually needs.
Most DPDP guidance is written for compliance officers. This is the engineering version: schema migrations, consent state machines, retention jobs, and audit patterns for a defensible Indian SaaS codebase.
Practical guides, product updates, and compliance notes — straight to your inbox. No fluff.
Newsletter is opening soon. We'll switch this on once we've got our first issue ready.
Three isolation levels, three distinct failure modes. Most Postgres deployments run at Read Committed without knowing it. Here is what each level permits and what upgrading actually costs.
ONDC crossed 218 million transactions in FY 2025-26. But mobility now drives over half of all orders, and retail — the segment the protocol was built to democratise — peaked in October 2024 and has been falling since.
India hosts the second-largest SaaS ecosystem outside the US. The raise-or-bootstrap question has a different answer in 2026 than it did in 2021. Here's the data behind the shift.
AI didn't kill the take-home coding assignment — it exposed what was always wrong with it. Five alternatives that actually predict whether someone can do the job.
Most engineering take-homes broke when AI tools arrived. But the format was already measuring the wrong thing. Here's how to redesign the rubric so the assessment holds up.
Legal is not reviewing every npm install — you are. Here is the practical check to run before adding a dependency, and the licence type that catches most SaaS teams off guard.
Three years after the GPT-4 wrapper wave, a handful of AI companies are thriving and most are gone. The split was not random — and the pattern tells you something useful about building on top of LLMs in 2026.
Giving an LLM access to your database is easy. The problem is that your application-layer RBAC is invisible when the model generates SQL. Here's where it goes wrong and how to fix it at the layer that enforces.
Conventional wisdom says hire when the motion is repeatable. But at 30 customers, most founders can't yet articulate what makes their deals close, and hiring into that gap costs far more than the salary.
The Redis licence change created a three-way choice. Most teams are making it based on benchmarks. The real decision factors are licencing risk, ecosystem backing, and cloud-provider alignment.