You don’t really know what’s happening in your app - until your data starts talking back.
In the early days, we were flying blind. Agent requests were going out. Results were coming back. Users were interacting. But we couldn’t see any of it in a way that mattered.
That’s when we brought in BigQuery - not because it was trendy, but because we needed a way to actually understand our product as it grew in complexity. This is the story of how observability became a core layer of how Anjin is evolving.
Why “Did It Work?” Isn’t Enough
Our platform revolves around modular AI agents doing complex tasks - sometimes independently, sometimes as part of chained flows. Success in that kind of environment isn’t binary. It’s not just did it work? It’s:
- How long did it take?
- What did the user expect vs what they got?
- Were they trying to game the system or just find the edges?
Without a deeper view, we were relying too much on intuition and not enough on actual behavior. And that meant we were designing features in the dark.
Enter BigQuery: Seeing the Whole System
We’ve started funneling anonymized agent activity and app events into BigQuery - just enough to surface patterns, spot drop-offs, and answer the kind of questions we didn’t know we needed to ask yet.
We’re tracking things like:
- Agent run volumes over time
- Domain swaps per user session
- Latency data for each agent type
- Credit burn patterns and feature bottlenecks
And it’s paying off. We’ve already caught small bugs hiding inside high-frequency agent loops, and made UI changes based on where users were actually stalling - not where we assumed they would.
Want to dig into what this looks like technically? Hop into the community → happy to chat details.
Observability ≠ Analytics
This isn’t about dashboards for dashboards’ sake. This is about listening.
Observability at Anjin means tuning into the way agents behave in the wild. Not just logging what they do, but understanding why they do it, what it says about the user, and how we can respond in real time or in future releases.
It’s closer to telemetry than reporting. And it’s changing how we build.
We’re seeing:
- Which agents are high-frequency but low-output
- Where latency breaks UX expectations
- When feature limits (like domain caps) drive upgrades - or churn
These are design inputs, not just metrics.
A Quick Note on the Stack
We’ve had a few folks ask, “What’s your stack behind all this?”
Let’s just say it’s modular, secure, and designed to give us full control - especially when it comes to roles, tokens, and execution paths. If you’ve read Inside Anjin #01, you know we’ve had our fair share of growing pains.
We’re happy to share more about how it’s wired - just ask.
What's Next
- We're attending monthly unstructured, human conversations with folks building, researching, and breaking ground in this space. Those ideas are starting to spark something we’re calling Dinner with AI. Stay tuned.
- We’re tuning our internal dashboards to surface real agent behavior to admins, not just devs.
- We’re starting to ask harder questions about what it means to be “real-time” in an agent-driven product - and how far we want to take that.
Final Thought: Listening is a Feature
You can’t build a product people love without understanding how they use it.
BigQuery isn’t just a tool - it’s a feedback loop. It helps us listen to the agents, the system, and the users behind it all.
We’re still figuring it out. But the signal’s getting clearer every day.
Have questions? Want to see what we're seeing?
Join the community, or follow the full series starting with Inside Anjin #01.