Work update: Applied research at Intercom.
Much of my applied research work at Intercom focuses on understanding the boundaries of current AI technology. In production environments, it's not enough to know what AI can do—we need to understand where it breaks, what tradeoffs we're making, and how to push these boundaries in practical ways. This work centers on two critical areas: (a) the limitations and tradeoffs inherent in AI agents, and (b) the competency and adaptability of open-source models for real-world production use cases.
The Open Source Revolution: Small Models, Big Impact
One of the most exciting trends in AI research is the democratization of language models through open-source alternatives. While large proprietary models have dominated headlines, there's growing evidence that smaller, open-source models can be remarkably effective for specific use cases. This raises fundamental questions about the tradeoffs between model size, computational cost, and performance.
In our recent work, we explore whether smaller LLMs can truly compete with their larger counterparts. The research examines the practical implications of choosing between “David” (smaller open-source models) and “Goliath” (large proprietary systems) for real-world applications. This work is particularly relevant for organizations looking to deploy AI solutions that balance performance with cost, latency, and control.
Read more: David vs Goliath: are small LLMs any good? (co-authored with Ramil Yarullin, September 2025)
The ACR Tradeoff: Designing Reliable AI Agents
As AI agents become more capable and autonomous, a fundamental challenge emerges: how do we balance agency (the agent's ability to act independently), control (our ability to oversee and intervene), and reliability (consistency and correctness of outcomes)? This trilemma is central to deploying AI agents in production environments where mistakes can have real consequences.
The Agency, Control, Reliability (ACR) framework provides a structured way to think about these tradeoffs. Understanding this balance is crucial for building AI systems that are both powerful and trustworthy—systems that can operate autonomously when appropriate while maintaining the necessary safeguards for critical applications.
Read more: The Agency, Control, Reliability (ACR) Tradeoff for Agents (April 2025)
Looking Forward
These research directions reflect a broader shift in applied AI: moving from pure performance optimization to building systems that are practical, controllable, and accessible. As we continue to push the boundaries of what AI agents can do, frameworks like ACR help us navigate the complex landscape of deployment decisions, while the open-source movement ensures that powerful AI capabilities remain accessible to a broader community.
Looking ahead, the next frontier of applied research work centers on model alignment—ensuring that AI systems not only perform well on benchmarks but also behave in ways that are aligned with human values, organizational goals, and real-world constraints. This represents a natural evolution from understanding what models can do to ensuring they do it in ways that are safe, reliable, and truly useful in production environments.