LLMs have demonstrated impressive capabilities in a broad set of tasks, ushering in a new era of agentic applications. Notably, there is a shift away from monolithic models toward AI systems and architectures augmenting and complementing LLMs for data/model retrieval, task coordination and planning, reasoning, reflecting and learning, and eventually synthesis resulting in innovative services. Such “compound” systems promise improved performance for complex tasks, greater flexibility and adaptability across different applications, easier integration of existing models and data, and greater control and trust.
We are working toward building a blueprint architecture of compound AI systems tailored for enterprises. Key factors we consider include:
(1) ensuring seamless integration into existing infrastructure through suitable touch points and interfaces,
(2) effectively orchestrating work within, and external to, the compound system with appropriate resource allocation, and
(3) maximizing utilization of the system in a cost-effective manner, where constraints such as latency, accuracy, cost, availability, and quality must be considered.