Source: Marcus on AI
The gap between autonomous agent demos and real-world deployment is real. Agents hallucinate, cascade failures, and lack meaningful error recovery—making them unreliable for high-stakes tasks where humans currently absorb the failure cost. The issue isn't architectural but operational: current LLMs lack the deterministic reasoning and explicit state management that mission-critical systems require. Vendors and researchers continue overstating capabilities to secure funding and attention. Until concrete shipping products handle complex, unsupervised workflows with measurable SLAs, the category remains a capital-intensive placeholder rather than a solved problem.