marionette
Feature

Microsoft's Magentic-One Coordinates Task Completion Across Multiple AI Agents

4 minute read
David Barry avatar
By
SAVED
Microsoft's new multi-agent framework aims to make AI agents more efficient in the digital workplace. Here's how.

On Nov. 7, Microsoft unveiled Magentic-One, a framework that coordinates multiple agents working together to handle intricate, multi-stage tasks across diverse contexts.

With this release, Microsoft aspires to make Magentic-One the orchestrator of autonomous agents, working in coordination to achieve business results. A look into what that means for you and your business.

How Magentic-One Works

Magentic-One enables users to create applications using an open-source framework called AutoGen to manage multi-step tasks across various domains, such as web navigation, file management and coding. Microsoft stated it "represents a significant step towards developing agents that can complete tasks that people encounter in their work and personal lives," the researchers behind Magentic-One wrote. 

The potential for the digital workplace is clear.

The modular design of the system facilitates collaboration across its specialized agents, allowing them to work together efficiently on complex projects. This collaborative framework is a significant advancement in AI functionality, enabling complex tasks to be executed autonomously through the coordinated efforts of different agents.

The platform is enabled by four main agents:

  • WebSurfer: This agent manages web browsing, including navigating websites, performing clicks and summarizing content to gather information from the internet.
  • FileSurfer: Responsible for managing local files, directories and folders, this agent organizes and retrieves necessary documents or data.
  • Coder: This agent writes and executes code, analyzes information from other agents and creates new projects based on the gathered data.
  • ComputerTerminal: It provides a command-line interface for executing programs written by the Coder agent.

There is also a central Orchestrator agent, which acts as a project manager, coordinating the activities of the specialized agents. It dynamically assigns tasks based on the requirements of a project and monitors progress. If an agent encounters an error or requires assistance, the Orchestrator can reassign tasks or adjust strategies to ensure successful completion.

Related Article: AI Agents, Teams and Copilot Announcements Dominated the Productivity Focus at Microsoft Ignite

Can It Help You?

To understand where Magentic-One or any other multi-agent AI system fits within the broader landscape of your enterprise technology, Dennis Perpetua, global CTO at Kyndryl, said it is helpful to think about their task capabilities in a vertical sense.

“What makes any agentic system useful is its ability to do more than just provide human exchanges or chat functionality based on information retrieval and/or generation,” he said. “Magentic One is a generalist system that is more analogous to a robotic process automation (RPA) solution than Salesforce’s Agentforce, which is a vertical solution around CX, and GPT-4’s generative answer focus."

While both are extensible and robust, they are anchored by the type of user and the purpose they serve. Magentic-One is “a bag of parts” that enables the creation of workflows capable of performing tasks with a level of reasoning not typically found in RPA solutions or tools like Apple’s Automator, which has been used for end-user workflow orchestration since its release in 2005.

Asked about the ethical implications of deploying an agent to manage other agents in the enterprise, Perpetua highlighted a significant concern with the Orchestrator or dispatcher model. These systems, he noted, require more controls than single-threaded task systems because they facilitate information exchanges between task workers, which could lead to unintended consequences.

“Currently, the most popular safeguard is to have a human in the loop for confirmation of a task,” he said. “This should be made a design standard when a task is classified as irreversible. Much like we acknowledge a term of service when we sign up for something, the permanence of a task should be considered before execution.”

As multi-agent systems are applied to additional use cases, the reliance on the orchestrator becomes increasingly significant. “The problem becomes the eventual desire to remove the human from the task approval process and to become autonomous,” he said.

This introduces a design limitation that will eventually need to be addressed. Specialized agents may need to evaluate tasks for permanence, sensitive private information usage and privilege to mitigate risks.

Related Article: What Every Business Leader Should Know About AI's Influence on the Office

Magentic-One Benefits ... and Challenges

Magentic-One is promising. Having an AI system composed of multiple interacting intelligent and autonomous agents that can sense, learn, make decisions and act to achieve individual and collective goals holds great promise.

“Being open-source, Magentic-One allows developers and organizations to customize and extend its functionalities to meet specific needs,” explained Rogers Jeffrey Leo John, co-founder and CTO of DataChat, a no-code generative AI platform for instant analytics. This openness fosters innovation, community collaboration and accelerates adoption in the enterprise market by providing transparency and flexibility.

What sets Magentic-One apart is its approach to structure and usability, said Chris Dro, a software engineer and the founder of several companies. Unlike OpenAI's Swarm, which prioritizes flexibility, or Salesforce's Agentforce, which tightly integrates into a specific ecosystem, Magentic-One offers a modular, pre-configured setup, a design that helps ensures easy adoption and scalability without sacrificing performance.

Plus, Dro said, Microsoft has embedded robust safeguards into Magentic-One to ensure responsible use. These include red-teaming, access controls, human oversight and continuous monitoring, all of which mitigate risks — especially when managing sensitive data. “The framework excels at automating routine tasks while preserving human input for more complex situations. For example, roles focused on data entry or basic analysis can evolve into more strategic functions, supported by agents like WebSurfer and Coder."

But, implementing that framework in an enterprise setting presents significant difficulties.

One of those challenges is ensuring seamless interoperability with existing business systems and processes, John said. As a new framework, Magentic-One requires the development of numerous integrations to enable widespread adoption within enterprises, John said. There's also the fact that a multi-agent system like Magentic-One involves more interactions with LLMs compared to conventional AI systems, resulting in increased computational costs.

And then there's the privacy risk, ensuring the framework’s reliability and compatibility with the enterprise’s security and privacy requirements.

Related Article: Why AI's Next Step Is Decentralized and Customized

The Role of AutoGenBench

AutoGenBench, developed by Microsoft Research, is a benchmarking tool designed for multi-agent AI systems like Magentic-One. It is part of Microsoft’s broader research into multi-agent architectures and is primarily utilized within the AutoGen framework. The tool tests and evaluates multi-agent systems, especially those that leverage large language models, and measures and compares the performance of agent-based workflows across a variety of tasks and scenarios, offering insights into their effectiveness, scalability and interaction dynamics.

AutoGenBench plays a crucial role in benchmarking multi-agent AI systems and, Perpetua cautions, should not provide a false sense of security. Its current capabilities are designed to establish a baseline comparison against expected outcomes, helping to identify potential system drift. However, this has limited effectiveness in detecting malicious actions or unexpected behaviors during real-world exchanges, he said.

Learning Opportunities

While the advancement of AutoGenBench demonstrates Microsoft’s commitment to expanding AI accessibility across various domains, the tool has significant limitations and cannot replace the need for strong skills and robust design standards among AI practitioners.

About the Author
David Barry

David is a European-based journalist of 35 years who has spent the last 15 following the development of workplace technologies, from the early days of document management, enterprise content management and content services. Now, with the development of new remote and hybrid work models, he covers the evolution of technologies that enable collaboration, communications and work and has recently spent a great deal of time exploring the far reaches of AI, generative AI and General AI.

Main image: adobe stock
Featured Research