Stars
183
Forks
9
Language
Python
Watching
10
Build, evaluate and run General Multi-Agent Assistance with ease
AWorld
python examples/gaia/run.py
AWorld (short for Agent World) bridges the gap between theoretical MAS (Multi-Agent System) capabilities and practical implementation in real-world applications and guide you into the AGI World. GLHF! 🚀
agent
: AI-powered components that autonomously make decisions, use tools, do collaboration, and so on.swarm
: define the topology structure of a multiple agents system.environment
: the runtime supporting communication among agents and tools.task
: structure containing datasets, agents, tools, metrics, outputs, etc.runner
: complete a runnable specific workflow and obtain results.With Python>=3.11:
pip install aworld
from aworld.config.conf import AgentConfig
from aworld.core.agent.base import Agent
from aworld.runner import Runners
if __name__ == '__main__':
agent_config = AgentConfig(
llm_provider="openai",
llm_model_name="gpt-4o",
# Set via environment variable or direct configuration
# llm_api_key="YOUR_API_KEY",
# llm_base_url="https://api.openai.com/v1"
)
search = Agent(
conf=agent_config,
name="search_agent",
system_prompt="You are a helpful agent.",
mcp_servers=["amap-amap-sse"] # MCP server name for agent to use
)
# Run agent
Runners.sync_run(input="Hotels within 1 kilometer of West Lake in Hangzhou",
agent=search)
Here is a MCP server config example.
Below are demonstration videos showcasing AWorld's capabilities across different agent configurations and environments.
Mode | Type | Demo |
---|---|---|
Single Agent | Browser use |
![]() ▶️ Watch Browser Demo on YouTube |
Phone use |
![]() ▶️ Watch Mobile Demo on YouTube | |
Multi Agent | Cooperative Teams |
![]() ▶️ Watch Travel Demo on YouTube |
Competitive Teams |
![]() ▶️ Watch Debate Arena on YouTube | |
Mixed of both Teams | Coming Soon 🚀 |
Here is a multi-agent example of running a level2 task from the GAIA benchmark:
from aworld.agents.gaia.agent import PlanAgent, ExecuteAgent
from aworld.config.common import Agents, Tools
from aworld.core.agent.swarm import Swarm
from aworld.core.task import Task
from aworld.config.conf import AgentConfig, TaskConfig
from aworld.dataset.mock import mock_dataset
from aworld.runner import Runners
import os
# Need OPENAI_API_KEY
os.environ['OPENAI_API_KEY'] = "your key"
# Optional endpoint settings, default `https://api.openai.com/v1`
# os.environ['OPENAI_ENDPOINT'] = "https://api.openai.com/v1"
# One sample for example
test_sample = mock_dataset("gaia")
# Create agents
plan_config = AgentConfig(
name=Agents.PLAN.value,
llm_provider="openai",
llm_model_name="gpt-4o",
)
agent1 = PlanAgent(conf=plan_config)
exec_config = AgentConfig(
name=Agents.EXECUTE.value,
llm_provider="openai",
llm_model_name="gpt-4o",
)
agent2 = ExecuteAgent(conf=exec_config, tool_names=[Tools.DOCUMENT_ANALYSIS.value])
# Create swarm for multi-agents
# define (head_node, tail_node) edge in the topology graph
# NOTE: the correct order is necessary
swarm = Swarm((agent1, agent2), sequence=False)
# Define a task
task = Task(input=test_sample, swarm=swarm, conf=TaskConfig())
# Run task
result = Runners.sync_run_task(task=task)
print(f"Time cost: {result['time_cost']}")
print(f"Task Answer: {result['task_0']['answer']}")
Time cost: 26.431413888931274
Task Answer: Time-Parking 2: Parallel Universe
AWorld uses a client-server architecture with three main components:
Client-Server Architecture: Similar to ray, this architecture:
Agent/Actor:
Field | Type | Description |
---|---|---|
id | string | Unique identifier for the agent |
name | string | Name of the agent |
model_name | string | LLM model name of the agent |
_llm | object | LLM model instance based on model_name (e.g., "gpt-4", "claude-3") |
conf | BaseModel | Configuration inheriting from pydantic BaseModel |
trajectory | object | Memory for maintaining context across interactions |
tool_names | list | List of tools the agent can use |
mcp_servers | list | List of mcp servers the agent can use |
handoffs | list | Agent as tool; list of other agents the agent can delegate tasks to |
finished | bool | Flag indicating whether the agent has completed its task |
Environment/World Model: Various tools and models in the environment
Tools | Description |
---|---|
mcp Servers | AWorld seamlessly integrates a rich collection of MCP servers as agent tools |
browser | Controls web browsers for navigation, form filling, and interaction with web pages |
android | Manages Android device simulation for mobile app testing and automation |
shell | Executes shell commands for file operations and system interactions |
code | Runs code snippets in various languages for data processing and automation |
search | Performs web searches and returns structured results for information gathering and summary |
document | Handles file operations including reading, writing, and managing directories |
AWorld serves two complementary purposes:
✨ MCP Servers as Tools - Powerful integration of MCP servers providing robust tooling capabilities
🌐 Environment Multi-Tool Support:
🤖 AI-Powered Agents:
🎛️ Web Interface:
🧠 Benchmarks and Samples:
We warmly welcome developers to join us in building and improving AWorld! Whether you're interested in enhancing the framework, fixing bugs, or adding new features, your contributions are valuable to us.
For academic citations or wish to contact us, please use the following BibTeX entry:
@software{aworld2025,
author = {Agent Team at Ant Group},
title = {AWorld: A Unified Agent Playground for Computer and Phone Use Tasks},
year = {2025},
url = {https://github.com/inclusionAI/AWorld},
version = {0.1.0},
publisher = {GitHub},
email = {chenyi.zcy at antgroup.com}
}
This project is licensed under the MIT License - see the LICENSE file for details.
How is this guide?