【New Intelligence Introduction】 AI Agents are a hot topic in the field of large models. Users can introduce multiple LLM Agents with different roles to participate in actual tasks. Agents will engage in various forms of dynamic interactions such as competition and collaboration, thereby producing amazing group intelligence effects. This article introduces the large model mind interaction CAMEL framework (Camel) from the KAUST research team. The CAMEL framework is the earliest well-known project of autonomous agents based on ChatGPT, and has been accepted by the top artificial intelligence conference NeurIPS 2023.
What magic trick makes us intelligent? The trick is that there is no trick. The power of intelligence comes from our vast diversity, not from any single, perfect principle. —Marvin Minsky
At present, on the road to advanced intelligence of machines, large models (LLMs) represented by ChatGPT should be one of the milestones that must be passed. They have achieved very dazzling achievements in solving complex tasks in multiple fields through human-computer interaction in chat dialogues .
With the development of LLMs, the interaction framework between AI Agents has gradually emerged, especially in some complex professional fields. Intelligent agents pre-set in role-playing modes are fully capable of replacing the roles played by human users in tasks. At the same time, dynamic interactions between intelligent agents in the form of collaboration and competition can often bring unexpected results. This is the AI Agents regarded by OpenAI artificial intelligence expert Andrej Karpathy and others as "the most important frontier research direction leading to AGI."

The timeline of the development of this field is as follows[2]:
- 「CAMEL」 (Camel: Large Model Mind Interaction Framework) - Published on 2023.3.21
- 「AutoGPT」 - Published on 2023.3.30
- 「BabyGPT」 - Released on 2023.4.3
- 「Westworld」 simulation (Stanford Westworld Town) — Published on 2023.4.7
As the earliest well-known autonomous agent project based on ChatGPT, the KAUST research team's large-scale model mental interaction CAMEL framework (Camel) focuses on exploring a new cooperative agent framework called role-playing, which can effectively alleviate the errors that occur during the dialogue process of intelligent agents, thereby effectively guiding the intelligent agents to complete various complex tasks. Human users only need to input a preliminary idea to start the whole process. Currently, CAMEL has been accepted by the top international artificial intelligence conference NeurIPS 2023.
.png)
Paper link: https://ghli.org/camel.pdf
Project homepage: https://www.camel-ai.org/
AI Agents is a hot topic in the field of big models. Users can introduce multiple LLM Agents playing different roles to participate in actual tasks. Agents will engage in various forms of dynamic interactions such as competition and collaboration, thereby producing amazing group intelligence effects.

The authors designed flexible modular functions for the CAMEL framework, including the implementation of different agents, prompt examples in various professional fields, and an AI data exploration framework. Therefore, CAMEL can be used as a basic Agents backend to support AI researchers and developers to more easily develop applications related to multi-agent systems, cooperative artificial intelligence, game theory simulation, social analysis, and artificial intelligence ethics.
Specifically, the authors generated two large instruction datasets, AI Society and AI Code, and two single-round question-answering datasets, AI Math and AI Science, through collaborative scenarios involving two role-playing tasks to explore the research on LLM emergent capabilities.
1. CAMEL framework
The figure below shows the role-playing framework in CAMEL. Human users need to first formulate an idea or goal they want to achieve, for example: developing a trading robot for the stock market.
The characters involved in this task are an AI assistant agent (making it play the role of a Python programmer) and an AI user agent (making it play the role of a stock trader).

The authors first set up a task specifier for CAMEL, which will develop a more detailed implementation step based on the input idea. Then the AI assistant agent (AI Assistant) and the AI user agent (AI User) will communicate collaboratively through chatting, and each will complete the specified task step by step.
The collaborative communication is implemented through a system-level message passing mechanism. Let be the system message passed to the AI assistant agent, and be the system message passed to the AI user agent.
Then, the AI assistant agent and AI user agent are instantiated into two ChatGPT models and respectively, and the AI assistant agent and AI user agent are obtained accordingly.
After the role assignment is completed, the AI assistant agent and the AI user agent will collaborate to complete the task in the manner of following instructions. Let be the user instruction message obtained at time, and be the solution given by the AI assistant agent. Therefore, the dialogue message set obtained at time is:

At the next moment, the AI user agent will generate new instructions based on the historical dialogue message set. Then the new instruction message and the historical dialogue message set will be passed to the AI assistant agent to generate a solution for the new moment:

2. CAMEL Usage Examples
2.1. Collaborative role-playing
CAMEL's built-in collaborative role-playing framework can complete complex tasks through collaboration between agents without the expertise of human users. The figure below shows an example of CAMEL developing a stock market trading robot, in which the AI assistant agent plays the role of a Python programmer and the AI user agent plays the role of a stock trader.

In the role-playing framework, AI agents all have expertise in specific fields. At this time, we only need to specify a prompt for an original idea, and then the two AI agents will work around this idea. In the figure above, the user agent proposes that the trading robot needs to have the ability to analyze the sentiment of stock reviews . Then the assistant agent directly gives the script for installing the python library required for sentiment analysis and stock trading.

As the task progresses, the instructions given by the user agent will become more and more specific. In the figure above, the instruction is: define a function to use the Yahoo Finance API to get the latest stock price of a specific stock. The assistant agent will directly generate a piece of code to solve the need based on this instruction.
2.2. Embodied agent
In previous studies, AI Agents can be understood as simulating some operations without interacting with the real world or using external tools to perform operations. Current LLMs already have the ability to interact with the Internet or other tool APIs. CAMEL also provides embodied agents that can perform various operations in the physical world. They can browse the Internet, read documents, create images, audio and video content, and even execute code directly.

The figure above shows an example of CAMEL using the embodied agent to call the Stable Diffusion toolchain provided by HuggingFace to generate a camel family image. In this process, the embodied agent first infers all the animals included in the camel family, and then calls the diffusion model to generate an image and save it.
2.3. Critic-in-the-loop
In order to enhance the controllability of the role-playing framework, the author team also designed a critic-in-the-loop for CAMEL. This mechanism is inspired by the Monte Carlo Tree Search (MTCS) method. It can combine human preferences to implement the decision logic of tree search to solve tasks. CAMEL can set up an intermediate evaluation agent (critic) to make decisions based on the various opinions of the user agent and the assistant agent to complete the final task. The overall process is shown in the figure below.

Consider a scenario where we ask CAMEL to host a very specific research project discussion meeting. The theme of the research project is "Large Language Models". CAMEL can set the role of the user agent to a postdoctoral fellow, the role of the assistant agent to a doctoral student, and the role of the intermediate evaluation agent to a professor. The task instructs the doctoral student to help the postdoctoral fellow develop a research plan, which requires research on the ethics of large models.

After receiving the task, the postdoctoral agent first put forward three viewpoints on this project, indicating that the project should start with investigating relevant work on the ethics of large models.
The professor will then give his own opinions based on these three viewpoints. He believes that the second viewpoint is the most reasonable, that is, to study the discriminative algorithm of large models. At the same time, he will also point out the shortcomings of the other two viewpoints, such as the lack of a clearer structure in viewpoint 1 and the too narrow research scope of viewpoint 3.

After the professor’s speech, the doctoral student intelligence will carry out more specific project planning, such as directly listing some relevant literature on the ethical safety of large models and discussing how to carry out specific research.
3. Experimental results
The performance evaluation in this paper is mainly carried out from three aspects, and two gpt-3.5-turbo are used as experimental agents. The experimental dataset uses four AI datasets generated by the CAMEL framework, among which AI Society and AI Code focus on the dialogue effect of the agent, while AI Math and AI Science focus on the problem-solving ability of the agent.
3.1. Agent Evaluation
In this section, the authors randomly selected 100 tasks from the AI Society and AI Code datasets for evaluation, and then conducted comparative experiments using the CAMEL framework and a single gpt-3.5-turbo.
The evaluation of the results is divided into two parts. On the one hand, human subjects gave 453 voting data on the solutions given by the two methods to decide which solution is more feasible. On the other hand, the author prompted the GPT4 model to directly give scores for the two solutions. The specific comparison data is shown in the following table.

As can be seen from the above table, the solution provided by the CAMEL framework is significantly better than the solution provided by gpt-3.5-turbo in both human evaluation and GPT4 evaluation, and the overall trends of human evaluation and GPT4 evaluation are highly consistent.
3.2. Evaluation of ChatBot using GPT-4
In this part, the authors gradually fine-tuned the LLaMA-7B model on four datasets generated by CAMEL, and observed the model's acceptance effect on knowledge discovery by continuously injecting knowledge from different fields such as society, code, mathematics, and science into LLM.
The author first started with the AI Society dataset to allow the model to understand the common sense of human interaction and social dynamics. Then, with the injection of AI Code and other datasets, the model acquired knowledge of programming logic and grammar, while broadening the model's understanding of scientific theories, empirical observations, and experimental methods.

The above table shows the test results of the model on 20 society tasks, 20 coding tasks, 20 mathematics tasks, and 60 scientific tasks. It can be seen that each time a data set is added, the model performs better on the trained task domain.
3.3. HumanEval
In order to further evaluate the code writing task solving ability of the CAMEL framework, the authors conducted experiments on two evaluation benchmarks: HumanEval and HumanEval+. The experimental results are shown in the following table.

The superior performance of the CAMEL framework is clearly demonstrated in the table above, which not only far exceeds the LLaMA-7B model, but also significantly exceeds the Vicuna-7B model, indicating that the dataset generated using CAMEL has a unique effect in enhancing LLM to handle coding-related tasks.
4. CAMEL AI Open Source Community
It is worth mentioning that the CAMEL author team is building a very complete CAMEL AI open source community. The community Github repository has received more than 3,600 stars. The community covers the implementation of various intelligent agents in CAMEL, data generation pipelines, data analysis tools and generated data sets to support research on AI Agents and other aspects. The community has currently attracted many open source enthusiasts to contribute code.
It has been 9 months since the first line of code was written for the CAMEL project. The CAMEL-AI.org open source research and technology community has attracted more than 20 independent code contributors from KAUST/Cambridge/Sorbonne University/NUS/CMU/University of Chicago/Stanford/Duke University/Peking University/Shanghai Jiaotong University/Harbin Institute of Technology/Xidian University/Northeastern University/Chengdu University of Information and Communications Technology as well as industry.
The community is looking for full-time/part-time/internship contributors, engineers, and researchers to join in learning and exploring how to push the boundaries of building an intelligent society. Outstanding contributors will have the opportunity to participate in the writing of papers on the framework and other research projects.
If you are interested in joining the CAMEL-AI.org community, you can send your resume to camel.ai.team@gmail.com or add WeChat ID CamelAIOrg for consultation!
References:
[1] Minsky M. Society of mind[M]. Simon and Schuster, 1988.
[2]https://towardsdatascience.com/4-autonomous-ai-agents-you-need-to-know-d612a643fa92