Practical Exercise on AI Agent Development Based on Llama 3

Through this article, you will learn the complete process of building an AI agent with Llama 3 model function call capabilities based on the open source deep learning model visualization tool Gradio.

1. Introduction

Imagine you want to buy something. So, you visit an e-commerce website and use the search option to find what you need. Maybe you have a lot of things to buy, so this process is not very efficient. Now, think about this scenario: you open an application, describe what you want in plain English, and press enter. You don't have to worry about searching and comparing prices because the application automatically handles it for you. Cool, right? This is exactly what we are going to build in this article.

Let's look at some examples first.

User requests multiple products at the same time

The user asks for the best purchase he/she can make

Next, let's enhance this application with some features. We will use the Llama 3 open source model developed by Meta with function call capabilities. However, the sample program in this article can also be implemented using version 3.1 of this model. According to Meta's announcement (https://ai.meta.com/blog/meta-llama-3-1/), the 3.1 version of the model can use tools and functions more effectively.

Note: These models are multi-language, have a longer context length of 128K, use state-of-the-art tools, and have overall stronger reasoning capabilities.

I will be using the Groq cloud platform for this example development, specifically their big data model in this article. The initial workflow for this application includes an embedding model, a retriever, and two main tools for processing user purchase interests and cost-related questions. In summary, we need something similar to the components described in the figure below.

Sample application architecture diagram

Now, we have to choose an LLM component framework to use in development. For this, I chose my all-time favorite production-level open source AI platform framework Haystack (https://haystack.deepset.ai/).

Now that we have what we need, let's get started with the key development work!

2. Loading and indexing data

Since we are using a RAG pipeline in this sample program, we first build a document indexing service that will use the in-memory vector database provided by Haystack. Please note that each document in our vector database contains the following fields:

Content — The content we use to perform similarity search
Id - unique identifier
Price — product price
URL - Product URL

When our RAG pipeline is called, the Content field is used for vector search. All other fields are included as metadata in the vector database. Note that it is critical to save these metadata as they are crucial in the front-end presentation to users.

Next, let’s see how to achieve this.

from haystack import Pipeline, Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.builders import PromptBuilder
from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.dataclasses import ChatMessage
import pandas as pd

#Load product data from CSV
df = pd.read_csv("product_sample.csv")

#Initialize the in-memory document storage area
document_store = InMemoryDocumentStore()

#Convert product data to Haystack document object
documents = [
Document(
content=item.product_name,
meta={
"id": item.uniq_id,
"price": item.selling_price,
"url": item.product_url
}
) for item in df.itertuples()
]

#Create a pipeline for indexing documents
indexing_pipeline = Pipeline()

#Add a Document Embedder to the pipeline using the Sentence Transformer model
indexing_pipeline.add_component(
instance=SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"), name="doc_embedder"
)

# Add a document writer to the pipeline to store documents in the document store
indexing_pipeline.add_component(instance=DocumentWriter(document_store=document_store), name="doc_writer")

#Connect the output of the embedder to the input of the writer
indexing_pipeline.connect("doc_embedder.documents", "doc_writer.documents")

#Run the indexing pipeline to process and store documents
indexing_pipeline.run({"doc_embedder": {"documents": documents}})

Great, we have completed the first step of our AI agent application. Now, it’s time to build the product identifier function tool. To better understand the main task of product identifier, let’s consider the following example.

The user's query content is as follows:

Original English text: I want to buy a camping boot, a charcoal and google pixel 9 back cover.

Chinese meaning: 我想买一双露营靴、一块木炭和谷歌pixel 9手机外壳。

Now, let's take a look at the idealized workflow of the product identifier function.

Product Identification Function Workflow

First, we need to create a tool to analyze user queries and identify the products that the user is interested in. We can build such a tool using the following code snippet.

3. Building a User Query Analyzer

template = """
Understand the user query and list of products the user is interested in and return product names as list.
You should always return a Python list. Do not return any explanation.

Examples:
Question: I am interested in camping boots, charcoal and disposable rain jacket.
Answer: ["camping_boots","charcoal","disposable_rain_jacket"]

Question: Need a laptop, wireless mouse, and noise-cancelling headphones for work.
Answer: ["laptop","wireless_mouse","noise_cancelling_headphones"]

Question: {{ question }}
Answer:
"""

product_identifier = Pipeline()

product_identifier.add_component("prompt_builder", PromptBuilder(template=template))
product_identifier.add_component("llm", generator())

product_identifier.connect("prompt_builder", "llm")

Ok, now that we are halfway through our first function, it is time to complete the function by adding the RAG pipeline.

Product Identification Function Workflow

4. Creating a RAG Pipeline

template = """
Return product name, price, and url as a python dictionary.
You should always return a Python dictionary with keys price, name and url for single product.
You should always return a Python list of dictionaries with keys price, name and url for multiple products.
Do not return any explanation.

Legitimate Response Schema:
{"price": "float", "name": "string", "url": "string"}
Legitimate Response Schema for multiple products:
[{"price": "float", "name": "string", "url": "string"},{"price": "float", "name": "string", "url": "string"}]

Context:
{% for document in documents %}
product_price: {{ document.meta['price'] }}
product_url: {{ document.meta['url'] }}
product_id: {{ document.meta['id'] }}
product_name: {{ document.content }}
{% endfor %}
Question: {{ question }}
Answer:
"""

rag_pipe = Pipeline()
rag_pipe.add_component("embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"))
rag_pipe.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store, top_k=5))
rag_pipe.add_component("prompt_builder", PromptBuilder(template=template))
rag_pipe.add_component("llm", generator())

rag_pipe.connect("embedder.embedding", "retriever.query_embedding")
rag_pipe.connect("retriever", "prompt_builder.documents")
rag_pipe.connect("prompt_builder", "llm")

After executing the above code, we have completed the construction of the RAG and query analysis pipeline. Now it is time to convert it into a tool. For this, we can use a regular function declaration as shown below. Creating a tool for an AI agent is just like creating a Python function. If you have a question similar to the following:

How does the agent call this function?

The solution is simple: leverage the model-specific tool pattern. Of course, we will incorporate this pattern in a later step. Now, it is time to create a wrapper function that uses both the query analyzer and the RAG pipeline.

Let's first clarify the goal of this function.

Goal 1: Identify all products that the user is interested in and return them as a list.

Objective 2: For each identified product, retrieve up to five products and their metadata from the database.

5. Implementing product identifier functions

def product_identifier_func(query: str):
"""Identifies products based on a given query and retrieves relevant details for each identified product.

parameter:
query (str): A query string that identifies the product.

Return Value:
dict: A dictionary where the keys are the product names and the values are the details of each product. If the product is not found, it returns “No product found”。
"""
product_understanding = product_identifier.run({"prompt_builder": {"question": query}})

try:
product_list = literal_eval(product_understanding["llm"]["replies"][0])
except:
return "No product found"

results = {}

for product in product_list:
response = rag_pipe.run({"embedder": {"text": product}, "prompt_builder": {"question": product}})
try:
results[product] = literal_eval(response["llm"]["replies"][0])
except:
results[product] = {}

return results

Product Identification Function Workflow

At this point, we have completed building the first tool for our agent. Now, let's see if it works as expected.

query = "I want crossbow and woodstock puzzle"
#Execute function
product_identifier_func(query)

# {'crossbow': {'name': 'DB Longboards CoreFlex Crossbow 41" Bamboo Fiberglass '
# 'Longboard Complete',
# 'price': 237.68,
# 'url': 'https://www.amazon.com/DB-Longboards-CoreFlex-Fiberglass-Longboard/dp/B07KMVJJK7'},
# 'woodstock_puzzle': {'name': 'Woodstock- Collage 500 pc Puzzle',
# 'price': 17.49,
# 'url': 'https://www.amazon.com/Woodstock-Collage-500-pc-Puzzle/dp/B07MX21WWX'}}

Success! However, it is worth noting the output schema returned here. Here is the overall schema structure of the output.

{
"product_key": {
"name": "string",
"price": "float",
"url": "string"
}
}

This is exactly the pattern we propose to generate in the RAG pipeline. Next, let's build an optional utility function called find_budget_friend_option.

def find_budget_friendly_option(selected_product_details):
"""Find the most economical and friendly options for each category of products.

parameter:
selected_product_details (dict): A dictionary where the keys are product categories and the values are lists of product details. Each product's details should be a dictionary containing a "price" key.

Return results:
dict: A dictionary where the keys are product categories and the values are the most budget-friendly product details for each category.
"""
budget_friendly_options = {}

for category, items in selected_product_details.items():
if isinstance(items, list):
lowest_price_item = min(items, key=lambda x: x['price'])
else:
lowest_price_item = items

budget_friendly_options[category] = lowest_price_item

return budget_friendly_options

Let's focus on the most critical aspect of this application, that is, enabling the AI agent to use these features as needed. As we discussed earlier, this can be achieved through model-specific tool patterns. Therefore, we need to locate the tool pattern specific to the selected model. Fortunately, this is mentioned in the Groq model library (https://huggingface.co/Groq/Llama-3-Groq-70B-Tool-Use). We just need to adjust it to suit our use case.

6. Finalizing the chat template

chat_template = '''<|start_header_id|>system<|end_header_id|>

You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{"name": <function-name>,"arguments": <args-dict>}
</tool_call>

Here are the available tools:
<tools>
{
"name": "product_identifier_func",
"description": "To understand user interested products and its details",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The query to use in the search. Infer this from the user's message. It should be a question or a statement"
}
},
"required": ["query"]
}
},
{
"name": "find_budget_friendly_option",
"description": "Get the most cost-friendly option. If selected_product_details has morethan one key this should return most cost-friendly options",
"parameters": {
"type": "object",
"properties": {
"selected_product_details": {
"type": "dict",
"description": "Input data is a dictionary where each key is a category name, and its value is either a single dictionary with 'price', 'name', and 'url' keys or a list of such dictionaries; example: {'category1': [{'price': 10.5, 'name': 'item1', 'url': 'http://example.com/item1'}, {'price': 8.99, 'name': 'item2', 'url': 'http://example.com/item2'}], 'category2': {'price': 15.0, 'name': 'item3', 'url': 'http://example.com/item3'}}"
}
},
"required": ["selected_product_details"]
}
}
</tools><|eot_id|><|start_header_id|>user<|end_header_id|>

I need to buy a crossbow<|eot_id|><|start_header_id|>assistant<|end_header_id|>

<tool_call>
{"id":"call_deok","name":"product_identifier_func","arguments":{"query":"I need to buy a crossbow"}}
</tool_call><|eot_id|><|start_header_id|>tool<|end_header_id|>

<tool_response>
{"id":"call_deok","result":{'crossbow': {'price': 237.68,'name': 'crossbow','url': 'https://www.amazon.com/crossbow/dp/B07KMVJJK7'}}}
</tool_response><|eot_id|><|start_header_id|>assistant<|end_header_id|>
'''

Now, there are only a few steps left. Before we do anything else, let's test our proxy.

##Testing Agent
messages = [
ChatMessage.from_system(
chat_template
),
ChatMessage.from_user("I need to buy a crossbow for my child and Pokémon for myself."),
]

chat_generator = get_chat_generator()
response = chat_generator.run(messages=messages)
pprint(response)

## Response results
{'replies': [ChatMessage(content='<tool_call>\n'
'{"id": 0, "name": "product_identifier_func", '
'"arguments": {"query": "I need to buy a '
'crossbow for my child"}}\n'
'</tool_call>\n'
'<tool_call>\n'
'{"id": 1, "name": "product_identifier_func", '
'"arguments": {"query": "I need to buy a '
'Pokemon for myself"}}\n'
'</tool_call>',
role=<ChatRole.ASSISTANT: 'assistant'>,
name=None,
meta={'finish_reason': 'stop',
'index': 0,
'model': 'llama3-groq-70b-8192-tool-use-preview',
'usage': {'completion_time': 0.217823967,
'completion_tokens': 70,
'prompt_time': 0.041348261,
'prompt_tokens': 561,
'total_time': 0.259172228,
'total_tokens': 631}})]}

At this point, we are about 90% done.

The work is nearing completion.

In the above response, you may have noticed that the XML tag <tool_call> contains the tool call. Therefore, we need to develop a mechanism to extract the tool_call object.

def extract_tool_calls(tool_calls_str):
json_objects = re.findall(r'<tool_call>(.*?)</tool_call>', tool_calls_str, re.DOTALL)

result_list = [json.loads(obj) for obj in json_objects]

return result_list

available_functions = {
"product_identifier_func": product_identifier_func,
"find_budget_friendly_option": find_budget_friendly_option
}

With this step complete, when the agent calls the tool, we can directly access the agent's response. The only thing outstanding now is getting the tool call object and executing the function accordingly. Let's get this part done as well.

messages.append(ChatMessage.from_user(message))
response = chat_generator.run(messages=messages)

if response and "<tool_call>" in response["replies"][0].content:
function_calls = extract_tool_calls(response["replies"][0].content)
for function_call in function_calls:
# Parsing function call information
function_name = function_call["name"]
function_args = function_call["arguments"]

#Find the corresponding function and call it with the given arguments
function_to_call = available_functions[function_name]
function_response = function_to_call(**function_args)

# Use `ChatMessage.from_function` to append function responses in the message list
messages.append(ChatMessage.from_function(content=json.dumps(function_response), name=function_name))
response = chat_generator.run(messages=messages)

Now, it’s time to connect each of the previous components together to build a complete chat application. For this, I chose to use Gradio, a powerful open source deep learning model visualization tool.

import gradio as gr

messages = [ChatMessage.from_system(chat_template)]
chat_generator = get_chat_generator()

def chatbot_with_fc(message, messages):
messages.append(ChatMessage.from_user(message))
response = chat_generator.run(messages=messages)

while True:
if response and "<tool_call>" in response["replies"][0].content:
function_calls = extract_tool_calls(response["replies"][0].content)
for function_call in function_calls:
#Parsing function call information
function_name = function_call["name"]
function_args = function_call["arguments"]

#Find the corresponding function and call it with the given arguments
function_to_call = available_functions[function_name]
function_response = function_to_call(**function_args)

#Use `ChatMessage.from_function` to append function responses in the message list
messages.append(ChatMessage.from_function(content=json.dumps(function_response), name=function_name))
response = chat_generator.run(messages=messages)

# Regular conversations
else:
messages.append(response["replies"][0])
break
return response["replies"][0].content

def chatbot_interface(user_input, state):
response_content = chatbot_with_fc(user_input, state)
return response_content, state

with gr.Blocks() as demo:
gr.Markdown("# AI Purchase Assistant")
gr.Markdown("Ask me about products you want to buy!")

state = gr.State(value=messages)

with gr.Row():
user_input = gr.Textbox(label="Your message:")
response_output = gr.Markdown(label="Response:")

user_input.submit(chatbot_interface, [user_input, state], [response_output, state])
gr.Button("Send").click(chatbot_interface, [user_input, state], [response_output, state])

demo.launch()

That's it! So far, we have successfully built an AI agent based on the Llama 3 model, which has function call capabilities. You can access its complete source code from the GitHub repository (https://github.com/Ransaka/ai-agents-with-llama3).

Additionally, you can access the dataset used in this article through the Kaggle link (https://www.kaggle.com/datasets/promptcloud/amazon-product-dataset-2020).

7. In conclusion

In summary, when building AI agent-based systems, it is important to consider the time required to complete tasks and the number of API calls (tokens) used for each task. A major challenge in this area of development is to reduce hallucinations in the system, which is also a very active research area. Therefore, there are no fixed rules for building LLM and agent systems. It is necessary for the development team to plan work patiently and strategically to ensure that the AI agent LLM runs properly.

Finally, unless otherwise stated, all images in this article are provided by the author.

References

"Introduction to Llama 3.1: Our most powerful model yet", our latest model brings open intelligence to everyone, expands context length, adds support for eight languages, and more. Link address: https://ai.meta.com/blog/meta-llama-3-1/?source=post_page-----7e74f79d1ccc--------------------------------
《Groq/Llama-3-Groq-70B-Tool-Use · Hugging Face》, we are advancing and democratizing artificial intelligence through open source and open science. Link address: https://huggingface.co/Groq/Llama-3-Groq-70B-Tool-Use?source=post_page-----7e74f79d1ccc--------------------------------
https://docs.together.ai/docs/llama-3-function-calling

Tags: #AI Agent #AI Agents Examples

Comment *

Name *

Website

Exploring the Multi-AI Agent Model: Opportunities, Applications, and Future Prospects

Abstract: The multi-AI agent model is a powerful artificial intelligence architecture that uses the collaboration and interaction between multiple agents to solve complex problems, perform diverse tasks, and simulate complex system behaviors. In this model, each agent has independent perception, decision-making, and action capabilities, and optimizes the overall system goals through mutual collaboration and information sharing.

Agent application development based on large model (LLM)

At present, the industry generally believes that applications based on large models are concentrated in two directions: RAG and Agent. Regardless of which application, designing, implementing, and optimizing applications that can fully utilize the potential of large models ( LLM ) requires a lot of effort and expertise. As developers begin to create increasingly complex LLM applications, the development process inevitably becomes more complicated. The potential design space of such a process can be huge and complex. The article " How to Build Apps Based on Large Models " provides an exploratory basic framework for large model application development, which can basically be applied to RAG and Agent. However, is there anything unique about agent-oriented large model application development? Is there a large model application development framework focused on Agent?

AI Agent Framework (LLM Agent): How LLM-driven agents lead industry change, application exploration and future prospects

Many people may wonder, Agent seems to be not that far from LLM, so why is Agent so popular recently, and why is it not called LLM-Application or other words? This has to start with the origin of Agent, because Agent is a very old term, and can even be traced back to the remarks of Aristotle and Hume. In a philosophical sense, agent refers to an entity with the ability to act, and the word agent refers to the exercise or manifestation of this ability. In a narrow sense, agent usually refers to the manifestation of intentional action; accordingly, the word agent refers to an entity with desires, beliefs, intentions and the ability to act. It should be noted that agents include not only human individuals, but also other entities in the physical and virtual worlds. Importantly, the concept of agent involves the autonomy of individuals, giving them the ability to exercise their will, make choices and take actions, rather than passively responding to external stimuli.

Artificial Intelligence Agent: A new architecture for enterprise automation and intelligent transformation

Generative AI is entering the agent era, with “agentic AI” or “AI agent ” being the buzzwords right now . The agent architectures and early use cases we see today represent only the beginning of a broader transformation that promises to redefine the human-machine dynamic, with profound implications for enterprise applications and infrastructure .

Generative AI Revolution: 101 Case Study AI Agent How to reshape the business world

Imagine that every decision in an enterprise can be based on in-depth data analysis, customer service can be personalized to each users needs, and internal processes are automated to the point where human supervision is almost unnecessary. This is not a futuristic fantasy, but an enterprise revolution led by AI Agents , which is rapidly changing our business world.