Through this article, you will learn the complete process of building an AI agent with Llama 3 model function call capabilities based on the open source deep learning model visualization tool Gradio.
1. Introduction
Imagine you want to buy something. So, you visit an e-commerce website and use the search option to find what you need. Maybe you have a lot of things to buy, so this process is not very efficient. Now, think about this scenario: you open an application, describe what you want in plain English, and press enter. You don't have to worry about searching and comparing prices because the application automatically handles it for you. Cool, right? This is exactly what we are going to build in this article.
Let's look at some examples first.
Next, let's enhance this application with some features. We will use the Llama 3 open source model developed by Meta with function call capabilities. However, the sample program in this article can also be implemented using version 3.1 of this model. According to Meta's announcement (https://ai.meta.com/blog/meta-llama-3-1/), the 3.1 version of the model can use tools and functions more effectively.
Note: These models are multi-language, have a longer context length of 128K, use state-of-the-art tools, and have overall stronger reasoning capabilities.
I will be using the Groq cloud platform for this example development, specifically their big data model in this article. The initial workflow for this application includes an embedding model, a retriever, and two main tools for processing user purchase interests and cost-related questions. In summary, we need something similar to the components described in the figure below.
Now, we have to choose an LLM component framework to use in development. For this, I chose my all-time favorite production-level open source AI platform framework Haystack (https://haystack.deepset.ai/).
Now that we have what we need, let's get started with the key development work!
2. Loading and indexing data
Since we are using a RAG pipeline in this sample program, we first build a document indexing service that will use the in-memory vector database provided by Haystack. Please note that each document in our vector database contains the following fields:
- Content — The content we use to perform similarity search
- Id - unique identifier
- Price — product price
- URL - Product URL
When our RAG pipeline is called, the Content field is used for vector search. All other fields are included as metadata in the vector database. Note that it is critical to save these metadata as they are crucial in the front-end presentation to users.
Next, let’s see how to achieve this.
from haystack import Pipeline, Document
from haystack.document_stores.in_memory import InMemoryDocumentStore
from haystack.components.writers import DocumentWriter
from haystack.components.embedders import SentenceTransformersDocumentEmbedder
from haystack.components.generators import OpenAIGenerator
from haystack.utils import Secret
from haystack.components.generators.chat import OpenAIChatGenerator
from haystack.components.builders import PromptBuilder
from haystack.components.embedders import SentenceTransformersTextEmbedder
from haystack.components.retrievers.in_memory import InMemoryEmbeddingRetriever
from haystack.dataclasses import ChatMessage
import pandas as pd
#Load product data from CSV
df = pd.read_csv("product_sample.csv")
#Initialize the in-memory document storage area
document_store = InMemoryDocumentStore()
#Convert product data to Haystack document object
documents = [
Document(
content=item.product_name,
meta={
"id": item.uniq_id,
"price": item.selling_price,
"url": item.product_url
}
) for item in df.itertuples()
]
#Create a pipeline for indexing documents
indexing_pipeline = Pipeline()
#Add a Document Embedder to the pipeline using the Sentence Transformer model
indexing_pipeline.add_component(
instance=SentenceTransformersDocumentEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"), name="doc_embedder"
)
# Add a document writer to the pipeline to store documents in the document store
indexing_pipeline.add_component(instance=DocumentWriter(document_store=document_store), name="doc_writer")
#Connect the output of the embedder to the input of the writer
indexing_pipeline.connect("doc_embedder.documents", "doc_writer.documents")
#Run the indexing pipeline to process and store documents
indexing_pipeline.run({"doc_embedder": {"documents": documents}})
Great, we have completed the first step of our AI agent application. Now, it’s time to build the product identifier function tool. To better understand the main task of product identifier, let’s consider the following example.
The user's query content is as follows:
Original English text: I want to buy a camping boot, a charcoal and google pixel 9 back cover.
Chinese meaning: 我想买一双露营靴、一块木炭和谷歌pixel 9手机外壳。
Now, let's take a look at the idealized workflow of the product identifier function.
First, we need to create a tool to analyze user queries and identify the products that the user is interested in. We can build such a tool using the following code snippet.
3. Building a User Query Analyzer
template = """
Understand the user query and list of products the user is interested in and return product names as list.
You should always return a Python list. Do not return any explanation.
Examples:
Question: I am interested in camping boots, charcoal and disposable rain jacket.
Answer: ["camping_boots","charcoal","disposable_rain_jacket"]
Question: Need a laptop, wireless mouse, and noise-cancelling headphones for work.
Answer: ["laptop","wireless_mouse","noise_cancelling_headphones"]
Question: {{ question }}
Answer:
"""
product_identifier = Pipeline()
product_identifier.add_component("prompt_builder", PromptBuilder(template=template))
product_identifier.add_component("llm", generator())
product_identifier.connect("prompt_builder", "llm")
Ok, now that we are halfway through our first function, it is time to complete the function by adding the RAG pipeline.
4. Creating a RAG Pipeline
template = """
Return product name, price, and url as a python dictionary.
You should always return a Python dictionary with keys price, name and url for single product.
You should always return a Python list of dictionaries with keys price, name and url for multiple products.
Do not return any explanation.
Legitimate Response Schema:
{"price": "float", "name": "string", "url": "string"}
Legitimate Response Schema for multiple products:
[{"price": "float", "name": "string", "url": "string"},{"price": "float", "name": "string", "url": "string"}]
Context:
{% for document in documents %}
product_price: {{ document.meta['price'] }}
product_url: {{ document.meta['url'] }}
product_id: {{ document.meta['id'] }}
product_name: {{ document.content }}
{% endfor %}
Question: {{ question }}
Answer:
"""
rag_pipe = Pipeline()
rag_pipe.add_component("embedder", SentenceTransformersTextEmbedder(model="sentence-transformers/all-MiniLM-L6-v2"))
rag_pipe.add_component("retriever", InMemoryEmbeddingRetriever(document_store=document_store, top_k=5))
rag_pipe.add_component("prompt_builder", PromptBuilder(template=template))
rag_pipe.add_component("llm", generator())
rag_pipe.connect("embedder.embedding", "retriever.query_embedding")
rag_pipe.connect("retriever", "prompt_builder.documents")
rag_pipe.connect("prompt_builder", "llm")
After executing the above code, we have completed the construction of the RAG and query analysis pipeline. Now it is time to convert it into a tool. For this, we can use a regular function declaration as shown below. Creating a tool for an AI agent is just like creating a Python function. If you have a question similar to the following:
How does the agent call this function?
The solution is simple: leverage the model-specific tool pattern. Of course, we will incorporate this pattern in a later step. Now, it is time to create a wrapper function that uses both the query analyzer and the RAG pipeline.
Let's first clarify the goal of this function.
Goal 1: Identify all products that the user is interested in and return them as a list.
Objective 2: For each identified product, retrieve up to five products and their metadata from the database.
5. Implementing product identifier functions
def product_identifier_func(query: str):
"""Identifies products based on a given query and retrieves relevant details for each identified product.
parameter:
query (str): A query string that identifies the product.
Return Value:
dict: A dictionary where the keys are the product names and the values are the details of each product. If the product is not found, it returns “No product found”。
"""
product_understanding = product_identifier.run({"prompt_builder": {"question": query}})
try:
product_list = literal_eval(product_understanding["llm"]["replies"][0])
except:
return "No product found"
results = {}
for product in product_list:
response = rag_pipe.run({"embedder": {"text": product}, "prompt_builder": {"question": product}})
try:
results[product] = literal_eval(response["llm"]["replies"][0])
except:
results[product] = {}
return results
At this point, we have completed building the first tool for our agent. Now, let's see if it works as expected.
query = "I want crossbow and woodstock puzzle"
#Execute function
product_identifier_func(query)
# {'crossbow': {'name': 'DB Longboards CoreFlex Crossbow 41" Bamboo Fiberglass '
# 'Longboard Complete',
# 'price': 237.68,
# 'url': 'https://www.amazon.com/DB-Longboards-CoreFlex-Fiberglass-Longboard/dp/B07KMVJJK7'},
# 'woodstock_puzzle': {'name': 'Woodstock- Collage 500 pc Puzzle',
# 'price': 17.49,
# 'url': 'https://www.amazon.com/Woodstock-Collage-500-pc-Puzzle/dp/B07MX21WWX'}}
Success! However, it is worth noting the output schema returned here. Here is the overall schema structure of the output.
{
"product_key": {
"name": "string",
"price": "float",
"url": "string"
}
}
This is exactly the pattern we propose to generate in the RAG pipeline. Next, let's build an optional utility function called find_budget_friend_option.
def find_budget_friendly_option(selected_product_details):
"""Find the most economical and friendly options for each category of products.
parameter:
selected_product_details (dict): A dictionary where the keys are product categories and the values are lists of product details. Each product's details should be a dictionary containing a "price" key.
Return results:
dict: A dictionary where the keys are product categories and the values are the most budget-friendly product details for each category.
"""
budget_friendly_options = {}
for category, items in selected_product_details.items():
if isinstance(items, list):
lowest_price_item = min(items, key=lambda x: x['price'])
else:
lowest_price_item = items
budget_friendly_options[category] = lowest_price_item
return budget_friendly_options
Let's focus on the most critical aspect of this application, that is, enabling the AI agent to use these features as needed. As we discussed earlier, this can be achieved through model-specific tool patterns. Therefore, we need to locate the tool pattern specific to the selected model. Fortunately, this is mentioned in the Groq model library (https://huggingface.co/Groq/Llama-3-Groq-70B-Tool-Use). We just need to adjust it to suit our use case.
6. Finalizing the chat template
chat_template = '''<|start_header_id|>system<|end_header_id|>
You are a function calling AI model. You are provided with function signatures within <tools></tools> XML tags. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. For each function call return a json object with function name and arguments within <tool_call></tool_call> XML tags as follows:
<tool_call>
{"name": <function-name>,"arguments": <args-dict>}
</tool_call>
Here are the available tools:
<tools>
{
"name": "product_identifier_func",
"description": "To understand user interested products and its details",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The query to use in the search. Infer this from the user's message. It should be a question or a statement"
}
},
"required": ["query"]
}
},
{
"name": "find_budget_friendly_option",
"description": "Get the most cost-friendly option. If selected_product_details has morethan one key this should return most cost-friendly options",
"parameters": {
"type": "object",
"properties": {
"selected_product_details": {
"type": "dict",
"description": "Input data is a dictionary where each key is a category name, and its value is either a single dictionary with 'price', 'name', and 'url' keys or a list of such dictionaries; example: {'category1': [{'price': 10.5, 'name': 'item1', 'url': 'http://example.com/item1'}, {'price': 8.99, 'name': 'item2', 'url': 'http://example.com/item2'}], 'category2': {'price': 15.0, 'name': 'item3', 'url': 'http://example.com/item3'}}"
}
},
"required": ["selected_product_details"]
}
}
</tools><|eot_id|><|start_header_id|>user<|end_header_id|>
I need to buy a crossbow<|eot_id|><|start_header_id|>assistant<|end_header_id|>
<tool_call>
{"id":"call_deok","name":"product_identifier_func","arguments":{"query":"I need to buy a crossbow"}}
</tool_call><|eot_id|><|start_header_id|>tool<|end_header_id|>
<tool_response>
{"id":"call_deok","result":{'crossbow': {'price': 237.68,'name': 'crossbow','url': 'https://www.amazon.com/crossbow/dp/B07KMVJJK7'}}}
</tool_response><|eot_id|><|start_header_id|>assistant<|end_header_id|>
'''
Now, there are only a few steps left. Before we do anything else, let's test our proxy.
##Testing Agent
messages = [
ChatMessage.from_system(
chat_template
),
ChatMessage.from_user("I need to buy a crossbow for my child and Pokémon for myself."),
]
chat_generator = get_chat_generator()
response = chat_generator.run(messages=messages)
pprint(response)
## Response results
{'replies': [ChatMessage(content='<tool_call>\n'
'{"id": 0, "name": "product_identifier_func", '
'"arguments": {"query": "I need to buy a '
'crossbow for my child"}}\n'
'</tool_call>\n'
'<tool_call>\n'
'{"id": 1, "name": "product_identifier_func", '
'"arguments": {"query": "I need to buy a '
'Pokemon for myself"}}\n'
'</tool_call>',
role=<ChatRole.ASSISTANT: 'assistant'>,
name=None,
meta={'finish_reason': 'stop',
'index': 0,
'model': 'llama3-groq-70b-8192-tool-use-preview',
'usage': {'completion_time': 0.217823967,
'completion_tokens': 70,
'prompt_time': 0.041348261,
'prompt_tokens': 561,
'total_time': 0.259172228,
'total_tokens': 631}})]}
At this point, we are about 90% done.
In the above response, you may have noticed that the XML tag <tool_call> contains the tool call. Therefore, we need to develop a mechanism to extract the tool_call object.
def extract_tool_calls(tool_calls_str):
json_objects = re.findall(r'<tool_call>(.*?)</tool_call>', tool_calls_str, re.DOTALL)
result_list = [json.loads(obj) for obj in json_objects]
return result_list
available_functions = {
"product_identifier_func": product_identifier_func,
"find_budget_friendly_option": find_budget_friendly_option
}
With this step complete, when the agent calls the tool, we can directly access the agent's response. The only thing outstanding now is getting the tool call object and executing the function accordingly. Let's get this part done as well.
messages.append(ChatMessage.from_user(message))
response = chat_generator.run(messages=messages)
if response and "<tool_call>" in response["replies"][0].content:
function_calls = extract_tool_calls(response["replies"][0].content)
for function_call in function_calls:
# Parsing function call information
function_name = function_call["name"]
function_args = function_call["arguments"]
#Find the corresponding function and call it with the given arguments
function_to_call = available_functions[function_name]
function_response = function_to_call(**function_args)
# Use `ChatMessage.from_function` to append function responses in the message list
messages.append(ChatMessage.from_function(content=json.dumps(function_response), name=function_name))
response = chat_generator.run(messages=messages)
Now, it’s time to connect each of the previous components together to build a complete chat application. For this, I chose to use Gradio, a powerful open source deep learning model visualization tool.
import gradio as gr
messages = [ChatMessage.from_system(chat_template)]
chat_generator = get_chat_generator()
def chatbot_with_fc(message, messages):
messages.append(ChatMessage.from_user(message))
response = chat_generator.run(messages=messages)
while True:
if response and "<tool_call>" in response["replies"][0].content:
function_calls = extract_tool_calls(response["replies"][0].content)
for function_call in function_calls:
#Parsing function call information
function_name = function_call["name"]
function_args = function_call["arguments"]
#Find the corresponding function and call it with the given arguments
function_to_call = available_functions[function_name]
function_response = function_to_call(**function_args)
#Use `ChatMessage.from_function` to append function responses in the message list
messages.append(ChatMessage.from_function(content=json.dumps(function_response), name=function_name))
response = chat_generator.run(messages=messages)
# Regular conversations
else:
messages.append(response["replies"][0])
break
return response["replies"][0].content
def chatbot_interface(user_input, state):
response_content = chatbot_with_fc(user_input, state)
return response_content, state
with gr.Blocks() as demo:
gr.Markdown("# AI Purchase Assistant")
gr.Markdown("Ask me about products you want to buy!")
state = gr.State(value=messages)
with gr.Row():
user_input = gr.Textbox(label="Your message:")
response_output = gr.Markdown(label="Response:")
user_input.submit(chatbot_interface, [user_input, state], [response_output, state])
gr.Button("Send").click(chatbot_interface, [user_input, state], [response_output, state])
demo.launch()
That's it! So far, we have successfully built an AI agent based on the Llama 3 model, which has function call capabilities. You can access its complete source code from the GitHub repository (https://github.com/Ransaka/ai-agents-with-llama3).
Additionally, you can access the dataset used in this article through the Kaggle link (https://www.kaggle.com/datasets/promptcloud/amazon-product-dataset-2020).
7. In conclusion
In summary, when building AI agent-based systems, it is important to consider the time required to complete tasks and the number of API calls (tokens) used for each task. A major challenge in this area of development is to reduce hallucinations in the system, which is also a very active research area. Therefore, there are no fixed rules for building LLM and agent systems. It is necessary for the development team to plan work patiently and strategically to ensure that the AI agent LLM runs properly.
Finally, unless otherwise stated, all images in this article are provided by the author.
References