Agents#
AI agents are programs in which AI controls the workflow. This page discusses the conceptual ideas and best practices associated with implementing the LLM agent.
Tools#
In terms of agentic systems, tools are APIs or other specialized tools that the model can invoke as needed. Information about these tools can be used is built into the system prompt, allowing model to decide whether to use a tool.
For example modern machine learning models aren’t very good at arithmetics, so we’ll provide a tool for adding numbers. It will literally look like a python function:
def sum(a: int, b: int) -> int:
"""Summation of the given numbers"""
return a + b
In the system prompt, we are supposed to declare a model that it can use this prompt, as well as carefully declare the input and output of the tool. So for the example under consideration the part of the system prompt associated with tools can look like:
...
You're supposed to use tools:
Name: sum; Input: int, int; Output: int; Description: "Summation of the given numbers."
...
Actions#
Actions are the concrete steps an AI agent takes to interact with its environment.
The way agents provide commands to the environment differentiates their actions.
Type of Agent |
Description |
---|---|
JSON Agent |
The action to take is specified in JSON format |
Code agent |
The action writes a code block that is executed in the environment |
Function-calling action |
It is a subcategory of the JSON Agent which has been fine-tuned to generate a new message for each action. |
Actions also differentiated by the purposes. Typical purposes are:
Type of Agent |
Description |
---|---|
Information Gathering |
Performing web sereaches, quering databases, or retriving documents. |
Tool Usage |
Making API call, running calculations, and execute code. |
Environment Interaction |
Manipulating digital interfaces or controling physical devices. |
Communication |
Engaging with users via chat or collaborating with other agents. |
Vanila agent#
There is a set of frameworks for implementing LLM agents. In this section, we will explore the implementation of the LLM agent with Hugging Face and raw Python to better understand the issues they solve.
For example, let’s consider the task of building a bot that can assist with basic arithmetic operations.
The following cell defines the InferenceClient
of the hugging face, which we’ll use as an endpoint providing access to the model.
from huggingface_hub import InferenceClient
client = InferenceClient(model="meta-llama/Llama-4-Scout-17B-16E-Instruct")
These are the tools available to the agent we’re developing. Each tool implements a basic arithmetic operation and can be considered as a separate tool.
def sum(a: int, b: int) -> int:
return a + b
def multiply(a: int, b: int) -> int:
return a * b
def subtract(a: int, b: int) -> int:
return a - b
def divide(a: int, b: int) -> float:
return a / b
The following cell defines the messages template. The most important thing here is the system prompt which defines the tools and how they can be used.
system_message = """
You are a math helper bot. You will be provided with a question from a user.
It includes only summation, multiplication, subtraction and division.
To complete the tasks you have tools which are python functions you can call.
TOOLS:
- sum(a: int, b: int) -> int
- multiply(a: int, b: int) -> int
- subtract(a: int, b: int) -> int
- divide(a: int, b: int) -> float
If you have detected a arithmetical task in the question you have to answer:
```
CALL: <code of the function call>
```
""".strip()
messages_template = [
{"role": "system", "content": system_message},
]
The following function implements the agent’s logic. If it detects the expected behavior expected for calling a tool it calls a tool, it calls the tool. To ensure that the model invouked the tool and did not complete the answer by itself, the corresponding message is published in the stdout in case of model invocation.
def process_request(request: str) -> str:
output = client.chat_completion(
messages=messages_template + [{"role": "user", "content": request}]
)
content = output.choices[0].message.content
if content is None:
return "No content generated"
if content.startswith("CALL:"):
print("Tool detected")
code = content.split("CALL:")[1].strip().strip("```").strip()
ans = eval(code)
return str(ans)
else:
return content
Consider the results of using of the implemented “agent”.
process_request("What is 22 - 4?")
Tool detected
'18'
And the request in a bit different form.
process_request("Multiply 3 by 7")
Tool detected
'21'
This is an example of when the tool is not needed.
print(process_request("What is the capital of France?"))
I'm a math helper bot, and I'm here to help with mathematical tasks. Unfortunately, your question about the capital of France is not a math-related question. However, I can tell you that the capital of France is Paris.
If you have a math question, I'd be happy to help! Please feel free to ask.