A step-by-step guide
OpenAI Assistant API
OpenAI has recently introduced new features that showcase an agent-like architecture, such as the Assistant API. According to OpenAI:
The Assistants API allows you to build AI assistants within your own applications. An Assistant has instructions and can leverage models, tools, and files to respond to user queries. The Assistants API currently supports three types of tools: Code Interpreter, File Search, and Function calling.
While these advancements are promising, they still lag behind LangChain. LangChain enables the creation of agent-like systems powered by LLMs with greater flexibility in processing natural language input and executing context-based actions.
However, this is only the beginning.
At a high level, interaction with the Assistant API can be envisioned as a loop:
- Given a user input, an LLM is called to determine whether to provide a response or take specific actions.
- If the LLM’s decision suffices to answer the query, the loop ends.
- If an action leads to a new observation, this observation is included in the prompt, and the LLM is called again.
- The loop then restarts.
Unfortunately, despite the announced advantages, I found the documentation for the API to be poorly done, especially regarding interactions with custom function calls and building apps using frameworks like Streamlit.
In this blog post, I will guide you through building an AI assistant using the OpenAI Assistant API with custom function calls, paired with a Streamlit interface, to help those interested in effectively using the Assistant API.
Use case: Tax Computation Assistant
In this blog post, I will demonstrate a simple example: an AI assistant capable of calculating tax based on a given revenue. Langchain users can easily come into mind implementing this by creating an agent with a “tax computation” tool.
This tool would include the necessary computation steps and a well-designed prompt to ensure the LLM knows when to call the tool whenever a question involves revenue or tax.
However, this process is not exactly the same with the OpenAI Assistant API. While the code interpreter and file search tools can be used directly in a straightforward manner according to OpenAI’s documentation, custom tools require a slightly different approach.
assistant = client.beta.assistants.create(
name="Data visualizer",
description="You are great at creating beautiful data visualizations. You analyze data present in .csv files, understand trends, and come up with data visualizations relevant to those trends. You also share a brief text summary of the trends observed.",
model="gpt-4o",
tools=[{"type": "code_interpreter"}],
)
Let’s break it down step by step. We aim to:
- Define a function that computes tax based on given revenue.
- Develop a tool using this function.
- Create an assistant that can access this tool and call it whenever tax computation is needed.
Tax Computation Function for Assistant Integration
Please note that the tax computation tool described in the following paragraph is designed as a toy example to demonstrate how to use the API discussed in the post. It should not be used for actual tax calculations.
Consider the following piecewise function, which returns the tax value for a given revenue. Note that the input is set as a string for simpler parsing:
def calculate_tax(revenue: str):
try:
revenue = float(revenue)
except ValueError:
raise ValueError("The revenue should be a string representation of a number.")
if revenue <= 10000:
tax = 0
elif revenue <= 30000:
tax = 0.10 * (revenue - 10000)
elif revenue <= 70000:
tax = 2000 + 0.20 * (revenue - 30000)
elif revenue <= 150000:
tax = 10000 + 0.30 * (revenue - 70000)
else:
tax = 34000 + 0.40 * (revenue - 150000)
return tax
Next, we define the assistant:
function_tools = [
{
"type": "function",
"function": {
"name": "calculate_tax",
"description": "Get the tax for given revenue in euro",
"parameters": {
"type": "object",
"properties": {
"revenue": {
"type": "string",
"description": "Annual revenue in euro"
}
},
"required": ["revenue"]
}
}
}
]
# Define the assistant
assistant = client.beta.assistants.create(
name="Assistant",
instructions="",
tools=function_tools,
model="gpt-4o",
)
Now, the essential point:
How does the assistant use the function when “calculate_tax” is called? This part is poorly documented in the OpenAI assistant, and many users might get confused the first time using it. To handle this, we need to define an EventHandler to manage different events in the response stream, specifically how to handle the event when the “calculate_tax” tool is called.
def handle_requires_action(self, data, run_id):
tool_outputs = []
for tool in data.required_action.submit_tool_outputs.tool_calls:
if tool.function.name == "calculate_tax":
try:
# Extract revenue from tool parameters
revenue = ast.literal_eval(tool.function.arguments)["revenue"]
# Call your calculate_tax function to get the tax
tax_result = calculate_tax(revenue)
# Append tool output in the required format
tool_outputs.append({"tool_call_id": tool.id, "output": f"{tax_result}"})
except ValueError as e:
# Handle any errors when calculating tax
tool_outputs.append({"tool_call_id": tool.id, "error": str(e)})
# Submit all tool_outputs at the same time
self.submit_tool_outputs(tool_outputs)
The code above works as follows: For each tool call that requires action:
- Check if the function name is “calculate_tax”.
- Extract the revenue value from the tool parameters.
- Call the calculate_tax function with the revenue to compute the tax. (This is where the real interaction happens.)
- After processing all tool calls, submit the collected results.
Talking to the assistant
You can now interact with the assistant following these standard steps documented by OpenAI (for that reason, I will not provide many details in this section):
- Create a thread: This represents a conversation between a user and the assistant.
- Add user messages: These can include both text and files, which are added to the thread.
- Create a run: Utilize the model and tools associated with the assistant to generate a response. This response is then added back to the thread.
The code snippet below demonstrates how to run the assistant in my specific use case: The code sets up a streaming interaction with an assistant using specific parameters, including a thread ID and an assistant ID. An EventHandler instance manages events during the stream. The stream.until_done() method keeps the stream active until all interactions are complete. The with statement ensures that the stream is properly closed afterward.
with client.beta.threads.runs.stream(thread_id=st.session_state.thread_id,
assistant_id=assistant.id,
event_handler=EventHandler(),
temperature=0) as stream:
stream.until_done()
Streamlit interface
While my post could end here, I’ve noticed numerous inquiries on the Streamlit forum (like this one) where users struggle to get streaming to work on the interface, even though it functions perfectly in the terminal. This prompted me to delve deeper.
To successfully integrate streaming into your app, you’ll need to extend the functionality of the EventHandler class mentioned earlier, specifically focusing on handling text creation, text deltas, and text completion. Here are the three key steps required to display text in the Streamlit interface while managing chat history:
- Handling Text Creation (on_text_created): Initiates and displays a new text box for each response from the assistant, updating the UI to reflect the status of preceding actions.
- Handling Text Delta (on_text_delta): Dynamically updates the current text box as the assistant generates text, enabling incremental changes without refreshing the entire UI.
- Handling Text Completion (on_text_done): Finalizes each interaction segment by adding a new empty text box, preparing for the next interaction. Additionally, it records completed conversation segments in chat_history.
For instance, consider the following code snippet for managing text deltas:
def on_text_delta(self, delta: TextDelta, snapshot: Text):
"""
Handler for when a text delta is created
"""
# Clear the latest text box
st.session_state.text_boxes[-1].empty()
# If there is new text, append it to the latest element in the assistant text list
if delta.value:
st.session_state.assistant_text[-1] += delta.value
# Re-display the updated assistant text in the latest text box
st.session_state.text_boxes[-1].info("".join(st.session_state["assistant_text"][-1]))
This code accomplishes three main tasks:
- Clearing the Latest Text Box: Empties the content of the most recent text box (st.session_state.text_boxes[-1]) to prepare it for new input.
- Appending Delta Value to Assistant Text: If new text (delta.value) is present, it appends this to the ongoing assistant text stored in st.session_state.assistant_text[-1].
- Re-displaying Updated Assistant Text: Updates the content of the latest text box to reflect the combined content of all assistant text accumulated so far (st.session_state[“assistant_text”][-1]).
Conclusion
This blog post demonstrated how to use the OpenAI Assistant API and Streamlit to build an AI assistant capable of calculating tax.
I did this simple project to highlight the capabilities of the Assistant API, despite its less-than-clear documentation. My goal was to clarify ambiguities and provide some guidance for those interested in using the Assistant API. I hope this post has been helpful and encourages you to explore further possibilities with this powerful tool.
Due to space constraints, I have tried to avoid including unnecessary code snippets. However, if needed, please visit my Github repository to view the complete implementation.
Creating an Assistant with OpenAI Assistant API and Streamlit was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Creating an Assistant with OpenAI Assistant API and Streamlit
Go Here to Read this Fast! Creating an Assistant with OpenAI Assistant API and Streamlit