Building an Agentic Framework with O1 and GPT-4o

Read Time:
minutes

Multi-Agent Systems (MAS) and Agentic Libraries

Imagine that you are leading a project, but instead of doing everything yourself, you have a talented team of assistants, each specialising in a different aspect of the work. They collaborate effortlessly, sharing knowledge and skills to tackle challenges that are simply too complex for one person or even one AI to handle alone.

AGENTS! Well guess what...

This is the magic of Multi-Agent Systems (MAS), a dynamic ensemble of intelligent agents, each playing a unique role, working together to solve different problems.

Multi-Agent Systems are not just becoming important but they are on the brink of  becoming essential. Their modular design and flexibility allow tasks to be distributed among multiple agents, enabling them to work simultaneously and efficiently.

Think about a financial data analysis project, where one agent could handle data gathering and another could analyse it for trends, and yet a third could generate insightful reports. This teamwork not only speeds up the process but also allows each agent to focus on their specific tasks, ensuring a thorough and effective approach.

Spoiler alert, that’s what we did here.
You can find the code available on GitHub here.

But why o1 and GPT-4o?

So, why use both o1 and GPT-4o? o1 stands out for its ability to tackle complex, multi-step problems with advanced reasoning, making it ideal for coding challenges, scientific analysis, and deep mathematical tasks. It thinks through solutions, self-fact-checks, and refines its approach, which saved us days on a coding issue that it solved in minutes. In our framework, o1 manages agent interactions and ensures tasks are done with precision.Although, for simpler tasks or tasks that require little to no reasoning, I wont recommend using this. In such cases, GPT-4o felt more apt to use and would save you a few bucks as well.

Meanwhile, GPT-4o handles routine tasks efficiently, creating a balanced system where o1 takes on complex problem-solving and GPT-4o delivers quick, reliable results.

In the next section, we’ll take a closer look at the architecture of this framework, exploring the roles and interactions of individual agents and how they collaborate to form a unified system.We implemented this using just traditional Python without relying on any pre-built libraries (sorry, LangChain).

Architecture Overview

To understand this agentic framework, let’s break down its architecture. At the core of this system lie two essential components, the MasterAgent and the Worker Agents.

The MasterAgent

The MasterAgent serves as the brain of the system. Backed with o1’s advanced reasoning capabilities to break down complex tasks into manageable sub-tasks. Unlike previous models, o1 has an agentic approach that allows it to analyse the best way to tackle any question.

This helps to intelligently allocate sub-tasks to WorkerAgents, ensuring smooth interactions and efficient collaboration among them. We were all amazed by how well o1 reasoned, especially with its ‘chain-of-thought’ process.

For instance, when the MasterAgent is asked to analyse a company's long-term growth potential, it doesn't just assign tasks randomly. Instead, it thoughtfully chooses the worker agents that are best suited for specific subtasks, like examining financial statements, analysing competitors, or looking into market trends.

Basically, it is the key player in the system.

The Worker Agents

Alongside the MasterAgent are the Worker Agents, which are essential for carrying out the specific tasks given to them. These agents use GPT-4o and the Perplexity tool to get their jobs done. Once they finish, they send all their results back to the MasterAgent.

Each Worker Agent is designed for specialised functions, such as conducting thorough market research, analysing financial statements, or evaluating competitors. They quickly gather relevant information from external sources, like the Perplexity Search Agent (has access to the latest internet knowledge). This means that while the MasterAgent sets the overall strategy, the Worker Agents handle the detailed analyses.

For example, if the MasterAgent decides that a closer look at a company's financial health is needed, it will assign a Worker Agent specifically for analysing financial statements. This agent will then do its research, pull in the necessary data, and provide insights that contribute to the MasterAgent’s overall report.

This way the MasterAgent and Worker Agents work in harmony by breaking down tasks such that each agent can focus on their strengths. As a result, the system enhances efficiency and accuracy, ultimately leading to valuable insights.

Implementing and Configuring Agents

The true power of the framework is in its ability to set up agents for specific roles and tasks.

In this section, we’ll explore the processes involved in defining agent roles, assigning tasks, and configuring them for optimal performance, as well as how to maintain data quality, handle autonomy over time, and manage error-checking with feedback loops.

The MasterAgent directs the activities of specialised agents such as the AnalystAgent and WorkerAgent. This modular design enables the system to flexibly manage tasks based on user input. Here's how we set up and assign tasks dynamically:


class MasterAgent:
    def __init__(self):
        self.name = "MasterAgent"
        self.model = "o1-mini"

    def run(self, user_prompt):
        prompt = MASTER_AGENT_PROMPT.replace("__COMPANY_NAME__", user_prompt)
        messages = [{"role": "user", "content": prompt}]
        response_message = call_openai(messages, client=openai_client, model=self.model)

        self.log("MasterAgent initial response received.")
        
        # Assign tasks to WorkerAgents based on parsed stock_info
        for agent_info in stock_info.get("agents", []):
            agent_instance = self.instantiate_worker(agent_info.get("Agent"), agent_info.get("Task"))
            worker_response = agent_instance.run()
            worker_responses.append({
                "agent": agent_instance.name,
                "task": agent_info.get("Task"),
                "response": worker_response
            })

This setup shows the MasterAgent's ability to parse user inputs and assign tasks effectively, enhancing the overall efficiency of the agent system.

Managing Long-Term Autonomy

In applications requiring prolonged operations, maintaining data integrity and consistency is paramount. The O1 framework equips agents with mechanisms that support autonomous functionality over extended periods. Techniques like periodic self-checks and systematic logging are essential to ensure agents align with their objectives and produce reliable outputs.

Consider the role of the AnalystAgent, designed for long-term financial data analysis. By implementing consistent data retrieval and report generation processes, this agent guarantees that the analyses remain relevant and accurate over time. Here's a glimpse into how the AnalystAgent operates:


class AnalystAgent:
    def __init__(self, task):
        self.task = task
        self.name = "AnalystAgent"
        self.model = "gpt-4o"

    def run(self, ticker_symbol):
        self.log(f"Fetching financial data for {ticker_symbol}...")
        finance_data_filename = f"{ticker_symbol}_finance_data.csv"
        ticker_data = get_ticker_data(ticker_symbol, "1y", finance_data_filename)

        if not ticker_data:
            self.log(f"Data retrieval failed for {ticker_symbol}.")
            return "No data available."

        self.log(f"Analyzing financial data for {ticker_symbol}.")

The AnalystAgent focuses on consistent data analysis while logging its progress to ensure transparency and accountability.

Implementing Robust Error Handling and Feedback Loops

A key feature of the O1 framework is its robust error handling capabilities, supported by feedback loops that enable agents to self-correct and improve performance. This dynamic approach allows agents to revise outputs when initial results fail to meet quality benchmarks. The inclusion of a Reviewer Agent can further enhance output quality by providing additional oversight.

Take a look at the WorkerAgent, which utilizes a feedback mechanism to refine its responses through iterative checks:


class WorkerAgent:
    def __init__(self, name, task):
        self.name = name
        self.task = task
        self.model = "gpt-4o"

    def run(self, max_messages=3):
        messages = [{"role": "system", "content": WORKER_AGENT_PROMPT.replace("__TASK__", self.task)}]

        for _ in range(max_messages):
            response_message = call_openai(messages, client=openai_client, model=self.model, temperature=0.1)
            tool_call = re.search(r'(.*?)', response_message.content, re.DOTALL)
            if tool_call:
                tool_response = self.handle_tool_call(json.loads(tool_call.group(1)))
                messages.append({"role": "user", "content": tool_response})
            if "__end_conv__" in response_message.content.lower().strip():
                break
            messages.append({"role": "assistant", "content": response_message.content})
        return self.generate_report(messages)

    def handle_tool_call(self, tool_call):
        if tool_call["tool_name"] == "perplexity_search":
            return perplexity_search(tool_call["arguments"]['query'])

In this example, the WorkerAgent iteratively processes up to three response cycles, refining its output until a satisfactory conclusion is reached. This structure fosters continuous improvement, ensuring agents produce high-quality, reliable results.

Financial Analysis Use Case

The framework is designed to tackle various analytical challenges, with one of its most impactful applications being financial analysis. This use case showcases how the framework operates to provide in-depth insights into stock performance, market trends, and investment opportunities.

Overview of the Process

Working of the FinancialAnalyst Framework

When a user inputs the name of a company, let's say JPMorgan Chase, the MasterAgent comes into action. The first step involves retrieving the company's ticker symbol, which is a unique identifier for publicly traded companies. In this case, the ticker symbol for JPMorgan Chase is JPM.The MasterAgent retrieves the ticker symbol in its output, which is formatted in JSON. This is specified in a prompt that looks like this:




Output your thoughts here, some, thoughts, anything you can think of.
Like what you plan to do, how do you plan to use the agents, what are you thinking, etc.




ONLY OUTPUT AN ARRAY OF AGENTS YOU WANT TO INSTANTIATE. Task is the task you want the agent to perform. - Write 2 sentences of description of the task.
Only give very nieche and specific tasks to the agents, so they can perform the task very well.

Provide the output in this format:

{
    "ticker_symbol": "TICKER_SYMBOL",  # Replace this with the actual ticker symbol
    "agents": [
        {"Agent": "AgentName", "Task": "TaskDescription"},
        {"Agent": "AgentName", "Task": "TaskDescription"},
        {"Agent": "AgentName", "Task": "TaskDescription"},
        {"Agent": "AgentName", "Task": "TaskDescription"},
        {"Agent": "AgentName", "Task": "TaskDescription"}
    ]
}



You must always follow the format.

Here’s a code snippet illustrating how the MasterAgent retrieves the ticker symbol:


class MasterAgent:
    def run(self, user_prompt):
        prompt = MASTER_AGENT_PROMPT.replace("__COMPANY_NAME__", user_prompt)
        messages = [{"role": "user", "content": prompt}]
        response_message = call_openai(messages, client=openai_client, model=self.model)
        self.log("MasterAgent initial response received.")

        match = re.search(r'(.*?)', response_message.content, re.DOTALL)
        raw_output = match.group(1).replace('\n', ' ').replace('\r', '').strip()
        stock_info = json.loads(raw_output)
        ticker_symbol = stock_info.get("ticker_symbol") # retrieves the symbol here

Once the ticker symbol is obtained, the framework retrieves the stock information for the specified period, typically one year. This data is organized and stored as a CSV file in a dedicated directory called TickerData. For instance, the file created might be named JPM_finance_data.csv.


class AnalystAgent:
    def run(self, ticker_symbol):
        try:
            finance_data_filename = f"{ticker_symbol}_finance_data.csv"
            print(f"[{self.name}] Fetching financial data for {ticker_symbol}...")
            self.ticker_data_file = get_ticker_data(ticker_symbol, "1y", finance_data_filename)

The get_ticker_data() utility comes from yfinance library that offers a threaded and Pythonic way to download market data from Yahoo!Ⓡ finance.

After storing the relevant stock data, the framework employs the AnalystAgent to conduct a comprehensive analysis. The AnalystAgent utilizes the collected data to assess various critical metrics, including:

  • Historical Price Trends: Evaluating stock price movements over time.
  • Trading Volume: Analyzing the number of shares traded to gauge market interest.
  • Price-to-Earnings (P/E) Ratio: Determining the company's earnings in relation to its share price.
  • Moving Averages: Calculating short-term and long-term moving averages to identify trends.
  • Volatility: Measuring the stock's price fluctuations to understand risk.

Here’s how the AnalystAgent might analyze the data:


class AnalystAgent:
    def run(self, ticker_symbol):
        file = client.files.create(file=open(filename, "rb"), purpose='assistants')
            assistant = client.beta.assistants.create(
            name="Data visualizer",
            description=FINANCIAL_PROMPT,
            model="gpt-4o", 
            tools=[{"type": "code_interpreter"}],
            tool_resources={
                "code_interpreter": {
                    "file_ids": [file.id]
                }
            }
            thread = client.beta.threads.create(
            messages=[
                {
                    "role": "user",
                    "content": FINANCIAL_DATA_ANALYSIS_PROMPT,  
                    "attachments": [
                        {
                            "file_id": file.id,
                            "tools": [{"type": "code_interpreter"}]
                        }]}])
            
            run = client.beta.threads.runs.create_and_poll(
                    thread_id=thread.id,
                    assistant_id=assistant.id,
                    model="gpt-4o",
                    tools=[{"type": "code_interpreter"}, {"type": "file_search"}]
                        )


To achieve this, the AnalystAgent leverages the Assistant API, uploading the stock information file and requesting the generation of visualizations based on the predictions and trends identified in the analysis. This method ensures that the findings are not only presented in a numerical format but also visually represented, making the information more digestible for users.

The FINANCIAL_DATA_ANALYSIS_PROMPT is crafted to ensure the analysis is deeply quantitative and focused on generating insightful visualizations. Key aspects include:

  • Emphasis on Numerical Evidence: The AnalystAgent is instructed to base conclusions on numerical data, avoiding unnecessary explanations of the dataset’s columns.
  • Requirement for Visualisations: The prompt mandates the creation of at least two visualizations, each accompanied by descriptive titles and XML-formatted descriptions highlighting key observations.

These visualisations facilitate better comprehension of complex data, making it easier for users to interpret the findings at a glance.

For instance, a recent analysis of Sony's stock trends generated a visualization that illustrated the company's stock closing price fluctuations over the past year, complemented by an analysis of trading volume trends.

First, let's look at the stored CSV file for stock prices:

Figure 1 : Raw ticker data for SONY collected over the past year.

This is what the Agent sees, and after thorough analysis, it generates the visualisations below:

Figure 2 : Representation of SONY's Closing stock price over time
Figure 3 : SONY's stock volume varying with time

Reporting to the MasterAgent

Once the analysis and visualizations are complete, all results are compiled and sent back to the MasterAgent. The MasterAgent aggregates the information collected from various worker agents, synthesizing it into a comprehensive financial analysis report for the organization.


class MasterAgent:
    def generate_report(self, worker_responses, messages):
        report_content = "\n\n".join(
            f"Agent: {resp['agent']}\nTask: {resp['task']}\n\nWorker report: {resp['response']}\n"
            for resp in worker_responses
        )
        report_prompt = "Compile the following information into a cohesive report:\n" + report_content
        report_message = call_openai(
            messages + [{"role": "user", "content": report_prompt}],
            client=openai_client,
            model=self.model
        )
        self.log("Report generated.")
        with open("report.md", "w") as report_file:
            report_file.write(report_message.content)
        
        return report_message.content

This report includes not just the metrics and visualisations but also an in-depth examination of the latest market information, ensuring that it is reliable and insightful for users.

Investors, analysts, and stakeholders can rely on this report for informed decision-making, whether for investment strategies or simply for gaining a deeper understanding of the organization.

Benefits of the Multi-Agent Approach

The adoption of a multi-agent system brings numerous advantages, particularly in complex fields like financial analysis. This approach enhances the efficiency, accuracy, and scalability of data processing and decision-making. Consider these important perks,

  1. Modularity and Flexibility: A multi-agent framework allows for the development of distinct agents, each with specialized roles. This modularity enables the system to be flexible and adaptable. For instance, in our framework, the MasterAgent oversees the entire process while the AnalystAgent focuses on financial data analysis. If the user needs to analyze a new type of data, a new agent can be developed and integrated seamlessly.
  2. Parallel Processing: Multi-agent systems excel in parallel processing, where multiple agents can operate simultaneously on different tasks which significantly reduces processing time, particularly when handling large tasks
  3. Scalability: As business needs grow, so can the multi-agent system. New agents can be added to address additional tasks or enhance existing functionalities without disrupting the entire framework.
  4. Cost-Effectiveness: By automating various analytical processes through specialized agents, organizations can save time and resources. The framework's efficiency in handling data analysis reduces the need for extensive human intervention, ultimately leading to cost savings. Automated reports and visualisations allow for quicker decision-making, making it a cost-effective solution for businesses seeking insights.

Challenges

While the multi-agent approach offers many advantages, it also comes with its share of challenges. It's important to understand these challenges for an effective implementation. Following are some challenges that we encountered during the project,

  1. Complexity of Coordination: Managing multiple agents can introduce complexity, particularly in ensuring that they work together seamlessly. In our case, ensuring that the MasterAgent accurately processes the findings from various worker agents required careful design of prompting and data handling mechanisms.
  2. Error Handling and Fault Tolerance: Implementing error handling and recovery mechanisms for multiple agents can be difficult. During the project, we had to develop strategies for gracefully handling agent failures, such as automatically reassigning tasks to other agents or implementing redundancy for critical processes.
  3. Testing and Debugging: Testing a multi-agent system can be more complex than traditional systems due to the interactions between agents. Identifying the source of issues or bugs can be challenging when multiple agents are involved. In our project, we had to implement unit tests for individual agents and integration tests for the entire system. However, debugging issues that arose from inter-agent interactions required extensive logging and analysis.

Conclusion and Final Thoughts

As we wrap up, I can’t help but feel a sense of accomplishment and excitement about what we’ve created. This project has shown how powerful MAS can be in tackling complex analytical tasks.

Naturally, this journey had its share of challenges. From making sure the agents communicated smoothly to addressing data security issues, each obstacle taught us important lessons about system design and the need for flexibility.

Looking forward to enhancing this framework and seeing what else it can do beyond finance, but I leave that up to the readers. Feel free to use it for any purpose and let us know how it goes!

If you are someone who’s interested in integrating agents or enhancing your workflow with AI, feel free to reach out! You can book a call with us to discuss your needs. We’d love to help you on your development journey.
Happy coding!

Book an AI consultation

Looking to build AI solutions? Let's chat.

Schedule your consultation today - this not a sales call, feel free to come prepared with your technical queries.

You'll be meeting Rohan Sawant, the Founder.
 Company
Book a Call

Let us help you.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
Behind the Blog 👀
Shivam Mitter
Writer

The guy on coffee who can do AI/ML.

o1 was amazing, but for low-end tasks GPT-4o feels more apt to use
Pranav Patel
Editor

Good boi. He is a good boi & does ML/AI. AI Lead.