VanEck’s Bitcoin ETF HODL experienced an astonishing 14x surge in trading volume on Feb. 20, catching the attention of investors and analysts across the financial sector. The ETF, one of the ten spot bitcoin exchange-traded funds (ETFs) available in the US, traded over $400 million in volume, a significant leap from its daily average of […]
Optimising multi-model collaboration with graph-based orchestration
Orchestra — photographer Arindam Mahanta by unsplash
Integrating the capabilities of various AI models unlocks a symphony of potential, from automating complex tasks that require multiple abilities like vision, speech, writing, and synthesis to enhancing decision-making processes. Yet, orchestrating these collaborations presents a significant challenge in managing the inner relations and dependencies. Traditional linear approaches often fall short, struggling to manage the intricacies of diverse models and dynamic dependencies.
By translating your machine learning workflow into a graph, you gain avisualisationof how each model interacts and contributes to the overall outcome that combines natural language processing, computer vision, and speech models. With the graph approach, the nodes represent models or tasks, and edges define dependencies between them. This graph-based mapping offers several advantages, identifying which models rely on the output of others and leveraging parallel processing for independent tasks. Additionally, we can execute the tasks using existing graph navigation strategies like breadth-first or depth-first according to the task priorities.
The road to harmonious AI models collaboration is not without hurdles. Imagine conducting an orchestra where each individual speaks different languages and instruments operate independently. This challenge mirrors the communication gaps when integrating diverse AI models, requiring a framework to manage the relations and which models can receive each input format.
From Theory to Practice: Expected Use Cases
The graph-based orchestration approach opens doors to exciting possibilities across various domains:
Collaborative tasks for drug discovery
Diagram of three models collaboration as part of data analysis task — image by author
Researchers can accelerate the drug discovery process with a sequence of AI-powered assistants, each designed for a specific task, for example, using a three-step discovery mission. The first step involves a language model that scans vast scientific data to highlight potential protein targets strongly linked to specific diseases, followed by a vision model to explain complex diagrams or images, providing detailed insights into the structures of the identified proteins. This visual is crucial for understanding how potential drugs might interact with the protein. Finally, a third model integrates input from the language and vision models to predict how chemical compounds might affect the targeted proteins, offering the researchers valuable insights to lead the process efficiently.
Several challenges will emerge during the model integration to deliver the entire pipeline. Extracting relevant images from the scanned content and feeding them to the vision model isn’t as simple as it seems. An intermediate processor is needed between the text scan and vision tasks to filter the relevant images. Secondly, the analysis task itself should merge multiple inputs: the data scan output, the vision model’s explanation, and user-specified instructions. This requires a template to combine the information for the language model to process them. The following sections will describe how to utilise a python framework to handle the complex relations.
Creative Content Generation
Diagram of four tasks to generate animation — image by author
The models collaboration can facilitate interactive content creation by integrating elements such as music composition, animation, and design models to generate animated scenes. For instance, in a graph-based collaboration approach, the first task can plan a scene like a director and pass the input for each music and image generation task. Finally, an animation model will use the output of the art and music models to generate a short video.
To optimise this process, we aim to achieve parallel execution of music and graphics generation as they are independent tasks. So there’s no need for music to wait for graphics completion. Additionally, we need to address the diverse input formats by the animation task. While some models like Stable Video Diffusion work with images only, the music can be combined using a post-processor.
These examples provide just a glimpse of the graph theory potential in model integration. The graph integration approach allows you to tailor multiple tasks to your specific needs and unlock innovative solutions.
Intelli Framework Orchestrates AI Models with Graph Theory
Tasks represented with a graph — image by author
Intelli is an open source python module to orchestrate AI workflows, by leveraging graph principles through three key components:
Agents act as representatives of your AI models, you define each agent by specifying its type (text, image, vision, or speech), its provider (openai, gemini, stability, mistral, etc.), and the mission.
Tasks are individual units within your AI workflow. Each task leveraging an agent to perform a specific action and applies custom pre-processing and post-processing provided by the user.
Flow binds everything together, orchestrating the execution of your tasks, adhering to the dependencies you’ve established through the graph structure. Flow management ensures tasks are executed efficiently and in the correct order, enabling both sequential and parallel processing where possible.
Using the flow component to manage the tasks relation as a graph provide several benefits when connecting multiple models, however for the case of one task only this might be overkill and direct call of the model will be sufficient.
Scaling: As your project grows in complexity, adding more models and tasks requires repetitive code updates to account for data format mismatches and complex dependency. The graph approach simplifies this by defining a new node representing the task, and the framework automatically resolves input/output differences to orchestrates data flow.
Dynamic Adaptation: With traditional approaches, changes for complex tasks will impact the entire workflow, requiring adjustments. When using the flow, it will handle adding, removing, or modifying connections automatically.
Explainability: The graph empowers deeper understanding of your AI workflow by visualising how the models interact, and optimise the tasks path navigation.
Note: the author participated in designing and developing the intelli framework. it is an open source project with Apache licence.
Getting Started
First, ensure you have python 3.7+, as intelli leverages the latest python asyncio features, and install:
pip install intelli
Agents: The Task Executors
Agents in Intelli are designed to interface with specific AI model. Each agent includes a unified input layer to access any model type and provides a dictionary allowing to pass custom parameters to the model, such as the maximum size, temperature and model version.
from intelli.flow.agents import Agent
# Define agents for various AI tasks text_agent = Agent( agent_type="text", provider="openai", mission="write social media posts", model_params={"key": OPENAI_API_KEY, "model": "gpt-4"} )
Tasks: The Building Blocks
Tasks represent individual units of work or operations to be performed by agents, and include the logic to handle the output of the previous task. Each task can be a simple operation like generating text or a more complex process, like analysing the sentiment of user feedback.
from intelli.flow.tasks import Task from intelli.flow.input import TextTaskInput
# Define a task for text generation task1 = Task( TextTaskInput("Create a post about AI technologies"), text_agent, log=True )
Processors: Tuned I/O
Processors add an extra layer of control by defining a custom pre-process for the task input and post-process for the output. The example below demonstrates creating a function to shorten the text output of the previous step before calling the image model.
class TextProcessor: @staticmethod def text_head(text, size=800): retupytrn text[:size]
task2 = Task( TextTaskInput("Generate image about the content"), image_agent, pre_process=TextProcessor.text_head, log=True, )
Flow: Specifying the dependencies
Flow translates your AI workflow into a Directed Acyclic Graph (DAG) and leverage the graph theory for dependency management. This enables you to easily visualise the task relations, and optimise the execution order of your tasks.
The map_paths dictates the task dependencies, guiding Flow to orchestrate the execution order and ensuring each task receives the necessary output from its predecessors.
Here’s how Flow navigates the nodes:
Mapping the Workflow: Flow constructs a DAG using tasks as nodes and dependencies as edges. This visual representation clarifies the task execution sequence and data flow.
Topological Sorting: The flow analyses the graph to determine the optimal execution order. Tasks without incoming dependencies are prioritised, ensuring each task receives necessary inputs from predecessors before execution.
Task Execution: The framework iterates through the sorted tasks, executing each with corresponding input. Based on the dependency map, inputs might come from previous task outputs and user-defined values.
Input Preparation: Before execution, the task applies any pre-processing functions defined for the task, modifying the input data as needed and calls the assigned agent.
Output Management: The agent returns an output, which is stored in a dictionary with task name as a key and returned to the user.
To visualise your flow as a graph:
flow.generate_graph_img()
The visual of the tasks and assigned agents — image by intelli graph function
Conclusion
Using graph theory has transformed the traditional linear approaches to orchestrating AI models by providing a symphony of collaboration between diverse models.
Frameworks like Intelli translate your workflow into a visual representation, where tasks become nodes and dependencies are mapped as edges, creating an overview of your entire process to automate complex tasks.
This approach extends to diverse fields requiring collaborative AI models, including scientific research, business decision automation, and interactive content creation. However, effective scale requires further refinement in managing the data exchange between the models.
Just months after Google DeepMind unveiled Gemini — its most capable AI model ever — the London-based lab has released its compact offspring: Gemma. Named after the Latin word for “precious stone,” Gemma is a new family of open models for developers and researchers. “Demonstrating strong performance across benchmarks for language understanding and reasoning, Gemma is available worldwide starting today,” Sundar Pichai, the CEO of Google, said on Twitter. Gemma comes in two sizes — 2 billion and 7 billion parameters. Each of them has been released with pre-trained and instruction-tuned variants. The lightweight models are descendants of Gemini. From their…
Helsinki-based Silo AI has completed the training of the Poro model — a new milestone in its mission to create large language models (LLMs) for low-resource languages. Named after the Finnish word for “reindeer,” Poro is the first of a family of open-source multilingual LLMs. The startup is building the models alongside the University of Turku and the EU’s High Performance Language Technologies (HPLT) project. Poro is a 34.2 billion parameter model, designed to process English, Finnish, and code. It’s been trained on a dataset of 1 trillion tokens. “What we are proving with Poro is that we can build…
It’s been three months since Apple launched its top-of-the-line 16-inch MacBook Pro with the new M3 Max processor. Let’s revisit it to see how it’s held up and if it really is “scary fast.”
M3 Max 16-inch MacBook Pro long-term review
While we test all Apple devices, our daily driver and productivity workhorse has been the MacBook Pro line. At the tail end of 2023, we upgraded our M2 Max 16-inch MacBook Pro to the new M3 Max version.
Before diving in to how the 2023 16-inch MacBook Pro has held up to our workflow, let’s revisit the specs.
Responding to accusations that it has made minimal effort to allow third-party App Stores in EU, Apple says it focused expressly on complying with the region’s new laws while protecting users security.
App Store logo
In January 2024, in response to the European Union’s Digital Markets Act requiring firms such as Apple to allow third-party alternative app stores, Apple announced a wide-ranging plan for developers in the region. It has been widely criticized by rival firms, with Spotify calling the new terms “extortion,” and Epic Games saying it was “malicious compliance.”
Following news that Meta and Microsoft are lobbying the EU with the accusation that Apple is failing to comply with the new laws, an Apple spokesperson told AppleInsider that it had spent a year working with the European Commission, and in complying with every requirement, also worked to add security safeguards for users.
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept”, you consent to the use of ALL the cookies.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.