Tag: tech

Demo AI Products Like a Pro
Joseph George Lewis
An intro to expert guide on using Gradio to demonstrate product value to expert and non-technical audiences.

Photo by Austin Distel on Unsplash

We have all experienced at least one demo that has fallen flat. This is particularly a problem in data science, a field where a lot can go wrong on the day. Data scientists often have to balance challenges when presenting to audiences with varying experience levels. It can be challenging to both show the value and explain core concepts of a solution to a wide audience.

This article aims to help overcome the hurdles and help you share your hard work! We always work so hard to improve models, process data, and configure infrastructure. It’s only fair that we also work hard to make sure others see the value in that work. We will explore using the Gradio tool to share AI products. Gradio is an important part of the Hugging Face ecosystem. It’s also used by Google, Amazon and Facebook so you’ll be in great company! Whilst we will use Gradio, a lot of the key concepts can be replicated in common alternatives like StreamLit with Python or Shiny with R.

The importance of stakeholder/customer engagement in data science

The first challenge when pitching is ensuring that you are pitching at the right level. To understand how your AI model solves problems, customers first need to understand what it does, and what the problems are. They may have a PhD in data science, or they may never have heard of a model before. You don’t need to teach them linear algebra nor should you talk through a white paper of your solution. Your goal is to convey the value added by your solution, to all audiences.

This is where a practical demo comes in. Gradio is a lightweight open source package for making practical demos [1]. It is well documented that live demos can feel more personal, and help to drive conversation/generate new leads [2]. Practical demos can be crucial in building trust and understanding with new users. Trust builds from seeing you use the tool, or even better testing with your own inputs. When users can demo the tool they know there is no “Clever Hans” [3] process going on and what they see is what they get. Understanding grows from users seeing the “if-this-then-that” patterns in how your solution operates.

Then comes the flipside … everyone has been to a bad live demo. We have all sat through or made others sit through technical difficulties.

But technical difficulties aren’t the only thing that give us reason to fear live demos. Some other common off-putting factors are:
- Information dumping: Pitching to customers should never feel like a lecture. Adding demos that are inaccessible can give customers too much to learn too quickly.
- Developing a demo: Demos can be slow to build and actually slow down development. Regularly feeding back in “show and tells” is a particular problem for agile teams. Getting content for the show and tell can be an ordeal. Especially if customers grow accustomed to a live demo.
- Broken dependencies: If you are responsible for developing a demo you might rely on some things staying constant. If they change you’ll need to start again.
Introducing Gradio

Now to the technical part. Gradio is a framework for demonstrating machine learning/AI models and it integrates with the rest of the Hugging Face ecosystem. The framework can be implemented using Python or JavaScript SDKs. Here, we will use Python. Before we build a demo an example Gradio app for named entity recognition is below:

Image Source: Hugging Face Documentation [4]

You can implement Gradio anywhere you currently work, and this is a key benefit of using the framework. If you are quickly prototyping code in a notebook and want instant feedback from stakeholders/colleagues you can add a Gradio interface. In my experience of using Gradio, I have implemented in Jupyter and Google Colab notebooks. You can also implement Gradio as a standalone site, through a public link hosted on HuggingFace. We will explore deployment options later.

Gradio demos help us solve the problems above, and get us over the fear of the live demo:
- Information dumping: Gradio provides a simple interface that abstracts away a lot of the difficult information. Customers aren’t overloaded with working out how to interact with our tool and what the tool is all at once.
- Developing a demo: Gradio demos have the same benefits as StreamLit and Shiny. The demo code is simple and builds on top of Python code you have already written for your product. This means you can make changes quickly and get instant feedback. You can also see the demo from the customer point of view.
- Broken dependencies: No framework will overcome complete project overhauls. Gradio is built to accomodate new data, data types and even new models. The simplicity and range of allowed inputs/outputs, means that Gradio demos are kept quite constant. Not only that but if you have many tools, many customers and many projects the good news is that most of your demo code won’t change! You can just swap a text output to an image output and you’re all set up to move from LLM to Stable Diffusion!
Step-by-step guide to creating a demo using Gradio

The practical section of this article takes you from complete beginner to demonstration expert in Gradio. That being said, sometimes less can be more, if you are looking for a really simple demo to highlight the impact of your work by all means, stick to the basics!

For more information on alternatives like StreamLit, check out my earlier post:

Building Lightweight Geospatial Data Viewers with StreamLit and PyDeck

The basics

Let’s start with a Hello World style example so that we can learn more about what makes up a Gradio demo. We have three fundamental components:
1. Input variables: We provide any number of input variables which users can input using toggles, sliders or other input widgets in our demo.
2. Function: The author of the demo makes a function which does the heavy lifting. This is where code changes between demos the most. The function will transform input variables into an output that the user sees. This is where we can call a model, transform data or do anything else we may need.
3. Interface: The interface combines the input variables, input widgets, function and output widgets into one demo.
So let’s see how that looks in code form:

<a href="https://medium.com/media/c2f99aeb97c4f43f54b88412c2a1fe67/href">https://medium.com/media/c2f99aeb97c4f43f54b88412c2a1fe67/href</a>

This gives us the following demo. Notice how the input and output are both of the text type as we defined above:

Image Source: Image by Author

Now that we understand the basic components of Gradio, let’s get a bit more technical.

To see how we can apply Gradio to a machine learning problem, we will use the simplest algorithm we can. A linear regression. For the first example. We will build a linear regression using the California House Prices dataset. First, we update the basic code so that the function makes a prediction based on a linear model:

<a href="https://medium.com/media/4d60931abdedba8a51733ec0abcda48e/href">https://medium.com/media/4d60931abdedba8a51733ec0abcda48e/href</a>

Then we update the interface so that the inputs and outputs match what we need. Note that we also use the Number type here as an input:

<a href="https://medium.com/media/fd9919e0a1254da4b0699c7d588e15d5/href">https://medium.com/media/fd9919e0a1254da4b0699c7d588e15d5/href</a>

Then we hit run and see how it looks:

Image Source: Image by Author

Why stop now! We can use Blocks in Gradio to make our demos even more complex, insightful and engaging.

Controlling the interface

Blocks are more or less exactly as described. They are the building blocks of Gradio applications. So far, we have only used the higher level Interface wrapper. In the example below we will use blocks which has a slightly different coding pattern. Let’s update the last example to use blocks so that we can understand how they work:

<a href="https://medium.com/media/fc689a04511bbbe8565607946199ada6/href">https://medium.com/media/fc689a04511bbbe8565607946199ada6/href</a>

Instead of before when we had inputs, function and interface. We have now rolled everything back to its most basic form in Gradio. We no longer set up an interface and ask for it to add number inputs for us! Now we provide each individual Number input and one Number output. Building like this gives us much more control of the display.

With this new control over the demo we can even add new tabs. Tabs enable us to control the user flows and experience. We can first explain a concept, like how our predictions are distributed. Then on the next tab, we have a whole new area to let users prompt the model for predictions of their own. We can also use tabs to overcome technical difficulties. The first tab gives users a lot of information about model performance. This is all done through functions that were implemented earlier. If the model code doesn’t run on the day we still have something insightful to share. It’s not perfect, but it’s a lot better than a blank screen!

Note: This doesn’t mean we can hide technical difficulties behind tabs! We can just use tabs to give audiences something to go on if all else fails. Then reshare the demo when we resolve the technical issues.

Image Source: Image by Author

Ramping up the complexity shows how useful Gradio can be to show all kinds of information! So far though we have kept to a pretty simple model. Let’s now explore how we would use Gradio for something a bit more complex.

Gradio for AI Models and Images

The next application will look at using Gradio to demonstrate Generative AI. Once again, we will use Blocks to build the interface. This time the demo will have two core components:
1. An intro tab explaining the limitations, in and out of scope uses of the model.
2. An inspiration tab showing some images generated earlier.
3. An interactive tab where users can submit prompts to generate images.
In this blog we will just demo a pre-trained model. To learn more about Stable Diffusion models, including key concepts and fine-tuning, check out my earlier blog:

Stable Diffusion: How AI converts text to images

As this is a demo, we will start from the most difficult component. This ensures we will have the most time to deliver the hardest piece of work. The interactive tab is likely to be the most challenging, so we will start there. So that we have an idea of what we are aiming for our demo page will end up looking something like this:

Image Source: Image by Author. Stable Diffusion Images are AI Generated.

To achieve this the demo code will combine the two examples above. We will use blocks, functions, inputs and buttons. Buttons enable us to work in a similar way to before where we have inputs, outputs and functions. We use buttons as event listeners. Event listeners help to control our logic flow.

Let’s imagine we are trying to start our demo. At runtime (as soon as the demo starts), we have no inputs. As we have no input, the model the demo uses has no prompt. With no prompt, the model cannot generate an image. This will cause an error. To overcome the error we use an event listener. The button listens for an event, in this case, a click of the button. Once it “hears” the event, or gets clicked, it then triggers an action. In this case, the action will be submitting a completed prompt to the model.

Let’s review some new code that uses buttons and compare it to the previous interface examples:

<a href="https://medium.com/media/f4ca0fe01b0afafee464de9d6179ff6b/href">https://medium.com/media/f4ca0fe01b0afafee464de9d6179ff6b/href</a>

The button code looks like the interface code, but there are some big conceptual changes:
1. The button code uses blocks. This is because whilst we are using the button in a similar way to interface, we still need something to determine what the demo looks like.
2. Input and output widgets are used as objects instead of strings. If you go back to the first example, our input was “text” of type string but here it is prompt of type gr.Text().
3. We use button.click() instead of Interface.launch(). This is because the interface was our whole demo before. This time the event is the button click.
This is how the demo ends up looking:

Image Source: Image by Author. Stable Diffusion Images are AI Generated.

Can you see how important an event listener is! It has saved us lots of work in trying to make sure things happen in the right order. The beauty of Gradio means we also get some feedback on how long we will have to wait for images. The progress bar and time information on the left are great for user feedback and engagement.

The next part of the demo is sharing images we generated beforehand. This will serve as inspiration to customers. They will be able to see what is possible from the tool. For this we will implement another new output widget, a Gallery. The gallery displays the images we just generated:

<a href="https://medium.com/media/ddeccfda8deea8520f07af18fad9d9e1/href">https://medium.com/media/ddeccfda8deea8520f07af18fad9d9e1/href</a>

An important note: We actually make use of our generate_images() function from before. As we said above, all of these lightweight app libraries enable us to simply build on top of our existing code.

The demo now looks like this, users are able to switch between two core functionalities:

Image Source: Image by Author. Stable Diffusion Images are AI Generated.

Finally we will tie everything together with a landing page for the demo. In a live or recorded demo the landing page will give us something to talk through. It’s useful but not essential. The main reason we include a landing page, is for any users that will test the tool without us being present. This helps to build accessibility of the tool and trust and understanding in users. If you need to be there every time customers use your product, it’s not going to deliver value.

This time we won’t be using anything new. Instead we will show the power of the Markdown() component. You may have noticed we have used some Markdown already. For those familiar, Markdown can help express all kinds of information in text. The code below has some ideas, but for your demos, get creative and see how far you can take Markdown in Gradio:

Image Source: Image by Author

The finished demo is below. Let me know what you think in the comments!

Image Source: Image by Author. Stable Diffusion Images are AI Generated.

Sharing with customers

Whether you’re a seasoned pro, or pitching beginner sharing the demo can be daunting. Building demonstrations and pitching are two very different skillsets. This article so far has helped to build your demo. There are great resources online to help pitching [5]. Let’s now focus on the intersection of the two, how you can share the demo you built, effectively.

Baring in mind your preferred style, live demo is guaranteed to liven up your pitch (pun intended!). To a technical audience we can set off our demo right in our notebook. This is useful to those who want to get into the code. I recommend sharing this way with new colleagues, senior developers and anyone looking to collaborate or expand your work. If you are using an alternative to Gradio, I’d still recommend sharing your code at a high level with this audience. It can help bring new developers onboard, or explain your latest changes to senior developers.

An alternative is to present the live demo using just a “front-end”. This can be done using the link provided when you run the demo. When you share this way customers don’t have to get bogged down in code to see your demo. This is how the screenshots so far have been taken. I’d recommend this for live non-technical audiences, new customers and for agile feedback/show and tell sessions. We can get to this using a link provided if you built your demo in Gradio.

The link we can use to share also allows us to share the demo with others. By setting a share parameter when we launch the demo:
```
demo.launch(debug=True, share=True)
```
This works well for users who can’t make the live session, or want more time to experiment with the product. This link is available for 72 hours. There is a need for caution at this point as demos are hosted publicly from your machine. It is advised that you consider the security aspects of your system before sharing this way. One thing we can do to make this a bit more secure is to share our demo with password protection:
```
demo.launch(debug=True, auth=('trusted_user', 'trusted123'))
```
This adds a password pop-up to the demo.

You can take this further by using authorisation techniques. Examples include using Hugging Face directly or Google for OAuth identity providers [6]. Further protections can be put in place for blocked files and file paths on the host machine [6].

This does not solve security concerns with sharing this way completely. If you are looking to share privately, containerisation through a cloud provider may be a better option [7].

For wider engagement, you may want to share your demo publicly to an online audience. This can be brilliant for finding prospective customers, building word of mouth or getting some feedback on your latest AI project. I have been sharing work publicly for feedback for years on Medium, Kaggle and GitHub. The feedback I have had has definitely improved my work over time.

If you are using Gradio demos can be publicly shared through Hugging Face. Hugging Face provides Spaces which are used for sharing Gradio apps. Spaces provide a free platform to share your demo. There are costs attached to GPU instances (ranging from $0.40 to $5 per hour). To share to spaces, the following documentation is available [6]. The docs explain how you can:
- Share to spaces
- Implement CI/CD of spaces with GitHub actions
- Embedding Gradio demos in your own website from spaces!
Spaces are helpful for reaching a wider audience, without worrying about resources. It is also a permanent link for prospective customers. It does make it more important to include as much guidance as possible. Again, this is a public sharing platform on compute you do not own. For more secure requirements, containerisation and dedicated hosting may be preferred. A particularly great example is this Minecraft skin generator [8].

Image Source: Nick088, Hugging Face [Stable Diffusion Finetuned Minecraft Skin Generator — a Hugging Face Space by Nick088]

Additional considerations

The elephant in the room in the whole AI community right now is of course LLMs. Gradio has plenty of components built with LLM in mind. This includes using agentic workflows and models as a service [9].

It is also worth mentioning custom components. Custom components have been developed by other data scientists and developers. They are extensions on top of the Gradio framework. Some great examples are:
- Image annotation component: gradio_image_annotation V0.0.6 — a Hugging Face Space by edgargg
- Question answering with an uploaded PDF: gradio_pdf V0.0.6 — a Hugging Face Space by awacke1
Extensions are not unique to Gradio. If you choose to use StreamLit or Shiny to build your demo there are great extensions to those frameworks as well:
- StreamLit Extras, an extension of the StreamLit UI components: https://extras.streamlit.app/
- Awesome R Shiny, additional reactive/UI/theming components for Shiny: https://github.com/nanxstats/awesome-shiny-extensions
A final word on sharing work, in an agile context. When sharing regularly through show and tells or feedback sessions lightweight demos are a game changer. The ability to easily layer on from MVP to final product really helps customers see their journey with your product.

In summary, Gradio is a lightweight, open source tool for sharing AI products. Some important security steps may need consideration depending on your requirements. I really hope you are feeling more prepared with your demos!

If you enjoyed this article please consider giving me a follow, sharing this article or leaving a comment. I write a range of content across the data science field, so please checkout more on my profile.

References

[1] Gradio Documentation. https://www.gradio.app/

[2] User Pilot Product Demos. https://userpilot.com/blog/product-demos/

[3] Clever Hans Wikipedia. https://en.wikipedia.org/wiki/Clever_Hans

[4] Gradio Named Entity Recognition App. Named Entity Recognition (gradio.app )

[5] Harvard Business Review. What makes a great pitch. What Makes a Great Pitch (hbr.org )

[6] Gradio Deploying to Spaces. Sharing Your App (gradio.app ).

[7] Deploying Gradio to Docker. Deploying Gradio With Docker

[8] Amazing Minecraft Skin Generator Example. Stable Diffusion Finetuned Minecraft Skin Generator — a Hugging Face Space by Nick088

[9] Gradio for LLM. Gradio And Llm Agents

Demo AI Products Like a Pro was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Demo AI Products Like a Pro

Go Here to Read this Fast! Demo AI Products Like a Pro
May 7, 2024
Text to Knowledge Graph Made Easy with Graph Maker
Rahul Nayak
An open-source library for building knowledge graphs from text corpus using open-source LLMs like Llama 3 and Mixtral.

Image generated by the Author using Adobe Photoshop

In this article, I will share a Python library — the Graph Maker — that can create a Knowledge Graph from a corpus of text as per a given Ontology. The Graph Maker uses open-source LLMs like Llama3, Mistral, Mixtral or Gemma to extract the KG.

We will go through the basics of ‘Why’ and ‘What’ of the Graph Maker, a brief recap of the previous article, and how the current approach addresses some of its challenges. I will share the GitHub repository at the end of this article.

Introduction

This article is a sequel to the article I wrote a few months ago about how to convert any text into a Graph.

How to Convert Any Text Into a Graph of Concepts

The article received an overwhelming response. The GitHub repository shared in the article has more than 180 Forks and more than 900 Stars. The article itself was read by more than 80K readers on the Medium. Recently the article was attributed in the following paper published by Prof Markus J. Buehler at MIT.

Accelerating Scientific Discovery with Generative Knowledge Extraction, Graph-Based Representation, and Multimodal Intelligent Graph Reasoning

This is a fascinating paper that demonstrates the gigantic potential of Knowledge Graphs in the era of AI. It demonstrates how KGs can be used, not only to retrieve knowledge but also to discover new knowledge. Here is one of my favourite excerpts from this paper.

“For instance, we will show how this approach can relate seemingly disparate concepts such as Beethoven’s 9th symphony with bio-inspired materials science”

These developments are a big reaffirmation of the ideas I presented in the previous article and encouraged me to develop the ideas further.

I also received numerous feedback from fellow techies about the challenges they encountered while using the repository, and suggestions for improving the idea. I incorporated some of these suggestions into a new Python package I share here.

Before we discuss the working of the package — The Graph Maker — let us discuss the ‘Why’ and the ‘What’ of it.

A Brief Recap

We should probably start with ‘Why Graphs’. However, We discussed this briefly in my previous article. Feel free to hop onto that article for a refresher. However, let us briefly discuss the key concepts that are relevant to our current discussion here.

TL;DR this section if you are already well versed in the lore of Knowledge Graphs.

Here is an illustration that sums up the idea of Knowledge Graphs neatly.

Source: https://arxiv.org/abs/2403.11996

To create a KG, we need two pieces of information.
1. Knowledge Base: This can be a corpus of text, a code base, a collection of articles, etc.
2. Ontology: The categories of the entities, and the types of their relationships we care about. I am probably oversimplifying the definition of ontology here but it works for our purpose.
Here is a simple ontology

Entities: Person, Place
Relationships:
Person — related to → Person
Person — lives in → Place
Person — visits → Place

Given these two pieces of information, we can build a KG from a text that mentions people and places. However, let’s say our knowledge base is about a clinical study of prescription drugs and their interactions. We might use a different ontology where Compounds, Usage, Effects, Reactions etc may form our ontology.

In the previous article, we discussed How we can extract a Knowledge Graph using an LLM, without supplying it with an ontology. The idea was to let the LLM discover the ontology best suited for the given corpus of text by itself.

Although this approach lacks the rigour of the traditional methods of generating KGs, it has its merits. It can generate KGs with unstructured data more easily than traditional methods. The KGs that it generates are, in some sense, also unstructured. However, they are easier to build and are richer in information. They are well suited for GRAG (Graph Retrieval Augmented Generation) like applications.

Why The Graph Maker?

Let me list a few challenges and observations I received in the feedback for my previous article. It will help us understand the challenges in creating KGs with LLMs. Let us use the Wikipedia summary of the Lord of the Rings books. One cant not love the Lord of the Rings after all!

Meaningful Entities

Given a free run, the entities that the LLM extracts can be too diverse in their categories. It mistakes by marking abstract concepts as entities. For example in the text “Bilbo Baggins celebrates his birthday and leaves the Ring to Frodo”, the LLM may extract “Bilbo Baggins celebrates his birthday” or “Celebrates his birthday” as ‘Action’. But it may be more useful if it extracts “Birthday” as an ‘Event’.

Consistent Entities

It can also mistake marking the same entity differently in different contexts. For example:

‘Sauron’, ‘the Dark Lord Sauron’ and ‘the Dark Lord’ Should not be extracted as different entities. Or if they are extracted as different entities, they should be connected with an equivalence relationship.

Resilience in parsing

The output of the LLMs is, by nature, indeterministic. To extract the KG from a large document, we must split the corpus into smaller text chunks and then generate subgraphs for every chunk. To build a consistent graph, the LLM must output JSON objects as per the given schema consistently for every subgraph. Missing even one may affect the connectivity of the entire graph adversely.

Although LLMs are getting better at responding with well-formatted JSON objects, It is still far from perfect. LLMs with limited context windows may also generate incomplete responses.

Categorisation of the Entities

LLMs can error generously when recognising entities. This is a bigger problem when the context is domain-specific, or when the entities are not named in standard English. NER models can do better at that, but they too are limited to the data they are trained on. Moreover, they can’t understand the relations between the entities.

To coerce an LLM to be consistent with categories is an art in prompt engineering.

Implied relations

Relations can be explicitly mentioned, or implied by the context. For example:

“Bilbo Baggins celebrates his birthday and leaves the Ring to Frodo” implies the relationships:
Bilbo Baggins → Owner → Ring
Bilbo Baggins → heir → Frodo
Frodo → Owner → Ring

Here I think LLMs at some point in time will become better than any traditional method of extracting relationships. But as of now, this is a challenge that needs clever prompt engineering.

The Graph Maker

The graph maker library I share here improves upon the previous approach by travelling halfway between the rigour and the ease — halfway between the structure and the lack of it. It does remarkably better than the previous approach I discussed on most of the above challenges.

As opposed to the previous approach, where the LLM is free to discover the ontology by itself, the graph maker tries to coerce the LLM to use a user-defined ontology.

Here is how it works in 5 easy steps.

1. Define the Ontology of your Graph

The library understands the following schema for the Ontology. Behind the scenes, ontology is a pedantic model.
```
ontology = Ontology(
# labels of the entities to be extracted. Can be a string or an object, like the following.
labels=[
{"Person": "Person name without any adjectives, Remember a person may be referenced by their name or using a pronoun"},
{"Object": "Do not add the definite article 'the' in the object name"},
{"Event": "Event event involving multiple people. Do not include qualifiers or verbs like gives, leaves, works etc."},
"Place",
"Document",
"Organisation",
"Action",
{"Miscellaneous": "Any important concept can not be categorised with any other given label"},
],
# Relationships that are important for your application.
# These are more like instructions for the LLM to nudge it to focus on specific relationships.
# There is no guarantee that only these relationships will be extracted, but some models do a good job overall at sticking to these relations.
relationships=[
"Relation between any pair of Entities",
],
)
```
I have tuned the prompts to yield results that are consistent with the given ontology. I think it does a pretty good job at it. However, it is still not 100% accurate. The accuracy depends on the model we choose to generate the graph, the application, the ontology, and the quality of the data.

2. Split the text into chunks.

We can use as large a corpus of text as we want to create large knowledge graphs. However, LLMs have a finite context window right now. So we need to chunk the text appropriately and create the graph one chunk at a time. The chunk size that we should use depends on the model context window. The prompts that are used in this project eat up around 500 tokens. The rest of the context can be divided into input text and output graph. In my experience, smaller chunks of 200 to 500 tokens generate a more detailed graph.

3. Convert these chunks into Documents.

The document is a pedantic model with the following schema
```
## Pydantic document model
class Document(BaseModel):
  text: str
  metadata: dict
```
The metadata we add to the document here is tagged to every relation that is extracted out of the document.

We can add the context of the relation, for example, the page number, chapter, the name of the article, etc. into the metadata. More often than not, Each node pairs have multiple relations with each other across multiple documents. The metadata helps contextualise these relationships.

4. Run the Graph Maker.

The Graph Maker directly takes a list of documents and iterates over each of them to create one subgraph per document. The final output is the complete graph of all the documents.

Here is a simple example of how to achieve this.
```
from graph_maker import GraphMaker, Ontology, GroqClient

## -> Select a groq supported model
model = "mixtral-8x7b-32768"
# model ="llama3–8b-8192"
# model = "llama3–70b-8192"
# model="gemma-7b-it" ## This is probably the fastest of all models, though a tad inaccurate.

## -> Initiate the Groq Client.
llm = GroqClient(model=model, temperature=0.1, top_p=0.5)
graph_maker = GraphMaker(ontology=ontology, llm_client=llm, verbose=False)

## -> Create a graph out of a list of Documents.
graph = graph_maker.from_documents(docs)
## result: a list of Edges.

print("Total number of Edges", len(graph))
## 1503
```
The Graph Makers run each document through the LLM and parse the response to create the complete graph. The final graph is as a list of edges, where every edge is a pydantic model like the following.
```
class Node(BaseModel):
  label: str
  name: str
 
class Edge(BaseModel):
  node_1: Node
  node_2: Node
  relationship: str
  metadata: dict = {}
  order: Union[int, None] = None
```
I have tuned the prompts so they generate fairly consistent JSONs now. In case the JSON response fails to parse, the graph maker also tries to manually split the JSON string into multiple strings of edges and then tries to salvage whatever it can.

5. Save to Neo4j

We can save the model to Neo4j either to create an RAG application, run Network algorithms, or maybe just visualise the graph using the Bloom
```
from graph_maker import Neo4jGraphModel
create_indices = False
neo4j_graph = Neo4jGraphModel(edges=graph, create_indices=create_indices)
neo4j_graph.save()
```
Each edge of the graph is saved to the database as a transaction. If you are running this code for the first time, then set the `create_indices` to true. This prepares the database by setting up the uniqueness constraints on the nodes.

5.1 Visualise, just for fun if nothing else
In the previous article, we visualised the graph using networkx and pyvis libraries. Here, because we are already saving the graph to Neo4J, we can leverage Bloom directly to visualise the graph.

To avoid repeating ourselves, let us generate a different visualisation from what we did in the previous article.

Let’s say we like to see how the relations between the characters evolve through the book.

We can do this by tracking how the edges are added to the graph incrementally while the graph maker traverses through the book. To enable this, the Edge model has an attribute called ‘order’. This attribute can be used to add a temporal or chronological dimension to the graph.

In our example, the graph maker automatically adds the sequence number in which a particular text chunk occurs in the document list, to every edge it extracts from that chunk. So to see how the relations between the characters evolve, we just have to cross section the graph by the order of the edges.

Here is an animation of these cross-sections.

Animation generated by the Author

Graph and RAG

The best application of this kind of KG is probably in RAG. There are umpteen articles on Medium on how to augment your RAG applications with Graphs.

Essentially Graphs offer a plethora of different ways to retrieve knowledge. Depending on how we design the Graph and our application, some of these techniques can be more powerful than simple semantic search.

At the very basic, we can add embedding vectors into our nodes and relationships, and run a semantic search against the vector index for retrieval. However, I feel the real power of the Graphs for RAG applications is when we mix Cypher queries and Network algorithms with Semantic Search.

I have been exploring some of these techniques myself. I am hoping to write about them in my next article.

The Code

Here is the GitHub Repository. Please feel free to take it for a spin. I have also included an example Python notebook in the repository that can help you get started quickly.

Please note that you will need to add your GROQ credentials in the .env file before you can get started.

GitHub – rahulnyk/graph_maker

Initially, I developed this codebase for a few of my pet projects. I feel it can be helpful for many more applications. If you use this library for your applications, please share it with me. I would love to learn about your use cases.

Also if you feel you can contribute to this open source project, please do so and make it your own.

I hope you find the graph maker useful. Thanks for reading.

I am a learner of architecture (not the buildings… the tech kind). In the past, I have worked with Semiconductor modelling, Digital circuit design, Electronic Interface modelling, and the Internet of Things.

Currently, Data and Consumer Analytics @Walmart Keeps me busy.

Thanks

Text to Knowledge Graph Made Easy with Graph Maker was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Text to Knowledge Graph Made Easy with Graph Maker

Go Here to Read this Fast! Text to Knowledge Graph Made Easy with Graph Maker
May 7, 2024
Boost employee productivity with automated meeting summaries using Amazon Transcribe, Amazon SageMaker, and LLMs from Hugging Face

Mateusz Zaremba

This post presents a solution to automatically generate a meeting summary from a recorded virtual meeting (for example, using Amazon Chime) with several participants. The recording is transcribed to text using Amazon Transcribe and then processed using Amazon SageMaker Hugging Face containers to generate the meeting summary. The Hugging Face containers host a large language model (LLM) from the Hugging Face Hub.

Originally appeared here:
Boost employee productivity with automated meeting summaries using Amazon Transcribe, Amazon SageMaker, and LLMs from Hugging Face

Go Here to Read this Fast! Boost employee productivity with automated meeting summaries using Amazon Transcribe, Amazon SageMaker, and LLMs from Hugging Face

May 7, 2024
How Veritone uses Amazon Bedrock, Amazon Rekognition, Amazon Transcribe, and information retrieval to update their video search pipeline

Tim Camara

This post is co-written with Tim Camara, Senior Product Manager at Veritone. Veritone is an artificial intelligence (AI) company based in Irvine, California. Founded in 2014, Veritone empowers people with AI-powered software and solutions for various applications, including media processing, analytics, advertising, and more. It offers solutions for media transcription, facial recognition, content summarization, object […]

Originally appeared here:
How Veritone uses Amazon Bedrock, Amazon Rekognition, Amazon Transcribe, and information retrieval to update their video search pipeline

Go Here to Read this Fast! How Veritone uses Amazon Bedrock, Amazon Rekognition, Amazon Transcribe, and information retrieval to update their video search pipeline

May 7, 2024
Information extraction with LLMs using Amazon SageMaker JumpStart

Pooya Vahidi

Large language models (LLMs) have unlocked new possibilities for extracting information from unstructured text data. Although much of the current excitement is around LLMs for generative AI tasks, many of the key use cases that you might want to solve have not fundamentally changed. Tasks such as routing support tickets, recognizing customers intents from a […]

Originally appeared here:
Information extraction with LLMs using Amazon SageMaker JumpStart

Go Here to Read this Fast! Information extraction with LLMs using Amazon SageMaker JumpStart

May 7, 2024
How I Predicted the Effect of Mutations on Protein Interactions Using AlphaFold

Murto Hilali

Using AlphaFold-Multimer, XGBoost, and 47,000 SLURM jobs to predict PPI outcomes with 91% AUC

Continue reading on Towards Data Science »

Originally appeared here:
How I Predicted the Effect of Mutations on Protein Interactions Using AlphaFold

Go Here to Read this Fast! How I Predicted the Effect of Mutations on Protein Interactions Using AlphaFold

May 7, 2024
UK fintech raises £800M for AI that determines how much money you can borrow

Ioanna Lykiardopoulou

An AI scanning your bank transaction data entails a level of invasiveness that I find difficult to accept — let alone embrace for my own transaction information. But the technology could bring merits, at least in the lending world. Enter Abound. The London-based startup has just raised £800mn for its lending platform that uses AI to determine loan amounts. Dubbed Render, Abound’s AI analyses customers’ full bank transaction data (from income to spending details) to understand their individual financial situation — unlike traditional credit checks. Render then calculates the amount of money customers are able to pay back each month.…

This story continues at The Next Web

Or just read more coverage about: Fintech

Go Here to Read this Fast! UK fintech raises £800M for AI that determines how much money you can borrow

Originally appeared here:
UK fintech raises £800M for AI that determines how much money you can borrow

May 7, 2024
Bottoms up: This German beer is made from recycled wastewater

Siôn Geschwindt

Reuse Brew is a classic German lager with a twist — it’s made from recycled wastewater. The beer is the result of a tie-up between the south German city of Weissenburg, American water tech company Xylem, and the Technical University of Munich (TUM). Specifically, TUM’s Brewery and Beverage Technology department (why didn’t I study there?!). While the idea of a sewage brew might be hard to swallow, Xylem ensures us that all the bad stuff is filtered out before the malt, hops, and yeast are added. First a machine injects ozone into the wastewater. Then the sludge is blasted by…

This story continues at The Next Web

Go Here to Read this Fast! Bottoms up: This German beer is made from recycled wastewater

Originally appeared here:
Bottoms up: This German beer is made from recycled wastewater

May 7, 2024
Not just iPad — Mac gets AI-Enhanced Logic Pro & Final Cut Pro updates

Apple has updated its Logic Pro and Final Cut Pro creative software suites, enhancing their functionality with new AI-driven features for Mac users.

Apple boosts Mac creativity with AI-Enhanced Logic Pro & Final Cut Pro

At the iPad event, Apple highlighted new versions of Final Cut Pro and Logic Pro for iPad — but they sneakily updated the Mac versions too.

The latest version of Final Cut Pro for Mac enhances editing speed through AI. Likewise, the updated Logic Pro, driven by artificial intelligence, introduces studio assistant tools that enhance the music production process, offering artists assistance precisely when it’s required.

Continue Reading on AppleInsider | Discuss on our Forums

Go Here to Read this Fast!

Not just iPad — Mac gets AI-Enhanced Logic Pro & Final Cut Pro updates

Originally appeared here:

Not just iPad — Mac gets AI-Enhanced Logic Pro & Final Cut Pro updates

May 7, 2024
Apple’s iPad upgrades march Lightning one step closer to death

After 11 years, 6 months, and 5 days of valiant charging and data service, the Lightning port is no longer on any iPad that Apple sells.

You’ll never see it again — a Lightning port on an iPad

It’s easy to be glad about the move to USB-C for Apple devices, because it’s (usually) faster, and because now USB-C is in the iPhone, the iPad, and a MacBook Pro can be charged over USB-C. Even the Siri Remote for Apple TV 4K is now USB-C.

That has taken a long time — the iPad Pro moved from Lightning to USB-C in October 2018. It’s possible that Apple would have kept the Lightning port for even longer, at least on the iPhone, if it were for the EU introducing laws mandating USB-C.

Continue Reading on AppleInsider | Discuss on our Forums

Go Here to Read this Fast!

Apple’s iPad upgrades march Lightning one step closer to death

Originally appeared here:

Apple’s iPad upgrades march Lightning one step closer to death

May 7, 2024

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Tag: tech

An intro to expert guide on using Gradio to demonstrate product value to expert and non-technical audiences.

The importance of stakeholder/customer engagement in data science

Introducing Gradio

Step-by-step guide to creating a demo using Gradio

Sharing with customers

Additional considerations

References

An open-source library for building knowledge graphs from text corpus using open-source LLMs like Llama 3 and Mixtral.

Introduction

A Brief Recap

Why The Graph Maker?

Meaningful Entities

Consistent Entities

Resilience in parsing

Categorisation of the Entities

Implied relations

The Graph Maker

1. Define the Ontology of your Graph

2. Split the text into chunks.

3. Convert these chunks into Documents.

4. Run the Graph Maker.

5. Save to Neo4j

Graph and RAG

The Code