Blog

Bitcoin’s long-term holders set record high – What it means for BTC

Lennox Gitonga

Bitcoin long-term holders have been buying aggressively.
MicroStrategy’s convertible debt will enable more purchase of Bitcoin.

Bitcoin [BTC] continued to demonstrate strength as the crypto

The post Bitcoin’s long-term holders set record high – What it means for BTC appeared first on AMBCrypto.

Go here to Read this Fast!

Bitcoin’s long-term holders set record high – What it means for BTC

Originally appeared here:

Bitcoin’s long-term holders set record high – What it means for BTC

September 23, 2024
Ethereum’s bullish week: 5 million active addresses drive 14.5% gain

Gladys Makena

ETH price surged by 14.5% over the last week.
Ethereum active addresses have hit 5 million over the past week as active buyers increased.

Since hitting a weekly low of $2,251 on 16th Septemb

The post Ethereum’s bullish week: 5 million active addresses drive 14.5% gain appeared first on AMBCrypto.

Go here to Read this Fast!

Ethereum’s bullish week: 5 million active addresses drive 14.5% gain

Originally appeared here:

Ethereum’s bullish week: 5 million active addresses drive 14.5% gain

September 23, 2024
A closer look at BlockDAG’s Testnet Launch and its quest for 30,000x ROI

AMBCrypto Team

The cryptocurrency world is perpetually vibrant with daily revelations. Today, we dive into some of the most captivating developments in cryptocurrencies that are currently making waves. Stellar, once

The post A closer look at BlockDAG’s Testnet Launch and its quest for 30,000x ROI appeared first on AMBCrypto.

Go here to Read this Fast! A closer look at BlockDAG’s Testnet Launch and its quest for 30,000x ROI

Originally appeared here:
A closer look at BlockDAG’s Testnet Launch and its quest for 30,000x ROI

September 23, 2024
Market Analyst Anticipates Ethereum Surge to $8,000 Once This Historical Pattern Unfolds

Newton Gitonga

Ethereum (ETH) experienced a notable surge on Friday, fueled by Bitcoin’s recent performance, which saw the leading crypto asset reclaim the $63,000 mark.

Go here to Read this Fast! Market Analyst Anticipates Ethereum Surge to $8,000 Once This Historical Pattern Unfolds

Originally appeared here:
Market Analyst Anticipates Ethereum Surge to $8,000 Once This Historical Pattern Unfolds

September 23, 2024
This Exchange Scooped Over 13 Trillion SHIB As Holders Worried About Price — Is $0.001 Shiba Inu Close?

Newton Gitonga

With Robinhood holding 47 trillion SHIB tokens, valued at approximately $685 million, many analysts are reassessing their outlook on the coin.

Go here to Read this Fast! This Exchange Scooped Over 13 Trillion SHIB As Holders Worried About Price — Is $0.001 Shiba Inu Close?

Originally appeared here:
This Exchange Scooped Over 13 Trillion SHIB As Holders Worried About Price — Is $0.001 Shiba Inu Close?

September 23, 2024
BlockDAG’s Next Premier League Rumor: A Threat to PEPE Coin & Simon’s Cat with Massive ROI Predictions?

Guest Author

What a year 2024 has been for meme coins! PEPE faced some setbacks, with a 6.69% decrease in September, but is eyeing a strong 25% comeback. On the other hand, Simon’s Cat has climbed a whopping 85.09% since its debut in August. But it’s BlockDAG that’s stirring up the sports world, amassing $74.5 million from […]

Go here to Read this Fast!

BlockDAG’s Next Premier League Rumor: A Threat to PEPE Coin & Simon’s Cat with Massive ROI Predictions?

Originally appeared here:

BlockDAG’s Next Premier League Rumor: A Threat to PEPE Coin & Simon’s Cat with Massive ROI Predictions?

September 23, 2024
Four Ways to Credit a Source in a Chart
Angelica Lo Duca
Data Visualization, Data Storytelling

A practical tutorial on where you should place credits in your next data visualization

Photo by Lukas Blazek on Unsplash

When you build a chart with data that is not ours, you must cite the source from which you took the data. This serves to do justice to the work done by others and increase your credibility towards those who will look at the graph. If you cite the data source, your chart will surely be more authoritative because it is based on verifiable data.

In this article, you will see four strategies for placing credits in a chart, although you can use your creativity to place it wherever you want:
- Under the title/subtitle
- Under the main chart
- Under the next steps
- Sideways.
Under the title/subtitle

Placing the credits under the title/subtitle generates a sense of trust in the audience from the beginning of the story. The following figure shows an example of credits under the title/subtitle.

Image by Author

Use this placement if you want your audience to know the sources from the beginning of the story. Although this placement may generate trust, it could also be distracting since the audience may leave your story to search for the sources.

Under the Main Chart

Placing the credits under the main chart involves adding detail to the story’s main point, which helps reinforce the story’s essential points. The following figure shows an example of credits placed under the main chart.

Image by Author

Use this placement if you want to reinforce the main message of your chart.

Under the Next Steps

In this case, credit the sources at the end of your story as an appendix to the next steps, as shown in the following figure.

Image by Author

Use this placement if you want to reinforce the next steps of your story.

Sideways

Placing credits sideways means considering them as external to the main data story workflow. You can place credits on the left or right, as shown in the following figure.

Image by Author

Use this placement to keep credits external to your main data story workflow and keep the audience concentrated on the story.

Discussion

You have seen how to place credits in a chart. If you have read carefully, at a certain point, precisely in the last type of placement—sideways—I talked about the main data story workflow. But what is it exactly? Any chart, if done well, tells a story, and a story normally follows a flow from the initial scene to its completion, with the final scene.

Even in a simple chart, you can tell a story, and precisely, a story in a chart has three main points: a beginning, a main point, and a final part.

The beginning corresponds to the context used to frame the chart’s content. Usually, you can insert the context immediately under the title, but you can also put it in other parts, such as on the left of the chart. In general, insert the context before the main chart so the reader can read it before the graph.

Then, there is the main point that corresponds precisely to your chart.

Finally, there is the final part of the story, which corresponds to the next steps. Place the next steps at the bottom, i.e., below the chart or to the right, so the reader will read them after reading the other parts of the graph.

Summary

Congratulations! You have just learned four possible ways to place credits in your chart. You have the following options: under the title/subtitle, under the main chart, under the next steps, or sideways.

Choosing one solution over another depends on your aesthetic taste and the space you have in the chart. The important thing is that the credits do not hinder the normal flow of the story.

This is a small final tip. If you can, always add a link to the data source in the credits so that the reader can personally verify the integrity of the data you show in your chart.

That’s all for me today. Thank you, as always, for reading the article.

You may also be interested in…
Extra Bonus…

If you want to learn more about turning your raw chart into a story, you can use the DIKW (Data-Information-Knowledge-Wisdom) pyramid.

Over the past few years, thanks to my children’s creativity and work, I have developed a passion for origami. This passion inspired me to create a practical pyramid to keep on the desk as a gadget.

In the DIKW origami pyramid, you will find, in a compact way:
- the steps from data to wisdom
- how to use them to transform data into a story
- how to use Generative AI to help you build the story
In the following video, I explain how to create a DIKW pyramid gadget and how to use it to turn data into a story. You can download the DIKW pyramid origami template here.

<a href="https://medium.com/media/17ce69b942424d01f38a32331bda266c/href">https://medium.com/media/17ce69b942424d01f38a32331bda266c/href</a>

Four Ways to Credit a Source in a Chart was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Four Ways to Credit a Source in a Chart

Go Here to Read this Fast! Four Ways to Credit a Source in a Chart
September 23, 2024
Semantic Layer for The People and by the People

Marina Tosic

My 3 [+1] jokers with templates for building a powerful analytical semantic layer

Continue reading on Towards Data Science »

Originally appeared here:
Semantic Layer for The People and by the People

Go Here to Read this Fast! Semantic Layer for The People and by the People

September 23, 2024
Using Generative AI to Automatically Create a Video Talk from an Article
Lak Lakshmanan
Using Gemini + Text to Speech + MoviePy to create a video, and what this says about what GenAI is becoming rapidly useful for

Like most everyone, I was flabbergasted by NotebookLM and its ability to generate a podcast from a set of documents. And then, I got to thinking: “how do they do that, and where can I get some of that magic?” How easy would it be to replicate?

Goal: Create a video talk from an article

I don’t want to create a podcast, but I’ve often wished I could generate slides and a video talk from my blog posts —some people prefer paging through slides, and others prefer to watch videos, and this would be a good way to meet them where they are. In this article, I’ll show you how to do this.

The full code for this article is on GitHub — in case you want to follow along with me. And the goal is to create this video from this article:

<a href="https://medium.com/media/9e21a6965b4d638eebc39df3b37777c6/href">https://medium.com/media/9e21a6965b4d638eebc39df3b37777c6/href</a>

1. Initialize the LLM

I am going to use Google Gemini Flash because (a) it is the least expensive frontier LLM today, (b) it’s multimodal in that it can read and understand images also, and (c) it supports controlled generation, meaning that we can make sure the output of the LLM matches a desired structure.
```
import pdfkit
import os
import google.generativeai as genai
from dotenv import load_dotenv

load_dotenv("../genai_agents/keys.env")
genai.configure(api_key=os.environ["GOOGLE_API_KEY"])
```
Note that I’m using Google Generative AI and not Google Cloud Vertex AI. The two packages are different. The Google one supports Pydantic objects for controlled generation; the Vertex AI one only supports JSON for now.

2. Get a PDF of the article

I used Python to download the article as a PDF, and upload it to a temporary storage location that Gemini can read:
```
ARTICLE_URL = "https://lakshmanok.medium...."
pdfkit.from_url(ARTICLE_URL, "article.pdf")
pdf_file = genai.upload_file("article.pdf")
```
Unfortunately, something about medium prevents pdfkit from getting the images in the article (perhaps because they are webm and not png …). So, my slides are going to be based on just the text of the article and not the images.

3. Create lecture notes in JSON

Here, the data format I want is a set of slides each of which has a title, key points, and a set of lecture notes. The lecture as a whole has a title and an attribution also.
```
class Slide(BaseModel):
    title: str
    key_points: List[str]
    lecture_notes: str

class Lecture(BaseModel):
    slides: List[Slide]
    lecture_title: str
    based_on_article_by: str
```
Let’s tell Gemini what we want it to do:
```
lecture_prompt = """
You are a university professor who needs to create a lecture to
a class of undergraduate students.

* Create a 10-slide lecture based on the following article.
* Each slide should contain the following information:
  - title: a single sentence that summarizes the main point
  - key_points: a list of between 2 and 5 bullet points. Use phrases, not full sentences.
  - lecture_notes: 3-10 sentences explaining the key points in easy-to-understand language. Expand on the points using other information from the article.
* Also, create a title for the lecture and attribute the original article's author.
"""
```
The prompt is pretty straightforward — ask Gemini to read the article, extract key points and create lecture notes.

Now, invoke the model, passing in the PDF file and asking it to populate the desired structure:
```
model = genai.GenerativeModel(
    "gemini-1.5-flash-001",
    system_instruction=[lecture_prompt]
)
generation_config={
    "temperature": 0.7,
    "response_mime_type": "application/json",
    "response_schema": Lecture
}
response = model.generate_content(
    [pdf_file],
    generation_config=generation_config,
    stream=False
)
```
A few things to note about the code above:
- We pass in the prompt as the system prompt, so that we don’t need to keep sending in the prompt with new inputs.
- We specify the desired response type as JSON, and the schema to be a Pydantic object
- We send the PDF file to the model and tell it generate a response. We’ll wait for it to complete (no need to stream)
The result is JSON, so extract it into a Python object:
```
lecture = json.loads(response.text)
```
For example, this is what the 3rd slide looks like:
```
{'key_points': [
    'Silver layer cleans, structures, and prepares data for self-service analytics.',
    'Data is denormalized and organized for easier use.',
    'Type 2 slowly changing dimensions are handled in this layer.',
    'Governance responsibility lies with the source team.'
  ],
 'lecture_notes': 'The silver layer takes data from the bronze layer and transforms it into a usable format for self-service analytics. This involves cleaning, structuring, and organizing the data. Type 2 slowly changing dimensions, which track changes over time, are also handled in this layer. The governance of the silver layer rests with the source team, which is typically the data engineering team responsible for the source system.',
 'title': 'The Silver Layer: Data Transformation and Preparation'
}
```
4. Convert to PowerPoint

We can use the Python package pptx to create a Presentation with notes and bullet points. The code to create a slide looks like this:
```
for slidejson in lecture['slides']:
    slide = presentation.slides.add_slide(presentation.slide_layouts[1])
    title = slide.shapes.title
    title.text = slidejson['title']
    # bullets
    textframe = slide.placeholders[1].text_frame
    for key_point in slidejson['key_points']:
        p = textframe.add_paragraph()
        p.text = key_point
        p.level = 1
    # notes
    notes_frame = slide.notes_slide.notes_text_frame
    notes_frame.text = slidejson['lecture_notes']
```
The result is a PowerPoint presentation that looks like this:

The PowerPoint file that was generated from the keypoints and lecture notes. Screenshot by author.

Not very fancy, but definitely a great starting point for editing if you are going to give a talk.

5. Read the notes aloud and save audio

Well, we were inspired by a podcast, so let’s see how to create just an audio of someone summarizing the article.

We already have the lecture notes, so let’s create audio files of each of the slides.

Here’s the code to take some text, and have an AI voice read it out. We save the resulting audio into an mp3 file:
```
from google.cloud import texttospeech

def convert_text_audio(text, audio_mp3file):
    """Synthesizes speech from the input string of text."""
    tts_client = texttospeech.TextToSpeechClient()    
    input_text = texttospeech.SynthesisInput(text=text)
    
    voice = texttospeech.VoiceSelectionParams(
        language_code="en-US",
        name="en-US-Standard-C",
        ssml_gender=texttospeech.SsmlVoiceGender.FEMALE,
    )
    audio_config = texttospeech.AudioConfig(
        audio_encoding=texttospeech.AudioEncoding.MP3
    )

    response = tts_client.synthesize_speech(
        request={"input": input_text, "voice": voice, "audio_config": audio_config}
    )

    # The response's audio_content is binary.
    with open(audio_mp3file, "wb") as out:
        out.write(response.audio_content)
        print(f"{audio_mp3file} written.")
```
What’s happening in the code above?
- We are using Google Cloud’s text to speech API
- Asking it to use a standard US accent female voice. If you were doing a podcast, you’d pass in a “speaker map” here, one voice for each speaker.
- We then give it in the input text, ask it generate audio
- Save the audio as an mp3 file. Note that this has to match the audio encoding.
Now, create audio by iterating through the slides, and passing in the lecture notes:
```
for slideno, slide in enumerate(lecture['slides']):
        text = f"On to {slide['title']} n"
        text += slide['lecture_notes'] + "nn"
        filename = os.path.join(outdir, f"audio_{slideno+1:02}.mp3")
        convert_text_audio(text, filename)
        filenames.append(filename)
```
The result is a bunch of audio files. You can concatenate them if you wish using pydub:
```
combined = pydub.AudioSegment.empty()
for audio_file in audio_files:
    audio = pydub.AudioSegment.from_file(audio_file)
    combined += audio
    # pause for 4 seconds
    silence = pydub.AudioSegment.silent(duration=4000)
    combined += silence
combined.export("lecture.wav", format="wav")
```
But it turned out that I didn’t need to. The individual audio files, one for each slide, were what I needed to create a video. For a podcast, of course, you’d want a single mp3 or wav file.

6. Create images of the slides

Rather annoyingly, there’s no easy way to render PowerPoint slides as images using Python. You need a machine with Office software installed to do that — not the kind of thing that’s easily automatable. Maybe I should have used Google Slides … Anyway, a simple way to render images is to use the Python Image Library (PIL):
```
def text_to_image(output_path, title, keypoints):
    image = Image.new("RGB", (1000, 750), "black")
    draw = ImageDraw.Draw(image)
    title_font = ImageFont.truetype("Coval-Black.ttf", size=42)
    draw.multiline_text((10, 25), wrap(title, 50), font=title_font)
    text_font = ImageFont.truetype("Coval-Light.ttf", size=36)
    for ptno, keypoint in enumerate(keypoints):
        draw.multiline_text((10, (ptno+2)*100), wrap(keypoint, 60), font=text_font) 
    image.save(output_path)
```
The resulting image is not great, but it is serviceable (you can tell no one pays me to write production code anymore):

The images used along with the audio clips look like this. Image generated by author.

7. Create a Video

Now that we have a set of audio files and a set of image files, we can use a Python package moviepy to create a video clip:
```
clips = []
for slide, audio in zip(slide_files, audio_files):
    audio_clip = AudioFileClip(f"article_audio/{audio}")
    slide_clip = ImageClip(f"article_slides/{slide}").set_duration(audio_clip.duration)
    slide_clip = slide_clip.set_audio(audio_clip)
    clips.append(slide_clip)
full_video = concatenate_videoclips(clips)
```
And we can now write it out:
```
full_video.write_videofile("lecture.mp4", fps=24, codec="mpeg4", 
                           temp_audiofile='temp-audio.mp4', remove_temp=True)
```
End result? We have four artifacts, all created automatically from the article.pdf:
```
lecture.json  lecture.mp4  lecture.pptx  lecture.wav
```
There’s:
- a JSON file with keypoints, lecture notes, etc.
- A PowerPoint file that you can modify. The slides have the key points, and the notes section of the slides has the “lecture notes”
- An audio file consisting of an AI voice reading out the lecture notes
- A mp4 movie (that I uploaded to YouTube) of the audio + images. This is the video talk that I set out to create.
Pretty cool, eh?

8. What this says about the future of software

We are all, as a community, probing around to find what this really cool technology (generative AI) can be used for. Obviously, you can use it to create content, but the content that it creates is good for brainstorming, but not to use as-is. Three years of improvements in the tech have not solved the problem that GenAI generates blah content, and not-ready-to-use code.

That brings us to some of the ancillary capabilities that GenAI has opened up. And these turn out to be extremely useful. There are four capabilities of GenAI that this post illustrates.

(1) Translating unstructured data to structured data

The Attention paper was written to solve the translation problem, and it turns out transformer-based models are really good at translation. We keep discovering use cases of this. But not just Japanese to English, but also Java 11 to Java 17, of text to SQL, of text to speech, between database dialects, …, and now of articles to audio-scripts. This, it turns out is the stepping point of using GenAI to create podcasts, lectures, videos, etc.

All I had to do was to prompt the LLM to construct a series of slide contents (keypoints, title, etc.) from the article, and it did. It even returned the data to me in structured format, conducive to using it from a computer program. Specifically, GenAI is really good at translating unstructured data to structured data.

(2) Code search and coding assistance are now dramatically better

The other thing that GenAI turns out to be really good at is at adapting code samples dynamically. I don’t write code to create presentations or text-to-speech or moviepy everyday. Two years ago, I’d have been using Google search and getting Stack Overflow pages and adapting the code by hand. Now, Google search is giving me ready-to-incorporate code:

Google Search returning code samples, adapated to my specific query. Screenshot by author.

Of course, had I been using a Python IDE (rather than a Jupyter notebook), I could have avoided the search step completely — I could have written a comment and gotten the code generated for me. This is hugely helpful, and speeds up development using general purpose APIs.

(3) GenAI web services are robust and easy-to-consume

Let’s not lose track of the fact that I used the Google Cloud Text-to-Speech service to turn my audio script into actual audio files. Text-to-speech is itself a generative AI model (and another example of the translation superpower). The Google TTS service which was introduced in 2018 (and presumably improved since then) was one of the first generative AI services in production and made available through an API.

In this article, I used two generative AI models — TTS and Gemini — that are made available as web services. All I had to do was to call their APIs.

(4) It’s easier than ever to provide end-user customizability

I didn’t do this, but you can squint a little and see where things are headed. If I’d wrapped up the presentation creation, audio creation, and movie creation code in services, I could have had a prompt create the function call to invoke these services as well. And put a request-handling agent that would allow you to use text to change the look-and-feel of the slides or the voice of the person reading the video.

It becomes extremely easy to add open-ended customizability to the software you build.

Summary

Inspired by the NotebookLM podcast feature, I set out to build an application that would convert my articles to video talks. The key step is to prompt an LLM to produce slide contents from the article, another GenAI model to convert the audio script into audio files, and use existing Python APIs to put them together into a video.

This article illustrates four capabilities that GenAI is unlocking: translation of all kinds, coding assistance, robust web services, and end-user customizability.

I loved being able to easily and quickly create video lectures from my articles. But I’m even more excited about the potential that we keep discovering in this new tool we have in our hands.

Further Reading
1. Full code for this article: https://github.com/lakshmanok/lakblogs/blob/main/genai_seminar/create_lecture.ipynb
2. The source article that I converted to a video: https://lakshmanok.medium.com/what-goes-into-bronze-silver-and-gold-layers-of-a-medallion-data-architecture-4b6fdfb405fc
3. The resulting video: https://youtu.be/jKzmj8-1Y9Q
4. Turns out Sascha Heyer wrote up how to use GenAI to generate a podcast, which is the exact Notebook LM usecase. His approach is somewhat similar to mine, except that there is no video, just audio. In a cool twist, he uses his own voice as one of the podcast speakers!
5. Of course, here’s the video talk of this article created using the technique shown in this video. Ideally, we are pulling out code snippets and images from the article, but this is a start …
<a href="https://medium.com/media/217a648d4c29d7ce479909172d22893e/href">https://medium.com/media/217a648d4c29d7ce479909172d22893e/href</a>

Using Generative AI to Automatically Create a Video Talk from an Article was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Using Generative AI to Automatically Create a Video Talk from an Article

Go Here to Read this Fast! Using Generative AI to Automatically Create a Video Talk from an Article
September 23, 2024

Graph RAG into Production — step-by-step

Graph RAG into Production — Step-by-Step

A GCP native, fully serverless implementation that you will replicate in minutes

After discussing Graph RAG conceptually, let’s bring it into production. This is how to productionize GraphRAG: completely serverless, fully parallelized to minimize inference and indexing times, and without ever touching a graph database (promise!).

In this article, I will introduce you to graphrag-lite, an end-to-end Graph RAG ingestion and query implementation. I published graphrag-lite as an OSS project to make your life easier when deploying graphrag on GCP. Graphrag-lite is Google Cloud-native and ready to use off the shelf. The code is designed in a modular manner, adjustable for your platform of choice.

Recap:

Retrieval Augmented Generation itself does not yet describe any specific architecture or method. It only depicts the augmentation of a given generation task with an arbitrary retrieval method. The original RAG paper (Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks by Lewis et. al.) compares a two-tower embedding approach with bag-of-words retrieval.

Modern Q&A systems differentiate between local and global questions. A local (extractive) question on an unstructured sample knowledge base might be “Who won the Nobel Peace Prize in 2023?”. A global (aggregative) question might be “Who are the most recent Nobel prize winners you know about?”. Text2embedding RAG leaves obvious gaps when it comes to global and structured questions. Graph RAG can close these gaps and it does that well! Via an abstraction layer, it learns the semantics of the knowledge graph communities. That builds a more “global” understanding of the indexed dataset. Here is a conceptual intro to Graph RAG to read up on.

The Graph RAG pipeline

A Graph RAG pipeline will usually follows the following steps:

Graph Extraction

This is the main ingestion step. Your LLM scans every incoming document with a prompt to extract relevant nodes and edges for our knowledge graph. You iterate multiple times over this prompt to assure you catch all relevant pieces of information.

Graph Storage

You store the extracted nodes and edges in your data store of choice. Dedicated Graph DBs are one option, but they are often tedious. Graph2nosql is a Python-based interface to store and manage knowledge graphs in Firestore or any other NoSQL DB. I open sourced this project becase I did not find any other comparable, knowledge graph native option on the market.

Community detection

Once you store your knowledge graph data you will use a community detection algorithm to identify groups of nodes that are more densely connected within each other than they are to the rest of the graph. In the context of a knowledge graph, the assumption is that dense communities cover common topics.

Community report generation

You then instruct your LLM to generate a report for each graph community. These community reports help abstract across single topics to grasp wider, global concepts across your dataset. Community reports are stored along with your knowledge graph. This concludes the ingestion layer of the pipeline.

Map-Reduce for final context building.

At query time you follow a map-reduce pattern to generate an intermediate response to the user query for every community report in your knowledge graph. You have the LLM also rate the relevance of each intermediate query response. Finally, you rank the intermediate responses by relevance and select the top n as context for your final response to the user.

Graph RAG step-by-step logic — Image by author

Graph Extraction

In the initial ingestion step, you instruct your LLM to encode your input document(s) as a graph. An extensive prompt instructs your LLM to first identify nodes of given types, and secondly edges between the nodes you identified. Just like with any LLM prompt, there is not one solution for this challenge. Here is the core of my graph extraction prompt, which I based on Microsoft’s OSS implementation:

-Goal-
Given a text document that is potentially relevant to this activity and a list of entity types, identify all entities of those types from the text and all relationships among the identified entities.

-Steps-
1. Identify all entities. For each identified entity, extract the following information:
- entity_name: Name of the entity, capitalized
- entity_type: One of the following types: [{entity_types}]
- entity_description: Comprehensive description of the entity's attributes and activities
Format each entity as ("entity"{tuple_delimiter}<entity_name>{tuple_delimiter}<entity_type>{tuple_delimiter}<entity_description>

2. From the entities identified in step 1, identify all pairs of (source_entity, target_entity) that are *clearly related* to each other.
For each pair of related entities, extract the following information:
- source_entity: name of the source entity, as identified in step 1
- target_entity: name of the target entity, as identified in step 1
- relationship_description: explanation as to why you think the source entity and the target entity are related to each other
- relationship_strength: a numeric score indicating strength of the relationship between the source entity and target entity
 Format each relationship as ("relationship"{tuple_delimiter}<source_entity>{tuple_delimiter}<target_entity>{tuple_delimiter}<relationship_description>{tuple_delimiter}<relationship_strength>)

3. Return output in English as a single list of all the entities and relationships identified in steps 1 and 2. Use **{record_delimiter}** as the list delimiter.

4. When finished, output {completion_delimiter}

<Multishot Examples>

-Real Data-
######################
Entity_types: {entity_types}
Text: {input_text}
######################
Output:

The extraction step is responsible for which information will be reflected in your knowledge base. Thus you should use a rather powerful model such as Gemini 1.5 Pro. You can further increase the result robustness, use the multi-turn version Gemini 1.5 Pro, and query the model to improve its results n times. Here is how I implemented the graph extraction loop in graphrag-lite:

class GraphExtractor:
    def __init__(self, graph_db) -> None:
        self.tuple_delimiter = "<|>"
        self.record_delimiter = "##"
        self.completion_delimiter = "<|COMPLETE|>"
        self.entity_types = ["organization", "person", "geo", "event"]

        self.graph_extraction_system = prompts.GRAPH_EXTRACTION_SYSTEM.format(
            entity_types=", ".join(self.entity_types),
            record_delimiter=self.record_delimiter,
            tuple_delimiter=self.tuple_delimiter,
            completion_delimiter=self.completion_delimiter,
        )

        self.llm = LLMSession(system_message=self.graph_extraction_system,
                              model_name="gemini-1.5-pro-001")

    def __call__(self, text_input: str, max_extr_rounds: int = 5) -> None:

        input_prompt = self._construct_extractor_input(input_text=text_input)

        print("+++++ Init Graph Extraction +++++")

        init_extr_result = self.llm.generate_chat(
            client_query_string=input_prompt, temperature=0, top_p=0)
        print(f"Init result: {init_extr_result}")

        for round_i in range(max_extr_rounds):

            print(f"+++++ Contd. Graph Extraction round {round_i} +++++")

            round_response = self.llm.generate_chat(
                client_query_string=prompts.CONTINUE_PROMPT, temperature=0, top_p=0)
            init_extr_result += round_response or ""

            print(f"Round response: {round_response}")

            if round_i >= max_extr_rounds - 1:
                break

            completion_check = self.llm.generate_chat(
                client_query_string=prompts.LOOP_PROMPT, temperature=0, top_p=0)

            if "YES" not in completion_check:
                print(
                    f"+++++ Complete with completion check after round {round_i} +++++")
                break

First I make an initial call to the multi-turn model to extract nodes and edges. Second I ask the model to improve the previous extraction results several times.

In the graphrag-lite implementation, the extraction model calls are made by the front-end client. If you want to reduce client load you could outsource the extraction queries to a microservice.

Graph Storage

Once you extract the nodes and edges from a document you need to store them in an accessible format. Graph Databases are one way to go, but they can also be cumbersome. For your knowledge graph, you might be looking for something a little more lightweight. I thought the same, because I did not find any knowledge graph native library I open sources graph2nosql. Graph2nosql is a simple knowledge graph native Python interface. It helps store and manage your knowledge graph in any NoSQL DB. All that without blowing up your tech stack with a graph db or needing to learn Cypher.

Graph2nosql is designed for knowledge graph retrieval with graph rag in mind. The library is designed around three major datatypes: EdgeData, NodeData, and CommunityData. Nodes are identified by an uid. Edges are identified by source and destination node uid and an edge uid. Given that uids can be freely designed, the graph2nosql data model leaves space for any size of the knowledge graph. You can even add text or graph embeddings. That allows embedding-based analytics, edge prediction, and additional text embedding retrieval (thinking hybrid RAG).

Graph2nosql is natively designed around Firestore.

@dataclass
class EdgeData:
    source_uid: str
    target_uid: str 
    description: str
    edge_uid: str | None = None
    document_id: str | None = None

@dataclass
class NodeData:
    node_uid: str
    node_title: str
    node_type: str
    node_description: str
    node_degree: int
    document_id: str 
    community_id: int | None = None # community id based on source document 
    edges_to: list[str] = field(default_factory=list)
    edges_from: list[str] = field(default_factory=list)  # in case of directed graph
    embedding: list[float] = field(default_factory=list)  # text embedding representing node e.g. combination of title & description

@dataclass
class CommunityData:
    title: str # title of comm, None if not yet computed
    community_nodes: set[str] = field(default_factory=set) # list of node_uid belonging to community
    summary: str | None = None # description of comm, None if not yet computed
    document_id: str | None = None # identifier for source knowlede base document for this entity
    community_uid: str | None = None # community identifier
    community_embedding: Tuple[float, ...] = field(default_factory=tuple) # text embedding representing community
    rating: int | None = None
    rating_explanation: str | None = None
    findings: list[dict] | None = None

To store your graph data via graph2nosql simply run the following code when parsing the results from your extraction step. Here is the graphrag-lite implementation.

from graph2nosql.graph2nosql.graph2nosql import NoSQLKnowledgeGraph
from graph2nosql.databases.firestore_kg import FirestoreKG
from graph2nosql.datamodel import data_model

fskg = FirestoreKG(
        gcp_project_id=project_id,
        gcp_credential_file=firestore_credential_file,
        firestore_db_id=database_id,
        node_collection_id=node_coll_id,
        edges_collection_id=edges_coll_id,
        community_collection_id=community_coll_id)

node_data = data_model.NodeData(
        node_uid=entity_name,
        node_title=entity_name,
        node_type=entity_type,
        node_description=entity_description,
        document_id=str(source_doc_id),
        node_degree=0)

fskg.add_node(node_uid=entity_name,node_data=node_data)

Community detection

With all relevant nodes and edges stored in your Graph DB, you can start building the abstraction layer. One way of doing that is finding nodes that describe similar concepts and describe how they are connected semantically. Graph2nosql offers inbuilt community detection, for example, based on Louvain communities.

Depending on your extraction result quality you will find zero-degree nodes in your knowledge graph. From experience, zero-degree nodes are often duplicates. Graphrag-lite uses graph communities as a major abstraction layer thus you should drop the nodes without any edges. Thus it would make sense to think about another duplicate/merge step and/or a node prediction step based on description and graph embeddings to add edges that might have been missed in the extraction step. In graphrag-lite I currently simply drop all zero-degree nodes.

# clean graph off all nodes without any edges
fskg.clean_zerodegree_nodes()

# generate communities based on cleaned graph
comms = kg.get_louvain_communities()

Here is the graphrag-lite implementation of community detection.

Optimizing throughput latency in LLM applications

The GraphRAG pipeline mentioned above takes numerous LLM calls per document ingestion and user query. For example, to generate multiple community reports for every newly indexed document, or to generate intermediate responses for multiple communities at query time. If processed concurrently an awful user experience will be the result. Especially at scale users will have to wait minutes to hours to receive a response to their query. Fortunately, if you frame your LLM prompts the right way you can design them as “stateless workers”. The power of stateless processing architectures is twofold. Firstly, they are easy to parallelize. Secondly, they are easy to implement as serverless infastructure. Combined with a parallelized and serverless architecture maximizes your throughput scalability and minimizes your cost for idle cluster setups.

In the graphrag-lite architecture I host both the community report generation and the intermediate query generation as serverless Cloud Run microservice workers. These are fed with messages via GCP’s serverless messaging queue PubSub.

graphrag-lite’s serverless and distributed ingestion & query pipeline — Image by author

Community report generation

After running the community detection you now know multiple sets of community member nodes. Each of these sets represents a semantic topic within your knowledge graph. The community reporting step needs to abstract across these concepts that originated in different documents within your knowledge base. I again built on the Microsoft implementation and added a function call for easily parsable structured output.

You are an AI assistant that helps a human analyst to perform general information discovery. Information discovery is the process of identifying and assessing relevant information associated with certain entities (e.g., organizations and individuals) within a network.

# Goal
Write a comprehensive report of a community, given a list of entities that belong to the community as well as their relationships and optional associated claims. The report will be used to inform decision-makers about information associated with the community and their potential impact. The content of this report includes an overview of the community's key entities, their legal compliance, technical capabilities, reputation, and noteworthy claims.

# Report Structure

The report should include the following sections:

- TITLE: community's name that represents its key entities - title should be short but specific. When possible, include representative named entities in the title.
- SUMMARY: An executive summary of the community's overall structure, how its entities are related to each other, and significant information associated with its entities.
- IMPACT SEVERITY RATING: a float score between 0-10 that represents the severity of IMPACT posed by entities within the community. IMPACT is the scored importance of a community.
- RATING EXPLANATION: Give a single sentence explanation of the IMPACT severity rating.
- DETAILED FINDINGS: A list of 5-10 key insights about the community. Each insight should have a short summary followed by multiple paragraphs of explanatory text grounded according to the grounding rules below. Be comprehensive.

The community report generation also demonstrated the biggest challenge around knowledge graph retrieval. Theoretically, any document could add a new node to every existing community in the graph. In the worst-case scenario, you re-generate every community report in your knowledge base for each new document added. In practice it is crucial to include a detection step that identifies which communities have changed after a document upload, resulting in new report generation for only the adjusted communities.

As you need to re-generate multiple community reports for every document upload we are also facing significant latency challenges if running these requests concurrently. Thus you should outsource and parallelize this work to asynchronous workers. As mentioned before, graphrag-lite solved this using a serverless architecture. I use PubSub as a message queue to manage work items and ensure processing. Cloud Run comes on top as a compute platform hosting stateless workers calling the LLM. For generation, they use the prompt as shown above.

Here is the code that runs in the stateless worker for community report generation:

def async_generate_comm_report(self, comm_members: set[str]) -> data_model.CommunityData:

        llm = LLMSession(system_message=prompts.COMMUNITY_REPORT_SYSTEM,
                         model_name="gemini-1.5-flash-001")

        response_schema = {
            "type": "object",
            "properties": {
                    "title": {
                        "type": "string"
                    },
                "summary": {
                        "type": "string"
                        },
                "rating": {
                        "type": "int"
                        },
                "rating_explanation": {
                        "type": "string"
                        },
                "findings": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "summary": {
                                    "type": "string"
                                },
                                "explanation": {
                                    "type": "string"
                                }
                            },
                            # Ensure both fields are present in each finding
                            "required": ["summary", "explanation"]
                        }
                        }
            },
            # List required fields at the top level
            "required": ["title", "summary", "rating", "rating_explanation", "findings"]
        }

        comm_report  = llm.generate(client_query_string=prompts.COMMUNITY_REPORT_QUERY.format(
            entities=comm_nodes,
            relationships=comm_edges,
            response_mime_type="application/json",
            response_schema=response_schema
        ))

comm_data = data_model.CommunityData(title=comm_report_dict["title"],                                              summary=comm_report_dict["summary"],                                                rating=comm_report_dict["rating"],             rating_explanation=comm_report_dict["rating_explanation"],               findings=comm_report_dict["findings"],
community_nodes=comm_members)
 
 return comm_data

This completes the ingestion pipeline.

Map-step for intermediate responses

Finally, you reached query time. To generate your final response to the user, you generate a set of intermediate responses (one per community report). Each intermediate response takes the user query and one community report as input. You then rate these intermediate queries by their relevance. Finally, you use the most relevant community reports and additional information such as node descriptions of the relevant member nodes as the final query context. Given a high number of community reports at scale, this again poses a challenge of latency and cost. Similar to previously you should also parallelize the intermediate response generation (map-step) across serverless microservices. In the future, you could significantly improve efficiency by adding a filter layer to pre-determine the relevance of a community report for a user query.

The map-step microservice looks as follows:

def generate_response(client_query: str, community_report: dict):

    llm = LLMSession(
        system_message=MAP_SYSTEM_PROMPT,
        model_name="gemini-1.5-pro-001"
    )

    response_schema = {
        "type": "object",
        "properties": {
            "response": {
                "type": "string",
                "description": "The response to the user question as raw string.",
            },
            "score": {
                "type": "number",
                "description": "The relevance score of the given community report context towards answering the user question [0.0, 10.0]",
            },
        },
        "required": ["response", "score"],
    }

    query_prompt = MAP_QUERY_PROMPT.format(
        context_community_report=community_report, user_question=client_query)

    response = llm.generate(client_query_string=query_prompt,
                 response_schema=response_schema,
                 response_mime_type="application/json")

    return response

The map-step microservice uses the following prompt:

---Role---
You are an expert agent answering questions based on context that is organized as a knowledge graph.
You will be provided with exactly one community report extracted from that same knowledge graph.


---Goal---
Generate a response consisting of a list of key points that responds to the user's question, summarizing all relevant information in the given community report.

You should use the data provided in the community description below as the only context for generating the response.
If you don't know the answer or if the input community description does not contain sufficient information to provide an answer respond "The user question cannot be answered based on the given community context.".

Your response should always contain following elements:
- Query based response: A comprehensive and truthful response to the given user query, solely based on the provided context.
- Importance Score: An integer score between 0-10 that indicates how important the point is in answering the user's question. An 'I don't know' type of response should have a score of 0.

The response should be JSON formatted as follows:
{{"response": "Description of point 1 [Data: Reports (report ids)]", "score": score_value}}

---Context Community Report---
{context_community_report}

---User Question---
{user_question}

---JSON Response---
The json response formatted as follows:
{{"response": "Description of point 1 [Data: Reports (report ids)]", "score": score_value}}

response:

Reduce-step for final user response

For a successful reduce-step, you need to store the intermediate response for access at query time. With graphrag-lite, I use Firestore as a shared state across microservices. After triggering the intermediate response generations, the client also periodically checks for the existence of all expected entries in the shared state. The following code extract from graphrag-lite shows how I submit every community report to the PubSub queue. After, I periodically query the shared state to check whether all intermediate responses have been processed. Finally, the end response towards the user is generated using the top-scoring community reports as context to respond to the user query.

class KGraphGlobalQuery:
    def __init__(self) -> None:
        # initialized with info on mq, knowledge graph, shared nosql state
        pass

    @observe()
    def __call__(self, user_query: str) -> str:

        # orchestration method taking natural language user query to produce and return final answer to client
        comm_report_list = self._get_comm_reports()

        # pair user query with existing community reports
        query_msg_list = self._context_builder(
            user_query=user_query, comm_report_list=comm_report_list)

        # send pairs to pubsub queue for work scheduling
        for msg in query_msg_list:
            self._send_to_mq(message=msg)
        print("int response request sent to mq")

        # periodically query shared state to check for processing compeltion & get intermediate responses
        intermediate_response_list = self._check_shared_state(
            user_query=user_query)

        # based on helpfulness build final context
        sorted_final_responses = self._filter_and_sort_responses(intermediate_response_list=intermediate_response_list)

        # get full community reports for the selected communities
        comm_report_list = self._get_communities_reports(sorted_final_responses)

        # generate & return final response based on final context community repors and nodes.
        final_response_system = prompts.GLOBAL_SEARCH_REDUCE_SYSTEM.format(
            response_type="Detailled and wholistic in academic style analysis of the given information in at least 8-10 sentences across 2-3 paragraphs.")

        llm = LLMSession(
            system_message=final_response_system,
            model_name="gemini-1.5-pro-001"
        )

        final_query_string = prompts.GLOBAL_SEARCH_REDUCE_QUERY.format(
            report_data=comm_report_list,
            user_query=user_query
        )
        final_response = llm.generate(client_query_string=final_query_string)
        return final_response

Once all entries are found the client triggers the final user response generation given the selected community context.

Final Thoughts

Graph RAG is a powerful technique every ML Engineer should add to their toolbox. Every Q&A type of application will eventually arrive at the point that purely extractive, “local” queries don’t cut it anymore. With graphrag-lite, you now have a lightweight, cloud-native, and serverless implementation that you can rapidly replicate.

Despite these strengths, please note that in the current state Graph RAG still consumes significantly more LLM input tokens than in the text2emb RAG. That usually comes with considerably higher latency and cost for queries and document indexing. Nevertheless, after experiencing the improvement in result quality I am convinced that in the right use cases, Graph RAG is worth the time and money.

RAG applications will ultimately move in a hybrid direction. Extractive queries can be handled efficiently and correctly by text2emb RAG. Global abstractive queries might need a knowledge graph as an alternative retrieval layer. Finally, both methods underperform with quantitative and analytical queries. Thus a third text2sql retrieval layer would add massive value. To complete the picture, user queries could initially be classified between the three retrieval methods. Like this, every query could be grounded most efficiently with the right amount and depth of information.

I cannot wait to see where else this is going. Which alternative retrieval methods have you been working with?

Graph RAG into Production — step-by-step was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Originally appeared here:
Graph RAG into Production — step-by-step

Go Here to Read this Fast! Graph RAG into Production — step-by-step

September 23, 2024

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Blog

Data Visualization, Data Storytelling

A practical tutorial on where you should place credits in your next data visualization

Under the title/subtitle

Under the Main Chart

Under the Next Steps

Sideways

Discussion

Summary

You may also be interested in…

Extra Bonus…

Using Gemini + Text to Speech + MoviePy to create a video, and what this says about what GenAI is becoming rapidly useful for

Goal: Create a video talk from an article

1. Initialize the LLM

2. Get a PDF of the article

3. Create lecture notes in JSON

4. Convert to PowerPoint

5. Read the notes aloud and save audio

6. Create images of the slides

7. Create a Video

8. What this says about the future of software

Summary

Further Reading

Graph RAG into Production — Step-by-Step

A GCP native, fully serverless implementation that you will replicate in minutes

Recap:

The Graph RAG pipeline

Graph Extraction

Graph Storage

Community detection

Community report generation

Map-Reduce for final context building.

Graph Extraction

Graph Storage

Community detection

Optimizing throughput latency in LLM applications

Community report generation

Map-step for intermediate responses

Reduce-step for final user response

Final Thoughts