Year: 2024

  • Building a RAG chain using LangChain Expression Language (LCEL)

    Building a RAG chain using LangChain Expression Language (LCEL)

    Roshan Santhosh

    Learning the building blocks of LCEL to develop increasingly complex RAG chains

    In this post, I will be going over the implementation of a Self-evaluation RAG pipeline for question-answering using LangChain Expression Language (LCEL). The focus of this post will be on the use of LCEL for building pipelines and not so much on the actual RAG and self evaluation principles used, which are kept simple for ease of understanding.

    I will be covering the following topics :

    1. Basic initialization steps
    2. Development of different variations of the RAG pipeline of increasing complexity using LCEL
    3. Methods for extracting intermediate variables from a LCEL-scripted pipeline
    4. Reasons for using LCEL

    The Setup

    Before we jump into the development of the RAG chain, there are some basic setup steps that we need to perform to initialize this setup. These include :

    Data Ingestion

    The data ingestion consists of two key steps :

    1. Reading the text from the pdf
    2. Splitting up the pdf text into chunks for inputting to the vector database

    Prompt Templates

    We will be using different prompts for the question-answering and self-evaluation tasks. We will be having 3 different prompt templates :

    1. qa_prompt : Basic prompt for the question-answering task
    2. qa_eval_prompt : Prompt for evaluator model that takes as input question-answer pair
    3. qa_eval_prompt_with_context : Similar to above prompt but additionally includes the context as well for the evaluation

    Database Initialization

    We initialize a simple vector database using FAISS and Open AI embeddings. For retrieval, we set k as 3 (return top 3 chunks for a given query)

    RAG Development

    Simple QA RAG

    We start off with an example of a basic RAG chain that carries out the following steps :

    1. Retrieves the relevant chunks (splits of pdf text) from the vector database based on the user’s question and merges them into a single string
    2. Passes the retrieved context text along with question to the prompt template to generate the prompt
    3. Input generated prompt to LLM to generate final answer

    Using LangChain Expression Language(LCEL), this RAG would be implemented as such:

    rag_chain = ( 
    RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
    qa_prompt |
    llm
    )

    The above code primarily follows the pipe architecture where the output from the preceding element is used as the input for the next element. The below diagram showcases the flow of data. Starting from the user’s input, it passes first through the RunnableParallel block, then through the qa_prompt to generate the prompt. This prompt is then sent to the LLM to generate the final output.

    Basic LCEL input/output flow

    There are two key additions to this pipeline that are unique to LangChain :

    1. RunnableParallel : As the name suggests, this class provides the functionality to run multiple processes in parallel. As a result, the output of a RunnableParallel is a dict with the keys being the arguments provided during its initialization. In this case, the output would have two keys : context and question.
      So why do we need this in our current situation? Its required because the qa_prompt template requires two input values: the context and the question. Therefore we need to compute these values individually and then pass them together to the qa_prompt template.
    2. RunnablePassthrough : This is a useful class when you want to pass through the input to the next stage without any modification. Essentially, this acts as an identity function that returns whatever is passed as its input.

    The flowchart for the above RAG would look like this :

    QA RAG with Self Evaluation I

    Building over the previous RAG chain, we now introduce new elements into the chain to implement the self evaluation component.

    The self evaluation component is again a pretty straightforward implementation. We take the answer provided by the first LLM and pass it to the evaluator LLM along with the question and ask it to provide a binary response (Correct/Incorrect).

    rag_chain = ( 
    RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
    RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question") ) |
    qa_eval_prompt |
    llm_selfeval |
    json_parser
    )

    The first key difference is the addition of an additional RunnableParallel component. This is required because, similar to the initial prompt for the QA, the self eval prompt also requires two inputs : the base LLM’s answer as well as the user’s question.

    So the output of the first RunnableParallel is the context text and the question while the output of the second RunnableParallel is the LLM answer along with the question.

    NOTE: For the second RunnableParallel, we use the itemgetter method to retain only the question value from the previous input and propagate it forward. This is done instead of using RunnablePassthrough as it would passed on the full input (dict with two keys) whereas we are only interested in passing on the question right now and not the context. Additionally, there is the issue of formatting as qa_eval_prompt expects a dict with str -> str mapping but using RunnablePassthrough would results in a str-> dict mapping

    The flowchart for this RAG implementation would look like this:

    QA RAG with Self Evaluation II

    For this variation, we make a change to the evaluation procedure. In addition to the question-answer pair, we also pass the retrieved context to the evaluator LLM.

    To accomplish this, we add an additional itemgetter function in the second RunnableParallel to collect the context string and pass it to the new qa_eval_prompt_with_context prompt template.

    rag_chain = ( 
    RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
    RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question"), context = itemgetter("context") ) |
    qa_eval_prompt_with_context |
    llm_selfeval |
    json_parser
    )

    Implementation Flowchart :

    Retrieving intermediate variables

    One of the common pain points with using a chain implementation like LCEL is the difficulty in accessing the intermediate variables, which is important for debugging pipelines. We look at few options where we can still access any intermediate variables we are interested using manipulations of the LCEL

    Using RunnableParallel to carry forward intermediate outputs

    As we saw earlier, RunnableParallel allows us to carry multiple arguments forward to the next step in the chain. So we use this ability of RunnableParallel to carry forward the required intermediate values all the way till the end.

    In the below example, we modify the original self eval RAG chain to output the retrieved context text along with the final self evaluation output. The primary change is that we add a RunnableParallel object to every step of the process to carry forward the context variable.

    Additionally, we also use the itemgetter function to clearly specify the inputs for the subsequent steps. For example, for the last two RunnableParallel objects, we use itemgetter(‘input’) to ensure that only the input argument from the previous step is passed on to the LLM/ Json parser objects.

    rag_chain = ( 
    RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
    RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question"), context = itemgetter("context") ) |
    RunnableParallel(input = qa_eval_prompt, context = itemgetter("context")) |
    RunnableParallel(input = itemgetter("input") | llm_selfeval , context = itemgetter("context") ) |
    RunnableParallel(input = itemgetter("input") | json_parser, context = itemgetter("context") )
    )

    The output from this chain looks like the following :

    A more concise variation:

    rag_chain = ( 
    RunnableParallel(context = retriever | format_docs, question = RunnablePassthrough() ) |
    RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question"), context = itemgetter("context") ) |
    RunnableParallel(input = qa_eval_prompt | llm_selfeval | json_parser, context = itemgetter("context"))
    )

    Using Global variables to save intermediate steps

    This method essentially uses the principle of a logger. We introduce a new function that saves its input to a global variable, thus allowing us access to the intermediate variable through the global variable

    global context

    def save_context(x):
    global context
    context = x
    return x

    rag_chain = (
    RunnableParallel(context = retriever | format_docs | save_context, question = RunnablePassthrough() ) |
    RunnableParallel(answer= qa_prompt | llm | retrieve_answer, question = itemgetter("question") ) |
    qa_eval_prompt |
    llm_selfeval |
    json_parser
    )

    Here we define a global variable called context and a function called save_context that saves its input value to the global context variable before returning the same input. In the chain, we add the save_context function as the last step of the context retrieval step.

    This option allows you to access any intermediate steps without making major changes to the chain.

    Accessing intermediate variables using global variables

    Using callbacks

    Attaching callbacks to your chain is another common method used for logging intermediate variable values. Theres a lot to cover on the topic of callbacks in LangChain, so I will be covering this in detail in a different post.

    Why use LCEL?

    The reasons for using LCEL are best explained by the authors of Langchain themselves in their official documentation.

    Of the points mentioned in the documentation, the following are some that I find especially useful :

    1. Input and output schemas : Will be covering this in detail in a different post
    2. Async support : As we move towards production applications, it becomes more important to have async functionality. LCEL pipeline allow for the seamless transition to async operations.
    3. Optimized parallel execution

    Above these reasons, as a matter of personal preference, I feel that using LCEL helps improve the readability of your code and allows for cleaner implementations.

    Resources

    Full code notebook

    PDF text

    Images : All images are created by the author

    In addition to Medium, I share my thoughts, ideas and other updates on Linkedin.


    Building a RAG chain using LangChain Expression Language (LCEL) was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

    Originally appeared here:
    Building a RAG chain using LangChain Expression Language (LCEL)

    Go Here to Read this Fast! Building a RAG chain using LangChain Expression Language (LCEL)

  • Lowest price: Apple’s M3 MacBook Air 16GB RAM, 512GB SSD, 10C GPU drops to $1,349

    Lowest price: Apple’s M3 MacBook Air 16GB RAM, 512GB SSD, 10C GPU drops to $1,349

    Packed with upgrades, Apple’s latest M3 MacBook Air with 16GB RAM, 512GB of storage and a 10-core GPU is marked down to $1,349. AppleCare is discounted with an exclusive coupon as well, dropping the price down to $159.

    MacBook Air M3 in Midnight with a colorful wallpaper displaying the Apple M3 chip logo, set against a vibrant room background with a promotional badge overlay reading AppleInsider Lowest Prices Exclusive.
    The exclusive $150 discount* is in effect with promo code APINSIDER at Apple Authorized Reseller Adorama. Celebrating 50 years in business, the retailer is marking down Apple’s latest MacBook Air with the M3 chip in every set of specs.

    Use code APINSIDER

    Both the 13-inch and 15-inch screen sizes are eligible for a coupon discount, but we’re especially fond of the 13-inch configuration with a 10-core GPU, 16GB RAM and a 512GB SSD. After the coupon is applied during Step 4 of checkout (Payments section), the price drops to $1,349 in your choice of finish — the best price available across Apple resellers, according to our M3 MacBook Air 13-inch Price Guide.

    Continue Reading on AppleInsider

    Go Here to Read this Fast!

    Lowest price: Apple’s M3 MacBook Air 16GB RAM, 512GB SSD, 10C GPU drops to $1,349

    Originally appeared here:

    Lowest price: Apple’s M3 MacBook Air 16GB RAM, 512GB SSD, 10C GPU drops to $1,349

  • Gaming and AI are in Mac’s future, even with low memory capacities

    Gaming and AI are in Mac’s future, even with low memory capacities

    Apple is continuing to insist that users can still get a lot of work done with just 8GB of memory in a Mac, while also being bullish about improving the gaming experience and AI development on the platform.

    MacBook Pro
    MacBook Pro

    In November, Apple VP of worldwide product marketing Bob Borchers defended selling Macs with 8GB of memory, insisting that design improvements makes 8GB as useful as 16GB in a PC. Months later, chiefs at the company are continuing to insist that small is beautiful, at least in terms of memory capacities.

    In an interview with IT Home, product marketing manager Evan Buyze said that 8GB of memory is suitable for many tasks, including web browsing, streaming, light photo and video editing, and some gaming.

    Continue Reading on AppleInsider | Discuss on our Forums

    Go Here to Read this Fast!

    Gaming and AI are in Mac’s future, even with low memory capacities

    Originally appeared here:

    Gaming and AI are in Mac’s future, even with low memory capacities

  • Iran-Israel conflict, negative funding rates triggered $860m crypto sell-off

    Mohammad Shahidullah

    Historically, geopolitical instability tends to drive investors away from riskier assets. That’s what happened Friday with crypto as Iran-Israel tensions escalated.

    Go here to Read this Fast! Iran-Israel conflict, negative funding rates triggered $860m crypto sell-off

    Originally appeared here:
    Iran-Israel conflict, negative funding rates triggered $860m crypto sell-off

  • Five Arrows buys private equity-backed Rimes

    Julius Mutunkei

    Private equity firm EQT AB has agreed to sell Rimes Technologies to Five Arrows, Rothschild & Co.’s alternative assets division. Valuation reportedly hovers $852m.

    Go here to Read this Fast! Five Arrows buys private equity-backed Rimes

    Originally appeared here:
    Five Arrows buys private equity-backed Rimes