A framework for unlocking custom LLM solutions you’ll understand
Foreword
This article illustrates how Large Language Models (LLM) are gradually adapted for custom use. It is meant to give people with no computer science background an easy to grasp analogy into how GPT and similar AI systems can be customized. Why the artwork? Bare with me, I hope you enjoy the journey.
Introduction
I will not start this article with an introduction on how ChatGPT, Claude and generative AI have transformed businesses and will forever change our lives, careers and businesses. This has been written many times (most notably by the GPTs themselves …). Instead, today I would like to focus on the question of how we can use a Large Language Model (LLM) for our specific, custom purposes.
In my professional and private life, I have tried to help people understand the basics of what Language AI can and cannot do, beginning with why and how we should do proper prompting (which is beyond the scope of this article), ranging all the way to what it means when managers claim that their company has it’s “own language model.” I feel there is a lot of confusion and uncertainty around the topic of adapting language models to your business requirements in particular. So far, I have not come accross an established framework that adresses this need.
In an attempt to provide an easy explanation that makes non-IT specialists understand the possibilities of language model customization, I came up with an analogy that took me back to my early days when I had been working as a bar piano player. Just like a language model, a bar piano player is frequently asked to play a variety of songs, often with an unspecific request or limited context: “Play it again, Sam…”
Meet Sam — the Bar Piano Player
Imagine you’re sitting in a piano bar in the lounge of a 5-star hotel, and there is a nice grand piano. Sam, the (still human) piano player, is performing. You’re enjoying your drink and wonder if Sam can also perform according to your specific musical taste. For the sake of our argument, Sam is actually a language model, and you’re the hotel (or business) owner wondering what Sam is able to do for you. The Escalation Ladder I present here is a framework, which offers four levels or approaches to gradually shape Sam’s knowledge and capabilities to align with your unique requirements. On each level, the requirements get more and more specific, along with the efforts and costs of making Sam adapt.
The Escalation Ladder: From Prompting to Training Your Own Language Model
1. Prompting: Beyond the Art of Asking the Right Questions
The first thing you can do is quite simple, however, may not be easy. You ask Sam to play a song that you’d like to hear. The more specific you are, i.e., the clearer your request, the better your wording (and depending on the number of drinks you’ve had at the hotel bar, your pronunciation), the better. As Voltaire famously said:
“Judge a man by his questions rather than by his answers.”
“Play some jazz” may or may not make Sam play what you had in mind. “Play Dave Brubeck’s original version of ‘Take Five’, playing the lead notes of the saxophone with your right hand while keeping the rythmic pattern in the left” will get quite a specific result, presuming Sam has been given the right training.
What you do here is my analogy to prompting — the way we currently interact with general-purpose language models like GPT or Claude. While prompting is the most straightforward approach, the quality of the output relies heavily on the specificity and clarity of your prompts. It is for this reason that Prompt Engineering has become a profession, one which you probably would never have heard of only a few years ago.
Prompting makes a huge difference, from getting a poor, general, or even false answer to something you can actually work with. That’s why in my daily use of GPT and the like, I always take a minute to consider a proper prompt for the task at hand (my favorite course of action here is something called “role based prompting”, where you give the model a specific role in your prompt, such as an IT expert, a data engineer or a career coach. Again, we will not get into the depths of promtping, since it is beyond the scope of this article).
But prompting has its limits: You may not want to always explain the world within your prompts. It can be quite a tedious task to provide all the context in proper writing (even though chat based language models are somewhat forgiving when it comes to spelling). And the ouput may still deviate from what you had in mind — in the hotel bar scenario, you still may not be happy with Sam’s interpretation of your favorite songs, no matter how specific your requests may have been.
2. Embedding or Retrieval-Augmented Generation (RAG): Provide Context-Relevant Data or Instructions
You have an idea. In addition to asking Sam to “play it again” (and to prompt him specifically of what it is you want to hear), you remember you had the sheet music in your bag. So you put the sheets on the piano and ask Sam to play what’s written (provided you give him some incentive, say, $10 in cash).
In our analogy, our model now uses its inherent abilities to generate language output (Sam playing piano) and directs those abilities towards a specific piece of context (Sam playing a specific song).
This architectural pattern is referred to as Retrieval-Augmented Generation (RAG), where you provide the model with additional context or reference materials relevant to your domain. By incorporating these external sources and data, the model can generate more informed and accurate responses, tailored to your specific needs. In more technical terms, this involves preparing and cleaning textual context data that is then transformed into Embeddings and properly indexed. When prompted, the model receives a relevant selection of this context data, according to the content of the prompt.
It is the next step up the ladder since it requires some effort on your side (e.g., giving Sam $10) and can involve some serious implementation costs.
Now Sam plays your favorite tune — however, you are still not happy with the way he plays it. Somehow, you want more Swing, or another touch is missing. So you take the next step on our ladder.
3. Fine-tuning: Learning and Adapting to Feedback
This is where my analogy starts to get a bit shaky, especially when we’re taking the word “tuning” in our musical context literally. Here, we are not talking about tuning Sam’s piano. Instead, when thinking about fine-tuning in this context, I am referring to taking a considerable amount of time to work with Sam until he plays how we like him to play. So we basically give him piano lessons, providing feedback on his playing and supervising his progress.
Back to language models, one of the approaches here is referred to as reinforcement learning from human feedback (RLHF), and it fits well into our picture of a strict piano teacher. Fine-tuning takes the customization process further by adapting (i.e. tuning) the model’s knowledge and skills to a particular task or domain. Again, putting it a little more technical, what happens here is based on Reinforecement Learning, which has a Reward Function at its core. This reward dynamically adapts to the human feedback, which is often given as a human A/B judgement label to the textual output of the model, given the same prompt.
For this process, we need considerable (computational) resources, large amounts of curated data, and/or human feedback. This explains why it is already quite high on our escalation ladder, but it’s not the end yet.
What if we want Sam to play or do very specific musical things? For example, we want him to sing along — that would make Sam quite nervous (at least, that’s what this specific request made me feel, back in the days), because Sam hasn’t been trained and never tried to sing…
4. Custom Model Training: Breeding a New Expert
At the pinnacle of the Escalation Ladder we encounter custom model (pre-)training, where you essentially create a new expert from scratch, tailored to your exact requirements. This is also where my analogy might crumble (never said it was perfect!) — how do you breed a new piano player from scratch? But let’s stick to it anyway — let’s think about training Samantha, who has never played any music nor sung in her entire life. So we invest heavily in her education and skills, sending her to the top institutions where musicians learn what we want them to play.
Here we are nurturing a new language model from the ground up, instilling it with the knowledge and data necessary to perform in our particular domain. By carefully curating the training data and adjusting the model and its architecture, we can develop a highly specialized and optimized language model capable of tackling even the most proprietary tasks within your organization. In this process, the amount of data and number of parameters that current large language models are trained on can get quite staggering. For instance, rumours suggest that OpenAI’s most recent GPT-4 has 1.76 trillion parameters. Hence, this approach often requires enormous resources and is beyond reach for many businesses today.
Conclusion
Just like our journey from timidly asking Sam to play Dave Brubeck’s “Take Five” up to developing new talent — as we progress through each level of the Escalation Ladder, the effort and resources required increase significantly, but so does the level of customization and control we gain over the language model’s capabilities.
Of course, much like most frameworks, this one is not as clear cut as I have presented it here. There can be hybrid or mixed approaches, and even the finest RAG implementation will need you to do some proper prompting. However, by understanding and reminding yourself of this framework, I believe you can strategically determine the appropriate level of customization needed for your specific use cases. To unlock the full potential of Language AI, you will need to strike the right balance between effort and cost, and tailored performance. It may also help bridge the communication gap between business and IT when it comes to Language AI model adaption and implementation.
I hope you enjoyed meeting Sam and Samantha and adapting their abilities on the piano. I welcome you to comment, critique, or expand on what you think of this analogy in the comments below, or simply share this article with people who might benefit from it.
Notes and References:
This article has been inspired by this technical article on Retrieval Augmented Generation from Databricks.
All drawings are hand crafted with pride by the author 🙂
The Business Guide to Tailoring Language AI was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
The Business Guide to Tailoring Language AI
Go Here to Read this Fast! The Business Guide to Tailoring Language AI