Open-Source Models, Temperature Scaling, Re-Ranking, and More: Don’t Miss Our Recent LLM Must-Reads
Feeling inspired to write your first TDS post? We’re always open to contributions from new authors.
New LLMs continue to arrive on the scene almost daily, and the tools and workflows they make possible proliferate even more quickly. We figured it was a good moment to take stock of some recent conversations on this ever-shifting terrain, and couldn’t think of a better way to do that than by highlighting some of our strongest articles from the past couple of weeks.
The lineup of posts we put together tackle high-level questions and nitty-gritty problems, so whether you’re interested in AI ethics, the evolution of open-source technology, or innovative RAG approaches, we’re certain you’ll find something here to pique your interest. Let’s dive in.
- Shifting Tides: The Competitive Edge of Open Source LLMs over Closed Source LLMs
The initial wave of generative-AI tools was spearheaded by proprietary models like the ones released by OpenAI. Leonie Monigatti’s new article focuses on an emerging trend: the rise—and increasing dominance—of smaller open-source foundation models that are making a splash thanks to factors like data security, customizability, and cost. - Chatbot Morality?
We know LLMs can produce hallucinations when asked for factual information; what happens when users start prompting them for ethics-focused advice? Eyal Aharoni and Eddy Nahmias present their latest research on this thorny question and the dangers inherent to the perception of morality in chatbots that “can imitate or synthesize human moral discourse in specific, controlled circumstances.” - Can Recommendations from LLMs Be Manipulated to Enhance a Product’s Visibility?
E-commerce is an area that’s already susceptible to manipulation and questionable business practices. As Parul Pandey shows in her analysis of a recent paper, LLMs—with their power to produce text and other media rapidly and at scale—are already primed to exploit various loopholes and blind spots in this ecosystem.
- Temperature Scaling and Beam Search Text Generation in LLMs, for the ML-Adjacent
In a comprehensive, example-filled guide, Mike Cvet unpacks the concept of temperature in the context of generative-AI workflows: it’s a parameter that modifies the predictability of the model’s output sequences, and mastering its nuances can help practitioners use AI tools more effectively. - How to Use Re-Ranking for Better LLM RAG Retrieval
After the initial excitement around retrieval-augmented generation, it became clear to many practitioners that RAG systems can usually benefit from more advanced refining methods. Dr. Leon Eversberg’s recent tutorial walks us through a workflow that leverages two-step retrieval (using open-source bi-encoders and cross-encoders) for better results.
As they always do, our authors branched out to many other topics in recent weeks, producing some top-notch articles; here’s a representative sample:
- Ending her excellent series on customer lifetime value on a high note, Katherine Munro offers a detailed overview of available predictive methods and what marketers and data scientists can expect from each.
- Every Sachin Date deep dive is a cause for celebration, and the latest is no exception: it’s a meticulous exploration of statistical convergence, told through the story of a 19th-century shipwreck.
- In her latest beginner-friendly guide, Srijanie Dey, PhD turns to Llama 3 and unpacks the nuances of its transformer architecture.
- Writing at the intersection of molecular biology, bioinformatics, and AI, Murto Hilali shows how he built a multi-classifier model that predicts the effects of mutations on protein interactions.
- If you’re considering a career transition from physics (and related fields) to data science, don’t miss Sara Nóbrega’s practical guide, based on her own journey and the learnings she’s collected along the way.
- For anyone taking their first steps in deep learning, Shreya Rao is back with a new, beginner-friendly, expertly illustrated primer on convolutional neural networks.
- The paper unveiling Kolmogorov-Arnold Networks (KANs) is barely two weeks old, but is already making major waves in the field. Theo Wolf’s debut TDS article helps us understand how KANs work and what the buzz is all about.
Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.
Until the next Variable,
TDS Team
Open-Source Models, Temperature Scaling, Re-Ranking, and More: Don’t Miss Our Latest LLM Must-Reads was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Open-Source Models, Temperature Scaling, Re-Ranking, and More: Don’t Miss Our Latest LLM Must-Reads