RAG Efficiency, Self-Learning Tips, the Business of AI, and Other January Must-Reads
By now we may have moved on from “Happy new year!” territory, but January’s jolt of energy and activity is still very much with us. We see it in the posts that have drawn the most readers and generated the liveliest conversations in recent weeks: they tend to focus on teaching oneself new skills, seeking out new opportunities, and gaining greater efficiency in established workflows.
Before we settle into the rhythm of a new month, let’s celebrate our most-read and biggest-splash-making stories from the first few weeks of 2024. As you’ll see, most have a strong practical flavor—whether in implementing RAG or writing better-performing code, among other areas—so we hope you’re still feeling motivated to explore new topics and expand your data science and ML toolkit. Let’s dive in.
- How to Learn AI on Your Own (a Self-Study Guide)
For those of you who are curious about AI but haven’t had a chance to learn about it in a structured or formal way, Thu Vu’s self-guided roadmap, complete with recommended resources—our most popular article in January!—is one you shouldn’t miss. - How I Became A Data Scientist — No CS Degree, No Bootcamp
Another hit among the driven, self-starter members of our community was Egor Howell’s personal account of his career path as a machine learning-focused data professional; it offered many actionable insights for others who’d like to pursue a similar trajectory. - Fine-Tune a Mistral-7b Model with Direct Preference Optimization
LLM optimization approaches continue to generate great interest for readers who are experimenting with cutting-edge workflows in their projects. Maxime Labonne has been among the ML professionals leading the charge in this area, including in his recent exploration of direct-preference optimization. - How to Cut RAG Costs by 80% Using Prompt Compression
Retrieval-augmented generation probably needs no introduction at this point for anyone tinkering with LLMs. As Iulia Brezeanu shows in her recent article, though, there is still a lot of room for making this approach more cost-effective and sustainable for teams.
- If you’re still taking your first steps in your work with large language models, Parul Pandey’s compilation of visual guides offers a patient and accessible introduction to the topic.
- To prevent your data project from running into memory-overflow issues, Siavash Yasini presents three useful tricks to write more efficient classes in Python.
- With the proliferation of new and sleek visualization tools, Mike Clayton wondered if Matplotlib should remain the top choice for data professionals when generating static plots.
- How does the ChatGPT plugin perform and what should ML practitioners keep in mind when using it? Livia Ellen shares her recent experiments and the insights she drew from them.
- As Barr Moses helpfully reminds us, “building a generative AI model that actually drives business value is hard” — but not impossible if you are aware of the most common pitfalls (and know how to avoid them).
- Ready to roll up your sleeves for some hands-on tinkering? Pye Sone Kyaw walks us through the process of running language and vision models on a Raspberry Pi.
- Don’t pack up your Raspberry Pi just yet—Dmitrii Eliuseev’s neat “weekend AI project” includes a full speech recognition workflow, executed entirely on the (very) compact computer.
- If you’re in a more theoretically minded mood these days, Stephanie Shen’s deep dive on Bayesian inference and its stakes for perception, reasoning, and decision-making will absolutely do the trick.
- Christopher Tao’s annotated roundup of the recent top 30 Python projects on GitHub offers a convenient window into the community’s collective mind as a new year begins to unfold.
Our latest cohort of new authors
Every month, we’re thrilled to see a fresh group of authors join TDS, each sharing their own unique voice, knowledge, and experience with our community. If you’re looking for new writers to explore and follow, just browse the work of our latest additions, including Omar Ali Sheikh, Brett A. Hurt, Zhaocheng Zhu, Mohamed Mamoun Berrada, Robert Dowd, Richard Tang, Theo Wolf, Han HELOIR, Ph.D. ☕️, Rhys cook, Andrew Lucas, Shafik Quoraishee, Karla Hernández, Omer Ansari, Tim Forster, Andrew Bowell, Harry Lu, Pye Sone Kyaw, Najib Sharifi, Josep Ferrer, Rohan Paithankar, Arne Rustad, Ian Stebbins, Thi-Lam-Thuy LE, Jan Jezabek, Ph.D., Raluca Diaconu, Tiffany Bogich, Ryu Sonoda, Yann-Aël Le Borgne, Aminata Kaba, Lorena Gongang, Yanli Liu, and Martina Ivaničová, among others.
Thank you for supporting the work of our authors! If you’re feeling inspired to join their ranks, why not write your first post? We’d love to read it.
Until the next Variable,
TDS Team
RAG Efficiency, Self-Learning Secrets, and the Business of AI (and Other January Must-Reads) was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
RAG Efficiency, Self-Learning Secrets, and the Business of AI (and Other January Must-Reads)