Agent Ecosystems, Data Integration, Open Source LLMs, and Other November Must-Reads
Feeling inspired to write your first TDS post? We’re always open to contributions from new authors.
Welcome to the penultimate monthly recap of 2024 — could we really be this close to the end of the year?! We’re sure that you, like us, are hard at work tying up loose ends and making a final push on your various projects. We have quite a few of those on our end, and one exciting update we’re thrilled to share with our community already is that TDS is now active on Bluesky. If you’re one of the many recent arrivals to the platform (or have been thinking about taking the plunge), we encourage you to follow our account.
What else is on our mind? All the fantastic articles our authors have published in recent weeks, inspiring our readers to learn new skills and explore emerging topics in data science and AI. Our monthly highlights cover a lot of ground—as they usually do—and provide multiple accessible entryways into timely technical topics, from knowledge graphs to RAG evaluation. Let’s dive in.
Monthly Highlights
- Agentic Mesh: The Future of Generative AI-Enabled Autonomous Agent Ecosystems
What will it take for autonomous agents to find each other, collaborate, interact, and transact in a safe, efficient, and trusted fashion? Eric Broda presents his exciting vision for the agentic mesh, a framework that will act as the seamless connecting tissue for AI agents. - Building Knowledge Graphs with LLM Graph Transformer
For anyone in the mood for a hands-on deep dive, Tomaz Bratanic’s latest technical guide walks us through the nitty-gritty details of LangChain’s implementation of graph construction with LLMs. - Why ETL-Zero? Understanding the shift in Data Integration
“Instead of requiring the explicit extraction, transformation and loading of data in separate steps, as is traditionally the case, data should flow seamlessly between different systems.” Sarah Lea introduces a novel approach for creating a simplified ETL process with Python.
- Economics of Hosting Open Source LLMs
As LLM usage has skyrocketed in the past year or so, practitioners have increasingly asked themselves what the most efficient way to deploy these models might be. Ida Silfverskiöld offers a detailed breakdown of the various factors to consider and how different providers stack up when it comes to processing time, cold start delays, and CPU, memory, and GPU costs. - How I Improved My Productivity as a Data Scientist with Two Small Habits
Sometimes, minor changes to your daily routine can have as much of an impact as a total workflow overhaul. Case in point: Philippe Ostiguy, M. Sc.’s new post, where we learn about two seemingly non-work-related habits around rest and mental strength that have given Philippe’s productivity a major boost. - A 6-Month Detailed Plan to Build Your Junior Data Science Portfolio
Whether you’re a freshly minted data scientist or a more seasoned professional looking for a new role, Sabrine Bendimerad’s blueprint for crafting a successful portfolio will give you concrete ideas and a realistic timeline for getting the job done. - How to Reduce Python Runtime for Demanding Tasks
Everyone wants their code to run faster, but hitting a plateau is all but inevitable when dealing with particularly heavy workloads. Still, as Jiayan Yin shows in her highly actionable post, there might still be GPU optimization options you haven’t taken advantage of to speed up your Python code. - How to Create a RAG Evaluation Dataset From Documents
As Dr. Leon Eversberg explains in his recent tutorial, “by uploading PDF files and storing them in a vector database, we can retrieve this knowledge via a vector similarity search and then insert the retrieved text into the LLM prompt as additional context.” The result? A robust approach for evaluating RAG workflows and a reduced chance for hallucinations.
Our latest cohort of new authors
Every month, we’re thrilled to see a fresh group of authors join TDS, each sharing their own unique voice, knowledge, and experience with our community. If you’re looking for new writers to explore and follow, just browse the work of our latest additions, including Jessica S, Tanner McRae, Ed Sandoval, Robert Corwin, Eric Colson, Joseph Ben, Marcus K. Elwin, Ro Isachenko, Michael Zakhary, Haim Barad, Elisa Yao, Mohamad Hamza, Eric Silberstein, Lorenzo Mezzini, David Teather, Diego Penilla, Daniel Klitzke, Iheb Rachdi, Aaron Beckley, Andrea Rosales, Bohumir Buso, Loizos Loizou, Omri Eliyahu Levy, Ohad Eytan, Julián Peller, Yan Georget, James Barney, Dima Sergeev, Pere Martra, and Gizem Kaya, among others.
Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.
Until the next Variable,
TDS Team
Autonomous Agent Ecosystems, Data Integration, Open Source LLMs, and Other November Must-Reads was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Autonomous Agent Ecosystems, Data Integration, Open Source LLMs, and Other November Must-Reads