If you’re a regular reader of the Variable, you might have noticed that we stress—every week—that TDS is always open to contributions from new authors. And we mean it! Some of you might have seen this message and thought something along the lines of “great, I’d love to write an article!” but then wondered what kinds of posts would be a good fit, what topics our readers are interested in, and what types of experiences and skill sets are welcome.
This week’s Variable edition highlights some of our best recent articles, so if you have no desire to become a TDS author, that’s totally fine! We hope you enjoy your reading as always. We’ve focused exclusively on posts by our most recent cohort of authors, however, in the hope that their work inspires you to give this a try, too.
As you’ll see, TDS contributors come to us with a wide range of experience levels (from early learners to PhDs and industry veterans), interests, and writing styles. What unites them is great storytelling skills and a desire to share their knowledge with a broader community. We hope (and are fairly certain) you’ll enjoy our weekly lineup.
- What Do Large Language Models “Understand”?
“When we attribute human-like abilities to LLMs, we fall into an anthropomorphic bias by likening their capabilities to our own. But are we also showing an anthropocentric bias by failing to recognize the capabilities that LLMs consistently demonstrate?” In one of the most thought-provoking articles we’ve read recently, Tarik Dzekman tackles the question of LLMs’ capacity to understand language, looking at the topic through a philosophy- and psychology-informed lens. - Integrating LLM Agents with LangChain into VICA
“Our goal is to say goodbye to the robotic and awkward form-like experience within a chatbot, and say hello to personalized conversations with human-like assistance.” Ng Wei Cheng and Nicole Ren share practical insights and lessons learned from their extensive work on Singapore’s GovTech Virtual Intelligent Chat Assistant (VICA) platform. - Text Vectorization Demystified: Transforming Language into Data
“For those of us who are aware of the machine learning pipeline in general, we understand that feature engineering is a very crucial step in generating good results from the model. The same concept applies in NLP as well.” Lakshmi Narayanan offers a thorough overview of text-vectorization approaches and weighs their respective advantages and limitations.
- Leveraging Gemini-1.5-Pro-Latest for Smarter Eating
“It is worth noting here that with advancements in the world of AI, it is incumbent on data scientists to gradually shift from traditional deep learning to generative AI techniques in order to revolutionize their role.” Mary Ara presents an end-to-end project walkthrough that demonstrates how to do precisely that—in this case, through the creation of a calorie-tracking app that leverages a cutting-edge multimodal model. - The Most Useful Advanced SQL Techniques to Succeed in the Tech Industry
“Although mastering basic and intermediate SQL is relatively easy, achieving mastery of this tool and wielding it adeptly in diverse scenarios is sometimes challenging.” Jiayan Yin aims to help data analysts and other practitioners bridge that skill gap with a comprehensive overview of the more advanced SQL techniques you should add to your querying toolkit. - Fine-Tune the Audio Spectrogram Transformer with Hugging Face Transformers
“This process adapts the model’s capabilities to the unique characteristics of our dataset, such as classes and data distribution, ensuring the relevance of the results.” Writing at the intersection of machine learning and audio data, Marius Steger outlines a detailed workflow for fine-tuning the Audio Spectrogram Transformer (AST) on any audio-classification dataset. - Algorithm-Agnostic Model Building with MLflow
“Consider this scenario: we have an sklearn model currently deployed in production for a particular use case. Later on, we find that a deep learning model performs even better. If the sklearn model was deployed in its native format, transitioning to the deep learning model could be a hassle because the two model artifacts are very different.” Mena Wang, PhD explains why it can sometimes make a lot of sense to work with algorithm-agnostic models—and shows how to get started in MLflow. - A Fresh Look at Nonlinearity in Deep Learning
“But why do we need activation functions in the first place, specifically nonlinear activation functions? There’s a traditional reasoning, and also a new way to look at it.” Harys Dalvi unpacks the stakes of using a linear layer for the output of deep learning classifiers and the value we can gain by interpreting the consequences of linearity and nonlinearity in multiple ways.
Thank you for supporting the work of our authors! As we mentioned above, we love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.
Until the next Variable,
TDS Team
LLM Agents, Text Vectorization, Advanced SQL, and Other Must-Reads by Our Newest Authors was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
LLM Agents, Text Vectorization, Advanced SQL, and Other Must-Reads by Our Newest Authors