Feeling inspired to write your first TDS post? We’re always open to contributions from new authors.
When we think about problem-solving, our focus tends to be on the solving part: the powerful hack, a new magical tool, a few lines of code that make everything click into place. In reality, a lot has to happen for these final touches to work—from developing a solid understanding of what the problem actually is, to sketching out a workable process that ensures we find consistent success rather than just a temporary band-aid.
Our weekly highlights this week stand out for their holistic approach to finding effective solutions to occasionally thorny challenges. They offer a glimpse into practitioners’ mindset as they explore their available resources (data, tools, and time, to name a few) and weigh the pros and cons of different workflows. We think they might just inspire you to view whatever project you’re working on at the moment from a new perspective. Enjoy your reading!
- Algorithmic Thinking for Data Scientists
For a thorough introduction to the benefits of algorithmic thinking—which entails “combining rigorous logic and creativity to frame, solve, and analyze problems, usually with the help of a computer”—don’t miss Chinmay Kakatkar’s excellent article. The focus is on writing efficient code, but you could apply the principles laid out here across a wide range of use cases. - The Ultimate Guide to Finding Outliers in Your Time-Series Data (Part 1)
Detecting patterns and weeding out anomalies in your dataset remains an essential task for data scientists. Sara Nóbrega’s new guide is a broad, actionable resource that outlines several powerful techniques and zooms in on how you should choose the right one for the project you’re working on. - Jet Sweep: Route Optimization to Visit Every NFL Team at Home
The traveling salesman problem is a classic optimization challenge; Sejal Dua presents an engaging walkthrough of its theoretical complexity, and introduces a few twists: we’re looking at NFL stadiums instead of sales routes, and using linear programming and geospatial data to generate the best possible itinerary to visit all of them.
- Solving a Resource-Planning Problem with Mathematical Programming and Column Generation
For his debut TDS article, Luis Fernando PÉREZ ARMAS, Ph.D. takes a stab at another famous optimization problem: the minimum vertex coloring problem (also knows as the graph coloring problem), and dives deep into its real-world applications before showing how to solve it using mathematical programming and column generation. - Data Disruptions to Elevate Entity Embeddings
“When categorical features have a lot of possible levels (‘high cardinality’), both modeling and analytics become tricky.” Valerie Carey starts her accessible explainer with a close look at entity embeddings as a potential answer to the challenge of high-cardinality features, and goes on to propose a stochastic regularization method to improve their generalizability in neural network models. - Exploring RAG Applications Across Languages: Conversing with the Mishnah
There are, by now, many well-established workflows for building effective RAG systems; Shlomo Tannor ups the ante in his hands-on tutorial (another TDS debut!) by demonstrating how he built a multilingual app that allows English-speaking users to obtain information from the Mishnah, an ancient Rabbinic text originally written in Hebrew.
Looking for recommended reads on other topics? We hope so—here are some of our recent favorites:
- If you’re interested in the particular quirks of fine-tuning smaller transformer models, Ida Silfverskiöld’s project walkthrough offers a detailed overview.
- In a new series, Subarna Tripathi explores the emerging field of long-form visual understanding; part one focuses on video as graph-based and on leveraging graph neural networks for downstream applications.
- How do multimodal image-text models perform image classification, image retrieval, and image captioning? Wei Yi’s beginner-friendly deep dive offers an illuminating look under these models’ hood.
- Taking several steps back from day-to-day implementation, Dusko Pavlovic invites us to reflect on the theoretical underpinnings of learning—and how they facilitate the rise of learning machines.
- Data science roles come with their own particular stressors and bottlenecks. Zijing Zhu, PhD shares helpful pointers on how to tackle them successfully—and become better data scientists along the way.
- If you’re new to reinforcement learning and would like to learn about this topic from the ground up, we highly recommend Angjelin Hila’s comprehensive primer.
Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.
Until the next Variable,
TDS Team
Multilingual RAG, Algorithmic Thinking, Outlier Detection, and Other Problem-Solving Highlights was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Multilingual RAG, Algorithmic Thinking, Outlier Detection, and Other Problem-Solving Highlights