Feeling inspired to write your first TDS post? We’re always open to contributions from new authors.
With May drawing to a close and summer right around the corner for those of us in the Northern Hemisphere, it’s time once again to look back at the standout articles we’ve published in the past month: those stories that resonated the most with learners and practitioners across a wide swath of data science and machine learning disciplines.
We were delighted to see a particularly eclectic lineup of posts strike a chord with our readers. It’s a testament to the diverse interests and experiences that TDS authors bring to the table, as well as to the increasing demand for well-rounded data professionals who can write clean code, stay up-to-date with the latest LLMs, and—while they’re at it—know how to tell a good story about (and through) their projects. Let’s dive right in.
Monthly Highlights
- Python One Billion Row Challenge — From 10 Minutes to 4 Seconds
With a longstanding reputation for slowness, you’d think that Python wouldn’t stand a chance at doing well in the popular “one billion row” challenge. Dario Radečić’s viral post aims to show that with some flexibility and outside-the-box thinking, you can still squeeze impressive time savings out of your code. - N-BEATS — The First Interpretable Deep Learning Model That Worked for Time Series Forecasting
Anyone who enjoys a thorough look into a model’s inner workings should bookmark Jonte Dancker’s excellent explainer on N-BEATS, the “first pure deep learning approach that outperformed well-established statistical approaches” for time-series forecasting tasks. - Build a Data Science Portfolio Website with ChatGPT: Complete Tutorial
In a competitive job market, data scientists can’t afford to be coy about their achievements and expertise. A portfolio website can be a powerful way to showcase both, and Natassha Selvaraj’s patient guide demonstrates how you can build one from scratch with the help of generative-AI tools.
- A Complete Guide to BERT with Code
Why not take a step back from the latest buzzy model to learn about those precursors that made today’s innovations possible? Bradney Smith invites us to go all the way back to 2018 (or several decades ago, in AI time) to gain a deep understanding of the groundbreaking BERT (Bidirectional Encoder Representations from Transformers) model. - Why LLMs Are Not Good for Coding — Part II
Back in the present day, we keep hearing about the imminent obsolescence of programmers as LLMs continue to improve. Andrea Valenzuela’s latest article serves as a helpful “not so fast!” interjection, as she focuses on their inherent limitations when it comes to staying up-to-date with the latest libraries and code functionalities. - PCA & K-Means for Traffic Data in Python
What better way to round out our monthly selection than with a hands-on tutorial on a core data science workflow? In her debut TDS post, Beth Ou Yang walks us through a real-world example—traffic data from Taiwan, in this case—of using principle component analysis (PCA) and K-means clustering.
KANs in the Spotlight
If we had to name the topic that created the biggest splash in recent weeks, KANs (Kolmogorov-Arnold Networks) would be an easy choice. Here are three excellent resources to help you get acquainted with this new type of neural network, introduced in a widely circulated paper.
- Kolmogorov-Arnold Networks: The Latest Advance in Neural Networks, Simply Explained
For a clear and accessible primer on KANs, you can’t do better than Theo Wolf’s easy-to-follow post. - Kolmogorov-Arnold Networks (KANs) for Time Series Forecasting
Looking at KANs from the perspective of a more specialized use case, Marco Peixeiro shows how they can be applied in the context of time series forecasting. - Understanding Kolmogorov–Arnold Networks (KAN)
Finally, for a more complete (but still reader-friendly) paper walkthrough, look no further than Hesam Sheikh’s debut TDS article.
Our latest cohort of new authors
Every month, we’re thrilled to see a fresh group of authors join TDS, each sharing their own unique voice, knowledge, and experience with our community. If you’re looking for new writers to explore and follow, just browse the work of our latest additions, including Eyal Aharoni and Eddy Nahmias, Hesam Sheikh, Michał Marcińczuk, Ph.D., Alexander Barriga, Sasha Korovkina, Adam Beaudet, Gurman Dhaliwal, Ankur Manikandan, Konstantin Vasilev, Nathan Reitinger, Mandy Liu, Beth Ou Yang, Maicol Nicolini, Alex Shpurov, Geremie Yeo, W Brett Kennedy, Rômulo Pauliv, Ananya Bajaj, 林育任 (Yu-Jen Lin), Sumit Makashir, Subarna Tripathi, Yu-Cheng Tsai, Nika, Bradney Smith, Katia Gil Guzman, Miguel Dias, PhD, Bào Bùi, Baptiste Lefort, Sheref Nasereldin, Ph.D., Marcus Sena, Atisha Rajpurohit, Jonathan Bennion, Dunith Danushka, Bernd Wessely, Barna Lipics, Henam Singla, Varun Joshi and Gauri Kamat, and Yu Dong.
Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.
Until the next Variable,
TDS Team
Data Science Portfolios, Speeding Up Python, KANs, and Other May Must-Reads was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Data Science Portfolios, Speeding Up Python, KANs, and Other May Must-Reads