TDS Best of 2023: On ChatGPT and LLMs

TDS Editors

You might say 2023 was an eventful year for data scientists and ML professionals, but that wouldn’t quite capture the amount of hectic activity we’ve seen in the field in the past 12 months.

As much as we always aim to resist hype and hyperbole, we have to concede that yes, we’ve seen some dramatic changes in the way both practitioners and society at large view AI and its effects on our daily lives. The launch of ChatGPT in the final weeks of 2022 was far from the only factor in this transition, but it’s difficult to deny its role as both catalyst and symbolic focal point.

When we considered how we can take stock of our authors’ best and most popular work in 2023, then, turning to articles on LLMs in general—and on that one ubiquitous chatbot in particular—became a very natural choice. The selection of articles we present here is by now means exhaustive, but it does offer a representative sample of the articles that you, our readers, responded to the most—whether it’s the ones you couldn’t stop reading and sharing, or those that generated the most thoughtful conversations on TDS and beyond.

Before we dive into the articles that made the biggest splash in the past year, we’d like to take a moment to thank our entire community for your support. We owe a special debt of gratitude to our incredible authors, to our partners at Medium, to a dedicated group of volunteer Editorial Associates who generously offer us their expertise, and to our two former colleagues and extraordinary editors, Caitlin Kindig and Katherine Prairie.

How ChatGPT Works: The Model Behind The Bot
In the least shocking development ever, Molly Ruby’s accessible and informative explainer became our most popular post of 2023. If you haven’t read it already, it’s not too late to catch up!
Closed AI Models Make Bad Baselines
What direction will NLP research take in a post-ChatGPT world? Anna Rogers examines the current state of a rapidly changing field.
Can ChatGPT Write Better SQL than a Data Analyst?
While the extent to which LLMs pose a threat to entire professions remains to be seen, Marie Truong took the time to survey ChatGPT’s coding skills soon after its launch.
GPT Is an Unreliable Information Store
In a prescient look at AI hallucinations, Noble Ackerson zoomed in on the emerging risks of using LLMs as if they were reliable search engines.

How to Convert Any Text Into a Graph of Concepts
Exploring the realm of new possibilities in NLP thanks to LLMs, Rahul Nayak offers a hands-on approach to converting a text corpus into a knowledge graph.
Not All Rainbows and Sunshine: The Darker Side of ChatGPT
From baked-in bias to issues around privacy and plagiarism, Mary Reagan PhD unpacked some of the major risks that have emerged in the wake of ChatGPT’s ascendancy.
Zero-ETL, ChatGPT, and the Future of Data Engineering
How will ChatGPT and similar tools affect day-to-day data-engineering workflows? Barr Moses shared insights on the future of the “post-modern data stack.”
All You Need to Know to Build Your First LLM App
2023 was the year in which the process of building LLM-powered apps became meaningfully democratized, thanks in no small part to contributions like Dominik Polzer’s widely shared tutorial.
GPT-4 vs. ChatGPT: An Exploration of Training, Performance, Capabilities, and Limitations
Months after releasing ChatGPT, OpenAI upped the ante with its latest model, GPT-4, and Mary Newhauser was quick to provide a thorough comparison of the two products.
TimeGPT: The First Foundation Model for Time Series Forecasting
As the year progressed, we encountered more and more LLM solutions targeting specific use cases. Marco Peixeiro wrote a great explainer on TimeGPT, one such example of a customized foundation model.
Mastering Customer Segmentation with LLM
The practical use cases for LLMs and the products they support continue to grow every day; Damian Gil outlined a promising direction for marketers and business strategists.
Getting Started with LangChain: A Beginner’s Guide to Building LLM-Powered Applications
Alongside ChatGPT, LangChain emerged as a popular tool for builders of LLM-based products; Leonie Monigatti wrote the go-to resource for anyone interested in tinkering with it.
New ChatGPT Prompt Engineering Technique: Program Simulation
Translating our needs and goals into language LLMs can decipher correctly remains a challenge. Giuseppe Scalamogna unveiled an innovative framework for more effective prompt design.
How GPT Models Work
For a thorough and accessible primer on the math and theory behind GPT models, Beatriz Stollnitz’s deep dive remains a top-notch choice for beginners and more seasoned practitioners alike.
How to Build an LLM from Scratch
If you prefer a more hands-on approach to your learning, Shawhin Talebi’s tutorial on building LLMs will take you from data curation to model evaluation—it’s worth exploring even if you’re not planning on creating the next Llama or Falcon model in your home office!
RAG vs Finetuning — Which Is the Best Tool to Boost Your LLM Application?
As we learned about the limitations of pre-trained models, new approaches emerged for boosting their performance. Heiko Hotz delivered a useful comparison of the two leading options: fine-tuning and retrieval-augmented generation (RAG).
Running Llama 2 on CPU Inference Locally for Document Q&A
The ability to “speak” with our own text documents, PDFs, and audio recordings has become a popular everyday use case for LLMs. Kenneth Leung’s step-by-step guide showed how we can create such a workflow on a local machine.
A Gentle Intro to Chaining LLMs, Agents, and utils via LangChain
For anyone taking their first steps working with LLMs, Dr. Varshita Sher’s helpful and comprehensive tutorial on the building blocks of LangChain is an essential read.
Large Language Models in Molecular Biology
Exploring LLMs’ potential to transform scientific research, Serafim Batzoglou’s eye-opening deep dive focused on its impact within molecular biology, with applications ranging from gene-structure prediction to pharmaceutical discovery.

Stay tuned! Throughout 2023 we’ve published countless excellent articles on a wide range of topics that go far beyond LLMs and ChatGPT. Next week, we’ll devote our final Variable edition of the year to standout posts on data science and programming skills, career paths, and special projects.

Thank you, once again, for supporting the work of our authors throughout 2023! If you’ve enjoyed the articles you read on TDS, consider becoming a Friend of Medium Member: it’s a new membership level that offers your favorite authors bigger rewards for their top-notch writing.

Until the next Variable,

TDS Editors

TDS Best of 2023: On ChatGPT and LLMs was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Originally appeared here:
TDS Best of 2023: On ChatGPT and LLMs

Go Here to Read this Fast! TDS Best of 2023: On ChatGPT and LLMs

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Related Stories

Documenting Python Projects with MkDocs

Productionising GenAI Agents: Evaluating Tool Selection with Automated Testing

Don’t Be Afraid to Use Machine Learning for Simple Tasks

You may have missed

Documenting Python Projects with MkDocs

Productionising GenAI Agents: Evaluating Tool Selection with Automated Testing

Don’t Be Afraid to Use Machine Learning for Simple Tasks

How Spotify Implemented Personalized Audiobook Recommendations