What’s New in Computer Vision and Object Detection?

Feeling inspired to write your first TDS post? We’re always open to contributions from new authors.

Before we get into this week’s selection of stellar articles, we’d like to take a moment to thank all our readers, authors, and members of our broader community for helping us reach a major milestone, as our followers count on Medium just reached…

We couldn’t be more thrilled — and grateful for everyone that has supported us in making TDS the thriving, learning-focused publication it is. Here’s to more growth and exploration in the future!

Back to our regular business, we’ve chosen three recent articles as our highlights this week, focused on cutting-edge tools and approaches from the ever-exciting fields of computer vision and object detection. As multimodal models grow their footprint and use cases like autonomous driving, healthcare, and agriculture go mainstream, it’s never been more crucial for data and ML practitioners to stay up-to-speed with the latest developments. (If you’re more interested in other topics at the moment, we’ve got you covered! Scroll down for a handful of carefully picked recommendations on neuroscience, music and AI, environmentally conscious ML workflows, and more.)

Mastering Object Counting in Videos
Accurate object detection in videos comes with a host of new challenges when compared to the same process in static images. Lihi Gur Arie, PhD presents a clear and concise tutorial that shows how you can still accomplish it, and uses the fun example of counting moving ants on a tree to make her case.
Spicing Up Ice Hockey with AI: Player Tracking with Computer Vision
For anyone looking for a thorough and engaging project walkthrough, we strongly recommend Raul Vizcarra Chirinos’ writeup of his recent attempt to build a hockey-player tracker from (more or less) scratch. Using PyTorch, computer vision techniques, and a convolutional neural network (CNN), Raul developed a prototype that can follow players and collect basic performance statistics.
A Crash Course of Planning for Perception Engineers in Autonomous Driving
While we might still be years away from self-driving cars dominating our roads, researchers and industry players have made significant progress in recent years. Practitioners who’d like to expand their knowledge of planning and decision-making in the context of autonomous driving shouldn’t miss Patrick Langechuan Liu’s comprehensive “crash course” on the topic.

As promised, here are our recommended reads on other themes, questions, and challenges we thought you might enjoy exploring:

For his debut TDS article, Jonathan R. Williford, PhD draws fascinating connections between current work on multimodal transformers and the way our brains process visual information.
From being overly defensive about your weaknesses to not fully owning your projects, Mandy Liu reflects on the mistakes she’s made as a junior data scientist, and shares actionable advice for others who are just starting out.
Why is tracking so important in machine learning projects, and how should you go about implementing it effectively? Chayma Zatout has the answers in her MLOps primer.
If you’re curious to learn about a new, cutting-edge prompting framework, don’t miss Anand Subramanian’s thorough and practical introduction to Medprompt.
In her latest LLM-focused tutorial, Yanli Liu covers an alignment-optimization approach that combines two novel techniques: representation fine-tuning with ORPO (Odds Ratio Preference Optimization).
The growing environmental footprint of ML models is a timely and crucial topic; Sydney Nye presents a pragmatic guide that centers sustainable practices for model training and serving.
Working at the intersection of music analysis and AI, Emmanouil Karystinaios walks us through his research on perception-inspired graph convolution for music-understanding tasks.
Hesam Sheikh goes beyond the hype agentic-AI systems have generated, and offers a detailed, hands-on tutorial on building a team of AI agents to refine and customize job-application materials.
What happens when the prompt you give an LLM contains contradictory instructions? Yennie Jun explores this interesting conundrum in her experiments with AI “cognitive dissonance.”

Thank you for supporting the work of our authors! We love publishing articles from new authors, so if you’ve recently written an interesting project walkthrough, tutorial, or theoretical reflection on any of our core topics, don’t hesitate to share it with us.

Until the next Variable,

TDS Team

What’s New in Computer Vision and Object Detection? was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

Originally appeared here:
What’s New in Computer Vision and Object Detection?

Go Here to Read this Fast! What’s New in Computer Vision and Object Detection?

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

What’s New in Computer Vision and Object Detection?

More posts

Red Hat bets big on AI with its Neural Magic acquisition

How many software updates does the OnePlus 13 get?

The best air purifier for 2025

UK Government launches ransomware protection proposals