Category: Technology

Netflix lawsuit sues VMware over virtual machine patents

The streaming giant says VMware is in violation of five patents, and demands to be paid.

Go Here to Read this Fast! Netflix lawsuit sues VMware over virtual machine patents

Originally appeared here:
Netflix lawsuit sues VMware over virtual machine patents

December 24, 2024
Let’s Learn a Little About Computer Vision via Sudoku

Brian Roepke

Solving Sudoku is a fun challenge for coding, and adding computer vision to populate the puzzle ties this with a popular ML technique

Continue reading on Towards Data Science »

Originally appeared here:
Let’s Learn a Little About Computer Vision via Sudoku

Go Here to Read this Fast! Let’s Learn a Little About Computer Vision via Sudoku

December 24, 2024
Mono to Stereo: How AI Is Breathing New Life into Music
Max Hilsdorf
Applications and techniques for AI mono-to-stereo upmixing

Image generated with DALL-E 3.

Mono recordings are a snapshot of history, but they lack the spatial richness that makes music feel truly alive. With AI, we can artificially transform mono recordings to stereo or even remix existing stereo recordings. In this article, we explore the practical use cases and methods for mono-to-stereo upmixing.

Mono and Stereo in the physical and digital world

Photo by J on Unsplash

When an orchestra plays live, sound waves travel from different instruments through the room and to your ears. This causes differences in timing (when the sound reaches your ear) and loudness (how loud the sound appears in each ear). Through this process, a musical performance becomes more than harmony, timbre, and rhythm. Each instrument sends spatial information, immersing the listener in a “here and now” experience that grips their attention and emotions.

Listen to the difference between the first snippet (no spatial information), and the second snippet (clear differences between left and right ear):

Headphones are strongly recommended throughout the article, but are not strictly necessary.

Exampe: Mono

<a href="https://medium.com/media/f484d5274ef70332f5920662eff2d556/href">https://medium.com/media/f484d5274ef70332f5920662eff2d556/href</a>

Example: Stereo

<a href="https://medium.com/media/cd587700468d0c34ec93986671b77d6f/href">https://medium.com/media/cd587700468d0c34ec93986671b77d6f/href</a>

As you can hear, the spatial information conveyed through a recording has a strong influence on the liveliness and excitement we perceive as listeners.

In digital audio, the most common formats are mono and stereo. A mono recording consists of only one audio signal that sounds exactly the same on both sides of your headphone earpieces (let’s call them channels). A stereo recording consists of two separate signals that are panned fully to the left and right channels, respectively.

Example of a stereo waveform consisting of two channels. Image by the author.

Now that we have experienced how stereo sound makes the listening experience much more lively and engaging and we also understand the key terminologies, we can delve deeper into what we are here for: The role of AI in mono-to-stereo conversion, also known as mono-to-stereo upmixing.

Use Cases for Mono-to-Stereo Upmixing

AI is not an end in itself. To justify the development and use of such advanced technology, we need practical use cases. The two primary use cases for mono-to-stereo upmixing are

1. Enriching existing music in mono format to a stereo experience.

Although stereo recording technology was invented in the early 1930s, it took until the 1960s for it to become the de-facto standard in recording studios and even longer to establish itself in regular households. In the late 50s, new movie releases still came with a stereo track and an additional mono track to account for theatres that were not ready to transition to stereo systems. In short, there are lots of popular songs that were recorded in mono. Examples include:
- Elvis Presley: Thats All Right
- Chuck Berry: Johnny Be Goode
- Duke Ellington: Take the “A” Train
<a href="https://medium.com/media/cb66090595c56d12637ec8d78b01ce30/href">https://medium.com/media/cb66090595c56d12637ec8d78b01ce30/href</a>

Even today, amateur musicians might publish their recordings in mono, either because of a lack of technical competence, or simply because they didn’t want to make an effort to create a stereo mix.

Mono-to-Stereo conversion lets us experience our favorite old recordings in a new light and also bring amateur recordings or demo tracks to live.

2. Improving or modernizing existing stereo mixes that appear sloppy or simply have fallen out of time, stylistically

Even when a stereo recording is available, we might still want to improve it. For example, many older recordings from the 60s and 70s were recorded in stereo, but with each instrument panned 100% to one side. Listen to “Soul Kitchen” by The Doors and notice how the bass and drums are panned fully to the left, the keys and guitar to the right, and the vocals in the centre. The song is great and there is a special aesthetic to it, but the stereo mix would likely not get much love from a modern audience.

<a href="https://medium.com/media/0ca3fa5add9afc69d7e7206962298039/href">https://medium.com/media/0ca3fa5add9afc69d7e7206962298039/href</a>

Technical limitations have affected stereo sound in the past. Further, stereo mixing is not purely a craft, it is part of the artwork. Stereo mixes can be objectively okay, but still fall out of time, stylistically. A stereo conversion tool could be used to create an alternate stereo version that aligns more closely with certain stylistic preferences.

How Mono-to-Stereo AI Works

Now that we discussed how relevant mono-to-stereo technology is, you might be wondering how it works under the hood. Turns out there are different approaches to tackling this problem with AI. In the following, I want to showcase four different methods, ranging from traditional signal processing to generative AI. It does not serve as a complete list of methods, but rather as an inspiration for how this task has been solved over the last 20 years.

Traditional Signal Processing: Sound Source Formation

Before machine learning became as popular as it is today, the field of Music Information Retrieval (MIR) was dominated by smart, hand-crafted algorithms. It is no wonder that such approaches also exist for mono-to-stereo upmixing.

The fundamental idea behind a paper from 2007 (Lagrange, Martins, Tzanetakis, [1]) is simple:

If we can find the different sound sources of a recording and extract them from the signal, we can mix them back together for a realistic stereo experience.

This sounds simple, but how can we tell what the sound sources in the signal are? How do we define them so clearly that an algorithm can extract them from the signal? These questions are difficult to solve and the paper uses a variety of advanced methods to achieve this. In essence, this is the algorithm they came up with:
1. Break the recording into short snippets and identify the peak frequencies (dominant notes) in each snippet
2. Identify which peaks belong together (a sound source) using a clustering algorithm
3. Decide where each sound source should be placed in the stereo mix (manual step)
4. For each sound source, extract its assigned frequencies from the signal
5. Mix all extracted sources together to form the final stereo mix.
Example of the user interface built for the study. The user goes through all the extracted sources and manually places them in the stereo mix, before resynthesizing the whole signal. Image taken from [1].

Although quite complex in the details, the intuition is quite clear: Find sources, extract them, mix them back together.

A Quick Workaround: Source Separation / Stem Splitting

A lot has happened since Lagrange’s 2007 paper. Since Deezer released their stem splitting tool Spleeter in 2019, AI-based source separation systems have become remarkably useful. Leading players such as Lalal.ai or Audioshake make a quick workaround possible:
1. Separate a mono recording into its individual instrument stems using a free or commercial stem splitter
2. Load the stems into a Digital Audio Workstation (DAW) and mix them together to your liking
This technique has been used in a research paper in 2011 (see [2]), but it has become much more viable since due to the recent improvements in stem separation tools.

The downside of source separation approaches is that they produce noticeable sound artifacts, because source separation itself is still not without flaws. Additionally, these approaches still require manual mixing by humans, making them only semi-automatic.

To fully automate mono-to-stereo upmixing, machine learning is required. By learning from real stereo mixes, ML system can adapt the mixing style of real human producers.

Machine Learning with Parametric Stereo

Photo by Zarak Khan on Unsplash

One very creative and efficient way of using machine learning for mono-to-stereo upmixing was presented at ISMIR 2023 by Serrà and colleagues [3]. This work is based on a music compression technique called parametric stereo. Stereo mixes consist of two audio channels, making it hard to integrate in low-bandwidth settings such as music streaming, radio broadcasting, or telephone connections.

Parametric stereo is a technique to create stereo sound from a single mono signal by focusing on the important spatial cues our brain uses to determine where sounds are coming from. These cues are:
1. How loud a sound is in the left ear vs. the right ear (Interchannel Intensity Difference, IID)
2. How in sync it is between left and right in terms of time or phase (Interchannel Time or Phase Difference)
3. How similar or different the signals are in each ear (Interchannel Correlation, IC)
Using these parameters, a stereo-like experience can be created from nothing more than a mono signal.

This is the approach the researchers took to develop their mono-to-stereo upmixing model:
1. Collect a large dataset of stereo music tracks
2. Convert the stereo tracks to parametric stereo (mono + spatial parameters)
3. Train a neural network to predict the spatial parameters given a mono recording
4. To turn a new mono signal into stereo, use the trained model to infer spatial parameters from the mono signal and combine the two to a parametric stereo experience
Currently, no code or listening demos seem to be available for this paper. The authors themselves confess that “there is still a gap between professional stereo mixes and the proposed approaches” (p. 6). Still, the paper outlines a creative and efficient way to accomplish fully automated mono-to-stereo upmixing using machine learning.

Generative AI: Transformer-based Synthesis

Stereo-Genration in Meta’s text-to-music model MusicGen. Image taken from another article by the author.

Now, we will get to the seemingly most straight-forward way to generate stereo from mono. Training a generative model to take a mono input and synthesizing both stereo output channels directly. Although conceptually simple, this is by far the most challenging approach from a technical standpoint. One second of high-resolution audio has 44.1k data points. Generating a three-minute song with stereo channels therefore means generating over 15 million data points.

With todays technologies such as convolutional neural networks, transformers, and neural audio codecs, the complexity of the task is starting to become managable. There are some papers who chose to generate stereo signal through direct neural synthesis (see [4], [5], [6]). However, only [5] train a model than can solve mono to stereo generation out of the box. My intuition is that there is room for a paper that builds a dedicated for the “simple” task of mono-to-stereo generation and focuses 100% on solving this objective. Anyone here looking for a PhD topic?

What Needs to Happen Next?

Photo by Samuel Spagl on Unsplash

To conclude this article, I want to discuss where the field of mono-to-stereo upmixing might be going. Most importantly, I noticed that research in this domain is very sparse, compared to hype topics such as text-to-music generation. Here’s what I think the research community should focus on to bring mono-to-stereo upmixing research to the next level:

1. Openly Available Demos and Code

Only few papers are released in this research field. This makes it even more frustrating that many of them do not share their code or the results of their work with the community. Several times have I read through a fascinating paper only to find that the only way to test the output quality of the method is to understand every single formula in the paper and implement the algorithm myself from scratch.

Sharing code and creating public demos has never been as easy as it is today. Researchers should make this a priority to enable the wider audio community to understand, evaluate, and appreciate their work.

2. Going All-In on Generative AI

Traditional signal processing and machine learning are fun, but when it comes to output quality, there is no way around generative AI anymore. Text-to-music models are already producing great-sounding stereo mixes. Why is there no easy to use, state-of-the-art mono-to-stereo upmixing library available?

From what I gathered in my research, building an efficient and effective model can be done with a reasonable dataset size and minimal to moderate changes to existing model architectures and training methods. My impression is that this is a low-hanging fruit and a “just do it!” situation.

3. Making Upmixing Automated, but Controllable

Once we have a great open-source upmixing model, the next thing we need is controllability. We shouldn’t have to pick between black-box “take-it-or-leave-it” neural generations or old-school, manual mixing based on source separation. I think we could have it both.

A neural mono-to-stereo upmixing model could be trained on a massive dataset and then finetuned to adjust its stereo mixes based on a user prompt. This way, musicians could customize the style of the generated stereo based on their personal preferences.

Conclusion

Effective and openly-accessible mono-to-stereo upmixing has the potential to breathe live into old recordings or amateur productions, while also allowing us to create alternate stereo mixes of our favorite songs.

Although there have been several attempts to solve this problem, no standard method has been established. By embracing recent development in GenAI, a new generation of mono-to-stereo upmixing models could be created that makes the technology more effective and more widely available in the community.

About Me

I’m a musicologist and a data scientist, sharing my thoughts on current topics in AI & music. Here is some of my previous work related to this article:
Find me on Medium and Linkedin!

References

[1] M. Lagrange, L. G. Martins, and G. Tzanetakis (2007): “Semiautomatic mono to stereo up-mixing using sound source formation”, in Audio Engineering Society Convention 122. Audio Engineering Society, 2007.

[2] D. Fitzgerald (2011): “Upmixing from mono-a source separation approach”, in 2011 17th International Conference on Digital Signal Processing (DSP). IEEE, 2011, pp. 1–7.

[3] J. Serrà, D. Scaini, S. Pascual, et al. (2023): “Mono-to-stereo through parametric stereo generation”: https://arxiv.org/abs/2306.14647

[4] J. Copet, F. Kreuk, I. Gat et al. (2023): “Simple and Controllable Music Generation” (revision from 30.01.2024). https://arxiv.org/abs/2306.05284

[5] Y. Zang, Y. Wang & M. Lee (2024): “Ambisonizer: Neural Upmixing as Spherical Harmonics Generation”. https://arxiv.org/pdf/2405.13428

[6] K.K. Parida, S. Srivastava & G. Sharma (2022): “Beyond Mono to Binaural: Generating Binaural Audio from Mono Audio with Depth and Cross Modal Attention”, in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, p. 3347–3356. Link

Mono to Stereo: How AI Is Breathing New Life into Music was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Mono to Stereo: How AI Is Breathing New Life into Music

Go Here to Read this Fast! Mono to Stereo: How AI Is Breathing New Life into Music
December 24, 2024
How Bias and Variance Affect Your Model

Gustavo R Santos

Learn the concepts and the practice. How a model behaves in each case.

Continue reading on Towards Data Science »

Originally appeared here:
How Bias and Variance Affect Your Model

Go Here to Read this Fast! How Bias and Variance Affect Your Model

December 24, 2024
5 great Netflix shows to watch on Christmas

Blair Marnell

Celebrate the holidays with our selection of the five great Netflix shows to watch on Christmas.

Go Here to Read this Fast! 5 great Netflix shows to watch on Christmas

Originally appeared here:
5 great Netflix shows to watch on Christmas

December 24, 2024
The best horror games to play in 2025

Engadget

Are you tired of feeling safe and happy all the time? Is your daily life overrun by feelings of security, contentment and peace? Do you want an escape from all of the oppressive niceness around you? Well, look no further — these are the games for you.

Here, we’ve collected more than a dozen of the most evocative and disturbing horror games in recent memory. These selections cover a wide range of genres and styles, but each one comes with at least a tinge of unsettling terror. So take a peek, find your game, and prepare your skeleton for some fresh air because you’re about to jump out of your skin.

Check out our entire Best Games series including the best Nintendo Switch games, the best PS5 games, the best Xbox games, the best PC games and the best free games you can play today.

This article originally appeared on Engadget at https://www.engadget.com/gaming/best-horror-games-120029388.html?src=rss

Go Here to Read this Fast! The best horror games to play in 2025

Originally appeared here:
The best horror games to play in 2025

December 24, 2024
ISS astronauts enjoy a microgravity holiday with Brussels sprouts and more

Trevor Mogg

Despite the unusual conditions, crew on the ISS are still able to celebrate the holiday with traditional foods, though Christmas cookies can be problematic.

Go Here to Read this Fast! ISS astronauts enjoy a microgravity holiday with Brussels sprouts and more

Originally appeared here:
ISS astronauts enjoy a microgravity holiday with Brussels sprouts and more

December 24, 2024
The 25 best PC games you can play right now for 2025

Engadget

PC gamers have almost too many options when it comes to titles to play, which is a great problem to have. With decades of games to choose from (and the first port of call for most indie titles, too), the options are endless. You also get the perks of (nearly always flawless) backward compatibility and console-beating graphical performance — if you’ve got the coin for it when you’re building your perfect kit or picking up a high-powered gaming laptop.

The whole idea of what a gaming PC is and where you can play it is shifting, too, with the rise of handheld gaming PCs like the Steam Deck. We’ve tried to be broad with our recommendations here on purpose; here are the best PC games you can play right now.

Best PC games to play right now

Check out our entire Best Games series including the best Nintendo Switch games, the best PS5 games, the best Xbox games, the best PC games and the best free games you can play today.

This article originally appeared on Engadget at https://www.engadget.com/gaming/pc/the-best-pc-games-150000910.html?src=rss

Go Here to Read this Fast! The 25 best PC games you can play right now for 2025

Originally appeared here:
The 25 best PC games you can play right now for 2025

December 24, 2024
Unfair decisions by AI could make us indifferent to bad behaviour by humans

The Conversation

Artificial intelligence (AI) makes important decisions that affect our everyday lives. These decisions are implemented by firms and institutions in the name of efficiency. They can help determine who gets into college, who lands a job, who receives medical treatment and who qualifies for government assistance. As AI takes on these roles, there is a growing risk of unfair decisions – or the perception of them by those people affected. For example, in college admissions or hiring, these automated decisions can unintentionally favour certain groups of people or those with certain backgrounds, while equally qualified but underrepresented applicants get overlooked.…

This story continues at The Next Web

Go Here to Read this Fast! Unfair decisions by AI could make us indifferent to bad behaviour by humans

Originally appeared here:
Unfair decisions by AI could make us indifferent to bad behaviour by humans

December 24, 2024
These will be the most in-demand programming languages in 2025

Kirstie McDermott

Across Europe, skills shortages are emerging as a key challenge. The Council of the European Union says this is driven by demographic change, demand for new skillsets, and poor working conditions in some sectors. Adding to that, a recent report highlighted that around 42% of Europeans lack basic digital skills, including 37% of those in the workforce. The rapid advancement of AI is adding more pressure. While AI offers the EU a shot in the arm to strengthen the bloc’s innovation and competitiveness, there is still a gap between the skills required, and the skills available. 5 jobs to discover…

This story continues at The Next Web

Go Here to Read this Fast! These will be the most in-demand programming languages in 2025

Originally appeared here:
These will be the most in-demand programming languages in 2025

December 24, 2024

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Category: Technology

Applications and techniques for AI mono-to-stereo upmixing

Mono and Stereo in the physical and digital world

Use Cases for Mono-to-Stereo Upmixing

1. Enriching existing music in mono format to a stereo experience.

2. Improving or modernizing existing stereo mixes that appear sloppy or simply have fallen out of time, stylistically

How Mono-to-Stereo AI Works

Traditional Signal Processing: Sound Source Formation

A Quick Workaround: Source Separation / Stem Splitting

Machine Learning with Parametric Stereo

Generative AI: Transformer-based Synthesis

What Needs to Happen Next?

1. Openly Available Demos and Code

2. Going All-In on Generative AI

3. Making Upmixing Automated, but Controllable

Conclusion

About Me

References

Best PC games to play right now