Imagine that you have a song file—drums, guitar, bass, vocals, piano—and you want to rebalance it, bringing the voice down just a touch in the mix.
Or you want to turn a Lyle Lovett country-rock jam into a slamming club banger, and all that's standing between you and the booty-shaking masses is a clean copy of Lovett's voice without all those instruments mucking things up.
Or you recorded a once-in-a-lifetime, Stevie Nicks-meets-Ann Wilson vocal performance into your voice notes app... but your dog was baying in the background and your guitar was out of tune. Can you extract the magic and discard the rest?
Without access to the original recording project files or master tapes, jobs like these have always been slightly difficult. Specialized tools or software could extract specific sounds, often the vocals, sometimes through crude techniques based around high- and low-passing the audio, but the results were never quite what one might wish. With the advent of machine learning, however, computers have gotten scarily good at reaching into dense audio mixes and surgically extracting complicated parts.
Apple's May 28 update to its flagship audio program, Logic Pro, shows just how far this tech has come—and how quickly it's advancing.
Put to the test
In 2024, Apple rolled out the Stem Splitter feature in Logic 11. Powered by AI tools and requiring Apple Silicon in order to work, Stem Splitter could "recover moments of inspiration from any audio file and separate nearly any mixed audio recording into four distinct parts: Drums, Bass, Vocals, and Other instruments, right on the device," Apple said. "With these tracks separated, it’s easy to apply effects, add new parts, or change the mix."
And it worked—but it worked best as long as you kept all the stems together (That is, if you simply rebalanced the track or added effects to one of the stems.) But if you wanted to isolate a single stem, you had to contend with some fairly gnarly audio artifacts.
Consider an example, from a song I've been working on. Here's a snippet of the full piece:
After running Logic's original Stem Splitter on the snippet, I was given four tracks: Vocals, Drums, Bass, and "Other." They all isolated their parts reasonably well, but check out the static and artifacting when you isolate the bass track:
The vocal track came out better, but it was still far from ideal:
Now, just over a year later, Apple has released a point update for Logic that delivers "enhanced audio fidelity" for Stem Splitter—along with support for new stems for guitar and piano.
The difference in quality is significant, as you can hear in the new bass track:
And the new vocal track, though still lacking the pristine fidelity of the original recording, is nevertheless greatly improved:
The ability to separate out guitars and pianos is also welcome, and it works well. Here's the piano part:
Pretty impressive leap in fidelity for a point release!
There are plenty of other stem-splitting tools, of course, and many have had a head start on Apple. With its new release, however, Apple has certainly closed the gap.
Izotope's RX 11, for instance, is a highly regarded (and expensive!) piece of software that can do wonders when it comes to repairing audio, reducing clicks, background noise, and sibilance.
It includes a stem-splitting feature that can produce four outputs (vocal, bass, drums, and other), and it produces usable audio—but I'm not sure I'd rank its output more highly than Logic's. Compare for yourself on the vocal and bass stems:
In any event, the AI/machine learning revolution has certainly arrived in the music world, and the rapid quality increase in stem-splitting tools in just a few years shows just what these AI systems are capable of when trained on enough data. I remain especially impressed by how the best stem splitters can extract not just a clean vocal but also the reverb/delay tail. Having access to the original recordings will always be better—but stem-splitting tech is improving quickly.