SAM Audio: Meta's new Multimodal Audio Separation Model

Advertisement:

Meta SAM Audio thumbnail

Meta has recently introduced SAM Audio, a groundbreaking unified multimodal model designed for audio separation. This innovative tool allows users to isolate specific sounds in music, speech, and other environments using various intuitive prompts. For those in the audio and video production industries, this represents a significant shift in how audio can be edited and manipulated.

Key Features of SAM Audio

SAM Audio is a state-of-the-art model that can separate audio through:

  • Text Prompts: Users can type in what they want to isolate, such as "guitar" or "vocals," and the model will focus on that specific sound.
  • Visual Prompts: By pointing at a specific object in a video, the model can extract the sound associated with it. For example, isolating the sound of a passing train by simply selecting it on-screen.
  • Span Prompts: This feature allows for even more precision by selecting a specific duration or "span" within the audio waveform.
  • Multi-Prompting: Users can combine different types of prompts to create a more tailored and efficient workflow.

Applications for Creators

SAM Audio is designed for a wide range of users, including musicians, audio engineers, video creators, and hobbyists. Its ability to accurately isolate speech from background noise or separate individual instruments in a complex musical piece offers unprecedented control.

The Future of Audio Editing

The technology behind SAM Audio will definitely turn heads in the audio industry. Given its versatility, I'm sure that this tech will soon be integrated into Digital Audio Workstations (DAWs) and Non-Linear Editors (NLEs), possibly within 2026. My bet is on DaVinci Resolve 😀! Whether it shows up as a native feature or a plugin, this tech is definitely here to help audio editors. For more information, visit https://ai.meta.com/samaudio/

GitHub: https://github.com/facebookresearch/sam-audio

If this interests you, then you might also like to read my articles:

You can watch the full introductory video here:


Prashant MIshra
Written by Prashant Mishra

Views on this website are my own and do not represent the opinions of any organisations I work with.


Chief Product Officer, Soundly | Founder, Pracific
Building audio products, communities, sonic experiences & educational initiatives. I promote budding talents & ideas 🚀

Audio Developer Conference (ADC) | Game Audio India | National Institute of Design | Music Hack Day IndiaMusic Tech Community | Previously contributed to School of Video Game Audio