Clean audio tracks with misst: separate stems on Linux easily
I recently attended a live event organized by Rockin’1000: a road concert with a band of eleven. No evidence, no clicking in headphones, no safety net. To prepare ourselves, the organization provided us with everything we needed: the twelve tracks of the setlist complete with scores, tabs, video tutorials for both guitars (right and left) and of click track with voice prompts (“verse”, “bridge”, “chorus”, etc..) designed to avoid errors during individual study.
The problem? I wouldn't have had any indications in person, so I had to practice the tracks “clean”, with just the basic sounds: those voice prompts could not be used.
And here it comes into play lost, an open source software that allows stem separation using modern AI models, including those of Meta.
In this article we see what it is, how it works, how to install it and how I used it to get perfect click tracks for live.
What's wrong?

lost is an open source application for separating audio tracks (“stem separation”).
Its purpose is simple: take a stereo audio file and split it into isolated components, come:
- voce,
- drums,
- basso,
- other tools,
- percussion,
- various accompaniments.
It's meant to be easy, light, multiplatform and with graphical interface, without having to navigate complicated models or difficult-to-manage dependencies.
Thanks to the open source models developed by the community and — in some versions — to the models published by Meta, misst manages to extract voices surprisingly accurately, often comparable to professional tools.
How to install misst on Linux
Unlike many modern audio software, misst is not available through Flatpak nor as an AppImage or precompiled package for Linux.
The official distribution provides installers for Windows only, while on Linux the only supported method is
manual installation from sources using Python and a virtual environment (venv).
Requirements
- Python 3.9 or higher
- FFmpeg installed on your system
- Git
- Optional: CUDA + NVIDIA GPU to accelerate stem separation
Step-by-step installation
Open a terminal and follow these steps:
# 1. Clona il repository ufficiale
git clone https://github.com/Frikallo/MISST.git
cd MISST
# 2. Crea un ambiente virtuale Python
python3 -m venv venv
# 3. Attiva l'ambiente virtuale
source venv/bin/activate
# 4. Installa le dipendenze
# Per GPU (CUDA): requirements.txt
# Per CPU-only: requirements-minimal.txt
pip install -r requirements-minimal.txt
# 5. Avvia l'applicazione GUI
python3 MISSTapp.py
Important notes
- If you want to take advantage of the GPU, make sure you have a compatible CUDA version before installing the full dependencies.
- Installation away
pipit may take time, especially on systems without GPU acceleration. - The app will run directly from the cloned repository: it is not installed system-wide.
How I used misst for the Rockin'1000 project
The click tracks I received were perfect for studying in preparation for a concert with the click in headphones, but for mine there was a big problem: they counted the sections of the song orally.
“Verse… two… three… four…”
“Bridge!”
“Guitar left, chorus!”
Great for avoiding mistakes at home, terrible for practicing in anticipation of a live performance without clicking on headphones.
Objective
Get a “neutral” click track, composed only of:
- metronome,
- any accents or non-vocal rhythmic indications.
Procedure with misst
- I import the click track in measures.
I upload the WAV/MP3 file exactly as provided by Rockin'1000. - I select the Voice/Other model.
It's ideal when you want to precisely isolate vocal parts. - I do the separation.
After a few seconds (or a couple of minutes, It depends on the CPU) I have two files:- voice.wav – everything that the software considers voice,
- no_voice.wav – everything else.
- I check the result.
In most cases the voice with commands (“verse”, “bridge”) ends correctly in the vowel stem. - I export only the "instrumental" track.
The one without voice becomes my new “clean” click track. - (Optional) I adjust the volume or I recreate some accents with Audacity.
The result was impeccable: click track pulite, no voice commands, no obvious artifacts, perfect for studying as if you were live.
What it does and what it doesn't do
What is good for you
- Excellent vocal separation for click tracks, karaoke, rehearsal tracks.
- Simple and clear interface even for non-technical people.
- Available as Flatpak → immediate installation.
- Open source: you can study it, modify it, contribute.
- Works offline: no uploading to cloud services.
What it doesn't do
- It is not an audio editor: no cuts, effects, normalization.
- It doesn't work miracles: some residual voices may remain if overlapped with instruments that occupy similar frequencies.
- It is not intended for extreme professional productions (but for daily use that's enough).
- It does not replace more "scientific" software like Demucs from the command line, if you need the absolute maximum quality.
Conclusions
If you play, remixi, produce or work with tracks to edit, misst is one of the best open source tools available for Linux today.
It's light, easy, free and allows anyone to do voice separation without becoming an expert in AI models.
In my specific case — preparing a live Rockin'1000 concert without vocal click tracks — it was the perfect tool, quick and effective.
Definitely worth a place in your musical toolbox.




0 Comments