I’ve recently done some talks for my schools cybersecurity club, and now I want to edit them.

My actual video editing needs are very simple, I just need to clip parts of the video out, which basically every editor can do, as per my understanding.

However, my videos were recorded from my phone, and I don’t have a presentation mic or anything of the sort, meaning background noise, including people talking has slipped in. From my understanding, it’s trivial to filter out general noise from audio, as human voices have a specific frequency, even “live”, like during recording or during a game, but filtering voices is harder.

However, it seems that AI can do this:

https://scribe.rip/axinc-ai/voicefilter-targeted-voice-separation-model-6fe6f85309ea

Although, it seems to only work on .wav audio files, meaning I would need to separate out the audio track first, convert it to wav, and then re merge it back in.

Before I go learning how to do this, I’m wondering if there is already an existing FOSS video editor, or plugin to an editor that lets me filter the video itself, or a similar software that works on the audio of videos.

  • WastedJobe@feddit.de
    link
    fedilink
    arrow-up
    0
    ·
    7 months ago

    Audio Engineer here. Not sure Ardour can open video, but it’s a capable DAW and open source. Reaper is closed source but it can open (and even render) pretty much any video format. To actually seperate a single voice, you do need additional plugins though, no matter which DAW you’re using.
    I think iZotope RX could do it, but it is fairly expensive. I haven’t seen any open source audio tools that can do this at all. It is pretty much guaranteed to require some kind of machine learning, as parametrically seperating by EQ or phase won’t work if you have only one source signal (even with two or more microphones, it would be really, really hard).
    A very good spectral editor might technically work, it would however take several days of manually deleting select frequencies on an almost single sample level and still sound bad, especially if the noise is nearly the same level as the signal.