How to use Voice Changer software effectively

I use a voice changer program when I’m livestreaming. It helps me sound more female, though some may disagree.

Trans-related subjects are not something I like talking about, but when it gets brought up, my viewers usually have an “Aha!” moment, because they weren’t totally sure – and in many cases they are surprised, because they couldn’t tell I used one at all.

It’s actually something I’ve explored long before I became a Vtuber, and applying it to my avatar was a natural decision. But, the opinions I found in the past was that it was always “obvious“, and it was never possible to sound convincing.

So why does it work for me? There are many possible answers, and viewers often say that they just assumed I had a cheap microphone. But I think there’s also something behind how I selected and tuned my software in a way I thought best.

In this article, I’m going to avoid speaking in detail about why one would like to use a voice changer. Perhaps you think of Vtubers as virtual characters, and it adds to the suspension of disbelief. Perhaps, you wish to conceal your voice for privacy. Or, perhaps you are trans and you are uncomfortable with hearing your own voice.

Whether or not you agree with these or if they apply to you, what matters is that these reasons exist, and that I pursued them. And, I’m finally sharing some of what I did.

This guide has four sections:

Click the provided links to jump to the section you desire.

The TL;DR Summary

OBS Studio has had support for VST plugins for a couple of years now, and I currently use a plugin called RoVee in my audio filters, along with the integrated OBS noise filters and EasyQ – an equaliser – to clean it up before RoVee does its magic.

I used two setups prior to this. Originally, I used a dedicated program called Koigoe used by Japanese Vtubers to change my voice outside of OBS. Both RoVee and Koigoe have a critical feature compared to other voice changers: they have dials that set the formant and pitch modification manually, allowing for fine tuning.

My second setup was similar to my current one, except that I had a VST plugin called Voxengo Recorder and a virtual cable to output the audio to my desired programs. Currently, I do not use these two things, and you can hear me unfiltered on Discord, and other streams via Discord.

That’s the whole thing, but let’s unpack these sentences.
RoVee with dials for Formant and Pitch.

VST plugins are a program-agnostic add-on system for Digital Audio Workstations (DAWs) that are typically sound mixers and filters, and DAWs are the collective term for music and/or sound production programs. This allows a producer to use whatever they’re comfortable with – whether it’s Reason, FL Studio, or Studio One – and use a fairly similar pipeline.

To use a VST, you typically need a DAW, as that’s the intended purpose – but there are also programs that take live music and process them with VST plugin support, and OBS itself is now one of them.

To install a VST plugin, you are required to put them in your C drive in a specific folder. After that, it’s accessible by every DAW program and anything that supports VST. The folder is:

C:\Program Files (x86)\vstplugins for 32bit plugins, or
C:\Program Files\vstplugins for 64bit plugins.

Depending on what version it is, it will only be accessible in the 32bit or 64bit version of OBS. RoVee is only available in 32bit, so you’ll have to use the 32bit version of OBS Studio. Then, you can find it in your Audio Filters in the Audio Mixer window in OBS.

And that’s mostly my current setup. Skip to the next section for some advice on how to tune it, or keep reading for what I used to use.

Stuff I Used Outside of OBS Studio

If you don’t use OBS for some reason, you can do the same thing with an external program. Most of my work ended up going through OBS anyway, but there are cases where you may want to use a filter in VoIP, or some other post-production.

I opted to use a program to output live OBS audio to reduce repetition, but I will outline how to do it without OBS.

To do this, you need a few different programs:

  • A set of virtual audio cables (if it’s not included in the other programs,)
  • a sound mixer,
  • a VST-supported program, if using a VST voice changer,
  • or otherwise, another voice changer program.

There are a few different ways to configure this, but the overall pipeline is the same: You send your physical microphone into a middle-man program, and that program sends the processed sound out into your desired one, with a fake microphone.

Virtual Audio Cables add a set of “fake” speakers and microphones to your computer, allowing you to directly send live audio from one program to another. I used VB Audio Cable for this, but in some cases, a sound mixer program (including the one I suggest next) and Voice Changer programs will install a set of fake audio drivers. Check first.

For a Sound Mixer, a common suggestion is Voicemeeter Banana, from the same people who make VB Audio Cable. Since OBS has its own sound mixer, I considered this an unnecessary complication and only used the Virtual Cables. It’s up to you.

For a VST program… again, I have never used this myself, because OBS supports it. However, you can take a look at VSTHost, which is old, very lightweight, and commonly recommended for live artists.

To output the results of the VST filtering to another program, I used Voxengo Recorder, which was rather difficult to find. VSTHost should have the ability to output already, as do other VST programs, but I used OBS, which doesn’t have one.

A Voice Changer for Early Vtubers
Koigoe. Uhh…. I forgot to install Japanese support this time, but there’s enough English to work. ¯\_(ツ)_/¯

The original program I used for voice changing was called Koigoe, which I believe was used by early Japanese streamers. It has a bunch of useful features including wav file and batch processing, and a pitch monitor graph to observe your vocal range in real time.

While it has some presets, it also has sliders for you to set your own formant and pitch shift level. This lets you tune your voice more precisely, compared to other voice changer programs, which largely consist of presets.

However, Koigoe does have one fatal flaw that prevents long term use: If running the live microphone mode for too long, there’s a buffer overflow that garbles all the audio. You can prevent this by pressing stop/start every 10 to 15 minutes. This doesn’t occur when you are only processing audio files.

(As an interesting bit of trivia, my entire first month of streaming had garbled audio, and the Youtube playthrough is actually not the original VOD, but was a reprocessed version.)

Speech and Music Theory

So I’ve mentioned Formant and Pitch Correction a couple times, but why are they so important?

If it wasn’t already evident, I never found a voice changer I fully liked. Instead, I found musical voice changers to function better. Since you are expected to tune these yourself, some basic theory knowledge is required.

Pitch

Most people should already know about pitch: It’s essentially the musical notes that are made by your voice. Pitch is something that applies both in human singing and in human speech, which means that using music software for speech changing is possible.

From exploring different programs that are explicitly about voice changing, I’ve found that they are usually full of unchangeable presets that don’t fully explain themselves or consider that everyone’s voice is different. Perhaps it’s because I only used the trial versions, but I never found one to my liking.

After finding Koigoe, I was able to experiment more directly with the correction values, and also looked at the pitches in my regular speech pattern at the same time.

A graph sample of my own singing.
From a Pitch Monitor Android App.

During regular speech, the pitches you make can range over an entire octave. With trial and error, and studying the graph, I pushed my voice to stay within the lowest part of contralto, which is the lowest female singing range.

Contralto singing range, with Middle C noted.

Formant

Formant refers to the specific texture of your voice. Timbre and resonance are related terms in sound and music theory, and it’s the reason why we can tell the difference between two different instruments playing the same note.

The human voice itself is a musical instrument, and changing the formant in addition to the pitch is important in the same way that changing the note on an instrument actually requires physical movement of the keys. You’re changing the texture, too.

That’s the layman’s explanation, and I have no idea how correct I am… but the point is, you need to shift the formant a certain percentage based on the semitone shift. The percentage, much like your vocal range, is something you need to tune yourself depending on what your voice is like – but you generally don’t need a large amount.

General Advice

Less is More.

The farther away you shift a sound, the more distortion is created. You can clean some of this distortion in post-production, but you can’t do anything in a live setting except to use less of it, or not at all.

This is why I settled on contralto – but you can go higher if you want.

You need clean, studio-grade audio.

My first experiments that proved that it was possible to have an invisible voice changer were on video game dialogue clips, recorded in a studio setting. You don’t need a studio microphone to get this, or a studio, but knowing how to have good microphone audio is a prerequisite to having effective voice filtering.

Use noise filtration, position your microphone correctly, and use an equaliser to control your bass levels, so that the voice changer receives the purest sound sample.

Some people won’t like this, but the choice is yours.

Your voice is a foundation of your identity. Changing it, and having people learn that you are changing it, will get a bad reaction from some people. It is, objectively speaking, a deception from your natural voice. A lie.

You should think hard about why you would want to use it, and then follow it with conviction. It can be as simple as saying you are adding to your character – but don’t lose yourself.

Software List

Mixers/VST Hosts

Voicemeeter Banana
https://www.vb-audio.com/Voicemeeter/banana.htm
Mixer.

VSTHost
https://www.hermannseib.com/english/vsthost.htm
VST host, no mixing (unless you throw in a mixer VST)

Cantabile
https://www.cantabilesoftware.com/
Both a mixer and VST host.

VST Plugins

You can also try searching for “formant shifting” VST plugins online. Anything related to autotuning is what you should seek. The ones listed here are free.

RoVee
official: https://www.g200kg.com/jp/software/rovee.html
English mirror: http://www.vst4free.com/free_vst.php?id=1012
Formant/Pitch shifter.

EasyQ
http://www.vst4free.com/free_vst.php?plugin=EasyQ&id=949
Equaliser.

Voxengo Recorder
https://www.voxengo.com/product/recorder/
Outputs the audio in its location in the chain.

Other Utility Programs

VB Audio Cable
https://www.vb-audio.com/Cable/
Virtual Audio Cable.

Vocal Pitch Monitor
https://play.google.com/store/apps/details?id=com.tadaoyamaoka.vocalpitchmonitor
Self-explanatory. Android.

Voice Changer Programs

Koigoe
http://koigoemoe.g2.xrea.com/koigoe/koigoe.html
The only one here that’s actually a musical program with tuning dials. And free.
Has batch file processing and a pitch monitor.

MorphVox
https://screamingbee.com/morphvox-voice-changer
Also on Steam and tends to get mixed reviews.

AV Voice Changer Diamond
https://www.audio4fun.com/voice-changer.htm
Commonly said to be the “best” choice. Also the most expensive.

Voxal
https://www.nchsoftware.com/voicechanger/index.html

Voicemod
https://www.voicemod.net/
A fairly new development compared to everything else on the list.

4 thoughts on “How to use Voice Changer software effectively

  1. Very neat, thank you so much!
    It’s a real bummer that Rovee may only work with the 32-bit OBS. I wonder if it’s possible to just copy the VST plugins to the alternative file location for 64-bit.

    I also downloaded the trial version of AV Voice Changer Diamond.. It does seem to work well, and it is nice that you can point the output to various other programs if you’d like to. I’m still not sure it’s worth the cost as I find it difficult to evaluate if it’s ‘better’ than the methods you mentioned, or if it’s just selling a lot of unneeded extras in a complete package that will get you the same result.

    Either way, thanks for listing out some options! Good to know about the buffer overflow problem in Koigre too.. I was about to go that route, but I would certainly forget to start/stop every ~15 minutes. The tip about limiting changes to contralto is interesting too. Getting set up to Vtube is certainly dizzying sometimes, but this is easily some of the best information I’ve sound about voice changers for it, thanks again.

    Like

    1. RoVee was the only one I found that was free, unfortunately. But there are a lot of paid VSTs out there with active support, like Little Alter Boy. And yeah, I’m not really sure which is ‘better’ or not… but I settled on my choice because of what I wanted to have control over. As for Koigoe, you should keep it in mind if you don’t stream, since you’ll have more control when you’re editing.

      Like

  2. Hi Eiri, May I say nice post. My avatar is an alien and my own voice just doesn’t match, well that what I think anyway. I’m not trying to sound feminine, just have a higher pitch. The problem I have is the sound drastically reduces in quality and sounds tinny as sometimes distorts. I’m not even changing the pitch that much, just by a couple of points, its driving me crazy. Any ideas? I’m using a plugin that I have downloaded for OBS. I prefer this because I can set it up and leave it, rather than having to worry about yet another external program. Any help would be greatly appreciated.

    Like

    1. I’m sorry for neglecting your comment for so long. There isn’t a good software solution if the quality drops and sounds tinny. At the certain point the audio needs to be of a quality you might hear in an audiobook – super clean and isolated, and difficult to get live unless you have a home studio with well-tuned noise filtration.

      Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Design a site like this with WordPress.com
Get started