How we made version 1 voices sound even better in Synthesizer V Studio 2 Pro

Did you know that Synthesizer V Studio 2 Pro comes with a collection of upgraded voices?

Additionally, all new voices contain significantly improved clarity and dynamics, and even the free compatibility updates for version 1’s voices sound slightly better than the original versions made for Synthesizer V Studio Pro.

The story dates back to 2018 when the Dreamtonics team were working on version 1.0 of Synthesizer V Studio. It was an interesting time when neural networks had just emerged as a viable alternative to sample concatenation for vocal synthesis. It seemed that deep learning may eventually help us achieve an unprecedented level of naturalness, but back then it just wasn’t possible to deliver the same kind of crisp, clean-sounding audio that concatenative synthesis could easily generate.

Looking back, we initially made a balanced decision by designing the first generation Synthesizer V engine as a hybrid of concatenation and neural networks. The inspiration was drawn from state-of-the-art text-to-speech engines available at the time. A neural network is trained to do all of the high-level planning in terms of “what kind of sound to make for each consonant and vowel”. The engine would perform a database search to find the sample that sounded the most similar to that target, then the selected samples would be joined together with cross-fades to form one continuous vocal phrase.

The decision had a few significant impacts on Synthesizer V Studio. Firstly, it was why our voices were named “voice databases”, as they were literally databases containing thousands of voice samples, including a neural network that served as their index. Secondly, from the very beginning, the editor itself was designed to support this kind of hybrid approach to vocal synthesis.

However, in the second half of 2020, a mere few months after launching, we found a way to make a fully neural network-based vocal synthesizer that actually sounded just as crisp and clear as our hybrid engine. In fact, this new approach became so successful that later 1.x versions were all focused on the neural network part.

In the meantime, we realized that while a fully neural network-based synthesizer can generate amazingly realistic sounds, it’s up to the user’s preference to decide which method sounds better to them. With this in mind, our original hybrid engine generated a “classical” voice that was realistic texturally, but robotic structurally. That being said, many of our users do like the warm, malleable, and mechanical touch as part of the sounds from our first-generation engine, and so we therefore decided to keep it until the very last version 1.11.2 of Synthesizer V Studio Pro, released in the second half of 2024.

However, the simultaneous support provided for two vastly different algorithms, complicated the development of Synthesizer V Studio in several aspects. A common pattern is the requirement for extra steps of data conversion, during which some precision is lost. As a result, our neural network models ended up sounding just a little bit smoother inside Synthesizer V Studio than running as a prototype. The difference is barely noticeable in most cases, but after listening to synthesized vocals for hours every day, our devs eventually became experts at spotting the differences.

Thankfully, this loss of precision was fully resolved in Synthesizer V Studio 2 Pro when the development started, due to a complete rewrite of the engine. We also managed to retain some of the characteristics of the original hybrid engine into fully neural network-based models, while keeping the singing styles and timbre as close to the original recordings as possible. These re-trained voices are still available as free compatibility updates for the concatenative synthesis-based version 1 voices. Of course, for those who might want to keep using the original hybrid engine, Synthesizer V Studio Pro version 1 is still available.

Find out more about Synthesizer V Studio 2 Pro, plus trial the software and any voice for 7-14 days HERE.

How we made version 1 voices sound even better in Synthesizer V Studio 2 Pro

Buy Synthesizer V Studio Pro

Other posts from Blog

[title]