While it was already possible to make computers sing, it is extremely easy to tell the difference between a computer-generated voice and a regular human being singing. Things can be tweaked by hand to make it better, but it involves a lot of time and skill – and having a human being to make it work better defeats the whole purpose of a singing computer in the first place.
The Japanese researchers who solved this problem claimed they used an evolutionary process. They ran Vocaloid (an advanced program that can create computer-generated vocals) eight times, each time tweaking randomly. A human producer then listens to all eight tracks, and slides bars in the software to reflect how well each “frequency curve” modification has performed in certain respects. The curves that produce the best results are kept and used to create a new set of tracks which are then analyzed again. The process is repeated, until the perfect set of frequency curves are discovered – which can then be applied to create the most realistic computer-generated singing voice.
In the future we could have completely digital pop stars, or even aging singers who sound young forever because they could have their voices preserved digitally. Though it makes you wonder if anybody can simply replicate such voices in the future, which could lead to a new form of theft/impersonation. Maybe Microsoft Sam could make use of this technology to sound less like a joke.