in

Microsoft AI Can Impersonate Your Voice With Just 3 Seconds Of Audio (greatgameindia.com)

According to Microsoft, a new AI named VALL-E can impersonate your voice with just 3 seconds of audio and can also match the speaker’s “emotional range” and tempo, making it a highly accurate type of mimicry.

What do you think?

33 Points
Upvote Downvote
Subscribe
Notify of
guest

5 Comments
Inline Feedbacks
View all comments
stevencasteel
stevencasteel
1 year ago

Yeah, I’ve seen a few papers on this.

Imagine what it can do with 10 minutes of audio.

Image generating AI like Stable Diffusion can make perfect renders of your face with just 8 training photos.

Deepfake video currently takes hundreds of images and a powerful graphics card but it’ll get optimized soon.

Presbyopia
Presbyopia
1 year ago

Good if you have a condition which means you will lose your speech and need a machine to communicate. But we all know how else this will be used.

Rosey
Rosey
1 year ago

No it can’t. Not yet. So unplug from anything that can collect this days. Horse bolted I know.

stevencasteel
stevencasteel
1 year ago
Reply to  Rosey

Yes it can. There are audio demos right here at the white paper:

https://valle-demo.github.io/

There are also several sites like Uberduck dot com that have been around for about a year.

CaliGirlWonders
CaliGirlWonders
1 year ago

That’s why “smart” phones were manufactured after the invasion of digitized audio and video. It’s all “invented” by the CONs to further control us. They can only mimic Nature and the Divine Order. They are NOT superior to us. They invert everything.