Home QRIX Auto QRIX Gadgets QRIX Flights QRIX Hotels QRIX Shopping Web Hosting Filmybaap Contact Us Advertise More From Zordo

AI voices are officially too realistic!

3 weeks ago 17

AI-generated voices are nothing new, as they’ve been around for decades. Be that as it may, the digital voices we’ve experienced over the years wouldn’t exactly fool anyone. However, nowadays, I think that digital voices have reached the point where they can be scary. This is for several reasons. Are AI-generated voices too realistic nowadays?

We’ve come a long way from those clinical and disjointed voices we’ve heard over the years. Think about old digital voices from the 2000s and 2010s. Google Assistant and Alexa were about as good as it was going to get. However, with the generative AI boom, there came a huge push to make AI more realistic, and you can bet that this had a profound effect on how much work people put into their digital voices.

Now, think about the voices that OpenAI showed off when it launched GPT-4o. Right now, there are four voices on the platform. We also can’t forget about Google’s Gemini voice. While they all sound realistic, I don’t think that we were seeing just how insane these voices could get. It wasn’t until I tried Google’s new tool that I realized that digital voices might have crossed the threshold into realism.

NotebookLM showed me that digital voices are too realistic

In case you haven’t heard of it, Google released a product last year called NotebookLM. Think of it as an AI-assisted notebook. You can upload information like sources and documents about a certain subject, and keep track of the material. Google will use its AI to read and extract information from the material you uploaded.

Using this tool, you can ask questions about the material you uploaded. Think of it like using a chatbot trained only on the material you uploaded. Imagine uploading an entire textbook about physics, and being able to ask questions about the material in it.

While this platform is nothing new, there’s a new functionality that Google dreamed up and is testing now. You can have Google generate a podcast-style discussion based on the information you uploaded. When I say podcast-style, I mean that it’s meant to make it seem like two people actually set up a microphone and recorded an actual podcast.

The voices sound disturbingly realistic for several reasons. The sentences flow naturally and the cadence and inflection of the speakers are extremely natural. Not just that, but Google even captured some of the little things that differentiate man from machine. I can hear breath noises, it adds the “ums” and “likes” that you hear when people are talking in real life, and there was even an instance where one of the speakers had a false start to a word and corrected himself. Google even went so far as to have one of the speakers laugh.

It’s one thing to create a voice that sounds good when giving a direct response or reading from a script. However, it’s a whole other beast designing a voice that sounds like it’s having a human discussion. And Google nailed it.

During the podcast episode, one thing that stuck out to me was this:

Speaker #1: “So, the article specifically calls out two apps. USB Audio Pro and Musicalot. Have you heard of either of those?”

Speaker #2: “USB Audio Pro. That rings a bell. I think a friend of mine uses it.”

It literally pointed to a friendly relationship between one of the speakers and a person. These examples are among a bunch of other examples.

Google’s voice did the scariest thing…

Okay, so it’s good, but there are other good digital voices out there. What makes this different? Well, the thing about this is that it did probably the scariest thing an AI voice could do… it made me forget.

I uploaded one of my articles and had it create a discussion. NotebookLM spat out a 12 1/2-minute mini podcast episode. I started listening to it, and the shock of it being an AI-generated discussion went away. After a few minutes, I actually forgot that I was listening to AI-generated voices briefly. Maybe it was for a minute, maybe it was for 15 seconds. But, Google has mastered the art of making voices sound so grounded and realistic.

As you could guess, that scared the hell out of me. I knew that it was AI-generated, but it was so realistic that I actually forgot.

Final puzzle piece

Companies are trying their hardest to shove their AI products down our throats, and this is for several reasons. Sure, there are companies that are just trying to keep the investors happy, but there are misguided companies who would love for you to forget about the utility of human-made content. We’re seeing platforms that literally generate entire videos for you with an AI-generated avatar, AI-generated script, and AI-generated voice.

Not only that, but we’re seeing companies like Wix advertising that users can create entire websites in minutes with AI. Also, we can’t forget about the AI dating apps. Hell, there’s even a social media app where the AI generates its own content and posts by itself. We’re living in a world where we’re starting to forget the beauty of human creation, and what makes this worse is that there are people who are endorsing this behavior.

Now, with AI voices getting so good, this trend is going to get worse. The thing is that people associate with speech; a warm and human-sounding voice can make a person connect with something. It’s only exacerbated by companies making the voices sound more personal and tailored to the individual.

Realistic voices are one of the final pieces of the puzzle to make a person fully associate with an AI. If you listen to an AI with a cold and janky-sounding voice, it’s a constant reminder that it’s a robot. Once the voice becomes realistic, there’s a higher chance that you’ll consider it human.

So, what could happen down the road?

We’re at what feels like a tipping point when it comes to human-AI relations. There are people who are already associating with AI. OpenAI even uttered a statement urging people not to fall in love with ChatGPT. Do you know what’s messed up about that? Everyone old enough to associate with AI has grown up in a more traditional world where the only interactions were human.

But, with companies pushing the boundaries of how human AI can be and pushing their AI down our throats, what about the next generation or the one after? Imagine a child born tomorrow growing up in an increasingly AI-driven world. How would that child be in 2040 when they’re a teenager? How many LLMs would have had an effect on that child’s life? Will this child know how wrong AI-generated relationships are if they’ve been taught by a chatbot rather than a teacher?

Now that voices are so real, what’s the use of recording podcasts when you can just generate one? Sure, people nowadays will stomp on an AI-generated podcast, but think about how things will be in a few years when AI is more normalized. Younger listeners, who grew up around AI, most likely won’t care. Rather than praising a group of podcasters, the listeners will praise the model being fed the data.

With AI voices sounding so realistic, humanity is one step closer to actually forgetting about humanity itself. Google has mastered the art of the voice, and we have no idea what sort of consequences will follow.

Read Entire Article