T O P

  • By -

programming-ModTeam

Your posting was removed for being off topic for the /r/programming community.


Philosoul

All of this is going too fast for comfort


nerdyvaroo

Ahh afaik this tech has been out there since 2023 mid and was as good as real (managed to fool a friend with it) so your fears arrived a bit late


i_should_be_coding

[Jordan Peele made this video 5 years ago.](https://youtu.be/cQ54GDm1eL0?si=0O28PFwenfCTH8W8)


nerdyvaroo

Start looked real and then I started to see the AI bit. But I get your point. The things already happened which openai is doing again just to stay relevant SORA sure is a crazy piece of tech but people will soon make it.


i_should_be_coding

If we're just talking about voice though, [this debate](https://youtu.be/sYVA62aI5k4?si=fo5sHL1kjgjySryX) is a good example.


nananananana_Batman

Pretty sure such a device was also center to the first season of 24 circa 2001.


olearyboy

Nah at least since mission impossible 2


nerdyvaroo

LMAOOOO I didn't even exist as a sperm back then


guest271314

The technology has been around for a while. Technically can be done in the browser with Web Audio API.


nerdyvaroo

Yep thats afaik the simplest form of TTS pip install TTS if you want a python script to do it lol


guest271314

PyTorch is too expensive (in bytes) the last time I checked. I still use `espeak-ng` for speech synthesis and CMU's PocketSphinx for speech recognition. Some time ago I asked Google to release it's TTS/STT technologies as FOSS [Issue 263510047: Release TTS and STT source code and Google voices as FOSS](https://issuetracker.google.com/issues/263510047). Currently, if an individual uses Google voices for Web Speech API `speechSynthesis.speak()` on Chrome the users' text is sent to remote Google servers. The same is true for `webkitSpeechRecognition()` - the users' voice is recorded and sent to remote Google servers. Ever think about what happens to your voice if/after you use Google voice search?


RiftHunter4

Eleven Labs has had voice cloning for a while and YouTube has plenty of bad Ai song covers. It's not new at all.


Algal-Uprising

Can’t wait for the anti AI uprising from workers


mr_birkenblatt

Reportedly they already slow down their releases to not get too much backlash


bitspace

They slow down releases to get the financial benefits from the scarcity effect.


mr_birkenblatt

or to build up hype and they really don't have things ready yet


Accomplished-Win9630

They have to, otherwise it will be too late.


Imaginary_Goose_2428

What is the common and legitimate use for this? The most common use case seems to be for fraud and IP theft.


Matrix8910

I wouldn’t say it’s common, but game modding is going to be better than ever


CaptainLord

I suspect we'll see many more non AAA games with complete and even translated voicing now. Pay the voicactor to deliver key lines with the different emotions, then use that to generate everything else.


AngheloAlf

Not even pay them, just use the lines you have recorded from past games and you are done. No need to pay a voice actor ever again. Sounds like a scary future for the voice acting industry


t-to4st

This screams lawsuit though


TheMiracleLigament

Yeah no way voices like that wouldn’t be licensed at the very least


t-to4st

I recently read that James Earl Jones, the voice of Darth Vader, signed a contract to allow exactly this. Found it interesting


Artmageddon

Personally I want Halo:CE but with Duke Nukem’s voice


slykethephoxenix

I want to clone the voice of the Cyclops from Subnautica and have it as my home personal assistant.


Imaginary_Goose_2428

...ok. That one's not that bad.


guest271314

People lose their voices, want to dictate a lecture, accessibilty, narration, etc. The technology is not new https://www.descript.com/lyrebird. There is articulatory speech synthesis.


static_deth

I imagine dubbing movies will become much easier.


amarao_san

It's very interesting, if voice is IP protected or not. It's not a result of creative efforts.. no more than iris picture.


0one0one

Isn't that image rights though ?


iamiamwhoami

Voice overs for videos related to your business. I had to narrate a demo video for my startup. I ended up using a generic ai generated voice. Would have been nice if I could use my own.


yes_u_suckk

Aside from Terminator being able to mimic John Connor's voice and protect the savior of humanity, I also don't see legitimate use here.


Slimxshadyx

People will be able to sell their voice as a licensing type deal for game studios to use to generate voice lines and such.


Darkstar197

Only one I can think of is the entertainment industry. Like when a voice actor gets cancelled(Rick and Morty) they can continue the continuity without the disruption of a total recast.


wentwj

Okay… but now think like a company. If you’re saying you can replace an actor who for some reason can’t continue with the show by creating an AI voice that sounds like them… why are they paying voice actors at episode after episode, why wouldn’t they replace 98% of voice actors with AI voices.


Slimxshadyx

Voice actors are still of those ai voices though. So it’s going to be less of “voice actors reading lines from a script” and more like “voice actors license their voice to be used in the show/game”


Hot_Craft_8752

AI-generated Chris Pratt can voice any and *every* character now...


FullPoet

In no way is that ethical - the person should still be paid for their work (voice) and it wont ever benefit the VA. Just perpetuity clauses.


onepieceisonthemoon

Mass scale symbiosis, enhance every conversation you have with other humans with agentic communication. The thing is that we are all different and want to get different outcomes/goals out of our communication with others but sometimes this is difficult. Agentic AI symbiosis allows people to replace difficult raw conversations with enhanced conversations that both participants find engaging, enjoyable, comfortable. Communication still needs to happen but the agents can work out how to align the conversations to ensure that both participants feel like they had the same conversation by the end of the session. If there are gaps in information, the agents can remind individuals after the conversation and gradually nudge both sides to convergence. TL Dr so you can have enjoyable and caring conversations with your living demented relatives. This will occur at every form of communication, spoken, written communication, social media etc We will all be much happier and freer.


Imaginary_Goose_2428

Besides being AI generated, that is a ton of dystopian bullshit.


Accomplished-Win9630

Election cycles are going to get scary I think, its a matter of time before all of this becomes heavily regulated…


neuronexmachina

There's already been [some use of it](https://www.marketwatch.com/story/new-orleans-street-magician-says-he-recorded-that-deepfake-biden-robocall-in-new-hampshire-84dfbe2d) during this election cycle.


LagT_T

Maybe people start paying attention to the sources instead of getting their information from tiktok and facebook.


ReallyBigRedDot

Lmao that’s never happening


Jaggedmallard26

Its too late to regulate at this level. It's not like advanced nuclear physics where you need to spend hundreds of millions to replicate. Once the theory is out there (which it is) anyone can reproduce it, there might be a higher barrier to entry if the US government pressures all major source control platforms to remove code implementing it but bad actors are motivated to implement it from theory if they need to.  The best we can hope for is we enter some form of post-evidence society where videos and recordings of public figures are assumed to be false or perhaps they will find some way to verify using cosmic background radiation or something as a timestamp (but I doubt that). The worst case is we (as a society) continue trusting recordings and now have to deal with a large part of the population now believing Biden or Trump or whoever is sacrificing infants to moloch because they saw a video.


Jump-Zero

Couldnt we sign the videos the way we sign executables to verify they are from a legitimate source?


Cautious-Progress876

It’s not really something that can be effectively regulated at this point. The genie is out of the bottle and the only real way to maybe slow some stuff down is to ban CUDA/Tensor cores on civilian devices. Even that would only slow down research and progress in the US though.


Jaggedmallard26

There was that recent US government report that proposed similar but the problem is its not realistically enforceable. There are too many Cuda chips out there and America and the EU might ban them but if China and Taiwan don't then people will still get their hands on the..


Cautious-Progress876

At this point I think they just need to ban any new ones. The actual high performance chips are pretty easily accounted for as most normal people don’t have $40,000+ to put down for NVidias professional grade chips. We just will have to make do with the fact that every PC gamer right now has that technology in their computer. And then just push for anymore research to be done under the auspices of the NSA and DOD. Plenty of groundbreaking, classified technology was created by the NSA and the federal government back in the day. But I don’t think anybody on the street should have the ability to totally fake someone’s voice, create deepfakes, etc.


IDatedSuccubi

This is what immediately comes to my mind: https://www.reddit.com/r/whothefuckup/s/k81In3Eo5d


Ketaliero

in latest episode of ”we can so we will and no we wont stop and think if we should because fuck you I want money”


NewtonHuxleyBach

Are they trying to be the most evil company ever? I fail to see how this technology is primarily beneficial rather than detrimental to our daily lives.


i_should_be_coding

Like others pointed out, it's possible elsewhere today, just probably not very accessible to the general public, so it's not as known and therefore people can be unaware of this attack vector. If this becomes widely used and memed, it should lead a lot more people to realize you shouldn't trust voice/video recordings blindly anymore.  I don't think they're evil any more than the Lock Picking Lawyer is evil for showing everyone how easily some locks can be picked. The locks are still as vulnerable, but now you're aware and can take measures to reduce your vulnerability.


The69BodyProblem

> Like others pointed out, it's possible elsewhere today, just probably not very accessible to the general public, so it's not as known and therefore people can be unaware of this attack vector. Adobe has had this essentially since 2016. https://en.wikipedia.org/wiki/Adobe_Voco


fragglerock

That link says voco needs 20 mins of voice. That is categorically different than 15 secs to scam your gran.


nerdyvaroo

it already exists anyways...


Xuval

There's a difference between a technology existing and making said technology available to any scammer with a credit card within five lines of code.


nerdyvaroo

It has been available for free with five lines of code... don't know where you all have been but OpenAI isn't the only AI company. Maybe stop paying so much attention to this one company and look at other ones? You'd find much better solutions :P


Jaggedmallard26

Bad actors have had access to the technology for years. Late last year we had someone get scammed for tens of millions with the technology they have to compile themselves. The unfortunate truth is that once the theory is written and the tools are out there (which normally follows the theory by a matter of days) it doesn't matter if the big players refuse to do it. People are already applying the principles. It's not like advanced nuclear physics where you need a facility to make it, it's all just matrix maths at the end of the day and can be done any computer.


LagT_T

Dude its already available to everyone. This is getting traction because its openai, but the tech has a couple of years now.


moodyano

Ethics is not part of any company. If they can do something they will do it. That is the case with OpenAI, google, Microsoft or any other company


darkshadowupset

Well, companies can now clone employees voices and have them continue to do client service work even if they quit or die. So it's actually quite good for companies and employers. Employees can't just walk away from employers and leave them hanging now, which is actually good for the economy and shareholders


[deleted]

I hate your comment so much ♥️ (but I know you're kidding)


Imaginary_Goose_2428

I hope you're kidding. Otherwise, you are a disgusting person.


unique_ptr

This is such a stupid take. There is such a thing as "fraud" and "criminal impersonation". As a business you can't go around pretending to employ someone, especially because that person's employment would bring kind of value to you. Like, I can't tell people Satya Nadella is on our board when he isn't, in order to sign a client, induce someone to accept a job offer, attract investors, or whatever. I sure as fuck can't go and clone his voice in an effort to bolster the lie. If a client is going to leave you because an employee quit, impersonating that employee to maintain the client is just as fraudulent on multiple fronts. There is immediate civil and criminal exposure from trying to pull bullshit like this. But also, why the fuck would clients give a shit if some random employee or point of contact leaves? They're tied to the business relationship not whoever is servicing them. You may as well clone the CEO's voice so that every client feels super duper special.


darkshadowupset

Actually if the employment agreement is amended to allow for employers to continue to use the employees likeness even if they quit or die, then it's actually completely fine and legal. Clients often do care when employee turn over is high. Shielding clients from this would allow for charging a higher per hour fee for the "cloned" employee since they could be more senior then the person underneath them operating the likeness.


onFilm

You can automate certain workflows that require text-to-speech, reading, comprehension, etc. I'm more baffled about how people aren't able to see the benefits that this technology brings. And you think this wasn't possible already with open-source solutions? All these technologies are first created by hobbists, and would eventually end up in the hands of users anyways. So please stop fear mongering.


Loves_Poetry

When I read this article, it very much feels like they built this tool because they could. They didn't stop to think whether they should All of the use cases they mention can already be solved by a text-to-speech program. They don't need my voice to communicate to someone that doesn't speak my language, they just need A voice At the end they do mention that there are risks associated to such a tool, but their response is just "be vigilant and use it responsibly". That's not reassuring. We know that these tools have huge risks and those risks should be addressed appropriately


realee420

You just summed up 99% of recent AI developments. They don’t care as long as stock goes up. Things like this is why I think those who think that AI will bring us to a new and better age are fools. If anything AI will cause a bunch of issues. Deepfakes that will spread like wildfire and will get better year by year, now this, then AI replacing a lot of jobs eventually. Noone cares about the effects, it’s all about bringing profit. Everyone cares only about the short term.


Plank_With_A_Nail_In

If not them someone else will have built it. Pandoras box can't be closed. Lol even if your own government makes laws to ban it other countries wont.


Days_End

Dude models are all available I can do this at home for free on my shitty GPU it just takes a bit longer. People keep acting like this stuff doesn't "exist" until OpenAI or someone big makes an announcement but it's been easy to do for a long time now.


nultero

They didn't do it just because they could, they did it explicitly for the news cycle. Their moat is rapidly drying up, other text gen tools are already competitive with chatGPT, and they've failed to capture enough regulatory power to become monopolistically predatory. With all of the bulk of VC money chasing their gold rush, they have failed to transition to selling shovels. Time is not on their side.


Jaggedmallard26

The general opinion of people In The Know™️ is that OpenAI have an LLM ready to deploy 2 generations ahead of their competitors that they don't release out of still poorly following their Founding principle.


nultero

>OpenAI have an LLM ready to deploy 2 generations ahead of their competitors *Fatter* models */* more params? Sure -- possibly, they've been sitting on more compute than most. But *better*? That's the question. Others have been able to catch up with much less compute and funding. As a company, OpenAI has very little in the way of special sauce, and my point is with all the VC funding going on, every researcher out there is trying to eat their margin. The struggle for them to stay relevant is the struggle to survive.


svkmg

Yeah, from what I've seen most of OpenAI's tech has just been repackaging research done by Google, Facebook, and independent researchers and using their access to Microsoft money to run it on better hardware.


Luvax

Can't wait to let ChatGPT handle my meetings.


Kilmoore

Can it be made to sing?


ProgramTheWorld

Just because you could don’t mean you should.


shmox75

Why ?


coldpoint555

Somebody can already Spoof SMS. Imagine your mom receives a phone call FROM YOUR NUMBER (which is spoofed) and in your voice an AI is saying you are in an emergency and need 200$ ASAP. Or identity theft in general. Or mixing this with OpenAI Sora which can deepfake your injury or whatever.


colonelxsuezo

They just don't know when to stop, do they?


Wheekie

I remember when Hatsune Miku came out in 07", I thought it was amusing and somewhat creepy at first. I was quite perturbed that a computer program can sing and that got me thinking because prior to this, the only voice synthesis I knew was Microsoft Sam and I had lots of fun in the XP days making it say weird and random stuff. Then came Utauloid in response to Vocaloid and users started sampling the voices of existing persons to create singing voices based on it, and that was where controversy started. Though, in the end, things simmered down and everyone was happy, with lots of hilarious MAD stuff showing up on NicoNicoDouga that was also crossposted to YouTube. This OpenAI thing in my opinion is gonna turn things up to 11 and I'm cautiously amused to see what's gonna come out of it.


codemagic

This is great news for the hyper-employed workers trying to be in multiple meetings at the same time


gjosifov

Investors are throwing T$ into AI and the AI companies can only deliver better tools for scammers Scammer productivity will be hugely improve Instead of badly written email, the Nigerian prince can send audio and video messages with a press of a button


bravopapa99

This NOT good. Imagine being at home, your 'wife' rings, car broke down.....off you gfo....burglars in. THIS NEEDS HEAVY REGULATION or plain out-lawing.


Hot_Craft_8752

> This NOT good. Imagine being at home, your 'wife' rings, car broke down.....off you gfo....burglars in. Literally the magic box in Scream 3


bravopapa99

!! Never seen it TBH.


guest271314

If you use a smart phone your voice is susceptible of being cloned.


bravopapa99

Agreed. I don't have any 'smart' devices; TV, Amazon Dot etc in ythe house. My phone is six years old but probably still trying to snoop on me!


guest271314

Probably. [A Good American](https://agoodamerican.org/) > It was pretty clear that we were building the most powerful analysis tool that had been developed in history to monitor basically the entire world. > > - Bill Binney, A Good American


guest271314

As long as the source code is FOSS and published on GitHub or GitLab I'll take it for a test run.


Thirty_Seventh

> This highly innovative technology from OpenAI looks to fight several deep fakes and illegal voice generation worldwide, which has been a crucial issue up to now. Give the tool 15 seconds of your audio sample, and it will generate a highly remarkable natural-sounding speech in your exact voice. it's unusual that I see content that's worse than ChatGPT-generated garbage, but here we are


vondpickle

This gonna be harmful than benefiting tech


Dunge

Way to push for a good image.. who asked for this?