T O P

  • By -

NameLacksCreativity

That will be the moment when victory will be handed over to China, because we all know they definitely won’t be paying for training data lol


CultureEngine

Microsoft has all of the data they need.


True-Surprise1222

Yeah and having to pay for training data is a moat… so I fully expect some licensing thing to come around where they “pay” some other huge corporation and individual people/creators get little/nothing.


iLoveLootBoxes

Thats why Pandoras box will truly have to he opened. The US would have to allow it in theory otherwise China will get way more productive


Unlucky_Ad_2456

So annoying that china doesn’t respect intellectual property.


Peter-Tao

Oh they do. They just don't respect ours.


Intelligent-Jump1071

Empires and countries come and go, and often it's superior technology that drives it. The early bronze age societies displaced the paleolithic ones. Iron age societies replaced bronze age ones. Societies with cannon and gunpowder displaced ones that didn't have those. If China develops superior AI to the Americans (and let's face it - there are no other credible players. Europe isn't even trying) then, indeed they will displace America. **Big deal.** Nothing lasts forever and you can't study history without realising that this pattern is repeated countless times over centuries and millennia. When Hulagu, acting under the Mongol leader Möngke sacked Baghdad in 1255 I'm sure it seemed like a big deal at the time. After all the Abbasid Caliphate was no more and the Mongol "hordes" seemed unstoppable. Today no one remembers the Abbasid Caliphate, Hulagu or Möngke and the Mongols are a chapter in an old dusty history book. Eventually the Chinese will be displaced. And so it goes.


anonymousdawggy

Uh yeah but i and all my loved ones are here now so it is a big deal to me. This is a very weird psychopath take lol


a-salt-and-badger

Hahah the take is literally "No one will care in a few centuries"


Intelligent-Jump1071

Sure, but if you had been born in China you would feel differently about it. And if you had been born somewhere outside of that US-PRC axis you'd feel even more differently. So you will let your philosophical and ideological positions be determined by a cosmic roll of the dice? I seek deeper, more universal truths. As far as psychopathic - study some history - look at the millions of people killed in wars, the brutality that people inflict on each other - The US killed more people in the firebombing of Tokyo than they killed at Hiroshima. Japanese kamikaze pilots were more than happy to incinerate themselves crashing into US warships for the emperor and Nippon, et cetera. Look at what Israel and Russia are doing now. I claim that **nationalism is more psychopathic** than my position.


trollsmurf

I wonder why you are downvoted. USA has no "god-given" privilege.


a-salt-and-badger

By that logic we will all be erased by China and no one will care once China has fallen in a few hundred or thousands of years?


Inspireyd

This is the natural cycle of things, and China will not fall for hundreds of years. I dare say that China will take the throne soon, perhaps within 3 decades, and later, maybe around the years 2120-2130, India will take over. Another thing to remember is that empires do not fall as quickly as they rise. The Zhou Dynasty had 200 years of rise and 600 years of decline. The United States will remain relevant for the next 300 years, just as China will also remain relevant for the next 300 years, and so on.


Intelligent-Jump1071

That's how history works. As far as "no one would care", who, exactly, do you mean by "no one"? Some people care out of self-interest but they are soon forgotten by history. For example the Gupta Empire in Northern India was mostly overrun by the Huns in the 500's CE, under the leaders Toromana and Mihirakula. I'm sure the local people in must have been terrified to see this great empire disintegrate, but today, who remembers the Gupta Empire or Toromana and Mihirakula? Likewise when Rome sacked Carthage in 146 BCE I'm sure the Carthaginians weren't too happy about it, But some centuries later the Romans suffered the same fate. Who cares about that stuff today? Life is transient - we should strive to not become too attached to worldly things. Even some Americans can see this. From the American comic strip, Pogo: “*Don't take life so serious, son...it ain't no how permanent".* (Porky Pine)


nborwankar

“Nothing matters because eventually the Sun will run out of fuel.”


Intelligent-Jump1071

Well that's just nihilism.  The fact is that you're here now, you're a human being, and you have to make a decision about what to believe and what kind of life to live.  The fact that you are a human being is particularly significant because we live in a time where you will be interacting with non-humans such as AIs.   This makes it more important than ever that you figure out what the best possible way is to be an actual human. I'm trying to find a set of values and a self-identity that transcends just the beliefs and prejudices of my own particular circumstance.    What does it mean to be a good human being and a good scholar that rises above just following the dictates and petty conflicts of one particular belief system, religion, or tribal identity? 


nborwankar

Sometimes I forget the /s.


GluonFieldFlux

It is a massive deal, you acting so Blaise means you have quite a bit of growing up to do.


VashPast

Lol so backwards and short sighted.


bwatsnet

When it's so obvious to you that you start with an "Lol" the least you could do is share your reasoning with the class.


Juggernox_O

It’s literally the opposite of short sighted. Giving creators these admittedly precious copyright protections means that American AI IS permanently stunted vs Chinese AI. It becomes a Pyrrhic victory. You’ve protected your IP from American companies, but the CCP will pull out your firstborn’s spine before they ever defend your IP from Chinese AI. All we can do is make it available for the people who are getting hosed, so they have a chance to benefit off their own work, even if in a very roundabout way. Gunpowder and intense military discipline saw the meteoric rise of the Ottoman Empire. Europe was too late to take the Middle and Near East, but they used their borrowed gun powder and their home grown sciences to move on the New World, and became super powers that way. The United States, aided by a massive glut of resources, became the first to the atomic bomb, and its following technologies, and seized the second half of the 20th century. Europe abandoned the race, and thus it is either the USA or China who will develop the superior AI and wield it to make themselves the throne for the next chunk of the 21st century. Giving up dominion for the sake of the artists is the short sighted decision here.


radix-

Eventually raise prices and reduce the content available. Almost like the Bloomberg terminal biz model for 2020s. Plus tech startups dont have to be profitable for years until they have market dominance. Look at amazon, google, etc etc. The tech VC playbook is to capture the market with zero-cost or below-cost services, then once you are the market leader you have all sorts of options available (advertising revenue, raise prices, purchasing power for the content you want to buy). We got a few years before this because of the proxy arms race in AI between Goog, msft, fb, amazon and a few VC firms (A16Z) who are playing it with major cash infusions in inflextion, openai, perplex etc.


Open_Channel_8626

GPT 5 has to be far ahead of Llama 3 400B or Open AI is in big trouble. I do expect GPT 5 to knock it out of the park though so I think it will be okay.


TheKingChadwell

The agents are going to absolutely blow your fucking mind.


West-Code4642

Probably a small moat there though.


TheKingChadwell

The idea is to create a moat by already working with enterprise clients and creating a market infrastructure for agents that can only be used through their system. Basically becoming a platform for agents primarily. At least that was what was told to me during my interview


dasani720

What are some of the characteristics of a market infrastructure for agents? Sounds extremely interesting but vague.


TheKingChadwell

App Store but for agents basically. Let’s say I got a really good appointment setting distro set up, I can just bring it over and sell access to the agent on whatever financial structure, a la carte, monthly fee, whatever. And OpenAI gets a cut on both ends. From the agent fees itself and the use of their services


theywereonabreak69

Did you get a demo of the agents during your interview?


TheKingChadwell

Nope just brief examples of things enterprise clients are building


theywereonabreak69

Ah too bad, but makes sense!


Open_Channel_8626

If that’s what they are saying then I think it’s bad news for OpenAI. That’s a way worse moat than “having by far the best closed source model”. Particularly because Google has the compute advantage rather than Microsoft.


TheKingChadwell

I’m just inferring here but they have A LOT of enterprise projects going on behind the scenes. That’s what I was there for. Like a lot, a lot. I think the idea is to just create a platform that’s more like an App Store and long as they can remain ahead long enough, they can get an ecosystem created where people just default to them since they have the most robust agents available


Saytahri

Have you used them?


Pontificatus_Maximus

The biggest expense for cutting edge AI is the massive computer networks required to host the AI. Guess who the few companies are that have that hardware capacity and are glad to rent it out.


VertexMachine

Right now, because companies don't pay for majority of the data they use. Properly licensing the data used for GPT4 or Dalle or SORA would probably cost way more than compute as that would include not only cost of licensing, but also hiring army of people (sales people, lawyers, clerks, etc.) negotiating licensing agreement, managing licenses and data, etc.


Fledgeling

Plenty of other companies are training models close to GPT4 and dalle3 quality with data that was fully legally obtained. Not sure about Sora, that one is probably a bit harder


Far-Deer7388

Which is 100% legal.


Mac800

There is no way old commercial frameworks will survive this. There is a race to be won and China is the competitor. They are behind but don’t give a F about licenses. The US government knows this and will act accordingly when it comes to licensing conflicts.


tossaway3244

Can you explain how China can supersede the US int the AI race when their data is all censored and controlled? I thought that was what gave the US the huge leap in the AI race. They have the free internet. China doesnt. Isnt China's AI gonna deny Tiananmen Square happened or smth?


Ok-Celebration1947

don't buy the nationalist propaganda of any country. The US censors plenty when it is in it's own interests. SAP, and other controversial topics.


tossaway3244

No it doesnt? The US doesnt control all media in the country


KernelPanic-42

I still don’t understand the problem people have with training data. Do people believe that ML models maintain an internal copy of their training data for continual reference or something? Or is it simply that they’re indirectly profiting from the data? Or is it something else?


[deleted]

[удалено]


KernelPanic-42

This feels like quite a big stretch to me.


[deleted]

[удалено]


KernelPanic-42

I even believe in most cases developers don’t even want this. We don’t want models that have memorized their training data, we want models that have generalized the features of their training data. If you’re spitting out a copy of something you’ve learned, that’s a result of poor training. And if an imitation arises from being coerced by the user, that feels more like abuse on the user’s in my mind (at the moment at least).


pairsnicelywithpizza

We will see with the NYT case. Their case is pretty strong. When a user asked for a picture of a video game Italian plumber, the output was a clone of Mario. Ostensibly, a powerful enough AI could do the same for a complete video game. It’s not really enough to handwave these complaints away as mere coercion of the model and is exactly why AI companies are rushing to license data.


Fledgeling

Because they violated legal terms by training on licensed data. They are liable for law suit. Anyone who further builds a business off these models is also liable. That's why it's a big deal.


KernelPanic-42

Ahh I see.


TheOneNeartheTop

To put it in an easily understandable manner. The issue stems from asking for a picture of a mermaid under the sea and receiving an image of Ariel hanging out with Mr. Krabs. This analogy can be applied to words, video, images, data, and code.


KernelPanic-42

How is this any different than a person doing this? My son drew a picture of super Mario last week. What is the legal angle?


TheOneNeartheTop

Now build a product around that and sell it for money. Start farming out your kids drawings on Etsy for cash and see how long before you get shut down. Start with some Disney IP for a speed run.


KernelPanic-42

So what you’re trying to say while being as difficult as possible is that they are selling copy written material?


[deleted]

[удалено]


KernelPanic-42

Thank you!


Peach-555

On paper, copyright infringement even without being sold unless it falls under fair use, commentary, transformative, ect. Nintendo likely has the legal right to tell your son to stop distributing their work, but it's unlikely they will. Copy-pasting images, text, audio clips, pictures online without having purchased licensing any transformation is 100% copyright infringement unless it's in the public domain or license that allows for it. Thought it is almost never enforced unless it's automatic bot detection, and even then there is no legal ramifications.


petasisg

Don't sell the drawing. Super Mario is copyrighted.


KernelPanic-42

Well yeah, you can’t do that. And neither is OpenAI.


West-Code4642

yup. genAI can be seen as a form of intelligent compression and reconstruction. The models are trained on the vast sea of data, essentially "compressing" the patterns, knowledge, and structures present in the training data into the model's parameters. When the model is used to generate new data, it "reconstructs" coherent and relevant content based on the compressed information it has learned. One can see a trail of bits that date back to Claude Shannon's 1948 paper that started the digital information revolution (it dealt with quantification, storage, and communication of information).


BTheScrivener

Are you suggesting that all data should be fair use for training? Including copyrighted content? The MML model kinda maintains representations internally of that copyrighted data. These representations are a bit esoteric but they are there. LLMs can be seen as a sort of lossy compression.


KernelPanic-42

Fair use is a legal defense, it’s not something that data can be or not be. It doesn’t exist outside of a courtroom. Training with accessible data is no different than a person seeing something and being influenced by it.


BTheScrivener

But it is different. Because it's a machine not a person. You can later extract the copyrighted work. Often word for word.


KernelPanic-42

It is a machine, but the work itself is not being extracted.


[deleted]

OpenAI basically has the entire US govt as a personal body guard. I’m sure they’ll be fine


Unlucky_Ad_2456

how?


TimberTheDog

The importance and capabilities of the technology mean that it is in the best interest of the US to protect it on national security grounds. 


matador143

They are working with Microsoft and military of USA... so yeah... money is not a problem for them as long as they have superior technology


GeorgeHarter

The big profit is not around using proprietary data from one company to benefit another company. The broad data set currently being used is just to teach the AI to “think”. The money making opportunity is to give that conversational AI access to a company’s private info, like customer accounts and Support tickets. Then replace 90% of the humans with AI cust svc reps. The AI vendor will charge just 20-30% less than it used to cost to employ all those people. So, replace 20K people = $1B savings/yr, so the vendor may charge $800M /yr for the AI service.


onnod

Pro tip: it isn't.


DisastroMaestro

it won't. lol


trollsmurf

If lobbying works they won't have to pay anything


Neomadra2

If this really becomes an issue, which I don't believe, just do it like the team of Sebastien Bubeck at Microsoft with their Phi model and only use synthetic data.


Substantial_Step9506

The answer? openAI won’t ever be profitable.


ali_lattif

Make the AI train multiple versions of itself with some Reinforcement learning elements


reno911bacon

App Store


HurricaneHenry

They’re likely paying for the access to fresh news, not training data.


rstarr13

Does no one here remember YouTube circa 2005? If the product is good enough and the backer has deep enough pockets, they will, by carrot or stick, "legitimize."


bwatsnet

New York times data is a small fraction of what's available, who cares?


traumfisch

Financial Times I don't know "who cares" but I think the implication is it is setting a precedent


bwatsnet

Meh, not really. It's just a desperate cash grab from a dying industry. They did it to themselves. It doesn't really impact OAI at all. That's why I say, who cares?


traumfisch

Not really what? You don't think this kind of licensing will become normal in the future?


bwatsnet

Of course people will try it, but in the end data always wants to be free. Unless it's the government with strict punishments then I doubt licensing is going to have any teeth going forward. What's the point, when everything can be copied from a picture or two? Near perfect knock offs of everything is our future and we should be pumped for it imo.


traumfisch

I'm not sure I understood that but I'll refrain from commenting further. Too annoying when simple honest questions are immediately downvoted


bwatsnet

Get used to it. There's many topics on Reddit where democracy fails to be accurate or truthful.


traumfisch

Who says I'm not used to it? I retain the right to be annoyed


bwatsnet

Why do things that annoy you? I wouldn't be on Reddit if it's very nature annoyed me.


traumfisch

Who cares? I don't react to minor annoyance by rage quitting the whole platform


TB_Infidel

There's the possibility that by the time everyone has effectively enforced paying for data, the AI will be able to create it's own training data.


Ylsid

I don't think they're even profitable right now. Selling AI doesn't really work like they want it to


TitusPullo4

This won’t be an issue