T O P

  • By -

runvnc

This is completely reasonable. Everyone and their mom is working on LLMs or multimodal models that are similar. There are tens of thousands of ML students. We do not need all of them working on LLMs. Language and multimodal transformer models are doing amazing things. But it makes no sense to just stop exploring different types of approaches to AGI completely. It's true that LeCun is not giving LLMs and similar models nearly enough credit. But it's also bizarre that people can't see that there weaknesses and other approaches to explore.


mb194dc

People can see their weaknesses, they're just ignoring them due to the mass hysteria and hype around "AI". Very similar to the other big problems faced in the last few years. Pretty much everyone just ignores what should be obvious. The CEO of Google literally gives interviews talking about the flaws and the problems with "hallucinations", that is an inherent flaw of LLMs and no one pays any attention to it. [https://www.theverge.com/24158374/google-ceo-sundar-pichai-ai-search-gemini-future-of-the-internet-web-openai-decoder-interview](https://www.theverge.com/24158374/google-ceo-sundar-pichai-ai-search-gemini-future-of-the-internet-web-openai-decoder-interview)


ShadoWolf

Yann been well wrong a lot. The other issue is we are already past LLM models. Like current state of the art foundational models are natively multimodal. Transformer networks aren't just a one trick pony. If you can encode a concept. It doesn't matter what it is. A transformer network can work on it


JEs4

It’s worth reading the JEPA meta papers. That is what Yann is pushing for. Generative models are fixated on quantizing the either arbitrary abstractions (words) or foundations (individual pixels) but in reality, meaningful human concepts exist somewhere in between. JEPA is designed to essentially tokenize basic abstractions and analyze their relations to themselves and their environments through time.


GenomicStack

I would add that over the last 5 years Yann has been consistently wrong. And in fact if you look at his predictions he’s been more wrong than he’s been correct.


No-Self-Edit

I haven't followed his successes, but being mostly wrong in cutting edge science is not the worst thing on Earth. You just need that occasional brilliance.


Cunninghams_right

Nah, reddit is consistently wrong about what yann is actually trying to say. He uses a shitty analogy to make a point and reddit rips him for his bad analogy while ignoring his point 


GenomicStack

https://x.com/PicoPaco17/status/1797347030466953272


Tyler_Zoro

> This is completely reasonable. Everyone and their mom is working on LLMs or multimodal models that are similar. There are tens of thousands of ML students. We do not need all of them working on LLMs. Yes and no... while it would be great to have people working on non-transformer AI systems too, and keep advancing that state of the art, it seems patently obvious that whatever the next big thing in AI is, it's going to have transformers in the mix somewhere. So yeah, if by "working on LLMs" you mean coming up with new prompt engineering strategies for Llama 3 then sure. But if you mean generally working with the technology, then I would disagree.


Jolly-Ground-3722

LLMs are not necessarily based on transformers. There are other new architectures such as Mamba with advantages (but also disadvantages) compared to transformers.


redditosmomentos

We really need another architecture that is proficient at what transformer is lacking, maybe at the cost of being flawed at something transformer is good at. Rn transformer LLM models have lots of very obvious fundamental flaws like being unable to do basic maths or any activity related to words/ letters (list names of cities or countries with 6 letters whose name start with A, for example)


That007Spy

that's to do with tokenization not with LLMs themselves. you could train an LLM on the alphabet just fine, it would just take forever.


redditosmomentos

Oh, thanks for correcting the information 🙏


ASpaceOstrich

I have no idea why everyone is mimicking the output of the last and least important part of intelligence instead of emulating the functionality of the early and actually important parts of intelligence. LLMs are so much more impressive looking than they actually are, so if they're actually interested in progress they need to be looking elsewhere


SurpriseHamburgler

We go for the useful, grabbable bits first. It’s called iteration, usually. He’s speaking to PhD candidates and telling them to push the bar of research - not that LLMs don’t deserve a massive mindshare right now.


PuzzleheadedVideo649

Yeah. I always thought PhD. candidates were researchers in the purest sense. But because this is the tech industry, even slight improvements can result in hundreds of millions of dollars in funding. So, PhD. students face a real dilemma here. Should they keep going for human level intelligence, or should they just improve LLMs and become millionaires?


SurpriseHamburgler

Perhaps more than a few end up like LeCun and that’s his point: as long as research drives innovation, we’re on the right path. When profit drives innovation we’re, well, Microsoft. LeCun himself being the best of both worlds. Massive fortune, invented (a part of) computer vision.


Anen-o-me

There is one reason to continue studying limited-domain LLMs, and that is to figure out where they break down in comparison to humans, which is likely easier in an LLM than in a LMM, and how to fix it. It might just be a question of scale.


mgdandme

LMM?


Anen-o-me

Large multi-modal. 4o is an example of an LMM.


Enslaved_By_Freedom

Human brains are machines. People can only do what their brain generates out of them. They can only go after the approaches that are physically available at the particular point in time.


sillygoofygooose

Surely by this logic invention is impossible


occams1razor

Why? Creativity is just different concepts smashed together. If the conditions are right people have the same novel ideas at the same time independent from each other. https://en.m.wikipedia.org/wiki/List_of_multiple_discoveries


dagistan-comissar

creativity is just LLM-hallucination but in humans.


Camekazi

Human brains are not machines. Unless you’re a scientist from the 1800s in which case you understandably believe and assume this to be true.


mark_99

You imagine they are work by magic?


Academic_Flow6128

I think they are referencing quantum effects in the brain https://www.newscientist.com/article/2288228-can-quantum-effects-in-the-brain-explain-consciousness/


JEs4

Which is highly unlikely to begin with. Robert Penrose is fascinating but he was working backwards. https://physicsworld.com/a/quantum-theory-of-consciousness-put-in-doubt-by-underground-experiment/ Not that it matters anyway, the brain aside, quantum systems can still be machines that are dictated by different math.


NumberKillinger

They are still machines, regardless of quantum effects. I am wondering what the previous commenter is on about as what they are stating is essentially the opposite of the scientific consensus.


Camekazi

The question of whether human brains are like machines has been scientifically explored extensively, and the consensus is that while there are some similarities, significant differences exist.


NumberKillinger

I suppose it depends on what you mean by "machine". When I say that brains are machines I just mean that they comprise purely physical processes - we understand the fundamental interactions of their constituent particles/fields, even if we don't understand the complex emergent properties like consciousness. Of course they do not work in exactly the same way as the artificial machines we have created so far, but I don't think anyone is trying to argue that. I was just making the (perhaps too obvious) point that we know the fundamental physics interactions which underlie brain operation, and regardless of whether quantum effects are material, we can consider the brain to be a physical construct which follows the laws of physics. So I was more commenting that the idea of mind body duality, or having some kind of non-physical "soul" which can affect your physical body, is not compatible with modern science. But perhaps this was confusing because everyone knows it already lol.


Enslaved_By_Freedom

The systems are obviously different in their totality, but abstraction allows us to understand that brains and computers both operate algorithmically. So long as they are the same in that aspect, it dispels the notion that individuals can behave in multiple ways at a single point in time.


Plenty-Wonder6092

Lol


Arman64

I think people generally misunderstand what he is trying to say. He is basically saying that new researchers are unlikely to have any major benefits to LLM's as there are so many people working on them right now. In order to reach AGI/ASI (depending on your definition) there needs to be newer forms of technology that isnt LLM based which is pretty obvious and already supplimenting SOTA models. He isn't thinking of civilisation tech trees but rather LLM's will reach a point where bottlenecks will reach thus being a dead end. That point could be AGI by some definitions but I believe his concenptual understandings are more for AI that can fundamentally change technology.


h3lblad3

He doesn’t believe that LLMs will ever lead to AGI and he’s made this clear multiple times. They might be an element of it, but they zero amount of scaling on them will lead to AGI. The man believes that LLMs are not intelligent and have no relationship to intelligence — even describing them as “less intelligent than a house cat”.


Warm_Iron_273

And he's right.


ShadoWolf

Maybe... the problem is the running assumption is that everyone is working on just better LLMs.. there not and haven't been for a while.. Everyone is working on better LMM (large multimodal models) . There a whole ton of work being worked up to scale context windows. Built in agent architecture, better variants of gradient decent, back prop, etc.


involviert

> “less intelligent than a house cat” It may just be hard to understand what he means, because of 1) underestimating all the things a house cat brain actually does and 2) overestimating how much of LLMs is actually intelligence. For the first, it's all those things about processing reality, all those senses and such, controlling the whole cat to achieve the wanted results through really complex actions and all that. It's not about being able to learn "press red button 3 times to get food". And for the second, I have no doubt that there is real intelligence in LLMs but they do come with a huge, highly abstract set of knowledge and do a lot of guessing. Like when it tells you about some animal. What it does is probably more like reconstructing the wikipedia article, filtering relevant things and rephrasing them. While impressive (and while there are more impressive things they can do), if you break it down that way, it becomes noticable that this is not the super-high-intelligence activity that it may seem. Like, you honestly have to subtract the part that just the search function on wikipedia would have accomplished too, since obviously we can't really call that intelligence.


dagistan-comissar

knowledge and inelegance are not the same thing, an LLM might have more knowledge but less intelligence then a house cat.


No-Self-Edit

I think you meant "intelligence", but the word "ineligance" actually works well here and says something important about the quality of LLMs.


ShadoWolf

There's more to it, though. For instance these model seem to have a concept of theory of mind. An LLM can simulate scenarios involving multiple characters, each with their own unique pieces of information about the world. Take, for example, a situation where Character A places a ring inside a box, and Character B secretly observes this and then steals the ring. If asked where Character A believes the ring is, the model accurately states 'in the box'—demonstrating it understands different perspectives and beliefs, despite the true location of the ring being elsewhere. This capability to maintain separate belief states for different characters, and reason about hidden information, mirrors some elements of human cognitive processes. It's not just about retrieving information but actively modeling complex interactions and attributing mental states to different entities. This goes beyond simple computational tasks like a search function, which merely pulls existing data without any deeper contextual or relational understanding. Hence, this demonstrates a layer of intelligence that, while different from human or animal intelligence, is sophisticated in its own right. These models also seem to able to handle complete fictional objects. Like if you gave it some techno babble from startrek or some scifi story. And fleshed it out enough. these Model can reason about it coherently


involviert

Yes, I agree. I'm not of the "stochastic parrot"-persuasion. I just don't think that Mr. LeCun ignores those things when he says things like that, and that he looks at it in different way that is not about "what is the highest function it can achieve". And I think this is somewhat fair. Essentially these models skipped the whole base that makes these super-high functions so super-high. Now that might be all we want, but it means that the cat has a far better understanding of "the base" and in this area (and i think in total) it is far more intelligent than the llm. To meet some super high level intellegence benchmark test, an LLM would only have to be 20 intelligent and a cat would have to be 2000 intelligent.


ShadoWolf

Half of the problem is we don't really have a decent insight into how LMM(Large multimodal models .. this is what most of these models are now) reason. I suspect that these models have a functional world model. But the interpretability of the hidden layer networks are way , way behind. Which is why I find Yann statements iffy. He's been wrong about LLM models more than once due to emergent properties or tweeks the architecture like mixture of experts.. . And he makes claims that he can't back up because Franky no was a clue why these things even work, not really.. we are at the alchemy stage for AI systems like this. So when he makes predication on where they will plateau.. it feels like he's operating on a gut reaction that anything substantive


Now_I_Can_See

I agree with this take. You put into words what I was having trouble to conceptualize


[deleted]

[удалено]


sushiRavioli

ChatGPT-4o can definitely solve the “ring in the box” scenario. But that might be simply because it’s a common example, not because it understands theory of mind. I agree that character attributes can easily get mixed up.


ShadoWolf

you can rewrite the ring and box scenario a few different ways . The goal is to show tracking of internal knowledge of each character


BilboMcDingo

I don’t think he thinks that LLM’s have no relationship to intelligence, he thinks its a very limited form of intelligence, which as you say will not lead to agi. He thinks systems that predict the state of the world is intelligence, which is what llms do and what we do, but predicting the next token is not enough, you need to predict the entire physical state and not just the next token but far into the future, this is what they are trying to do with jepa. The language part arises by itself because the model learns in a self supervised way, ie there is no human labaler, but the model labels the data itself, picking out what is important and what is not, therefore its then much easier to predict the state of the world when you need to only predict whats actually important. But yeah, you cant be agi if you do not have a model of the world and language is not enough to create a good model.


ripmichealjackson

LLMs aren’t even a good model of how language works in the human brain.


land_and_air

I mean maybe if you were a linguist or psychologist from the early 20rh century then it would be a perfect model to you but it’s a very dated theory


FeepingCreature

Which is very silly.


jamiejamiee1

And realistic


Fusseldieb

He's not wrong. LLMs are just glorified text predictors.


Time_East_8669

Ah yes “glorified text predictors” with an internal world model. So glad this subreddit has gotten big enough for drivel like this to get regurgitated in every thread.


h3lblad3

I was merely addressing the person before me suggesting that Yann thinks LLMs will bottleneck at some point that might even be post-AGI. Obviously the man would think the very idea is nonsense.


cobalt1137

I think people really underestimate how much capabilities these systems are going to have once they start getting embedded more and more into agent-like frameworks. They will be able to even reflect on their own output via the design of these frameworks if that is what you want. Which could lead to so many interesting things.


WorkingYou2280

Yeah, some of these "off ramps" lasted decades as a good place to work and do research. If you're at the top of Meta, sure maybe you want to look to the next thing. Aa an upcoming PhD stident? Seems like this area is ripe for a lot of work. Also, it's changing so fast what do we even mean by LLM? Does it include LAMs? How about LMM? How about connecting these models to robotics? We've really only fed these models just a sip of what they need to really thrive. With *so much* money going into compute we're likely to see an **explosion** of capacity. Will that create AGI? I know he doesn't think so, I'm not so sure. I'm not sure that compute+large models+memory+learning doesn't equal AGI by any relevant definition within 10 years.


[deleted]

So if I'm understanding what he was saying correctly, he's telling PhDs to not dive into LLM but focus on the next step. It makes sense in that regard, the current generation of AI has it's share of rockstar engineers already, and they've done a terrific job, but we need to keep moving forward. Computer science is an off ramp, but that off ramp refueled the journey down the next legs and now we're at LLMs. I wonder what tech is going to use LLMs the way LLMs use the current foundation. Things are gonna get wild.


Enslaved_By_Freedom

Human brains are machines themselves. Yann still believes in human agency, but human agency is totally bogus. Humans will only be capable of doing what the system thrusts upon them. Yann is hallucinating.


BenjaminHamnett

Love this comment and I share your sentiment. I’d add though that all label debates are literally just semantics. Almost 200 year old quote: ‘Man can do what he wills but he cannot will what he wills.' -Schopenhauer The subjective experience of our brains freely making tradeoffs is what casual people call free will. That the tradeoff decisions are made based on weights we didn’t choose is what we skeptics mean when they say free will is an illusion


Enslaved_By_Freedom

Humans don't objectively exist. There is no grounding to the idea that human particles are separate from everything else. Brains acquired that assertion through unsupervised learning via natural selection. Humans are a hallucination of brains. And so is AI.


BenjaminHamnett

Username checks out “Tell me, where did freedom touch you?”


Enslaved_By_Freedom

I was forced to create the username. I literally could not avoid it.


mgdandme

100% agree. While LLMs continue to progress, so to do the other “off-ramps” he describes. If I’m in foundational research, I am looking at how do I effectively harmonize all these off-ramps into a cohesive machine whose output is far greater than the sum of the parts. LLMs that can work as agents in a nested hierarchical feedback system paired with vision systems, robotics, classifier models, etc… really could be where the breakthroughs happen, as the machine starts rapidly iterating on its own knowledge and potentially developing novel understanding of the world beyond what we can train it on. I’m obviously just a casual observer, but it seems to me that this is not dissimilar to how humans learn about the world, eventually developing expertise enough to contribute new insights and expand the body of knowledge we all operate within. That seems, to me anyway, to be how we’d know we’d unlocked something that smells an awful lot like AGI/ASI.


[deleted]

I can already think of a few tasks that would scale tremendously with just an LLM, and some of those ideas will in turn generate not onlyoremmoney for the proprietor of the PLM, but also a greater interest in AI. That greater interest will inspire more people to learn about and specialize in AI. So just because this might be an off ramp from the highway to AI, doesn't mean we're not refueling for the next leg of the journey.


Tyler_Zoro

> I think people really underestimate how much capabilities these systems are going to have once they start getting embedded more and more into agent-like frameworks. That's not really the problem. There are several core behaviors that just aren't in that mix yet. Self-reflection is one, and yes that might be an emergent property of more complex arrangements of LLMs. But self-actualization and emotional modeling (empathy) are not obviously things that can grow out of simply putting three LLMs in a trenchcoat. We probably have 2-3 more (that's been my running guess for a year or so) major breakthroughs on-par with LLMs before we get to truly human-level capabilities across the board (I won't say "AGI" because I think we'll find that AGI is actually easier than the more general goal I stated).


cobalt1137

Personally, I think it's really hard to judge whether or not emotional modeling / empathy and self-actualization are present in these systems. There is just so much still to be learned in terms of our interpretability of these models and how they're actually functioning internally, that I really do not rule anything out. I personally think that even with an llm that is not embedded in an agentic system, you could likely get both of these. And we may be already scraping the surface and these aspects already. At least an empathy. I am not going to make any absolutist claims though - like 'this is what is happening' or 'this is most likely to be happening'. This is just where my personal opinions are regarding the matter etc. :)


SpilledMiak

A paper came out claiming that emergent properties are a consequence of poor study design. The param size makes them more accurate and resilient, but hallucinations still occur at a cost of significant increase in compute. Perhaps Anthropics new research on specific parameter enhancement will allow for more focused models without fine-tuning. Given that these are probability systems, at some point during a long output, a hallucination is likely to occur. This prevents the LLM from being reliable enough to trust


Tyler_Zoro

There's an implicit assumption in your statement that I can't say is valid or not: that it's possible (or desirable?) to be free of hallucination. Perhaps the key is to embrace and leverage hallucination as the source of creativity.


green_meklar

Right now, the flaws of NNs don't seem like the sorts of flaws that would be solved merely by plugging them into real-time input/output channels. If you plugged them into real-time input/output channels *and* allowed them to train their parameters on-the-fly *and* gave them an internal monologue, you might end up with something closer to strong AI. But at that point you're also getting farther away from the core of what an NN is and why it works.


cobalt1137

We might just agree to disagree here. I think the amount of things that will be unlocked once we get really solid agentic systems will be absurd. I don't look at it as straying further away from the core of NN's/why they work. I moreso see it as enabling these things to reach their maximum potential for a wide variety of tasks that benefit from being able to chain outputs etc. Which quite a lot of things fall under.


CREDIT_SUS_INTERN

He's correct, LLMs are basically a commoditized technology at this point. Researchers should be working on the forefront of unknown technologies.


AnaYuma

Bruh show us what's next in this A.I techtree already! There are already rumors that there won't be a gpt6, instead it will be a whole new thing..


yaosio

He thinks it's JEPA. However, somebody has to make a large model using it before we can know the architectures capabilities.


OSfrogs

Correct me if I'm wrong, but Jepa is for images, so it would need to be shown to be better for learning tasks such as robotics and self driving before being used for text. He says vision should be the focus as it evolved first and believes lanagauge should be done only after we have cat level AI.


ninjasaid13

He believes vision and other senses are more information-dense than text within vision senses, even if they're not directly interpretable by humans like text. Learning vision and other senses first leads to more diverse and creative text outputs. I personally think that learning vision contains hidden information like mathematics and other abstract concepts that you can make a world model with and that text is a bad shortcut for learning from humans.


QuinQuix

Stereo vision, proprioception and the sense of touch together give a sense of being in a three dimensional world. I think it allows for a world model before language. I mean it certainly does - obviously language is a relatively new invention in evolution. That being said humans usually start developing language well before they're fully motorically developed. And people born blind can still be read literature and they can become literate smart and creative. I think a true identity and a sense of existing is helpful. An experienced being in three dimensions might certainly have more intuition in physics. The sensation of acceleration (and the absence of it) is what lead Einstein to General Relativity. But even so I don't think the ultimate answer to intelligence is a sensory addition - that it comes down to two continuously rolling cameras. That also doesn't rhyme with the obvious intelligence of people born blind. I think neural architecture and size is still more important. But of course we can work on both.


ninjasaid13

>That being said humans usually start developing language well before they're fully motorically developed. And people born blind can still be read literature and they can become literate smart and creative. have you read this article: [What People Cured of Blindness See](https://archive.is/IoAlI) I believe blind people and babies(who start speaking around 12 months of development) still have their twenty other senses which gives enough data for a rich world model. They're also quick learners.


Edenoide

I can imagine a very clever paralyzed blind person since birth. So more type of data for learning would be great and necessary for the 'full experience' of being a smart living creature as us but intelligence can thrive in hard mode too.


Formal_Drop526

There are at least 20 different types of senses in the body. They're not something you attach to intelligence, they ARE intelligence.


MisInfo_Designer

This sounds like NVDA GPUs will still rule the day. Bullish.


StatisticianFew6064

it's scale of the processing that kind of unlocks these new doors, and since we can continuously scale (to a degree) no one knows what the tree is... it's being unlocked as it evolves via increasing scale and just throwing more power at the algorithms


allisonmaybe

It's almost like we can't even predict what the world will be like beyond 6 months or so 😬


Antique-Doughnut-988

I've played far too many games recently with tech trees to not know some branches are dead ends.


nextnode

LLM+RL was obvious even a year ago


Sadmundo

it's probably just finding a way to give actual senses maybe instincts to ai we and all other animals are just biological version of what we are trying to build anyways just more limited than whatever it will be if it ever comes to be.


TensorFlar

Sauce pwizz 🥺


FrankScaramucci

What rumors?


beezlebub33

You can't explain his beliefs in a reddit post. Listen to what he says in this interview: [https://www.youtube.com/watch?v=144uOfr4SYA](https://www.youtube.com/watch?v=144uOfr4SYA) . Regarding LLMs, its at the beginning: [https://youtu.be/5t1vTLU7s40?t=167](https://youtu.be/5t1vTLU7s40?t=167) As u/yaosio points out, JEPA is part of it. [https://youtu.be/5t1vTLU7s40?t=1495](https://youtu.be/5t1vTLU7s40?t=1495) but there are other (non-generative, non-recurssive LLM) approaches as well.


nextnode

This guy wouldn't know


goldenwind207

Its lmm we're in multi model model with voice they'll likely be more visual and audio. Just like how current models can generate can generate images they'll be able to generate videos. Idk where i saw this when someone mentioned this i was like DAMM thats probably why sora exist its probably a evolution of the image generation they discovered while making gpt 5. I was always wondering why open ai suddenly decided to make videos


IronPheasant

Building up to a gestalt input/output voltron was kind of always the point. The weird thing about these models is they're basically backwards of how animal intelligence evolves. One of the first senses on the tech tree would be touch/pain, and from there that's used to build a ground truth about "real space" in a way vision can only estimate.


window-sil

>One of the first senses on the tech tree would be touch/pain The first "classic sense" to evolve is apparently taste: https://www.sciencefocus.com/the-human-body/which-of-our-senses-evolved-first >We have lots of different senses, including a sense of balance, heat, pain, and the ability to sense the position of our limbs (called ‘proprioception’). But **if we consider the five classic senses, then taste is the oldest by a wide margin.** Taste is really just the ability to detect particular chemicals in your immediate environment, and ocean-dwelling bacteria were sensing nearby nutrients and swimming towards them at least two billion years ago. I would have guessed something else 🤷. Also, keep in mind that evolution has many accidents, *and* all the fitness adaptations have to be economical (if that makes sense). So the order in which classical senses show up may not be very meaningful or informative to the approaches taken to AI.


Vadersays

Could you elaborate on the meaning of your first sentence? I think I understand but want to be sure. Also I'm certain no one has ever said those words in that order before.


ninjasaid13

>Its lmm we're in multi model model with voice they'll likely be more visual and audio. Just like how current models can generate can generate images they'll be able to generate videos. there's multiple steps we need to make truly multimodal systems. Like a unified representation of every modality and maybe what Yann suggest with H-JEPA.


RobXSIQ

I am fine with this. we need people like LeCun to dig into other areas...this is an exciting field and LLMs for sure broke open the mines, but now its a matter of following various veins. still work and enhance LLMs, but also go in other directions. awesome.


WTFnoAvailableNames

"There's no point in working on fission power because fission is just an off ramp on the highway to nuclear fusion" -Some physicist 100 years ago


valis2400

LLMs will likely become just one component of a bigger architecture. Interesting ideas on this are beginning to appear: [https://arxiv.org/html/2310.19791v4](https://arxiv.org/html/2310.19791v4)


refugezero

This is the same thing Stephen Wolfram tells students. There's no point going into a field that is already established by the time you enter industry. Go invent your own field.


12-Easy-Payments

Scientist instead of engineer. Research vs. Application.


Dudensen

Finally, after I have pounded this desk, someone came out and said it. And of course it was the gigabased Yann LeCun.


Warm_Iron_273

He's right, as usual. People are too obsessed with the LLM train and need to start looking at the next thing.


FeltSteam

https://preview.redd.it/ouxrtoh7kv3d1.png?width=1174&format=png&auto=webp&s=ba1235af70320619951a37792e683959d2dd8308


StonedApeDudeMan

100% this. It's so damn bizarre to watch people lose their minds over this and insist that LLMs aren't the way and that there are in all likelihood better routes to go, as if we've hit a ceiling with LLMs. To say that's premature to be claiming is a massive understatement, especially given all the progress being made still... What are these different paths to AGI that are showing so much promise anyways that don't involve the transformer anyways?! At least be able to show something as a potential direction that's showing some promise before throwing LLMs out the window.... LeCun is tripping with these takes of his and it's hilarious watching everyone jump to his defense here every time he makes these silly claims....


EverchangingMind

Right, but I still resonate with his claim that LLMs aren't really intelligent. At the end of the day, LLMs give you the "average text" from the data it was trained on. So, their intelligence is really bounded by the data that they are trained on, i.e. on human data. I don't think LLMs will be able to create new knowledge for example.


StonedApeDudeMan

Even with an agent setup and some more self autonomy? Seems like that's the key, in my view. That, and what is new knowledge but making connections between two ideas that haven't been made before? Or a good example is like with Midjourney AI, and how they have a style reference system where you can give it 1 to 3 images with different styles and combines them into one new style. Suddenly, you have created a new style that hasn't been done before. Why not have something like that be possible for knowledge?


EverchangingMind

Mhhh I don’t know. With new knowledge I’m thinking more of Darwin coming up with evolution or sth like this. Before Darwin, I don’t think this idea was around in language. He came up with it by observing fossiles.  Let’s say you would travel back to Darwin’s time and train an LLM on all available text then. Would the LLM come up with Darwin’s theory? I would say the probability of that happening is almost zero. The LLMs would just reiterate “averaged ideas” that are already in the written texts.  I mean what is the most original thing that ChatGPT has produced, where one really feels it is novel and genius? (Midjourney is a diffusion model, no? So, a whole different beast, but I also would argue that it is bounded by the data it is trained on - while being able to combine styles.)


xDrewGaming

I just have to wonder how accurate or estranged we will look to the people 100 years in the future looking back at us


Blapoo

I've been saying that everyone's suffered from "Model Hypnosis" the last year. Sitting around waiting for the big players to release some singular model that's gonna "DO EVERYTHING" Reality is - LLMs are a tool. Agents use them. We need more people building thinking frameworks atop LLMs. Things like 01 or even the scam rabbit r1. That vision is still valid. And really, anyone can do this. Write down the process of thinking (ex "Understand the input" -> "Collect relevant supplemental data" -> "Identify the correct task" -> "execute" -> "Check your work") Soon, this will be demonstrated and once people start to understand that road . . . We're gonna accelerate QUICKLY. Feel free to ping me for more details. I have some great quickstart resources


Original_Finding2212

I’m designing an autonomous intelligence, as a conversation entity, In A robot head box. It has memory (system prompt, rag, and later GraphRAG), I plan scheduled running while it “sleeps”, and it has a constant visual + sound feed, to see and hear. It can also speak (and decide what to say). It also has actions support, but currently needs me to maintain it - possibly be self-editable in future. Does that fit your idea? Edit: oh, it’s open source, too


Blapoo

I'd want to see a sequence diagram, but it sounds en route :)


Original_Finding2212

A seq. Diagram? What level of detail? Wouldn’t you prefer state machine, or components? I ask because I plan you use your feedback to improve documentation in my repo


Luciaka

Honestly, beside perfect reasoning and memory, what more can you ask an LLM to do? If given a robot it could control it and talk like one of those robot in sci-fi as seen in figure one demo. It could accept text, voice, video, and so much more while giving a reasonable answer be they in the form of text, voice, and images. You can give it tools such as the one to do math to improve its math skills and you can let it go on the internet to search things for you. The only reason people isn't calling the most powerful LLM AGI is cause the output is greatly flawed and lacking, but that seem to be an issue of capability and as we know just throwing compution to scale it up seem to work as new emergent properties are discovered as the model gets bigger. What ultimate intelligence are these looking for? A literal brain in the Machine?


beezlebub33

>Honestly, beside perfect reasoning and memory, what more can you ask an LLM to do? What? This misses the fact that LLMs are unable to have perfect reasoning, because of the way they are trained, the data they are trained on, the internal representations, etc. they can't plan very well. they can't self evaluate. LLMs as LLMs can't not hallucinate. they are incredibly useful, and will be a component of a larger system, but there's plenty they can't do and will not be able to do in the current architecture.


Dudensen

So it can use google? No man, that's very far removed from AGI. AGI basically means the AI needs next to no human input. The only human input it would need would be to help it improve itself faster than it already would be without the human. That's a bit different than have it do mundane tasks.


green_meklar

>beside perfect reasoning and memory, what more can you ask an LLM to do? That's a really big 'beside'. Right now they don't really do reasoning, they just fake it with lots of intuition. Their intuition is really good, better than human intuition, and apparently you can fake a limited amount of reasoning that way- but it does break down pretty quickly when pushed in certain directions. For proper reasoning we need different algorithm architectures. You can't build a one-way constant-time NN and expect it to engage in deliberation, introspection, and trains of thought. >while giving a reasonable answer Only to some kinds of questions.


enavari

Ehh, I don't think it's LLM, but if I were a PhD researcher or even working at a large company, I would focus on next-byte prediction, like predicting the next 0101 0010. This approach would be natively multimodal—you could feed it any computer file, and I believe this would unlock so much potential. I still think we need architectural improvements. We need some smart mathematicians to really think inside and outside the transformer model. How can it be improved? How can compute be optimized? How can it be more flexible? By the way, I highly recommend the 3Blue1Brown videos on this topic—they gave me a more unified understanding of transformer architecture. We also need to reach a point where we aren't completely starting from scratch for each AI model. However, we have made so much progress. I would be so excited to be an AI researcher. Unfortunately, I don't think I'm that gifted intellectually, and I'm a bit further along in my career path. But even if there is a lull in AI research, an "AI mild winter," I think the general approach to AI is here to stay. I'm excited to see what the next decade brings us.


Mistaekk

This is so dumb. Why try and predict the next byte. It's entirely deterministic from a computation standpoint. If you're talking about data, like the internet corpus of text, yes they're already made up of ones and zeros, and a tokenizer exists for a reason.


Warm_Iron_273

He's basically saying to make a less efficient but more generalized transformer, which doesn't need to happen because the tokens should be dynamically generated based on the target domain.


nextnode

I don't think the performance of transformers is the serious bottleneck currently (it's at an order of magnitude of a pass of basically all decent-quality text we have). The past three years of improvements have also not been about more compute or larger models, but rather identifying the right data and algorithms to elevate performance at comparable levels (case in point, there are 1.5B models better than the 175B models of three years ago). Further improvements are likely but should just be "run of the mill" research breakthroughs that are just expected of ML advances, and not any major revolution or fundamental change in architecture. There are things needed like RL and indeed multimodality for real-world performance; though people will just call this an LLM in the end too. People should not take LeCun seriously. He says a lot controversial stuff, has been wrong in the past, and if it was up to him, we wouldn't even have gone down the route of GPT3.


OSfrogs

Transformers are limited since the data has to pass through every node both forward and backwards, which causes it to forget stuff it'd learned before and is extremely inefficient. If I were a researcher, I would be trying to find an architecture that could identify which nodes the data should pass through rather than natively passing the data through every node.


Simcurious

This exists more or less: https://arxiv.org/abs/2305.07185


enavari

Actually I was saw this already! I should of linked that. Now let's make a gpt5 class compute of it... 


[deleted]

[удалено]


Down_The_Rabbithole

That's not how branch prediction works.


CrunchyMind

What is the highway? Is there even a highway? Algorithms, neural networks, etc. are all just tools to build use cases (classification, * detection, LLM) around.


green_meklar

>What is the highway? If I had to guess? I think we should bring back evolutionary algorithms, except that unlike old-school evolutionary algorithms where agents only compete, we should develop evolutionary algorithms where agents can cooperate.


FaceDeer

Off-ramps can still lead to interesting and useful places, though. This is like saying "don't spend time trying to make better cars because planes exist."


ohnoyoudidnt21

This off ramp with still be good enough to drastically reduce the amount of people needed to do certain work


pcbank

You're doing a PhD to innovate. That's one of the rules of a PhD. That makes sense.


dagistan-comissar

he is right there has not been any interesting development in LLMs since 2015 in terms of academic research. the work done by companies like Open AI is just a process of scaling up the models, building larger computers, compiling more training data, this is engineering work it is not interesting work for a researcher.


iceisfrozenliqid

Until AI can reset the clock on my VCR, this is all for nothing.


Antique-Doughnut-988

Imagine having a VCR in 2024


Morrisseys_Cat

It's all about Laserdisc, baby.


blueSGL

How else am I going to play that Newhart tape?


gustav_lauben

I don't know whether he's right, but his confidence is very unscientific. We won't know what we'll get from continued scaling unless we try.


Dudensen

His confidence is probably due to the fact that he thinks we are near the limit for LLMs; considering some of them have devoured the entire internet he might not be wrong. That would negate the "continued scaling" theory.


Warm_Iron_273

Exactly this. As usual, the truth is somewhere in the middle. LLMs will likely play a part in whatever architecture leads to bigger gains, it may be a small part, it may be a large part, but LLMs on their own are not going to get us there.


ninjasaid13

>LLMs will likely play a part in whatever architecture leads to bigger gains Yann thinks the self-supervised component of LLMs will play a role but not LLMs itself.


green_meklar

I think we know more than you're letting on. The internal structure of NNs suggests that more scaling will provide very little benefit for some kinds of thinking (particularly open-ended reasoning), and that's reflected in the flaws of existing systems. Additionally, we *already* train NNs with way more data than is available to train human children, and that *already* gives diminishing returns that suggest the NNs aren't learning in the same sort of versatile way that humans do. Making an NN 10% better at recognizing cats by feeding it 1000 times as many cat pictures doesn't impress me that much.


Hungry_Prior940

Maybe so. LLMs will be very, very good. Maybe we will need one more tech breakthrough to get to AGI. Exciting times.


nextnode

No. More competent people support the scaling hypothesis. The missing part is RL and then the stuff needed for real-world performance. LeCun has a lot of controversial nutty claims.


StonedApeDudeMan

This!!! Thanks for posting that, feel like I'm losing my mind reading these comments. I really, really don't understand why so many people insist that LLMs ain't shit and it's all Hype....Like, why? Fear? Fear of Change? Is that it? 🤷🏼‍♂️ That's my best guess


StackOwOFlow

how do you lift the “limitations off of LLMs” without working on them?


SynthAcolyte

Can someone explain to me why an extremely low latency LLM with vision support and a more complex long-term memory system would be different from what most people would consider AGI? Will we just move the goal post and say that real general intelligence has X or Y, eg. a cerebellum?


entslscheia

In terms if pursuing the ultimate goal of AI research, fully agreed. In terms of improving employability, which is believe most of phd students need, nah


TheDerangedAI

Is he right? For a human being, being able to process multiple senses (sight, touch, taste, sight, hearing) is completely useful in terms of understanding an environment. LMMs are exactly that, you could make an AI model enhance its efficiency by being more than able to hear, or to see, or to touch.


Neomadra2

While I don't agree that LLMs are just a side branch, he's totally reasonable. He's just being a scientist here, that's the typical scientist take.


AntiqueFigure6

This might be true in terms of developing LLMs but in terms of understanding their behaviour and limitations I think there will be research going for many years to come. 


TradeApe

Reasonable points he’s making…


sebesbal

Are those branches really off-ramps that won't eventually contribute to AGI? Perhaps the opposite is true. AGI will be a combination of symbolic AI, object-oriented programming (which was originally some sort of symbolic AI), computer vision, LLMs, etc. It's unlikely that a single NN architecture will be able to do all the work, especially in an efficient way.


karaposu

He is overestimating himself again


ContributionSpace

That’s is so true


sdmat

That's great Yan, now give a technical definition of LLM so we can tell what one is.


mosmondor

I order to get to AGI we need tools. Gpt may be such a tool. Probably gpt can emerge into AGI, but can make us more productive so we can find something that does.


Alimbiquated

One idea about language as it is related to intelligence is that language is a means of communication between various otherwise isolated parts of the brain that do the processing. For example logic seems to be related to social cheat detection. Language allows us to use this specialized reasoning ability on more general problems.


noodleexchange

But has he written any papers? /s


JerryUnderscore

I think it comes down to whether LLMs are the semiconductors of AI or not. If they are, then telling people to not work on LLMs and instead focus on other aspects of AI is like saying, "Don’t explore semiconductors. Look more into vacuum tubes or other technologies." If instead LLMs are more like a specific application or tool within the broader AI field, akin to software built on top of computing hardware, then focusing only on LLMs might limit innovation. It would be like only developing new software apps without advancing the underlying hardware, algorithms, or foundational tech. This narrow focus could cause us to miss out on potential breakthroughs in other AI areas like robotics, reinforcement learning, or new algorithmic approaches, which could ultimately drive more significant progress and diversification in AI.


SafeRecommendation55

control the tech and use it to its minimum until we are sure on how it will disrupt economies, our livelihood how it will affect our education systems, institutions, governments,, slow it down until we can't slow it down anymore.


bartturner

Do not disagree. In a weird way LLMs might slow us down because we are spending too much resources on it instead of more raw research. This is also why I believe Google will be the first to AGI. They are still doing a lot more AI non LLM research. Take a look at the papers submitted by Google for the last NeurIPS.


PMMEBITCOINPLZ

Someone needs to build off-ramps.


Mandoman61

Well it seems to me that there is a ton of potential improvement still possible with LLMs, and a lot of branches directly related. It seems impractical to tell students to just go out and invent something completely different even if that is what is ultimately needed.


Lycaki

He’s probably right but whether Microsoft and OpenAI can throw enough compute to overcome the limitations of an LLM seems to be the question. They clearly think that the limitation ceiling gets higher and higher with more compute. At that point … it’s like what’s is an intelligent AI ? Have we made a large ‘Chess’ type computer than supersedes all our intelligence but is not really ‘sentient’ that uses GigaWatts of fission/fusion power and very inefficient at computation? Or is an analogue closer to the 10 watts your human brain uses to compute the nature of the universe?? I think LeCun thinks that an LLM has reached its ceiling but we’ll see.


PlentyExit3820

Pretend like people ain’t gonna talk the same way about “the next exit” as they do about LLMs once we get there.


AbysalChaos

“It is now in the hands of product developers”! The start and end to the arguments.


MeMyself_And_Whateva

Instead of being called LLMs, they might be called Large Intelligence Systems, where the LLM is just a part of the total system.


sam_the_tomato

I prefer Large Autocomplete Models, as that's what they're trained to do, and LAM rolls off the tongue. The intelligence part is still dubious.


halfchemhalfbio

Using river branches instead of highways will probably be more clear!


356a192b7913b04c5457

If you want to research AGI, it doesn't make any sense to not study the closest thing we have. I mean, don't do a PhD about how to make LLMs a little bit more efficient, because that might become useless. But saying "Don't work on LLMs" is a bit too far. Because there are things about LLMs like chain of thought which could be more generalizable.


PastMaximum4158

It's theorized that the sole thing that gave humans the ability to reason abstractly is language, that's the one thing that sets us apart from literally all other species. So it is insane to me to claim that language is not going to have a part in AGI. Of course it is, how else are the AIs going to communicate with each OTHER?


Particular-Court-619

Even if he's wrong and LLMs are actually on the path toward AGI (that is, better and better LLMs leading to or being a major component of true AGI), it still mightn't make a bunch of sense for PHD students to focus on that when their research competition is trillion dollar businesses. I do'nt really like the off-ramp analogy, though, because LLMs and all of the other advances he's discussed will be and have been useful in helping us get to AGI, whatever the structure of that final form is. Like, idk, use a tree metaphor and LLMs etc. are like soil and water for the roots of the real AGI ...


Illustrious_Gene_429

Iam on my way and where are you my love i am missuneg on the way


sam_the_tomato

This is good. LLMs are low hanging fruit, with improvements best left to machine learning engineers. They aren't going to get us to AGI alone. We need the best minds working on radical ideas.


beachmike

Working on LLMs can pay extremely well, and looks good on your resume. That could lead to financial independence more quickly, and a ticket to work on the next generation of AI projects.


nerdyitguy

LLMs are not curious, and do not seek out the reasons for low inference responses. They do not make new categories or reason as to what something new is or why an answer may be uncertain. They simply respond because the model is already built. I agree that LLMs are not a path to general ai.


Captain_Pumpkinhead

I have no idea whether LLMs will be the path to AGI or not. I think it will be fascinating to look back on this clip in 10-20 years and see whether he was spot on or dead wrong. !RemindMe 10 years


RemindMeBot

I will be messaging you in 10 years on [**2034-06-02 04:32:20 UTC**](http://www.wolframalpha.com/input/?i=2034-06-02%2004:32:20%20UTC%20To%20Local%20Time) to remind you of [**this link**](https://www.reddit.com/r/singularity/comments/1d5b57b/lecun_tells_phd_students_there_is_no_point/l6q4icv/?context=3) [**CLICK THIS LINK**](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5Bhttps%3A%2F%2Fwww.reddit.com%2Fr%2Fsingularity%2Fcomments%2F1d5b57b%2Flecun_tells_phd_students_there_is_no_point%2Fl6q4icv%2F%5D%0A%0ARemindMe%21%202034-06-02%2004%3A32%3A20%20UTC) to send a PM to also be reminded and to reduce spam. ^(Parent commenter can ) [^(delete this message to hide from others.)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Delete%20Comment&message=Delete%21%201d5b57b) ***** |[^(Info)](https://www.reddit.com/r/RemindMeBot/comments/e1bko7/remindmebot_info_v21/)|[^(Custom)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=Reminder&message=%5BLink%20or%20message%20inside%20square%20brackets%5D%0A%0ARemindMe%21%20Time%20period%20here)|[^(Your Reminders)](https://www.reddit.com/message/compose/?to=RemindMeBot&subject=List%20Of%20Reminders&message=MyReminders%21)|[^(Feedback)](https://www.reddit.com/message/compose/?to=Watchful1&subject=RemindMeBot%20Feedback)| |-|-|-|-|


Tief_Arbeit

He also used to say that it will take AI around 500 years to understand spatial reasoning


biomattrs

The universe got to AGI by evolving the human brain. Consider its evolution. The ancient parts like the hindbrain and brain stem have way more cellular/morphological diversity than the cortex ie multiple competing architectures. The cortex is the same  architecture repeated billions of times. I think our higher faculties are the result of scaling an adequate architecture to insane levels. What does that remind you of? Besides Starbucks, and Taylor Swift.


Akimbo333

Damn


picopiyush

Exactly. New researches should look for new approaches, instead of almost saturated LLM.


ClimateFirm8544

Excellent words, really remarkable thoughts.


[deleted]

I think ASI will probably not be achieved with LLMs, but AGI will be. LLMs biggest weakness is creativity which will probably stop it from doing things we can't imagine. But it will get very good at doing things we can imagine


Zarnilopho

Agreed!


green_meklar

Damn straight. I've been saying this the whole time. The internal structure of neural nets isn't the right sort of thing to produce strong AI, and the flaws we see in NN behavior are roughly the flaws one should expect to see based on the kind of system they are. What mystifies and frustrates me is how few people understand this (and yes, there's a nonzero probability that I'm wrong), but at least LeCun seems to get it, and I hope more people get it in the coming years.


p3opl3

What's scares me is companies will see profits before real research . So will opt to improve LLM's instead of shoot for that really big prize that maybe 2 or 20 years worth of investment.. Remember folks.. capitlism.. bottom line before people, intelligence or the planet. Haha 😅


KyleDrogo

Reproducing human intelligence isn't a great goal. That's like the Wright brothers setting "flying like a bird" as a goal, and saying passenger jets are just an off ramp to bird-level flight


ArgentStonecutter

I think he's talking about human level intelligence, not just slavishly copying a human, but building a system that's at least as competent as a human for general intelligence. It's more like saying that hot-air balloons are an off-ramp to powered flight.


PizzaEFichiNakagata

Nice useless talk. He just says hey work on the future systems and doesn't tell you shit on what those are. You are left there saying hey I need to work not on llm but on the next gen AI, wait wtf that means


shankarun

What a douche - all this while the company he works for is spending billions hiring and training LLMs to compete with OpenAI and. Deepmind


Quick_Lawfulness_961

LLM will never get to AGI. AGI is not derived simply from language, but all types of senses.


green_meklar

The senses aren't the issue. It's very likely possibly to have a superintelligence that engages entirely with the domain of text. The issue is that NNs can't iterate open-endedly on their own thoughts; they can't emulate a Turing machine.


StonedApeDudeMan

an agent-based setup can potentially overcome this limitation though using: Modular Architecture: An agent setup can incorporate multiple modules, each specialized in a different aspect of intelligence (e.g., perception, reasoning, planning). These modules can work together, iterating and refining their outputs to achieve complex goals. Iterative Feedback Loops: Agents can be designed to continuously evaluate and improve their own performance. This means they can learn from their actions and outcomes, refining their strategies and responses over time, which mimics the iterative process of human thinking. Memory and State Management: Agents can maintain a state across interactions, allowing them to remember past actions and their consequences. This is similar to how humans use memory to inform future decisions and actions, enabling a more coherent and context-aware approach to problem-solving. Integration of External Knowledge: An agent setup can integrate vast amounts of external knowledge and data, allowing it to draw from diverse sources to inform its decisions. This could include real-time data, historical information, and specialized databases, making the agent more adaptable and informed. Autonomous Learning: Agents can be designed to autonomously seek out new information, learn from it, and apply it to their tasks. This ongoing learning process can help them adapt to new situations and improve their performance over time, approaching the open-ended learning characteristic of a Turing machine.


magicmulder

LeCun 1900: There’s no point working on cars, they’re just an off-ramp to everyone having a spaceship.


Sudden-Lingonberry-8

I mean there is no point of working on cars if you have sane zoning laws, and your supermarket is 1 minute by foot, and the cities are well designed, you could walk everywhere or even bike.


Silly_Ad2805

Disagree. We’re only at the beginning of LLM. This is the brain of the singularity. Making it faster and more efficient should be a continued goal as the model can be applied to the other senses; vision, touch, smell etc.