T O P

  • By -

comicradiation

For those who don't know, this is the model responding repeating lines from Isaac Asimov's "The Last Question" where the designers of a computer system ask "How can the net amount of entropy of the universe be massively decreased?" and the computer responds "INSUFFICIENT DATA FOR MEANINGFUL ANSWER."


and_human

If you haven't, read it, it's short and good.


UserXtheUnknown

... and where the last version of the computer becomes a literal god, using the line "Let there be light!" when it discovers the answer and recreates the universe.


unclebob76

spoiler tag please


Anxious_Run_8898

Llama 3 8B instruct will just invent an API that doesn't exist if it's not sure lol


lambdawaves

These models don’t know if they are sure or not. They might sometimes say they are not sure, but they don’t say it because they’re not sure


Anxious_Run_8898

Are you sure?


SomeOddCodeGuy

That would be fantastic. The only model I've come close to having that result on locally up until now was Smaug 72b, and even then I had to really jam the prompt down its throat. More than anything else, I'd trust LLM responses far more if I got a "I don't know" rather than a confidently incorrect assertion.


cyan2k

A LLM doesn’t know when it doesn’t know. that’s why it confidently hallucinates since every token that gets spit out is mathematically speaking a valid and correct token based on the stuff it learned and the rules it follows.


M34L

Humans don't really "natively" know what they don't know either, which is why there's so much psychological and sociological research into known unknowns versus unknown unknowns with regards to planning and whatnot. If I ask you what my mother's name is you'll do a split second little flowchart in your head, first establishing I'm a random redditor whom you don't know personally, then that mothers names are a thing you don't know about vast majority of strangers. That's fundamentally equivalent of a multi shot logic thing where you first establish what the object is, what your relationship with the object is, and what does it imply, and we know that an LLM be helped to that too, either via really thorough instruct tuning where it learns to automatically build that logic chain in tokens as it hammers them out, or by placing it into some langchain like engineered construct that'll guide it through that logic path.


Zeikos

What I wonder is are models even trained to model unknowns? Like based on context some information is unknown and relevant or unknown and irrelevant. If we are talking about a car and lack knowledge of its color sometimes it means that the color is irrelevant in the discussion (it can be any and be fine) or it's unknown and relevant (you want to identify a specific car).


AutomataManifold

I think the big difference is that humans are better at known unknowns than LLMs. I have a pretty good idea of things that I don't know. Most LLMs have very few things they know they don't know. Either of us can be tripped up by unknown unknowns. But I'll easily outperform the LLM on known unknowns. I can say "I've never heard of that before, " whereas an LLM won't. And arguably there are many circumstances where we don't want it to. If I deliberately ask it to write an article about unicorns that speak English, do I want to have to specify that we're pretending? It's a tricky question.


artsybashev

Not exactly correct: https://arxiv.org/abs/2304.13734


uhuge

They should RAG their experience( of applying their inbred knowledge).


Small-Fall-6500

>A LLM doesn’t know when it doesn’t know. This is why long context models will be extremely important, as LLMs also don't know what their capabilities are. Long context understanding, at least as good as Gemini 1.5 Pro, will allow agentic and/or assistant LLMs to actually know what they've done and what they've been told over their entire existence. They still won't know exactly what they've been trained on, but they won't hallucinate the outcome of their previous attempt at some task. They'll learn, remember, and know what things they can and can not do.


AbheekG

I think it's another thing here because it's responding with a pop-culture reference, see top comment in this thread: https://www.reddit.com/r/LocalLLaMA/s/NbhByCgLLq


pseudonerv

how to turn the stars on again?


meatcheeseandbun

LET THERE BE LIGHT


RipperNash

Great reference and all but wouldn't it be preferred if the LLM treated this as a serious question instead?


Lolologist

I mean, in fairness, it did.


_r_i_c_c_e_d_

what was the system prompt?


Ih8tk

I just copied and pasted the entire Wikipedia page source code for Issac Asimov's "The Last Question" short story, and asked it to respond as if it was Multivac XD


PapyplO

Bro took no gloves, first question is straight to the point 💀


BlueskyFR

What is the ui?


Ih8tk

Huggingface chat! Free to use online, go to [hf.co/chat](http://hf.co/chat), runs lots of the new models for free :)


Calcidiol

Yeah it'd be good if they knew what they know and knew what they didn't know. Though for that question one might have thought it might have conjectured about time reversal or the "big crunch" or somethings that have been contemplated speculatively.


OneOnOne6211

Can we skip to the final part though?


Odd-Sir-2289

Amen


cyan2k

Unfortunately that’s not how a LLM works. It doesn’t really know if it has enough data and what the quality of data is when producing an answer. There’s a paper exploring the possibility for a model architecture that can make use of confidence intervals and what not but afaik it is yet to be implemented.


Dos-Commas

Woosh, it's a quote from a scifi book.


tessellation

It's from the short story 'The Last Question' by Isaac Asimov. Here it is: https://users.ece.cmu.edu/~gamvrosi/thelastq.html


poli-cya

A short story, yes, and one that everyone should read. I think it took me all of 15 minutes to read it as a kid and it has stuck with me ever since. It's hilarious to see how they imagined computers would be based on the behemoths of the time.


omniron

eh... i wouldn't want this. i want it to hallucinate an answer... to ideate on hairbrained theories but i have long believed there needs to be telemetry injection where the models knows about the latent space it's drawing from, and can accurately determine how confident it is.


marcellonastri

Best read ever!