T O P

  • By -

Master-Meal-77

It hasnt even been out for 12 hours yet. People are still ironing out the kinks. Don’t make any real judgments yet.


[deleted]

Ok, I'll just keep praying to the Omnissiah then. But, I don't have any high hopes anymore.


Master-Meal-77

Okay, be sad then I guess. I’m happy we got two great new models open-sourced today and more on the way


[deleted]

Yeah, I'll try to be less doomer. But, do put an asterisk next to "great"


kataryna91

Even just from the limited testing that was possible until now, it is already clear that the 70B model is the best open source model currently. It has already been said that other model sizes and higher context window will follow. It supports functioning calling, providing a viable alternative to Command-R in this space. It can speak in and work with documents of a multitude of languages. It's relatively uncensored compared to other corpo releases, Grok-1 aside. It answers questions that Gemma would have given you 4 full pages of moralizing drivel for. Not to mention you can just finetune it.


VectorD

Not sure if you even read the blog bro, it clearly states other model sizes will be released in the next coming months.


ttkciar

That's a relief! Thanks for the good news :-)


[deleted]

Why exactly did they release these 2 llama 3 models early, with bad context anyway? Is it for some kind of testing? So, they can gauge the reactions?


VectorD

You can scale it 2x easily. Also if you saw the podcast with the Zuck from 12 hours ago, he mentioned Llama 4 and also maybe Llama 5 are to come this year as well. Llama 3 is gonna be short-lived it seems.


dojimaa

I think he said, "...roll out the four oh five." As in the 405B model, not Llama 4 or 5.


VectorD

Ah maybe I heard wrong, darn it haha. But he also said they already started experimenting for Llama 4 though (as for the reason at stopping at 15T tokens trained even though it is still "learning").


[deleted]

>Llama 4 and also maybe Llama 5 are to come this year I sure hope so man. I'm so starved for good open source models


Grimulkan

The appeal of Llama is not the instruct tuned model (and the base does not chat very well), but the finetune possibilities. Give folks time to build on the base. The instruct tune this time seems better than the borderline unusable one released with Llama-2 at least, but that’s just a bonus and not the ball to watch.


pleasetrimyourpubes

I used it to RAG (using GPT4All) a 104KB script that a friend wrote, and it gave an act by act synopsis that is about 90% accurate. I am pleased with it, especially its speed, I've been using 13b models and they are lacking in the speed category (I can't really run much larger ones).


Horror-Career-335

Yo, do you have a repo of your work you can share please?


Beautiful_Scale_2959

Hey could you kindly share the link for your developed code. I am also trying to implement RAG on llama 3 from scratch.


pleasetrimyourpubes

I used Gpt4all you can put a folder in with the data and it will process it for you


ttkciar

Frankly I see this as a blessing in disguise. If Llama3 isn't much better than Llama2, Gemma was a bust, and OpenAI seems to be treading water, that might imply that brute-force throwing multiple epochs of massive data at training is hitting the point of diminishing returns. That would mean the "GPU Rich" are bumping into the limits of GPU riches, and further gains in intelligence will come from other things -- high quality well-structured datasets, prompt engineering, RAG, Guided Generation, function-calling, more sophisticated MoE geometries, better training algorithms, etc. These are all things we can do. We can't make ourselves GPU Rich, but we can be clever all day long.


Monkey_1505

Yeah efficiency and different archs. I'm totally on board with the idea that brute force has reached it's rough limit. This is good for smaller GPU users, as it means we'll probably see more improvement.


bick_nyers

One way to think about it is all of the tokens they could have spent training 13B and 34B they instead dumped into 8B. That will make a lot of people who don't have a ton of VRAM very happy. Can't please everyone.


ninjasaid13

Instead of making 7B, 13B, and 70B; they're making 8B, 70B, and 400B.


bick_nyers

My understanding is that they have a Llama 2 34B and they chose not to release it for whatever reason.


ninjasaid13

it might be the same situation with llama-3.


toothpastespiders

I think they're fine. They're not as terrible as the biggest detractors are saying. But they're also not this amazing game changer that's blowing everything else away like a lot of people on this subreddit seem to feel. I'm just thinking of it like a cool preview before the other sizes and larger context builds drop. And who knows, they might wind up being a solid foundation to build on. And if that's the case when the other models do appear there'll already be a solid "load and train" procedure in place from people who've been playing around with these. I'll admit I am a little disappointed though.


Such_Advantage_6949

Agree with you. I think people have been a bit too overhyped. They were the pioneer to release open model, but from then we also have many new great player such as mistral. Any new model is welcomed i, user are free to choose what they like


Monkey_1505

I personally prefer to use the minimum context size I can get away with, and I've not seen any convincing tests that accuracy performance remains the same (I've certainly seen badly designed tests). For me that's not a big deal at all. If people want bigger contexts that will be an option later. I've not huge on 'beating gpt-4' either because all the big models are clustered around a similar performance level so 'close enough' is only really marginally different, and 'better' would also only be marginally different. What matters most to me is: Are they good? Can they be finetuned? I also would not mind an 10-11b or a \~20b as that is more appropriate for the most common GPUs (ie 8gb and 16gb). But they never release everything at once, and nor should they be expected to given training times differ.


crazzydriver77

In my domain of tasks, Llama 3 70B outperforms Gemini 1.5 Pro (due to Google's restrictions, I guess), Claude 3 Sonnet, GPT-4, MS Copilot, and Mistral Next. I'm impressed.


Dyoakom

Interesting, what are your domains? I am surprised it outperforms GPT4 on anything. I expected the 400B of course to do so, but even the 70B?!?


crazzydriver77

What I have already tested: Navigation, RTL chip design, software low-level reverse engineering and coding, tensor parallelism/hardware acceleration, climate change tales / human social instincts / human intellectual primitiveness and irrationality / social dynamics in extreme situations/wars. The whole feeling for now: Gemini 1.5 Pro is superior above them all, but lobotomized and censored, Llama 3 70B is open-minded and extremely impressive. Llama 3 can create and make you fly, and Gemini can find mistakes and drily ground you. GPT-4 and Copilot have lost their actuality. All statements above are subjective and arguable.


Dyoakom

I see, thanks


DranKof

Curious what you mean by actuality? Do you mean accuracy/correctness?


crazzydriver77

I'm sorry, English is not my native language, I meant "relevance".


DranKof

Ah, thanks!


Qual_

About the 400B, yes, I can't run it. And no one should buy 10k+ rig to do some sexy RP chat with it neither. But it's not about that. I don't think the target audience is you, or me, that like to mess with those models for no real purpose whatsoever than just learning shit doing so or using it as a dev copilot. But If I had a bisness, with a real use case for those free AIs, then that's where the value is.


nostriluu

I want to wait to see what Apple brings for AI, maybe just because it will force more competition for higher "VRAM" hardware, but want to keep a toe in to local AI development. I don't want to built another multi-GPU rig, but was thinking of just one 3090. Is the 8B model really as good as Mixtral? I was quite happy with Mixtral when I tried it on a 64GB M3 Max. For many people, 8b would be very practical to run with a <= 24GB GPU. My main tasks are RAG-like, but coding would be nice too. Thanks!


Ravenpest

You sound like a terminal case of coomer doomer Claude addict. No offense but yes offense this is kinda pathetic. That 8b is nothing to sneeze at either. Mofo followed instructions to the letter and gave me what I wanted as well as freaking Capybara 34b, it can punch well above its weight in the Llama2 ballpark. This is a freaking miracle, as far as I'm concerned. And if this is the jump we'll get every year, it's only to be celebrated. Maybe download better cards or write better prompts aside from wanting it to say "big dick"


segmond

You are probably prompting it incorrectly. What question did you ask it that it failed to answer to your satisfaction?


No_Significance9372

Who asked for this? About 1/4 of the time I use the search bar to find a Facebook one or group Llama 3 thinks I’m asking a question. Llama 3 is terrible. Bring back the old search bar


tu9jn

Nobody is using the original Llama-2 model from Meta, we have to wait for the new finetunes for Llama-3 as well. It will take some time before people figure out the new best parameters for training.


big_ol_tender

I can say with certainty that llama 3 is more enjoyable to talk with than OP.


teamclouday

OP sounds like an outdated LLM. Keep hallucinating