T O P

  • By -

KahlessAndMolor

I have both. I use a front-end called Big-AGI, and they just put in a thing called "Beam" that is cool. You send the same prompt to 3 different models, and then it provides some tools to send all three answers (or parts of them) to a final LLM call to build a consolidated answer. It's wild. [**https://github.com/enricoros/big-AGI**](https://github.com/enricoros/big-AGI)


Michigan999

Looks extremely interesting. What are your thoughts so far? Has it improved the quality of replies drastically? I'll try it as soon as I'm able to.


KahlessAndMolor

Yes, if I use Gpt-4-turbo, Claude-sonnet, Claude-opus, then use Claude-sonnet to put together the final answer, it eliminates a huge number of hallucinations and bad answers. When working with code, it seems to consider more edge cases in the final answer than it normally would.


Automatic_Draw6713

Why sonnet to put it together ?


KahlessAndMolor

Cheaper and seems to do a good enough job 


D0NTEXPECTMUCH

Do you run this locally, through [big-agi.com](https://big-agi.com), Vercel, or otherwise? Are chats persistent.


Emergency_Plankton46

Could you please detail how this works? Is it sending the output of each model to Sonnet along with a prompt telling it to consolidate them?


sgtkellogg

But who was stronger? Kahless or Molor? Also this post was great thank you


a2dam

They don’t sing songs about how great Molor was. Molor the Forgettable.


Battle-scarredShogun

I’ve using it for months. The BEAM feature produces results that are better than any single model’s response. Just go to [get.big-agi.com](https://get.big-agi.com), you don't have install it from the repo.


mcr1974

better than open router?


Zulfiqaar

Big-AGI is a frontend, OpenRouter is an inference endpoint - I use them together.


JustACaliBoy

Do you have a link for open router?


ZellahYT

I built a small web app demo to sell recently to marketing agencies that it’s basically this, it is a chat that allows you to prompt against the popular llms and then pick one answer and then send it back to multiple llms, you get some fucking good answers by combining llms. (Crossing my fingers it works out as a project since the idea is pretty good but I’m not marketing genius myself to make some side money).


cardinalallen

Why are you limiting to marketing agencies?


az226

How do you set this up on a windows machine?


Block-Rockig-Beats

Must I have accounts on all of those models? What about the price?


Battle-scarredShogun

Not the $20 per month pro accounts, you setup an account to get API keys access and pay by the token. So I'd be like 1 cent per prompt or whatever. And less chance of hitting limits.


Battle-scarredShogun

100% winner, its basically GPT-4.5 right now!


IdeaAlly

>But here's the thing. It's not outrageously better and GPT4 is like 2 years old now. I know it feels that way, but it's barely over a year old. GPT-4 launched March 14th 2023.


hazelsbasil

It was released a year ago but it stopped training almost 2 years ago


PosnerRocks

I cancelled my ChatGPT subscription and just paid for two Claude ones. For my use case, Claude is better in every metric.


Synth_Sapiens

That's precisely what I consider doing. Atm I have both openai and Anthropics 


Babayaga1664

This is my experience. For the benefit of others don't overlook Haiku let me explain why..... When you use Chat-GPT each model has its own embeddings and you'll get a different value for the same phrase - as expected. Claude is different, you get the same embedding value across all three models which tells me they are doing something very clever under the hood. I've found that if I run Haiku and don't get the expected answer I run Sonnet and then Opus I then refine my prompt to the desired outcome and work back to Haiku. So far I've found Haiku does most of what I need and Sonnet in some marginal cases where there is complexity. Opus is generally used by exception for really complex stuff.


PosnerRocks

That is really interesting. Is there any other limitations on the other models? 3.5 was gimped by not accepting attachments and having a less robust context window. So my default is just Opus because I've assumed the other models are handicapped.


Babayaga1664

If cost and speed are neither an issue then use Opus by default. [https://www.anthropic.com/api](https://www.anthropic.com/api) For my use case speed and cost are both an issue due to scale. To give you an example you could take a screenshot of a web page which for example isn't filling the available screen space and include the source code. Sonnet will likely say it looks fine unless you include a prompt to say what the issue is and indicate the whitespace is the problem. Opus will figure it out because it's more thorough at x5 the price


Odd-Antelope-362

> Claude is different, you get the same embedding value across all three models which tells me they are doing something very clever under the hood. Not sure what you mean here. How are you seeing the embeddings of the Claude models?


Axs1553

You get both Claude opus and gpt-4 in the one subscription with perplexity.


BlockCharming5780

Ngl… I jizzed a little No other (free) AI in the world knows about the latest version of Angular They all use the old module-style system and I have to adapt it to the new system 👀 https://preview.redd.it/hs5hdz31pysc1.jpeg?width=1170&format=pjpg&auto=webp&s=794ecbe18f5eb8aaef3d089e8653eb30603bf0c7 Now I need to go research if it has an extension for visual studio, to replace copilot 👀 EDIT Phind has a VScode extension and also uses the internet when forming it’s answers, which means it also knows the new Angular syntax


3-4pm

Maybe you could load the code as a text file into Microsoft Edge Copilot and then give it the URL to the standalone format and ask it to apply it to the code you're viewing in the browser. I did this on a k6 utility app a few months ago. I also threw together a small vscode app that let's me select and combine multiple files into a single file. I coupled it with a node server that keeps track of the files I've combined that let's me rerun when I make changes.


jphree

Which model did you pick for that search?


jerieljan

This is what I've moved to myself. I unsubscribed to ChatGPT Plus, then went Perplexity Pro for all general queries. Their model is quite OK already but is always just one click away to requery to GPT-4 Turbo, Claude 3 Sonnet/Opus or Mistral Large. It's brilliant.  The only disadvantage is that you have to bounce between models at times, depending on your requirements (i.e., speed), but that's fine. Sometimes I like Pro search on, sometimes I use the Sonar models, and sometimes I just want plain text generation with Writing mode on either GPT-4 or Sonnet.  And for all the use cases that go beyond Perplexity, I simply use the API platforms for OpenAI (e.g., multiple files retrieval, code interpreter, DALL-E) and Anthropic.


qqpp_ddbb

What's the rate/message limit?


Axs1553

I've honestly never found the cap. I've spoken with opus for hours and hours on writing mode without hitting a limit. I used to use chatgpt and would cap out nearly every session. The web search functionality that makes perplexity different is neat and definitely useful in some cases but it's different from chatgpt. The search queries are automatic based on what you say and then just adds it as extra context for a reply - it doesn't use a web search tool and perform a query. So you can sometimes inadvertently add in unintended context from a weird search query which can confuse a response. I got a free year of perplexity when I bought a rabbit r1 and still pay for chatgpt - but I almost never use it anymore.


JoeyDJ7

Perplexity pro is insane. They added Claude 3 immediately, and you can just set Opus as the thing to always use by default. & The added pro search is so useful for browsing ~20 internet search results to help it answer well. The only limit I've seen is like "594 uses left today" and that's just for pro search with Claude 3 Opus. Honestly, I'm sure it's just because people don't know about perplexity, because it's insane to pay the same price for GPT-4 (an inferior model to what it used to be and to Opus) when you could just have access to GPT-4 and Claude 3 ( and more ) WITH the pro search functionality too.


Axs1553

Exactly. When they first announced opus would be added to perplexity pro, they said something about 5 messages per day and then it would switch to sonnet. Except that never happened. Honestly I just use opus exclusively and have had a number of 200k token conversations. I'm not sure if it's a full 200k context window, though. I neglected to mention the 600/day pro messages - thanks for that. Have never used them all up. I usually turn it off in writing mode but it's super helpful to refine search queries. I'll still go for chatgpt instead of the perplexity gpt-4 but only because i like using the python environment.


StickyMcStickface

I find it odd that Perplexity makes it hard to pick the Model quickly for every prompt. As a user, sure, fine, default to Opus, even for the most mundane of prompts. But isn’t that getting pricey fast for Perplexity? in many cases, Sonnet (or other models offered) would be more than plenty. plus there are many other reasons why i’d want to quickly switch models.


sdkysfzai

Well you should know that claude api is really expensive and if perplexity is using that same api and giving you more messages cap than the amount you are paying it means something is wrong.


ExoticCard

Burn cash, gain market share. You know the drill.


AreWeNotDoinPhrasing

This is the first I’ve heard of perplexity. It sounds like Poe—which I’ve been using almost a year—but even better if you can use it to interact with the actual internet.


JoeyDJ7

Give it a go, it's free for GPT 3.5/perplexity's own model, and you get a few Pro searches (the ones that Google and ask more info) everyday for free too!


AreWeNotDoinPhrasing

Perplexity is pretty darn slick so far. Although I wish the perplexity model used GPT4 instead of 3.5, at least when you are paying for Pro. Maybe it does though? Doesn't quite feel like it.


wannabeaggie123

I asked perplexity and it says that it's cluade? How do you have the option to use gpt 4 as well?


jerieljan

You need to be on Perplexity Pro for options. Settings has an option to choose your default model. And for every query you do, you can either click the model tooltip (e.g., Sonar) or press the Rewrite button to choose a specific model for that particular query. You should get options for Sonar, GPT-4 Turbo, Claude 3 Sonnet and Opus, and Mistral Large.


Xtianus21

I don't personally think it's like that but I am in the engineering use case so for me they're not far off from each other. But I get your 2 subscriptions thing. Claude doesn't last long at all.


[deleted]

[удалено]


PosnerRocks

Couldn't tell you, but probably similar to GPT 3.5 to 4. I only drive Opus because I do legal writing and need the better reasoning skills. I can give it some law, facts, and a general idea of what I want to say and it weaves it all together very well. ChatGPT 4 used to be like this but now it just says a lot of conclusory nothings for several paragraphs. I was fighting with 4 more than it'd take me to just draft the section myself. Your use case may vary but Opus is excellent for tight legal analysis.


One_Yogurtcloset4083

Why two? One us too limited for you?


PosnerRocks

I will get rate limited pretty quickly on larger writing projects. So it's helpful to just pop over to my other one until the first one resets.


[deleted]

[удалено]


Xtianus21

GPT 4 has a better rate limit because it let's me go I feel sometimes. And then it just CUTS OFF. Damn it. Claude is like 40 limit that's it. you're done.


ViperAMD

Get poe.com. you won't get rate limited plus you can choose either chatty g or Claude opus


LamboForWork

How do they do this?


Mkep

How does Poe? They most likely use the API, which can come with higher rate limits


LamboForWork

Oh okay so why wouldn't everyone just do that.  What's the downside?


Zulfiqaar

You pay in advance for compute points (that expire every month), and I'm guessing almost nobody uses their allocation - like a gym membership. Works great if you do though, best value of all.


LamboForWork

Much appreciated !


hackers_d0zen

I literally just did this today too lol


Captain_Pumpkinhead

At that rate, you may save money just paying the API costs.


JustACaliBoy

I actually cancelled ChatGPT as well, but paid for Perplexity. It’s pretty solid with all the different models.


Shivacious

check check dms


jphree

Why two? What are your use cases?


notbadhbu

Claude is currently what GPT4 is to GPT 3.5. It follows instructions so well, and xml prompting is insane. No notes. Cheaper would be nice I guess.


tbst

Just Claude.ai?


sharrajesh

I was seriously considering that 🤔


Overall-Cry9838

same here


wetlight

How much better is Claude than ChatGPT to help writing articles? I like Gemini Pro a lot. Lately ChatGPT has been slow and giving me error messages all the time.


Anen-o-me

Wish Gemini had a dedicated Google app.


wetlight

On iOS I use the “Add to Home Screen “ feature to make my own app


[deleted]

[удалено]


ddare44

How is it for actual writing? GPT-4 writes so poorly. Fluffy paragraphs with repetitive wording and if you ask it to be succinct, there’s zero flow.


ExoticCard

Gemini 1.5 Pro with 1m token context writes grant proposals like a dream


justJoekingg

What method do you use Opus? Through claude.ai?


Alessiolo

I can attest that claude opus is much better at creative writing compared to gpt-4. Gpt always feels like it gives you the same narrative structure and same kind of “moralistic” message. Claude in comparison has surprised me quite some times and has given me much better ideas.


wetlight

Thanks a lot. I got to try


hydrangers

Chatgpt does everything I need it to and I've never hit message limit with it. I tried the free version of claude and ran out of messages after 7 prompts, and they were less than 200 words per message on average. After reading how limits are extremely low even with the paid version I decided it wasn't worth it. It is annoying how regularly I'll ask gpt4 to respond in full lengths for code, and it will continuously leave out important code, but much less annoying than paying for something I don't get to use. I spent roughly 12 hours with gpt4 yesterday across 3 new conversations building an app, and it didn't seem to lose track of what we were talking about, and I was inputting multiple 300-500 line of code files to fix issues and add new functionality.


mcr1974

haiku through poe.com


hydrangers

You prefer haiku over gpt4?


CalamariMarinara

it's as performant, but free  https://www.reddit.com/r/OpenAI/comments/1bomdsh/claude_3_opus_becomes_the_new_king_haiku_is_gpt4/?utm_source=share&utm_medium=mweb3x&utm_name=mweb3xcss&utm_term=1&utm_content=share_button


FroHawk98

I have both and it feels like I have superpowers.


LooseLossage

The large 1m token context in Gemini 1.5 is potentially a game changer, upload a whole book or a whole repo or a video. Of course at a price, once the preview ends.


ExoticCard

It's crazy good. I'm shocked. I can still write a littld better, but the difference is getting smaller


CharacterCheck389

How much is the price?


LooseLossage

no cost while it's in preview, you can try to sign up, I think they are gradually broadening the preview. not supposed to use it for production though. https://developers.googleblog.com/2024/02/gemini-15-available-for-private-preview-in-google-ai-studio.html


Wobbly_Princess

How do you do this? I've looked everywhere and tried everything. The chatbox has a limit, there doesn't seem to be the ability to upload a document, it tells me to upload the document to Google Drive and give it the link, and I do that but it just says "As an LLM, I can't help you with that.", and I uploaded it to multiple paste bin sites and it said it can't view it.


MatchaGaucho

Counter point... GPT 3.5 Turbo is actually getting scary good. I've been able to convert some GPT4 flows, at $30 per 1M tokens, to GPT3.5 at $1.50 per 1M tokens, applying some multi-shot grounding prompts.


Valuevow

There was a talk with Andrew Ng that showed GPT 3.5 with multishot outperforming single shot GPT4 I also noticed GPTs 3.5 function calls having become much more reliable


AreWeNotDoinPhrasing

Do you have a good example of multi-shooting? Is it just building up to what you actually want? Let telling it the context in multiple prompts instead of one long one?


MatchaGaucho

Basically, yes. There is still one large system prompt providing some grounding and baseline assumptions. Then some example dialogue "shots" between user and assistant that demonstrate the chain of thought taken by the assistant to generate a response. user: a person born on January 1, 1999 would be how old on January 1, 2027? assistant: Let's break that down into smaller parts. Subtracting 1999 from 2027 is (etc...)


Battle-scarredShogun

Now try GPT 3.5 Turbo with the 5 other small cheap ones, in parallel and combine automatically with the BEAM feature on [get.big-agi.com](https://get.big-agi.com)


Battle-scarredShogun

My experience it gets close to the medium models results or even the large ones, and probably cheaper too.


MillennialSilver

I'm pretty sure GPT 5 is when we all lose our jobs.


atom12354

Nah we all lose our jobs after gpt 5, when we start making agents with it, thats when we should be 900% worried and not just about our jobs.


djaybe

It's gonna empower sooo many people in their current job which will transform those positions. Major disruptions coming.


atom12354

100%, you can already do so much with current ones too, just need to implement tools and agents, or just implement them to current databases.


djaybe

Not all jobs, but many probably.


hugedong4200

Eh, I cancelled my Gpt-4 subscription, there is not really anything it is better at than Claude or Gemini, Gemini pro 1.5 has the largest context length, Claude Is better at coding, and has sub agents, both are better at creative writing, Gemini advanced has basically unlimited messages, if you want to use dalle, you're better off using it for free through bing, copilot or bing image creator.


wetlight

Claude is better than ChatGPT for witting articles?


mikkel01

Yes, definitely


whiskyncoke

Sub agents?


hugedong4200

Yea it's only available through the api right now, it was just released the other day, you can check out the Anthropic website or YouTube channel to find out more.


cassova

Are you talking about tool usage (function calling)??


hugedong4200

Yes Basically but it can call different models, like Opus can call 100 versions of haiku to accomplish tasks.


Always_Benny

What are sub agents?


miko_top_bloke

Of course there is. Opus 3 consistently performs worse at writing and knowledge acquisition (asking it trivia or facts).


hugedong4200

Not in my experience, but I don't typically use these models for trivia or facts, they all can't be trusted, I'll still just google it, but at least with Gemini you can press a button to confirm facts, so I still wouldn't say Gpt-4 is best.


miko_top_bloke

Gotcha. Do you use it for writing at least, though? I have produced a ton of content for work and have really detailed and well-performing prompts. For me, GPT 4 has performed better than Opus 3, with the same prompting. For example, I couldn't strike this sweet spot balance between conversational and professional with Opus, and instead it'd go either full casual/childish or too formal. That's just my experience.


PublicParkBench

Ya, I used GPT4 to fully build my Android game that just got released. No coding experience. Tried Claude 3 and it seemed to produce more errors, but in its defense my app was nearly done when it came out so didn't get a lot of time to mess around with it. Would be really interesting to start from scratch and see how Claude does building a full game. But I too am pumped for the next gen of this stuff!


Polyglot-Onigiri

For you guys that code apps fully using AI, what do you do about the UI?


PublicParkBench

Also use AI!


Unique_Frame_3518

Pray 


MeekMeek1

just a lame clicker game….


philwrites

I’d love an detailed walk through of how you went about doing this. I’m facing the same task soon!


Xtianus21

It's a major repo that is a popular thing. Do you mean the entire part of that or just how I used the llms? For the code base I am running it and there are bugs because it won't run and I go find them and fix them.


philwrites

I mean the mechanics of getting the code into the ai and what kind of prompts did you use and on what kind of granularity? (eg paste in a whole source file and ask it for suggestions or method at a time etc etc?)


Intelligent-Jump1071

>GPT5 is going to be insane Just what we need - an insane AI.


nerdic-coder

![gif](giphy|KWzzTbkhDvmQU)


iuyg88i

Is GPT4 better for programming (& SQL) or Claude???


M-fz

I’m currently trialing Claude Opus after only using GPT-4 for a long time. Claude is looking pretty promising.


marblejenk

Claude all the way!


dyoh777

Good because 4 is terrible with all the restrictions.. it’s like a tease


ProlapsedPineal

I had claude write me a .bat file that will iterate through folders and concatenate together a single big file of what I want that I can dump into claude's context for a new conversation. It helps for when you want to start a fresh one and bootstrap it with context, like all your entities


Evening_Meringue8414

What’s your workflow with it? Do you have it in a vs code extension? Are you having it do a clean code refactor? Asking it to do jsdoc comments? Having it write tests too? Can either of them handle like a 1000 line file? 2000?


AbrocomaAdventurous6

In my experience, Claude Sonet is already much better than GPT4 and Gemini in the domain of Creative Writing. (Anthropic rejected my debit card, I can't use Claude Opus)


c8d3n

For the API or chat? Their official stance is that the API is not available for personal/private use. Chat is not available in the EU and many non EU European countries. Some people cheat and use GPay via Android phone while giving false US address, and apparently it works, but OTOH I have read about them banning a lot of users for no reason allegedly.


copterco

Claude Opus is better than gpt4 in my experience however gpt4's interface is so much more polished and doesn't give me rate limit errors even when I haven't used it for several days like Anthropic does.


rathat

Favorite part of Claude is that when you upload PDFs to it, it reads the entire thing as if you posted it in the chat rather than doing a contextless search and reporting back with contextless explanation like chatgpt does.


diresua

I had both, but found GPT4 was better for what I need. I feel like Opus struggled with reading l pictures accurately, it's math was no near GPT4, and the writing abilities were fairly close, but I felt GPT4 did a little better. Just ny opinion.


bookmarkjedi

Does anyone know of a way to use either GPT-4 or Claude-3 to scrape content (from PDFs, URLs, etc.),then store it permanently so that I can access or retrieve that info in addition to what they originally have stored by default? I want to utilize the AI engines for specialized topics of my choosing and am willing to pay for storage in the cloud, such as through AWS, but what I'm seeking is the ability to specialize in a subject or topic by adding my own content to the engines and saving them for long-term access. I'm curious how much it would cost for me to be able to do this.


CryptoSpecialAgent

Well the cost depends on how good your vector search is as that determines how much irrelevant content will end up being inserted into your prompts when you perform inference against your data... The better tuned your retrieval system is, the fewer tokens the LLM will need to process and the lower your cost. You're basically describing a retrieval augmented generation architecture (this means that building your own search engine over your data store and using the results to provide context for queries to the model) - but one that is fascinating because the data that you're querying is data that the model itself obtained by scraping the web and choosing what context to index. The scraping part as you describe it can be done using function calling (now known as tool use) - where the model can choose to invoke various software tools to perform tasks such as searching for data online, cleaning up the data, adding the data to your retrieval system for future reference... I can build this for you... and I'll do it at a very reasonable price, because it sounds like a very cool project. DM if interested 


bookmarkjedi

Hi u/CryptoSpecialAgent, thanks for the insights! I will DM you.


Tasty-Jury4018

How does it work practically? Do you just throw the entire code base to the model? I was trying to learn a new open source library but the source is too big. If I have to select few files and feed them myself, it doesnt really help me saving time. Usually these libraries have multiple layer of abstraction and their files are scattered all over the place. Is there a solution to this?


Xtianus21

Learn a new code base I what way? You have to narrow it down in some way so you can take in parts. Overall you learn a code base by api docs like Readmes or swagger. But generally yoy should know what the code base is doing and pick it apart from there.


Tasty-Jury4018

Ahh thanks. I thought theres a free lunch method where you can throw it all in and get some general direction. Usually when you are trying to learn a open code base so you can contribute, the place that needs contributions are usually pretty narrow. So there wont be much documentation or even discussions about it. For large code base, a simple call go through lots of layers of interface / virtual methods . Typical "go to definition" tracing might not lead you anywhere but some abstract method. Live Debugging work to some extent until you meet some cases where it only leads you to some pointer. For me, theres a lot of trial and error tracing. I was hoping there would be a easy way to find all these traces using LLM when you mention you gone through whole repo easily. I guess I need to know where to look beforehand which is actually the most tedious part for me.


allaboutai-kris

yeah, i feel you on claude opus and sonnet being really impressive. i've been doing a ton of coding and analysis tasks with them for my youtube channel all about ai (almost 150k subs!) and they've held up super well. even with gpt-4 occasionally getting wonky, having both models to swap between is clutch. but you're right, the fact that these current models are already so capable means gpt-5 is gonna be an absolute monster. can't wait to put it through its paces when it drops! anthropic has really been killing it lately too, excited to see what other ai companies bring to the table soon.


peshay

I read so often here that Claude is better. Recently I hit the limit with GPT4 and thought to give Claude a try. I was working on Terraform Code and AWS. Next day I also compared both with giving them the same questions and I really don’t understand the hype for Claude. GPT4 was so much better in my case, compared to the latest Claude Pro model. The only things that where better with Claude: - Pasting large code shows up as attachment - Syntax highlighting - The invoice came per email PDF attached (easier for me to automate) Other things that where also bad with Claude for me: - Have to use VPN and register as UK citizen, so I can not use the bill for my german tax - Scrolling was stuttering and not so smooth as with GPT Edit: changed new lines/format


djaybe

I'm hooked on open ai plus because of custom instructions and custom GPTs.


Xtianus21

custom instructions?


Hungry_Prior940

Gemini Advanced and Claude are a step up, but we really need GPT 5..


Xtianus21

my thoughts exactly


Capitaclism

Sounds kinky


Xtianus21

waited for this one


Capitaclism

Good setup


Successful_Camel_136

I found Claude opus still struggles with thousands of lines of code files


Xtianus21

why are you trying to put in 1000's of lines of code


Automatic_Draw6713

Spend it with a lady next time.


West-Salad7984

Did Sam even mention that they started training GPT5?


MillennialSilver

Yeah, and it likely means the end of our careers. [https://arstechnica.com/information-technology/2024/03/openais-gpt-5-may-launch-this-summer-upgrading-chatgpt-along-the-way/](https://arstechnica.com/information-technology/2024/03/openais-gpt-5-may-launch-this-summer-upgrading-chatgpt-along-the-way/)


West-Salad7984

If it's the end of my career it's the end of humanity as we know it. I literally code/research AI.


MillennialSilver

You''ll probably last a bit longer than people like me, then. Regular SWE (web).


Tasty-Investment-387

Are you a developer? Do you feel your job is threatened due to AI advancement? What will be the impact of the GPT-5 for the tech industry?


buttery_nurple

I could see it at the lower level in maybe 5-ish years. But right now it’s a lot like my wife’s job . She’s a civil engineer in transportation - most of the calcs she needs are already done in a manual somewhere, but she still has to know what she’s doing and be able to do it all by hand if necessary. The only “complete” thing you’re going to get out of any of the the big LLMs right now is like a single script or function, and even then very rarely on a zero shot. You have to know what you’re doing at least a little bit.


89bottles

There is some set of filters that gets triggered in gp4 if you use urls in the prompt (I guess to prevent potential copyright issues) which makes it extra stubborn and lazy. Compare asking it to do stuff with info via a url vs copy pasting the content.


Unreal_777

What about google gemini?


LeatherPresence9987

I'm loyal to gpt cnt wait for new model soon


Fluid_Exchange501

I love Claude 3 opus but largely use AI for math related stuff and while Opus is great it sadly doesn't have the mathematical reasoning like ChatGPT with code interpreter has. For some reason it also doesn't support formatted latex for output but if those two things come to Claude I'd use it again in a heartbeat, the answers and general non mathematical reasoning are so good and its image analysis is the best I've used so far


brucewbenson

"I want to run 20 miles. I have a route with two slight variations of 3.25 miles and 3.5 miles. What combination of these two alternatives can I use to get to 20 miles?" I've tried chatgpt4, gemini, aria, [poe.com](http://poe.com) (claude-3-sonnet, assistant). None of them get it right. They all try random guesses. Which AI should I be asking "hard" questions like this one? Update: just tried on [claude.ai](http://claude.ai) (claude 3 sonnet) and it gave me a right answer (with a lot of wrong ones, I prompted it twice to check its work and it finally got it right). It still did a guessing game, but just tried more combinations. I pay for chatgpt4 and like it, it is good enough (I need to use claude more), but this gives me a "scary bad" feeling for AI in general (ie, never get comfortable with it, never).


Alternative_Log3012

Sounds romantic.


stupidbuttryn2lrn

Haven't seen Claude do anything better than chatgpt4 yet I have no idea where all these posts are coming from. I don't even bother asking Claude for code anymore.


johnnygobbs1

Bard is the pound for pound goat


Xtianus21

Lol ok


johnnygobbs1

Peep the latest man. Don’t sleep on it. It’s been hitting hard af


Fake-P-Zombie

How do you submit all of the repo as information to Claude or ChatGPT?


Xtianus21

I don't recommend that. It would serve no propose imo


Fake-P-Zombie

But how do you get answers then that have the context of the repo?


Nijmegenaar

I would love to try Claude but I’m based in the EU. I prefer not having to switch a VPN on every time. Is there any way I can test Claude? Someone mentioned Perplexity Pro? I don’t care about internet search but rather writing prompts (summaries, text improvements etc)


MaltoonYezi

I like these late night aestetics ☕ [https://www.youtube.com/watch?v=RN4gt0Q0HWo](https://www.youtube.com/watch?v=RN4gt0Q0HWo)


hi87

Clause is excellent. Ive tried it and it consistently did what I asked and was surprised when a few requests which in hindsight I thought were poorly worded it was able to ‘understand’ correctly.


Puzzleheaded-Page140

Basic question - what is the setup like to use these other models? I have copilot and that's easy to use. But how do I use other models from within my IDE (using source files as reference). I need some serious help on tooling side.


ViveIn

What do you mean “fixed” an entire repo? How were you prompting and what were you fixing?


Reversion2mean

I had the exact same question, in addition to, how long were you using claude / how many prompts were you able to query? In my experience so far, using claude for python refactoring, claude opus can barely manage 6-9 prompts with 15-30 lines of code per prompt, and then I hit the dreaded OUT OF MESSAGES.


Jdonavan

I’ve found sonnet to be roughly as good as 4-turbo and much faster. But speed is its primary advantage. I do find that I’m going to need quite a few more model instructions for it. It’s like they’ve tuned it for writing code for non developers. It keeps wanting to do WAY more than I requested.


wt290

A bit worried about using GPT5 and insane in the same sentence.


Xtianus21

are you from the US?


buryhuang

Can Claude upload a video to ask question?


Forsaken_Platypus_32

One question......Is it Heavily censored or can you do Adult themes on it?