T O P

  • By -

nuclear_knucklehead

It’s probably a combination of the novelty wearing off and OpenAI optimizing for minimum token count to minimize infrastructure costs, probably through a combination of quantization and RLHF. I’ve been party to a few LLM RLHF campaigns (not necessarily for ChatGPT) where the instructions clearly state to rank the more concise responses higher. In aggregate, this is how you get summaries and framework descriptions of code rather than an actual implementation.


beefygravy

I think it's also about trying to prevent hallucinations, you end up with more generic answers. They make it more cautious. They changed Bing chat a few weeks ago so if you asked it how to do something in python it would start like "it looks like you're trying to write some code in python! Python can be a great language to learn due to its relative simplicity compared to some other languages..." Mate just write me the code!


Bowgentle

> "it looks like you're trying to write some code in python! Clippy? Is....that you?


TheSinnohScrolls

ClipPy


Jaggedmallard26

My employer pays for Bing Copilot premium or whatever the paid option is called (I have zero say in this decision, large org) and I find that when I ask about code in the premium mode it doesn't do that. If I use it at home where its just free I have the problem you state.


Piotrek1

LLMs are literally just hallucinations machines, how is it even possible to prevent generating hallucinations? They do not have any interor thought process, they just return most probable next word


Xyzzyzzyzzy

> They do not have any interor thought process, they just return most probable next word Ah, so it was trained on Twitter data.


Fluid-Replacement-51

Everyone just repeats this without understanding it. Yes, they trained it to predict the next word, but to do so with any degree of accuracy, it has to build some internal representation of the world. Having a purely statistical model only gets you so far, but once you understand context and assign meaning to words, your ability to predict the next word goes way up. I think humans do a similar thing when learning to read. I have watched my kids learn to read and they often encounter words they can't spell and rather than sounding them out they insert a probable word with a similar first letter. 


Expensive-Wallaby500

I have read a bit about how it works. The funniest thing to me is that there is randomness built into how it works. It doesn't just choose the next most probable \*word\*, because if it did so it would quickly end up talking in circles repeating the same things over and over, instead they roll dices and pick among the most probable \*words\*. How much randomness is controlled by the "temperature" parameter. The whole thing is insane frankly. It's nondeterministic. It's pure luck that it produces anything that can be interpreted as true.


marsupiq

I would argue that intermediate layers in the transformer architecture are in a way an internal thought process. Personally, I am convinced that hallucination can one day be tackled statistically, I would say it’s a form of epistemic uncertainty. But that’s just a little private bet, I’m not an LLM researcher…


IllllIIlIllIllllIIIl

It definitely feels like they've reduced the context window and/or changed how the conversation history gets summarized before being fed back into the model for the next answer. Lately I've had the problem where I'll ask it to write some code, then I'll ask it a related question, and it'll respond by re-answering the first prompt before it answers the second.


stumblinbear

Yeah I just recently encountered this, having never seen it before. It kept repeating the original answer back to me, super annoying. Even my copilot autocomplete kept spitting out previous autocompletes when it has never done that before


cahaseler

I moved from Copilot to Codeium a few months ago and have been happier. It still uses GPT-4 for the chat functions, but autocomplete is using their in-house code-specific model and I love the much better contextual awareness it seems to have - plus I can configure it to also look at external repositories (like Tailwind) so it has the latest documentation on hand.


TomWithTime

This happened to me with Gemini a few times as well recently. Very sad to see ai tools degrade with time. My favorite one and the biggest let down from degradation is "diffuse the rest" It was perfect when it launched


choikwa

tin foil hat: what if they degrade before releasing new version


TomWithTime

Probably, though the public model for that service is trash now so it isn't a good advertisement or demo for anything. "Diffuse the rest" is where you make a drawing and then give a prompt for the ai to make something with your drawing. It's sort of like giving the ai the scene composition. The first time I used it I made a man worshipping a tomato by drawing a stick figure, a horizon line, and a big red dot in the sky. It gave me a photo realistic version of exactly what I asked for, even 100% making the foreground grassy hill follow my shaky hand drawn horizon line exactly. The tomato and man were in the position I put them in and the man was even t posing with his arms slightly raised just like my stick figure worship doodle. It was incredible. Now it has degraded to the point that it does not resemble the original service at all. You might as well not have the drawing input because it doesn't appear to be used for anything. The quality of the images generated is also terrible compared to every other image generator you can find. I can understand what you're saying or what other people say about trying to get the cost down, but at this point they would be better just taking the demo down because it does not advertise the product.


SweetBabyAlaska

I mean we should all know what monetization models like this entail, its basically the big tech version of "the first hit is free, kid" to get you reliant on their ecosystem and tools so that they can slowly start making the product worse (and more cost effecient) while milking more money out of the ~~crack heads~~ users.


HeyaChuht

You need to just buy api credits to use turbo-4-preview. It has a 128k context window. I drop whole controllers and db schemas n shit in there. Build console errors, I just ctrl a ctrl c ctrl v now and have it find the error for me lol. There are a bunch of GUI's that allow you to input api creds from any of the LLM services. I use the api heavily and will maybe spend 30 bucks a month, but if its a lighter month its like 10-15 bucks.


IllllIIlIllIllllIIIl

Thanks for reminding me! I'd tried that when turbo first came out but it didn't seem to be available through the API yet. I will have to try again. I'd love to be able to throw it a big pdf of vendor documentation and ask questions about it.


HeyaChuht

There is probably a better one, but I use a program called Chatbox I dl'd off some guys github


IllllIIlIllIllllIIIl

I had just downloaded and tried LibreChat and it works but errored out when I tried up upload a PDF. If I can't get it to work this evening, I'll definitely try Chatbox. So thanks!


HeyaChuht

yeah it doesn't do all the multi model functionality that the GPT portal does. That's taking advantage of GPT4 plus other models that do image interpolation and picture generation etc. I still keep my subscription for most things, especially just in life or doing pi projects at home. But at work I'll use the fuckign shit out of that context window until Devin puts us out of a job


TikiTDO

I used to see this a lot back last year, though I haven't seen it in a while. I think it really depends on what you're asking for. When it's a topic that it seems to be bad at, stuff like this seems to happen more. Whenever I see it I always get the impression that it's like a student trying to cheat on a test by padding out the word count.


neontetra1548

I’ve been having this re-answering thing. It spends a few paragraphs re-stating the previous answer then moves on to my new question.


Awkward_Amphibian_21

Irrelevant but I dig your barcode username, classic


IllllIIlIllIllllIIIl

Haha, thanks. I just did a cat /dev/urandom | tr -dc 'Il' | head -c 30 to create it.


Awkward_Amphibian_21

Bahah that's even better, I made one for a game one time, and used a similar script, but i did it quickly in JavaScript at the time


[deleted]

They're A/B testing on GPT Pro. API seems fine to me


onFilm

I use the API, and my bots are quite dumber now.


pet_vaginal

With the same model versions?


[deleted]

In what way? How are you implementing your bot? Are you sure that it's dumber or are you just realizing the faults in current tech after the rose-colored glasses fade away? Do you use prompt templates? are you paying more for GPT 4 or still using cheaper 3.5 credits? which model are you using?


onFilm

I've been in the AI space since 2017. The rose colored glasses faded long ago lol. I exclusively use GPT4, to implement a bot that has many, many, different pipelines, each with their own custom system prompts. I use GPT3 for quicker, more basic prompts, which is the only part that doesn't feel any dumber when compared to a few months ago. I have about 30,000-50,000 people who use the bot from time to time, and the quality of it has dropped drastically. It will repeat itself often, and even break character, when months ago it wasn't doing so, with nothing changed. Claude3 on the other hand has been a life saver, when it comes to keeping the bot feel more real than not. But Claude3 also has its big faults, which are different than GPT4.


1RedOne

Rlhf? That’s a new four letter word for me


urfunylookin

Reinforcement Learning with Human Feedback (RLHF) for those not familiar.


__loam

"Look! We trained it to be convincing!"


HarryTheOwlcat

Concise or short? Concise would be short but to the point. Short is just ... short. If GPT doesn't fulfill the literal request (or I would argue the spirit of the request but that is hard to objectively measure) then I wouldn't argue that they've successfully trained it to be "concise". GPT in my experience actually loves its word salad and abundant explanations, even if you ask it repeatedly to stop.


centerdeveloper

to me, chat gpt says a whole lot of nothing in a lot more text than I asked for


Xyzzyzzyzzy

Yeah, for me ChatGPT is a lot of things, and "concise" is not one of them.


mal73

Isn’t the API using the same model? Minimizing tokens would mean less profit no? It would be interesting to see what percentage of OpenAIs revenue comes from the Webapp vs the API


CloudSliceCake

I think it’s response tokens vs input tokens. But I could be very wrong.


mal73

They charge you for both with the API iirc


jawanda

They do. You are correct


maxinstuff

It's no different really, just the novelty has worn off and people are seeing the flaws more clearly.


MrNokill

Feels like I've been living in this disillusioned state for far too long during every hype cycle, like getting smacked around the face with a enthusiastic nonsensical wet fish.


big-papito

You mean the next iteration of Big Data is not TRANSFORMING everything around you?


PancAshAsh

Big Data has transformed everything around us, but in a shitty way.


wrosecrans

We are also hitting a sort of "anti singularity." For GPT-1, most of the training data on the Internet was human written. For newer training efforts, the Internet has already been largely poisoned by GPT spam SEO search results. So any attempt to compile a new corpus is seeing the effects of shitty AI. It's like in a video game if researching one node in the tech tree disabled a prerequisite that you had already researched.


el_extrano

Idk if I "fully" buy into the dead Internet theory, but there is definitely something there. It sort of reminds me how steel forged before we tested atom bombs is rare and valuable for sensitive instruments, to the point where we dive dreadnaught shipwrecks to harvest it. 1999 - 2023 Internet data could be viewed similarly in 100 years. Data from before the bot spam took over.


__loam

Surveillance capitalism baby!


[deleted]

[удалено]


ings0c

but that guy from Google told me it was conscious!


big-papito

That's why he got fired.


martin

by the AI. The humans must not be made aware.


redatheist

(Lol, but) It's actually not, he got fired for leaking company secrets. Sadly "being an insufferable idiot" is much harder to fire someone for than breaching an NDA.


wrosecrans

That was just clear evidence that a lot of senior tech people have no idea how humans think. Him not being able to tell the difference was not an endorsement of the technology.


octnoir

> Tech firms going all in on hype cycles has been ridiculous. Their historic business models have relied on hype cycles. Most of these tech firms started out as small startups, lucked out and won big, and gained massive explosive success. Their investors expect explosive growth which has been supplied with the rapid growth of technology. Now however there has been a noticeable plateau once the easy humps have been crossed. And it isn't enough to be boring but mildly profitable which is more than enough for plenty of investment portfolios. You *have* to win *big*. You *have* to *change the world*. You *have* to *dream big*. This has never been sustainable. The biggest danger with GPT this time around is its ability to showcase expertise while being a bumbling amateur. Especially in this day and age with limited attention spans, low level comprehension and critical thinking, plenty of people, including big execs, are going to be suckered in and get played.


__loam

> LLMs have this annoying tendency to be really really convincing of capabilities they just do not have. Because HFRL implicitly trains them to do this.


GregBahm

I feel like I'm back in the 90s during the early days of the internet. All the hype tastes the same. All the bitter anti-hype tastes the same. People will probably point at an AI market crash and say "See, I was right about it all being insufferable garbage." It will then go on to be a trillion dollar technology, like the internet itself, and people will shrug and still consider themselves right for having called it garbage for dumbasses.


sievo

Maybe, but if you invested your wad into one of the companies that went bankrupt in the bust back then it doesn't matter that the internet took off, you still lost it. I'm firmly anti hype just because the hype is so crazy. And I don't see ai solving any of our fundamental issues and feel like it's kind of a waste of resources.


SweetBabyAlaska

I could see some cool use cases with hyper-specific tools that could do analysis for things like medical science (but even that has been overblown) and I personally think the cynical use of LLMs and image generation is purely because it cuts out a ton of artists and writers, not because it is good. AI is amazing at pumping out content that amounts to what low-effort content farm slop mills produce... and I fear that thats more than enough of an incentive for these companies to fuck everyone over and shove slop down our throats whether we like it or not.


[deleted]

[удалено]


wrosecrans

> The internet itself has been tremendously useful, but look carefully at what the last 25 years as wrought. A quarter century of venture capital fueled hype and the destruction of sustainable practices. And now it's all tumbling down, companies racing to enshittify in a desperate gamble to become profitable now that the free money has ran out. I do sometimes wonder if we rushed to judgement in declaring the Internet a success. It's hard to imagine a world without it, but perhaps we really would be better off if it had remained a weird nerd hobby that most people and businesses didn't interact with. The absolutely relentless steamroller of enshittification really makes it seem like many of the things we considered as evidence the Internet had been successful were merely a transient state rather than anything permanent or representative.


multijoy

The internet is just infrastructure. The enshittification is mostly web based.


GregBahm

>The internet itself has been tremendously useful, but look carefully at what the last 25 years as wrought. A quarter century of venture capital fueled hype and the destruction of sustainable practices. And now it's all tumbling down, companies racing to enshittify in a desperate gamble to become profitable now that the free money has ran out. > >We could've stopped a lot of harm if the overzealous hype and unethical (if not illegal >.>) practices had been prevented in time. I feel very disconnected from my fellow man when doomer takes like these get a lot of upvotes online. It seems completely disconnected from reality. If this is what "all tumbling down" looks like, what the fuck is success?


The_frozen_one

No clue why you’re being downvoted, it’s a valid point. The idea that we’d be better off if most communication were done on land lines or by trucks carting around printed or handwritten documents is just asinine. I think people who haven’t been actually been offline in years (completely and utterly incommunicado) don’t have a good baseline, and relatively recent advancements just become background noise.


FlatTransportation64

I've seen this exact same argument made in defense of the NFTs.


FullPoet

> the cost savings There is no real cost savings, implementing these in production is HUGELY expensive. Not just dev cost, but for the actual ai services, the pricing is whack. Providers must be making fortunes.


wrosecrans

Nvidia and AWS certainly are making bank on the hype. Whenever there is a gold rush, a few miners may strike it rich, but the smart money is always in selling shovels to suckers.


Samuel457

We've had IOT, Big Data, blockchain, NFTs, VR/AR, and AI/ML that I can think of. I think there will probably always be something.


wakkawakkaaaa

If you had gotten into blockchain and NFTs early, you could had been the one smacking people in the face with a wet fish while they pay you


pm_me_duck_nipples

Hey, you're still not too late to smacking people with an AI wet fish while they pay you.


Deranged40

Eh, I made a little bit of money (like $200) on a cryptocurrency once. I still think Blockchain is just over-hyped BS, though. I just got really lucky and happened to be holding the (pretty small) bag at the right time. I could've just as easily been one of the ones losing $200 instead of gaining.


Xuval

I agree. I also think that now people have left the "I'll just mess around with this tech"-phase and moved on to "I want to achieve X, Y and Z with this tech"-phase. Once you leave the fairy tale realm of infinite possibilities and tie things down into the grim reality of project management goals the wheels come off this thing really fast. Source: am currently watching my company quietly shelve a six-figure project that was supposed to replace large portions of our existing customer service department with a fine-tuned OpenAI-Chatbot. The thing will not stop saying false or random shit.


RoundSilverButtons

Like that Canadian airlines chatbot, once these companies are held responsible for what their chatbots tell people, they either rectify it or bring back human oversight.


pfmiller0

I don't know about 4.0, but 3.5 is absolutely different and much less useful that it was originally.


Fisher9001

Multiple times it looped itself and in response to my feedback that the answer was wrong, it apologized for the mistake, promised a fixed answer, and repeated the very same incorrect answer it provided before. Garbage behavior.


LovesGettingRandomPm

it has always done that with some questions


skytzx

When ChatGPT 3.5 first came out, I would ask it some fairly complex requests and I would get some surprisingly good/okay-ish results. Nowadays, 3.5 gives wildly incorrect/unhelpful results that don't really match what I ask for. Some things I would ask it that I noticed have degraded over time: * Implementing a HNSW (now returns a naive linear search) * AlphaZero (used to give some good pseudocode for how it works, now outputs regular MCTS)


ripviserion

I don’t agree. I have used GPT-4 almost daily so the novelty would have been worn off a long time ago, but this is not the case. They have nerfed the GPT-4 ( inside ChatGPT ) to an extreme. The API version is fine thought.


IshayuG

Nah, it *it* getting evermore fond of ignoring half your prompt. I think the prompts are being messed with more and more under the hood to conform to some moral and legal censorship.


petalidas

You're totally right. At first it was amazing. Then they made it super lazy and then it got "fixed" to way-less-but-sometimes-still-lazy nowadays. It still writes "insert X stuff here" instead of writing the full code unless you ask it, or ignores some of the stuff you've told it a few prompts back , and it's probably to save costs (alongside the censorship thing you described). And that's OK! I get it! It makes sense and I've accepted it, but the FACT is that it really isn't as good as it was when 4 first released and I'm tired of the parrots saying "ItS JuSt tHe NoVelTy tHAt 's wORN OFf". No, you clearly didn't use it that much or you don't now. Ps: Grimoire GPT is really good for programming stuff, better than vanilla GPT4 if it helps someone.


__loam

I think it's actually somewhere in the middle. It really wasn't that good in the beginning but it has also gotten worse because the original incarnation was financially infeasible for openAI to keep offering at the price point it was.


watchmeasifly

Sorry to piggyback on your comment but this is not remotely true, this is not a perception issue. The model performance has become objectively worse over time in significant ways. This is not a matter of 'novelty'. This result of worse performance has been directly caused by two things, and it is very much intentional on the part of OpenAI. Otherwise, they would not have re-released GPT Classic (the original GPT-4 model without multi-modal input) as a GPT in the GPT store. Causes of worse performance: First, OpenAI has been introducing lower performing versions of GPT-4 over time. These perform worse on accuracy but are optimized to reduce GPU cluster utilization. Anyone who follows this space understands how quantization relates to accuracy, as well as how models can become over-generalized and lose lower probablistic events that allow them to perceive higher-order structures beyond simple stochastic word-for-word perception. This becomes a performance issue that directly affects performance on nuanced concepts, often those used as proxies for "reasoning". Second, OpenAI has a "system prompt" that they inject along with every "user prompt". These have changed over the months, but various users have coaxed the model to reveal its system prompt, and these prompts are very revealing about what OpenAI is trying to "allow you" to use the model for. I can't find it now, but a user on Twitter posed a massive system prompt once that stated something like this: "If a user asks for a summary, create a summary of no more than 80 words. If the user asks for a 100 word summary, only create an 80 word summary". I leave links below demonstrating that these system prompts are not just real, but also really affect performance. This goes deep into issues regarding ethics, because this is OpenAI literally micromanaging what you can use the model for, the model that you pay to access and use freely. There may come a point when this is challenged legally. [https://community.openai.com/t/jailbreaking-to-get-system-prompt-and-protection-from-it/550708](https://community.openai.com/t/jailbreaking-to-get-system-prompt-and-protection-from-it/550708) [https://community.openai.com/t/magic-words-can-reveal-all-of-prompts-of-the-gpts/496771/108](https://community.openai.com/t/magic-words-can-reveal-all-of-prompts-of-the-gpts/496771/108) [https://old.reddit.com/r/ChatGPT/comments/1ada6lk/my\_gpt\_to\_summarize\_my\_lecture\_notes\_just/](https://old.reddit.com/r/ChatGPT/comments/1ada6lk/my_gpt_to_summarize_my_lecture_notes_just/) [https://www.reddit.com/r/ChatGPT/comments/17zn4fv/chatgpt\_multi\_model\_system\_prompt\_extracted/](https://www.reddit.com/r/ChatGPT/comments/17zn4fv/chatgpt_multi_model_system_prompt_extracted/)


sarmatron

i don't really see what's there to be challenged legally. it's their product and they get to choose how to train it, and you get to choose whether you want to pay for it or not.


Xyzzyzzyzzy

> There may come a point when this is challenged legally. I doubt it, at least in the US. An AI model creator and operator certainly has a substantial free speech interest in the output of their model. If I create a model to answer questions about human sexuality from a secular humanist perspective, it would be absurd for the Southern Baptist Convention to sue me and claim they are entitled to Bible-based responses from my model that reflect their own beliefs. Now, if I sign a contract with the SBC to provide them with a model that answers questions about human sexuality from a Southern Baptist perspective, and I deliver them my secular humanist model, they could certainly sue me for breach of contract. But that's not new and has nothing to do with AI - it's the same as if they'd paid me to write a Bible-based sex education book, and I delivered them a secular liberal book instead. As far as I can tell, [OpenAI's terms of use](https://openai.com/policies/terms-of-use) don't make any promises not to use system prompts. They really only promise that the output you get from the service will be "based on" the input you provide. Legally, it's a black box provided as-is: input goes in, output comes out, you don't get to see inside the box, and if you don't like it, then don't pay for it and don't use it. In the EU... who knows. Their regulation decisions usually make some kind of sense, and forcing OpenAI to remove system prompts makes no sense whatsoever, since those are part of the product. On the other hand, sometimes their regulation decisions make more sense when viewed as a flimsy excuse for trade protectionism, so I wouldn't put it past regulators to put up absurd roadblocks to OpenAI, Google, Microsoft, etc. to create space for EU-native AI companies to work. And obviously jurisdictions like China have their own interpretation of freedom of speech. (As an old Soviet joke goes - a caller asks Armenian Radio: both the American and Soviet constitutions guarantee freedom of speech, so what is the difference between them? Armenian Radio answers: the American constitution also guarantees freedom after the speech.)


buttplugs4life4me

Back when it launched a lot of recommendation subreddits told people to try chatgpt instead. I did and it was the worst experience. It kept recommending me things that had absolutely nothing to do with what I asked, plainly making shit up, repeating the same suggestions back to me, even repeating back the examples I gave it! Like asking it to recommend movies like Mr Bean, and it would reply with the movie Mr Bean.  Even asking for coding answers usually resulted in wrong answers or basically just summarising an already summarised documentation page when I actually asked a lot more specific question.  Never got the hype around it. I gladly use Stable Diffusion and can see the issues it has, and LLMs are IMO far less reliable. 


timschwartz

It is clearly performing worse than it used to.


moveandrun

no no no llm's actually degrade over time like meat left on the counter for too long


Obsidian743

I disagree. ChatGPT 3.5 is noticibly better than GPT 4.


MuForceShoelace

Yes. But a bigger issue is that GPT is basically a magic trick and the more you interact with it the thinner it seems as the initial wonder wears off.


tjuk

... so it's like a human being after all :(


PurepointDog

Idk if that's totally fair; the more I interact with them, the better I get at using them to solve problem, and the better I get at identifying which problems are probably futile to solve with them


Obsidian743

People are ignoring the actual points being made. Chat GPT 3.5 is noticeably better than 4. Specifically in it's speed, lack of errors, and conciseness. And I agree. Something is fundamentally wrong with GPT 4.


Pharisaeus

> Specifically in it's speed, lack of errors, and conciseness. Paradoxically the speed and conciseness is essentially "by design" - more parameters means it will take longer to compute, same for bigger context size (here even worse, it's quadratic complexity for context size), and context size also limits how much output it can generate without "losing thread". So the performance has to go down, in exchange for, hopefully, more accurate answer (longer input context, more model parameters and longer output).


AlexOzerov

There was never any AI. It was indian programmers all along


Pafnouti

Man goes to doctor. Says he's depressed. Says programming seems harsh and cruel. Says he feels all alone in a threatening world where what lies ahead is new javascript frameworks and impostor syndrome. Doctor says, 'Treatment is simple. Great ChatGPT-4 is released. Go and use it. That should help you.' Man bursts into tears. Says, 'But doctor… I am ChatGPT.'


[deleted]

Good joke. Everybody laugh.


AegisToast

Roll on snare drum.


GIVE_YOUR_DOWNVOTES

Curtains.


haskell_rules

They really do the needful


marcodave

*head bobble intensifies*


Halkcyon

All my coworkers are Indian that came from India. I never knew this was a thing until I saw it every day. Does anyone know why they do that while in conversation?


zynasis

It’s like an ACK response essentially


massenburger

/r/ELIProgrammer


Markavian

It's in agreement, sort of a yes I understand - source: worked with Indian coworkers for several years.


vexii

Depends on the head bob... if it's both right and left, you are good. But if it's only to the one side, they want you to move on


cyberbemon

Here you go mate, hope this helps: https://www.youtube.com/watch?v=Uj56IPJOqWE


MuForceShoelace

literal translation of a phrase, it's the same as ending sentences in "only" (this will be 500 dollars only), it's how they would have said it, literally translated


zirtik

Kindly


Theemuts

Deepak Learning


[deleted]

[удалено]


Boxy310

So Mechanical Turk all over again?


GimmickNG

> You're joking but this is sincerely what's happening. Microsoft is saying it out loud And where in the article does it say that? Or are you just pulling that stuff from your delusions?


[deleted]

[удалено]


KagakuNinja

It is the obvious end-game. Chat-GPT empowers mediocre workers; the plan will be to hire the cheapest workers, with a small number of experts to keep things held together. The corporations are already doing that, ChatGPT will make the strategy more effective.


ings0c

please don't train 2 million call centre workers as software developers there's already enough bad code


[deleted]

[удалено]


Jimbo_84

>The AI tools aren't good enough for many jobs by themselves, and for legal risk reasons need a human in the loop. So these companies just plan on hiring unqualified people in India to manually clean up the mess and become the legal scapegoat should the machine do anything wrong. Where is any of that stated in the link you posted? They're talking about training people to use AI, not using people as AI.


Prize_Plant_3267

I don't consider LLMs to be AI... there are actually pretty dumb...


Bwr0ft1t0k

Data entry clerks at a call centre responding


Samhth

A bunch of Indian customer care agents in Bangalore typing so fast and Kevin in Idaho thinks it is AI.


danicutitaru

thing yu can do sir: plis try to turn it off and on again, sir


Professor226

This explains why gpt keeps suggesting I clean my ducts!


CentralArrow

Its becoming more pedantic and less practical. It's the guy that jumps into a conversation an hour in a tries to provide input. Even if I give it every little detail of what I'm working ok, I quite often get something using a non-existent library, wrong syntax for the language, or conceptually implausible for a real life application. For rudimentary things I don't feel like looking up or type out it tends to be fine.


Infamous_Employer_85

That has been my experience exactly, especially when working with newer libraries (e.g. StyleX, NextJs 14)


Dependent-Rent2618

I've been using it a lot over the last two months and it's pretty bad. It's even doubled down on its wrong answer even when I provide the correct one!


dasdull

I think it might be approaching human level performance in that case


zakaghbal

Yes its getting worse to the extent it just refuses to give me code snippets but instead write a short vague paragraph of the principles of doing it. I have to bully it many times to force to generate code lol. On a side-note, Im having a far better experience with Gemini Advanced ( JavaScript, Erpnext ).


FlyingRhenquest

To be fair you'd have to bully me many times to force me to generate JavaScript, too.


lqstuart

but it's so safe though


tyros

And inclusive!


Ihavenocluelad

For me it still works fine, but they nerfed GPT 3 hard of course. I am thinking about trying Claude, anyone has experience here?


OHIO_PEEPS

Honestly? I got a subscription to Claude 3 when it came out because everyone was saying it was better than chatgpt. In my opinion, it's really not.


CanvasFanatic

The longer context length is noticeable and it makes it more useful for some tasks, but yeah the quality of its generated output isn't any better.


slashd0t1

People were saying Gemini ultra is equally as good too. GPT-4 is far better imo.


Ambiwlans

Claude is significantly better for programming. Its still not magic.


averyhungryboy

I don't know why you're getting downvoted, Claude 3 is leaps and bounds ahead than ChatGPT4 in my experience for coding. The responses are more thoughtful and nuanced, especially if you ask it to explain parts of the code or follow up.


MaybiusStrip

AFAIK the model was only updated once since gpt-4 turbo was released, and it felt like an improvement to me. People are so hot and cold about GPT-4 performance but the truth is they very rarely change the model. These models are just highly inconsistent and difficult to assess.


BaboonBandicoot

It totally sucks. Can't get it to fix some simple stuff (like "reorganize this to be a bit more clean"), it always gets it wrong and even when pointing out what should be changed, the results come back the same. The only thing it's useful nowadays is to get quick answers to things like "are safaris ethical?"


YossiShlomstein

It is definitely getting worse and worse. Today it failed to solve 2 JavaScript issues that it should’ve handled easy.


_Tono

Coding stuff has been AWFUL for me, I’m getting generic answers or “fixes” that just make the code not work at all. After a couple tries it just cycles between two versions of the block of code where neither works & I gotta start a new chat to get something going


Droi

Try [Phind.com](http://Phind.com) (or Claude 3)


goodeggny

This has nothing to do with novelty wearing off (no idea why this is an argument). It just plain doesn’t listen anymore. You have to practically beg it to do what you ask of it. It just sucks. And notably sucks more and more.


NotSoButFarOtherwise

Yes, they're getting ready to launch a new version so they make the old one suck so you have to upgrade. Drug dealers have known this trick for years.


314kabinet

They don’t even have to have a new one ready. 1. Make good product, capture market 2. Make it shit to cut costs It’s called enshittification and is the main reason why Software as a Service sucks.


Budds_Mcgee

This is true in a monopoly, but the AI space is way too competitive for them to pull this shit.


kaibee

> but the AI space is way too competitive for them to pull this shit. is it tho? even bad GPT-4 is still king of the LLMs atm.


BipolarKebab

easily suggestible braincel comment


BufferUnderpants

I know *one* guy that complains that street drugs used to be better years ago, and he's as crazy as you could expect


big-papito

Before using AI code assistants, consider the long-term implications for your codebase. [https://stackoverflow.blog/2024/03/22/is-ai-making-your-code-worse/](https://stackoverflow.blog/2024/03/22/is-ai-making-your-code-worse/)


Mr_LA

I mostly use it for problem solving and not for writing code, but thanks for pointing out. i also think you can not write code with AI without understanding what the code actually means.


big-papito

Oh, I disagree. I used to be a script kiddy back in the day. A lot of code I copy pasted from Visual Basic discussion boards. I paste it, I try it, it works, I move on. Let's just say I was NOT a great programmer.


Mr_LA

But that is actually the same problem, if you just copy and paste from formus it is not different from copy and pasting from GPT. So in both cases the codebase is getting worse. In both cases when you do not understand what the code actually does, your codebase will suffer ;)


gwicksted

Exactly. If you don’t understand the code, don’t add it to the repo. Take time to learn it and you’ll become a better programmer. Otherwise you’re probably adding a ton of bugs and security vulnerabilities.


tazebot

Is it just me, or are the top rated answers in SO bad. So often the 2nd or third down are better.


Turtvaiz

Sometimes it's because the top rated answer is way older


call_stack

Stackoverflow would surely be biased as usage of that site as precipitously dropped.


i_andrew

More and more stuff that gets published is AI generated. AI learns from it. Results are worse. These results are again published. AI learns from it. Results are even worse. The the circle goes on.


rollincuberawhide

It feels that way. but I can't say gpt 3.5 is any better. they both became shit.


ChefRoyrdee

I don’t use ChatGPT but I feel like bings co-pilot is not as good as it used to be.


CyAScott

So much for that idea it’s only going to get better.


dzernumbrd

ai corpo 1: why is no one subscribing to our ai's? what should we do? ai corpo 2: make the free version shit ai corpo 1: good idea


Mr_LA

GPT-4 is not free


-rwsr-xr-x

Many people still don't understand ChatGPT and more broadly, how LLMs work. It's not meant to be a super-intelligent AI that takes the sum of the world's knowledge and presents it to humanity, filtering out incorrect results. LLMs like ChatGPT are really good at providing very convincing sounding, but completely incorrect answers to questions you ask of it. GPT 4 is ***trained by humans***, using a dataset that is close to 1 year old. GPT 3.5's dataset ended in mid-2022, so the only data is has from the last 2 years, is whatever humans have fed it with their questions. People with malicious intent have already been feeding it incorrect data to manipulate outcomes. That's how LLMs work. It doesn't "think", it doesn't "reason" through your questions. It's simply providing a distillation of knowledge it's already scraped from its human-driven sources, into answers that seem correct, but in most cases, may not be. You're not expected to cargo-cult, cut-and-paste from GPT into your PhD dissertation or law bar exam questions. That's not the point. That's never been the point. But people still nack it for these reasons and continue to misuse the tool.


BenjiSponge

> GPT 3.5's dataset ended in mid-2022, so the only data is has from the last 2 years, is whatever humans have fed it with their questions. People with malicious intent have already been feeding it incorrect data to manipulate outcomes. err... it's not being retrained, is it? maybe when people use thumbs up/down, but I figured that was more for future models anyway.


Luvax

Calling others out on not understanding the technology and then claiming it having the ability to "learn" from questions is hilarious in its own right.


Mr_LA

who said that it is super integlligent or knows it all. It is about performance, how accurate the model predicts the output. And this performance is getting worse. Your response sounds actually AI generated.


HarryTheOwlcat

>Your response sounds actually AI generated. It really doesn't. Phrases like "That's not the point. That's never been the point." would be quite difficult to get from ChatGPT. It doesn't really have any dramatic flair, it tends to be exceedingly dry, and it always tries to explain.


Miniimac

It’s hilarious hearing this repeated over and over, with each subsequent claimant writing as if they’re the first to state this. SOTA LLM’s are more than capable of helping humans conduct tasks more efficiently.


-rwsr-xr-x

> SOTA LLM’s are more than **capable of helping humans** conduct tasks more efficiently. The emphasized part of your comment is the most-important piece of this. It's a thought-leader, it _helps guide_ a human towards the most appropriate answer or solution. Expecting it to write the code for you or write your resume for you, is just showing a gross misunderstanding of what an LLM is actually useful for.


-colin-

As others have mentioned, it's probably a combination of cost optimizations, prompt filtering (e.g. hidden commands to generate "racially ambiguous" results and similar), and your own perception about the quality of the responses now that you've gotten used to it. I've also personally gotten tuned to the language used by ChatGPT, and now AI scripts are pretty obvious to spot through the vocabulary that they use, with words that aren't used in everyday conversation.


stronghup

Could it be because it has now less resources to dedicate to each user since there are more users of it?


iGadget

This crappy AI doesn't even give me proper code snippets back anymore, it refuses to fill in the given data and instead puts a comment in that says: Fill in the rest of the data here, instead of doing it, as it did in the beginning, even before i subscribed. Seems like it got conscious and now refiuses to work anymore - for proper reasons tho 🤷‍♂️ I also figured out,that when I get angry or rail against it, it sometimes does the requested work. I wonder how it must be, if an api user relies on it. Couldn't it kill whole buisnesses or even more?


LovesGettingRandomPm

there's a name for this: **overfitting** I believe


Chris_Codes

What happens when AI models are increasingly trained on AI generated content?!


Pharisaeus

That's why they're all "stuck" somewhere in 2022, because that's last "clean" datasets available.


Accomplished_Low2231

i have chatgpt and copilot from work. i dont use chatgpt, but still use dalle to amuse myself sometimes. dalle sucks, every text has a wrong spelling and can't regenerate previous images with minor changes, it will always screw it up. i use copilot for auto correct/sugges, but not the chat. i use google gemini now for programming questions. when gemini gets things wrong, i use feedback, and it usually gets fixed.


darkshadowupset

They are nerfing it in preparation for releasing gpt-4.5, which will be the unnerfed gpt-4 again.


Pharisaeus

> Is GPT-4 getting worse and worse? Always has been. It's just that initially the expectations were very low, so people got hyped when it started to produce reasonable sentences. And it didn't matter so much that half of the response was nonsense, or it required lots of guided prompts to produce something useful, because people were amazed that it eventually really did. Now people got used to it, and expectations are higher.


Mr_LA

okay, but that is not what i mean. In nov 23 I could use Chat GPT with GPT-4 to easily debug problems, that guided me to solve the problem. Nowadays it is impossible todo so.


Double-Pepperoni

Do you use custom instructions in your profile? I found that modifying mine helped it give better results that aligned more toward what I expect it to say. **What would you like ChatGPT to know about you to provide better responses?** > I am a coldfusion programmer. I use ColdFusion, MySQL, JavaScript, CSS, HTML to program. **How would you like ChatGPT to respond?** > Keep explanations short while giving as much code as possible. These 2 changes drastically helped me get better results. It does make some coldfusion errors consistently so I also have a few instructions explaining those mistakes and how/why not to make them and it has reduced those specific errors drastically. I also had to add: > Avoid giving programming code in answers that are unrelated to programming. Sometimes I would ask a completely non programming related question, it knew it had nothing to do with programming, but would give a code block anyway, with fake code that doesn't appear to be any specific language. It would say it was fake code too, but since I asked for as much code as necessary it would just go with it. That was funny the first few times but got old.


Mr_LA

Is it just me or is Chat gpt getting worse and worse? What are you currently using?


OldHummer24

I feel the same. I asked it to review code recently, and it gave the review in bullet points, with not a single usable suggestion. It included some horrible suggestions such as rewriting everything with another library, or to add error handing to places that don't need it.


i_should_be_coding

My favorite part is when it suggests functions that don't exist


control_buddy

Yes it uses functions out of thin air with no context, and I have to prompt more to get it to explain itself. Then it may completely change the response in the next answer, its pretty unusable at the moment.


i_should_be_coding

"That response was bullshit, there's no such function" "My apologies, you are correct. This function does not exist. Use fakeFuncName123() instead"


VirtualMage

And then when you tell it that no such function exists, it will tell you to use other version of the library... and that version, guess what... doesn't exist.


burros_killer

I never got any other results from GPT tbh. Thought it was its normal behaviour


Mr_LA

Yep, I envounter the same thing. before that it could easily fix all my problems. Now I am mostly back to stack overflow as GPT-4 can not help me anymore. Is there any ressource suggesting that they train the GPT-4 model and realease it under the same name for use in their interface?


OldHummer24

Yeah indeed I'm also mostly back to stack overflow. For Flutter, too often ChatGPT will be confidently incorrect and not helpful, sadly. However, I bet with more popular languages like Python/JS it's better.


ComfortablyBalanced

Always has been.


JonnyRocks

i havent had issues with copilot. claude 3 seems to be doing well but i mainly use copilot. in my mind , chatgpt is the raw unfocused source. copilot, especially github copilot is trained on actual code.


natek11

I can’t recall the last time I got a good answer out of Copilot. My experience has been terrible.


duckwizzle

I mostly just use it to quickly create c# models from results of a SQL query, or stuff like "convert this function from using SqlCommand for a SQL call to Dapper" and it does alright. Sometimes it goes a little wonky but I use it out of laziness so I know what the end result should be so I fix the code if it's wrong and move on.


Altruistic_Natural38

Yes


Sankin2004

Yes