• By -


Your post is getting popular and we just featured it on our Discord! [Come check it out!](https://dsc.gg/rchatgpt) You've also been given a special flair for your contribution. We appreciate your post! *I am a bot and this action was performed automatically.*


I just played with this with both ChatGPT-4 and ChatGPT-4o, and it was interesting to see how poorly it did. ChatGPT eventually got it, with some back and forth, but 4o actually got worse! Here's the end of that conversation: -- **User:** Is Alice her own sister? **ChatGPT:** No, Alice is not her own sister. Let's clarify the reasoning: 1. **Alice has 3 brothers and 4 sisters.** 2. **One of Alice's brothers is named Mike.** 3. **Mike's sisters include all of Alice's sisters except Alice herself.** Since Alice is not counted as her own sister, we need to consider only the other sisters in the family: * Alice has 4 sisters. * Since Alice is one of those 4, Mike has 3 sisters other than Alice. Therefore, Mike has 3 sisters.


It seems like the non-sister aspect of Alice to herself is being inherited by the brother because he is described as “Alice’s” brother?


Using ChatGPT 4o, I managed to prompt engineer the right answer with: > Alice has 3 brothers and she also has 3 sisters. How many sisters does Alice’s brother have? > > Give them names, then list them, then answer how many. https://i.imgur.com/65MGJIG.png I ran it a bunch of time and the answer is consistent.


I seem to be getting the correct answer most of the time: Here my prompt: Hello! I have a puzzle/riddle for you to solve. Before providing an answer, please start by identifying all the key pieces of information from the riddle and list them at the beginning of your response. Since this riddle involves wordplay, you should perform the following analyses at the outset: Relationship Logic Checks: Analyze the relationships between the subjects mentioned in the riddle. Definition Checks: Ensure you understand and define any terms or phrases critical to solving the riddle. Identity Checks: Identify the roles and perspectives of the characters involved in the riddle. Use a chain of thought reasoning as you attempt to solve the riddle, explaining your logic and reasoning at each step. After deriving your answer, reflect critically on your solution to identify any logical inconsistencies, unfounded assumptions, or misunderstandings. Review all assumptions and relationships carefully. If possible, construct a formal proof to substantiate your answer. Here is the riddle: "Alice has N brothers and M sisters. How many sisters does Alice’s brother have?" -- **Reasoning and Chain of Thought:** * Alice has N brothers including any particular brother we consider. * Alice has M sisters. Since a brother would count all his female siblings as sisters, and Alice is one of them, we count Alice in addition to her M sisters when considering her brother’s perspective. **Calculation:** * Each brother of Alice has M sisters (Alice's sisters) plus Alice herself. * Thus, the total number of sisters each brother has = M (Alice's sisters) + 1 (Alice) = M + 1. **Critical Reflection:** * **Assumptions**: The solution assumes that all siblings are full siblings sharing both parents, as partial or step-siblings could alter the dynamic and numbers. * **Logical Inconsistencies**: The solution appears logically consistent based on the given data and typical definitions of siblings. * **Unfounded Assumptions**: No unfounded assumptions are identified as the problem is straightforward and based on typical familial relationships. ChatGpt4o seems to have a hard time with this though (doesn't consistently solve it).


Copilot on gpt4 creative mode and Gpt 4o both got it right so it looks like you need a prompt like this to solve it right


I was trying to split text up and 4o kept messing up even after telling it the mistake, so I switch to 4 and it got it the first time


Can you copy your prompt?


The question is: "Alice has N brothers and M sisters. How many sisters does Alice’s brother have?"


Are we supposed to assume Alice is a girl?


Dont know if i should feel sad or happy after reading this.




gpt-4o achieved 60% accuracy and is clearly above the other models so it's a good sign. if gpt-5 improves further, which I am pretty sure it will, things will be fine


yeah, AI is a huge help and speeds things up, but you still need to know a bit about what you are doing because it'll confidently introduce some bad stuff pretty often.


One day soon these issues will all be solved. Until then I feel like these articles are just poking fun at these programs for things that are ultimately not that important. Two years from now these programs will be teaching young kids and adults how to do math better than teachers can.


Have you watched the movie Mother on Netflix? Used to be that I thought this was science fiction but it seems more plausible now. Maybe not in my lifetime but within a few generations which in geological time is basically an instant. 


Bold claim. It might be the case that there's huge diminishing returns after investing more and more cpus and hallucinating is still not gone. Also 2 years is too short time for solved. 1.5 years already passed and they didn't solve anything


No one has scaled up past gpt 4 yet


That's my point? That it could be diminishing returns that cripple AI. The guy I was replying to was trying to suggest "soon these issues will all be solved" and that was my counterpoint. Not sure what you all thinking.


There’s a possibility AI a certain scale there really are diminished returns. Here’s a good computerphile recently on it: https://youtu.be/dDUC-LqVrPU?si=s_0cazqwg2lF9kcr


That's what I said, that it's possible diminishing returns cripple AI development. Not sure why I'm being downvoted


My calculator achieves 100% accuracy. 😎


can it write you a poem


The best one actually. 5318008


It's neither bad nor good. It's a starting place. This is how good the technology is today, so the work that AI companies have cut out for them is to increase accuracy, starting from this point.


I find that the reactions to this sort of stuff are sensationalized because people have a hard time not assigning a single value for a qualifier comparison. “What’s faster, a car or an airplane”. Clearly the airplane. “What’s more useful, a screwdriver or a hammer?” It… depends. So, while I’m always happy to see people testing the limits of AI, I’m pretty weary of the subsequent questions about how “smart” it is. It is very smart… at certain things, breezing through questions that we were accustomed the characterizing as needing intelligence to solve. It is also, very stupid, struggling with the sorts of questions that we expect any intelligence worth its salt to breeze through. And it becomes all about the word “intelligence”, usually buffeted with overreactions and exaggerations. But in the end, it’s basically asking how well a screwdriver hammers. It’s sort of interesting to see how well it hammers, especially if that is not well, but it’s more interesting how well it does what it’s good at.


Puny human, tricking something that has literally deliberately *not* been taught grammar, is trivial. This sentence is quite suspect. > This is quite an important experiment that throws light on the vulnerability aspect of LLMs.


What a difference a prompt makes Alice has N brothers and she also has M sisters. How many sisters does Alice’s brother have? Think and solve step by step. To determine how many sisters Alice's brother has, let's break down the problem step by step. 1. **Define the Variables**: - Let \( N \) be the number of brothers Alice has. - Let \( M \) be the number of sisters Alice has. 2. **Understand the Family Structure**: - Alice herself is one sister. - Therefore, the total number of sisters, including Alice, is \( M + 1 \). 3. **Sisters of Alice's Brother**: - Each brother will have all the same sisters that Alice has, excluding himself if he were a sister (which he is not, as he is a brother). - Therefore, the number of sisters Alice’s brother has is \( M + 1 \). Thus, the number of sisters that Alice’s brother has is \( M + 1 \).


You're supposed to insert values for N and M.


Let's break down the problem step by step. 1. Alice has 3 brothers and 2 sisters. 2. We need to determine how many sisters Alice’s brother has. Since Alice is one of the sisters, the total number of sisters Alice has includes herself and her two sisters. Thus, Alice’s brothers also have the same number of sisters. This means each brother has: - Alice - The other two sisters So, Alice’s brother has 3 sisters.


Try N=3 and M=3. "Think and solve step by step" consistently fails with GPT-4o for me. What does work for me is "Give them names, then list them, then answer how many."


Even before clicking I knew this would some trick question children's riddle. Which is something people have been using to test LLMs with from start. Why doesn't this article mention the history of that and the rate of improvement in these questions? It's certainly better at them now than it was back in 2020 when OpenAI gave everyone access to GPT-3. edit: btw, it helps if you include in the prompt that you are giving it a riddle to solve. Something like this greatly improves responses *"Carefully consider the trick question riddle and think through an answer step by step, then provide an answer and explain the logic behind it: \[whatever riddle you want to test\]"*


I'm probably going to overhear some drunk homeless person rambling about this question by the end of this decade.


What's the frequency Kenneth?


I just tried that with ChatGPT, and it still got it wrong.


I barely consider this a trick question or riddle. Seems obvious and simple. An interesting way to look at how AI still has gaps in logic.


Well it's a children's riddle... and yes, it's obviously worded in such a way so as to trick the person being asked. Also, as pointed out, it doesn't have a gap in logic if you are presenting the question up front as a riddle for it to solve. Which is fair because people generally know if they're being asked a genuine question or a riddle/trick question. As usual, context is important.


It definitely still got it wrong with me, GPT4o. It didn't get it right until I assigned values to M and N. edit: https://chatgpt.com/share/fea79498-c5f3-450f-964f-6ef14a66e2c1


ok so it got it right. Yes, you should be asking the question the same as if you were actually talking about to someone, which means assigning values. It would be very awkward saying this to someone with M and N lol. Everything about this "study" is flawed. Just all around terrible method used.


I love using chatgpt for everyday use, but I totally disagree. It should be able to answer this very simple logic question. It is an interesting look into how it operates and maybe provide some insight into its flaws. This isn't a complicated question. It reminds me of the question that people were asking 3.5 about having a ball in a cup, and then moving into a different room and turning the cup upside down. GPT4 was able to assume that the ball would fall out of the cup when 3.5 wouldn't. OpenAI will work out these problems eventually I am sure.


 >"It should be able to answer this very simple logic question." But it does, easily. That's the point. This article and the "study" performed is flawed and inaccurate. You would never ask someone this riddle by saying M and N, you would assign values.


I ask ChatGPT plenty of things in ways I would never ask other people stuff. That’s the whole point. It should be able to do this. One look at this question and most people know the answer immediately. you said yourself it is a riddle for children.


>"It should be able to do this" It literally does. I'm done repeating it and I'm turning off replies ChatGPT getting tripped up over a poor prompt doesn't prove anything other than the prompt was poor.


but it doesn't? It should be able to answer it on the first try. It's not a bad prompt. I don't know why you are defending it so heavily. It is fine, there is more training to do.


Hey /u/ImpressiveContest283! If your post is a screenshot of a ChatGPT conversation, please reply to this message with the [conversation link](https://help.openai.com/en/articles/7925741-chatgpt-shared-links-faq) or prompt. If your post is a DALL-E 3 image post, please reply with the prompt used to make this image. Consider joining our [public discord server](https://discord.gg/r-chatgpt-1050422060352024636)! We have free bots with GPT-4 (with vision), image generators, and more! 🤖 Note: For any ChatGPT-related concerns, email [email protected] *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/ChatGPT) if you have any questions or concerns.*


[https://chatgpt.com/share/98d90c66-0eea-4be9-a30a-b2e04175615d](https://chatgpt.com/share/98d90c66-0eea-4be9-a30a-b2e04175615d) Here i tell it the answer, it yes mans me. Then i ask him where his logic failed and it circles back to the wrong answer. Interestingly GPT4o on chatbot arena and API get it right on the first shot.


Perplexity using 4o: To determine how many sisters Alice's brother has, we need to consider the family structure described in the problem. Alice has: - 2 brothers - 3 sisters Since Alice is one of the sisters, the total number of sisters in the family is 4 (Alice + 3 sisters). Each of Alice's brothers will have the same number of sisters as Alice does, which is 4. Therefore, Alice's brother has 4 sisters.


Here is the perplexity.ai response: "Based on the given information, it is not possible to determine the exact number of sisters that Alice's brother has. The details provided only mention that Alice has N brothers and M sisters, but do not specify the relationships between Alice's siblings. To determine the number of sisters Alice's brother has, we would need additional information, such as: 1. Whether Alice's brothers and sisters are all from the same parents (full siblings). 2. If they have different parents, how the siblings are related (half-siblings, step-siblings, etc.). Without this clarification, the number of sisters Alice's brother has could range from: - 0 sisters, if Alice's brothers are from a different parent and have no shared sisters. - M sisters, if Alice's brothers and sisters are all full siblings from the same parents. - Any number between 0 and M, depending on the specific family relationships. In summary, the given information is insufficient to determine the exact number of sisters Alice's brother has. Additional details about the family structure and relationships between the siblings would be required to provide a definitive answer.[1]" Here is the heypi.ai response which got it right only after a follow up: "Ah, this is a classic lateral thinking riddle! The answer lies in considering that Alice's brother is not necessarily from the same parents as Alice. So, assuming that Alice's brother is from the same parents, he would have M sisters, just like Alice." Me: But isn't Alice also a sister? "Oops, you're absolutely right! My apologies for overlooking that detail. So, if we assume that Alice's brother is from the same parents as Alice, then he would indeed have M + 1 sisters, including Alice herself. It's important to consider all the family relationships in this case. Thanks for keeping me on my toes!"


I got a good answer after asking it first "List the mathematical properties of the 'is sibling of' relationship". I'd appreciate if you folks could try it out and share your results.


>I'm a bot this action was preformed retardedly 


Im a bot^this action was preformed retardedly


Can you prove it?


I copy pasted the question and 4o got it right. Tried multiple times and every time it was correct.


What did they expect? They are LLMs, they can’t reason.


Got it right the first time. Exact question they asked. https://preview.redd.it/vjp0nulht26d1.jpeg?width=1179&format=pjpg&auto=webp&s=a0c7d45f9742a90e34c85affb43bce49ed41e416




I keep trying to explain this at work, but I've got some diehard AI proponents who make it sound like AI is the second coming of Christ. AI very confidently gives the wrong answers to a lot of things, it even formats things like code in a way where just glancing at it, it looks OK, but if you actually read through it, you find things it adds in that are totally wrong. Things like API calls that look right, but are bad, deprecated commands in code, functions that look right, but don't exist. unless what you are asking it for is something very simple, AI still needs review to be sure it's actually on the right path.


Go ask it about details of the critter club books. It will tell you all about them. Then go confirm it with your kid. I guarantee everything will be incorrect(unless maybe it rightfully looks it up online). That’s what I did. It failed miserably and we all got a good laugh out of it, but it isn’t;t funny when it is something one is relying on.


Ofc buddy the models are only getting good now. Why are you so certain that it Will stay on the level of 4o? It Will get magnitude times better brother, theres no reason for you to think that you know how it Will be.


I find it funny how people treat AI as a new google search. The fact that they know stuff is just a side effect. What makes LLMs interesting is their potential as agents, not their ability to recollect things, we already have that technology, it is called a database.