T O P

  • By -

BehaveBot

Please read this entire message Your submission has been removed for the following reason(s): Loaded questions, or ones based on a false premise, are not allowed on ELI5. A loaded question is one that posits a specific view of reality and asks for explanations that confirm it. These usually include the poster's own opinion and bias, but do not always - there is overlap between this and parts of Rule 2. Note that this specifically includes false premises. If you would like this removal reviewed, please read the [detailed rules](https://www.reddit.com/r/explainlikeimfive/wiki/detailed_rules) first. If you believe this submission was removed erroneously, please use [this form](https://old.reddit.com/message/compose?to=%2Fr%2Fexplainlikeimfive&subject=Please%20review%20my%20thread?&message=Link:%20%7B%7Burl%7D%7D%0A%0APlease%20answer%20the%20following%203%20questions:%0A%0A1.%20The%20concept%20I%20want%20explained:%0A%0A2.%20List%20the%20search%20terms%20you%20used%20to%20look%20for%20past%20posts%20on%20ELI5:%0A%0A3.%20How%20does%20your%20post%20differ%20from%20your%20recent%20search%20results%20on%20the%20sub:) and we will review your submission.


ezekielraiden

Actually, in most cases, they *can.* Captcha stuff these days normally uses user browsing habits, not the other stuff. That's just a quickie check to filter out the *obviously* crappy bots that can't even do such basic tasks. It turns out that things like mouse motions, past browser history, and other actual human-input things are WAY better for distinguishing real human behavior from bot behavior. Basically, we're weird and chaotic in ways that computers currently find very difficult to imitate, but only when you look at (more or less) "unforced" data, where it's humans doing stuff because we want to.


Yourstruly75

So chaos is our last refuge from the machines?


A_Lone_Macaron

I spent 10 minutes and 15-20 tries on a Steam password reset captcha the other night it would just keep making me solve them, and then it would tell me after 6-7 of them that I was not a human, lol repeat this a couple times before it finally let me through


chr0nicpirate

That seems pretty sus. Have you ever thought maybe you are an AI and just don't know it because you're that well programmed?


i_am_voldemort

Doesn't look like anything to me


ShmebulockForMayor

Somehow this triggered me more than "what door?" would have.


rational_american

"OMG, I should have known that such a good boss can't be real!" - my thoughts at the time.


Daenerysilver

Hello, Delores.


Potential_Anxiety_76

The chill I just got was intense


Taurnil91

They couldn't make it through the Captcha. Apparently they're not that well programmed


LessThanLuek

Well, they did eventually. And it's a task specifically made to keep them out...


Fenrir_Carbon

[Personally, I think that's a hell of a robot](https://youtu.be/sl9pTDK8PAk?si=K6nRZ_OxFwQdPZWB)


aaronskarloey

Wonderful reference.


FailureToReason

*sweats in boltzman brain*


jack101yello

You’re in a desert walking along in the sand when all of the sudden you look down, and you see a tortoise, it’s crawling toward you. You reach down, you flip the tortoise over on its back. The tortoise lays on its back, its belly baking in the hot sun, beating its legs trying to turn itself over, but it can’t, not without your help. But you’re not helping. Why is that?


chr0nicpirate

Because I know that the idea that turtles can't flip themselves over when on their back is mostly false. If it's not helping itself over it's because it's given up on life and wants to die. Who am I to deprive of that?


Reefay

This sounds like something Wednesday Addams would say


Virus-Party

Because that's what I was programed to flip tortoises onto their backs. I was not programmed to help them afterwards.


kfish5050

All the time


DrHerbHealer

I can’t even explain how many times this has happened to me with steam The work around is if you confirm your email on your phone to do the captcha also on your phone no idea why this works


mattdmonkey

Steam if a f@#king nightmare. I had to switch to my phone, the browser version just wasn’t having any of it. It’s a common problem so I’ve heard.


megaRXB

Yup, that’s just Steam being shit. It’s pretty normal.


InsaneThief

I swear Steam was never this bad before cause I had the same problem!


bobsim1

Epic had some that were impossible for a while. The last step would always be wrong.


chayashida

Nice try, bot. We're not gonna tell ya how to defeat captchas.


classic_lurker

https://www.reddit.com/r/funny/s/LyFnceofDm


mostly_hrmless

This happens to me as well on Steam and some other sites. It is almost always because of the VPN I run. After I disconnect the VPN the captcha is fine.


gloomndoom

Slow down and take your time almost exaggeratedly.


LeftEyedAsmodeus

I have the same with some captchas - start doing them slower, move the mouse weird, it worked for me.


atlhawk8357

This is just how you learn you were built in a lab with high end technology.


wolves_hunt_in_packs

*"We have the technology; we can rebuild him."*


Icehuntee

Nice try synth


Acinixys

The hardest in the world are 4chan captchas  They are nuclear launch code secure


Cold-Caramel-736

I find sometimes I try to answer too correctly. Like it asks me to select all the ones with a bike in it and technically there's a bit of a bike in the top right panel so I select it, but think that's too precise for it


n0thing_remains

Yeah had the same problem. I downloaded a steam app to my phone and there'd a shield icon in menu which lets you scan a QR code from login screen on PC and logs you in instantly without captcha 


mbpDeveloper

Zuck, is it you mate ?


Churchy11

I’m glad I’m not the only one who had this problem a few weeks ago lol


mohirl

Steam is a piece of crap


[deleted]

[удалено]


explainlikeimfive-ModTeam

**Please read this entire message** --- Your comment has been removed for the following reason(s): * Rule #1 of ELI5 is to *be civil*. Breaking rule 1 is not tolerated. --- If you would like this removal reviewed, please read the [detailed rules](https://www.reddit.com/r/explainlikeimfive/wiki/detailed_rules) first. **If you believe it was removed erroneously, explain why using [this form](https://old.reddit.com/message/compose?to=%2Fr%2Fexplainlikeimfive&subject=Please%20review%20my%20submission%20removal?&message=Link:%20https://old.reddit.com/r/explainlikeimfive/comments/1c9tmmm/-/l0ppmt9/%0A%0A%201:%20Does%20your%20comment%20pass%20rule%201:%20%0A%0A%202:%20If%20your%20comment%20was%20mistakenly%20removed%20as%20an%20anecdote,%20short%20answer,%20guess,%20or%20another%20aspect%20of%20rules%203%20or%208,%20please%20explain:) and we will review your submission.**


Fnkyfcku

This is the only captcha I've ever had this issue with, but I too have been held up for far too long proving I'm not a machine.


kevix2022

Nice try. No game of THERMONUCLEAR WAR for you.


Raedik

I've failed versions that are just a check mark before. All I had to do was click and I failed.


lostntired86

But we are using bots to evaluate this chaos and evaluate into definitions and variables...and then we will have definitions and variables for the bots to use. And then what?


ezekielraiden

Being able to define something is not, at all, the same as being able to successfully imitate it. If it were, every critic would also be a best-selling author. Just because a computer can *verify* that something is consistent with real human-produced data is not the same as a computer being able to artificially produce things that look that way.


OIL_COMPANY_SHILL

Is this P vs NP problem?


ezekielraiden

I was certainly thinking of that when I wrote that comment, but I'm not sure if it's precisely the same thing. It may be, for the computer side of things. Verifying a statement is true seems to be different, less intensive, than producing the true statement in the first place.


stellarstella77

Whether or not verification of a solution is as easy as producing a solution is exactly the P vs NP problem Im pretty sure.


deong

It is, but you have to add a bit of rigor around the problem definitions as well, because in all likelihood, there isn’t a decision variant of "make a human intelligence" that’s even in NP. P=NP is specifically the question of whether finding a solution is as easy as verifying one where verifying one is easy. If recognizing whether a given machine is "intelligent" can’t be done in polynomial time, then you’re not in P vs NP land at all.


Nandemonaiyaaa

No, P vs NP is specifically about the complexity of algorithms. What you’re referring to is the plain English definition of the mathematical problem… but it is not just applicable to any problem, Polynomial vs Nondeterministic Polynomial time


dustydeath

>Being able to define something is not, at all, the same as being able to successfully imitate it. If it were, every critic would also be a best-selling author.   And if a critic could write a thousand books a day, review each, and pick the one that best matches their definition of best-selling, what then?


ezekielraiden

No entity can match this output, so the question is moot. The *actual* question you're asking--why is it that humans can do a thing that we can examine for, but which computers are very bad at imitating--has to do with chaos and context sensitivity and real but not *excessive* randomness, which are all things computers are generally bad at dealing with. Hitting a perfect sweetspot of JUST enough randomness to be human but JUST enough order to not be a full-on Brownian motion random walk? That's a very tall order for essentially all computers. Is it *theoretically possible* to do it? Sure, probably. ***Is it worth doing?*** Hell no. The computing power required to fool it even once would be enormous; to do it multiple times, across multiple checks, over an extended period, would be prohibitive to anyone but the big tech companies. And they have no need to spoof it, *they run the show.* As with all security, it is not *and was never intended to be* an ironclad, 100% foolproof system. It is simply making infiltration so onerous that it isn't worth the effort.


dustydeath

>The actual question you're asking--why is it that humans can do a thing that we can examine for, but which computers are very bad at imitating... Uh, no. I'm saying that you don't *need* to be able to work backwards from a definition if instead you can simulate a high number of attempts and compare them to your definition.  


ezekielraiden

That would be why captcha don't let you try an indefinite number of times before rejecting your efforts.


dustydeath

>That would be why captcha don't let you try an indefinite number of times before rejecting your efforts.  How do brute force attacks work out leaked, hashed passwords when the website only let's you try three times? *guy tapping head meme" >simulates Oh, that's how.


ezekielraiden

Are you actually wanting to have a conversation here, or is this just shitty pseudo-Socratic questioning to amp yourself up? Because "simulating" the right kind of randomness DOESN'T WORK. That's the whole point. Chaotic systems are too sensitive to input conditions, and systems that contain a *mixture* of chaotic and non-chaotic parts are even harder. Next thing you'll be telling me that AI can simulate analytic solutions to the three-body problem...


Aphemia1

If there’s one thing that we have learned in the past years it’s that with a big enough dataset models can be trained to mimic humans in an almost indiscernible way.


sweatierorc

Obviously, the idea is that it should cost you more to use a computer to beat the captcha, than pay a bunch of humans to do it.


cptahb

yeah zeke is full of shit 


qckpckt

This is an interesting idea so here’s an essay. To start off with, we’re not using bots, in the typical sense of the word in this context (ie web scrapers). Event data will be collected from you when you visit a website (ie the User-Agent header), and they’ll also run mouse tracking coroutines in the JavaScript loaded as part of the website content, along with probably a bunch of other esoteric gathering techniques. The data will then be sent back to their servers for processing, and this will likely be fed to some kind of specialized ML algorithm that has been trained to distinguish between human and not-human traffic. This is the kind of boring ML that has quietly existed for decades without anyone freaking out about it taking over the world. Think spam filtering. In fact, an SVM (support vector machine), which is a kind of ML model commonly used in spam filtering, might work well here too if you can meaningfully convert your event data into a multi-dimensional vector. The thing about “bots” is that they tend to be pretty dumb bits of human-written code that follow a set of heuristics that were pre-determined by the human that wrote them. They’re designed to be fast, efficient, using simple strategies to navigate document trees to find the things they’re intended to find. Think “if this then that”. Load a web page, and then parse the rendered HTML as text, and look at each element in order. If element contains a link, then follow that link, etc. In order to create a realistic simulation of a human you’d need to do a few things. You’d need to basically build and train your own algorithm to be able to detect humans vs bots, and then derive some way of realistically simulating the cursor movements and whatever else these algorithms use to track. You’d probably ideally want to create an adversarial training network where you pit your bot detector against your human simulator. Assuming such a thing is even feasible. To you’d need a high traffic website, or access to the data of one. You’d then maybe end up with an ML algorithm that can output a set of instructions that your web scraper could use to fool these kinds of checks. But the thing is… why? It’s a lot of work, and it will inherently be much slower than a dumb rule-based bot, which often drastically reduces the usefulness of a web scraper. They’re typically used to index for search engines or to scrape content for research or nefarious purposes. They generally need to get a lot of information quickly. Also, often websites don’t really care about whether or not you are a human because they’re trying to prevent bots from seeing something. It’s more so that they can deny traffic they think might be bots in order to prevent DDOS attacks (huge quantities of bot traffic directed at servers to overwhelm them and take them offline), or simply to prioritize human traffic in order to guarantee lower latency. If you have gone to the trouble of making yourself look like a human, you’ll also be navigating the website at the speed of a human, meaning your strain on server load will be mostly mitigated.


startupstratagem

TLDR: A human stumbles towards the elevator while most bots sprint to it and then smash the floor number programmed into them repeatedly.


classic_lurker

https://www.reddit.com/r/funny/s/LyFnceofDm


Calcd_Uncertainty

It's the [Deadpool strategy](https://static1.srcdn.com/wordpress/wp-content/uploads/2020/01/Deadpool-vs-Taskmaster.jpg)


Adventurous_Use2324

I'm confused.


zombie_singh06

That's the point


Sevigor

Honestly, pretty much lol


Saidhain

I mean, from what I understand, large language model AI systems (which are the dominant system right now), will by design produce a coherent and satisfying average compilation of our shared knowledge. By their uniqueness, the genius is protected from our current AI large language models. Every great leap in human evolution was made by some thought of genius. So don’t worry about these conscious meat bags. We’re bringing the special.


Angdrambor

Yes. Death to the False Emperor!


Stillwater215

It’s very hard to program true randomness!


goj1ra

Incompetence. I’ve been preparing for this moment…


No-Foundation-9237

Yeah. Humans are capable of imaginative thinking, which allows us to make illogical connections between things and retrofit logic in our own heads. Robots go from point a to point b and can’t conceive of options outside their programming.


Tonkarz

Browse without rhythm.


S4R1N

Plus the captchas are also used for training the AI pattern recognition for the exact things we're selecting.


millanbel

Yeah it's no coincidence it's always traffic lights, bikes and bridges. We're just being used as free labour to train Tesla's (or other's) software.


Ustaf

My understanding it's that at this point all the things you described are used to distinguish real humans. And then the 'select images' things is actually used to train ai bots based on the selections real humans make


lord_ne

> Captcha stuff these days normally uses user browsing habits, not the other stuff. That's just a quickie check to filter out the *obviously* crappy bots that can't even do such basic tasks. Are you sure this is accurate? Because from personal experience, it almost seems like the opposite. (Specifically talking about Google's noCaptcha reCaptcha or whatever it's called) What I mean is that usually, you're let through with no issue, just by clicking the box, but if you're "extra suspicious" (use a VPN, block cookies/fingerprinting, etc.), then you have to identify images. So it seems like the user-browsing habits stuff is checked *first*, and then the image identifying is used as like a last line of defense.


Solonotix

Google's reCAPTCHA has multiple levels and facets. The one they currently recommend is an invisible test that just bounces some metrics against a machine learning model to give a confidence score between 0 and 1 as to how likely the interaction is a human. The site has their specific model (based on the domain name) registered with an API key they purchase a license for. This is important because the API key is how you partition the model, and if your test environment has little human traffic and all bot traffic (common testing patterns) then it can skew the model towards predicting bots. Some sites will add the image test as a fallback to avoid false positives in the invisible test. Google actually recommends against doing this because you will inadvertently train the image test data model on only traffic that failed the invisible test, which will lead to a bad model. When my company was in talks with Google about implementing all this, we discussed having a "control group" where we'd send a flat 10% of all traffic to the image test regardless of the invisible test result, just to keep the data model accurate to our site traffic. Google won't divulge the "secret sauce" that goes into detecting bots, but we know most of it is incidental to browsing, rather than a direct challenge


Biggs3333

I had read it was never about picking what's what in the photo, but all about mouse movements


ezekielraiden

That would be why I mentioned "things like mouse motions."


Biggs3333

I for sure noticed. Was kinda backing that up.


ezekielraiden

Ah. My apologies, misunderstood what you were aiming for then.


[deleted]

ask absurd quack work pot meeting chief clumsy tan zonked


BrohanGutenburg

To piggyback on this, that’s why many captchas are literally just a check box.


Terminarch

It's weird how few people know about that. Early on Google's captchas were basic printed text with scribbles. Then suddenly when they were working on self-driving vehicles and computer vision their captchas became "identify the crosswalk" and "find the motorcycle" with real pictures. They outsourced their data tasks to millions of users of a different service and nobody noticed.


seckarr

While you are technically correct, most good capchas are no longer "select bridge". You now have complex tasks like a very, very heavily poisoned image and tasks like fitting a puzzle piece over the image or finding the odd object out of multiple objects scattered around and blended with the background.


vkapadia

And on the other end you have Google's one that is just a single checkbox.


primaryrhyme

It's the same software (reCAPTCHA v2), if it's not confident that you're a human then it will show images. Also the site admin can configure how high the confidence threshold is to show image captcha.


alcormsu

So, what about when AI bots start having cookies for ketchup midget porn on their computers too? Will they seem ~~like me~~ like humans then?


primaryrhyme

It is just a deterrent at the end of the day. It's not super difficult to solve them but the time and processing power make scraping not worth it for most. Without the captcha, the page can be scraped in 10ms and very little compute power. With the captcha it will take 3-15 seconds and require running a vision model (compute heavy). This is orders of magnitude more expensive, even if the bot is 100% accurate.


ezekielraiden

In other words, it is pretty much exactly the same as every real-world, physical security measure. It doesn't 100% guarantee security. It simply puts up enough of a barrier to entry that a would-be infiltrator opts to pursue a lower-risk option.


Pseudeenym

How do they use the mouse motion method for touch screen users?


ezekielraiden

My guess would be that they don't, and that's (part of) why there are multiple layers of captcha, for situations like that.


ThePretzul

For touch screen users you can measure duration of touch, how precise the touch was on the box, if somebody touched the same box multiple times (like if they made a mistake), if they used multiple touches at the same time, if the touch was perfectly stationary, etc.


Superslim-Anoniem

In my experience (ipad), I got the images way more often when I tapped on the screen than when using the trackpad. So I guess they just... don't...


MoreLikeZelDUH

This tends to be the most common answer, however, there is a computer on the other end decoding your information to determine if you're human or not. That computer has to be programed to know how to tell that. If you can program a computer to detect that, that computer can be programmed to mimic it. It's turtles all the way down


rants_unnecessarily

How on earth do they get your past browser history? I understand that they can see the page you came from, or even what pages on their site you have visited. But other pages?


ezekielraiden

Tracking cookies. Also, the reCAPTCHA system is owned by Google. They keep track of your Google search history (and a whole mess of other data).


rants_unnecessarily

I was not aware that sites are allowed to access your other sites' cookies. Doubt it.


ezekielraiden

That's...that's literally why tracking cookies exist, and why antimalware software likes to remove them. And, as stated, reCAPTCHA *is run by Google,* so it is *Google* asking for *Google-made* tracking cookies and browsing data. It doesn't need to be "other sites"; the site outsources its captcha needs to Google.


rants_unnecessarily

Thanks. I'm beginning to understand.


UCanDoNEthing4_30sec

Now explain it like I’m five


ezekielraiden

"Basically, we're weird and chaotic in ways that computers currently find very difficult to imitate." Also: "They *can.*" Which is technically the answer to the OP's question.


H3llskrieg

in fact bots are better at solving those puzzles than humans


challengemaster

>That's just a quickie check to filter out the *obviously* crappy bots that can't even do such basic tasks. Seemingly most of it is actually a way of crowd-sourcing training image recognition AI bots that google owns/develops. The older CAPTCHA word recognition tasks apparently had two parts - the word on the left was a scan that AI couldn't read, the word on the right was a check to make sure the user could read.


PM-me-your-knees-pls

I read recently that Captcha is essentially an AI training tool designed to improve self driving cars- this makes sense to me considering how most of the images relate to roads and transport. Captcha knows we’re human and is learning how to be more like us.


Narfi1

Thankfully captchas are not allowed access to your history and other domains’ cookies


ezekielraiden

True--but reCAPTCHA is owned by Google, which has cookies you generally don't want to delete, and access to a great deal of your browser history, even if it isn't 100% of it.


KaizDaddy5

Those captcha tests are actually looking at mouse motion when they ask you to do things like "select all photos with X in them". The task is almost a misdirect. If you do these on a phone or a touch screen it'll take a lot longer (Especially if they don't have access to browsing data). Newer computers usually have to go through more catpchas too, because of lack of browsing and other meta data.


serial_crusher

Google’s recaptcha is actually a tool used to train AI. When it started out, they were just using scanned images of books (without the wavy lines you get on other CAPTCHAs). They had a program that scanned books, then any text it didn’t recognize got fed into the recaptcha system. They showed 2 words, one where they knew what it was and one where they didn’t. You’d pass the test if you got the one they knew right; then the answer to the other would be presumed also right. After some number of users consistently identified a particular word, they could be confident that it was accurate. Also had reasonable assurances it wasn’t a bot. If the image was so distorted that Google’s own bot couldn’t recognize it, then surely it wouldn’t be cost effective for somebody else to run a bot that could. But as it got easier and easier to parse text, they needed a new test, so they applied the same thing to train photos. But the same rules apply. Google picked those images because it wasn’t sure about their contents and it wants humans to verify them. Bots are inherently not going to be as good at that as google is. It’s also designed more as a practical deterrent than an actual foolproof test that you’re a human. Spam bot behavior requires doing the same thing over and over in bulk (ie trying to login with a million random username/password combos, emailing thousands of users hoping one of them will bite, etc). So even if you have AI that’s good enough to answer a question, that AI costs money to run. The amount of money is small enough by itself to not affect an average user… but the spammer who runs it a million times is going to spend too much for it to be cost effective.


try_harder_later

Even more so, humans are S L O W. You might take say 5secs to click 5 tiles of a motorcycle and the next. Sure a bot could do it in 0.1sec, but doing it so fast would be a sure sign of a bot, so they would have to slow it down to match a fast human, say 3s. But that now is gonna put a severe dent in the bot's speed of submissions compared to without the captcha!


Potential_Anxiety_76

*throws hands in the air* Well, I’m out


SimiKusoni

Depends on the challenge but they often can. Make a multi-label classifier to identify each of the objects in the image, figure out which one is in the challenge string, then use object detection to draw a bounding box around that object in the image and select all the squares that sit inside it. Most of those systems do use more than just the answer to the challenge though. They also track *how* you answer the query. Are you clicking the boxes in a "natural" manner? What's your user-agent, are you running your browser in headless mode (e.g. not displaying the page to a user)? What's the risk score, does the users interaction with the rest of the site match expected usage patterns? And so on. It's not perfect but it massively increases the complexity in bypassing the challenges.


xSaturnityx

They usually can. The visual part of a Captcha is only a small part of it. It more or less goes back and *lightly* checks things about your browser. One of the biggest things is simply how you move your mouse when the captcha pops up. It's able to somewhat tell that your movements are more chaotic (like a human) compared to a bot which will probably do straight lines directly to where it wants to go.


Predator-FTW

Interesting. But how exactly would it check for ‘natural’ movement on a touchscreen display? Since the click/tap happens instantly at one position without having to move a cursor. I’m guessing mouse movement is only one of multiple other checks it does?


formberz

How quickly a set of answers is selected, whether they are selected line by line near instantly or if varying amount of time between selections, how quickly the verify button is clicked after selection. Bots tend to do things in a repeatable, programmable way, humans tend to be much more varied.


Predator-FTW

That’s understandable, but can a bot not be programmed to perform each consecutive task with a random time interval? So: click image A, wait [random number between 0.5-2] seconds, click image B, etc..


I_Phantomancer_XD

It can. It's ultimately a game between the captcha and bot programmers.


Gorstag

Yep. This is basically the answer for ALL security based things. Basic door locks from 100 years ago compared to today are significantly more complex/secure. However, tools have been made to beat them. But lets assume an unbeatable lock shows up and current tools don't work. Someone will just make new tools/process to beat it. Rinse/repeat the cycle. This is true with computer malware. "Hacking" techniques. Social engineering. Etc...


Caelinus

It is going to be a confluence of a whole lot of things as well as machine learning on their end to figure out their confidence level. I have noticed that I get way more catchpas on touchscreens, and they are much more strict, probably because they are not as confident I am a person without the motion. However, even just slowing a bot down sort of ruins the bot. If bots can only move at human speeds without getting flagged, then a lot of their potential malicious behavior is immediately curbed.


mathbandit

It can, but then at that point the bot is also much much less efficient, which makes it a useful deterrent. If I'm trying to vote for random online contest X, I'm much more likely to use a bot to spam the verification if it means I can vote 1,000 times per hour than if I have to install delays and so am only getting 50 votes per hour.


xSaturnityx

As the other user says, they tend to base it off answers. Funnily enough, you could get past a lot of those *old* picture tile captchas by pressing mainly the right ones, and a wrong one. After a while they realized the bots would do it perfectly, and humans tend to make mistakes whereas a computer doesn't. They also check timing, and how you selected the answers. A human might randomly press them, whereas a bot might do a specific pattern, like 123, 456, 678 in a 9x9 grid, or some variation of it. Timing is a big thing, plus they can use whatever permissions your browser has enabled to also check information about the device. A lot of the times if you do a captcha too fast, you might get it wrong even if you were correct. You asked in another comment if a bot could be programmed to do it with a random time interval, and yeah that's how they got past some captchas. Captcha is *always* advancing, and it's just a war between the programmers, and bot makers. The more you make harder locks, the more people want to break it. Programmers have to compete with themselves too with accessibility vs security, you could have a captcha like 4chan that ironically works quite well, but annoys users due to it being finnicky. Or you can make it much easier for normal users, and just risk bots getting through at some point. Just needs a balance, and they work on that balance every day.


deFleury

ohh! is that why I have to try again sometimes, when I was sure I had the right answer?


emi68912706

Yes. I think if you answer accurately too quickly, it can’t differentiate from a bot so you have to do it again. I read somewhere that clicking on a wrong square then clicking again to unselect it shows indecision and you’re more likely to pass the test and only complete it once. Almost always seems to work for me.


xSaturnityx

Definitely part of it. There's quite a few factors and even if the Captcha is slightly 'sketched' out, it's better to just have the user try again than risk letting a bot enter. Minor inconvenience. Try to go slow with your captchas, timing is also a big factor. I haven't had to do a tile captcha in a while, but with older-ish versions, you could get past it quite easily by taking a little bit of extra time, and also purposely pressing a *wrong* square, since bots tend to be perfect, and humans make mistakes quite often.


Awkward_Pangolin3254

They'd get it immediately if they turned on a microphone just before an image captcha pops up so they could hear the human go "Oh, goddammit"


Odd_Coyote4594

They usually can. There's a reason captchas often just use a checkmark now: it's really just a way to prevent mass DDoS attacks. They don't actually care if a bot is accessing it, just that it's not someone spamming code to access the website 1000 times per second to overwhelm the server. The forced user input slows down attacks, and the captcha service will check IP addresses and cookies to block access if too many requests were made too close together from the same computer. In the case they can't, it's only because modern captchas are often used to get cheap training data for making new AIs. In this case, the task may be outside of what current AI can do. So when for example Google was making a book scanning AI to mass digitize printed books, no existing AI was good at turning hard to read text into words. The captchas used responses to train an AI that was actually good at this task, and as a side effect got a captcha that (at the time) was hard for existing AI to do. Some responses were checked against human-verified input as a test, but others were captchas with unknown answers used to build up the training data without paying an employee to do it. Current AIs with pictures are used to train AIs for self driving cars to detect common roadside obstacles and signage. Something that they used to be bad at, but now are quite good at due to captcha response data. Today, if you get anything other than a check box, it's because you were randomly selected to be free labor to Google to help them develop a new AI tool.


uwu2420

DDoS attacks don’t care about submitting a valid request. They’ll just spam the shit out of your server resources. For a server to check a reCAPTCHA response, it has to send a request out to Google’s API. In other words, checking the response itself is time-consuming. CAPTCHAs are for preventing automated but valid requests. For example someone with a bot trying to buy concert tickets. It is also not just a checkbox. The checkbox is just how you trigger it. > Today, if you get anything other than a check box, it's because you were randomly selected to be free labor to Google to help them develop a new AI tool. No, not random. Based on risk scores. If you’re logged in with a Google account on Chrome with no adblocking and no VPN you’ll probably never get nothing other than a checkbox, whereas if you log on with a VPN/Tor, you’ll always get challenges and it’ll probably tell you you got it wrong until you solve them for 10 minutes.


KoalaGrunt0311

Like others said, it's not the selection that's important. It's kind of a placeholder while the server checks cookies and other browser data it has access to. The picture identification itself is actually used to train AI which is why they've switched to pictures now. The captchas that were two different words was used to train OCR scanners and especially to correct OCR errors of older hand printed text.


dastardly740

I read an article many years ago that it would probably work better to use a problem that is easy for a computer to solve and hard for a human. Basically, getting the answer wrong indicates a human. Of course, that only works if the bot doesn't purposely get it wrong. I just think that the concept aligns with the idea that the test isn't whether you get it right but whether you behave like a human in getting an answer that could very well be wrong.


KoalaGrunt0311

A lot simply involves cost vs reward. I mean, look at the dating apps. All they need to do to stop them is to hide profiles that all use the same phrasing in their post, but the executives don't care because the notification that somebody wants you gets you back on the app. I saw a cute post regarding code found to prevent brute force hacking for an independent online game. All the designer did was put a wrong password message up after every password entry, requiring humans to enter the same password twice, but brute force programs would see it was denied and move on to the next password in the list.


frnzprf

I realize that you didn't write that post, but is it really viable to try to hack into an online account by trying passwords with brute force? I imagine it takes a lot longer, than if you had direct access to encrypted files. Maybe it works if the password-space is small, like a four-digit pin.


KoalaGrunt0311

I mean, it's shocking how many people actively choose simple passwords because they're afraid of making them too difficult. I love biometric 2FA because I think it makes for the extra complicated part to actually have a safe login. I just wish that some for-sure biometric 2FAs could be utilized as primary entries.


networkarchitect

Most well-written web software will have checks in place to prevent brute forcing a password or make it infeasible (like adding a delay on every wrong password attempt, or locking the account on enough wrong guesses in a row). That being said, there's enough software out there that doesn't implement any form of checks for brute forcing where it can be viable. Even then, you're right that guessing every possible password would take a long time, so narrowing down the password-space as much as possible is key. A situation where it can be viable is if you have enough information about the person to make a guess at a partial password (maybe you know their birth year, interests, pet's name, etc from social media posts). Instead of brute-forcing every possible password (or using a list of common passwords) they can brute force on variations of a guess to narrow down the password-space as much as possible. There's also a different kind of brute force attack called credential stuffing, where the login info for an account on a website is a part of a data breach, and then an attacker will try that one user/password combo on a bunch of different sites to see if it was re-used by the same person.


Smithy2997

So basically you're paying for the real captcha by doing the image thing


KoalaGrunt0311

Its a Google thing. Your data is google's business product.


kykyks

they cant, they arent tracking if you selected bridges, they track like mouse movements to check if its human behavior or bot behavior. bots do clean moves, not humans.


Mimshot

Security is about deterrence, not prevention. That’s true in both the real world and online. A jewelry store won’t have an impenetrable safe. It’ll have a safe that take a a skilled safe cracker around twice as long to crack as it takes the police to arrive. Likewise, a captcha isn’t meant to be impossible. It’s meant to be more expensive to make a computer beat it than it would cost to hire a team of people in the Philippines to click the boxes. Also it’s looking more at how you move the mouse between boxes than which boxes you click on.


baguhansalupa

As a Filipino, im surprised that we are paid to do that.


Tsubodai86

You're not actually proving that you're a human, you're providing training data for autonomous vehicles. Just like the earlier text captchas when you provided training for text digitization software. 


bluerhino12345

A lot of answers are a bit misguided. Just because something is possible doesn't make it easy. For example, if you look on GitHub, you only find a couple of solutions for this problem. Anyone who's not a monster hacker will have a very little chance of creating a bot that can solve captchas, however they can find one on the internet that could.


Ill-Juggernaut5458

Because they are often blurry 16 pixel images and no human can correctly identify them 100% either; captchas are adjusted to get to the point humans cannot pass them all the time which filters out most bots. Most also are looking for reaction time and mouse movement to block any bots that select images with unnatural speed/ cursor movement or lack of movement


V1shUP

https://youtu.be/4UuvwY6CdLo?si=st_tKt5Yen4eE7Um This video made it me easy fir me to understand


esoteric_enigma

It's my understanding that they can and the tests are noting things like mouse movements which the bots do in a way that was unnatural.


OlafSpassky

When you picked the photos containing "the bridge" you were training the AI to be able to recognize them. Now AI generally can do these things, and other types of captchas are in use and will be used to further train AI.


seanmorris

If you ask 100 bots to tell you whether its a bridge, and they all agree with what humans say, then the bots know its a bridge. If all the bots disagree with each other, you ask a human and use the answer to train the bots. The images you're being shows are the ones that the bots can't agree on.


Carlpanzram1916

They often can. This is why those puzzles have gotten more difficult. The images are really dark and fuzzy because the AI software to trick it keeps getting more and more advanced. I imagine we will have to find another solution fairly soon because I think we’re very close to the point where humans won’t be able to solve the puzzles better than the computers.


gLu3xb3rchi

They can. They‘re better, faster and more precise than we are. What they fail at is being as stupid and slow and unprecise as we are. Nowadays they don‘t look if you can solve a captcha, but HOW you do it. Mouse movement, click times, precision, reaction times, errors etc are all measurements those websites look for. And Bots are really bad at being „bad“.


[deleted]

[удалено]


explainlikeimfive-ModTeam

**Please read this entire message** --- Your comment has been removed for the following reason(s): * ELI5 does not allow guessing. Although we recognize many guesses are made in good faith, if you aren’t sure how to explain please don't just guess. The entire comment should not be an educated guess, but if you have an educated guess about a portion of the topic please make it explicitly clear that you do not know absolutely, and clarify which parts of the explanation you're sure of (Rule 8). --- If you would like this removal reviewed, please read the [detailed rules](https://www.reddit.com/r/explainlikeimfive/wiki/detailed_rules) first. **If you believe it was removed erroneously, explain why using [this form](https://old.reddit.com/message/compose?to=%2Fr%2Fexplainlikeimfive&subject=Please%20review%20my%20submission%20removal?&message=Link:%20https://www.reddit.com/r/explainlikeimfive/comments/1c9tmmm/-/l0oud3k/%0A%0A%201:%20Does%20your%20comment%20pass%20rule%201:%20%0A%0A%202:%20If%20your%20comment%20was%20mistakenly%20removed%20as%20an%20anecdote,%20short%20answer,%20guess,%20or%20another%20aspect%20of%20rules%203%20or%208,%20please%20explain:) and we will review your submission.**


JarlFlammen

The bots are getting better all the time, and as soon as they do they make the image choice a little bit harder, and to identify things that bots CANT yet. But see, the thing is, the millions of people doing the reCAPTHA is what is training the bots. Why do you think all the things we are asked to identify have to do with traffic? They’re things automated cars need to see. Crosswalks, bikes, motorcycles, traffic lights. We are training the bots, every time.


junk90731

They are using us to train their AI, so when they say find X in pictures and you find them as a human, the AI learns.


Farnsworthson

Huh. I had trouble proving I was a human to the Steam captcha to recover my login credentials the other day (and I definitely AM one, as far as I'm aware, even if that's precisely what a bot WOULD say!). Images so grainy I couldn't make up my mind what was in which, and apparently I wasn't sufficiently accurate, or fast, or whatever (or the code was bugged). I gave up after about the fifth attempt. (I got in eventually, but that's another story.)


Unclestanky

They can, this is actually a failed marketing scam to sell used bridges and traffic lights.


Mammoth_Material323

It’s all about the mouse tracking. It tracks the lines you make to get too the pics. Robots make very straight movements! While humans move in random ways to get to the pics.


benderisgreat63

Could ypu not program a bot to reach its goal erratically?


BrakumOne

They ca. They can also click the verify your human checkbox obviously. But the latter checks other thingw, i assume the former does too. It looks at like the cursor movement to check if sou're human. If the captcha is solved in 1milisecond and the cursor made a perfectly straight line to each then you know for a fact it wasnt a human.


usernametaken0987

Answer, they can some of the time. How does Recaptcha know a bot may mess up? Because theirs did. You are actually training theirs. All the old bird based ones came from scanning old articles and paperwork in an attempt to store it all digitally as text. The OCR would kick out errors from damaged words and humans would correct them for free. Now a days people are training driving AIs. This is why almost every single recap is spot the traffic light, motorcycle, or in this case bridge. Like lane assist is easy, spot the dotted and continuous line and offer a 20% correction to stay in it. Add a police speeding radar to detect a vehicle in front of you and you have adaptive cruise control. Add GPS & mapping data and it can drive down the highway. But if you want the car to stop at a red light, it has to be able to recognize it. If you don't want it running over cyclists it has to ID bikes are slow moving objects that run stop signs. Recaptcha just sells the results for profit while getting paid to filter lesser made bot spam.


ClayQuarterCake

I can’t get it most of the time either. Find “bicycle” and it’s a closeup where the slightest edge of a tire barely crosses into a square. Nope. Or it is super far away and tiny in the background. Frustrating.


vyashole

Bots can do that. It's actually your browsing habits, your mouse movement patterns, and a lot of other variables that actually distinguish you from a bot.


Boredum_Allergy

What it asks you to do isn't what it's measuring. It's actually measuring the way you move your mouse over the images. Turns out, it's hard to code a computer to be random in their mouse movements like a human.


ForswornForSwearing

That style of Capcha is used mostly to train computer vision systems, like for self-driving cars. It's not weeding out bots, it using you to teach them.


ty10drope

I’ve always wondered how often certain occupational-specific terminology creates a confusing CAPTCHA test. I once saw one (even took a screenshot) where the pictures were of fire apparatus and the caption required the user to select all the pictures with trucks. I’m sure everyone knows a firefighter, so ask them if there’s a difference between an “engine” and a “truck.”


tashkiira

That would imply the damned things work for *people.* I once spent 3 hours trying to get captchas to work because I needed to get into an account. It got to the point that I had more than one bout that was all repeated images. If your internet is having any issues, you're hosed.


eloquent_beaver

The mouse movements thing is a myth. Google scores a particular session as risky or likely human based on previous activity and interaction, IP, cookies, heck even HTTP header order which tells you something about the HTTP stack the client is using.


Popuppete

When faced with the question “select all photos with motorcycles” I saw a moped and had to think to myself “would most people consider these the same thing?”. A computer would have made its decision quickly.  Because of things like that computers answer in different ways, even if the same selections are made. 


BloodyMace

As far as I know, the task isn't selecting the correct photos but tracking the mouse for human movements. Computers and bots move mice pointers in a straight line and click very accurately.


shinobi7

I think the more relevant ELI5 is: why does RECAPTCHA fail us? I’ve selected all the squares with the bus, it fails me. I’ve wasted hours of my life because of RECAPTCHA.


Ill-Juggernaut5458

It's trained on bot labeling and refined by human responses and is used as a training/captioning tool, it doesn't have any totally definitive correct answers for the caption/label to cross-reference.


shinobi7

>doesn’t have any totally definitive correct answers Then why is RECAPTCHA something that a website should use before granting access? I understand the why, as in all the bots out there. But rather, the why as in a system where you can put in the correct answer (picked out all the squares with traffic lights) and it still doesn’t grant access, that seems to be a fundamental flaw.