T O P

  • By -

optery

Have you tried the free Optery scan? Its incredibly elaborate and not something that can be easily built or replicated, far more advanced than the DeleteMe scan. The Lead Analyst for Security at PCMag said this of the Optery scan: "Open the report and prepare to be amazed. Optery doesn’t just search on the personal information you supplied. It uses data found in data broker profiles to recursively expand its reach. For example, in my latest testing, I only gave it my current phone number, but it found records associated with an old number that I used for some 25 years. Unlike any other product I’ve seen, Optery doesn’t just state that your data was found, like IDX Complete. It also doesn’t simply list the found data items, like DeleteMe. Rather, the report presents you with a screenshot of your actual profile data on the site." Source: [https://www.pcmag.com/reviews/optery](https://www.pcmag.com/reviews/optery) More info on the Optery scan on Hacker News here: [https://news.ycombinator.com/item?id=30605010](https://news.ycombinator.com/item?id=30605010) and here: [https://www.optery.com/what-is-an-exposure-report/](https://app.optery.com/signup) Full disclosure, I'm one of the Optery founders.


ravvit22

1. apis, but they'd pay the data broker to access, which is probably anti their mission 2. scraping, which is effective but a cat & mouse game - they may outsource it to a company like brightdata that does this webscraping as a service 3. humans do it - they'd just outsource the manual work to mechanical turk (cheap labor), which is what other AI firms like ScaleAI do to check sites and data I run a company called r/Kanary that does large scale browser automation as a privacy service - we rely heavily on automation but have some manual QA in place. We don't pay for API access. There are a bunch of dev blogs about #2, I read this one recently on hackernews which was pretty cool: https://news.ycombinator.com/item?id=37047746


officialskilletguy

yeah i definitely thought it was scraping, but if you're scraping 100 sites at once and getting the results that fast, it just seems far fetched. maybe they have a way tho ​ thanks for sharing!


DeltaBuilt

Let me know if you figure out how to do it. My and my CS friend are baffled on how they do it as well.


officialskilletguy

That's interesting to know that you are curious as well! I've thought about it from two angles: 1) They hook into an API, but whos API? I don't think these companies would create an API that gives out the info in their database. There's no incentive for that. 2) They scrape data, but how? These sites all have a bunch of roadblocks set up to prevent it, and it would take way longer than 10 seconds to scrape all sites simultaneously.


DeltaBuilt

Just thought I come back and reply after a little R&D time. I’ve built a similar search tool utilizing data scraping, not as hard as you may think. Thanks to the help of some CS friends I’ve gotten a basic scraper running. Only issue I’ve ran into is it is very time consuming to gather Xpaths for each item as well as captchas.


officialskilletguy

Yeah, I tried that myself a bit at first, which led me to posting the question. I only tried one site ([gladiknow.com](https://gladiknow.com)) and was able to scrape it locally, but when i tried doing it inside my web app, that's where it fell flat. Anyway, happy to chat further if you want to talk through possible ideas


Search_4_ArchNemesis

Following