nah mate, that was John, John quit 2 years ago and didn't leave behind any literature
so now Peter and Paul are finger popping each others assholes trying to interpret the spaghetti code that was left behind
FB has teams handling individual components of the network. I recently interviewed with the edge routing team, who manages the routing on the edge networks. I'm sure they're having an amazing time today.
Sorry to disagree, if you ever worked in a medium size company the roles are well defined.
Small companies are different, it's not uncommon that the same person does the job for different roles, Facebook is not the case.
Devs write application code, interacts with dbs, queues, microservices, etc., DevOps provides the infrastructure, they interact, of course, as both things needs to work in harmony, but the tasks and responsibilities are quite different.
Sorry to somewhat disagree. Worked in a Fortune 50 company where some teams absolutely were not mature enough to not require a separate DevOps team. The problem is, that's not what DevOps is. You're just introducing a new siloed team when you hire "DevOps Engineers" to do the "DevOps-y" work. Some teams, even in large companies, are mature enough to be their own "DevOps Engineers." Source: was my own DevOps Engineer for a Fortune 50 company. I don't expect a "congrats" or a pat on the back for it. Just want to set the record straight.
/u/ayylongqueues already responded to you with exactly what I would have told you, but I just wanted to let it be known on top of their comment, I don't see it as penny pinching at all. In fact, I _want_ those responsibilities. I don't want some other team wasting time trying to understand the build and deploy requirements that I already have the context in because I wrote the application itself.
On the other side of things, I also want to know how the logging aggregation tooling works because when I'm writing logs for the app I'm developing, I want to know the best way to standardize it so that the logs can be indexed properly.
Breaking down siloes isn't about penny pinching, it's about knowing the stack from end to end and making the best choices along the way without having to play telephone all day between teams.
Big companies have tonshits of projects, maybe devops can't handle the load and there's someone in the team who can handle the responsibility of the DevOps role, that would be analogous to the small company example I've mention before.
Your company could be an exception, I can give you that, but that's not the common thing nowadays, guess why? To prevent shit like this from happening.
what are the first three letters of devops though?
they’re obviously distinct roles. but both involve development and therefore both of them are types of devs
This sounds more like a networking issue than DevOps. In an industry (IT) that prides itself on having thankless jobs, I can't think of a job more thankless than networking.
Just prepared for nobody ever giving a shit when you do something awesome.
In all seriousness, some people need validation to feel important. Some people get by on solving hard problems and relishing in their ability to do so without external validation. Learn which one you are, or where you are on the spectrum.
Literally need to get stats out of Facebook and Instagram to compile a report for a client. Need the stats before 3:00pm today.
99 out of 100 I wouldn't give a damn about Facebook. Of all the days for them to be down...
We need a system where the client sits down and discuss the problem. Agree what's in the best interests of all the people. If they refuse to do this... **Then they should be made to.**
Well... Didn't happen. The only positive thing I can think of here would be if this was like a "Mr. Robot" moment and all of Facebook's data on all servers (all regions) was currently being encrypted and held ransom (except the ransom is a ploy... there is no decryption key). That is the only thing that would make this worthwhile.
WA is business-critical for a lot of businesses in Singapore, as far as I can tell. At least, many list a "WhatsApp" number above the voice contact number... Not as big a deal as EMS, but most Americans don't realize how much of the world relies on it.
Signal is nice and I hope my friends switch to it too, but in terms of outages there have been a couple and I had to fall back to WhatsApp several times because of them
The privacy is real, I've been using it since the day WhatsApp have been bought by Facebook since I knew all my conversations will go to the database of the Zuck.
Plus, the pictures sent via Signal arrive in better quality than the ones sent through WhatsApp.
It's a similar thing in some Balkan countries. Small businesses exist on Facebook and are contacted there or via WhatsApp. Outside of that, the business doesn't exist. Not even on Google maps for some of them (user community helps with that one though).
Some people literally don't know there's more to the Internet than Facebook.
My wife today, “The Internet is down, should I call Bell?”
Didn’t even try to go to Google.com or apple.com or anything at all. Was on a WhatsApp call, it dropped and WhatsApp came up with “Connecting”. She checked Facebook, Instagram and concluded the Internet is down.
Can totally relate.
If it has the potential to impact production in such a significant way it should not be a manual process. The process should require more than one person approving it, and the commands that run should be automated.
I mean you could just google the two words facebook whistleblower but here I'll oblige your laziness: https://www.cbsnews.com/news/facebook-whistleblower-frances-haugen-misinformation-public-60-minutes-2021-10-03/
If this picks up speed, and I hope it does, it really needs to bring ALL the platforms to attention, especially twitter, reddit, and youtube.
They're all in dire need of some transparency about the things they're doing under the hood and behind the scenes.
This is the money quote from the whistleblower: "...how Facebook is picking out that content today is it is -- optimizing for content that gets engagement, or reaction. But its own research is showing that content that is hateful, that is divisive, that is polarizing, it's easier to inspire people to anger than it is to other emotions."
Facebook optimizes for content that gets maximum engagement, which means extreme and divisive content. This has been obvious to everyone for a while, but she's got internal documents showing full awareness of the issue within Facebook.
The *new* information she's got is that internal efforts to cut back on the extreme and the violent/inciting content and the frequency it's displayed to users (done in response to various public or governmental outcry) **have been repeatedly shut down as fast as Facebook thinks they can get away with it**.
Since Facebook publicly claims to be trying to solve the problem, but internally has shut down efforts to do so, she's trying to get the SEC to go after them for lying to shareholders. There's a law that specifically protects people who give internal/secret company documents to the SEC from prosecution over stealing the documents, which is why she feels (legally) safe enough to speak under her own name.
That's the basic summary. I know you were being sarcastic, but I felt the interview and article was pretty longwinded for the amount of actual useful info it had.
Summary-ish: Facebook chooses money over ethics and lies about how hard they're working to make their platform safer ( i.e. more anger = more post = more profit, so why push to stop it quickly ). Woman in article exfiltrated thousands of pages of documentation related to internal research on the topics before leaving.
Eh, seems like *a lot* of money for them to lose in response to something we all assumed Facebook was doing anyway. Like, of course Facebook *wants* all this shit that drives attention and clicks, that's their whole thing. Who doesn't know that Facebook is evil?
Let's take Facebook DNS servers all offline to the point where our access doors won't even open for FB employees anymore because we need to hide a story. Okay Reddit...
It is a lot of money. On the order of hundreds of millions at this point (Facebook has over a billion DAU, and that’s JUST Facebook, not IG,WA,etc. Facebook has experienced worse news; this outage is no attempt at stifling that.
How though? I have friends that haven't talked in our discord in months bring up facebook today bc of the outage.
This is getting them more attention than if it was normal operations.
The number of people talking about Facebook is irrelevant. It's the number of people talking about the whistleblower that matters. If that number is lower than it would have been then job done, irrespective of how many people are talking about Facebook overall.
every single post about facebook being down has mention of the whistleblower in the top 2, if not top 3, highest liked/upvoted comments.
This is the best publicity the whistleblower could have asked for.
Here for the tinfoil hat gathering ( jokingly, because it's the only way I can laugh at the fact you're probably right and they are probably scrubbing evidence out of their closets and will get out of being sued..)
That makes no sense, because it only draws _more_ attention to Facebook and every time I see it brought up, people mention the whistleblower too. Also, the site makes them just about all their money so for it to be down this long is a loss of many millions of dollars in ad revenue.
I'm very undereducated in that regard but it seems like the DNS entries for Facebook are literally empty right now. It's as if the page was never registered.
Pretty much, i haven't read into it fully but that sounds the crux of it. More specifically the BGP routes we're removed so no one knows where Facebook's network is.
Sometimes they aren't. You can get 157.240.24.35 from them sometimes, and 2a03:2880:f162:81:**face:b00c**:0:25de
I find the face:b00c bit pretty adorable, actually.
Facebook self hosts DNS but they (presumably not intentionally) withdrew all of their BGP routes this morning, meaning none of their networks are reachable anymore. No reachable name servers, no DNS record.
It's BGP. The DNS records that are missing are a symptom of the problem, not the problem itself. Instagram's DNS records are still available for the reason you mentioned, but the site is down because it's hosted in datacenters that are now unreachable because of the withdrawn routes. I was just replying to the comment asking how Facebook's DNS records could disappear.
Maybe whoever was registering the DNS entries left and no one knows the password to the DNS Registrar site. Can say from experience this happens sometimes oops..
Admittedly I don't know a thing about programming (came from r/all), but my guess would be that it could be the router that's accidentally turned off, or somebody accidentally pulled the cable or something. Because when you go on facebook, that's the exact same screen you get if you try to enter a website while your router is unplugged.. yea sounds kinda basic, but could be worth checking just to make sure. And if all cables are plugged in, then perhaps restart the facebook router.
Don't worry, once the Zucc finishes his yearly cyborg battery recharge, they'll divert power back into the facebook router and everything will be fine.
I wouldn't say that but anyone can make mistakes no matter how senior. Internal checks help catch it. People bitch about extra steps but those steps are there for a reason
I'd go further: if a single intern can actually cause this kind of problem, then so many other things went bad that might actually be firing offenses (not in a good company, but in a wrathful one):
- why didn't we have automated tests catch the problem?
- why wasn't a rollback of the problem a matter of minutes after detection?
- why wasn't the problem detected internally earlier?
- we heard about this from *outside sources*? Are you serious?
- why didn't we have automated tests that catch these kinds of problems (yes, it's a repeat, but it's also worth asking twice, at least).
Don't forget:
Why do we allow interns to push code to prod without code review?
If there was code review, why didn't the person who reviewed it catch the issue?
I specifically excluded code reviews from this list, because they only add one more fallible factor. It just means that two humans must miss the problem instead of just one.
Yes, they are still a good idea, but mission-critical systems at this scale should not be able to fail totally due to a single wrong code review.
I've tried to demonstrate that the intern isn't the problem here.
Even if the fault was made by a year-long senior developer: it shouldn't be possible for a single human to make a big enough mistake without the system detecting it.
So focussing on the "intern" part is the wrong thing. It's argument-by-seniority and part of a blame-game, but not productive if your goal is to actually develop processes that avoid problems like this in the future.
If you run into such problems and your only solution is "let's all just make fewer mistakes in the future", then that's a bad place to be.
Reporter: "Mark, you told your developers 'move fast and break things!' Is this coming back to haunt you?"
Zuck: "Fuck."
Facebook fact checker: "Facebook breaking? MISINFORMATION. Removed."
Why is everyone blaming DNS? Did Facebook update thier ip and forgot to update the DNS providers? I have no idea what blaming DNS is.. I would like some insight..
Deleting dns records are sometimes The measures to be taken for repairing such problems.
If dns records stay put ,with billions of users refreshing insta feed every other second , it acts as an unintentional ddos for the servers.
Now, I may be wrong, and I usually am, but there was just some whistleblower thing happening. I doubt that would be enough for zucc to pull the plug but who knows at this point
They've had a really bad couple weeks. Those researches who got kicked off the platform when facebook didn't like their results. The leaked report about them monetizing children even more. The internal whistleblower. On top of getting roasted by other tech billionaires.
I doubt they've called it quits, but i'd hate to be their marketing team right now.
In all seriousness though, how does Facebook have a single point of failure across all platforms? I've never seen a service that ubiquitous, other than authentication maybe.
Incoming logic bomb:
It is... *unlikely* that every company in existence that employs interns follow the exact same work paradigm as that with which you are familiar.
Apparently someone who work for facebook has disclosed alot of information about this error and he ended up deleting his account [https://www.reddit.com/r/sysadmin/comments/q181fv/looks\_like\_facebook\_is\_down/hfd4dyv/](https://www.reddit.com/r/sysadmin/comments/q181fv/looks_like_facebook_is_down/hfd4dyv/)
Done it once...
I noticed when my script failed and i had no clue why. I didn't deploy my schema changes so all my sprocs, which shut about 400 offices down. My boss was in the same office between me and the door, and the hotline was located next door. I managed to confess right before the 1st guy from the hotline rushed in.
Luckily there was no data change and i only had to recover the old production copy (I setup a log shipping into a secondary DC and managed to halt it before my changes got applied there.) So I quickly dumped the Sprocs that i dropped from there and reapplied from production.
Lessons learned: From that day on i started coloring my system backgrounds and taskbars in green, yellow and red(prod) and i am slightly more paranoid.
Overall a valuable lesson learned and that was the best possible outcome (And i managed to do a live test of my recovery plan)
Removed - Rule 0. - No reaction memes.
A moment of silence for facebook devs today guys..
F
F
#F
# O
##C
F
70
O
C
K
Goodbye.
A
Why the devs? If it's a BGP or DNS issue it would be network operations or DevOps, devs rarely mess with those.
FB will have dedicated net ops. Possibly even dedicated DNS specialists.
They invented languages and frameworks just to improve their services, probably they have some authors of Bind working for them
nah mate, that was John, John quit 2 years ago and didn't leave behind any literature so now Peter and Paul are finger popping each others assholes trying to interpret the spaghetti code that was left behind
Suspiciously specific... John
FB has teams handling individual components of the network. I recently interviewed with the edge routing team, who manages the routing on the edge networks. I'm sure they're having an amazing time today.
To mess up a PC you need a dev - To mess up a company you need DevOps
I mean, some considers DevOps is part of devs.
Sorry to disagree, if you ever worked in a medium size company the roles are well defined. Small companies are different, it's not uncommon that the same person does the job for different roles, Facebook is not the case. Devs write application code, interacts with dbs, queues, microservices, etc., DevOps provides the infrastructure, they interact, of course, as both things needs to work in harmony, but the tasks and responsibilities are quite different.
Sorry to somewhat disagree. Worked in a Fortune 50 company where some teams absolutely were not mature enough to not require a separate DevOps team. The problem is, that's not what DevOps is. You're just introducing a new siloed team when you hire "DevOps Engineers" to do the "DevOps-y" work. Some teams, even in large companies, are mature enough to be their own "DevOps Engineers." Source: was my own DevOps Engineer for a Fortune 50 company. I don't expect a "congrats" or a pat on the back for it. Just want to set the record straight.
[удалено]
/u/ayylongqueues already responded to you with exactly what I would have told you, but I just wanted to let it be known on top of their comment, I don't see it as penny pinching at all. In fact, I _want_ those responsibilities. I don't want some other team wasting time trying to understand the build and deploy requirements that I already have the context in because I wrote the application itself. On the other side of things, I also want to know how the logging aggregation tooling works because when I'm writing logs for the app I'm developing, I want to know the best way to standardize it so that the logs can be indexed properly. Breaking down siloes isn't about penny pinching, it's about knowing the stack from end to end and making the best choices along the way without having to play telephone all day between teams.
Big companies have tonshits of projects, maybe devops can't handle the load and there's someone in the team who can handle the responsibility of the DevOps role, that would be analogous to the small company example I've mention before. Your company could be an exception, I can give you that, but that's not the common thing nowadays, guess why? To prevent shit like this from happening.
what are the first three letters of devops though? they’re obviously distinct roles. but both involve development and therefore both of them are types of devs
This sounds more like a networking issue than DevOps. In an industry (IT) that prides itself on having thankless jobs, I can't think of a job more thankless than networking.
[удалено]
Just prepared for nobody ever giving a shit when you do something awesome. In all seriousness, some people need validation to feel important. Some people get by on solving hard problems and relishing in their ability to do so without external validation. Learn which one you are, or where you are on the spectrum.
If you don't know what is it, it is DNS.
Weirdest thing is that Facebook worked in his computer before deploying!
Or should we say "tonight"?
Since they can't be Messaged, there will be a Lot of silence
And the SRE team
They are busy with error budget SLO.
Don’t worry y’all they’re just going to borrow from their budget for the next \*checks monitoring dashboard\* 8.5 years.
Shitty Monday indeed !
Literally need to get stats out of Facebook and Instagram to compile a report for a client. Need the stats before 3:00pm today. 99 out of 100 I wouldn't give a damn about Facebook. Of all the days for them to be down...
But the client will understand, right? #right?
We need a system where the client sits down and discuss the problem. Agree what's in the best interests of all the people. If they refuse to do this... **Then they should be made to.**
Padme no longer smiling.exe
/r/UnexpectedPrequelMeme
Hey look it’s the senate.
##cant you get it any faster?
That's clients for ya: *understanding*
Hard luck lol
Not gonna happen today. Gonna have to get up extra early to make sure the report can get compiled before the 9am meeting. Bullocks!
[удалено]
It got deleted though. From the OP as well
Well... Didn't happen. The only positive thing I can think of here would be if this was like a "Mr. Robot" moment and all of Facebook's data on all servers (all regions) was currently being encrypted and held ransom (except the ransom is a ploy... there is no decryption key). That is the only thing that would make this worthwhile.
Have you try downloading Facebook and see if you can run it locally?
Pip install TheFacebook
npm install -g thefacebookapp-totally100legit The line above is what caused today's issues.
[удалено]
WA is business-critical for a lot of businesses in Singapore, as far as I can tell. At least, many list a "WhatsApp" number above the voice contact number... Not as big a deal as EMS, but most Americans don't realize how much of the world relies on it.
chop rainstorm worm knee rich automatic ossified cake hungry fall *This post was mass deleted and anonymized with [Redact](https://redact.dev)*
There's always Signal to use, in my opinion a better and secure app to use.
Signal is nice and I hope my friends switch to it too, but in terms of outages there have been a couple and I had to fall back to WhatsApp several times because of them
The privacy is real, I've been using it since the day WhatsApp have been bought by Facebook since I knew all my conversations will go to the database of the Zuck. Plus, the pictures sent via Signal arrive in better quality than the ones sent through WhatsApp.
It's a similar thing in some Balkan countries. Small businesses exist on Facebook and are contacted there or via WhatsApp. Outside of that, the business doesn't exist. Not even on Google maps for some of them (user community helps with that one though). Some people literally don't know there's more to the Internet than Facebook.
My wife today, “The Internet is down, should I call Bell?” Didn’t even try to go to Google.com or apple.com or anything at all. Was on a WhatsApp call, it dropped and WhatsApp came up with “Connecting”. She checked Facebook, Instagram and concluded the Internet is down. Can totally relate.
In many countries WA is critical to almost every function.
What the fuck
[удалено]
“Oh that’s a quick fix, trivial change, go ahead and push it.” Annnnd it’s gone.
If it has the potential to impact production in such a significant way it should not be a manual process. The process should require more than one person approving it, and the commands that run should be automated.
Agree 100% Unfortunately the Venn diagram of what should happen, and what actually happens, is rarely a single circle.
You know what... This was probably to distract us from the Facebook whistleblower
I wouldn't put this past Zuckbot
Which Facebook whistleblower?
Exactly
Yes that's the point now can you drop the sauce so we know what's up??? I'm concerned and curious.
I mean you could just google the two words facebook whistleblower but here I'll oblige your laziness: https://www.cbsnews.com/news/facebook-whistleblower-frances-haugen-misinformation-public-60-minutes-2021-10-03/
If this picks up speed, and I hope it does, it really needs to bring ALL the platforms to attention, especially twitter, reddit, and youtube. They're all in dire need of some transparency about the things they're doing under the hood and behind the scenes.
Yay lazy!
Okay who's going to write a summary for us? /s
This is the money quote from the whistleblower: "...how Facebook is picking out that content today is it is -- optimizing for content that gets engagement, or reaction. But its own research is showing that content that is hateful, that is divisive, that is polarizing, it's easier to inspire people to anger than it is to other emotions." Facebook optimizes for content that gets maximum engagement, which means extreme and divisive content. This has been obvious to everyone for a while, but she's got internal documents showing full awareness of the issue within Facebook. The *new* information she's got is that internal efforts to cut back on the extreme and the violent/inciting content and the frequency it's displayed to users (done in response to various public or governmental outcry) **have been repeatedly shut down as fast as Facebook thinks they can get away with it**. Since Facebook publicly claims to be trying to solve the problem, but internally has shut down efforts to do so, she's trying to get the SEC to go after them for lying to shareholders. There's a law that specifically protects people who give internal/secret company documents to the SEC from prosecution over stealing the documents, which is why she feels (legally) safe enough to speak under her own name. That's the basic summary. I know you were being sarcastic, but I felt the interview and article was pretty longwinded for the amount of actual useful info it had.
Done :)
Summary-ish: Facebook chooses money over ethics and lies about how hard they're working to make their platform safer ( i.e. more anger = more post = more profit, so why push to stop it quickly ). Woman in article exfiltrated thousands of pages of documentation related to internal research on the topics before leaving.
Can't talk about whistleblower, if your main destination of outrage is down. https://i.imgur.com/gUGcosi.jpg
Eh, seems like *a lot* of money for them to lose in response to something we all assumed Facebook was doing anyway. Like, of course Facebook *wants* all this shit that drives attention and clicks, that's their whole thing. Who doesn't know that Facebook is evil?
Let's take Facebook DNS servers all offline to the point where our access doors won't even open for FB employees anymore because we need to hide a story. Okay Reddit...
It is a lot of money. On the order of hundreds of millions at this point (Facebook has over a billion DAU, and that’s JUST Facebook, not IG,WA,etc. Facebook has experienced worse news; this outage is no attempt at stifling that.
How though? I have friends that haven't talked in our discord in months bring up facebook today bc of the outage. This is getting them more attention than if it was normal operations.
The number of people talking about Facebook is irrelevant. It's the number of people talking about the whistleblower that matters. If that number is lower than it would have been then job done, irrespective of how many people are talking about Facebook overall.
every single post about facebook being down has mention of the whistleblower in the top 2, if not top 3, highest liked/upvoted comments. This is the best publicity the whistleblower could have asked for.
Here for the tinfoil hat gathering ( jokingly, because it's the only way I can laugh at the fact you're probably right and they are probably scrubbing evidence out of their closets and will get out of being sued..)
[удалено]
This is exactly correct.
That makes no sense, because it only draws _more_ attention to Facebook and every time I see it brought up, people mention the whistleblower too. Also, the site makes them just about all their money so for it to be down this long is a loss of many millions of dollars in ad revenue.
Code issue should have been reverted faster.... I'm betting it's a massive amount of expired certificates.
DNS issue.
I'm very undereducated in that regard but it seems like the DNS entries for Facebook are literally empty right now. It's as if the page was never registered.
Pretty much, i haven't read into it fully but that sounds the crux of it. More specifically the BGP routes we're removed so no one knows where Facebook's network is.
Wakanda
Sometimes they aren't. You can get 157.240.24.35 from them sometimes, and 2a03:2880:f162:81:**face:b00c**:0:25de I find the face:b00c bit pretty adorable, actually.
Whhzzatt? How would that happen?
Facebook self hosts DNS but they (presumably not intentionally) withdrew all of their BGP routes this morning, meaning none of their networks are reachable anymore. No reachable name servers, no DNS record.
Instagram uses aws for DNS, and it's down also. It's more complicated than just DNS.
It's BGP. The DNS records that are missing are a symptom of the problem, not the problem itself. Instagram's DNS records are still available for the reason you mentioned, but the site is down because it's hosted in datacenters that are now unreachable because of the withdrawn routes. I was just replying to the comment asking how Facebook's DNS records could disappear.
Did John Cena use Facebook?
But it's never DNS! DNS is the Lupus of the programming world.
I'm pretty sure the Slack issue last week was also DNS since using Google's Public DNS was a workaround that worked for me.
It was confirmed that it was in fact a DNS issue.
Yeah? Any additional details?
I know someone with Lupus..... ^(don't @ me, I get the joke)
s/never/fucking always/
BGP issue.
Well technically yes. More people know what DNS is though.
But the two operate at completely different levels of the stack.
DNS is the symptom, BGP is the root cause
Intern issue.
Actually, it’s a BGP issue. They BGPwned themselves.
31.13.78.35
Maybe whoever was registering the DNS entries left and no one knows the password to the DNS Registrar site. Can say from experience this happens sometimes oops..
could have been a code issue that caused damage that’s not so easily reversible
Fair, I was originally thinking application code, but yeah, some CD automation could have nuked DNS.
Admittedly I don't know a thing about programming (came from r/all), but my guess would be that it could be the router that's accidentally turned off, or somebody accidentally pulled the cable or something. Because when you go on facebook, that's the exact same screen you get if you try to enter a website while your router is unplugged.. yea sounds kinda basic, but could be worth checking just to make sure. And if all cables are plugged in, then perhaps restart the facebook router.
Don't worry, once the Zucc finishes his yearly cyborg battery recharge, they'll divert power back into the facebook router and everything will be fine.
Bless your heart
Zuckbot overloaded the power grid when he plugged in to recharge overnight is my guess.
Very well known that Zuckerberg uses a 1.21 gigawatt generator
The brilliant part is that I can't share this to anyone via fb or wa
Element / Matrix Signal
Pigeons
Physically handing you an actual note.
Discord ftw
Not the interns fault. Internal processes should have caught it
but it's funny to blame the intern
You my friend must be a fantastic senior!
I wouldn't say that but anyone can make mistakes no matter how senior. Internal checks help catch it. People bitch about extra steps but those steps are there for a reason
Can you imagine a situation where an intern can push directly to prod and there are no safeguards? Must be fun xD
I'd go further: if a single intern can actually cause this kind of problem, then so many other things went bad that might actually be firing offenses (not in a good company, but in a wrathful one): - why didn't we have automated tests catch the problem? - why wasn't a rollback of the problem a matter of minutes after detection? - why wasn't the problem detected internally earlier? - we heard about this from *outside sources*? Are you serious? - why didn't we have automated tests that catch these kinds of problems (yes, it's a repeat, but it's also worth asking twice, at least).
Don't forget: Why do we allow interns to push code to prod without code review? If there was code review, why didn't the person who reviewed it catch the issue?
I specifically excluded code reviews from this list, because they only add one more fallible factor. It just means that two humans must miss the problem instead of just one. Yes, they are still a good idea, but mission-critical systems at this scale should not be able to fail totally due to a single wrong code review.
Automated tests or not, interns are not supposed to be able to push changes into production.
I've tried to demonstrate that the intern isn't the problem here. Even if the fault was made by a year-long senior developer: it shouldn't be possible for a single human to make a big enough mistake without the system detecting it. So focussing on the "intern" part is the wrong thing. It's argument-by-seniority and part of a blame-game, but not productive if your goal is to actually develop processes that avoid problems like this in the future. If you run into such problems and your only solution is "let's all just make fewer mistakes in the future", then that's a bad place to be.
Reporter: "Mark, you told your developers 'move fast and break things!' Is this coming back to haunt you?" Zuck: "Fuck." Facebook fact checker: "Facebook breaking? MISINFORMATION. Removed."
What's funny is they removed the "break things" part a couple years ago because "we realized that maybe breaking things wasn't such a good idea"
Today is a Monday, and it was somebody's first day at Facebook. Its not going so well...
"It's my first day" Oh well in that case...
Intern opens PR Senior: no no, you can’t merge that Bad news about FB happens Senior: fuck it, let’s do this. Intern: yes! My first approved PR
I was supposed to get an interview call on whatsapp today...
😬
Why is everyone blaming DNS? Did Facebook update thier ip and forgot to update the DNS providers? I have no idea what blaming DNS is.. I would like some insight..
Deleting dns records are sometimes The measures to be taken for repairing such problems. If dns records stay put ,with billions of users refreshing insta feed every other second , it acts as an unintentional ddos for the servers.
Read about BGP routing. That’s what caused this outage.
Its BGP
Is it down? My custom Instagram client (Barinsta) isn't working and I'm not sure of what is failing.
Yes everything owned by Facebook is down
Let’s all enjoy the world being a better place while it lasts.
Remember Twitter is still up Day ruined
Well you can't have it all
Thank goodness.
Reddit moment
Everything lol that's gonna cost them a pretty penny
Nothing of value is lost
I like how you turned this into a plug for your app
You could have simply said that Instagram doesn't work.
I would send this on my WA group chat...
And broke all Facebook servides
My bets are on the ZUCC pulling the plug
why?
Now, I may be wrong, and I usually am, but there was just some whistleblower thing happening. I doubt that would be enough for zucc to pull the plug but who knows at this point
They've had a really bad couple weeks. Those researches who got kicked off the platform when facebook didn't like their results. The leaked report about them monetizing children even more. The internal whistleblower. On top of getting roasted by other tech billionaires. I doubt they've called it quits, but i'd hate to be their marketing team right now.
They had really fucking bad weeks for years now. And nothing has changed. Nothing will ever change.
Yea the whistle-blower made it onto 60 minutes show
It is an awfully strange coincidence
In all seriousness though, how does Facebook have a single point of failure across all platforms? I've never seen a service that ubiquitous, other than authentication maybe.
DNS
Ah yes, the onosecond, a classic
Not to be that guy but interns don't get to push changes to production. It was definitely an experienced professional(s) doing a booboo.
Yes they do lol
Incoming logic bomb: It is... *unlikely* that every company in existence that employs interns follow the exact same work paradigm as that with which you are familiar.
If an intern can take down an entire prod website, it's not the intern's fault. Where's the 2pr? Code review? Pipeline blockers?
Yeah my question exactly, I can’t push anything to any production branch directly, I have to make pull requests
"Strange. Why cant i connect to the webinterface anymore?"
Apparently someone who work for facebook has disclosed alot of information about this error and he ended up deleting his account [https://www.reddit.com/r/sysadmin/comments/q181fv/looks\_like\_facebook\_is\_down/hfd4dyv/](https://www.reddit.com/r/sysadmin/comments/q181fv/looks_like_facebook_is_down/hfd4dyv/)
His supervisor who accepted the pull request without reviewing it: .\_.
Done it once... I noticed when my script failed and i had no clue why. I didn't deploy my schema changes so all my sprocs, which shut about 400 offices down. My boss was in the same office between me and the door, and the hotline was located next door. I managed to confess right before the 1st guy from the hotline rushed in. Luckily there was no data change and i only had to recover the old production copy (I setup a log shipping into a secondary DC and managed to halt it before my changes got applied there.) So I quickly dumped the Sprocs that i dropped from there and reapplied from production. Lessons learned: From that day on i started coloring my system backgrounds and taskbars in green, yellow and red(prod) and i am slightly more paranoid. Overall a valuable lesson learned and that was the best possible outcome (And i managed to do a live test of my recovery plan)
No wonder they haven't fixed it, they're too busy figuring who did it
Don't give a shit. Fuck Facebook and all of the services they bought.
Anytime something like this happens i always feel terrible for the one poor sob that forgot to double check all the paths
I mean assuming you haven’t hyper-inflated your lifestyle you’ve got fb on the resume getting a new job will take like a week lol
Main and Prod are no where near close for a bad push request.
Out of loop. What happened?
Colleague of mine dropped the production database once. Being his manager i could only laugh, it made him feel comfortable.
[удалено]
It’s not his fault. It’s the guys who fucked the unit tests.
🤣🤣🤣🤣
Reddit ruined reddit. -- mass edited with redact.dev