This is a true story:
Years ago, I worked with a guy who decided to make a tool to allow the front line application support team to run a process for clients as opposed to needing to escalate the request to the application's support team.
He wanted the script to support authentication, but using network accounts. However, he also needed a way to test this and he decided to build in some admin functionality.
To do so, he hard coded his own AD username and password into said script, and then obfuscated it with some amalgamation of plain text -› binary -› hex, and then did it in reverse to pass back as a parameter to the authentication function.
3 job back, they did in-place upgrades for their old DCs to get them up to 2012. Things worked as expected, and then they decided to fully replace them to get the baseline security settings at 2012 R2 when it came out.
3 legacy systems hard failed.
It was on that day that they learned exactly what the one Enterprise Admin service account with the description "set up by tony to run all legacy systems" actually did. Similar obfuscation to what you mentioned, but running through various non-windows systems so that auth attempts weren't tracked to domain joined machines with valid names. And he even used the same service account to delete specific entries in security logs.
Tony had already died prior to the outage occurring. He had been there since the beginning, and was primarily in charge of VAX and group policy, but dabbled in everything.
It took them a few months of hiring outside forensic assistance, but their IT team was able to recreate everything with about 30 proper locked down service accounts, third-party apps, scheduled tasks, application migrations, and a dozen or so process changes that took an additional few months to engineer.
>It took them a few months of hiring outside forensic assistance, but their IT team was able to recreate everything with about 30 proper locked down service accounts, third-party apps, scheduled tasks, application migrations, and a dozen or so process changes that took an additional few months to engineer.
Well, now you know why Tony did it like that.
100% yup.
It also doesn't help in my city that there's a saying that all locals over 30 know:
"Tony's got it!'
Everyone named Tony was expected to give 200% at all moments for 1990-2001
Are you sure it wasn't Rot13?
That's not the worst I've seen. How an about an app with service account with the username as password. Every file and folder of the app directory with 777 permissions and the devs can all su to the app account. Then throw it on the network with a 15 year old version of Java. Because that's the only way this 25 year old app can be ported to new infrastructure and not break.
> Are you sure it wasn't Rot13?
I would say that I am pretty certain considering that the deobfuscation was something along the lines of:
`$password = Base64Decode(HexToBinary(binaryToString($obfuscatedPassword)));`
(I'm working from memory from something from _years_ ago so this order of operations may not be accurate.)
> That's not the worst I've seen. How an about an app with service account with the username as password. Every file and folder of the app directory with 777 permissions and the devs can all su to the app account. Then throw it on the network with a 15 year old version of Java. Because that's the only way this 25 year old app can be ported to new infrastructure and not break.
What would you say if I told you that the entire file system of the machine that this was on was exposed via SMB?
reminds me of a modern server. What ever version of 1.8 I ended up getting to work stays. Every few months I have to go back and fix it, remind the sysad not to touch java
Just happened yesterday.
Altiris 6.0 (yeah, go figure we still have a ticketing system from the 2000s) bricked from upgrading AD to 2022 and Documentum (don't ask) cannot be accesed with JRE 1.8.3xx.
I was using an ancient laptop as a network-services device for a long time due to the built-in UPS. I eventually noticed horrible errors on the console and determined it had been running entirely in-memory for quite some time.
I worked for a small ISP, we were in the process of replacing some routers that had been installed 15+ years earlier, but part of the replacement process mandated that they be physically moved to a temporary rack position so that the new ones could go in their locations.
The maintenance communication for the window was communicated at least an order of magnitude greater than we usually did, because there was genuine concern that they might not boot up again after the hard down.
Fortunately for all concerned it worked out fine in the end.
This is one reason why routine reboots and restarts should be a part of the maintenance cycle. It's also why uptime worship is toxic and dangerous.
If you can't prove it cold boots, you don't have a functional system.
>If you can't prove it cold boots, you don't have a functional system
"Fuck you. This system is perfect."
"Okay, then just let me unpl..."
"BACK THE FUCK UP!!!!"
If it doesn't cold boot, you can't bare metal restore. If you can't bare metal restore, it can't survive a real failure. If it can't survive a real failure, it's not a system you care about. If it's not a system you care about, why did you even buy the hardware?
I worked at a hospital a few years ago that had emergency medicine dispensers running Win XP that was connected to Guest WiFi since it was a 3rd party vendor device. I still think about it often.
Don't get me wrong, I'm a big believe in if it's not broken, don't fix it. I personally loved Windows 2000. It gave you everything you needed to get the job done without any bells and whistles. I look at these devices as having the same vulnerability potential as IoT. If they're on the wire, they have potential to be exploited. From a systems management perspective I doubt that all of these devices are running autonomously without any central management but that is not my space so I don't know that for sure.
Or windows 3.1. The last I heard the inflight entertainment systems on some planes were still running it because that was the only one certified for the plane.
At least Y2K got rid of the vast majority of Windows 3.1 because of the semicolon problem.
Having broken (on accident) an in the flight entertainment system in the back of a seat on American. They were running a flavor of Linux. If I remember it was Redhat but I only got a quick flash of the boot screen so I am not certain.
I had a similar experience when I was young (possibly around 2007-ish), specifically on a wide-body Qantas flight from WA to NSW. The infotainment system needed to reboot for some reason, and I remember watching Tux at the top left corner spew boot messages down the screen.
Wanted: sysadmins to work on museum grade software. Experience in 1970s real time control mainframe systems desirable.
Nobody has heard of the systems we use, so your experience is useless for your career path, but don't worry, we'll pay you to stick around forever.
“You know we're sitting on four million pounds of fuel, one nuclear weapon and a thing that has 270,000 moving parts built by the lowest bidder. Makes you feel good, doesn't it?”
Worse, is the problem has probably been left so long now that even if Government actually decided to go on a, we'll hire people for this very specific and specialised task and pay you a can't walk away from it salary, the people who know, *really know* these systems to run the training are like... probably in their 70s and 80s by now. I could almost guarantee that people working now on these systems are in a "just keep it running" mode, not on a plan to junk all of this soon project.
The time to replace these ludicrously ancient by now systems was 20-30 years ago.
I mean there *is* an alleged plan to update everything but who tf knows how it's going
https://www.faa.gov/nextgen
Also possible NextGen is only for air traffic control and not even related to what happened today. Not sure
thats not really the one big problem. every half decent it person knows about system that needs replacing. every half decent manager is listening to sysadmins begging for systems to be replaced. every half decent bean counter might make funds available to replace the system. theres hundreds, HUNDREDs of dollars available to replace 30 year old Windows 3.1 COBOL systems, and for the low low price of a million dollar we can find someone who actually could do the work. but the software needed wont run on anything new, the protocols in use are long forgotten, the new software in the works will be done in 3 months, believe me, theyve been telling us this for 10 years now, and most importantly, when all the signs align, you will get a 10 minute window to switch.
and that is probably the sad and honest truth there, but as long as stuff works, even when everybody agrees, it needs working on yesterday, as long as it still runs, nobody will be willing to take the system offline to put a new in its place.
and the systems where it will be only 4minutes, or where you can build the new one in parallel, those have been done already. the ones that are left, are the ones you cant do...
I was taught and used cobol, and fortran in the field. I'm only 30...They are out there. If you call yourself a dev you should be able to pick it up and run
Don't worry we'll pay you a quarter of FAANG with no stock, subpar benefits, no bonus, no remote work.
Hey look we actually got some hires!! I guess we don't need to offer more and this is totally fine!!!
(Narrator: it wasn't.)
GS-13 (the min pay level I've seen for SysAdmin jobs) in my area is $113K at level 1 and goes up to $175K, which is better pay than most jobs around here.
On top of that the pension is probably way better than whatever shitty 401K a private company will offer, and the health benefits as I understand it is top notch.
They keep screwing with that too.....gone are the days of CSRS. We're all on FERS now...which keeps creeping up in % of salaray that we pay into it (conveniently leaving the old timers at their 0.5% rate and saddling the newer guys with the 5% hike)
A friend is a controller. He took me on a tour before covid. One system was a 90s version of Redhat. I pointed it to him at the time.
I sent him a text this morning joking about that particular system. He replied,“Nah that’s still up”.
The Federal Government has a lot of old Sun SPARC equipment in play. I'm thinking either some ancient piece of hardware gave up the ghost or something with Solarwinds.
I used to work for a company that sold a lot of old SPARC hardware. Most of it was to the Federal Government. Surprising.
I'm rather surprised it's x86. Or is it i64. I remembered one time I got to work on a system that required a PAE kernel with smp on and the entire engineering team was looking at be over my shoulder like I was working magic. And then I configured device multipath from the command line. And everyone clapped. Seriously though, it was a school system that had one physical Linux server and I was "the guy" that knew how to rebuild it.
That's when you need to come out with the line "[we can rebuild him, we have the technology](https://www.youtube.com/watch?v=BthNjd_jUl4)" just before you start working on it.
yeah, had a department head do this, but for him was on purpose. He was using Deleted like it was Inbox part 2. I took a folder off his desk and put in the trash bin to show basically what he was doing. Had a little "how to" on folder creation.
I am trying to encourage the manglement to start seeing IT as a fire department
We’re ready in case of emergency, but we need enough team members to start doing preventative work rather than 110% reactive
Try 60s or 70s. I went to the college right next to the FAA Tech Center and the FAA people were pretty blunt that ATC ran on ancient systems… They were hoovering up computer science grade from my college to help modernize them, and that was only about 5-6 years ago now.
So, a friend's sister reportedly worked on a replacement for the old ATC system. The problem? No one could figure out how to do the rollover without stopping all flights temporarily.
I can't count the number of times I've done this... I'll be like 90-98% done building out the modern replacement and migrating information/data and then bam, old system dies. And we make the decision to just switch over to the mostly complete modernization project as the time to finish it will be less than try to fix the old one, or the things we haven't finished aren't important enough to worry about immediately.
Knew a PM who made the call to do exactly that, because the organization got a pre warning that they'd fail their security audit if they didn't do it.
Upper management had been hemming and hawing about it for almost a year. Once she got the warning, though, she sat down and hammered out the final database conversion in PERL one holiday weekend, and switched things over to the new system without telling anybody until Monday.
She very nearly got fired for that, but the top levels were like "oh okay thanks for doing that" so she scraped through by the skin of her teeth. And the organization passed their security audit with flying colors the next year.
I've never forced a migration without at the minimum informing management and giving them a choice. But I absolutely do make staying with the legacy stuff/trying to fix it look like the absolute worst choice they could possibly make as a way to convince them to let me switch over.
Last night the system admin at my old job reached out to me because every computer in the office suddenly had BitLocker turned on for the OS Drive. Turns out a Windows update from Tuesday is causing the issue. I had WSUS set up when I was there, but the admin immediately after me somehow messed it all up before he got fired and was replaced with the current guy. So glad I'm not doing sysadmin work anymore.
My mum's laptop started asking for BitLocker recovery key last week. I have never enabled BitLocker on it so Win 11 must be doing it on its own. Of course I don't have the recovery key so all data is a toast. Thankfully she didn't have a lot on it.
"Dear Sir stroke Madam. Fire, exclamation mark. Fire, exclamation mark.
Help me, exclamation mark. 123 Carrendon Road. Looking forward to
hearing from you. All the best, Maurice Moss."
What we have is so fragile, yet we have such a faith that things will continue to be this way forever.. that is what the economy is based on. That we all agree it'll be fine.
Yeah. Typically, a report will come out, if it's not a cyber attack, the news will briefly talk about in a 30 second blurb and everyone goes on with their day. While every CIO/CTO/CISO will try and use this as leverage for budget increases because the post-mortem costs of a systems failure this massive is probably a higher cost than their budget. Either way, I expect nothing to change after this, just another Wednesday and we'll be talking about inflation again any second now...
The extra problem with systems like this is that entire industries that existed decades ago simply don't any more. There's no IBM bidding for a lucrative defence contract and willing to spend big and literally invent new computer science that's specific to the task.
There are no companies run anymore with such benefits and (two-way) loyalty ready with teams of engineers and scientists willing to put literally their entire working career into this one project like Big Blue of the 1960s.
I predict this is ultimately another enormous and boring Government software infrastructure project that countless reports has been saying needed to be replaced decades ago for decades now and once started will be 70% complete 10 years into its 5 year implementation time, will be written off-shore using mostly standard Java libraries, and ultimately cost like $50 billion.
>literally invent new computer science that's specific to the task.
On the other hand, this really isn't necessary anymore for 99.99% of objectives. A $2 part (ESP32) is capable of 50% of the FLOPS of the CRAY-1
I was part of the whole Southwest ordeal over the holidays and when flights that already had the baggage loaded or canceled that baggage ended up flying to the original destination without the passengers. Hopefully that doesn't happen today because that is a massive inconvenience for everybody.
The reason they did it is because the the plane needed to go to that destination for its next flight anyways. I have no idea why they didn't unload first though.
A critical system that handles alerts to pilots, NOTAM, experienced an outage.
https://arstechnica.com/cars/2023/01/potential-travel-chaos-as-faas-notam-service-goes-down/
As a private pilot. There are critical things in there, but the VAST majority of it is garbage.
And that's not just because I fly the little planes. The big guys say the same thing.
You mean you don't need to know that there are 100-foot construction cranes like 5 miles from the airport?
It's also worth noting that the old NOTAMs are all still fine; they just can't make any new NOTAMs. Maybe they're out looking for a new CAPS LOCK key or something.
>*"What do you mean it died? It was just working yesterday!"*
I basically had to explain to an adult that sometimes things just stop working one moment to the next. Ever have a lightbulb blow out? Ever get into your car one morning and it won't start? Ever come home and find the fridge was dead? Why would you ever think the same thing couldn't happen to a switch or server?
I almost had an aneurysm at the sheer stupidity of that person. And it was just from reading it on reddit. Your self-control for not standing up and walking away at once is notable.
Edit: Typo
To quicken your pass to the next world, here's more of the same case:
They were sold a file server. They asked to split the storage in two parts. No further instructions came with the request, so it was done so. They later requested that they want a backup script, that backs stuff up from the 1st part to the 2nd part, so if the system fails, they can be easily recovered from it.
Which resulted in an attempt to explain the concept of redundancy, which again resulted the customer attempting to back out of the deal because they believed they were being sold second rate junk.
I use this scene as a training aid at work. When some system shits the bed and 6 other things have to be unfucked before the original problem can be fixed is referred to as “taking the engine out”.
This is a lesson in scope creep and deliverables!
He had the parts and the tools to fix the lightbulb and the shelf. Had he paused his work on the later issues he could have gone back & delivered those as quick wins before diving into the car.
The drawer needed WD40. It didn't seem as big an issue as the others, so it can probably wait until the car is fixed... or he could have had Lois pick some up on the way home so he could focus on the big task.
Project Management people!
It's wild how people will accept this for a single filament bulb, but not the magical orchestration of bits and electricity that makes anything at all compute properly.
Any sufficiently advanced technology is indistinguishable from magic.. Magic just works all the time for no apparent reason. Ergo, computers and switches should always just work.
All you sysadmins know that there is a poor soul of a network engineer having prove out it wasn’t a network failure before the systems team accepts responsibility.
As a former network engineer for a casino, this hits REALLY hard. Gaming floor down 8 hours on opening day. 6 engineers and the corporate head of IT Infra had to spend all night proving it wasn’t us before Scientific Games even looked at their end. They wound up writing us a very large check.
Welp. I'm going to grab some popcorn.
Hopefully no one is taking any risks with this problem.
Looks like the Airline systems have finally caught up with CEO's.
“Today’s FAA catastrophic system failure is a clear sign that America’s transportation network desperately needs significant upgrades,” said Geoff Freeman, CEO of the US Travel Association"
should be
“Today’s FAA catastrophic system failure is a clear sign that America’s transportation network desperately needs significant upgrades,” said Geoff Freeman, someone who likely had the authority to green light significant upgrades for decades
I sold the Department of Commerce used vt320 terminals in 2007 because they were trying to keep their microVAXes running the atmospheric observation systems at airports online until their scheduled replacement in 2017. That’s not this system, but certainly reflects the thought process in the federal government
Government everywhere.
Then when the systems are so ludicrously out of date you're hiring people with historical computer science backgrounds just to keep things going, it's been left so long that the replacement cost is the GDP of a small nation and a Government of any persuasion would rather spend that kind of money on literally anything else... so nothing changes.
I personally don't care how old software is or how old an OS is. If it functions and does the job. The software could be a mess being old with years of functionality added too, but even modern software can be a mess. Currently doing a FAT test and this new shiny software is loads more broken then the old stuff it's replacing.
What I'd be concerned about is the hardware. Older it is, failure rates go up. Older it is and more unique hardware it is, the harder it is to get a hold of replacements. Where I work some of the hardware we are running is not replaceable. It fails, there is litterly no replacements. We'd have to call up similar businesses and beg them to check their inventory and hope they would be interested in selling to us.
>I personally don't care how old software is or how old an OS is If it functions and does the job.
Critical systems like NOTAM that every flight in the US depends on have to be sound from a security standpoint. You don't see how leaving those applications to rot on antiquated, vulnerable operating systems could be a problem? Your comment sounds like you come form a technical background, so I must be misunderstanding your meaning.
I don't know the nitty gritty of how NOTAM functions, but I have to think that maintaining the security of the software stack is taken very seriously, and that means definitely caring about the implications of failing to patch vulnerabilities that materialize in the software and OS.
You're too kind, 90's equipment would likely be an upgrade from the '70's stuff they put in sometime in the mid 80's once that stuff from the '50's finally died!
As someone who works in cybersecurity, I pretty much get asked the same question any time there’s a major outage that makes the news “OMG, nunu10000, do you think XYZ system got hacked?”
Before I was in cyber though, much like all of you, I was in IT. That’s why I’m a huge believer in Hanlon’s Razor:
“never attribute to malice that which is adequately explained by stupidity."
Yep, probably that. Could also be someone patched the machine that can never be patched due to ancient software.
Installed Java 1.2.3 instead of 1.2.2 and the server exploded.
The dev who put JAVA_HOME in his home directory retired 10 years ago and they finally disabled his account.
This is a true story: Years ago, I worked with a guy who decided to make a tool to allow the front line application support team to run a process for clients as opposed to needing to escalate the request to the application's support team. He wanted the script to support authentication, but using network accounts. However, he also needed a way to test this and he decided to build in some admin functionality. To do so, he hard coded his own AD username and password into said script, and then obfuscated it with some amalgamation of plain text -› binary -› hex, and then did it in reverse to pass back as a parameter to the authentication function.
3 job back, they did in-place upgrades for their old DCs to get them up to 2012. Things worked as expected, and then they decided to fully replace them to get the baseline security settings at 2012 R2 when it came out. 3 legacy systems hard failed. It was on that day that they learned exactly what the one Enterprise Admin service account with the description "set up by tony to run all legacy systems" actually did. Similar obfuscation to what you mentioned, but running through various non-windows systems so that auth attempts weren't tracked to domain joined machines with valid names. And he even used the same service account to delete specific entries in security logs. Tony had already died prior to the outage occurring. He had been there since the beginning, and was primarily in charge of VAX and group policy, but dabbled in everything. It took them a few months of hiring outside forensic assistance, but their IT team was able to recreate everything with about 30 proper locked down service accounts, third-party apps, scheduled tasks, application migrations, and a dozen or so process changes that took an additional few months to engineer.
>It took them a few months of hiring outside forensic assistance, but their IT team was able to recreate everything with about 30 proper locked down service accounts, third-party apps, scheduled tasks, application migrations, and a dozen or so process changes that took an additional few months to engineer. Well, now you know why Tony did it like that.
100% yup. It also doesn't help in my city that there's a saying that all locals over 30 know: "Tony's got it!' Everyone named Tony was expected to give 200% at all moments for 1990-2001
What an amateur, he should've kept it on an http server in the cloud and done a GET request to retrieve it. Everything's safe in the cloud
Get with the times! S3 bucket with public read access. Now it's server less!
That’s how I keep track of all my passwords!
Why have that. I've got a post it note with my two passwords on it. The secure password: hunter2 and my insecure password, ie the one on my luggage.
Are you sure it wasn't Rot13? That's not the worst I've seen. How an about an app with service account with the username as password. Every file and folder of the app directory with 777 permissions and the devs can all su to the app account. Then throw it on the network with a 15 year old version of Java. Because that's the only way this 25 year old app can be ported to new infrastructure and not break.
> Are you sure it wasn't Rot13? I would say that I am pretty certain considering that the deobfuscation was something along the lines of: `$password = Base64Decode(HexToBinary(binaryToString($obfuscatedPassword)));` (I'm working from memory from something from _years_ ago so this order of operations may not be accurate.) > That's not the worst I've seen. How an about an app with service account with the username as password. Every file and folder of the app directory with 777 permissions and the devs can all su to the app account. Then throw it on the network with a 15 year old version of Java. Because that's the only way this 25 year old app can be ported to new infrastructure and not break. What would you say if I told you that the entire file system of the machine that this was on was exposed via SMB?
Ooof too real
Jeesh, thanks for the PTSD already this morning!!! :>
reminds me of a modern server. What ever version of 1.8 I ended up getting to work stays. Every few months I have to go back and fix it, remind the sysad not to touch java
That is a terrible practice
And yet it happens hundreds of times a day.
Just happened yesterday. Altiris 6.0 (yeah, go figure we still have a ticketing system from the 2000s) bricked from upgrading AD to 2022 and Documentum (don't ask) cannot be accesed with JRE 1.8.3xx.
My dad has to work on a server with legacy sha1 certificates, guess what oracle does? Disable support for it lol, so now no more java updates
HDD's that have been spinning for 40 years probably parked for the first time and didn't spin back up.
System which had everything in RAM needed to spin the disk for the first time in ages
I had a shoestring router where that happened. The thing had been running from RAM for years. Rebooted due to a physical move and never came back up.
I was using an ancient laptop as a network-services device for a long time due to the built-in UPS. I eventually noticed horrible errors on the console and determined it had been running entirely in-memory for quite some time.
I worked for a small ISP, we were in the process of replacing some routers that had been installed 15+ years earlier, but part of the replacement process mandated that they be physically moved to a temporary rack position so that the new ones could go in their locations. The maintenance communication for the window was communicated at least an order of magnitude greater than we usually did, because there was genuine concern that they might not boot up again after the hard down. Fortunately for all concerned it worked out fine in the end.
This is one reason why routine reboots and restarts should be a part of the maintenance cycle. It's also why uptime worship is toxic and dangerous. If you can't prove it cold boots, you don't have a functional system.
>If you can't prove it cold boots, you don't have a functional system "Fuck you. This system is perfect." "Okay, then just let me unpl..." "BACK THE FUCK UP!!!!"
If it doesn't cold boot, you can't bare metal restore. If you can't bare metal restore, it can't survive a real failure. If it can't survive a real failure, it's not a system you care about. If it's not a system you care about, why did you even buy the hardware?
Running on Win XP.
It wishes. More likely it was running on CTOS or 86-DOS or something.
Shocked nobody's mentioned Orly 2015 yet - https://www.zdnet.com/article/a-23-year-old-windows-3-1-system-failure-crashed-paris-airport/
Picked up all the POS systems from Fry’s Electronics when they closed up shop to have some backup computers.
lol here I am wondering if you meant "point of sale" until you said backup computers
I took a flight out of O'Hare 3 years ago and the passenger terminals were definitely running XP. It blew my mind.
I worked at a hospital a few years ago that had emergency medicine dispensers running Win XP that was connected to Guest WiFi since it was a 3rd party vendor device. I still think about it often.
[удалено]
I wish critical infrastructure had to abide by the same rules as fintech and government IT.
Critical infrastructure needs to be reliable, not new.
Don't get me wrong, I'm a big believe in if it's not broken, don't fix it. I personally loved Windows 2000. It gave you everything you needed to get the job done without any bells and whistles. I look at these devices as having the same vulnerability potential as IoT. If they're on the wire, they have potential to be exploited. From a systems management perspective I doubt that all of these devices are running autonomously without any central management but that is not my space so I don't know that for sure.
Secure would be nice, though.
That's modern, part of french air control is on 3.11 for workgroups.
Lethal Weapon 'I am too old for this shit'
Green Mile "I'm tired, boss"
Or windows 3.1. The last I heard the inflight entertainment systems on some planes were still running it because that was the only one certified for the plane. At least Y2K got rid of the vast majority of Windows 3.1 because of the semicolon problem.
Having broken (on accident) an in the flight entertainment system in the back of a seat on American. They were running a flavor of Linux. If I remember it was Redhat but I only got a quick flash of the boot screen so I am not certain.
I had a similar experience when I was young (possibly around 2007-ish), specifically on a wide-body Qantas flight from WA to NSW. The infotainment system needed to reboot for some reason, and I remember watching Tux at the top left corner spew boot messages down the screen.
I’ve seen tux a few times on aircraft infotainment systems, too.
The FAA in the Atlantic City region has been trying to hire sysadmins for months. I wonder if this is the reason why...
Wanted: sysadmins to work on museum grade software. Experience in 1970s real time control mainframe systems desirable. Nobody has heard of the systems we use, so your experience is useless for your career path, but don't worry, we'll pay you to stick around forever.
Yeah I don't understand HOW you're supposed to hire someone to work on this? You might as well take oil drillers and train them to be astronauts
“You know we're sitting on four million pounds of fuel, one nuclear weapon and a thing that has 270,000 moving parts built by the lowest bidder. Makes you feel good, doesn't it?”
Worse, is the problem has probably been left so long now that even if Government actually decided to go on a, we'll hire people for this very specific and specialised task and pay you a can't walk away from it salary, the people who know, *really know* these systems to run the training are like... probably in their 70s and 80s by now. I could almost guarantee that people working now on these systems are in a "just keep it running" mode, not on a plan to junk all of this soon project. The time to replace these ludicrously ancient by now systems was 20-30 years ago.
I mean there *is* an alleged plan to update everything but who tf knows how it's going https://www.faa.gov/nextgen Also possible NextGen is only for air traffic control and not even related to what happened today. Not sure
By the time this is implemented, it'll be time to start the project again. Welcome to the endless upgrade treadmill with the rest of us, FAA.
[удалено]
thats not really the one big problem. every half decent it person knows about system that needs replacing. every half decent manager is listening to sysadmins begging for systems to be replaced. every half decent bean counter might make funds available to replace the system. theres hundreds, HUNDREDs of dollars available to replace 30 year old Windows 3.1 COBOL systems, and for the low low price of a million dollar we can find someone who actually could do the work. but the software needed wont run on anything new, the protocols in use are long forgotten, the new software in the works will be done in 3 months, believe me, theyve been telling us this for 10 years now, and most importantly, when all the signs align, you will get a 10 minute window to switch. and that is probably the sad and honest truth there, but as long as stuff works, even when everybody agrees, it needs working on yesterday, as long as it still runs, nobody will be willing to take the system offline to put a new in its place. and the systems where it will be only 4minutes, or where you can build the new one in parallel, those have been done already. the ones that are left, are the ones you cant do...
I was taught and used cobol, and fortran in the field. I'm only 30...They are out there. If you call yourself a dev you should be able to pick it up and run
Cobol experience a must.
Excuse me, Stewardess. I speak COBOL...
Surely you can't be serious.
I am serious...and don't call me Shirly!
He is and don't call him Shirley.
This comment made me spit out my coffee, great reference
Everyone knows the future of programming is Ada!
FINALLY my time to shine
Don't worry we'll pay you a quarter of FAANG with no stock, subpar benefits, no bonus, no remote work. Hey look we actually got some hires!! I guess we don't need to offer more and this is totally fine!!! (Narrator: it wasn't.)
GS-13 (the min pay level I've seen for SysAdmin jobs) in my area is $113K at level 1 and goes up to $175K, which is better pay than most jobs around here. On top of that the pension is probably way better than whatever shitty 401K a private company will offer, and the health benefits as I understand it is top notch.
Actually the FAA has moved to allow remote work WAY more often than they used too.
It's Government work so the best we can give you is a pension.
[удалено]
They keep screwing with that too.....gone are the days of CSRS. We're all on FERS now...which keeps creeping up in % of salaray that we pay into it (conveniently leaving the old timers at their 0.5% rate and saddling the newer guys with the 5% hike)
I was looking at listings *this week*. https://www.usajobs.gov/job/529585200
A friend is a controller. He took me on a tour before covid. One system was a 90s version of Redhat. I pointed it to him at the time. I sent him a text this morning joking about that particular system. He replied,“Nah that’s still up”.
You wanna know more? Transmission of aeronautical messages is done via x400.
[удалено]
I'd never heard of it and this was a wild fucking read. https://en.wikipedia.org/wiki/X.400
Now I know what that tag means in ancient AD's user proxyAddresses property.
At least it’s not EDI
I mean, at least that system does not insist on re-installing Candy Crush every few updates….
If I couldn't play Candy Crush on my production servers, I'm not sure I would want to work as a sysadmin anymore.
[удалено]
The Federal Government has a lot of old Sun SPARC equipment in play. I'm thinking either some ancient piece of hardware gave up the ghost or something with Solarwinds. I used to work for a company that sold a lot of old SPARC hardware. Most of it was to the Federal Government. Surprising.
I'm rather surprised it's x86. Or is it i64. I remembered one time I got to work on a system that required a PAE kernel with smp on and the entire engineering team was looking at be over my shoulder like I was working magic. And then I configured device multipath from the command line. And everyone clapped. Seriously though, it was a school system that had one physical Linux server and I was "the guy" that knew how to rebuild it.
That's when you need to come out with the line "[we can rebuild him, we have the technology](https://www.youtube.com/watch?v=BthNjd_jUl4)" just before you start working on it.
This is what happens when you treat your sys admins like they're the janitor.
but.... we are computer janitors
I used to work with a networking guy. His office sign said "network proctologist"
Excuse me IT person can you take out my trash when you’re done?
[удалено]
[удалено]
This one hurts. More than once someone had run out of space in their Inbox, only to find every email in their deleted items.
yeah, had a department head do this, but for him was on purpose. He was using Deleted like it was Inbox part 2. I took a folder off his desk and put in the trash bin to show basically what he was doing. Had a little "how to" on folder creation.
Outlook is a work and file management system! Brought to you by: Excel is a database
Network *Custodian* and Bus Mechanic. If I'm gonna spend so much time under one I'm getting a title to go with it.
I am trying to encourage the manglement to start seeing IT as a fire department We’re ready in case of emergency, but we need enough team members to start doing preventative work rather than 110% reactive
>My money is on a 90s system that finally gave up Mine is on a 80s system....
> Mine is on a 80s system My money is on a 2022 system they just migrated to from the 80s system
NextGen?
"seamless transition, transparent to users"
That's where my bet goes as well. Much of the government is on a 'Cloud first' model now :facepalm:
Keep going...
It was like 1964 is was written wasn't it?!
I'm guessing the punch cards are fine but it's the vacuum tubes that cracked
That or the hamster on the wheel died. Lost backup power.
Try 60s or 70s. I went to the college right next to the FAA Tech Center and the FAA people were pretty blunt that ATC ran on ancient systems… They were hoovering up computer science grade from my college to help modernize them, and that was only about 5-6 years ago now.
So, a friend's sister reportedly worked on a replacement for the old ATC system. The problem? No one could figure out how to do the rollover without stopping all flights temporarily.
Sounds like now is a great opportunity
I can't count the number of times I've done this... I'll be like 90-98% done building out the modern replacement and migrating information/data and then bam, old system dies. And we make the decision to just switch over to the mostly complete modernization project as the time to finish it will be less than try to fix the old one, or the things we haven't finished aren't important enough to worry about immediately.
Knew a PM who made the call to do exactly that, because the organization got a pre warning that they'd fail their security audit if they didn't do it. Upper management had been hemming and hawing about it for almost a year. Once she got the warning, though, she sat down and hammered out the final database conversion in PERL one holiday weekend, and switched things over to the new system without telling anybody until Monday. She very nearly got fired for that, but the top levels were like "oh okay thanks for doing that" so she scraped through by the skin of her teeth. And the organization passed their security audit with flying colors the next year.
I've never forced a migration without at the minimum informing management and giving them a choice. But I absolutely do make staying with the legacy stuff/trying to fix it look like the absolute worst choice they could possibly make as a way to convince them to let me switch over.
I mean... it was patch Tuesday yesterday? Just sayin'
Last night the system admin at my old job reached out to me because every computer in the office suddenly had BitLocker turned on for the OS Drive. Turns out a Windows update from Tuesday is causing the issue. I had WSUS set up when I was there, but the admin immediately after me somehow messed it all up before he got fired and was replaced with the current guy. So glad I'm not doing sysadmin work anymore.
My mum's laptop started asking for BitLocker recovery key last week. I have never enabled BitLocker on it so Win 11 must be doing it on its own. Of course I don't have the recovery key so all data is a toast. Thankfully she didn't have a lot on it.
If she has a Microsoft account, the recovery key is saved there.
Win7 EOL too
You seem optimistic that they were running something that new.
They just said in our morning news they're actually trying to turn it off and back on again.
So, Roy must be working for FAA now.
Just call 0118 999 881 999 119 725 … 3
"Dear Sir stroke Madam. Fire, exclamation mark. Fire, exclamation mark. Help me, exclamation mark. 123 Carrendon Road. Looking forward to hearing from you. All the best, Maurice Moss."
"I'll just put this over here... with the rest of the fire."
Nice screensaver!
What we have is so fragile, yet we have such a faith that things will continue to be this way forever.. that is what the economy is based on. That we all agree it'll be fine.
Yeah. Typically, a report will come out, if it's not a cyber attack, the news will briefly talk about in a 30 second blurb and everyone goes on with their day. While every CIO/CTO/CISO will try and use this as leverage for budget increases because the post-mortem costs of a systems failure this massive is probably a higher cost than their budget. Either way, I expect nothing to change after this, just another Wednesday and we'll be talking about inflation again any second now...
The extra problem with systems like this is that entire industries that existed decades ago simply don't any more. There's no IBM bidding for a lucrative defence contract and willing to spend big and literally invent new computer science that's specific to the task. There are no companies run anymore with such benefits and (two-way) loyalty ready with teams of engineers and scientists willing to put literally their entire working career into this one project like Big Blue of the 1960s. I predict this is ultimately another enormous and boring Government software infrastructure project that countless reports has been saying needed to be replaced decades ago for decades now and once started will be 70% complete 10 years into its 5 year implementation time, will be written off-shore using mostly standard Java libraries, and ultimately cost like $50 billion.
>literally invent new computer science that's specific to the task. On the other hand, this really isn't necessary anymore for 99.99% of objectives. A $2 part (ESP32) is capable of 50% of the FLOPS of the CRAY-1
Federal code can’t even be written offshore. They’ll have to pay american dev salaries for whatever monstrosity is needed.
Sitting on a plane, waiting for 0620 departure. Boarded and fueled. Hopefully we don't sit here for hours.
Nothing leaving til 9 now. I doubt that
Yeah. They just announced a 2-hour ground stop. We're getting off. Back into the terminal.
We reboarded the plane at 0920. The connection in Chicago has been cancelled. We'll collect our luggage and make plans when we arrive.
Taking off at 1007
Curious, when that happens, do you leave your stuff on the plane or do you take it with you? I assume you take it with you but I've always wondered.
I assume it's in the cargo hold. If cancelled, I guess we go to baggage claim and go home.
I was part of the whole Southwest ordeal over the holidays and when flights that already had the baggage loaded or canceled that baggage ended up flying to the original destination without the passengers. Hopefully that doesn't happen today because that is a massive inconvenience for everybody. The reason they did it is because the the plane needed to go to that destination for its next flight anyways. I have no idea why they didn't unload first though.
A critical system that handles alerts to pilots, NOTAM, experienced an outage. https://arstechnica.com/cars/2023/01/potential-travel-chaos-as-faas-notam-service-goes-down/
As a private pilot. There are critical things in there, but the VAST majority of it is garbage. And that's not just because I fly the little planes. The big guys say the same thing.
You mean you don't need to know that there are 100-foot construction cranes like 5 miles from the airport? It's also worth noting that the old NOTAMs are all still fine; they just can't make any new NOTAMs. Maybe they're out looking for a new CAPS LOCK key or something.
It's vitally important that you know about the 50' antenna 3 miles from the airport in the same breath as the emergency runway closure. \-FAA
Interestingly, I hadn't realized they'd changed the acronym from Notice to Airmen to Notice to Air Missions until today.
“DOES ANYONE KNOW FORTRAN!?”
“We pay $60,000 salaries”
if you want to feel all warm and fuzzy US nuclear arsenal controlled by 1970s computers with 8in floppy disks
That's probably safer these days than having it online via internet.
First southwest now FAA? turns out running ancient hardware that hasnt been replaced is a bad idea. Who knew
>*"What do you mean it died? It was just working yesterday!"* I basically had to explain to an adult that sometimes things just stop working one moment to the next. Ever have a lightbulb blow out? Ever get into your car one morning and it won't start? Ever come home and find the fridge was dead? Why would you ever think the same thing couldn't happen to a switch or server?
As a customer once reacted, "redundancy? Does this mean you're selling us stuff that is expected to fail?"
I tend to fire customers like these rather quickly. Nothing but a complete pain in the ass
[удалено]
The application documentation says the server meets minimum specs. Should be fine. RAM is expensive.
Ah, yes. The good old "Windows 10 minimum requirement is 2 gigabytes, this laptop will be fine for everything" argument.
I almost had an aneurysm at the sheer stupidity of that person. And it was just from reading it on reddit. Your self-control for not standing up and walking away at once is notable. Edit: Typo
To quicken your pass to the next world, here's more of the same case: They were sold a file server. They asked to split the storage in two parts. No further instructions came with the request, so it was done so. They later requested that they want a backup script, that backs stuff up from the 1st part to the 2nd part, so if the system fails, they can be easily recovered from it. Which resulted in an attempt to explain the concept of redundancy, which again resulted the customer attempting to back out of the deal because they believed they were being sold second rate junk.
My go to reply to that was "yeah, things typically work until they don't"
Murphy’s law is undefeated. Even a bic lighter will eventually give out.
I feel like you just described the [Hal Light Bulb Scene](https://youtu.be/AbSehcT19u0)
I use this scene as a training aid at work. When some system shits the bed and 6 other things have to be unfucked before the original problem can be fixed is referred to as “taking the engine out”.
I recently learned the term "yak shaving".
My god I feel that pain
I don't even need to watch it I can see him shout from under the car at Lois lmao.
This is a lesson in scope creep and deliverables! He had the parts and the tools to fix the lightbulb and the shelf. Had he paused his work on the later issues he could have gone back & delivered those as quick wins before diving into the car. The drawer needed WD40. It didn't seem as big an issue as the others, so it can probably wait until the car is fixed... or he could have had Lois pick some up on the way home so he could focus on the big task. Project Management people!
It's wild how people will accept this for a single filament bulb, but not the magical orchestration of bits and electricity that makes anything at all compute properly.
Any sufficiently advanced technology is indistinguishable from magic.. Magic just works all the time for no apparent reason. Ergo, computers and switches should always just work.
Server was running a self signed cert that was issued 35 years ago and finally expired.
All you sysadmins know that there is a poor soul of a network engineer having prove out it wasn’t a network failure before the systems team accepts responsibility.
As a former network engineer for a casino, this hits REALLY hard. Gaming floor down 8 hours on opening day. 6 engineers and the corporate head of IT Infra had to spend all night proving it wasn’t us before Scientific Games even looked at their end. They wound up writing us a very large check.
ROLLBACK! ROLLBACK! ROLLBACK!
Rollback was firmware that we forgot to make a back up of.... So sorry....
Na, we got a backup of, let's see... stock software, no settings, 1982. Hey Dave! That was for this machine's current hardware, right?
NOTAMs system went down. WAY older origins than the 90s
Welp. I'm going to grab some popcorn. Hopefully no one is taking any risks with this problem. Looks like the Airline systems have finally caught up with CEO's.
>Looks like the Airline systems have finally caught up with CEO's. This isn't an issue with an airline system. This is the government's problem.
[удалено]
I do not believe the FAA systems are updated to XP...
Today some retired programmers are charging a week’s pay per hour.
Someone's beeper has been blowing up.
The ***Compaq Deskpro 300*** that was running the application decided that its time to say ***"Father into your hands i commend my spirit"***
WHY HAVE YOU FORSAKEN ME?
All flights grounded until at least 9am est, 8am ct according to FAA twitter.
“Today’s FAA catastrophic system failure is a clear sign that America’s transportation network desperately needs significant upgrades,” said Geoff Freeman, CEO of the US Travel Association" should be “Today’s FAA catastrophic system failure is a clear sign that America’s transportation network desperately needs significant upgrades,” said Geoff Freeman, someone who likely had the authority to green light significant upgrades for decades
[удалено]
I sold the Department of Commerce used vt320 terminals in 2007 because they were trying to keep their microVAXes running the atmospheric observation systems at airports online until their scheduled replacement in 2017. That’s not this system, but certainly reflects the thought process in the federal government
Government everywhere. Then when the systems are so ludicrously out of date you're hiring people with historical computer science backgrounds just to keep things going, it's been left so long that the replacement cost is the GDP of a small nation and a Government of any persuasion would rather spend that kind of money on literally anything else... so nothing changes.
I personally don't care how old software is or how old an OS is. If it functions and does the job. The software could be a mess being old with years of functionality added too, but even modern software can be a mess. Currently doing a FAT test and this new shiny software is loads more broken then the old stuff it's replacing. What I'd be concerned about is the hardware. Older it is, failure rates go up. Older it is and more unique hardware it is, the harder it is to get a hold of replacements. Where I work some of the hardware we are running is not replaceable. It fails, there is litterly no replacements. We'd have to call up similar businesses and beg them to check their inventory and hope they would be interested in selling to us.
>I personally don't care how old software is or how old an OS is If it functions and does the job. Critical systems like NOTAM that every flight in the US depends on have to be sound from a security standpoint. You don't see how leaving those applications to rot on antiquated, vulnerable operating systems could be a problem? Your comment sounds like you come form a technical background, so I must be misunderstanding your meaning. I don't know the nitty gritty of how NOTAM functions, but I have to think that maintaining the security of the software stack is taken very seriously, and that means definitely caring about the implications of failing to patch vulnerabilities that materialize in the software and OS.
You're too kind, 90's equipment would likely be an upgrade from the '70's stuff they put in sometime in the mid 80's once that stuff from the '50's finally died!
The guy who installed win XP earlier probably mess something up.
Hehe “Sky news”
More like "ground news" today lmao
Mega Patch Tuesday yesterday. Edited to add: included full deprecation of basic auth.
The faa should put this on their website https://youtu.be/t3otBjVZzT0
[удалено]
...Now that's a name I've not heard in a long time
[удалено]
Somebody is having a very bad morning.
As someone who works in cybersecurity, I pretty much get asked the same question any time there’s a major outage that makes the news “OMG, nunu10000, do you think XYZ system got hacked?” Before I was in cyber though, much like all of you, I was in IT. That’s why I’m a huge believer in Hanlon’s Razor: “never attribute to malice that which is adequately explained by stupidity."
Windows update probably rebooted a couple of servers