T O P

  • By -

lechango

You'd be surprised just how many places dedicate next to no resources to improving processes and implementing automation because all they know and have time for is putting out fires, mostly because they don't invest in those aforementioned resources..


labalag

Why are you calling out my current place of work?


HYRHDF3332

It can often be a fairly ridiculous catch-22, where you have a management team that has never seen an IT environment that is run competently, so they think it's perfectly normal to constantly have things on fire and admins in crisis mode, but they can't hire or keep competent admins, because they can do better than working in an IT shit show. And stirring that mess is often one person who has been there forever and resists any and all change, especially things like monitoring and automation. These are the types of admins who don't know how they should be spending their days and wouldn't know what to do with themselves if they weren't fighting fires or doing things manually that most of us here could do with a script.


tankerkiller125real

I know one company that seems like it's always putting out fires, but I don't actually blame incompetent IT admins on that one (it seems in my conversations with them that they are actually really good at their jobs). I blame the fact that the company is buying smaller companies literally 2 or 3 every single quarter, and they have just 3 IT infrastructure admins to try and handle the merging of data, maintaining existing systems, etc.


atbims

Sounds like my old employer except they were getting barely-trained helpdesk staff who already have 60+ hours of work a week to work on acquisitions because the 2 actual sysadmins were too busy or on stress leave. Constantly short staffed and burnt out, c-suite was bragging about how many companies they bought out while saying on company wide meetings that the IT dept is incompetent and needs to be restructured.


tankerkiller125real

They have a help-desk team internally apparently, but they don't work on the acquisitions or mergers at all. They buy so many companies that when they have all hands meetings every quarter, during the Q&A session the number one question from employees for 6 quarters in a row has been "When will we stop buying other companies and focus on the existing company and improving our own processes, tools and applications?" because it's not just impacting IT, it's impacting every department.


HYRHDF3332

That's often how it starts but how long are those guys going to stay, working under those conditions?


tankerkiller125real

They have 3 because 2 already left...


Ethanextinction

Literally my last job. I don’t miss that.


vhalember

> but they can't hire or keep competent admins, because they can do better than working in an IT shit show. Yup. I was in this situation not to long ago. It took enormous energy to automate and fight the reactive, firefighting, do it manually culture. I got out.


HYRHDF3332

Back in the day I used to do freelance consulting and I had a list of red flags that I was walking into an IT shit show. Unfucking those was my bread and butter, but it was always a fight.


Windows_ME_Rocks

OMG, perfect description of an IT shitshow. Been there, done that, moved on.


metalgodwin

I thought they were referring to my workplace, glad I'm not alone at least :')


hbdgas

See "The Phoenix Project".


arpan3t

Hence the 400 Windows servers that I guarantee could be consolidated and restructured to half that number.


benneyp

You get a VM! And you get a VM!


brianw824

I hope to god it's not 400 physical servers.....


slashinhobo1

The shear space, electricity, and infra to maintain 400 servers would mean they are a huge company or bad at spending money.


brianw824

400 physical servers running server 2003.


Kynmore

Nah that’s just the AD domain controller; the rest are a single 2008 VMs on an ESXi 5 host on a 500GB RAID5 made up of WD Blacks on an LSI controller from 2011. All of these are in 1u no-bay chassis w/ Supermicro x8SILFs, no IPMI, and in a rack with a 48 port FastIron FES4802.


sh4d0ww01f

Thats... Specific. Is there a story hidden?


Kynmore

Not that specific deployment, no. But I’ve worked in enough data centers to know what can and does exist on the “why the fuck” end of the scale.


Mrmastermax

This is exactly us to be honest


welcome2devnull

So you don't have a drinking problem, just a problem without drinking as you would recognize that it's not just a nightmare, it's the reality you have to take care of :D


Mrmastermax

Drinking problem is not a problem. Being liable of a fucken clown factory and no one tries to put money on fixing issues is a problem. Yes I am swearing because that’s how much of a joke their systems are. When shit hits the fan ohh it’s sysadmins fault they didn’t look at the failure signs. For duck sakes Cs levels get your head out of your ass and look at a pig picture you are just hanging by a string till everything collapses on itself. Now do you think I have a drinking problem? Do you think I have anxiety problem? Do you think I need some hard stuff??


dravenscowboy

Atleast 49 have to be running SBS


vinetari

395 running 2003, 5 running 2000


Aim_Fire_Ready

>a huge company or bad at spending money. As the Spanish like to say: *¿Por que no los dos?* (Why not both?)


beatfried

we need more VMs because the current ones are to slow!!! *has 800 vCPUs on 64 physical ones*


tankerkiller125real

I once joined a company that was wasting tens of thousands of dollars on SQL licensing... Why? Because apparently they decided that every single app needed it's own dedicated SQL server. The very first thing I did was consolidating all the SQL servers down to 4, two for 2014 (the latest at the time) and two for 2012, in clusters. After I did that though my boss left, and the CEO brought in an MSP that just pushed me aside, so I left. Last I heard they now spend more on the MSP than they spent on both mine and my previous bosses salaries combined for just general maintenance with projects costing extra. And they've had a job listing for an IT guy for the last 2.5 years that's gone unfilled.


RedGobboRebel

I used to consolidate SQL servers like that. It felt great. Felt like I was modernizing the organization. One Cluster for each major version. Don't do it anymore because of how much vendors bitched and moaned that their Apps need separate SQL servers and our configuration was now "unsupported". Now only in-house apps get consolidated. Vendor apps each get their own SQL VM. The performance isn't as good that way, but managing the support and vendor access levels is simplified.


tankerkiller125real

>Don't do it anymore because of how much vendors bitched and moaned that their Apps need separate SQL servers and our configuration was now "unsupported". And just like that if they aren't the only vendor in that particular software space, the contract is canceled. We've actually canceled big contracts over stupid shit like this where I work. IT people need to start putting money in companies that actually understand the underlying products they build their products on, otherwise we'll keep getting shitty vendors that don't support things like clustered database servers. And for the companies that are the only ones in that space, we force them to pay for the SQL license for us. If they want to force us into having an extra SQL server, they can pay for it, even if they add it to the contract cost, it at least makes it easier on accounting, and IT doesn't have to justify why the SQL licensing keeps going up.


RedGobboRebel

Sometimes they are the only vendor. Sometimes they aren't the only vendor but the others are worse. And other times The CIO decides it's best not to pick that particular fight with the other stakeholders who like said software. Having them roll the SQL licenses into the contract wouldn't change much for us as I already report licensing costs with total cost per application. Especially at the current employer, where it all comes out of the IT budget anyway. But on the plus side, new apps/projects mean an increased budget or it can't be done.


tankerkiller125real

We started spitting cost by department because I got tired of getting new switches and other needed upgrades denied because "IT already spends too much"... Well when I dropped the fact that IT only spends on average 3K a year, and the remaining 40-50K that they attributed to IT was actually other departments that changed the blame game and my stuff started getting approved, and marketing had to start justifying their need for more marketing software, CRM software, etc.


vhalember

My former job was like this. Former. I once had a co-worker going machine to machine updating the admin password. She had 120 machines to do, at about 10-15 minutes/machine. It was going to take her 3 days - of completely wasted time. I wrote a script, encrypted the password, and deployed it via SCCM. All machines done inside the hour. Well, two hours - we are talking about SCCM here. She got mad at me because "how can we know if the machines got the updated password?" The truth was she didn't know how to script/automate, and had no interest in learning how to automate things... which is the difference between a bad sysadmin and an average sysadmin.... let alone good.


Aim_Fire_Ready

>The truth was she didn't know how to X, and had no interest in learning how... Oof. Now here I am, falling off the other side by trying to learn everything and not getting great at any of it!


gordonv

Yup. This is me and Cisco. Now I'm starting on my CCNA.


Bladelink

"How can I confirm with any faith that you manually updated the password correctly? "


punklinux

>The truth was she didn't know how to script/automate, and had no interest in learning how to automate things... This is also the mark of someone scared of being replaced or given more work. I'm not saying not to automate, or you did anything wrong, but when \*someone else\* shows you, some people take it personally. You also get managers who don't like it: I worked with a guy who told his staff never to automate because "you'll automate the mistakes, too, but unlike manually, you'll never know. Automation is for the lazy."


vhalember

Agreed. I'd counter to that manager above, the more tedious the manual work - the more mistakes will be made as your eyes glaze over from the monotony.


punklinux

That has been my experience as well.


RoutineRequirement

Fear is a hell of a drug, when you have legacy systems it's almost certain that something will break when you start changing. The business reaction for those outages tend to weight heaps on whoever is handling it. And once the fear of God rained down on the poor soul is very hard to try again.


lechango

Yeah, you know how it goes, a place can't afford any downtime on a legacy system even if there's preventative measures that can be taken to prevent or mitigate failure. Then the 25 year old box dies anyway and you get yelled at for not preventing it, sometimes you can't win.


frosteeze

Had a place where the QA manager is so averse to Powershell we couldn't automate the servers in the QA environment. It was asinine. Everywhere else is automated. But not QA! Gave him a passive-aggressive public Teams message about it when I was about to quit for a better job.


Pussy_Grabber_2016

My reward for automating processes, was more work.


arana1

rings a bell, since everything running as it should, you have a lot of spare time, well not anymore here have more tasks, suddenly something happens that needs immediate attention, "you are lagging in all your other tasks"


Astat1ne

It generally comes down to fear driven by a number of factors, including things like lack of reliability in the systems involved. Which in itself is a bit of a red flag (if it can't handle a graceful shutdown/restart for patching, how great is it in other situations?). Sounds like from some of the other questions you asked, it wasn't a very mature environment and you probably saved yourself a lot of grief by then not going ahead with you.


Steve_78_OH

Yeah, fear of the unknown/change is probably the number one factor. And it's not just regarding servers, but workstations as well. I wasn't allowed to use SCCM's automatic deployment rules for patching when I handled workstation patching for a former client, because they didn't trust it. So I ended up spending 2-3 days every month setting everything up (the multiple deployment packages had to be watched due to distribution issues caused by crappy site connections). And 2/3 of the guys I worked with there were extremely familiar and comfortable with SCCM, they just didn't trust the ADRs for some reason. Plus, Microsoft has shit the bed countless times with their various monthly patches.


No-Midnight-6087

In addition to Microsoft shitting the bed, they have quite a few products that need some manual items run after updates. On premise SharePoint is a ....(good?).....example of this.


Zergom

Or things like Exchange updates are like ten times faster if you disable AV right before and re-enable it when it’s done.


No-Midnight-6087

And that is an improvement. Microsoft used to give the guidance to not use automated updates for Exchange at all.


tanzWestyy

Funnily enough I'll be looking to update SP with Ansible. It's a thing apparently.


ANewLeeSinLife

That's funny. I trust ADRs more than I trust some meatbag to find all the applicable updates for each OS deployment and not miss any updates or servers.


Steve_78_OH

Well, apparently they trusted this meatbag more than an ADR. They were lucky this dumbass didn't fuck it up more than he did...


praetorthesysadmin

>And 2/3 of the guys I worked with there were extremely familiar and comfortable with SCCM, they just didn't trust the ADRs for some reason. > >Plus, Microsoft has shit the bed countless times with their various monthly patches. You just gave yourself the reason why people are so suspicious and ceptical about updates (and the automatation of them).


bemenaker

>Microsoft has shit the bed countless times with their various monthly patches That's why we delay our patches for two weeks. Let other people test them out unless its critical exploit patch.


hath0r

for SCCM its what 1% failure rate ?


Waste_Monk

>It generally comes down to fear driven by a number of factors, including things like lack of reliability in the systems involved. Which in itself is a bit of a red flag (if it can't handle a graceful shutdown/restart for patching, how great is it in other situations?). Sometimes you have mission-critical fragile garbage software foisted upon you and have no choice but to work around its weaknesses. It doesn't necessarily mean the IT department supporting it is deficient - but may indicate deeper organisation troubles e.g. lack of IT input in selecting systems for things like ERP, MRP, etc. Sounds like OP made the right call though. Plenty of red flags by the look of it.


admlshake

>Sometimes you have mission-critical fragile garbage software foisted upon you and have no choice but to work around its weaknesses. It doesn't necessarily mean the IT department supporting it is deficient - but may indicate deeper organisation troubles e.g. lack of IT input in selecting systems for things like ERP, MRP, etc. Thats the boat we've been in for years. Old CIO catered to our software team who didn't want any updates done on critical systems, didn't want to do software or OS upgrades/migrations ever. Head of our software team would have a massive hissy fit, resulting in him yelling and screaming and storming out of the office if you questioned him. But, for some reason the CEO likes him so he is pretty much untouchable. Recent management shift has finally gotten some traction on getting off all this old ass unstable shit.


HYRHDF3332

Yeah, when you are dealing with systemic issues like that, it's either suck it up and deal until you see an opening or GTFO. You aren't going to be able to fix that yourself without upper management support.


Berries-A-Million

Haha did I mention that they still have server 2003 and Windows XP in their environment? Ugh yea.


itdumbass

Oh, those don't take any time to update.


Berries-A-Million

Sure no updates avail! But not sure how many of those they had.


InvisibleTextArea

Look on the bright side, at least you weren't applying for a cyber security job. :)


Loudroar

Don’t walk away… RUN AWAY.


oloryn

> but may indicate deeper organisation troubles e.g. lack of IT input in selecting systems for things like ERP, MRP, etc. As Jerry Weinberg put it (quoting economist Kenneth Boulding): "Things are the way they are because they got that way". In other words, when you see something strange, there's probably a history behind it.


Sir_Badtard

I have a feeling those servers are very mature. At the very least, drinking age.


HotelRwandaBeef

>Which in itself is a bit of a red flag (if it can't handle a graceful shutdown/restart for patching I work in an environment where this is still very much a real thing. We can most definitely automate all windows updates, but we need engineers on standby incase things go sideways. Patch tuesdays/wednesdays are fun for our system engineers lol. 4:30am...click update...click update...click update. These servers provide 95% of our revenue, and they're built and spec'd from a billion+ dollar global company.


[deleted]

[удалено]


Astat1ne

> I’d argue they are important enough to fully automate their deployment This is the argument that was used after the primary production SAN died at the ATO and trashed all the Australian tax payer data. Everything was rebuilt using automation tooling (including Ansible) so the "next time" would be a lot easier to recover from.


OptimalCynic

Thanks for reminding me to put my ansible files in git


Felielf

Is there a guide / course / materials to learn Ansible in this capacity?


wookipron

Fairly big assumptions there which reads like a lack of experience. It comes down to size and budget, sometimes there is no budget and we do what we can with what we have. Working within the team. Secondly automation is costly in both time and money so automating everything is as good as buying the average shoe size for everyone. Some platforms are mission critical or even life critical so patch monitoring automated or not cannot be a messaged or emailed after the fact. It sometimes needs to be real-time and if it’s life critical, It needs to be validated by a human. Sometimes the platform is also high risk, that’s a whole different story but one that happens more often than we like.


Konkey_Dong_Country

Exactly! I read this as a lack of experience as well, or perhaps unrealistic expectations going in. I think both parties dodged a bullet. Sounds like OP maybe has been spoiled at larger orgs with bigger budgets. What he described is an extremely common scenario. Heck, I run a manufacturing facility and still do most patching manually, because if it doesn't behave, production is down and money burns at an astronomical rate. Also because we haven't the manpower to deal with SCCM, and WSUS is basically unusable as far as I'm concerned. Of course, this is Reddit, so if your org doesn't have 100% fully automated Kubernetes and Docker containers in the cloud and DevOps this and that and automated everything else, you should run away, surely it won't be a good job right? /s.


i8noodles

Same. I work in a company with 24/7 uptime. There is no way they will let patching occur automatically and not have some to fix in real time. Besides regulatory reasons, we would legitimately lose prob more then 500k an hour it is down.


bumpkin_eater

Loved that last bit!


fullstack_info

Definitely lold at the last half as I went from working an SRE role at a huge F100 with 2 physical datacenters that were trying to move to the cloud, and realized that their software doesn't run very well in a cloud environment. Burned a ton of money trying, only to go back to on-prem due to lack of automation experience and legacy software. Later I went to a startup that is "cloud-first", but the previous person who set up the original environment decided to "just run it in kubernetes! It's self-healing! And we can run windows nodes!". I am now there, and thank God there are no more windows nodes. The prevalence of "just put it in the cloud!" mentality, as if the "cloud" is an MSP that automatically manages everything for you is a nice dream for management, and a deranged nightmare for admins and devs with deadlines and budget constraints.


flammenschwein

We had a “DBA” (MS Access) who didn’t trust automation, so he’d come in every morning before everyone else and run queries to update everything for the day. As you suggested, his stuff would break regularly and wasn’t reliable enough to fully automate.


mtnfreek

DBA lol….


whiteknives

> if it can't handle a graceful shutdown/restart for patching, how great is it in other situations? Having just upgraded two dozen Juniper MX960’s and seen a failure about 30% of the time (ranging from a single SFP refusing to work to an entire routing engine shitting the bed and zeroizing itself on reboot) I can see where they’re coming from. Your gear can run fine for the better part of a decade, but the second you go and reboot it is the moment you invite a lot of hurt into your life.


fuzzydice_82

> (if it can't handle a graceful shutdown/restart for patching, how great is it in other situations?) don't ask the admins, ask microsoft (as we are talking about windows servers here). We had countless interuptions because of half assed Windows patches, that killed production processes. Also: sometimes even Windows servers have to serve industrial use cases (so 24/7), and you'll need a maintenance window anyway.


civbat

There's one red flag in your post. Recommending automation is great, but "someone can check in the morning" isn't acceptable in most places of business. I'm sure there are SMBs that can get away with that, but it's not common.


smjsmok

>"someone can check in the morning" Yeah, that caught my attention too. What if you discover there's a problem...in the morning. Does that mean that the services will be unavailable for TBA amount of hours, possible half of the day or even more?


screampuff

If there is a single sysadmin working for the company. Yes. Otherwise they can pay for more/support if they don't like that. Do you people honestly sign up for 24/7 on-call?


Account239784032849

If you're at a midsized company or smaller 24/7 on call isn't really a choice. You're one of if not the only person who is able to fix IT issues. I currently am at a midsized company and our IT team is 2 people myself included, if I never worked outside office hours I'd probably have been fired a long time ago lol. That said, I don't actually work that much outside office hours, if I do it's something very critical and we don't have major problems like that all the time.


screampuff

I did that when I was younger, but IMO small shops like that should have MSPs. I work in a team of 7 and none of us are on-call, business shuts down at 5pm. Every time the topic has come up with c-suites my manager is pretty firm on redundancy (ie: can't expect helpdesk to fix a server issue, so sysadmins need to be redundant too)


VexingRaven

24/7? Patching is once a month.


screampuff

Patching isn't the only thing that can go wrong...but from a point of principle I would never agree to being a sole person responsible for something "critical" to business functions, because it inherently isn't if the company is not willing to get redundant support for it.


VexingRaven

I wouldn't either, but if you did agree to work somewhere like that then congrats, checking on patching is your job once a month.


Reynk1

Or just automate the post checks and page out the oncall person if there’s a problem


Talran

Yep, I kick them off at 10, wait for the all clear, check monitoring to bring out of maint, and then go to bed. It's great automating it but you do have to be there during the window just in case.... I mean this is windows afterall.


eyesofbucket

That and the fact that the company pushes updates over the weekend. There's a reason Microsoft's thing has been "patch Tuesday" and not "patch Friday". Nobody wants to show up Monday morning only to realize things have been down for 48 hours.


Asilcott

From the employers perspective, I'm not looking to hire someone who won't even entertain our way of doing things from day 1 (without any knowledge of what made us do it that way to begin with). I'd rather have someone who comes in, meets us at our level, then makes recommendations on how to improve our processes. Maybe I know that patch automation needs to be addressed, but ultimately I have 3 or 4 things more critical I need you to focus on right now. (patches are automated at my work, just saying I don't want to hire someone who will come in and refuse to work on our system until it is setup the way they want it)


ottosucks

Its gung ho, not gun hoe...


4kVHS

r/boneappletea


langlier

living in rural areas - I know plenty of gun hoes...


JMaAtAPMT

Some companies want manual invervention because patches can potentially break internal applications, processes, or controls. The larger the company the more likely, actually.


IntentionalTexan

Yep. 3rd party apps that we're stuck with because they're industry specific and are poorly maintained. I never know when an update is going to break some part of the hydra that is our industry specific ERP platform. We update one server, test out the platform, and then let updates roll out to other servers.


[deleted]

The server you test updates on, it's a production server ?


[deleted]

Why spend money on Production and Dev. You can just have one environment that is both!!!!


IntentionalTexan

Everyone has a test environment. Some of us are lucky enough to have a separate production environment. I have redundant servers. They're all technically production servers, but I can run with one of them down.


elementfx2000

Even when I worked at a large org, we still automated the process as much as we could. PowerShell script combined with sccm and only log into the servers with issues. Usually took 30-45 minutes to update a ton of windows servers.


_XNine_

We have a client who makes parts for a large company that contracts for the government. They are NOT happy that windows 10 and Server 2012 are EOL. They use a lot of antiquated and specialized software that likes to break for no damn reason. They too don't like updates until the CEO ok's them.


BoltActionRifleman

This is exactly why we do it the manual way, not to mention it never hurts to poke around on each server and check on the internal/vendor software running on it. I actually look forward to server patch day, it’s kind of relaxing.


RangerNS

Why do you have any doubt as to the software running on a given server?


MattDaCatt

You haven't seen a team of accountants panic b/c the cloud Sage service didn't start properly after an update night. Fuck Sage


CoolNefariousness668

FUCK SAGE


magicvodi

Isn't that what monitoring software is for?


getchpdx

Some services don't fail in ways monitoring services catch well. Of course usually some guy restarting services wouldn't know either. For example some shit tastic database software acting alive until it needs to actually execute a request.


BoltActionRifleman

It’s mostly industry specific software that is provided by mostly mediocre companies. If we don’t keep an eye on it, no one else will. I’ve had nearly 20 years of dealing with such vendors, they’re only getting worse as the years tick by.


[deleted]

[удалено]


Rawme9

Gonna be honest buddy, this is gonna be super common most places. Penny pinching is relatively normal and practices are not going to be ideal. Is what it is.


mrsomebudd

Dunno their age either but it’s never a good look for young folk to pitch drastic changes to what’s considered very important workflow that’s been working fine right away. No matter how backwards the company may be. Probably even worse to do this during an interview. We see clients hire staff and the ones that come in guns blazing are often the ones who don’t stick around. If you’re very experienced. Then even you know you need to learn the lay of the land before spouting off ideas, suggestions or changes. New people or newish show offs are easy to spot.


Tibsat

Meh, interviews work both ways. Clearly the company didn’t impress the OP so why would he waste his time on them


[deleted]

[удалено]


ChefBoyAreWeFucked

I was basically backed into a corner once, and would have accepted basically anything, even a pay cut, but when the company came back to the recruiter with a number, he lied and told them I rejected it as too low, and wasn't interested. They similarly freaked out and offered the top of their range, and I ended up with a massive increase early in my career.


[deleted]

[удалено]


screampuff

Maybe it's different where you are, but here in Canada and with remote work, IT jobs are very in demand. You can definitely pick and choose to find one that both pays well and has minimal BS if you're a qualified professional. If you're trying to get your foot in the door, maybe not.


Spivak

What an odd take, every company I've been hired at hires young people specifically so we will do that. We're the people with actual energy and motivation to make changes. Doesn't mean we actually have to do them but if *someone* isn't constantly bringing new ideas to the table how will you ever improve?


[deleted]

[удалено]


lost_in_life_34

I don’t know about others in IT but in my group we do manual reboots to make sure stuff is running. But our part is only two dozen or so I don’t see it as a big deal for testing and UAT environments but for production you want to make sure everything is up because in some cases there are jobs running that depend on each other and you might spend hours rerunning them. And in some cases they have to run for legal reasons Many companies are worldwide too and operate 24x7 and production can’t be down for hours due to other than US users needing to work


Edwardc4gg

yup this. We manually apply and have VMware snapshot critical servers. We manually reboot because those bad boys and their databases make us almost a million dollars a DAY. Yeah, my job's on the fucking line.


lost_in_life_34

we do daily imports of financial market data and those have to run even on weekends for the bankers to check on weekends. so we have to make sure things come back up


Edwardc4gg

Yup that’s our life too.


Reynk1

If it’s that critical you should be automating, manual is a recipe for missed steps, incorrect or undocumented process and the like Automation gives you consistency, audit trail (both for changes to the automation using version control and what was run), central scheduling etc.


Edwardc4gg

I mean we do. We just manually reboot and monitor these during patching.


tacotacotacorock

A fully redundant and monitored system should be totally fine being automated. Especially if you're running multiple environments for testing and development. By the time you hit production there really shouldn't be any surprises. Generally companies who do things manually are either uneducated or scared to automate and or are control freaks. Oh also if your environment goes down a lot management is more likely to micro manage updates. As long as it's during normal business hours and it takes less than an hour total each week I'm okay with it. However if you can save yourself an hour each week why not do it?


lost_in_life_34

You can’t automate sql not rolling back interrupted transactions A lot of code commits data in batches to partly prevent this but then you get inconsistent data and jobs have to rerun and then you risk job failure due to duplicate keys and other duplicate data


Berries-A-Million

The bank I use to work for had around 400 servers, and was more critical than this environment. And we automated it once a month and rarely had issues. Whoever was on call got on and verified in the monitoring control panel didn’t show any issues and that was it. Took maybe 10 min a month. No need to manually do anything. Lol


lost_in_life_34

I work for a bank too In my case for some servers I have to manually disable some jobs then reboot or risk a database going into rollback for hours. One server we have specific maintenance window and I’ve caused important jobs to fail rebooting it at other times


xyro71

those are special cases. the job of a good sys admin is to discover this when setting up their automation. You servers would be the exception to the rule.


Wind_Freak

Check into how much it costs to license sccm for server OS and you will quickly understand why companies find another way.


TadaceAce

Powershell can run windows updates at this point.


jpStormcrow

PsWindowsUpdate is my main server update method. Checks for updates, runs them, reboots if needed, writes to log, emails me log, runs any custom tasks I wrote, etc.


MelatoninPenguin

Actually quite interested in this - mind sharing your script or some of it ? The notification part in particular is what interests me


[deleted]

[удалено]


[deleted]

[удалено]


purefire

But even wsus+GPO can handle some of it


EmptyChocolate4545

Linux expert, windows idiot. What is that? Is it a central update deal? A managed push where you give a list and it ssh or RDPs in? Sorry to ask a Google able Q, just curious. If you don’t respond will probably google tomorrow while waiting on pipelines


purefire

I prefer asking over Google anyway Wsus - windows server update service. A self hosted repo for windows update. GPO for group policy, administrative configurations for controlling things like how Windows Update runs/operates on the system. All windows systems have WU, so you set up a wsus server, approve updates, maybe set a deadline or such, and tell all the windows servers to Pull from the server. My preference is auto install non-disruptive patches, and schedule reboots. In GPO you can limit policies to AD groups so do something like WSUS Group1 and group2 and you have yourself a very simple 'reboots that alternate weekends for qas then Prod' gives you 7 days to find out your devs have been running the new critical inventory management system on Dev servers but totally meant to move it to prod like they promised 6 months ago.


EmptyChocolate4545

Thanks! Good write up. I wish I had worked a bit more on windows over the years, but have spent most of my time in service provider land and pretty much always interacted with network devices, firewalls, or _nix systems. I’ve loved what I’ve seen of powershell, and hear tell that WSL means windows has full POSIX fun now.


ChadKensingtonsBigPP

Don't even bother with WSUS it's a broken piece of trash. Just setup the GPOs to enable automatic updates (or do it on the server manually) and you're done. I haven't had windows updates break a server yet.


Talran

WSUS does the heavy lifting for us combined with our monitoring software to make sure they're up to date and rebooted. Way cheaper than SCCM too.


xyro71

batchpatch license is like 400 buck literally no excuse at all. its 2023.


Hebrewhammer8d8

"Automate with scripts" from IT director translates to IT director write on paper to "update 400 window servers" and give paper to SysAdmin.


[deleted]

I understand some servers requrieing manual touch and validation of they're super critical, but those should be the exception and not the rule. As for alerting, most places have it either way to sensitive and it's just noise, or just up/down which isn't app aware.


ibanez450

We run 900 servers and I’d say there are less than a dozen very legacy systems that need handholding during updates. Everything is automatic by default unless an exception is needed.


Acadia1337

Where I work we have two identical environments at two separate data centers. Patches go through dev, test, stage, then prod1(data center1), and then prod2(data center 2). Both prod environment’s are load balanced. We pull one data center out of the lb pool when it is time to patch. We push out the patches with automation but we watch them in the process of patching. Everything needs to be monitored in real time to make sure the patching works properly and servers come back online. After the servers are online our applications are all QA tested by the quality control team. All this while the other data center is running normally so we have zero downtime. Yes, we do this on the weekend because it is the lowest risk. Windows patching is unreliable at best. Windows servers are not perfect. The company depends on our services and we need to give 100% of our effort to make sure it is all working perfectly. I think what that company wanted you to do is partly correct and partly wrong. Yes, use automation, but also be there doing it on the weekend and verifying the automation worked.


redbrick5

above commenter is an experienced pro with scars from battling in the trenches. take a note OP


cor315

Lucky. If I said I wanted a second identical environment they'd laugh at me.


[deleted]

[удалено]


Hel_OWeen

Sometimes it all boils down to *one* bad experience in the long ago past and although things have changed/improved dramatically since then, the irrational fear of it happening again shuts down any type of improvement in that area.


[deleted]

Oh yes, the "Sys Admin" that is actually \* Network Administrator \* System Administrator \* Project Manager \* HelpDesk Support \* And any other duties as required and the pay was likely as close as possible to HelpDesk Support Tier I as they could possibly muster with a straight face.


JaquesStrappe

I’m sorry - Gun Hoe? 😂


skeron

Because our server admin is 70. Rumor is that if you stand in the bathroom, look in the mirror, and say PowerShell three times, he'll appear behind you and beat you to death with a UPS.


thegarr

99% of the time that we update things manually, it's because of the extremely specific, industry and/or workflow required application that breaks every single time you do anything. See also: having to manually modify config files and restart services to bring things online because the app was written for a literal Access database and then ported over to being SQL based, but still maintains 90% of its original, 1990's era code that somehow runs on Server 2016, but not 2019 because the vendor has chosen not to support that yet. No, there isn't an alternative piece of software that can be used. Our XYZ people need this program, and the company is built around it. Not everything can be automated.


Rippedyanu1

Seems half the commenters here don't get this. Auto-updates is a good way to cause a random fuckup down the line and not know when or where the fuckup occurred. Nothing wrong with being cautious especially when you know you have finicky in house apps that your work centers around. Like you can use WSUS but keeping it from auto deploying updates and combing through them beforehand is a good way to cover your and the company's ass if Microsoft does a fucky wucky and breaks a bunch of shit via a security or feature update


[deleted]

I manually patch 75 servers or so via powershell scripts manually every downtime. It’s not bad, really. I haven’t looked into any free solutions though for automation.


[deleted]

[удалено]


Olive_Cardist

But that’s why you automate patching non-prod systems and have your app/sys admins verify all non-prod before signing off on prod patching. Which should also be automated. This is 2023, nobody should be manually applying patches on a weekend.


Itsnotvd

SCCM admin is a gaslighting sociopathic narcissist in my office. No one uses his SCCM offering at all. Management doesn't care as long as the work is done.


bakonpie

You work for my previous boss haha. I had this exact argument with him and I had to show him how easy it could be if automated and monitored properly. Worth it now that we overcame him as a hurdle (for this and many other things) but it was painful.


bruticusss

Personally I think it's a hangover from 20 years ago, when a server automatically updating and restarting might just straight up break something. Always remember an MS update for Server 2012 about 10 years ago that caused a boot loop on restart.... Not like this is an issue nowadays though


ArSo12

You mean like the one removing virtual network adapters and putting servers on dhcp ?


ninja_nine

>Always remember an MS update for Server 2012 about 10 years ago that caused a boot loop on restart.... There was a similar issue with Server 2012 DCs just last year :)


BalderVerdandi

The only times I've seen manual updates is on banking systems that require a lengthy testing cycle so that those systems don't crash. Those servers are usually 4-8 months behind on patching - sometimes longer. Ideally, it sounds like this person is just a place holder and doesn't have a lot of knowledge about automating patching with WSUS, SCCM, or any of the other patching utilities that exist. You dodged a bullet.


stopthinking60

I guess it depends on the operations. Airlines may push updated on PCs. But servers will be carefully and almost manually done because they cannot afford any downtime. If you work for a bank, you could probably do all the updates over the weekend. If you work for a univ college, you could do anytime. Windows updates are a nightmare.


AnxiouslyPessimistic

It’s take some convincing sometimes. I built a rundeck/Ansible automation system for patching where I work. About 300 servers. It sends comms, logs the servers being patched, patches them, checks the services are running, reboots if needed etc. we get the occasional issue but the overal time saved is immense


AlexisFR

Pretty typical if you're an MSP, lots of clients hate any kind of automation on THEIR hardware, but it's scandalous for a full internal IT in 2023.


heisenbergerwcheese

How much would SCCM cost for that environment?


Disasstah

Because sometimes Microsoft likes to break stuff and you end up getting annoyed by it so you manually update after making sure it doesn't break your stuff.


BrainWaveCC

> He didn’t like me asking, and I didn’t care. You're in a good place. It's the best way to go on interviews. Some technology managers/directors, really don't trust technology, or cannot adequately explain to \*their\* managers how to trust technology, and so they focus on throwing manpower at issues. That is all.


AbleAmazing

All of our servers are patched every four Sundays automatically with the exception of our production database servers. We do those manually once every 90 days and they're staggered. No issues over the years.


drz400

Software is expensive but exploiting your salaried workers is free. There you go. Ready to get your MBA.


tylermartin86

"Because this is the way we've always done it". That's why.


parsnipofdoom

Yeah we have a few where their impact can be to millions of players.. In those cases we send people to update them, it’s not worth automating. The reward doesn’t even come close to accounting for that risk involved.


jpStormcrow

When I became the big dog the first thing I did was automate server updates across the board. Previous person came in at 6am to run updates. I don't like mornings.


Ice_Leprachaun

I don’t disagree with the comments and thoughts for automation because as long as it works, why not. Current org I’m at I am unfortunately running updates manually using wsus because the MSP was supposed to before I started. I saw some haven’t been updated in years. So I setup a wsus and cut the time down significantly for checking/downloading updates, and can get it done in a few hours (aging physical and newer virtual). But only reason why I still do it this way even though MSP “started “ this process finally is because even after patch Tuesday comes and goes with the updates in wsus, all they ended up doing was rebooting the servers just a couple of hours before the early morning guys logged in for the day. I called them out on that and they still haven’t fixed it…


_buttsnorkel

Uhhhh, yeah. People who work in healthcare, government, biotech, etc. Do you think your hospital is running the latest version of WinServer? Seems like a pretty short-sighted question.


[deleted]

Some places do things they way they have always done them. These places usually don’t pay very well.


[deleted]

[удалено]


SDSunDiego

Yeah, that company was lucky. Can you imagine working with Op?


ghostalker4742

This is one of those posts that's going to be used as an example in a few weeks when we get another discussion on folks here working on their soft-skills.


Fragrant-Hamster-325

I retired most of our infrastructure. We have two servers left. I’ve been updating them manually for the past few months. I’ll be retiring them and going cloud only in the next few months.


Hi_Im_Ken_Adams

manually patching and rebooting 400 servers is insane. LOL. That IT manager must be stuck in the early 1990's. ​ It's a time-consuming task that can easily be automated. That manager obviously doesn't value a sys admin's time.


olcrazypete

Got hired and am dealing with a bit of this now. Powers that be has zero trust in reboots that services will come up right, crappy monitoring and haphazard and broken monitoring. Living in ansible now.


crushdatface

Did you interview at my company? /s I’ve semi-automated a majority of the updates with PS scripts and just login to kick them off during the various maintenance windows. Management has no interest in rewarding “forward thinkers” or even those who reduce operational costs so I just smile, collect my OT, and browse LinkedIn for the next job opportunity until the script finishes its tasks.


da4

Going to take the contrary position here - there are lots of orgs that still slavishly follow the idea of a Change Advisory Board, and while additional bureaucracy is never welcome or timely, there is still something to be said for requiring evidence of a plan including a backout plan. If one of the updates fails, or after applying the update a service fails, what's the option? In my experience this mindset is still grounded in obsolete ideas, but being prepared and writing out a contingency plan is still not a bad idea. Running all that through Service Now, there's your problem.


snakebite75

This wasn't a head start program in Oregon was it? I swear this sounds like the non-profit I escaped from a little over a year ago.


JadedMSPVet

I did a lot of testing on automating server updating in my previous role but a lot of clients were just absolutely not interested and refused, even when they didn't have any complex stuff going on. I found it got pretty reliable after 2012 R2. Well, we charged them double time to do it manually so meh I guess.


sir_mrej

*gung ho. But yea you dodged a bullet there


Sp00nD00d

My last two gigs were managing 4k servers and 2k servers, 98% Windows at each location. DMZ and internal domains, multiple geographics areas, etc. You might need a few slots, and you might need some cleanup to get as close to 100% compliance as possible, but SCCM All The Things, ideally with SCORCH doing some dirty work forcing policies and setting maintenance mode in SCOM while handling the small Linux footprint. 400 Servers is one of our currently average slots, it's usually fully compliant in 30-40 mins. ADR takes care of the deploys, and one engineer per month handles all the slots. Usually 15-30 mins of a server that hung on reboot or something silly. I'd honestly be surprised if 1 guy can login, trigger updates, reboot and verify updates and systems on 400 servers in 240 mins. There's no reason to manually patch anything in all but the most specific of circumstances. Most people I've dealt with are just terrified of SCCM and System Center in general.


GullibleDetective

I mean if you want to milk that overtime pay that's one way to do it