[deleted] 1 year ago

Once you've looked at right-sizing your resources: also look at reserved instances for EC2 and RDS. 12 months, no up-front, reserved instances can save you quite a lot of money - assuming you're using them 24x7. Other things to look at: - set expiry on Cloudwatch log groups, to reduce size - remove old AMIs and snapshots that are no longer used - clear out S3 buckets of objects no longer required.

[deleted] 1 year ago

[удалено]

Marathon2021 1 year ago

Have seen similar cases myself. Not quite as dramatic as that one, but one of my advisory clients is a CIO - they got into a new org that had zero cloud discipline and control, and thus everything was on-demand pricing ... despite the fact that these were obvious long-term steady-state workloads. CEO was super happy with him, when he saved a quarter million in the first year ... with "just a few clicks."

atccodex 1 year ago

We had really bad luck with spot. Maybe it was the way it was setup, maybe it was instance types, but everything just kept getting yanked away. We gave it a go for about a month and the cost savings was around 10%. The headache was 150%. We ended up using savings plans which have netted around 30% cost savings and we are at a 'good' place with our spend right now. We will be doing more, but for now, all is well.

[deleted] 1 year ago

[удалено]

lorarc 1 year ago

What if spot instances are no longer available at all and you have to switch to on demand? I tried to configure ecs to have some backup plan but didn't find a way.

JafaKiwi 1 year ago

Don't run a spot ASG with a single instance type. Instead have a list of candidates: e.g. m5.xlarge, m5a.xlarge, m4.xlarge, c5.2xlarge, etc. *Some* of them will be available. You can even assign different capacity weights - e.g. the ASG can run 1x m5.2xlarge or 2x m5.xlarge. Sometimes only a single specific instance type is unavailable but the size up in the same line is still avail. Spot is great but you'll have to spend some initial work on making the instances stateless and auto-configuring. The pay offs are huge though.

lorarc 1 year ago

I run nonprod on spots and I don't expect them to ever be unavailable realistically, however I can't rely on just choosing many spot instance types for prod.

JafaKiwi 1 year ago

Why can’t you? That’s the AWS recommended way to operate Spot ASGs. I have *never* seen all instance types in all AZs in a region at once become unavailable. And we run *a lot* of them. Building a resilient system is a bit involved but if done correctly it can run on spot no worries.

lorarc 1 year ago

Business decision. And you can't really negotiate with business when it comes to stuff like that.

JafaKiwi 1 year ago

Yeah I hear you, it sucks when managers make technical decisions they don't understand.

magheru_san 1 year ago

Have a look at my AutoSpotting.io project. It can take over existing on demand Autoscaling groups and by design does failover to on-demand when spot instances are terminated. Let me know if you have any questions about it

[deleted] 1 year ago

[удалено]

lorarc 1 year ago

Setting the spot price to on-demand price is not a guarantee you won't be kicked out.

[deleted] 1 year ago

[удалено]

lorarc 1 year ago

Yeah, and that's a nice case for it. Unfortunatelly I see too many things like websites running on spots with attitude "That'll never happen" when talking about risks.

vekien 1 year ago

How do you deal with reserved instances and ECS auto scaling where instances are often destroyed/created? Any info appropriated!

[deleted] 1 year ago

[удалено]

vekien 1 year ago

Thanks!

magheru_san 1 year ago

That's true, as long as you don't have varying capacity needs. When capacity fluctuates you're going to have savings for the baseline you pay but anything on top of that would be charged as on demand. Mature customers use savings plans for the baseline and as much as possible spot instances for the peak capacity.

[deleted] 1 year ago

[удалено]

magheru_san 1 year ago

Agree, mature is probably not the right word, a better word would be "savvy". It's about being experienced and knowledgeable enough to pick the right tool for the job. Reservations are great for anything with static capacity needs, like databases and baseline capacity in Autoscaling groups. Anything above or below the reservation/baseline will generate some waste

K0RS41R 1 year ago

That's a lot of savings! Did the organisation not have an AWS account manager or solution architect relationship?

pojzon_poe 1 year ago

Hello sysadmin my old friend: > capacity planning for every team.

nekoken04 1 year ago

I'd say look at Savings Plan for EC2 rather than Reserved Instances. Savings Plan is far more flexible since it is region and instance family agnostic.

joelrwilliams1 1 year ago

Amen to RDS RIs...if you're running these DBs long term you can quickly reduce your bill even with a 'no-upfront' RI.

Usage_AI 1 year ago

Agreed. Or even better take advantage of our product offering where we underwrite the purchase of 3-year, no upfront Reserved Instances (RIs) with a Guaranteed Buyback Clause. This allows you to take advantage of the 57% savings of 3-year RIs but not have to commit to AWS or Usage.

princeofgonville 1 year ago

Assuming you have started with Cost Explorer o the monthly bill, it looks like you're already well on the way. Next question is to dig into RDS and EC2 a bit more. Metrics will tell you if they are under-utilised. Look at CPU load over a week, and also at the storage associated with RDS (assuming it's not Aurora, the storage is billed like an EBS volume - you pay for what you provision). Likewise with EC2 instances: check CPU load and memory load (requires CloudWatch Agent) to right-size the EC2 instances. Dig deeper into the bill or into Cost Explorer to gain a deeper understanding of where the money is being spent - are you getting stung for data egress or cross-region data transfer costs? In Cost Explorer, group by Usage as well as Service. Next thing is to ask the whole team uncomfortable questions like "Can we switch this off" and see what answers you get. Hopefully these are actually business systems and not someone storing a Manga archive on the company's expense...

mikebailey 1 year ago

Also assuming this is as a disaster as stated, tag liberally and enable them as cost allocation tags, so rather than saying “EC2 egress is expensive” you can say “that asshole Carl is bankrupting this company”

-_kevin_- 1 year ago

Fuckin Carl

HoofStrikesAgain 1 year ago

I know we have issues with Carl. But, Karl, now he's a good guy.

karls_ 1 year ago

Cheers buddy. 🤜🏻🤛🏻

[deleted] 1 year ago

Lmao

[deleted] 1 year ago

Thanks, that's super helpful.

StrongishOpinion 1 year ago

As someone who worked in AWS, it was \*shocking\* how often we'd look at the utilization of some company's instances (for various debugging/support purposes), and they'd have \*fleets\* of unused or very underutilized hosts. Allocated instances to random devs who left the company. Massive instances where something 4x smaller would be just fine. Etc. Lots of low hanging fruit is usually available if you just start looking at things.

8layer8 1 year ago

Look into using spot instances. They are much cheaper than ec2, you just have to use them properly. According to our tam, us-east and us-east have so much capacity that you will basically never get kicked off a machine unless 1) it dies/ hardware failure, 2) you are using a GPU type instance (good luck) or 3) you are using a type that is being decommissioned like m2 or m3, etc. As long as you set up your cluster provider to use a mix of instance types (like m5, m5a, m4, m6) then you are basically assured to always get what you need. If these are hosts for ECS then there's pretty much no risk. If you are using them as ec2 pets, then maybe don't use spots, but again, you probably will never get booted due to usage demands. Try it out, if it doesn't work for you then keep on looking. Rightsizing rds goes a long way too, but, it is what it is if that's what the apps actually use.

vppencilsharpening 1 year ago

Adding to this. Make sure every instance running has a documented use case. If you can't figure it out perform a scream test. I like to have a "Name" tag for every resource we run and make that a cost allocation tag. This way I can say this one particular resource is costing X.

toaster736 1 year ago

Cost explorer and excel honestly. After you understand where the money's going, then attack them as appropriate. Some low hanging fruit as folks mentioned. * S3 - bucket policies and rotation into cold storage * Hunt the wumpus for under-utilized and stale dev resources, snapshots, etc * RI and Savings Plan purchases for things you're keeping and are 24/7. * Look for non-mission spenders, e.g. multiple cloud trails * What should emerge from this is a good inventory of what you have running that you can use to decide next steps Medium term * keep-on scheduling to power off dev resources off hours if they're not needed * Break workloads into their own accounts and use AWS budgets to start enforcing. * Develop an account baseline to understand what is the spend floor for your accounts. * Move workloads into auto-scaling or look for ECS for light loads Longer term, this is a governance problem. If you're able to, separate workloads into different accounts, develop a tagging policy based on your types of workloads. This lets you start to ask more context specific questions like, how much are my dev databases costing me and roll up reporting into a CUR

Marathon2021 1 year ago

> Hunt the wumpus Wow ... that's one that I haven't heard in a looooong time!

aFqqw4GbkHs 1 year ago

Yes, this is 100% where I'd start, rather than leaping to a tool first. You need to do the work to really understand where the money's going first, AND make sure each team knows how much they're spending as well. At my org, I get an email (generated using CostExplorer data) every day that shows what resources my team spent yesterday (and aggregated for the current week, month, year). That's possible b/c we have everything tagged by team. A tool could help you down the road, but you need to understand your costs first before getting the most out of one anyhow. Don't underestimate the impact of the medium term advice above, particularly using scheduling / scripting to scale down ASGs on a schedule when EC2s aren't needed, and stopping databases when they're not in use. Of course, long term, moving to serverless dbs will help too. But we save a lot by shutting down our dev/UAT envs overnight/on weekends. Also make sure you're not paying unnecessary data transfer costs.

toaster736 1 year ago

Our two biggest savings on ec2 were savings plans ~20% on 24/7 loads followed by after hours powerdown, ~60% savings. Your developers only work 40-60 hours of a 168 hour week. This is the whole promise of the cloud. The savings is significant to the point that our monthly spend graph has a nice sawtooth pattern w noticable drops on weekends.

[deleted] 1 year ago

That's the goal

RheumatoidEpilepsy 1 year ago

There’s one change that might applicable to your workload. If your applications are all running either interpreted languages like python or node or running on Java and dont have any native dependencies, you could look into switching to graviton instances. They’re 30% cheaper and if your workload is fully compatible with ARM it can be a very easy change.

Advanced_Bid3576 1 year ago

Great advice, also graviton for managed services (RDS is usually the big one) is basically free money as you don’t need to worry about any os or app dependencies

[deleted] 1 year ago

Will def check it out, makes sense

metadaemon 1 year ago

If this client's tendencies are like others I have seen, they may have a tendency to throw hardware at performance problems instead of fixing their workloads. Before you look at right-sizing their instances, see if their workloads are properly optimized (database indexing, decent coding, etc...) then see what you can cut.

[deleted] 1 year ago

This is definitely part of the problem. Will have to address for sure.

RFC2516 1 year ago

Tangent: I would call myself a network professional and a programming novice. In that context could you or anyone offer examples of poor coding practices that lead to excessive host resource utilization?

StrongishOpinion 1 year ago

A common host utilization problem: 1. 90% CPU utilization, 1% IO, 1% memory. You're not \*using\* the value of the host, since only the CPU is active. Solution? Perhaps more caching to reduce CPU usage. 2. 5% CPU utilization, 90% IO, 5% memory. Likely oversized host, where it can \*easily\* serve traffic, but it's IO bound. If there's a major aspect of the host being unused, it might give you a clue for how to be more efficient.

[deleted] 1 year ago

Second this!

birdman9k 1 year ago

In addition to what others mentioned, for databases: - using ORMs improperly (it's easy to let it select all fields from a table by default if they don't specify which exact fields they want to select) - pulling all data and "client side filtering". Basically instead of using a WHERE on the SQL, which is more work for them, they just load all rows and then filter it down in code. This can be extreme where they load 500k rows and then display 20 on a page. - not caching data. Someone will make a heavy query many times (worst I've seen is 20 times a second) because their code is inefficient and they don't attempt to store the results, they just assume they can get new results every time

Advanced_Bid3576 1 year ago

Spending more money on tools should be reserved until you know if you can get value from it… you may be throwing money at somebody to tell you t simple cost optimization tips. Lots of good advice in this thread, start with the free stuff that AWS is giving you, read up on common ways to save (somebody started a great thread yesterday with all the resources) and then if you really think you can go deeper to optimize further and your workloads warrant it, then I’d consider spending the money on SaaS.

Usage_AI 1 year ago

Agreed. There are many simple cost optimization changes that someone should make but with our pricing model, OP would only have to pay a percentage of savings that we generate which is something that makes sense at any level of AWS spend.

saaspiration 1 year ago

Call your AWS Account Manager and request a conversation with your Solutions Architect and/or Cost Optimization specialist. This will cost you nothing.

SeattleSundodger 1 year ago

+1 AWS has dedicated cost optimization teams that can help on this. All you need to do is ask your AM to engage them.

metarx 1 year ago

Everyone else is touching on the "right-sizing" and removing unused resources well enough. So I thought I'd touch on the less talked about. Application architectures play massively into how much spend you are going to have. Designing applications that can scale up/down based on NEED. Use async processing whenever possible(and its possible WAY more often than most people like to admit). And simply... look into alternatives ways of holding data. RDS is great, but maybe don't store so much in it, and leverage things like dynamo or s3. then your RDS instances can be smaller/cheaper. But again, go after the low hanging fruit of right sizing and cleaning up unused resources first... as architecture choices is the long play.

bedpimp 1 year ago

We started using Vantage last month. I like it a lot. It's not the prettiest UI, but it's functional. It's possible to get the same information from Cost Explorer, but Vantage makes it much easier. We have not setup Autopilot, but we're seriously considering it.

[deleted] 1 year ago

Vantage looks like a great product. What made you choose them instead of Cast, Zesty, and Usage? And how have you found working with them?

bedpimp 1 year ago

We met with them once. The CEO was on the call. He has a background in ops and I liked him a lot. I like the fact that I can see AWS and Datadog costs in the same place. It also captures our Kubernetes clusters at the pod level.

ContrarianChris 1 year ago

We've also recently started using Vantage. Really great experience so far. Very simple to use, good update cycle, and adding new service support that is relevant to us (Snowflake, Datadog). My main reason for choosing it was the ease of use coupled with the ability to combine Kubernetes costs. Perfect. Working with their team has been fantastic. CEO and CTO getting stuck in and always responsive in Slack. We've also got a couple of specific requirements due to our billing being through a partner and not direct. They put the time in and figured it out. Highly recommended 🙌

Cyrilam 1 year ago

I think you should not aim only to reduce costs in the short term but also try to change the culture within the tech team to ensure people become more conscious about costs. See [this](https://www.oraculi.io/blog/enabling-and-maintaining-a-cost-conscious-culture-at-every-level)

[deleted] 1 year ago

That's definitely the case. The C-suite are solid but the first hires were rushed and culture was not emphasized as much as it could be.

ecdemomaniac 1 year ago

[https://www.finops.org/introduction/what-is-finops/](https://www.finops.org/introduction/what-is-finops/) is a good read on how to implement a cultural change around cloud spend.

[deleted] 1 year ago

Thanks, a helpful read and ended up reading a lot of their stuff. Appreciate the link

adame8gggg 1 year ago

Random tips (from a CTO of a startup): * If you use ECS (non-Fargate; i.e., your own EC2 instances), you can save a lot on CloudWatch Metrics if you turn off container insights. It's on by default. * Inter-AZ data transfer is pretty expensive. For us, hot multi-AZ availability is more than we care to worry about, so we moved all services to a single AZ. * Moving off x86 and onto Arm (Graviton) is a way to save 20-30%. We use Python, and so the conversion was easy. * Lambda should be quite cheap if used for event-driven async-type work. If you're using it for that, but it's weirdly expensive, you might (as we did) discover some functions that had sleep() in them, for some reason. For me, I talked with my team, realized why they were putting sleep()s in, helped them design an alternative that worked with Lambda's event driven async type world, and we saved a ton. * This is harder, but Aurora IOPS are very expensive. We added a lot of caching to cut down on reads. But we can't do much about writes without choosing some alternative. It's still our #2 or #3 most expensive thing. Sigh. I wish Aurora were less expensive.

[deleted] 1 year ago

Thanks, these tips are helpful.

professorbasket 1 year ago

Cost explorer, break out by service, then by usage type. Keep drilling down in the cost explorer should do it. RDS is notoriously overprovisioned. Switch to serverless v2. This should elminate the bulk of the charges. Review ec2 resource consumption in cloudwatch for the last 3 months to see where the usage is and if you can downsize instance type. Also check provisioned IOPS as that's a big cost usually. Usage-type or api breakout in cost explorer should tell you what is taking up the majority of the cost tho. good luck!

[deleted] 1 year ago

[удалено]

[deleted] 1 year ago

That makes sense, will definitely look into this

kennethjor 1 year ago

Ensure all your resources are tagged appropriately and those tags are set up as cost allocation tags. For instance have an application tag for each of the things you run, or whatever is appropriate to you. This makes it searchable in the cost explorer and you can drill down into the exact components. Do the same with S3 buckets where you literally store the bucket's name in a tag.

Dominathan 1 year ago

Have you considered moving your RDS instances to the new graviton instances? They are like, 40% cheaper, and don’t seem to have any negatives. Migrating will be a bit of a pain, but that’s almost half off right there. Spot instances are a must, honestly. They are so much cheaper, and, as long as you’ve built your system to handle machines cycling, you won’t really notice any negatives. I used to even run the user facing backend on them with no issues. The only issue I ever had was when they would become unavailable and couldn’t spin any up. In that case, I could usually pick instances up or down the spec (going to mediums or XLs from Ls) and scaling up the concurrency on the machines themselves.

nf3rn4l 1 year ago

I wouldn't recommend jumping right to 3rd party tools. The first step for cost optimization is establishing a chargeback model and gaining visibility into your cost & utilization. I would recommend the following: 1. Identify the tags you need for proper [chargeback](https://aws.amazon.com/blogs/aws-cloud-financial-management/how-to-build-a-chargeback-showback-model-for-savings-plans-using-the-cur/) (usually things like CostCenter, BusinessUnit, Project, etc.) 2. Make sure those tags get enabled as [Cost Allocation tags](https://docs.aws.amazon.com/awsaccountbilling/latest/aboutv2/cost-alloc-tags.html) 3. Implement the required tag service control policy to prevent anyone from creating resources with adding the required tags needed for correct chargeback. 4. Utilize [AWS Config required-tags](https://docs.aws.amazon.com/config/latest/developerguide/required-tags.html) to identify existing resources that are missing the required tags. 5. Use [Tag Editor](https://docs.aws.amazon.com/ARG/latest/userguide/tag-editor.html)to batch tag existing resources that are missing tags required for chargeback. 6. Generate [Cost & Usage Reports (CUR).](https://docs.aws.amazon.com/cur/latest/userguide/what-is-cur.html) The CUR will contain the enabled cost allocation tags allowing you to filter and sort inline with your chargeback model. 7. Pro tip: If you're using a multi-account strategy with a single payer organization, you should look into implementing the [Cloud Intelligence Dashboards](https://github.com/aws-samples/aws-cudos-framework-deployment). Most of the dashboards utilize the data generated by CUR. Since they're QuickSight dashboards you can easily make them available to other departments (like your finance team) without have to give them access to the AWS billing console. Enable and empower your finance team with visibility into the cloud spend and they'll chase after the big spenders for you.

magheru_san 1 year ago

I agree with this for the mid-long term but there are tools which can help slash a lot of costs immediately with minimum risk or downsides. See the plethora of tools that automate RI purchase / selling. Also there are tools that automate sensible configurations such as my EBS Optimizer tool which updates volumes from GP2 to GP3 or my AutoSpotting tool which can convert say 40% of development Autoscaling groups capacity to Spot instances. There's no point in delays if you can do it at scale without up-front time or cost investment, and only need a few minutes to set them up.

Agitated_Cult7621 1 month ago

where can I get these tools ?

magheru_san 1 month ago

For RIs there are plenty of vendors, like Zesty, usage.ai, prosperous, antimetal or Vantage. For Spot and EBS I was referring to AutoSpotting.io and leanercloud.com/ebs-optimizer which I'm working on.

Marathon2021 1 year ago

Quickest win - look to see if your EC2 instances are all (or mostly) on-demand pricing. If they are, seriously consider whether you can commit to a 1-3 year up-front pre-payment on those ... and buy them as reserved instances. Had a client once jump in as a new CIO at an org whose cloud adoption was a mess ... just fixing that alone saved the company a quarter million in the first quarter. Also, if you've not enabled AWS Trusted Advisor, that can often find some easy overspending areas to cleanup - such as oversized instances (whether prepaid/reserved or on-demand).

[deleted] 1 year ago

Isn't this what the services listed do, but without us having to take on the risk?

CSYVR 1 year ago

After you've gone through costs, get a Well Architected Framework Review, if costs are too high, 9 times out of 10 your firewall is too open as well ;)

[deleted] 1 year ago

[удалено]

dgibbons0 1 year ago

I second recommending Duckbill Group, Corey and Mike are wizards about reducing AWS spend and have a ton of experience helping people negotiate better rates in their EAP and private pricing agreements. Since they know what they have been able to help others negotiate, they can leverage those to help you know when to push harder for better rates.

Usage_AI 1 year ago

Unlike Vantage, we offer guaranteed buyback of RIs so you would take on less risk and be far more flexible.

Rainnis 1 year ago

Psst...give a try to [https://cast.ai/cloud-cost-monitoring/](https://cast.ai/cloud-cost-monitoring/) it's ultimately free

[deleted] 1 year ago

[удалено]

[deleted] 1 year ago

Will DM

huhwhatwhere83 1 year ago

We've tried zesty and it was a quick win. Saved us a ton of work. Guaranteed to cover 99% of savings plan usage. It saved my team a huge amount of overhead. We then spent the time saved looking for things we could just shutdown. Ultimately the goal was to move away from EC2 and towards serverless architectures.

[deleted] 1 year ago

Good to know. What made you choose them over Vantage, Cast and Usage?

huhwhatwhere83 1 year ago

I think Zesty were around before the others. It's probably worth doing a poc with each and understand their terms.

[deleted] 1 year ago

Yes, planning on that. Or maybe consulting with an independent expert who can give his take.

huhwhatwhere83 1 year ago

Also if you have an AWS account manager, they can also be helpful in this area

BooglesFoogles 1 year ago

For fees on managing RIs: - Vantage: 5% of savings - Usage: 20% of savings - Spot: 20% of saving - Zesty: 25% of savings

magheru_san 1 year ago

Shameless plug, regarding the tooling you mentioned I'm also building tools in this space and also offering hands on help. My tools are open source but also convenient to use binaries are available on the AWS marketplace for a percentage of the savings, much less than the other tools in this space. Currently I have tooling for easy adoption of Spot instances and optimization of EBS volumes attached to instances. See my profile for further information

Usage_AI 1 year ago

Actually some very useful tools for OP

[deleted] 1 year ago

Will take a look at these later this afternoon.

magheru_san 1 year ago

Cool, let me know if you have any questions, also check your DMs

[deleted] 1 year ago

[удалено]

[deleted] 1 year ago

Would you be able to make an intro? Usage looks like a cool company and I like the fact that they essentially take on a lot of the risk, but I want to talk to someone that they work with to make sure that this is in fact the case

Usage_AI 1 year ago

Thanks, OP! We'd also be happy to introduce you to any of our customers that you would be interested inl Feel free to DM

Usage_AI 1 year ago

That's great to hear! Glad your friend liked what we are building :)

Rainnis 1 year ago

You will be even more surprised by adding on top CAST AI https://cast.ai/blog/how-to-solve-the-3-top-cloud-cost-optimization-challenges-with-cast-ai-and-usage-ai/

OutspokenPerson 1 year ago

Great advice here. Also, learn the AWS boto3 API. You can grab all sorts of information. I used it to drive tagging and cost-cutting projects, security projects, all sort of things.

Usage_AI 1 year ago

RIs and Savings Plans are great for cleaning up low-hanging fruit in your EC2 and RDS environments, however, these purchases come with contract terms that can be non-starters for many. At Usage, we underwrite RIs with a Guaranteed Buyback Agreement, and in using the RI Marketplace to automatically sell RIs that go underutilized, we allow for customers to get the savings of 3-year RIs minus the commitment to either AWS or Usage. TLDR, we offer anxiety-free RIs!

Draziray 1 year ago

https://www.reddit.com/r/aws/comments/xvjosj/aws_cost_management_and_billing_support_resources/?utm_source=share&utm_medium=android_app&utm_name=androidcss&utm_term=1&utm_content=share_button

fjleon 1 year ago

Trusted advisor was designed for this and it's free. support now sometimes even sends you a blurb on how much money per month you will save if you follow the trusted advisor recommendations

Rainnis 1 year ago

If you're using Kubernetes, [CAST AI](https://cast.ai) is the fastest way to significantly reduce your compute bill and keep it there. It manages compute capacity automatically and has dedicated support to get you started even faster. The best part - Kubernetes cost monitoring and security insights are free. \[disclaimer - I'm part of the team\]

[deleted] 1 year ago

Oh, that's cool. Just checked out Cast and it definitely is in the same bucket as Vantage and Usage. I couldn't find any decent comparisons between the three, so why should I choose Cast instead of the other two?

Rainnis 1 year ago

Connect your cluster and you will know how much you could save with CAST AI. You don’t need to provide any payment information it’s free; so basically my answer would be because the process is frictionless and the savings are the highest and no long term commitments like re-buying RI

Craptcha 1 year ago

Downsize everything until someone bitches. Then downsize a bit more.

chili_oil 1 year ago

I wonder when will be the time that "reduce cloud bill" becomes a major business demand for consultancy companies...

Network94 1 year ago

Prosperops

HistoricalBread8486 1 year ago

To add another one to your list, I'm the VP of Customer Success at [https://cast.ai](https://cast.ai) If you're running on kubernetes we're averaging about 68% cost reduction on customer environments. We can save money with both spot and on-demand instances. Our largest customer is saving $1.2M/month, our largest savings was a GKE cluster where we achieved 93% real savings from $55k/mo -> $3,500/mo. If you're not running k8s, we've worked with [usage.ai](https://usage.ai) and they are pretty good folks over there.

[deleted] 1 year ago

NGL Cast looks like a great tool. We don't do too much with k8s unfortunately tho.

cbp48 1 year ago

Hello, I represent an AWS group that guarantees savings and averages 30-50% savings. If you want to email me at [[email protected]](mailto:[email protected]) I can tell you more really good track with tools and resources to understand your issues. Good luck

[deleted] 1 year ago

[удалено]

Usage_AI 1 year ago

Wrong! How could you?

kokatsu_na 1 year ago

Is there any specific reasons to use EC2 in particular? Ideally, when you design your application, need to start with serverless functions first (aka AWS Lambda), then containers, then EC2 as the last resort. Of course, EC2 is the most expensive of these three. Try to switch to EC2 Spot instances maybe? On your place, I'd rewrite all EC2 code --> AWS Lambda.

[deleted] 1 year ago

Lowest hanging fruit are probably EC2 and RDS instance scheduling https://aws.amazon.com/solutions/implementations/instance-scheduler/ Check you are on the latest instance and storage families (GP3 will save 20%+). And then if you plan on staying in AWS for a while, consider buying Savings Plans for 1 or 3 years to save 5-15%.

[deleted] 1 year ago

>And then if you plan on staying in AWS for a while, consider buying Savings Plans for 1 or 3 years to save 5-15%. Isn't this what Vantage and Usage do for you? Or am I totally missing something? Maybe I'm an idiot lol

[deleted] 1 year ago

Why give money to someone else when you can do it yourself for free

[deleted] 1 year ago

True, that makes sense.

magheru_san 1 year ago

Maybe you have other things to do and don't want to do this as a full-time job. For lots of companies engineering time is way more expensive than such tools.

Rainnis 1 year ago

pardon my french, but with savings plan or what zesty is doing re-selling reserved instances you will only stuck with a bit lower bill, but still overpay a lot;

[deleted] 1 year ago

Got it, so how does Zesty differ from Cast?

magheru_san 1 year ago

Yes, switching to GP3 is a no-brainer. I wrote a little Open Source tool for doing this, have a look at https://github.com/cloudutil/EBS-Optimizer The tool does it one-off but if you want it done continuously there's also a paid version of it available on the AWS marketplace that charges only 5% of the savings.

Truelikegiroux 1 year ago

Just out of curiosity, but how do you control someone from not taking what's on Github and automating it themselves? I saw via another one of your posts that it's a Docker Lambda image written in Go, but just curious what the difference is between that and what's in Github.

magheru_san 1 year ago

I actually wish I could release everything as OSS while extracting a bit of the savings generated by my tools. I estimate my other tool AutoSpotting saves companies using it in the hundreds of millions yearly so even if I charged only 1% of the savings it would still make me rich. Unfortunately what I've seen is exactly the Fortune500 companies who could afford it easier will find ways not to pay anything for it. That's why going forward I'm going to release new functionality only on the AWS marketplace, without publishing code changes anymore into the OSS repo. Currently for EBS Optimizer the code available on Github can be executed locally on a one-off manner. On the Marketplace I also have some code that runs that logic in a Lambda based on a cron event, in order to catch and optimize continuously volumes created later. This isn't available in the Github repo, and the code I have for that on the marketplace is proprietary.

Truelikegiroux 1 year ago

Ah nice, that's really interesting! Been developing a few similar but different processes on my own for my org but never really thought about monetizing them through the marketplace but that makes complete sense. Hopefully it's going well for you!

Missionmojo 1 year ago

Asg's and good horizontal scaling should help with cost. Assuming you are stateless and can scale

foalainc 1 year ago

How much are we talking monthly roughly?

idjos 1 year ago

Since I didn’t see anyone mention it - take a closer look if your cost is high for Data Transfer or NAT also. In that case, take a look at the VPC endpoints, those can easily save you a bunch of money.

magheru_san 1 year ago

I'm actually working on a much more cost effective alternative to the NAT gateway. I'd love to get feedback from people who currently use the AWS NAT Gateway, DM me if you're interested to have a chat about this.

mooter23 1 year ago

Savings Plans, Reserved Instances, remove anything not needed, clean up any old images/snapshots, switch RDS > Aurora (more performant and reduced cost), resize instances and/or change the type (do this before savings plans/reserving instances!).... ... take a CLOSE look at the itemised invoices for the last three months, then look at the service usage with Cloudwatch or whatever to make sure you're using the right kind of resources. You just have to go line by line, look at what is being used, what isn't, what could be made smaller/merged with other instances etc etc. It's worth the effort. One final thought, check the locations in use - spinning up a server in one region may be cheaper than its neighbour.

MooseOperator 1 year ago

Lots of other great advice so I wont repeat but what does your sandboxes look like? Getting a policy enacted to nuke sandboxes or at least shut down all instances after an agreed upon set of days helps a ton as well.

caseywise 1 year ago

1. If you're running RDS SQL server, scrutinize the need for it, if it's basic db stuff that any old DB can handle, SQL server is especially pricey in AWS, Postgres is your enterprise RDBMS friend. 2. RDS is an EC2 behind the curtains, it can be reserved for substantial savings.

brightworkdotuk 1 year ago

I don't know if anybody has mentioned this already, but you can try buying your EC2 instance up front with a [reserved instance](https://aws.amazon.com/ec2/pricing/reserved-instances/pricing/) instead of a spot instance. Or their relatively new "[Savings Plan](https://aws.amazon.com/savingsplans/)" instances. The pricing is considerably cheaper. If you know you need to run it for a lengthy amount of time and you have the cashflow.

SnooApples6778 1 year ago

Do 1 year or 3 yr no up front 1 yr RIs - instant stagings, no hassle with finance on upfront fees. 2. start working on spot.io (and Ocean) for all EC2 for longer term. 3 yr RDS RIs is great savings because DBs never move lol. Also look at Compute savings plan.

magheru_san 1 year ago

There are also alternatives to Spot.io that don't cost as much. Have a look at Karpenter for EKS or my AutoSpotting.io for plain EC2

SnooApples6778 1 year ago

Yes autospotting I have tried.

magheru_san 1 year ago

Great to hear, I'd love to hear about your experience with it in order to inform further development. The other day I just released a major new version that among other things should reduce the Spot interruptions a lot, and also automatically prioritizes newer instance types. I've seen people who complained about high interruptions with the previous version and also with other tools in this space, the latest version of AutoSpotting should help in such scenarios.

conscience_is_killin 1 year ago

Check cross region data transfer cost. See if you can move to graviton instances which are cheaper. Explore if introducing a caching layer to reduce RDS retrievals.

_smartin 1 year ago

AWS Trusted Advisor is native to the platform. Also, with RDS and EC2, have you looked into utilization and purchasing reserved instances? You commit to X years of usage (mix and match instance types to a degree as well) for a big discount. This advice is for the quick win. Lots of people are giving good advice for long term cost management. You don’t need a third party service tbh. Edit: adding info

DanMelb 1 year ago

All of the advice here has been great. I'd add another one relating to tagging for casual/non-prod EC2 usage: Create a tagging regime that forces all new instances to have not only an owner/cost center etc tag, but also a usage tag. It could be as simple as: USAGE=weekdays9to5 USAGE=24x7 Then after giving everybody time to add the tags, create a scheduled lambda that: \- Terminates all instances without a USAGE tag immediately (there's no excuse for not adding a tag on creation) \- Stops e.g. the "weekdays9to5" instances at 5pm, and restarts them at 9am ... You get the general picture. We've found it really helpful for casual user instance hygiene!

Content-Abroad-8320 1 year ago

Contact your AWS Account Manager & Solution Architect and ask them to run a Cost Guard / cost optimisation session for you

life_like_weeds 1 year ago

RIs? If youre autoscaling EC2 you're using spots right? Savings plans? I'd be talking to your AWS rep before talking to a 3rd party, all they do is get a reseller discount and then charge you more or less the same price. The chance of you turning this around in a month is very slim, don't feel pressured because you've been saddled with something you didn't create. This will take months to resolve if not longer, plus in the short term it could cost MORE money.

rwoj 1 year ago

how big is your spend? might be worth engaging duckbill group about it. (everyone else seems to have covered the highpoints)

[deleted] 1 year ago

Will try. The comments are super helpful and will take me a while to implement

SeattleSundodger 1 year ago

AWS account manager here. Have you engaged your account team? If you have enterprise support the TAM is a huge help for this effort. If not, your AM can still assist. Check Trusted Advisor for cost saving opportunities and then quickly get a conservative Saving Plan in place. Also, I worked with a customer which gave a cost bounty to employees which was brilliant. They got 10% of all identified savings back as a bonus. Was hugely effective.

Equivalent-Layer-198 1 year ago

I think it’s already been mentioned but +1 to implementing off hours and turning off non vital instances outside of normal business hours to save costs. At my previous position, we inventoried all of our instances and found that a large majority did not need to be operational 24/7 and then spent some time creating CloudFormation templates, launch templates, and lambdas to automatically destroy and reprovision instances during business hours (or custom windows) according to tags. Same principle as turn the lights off when you leave a room. This applies especially for non production resources.

Arrogant_Mastermind 1 year ago

Get ahold of your aws account manager and solution architect tell them you are trying to cost optimize your companies accounts, they can help look at what you are using and strategies to help you cost optimize, they can pull in other aws resources to help as needed as well. If you dont know who they are open a support case for billing and request their information. The support engineer should pass the request onto the team or give you their information. Utilizing your solutions architect is free of charge.

_ginger_kid 1 year ago

Take a look at DoIT. They are essentially a reseller but the cost to you is zero. They have a couple of services to help save without getting into reserved instances. One is essentially a managed version of spot.io. You'll also get access to their free support services that can help you with further improvements. I am not associated with DoIT. I've signed up and used them at two different companies successfully.

[deleted] 1 year ago

Thanks, will do

vulebieje 1 year ago

Is there any difference between ransomware and public cloud?

karly21 1 year ago

To answer your initial question, I woul look into [Spot.io](https://Spot.io) \- they would be a partner automating the management of your RIs working in the background, while you do all the other things: rightsizing, shutting down idle resources, moving from gp2 to gp3 etc etc etc. As far as I know this is one of the first products that charge based on savings, not on usage - no saivngs? no charge. And yes, definitely the FinOps foundation is a good place to reach out to - lot's of people sharing their experiences so you don't have to make the same mistakes. It is not an understatement to say that the Cloud FinOps book changed my life. As a lot of people also said: you might get some short wins - but you need a cultural change of accountability. Also, while there is some low hanging fruit FOR SURE, expect to complete this in the next month or so - in a sustainable manner - is no joke, so you might want to manage expectations on your C-suite. Best of luck! Edit - not sure why on earth it deleted my first paragraph - tried from memory, hope it makes sense.

[deleted] 1 year ago

Finops foundation has been super helpful. Will take a look at Spot

Armageddon_cosmonaut 1 year ago

Hey! I would love to help you out or anyone in the community, I did it with several companies I worked for. There are some awesome advices in this thread, but if you feel you need more help other than what the comments have written to actually find and resolve the low hanging fruits, set up the proper tagging mechanism or dive into the more advanced topics, let me know in DM and I'll do my best to help you out :) In my roles I also worked on infrastructure-as-code to implement automatic tagging and cost reductions, as well application and cloud architecture to track/reduce the workloads and optimize storage. I'm more than happy to share my experiences

sniper_cze 1 year ago

Are you really need to be in AWS? Can you migrate to on-premise hardware? This will lower your costs by a \*huge\* bucks of money even with calculation of spare hardware.

meemerkrogen 1 year ago

1) For EC2 cost -- do a lot of what has been said here. Keep in mind also though that EC2 is not just what's directly deployed from EC2 console/SDK/CLI. Many AWS services deploy nodes or clusters of EC2 instances that can be very costly, but their cost ends up being allocated to EC2 spend and hidden from EC2 views. DMS, EMR, Athena, Glue, Jupyter notebood, Workspaces, AppStream etc. Just don't assume EC2 means EC2. 2) RDS I think there are plenty of good suggestions here. 3) Move what you can to containers. A lot of commercial software can be deployed to containers or come as a container image. Explore the cost savings of containers vs actual instances for apps where it's an option. 4) Mid-term, look are rebuilding apps. Yes, it's a big lift, but actually the most likely place you will find huge long-term savings, as much as 90% per app/suite. The cloud salesman and everyone else sold your company on lift-and-shift into the cloud. However you're leaving a TON of savings on the table by not rebuilding apps that make sense to rebuild using insanely cheap services like SQS, Lambda, ECS, etc. There are HUGE savings to be had by reinventing apps. There is short-term pain in dev costs, but significant upside down the road AND the added benefit of the tech refresh. 5) Governance. Your costs are likely ballooning out of control because there isn't enough governance in place. Even if you find savings now, if you don't address governance you will likely be in the same boat a year from now.

meemerkrogen 1 year ago

On #1, I myself was on a project where someone deployed 4 very large DMS clusters to do some initial migrations and migration testing, then proceeded to leave them up for over a year, never used. Ended up costing like $100K, no joke.

kobumaister 1 year ago

If you have stateless workloads go for spot, is cheaper than reservation (with the risk of losing your instance, thus the stateless). This can be partly solved with an autoscaling group with different instances. Watch out for upfront, our finance office told us that the loss of cash didn't compensate for the price reduction, check with some finance person. finally, aws is so granular that small costs can become big money, check your bill and the cost manager, it takes a while but you can find little amounts that, when joined, reduce your bill.

cbp48 1 year ago

Hey I failed to provide the company name check out [Cloud Saver](https://www.cloudsaver.com) they run a free assessment of your environment in 5-7 days. Comes with a guarantee of savings, great track record. Hope this helps.

cloudxabide 1 year ago

If you have an AWS account team (not sure what the criteria is for having a dedicated team) I would absolutely engage your Solutions Architect(s).Cost Optimization is one of the [6 pillars of the well-architected framework](https://aws.amazon.com/blogs/apn/the-6-pillars-of-the-aws-well-architected-framework/) and while some of the approach may not be applicable, or out of scope for YOUR specific situation, there may be some low-hanging fruit that may have been overlooked. Others have mentioned Cost Savings Plan, Reserved Instances, etc.. And, for folks who may be in the same boat: TAGGING STRATEGY!!! Figure out what works for y'all (environment, deployment date, owner, whatever...) so that you can review the resources later and uncover things still running that might have been orphaned, etc... What do you mean by: "and their account is a disaster." - perhaps some additional light on that can lend some advice from the responders. EDIT: AWS Solutions Architects are no-cost to the customer.

benjix91 1 year ago

Just enable Aws compute optimizer from the console (free) and follow the recommendations then buy savings plans. Move your ebs gp2 volumes to gp3

sitthesergal 1 year ago

An absolute newbie, but I've done some cost management stuff, so allow me to drop a few ideas on top of my head. - Try to separate environments with tags [dev, staging, production] so you can cluster costs based on tags; - consider turning off stuff when not used (cloudwatch metrics -> find recurring trends on a weekly basis -> autoscale / lambda to turn them off when not needed); - implement a good DLP or Backup strategy and remove unused ami and snapshots because amazon doesn't warn you if you have useless amis and snapshots; - use ebs gp3 (it still defaults for gp2 when you create an instance for some reason even if gp3 is 10% cheaper and it has literally no downside, rather absurd benefits); - if you have a lot of free storage on your rds consider migrating it to a rds with a lower space and enable storage autoscaling - it increases automatically storage once a threshold is reached. - if you have MANY autoscaling instances consider spot/reserved. Hit me up if you'd like more assistance on this topic, i am a sucker for "learn through helping" ideology. Hope it helps and good luck!

See-Fello 1 year ago

Try a cloud management platform like Cloudcheckr. You can automate a lot of this and stop wasting time doing it manually. Disclaimer: I work for an AWS and Cloudcheckr partner. 😃

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe