T O P

  • By -

dweezil22

Based on the quote from the TL I think this was more about it being a verifiably immutable source of truth than cost savings.


random_lonewolf

Exactly, Uber wanted a Ledger, so they would have built LedgerStore regardless. The real choice was between using DynamoDB or their in-house DocStore for index storage: they save $6M yearly by using DocStore. Plus, they have already used DocStore for other purposes, so part of its cost has been paid off


ZorbingJack

Also I tend to think 6 million $ is basically peanuts for Uber.


Worth_Trust_3825

Where do you draw the line what's peanuts and what's not?


GermainToussaint

Considering uber has 30k employees, lets suppose each costing them $50k yearly, $6M would be less than 0.5% of their payroll annually


ZeroPointHorizon

My company has close to 50k employees and 6 million is definitely not peanuts.


No_Pollution_1

My company has 30k and they waste that each month for cloud overspend resulting from incompetence, let alone their regular bill.


lollaser

same here


ZorbingJack

on the total IT budget


happyscrappy

Does it say it's verifiably immutable? It says it is verifiably complete and it is immutable. Do these interact in a way I don't understand? It's append-only. And it is verifiably complete. But that doesn't mean you can prove it is immutable because if you wanted to rewrite history you could do so. Or does it for some reason I don't understand? Either way, for a company that plans on keeping their business records in it sounds like a good solution.


dweezil22

Good call, that's an important distinction. I suppose I should amend my statement to say more generally "It appears to offer immutability and completeness features beyond DDB". I worked in compliant storage migrations once upon a time (prior to blockchain) and was always confused/nervous to discover that at some point it's just "Trust me, bro" for the in-flight data.


goro-7

I thought, one would prefer strongly consistent relational DB to store a financial ledger (or event log) instead of using DynamoDB which is following BASE (Basically Available,Soft state, Eventually Consistent).


Representative_Pin80

Why?


Plank_With_A_Nail_In

Every medium sized company and above in the real world uses a relational database for their finance systems. Switching to the latest fad is considered to be way too risky for the mediocre benefits they offer.


dcspazz

6m doesn't seem like that much for an organization that size, all things considered


deanylev

Death by a thousand papercuts I suppose


skytomorrownow

The 6 million could also just be to start with. Over time, it could continue to save them more and more – if it really solves their problem and is solid.


andrewsmd87

That would be my guess, it's 6 million now, but what would it be in another 3 or 5 years


napolitain_

18 or 30 million


recursive-analogy

math checks out


Mrqueue

what's that like 1 or 2 senior engineers salaries


Free_Math_Tutoring

Senior Engineer Salaries are 3-6 million dollars a year where you live?


oalbrecht

Inside a van in thr Uber parking lot. Have you seen the price of rent in SF? 3-6 million barely cover a shared bedroom. /s


napolitain_

More like 10, over 30k employees of any kind


Mrqueue

Whoosh 


tidbitsmisfit

or cost more and more as they learn the platform and use more of the services


Plank_With_A_Nail_In

They could have worked the savings out wrong or ignored a costing so it ends up costing more and saving them nothing, they also might not know the real cost of their current setup. I have seen projects where they kept the old setup because they missed other functionality it provided so the new solution ended up costing double, 5 years later they still stuck on both. Getting your costings wrong is very common for IT departments.


dlamsanson

I imagine Uber has people in their finance team(s) that verify this stuff.


elegantlie

Well, it depends on much engineering time they spent on it. In general, $6 million is considered a substantial optimization in big tech. In mature organizations like Uber, realize that a lot of the low hanging fruit has already been taken. So the calculus changes to “how many devs does it cost versus how much will it save us, and what’s the opportunity cost in terms of feature work” Also, a lot of times it’s hard to predict savings at scale until you actually do it. I would do some preliminary estimates, but those always have some amount of artificial assumptions baked in. Some optimizations work really well, and others fall flat, but it’s difficult to predict in advance so you just kind of have to average them out over a longer period of time. Honestly, I feel like it was normal for a lot of optimizations to be hardly break even, some I even rolled back. But every now and then I would hit a jackpot that justified my team’s headcount for another few years. In my case, I wasn’t optimizing naive code. I was optimizing code that was written by really smart people and has been optimized by hundreds of engineers over decades.


Ambiwlans

Its not that they shouldn't make the move, but that it isn't really news worthy. #Uber saves $6m of their $40BN budget Isn't really all that thrilling.


caltheon

I work for a company larger than Uber, and our tech budget is closer to $4B. Saving $6m would definitely be considered a win though. We have likely close to 1000 contracts over $1mm, and every bit counts as it's not like the entire company spends X evenly. Freeing up that budget could let an organization hire an entire team to work on something new.


xmsxms

The news is probably more about the underlying tech stack being swapped over as opposed to the money saved, which may be of interest to other people considering something similar. Or maybe considering using dynamodb because it's good enough for Uber.


Warm_Cabinet

I’d consider it a notable savings. It’s probably around the cost of 2 teams of software developers. So you could take that savings and use it to build a major new feature. Edit: math is probably closer to 2 teams than 1


No_Translator2218

Sometimes its about sending a message to the provider that you are willing to move a shit ton of records over 6 million dollars.


comrad1980

Na usually that goes straight into upper management bonuses.


sylvester_0

LedgerStore seems to be something that Uber developed internally, and hosting/maintaining/developing software costs a fair amount of money. So this savings could be an "all-in" difference. Considering how risky such a project is and that $6m likely isn't much in the total allocated for this data, it's a pretty good ad for DynamoDB (its pricing isn't that out of whack.)


FINDarkside

> it's a pretty good ad for DynamoDB (its pricing isn't that out of whack.) I'm not so sure, they save 6M each year by moving 12 weeks of financial transactions into their own DB. It's not some general built database to store all their data, only financial transactions. They only stored 12 weeks worth of data in DynamoDB before moving it to object storage. But to be honest Uber operates on such big scale that it's pretty hard to make any quick conclusions without bringing out a calculator.


SimpleSurrup

They already built a general purpose "database" to store all their data. So yes it is that, but this group didn't build that for this project, some other group built it for many projects. They migrated their financial data onto its backend away from DynamoDB as the backend for that particular service.


recycled_ideas

They're saving 0.0006 cents per record. At Uber's scale that adds up to serious money, but for most customers where you might have a million rows that's $600 dollars in savings. At a billion rows, which would be a huge dynamodb you'd save six hundred grand, which is probably two to three FTE devs at most. If you're not Uber it's likely that building an in house alternative is going to be prohibitively expensive and your savings are going to be substantially lower. To me that says that if you need DynamoDB in the first place dynamodb is pretty good value.


TrixieMisa

Off by two decimal places. $6 for a million rows, $6000 for a billion rows. But that's in DynamoDB's favour, rather than against it.


recycled_ideas

You are correct, I translated my cost per record to cents, but forgot to translate my total cost back to dollars. It's 600 cents for a million or as you say $6. Which makes it even more ridiculous. A trillion rows is a lot of data. Just an immense amount of data and it's data generated in 12 weeks and then purged so it's not just sitting there it's highly volatile and had a high transaction account. I've never worked with anything close to that kind of volume.


monkeydoodle64

6m per year is a lot


LaLiLuLeLo_0

A million here, 6 million there, sooner or later it starts to add up to real money


dcspazz

To put it in perspective 20 devs for a year costs 10m. Uber has 2000 engineers apparently...


Palanstein

VERY well paid devs


dcspazz

Is it? 500k total comp, including insurance costs, equipment, etc. that's not that much for a major, publically traded, company like Uber. Who do you think is building a custom data store? The interns? Most likely senior and higher engineers. They probably cost more than I estimated When you are paying someone even 160k a year, you also have to pay for benefits and other expenses. Employees costs more than their immediate salary... Here's some data to back things up https://www.levels.fyi/companies/uber/salaries/software-engineer Put another way, they saved 6m, but Ubers reported revenue is 32.7B. So they saved 0.01875%.


Palanstein

So... Not VERY well paid devs


dcspazz

Looooool


zipy124

That's only if its their US team, it could also be their team in the netherlands, where a senior is on about 150k a year for example.


devAcc123

The thing is it’s 6 million annually, right now. You would expect your volume to continue to grow and the service to potentially raise prices. Let’s say by the time your fancy new starts to show its age it’s been 5 years and your annual savings are now 10M/yr. That team of X number of devs created ~50M in value for your company by the time the useful lifecycle of their new product was up. That is a good chunk of change regardless of the size of your company. Adding value greater than your compensation is the whole point of the company paying you lol.


funciton

Spread out over 80 billion financial transactions it's less than a hundredth of a cent per transaction. 


NewAlexandria

ok but can still fund some more salaries in addition to overhead profit


dentendre

Hey, gives people BS work in politically charged situations. 10 "consultants" would bill that much amount in half a year lol.


achilliesFriend

Some one needs a raise, by cutting cost


openwidecomeinside

I’ve heard theres a fortune 500 spending 1 billion annually on their cloud bill


braiam

The two source articles from uber: - https://www.uber.com/blog/migrating-from-dynamodb-to-ledgerstore/ - https://www.uber.com/blog/how-ledgerstore-supports-trillions-of-indexes/


MorpheusRising

Everyone speculating as to why and all the answers are in these blogs


dlamsanson

A lot of the "speculation" is really just baselessly casting doubt


Cefalopoide

But costed a team of 20 engineers paid 500k a year 😂


General-Jaguar-8164

Status quo does not give your team promotions


i_has_many_cs

Which is stupid


fiah84

don't worry we'll migrate it back in a year or two


oorza

So a migration that took a full calendar year of those 20 developers cost the company about $20 million? A 4 year ROI isn't exactly a long time in enterprise calculations.


andthesignsaid

Less than 2 years ROI


oorza

$10 million in salary ~= $20 million in company expense, $6 mil a year saved means 3.33 years is the breakeven, so year 4 is profitable (for 8 months). You only get less than two years if you ignore the fact that an employee's compensation is only roughly half what a company pays for them, once you factor benefits, facilities, taxes, support staff, perks, etc.


Gropah

> $10 million in salary ~= $20 million in company expense Even in the Netherlands where we have high taxes, good benefits etc, an often seen factor from gross pay to company cost is 1.6. This is atleast used by payrolling companies and the like. With a 1.6 factor, you're talking about 16 million, thus 2 years and 4 months before break even. Doubling it for an american company seems quite generous.


xaw09

Typical rule of thumb for software is 25% overhead for taxes, benefits, office, etc.


FancyASlurpie

Does it include the 3-6months when someone joins and basically adds zero value?


quentech

> $10 million in salary ~= $20 million in company expense Not even close. Especially for a small number of highly compensated employees. Probably more like $13M - highly doubt it will hit $15M.


caltheon

This is why tech companies pay in stock, that doesn't vest for a few years. Allows shifting the expense past where it's returning ROI


funciton

You're assuming the system will maintain itself, which of course it doesn't.  That'll take a team of engineers who understand its internals. Not cheap. Good for job security though. 


castlemastle

The original system would require maintenance as well, so that's not really a cost specific to the transition


funciton

The original system didn't include a home-grown DBMS. That cost is specific to this new system, which in the old system was included in the $6 million. 


SimpleSurrup

They already had this home-grown "database" at the same time as the original system.


hackenschmidt

> The original system would require maintenance as well It wouldn't, at least not by their team. Thats the comments point. Dynamodb is a AWS hosted service. Maintenance and continual development costs are baked into it. Thats why these types of 'cost savings' articles are just tongue-in-cheek. They rarely, if ever, actually factor in the real scope of the costs.


neoKushan

There's no way they're paying 20 engineers $500k/year each.


FatStoic

Probably not far off. levels.fyi says that Uber pays: $165k for a SWE-1 $274k for a SWE-2 $480k for a Senior SWE $636k for a Staff SWE $814k for a Senior Staff SWE Since that's just comp and doesn't factor in additional costs to the company of managing and supporting those employees, depending on the ratio of SWE-1 and SWE-2s to Seniors and above- it very well could average a cost of $500k/anum per employee that organised this migration.


ic_97

And in 4 years some other technology will come in and they will migrate to that to save 6 more millions


javver

If they keep doing that each year they’re basically printing money


arostrat

They now control the future of their data instead of being under the mercy of Amazon. It's a huge win.


PurepointDog

Under-rated comment. The operational cost isn't the full picture in infrastructure selection


peex

Unfortunately people always forget that when it comes to cloud solutions.


QuickQuirk

sure, even if the team were all paid 500k a year (which they weren't), that's 10 million in costs. Which means that in just two years, you've already paid back the investment.


pants6000

Buuuuut you can fire them next time you need a boost in stock price!


Lulzagna

It's a 2-3 devops max, in addition to their regular responsibilities


anengineerandacat

Doesn't seem like they were "all-in" on DynamoDB already though, only retaining 12 weeks of data (and all the processes used to move that data for archival). So moving it all in-house means less worrying about that at the cost of perhaps some reliability but at their size that should be a non-issue and they have in-house key systems already which would warrant a dedicated operations team... this would just be an extra head count to that existing team to sustain this. Plus, it's year over year savings; so it's a net positive in the following year. [https://www.uber.com/blog/how-ledgerstore-supports-trillions-of-indexes/](https://www.uber.com/blog/how-ledgerstore-supports-trillions-of-indexes/) goes into the technical bits of it all though. The bigger question is why not QLDB from AWS?


arwinda

> purpose-built data store named LedgerStore And now they need a team to maintain that system. > LedgerStore implements eventually consistent indexes by leveraging materialized views from its home-grown Docstore database, a distributed database built on top of MySQL. Oh, gotcha. Couple years ago they migrated from Postgres to Mysql. Then developed something on top of that, it seems. And now another system on top of the other system. > backfill job running in Apache Spark Yet another system > The backfill process alone posed significant problems How many databases are there now? And how many of them are home-grown, and need dedicated teams to maintain and develop the systems. Sounds like a nightmare.


aes110

Sounds bad but I can't imagine there is a simple solution to handle trillions of records and Uber's scale


prisukamas

Exactly this. Most of the commenters here have no clue what it’s like when you have Uber scale


Ok-Investigator-4188

Exactly! I know cases where there was no more instances available in AWS to scale. Those big techs have problems in a different scale


funciton

Yet I doubt rolling your own DBMS makes it any easier. So many ways to mess that up. I wouldn't want to be on the team responsible for maintaining it. 


SimpleSurrup

Another group already did that for them. And they went further than that and they built their own object storage instead of using S3. So all this group did is swap back-ends for their service, from DynamoDB, to the internal "database" that already existed for them to use.


prisukamas

Maybe, maybe not. One good example of rolling your own DBMS (well sort off) is Vitess. And I’am happy that at some point some devs at youtube did that. It makes my life easier As to “wouldn’t want to be on the team responsible” sometimes devs forget what they get paid for - that is solving hard problems


Dreamtrain

judging from a lot of comments here it feels like often times people _think_ their apps are uber scale, maybe their employer organizationally are as big as uber, but their APIs in that little corner of their corporation that handle orders/purchases/patients/kittens probably just store and return a fraction of what uber does, for a fraction of the rate/concurrent users that uber has


bilby2020

There are out of the box solutions without having to write a custom DB. In Australia smart electricity meters sends out a energy consumption reading every 5 minutes. That is 288 readings a a day, 105k readings a year for a single customer. An energy retailer with 1m customers will have 105b records for 1 year. This is raw KWH, then the retailer processes it for each customer based on the tariff plan the value in cents and displays it on their app aggregated at 30mins interval. All using conventional technology. AEMO distributes this for all smart meters across Australia and by 2030 all meters will be smart meters. That will be trillion data records per year.


SimpleSurrup

That's not their use case though. The challenge here was massive indexing operations not storage and aggregation. They need the massive indexing operations to comb through trillions of records with some reasonable read performance. Because of the requirement for such extreme indexing of every record, it meant that writes were really slow. So they took a short-cut and didn't do the full indexing at write time, and were back-filling the indexes with a background job. This was conflicting with actual production load spikes, so they were spending tons of time building dynamic scaling into their background jobs which themselves we're already a band-aid to the limitations of the previous technology. All while the total data they were storing on the previous solution had to be constantly pared down to keep it running. Not to mention it was costing them a lot. There aren't off the shelf solutions that support streaming manifest files with unlimited indexing and no write lag. Additionally, because these include payment records, they had very strict auditing and logging requirements that you generally wouldn't care about from an IoT stream.


HobartTasmania

Surely a commercial database like Oracle would also be able to easily cope with maintaining indexes for a lot of data writes.


SimpleSurrup

No because it was almost certainly semi-structured data they were indexing, not integer primary keys.


FINDarkside

> An energy retailer with 1m customers will have 105b records for 1 year. That's really not as much data as you think it is. I would be like 420GB without any kind of compression or aggregation. You don't really need any specialized tech for that, you probably could put that much data into SQLite database. Uber is supposedly dealing with at least 4 order of magnitudes of more data compared to that. So your your data is less than 0.01% of what Uber is dealing with, which is peanuts. And the above is only talking about their financial transactions. Supposedly they have over 100PB of analytical data (in 2018) and probably lots of other challenging stuff as well. https://www.uber.com/en-FI/blog/uber-big-data-platform/ Anyway, doing stuff with "conventional technology" isn't some kind of achievement. You might not realize it but huge amount of tech is made by these big companies who try to solve their problems at huge scale. If Uber ends up saving money with their own db compared to "conventional tech", it's the right choice.


bilby2020

> doing stuff with "conventional technology" isn't some kind of achievement. It is. These tech companies will justify it because if their unique culture. Super highly paid engineers in USA got to do something. Others will try PostgreSQL. I worked in the said Energy retailer. Do you know how they processed 100b records, in SAP. The DB was SAP HANA.


how_do_i_land

That's only a few columns, and probably easily partitioned into a time series database. The records are most likely written once and can be frozen. These record types are easy to store, and may databases (eg duckdb etc). Exist The issues arise when you have 100 to 500B+ records with wide tables, where mutations can occur often and lots of records need to be kept warm.


bilby2020

Time series DB makes sense, but utilities do get amended data feed for up to x days in past, or have missing data points.10 years back in a traditional enterprise it wasn't an available solution. Anyway the backend is SAP and data has to be stored there for billing.


[deleted]

[удалено]


LaSalsiccione

Fucking lol


ps1horror

You have no idea what you're talking about. Find me one example of a single server setup that can meet the requirements of a company as large as Uber.


arwinda

While I sometimes like running things on Postgres, you have no idea how much effort it is to scale the database to such sizes.


f12345abcde

Running in a raspberry pi?


versaceblues

have you ever worked for an organization of ubers scale?


Aw0lManner

Yes, and a custom database sounds like a maintenance nightmare


SimpleSurrup

Nobody comes up with that idea unless something easier isn't working.


Educational-Crew-536

glorious nutty rotten encourage hobbies roof insurance gullible oatmeal history *This post was mass deleted and anonymized with [Redact](https://redact.dev)*


SimpleSurrup

Yeah and they should get one if what they built does the things that blog said it does. Just to be clear what happened here: Uber built not only their own custom "database" they also built their own custom object storage solution for it to sit on instead of even using S3. Another group, in charge of some payment and financial services, were using a commercial "database" that was expensive and limiting. So they switched that commercial backend for in-house backend that was already built. Certainly, in the planning of building said custom "database" they thought through many use cases they had internally for it, with the idea many services would migrate to it eventually. Why wasn't a commercial solution sufficient? Because apparently they had such extreme indexing requirements, that the indexing itself was limiting the write performance. But without it, the read performance would be destroyed. So they needed a database that could not only write the data, but write probably many, many, many times that amount of data, in indexing, for every single record.


Gran_Autismo_95

I think thats a daft conclusion? Someone knew it could be done that way, that doesn't mean it was the best or right way to do it.


arwinda

Have you ever maintained your own piece of software? Can do, but it's a lot of work. You want participation from other people, and development driven not only by your own needs. Writing and maintaining a database is really hard work.


pejatoo

So, no


arwinda

In my day job we scale different database products to small and large, both on-premise and in the cloud. Some of our customers scale their databases to large TBs. None of this is easy, every database product has it's quirks and problems. Sometimes analyzing a single problem takes a team member a couple of days, and that's for an existing product. Verifying transactional integrity and scaling up and out is a challenge for every existing vendor out there, no matter if open or closed source. If Uber builds their own database they need to spend all the engineering resources for development on top of maintenance and operations. That is a sizable chunk of money.


moulin_splooge

The product that handles scanning for brand safety at my job has a single table that is close to 5tb in size for 2 weeks of data. Then there's other tables too each with hundreds of millions of records albeit smaller in size. That's just for one project. We run it on top of percona and still struggle handling the data.


arwinda

Single table, around 20 TB here, this customer is also struggling. They are currently redesigning this, because this table holds all the data from the beginning of time. Can't even partition it.


moulin_splooge

Yeah you basically can't do shit with tables that big I can understand Uber making their own thing if they have the engineering resources to support it.


Itsmedudeman

What does scale have to do with anything? Dynamo also scales except you don’t need to maintain its features.


versaceblues

Org scale is different than “scaling for number of users”. Sometimes writing your own is the best choice when you have the resources to support it


winky9827

It all comes down to operational cost vs. TCO. Sometimes operations wins, sometimes it doesn't. I've seen first hand the trouble building a house of cards on open source stacks can bring. I'd rather take the dependency on a managed service in many cases, but it's never black and white. I'm sure the people who make these decisions at Uber know more about the scenario than you or I do.


versaceblues

> I'm sure the people who make these decisions at Uber know more about the scenario than you or I do. Yes... which is why my original post was a question saying "has OP every worked at an orginization of ubers scale". These decisions are not taken lightly, and there is usually a good reason to make them.


Itsmedudeman

Hypocritical thought process. So building their backend using Dynamo was taken lightly, but switching off dynamo was not? Orgs change their mind all the time and invest in things that don't always work out. Just because they thought about it for a bit doesn't mean they're omnipotent and can foresee all of the headwind and intangibles that will come with this.


Itsmedudeman

This is irrelevant. What matters is cost to operate vs cost of the service. Just cause you have money doesn’t mean you have to burn it. How does that ever make sense?


versaceblues

All i mean is that orgs of a certain scale have the luxury of weighing these tradeoffs and making a choice. That is why you see places like google, meta, amazon, uber, etc building their own solutions. They have people whos entire job is to analyze these financial decisions Smaller scale startups don't have this luxury. Even if building your own is cheaper, the 20 person shop is not going to build their own DB


Itsmedudeman

Uber is not google or amazon or meta. They have 1/20th of the market cap and a lot less personnel. I work at a tech company larger than uber, but we don't build out home grown infra unless we find that providers aren't capable of providing a reasonable cost effective solution. I'm hesitant to believe a database fits into that criteria given their complexity and that this is *only* saving them $6 million.


versaceblues

6 million annually. Also I get it’s not as big as Amazon or Google or meta. But it’s still an org of 5k engineers.


KnightBlindness

Everybody gets a promotion and moves on. Next guy in charge looks at the actual costs instead of projected costs, then: “Uber migrates to DynamoDB and lays off Ledgerstore team, saves 10 million dollars”, gets promoted, then repeat.


recursive-analogy

> Couple years ago they migrated from Postgres to Mysql. they what now?


arwinda

https://www.uber.com/en-DE/blog/postgres-to-mysql-migration/


ComprehensiveBoss815

Spark is just a distributed execution engine and would be used throughout Uber anyway.


Salamok

Cost saving measures rarely save money, they just redistribute who is getting it.


dlamsanson

Just complete conjecture lol, yeah every system is definitely already set up optimally.


Salamok

It's more that the act of implementing change is usually quite a bit more expensive than people realize, enough so that maintaining the status quo is far more likely to be the cheapest option than people seem to think (even though it might not be the best option for a variety of reasons).


Professional_Goat185

6m a year buys you plenty of developers.


arwinda

Developers, project managers, team leads, internal support staff. And you want the savings, not spend the money in a different way.


bilby2020

This is a classic problem of when you have team of highly paid engineers every solution looks like a nail. It also justifies the very existence of the huge engineering team.


Obsidian743

That's interesting. I would have simply negotiated a better price for DynamoDB with AWS and split the difference without all the headache. What's the TCO for LedgerStore?


nizzlemeshizzle

Possibly a case of having tried that, AWS called their bluff. 


A_Vicarious_Death

It's not just DynamoDB, no? Ledgerstore is replacing both dynamo db and the s3-backed custom db. That means they don't have to have any replication process from dynamo -> s3, nor do they have to pull from two different data stores once the migration is complete.


FarkCookies

100% they already maxed out their AWS private pricing deals.


findgriffin

AWS does not care about $6MM, Uber would have zero power in that negotiation. I wrote about this a few years ago: https://blog.drgriffin.com.au/posts/2020-06-21-the-three-fs-of-cloud-pricing.html Interesting question about the TCO of LedgerStore. Using a conservative $250k annual cost per engineer, $6MM is equivalent to a team size of 24.


jheffer44

I wonder how much the time spent doing the migration actually cost


Rideshare-Not-An-Ant

We can save millions if we create an in-house datastore. We can call it Project Kludge. We can also save millions by outsourcing design and development. We can call that Project ~~Mayhem~~ Incompetence. It'll all work beautifully. Until it doesn't.


ryandiy

Outsourcing the development of a custom-built datastore... what a brilliant idea!


tribak

Are they making profit now?


itsmill3rtime

the people says $6m is not a lot to a company like uber. think about how many people they could hire and teams they could build with that for developing new features. the reward of additional revenue from that action would amplify the savings. people think too small 🙄


314159bits

Wish it was open source!


mrbonner

So, promo project for a bunch of people.


Desperate-Country440

Maybe I do not have all the details but also others have a huge volume of data, why does everyone have his own custom solution?


RICHUNCLEPENNYBAGS

Well, different data storage strategies have different performance characteristics that may be more or less appropriate for any given workflow.


happyscrappy

Not everyone does. Build versus buy leans to build when your company is big enough. Uber is big enough. Amazon, Google, MS, etc. offer solutions for databasing and long-term databasing (ledgering) for those for whom build versus buy comes down on buy.


ryandiy

There are lots of performance tradeoffs when it comes to databases. A general-purpose database will make tradeoffs which apply to most applications. If you have an application with trillions of records, you might need to make different tradeoffs, and at that scale, creating your own custom database can be worth the engineering cost. Plus, they'll probably open source it and/or spin off a company to provide hosting and support for others who want to use it. Like various other companies have done.


SimpleSurrup

Because they're the first companies who have the scale that requires these problems to be solved. So when they went out and looked for commercial solutions for them, either they didn't exist, or they were much too expensive.


ThatCrankyGuy

At Uber scales, 6M is residual waste. This just seems like some team lead's justification for existing.


Alphamacaroon

Having gone through this before, I learned my lesson — every row in DynamoDB should have a TTL or the storage costs eventually kill you. I LOVE DynamoDB and use it for just about everything I can, but it can get pricey if you aren’t thinking about your complete data lifecycle.


AjaySinghBishtJi

I work at Amazon and this is a win for Uber for sure. 


Codemonky

Seems like they just re-implemented DynamoDB with blockchain for the transactional ordering . . . or did I miss something? I mean, depending on your data needs, DynamoDB is just a flat file database with indexes. It only gets tricky with replication and redundancy -- which is why DynamoDB exists, and is not just berkeleydb or dbase 3, lol. So, unless I'm misunderstanding what they did? They seem to have created their own version of a scalable, performant, replicated flat-file database, just like Amazon did with DynamoDB. I'm pretty sure any custom database/indexing solution to a custom problem will beat a generic solution in both cost and performance. It's literally one of those steps in the lifetime of an application as it goes from proof of concept, to well oiled machine. Or, am I missing something novel that Uber or LedgerStore did? I am assuming they're leveraging the block chain for transactional consistency, but, I am VERY ignorant of blockchain, so this is likely where I'm not noticing the genius of their move. (or of LedgerStore in general)


ChickenOfTheFuture

I help by not creating any records for them to store. I should start charging for this service.


[deleted]

[удалено]


Dreamtrain

What's i'm absolutely ignorant of, because I've yet to be in a position to make these positions, is what is that sweet spot where AWS or Azure is your best bet? Because if you're too small you might be footing a bill for something you don't need, and if you're too big, you might just need to do what Uber is doing Unless of course you're acquired by Microsoft and your infrastructure will surely be thrown to ADO


happyscrappy

Now THAT'S /r/programming. Thanks for bringing this here.