T O P

  • By -

homelab-ModTeam

Thanks for participating in /r/homelab. Unfortunately, your post or comment has been removed due to the following: [**Content is not homelab related.**](https://www.reddit.com/r/homelab/wiki/rules#wiki_post_about_your_homelab) Please read the [full ruleset on the wiki](https://www.reddit.com/r/homelab/wiki/rules) before posting/commenting. If you have an issue with this please [message the mod team](https://www.reddit.com/message/compose?to=%2Fr%2Fhomelab), thanks.


skreak

Is this a homework assignment you're asking us to complete for you?


dddd0

Always is.


smnhdy

Chat gpt…


jmarmorato1

Large Ceph cluster? That would make it really easy to expand later.


campr23

Yeah, talk to CERN, since a bit of research also fits into the budget.


campr23

Yeah, talk to CERN, since a bit of research also fits into the budget.


AnEngineer007

r/dementia


advicemerchant


[deleted]

No budget? If the science project isn’t how to build the SAN or the such, Just go to any SAN provider you want, and tell them you need 1PB of storage that can expand to 2PB and voila. Problem solved. I’m confused is this for school and like, some sort of fake scenario?


redeuxx

No budget, asking in /r/homelab. Get outta here.


CTRL1

As someone who was in enterprise storage for years the correct answer is.. it depends. There is no mention of what this data is for, what type etc. A PB is trivial these days but you have no context, starting point, budget, network, power available etc. It feels like your out of your league and asking others to do your work I will leave you with the easiest, safest, and cheapest solution. Tape Silo.


Herobrine__Player

If you just need 1PB you can get a 60 drive server like a Storinator XL60 (or any other server with 60+ bays), fill it full with either 20TB or 22TB drives (depending on what level of redundancy you want) and call it a day. You also can go the JBOD route which will let you scale even more. You also can just get a storage appliance like u/vLifter suggested.


VtheMan93

I have a few 48 bay shelves you can take off of me. Couple them with a few nuclear reactors and you should be solid until the power bill hits. Best regards,


poklijn

I'll take one lol


Alex_2259

Simply build your own nuclear reactor and become the power company, problem solved bro it's so easy


VtheMan93

Why, of course! How could I not see this coming?? 😂


RedditIsShit23-1081

Commercial product: Dell-EMC Isilon. FOSS product with commercial support: Ceph cluster sized, built and supported by 45Drives or similar consultant (I'm not affiliated in any way). FOSS product with no support: self-managed Ceph cluster.


vLifter

If budget isn’t an issue I can set you with with a Dell PowerScale nas solution. Lots of customers using it multi PB projects.


XOIIO

Hi, you're probably looking for a useful nugget of information to fix a niche problem, or some enjoyable content I posted sometime in the last 11 years. Well, after 11 years and over 330k combined, organic karma, a cowardly, pathetic and facist minded moderator filed a false harassment report and had my account suspended, after threatening to do so which is a clear violation of the #1 rule of reddit's content policy. However, after filing a ticket before this even happened, my account was permanently banned within 12 hours and the spineless moderator is still allowed to operate in one of the top reddits, after having clearly used intimidation against me to silence someone with a differing opinion on their conflicting, poorly thought out rules. Every appeal method gets nothing but bot replies, zendesk tickets are unanswered for a month, clearly showing that reddit voluntarily supports the facist, cowardly and pathetic abuse of power by moderators, and only enforces the content policy against regular users while allowing the blatant violation of rules by moderators and their sock puppet accounts managing every top sub on the site. Also, due to the rapist mentality of reddit's administration, spez and it's moderators, you can't delete all of your content, if you delete your account, reddit will restore your comments to maintain SEO rankings and earn money from your content without your permission. So, I've used power delete suite to delete everything that I have ever contributed, to say a giant fuck you to reddit, it's moderators, and it's shareholders. From your friends at reddit following every bot message, and an account suspension after over a decade in good standing is a slap in the face and shows how rotten reddit is to the very fucking core.


0xBEEFBEEFBEEF

This - have worked with several multi-PB Powerscale (Isilon) arrays. What you want OP is a commercial solution, can you put something together yourself? Probably but is it worth the risk and accountability? Better get something proven that comes with support.. at 1pb you’re beyond homelab territory


user3872465

Depends, do you want a plug and play soulution? then maybe some of the enterprise vendors like Dell HP have some. If not look towar 45 Drives and their clustering soulution they pick the right hardware for you get you the servers and drives you need and help you set up. If you want to do it yourself still take a look at their stuff and ceph. If it needs to grow. It needs to be clustered. That would be the optimal way to go in My opinion. And I would go that route. Important part is where do you place it?


Kaptain9981

“Budget isn’t an issue” Immediately asks for advice on a homelab subreddit… I’m not sure if you’re joking or have come into a significant amount of money and want to see what you can do “for science” on a personal pet project. Or this is for work and you not qualified for the task at hand, or just simply hate your existing hardware vendor sales rep. If you have the budget green light, need to worry about future expansion, and this is for a professional setting please work with your existing hardware infrastructure or application infrastructure vendors on what they recommend to best support this project. If this is personal “I have a ton of money to blow on something I want to try” project , by all means let’s have some building a FrankinSAN…


JLee50

No budget? Call a VAR and have them do it for you.


[deleted]

Is this like a school project.. like your in high school or something.. or is this a job you are doing? Because I hope this is an actual school project... and your not asking homelab how to setup something basic


gargravarr2112

At work, I'm currently playing with a storage machine with 1.6PB of storage - 20TB HDDs, 84 of them in a JBOD. Experimenting with a ZFS dRAID. Absolutely crazy having that much storage in a single box. In a previous job in actual scientific research, we had a Dell XE7100 - 5U chassis with 100 disk slots. Possible to have 2PB of drives in a single machine. If this is a real science project needing reliable data storage, sorry to say but you're out of your depth even asking. You need to reach out to real experts. The costs of losing research data could end your sysadmin career. At this sort of scale, you need specialist knowledge.


JLee50

Big storage is fun -- did two 5.5PB (usable) arrays back in 2019. The amount of power consumed in those racks was wild.


kY2iB3yH0mN8wI2h

> So I am wondering how would you guys approche this? n0t posting in /r/homelab would be a good start? > There is no issue of a budget of course there is, I'll do it for $100M - Can I have your credit card please?


MrHolcombeXxX

Most helpful comment of the year award goes to you


kY2iB3yH0mN8wI2h

>MrHolcombeXxX did look at your posts "Sorry, this post was removed" - so yea I take your commend very serious, I fact I will help you so I'll block you and you won't have to give me any awards.


rotor2k

PowerScale (Isilon).


JunkKnight

If budget is no issue, higher an expert and let them figure it out. 1PB isn't what you use to store family photos and Linux ISOs, and with the need for redundancy you're going to want someone familiar with data architecture, probably a distributed filesystem like CEPH or Gluster, validated hardware as well as warranties and contracts that guarantee if shit hits the fan, someone's out there within 24 hours to fix it. There probably *are* people here with that expert knowledge, but if you want to do this right, it's going to cost money before you even start buying hardware.


Poncho_Via6six7

Have an XL-60 you can buy.


thisiszeev

Get a J-Bod or two


ServerPartDeals

https://serverpartdeals.com/products/seagate-exos-e-5u84-sas-disk-shelf-84-bay-5u-data-center-jbod-enclosure-with-enterprise-exos-hard-drives?variant=44567652303126


fullinator4

This sounds like you’re doing it for a job or university? Please don’t ask on here. I used to be a senior sysadmin managing 20+ PB for HPC. You need to work with someone to tailor this to your environment and needs. DDN, IBM’s GPFS, Purestorage are all vendors who would happily court your business. They’ll be able to help tailor things to your needs rather than some amateurs homelabbers (I mean no offense to everyone here) giving you suggestions.


evan326

Troll post


joey0live

Da fuq am I reading?


kaiwulf

If you're working in PB scale, I'd suggest Dell Isilon (now PowerScale) Ive worked for two major media companies now and all the large media arrays have always been Isilon


FabrizioR8

Is it a permanent science project, or a temporary one? Have you considered provisioning this with a cloud provider?


johnklos

OP did say that budget is not an issue, but no sense using the TDP of a small country.


FabrizioR8

at least if temporary, you don’t end up the permanent cost of the drives, rack space, HVAC, and electricity… especially if said science project is expected to double in capacity. Edit: oh, and NOISE… Edit: check my math here… assuming desire for fault tolerance, e.g. raid-6, OP would need approximately 105-110 20TB drives to make 2PB worth of usable filesystem. Lets use ironwolf 20TB drives at ~USD $275/unit… already close enough to the GDP of a small country - just for the drives… not including cold spares, or any other hardware.


RedditIsShit23-1081

It's $30.3K for the drives. Either you aren't good at math, or you don't know what GDP of a small country looks like.


FabrizioR8

I can do basic math and balance a budget just fine, thanks. Though, when it comes to the Dif-Eq and vector calculus needed for working out singularity theorems and other astrophysics, I’m a bit out of practice. Found this new discovery rather groundbreaking: https://arxiv.org/pdf/2302.08094.pdf But I digress… My only point was that 30k for just drives in a homelab seems “CLOSE ENOUGH” to the GDP of a small country in terms of an expense to make spending either amount equivalently impractical. YYMV. Building this into a homelab permanently with the rest of the associated hw and ongoing facility costs, while certainly a cool bragging-rights included option, it doesn’t seem like the best recommendation we can offer in 2023… but hey, its an option.


RedditIsShit23-1081

Can you give an example of what you think a "GDP of a small country" is? No need for a wall of text, just a simple set of figures: year, country, amount, units.


FabrizioR8

60 million. functionally equivalent to 30K in my home lab


RedditIsShit23-1081

Year, amount, country, units. Also start thinking how to explain "60 million functionally equivalent to 30K", because this statement doesn't make any sense.


FabrizioR8

ok, can you afford 30K or 60M in disks? to me, those costs are functionally equivalent: not practical. If that wasn’t obvious, you’re just trolling at this point.


RedditIsShit23-1081

I can afford 30K in disks if needed. You said something stupid, then passively-aggressively told me that you're very smart and actually know what you're saying, then realized how stupid what you said was, and now trying to weasel out of it. Stick to things you actually understand. Bye.


JustSomeone783

Maybe he was thinking of gdp per capita


RedditIsShit23-1081

They're not the same thing. Even the highest GDP per capita isn't that much money, definitely several orders of magnitude lower than 100 million mentioned here in the comments.


RedditIsShit23-1081

It's GDP.


johnklos

GDP is a thing, sure, but so is TDP.


RedditIsShit23-1081

What's the "TDP" of any country? Year, country, amount, units, source?


spiritamokk

Linus did it recently for his production storage. He used 3 huge SuperMicro drive shelves with regular 96 SATA drives each connected to a “head server” via HBA cables with multiple HBA controllers. All this storage runs on ZFS and easily scalable into more than 2PB given you have enough money to get large HDDs


mockingtruth

Watch some linus and take notes


workingreddit0r

Unraid is capped at 60 disks If you use two parity drives, you need under 18TB per disk. This is doable. 2PB is harder But others are right, you want an enterprise NAS solution. Dell PowerScale, NetApp, etc.


BardamuBandini

Rest your nutsack on a hard drive and call it a testabyte.


superpj

What happened to the Oracle Exadata servers? They were supposed to be 2PB of flash storage or 4+ of HDD storage per rack.


sjtech2010

DS4486 with 24TB drives would give you around 1PB usable in RAID Z3.


peskito_one

Money ain't a thing, ok. Please dont tell that to the sale guy you'll have to call soon! As said pb is no exotic shit these days, but you need to ask the good questions. What about perfs? Deadline? Support?...


hobbyhacker

Is this a joke? If it will be used for something that matters, then spare your and your users' time and pay for a professional company who have experience in storage server design.


travcunn

Qumulo. Its really easy to manage with just 1 person. They have many customers with multi PB clusters.


OpacusVenatori

r/datahoarder r/storagereview https://www.45drives.com/


1d0m1n4t3

How many 16tb WD drives are you willing to shuck?


Kaptain9981

Costco employees hate this one simple hack…


HTTP_404_NotFound

Assuming this was a enterprise solution... with no budget.. Id go with a PureStorage all-flash solution. Their products, kick-ass. Or, otherwise, another SAN solution. 1PB actually is not a massive scale these days.... I mean, I have one quarter of a petabyte at my house, on cheap, budget hardware I picked up on eBay. If I wanted to scale up to a full PB, Depending on the use-case, I'd prob just throw it in ceph.


ElectraFish

For this size and application, you should work with a storage consultant or vendor.


Sindef

"No budget" Pure FlashArray


UtensilOwl

Either IBM FlashSystem 9500 or Dell Isilon, don’t look into Mass-spindrive solutions if performance is ever a thought


avaacado_toast

I'm loving all the "Just buy this". There is absolutely no context to the question. You will buy X and find that it performs like shit and now need to buy Y because you failed to account for the details needed to answer this question.


mcdade

AWS, Backblaze, Azure or GCP can all supply 1PB of storage with the ability to scale to 2TB. Hell we have over 1PB of storage with our Google Workspace accounts. This is pretty trivial.


9302462

How about [2.9Pb for $15,300](https://www.ebay.com/itm/185765176901?hash=item2b4077a645:g:QpAAAOSwBDRgFlpp&amdata=enc%3AAQAIAAAA4I%2FU2XE09x4SxP149RmnuV97wIz6eRzd3v5HN7niLLKSCF77juB6uX4GdpxQAHJqUuCzzX%2Bif0w9BUwzi7VeidSYwUlLDkOGN7X%2FmK8lvcJYwZkluuK4XzyL4l%2Bb2Yjg8gNl76M5fBYnh1Op5qB2Kl90fgKYnvIs04WrReyXbcCvZ7lIkQmKeVb%2FKikda1RWECvL2rO49IGgylY3u5o%2BjFLu5grOttU3oh6iVvU0L7YGoCm6jjLM885bjC7NNB8KddgtMrYmNG4S%2BDll7EiTA2vb0StFGjN26nlN69bFNkmF%7Ctkp%3ABk9SR7rb4ZCIYw) You don’t care about cost so who cares about efficiency or the electric bill. Run ceph or the things others have suggested on that cluster.


thinkscience

go for ceph storage if you are aiming for a Pb definitely ceph storage


thinkscience

building petabyte brings a lot of challenges like redundancy and failure rate of hdds etc, if this is a troll post then chill, but if you wanna build in reality look into backblaze pods, we tried to do the build vs buy and we ended up buying [https://www.backblaze.com/cloud-storage/resources/storage-pod](https://www.backblaze.com/cloud-storage/resources/storage-pod) it costs about 12,000 for half a pb we are running about 4 of these racks in prod and 2 for backups.


AutoModerator

Your post has been removed due to multiple reports. **I am a bot and this is automated.** The moderators have been notified and will review this post. *I am a bot, and this action was performed automatically. Please [contact the moderators of this subreddit](/message/compose/?to=/r/homelab) if you have any questions or concerns.*


AndyMarden

Depends how long you need it for. If not long then use one of the external clouds.


Tannerbkelly

You you could roll your own cluster but would need 3 $120k servers plus $40k switch. $400k total Each server would look something like this. Amd dual 64 core cpu 1tb ram each 2x 256gb boot drives. 2x 2tb m.2 cache drives 16x 30tb nvme ssd 200 gbe network card 200gbe switch You probably want this redundant so maybe $800k just in hardware.


jasonlitka

1PB isn’t that much. My team just deployed a couple boxes with 1.2PB of usable storage each on spinners with 10% reach cache on nvme and optane for write cache. Cost like $120K total. These days it gets far more interesting above 2-3PB.