The m1/m2 looks like poweredge r630 equivalent with v3/v4 cpu and m4 is using scalable Xeon which is one generation newer. All of them are great systems, especially when maxed out. The m4 being the newest is probably the best all around choice.
From power point of view, they’re gonna be “less” energy efficient than consumer diy stuff in that they’re supposed to be highly dense systems ran in a data centre with thousands of other similar servers, to pack as much punch in as little space as possible.
Another thing I’d be wary about is noise… 1U means you’re stuck with itty bitty tiny fans that need to spin very quickly and make a lot of noise, should your components heat up. Again, that whole data centre high density thing… noise isn’t something they’re optimizing for.
What’s the best way to make an offsite backup for 42tb at this point with 20mbps of bandwidth? It would take over 6 months to upload while maxing out my connection.
Maybe I could sneakernet an initial backup then incrementally replicate?
Outside my depth but I'll give it a stab. Identify what data is important, (is the full 42Tb needed?). Can the data be split into easier to handle chunks?
If it is, then I personally do an initial sneakernet to get the fist set of data over. Then mirror different on a regular basis.
First thing to do is check SMART data to see if there are any fails. Then looking at usage hours, spin ups, pre-fails / old-age to get a general idea how worn the drive is and for how long you could make use of it depending on risk acceptance.
If there are already several clusters relocated and multiple spin up fails, I’d probably return the drive.
Apart from all the reliability stuff: I’d check the content of the drive (with a safe machine) - if it wasn’t wiped you might want to notify the previous owner, so she can change her passwords or notify customers about the leak (in compliance to local regulations) etc. - even if you don’t exploit that data, the merchants/dealers in the chain might already have.
My guess would be that it’s stored in some kind of non-volatile memory, i.e. EEPROM. Not sure if anyone ever tried that, but with the dedication of some hardware hackers that seems at least feasible. Reverse engineering / overriding the HDD’s firmware would be another approach to return fake or manipulated values.
I haven’t seen something like that in the wild so far. What I have seen are manipulated USB sticks though: advertising the wrong size (could be tested with h2testw) or worse.
I’ve bought used / refurbished (not sure which) with erased smart data. It being all zeros was a clear sign of erased / tampered info. After running badblocks some relocated sectors showed up.
I was surprised the prices aren’t even that much higher than single actuator drives of the same size. I might be picking a few of these up for my next capacity increase.
Speaking of archive today; since yesterday or so, I’ve been getting nothing but the cloudflare challenge loops. I recall maybe four or so years back, they were adamantly against cloudflare, and if one were to use 1.1.1.1 for DNS, it would refuse to load or throw errors. I wonder what’s happening behind the scenes?
Those companies are such a drag for human society because they are afraid of their bottom line. the Music Industry and Amazon also would have burned down the Library of Alexandria if they had deemed it worrisome to their profit.
One easy solution might be to check into a self-hosted search engine? I’ve used mnogosearch in the past which worked well for spidering a single domain, but it only created the database and didn’t have a web page front end. Still, if you let it go crazy across your nextcloud pages and add a search bar to your website it could provide what you’re missing. They provided enough examples at the time for me to write my own search page pretty easily.
Thank you for this! I have sent this suggestion off to our web wizard it looks extremely promising, we had wanted to attempt something like this but couldn’t find a foot hold to get started!
Good luck! And don’t get stuck on the software I use, you may find something else that is better suited for your type of data. Like if your content is wrapped up in PDFs or some kind of zipped files then the best solution is one that can peer into those files to give search hits on the given text. Of course if your content is already fairly plan text then pretty much any solution would work.
datahoarder
Oldest
This magazine is from a federated server and may be incomplete. Browse more on the original instance.