datahoarder

This magazine is from a federated server and may be incomplete. Browse more on the original instance.

keefshape, in [Solved] How to backup an entire blog hosted on medium.com?

Pose this question to chat gpt 3.5 or 4. Ask it to assist in making a (python?) script to do this. Feed it errors, and you can get there pretty quickly and learn along the way.

keefshape,

Lol, downvoted for…?

AuroraBorealis, in [Solved] How to backup an entire blog hosted on medium.com?
@AuroraBorealis@pawb.social avatar

Does this actually modify the files when monolith embeds everything into one file?

greengnu, in Asking advice for home storage configuration

Raid stopped being optimal now that btrfs and ZFS exist.

If you plan on doing matching drives ZFS is recommended

If you expect mismatched disks, btrfs will work.

If you are most worried about stability get a computer with ECC memory.

If you are most worried about performance, use SSD drives.

If you want a bunch of storage for cheap, use spinning disks (unless you exceed the 100TB capacity range)

Anticorp, in Too many users abused unlimited Dropbox plans, so they’re getting limits

Using what you’re offered is considered abuse now? Huh…

HobbitFoot,

Unlimited* plans are always sold on the idea that a sizeable part of the user base aren’t going to use an actual unlimited amount of the resource.

Unless there is a contract regarding a fee over a period of time, there isn’t that much that users can do to compel a service to offer a service they no longer want to offer.

splendoruranium,

Unlimited* plans are always sold on the idea that a sizeable part of the user base aren’t going to use an actual unlimited amount of the resource.

Unless there is a contract regarding a fee over a period of time, there isn’t that much that users can do to compel a service to offer a service they no longer want to offer.

Absolutely! But I don’t think that’s the point of contention here. The problem is the “abuse” rhetoric, since it’s not just incorrect but disingenuous to basically claim that the users did anything wrong here. They’re imposing limits because they miscalculated how many heavy users they could handle.
Again, that’s a completely reasonable move, but framing it as anything but a miscalculation on their part is just a dick move.

yote_zip, in Asking advice for home storage configuration
@yote_zip@pawb.social avatar

Are you buying the hardware for this setup, or do you already have it laying around? If you don’t have the hardware yet I’d recommend not using external USB drives in any way possible, as speed and reliability will be hindered.

If you already have the hardware and want to use it I’m not super confident on recommending anything given my inexperience with this sort of setup, but I would probably try to use ZFS to minimize any potential read/write issues with dodgy USB connections. ZFS checksums files several times in transit, and will automatically repair and maintain them even if the drive gives you the wrong data. ZFS will probably be cranky when used with USB drives but it should still be possible. If you’re already planning on a RAID6 you could use a RAIDZ2 for a roughly equivalent ZFS option, or a double mirror layout for increased speed and IOPS. A RAIDZ2 is probably more resistant against disk failures since you can lose any 2 disks without pool failure, whereas with a double mirror the wrong 2 disks failing can cause a pool failure. The traditional gripe about RAIDZ’s longer rebuild times being vulnerable periods of failure are not relevant when your disks are only 2TB. Note you’ll likely want to limit ZFS’s ARC size if you’re pressed for memory on the Orange Pi, as it will try to use a lot of your memory to improve I/O efficiency by default. It should automatically release this memory if anything else needs it but it’s not always perfect.

Another option you may consider is SnapRAID+MergerFS, which can be built in a pseudo-RAID5 or RAID6 fashion with 1 or 2 parity drives, but parity calculation is not real time and you have to explicitly schedule parity syncs (aka if a data disk fails, anything changed before your last sync will be vulnerable). You can use any filesystems you want underneath this setup, so XFS/Ext4/BTRFS are all viable options. This sort of setup doesn’t have ZFS’s licensing baggage and might be easier to set up on an Orange Pi, depending on what distro you’re running. One small benefit of this setup is that you can pull the disks at any time and files will be intact (there is no striping). If a catastrophic pool failure happens, your remaining disks will still have readable data for the files that they are responsible for.

In terms of performance: ZFS double mirror > ZFS RAIDZ2 > SnapRAID+MergerFS (only runs at the speed of the disk that has the file).

In terms of stability: ZFS RAIDZ2 >= ZFS double mirror > SnapRAID+MergerFS (lacks obsessive checksumming and parity is not realtime).

PigeonCatcher,

Thank you! By the way, I’ve heard that ZFS has some issues with growing raid array. Is it true?

constantokra,

If you want to be able to grow, check out mergerfs and snapraid. If you’re wanting to use a pi and USB drives it’s probably more what you’re wanting than zfs and raid arrays. It’s what i’m using and I’ve been really happy with it.

PigeonCatcher,

Thank you! Gonna check it out.

constantokra,

I’ve been using linux for a long time, and I have a background in this kind of stuff, but it’s not my career and I don’t keep as current as if it was, so i’m going to give my point of view on this.

A zfs array is probably the legit way to go. But there’s a huge caveat there. If you’re not working with this technology all the time, it’s really not more robust or reliable for you. If you have a failure in several years, you don’t want to rely on the fact that you set it up appropriately years ago, and you don’t want to have to relearn it all just to recover your data.

Mergerfs is basically just files on a bunch of disks. Each disk has the same directory structure and your files just exist in one of those directories on a single disk, and your mergerfs volume shows you all files on all disks in that directory. There are finer points of administration, but the bottom line is you don’t need to know a lot, or interact with mergerfs at all, to move all those files somewhere else. Just copy from each disk to a new drive and you have it all.

Snapraid is just a snapshot. You can use it to recover your data if a drive fails. The commands are pretty simple, and relearning that isn’t going to be too hard several years down the road.

The best way isn’t always the best if you know you’re not going to keep current with the technology.

Borger, in [Solved] How to backup an entire blog hosted on medium.com?

Write a scraper using python and selenium or something. You may have to manually log in as part of it

perviouslyiner, in Looking for data about company ownership network

WikiData is a database-oriented version of Wikipedia intended for categorising relations between concepts.

For example, Google (www.wikidata.org/wiki/Q95) is “owned by” XXVI Holdings (www.wikidata.org/wiki/Q100292691) since 2017.

XXVI Holdings similarly shows that it is owned by Alphabet Inc (www.wikidata.org/wiki/Q20800404), which is owned by a list of people and entities with various different voting rights.

There is also the “Parent Organisation” relation (www.wikidata.org/wiki/Property:P749) which links Google to Alphabet directly.

These should all be computer-readable.

rah, in Looking for data about company ownership network
Deebster, in Backblaze
@Deebster@lemmyrs.org avatar

I like them - fast enough and a good price, especially if you have public data you're happy to put behind Cloudflare for free egress.

Edit: Aaaand this morning I get an email saying prices are going up by 20% in October 🤦‍♂️

SamsonSeinfelder, in Music Industry sues Internet Archive

Those companies are such a drag for human society because they are afraid of their bottom line. the Music Industry and Amazon also would have burned down the Library of Alexandria if they had deemed it worrisome to their profit.

chiisana, in archive.today: On the trail of the mysterious guerrilla archivist of the Internet

Speaking of archive today; since yesterday or so, I’ve been getting nothing but the cloudflare challenge loops. I recall maybe four or so years back, they were adamantly against cloudflare, and if one were to use 1.1.1.1 for DNS, it would refuse to load or throw errors. I wonder what’s happening behind the scenes?

bravesilvernest, in archive.today: On the trail of the mysterious guerrilla archivist of the Internet

Well… that was dope. Hopefully they can continue on, the true datahoarder they are.

einsteinx2, in Dual Actuator Drives
@einsteinx2@programming.dev avatar

I was surprised the prices aren’t even that much higher than single actuator drives of the same size. I might be picking a few of these up for my next capacity increase.

chiisana, in Was gifted some Primergy RX2530 M1 M2 and M4 - suitable for homelab in termy energy consumption

The m1/m2 looks like poweredge r630 equivalent with v3/v4 cpu and m4 is using scalable Xeon which is one generation newer. All of them are great systems, especially when maxed out. The m4 being the newest is probably the best all around choice.

From power point of view, they’re gonna be “less” energy efficient than consumer diy stuff in that they’re supposed to be highly dense systems ran in a data centre with thousands of other similar servers, to pack as much punch in as little space as possible.

Another thing I’d be wary about is noise… 1U means you’re stuck with itty bitty tiny fans that need to spin very quickly and make a lot of noise, should your components heat up. Again, that whole data centre high density thing… noise isn’t something they’re optimizing for.

Wilshire, in We now have official presence here!
@Wilshire@lemmy.ml avatar

Thank you! It’s nice to see my favorite communities join Lemmy. I’ve been using Reddit less and less.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • datahoarder@lemmy.ml
  • localhost
  • All magazines
  • Loading…
    Loading the web debug toolbar…
    Attempt #