datahoarder

This magazine is from a federated server and may be incomplete. Browse more on the original instance.

Showroom7561, in Backblaze increases storage costs to 0.6ct/GB or 6$/TB, but offers free downloads

Free egress is fine and all, but as someone who uses cloud storage as a last resort, I’d want to pay less for storing that data, regardless of what it costs to get it back.

A 20% increase is a little bonkers. Do they give any reasons for this substantial increase? Computer storage prices have not gone up, from what I can see (they’ve gone way down from two years ago).

Car,

Back blaze is one of the OG “cheap cloud” storage providers.

They buy cheap stuff and develop cheap storage networks to charge cheap prices and stay in business. They publish entire papers on running cheap storage if you’re interested. It’s actually pretty interesting stuff.

They raised prices 20% (or $1). Hardware costs **may have gone down that much, but I’m willing to bet their energy and rent prices haven’t. They’re subject to services inflation just as much as anybody else.

Frozen pizza prices in my area have increased more in the last year than their services.

Anticorp, in Too many users abused unlimited Dropbox plans, so they’re getting limits

Using what you’re offered is considered abuse now? Huh…

HobbitFoot,

Unlimited* plans are always sold on the idea that a sizeable part of the user base aren’t going to use an actual unlimited amount of the resource.

Unless there is a contract regarding a fee over a period of time, there isn’t that much that users can do to compel a service to offer a service they no longer want to offer.

splendoruranium,

Unlimited* plans are always sold on the idea that a sizeable part of the user base aren’t going to use an actual unlimited amount of the resource.

Unless there is a contract regarding a fee over a period of time, there isn’t that much that users can do to compel a service to offer a service they no longer want to offer.

Absolutely! But I don’t think that’s the point of contention here. The problem is the “abuse” rhetoric, since it’s not just incorrect but disingenuous to basically claim that the users did anything wrong here. They’re imposing limits because they miscalculated how many heavy users they could handle.
Again, that’s a completely reasonable move, but framing it as anything but a miscalculation on their part is just a dick move.

splendoruranium, in What do you do with damaged drives?

I’m a bit baffled that this hasn’t popped up yet: Sell them on eBay.
Mark them as broken goods/scrap and re-iterate that fact very clearly in the product description. Broken drives often sell for up to 1/3 of the value of a working one, no scamming needed.

I cannot tell you why that is, but my theory is that a lot of folk buy up broken drives in private sales in the hopes that the “broken”-diagnosis is just user error and that the drive is actually fine. Knowing my users that might actually be true in many cases.

Edit: I didn’t quite catch that you were not able to successfully overwrite your data. I guess that’s a point against selling it. Always encrypt your drives, that way you can always sell them when they break!

vicfic,
@vicfic@iusearchlinux.fyi avatar

Aah that’s a pretty good idea. But I’m guessing it’s not the case for SSD’s?

splendoruranium,

It absolutely is, at least from my observations!

MossyFeathers, in I just downloaded the entire Classic Chicago Television youtube channel on a whim

Nice! What’re you gonna do with them? Are you gonna upload them somewhere, or just hold onto them?

empireOfLove,

They still happily exist on YouTube- for now. So no point in re-hosting, they’ll get squirreled away into the Giant Hard Drive of Doom.

If something happens to the actual archive project in the near future, I’ll likely section them up into 20gb pieces and post them out on a torrent someplace.

Appoxo,
@Appoxo@lemmy.dbzer0.com avatar

Just upload it to archive.org before your backup dies. No need to hoard it for yourself.

empireOfLove,

Nah. IA doesn’t need to deal with this volume of shit and they already have enough of a hard time dealing with copyright trolls.

If this channel is impacted in the future, I’ll probably put out a few torrents with the videos and post them here.

NikkiNikkiNikki, in I just downloaded the entire Classic Chicago Television youtube channel on a whim
@NikkiNikkiNikki@kbin.social avatar

Plan to do this with a lot of the entertainment videos I watch, considering how ban happy some websites have been with content creators, being able to still see their craft after it is gone is worthwhile.

Just need to buy a fuckton of storage though

empireOfLove,

Me too. There’s a couple channels I’ve downloaded in their entirety, but they’re nothing like the size of this one.

bela, in Google Books - colour images

I just spent a bit too much time making this (it was fun), so don’t even tell me if you’re not going to use it.

You can open up a desired book’s page, start this first script in the console, and then scroll through the book:


<span style="color:#323232;">let imgs = new Set();
</span><span style="color:#323232;">
</span><span style="color:#323232;">function cheese() {    
</span><span style="color:#323232;">  for(let img of document.getElementsByTagName("img")) {
</span><span style="color:#323232;">    if(img.parentElement.parentElement.className == "pageImageDisplay") imgs.add(img.attributes["src"].value);
</span><span style="color:#323232;">  }
</span><span style="color:#323232;">}
</span><span style="color:#323232;">
</span><span style="color:#323232;">setInterval(cheese, 5);
</span>

And once you’re done you may run this script to download each image:


<span style="color:#323232;">function toDataURL(url) {
</span><span style="color:#323232;">  return fetch(url).then((response) => {
</span><span style="color:#323232;">    return response.blob();
</span><span style="color:#323232;">  }).then(blob => {
</span><span style="color:#323232;">    return URL.createObjectURL(blob);
</span><span style="color:#323232;">  });
</span><span style="color:#323232;">}
</span><span style="color:#323232;">
</span><span style="color:#323232;">async function asd() {
</span><span style="color:#323232;">  for(let img of imgs) {
</span><span style="color:#323232;">    const a = document.createElement("a");
</span><span style="color:#323232;">    a.href = await toDataURL(img);
</span><span style="color:#323232;">    let name;
</span><span style="color:#323232;">    for(let thing of img.split("&amp;")) {
</span><span style="color:#323232;">      if(thing.startsWith("pg=")) {
</span><span style="color:#323232;">        name = thing.split("=")[1];
</span><span style="color:#323232;">        console.log(name);
</span><span style="color:#323232;">        break;
</span><span style="color:#323232;">      }
</span><span style="color:#323232;">    }
</span><span style="color:#323232;">    a.download = name;
</span><span style="color:#323232;">    document.body.appendChild(a);
</span><span style="color:#323232;">    a.click();
</span><span style="color:#323232;">    document.body.removeChild(a);
</span><span style="color:#323232;">  }
</span><span style="color:#323232;">}
</span><span style="color:#323232;">
</span><span style="color:#323232;">asd();
</span>

Alternatively you may simply run something like this to get the links:


<span style="color:#323232;">for(let img of imgs) {
</span><span style="color:#323232;">	console.log(img)
</span><span style="color:#323232;">}
</span>

There’s stuff you can tweak of course if it don’t quite work for you. Worked fine on me tests.

If you notice a page missing, you should be able to just scroll back to it and then download again to get everything. The first script just keeps collecting pages till you refresh the site. Which also means you should refresh once you are done downloading, as it eats CPU for breakfast.

Oh and NEVER RUN ANY JAVASCRIPT CODE SOMEONE ON THE INTERNET TELLS YOU TO RUN

rah, in Looking for data about company ownership network
perviouslyiner, in Looking for data about company ownership network

WikiData is a database-oriented version of Wikipedia intended for categorising relations between concepts.

For example, Google (www.wikidata.org/wiki/Q95) is “owned by” XXVI Holdings (www.wikidata.org/wiki/Q100292691) since 2017.

XXVI Holdings similarly shows that it is owned by Alphabet Inc (www.wikidata.org/wiki/Q20800404), which is owned by a list of people and entities with various different voting rights.

There is also the “Parent Organisation” relation (www.wikidata.org/wiki/Property:P749) which links Google to Alphabet directly.

These should all be computer-readable.

yote_zip, in Asking advice for home storage configuration
@yote_zip@pawb.social avatar

Are you buying the hardware for this setup, or do you already have it laying around? If you don’t have the hardware yet I’d recommend not using external USB drives in any way possible, as speed and reliability will be hindered.

If you already have the hardware and want to use it I’m not super confident on recommending anything given my inexperience with this sort of setup, but I would probably try to use ZFS to minimize any potential read/write issues with dodgy USB connections. ZFS checksums files several times in transit, and will automatically repair and maintain them even if the drive gives you the wrong data. ZFS will probably be cranky when used with USB drives but it should still be possible. If you’re already planning on a RAID6 you could use a RAIDZ2 for a roughly equivalent ZFS option, or a double mirror layout for increased speed and IOPS. A RAIDZ2 is probably more resistant against disk failures since you can lose any 2 disks without pool failure, whereas with a double mirror the wrong 2 disks failing can cause a pool failure. The traditional gripe about RAIDZ’s longer rebuild times being vulnerable periods of failure are not relevant when your disks are only 2TB. Note you’ll likely want to limit ZFS’s ARC size if you’re pressed for memory on the Orange Pi, as it will try to use a lot of your memory to improve I/O efficiency by default. It should automatically release this memory if anything else needs it but it’s not always perfect.

Another option you may consider is SnapRAID+MergerFS, which can be built in a pseudo-RAID5 or RAID6 fashion with 1 or 2 parity drives, but parity calculation is not real time and you have to explicitly schedule parity syncs (aka if a data disk fails, anything changed before your last sync will be vulnerable). You can use any filesystems you want underneath this setup, so XFS/Ext4/BTRFS are all viable options. This sort of setup doesn’t have ZFS’s licensing baggage and might be easier to set up on an Orange Pi, depending on what distro you’re running. One small benefit of this setup is that you can pull the disks at any time and files will be intact (there is no striping). If a catastrophic pool failure happens, your remaining disks will still have readable data for the files that they are responsible for.

In terms of performance: ZFS double mirror > ZFS RAIDZ2 > SnapRAID+MergerFS (only runs at the speed of the disk that has the file).

In terms of stability: ZFS RAIDZ2 >= ZFS double mirror > SnapRAID+MergerFS (lacks obsessive checksumming and parity is not realtime).

PigeonCatcher,

Thank you! By the way, I’ve heard that ZFS has some issues with growing raid array. Is it true?

constantokra,

If you want to be able to grow, check out mergerfs and snapraid. If you’re wanting to use a pi and USB drives it’s probably more what you’re wanting than zfs and raid arrays. It’s what i’m using and I’ve been really happy with it.

PigeonCatcher,

Thank you! Gonna check it out.

constantokra,

I’ve been using linux for a long time, and I have a background in this kind of stuff, but it’s not my career and I don’t keep as current as if it was, so i’m going to give my point of view on this.

A zfs array is probably the legit way to go. But there’s a huge caveat there. If you’re not working with this technology all the time, it’s really not more robust or reliable for you. If you have a failure in several years, you don’t want to rely on the fact that you set it up appropriately years ago, and you don’t want to have to relearn it all just to recover your data.

Mergerfs is basically just files on a bunch of disks. Each disk has the same directory structure and your files just exist in one of those directories on a single disk, and your mergerfs volume shows you all files on all disks in that directory. There are finer points of administration, but the bottom line is you don’t need to know a lot, or interact with mergerfs at all, to move all those files somewhere else. Just copy from each disk to a new drive and you have it all.

Snapraid is just a snapshot. You can use it to recover your data if a drive fails. The commands are pretty simple, and relearning that isn’t going to be too hard several years down the road.

The best way isn’t always the best if you know you’re not going to keep current with the technology.

Borger, in [Solved] How to backup an entire blog hosted on medium.com?

Write a scraper using python and selenium or something. You may have to manually log in as part of it

keefshape, in [Solved] How to backup an entire blog hosted on medium.com?

Pose this question to chat gpt 3.5 or 4. Ask it to assist in making a (python?) script to do this. Feed it errors, and you can get there pretty quickly and learn along the way.

keefshape,

Lol, downvoted for…?

greengnu, in Asking advice for home storage configuration

Raid stopped being optimal now that btrfs and ZFS exist.

If you plan on doing matching drives ZFS is recommended

If you expect mismatched disks, btrfs will work.

If you are most worried about stability get a computer with ECC memory.

If you are most worried about performance, use SSD drives.

If you want a bunch of storage for cheap, use spinning disks (unless you exceed the 100TB capacity range)

AuroraBorealis, in [Solved] How to backup an entire blog hosted on medium.com?
@AuroraBorealis@pawb.social avatar

Does this actually modify the files when monolith embeds everything into one file?

oldfart, in [Solved] How to backup an entire blog hosted on medium.com?

Maybe some alternative frontend and then the regular methods like wget?

github.com/mendel5/alternative-front-ends#medium

lemann, in Do SSD failures follow the bathtub curve? Ask Backblaze

Interesting, from that data it seems SSD reliability so far isn’t too far off from HDDs (at least for Backblaze’s intensive cloud storage workload) despite having no moving parts…

Will be interesting to see how the reliability plays out as they build up their SSD inventory over the coming years

Car,

I agree. Consumer use cases of SSDs sees a tremendous benefit if only for accidental damage reasons, but for enterprise data center use I would not have expected the same overall rates of failure.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • datahoarder@lemmy.ml
  • localhost
  • All magazines
  • Loading…
    Loading the web debug toolbar…
    Attempt #