“The more storage you have, the more stuff you accumulate.” — Alexis Stewart
Something seems very wrong about this whole idea.
There I was, minding my own business, reading about new web services, and it was all about data containers, docking, network interfaces, transport, routing, and other sorts of esoteric new technologies. And then it mentioned that the “data container” was 45 feet long and weighed 68,000 pounds, and the “routing” meant Google Maps.
Wait, what?
A quick review of the OSI 7-Layer Model of networking confirms that Level 4 is defined as the “transport layer.” I had no idea that it had 18 wheels and diesel power. I must have missed that part of the specification. When some people say, “transport,” they really mean transport.
Stop me if you’ve heard this one. You want to send a large file to your colleague sitting in the next cubicle. Normally, you’d attach the file to an email. Turns out, the file’s too large for your company’s server, so it gets blocked and your email never gets delivered. Of course, you don’t discover this for another hour, at which time your colleague is annoyed that you never sent the file you promised.
You can try breaking up the file into smaller parts, hoping the components will make it through. But then you’d need a file-splitting program, and so would your friend on the receiving end.
You could upload the file to some cloud storage service and then send the link. That might be quicker. But they’d probably make your coworker sign up for a free account and make up a new password, and verify it with a return email and a six-digit code via text message, all before you could use the “free” service. Annoying. Also not very secure.
In the end, you decide to just pass a dumb USB stick over the cubicle wall. The data’s only travelling 20 feet, and it’s just easier to physically pass it around instead of trying to second-guess all the modern high-tech alternatives. If there’d been a carrier pigeon in the office, or someone who could send smoke signals, you might have tried those options, too.
For all our high-tech, high-bandwidth, high-data-rate creations, sometimes it’s just easier to hand over a stack of papers and call it good. The venerable SneakerNet may never die.
Now multiply your email problem by… oh… a trillion. Let’s say you’ve got an exabyte of data (that’s a billion gigabytes) lying around, collecting dust, clogging up your servers. Your IT manager comes to you and decrees that you must either (a) back up the data to secure storage; (b) move it all to an offsite server farm; or (c) start renting space on Amazon AWS, Microsoft Azure, or some other third party. And you’ve got a year to make this happen.
No problem, right? A whole year, just to squirt some data down a network pipe? How hard can it be?
This is where bandwidth hits its head on the low doorway of physics.
Rather than procrastinate for 360 days before typing “cp -R /* /files/offsite,” you might want to get out your slide rule and calculate just how long it really takes to transfer several thousand gigabytes over any real-world network. Spoiler: it takes a very long time. More than a year, in fact. Oops.
This is not a hypothetical problem. Quite a few companies have amassed gigabytes, petabytes, and even exabytes of data – and they need someplace to put it. Actually, storing the data is not the problem. Accumulating that many bits takes a long time, so it’s easy to keep buying more hard drives and storage racks to always stay ahead of demand.
It’s transporting the data that’s hard. Once you decide that you need to move it all from Point A to Point B, you’ve got to herd all those bits to pastures new. How, exactly, do you do that?
In a big metal box, of course. What, you thought you’d use a network? Perish the thought. Networks are for wimps.
Here’s a quick, back-of-the-napkin calculation. Sending 50 terabytes (TB) of data over a 1 Gbps link would take over four days. Levelling up, and squirting 10,000 TB of data over a 10Gbps link would take over 3 months. Your exabyte would take 26 years – by which time, your boss would likely have run out of patience. “The progress bar is at 17%, boss! Just a few more years!”
And those numbers all assume you’re using 100 percent of your network’s bandwidth. Doubtless your coworkers would be nonplussed with your usurping their network for, well, forever. Realistically, you’ve got some (small) percentage of bandwidth available to you for this massive backup job. Better add a few more years to the estimates.
What’s the solution? Faster networks? Nope, bigger hard drives. Much bigger. Diesel-powered ones. On wheels.
“We’re going to need a bigger box,” says Andy Jassy, CEO of Amazon Web Services (AWS), speaking of their new product called Snowmobile.
Ridiculous as it sounds, Amazon (and others) discovered that the quickest way to transport large amounts of digital data is to physically pick it up in a big truck and haul it away, like so much shredded paper. AWS will happily rent you a Snowmobile 18-wheeler, complete with driver, diesel generator, outboard climate control, 45-foot containerized NAS, connection rack, and a whole bunch of really long fiber-optic cables. Snowmobile literally backs up (to) your network.
Once Snowmobile does its Vulcan mind-meld with your system, it can suck data out at the rate of 1 Tbps, filling the truck in 10 days. How much data fits in 45 feet? One hundred petabytes, says Amazon. That’s 1% of an exabyte. Got more than 100 petabytes to transport? No problem; multiple Snowmobiles can work in parallel – assuming your parking lot is big enough. Moving an exabyte from here to there is now just a six-month project, so you still have a little time to procrastinate.
LOL … sneaker net will always be alive.
Now the really interesting part is making sure that after 10 days of 1Tbps, that it’s all error free, and can be extracted error free at the other end. This is really hard when compressed files are encrypted, and the error rate on the media is more than the storage block. That’s 10 to the minus how many digits?