$0.01 per GB per month storage
$0.01 per GB retrieval
Normal egress fees on top, so additional ~$0.10 per GB if you want to retrieve outside of Google Cloud.
Early deletion fee. Effectively just a minimum storage charge of 1 month.
It seems this is cheaper than Glacier and quite a bit simpler. The speed restrictions are interesting though.
Unless Glacier just changed their pricing, how is it cheaper and simpler? Glacier is also $.01/GB storage, but only $.09/GB retrieval, which is cheaper than google's $.12/GB. Glacier also comes with 1st GB retrieval free/mo.
Glacier pricing is surprisingly complicated, and the actual cost can be much higher than $0.01 per GB-month if you don't read the fine print.
The biggest gotcha is that you can only access 0.17% of the data you've stored in any given day without extra charges. So if you've stored 1000 GB, you can only access 1.7 GB per day for free.
The cost for going over your daily "retrieval allowance" can be large, because cost is driven by "peak retrieval rate". They find the highest hourly retrieval rate over the entire month, and charge the whole month's retrieval at the peak rate.
This can get expensive fast. Again, with 1000 GB stored, if you retrieve 200 GB over the course of 4 hours, your hourly retrieval rate is 200 GB / 4 hours = 50 GB per hour. Your free retrieval allowance is 1.7 GB per day / 24 hours = 0.07 GB / hour. So your excess retrieval is 49.93 GB per hour.
Amazingly, that 49.93 GB per hour for 4 hours is charged for all 720 hours in the month, so that's 49.93 * 720 * 0.01 = $359.49.
That's an astonishing $1.79 per GB just to retrieve one-fifth of the data from Glacier storage.
So Glacier only makes sense for true cold storage, where you are very unlikely to touch more than 0.00007 of it in any given hour.
Google's Nearline is much simpler and faster for the same storage cost -- and far cheaper if you need to access more than a tiny fraction of the data.
Don't know about the OP, but we were bitten by this - badly - after our logic to throttle Glacier retrieval failed.
AWS has now added the ability to set a spending limit to avoid runaway retrievals. Nonetheless, from my point of view the ultra-complicated pricing scheme is nothing short of a disaster for Glacier as a product. I think it will continue to seriously impact its uptake, and Google is smart to exploit the opportunity to offer a simple, straightforward pricing scheme.
The use case is people who have a requirement to store data and probably never need to access it, either for reg requirements, or legal requirements, etc. In which case paying a couple of hundred bucks for retrieval is fine, especially if they can access at a high level of granularity
edit: Also as others have pointed out, Nearline is limited in retrieval time as well, so the cost difference isn't nearly as large.
Why does Amazon charge like this? It seems like on the rare occasion that you need to send some person to go grab the tape/disk from storage and bring it online, Amazon would want you to get all the data you need and put it back in storage.
Incentivizing users to bring the data online once a day to trickle it out seems bad for all involved.
I think it makes most sense to think of it as paying for access to the tape robot.
If Amazon store your data in a tape archive (I don't know if they do, but they at least seem to have similar constraints), they can only access a small portion of the stored data at a time, so they need to control how often people request data.
They could just rate limit everyone, but this way allows people to pay for priority in an emergency while still discouraging everyday read requests.
The pricing makes more sense if you're a large user with data spanning several tapes than if you're in the single terabyte range, but the low limit still discourages you from making requests causally, which helps them keep their SLA.
If they predict that you'll trickle out your small file, they can just read out everything on the first access and cache it online, so there's no extra trips to the archive for them.
Note: The following is speculation best I know. I'm recalling from memory what I've read on the internet written by someone who did not have a direct source.
Glacier uses low-speed (5400 RPM) consumer drives, which they then clock down lower to save energy. Any given 'rack' only has enough power to power a few drives on that rack, the rest is powered down.
To prevent multiple customers from all trying to pull their data out they needed to introduce a rate limiting system, which they did with this exorbitant pricing.
You assume Glacier uses tape storage... But the same principle still applies, they're trying to prevent people clogging up the network so when somebody does need to do an emergency restore there is lots of spare network capacity.
But if you retrieve the same 1000 GB from Glacier in the same 3 days that Google takes you would pay a 98 dollar retrieval fee. And Amazon would charge 300 dollar less traffic cost ($ 0.09 vs $ 0.12 / GB for 1000 GB) so in the end Glacier is still 202 dollar cheaper AND allows you to retrieve the data much faster if you really have to at a certain price.
So Google looks cheaper/simpler, but the amazon model is actually quite good for disaster recovery where you may be in a situation where you "need it now at any cost"
Glacier have additional fees depending on how fast you retrieve the data. Honestly I'm not sure what they are because I've never had to use it, but my impression is that these fees could be quite high.
They can be. It's possible to set a retrieval policy on your bucket either at free tier, or a fixed price point so you can control the cost of your final bill if you want to retrieve faster than the free tier
At least with Glacier I get the option of retrieving faster. Nearline is going to take more than 3 DAYS to retrieve my data if I only have 1TB stored (3MB/s).
I wrote a backup program that uses Glacier, and the retrieval policies are nearly impossible to manage and explain. But the thing I really don't get is that you can use Amazon Import to read a hard drive into Glacier, but you can't use Amazon Export to get a drive full of data back out. You can with S3.
In a disaster situation, a company is going to want hard drives sent to them next day. As others have mentioned, this isn't a money thing, it's a time issue. But it isn't available with Glacier (probably not with Google either...)
If you're willing to pay more for faster retrieval I guess you could store e.g. 5TB of random data to get a fast "base speed"- cutting those 3 days down to 12 hours.
Or use Nearline for archival data as well as disaster recovery. You won't need to transfer the archival data for normal disaster recovery scenarios, so it'll be -- for that scenario -- extra data that boosts your speed, but it's still, on its own, a useful use of the service.
"Unless Glacier just changed their pricing, how is it cheaper and simpler? Glacier is also $.01/GB storage, but only $.09/GB retrieval, which is cheaper than google's $.12/GB. Glacier also comes with 1st GB retrieval free/mo."
As long as we are comparing effective pricing, which is the only number that matters with Glacier and "Google Nearline" (and, to some degree, S3) it should be noted that rsync.net PB-scale is 3.0 cents with no additional charges.
That is, the effective price, no matter what your use-case or traffic/usage, is 3.0 cents per GB.[1]
Of course, you do have to buy a petabyte of it ...
Quote:
"This is a Beta release of Nearline Storage. This feature is not covered by any SLA or deprecation policy and may be subject to backward-incompatible changes."
So should I believe in Google's good-will? I would be fine trying out some services, which are in Google Beta. But my valuable data? They should have a SLA right from the start to gain the user's trust.
If we place too many restrictions on how companies should offer preview releases, they'll just stop entirely. Are you suggesting that even those comfortable taking on risk to get a sneak peak should be forced to wait for general availability?
Praise be the skeptical! It allows the courageous easy advantages. SLAs are almost worthless. And, really, has Google ever retired something like this (i.e., not an acquisition, not Google Reader, etc)?
ReplyTo aros: The two main criteria I would have are: 1) not an acquisition and 2) important development/developer service. I can't think of any and didn't see any after a cursory look at the list.
I think SLAs are literally worthless since I don't think they encourage even slightly more effort in minimizing whatever issues the customer is concerned about. No one wants servers to go down, no one wants to shut down a service. That a few Google bucks might be on the line would have zero impact.
But if the 3 seconds becomes 6 seconds... Or the price goes up... Or they announce they're end-of-life'ing the product...
Just move your data somewhere else.
Sure, it'd be inconvenient (and maybe expensive) to move. So, you balance all of that out in your mind, and maybe this is the right service for you, and maybe it's not.
I'm not even sure it'd be "expensive to move". Are people _really_ considering using this (or Glacier/rsync.net/whatever) as their _only_ copy of their data? I can't imagine looking my boss/customers in the eye and saying "We're going to have multiple terabytes of mission critical business data, and it's all going to live _only_ on AWS/Google/Cloud-service-de-jour!"
If I lose my AWS Glacier stored data (or Amazon bump the prices intolerably), I'll upload it to a competitor _from my local copies_...
Admittedly, I've only had to deal with storage topping out in the tens of terabytes range, so I've never needed to go beyond a dozen or two consumer-grade drives to keep a pair of rsynced copies locally - but I think that same kind of techniques scale all the way out to building your own Backblaze style storage pod if needed.
You're thinking about data that can't be lost, or your customers are screwed. Not all data is like that.
Log files come to mind. They're _nice_ to archive for a long time, but in many businesses, they're certainly not _critical_ to archive for a long time.
Intermediate files, too. You retain the original files in secured storage. But because the intermediate files are large and expensive to re-create, you keep them here in AWS.
In beta, your build the tools to use the service with your data, but don't use it for mission critical data until you have an SLA, etc., that you are comfortable with for that purpose.
And if you aren't comfortable even investing development efforts against a beta, don't. Different customers have different risk tolerances, and the fact that an early access product doesn't meet yours doesn't mean it shouldn't be available for those whose risk tolerances it is suitable for.
Surely the whole point of not having an SLA (and indeed the whole point of calling it a "beta") is so that you don't trust it with your valuable data. I'm sure it will have an SLA soon enough.
It seems like this is why they have the clause that you quoted. They aren't yet comfortable guaranteeing reliability or backwards compatibility (i.e. what "beta" means). If these things are crucially important to you, there's a pretty good chance you shouldn't be using it for your "valuable data".
It seems most useful as a place for redundant, encrypted backup. You don't really need it to be reliable (or safe) to store an encrypted online version of ~500gb of family photos that you have on an external hard disk.
It's probably something that could be deployed in the "nice to have" category: use it as a cache for your offline data to provide quick recovery of things you would previously have needed to go all the way to offline to retrieve. But if something goes wrong, you still have the offline data to recover from.
You shouldn't be trusting your production data to anything in beta. The whole point of open beta testing is to pool the technical risk. So use it, but don't rely on it. If you can't afford to use it in parallel with some existing system, then don't use it.
Just a correction: the city name is Sao Paulo with a 'u'. It has the tilde at São but if you don't want to you can omit it. The thing is that Paolo with an 'o' isn't a Portuguese name (it's Italian). In English it's written like in Portuguese. I see it sometimes so I just got in the habit of reminding people, wasn't calling you out, just trying to prevent future mistakes like that.
Actually there are not 7 regions for Google Cloud Storage but 3: US, EU, ASIA. You can get more specfic but that's still Alpha: ASIA-EAST1, US-CENTRAL1, US-CENTRAL2, US-EAST1, US-EAST2, US-EAST3, US-WEST1.
I understand the downvotes (they're very different services, and Dropbox prices can be very low because you're paying for capacity not usage - case in point I'm currently using 25 gigs but paying for a terabyte) but honestly - that's an interesting point... anyone want to write an API?
* Dropbox doesn't guarantee 11 9s like this service does - when I'm backing critical data up, I want to make sure it's _there_.
* Dropbox likely wouldn't take kindly to me storing 10PB, whereas that is what this service is designed for.
* I've got SLAs and guaranteed speeds with this, Dropbox isn't designed for me to suddenly download 10PB very quickly.
In this case, the durability refers to the loss of data/objects stored per year - if you're sending multiple PBs of data off to Glacier, you want to be able to retrieve them many years later. Even 5 9s would mean that 1 object out of 100,000 is lost every year, which is quite poor.
12 9s is new to me too. Not worked on anything better than 5 9s (5 mins downtime/year), and Wikipedia only goes to 9. 9 9s comes to 35ms downtime a year. I can't think of anything that needs more than 5 9s, let alone 9.
This is a durability metric, not an availability one. This typically tells you the likelihood of losing a given object in a year. (5 9's would be quite poor for this, implying a loss of 1 object out of 100,000 every year)
In this case, it's not as much service uptime as it is data retention. If you're storing 5+ PB of data, even 5 9s of data loss per year can have a measurable impact.
It's safe to assume this is going to be just like Glacier and S3/Google Storage, ie: unlimited.
Also retrieval speeds increase with your data set size: Note: You should expect 4 MB/s of throughput per TB of data stored as Nearline Storage. This throughput scales linearly with increased storage consumption. For example, storing 3 TB of data would guarantee 12 MB/s of throughput, while storing 100 TB of data would provide users with 400 MB/s of throughput.
I was just interested as to what these storage systems are capable of supporting. While I'm confident they could all store 10 TB of Data (That's just barely Tier-2 of 6 for Amazon), I'm wondering if they have the back end capability to store 10 PB of data.
The issue is with all the 9s, is it will take 3hrs to find out there is an issue, and another 3hours to make another request? I bet dropbox could fix their shit in less time.
Hm yeah, I must admit I have a hard time understanding the high pricing on cloud storage. I always end up comparing to something like OVH storage servers, and cloud storage seems way overpriced..
Based on the prices I can see on OVH's website you're not going to be able to hit 1c/gigabyte with many nines of durability. And thats ignoring the cost of operating the service yourself.
Total apples/oranges. You are comparing a Consumer System which is focussed on providing folder synchronization to a system that is meant to be used at scale (and is scaleable).
As a former Arq user, I'm happy to pay BackBlaze $5/month for not ever thinking about the backup problem again, on a bunch of family machines (mostly Macs but a couple of Windows machines as well).
No thinking about installing software (e.g., Arq, and finding something equivalent for Windows), keeping that software up to date, checking that it's up and running properly, etc. (BackBlaze tells me when it hasn't been able to reach a given machine after some period.)
So, no, I think the SaaS backup industry has nothing to fear from cheap online storage, at least for ordinary folk. (Hackers are a vanishingly small segment of that market.)
Update (5 days later): It's added! If you're an Arq 4 user, pick "Check for Updates" from Arq's menu to get it. Go to Preferences, Destinations tab, and click the + button. Pick "Google Cloud Storage" for the destination type.
It's really great. I'm using it instead of Glacier for all my personal backups.
This is exactly what I want for Linux (headless), but I haven't been able to find anything like it. The closest thing I've found is attic, which, while amazing, requires a daemon to be running on the remote end (which is what makes checking, deduplication and deltas fast).
Unfortunately duplicity is very inefficient, it needs to reupload the full set every so often, which takes up a lot of bandwidth. I guess attic is my best bet in the foreseeable future.
> the last excuse for not using self-hosted, secure backup tools like Arq[1] disappears.
Hardly. I pay for support. I pay for liability. If something goes haywire with my backups, I have a phone number I can call. A number that's not software developer/ sys admin who has no idea why I'm calling about my mom's missing holiday photos.
"I pay for support. I pay for liability. If something goes haywire with my backups, I have a phone number I can call"
Google "{service} lost all of my data" and see what kind of support these people got. I get that it gives you warm fuzzies but when these companies loose all of your data, you have almost no recourse.
Sort of. I've called DropBox about missing data before. They actually were quite helpful and even helped recover most of it, despite the fact that I wasn't even a paying customer at the time.
A more likely scenario is that the data is not lost, but my mother (or whoever is calling them) simply has things misconfigured. These companies _do_ provide useful support. Support that they're not going to get if they backup to a developer oriented service directly.
Well, I would guess the number of people who need to backup 4T is rather small.
My thinking is that most people likely max out well below the 0.5T that the same $5 buys you on Google now, making that variant actually cheaper for them than Backblaze.
For those people it means paying less per month and getting privacy and better control (data retention!) in return. Should be a no-brainer.
Backblaze lets you add a personal encryption key. I guess they could log that key when you try to decrypt and restore, although I trust they don't. I suppose the NSA probably gets the key when it gets transmitted, but I don't really care. Am I missing something here? If Backblaze's encryption implementation is substantially worse, I may switch.
"Well, what do I have to hide?" is a bad argument with respect to whether the NSA should be doing the things it does, but with respect to whether I am going to spend money and effort to hide my photos and documents from them, I think it's a fair argument.
I care very much, however, whether hackers will have access to my files, as they can cause havoc with things like tax returns that the NSA won't (I mean, the government already has my tax returns....).
So I was mainly curious if there was some significant flaw in Backblaze's encryption that should worry me from the perspective of a non-nation state adversary.
Can anyone comment how Crashplan stacks up vs Backblaze here? I'm leaning towards it as it'll save the data for more than 30 days after deletion, and security's another factor in my choice.
Backblaze can probably afford to be $5/mo per computer because the average computer using it has less than 4TB; OTOH, if there is a more cost effective option for the low end, it won't make sense to use Backblaze on the low end, which will drive up Backblaze's sustainable per computer price.
I'm considering switching from Backblaze to Arq to backup personal files (photos, etc.) from my rMBP. Anybody have any experience doing this and can recommend one vs. the other? I have a 500GB drive, probably only 300GB to backup. Backblaze is only $5/month, and let's you encrypt, so I'm wondering if it's even worth switching.
I'm using Arq with DreamObjects currently (hope Google support will be added soon), backing up two laptops and a desktop, total backup size ~3T.
I can hardly recommend it high enough.
The only minor niggles that I've run into were:
- If you have a very large directory (hundreds of thousands of files) then opening the backup-catalog can give you the beachball for minutes. However this really only happens for ginormous directories and is easily resolved by splitting the job into a few smaller ones.
- If you're low on diskspace then the temp files that Arq creates
during a backup can drive you over the edge and into a disk-full
situation.
Other than that it has been rock-solid for me, the author really knows what he's doing.
How is using Arq+Nearline better than Backlaze (client and storage)? Not really "self hosted" if you're still dumping the data to a cloud provider. Even if you configure Arq to use your own "cloud" (cost of that notwithstanding) I don't think it's an overall net improvement.
(I'm the developer behind Arq) With Arq you keep control of your stuff because it's encrypted before it leaves your computer. With Backblaze you can choose your own encryption key, but to restore your files you have to give them your key; they decrypt your files and leave them (unencrypted!) on their servers for you to download, which I think makes choosing your own encryption key pointless.
https://news.ycombinator.com/item?id=8169040https://www.backblaze.com/backup-encryption.html
While I personally don't mind Google having all my backup data, we must appreciate the fact that in this post-Snowden world some people will justifiably refuse to go with the big USA-corp players like Amazon or Google.
rsync.net just saying: "Hey, we're not Google or Amazon" is already a big selling point for some.
I encourage you to email us and ask about the HN readers discount.
Further, Amazon S3 is a better comparison, as far as pricing goes - we're fully live, online, random access storage - not nearline or weirdline like google/glacier.
The cost of one dinner out is prohibitive? Just testing a backup solution alone would cost more than that, much less rolling your own, unless your time is worth considerably less than $40/hour.
Glacier is the other big offering from a comparable cloud vendor in this space, so, yeah. The performance claim is directed largely at Glacier, as is the consistent access one.
With a 3s average response time and presumably replication across multiple disks, you've plenty of scope to schedule requests in a way that they won't affect performance.
In 3s on one disk you can do roughly 5k seeks and read up to 300MB. They only need to do 1 seek and read 4MB.
And at 3MB/s per TB it will always take over 3 days to retrieve all of your data. The 2-5 second retrieval latency is irrelevant when you're going to be waiting for 3 days...
> You should expect 4 MB/s of throughput per TB of data stored as Nearline Storage.
But still, you are correct in that it will take about 1TB / 4MB/s = 2.9d to retrieve all your data.
If you need the ability to do a restore faster than this, then you need to pay more for storage, is all. For many of us, waiting 3d to recover from a catastrophic failure isn't a big deal.
I own a big wholesale telco that does tons of data center business and bandwidth. Of course, margins are super thin so we need great pricing. We are no Google, but we can achieve the same pricing, including the cost of bandwidth (that is 9-9s of durability and no spin-up time, which is where the 3-seconds come from)
Maybe I should start my own service to harness all this infrastructure with something like swift!
from the documentation: "You should expect 4 MB/s of throughput per TB of data stored as Nearline Storage. This throughput scales linearly with increased storage consumption."
That's a good observation. I wonder how it works for multiple objects that are significantly smaller than 1TB. If I'm streaming one at 4MBps and then after a bit, decide to start another object download, does the original slow to 2MBps?
The unusual thing about this market is that by constantly improving the price point of their product but keeping their profit margins the same, AWS has forced every competitor in it to actually compete at the limits of their capability - so any scheme like this that got any serious traction would presumably be self-undermining. The economists will have considered all of the unused provided capacity in the model before they priced it. Sorry to be miserable ;)
I don't know why you think so. AWS is not very price competitive. I make good money consulting on setup of more cost-effective alternatives to things like S3.
The large cloud providers sell on brand recognition, convenience and trust, not cost. They may compete with each other on cost to try to cannibalise each others markets, but they're not even close to pushing the envelope on cost efficiency in terms of the prices they offer.
Just wondering, what is more cost effective than S3 with the same reliability?
If you have your own servers I can see running zfs and replicating snapshots every few minutes to a remote machine but if you aren't, what else is competitive?
I do agree that a lot of AWS services aren't cost effective though, especially at any sort of non trivial scale.
Setting up your own with any of Ceph, Gluster, Riak, Swift or similar depending on your specific needs. Heck, I've got setups where we've been served well for years with a combination of inotifywait, grep and rsync (combined to trigger rsync instantly on modification events). It really depends on your access patterns, and there are lots of potential savings from making use of domain specific knowledge for your specific system.
In general, you can beat AWS with 3x replication across multiple data centres even with renting managed servers, especially as your bandwidth use grows as AWS bandwidth prices are absolutely ridiculous (as in, anything from a factor of 5 to 20 above what you'll pay if you shop around and depending on your other requirements). Lease to own in a colo drops the price even further.
Anything from 1/3 to 1/2 of AWS costs is reasonable with relatively moderate bandwidth usage, with the cost differential generally increasing substantially the more you access the data.
Most people also don't have uniform storage needs. If you use your own setup, people tend to be able to cut substantially more in cost by reducing redundancy for data where it's not necessary etc.
See also the Backblaze article that mentioned Reed-Solomon - if your app is suitable for doing similar you can totally blow the S3 costs further out of the water that way.
Compared to what? Honest question - not trying to be cheeky.
If you are comparing against a self managed solution, are you factoring all costs into that equation (fully burdened labour costs, disposal, cooling, power, etc.).
For a reference point: The hardware cost for storage servers with full redundancy is in the order of $0.10 per GB (plus electricity, rack space etc.). Buying your own hardware and throwing it away every year is price competitive with Nearline (and you get full online storage, no three second delays and slow retrieval).
Of course there's good reasons not to roll your own solution until you reach a certain scale, but it's certainly possible to compete with Google and AWS on price.
> For a reference point: The hardware cost for storage servers with full redundancy is in the order of $0.10 per GB (plus electricity, rack space etc.). Buying your own hardware and throwing it away every year is price competitive with Nearline (and you get full online storage, no three second delays and slow retrieval).
Well, it would be price competitive if you considered only the hardware costs and not, e.g., the labor cost involved.
Consider that there are labor costs around operating services on AWS as well, and it's generally more expensive by the hour, and labor cost of managing your own servers is fairly low.
I have plenty of servers sitting in racks that haven't been touched in 5+ years; if you were to operate on a "buy and throw out in a year" practice, my typical labour costs would average a couple of hours per new server for setup. Or if you rent managed servers, you don't ever need to touch (or see) the hardware.
Conversely, you can get bandwidth at a tiny fraction of the cost of AWS bandwidth, so the moment you actually transfer data to/from your setup, AWS gets progressively more expensive.
Our story with Amazon Glacier (50TB):, we are a company with 50TB of data on multiple Servers and NAS, and some External drives, most of the data we needed to save/Archive for 10 years at least, it’s starting to become very hard to manage locally, so we saw glacier a great solution for us! Well, not exactly
We tried to manually using it, like APIs, Scripts, etc… it’s impossible, it was very had
So we tried cloudberry to help us upload, it will upload but also It won’t work for huge data, with glacier you won’t be able to search, list, find any file you want to download easily, also its not practical to manage all the backup and millions of files manually, also we got millions of photos we needed a way to find them easily like thumbs
So Glacier had so many restrictions like 3-5 hours restore time, 5% restore quota hard to use even with utilizes like cloudberry, cant list, search, it’s not useable as its! But the price was attractive
We considered Seagate Evault , but it was expensive and so many hidden fees and complicated for our case
Then we tired another solution called Zoolz ( http://www.zoolz.com ), Zoolz does not use your own AWS Account, but utilize their own account and from what I heard from them they got 5 Petabyte, so they got massive restore quota, around 5% of the 5 Petabyte, also its simple to use, they offer Zero restore cost, it’s like Mozy or Crashplan but for business and on glacier , like they internally create thumbs for your photos and store it on S3, so you have instant preview, also you will be able to search and browse your data easily and instantly, they got Servers and Polices, and got a reasonable price, all we wanted , we got the 50TB for $12,000 / year, its more expensive than using Glacier by itself, using Glacier will cost around $6000, but its not practical for a company to use it as a standalone storage
The only disadvantage is that when/if we needed a file we have to wait 4 hours to get it, which is fair, it was faster than when we used to use Tapes :), awe tried to restore 1 TB with 1.2 million files, it took us around 10 hours to complete, which was okay
The transport for this service is HTTP? I imagine most competitors use that as well, right? How does encoding factor in here? I have to transcode my data to base64 in order to put it in or take it out, I assume? I "know" I'd only get billed for the data stored as the original octet/binary encoding. But what about the egress fees? Encoded or decoded data is the input for the billing?
Some of these Google/Amazon/Apple "Introducing Product X..." stories are starting to strike me as intellectually devoid, creeping into "free advertising" territory.
This particular story is about a Google service that is an extremely late entrant into the market (Amazon Glacier, etc) and offers almost nothing new. I'm not saying it's a bad product or not useful to folks, but it's not, by any means, groundbreaking. I'd much rather see the top spot of HN occupied by some startup's new idea or a researcher's new findings.
I also don't mean to imply that "big corporate" == bad. Certain products -- self driving cars, Space X automated landings, etc -- are absolutely worthy of our attention and discussion. I just hope that people would think twice before upvoting a story merely because it's from Google.
In short, if I'm understanding correctly:
It seems this is cheaper than Glacier and quite a bit simpler. The speed restrictions are interesting though.