Question from ignorance: how do you get "petabytes of data" into the Google Cloud in a reasonable time? I find copying a mere few TB can take days and that's on a local network not over the internet.
I don't work in this specific field, but did previously, during the first decade of this century, in broadcast video distribution.
At the time, UDP based tools such as Aspera[1], Signiant[2] and FileCatalyst[3] were all the rage for punting large amount of data over the public Internet.
Aspera, is the current winner in Bioinformatics. The European Bioinformatics Institute and US NCBI are both big users of it. Mainly for INSDC (Genbank/ENA/DDBJ) and SRA (Short Read Achive) uploads.
For UniProt a smaller dataset we just use it to clone servers and data from Switzerland to the UK and US at 1GB/s over wide area internet.
Jim Kent wrote a small program parafetch - basically an ftp client that parallelized uploads. It worked reasonably well for speeding things up maybe 10x. You can get it somewhere on the UCSC web site in his software repository, though it involves compiling the C code.
Tannenbaum always forgot to include the time writing and reading tapes. Typical 10TB hard drives (which most people use for data interchange instead of tapes) only have ~100MB/sec bandwidth (~ same as 1Gbit NIC).