One thing you can independently verify is that the backdoor string is still in their client. The installer starts transmitting data during the installation process, so I would not recommend installing it outside of a VM—just look at the files directly. The macOS installer has it in `Backblaze Installer.app/Contents/Resources/instfiles.zip/bztransmit`. The Windows installer is a self-extracting ZIP file, so just use unzip and look in `bztransmit.exe` and `bztransmit64.exe`.
$ strings bztransmit |grep BACKDOOR
DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml file exists:
ERROR DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml file could not be read:
ERROR DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml file existed but less than 10 chars or could not be read:
ERROR DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml file did not contain bz_cvt:
ERROR DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml file contained bz_cvt but wrong num digits:
ERROR DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml file did not contain bz_upload_url:
ERROR DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml file contained bz_upload_url but did not start with http:
DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml file exists and is valid and bz_cvt=
DoHttpPostSyncHostInfo - BACKDOOR_prefer.xml SUCCESSFULLY_swapped_in new_bz_cvt=
What does the corresponding code do? I genuinely don’t know. My goal was to find backup software, not to do a security analysis. An easy-to-exploit root code execution vulnerability was enough for me to uninstall the software, submit a report as a professional courtesy, and go do something else.
Clearly it’s dumb to put the word BACKDOOR in your code if your goal is to plant a secret backdoor, but it’s also pretty dumb to use world-writable directories, disable host certificate verification, use magic hard-coded strings to “sign” updates, and implement data encryption in a way which requires the ‘private’ password to be sent to the server, so who knows. As I said in the tweet, even if it turns out to be innocuous, the optics are terrible and show a serious lack of good judgement on their part, especially given how much they claim to be security experts who care about their reputation[0].
> Were there any follow-ups from Backblaze?
No. The only “follow-up”, as it were, was to cancel their HackerOne public bug bounty programme. (Though this was a month ago, their web site still tells people to “visit our public bug bounty program managed through Hacker One”[2].) They have not communicated with me at all except for one tweet about that change[1]. I have seen no public statement from them acknowledging that this happened, or that they made mistakes, or that they have steps they plan to take to improve their internal software development practices.
[0] “We stand by our reputation as trustworthy, careful programmers who have worked in the security field for over a decade. […] we have LOTS of interest in keeping our reputations rock solid and utterly clean.” https://help.backblaze.com/hc/en-us/articles/217664798-Secur...
Thank you very much for your analysis and explanation. I was a happy customer and would recommend them to almost anyone, up until a few minutes ago.
There was always a nagging voice in the back of my head - "their backup encryption can't possibly be good" - but I wanted to believe it since their deal is so good, in terms of storage per dollar, and it's hard for me to afford otherwise right now. I did a bit of digging not too far back and their help pages go into a small amount of detail but ultimately just say "trust us". I trusted them to throw away my encryption password after a session, which is terrible, of course. After reading all of your work, I cannot even countenance trusting them anymore.
Ironically the first thing that comes to my mind for offsite backup is Backblaze B2... if you do all the encryption before it leaves your machine, then it doesn't matter what they do, and their storage is cheap. Rclone or similar means that their client software need not be trusted. It's up to whether you tolerate their business practices.
You mentioned in another comment that you are still looking for cloud backup software which ticks all the boxes - why did you not go with Rclone? On the surface it looks great but I haven't tried it yet.
> Ironically the first thing that comes to my mind for offsite backup is Backblaze B2... if you do all the encryption before it leaves your machine, then it doesn't matter what they do, and their storage is cheap. Rclone or similar means that their client software need not be trusted. It's up to whether you tolerate their business practices.
Business practices aside, the problem I personally have is: seeing how they are dishonest about their client security, how much can I trust that they’re honest about their durability or server-side security? Encrypting blocks myself ensures confidentiality, but how do I trust that data aren’t being silently lost or corrupted at rest? They hand-wave about how 11-9s durability doesn’t matter[0] in the same way they hand-wave about how zero-knowledge data privacy doesn’t matter, while simultaneously mentioning that they have, in fact, experienced unrecoverable data corruption in the past[1].
> why did you not go with Rclone?
There’s a long tail of roll-your-own cloud backup tools like this that I didn’t really evaluate: Rclone, BorgBackup, restic, Duplicity, Tarsnap, etc. I just ran out of energy and really wanted to avoid something that required a bespoke setup. These tools might work for me, but they wouldn’t be something I could recommend to a non-technical user, and I’m not sure if any of them have daemons for performing continuous backup, so I’d have to figure out a separate solution for that which might be buggy.
In any case, it looks like rclone may have some of the same problems as Duplicacy and restic with using too much memory[2][3].
[1] “If a cosmic ray has thrown a bit, we ask your computer to retransmit that particular file. This rarely, rarely happens, but in our datacenter of quadrillions of trillions of files in 350 petabytes of customer data. It DOES occur once in a while.” https://help.backblaze.com/hc/en-us/articles/217664798-Secur...
This is pretty disappointing to see, I've been looking forward to use b2 vs s3 in prod just from a cost perspective and only using their api, but if they are shipping stuff like this to run on other peoples machines, It makes one wonder about whats running on their servers… though I guess if things are uploaded via the api and are encrypted beforehand or meant to be publicly available are still pragmatic production use cases as along as someone assumes they are compromised when building stuff around it.
I’ve been happy using Carbon Copy Cloner for local backups for years. It may mostly be a wrapper around rsync, but it’s a very good one. :-)
For cloud backups, I’m still looking. My personal criteria are security (of course), good data integrity management, continuous backup, upload speed, and low resource utilisation. I was hoping to find something which was also simple enough that I could also tell non-technical friends and family to just use <whatever>. What I found was:
Backblaze… well, you know.
Carbonite uses Java and is slow.
SpiderOak uses Python and is slow.
Duplicacy uses Go along with a local web interface for the consumer version. It currently consumes way too much memory during backups[0].
IDrive I never tried installing; I was put off by how it appears to uses a fleet management model where configuration relies on visiting their web site and having stuff get pushed down to the client.
Acronis True Backup somehow manages to be 700 megabytes on disk. I thought they must be using CEF, but no, it seems to just contain a truly colossal amount of code along with a 100 megabyte help file.
At some point I stumbled across Arq, which looked like it could be a winner: they’ve published a couple CVEs on their site like a good company should; they have a monthly sub or you can just buy the software outright and use your own cheap cloud storage like S3 Glacier; they even publish the format of their backup files[1]. But then they went and released Arq 6 using CEF for the UI. It looks like they listened to feedback from users and will be going back to a native UI for Arq 7, and so I plan to take a serious look at this once it is released.
Am I misreading this pricing information? They charge $250 per terabyte per month? I know I didn’t specify cost effectiveness in my list but I did at least expect everything to be within the same order of magnitude in cost.
Were there any follow-ups from Backblaze?