300k Cassandra nodes seems a bit over the top even for a company with as many ac...

psaux · on Oct 9, 2022

Worked there, managed a lot of the teams. An agent is considered a node. So you could have X nodes per server if that clears items up. Even agent per X jbod. The teams are very smart on blast radius and sharding. When I left I had exabytes of data under my mgmt and happy to chat if anyone wants to DM. Folks often forget how many users and services Apple has.

haneefmubarak · on Oct 9, 2022

I'd absolutely love to chat and get a deeper understanding of that! Didn't see a way to contact you in your profile — what platform would you prefer to be DM'd on?

moralestapia · on Oct 9, 2022

+1 would love to hear more from GP.

_boffin_ · on Oct 9, 2022

psaux · on Oct 9, 2022

Added, thought I had in about.

relaxing · on Oct 9, 2022

What’s the orchestration setup like for clusters of that size?

threeseed · on Oct 8, 2022

Apple has a lot more data than just a list of devices.

There is everything from Weather to Siri to Store Purchases etc.

And companies will syndicate data sets to different teams for performance and security reasons ie. lots of duplication.

tpmx · on Oct 8, 2022

> Apple has a lot more data than just a list of devices. [...]

Of course. That is not the point here.

echelon · on Oct 8, 2022

Perhaps you'd be better convinced with a service breakdown.

Breaking monoliths into service boundaries yields easier ownership, maintenance, migration, and resilience.

One "tiny" company with a few verticals can be comprised of thousands of microservices, each handling their own dedicated objective. Authentication, reverse proxy, API gateway, SMS, email, customer list, marketing email gateway, CMS for marketers on product X, feature flags, transaction histories, GDPR compliance handling, billing intelligence, various risk models, offline ML risk enrichment, etc. etc. Each will have its own data needs and replication / availability needs.

This Apple number might seem crazy, but I'm not phased by it. I can picture it.

tpmx · on Oct 8, 2022

I can also picture it, but not really in the way you're outlining it.

It's a sad and very inefficient picture though. Apple does not need this this much data processing. It's a grotesque amount per device. My most positive plausible interpretation is that maybe they're just wasting insane amounts of energy doing lots and lots doing of stupid analytics, as one tends to do.

threeseed · on Oct 9, 2022

> It's a grotesque amount per device

Again. It is not just for device data.

There are backend services which your device interacts with e.g. Maps, Siri, Weather.

echelon · on Oct 8, 2022

Sometimes things have to be built as layered abstractions in order for humans to reason about them at scale.

See also the natural stochastic gradient ascent that produced our crazy complicated metabolic pathways (and all of biology).

WookieRushing · on Oct 8, 2022

I was also surprised by this. 300K nodes for a distributed DB is kind of crazy. I’ve worked with similar systems but they stored much more than 100 PB with 10x less nodes

Apple is using less than one TB per server…

But when you see the 1000s of clusters it starts to make sense. They probably have a Cassandra cluster as their default storage for any use case and each one probably requires at least 3 nodes. They’re keeping the blast radius small of any issue while being super redundant. It probably grew organically instead of any central capacity management

stubish · on Oct 9, 2022

What you describe is best practice for older versions of Cassandra with older versions of the Oracle JVM on spinning disks. And at this time Apple already had a massive amount of Cassandra nodes. Back when 1TB disks were what we had started buying for our servers. Cassandra was designed to run on large numbers of cheap x86 boxes, unlike most other DBs where people had to spend hundreds of thousands or millions of dollars on mainframes and storage arrays to scale their DBs to the size they needed.

Half a TB per node, which during regular compaction can double. And if you went over, your CPU and disk spent so much time on overhead such as JVM garbage collection that your compaction processes backlog, your node goes slower and slower, your disk eventually fills up, and it falls over. Later things got better and you could use bigger nodes if you knew what you were doing and didn't trip over any of the hidden bottlenecks in your workload. Maybe even fixed in the last few versions of Cassandra 3x and 4.0.

WookieRushing · on Oct 9, 2022

What psaux mentioned makes more sense. A node == one Cassandra agent instead of a server.

Past 100k servers you start needing really intense automation just to keep the fleet up with enough spares.

If you’ve got say 10k servers it’s much more manageable

The fun thing is Cassandra was born at FB but they don’t run any Cassandra clusters there anymore. You can use lots of cheap boxes but at some point the failure rate of using soo many boxes ends up killing the savings and the teams.

stubish · on Oct 9, 2022

Yes, you can run multiple nodes on a single physical server. However, then you have the additional headache of ensuring that only one copy of data gets stored on that physical server, or else you can lose your data if that server dies. Similar to having multiple nodes backed by the same storage system, where you need to ensure losing a disk or volume doesn't lose two or more copies of data. Cassandra lets you organize your replicas into 'data centers', and some control inside a DC by allocating nodes to 'racks' (with some major gotchas when resizing, so not recommended!). Translating that into VMs running on physical servers and shared disk is (was?) not documented.

rockwotj · on Oct 9, 2022

> The fun thing is Cassandra was born at FB but they don’t run any Cassandra clusters there anymore.

Isn't Intragram mostly Cassandra?

https://instagram-engineering.com/open-sourcing-a-10x-reduct...

WookieRushing · on Oct 9, 2022

It wasn’t when I last saw it. Rocksandra ended up being a stepping stone to fbs most common distributed db, zippydb https://engineering.fb.com/2021/08/06/core-data/zippydb/

Zippydb is honestly one of the best parts of fb infra. It let you select levels of consistency vs latency

petersellers · on Oct 9, 2022

> Zippydb is honestly one of the best parts of fb infra. It let you select levels of consistency vs latency

How is that different from Cassandra's Tunable consistency model?

https://cassandra.apache.org/doc/4.1/cassandra/architecture/...

shrubble · on Oct 9, 2022

Seagate introduced 2TB drives no later than 2010.

stubish · on Oct 10, 2022

Interestingly, using the highest capacity drives at any point in time would work even worse since they spun slower and slower sequential write speed. If you could get them from your preferred vendor, which seemed to be several years after introduction for us!

j_kao · on Oct 9, 2022

I'm wondering if this refers to "virtual" nodes which are not coupled with physical nodes.

ninth_ant · on Oct 9, 2022

I believe this is the case, with many smaller nodes per physical host.

I’ve seen this type of design pattern implemented successfully with a variety of extremely large databases.

j_kao · on Oct 9, 2022

Yeah, I could see that even outside of the context of the Cassandra virtual nodes primitives.

These could be K8s nodes for example, that don't make full utilization of the underlying VM, which would completely make sense at APPL scale.

For some background to other HN users out there, "virtual" nodes refer to logical nodes in distributed database software where # of virtual nodes >= physical nodes. This means if the size of data passes a certain threshold, physical nodes can redistribute the virtual nodes, reducing the amount of data shuffled across physical nodes (as opposed to a naive hash function that mods a key by a fixed number and requires all nodes to reshuffle data when a new node is added).

daniel-grigg · on Oct 8, 2022

Or it tells us something of how much data is being scooped up per device. Certainly when I look through the raw health data collected it’s quite alarming and I’m sure that’s just a drop in the ocean.

ezfe · on Oct 8, 2022

Well, Health data can be uploaded to iCloud (CloudKit), but it's End-to-End encrypted so not really a concern.

Unlike other data in iCloud, if you lose your devices you lose your HealthKit data. This is not true for photos or emails, for example - which you keep if you lose your devices.

mwint · on Oct 8, 2022

Why do you think the raw health data is getting sucked off your device? That would be totally off brand for them.

Apple does have a separate opt-in “Research” program to facilitate this kind of thing.

faeriechangling · on Oct 8, 2022

Regardless of their current brand, Apple is the next big advertising giant and no amount of brand purity is going to change this. The data of Apple's users is simply of too high value for Apple to ignore forever.

tpmx · on Oct 8, 2022

Makes me think of that first decade (98-08) when Google actually wasn't being evil. Yeah, it's inevitable that Apple will turn to this when they can't grow any more simply by raising the prices of their devices. Perhaps they have reached that point about now with iPhone 14.

alphabettsy · on Oct 9, 2022

Google’s business model is built on monetizing user data. Apple’s was not. Maybe that will change.

threeseed · on Oct 9, 2022

> Apple is the next big advertising giant

No it's not.

There is no evidence whatsoever that Apple is doing anything other than facilitating ads through their News, App Store, Maps etc properties.

Their revenue will be an insignificant fraction of Facebook, Google etc.

Tarq0n · on Oct 9, 2022

That still creates the same conflict of interests with regard to targeting data that made Google turn consumer hostile over time.

daniel-grigg · on Oct 9, 2022

I’m not suggesting apple is doing nefarious, it was simply an example of a large dataset. I assume they’re encrypting/scrubbing all your PII as that’s their whole motto.

But unless they’re doing some voodoo magic then yes the data is leaving is your device in some form. Hence why I can view every heartbeat (aggregated by the minute) since I put on the original Apple Watch in 2015 despite changing all my devices and only restoring from iCloud. Indeed I expect it’s just part of your iCloud data storage.

Actually I just launched the health app now and I can see the app explicitly asks if you want to allow sharing your data with apple, so if you say ‘yes’ then you’re not only storing but allowing apple to query your data (minus PII).

smoldesu · on Oct 8, 2022

It's also off-brand for Apple to join PRISM and comply with thousands of annual requests for supposedly-inaccessible iCloud data. Neither of you will ever be proven right until we look inside those servers though, so making any conclusive statements is a mistake. Apple designed Schrodinger's datacenter.

prange · on Oct 8, 2022

Apple didn’t ‘join’ prism. That’s a standard piece of misinformation.