Cartography: Graph view of infrastructure assets and relationships between them

mbautin · on March 29, 2019

That is an interesting project! The idea of a graphical representation of infrastructure assets, with the ability of querying, sounds very useful, especially given how in AWS Management Console every region is a separate realm and no single view of resources is available. However, I don't agree with equating "infrastructure assets" with AWS -- AWS is the biggest cloud, but almost every business is moving towards multiple clouds / hybrid cloud these days. Similarly to providing a "bird's eye view" of assets in AWS, this concept would be even more useful in a multi-cloud / cross-cloud infrastructure. If this is on the roadmap, it probably deserves a mention in the documentation, and if not, it should be stated that this project focuses on AWS only at this time.

jordanbeiber · on March 29, 2019

Terraform uses a graph to keep track of infrastructure dependencies. You can export the graph and take a look (terraform graph).

I once built a terraform provider for a custom neo4j cmdb to be able to add metadata in the form of additional relationships to anything provisioned with TF.

This data was used for config inventory, job scheduling and a bunch of other automations/rules.

Broke the ”cloud vendor” boundry this way and we used it seamlessly against on-prem vsphere and aws.

Never got around to open source it, unfortunately.

A graph for this type of cmdb/config data is really useful.

ryan_lane · on March 29, 2019

For every business actively doing multi-cloud, I'd really love to see their business-case argument for doing so. For the past 8 years or so, I've talked to a large number of businesses that have gone multi-cloud to "avoid vendor lock-in" or to "get the best features of all clouds" who end up regretting their choices because they tend to end up with the worst of all worlds, networking issues, latency problems, tooling woes, and considerably larger tooling teams that are always playing catch-up.

alexchantavy · on March 29, 2019

> this concept would be even more useful in a multi-cloud / cross-cloud infrastructure.

Thanks for the comment! Cartography can definitely be extended to do multi-cloud but we're focusing on AWS at the moment. At the same time though we didn't want to discourage others from looking at this by saying in the docs that it was solely an AWS project. There's nothing really stopping it from supporting other data sources.

Edit: we're planning on adding a GitHub module soon too.

lmeyerov · on March 29, 2019

I've been loving this project. Also a big shout out to Beagle (Sony), which is also doing good work here recently: https://github.com/yampelo/beagle .

If you're curious about this stuff for more day-to-day, we (Graphistry) add some crazy GPU graph visual analytics + visual templating / automation for daily tasks + DB connectors. Think automatically mapping your daily splunk alerts across host/network/intel data, or seeing your entire data center, or going through a ton of DNS logs for bots/intel. Super fun space right now! Ex: http://labs.graphistry.com/graph/graph.html?dataset=PyGraphi...

ryguyrg · on March 29, 2019

Lending Club has also done a lot of work in this area: https://github.com/LendingClub/mercator https://www.youtube.com/watch?v=xMsx5pS_VNo

And Sacha of Lyft was previously using similar techniques on the red team at Microsoft.

The awesome Bloodhound project productizes some of the red team use cases: https://github.com/BloodHoundAD/Bloodhound/wiki

alexchantavy · on March 29, 2019

There's a lot of exciting work in graphs right now.

Check our Readme; we definitely had to give a shout-out to BloodHound :-).

abhinai · on March 29, 2019

The name is a bit confusing. Cartography sounds like something related to maps. Perhaps the authors should consider renaming this otherwise really awesome project?

erydo · on March 29, 2019

Mapping one's infrastructure is pretty common parlance. It's not even a pun, really.

lanfeust6 · on March 29, 2019

Yes, because mapping has more than one meaning, e.g. an operation that associates each element of a given set (the domain) with one or more elements of a second set (the range).

Cartography doesn't.

erydo · on April 3, 2019

It's still just one definition: A paper map relates a point on a the piece of paper to a point in some other space (topography, subway stops, interstate routes, etc). And if you relax the "carto" part of "cartography" (e.g. someone working on OSM or Google Maps is still doing cartography), then I think cartography as "the practice of constructing maps" is still an unambiguous and non-overloaded meaning, which applies to mapping infrastructure just as well.

But, words are just words.

ecesena · on March 29, 2019

Video from BSidesSF 2019, highly recommended: https://youtu.be/ZukUmZSKSek

jcims · on March 29, 2019

Yesssssss!!!!! I just started undergoing a process to build such a thing...this looks very similar to what I had in mind.

eb0la · on March 29, 2019

In the non-free world you can get similar features with a CMDB (like HP or EMC CMDBs).

This kind of software usually its licensed for the number of elements it tracks... and the costs add up easily (hosts, network ports, IP addresses, OS, Application, assets counts as one element and you end up tracking 1000s of them).

alpha_squared · on March 29, 2019

Literally working on an extremely similar project, albeit to power security; seems inevitable that an open-source solution would've popped up. Glad to see this. Any exploration in doing this cross-cloud?

alexchantavy · on March 29, 2019

We're definitely open for collaboration to allow multi-cloud :-). This could simply be another [intel module](https://github.com/lyft/cartography/tree/master/cartography/...).

duked · on March 29, 2019

I'm working on something related to security too. I would love to connect but no email on your profile.

alpha_squared · on March 29, 2019

Sure, let's connect; you can shoot me a message here: andrew(a)armaneous.com

chris408 · on March 29, 2019

Not sure, but for now AWS is the focus.

alpha_squared · on March 29, 2019

Not sure if I'm missing it in the documentation, but how does it keep in sync with changes in the account?

Edit: OP changed response to remove certain references, presumably because something sensitive was revealed. Changing my response out of respect.

alexchantavy · on March 29, 2019

Hi, I'm one of the devs on the project.

Good question. As is, this does not keep anything in sync.

To keep the graph in sync with changes in the account, simply set up a cronjob to run `cartography` whenever you would need a refresh. Each sync run should guarantee that you have the most up-to-date data.

Here's how a sync works: when the sync starts, set a variable called `update_tag` to the current time. Then, pull all the data from your AWS account(s) and create Neo4j nodes and their relationships, making sure to set their `lastupdated` fields to `update_tag`.

Finally, delete the left over nodes and relationships (i.e. those that do not have up-to-date `lastupdated` fields). This way the data stays fresh, and you can see this in the [cleanup jobs](https://github.com/lyft/cartography/tree/master/cartography/...).

alpha_squared · on March 30, 2019

Makes sense, thanks for the additional details!

Our approach requires us to stay as real-time as possible, so we're actually using CloudWatch events to keep in sync -- the deletes become a little hard after that.

I look forward to the progress of Cartography, though!

the_arun · on March 29, 2019

Shouldn't AWS provide some tool like this to empower their customers?

bostik · on March 29, 2019

In an ideal world, yes. In this world, they REALLY should.

The downside of AWS providing something like this out of the box would be complete lack of usability. There's a running not-even-a-joke that whatever UI an AWS service comes with, it's going to be at best an excuse. If you want to make a good use of a feature/offering in their stack, you have to write your own tools for it.

Don't get me wrong, the APIs in AWS are, for most parts, quite nice to work with. [asterisk goes here] -- It's the stock UIs in their web console that are maddening and (IMO) barely fit for anything productive.

jcims · on March 29, 2019

AWS Config provides essentially all of nodes and edges required to build a nice resource graph, but unfortunately it's quite limited in the services it covers.

stuntkite · on March 30, 2019

I built something similar for teaching people Docker in webcasts. It also integrates a little inspector, a terminal, hotkeys, and for no good reason; procedural color theming. It's kind of an ugly code base because I built it for screen casting then got a job that required a lot of attention so I didn't end up using it much, but I like it.

https://github.com/millerhooks/GraphTerm

motohagiography · on March 29, 2019

Nice. We should connect. I've built this for security using a very similar stack (Neo, etc). Where it's similar is the use of a graph to implement an ontology of any given system, where we diverge is I've built a separate viz, collaboration, and management layer.

Ontologies never took off because I think they tried to encompass too many things, where graphs are a simple tool you can use to build them as part of a greater idea.

ajflores1604 · on March 29, 2019

I've wanted to mock up something similar, but more along the lines of Lightbend Pipelines mixed with Touchdesigner. Or I guess a more interactive version of Netflix's Vizceral would be another way of looking at it. But having Touchdesigner's multi-tiered workspace within a node concept would be key.

securisteve · on March 29, 2019

I've been using JupiterOne recently, has anyone else seen it? It creates neo4j-like graph and CMDB from actual AWS data and many other things. There’s a free community version of the product too that I could test out in my lab.

duked · on March 29, 2019

We're building something very security oriented if you'd like to try and give your feedback (contact@cloudhawk.io) ? Thanks for mentioning Jupiter one we've never heard of it but interesting product !

phillipidem · on March 29, 2019

https://jupiterone.com/ ? Haven't used it but looks interesting! Looks like it can load data from services other than AWS

jordanbeiber · on March 29, 2019

Built a cmdb (configuration management db) of sorts using neo4j.

It is in many ways well suited as it is so easy for even non-tech people to understand and even query.

jacobolus · on March 29, 2019

Off topic: this is a horrendous name for a project unrelated to geographical maps.

Jedd · on March 29, 2019

I'd go further and suggest any existing commonly understood word is a poor choice for a new thing, but many software projects seem hellbent on confounding their potential users along with people who have no great interest in the new thing.

stiGGG · on March 29, 2019

There is a quite popular iOS lib for layout with the same name [1], where I always thought this, too.

[1] https://github.com/robb/Cartography