As a data scientist or software engineer working with "medium / large" amount of data, how do you visualize them?
Lets take a few example:
- Working on a iPhone app dealing with signal processing, Fourier transform etc, how do you visualize your signal and frequencies?
- As a back-end engineer working with a directed graph data structure, how do you quickly visualize your graph? are you interested in seeing your graph changing, step by step?
- How do you quickly visualize massive amount of data points into time series?
There are two kinds of charts: Charts designed to find information, and charts designed to sell information. The latter are often gorgeous and many-dimensional: Heatmaps, animated bubble charts, charts with time sliders, etc. And by all means, if selling the data is required, then sell it with the best tool for the job.
As for actually investigating the data, it's usually a lot of tables, lines and bars. They're simple to understand, and there's no cleverness in the visualization that might hide critical information.
To answer your questions, at Periscope I've seen:
1. A line graph of amplitude over time. You should see the frequency emerge clear as day. If you want to calculate frequency explicitly, you could overlay a second line with its own axis. Again, super simple, but gives you the answer directly.
2. I've seen a lot of fancy graph visualizations, but nothing that makes me happy. Depending on what you want to know about your graph, maybe a simple table with a structure like:
Or: A pivot table on top of this data, transoforming the second node column into the table's horizontal axis, can also be useful.3. OK, obviously I think Periscope is a great choice here. Loads of data analysts use it to visualize time series data on many tens/hundreds of billions of data points.
That said, other good choices are: Excel, R/Stata/Matlab, gnuplot, Apache Pig. And for the data storage itself, IMO Amazon Redshift is unparalleled.