Author here. I am seeing a lot of comments about how the graphs are not anchored...

vlmutolo · on Dec 15, 2021

> However, in the future I would pick a different visualization I think

I think the box plots were a good choice here. I quickly understood what I was looking at, which is a high compliment for any visualization. When it's done right it seems easy and obvious.

But the y-axis really needs to start at 0. It's the only way the reader will perceive the correct relative difference between the various measurements.

As an extreme example, if I have measurements [A: 100, B: 101, C: 105], and then scale the axes to "fit around" the data (maybe from 100 to 106 on thy y axis), it will seem like C is 5x larger than B. In reality, it's only 1.05x larger.

Leave the whitespace at the bottom of the graph if the relative size of the measurements matters (it usually does).

phkahler · on Dec 15, 2021

>> It's the only way the reader will perceive the correct relative difference...

Every day, the stock market either goes from the bottom of the graph to the top, or from the top all the way to the bottom. Sometimes it takes a wild excursion covering the whole graph and then retreats a bit toward the middle. Every day. Because the media likes graphs that dramatize even a 0.1 percent change.

KarlKemp · on Dec 15, 2021

No, the media just happens to sometimes share OP’s intend: to show a (small) absolute change. That change may or may not be as dramatic as the graph suggests in both visualizations: measured in Kelvin, your body temperature increasing by 8 K looks like a tiny bump when you anchor it at absolute zero. “You” being the generic “you”, because at 47 deg C body temperature, the other you is dead.

It will be visible if you work in Celsius, a unit that is essentially a cut-off Y axis to better fit the origin within the domains we use it for.

wbsss4412 · on Dec 16, 2021

The change still needs context.

We have an intuitive sense of what 30 degrees is, assuming it is in our preferred system of measurement.

A stock market graph really should be showing the percentage change, not some small absolute change that it’s not immediately understood by the typical layperson.

KarlKemp · on Dec 15, 2021

This notion about cut-off y-axes is the data visualization equivalent of “correlation is not causation”: it’s a valid point that’s easily understood, so everyone latches on to it and then uses it to proof their smartitude, usually with the intonation of revealing grand wisdom.

Meanwhile, there are plenty of practitioners who aren’t obviously to the argument, but rather long past it: they know there are situations where it’s totally legitimate to cut the axis. Other times, they might resort to a logarithmic axis, which is yet another method of making the presentation more sensitive to small changes.

vlmutolo · on Dec 15, 2021

There are plenty of instances where it's appropriate to use a y-axis that isn't "linear starting at zero." That's why I specified that I was only talking about ways to represent relative differences (i.e. relative to the magnitude of the measurements).

In this case, when we're measuring the latency of requests, without any other context, it's safe to say that relative differences are the important metric and the graph should start at zero.

So while it's true that this isn't universally the correct decision, and it's probably true that people regurgitate the "start at zero" criticism regardless of whether it's appropriate, it does apply to this case.

remus · on Dec 15, 2021

I think these choices are more context specific than is often appreciated. For example

> if I have measurements [A: 100, B: 101, C: 105], and then scale the axes to "fit around" the data (maybe from 100 to 106 on thy y axis), it will seem like C is 5x larger than B. In reality, it's only 1.05x larger.

If you were interested in the absolute difference between the values then starting your axis at 0 is going to make it hard to read.

amenod · on Dec 15, 2021

It is however very rare that absolute differences matter; and even when they do, the scale should (often) be fixed. For example the temperatures:

[A: 27.0, B: 29.0, C: 28.0]

versus:

[A: 27.0, B: 27.2, C: 26.9]

If scale is fit to the min and max values, the charts will look the same.

Still, as a rule of thumb, when Y axis doesn't start at 0, the chart is probably misleading. It is very rare that the absolute size of the measured quantity doesn't matter.

Tyr42 · on Dec 15, 2021

Yeah, you should graph both starting at 0K right? You wouldn't want to mislead people into thinking somthing at 10C is ten times more hot than something at 1C.

sandgiant · on Dec 15, 2021

Indeed. And if they don't you are probably better off normalizing your axis anyway.

eric_trackjs · on Dec 15, 2021

Agreed. Next time I'll make the text and other things a little larger too (the real graphs are actually quite large, I had to shrink them to fit the article formatting.) I'd already spent so much time on the article I didn't want to go back and redo the graphs (I didn't really think too many people would read it - it was a big surprise to see it on HN)

remram · on Dec 15, 2021

You might also want to actually put HTTP 1/2/3 side-by-side in each graph, and separate graphs by use case. Rather than the current visualization, putting use cases side by side, and HTTP 1/2/3 in different graphs.

edit: like this: https://imgur.com/a/7Gvq59j

brnt · on Dec 15, 2021

> I am seeing a lot of comments about how the graphs are not anchored at 0.

Personal preference: for large offsets that makes sense. For small ones (~10% of max here) it seems unnecessary, or, to a suspicious mind, meant to hide something ;)

eric_trackjs · on Dec 15, 2021

Yep, I didn't even think about the suspicious angle when I did it. Mostly I was fiddling with how to draw box plots in D3 and that's what came out. Next time I will ensure a 0 axis!

ReactiveJelly · on Dec 15, 2021

Some of the India offsets are huge.

And the numbers are too small to read, and reading numbers on a graph is mixing System 1 and System 2 thinking, anyway.

I agree that the graphs would be better and still impressive even anchored to 0

posix_me_less · on Dec 15, 2021

Could it be simply because UDP packets may be treated with higher priority by all the middleman machines? UDP is used by IP phones, video conferences, etc.

digikata · on Dec 15, 2021

Because of the use of UDP, I wonder the variation could end up being wider. There are some large outliers on the chart (though that could be an implementation maturity issue). Also I wonder if the routing environment will continue to be the same in terms of prioritization if http/3 becomes more common. There might be some motivation to slow down http/3 to prioritize traditional udp real-time uses.

rhplus · on Dec 15, 2021

Adding an 'axis break' is a great way to focus in on the range of interest while also highlighting the fact that it's not zero-based.

kzrdude · on Dec 15, 2021

The missing part of the axis, down to 0, is just 1/9th of the current length, so I think it's absolutely the wrong trade-off to cut the y-axis.

aquadrop · on Dec 15, 2021

Thanks for the article. But the goal of the graphs should be to show level of change and let graph speak for itself, if it's high or low. If they were anchored at 0 it would actually allow to see visual difference and for me personally it would be "way that was mostly readable side by side".

Railsify · on Dec 15, 2021

But they all start at the same anchor value, I was immediately able to interpret them and did not feel misled.

maxmcd · on Dec 15, 2021

I wouldn't interpret those comments as accusation. It's in all our best interests to critique possibly misleading graphs, even when done so unintentionally.

morrbo · on Dec 15, 2021

I found the article very useful regardless so thank you

marcos100 · on Dec 15, 2021

The charts are ok for the purpose of visualizing the performance of the protocols. At least for me, they are side-by-side with the same min and max values and is easy to compare. The purpose is clear and starting from zero adds nothing of value.

People who think the bottom line is zero don't know how to read a chart.

What maybe is missing is a table with the statistics to compare numbers with maybe a latency number.