Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Author here. I am seeing a lot of comments about how the graphs are not anchored at 0. The intent with the graphs was not to "lie" or "mislead" but to fit the data in a way that was mostly readable side by side.

The goal was to show the high level change, in a glanceable way, not to get in to individual millisecond comparisons. However, in the future I would pick a different visualization I think :)

The benchmarking has also come under fire. My goal was to just to put the same site/assets on three different continents and retrieve them a bunch of times. No more, no less. I think the results are still interesting, personally. Clean room benchmarks are cool, but so are real world tests, imo.

Finally, there was no agenda with this post to push HTTP/3 over HTTP/2. I was actually skeptical that HTTP/3 made any kind of difference based on my experience with 1.1 to 2. I expected to write a post about "HTTP/3 is not any better than HTTP/2" and was frankly surprised that it was so much faster in my tests.



> However, in the future I would pick a different visualization I think

I think the box plots were a good choice here. I quickly understood what I was looking at, which is a high compliment for any visualization. When it's done right it seems easy and obvious.

But the y-axis really needs to start at 0. It's the only way the reader will perceive the correct relative difference between the various measurements.

As an extreme example, if I have measurements [A: 100, B: 101, C: 105], and then scale the axes to "fit around" the data (maybe from 100 to 106 on thy y axis), it will seem like C is 5x larger than B. In reality, it's only 1.05x larger.

Leave the whitespace at the bottom of the graph if the relative size of the measurements matters (it usually does).


>> It's the only way the reader will perceive the correct relative difference...

Every day, the stock market either goes from the bottom of the graph to the top, or from the top all the way to the bottom. Sometimes it takes a wild excursion covering the whole graph and then retreats a bit toward the middle. Every day. Because the media likes graphs that dramatize even a 0.1 percent change.


No, the media just happens to sometimes share OP’s intend: to show a (small) absolute change. That change may or may not be as dramatic as the graph suggests in both visualizations: measured in Kelvin, your body temperature increasing by 8 K looks like a tiny bump when you anchor it at absolute zero. “You” being the generic “you”, because at 47 deg C body temperature, the other you is dead.

It will be visible if you work in Celsius, a unit that is essentially a cut-off Y axis to better fit the origin within the domains we use it for.


The change still needs context.

We have an intuitive sense of what 30 degrees is, assuming it is in our preferred system of measurement.

A stock market graph really should be showing the percentage change, not some small absolute change that it’s not immediately understood by the typical layperson.


This notion about cut-off y-axes is the data visualization equivalent of “correlation is not causation”: it’s a valid point that’s easily understood, so everyone latches on to it and then uses it to proof their smartitude, usually with the intonation of revealing grand wisdom.

Meanwhile, there are plenty of practitioners who aren’t obviously to the argument, but rather long past it: they know there are situations where it’s totally legitimate to cut the axis. Other times, they might resort to a logarithmic axis, which is yet another method of making the presentation more sensitive to small changes.


There are plenty of instances where it's appropriate to use a y-axis that isn't "linear starting at zero." That's why I specified that I was only talking about ways to represent relative differences (i.e. relative to the magnitude of the measurements).

In this case, when we're measuring the latency of requests, without any other context, it's safe to say that relative differences are the important metric and the graph should start at zero.

So while it's true that this isn't universally the correct decision, and it's probably true that people regurgitate the "start at zero" criticism regardless of whether it's appropriate, it does apply to this case.


I think these choices are more context specific than is often appreciated. For example

> if I have measurements [A: 100, B: 101, C: 105], and then scale the axes to "fit around" the data (maybe from 100 to 106 on thy y axis), it will seem like C is 5x larger than B. In reality, it's only 1.05x larger.

If you were interested in the absolute difference between the values then starting your axis at 0 is going to make it hard to read.


It is however very rare that absolute differences matter; and even when they do, the scale should (often) be fixed. For example the temperatures:

[A: 27.0, B: 29.0, C: 28.0]

versus:

[A: 27.0, B: 27.2, C: 26.9]

If scale is fit to the min and max values, the charts will look the same.

Still, as a rule of thumb, when Y axis doesn't start at 0, the chart is probably misleading. It is very rare that the absolute size of the measured quantity doesn't matter.


Yeah, you should graph both starting at 0K right? You wouldn't want to mislead people into thinking somthing at 10C is ten times more hot than something at 1C.


Indeed. And if they don't you are probably better off normalizing your axis anyway.


Agreed. Next time I'll make the text and other things a little larger too (the real graphs are actually quite large, I had to shrink them to fit the article formatting.) I'd already spent so much time on the article I didn't want to go back and redo the graphs (I didn't really think too many people would read it - it was a big surprise to see it on HN)


You might also want to actually put HTTP 1/2/3 side-by-side in each graph, and separate graphs by use case. Rather than the current visualization, putting use cases side by side, and HTTP 1/2/3 in different graphs.

edit: like this: https://imgur.com/a/7Gvq59j


> I am seeing a lot of comments about how the graphs are not anchored at 0.

Personal preference: for large offsets that makes sense. For small ones (~10% of max here) it seems unnecessary, or, to a suspicious mind, meant to hide something ;)


Yep, I didn't even think about the suspicious angle when I did it. Mostly I was fiddling with how to draw box plots in D3 and that's what came out. Next time I will ensure a 0 axis!


Some of the India offsets are huge.

And the numbers are too small to read, and reading numbers on a graph is mixing System 1 and System 2 thinking, anyway.

I agree that the graphs would be better and still impressive even anchored to 0


Could it be simply because UDP packets may be treated with higher priority by all the middleman machines? UDP is used by IP phones, video conferences, etc.


Because of the use of UDP, I wonder the variation could end up being wider. There are some large outliers on the chart (though that could be an implementation maturity issue). Also I wonder if the routing environment will continue to be the same in terms of prioritization if http/3 becomes more common. There might be some motivation to slow down http/3 to prioritize traditional udp real-time uses.


Adding an 'axis break' is a great way to focus in on the range of interest while also highlighting the fact that it's not zero-based.


The missing part of the axis, down to 0, is just 1/9th of the current length, so I think it's absolutely the wrong trade-off to cut the y-axis.


Thanks for the article. But the goal of the graphs should be to show level of change and let graph speak for itself, if it's high or low. If they were anchored at 0 it would actually allow to see visual difference and for me personally it would be "way that was mostly readable side by side".


But they all start at the same anchor value, I was immediately able to interpret them and did not feel misled.


I wouldn't interpret those comments as accusation. It's in all our best interests to critique possibly misleading graphs, even when done so unintentionally.


I found the article very useful regardless so thank you


The charts are ok for the purpose of visualizing the performance of the protocols. At least for me, they are side-by-side with the same min and max values and is easy to compare. The purpose is clear and starting from zero adds nothing of value.

People who think the bottom line is zero don't know how to read a chart.

What maybe is missing is a table with the statistics to compare numbers with maybe a latency number.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: