> We dispatch instanced draw calls where each instance corresponds to a segment’s quad. The vertex shader finds all of the information it needs from the primitive offset and segment id of the quad it is working on.
Hm, any reason you're using instanced quads? From my understanding of how GPUs work, instancing only has big gains when you have a lot of vertex data in the draw call [0]. Instead of the instanced quads, have you considered making a vertex buffer containing two shapes: a + for the center/edges, and four quads for the corners. That way you could still do the whole thing in two draw calls, and you're not going through the slow instancing path. And if you have to blend the bg anyway (e.g. rgba background-color), you can just draw the whole vertex buffer in one go to skip the extra draw call.
How much anti-alias (rounded corners) increases visual quality on modern high DPI displays any case? As far as I understood, macOS dropped it for font rendering.
If the contrast is high enough (such as black on white), then I find the difference very clearly visible on my 2× Surface Book display, and subtle but definitely present at lower contrasts. You’ve got to get a lot higher than what’s available on the mass market now before you’ll want to drop antialiasing altogether.
What macOS may have dropped is subpixel rendering—trebling probably-horizontal antialiasing precision by knowing the physical layout of subpixels. They won’t have dropped all antialiasing for font rendering, because that would be very visible.
(Note: Firefox seems to be ruining the feColorMatrix, utterly misinterpreting alpha values above 1. This makes the greys disappear to white. And once I turn WebRender on, the second column doesn’t appear at all! I’d check for bug reports on Bugzilla and file one if I didn’t find one, but I should be in bed.)
Hm, any reason you're using instanced quads? From my understanding of how GPUs work, instancing only has big gains when you have a lot of vertex data in the draw call [0]. Instead of the instanced quads, have you considered making a vertex buffer containing two shapes: a + for the center/edges, and four quads for the corners. That way you could still do the whole thing in two draw calls, and you're not going through the slow instancing path. And if you have to blend the bg anyway (e.g. rgba background-color), you can just draw the whole vertex buffer in one go to skip the extra draw call.
[0] http://www.joshbarczak.com/blog/?p=667