Cool! I just want to point out that the urge to generate an irregular mesh can be quite hardware unfriendly if you want to tesselate the terrain dynamically around a moving camera. The well cited papers on Geometry Clipmaps highlight this. The point is that it can be more efficient to focus on feeding the GPU pipeline in an efficient manner instead of focusing on rendering as few triangles as possible. Irregular meshes thrashes the cache if you retriangulate them all the time (since no data can be reused). A uniform grid has many advantages, unless you're only interested in rendering one static view, then irregular meshes are great!
Good point! I'm coming to this in the context of interactive maps, where pretty much all data is tiled. So we can use tiles for loading the terrain mesh & adjusting level of detail too while avoiding frequent GPU memory updates. But I'm a newbie in 3D rendering, so this is more of an exploratory/learning work than something to use in production — I've heard a lot about geometry clipmaps and would love to explore them too!
Would love to learn more, since it seems counter-intuitive — are there any studies / benchmarks on this for modern hardware? Why would a uniform mesh with, say, 10x more triangles perform better if it's not updated often (e.g. a few times a second)?
Mainly because GPU throughput is increasing much faster than CPU throughput. Spending cycles removing triangles doesn't matter if the GPU can just churn through them like nothing. In the context of a moving camera you want to reuse as much data as possible between terrain updates. With regular nested grids a ton of data can be reused, only the height needs updating as you move the grids around (and height can be reused between adjacent grids half the time if they have a power of two resolution relationship).
But baking irregular meshes into a tree-structure could be powerful too I suppose, I'm not that familiar with algorithms involving them. All of this is highly dependant on actual size of terrain and LOD requirements, terrain rendering is a deep rabbit hole.
Yeah, it's a thin balance, and to me as someone without years of experience in graphics programming, GPU feels like such a blackbox — hard to predict what will be the actual bottleneck until you implement an approach fully, with real-world loads, and even then it's hard to assess, especially with such variety of GPU performance characteristics on different devices.
So what's simpler / easier to implement in a given system will play an important role, and so are any additional things you want to do with the height data like collision detection, querying, data analysis etc. Anyway, that's a super-exciting topic to learn more about! Thanks a lot for all the thoughtful comments.
It'll come down to how frequently you're doing updates. A few times a second will certainly cost more than a prefab mesh with a texture fetch shader.
If you can precompute various LOD chunks for each tile and cache them, under some circumstances that's optimal, but I don't think any of them are webgl. A typical heightmap based terrain renderer today will consist of several (3-5) prefab LOD meshes deformed by height map textures in shader, like[1]. Swapping a textures in and out of vram is cheaper than swapping geometries, even if those geometries are of lower resolution.
Modern GPUs are almost never bound by the number of triangles on screen. The bottleneck will be the fillrate and bandwidth and pipeline stalls, especially in webgl. If you can throw down more triangles to do less of the other things, it's almost always worth it.
I have a toy mapbox tile renderer based on this approach. I'd love to know if you guys have plans to implement full 3d like google maps any time soon.
I would really love to see your toy Mapbox renderer — please share it, even if unpolished! We'd benefit a lot from fresh ideas on how to improve Mapbox GL performance.
GPUs are effectively “magic” and can do particular things hundreds or thousands of times faster than CPUs. Offloading a per-vertex or per-pixel task to the GPU, especially if doing so reduces bandwidth requirements, is almost always a big win.
I think reality is closer to this: GPU rasterization is 20-50 times faster than CPUs and throughput computing maybe 5-20 times faster.
For example consumer Ryzen 3 with 12 cores peaks at 32 FLOPS/core/cycle. So at 3.8 GHz, peak ~1.4 SP TFLOPs (or 0.7 DP TFLOPs). But just 50-60 GB/s memory bandwidth limits it somewhat.
Quick googling says consumer Nvidia RTX 2080 peaks at 10 SP TFLOPs (or 0.314 DP TFLOPs, yes, less than half than the CPU example). Memory bandwidth being at 448 GB/s.
GPUs win massively at rasterization, because they have huge memory bandwidth, a large array of texture samplers with hardware cache locality optimizations (like HW swizzling), texture compression, specialized hardware for z-buffer tests and compression, a ton of latency hiding hardware threads, etc.
But they're definitely not thousands or even hundreds of times faster.
Not very helpful if you need to do collision on the displaced mesh though, I would have thought, since a vertex shader only make it "appear" as if the vertices are not in their original position?
I'm thinking of tracing for foot placement in conjunction with inverse kinematics for walking characters, or vehicle wheels traversing the terrain. I've run into this problem with UE4.
Would it be possible to either compute or approximate the fragment shader's vertex results on segments of the terrain that require collision detection? I bet it wouldn't be feasible to do to check collisions for gun projectiles in a video game, but it might be feasible for something like making sure the player's model doesn't clip beneath the tessellated terrain.
UE4 has adaptive tessellation anyway, and you can scale the tessellation multiplier based on distance with a material function so that the landscape has more triangles closer to the camera and less further away. Beyond a certain distance (where the player wouldn't notice the detail anyway, so there's no point drawing it) you can scale the multiplier to zero to improve performance.
I'm not an expert but honestly in a game engine like UE4 where collision is important, it seems like the adaptive geometry tessellation approach is the way to go, as least from my initial experiments.
I was thinking of a somewhat different use case. I was thinking of a flight simulator where tessellation only occurred close to the camera, to provide the appearance of a more detailed mesh when viewed up close.
The problem is, with a quad mesh where each vertex is 20 or 30 meters apart a plane flying nap of the earth could end up clipping through the tessellated mesh but not actually collide with the non-tessellated terrain quad mesh being used for collision detection. Or worse, the player could collide with the terrain quad mesh but the tessellation makes it visually appear to have not collided with the terrain.
To only two ways I see to solve this are to either not use tessellation up close, or to compute with the CPU the vertices produced by the tessellation and use that mesh for collision.
Geometry shaders seems sort of the bastard child of the graphics pipeline. It was a mess to implement in hardware and the throughput isn't that great when creating tons of triangles. The tesselation stages of the pipeline addresses this and is more suitable for arbitrary subdivision of triangles.
Not a question about the mesh generator, but if I want to create a (leaflet or mapbox) map that uses raster images as textures in vector maps, where do I start?
Did you leverage this approach to optimize terrain rendering/add detail on devices where hardware tessellation is not supported? Lovely algorithm in any case. [Edit: Just remembered the limitation is with WebGL itself, doh.]
This is awesome. I wrote a Terrain-RGB to greyscale PNG converter and would love to play around with this in a demo map for a 3D preview or something. How are you rendering the mesh in your notebook? Threejs?
Worked on my iPad, which is unusual for demos like that. And wicked fast! The article is beautifully written and the information presented with astonishing clarity. Well, well done.
Oh yes, quadtree-like structures come up all the time when you're dealing with maps. E.g. check out this algorithm I came up with for finding pole of inaccessibility of a polygon: https://github.com/mapbox/polylabel
Did my masters project on quadtrees, specifically distributed methods for solving differential equations. Using them for mapping and loading adjacent cells is a similar application, as a lot of my work was in trying to maintain locality when flattening the structure. I had a great time writing this up, they're fantastic structures with a lot of potential.
I love stuff like this. This is like the 5th or 6th time I've seen 3D techniques from the past become real time relevant. The first one I saw was from the 50's and someone just happened to explore it and it happened to run great on modern hardware. I bet there is a ton of stuff out there like this. I do wish there was more of a benchmarking standard. It is great that it now runs in 20ms on his laptop but that says nothing about the actual computational requirements.
The easiest method would be downward "skirts" as mentioned above. A little more complicated one I'm thinking about is reconciling error maps on the border when an adjacent tile loads so that they split at the same point, and updating the corresponding mesh — should be fast enough to do on tile loads.
I think I've seen a rather pragmatic hack that has been used in Google Earth for their stitching of terrain: They simply add a kind of skirt at the boundary of each square chunk [0], so that there aren't any white/black pixels. There may be some artifacts between LoDs but they soon disappear when the next chunk gets more refined.
Edit: I think the Google terrain skirts were slanted, rather than falling straight down, so two neighbouring skirts intersect in a V.
If you are getting your data from an image, you already have a mesh (a quad mesh from pixels which could be broken down into triangles a few different ways).
Anybody know about best/easy/fast/interesting algorithms for doing the opposite of the post and "compressing" the data eliminating triangles for a prescribed error or some other metric?
That made it really clear that texture resolution trumps mesh resolution. Dialling down the mesh detail slider results in very little perceptual difference, but the blurry texture always stands out.
The non-textured demo actually looks better. Nothing that can't be fixed by stitching higher resolution sat imagery though!