This is very likely a platform specific optimization. The
optimization is about a very hot std::clamp. However, the
code is near functionally identical to the libstdc++
implementation, aside from being completely inlined.
I can't say why my version is faster, since I haven't read
any assembly output. More details in source code comments.
Two reasons:
- Makes it more straight-forward to add brightmaps to the non-power-of-two rendering functions.
- Made it easier to split off brightmap rendering. Hopefully improves performance, but I haven't thoroughly tested this.