> Regions that are flat color, for example (which occupy a nontrivial fraction of total area) can be rendered with no memory traffic at all, save the final output stage.
GPU rendering and compositing amount to the exact same thing most of the time. A "region of flat color", at least in principle, is just a 1x1 pixel texture that's "mapped" onto some sort of GPU-implemented surface that in turn is rendered onto the screen.
Hardware overlays merely accelerate the final rendering step; one can implement the exact same process either in hardware, or as a software-based "blitting" step.
This is good feedback that I need to be clear and avoid terminological confusion when I write that blog post.
Of course the compositor is using the GPU. The difference is entirely in how the capabilities of the GPU are exposed (or not) to the application. My thesis is that doing 2D rendering in a compute kernel is ideal for 2D workloads, because it lets the application express its scene graph in the most natural way, then computes it efficiently (in particular, avoiding global GPU memory traffic for intermediate textures) using the GPU's compute resources.
Of course you could in theory have a compositor API that lets you do region-tracking for flat color areas, and this would save some power, but no system I know works that way.
GPU rendering and compositing amount to the exact same thing most of the time. A "region of flat color", at least in principle, is just a 1x1 pixel texture that's "mapped" onto some sort of GPU-implemented surface that in turn is rendered onto the screen.
Hardware overlays merely accelerate the final rendering step; one can implement the exact same process either in hardware, or as a software-based "blitting" step.