Commit Graph

2 Commits

Author SHA1 Message Date
Elias Naur e6c31a02fd gpu: [compute] cache and re-use drawing operations from the previous frame
The compute renderer is more expensive to run than the old renderer on
low-end GPUs, and even more so on CPUs. To ensure good performance
regardless of the end-user device, this change implements automatic
re-use of content rendered in the frame before the current.

The basic idea is that every drawing operation (PaintOp), along with its
transform and clipping, can be hashed and efficiently looked up. A naïve
caching approach is then to rasterize every operation to separate
sections of several large texture atlases, turning a cache hit into a
very cheap texture copy.

However, for scenes with lots of overlapping operations, the resulting
texture memory from separating the operations would be much larger than
the memory for just the window framebuffer.

So instead of caching individual operations, this change caches layers,
which are sequences of drawing operations. It starts by putting all
operations into a single layer. Then, if the subsequent frame re-uses a
sub-sequence of that larger layer, it is split.

For example, consider a UI similar to the kitchen sample:

Hello, Gio

<Editor>

<Line Editor>

<Button> <Button> <Button>

<ProgressBar>

<Checkbox> <Toggle>

In the first frame, all of the drawing operations comprising the UI will
be stored and cached in a single layer. In the second frame the
progress bar will have moved and the renderer splits the UI into three
layers: layer A for everything up to (but not including) the progress
bar, layer B with just the progress bar, and layer C for the rest. Note
that nothing has been re-used yet. In the third frame, the progress bar
moves again, and this time layer A and C can be copied from the cache
only the progress bar needs redrawing through the compute programs.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur d23514fd58 gpu: add compute implementation
The old renderer is still the default, so the new compute renderer will only be
used in the rare case the old renderer is not supported but the new is. That
happens on the Samsung J2 Prime and Moto C Android phones. Or set the
GIORENDERER environment variable to "forcecompute" to disable the old renderer:

$ GIORENDERER=forcecompute go run ...

Missing features:
- Gradients are not supported yet, and render as a solid color.
- Draw timers are not added, and profile.Events are not emitted.
- Stroked paths may in some cases appear corrupted because their clip
  outlines are not continuous when generated by Gio. Sebastien is
  working on a fix.
- The new renderer shares most CPU-side logic with the old renderer,
  resulting in several inefficient conversion steps between the old
  operations representation and the new. This is slower, but minimizes
  divergence in features and bugs between the two renderers.

Roadmap:
- The compute renderer supports features that Gio does not yet
exploit: stroked paths with round caps, transformations, lines,
cubic beziér curves.
- More stroke styles and maybe dashed strokes natively in shaders.
- Metal and Direct3D ports.

The most important feature is porting the renderer to run on the CPU. A
CPU renderer will both support Gio on devices with insufficient GPU
support, and allow us to remove the old renderer. Two renderers is twice
the maintenance but the feature set of the weakest implementation.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2020-12-31 17:21:35 +01:00