This change avoids the hard dependency on GPU support for sRGB encoded
textures in the compute renderer.
With this change and the previously added CPU fallback, Gio no longer
rely on any GPU functionality outside the OpenGL ES 2.0 level.
Fixes gio#49
Fixes gio#154
Fixes gio#97
Fixes gio#36
Fixes gio#172
Signed-off-by: Elias Naur <mail@eliasnaur.com>
The compute renderer is more expensive to run than the old renderer on
low-end GPUs, and even more so on CPUs. To ensure good performance
regardless of the end-user device, this change implements automatic
re-use of content rendered in the frame before the current.
The basic idea is that every drawing operation (PaintOp), along with its
transform and clipping, can be hashed and efficiently looked up. A naïve
caching approach is then to rasterize every operation to separate
sections of several large texture atlases, turning a cache hit into a
very cheap texture copy.
However, for scenes with lots of overlapping operations, the resulting
texture memory from separating the operations would be much larger than
the memory for just the window framebuffer.
So instead of caching individual operations, this change caches layers,
which are sequences of drawing operations. It starts by putting all
operations into a single layer. Then, if the subsequent frame re-uses a
sub-sequence of that larger layer, it is split.
For example, consider a UI similar to the kitchen sample:
Hello, Gio
<Editor>
<Line Editor>
<Button> <Button> <Button>
<ProgressBar>
<Checkbox> <Toggle>
In the first frame, all of the drawing operations comprising the UI will
be stored and cached in a single layer. In the second frame the
progress bar will have moved and the renderer splits the UI into three
layers: layer A for everything up to (but not including) the progress
bar, layer B with just the progress bar, and layer C for the rest. Note
that nothing has been re-used yet. In the third frame, the progress bar
moves again, and this time layer A and C can be copied from the cache
only the progress bar needs redrawing through the compute programs.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
GPU APIs require that barrier() calls are dynamically uniform, that is
for every barrier in the code, every shader invocation in a workgroup
must all call it, or all not call it.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
The fill mode is now controlled by a SetFillMode command, not by flags
on each path segment and fill command.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
We're about to change the last stage of the compute pipeline to only
accept images, not sampled textures. This change prepares materials
for pixel-aligned image copying by pre-rendering images to a texture,
applying transforms.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
The old renderer is still the default, so the new compute renderer will only be
used in the rare case the old renderer is not supported but the new is. That
happens on the Samsung J2 Prime and Moto C Android phones. Or set the
GIORENDERER environment variable to "forcecompute" to disable the old renderer:
$ GIORENDERER=forcecompute go run ...
Missing features:
- Gradients are not supported yet, and render as a solid color.
- Draw timers are not added, and profile.Events are not emitted.
- Stroked paths may in some cases appear corrupted because their clip
outlines are not continuous when generated by Gio. Sebastien is
working on a fix.
- The new renderer shares most CPU-side logic with the old renderer,
resulting in several inefficient conversion steps between the old
operations representation and the new. This is slower, but minimizes
divergence in features and bugs between the two renderers.
Roadmap:
- The compute renderer supports features that Gio does not yet
exploit: stroked paths with round caps, transformations, lines,
cubic beziér curves.
- More stroke styles and maybe dashed strokes natively in shaders.
- Metal and Direct3D ports.
The most important feature is porting the renderer to run on the CPU. A
CPU renderer will both support Gio on devices with insufficient GPU
support, and allow us to remove the old renderer. Two renderers is twice
the maintenance but the feature set of the weakest implementation.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
The piet-gpu project is dual licensed under the Apache 2.0 and MIT, and the
shaders themselves are also offered under the UNLICENSE terms. See
https://github.com/linebender/piet-gpu#license-and-contributions, as of commit
72e2dfab3da8ae1adf7a0fb056b71ccbc4cfa29a:
"The piet-gpu project is dual-licensed under both Apache 2.0 and MIT licenses.
In addition, the shaders are provided under the terms of the Unlicense. The
intent is for this research to be used in as broad a context as possible."
Signed-off-by: Elias Naur <mail@eliasnaur.com>
Reintroduce support for offset in stencil vertex so we can reuse
cached values if the only difference in transform is offset. Split
current transform into a pure-offset part and the rest and use
only the complex part as cache key.
Signed-off-by: Viktor <viktor.ogeman@gmail.com>
Add support for affine transformations. The key changes are outlined
below.
- Painting/clipping with rectangles is handled by, for complex
transforms, creating clipping paths representing the transformed
rectangle and using a larger bounding box. Cover/Blit shaders updated
correspondingly to correctly map texture cordinates from the new
bounding boxes.
- Since path splitting must happen on CPU the transforms must happen CPU
side as well - offsets removed from shaders.
- Complex transforms will lead to different path splitting which means
that GPU arrays can no longer be cached if the transform has changed.
Thus the current transform is added as a key to the cache.
- Add a public API to op for setting Affine transformations.
There are a number of optimizations that could be explored further but
which are left out now:
- Caching also of CPU operations (e.g path splitting & transforms) and
not only caching the GPU arrays.
- Allow for re-use of cached GPU vertices if the transformation change
is a pure offset / scaling since the splitting is then the same.
Signed-off-by: Viktor <viktor.ogeman@gmail.com>
Safari's WebGL1 implementation (rightly) complains that first-class
array types are not supported as function result types. Define and
use a struct type instead.
While we're here, use const variables instead of functions.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
We're forced by compatibility to encode an integer state into a
floating point. Make the implicit conversion from floating point to
integer more robust against GPUs with low precision floats.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
OpenGL use the [-1; 1] range for clip depths, Direct3D [0; 1].
Use toClipSpace to encapsulate the difference.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
Add fboTextureTransform shader function for cancelling the
implied transformation from fragments output by the fragment
shader and the (u, v) coordinates used to sample from it in a
later pass.
For OpenGL the transformation is the identity.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
OpenGL supports casting from int to float during vertex array
reading. Direct3D doesn't. Since we're transpiling from GLSL, we can't
directly use the Direct3D builtin "asint". So that leaves using
"ivec2" instead of vec2.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
Emulate them for the OpenGL ES backend because 2.0 doesn't support uniform
buffers. The future d3d backend only supports uniform (constant) buffers.
Signed-off-by: Elias Naur <mail@eliasnaur.com>
We're about to add Direct3D support, where shaders are written in
HLSL. Rather than write shaders twice (or more), convert them to
a GLSL variant understood by the glslcc cross-compiler and generate
the OpenGL ES 2.0 and HLSL variants. The HLSL is used by a future
change.
Signed-off-by: Elias Naur <mail@eliasnaur.com>