Commit Graph

51 Commits

Author SHA1 Message Date
Elias Naur 8cec7e04eb gpu,gpu/shaders: [compute] decode sRGB texels in shader when EXT_sRGB is missing
This change avoids the hard dependency on GPU support for sRGB encoded
textures in the compute renderer.

With this change and the previously added CPU fallback, Gio no longer
rely on any GPU functionality outside the OpenGL ES 2.0 level.

Fixes gio#49
Fixes gio#154
Fixes gio#97
Fixes gio#36
Fixes gio#172

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-29 09:05:55 +02:00
Elias Naur b3a8c24334 gpu/internal/driver: rename TextureFormatSRGB to TextureFormatSRGBA
The format implies an alpha channel; name it accordingly.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-29 08:37:18 +02:00
Elias Naur ea38195e2e gpu: [compute] add CPU fallback
This change adds a CPU fallback for devices that don't support the old
renderer nor have GPU support for compute programs.

Most of the hard work is implemented in the gioui.org/cpu module. It
uses the SwiftShader project with light modification to output
statically compiled CPU .o files for each compute program.

The CPU fallback only covers Linux and Android on arm, arm64, amd64
architectures. There is no fundamental reason support can't be extended
to other platforms:

- macOS and iOS are probably easy, but it's likely that virtually every
  device has GPU support for compute shaders.
- Windows needs a Cgo-less port, or a build constraint to require a C
  compiler (Gio core doesn't).
- FreeBSD and OpenBSD are probably also easy to do because they're so
  similar to Linux.
- The 386 binaries didn't work properly in my tests, so fixes to
  SwiftShader is probably needed. However, I expect virtually every
  Intel device can run amd64 binaries.

Updates gio#49
Fixes gio#228

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:56:50 +02:00
Elias Naur 5197f637a7 gpu: [compute] compute and store clipping path hashes during construction
The hash of the clipping paths that affect drawing operations are computed
and used to quickly determine that two operations are not equal, the
most likely outcome of a comparison.

However, for paths that are constructed once and cached computing the
hash at every frame is wasteful. This is especially true for text, which
is both cached and also among the largest paths in a frame.

This change moves the hashing to op/clip.Path construction time, and
stores the hash in the ops list so it won't be re-computed at every use.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur 88fb798cca gpu: [compute] speed up path comparisons with op keys
To re-use previously cached layers, the compute renderer must know
whether two drawing operations are equal. In the case two operations are
not equal, a fast hash comparison will most likely fail. In the case two
equal operations with complicated clipping paths, the comparison of the
path data is expensive.

This change adds support for fast ops.Key comparisons, where two paths
are equal if their ops.Key are. This is an optimization that kicks in
for text rendering, where glyph clipping shapes are re-used across
frames.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur 4ab872e36a gpu: [compute] re-use layers that differ only in integer offsets
To re-use drawing operations common to two layers, every operation must
exactly match, including their transformations. However, layers that
differ only by an integer offset can be re-used because rendering does
not depend on the absolute integer offset. This is important in the very
common case of scrolling otherwise static UI content.

This change separate the integer offset from drawing operations and
relaxes the layer cache to match layers that differ only in integer
offsets.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur 318ddd0644 gpu: [compute] add function for separating integer offsets from transforms
Refactor only; separateTransform is needed in the following change.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur e6c31a02fd gpu: [compute] cache and re-use drawing operations from the previous frame
The compute renderer is more expensive to run than the old renderer on
low-end GPUs, and even more so on CPUs. To ensure good performance
regardless of the end-user device, this change implements automatic
re-use of content rendered in the frame before the current.

The basic idea is that every drawing operation (PaintOp), along with its
transform and clipping, can be hashed and efficiently looked up. A naïve
caching approach is then to rasterize every operation to separate
sections of several large texture atlases, turning a cache hit into a
very cheap texture copy.

However, for scenes with lots of overlapping operations, the resulting
texture memory from separating the operations would be much larger than
the memory for just the window framebuffer.

So instead of caching individual operations, this change caches layers,
which are sequences of drawing operations. It starts by putting all
operations into a single layer. Then, if the subsequent frame re-uses a
sub-sequence of that larger layer, it is split.

For example, consider a UI similar to the kitchen sample:

Hello, Gio

<Editor>

<Line Editor>

<Button> <Button> <Button>

<ProgressBar>

<Checkbox> <Toggle>

In the first frame, all of the drawing operations comprising the UI will
be stored and cached in a single layer. In the second frame the
progress bar will have moved and the renderer splits the UI into three
layers: layer A for everything up to (but not including) the progress
bar, layer B with just the progress bar, and layer C for the rest. Note
that nothing has been re-used yet. In the third frame, the progress bar
moves again, and this time layer A and C can be copied from the cache
only the progress bar needs redrawing through the compute programs.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur 938e51f111 gpu: [compute] clear viewport through glClear, not through compute
The performance difference is negligible, but is useful when the compute
pipeline can skip rendering to empty tiles.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur 89ab5ebf4f gpu: [compute] unify resource cleanup
Rename all resource release methods to "Release", and release all
resources with a slice and loop.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur b87cbc04f3 gpu: [compute] add compute renderer specific decoding of ops
Until now, the two renderers have shared structures and code for
decoding drawing ops and convert them to GPU-friendly structures.

However, the decoder is tailored to the old renderer and use
structures that poorly fits the new compute renderer.

This change copies the decoder and specializes the copy for the compute
renderer, avoiding a round-trip through the old renderer decoder.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur 60a47e7de5 gpu/internal,internal/gl: add support for strided texture uploads
The CPU fallback of the compute renderer needs to upload subtextures
from a larger image.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 14:34:18 +02:00
Elias Naur 9188690e9e gpu: [compute] don't leak a texture if its framebuffer allocation fails
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-07-27 12:57:53 +02:00
Elias Naur d331f63d20 gpu: [compute] move material clip space transformation to the GPU
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-06-28 08:35:41 +02:00
Elias Naur 39775f555a gp/internal/opengl: support sRGB emulation for embedded content
Programs such as gio-example/glfw rely on Gio drawing blending with
the framebuffer background. This change makes it so when sRGB emulation
is active.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-05-16 18:17:00 +02:00
Elias Naur 21c319ace5 gpu/internal/opengl,internal: move sRGB emulation to OpenGL driver
There is only one driver but several backends (EGL, WebGL).

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-05-16 10:44:45 +02:00
Elias Naur 82fff0178b gpu: [compute] generalize sizedBuffer to cover vertex buffers
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-05-04 21:35:39 +02:00
Elias Naur f655027110 gpu: [compute] add materials and blit timers to profiling output
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-04-27 16:49:06 +02:00
Elias Naur 53aad36ac1 gpu: [compute] move encoding to Collect
Collect is for converting ops to GPU commands, Frame is for actual
rendering. There's little practical difference, but makes profiling
easier to distinguish between conversion and rendering.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-04-27 16:49:06 +02:00
Elias Naur 157430a3d2 gpu: [compute] move timer initialization from Collect to Frame
GPU operations logically belong in the Frame method, and it's probably
best to keep them inside BeginFrame/EndFrame as well.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-04-27 16:49:06 +02:00
Elias Naur bc2c3db43e op/clip,gpu: move approximation of complex strokes to op/clip.Op.Add
Before this change, the two renderers both had special case code for
approximating strokes they don't support natively. This change moves
that conversion to clip.Op.Add, for several reasons:

- The compute renderer no longer need fallback logic and caches for
  strokes it doesn't support.
- The approximation logic is slow. Moving it to clip.Op.Add will not
  speed it up, but will make the cost easier to spot in profiles. Until all
  strokes are supported natively, users can use macros to cache
  expensive strokes.
- Reduced garbage: Op.Add takes an op.Ops anyway, and can use that for
  storing the approximated stroke outline.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-23 19:37:45 +01:00
Elias Naur 0a4b6549da internal/stroke,gpu: move stroking of path data to package internal/strokg
Pure refactor, preparing for use in op/clip.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-23 15:42:43 +01:00
Elias Naur 7825bda8f8 internal/stroke,op/clip: don't import op/clip from internal/stroke
To avoid an import cycle in a future change, internal/stroke can no
longer import op/clip. Move required op/clip functionality to
internal/stroke and duplicate the remaining types.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-23 15:28:52 +01:00
Elias Naur 8c8d1dc16f internal/stroke,gpu: create internal package for stroke to path conversion
Complex strokes are not yet supported in either of the current renderers,
so they are converted to filled outlines in package gpu.

We're about to move that complexity up to the op/clip package, so we're
going to need the converter available from outside package gpu. This
change extracts the conversion code and related types to the separate,
internal package stroke.

No functional changes; a follow-up moves the stroke conversion.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-23 12:35:41 +01:00
Elias Naur 8750828c69 gpu,gpu/shaders: [compute] add alpha to output
Fixes the glfw example where Gio content is composited (alpha blended)
on top of custom content.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-22 16:27:26 +01:00
Elias Naur 1dde94d8dd gpu: [compute] use support for simple strokes
In the old renderer, all strokes are converted to filled paths. The new
renderer can draw simple strokes natively. Do that, and avoid the costly
conversions.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 21:46:32 +01:00
Elias Naur 13da40f601 gpu,op/clip: [compute] get rid of stroke vs fill flags
The fill mode is now controlled by a SetFillMode command, not by flags
on each path segment and fill command.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-19 21:46:31 +01:00
Elias Naur 8128d6371d gpu: [compute] clear material texture before reusing it
Otherwise the padding we leave around rendered materials may contain
content from reclaimed materials.

Fixes icon "shimmering" when the kitchen example is transforming.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-15 19:17:22 +01:00
Elias Naur 258033d0b0 gpu: eliminate gaps by ensuring consistent transformations
This is another attempt at fixing the issue described in [0], the
previous attempt was reverted[1].

This change fixes the issue by tracking resolved transformations and
ensure that all segments within a path share a single transformation.

[0] https://github.com/linebender/piet-gpu/issues/62
[1] https://gioui.org/commit/2b21b48a7c5c4451deb642c164548a134bb9ad06

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-15 14:05:15 +01:00
Elias Naur 2b21b48a7c gpu,gpu/shaders: revert attempt to fix path gaps
This is effectively a revert of [0], reintroducing the path gaps
described in [1]. A follow-up change will implement another attempt.

[0] https://gioui.org/commit/2feec23561cd84d6b8ddbab84a202df66b123208
[1] https://github.com/linebender/piet-gpu/issues/62

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-13 15:23:25 +01:00
Elias Naur 9e79cee447 op/clip,gpu,internal/scene: encode cubic bézier curves natively
The compute renderer supports cubic curves, so encode them as such.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-11 18:54:12 +01:00
Elias Naur a369c408f9 gpu: [compute] skip encoding roundtrip for path data
Since clip.Path now encodes paths in the format expected by
elements.comp, use that data directly instead of a roundtrip through
drawOps.buildVerts.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-11 18:54:12 +01:00
Elias Naur 9366fce0f3 internal/scene: extract compute shader encoding to a separate package
We're about to encode clip.Paths with the format compatible with the
compute renderer. This change extracts the encoding to a re-usable
package.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-11 18:54:12 +01:00
Elias Naur 2328ddfeca internal/byteslice: rename package unsafe
All functions left in the old package unsafe were provided byte slice
views of other types. Rename the package accordingly and avoid a name
clash with the standard library package unsafe.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-11 11:27:02 +01:00
Elias Naur 4b377aa896 gpu: resize compute output when it becomes smaller
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-09 18:27:10 +01:00
Elias Naur 3a3ec711d3 gpu: [compute] cache rendered materials
This change tracks materials so that only the updated materials needs to
be rendered.

Materials are likely cheap to render each frame, at least compared to
the rest of the compute pipeline. However, the CPU fallback must
transfer all changed materials to CPU memory, and a cache is a great
improvement over fetching all materials every frame.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-09 13:25:24 +01:00
Elias Naur 1b142c07e0 gpu: separate the construction and placing of material quads
We're about to cache the transformed materials. It's easier to do when
quads can be constructed before determining their atlas position.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-08 20:10:11 +01:00
Elias Naur c799452c57 gpu/internal/driver: rename gpu/backend
There are no longer any importers of package backend outside of
gioui.org/gpu. Move it internally, and rename it to the slightly more
specific "driver" while we're at it.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-06 14:27:34 +01:00
Elias Naur 25a19481e3 gpu,gpu/backend: don't assume constant output framebuffer
Return the output framebuffer from BeginFrame, to make it clear that
it may change between frames. Delete CurrentFramebuffer which is no
longer needed.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-03-02 20:43:59 +01:00
Elias Naur f973b3f384 gpu,gpu/backend: [compute] handle loss of buffer contents during download
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-02-24 18:48:03 +01:00
Elias Naur c849c5b77f gpu: [compute] use correct usage flags for output image
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-02-22 17:19:15 +01:00
Elias Naur 2feec23561 gpu: [compute] fix path gaps by eliminating redundant path points
See https://github.com/linebender/piet-gpu/issues/62 for description
of the issue. The fix is the Gio copy of the piet-gpu fix.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-02-18 10:30:05 +01:00
Elias Naur b5d21b209c gpu: [compute] use array type for scene elements
All scene elements have a fixed size in uint32s. Model them accordingly.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-02-18 10:30:05 +01:00
Elias Naur c9a8265126 gpu: [compute] pre-transform images before rendering
We're about to change the last stage of the compute pipeline to only
accept images, not sampled textures. This change prepares materials
for pixel-aligned image copying by pre-rendering images to a texture,
applying transforms.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-02-18 10:30:05 +01:00
Elias Naur 8ec47dcae3 gpu: give compute.atlas a more precise name, reset atlas efficiently
Refactor only, in preparation for adding another atlas with pre-processed
materials.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-02-09 11:08:40 +01:00
Elias Naur 84b586ae6c gpu: don't automatically clear screen before rendering
Gio UI may be overlaid on top of custom graphics such as in the glfw example.
That will only work if Gio doesn't clear the screen (to white).

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-01-22 18:33:34 +01:00
Elias Naur 72a3248041 gpu: implement GPU profiling for compute
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-01-03 13:27:59 +01:00
Elias Naur 8662790f10 gpu: remove unused field
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-01-03 13:05:10 +01:00
Elias Naur bb9252f9d4 gpu: cache path data for compute
Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-01-02 20:18:05 +01:00
Elias Naur 23f710910f gpu: reclaim stale images in atlas texture before resizing
Issue found by Anthony Starks.

Signed-off-by: Elias Naur <mail@eliasnaur.com>
2021-01-02 18:28:41 +01:00