The Usecase
I wrote a custom UI framework in PyGame, a library used for software rendering (graphics on the CPU), to support my experiments while giving me a standard interactive layer using event-driven paradigms similar to other UI frameworks.
The requirements were specific:
- It needed to be transparent - I didn't want my UI layer to add extra cost over standard software rendering, which means no workarounds to get it to display custom canvases
- It needed to be in Python - The main goal is to have an interactive layer ready to spin up for rapid experimentation. Python has a vast ecosystem of libraries and is fast to write - the UI layer needs to match that iteration speed.
Starting From Nothing
UI at its most simplest.
The initial architecture focused on brutal simplicity. I persisted a flat list of components that I would manually place by first sketching it out in Photoshop, and every frame the engine ran a minimal loop:
- Hit-test: Compare the mouse coordinates and click state with the coordinates of every single component in the flat hierarchy, triggering any click/hover handlers on any components that passed the hit-test.
- Update: Run a global update() loop for every component if they need to update private state consistently every frame.
- Render: Call the render() method on each component, relying on my Photoshop math to make sure they render at the right size and in the right position.
This is very simple to write, but it's impractical for all but the most stylised or minimal UI layers. For a general purpose tool, it would be ideal to offload some of the math to the engine and focus on describing my UI through higher layer layout semantics, as opposed to manual pixel math.
The Family Tree
A reunion.
To achieve this, we can draw inspiration from actual UI engines and model our UI to represent nodes as a tree, instead of a flat hierarchy. Each node has a parent and one or more child nodes, which can each have their own children, and so on. I implemented an architecture where nodes are exclusively either layout-only or content-only, as opposed to something like HTML, where nodes can be contentful and have children of their own. Less flexible, but simpler to implement.
Instead of a simple list iteration, this approach requires depth-first traversal of the tree, which recurses through all the nodes. This recursive nature is essential to how the layout engine works. Each layout node implements two key methods, a measure() method to measure and return its rectangle size, and a distribute() method where a child node can be issued its final size and position.
This seems simple, but combined with the recursive nature of the tree traversal, it results in a layout engine that calls measure() on a child, that calls measure() on its child, so on and so forth, until instrinsic sizes bubble up and final positions can be distributed back down the tree.
This is an incredibly powerful paradigm and is inspired by how actual layout engines similar to the ones in Flutter and Jetpack Compose function. A crucial difference is that my layout engine only works with instrinsic sizing, and does not support any constraints. Practically, this means that a parent cannot grow or shrink its children, which is a key requirement if you want responsive design or fluid layouts. While these weren't the main requirements for the initial version of this engine, they are things I'd like to revisit, especially after watching this excellent video of how Clay (a layout engine for C) works.
Refining the engine
Code snippet of what a simple form looks like, showcasing the nested box layouts with anchoring support.
With the core component API and layout abstraction nailed down, I finally reached a point where I could start designing components and simple test programs for me to use. I quickly discovered some exceptions that I took for granted in other UI engines.
- Asynchronous support: One of the first GUIs I wrote involved a script that had to talk to an API, which would freeze the entire window. My solution was an abstraction for the base threading library where threads are tracked by the engine and callbacks are called on the main thread upon completion. This helps reduce the surface area for race conditions while keeping the program responsive.
- Event listeners: Sometimes components need access to I/O events that involve more than just the mouse. I added a system to globally emit events that can be subscribed to, similar to JavaScript APIs in the browser (... and running into the same memory leaking problems).
- Performance optimisations: Software rendered UIs can quickly slow down if not optimised correctly. I used flags to mark if a component or a layout was dirty, and made use of Python's context handler API to provide a Pythonic way of updating components while handling the flags behind the scenes. Components are only redrawn and layouts are only recalculated when the respective flag is set, allowing the program to minimise CPU usage to only when it's needed.
- UI Stages: Most UIs don't consist of a single "stage" of UI elements. Ideally, we want to navigate to various "stages" (or "pages" as they're called in a browser) depending on UI state. I implemented a state machine similar to how mobile applications work, where you can push a stage to a stack and return from it, or clear the entire stack and start fresh for destructive navigation.
Beyond the basics
An actual screenshot - featuring the minimal hardcoded stylesheet that ended up inspiring the style of this website.
What I have now works fine for basic / experimental scripts where raw iteration speed is more important than maintenance, but ideally, we'd want to bridge that gap and add more functionality. Here are a couple of more advanced ideas I'd like to explore in the future, inspired by real systems:
- Declarative API: Can we take the huge improvement in developer experience from moving from manual pixel -> automatic layout, and apply that to UI state? The program becomes a description of what you'd want to see for any given state, instead of a set of instructions to poke at the UI every single time a variable changes. This requires either a fine-tuned reactivity primitive (similar to SolidJS) or an optimised reconciler for diffing our UI tree with an ephemeral one created when state changes (like React.js).
- Composability: With the current API, my programs consist of big components that do whole tasks at once, render directly to surfaces, and store and manage their state opaquely. This is simple for the engine, but gets hard to manage for the developer. Modern paradigms are adopting a more functional, compositional API where programs consist of many tiny UI primitives that compose to make something larger. Supporting this requires an overhaul of the event-handling system to support event bubbling, and optimisation of almost all aspects of the engine to handle moving the complexity to the UI tree.
- Custom styling: Right now, the engine relies on a hardcoded stylesheet full of global style declarations that are referenced in the render method for each component. Ideally, we would combine this with a user-configurable styling API. Something similar to TailwindCSS utility classes would fit perfectly with the "minimal" target we're aiming for - but applying directly to the renderer instead of compiling to a file.
Conclusion
Ironically, this project started because I didn't want a UI. Existing solutions were opaque and required lots of boilerplate that often exceeded the actual scale of my projects. I just wanted clickable surfaces and a way to hack at the layers underneath. As the project grew, I ended up organically discovering how to construct simple abstractions through trying (and sometimes failing) to write my own, and why it's paradoxically anything but simple to do right.
While it’s far from perfect, writing it taught me more about UI systems than I ever would have learned by sticking to established solutions alone.
Read more about the high-performance video mosaic rendering and streaming engine I originally designed this UI library for.