I’ve wrote an article about the new graphics platform for Chromium called Ozone-GBM. I particularly think that Ozone-GBM will play an important role next in Chromium and Linux graphics communities in general. I hope you enjoy the read :) Please share it.
Some state-of-the-art progress of the project we’re proudly working with:
A few weeks ago we released Ozone-Wayland and now we’d like to detail for you the development process and strategy behind it… ah, and the title is not developing Chromium, the browser; it’s developing Chromium, the project! You will understand why next.
There are three main projects involved in here: Chromium, Wayland and Ozone-Wayland. In Chromium, there is a very big and geek community that mainly produces Chrome browser and Chrome-OS. Very near to that one there’s Blink engine’s community, which interacts (and overlaps) a lot with the Chromium’s; I can’t tell exactly in numbers but there’s huge number of people, vendors and commercial involved in Chromium and, more important to us, there’s a lot of quality code being cooked in there for leveraging Web technologies in general.
In the other side, we have Wayland, the project that hosts the development of a Linux graphics system. Its goal is to give to applications the best graphics performance that can be extracted from the hardware. The community in there mainly produces the Wayland client/server libraries and Weston, the window server to hosts applications.
Now, somewhere in between Wayland’s and Chromium’s, the Ozone-Wayland community will start to grow. Ozone, the Chromium’s meta-platform for supporting different windowing systems, has given the possibility of building the whole Wayland support outside the Chromium and Wayland code-bases — basically each project then has its own code base. It may take a second of reflexion, but it’s an wonderful way of organizing the development because a community doesn’t need to step on the other. And that’s the main point because, grossly speaking, Web developers don’t want know core graphics and hardware details, while the window system’s don’t want to know about Web technologies. Using this kind of organization, Ozone-Wayland therefore hosts the tough task of bridging Wayland graphics for Chromium, and on doing so also diminishes the burden of the two other big communities having to interact directly with each other.
It’s good to remember what each of these three projects are building. A quick guide follows:
– Chromium: Blink Web engine, Content Shell (used for runtime’s, browsers, etc), Chrome browser, Ash UI, Chrome-OS, among others.
– Wayland: libwayland-server, libwayland-client, Weston window compositor, Desktop Shell (a “toy” shell UI, for testing purposes).
– Ozone-Wayland: libozone-wayland, that relying on libwayland-client and linking with all Chromium based products. So Ozone-Wayland basically leverage any of those products that the Chromium project develops.
web page and how-to.
A few people came asking me how to setup Ozone-Wayland and etc. I’m not sure you guys noticed, but we’re hosting there in Github something that we call our “Web page”, detailing a bit more how Chromium code-base plays together with the Wayland specific bits. In particular, there’s a how-to for people that wants to give a try and check out the development. There’s a small wiki as well.
We’re coordinating the development via Github’s tracking issue system and trying to avoid mailing list so far — Welcome to the 21st century… let’s see how long we can stay in there and avoid going back to the 20th again :)
we hang out in freenode.net, #ozone-wayland channel; everyday and in the timezone you wish :)
The following message was sent out this morning — I’m copying it here and attaching a cute screenshot of my desktop :)
Ozone is a set of C++ classes in Chromium for abstracting different window systems on Linux. It provides abstraction for the construction of accelerated surfaces underlying Aura UI framework, input devices assignment and event handling.
Today we are launching publicly Ozone-Wayland, which is the implementation of Chromium’s Ozone for supporting Wayland graphics system. Different projects based on Chromium/Blink like the Chrome browser, ChromeOS, among others can be enabled now using Wayland.
In particular, we have Chrome Browser and Content Shell enabled and running on Wayland. All the projects are under active development (therefore unstable) but we are hoping to cope with fixes together with the open source community.
We’ll be posting updates in the following weeks detailing the solution and our ideas. Enjoy!
Let’s forget for a second about video drivers, whether it has acceleration or not, and all the related issues with hardware support on Wayland. This is all solved. Let’s talk about the user interface (UI) and ways to customize it all over the computing continuum — from phones, tablets and TV box to desktop PCs, Invehicle Infotainment (IVI), aeroplane systems, among others.
(I’ve made a cheat sheet here also — Creative Commons Legal Code Attribution 2.0. for both figures)
On customization, the shell plugin comes first: changes in there will impact directly which UI paradigm will be used. Specifically, one implementing the plugin protocol will be defining whether the UI is meant for phones, IVI, desktops, etc.
Probably the most important characteristic of the shell plugin is to give “roles” for surfaces, i.e. define where and how they will be mapped on the screen. For example, if a client wants its surface mapped as a top-level window, or say to resize the dimensions of it, then it’s up to the shell to expose these different surfaces roles, all according the UI paradigm the shell itself is providing.
Worth to note that the shell plugin doesn’t need to rely on any drawing library or graphics toolkit because it doesn’t tackle directly drawing aspects. Also, conceptually it’s mandatory to give roles for surfaces and therefore a shell plugin is a must (or at least a simple implementation of surface::configure).
An special shell client through an special “private” protocol can be used for setting up basic UI elements that require special treatment. For example in the desktop UI, widget elements such as panel, dock, lockscreen and cursors will need special treatments for their positioning, grabbing semantics and so forth.
On customization, different shell clients, exposing different UI elements can be implemented using *the* *same* shell plugin. Some architectures will rather be using one overlay simple client that will take care of spawning and controlling other UI basics applications also.
The special client will probably want to rely on graphics toolkits.
Wayland clients use the Wayland core protocol and the protocol that shell plugin has defined. The corollary is that one client will always know the UI paradigm (due shell plugin) and will *not* work across different paradigms. Though, that doesn’t mean applications will need to know their paradigm necessarily but only the middleware software is connecting to Wayland (like the graphics toolkits).
A footnote about Canonical’s Mir
Canonical announced their new display manager yesterday. There’s a section “Why Not Wayland / Weston?” where they claim:
“we consider the shell integration parts of the protocol as privileged and we’d rather avoid having any sort of shell behavior defined in the client facing protocol.”
and something similar was written here also:
” Wayland .. exposes privileged sections like the shell integration that we planned to handle differently, both for security reasons and as we wanted to decouple the way the shell works on top of the display server from the application-facing protocol”
so they would rather have:
“An outer-shell together with a frontend-firewall that allow us to port our display server to arbitrary graphics stacks and bind it to multiple protocols.”
First of all, there’s nothing privileged about the shell protocol Wayland is exposing. wl_shell and wl_shell_surface (the “shell protocols”) are part of the Wayland core protocol, yes, but as I’ve explained on this post, it’s all customizable for whatever UI needs. Nevertheless, their usage is completely optional and anyone can build a different shell and stack with the rest of Wayland, just like tablet-shell protocol for instance does. Still, this will be Wayland and use the shiny libwayland for IPC.
Therefore I don’t think Canonical should justify their new project because Wayland “does not fulfill .. requirements completely”. There are no technical reasons Ubuntu cannot use Wayland in principle. What they wrote there is a very very mean excuse instead.
Wayland 1.0 release is knocking the door and people keep asking “why Wayland if we got X already”, or things like performance, memory consumption, power savings and other kind of advantages on having Wayland instead X. Those are very important points to consider, of course, but for one individual actually programming the graphics system the answer should be straightforward: Wayland API is damn small.
1. But who’s going to program Wayland or X?
Short answer is: very likely you won’t :) A more elaborated answer requires the understanding of what is the graphics system “shell” and its components, or in other words what is the system layer that fits on top of a core graphics system.
While the graphics system comprises of an hardware abstraction, the shell could be thought as an abstraction for such graphics system in a way that application developers would feel more comfortable on writing their applications there – it would be the application software glue therefore, offering convenience for an ordinary developer. Examples of shell components are widget library “toolkits”, game engines, window and decoration managers, Web runtime, video processing libraries and so forth. Developers of these kind of components are the only ones that need to understand the graphics system API, in principle.
2. And what is the X API?
libxcb is the implementation of X11 protocol. libxcb needs 19 functions to deal with IPC related stuff. The core protocol implementation and libxcb protocol helpers export 195 functions all together. All extensions, developed over the 25 years of X existence, sum up 26 in total with 1064 functions for clients. Therefore the X11 client API has approximately a total of 1278 entry points.
Raw data and how I collected it is here.
When we talk about a graphics system, we like to think about the drawing APIs only. It’s a big mistake. The API is more broad, encompassing for instance input methods, input devices, output devices, a bunch of graphics related configuration aspects, testing and so on. In fact, X has basically two drawing APIs (the core protocol and Xrender) and some systems building very modern interfaces are not even using them anymore, bypassing via OpenGLES and friends.
I’ve reported about one year ago that some new systems don’t use the core X protocol and just use a few extensions instead. One would claim that this is alright cause the API would be smaller, but my opinion is if things carry on expanding outwards like they have been, we’re going get to a point where the graphics systems becomes unmaintainable. Moreover, it takes too long for the shell developer learn that just a small set of the API is needed. The X protocol flexibility feature in which developers can add many new extension as desired and the lack of a proper API deprecation mechanism is definitely a problem to consider here.
3. So what is the Wayland API then?
Wayland API has approximately a total of 135 entry points, in its 0.99 version. libwayland solely exports 19 functions, where most are related with IPC, dispatching of events and etc which are the main responsibility of the library. The 14 interfaces consists of 102 functions and usually a client application will require some platform specific routines as well, such as the EGL abstraction and some for the DRM driver model; these add 14 more functions currently.
Raw data here.
We have something we call “private protocols”, that describes more high-level interactions and a few special clients. Examples are the XWayland infrastructure, desktop shell workspace and its panel bar, input methods where special care for device grab is needed and etc. One might consider adding those APIs as well but anyhow, Wayland has a small API after all.
Although X and Wayland’s intention are both to sit between the applications and the kernel graphics layers, a direct comparison of those two systems is not fair on most of the cases; while X encompasses Wayland in numerous of features, Wayland has a few other advantages. In special, in this post I wanted to call the attention for the big advantage the shell programmer has when creating components that aid modern interfaces, where only a small set of functions are actually needed using Wayland.
X API is approximately 15 times bigger than the Wayland one. Here, I’ve only counted the amount of exported functions for clients. I understand that there could be different and more precise ways to tell how big is a graphics system API (e.g counting events received by clients, or Wayland amount of interface listeners, or the window properties of X).
A rather cool feature on Weston compositor is xwayland, to support X11 native applications on Wayland. It’s a quite important feature because gives the compatibility with the “old” windowing system, so say you have an application written on Motif/Xt or even something more “fancy” like a Web browser all tied with GTK2 and whatever dependency, then you better not bother yourself re-writing it to native Wayland or porting to a modern toolkit — it should just work seamlessly on it. Hence, X on Wayland fits pretty well with our overall transition plan.
The architecture behind and the mechanisms are a little tricky though, let’s take a look.
Once Weston is started, it launches the xwayland module which creates an X socket, adds it to the main Weston loop and waits for X clients to be connected into. When the first client gets connected, it triggers Weston to fork and exec one X server. Weston continues its normal execution but unregister itself that socket.
The X server, with the Wayland backend on it (xwayland), keeps listening Weston via a special Wayland protocol interface. Weston binds such interface and announces back the socket that X clients will be connecting to (
xserver_send_listen_socket event) and the first X client that was just connected (
xserver_send_client event). The idea is to give now to X the responsibility of clients trying to be connected, naturally. Worth to mention that this lazy initialization method was intentionally designed in order to avoid extra lags at Weston start up and memory overhead when X11 applications are not being used. So the X server is started on demand, only when actually needed.
At this point now, Weston also starts its own X Window Manager. In short, the main task of it is to proxy X applications built based on those old WM standards, such as EWMH and the jurassic ICCCM, and plumb them into the shiny Wayland desktop shell interface. In other words, the idea is to map different type of X windows on Wayland surfaces (
xserver_set_window_id request), and specially give some meaningful user-interface policies on X Windows to the desktop shell, for instance making a surface to get maximized, or say to resize/move it around.
Other tasks the Weston X window manager performs are embed a pretty decoration frame around windows and also make sure the client-to-client communication such as copy and paste (selection) and eventually drag and drop work nicely. Remark that the X protocol doesn’t define policy but the WMs, and this is a challenge for Wayland, that does define. So the Weston X WM has to come up with the right amount of salt to fit perfectly the policies that were already straighten up by Wayland’s desktop shell… hmm way too philosophical.
All X windows created from now on will be redirected to offscreen pixmap and stored on a DRM buffer (via the xwayland video driver); that’s how compositing works on Wayland. The idea is that a X client will behave very likely as a regular Wayland client. Therefore, there’s no protocol calls or any major task involved on xwayland and all happens seamlessly, with the protocol “conversion” penalty close to nil.
The architecture for input handling looks good already as well. At X init time, it’s created fake devices, the keyboard and the pointer, and the concept goes the other way around of window creation: it gives the input devices capabilities from Weston to Xorg. We’re still shaping the cursor settings, the complex logic of client and surface grabbing, among other features, but the basics are most definitely in place already.
So the video is there for demoing what we’ve got until now. You can see in practice all these rather cool building blocks I mentioned. It’s rather cool.. specially for developers; one has to have the know-how on X11 and Wayland protocols, Xorg and Weston internals, X Window Manager standards, etc. Lot of fun!!!