the damn small Wayland API

Wayland 1.0 release is knocking the door and people keep asking “why Wayland if we got X already”, or things like performance, memory consumption, power savings and other kind of advantages on having Wayland instead X. Those are very important points to consider, of course, but for one individual actually programming the graphics system the answer should be straightforward: Wayland API is damn small.

1. But who’s going to program Wayland or X?

Short answer is: very likely you won’t :) A more elaborated answer requires the understanding of what is the graphics system “shell” and its components, or in other words what is the system layer that fits on top of a core graphics system.

While the graphics system comprises of an hardware abstraction, the shell could be thought as an abstraction for such graphics system in a way that application developers would feel more comfortable on writing their applications there – it would be the application software glue therefore, offering convenience for an ordinary developer. Examples of shell components are widget library “toolkits”, game engines, window and decoration managers, Web runtime, video processing libraries and so forth. Developers of these kind of components are the only ones that need to understand the graphics system API, in principle.

2. And what is the X API?

libxcb is the implementation of X11 protocol. libxcb needs 19 functions to deal with IPC related stuff. The core protocol implementation and libxcb protocol helpers export 195 functions all together. All extensions, developed over the 25 years of X existence, sum up 26 in total with 1064 functions for clients. Therefore the X11 client API has approximately a total of 1278 entry points.

Raw data and how I collected it is here.

When we talk about a graphics system, we like to think about the drawing APIs only. It’s a big mistake. The API is more broad, encompassing for instance input methods, input devices, output devices, a bunch of graphics related configuration aspects, testing and so on. In fact, X has basically two drawing APIs (the core protocol and Xrender) and some systems building very modern interfaces are not even using them anymore, bypassing via OpenGLES and friends.

I’ve reported about one year ago that some new systems don’t use the core X protocol and just use a few extensions instead. One would claim that this is alright cause the API would be smaller, but my opinion is if things carry on expanding outwards like they have been, we’re going get to a point where the graphics systems becomes unmaintainable. Moreover, it takes too long for the shell developer learn that just a small set of the API is needed. The X protocol flexibility feature in which developers can add many new extension as desired and the lack of a proper API deprecation mechanism is definitely a problem to consider here.

3. So what is the Wayland API then?

Wayland API has approximately a total of 135 entry points, in its 0.99 version. libwayland solely exports 19 functions, where most are related with IPC, dispatching of events and etc which are the main responsibility of the library. The 14 interfaces consists of 102 functions and usually a client application will require some platform specific routines as well, such as the EGL abstraction and some for the DRM driver model; these add 14 more functions currently.

Raw data here.

We have something we call “private protocols”, that describes more high-level interactions and a few special clients. Examples are the XWayland infrastructure, desktop shell workspace and its panel bar, input methods where special care for device grab is needed and etc. One might consider adding those APIs as well but anyhow, Wayland has a small API after all.

Although X and Wayland’s intention are both to sit between the applications and the kernel graphics layers, a direct comparison of those two systems is not fair on most of the cases; while X encompasses Wayland in numerous of features, Wayland has a few other advantages. In special, in this post I wanted to call the attention for the big advantage the shell programmer has when creating components that aid modern interfaces, where only a small set of functions are actually needed using Wayland.

X API is approximately 15 times bigger than the Wayland one. Here, I’ve only counted the amount of exported functions for clients. I understand that there could be different and more precise ways to tell how big is a graphics system API (e.g counting events received by clients, or Wayland amount of interface listeners, or the window properties of X).

X on Wayland

A rather cool feature on Weston compositor is xwayland, to support X11 native applications on Wayland. It’s a quite important feature because gives the compatibility with the “old” windowing system, so say you have an application written on Motif/Xt or even something more “fancy” like a Web browser all tied with GTK2 and whatever dependency, then you better not bother yourself re-writing it to native Wayland or porting to a modern toolkit — it should just work seamlessly on it. Hence, X on Wayland fits pretty well with our overall transition plan.

The architecture behind and the mechanisms are a little tricky though, let’s take a look.

Once Weston is started, it launches the xwayland module which creates an X socket, adds it to the main Weston loop and waits for X clients to be connected into. When the first client gets connected, it triggers Weston to fork and exec one X server. Weston continues its normal execution but unregister itself that socket.

The X server, with the Wayland backend on it (xwayland), keeps listening Weston via a special Wayland protocol interface. Weston binds such interface and announces back the socket that X clients will be connecting to (xserver_send_listen_socket event) and the first X client that was just connected (xserver_send_client event). The idea is to give now to X the responsibility of clients trying to be connected, naturally. Worth to mention that this lazy initialization method was intentionally designed in order to avoid extra lags at Weston start up and memory overhead when X11 applications are not being used. So the X server is started on demand, only when actually needed.

At this point now, Weston also starts its own X Window Manager. In short, the main task of it is to proxy X applications built based on those old WM standards, such as EWMH and the jurassic ICCCM, and plumb them into the shiny Wayland desktop shell interface. In other words, the idea is to map different type of X windows on Wayland surfaces (xserver_set_window_id request), and specially give some meaningful user-interface policies on X Windows to the desktop shell, for instance making a surface to get maximized, or say to resize/move it around.

Other tasks the Weston X window manager performs are embed a pretty decoration frame around windows and also make sure the client-to-client communication such as copy and paste (selection) and eventually drag and drop work nicely. Remark that the X protocol doesn’t define policy but the WMs, and this is a challenge for Wayland, that does define. So the Weston X WM has to come up with the right amount of salt to fit perfectly the policies that were already straighten up by Wayland’s desktop shell… hmm way too philosophical.

All X windows created from now on will be redirected to offscreen pixmap and stored on a DRM buffer (via the xwayland video driver); that’s how compositing works on Wayland. The idea is that a X client will behave very likely as a regular Wayland client. Therefore, there’s no protocol calls or any major task involved on xwayland and all happens seamlessly, with the protocol “conversion” penalty close to nil.

The architecture for input handling looks good already as well. At X init time, it’s created fake devices, the keyboard and the pointer, and the concept goes the other way around of window creation: it gives the input devices capabilities from Weston to Xorg. We’re still shaping the cursor settings, the complex logic of client and surface grabbing, among other features, but the basics are most definitely in place already.

So the video is there for demoing what we’ve got until now. You can see in practice all these rather cool building blocks I mentioned. It’s rather cool.. specially for developers; one has to have the know-how on X11 and Wayland protocols, Xorg and Weston internals, X Window Manager standards, etc. Lot of fun!!!

X Census (for 1.10)

Following is the census of 1.10 window for all X infrastructure – raw numbers here. I did it in a similar way as the previous version. Worth to mention that there’s almost no relation between the cycles of development from each of the components listed below, which can lead to some misunderstanding. Anyway, still a nice indicative to see and evaluate how the free desktop community behaved.

Numbers for X implementation (xserver, proto, lib and xcb repositories):

Processed 1258 csets from 93 developers
70 employers found
A total of 139275 lines added, 58982 removed (delta 80293)

Developers with the most changesets
Alan Coopersmith 243 (19.3%)
Gaetan Nadon 193 (15.3%)
Peter Hutterer 121 (9.6%)
Adam Jackson 94 (7.5%)
Jon TURNEY 43 (3.4%)
Keith Packard 37 (2.9%)
Jeremy Huddleston 36 (2.9%)
Jesse Adkins 34 (2.7%)
Pauli Nieminen 29 (2.3%)
Jamey Sharp 28 (2.2%)

Developers with the most changed lines
Matt Dew 57959 (35.7%)
Jeremy Huddleston 25002 (15.4%)
Fernando Carrijo 16739 (10.3%)
Gaetan Nadon 15750 (9.7%)
Alan Coopersmith 11850 (7.3%)
Adam Jackson 4273 (2.6%)
Keith Packard 2754 (1.7%)
Jesse Adkins 2516 (1.5%)
Peter Hutterer 2083 (1.3%)
James Jones 1876 (1.2%)

Developers with the most lines removed
Jeremy Huddleston 3726 (6.3%)
Adam Jackson 3617 (6.1%)
Jesse Adkins 2489 (4.2%)
Jamey Sharp 1497 (2.5%)
Søren Sandmann Pedersen 757 (1.3%)
James Cloos 187 (0.3%)
Adrian Bunk 184 (0.3%)
Tiago Vignatti 118 (0.2%)
Jon TURNEY 116 (0.2%)
Chris Wilson 72 (0.1%)

Developers with the most signoffs (total 1429)
Alan Coopersmith 315 (22.0%)
Peter Hutterer 191 (13.4%)
Gaetan Nadon 174 (12.2%)
Keith Packard 133 (9.3%)
Adam Jackson 96 (6.7%)
Jon TURNEY 52 (3.6%)
Jeremy Huddleston 36 (2.5%)
Jesse Adkins 34 (2.4%)
Pauli Nieminen 30 (2.1%)
Jamey Sharp 29 (2.0%)

Developers with the most reviews (total 882)
Alan Coopersmith 83 (9.4%)
Daniel Stone 78 (8.8%)
Peter Hutterer 76 (8.6%)
Julien Cristau 73 (8.3%)
Keith Packard 61 (6.9%)
Adam Jackson 49 (5.6%)
Mikhail Gusarov 41 (4.6%)
Jeremy Huddleston 38 (4.3%)
Colin Harrison 38 (4.3%)
Chase Douglas 35 (4.0%)

Developers with the most test credits (total 48)
Colin Harrison 16 (33.3%)
Gaetan Nadon 6 (12.5%)
Cyril Brulebois 5 (10.4%)
Alan Coopersmith 3 (6.2%)
Jeremy Huddleston 2 (4.2%)
Julien Cristau 1 (2.1%)
Aaron Plattner 1 (2.1%)
Luc Verhaegen 1 (2.1%)
Dirk Wallenstein 1 (2.1%)
Simon Thum 1 (2.1%)

Developers who gave the most tested-by credits (total 48)
Jon TURNEY 16 (33.3%)
Peter Hutterer 8 (16.7%)
Alan Coopersmith 7 (14.6%)
Dan Nicholson 4 (8.3%)
Michel Dänzer 2 (4.2%)
Gaetan Nadon 1 (2.1%)
Jeremy Huddleston 1 (2.1%)
Julien Cristau 1 (2.1%)
Aaron Plattner 1 (2.1%)
Luc Verhaegen 1 (2.1%)

Developers with the most report credits (total 21)
Julien Cristau 2 (9.5%)
Justin Mattock 2 (9.5%)
Peter Hutterer 1 (4.8%)
Aaron Plattner 1 (4.8%)
Cyril Brulebois 1 (4.8%)
Simon Thum 1 (4.8%)
Thierry Vignaud 1 (4.8%)
meng 1 (4.8%)
Sebastian Glita 1 (4.8%)
Bartosz Brachaczek 1 (4.8%)

Developers who gave the most report credits (total 21)
Peter Hutterer 7 (33.3%)
Julien Cristau 3 (14.3%)
Jamey Sharp 3 (14.3%)
Alan Coopersmith 2 (9.5%)
Eamon Walsh 2 (9.5%)
Michel Dänzer 1 (4.8%)
Gaetan Nadon 1 (4.8%)
Kristian Høgsberg 1 (4.8%)
Jesse Barnes 1 (4.8%)

Top changeset contributors by employer
Oracle 244 (19.4%)
Red Hat 225 (17.9%) 193 (15.3%)
Nokia 122 (9.7%)
Intel 46 (3.7%) 43 (3.4%)
Apple 36 (2.9%) 34 (2.7%)
NVidia 30 (2.4%) 28 (2.2%)

Top lines changed by employer 57958 (35.7%)
Apple 27540 (17.0%) 16729 (10.3%) 16611 (10.2%)
Oracle 14567 (9.0%)
Red Hat 8089 (5.0%)
Intel 4574 (2.8%)
Nokia 3153 (1.9%) 2528 (1.6%) 2110 (1.3%)

Employers with the most signoffs (total 1429)
Oracle 315 (22.0%)
Red Hat 293 (20.5%) 174 (12.2%)
Intel 144 (10.1%)
Nokia 127 (8.9%) 52 (3.6%)
Apple 36 (2.5%) 34 (2.4%)
NVidia 29 (2.0%) 29 (2.0%)

Employers with the most hackers (total 96)
Red Hat 8 (8.3%)
Nokia 8 (8.3%)
Intel 7 (7.3%)
Canonical 3 (3.1%)
VMWare 3 (3.1%)
Oracle 2 (2.1%)
NVidia 2 (2.1%) 1 (1.0%) 1 (1.0%)
Apple 1 (1.0%)

Development of X input drivers and input event processing tools (xf86-input-*, xkbcomp, xkeyboard-config repositories):

Processed 293 csets from 33 developers
29 employers found
A total of 34645 lines added, 26556 removed (delta 8089)

Developers with the most changesets
Peter Hutterer 152 (51.9%)
Sergey V. Udaltsov 32 (10.9%)
Alexandr Shadchin 21 (7.2%)
Alan Coopersmith 17 (5.8%)
Gaetan Nadon 12 (4.1%)
Trevor Woerner 6 (2.0%)
Nikolai Kondrashov 5 (1.7%)
Chase Douglas 4 (1.4%)
Simon Thum 4 (1.4%)
Joe Shaw 3 (1.0%)

Developers with the most changed lines
Sergey V. Udaltsov 30425 (79.0%)
Peter Hutterer 3377 (8.8%)
Alexandr Shadchin 1263 (3.3%)
Alan Coopersmith 806 (2.1%)
Chase Douglas 572 (1.5%)
Denis 'GNUtoo' Carikli 180 (0.5%)
Simon Thum 133 (0.3%)
Gaetan Nadon 110 (0.3%)
Bryce Harrington 99 (0.3%)
Nikolai Kondrashov 81 (0.2%)

Developers with the most lines removed
Alexandr Shadchin 1239 (4.7%)
Alan Coopersmith 739 (2.8%)
Chase Douglas 418 (1.6%)
Peter Hutterer 183 (0.7%)
Gaetan Nadon 61 (0.2%)
Peter Korsgaard 51 (0.2%)
Nikolai Kondrashov 35 (0.1%)
Jesse Adkins 16 (0.1%)
Javier Acosta 6 (0.0%)
Adam Jackson 6 (0.0%)

Developers with the most signoffs (total 304)
Peter Hutterer 197 (64.8%)
Alan Coopersmith 25 (8.2%)
Alexandr Shadchin 21 (6.9%)
Gaetan Nadon 11 (3.6%)
Trevor Woerner 6 (2.0%)
Nikolai Kondrashov 5 (1.6%)
Thomas Hellstrom 5 (1.6%)
Chase Douglas 4 (1.3%)
Simon Thum 4 (1.3%)
Joe Shaw 3 (1.0%)

Developers with the most reviews (total 126)
Trevor Woerner 37 (29.4%)
Alan Coopersmith 25 (19.8%)
Benjamin Tissoires 11 (8.7%)
Chris Bagwell 9 (7.1%)
Daniel Stone 9 (7.1%)
Chase Douglas 8 (6.3%)
Adam Jackson 7 (5.6%)
Cyril Brulebois 6 (4.8%)
Matt Turner 5 (4.0%)
Peter Hutterer 3 (2.4%)

Developers with the most test credits (total 25)
Alan Coopersmith 23 (92.0%)
Benjamin Tissoires 1 (4.0%)
Abdoulaye Walsimou Gaye 1 (4.0%)

Developers who gave the most tested-by credits (total 25)
Peter Hutterer 24 (96.0%)
Gaetan Nadon 1 (4.0%)

Developers with the most report credits (total 2)
Dave Airlie 2 (100.0%)

Developers who gave the most report credits (total 2)
Peter Hutterer 2 (100.0%)

Top changeset contributors by employer
Red Hat 155 (52.9%) 32 (10.9%) 21 (7.2%)
Oracle 18 (6.1%) 12 (4.1%) 6 (2.0%)
Canonical 6 (2.0%) 5 (1.7%) 4 (1.4%)
VMWare 3 (1.0%)

Top lines changed by employer 30437 (79.1%)
Red Hat 4428 (11.5%) 1263 (3.3%)
Oracle 825 (2.1%)
Canonical 713 (1.9%) 180 (0.5%) 133 (0.3%) 113 (0.3%) 93 (0.2%) 59 (0.2%)

Employers with the most signoffs (total 304)
Red Hat 199 (65.5%)
Oracle 26 (8.6%) 21 (6.9%) 11 (3.6%)
Canonical 6 (2.0%) 6 (2.0%) 5 (1.6%)
VMWare 5 (1.6%) 4 (1.3%) 3 (1.0%)

Employers with the most hackers (total 33)
Red Hat 3 (9.1%)
Oracle 2 (6.1%)
Canonical 2 (6.1%) 1 (3.0%) 1 (3.0%) 1 (3.0%) 1 (3.0%)
VMWare 1 (3.0%) 1 (3.0%) 1 (3.0%)

for userspace video drivers (libdrm, mesa and all xf86-video-*):

Processed 5223 csets from 131 developers
100 employers found
A total of 452414 lines added, 289531 removed (delta 162883)

Developers with the most changesets
Brian Paul 579 (11.1%)
Eric Anholt 512 (9.8%)
Vinson Lee 432 (8.3%)
Dave Airlie 357 (6.8%)
Marek Olšák 324 (6.2%)
Chia-I Wu 252 (4.8%)
José Fonseca 247 (4.7%)
Kenneth Graunke 210 (4.0%)
Luca Barbieri 210 (4.0%)
Ian Romanick 190 (3.6%)

Developers with the most changed lines
Brian Paul 70178 (13.0%)
Luca Barbieri 58946 (10.9%)
Kenneth Graunke 35433 (6.5%)
Chia-I Wu 34790 (6.4%)
Ian Romanick 30961 (5.7%)
Jerome Glisse 28641 (5.3%)
Eric Anholt 27906 (5.2%)
Christoph Bumiller 22352 (4.1%)
Dave Airlie 21625 (4.0%)
Alex Deucher 19210 (3.5%)

Developers with the most lines removed
Kenneth Graunke 19727 (6.8%)
Matt Turner 3052 (1.1%)
Henri Verbeet 1398 (0.5%)
Kristian Høgsberg 832 (0.3%)
Adam Jackson 248 (0.1%)
Jesse Adkins 161 (0.1%)
Nicolas Kaiser 43 (0.0%)
Andre Maasikas 34 (0.0%)
Pierre Allegraud 17 (0.0%)
Patrice Mandin 17 (0.0%)

Developers with the most signoffs (total 930)
Chris Wilson 181 (19.5%)
Jerome Glisse 99 (10.6%)
Brian Paul 82 (8.8%)
Dave Airlie 81 (8.7%)
Alex Deucher 59 (6.3%)
Tilman Sauerbeck 44 (4.7%)
Thomas Hellstrom 40 (4.3%)
Alan Coopersmith 30 (3.2%)
Jakob Bornecrantz 29 (3.1%)
Daniel Vetter 28 (3.0%)

Developers with the most reviews (total 69)
Jakob Bornecrantz 23 (33.3%)
Ian Romanick 10 (14.5%)
Eric Anholt 9 (13.0%)
Julien Cristau 6 (8.7%)
Mikhail Gusarov 4 (5.8%)
Brian Paul 2 (2.9%)
Alex Deucher 2 (2.9%)
Matt Turner 2 (2.9%)
José Fonseca 2 (2.9%)
Michel Dänzer 2 (2.9%)

Developers with the most test credits (total 6)
Guillermo S. Romero 1 (16.7%)
Michel Hermier 1 (16.7%)
Sitsofe Wheeler 1 (16.7%)
Bjørn Mork 1 (16.7%)
Michal Marek 1 (16.7%)
Manoj Iyer 1 (16.7%)

Developers who gave the most tested-by credits (total 6)
Chris Wilson 2 (33.3%)
Guillermo S. Romero 1 (16.7%)
Xiang, Haihao 1 (16.7%)
Xavier Chantry 1 (16.7%)
Jesse Barnes 1 (16.7%)

Developers with the most report credits (total 23)
Julien Cristau 2 (8.7%)
Matthias Hopf 2 (8.7%)
Jeff Chua 2 (8.7%)
Sitsofe Wheeler 1 (4.3%)
Bjørn Mork 1 (4.3%)
Michal Marek 1 (4.3%)
José Fonseca 1 (4.3%)
Daniel Vetter 1 (4.3%)
Cyril Brulebois 1 (4.3%)
Peter Clifton 1 (4.3%)

Developers who gave the most report credits (total 23)
Chris Wilson 19 (82.6%)
Xiang, Haihao 2 (8.7%)
Ian Romanick 1 (4.3%)
Kenneth Graunke 1 (4.3%)

Top changeset contributors by employer
VMWare 1582 (30.3%)
Intel 1292 (24.7%)
Red Hat 546 (10.5%) 324 (6.2%)
LunarG 252 (4.8%) 210 (4.0%) 158 (3.0%)
AMD 156 (3.0%) 65 (1.2%) 60 (1.1%)

Top lines changed by employer
Intel 132677 (24.5%)
VMWare 105087 (19.4%)
Red Hat 87373 (16.1%) 67407 (12.5%)
LunarG 38973 (7.2%) 22548 (4.2%)
AMD 19690 (3.6%) 14329 (2.6%)
richard@richard-desktop3.(none) 12426 (2.3%) 11676 (2.2%)

Employers with the most signoffs (total 930)
Intel 235 (25.3%)
Red Hat 215 (23.1%)
VMWare 159 (17.1%)
AMD 59 (6.3%) 44 (4.7%)
Oracle 30 (3.2%) 28 (3.0%) 24 (2.6%) 18 (1.9%) 12 (1.3%)

Employers with the most hackers (total 138)
Intel 17 (12.3%)
VMWare 13 (9.4%)
Red Hat 7 (5.1%)
Canonical 4 (2.9%)
Novell 2 (1.4%)
AMD 1 (0.7%) 1 (0.7%)
Oracle 1 (0.7%) 1 (0.7%) 1 (0.7%)

Pixman library (pixman):

Processed 223 csets from 15 developers
12 employers found
A total of 10985 lines added, 6139 removed (delta 4846)

Developers with the most changesets
Søren Sandmann Pedersen 124 (55.6%)
Siarhei Siamashka 64 (28.7%)
Andrea Canciani 11 (4.9%)
Dmitri Vorobiev 5 (2.2%)
Rolland Dudemaine 4 (1.8%)
Cyril Brulebois 2 (0.9%)
Jon TURNEY 2 (0.9%)
Liu Xinyun 2 (0.9%)
Maarten Bosmans 2 (0.9%)
Benjamin Otte 2 (0.9%)

Developers with the most changed lines
Søren Sandmann Pedersen 6335 (45.3%)
Siarhei Siamashka 3119 (22.3%)
Liu Xinyun 1318 (9.4%)
Jonathan Morton 721 (5.2%)
Andrea Canciani 586 (4.2%)
Dmitri Vorobiev 62 (0.4%)
Benjamin Otte 62 (0.4%)
Maarten Bosmans 56 (0.4%)
Rolland Dudemaine 32 (0.2%)
Mika Yrjola 7 (0.1%)

Developers with the most lines removed
Liu Xinyun 1318 (21.5%)
Maarten Bosmans 11 (0.2%)
Rolland Dudemaine 2 (0.0%)

Developers with the most signoffs (total 7)
Cyril Brulebois 2 (28.6%)
Jon TURNEY 2 (28.6%)
Liu Xinyun 1 (14.3%)
Alan Coopersmith 1 (14.3%)
Chen Miaobo 1 (14.3%)

Developers with the most reviews (total 1)
Matt Turner 1 (100.0%)

Developers with the most test credits (total 0)

Developers who gave the most tested-by credits (total 0)

Developers with the most report credits (total 0)

Developers who gave the most report credits (total 0)

Top changeset contributors by employer
Red Hat 126 (56.5%)
Nokia 64 (28.7%) 11 (4.9%)
Movial 7 (3.1%) 4 (1.8%) 2 (0.9%) 2 (0.9%) 2 (0.9%)
Intel 2 (0.9%)
Oracle 1 (0.4%)

Top lines changed by employer
Red Hat 7969 (57.0%)
Nokia 3156 (22.6%)
Intel 1318 (9.4%)
Movial 805 (5.8%) 619 (4.4%) 57 (0.4%) 40 (0.3%) 7 (0.1%) 6 (0.0%) 2 (0.0%)

Employers with the most signoffs (total 7)
Intel 2 (28.6%) 2 (28.6%) 2 (28.6%)
Oracle 1 (14.3%)

Employers with the most hackers (total 15)
Movial 3 (20.0%)
Red Hat 2 (13.3%)
Intel 1 (6.7%) 1 (6.7%) 1 (6.7%)
Oracle 1 (6.7%)
Nokia 1 (6.7%) 1 (6.7%) 1 (6.7%) 1 (6.7%)

X11 comformance’s XTS, taken from Peter’s repository:

Processed 36 csets from 2 developers
2 employers found
A total of 3114 lines added, 3339 removed (delta -225)

Developers with the most changesets
Peter Hutterer 21 (58.3%)
Aaron Plattner 14 (38.9%)

Developers with the most changed lines
Peter Hutterer 3242 (90.0%)
Aaron Plattner 136 (3.8%)

Developers with the most lines removed
Peter Hutterer 264 (7.9%)

Developers with the most signoffs (total 35)
Peter Hutterer 21 (60.0%)
Aaron Plattner 14 (40.0%)

Developers with the most reviews (total 4)
Joe Kain 2 (50.0%)
Adam Cheney 2 (50.0%)

Developers with the most test credits (total 0)

Developers who gave the most tested-by credits (total 0)

Developers with the most report credits (total 0)

Developers who gave the most report credits (total 0)

Top changeset contributors by employer
Red Hat 21 (58.3%)
NVidia 14 (38.9%)

Top lines changed by employer
Red Hat 3402 (94.4%)
NVidia 200 (5.6%)

Employers with the most signoffs (total 35)
Red Hat 21 (60.0%)
NVidia 14 (40.0%)

Employers with the most hackers (total 2)
Red Hat 1 (50.0%)
NVidia 1 (50.0%)

X documentation (doc repository):

Processed 108 csets from 7 developers
7 employers found
A total of 28556 lines added, 212807 removed (delta -184251)

Developers with the most changesets
Alan Coopersmith 57 (52.8%)
Gaetan Nadon 45 (41.7%)
Matt Dew 2 (1.9%)
Peter Hutterer 1 (0.9%)
Samuel Thibault 1 (0.9%)
Jesse Adkins 1 (0.9%)
Marc Balmer 1 (0.9%)

Developers with the most changed lines
Gaetan Nadon 146676 (67.0%)
Matt Dew 59191 (27.0%)
Alan Coopersmith 6927 (3.2%)
Samuel Thibault 7 (0.0%)
Jesse Adkins 7 (0.0%)
Peter Hutterer 3 (0.0%)
Marc Balmer 3 (0.0%)

Developers with the most lines removed
Gaetan Nadon 129581 (60.9%)
Matt Dew 52738 (24.8%)
Alan Coopersmith 1932 (0.9%)
Jesse Adkins 7 (0.0%)

Developers with the most signoffs (total 109)
Alan Coopersmith 59 (54.1%)
Gaetan Nadon 47 (43.1%)
Jesse Adkins 1 (0.9%)
Peter Hutterer 1 (0.9%)
Samuel Thibault 1 (0.9%)

Developers with the most reviews (total 28)
Alan Coopersmith 17 (60.7%)
Gaetan Nadon 4 (14.3%)
Peter Hutterer 2 (7.1%)
Daniel Stone 1 (3.6%)
Dan Nicholson 1 (3.6%)
Julien Cristau 1 (3.6%)
Matt Turner 1 (3.6%)
Adam Jackson 1 (3.6%)

Developers with the most test credits (total 0)

Developers who gave the most tested-by credits (total 0)

Developers with the most report credits (total 0)

Developers who gave the most report credits (total 0)

Top changeset contributors by employer
Oracle 57 (52.8%) 45 (41.7%) 2 (1.9%)
Red Hat 1 (0.9%) 1 (0.9%) 1 (0.9%) 1 (0.9%)

Top lines changed by employer 152747 (69.7%) 59195 (27.0%)
Oracle 7088 (3.2%) 7 (0.0%) 7 (0.0%)
Red Hat 3 (0.0%) 3 (0.0%)

Employers with the most signoffs (total 109)
Oracle 59 (54.1%) 47 (43.1%) 1 (0.9%) 1 (0.9%)
Red Hat 1 (0.9%)

Employers with the most hackers (total 7)
Oracle 1 (14.3%) 1 (14.3%) 1 (14.3%) 1 (14.3%)
Red Hat 1 (14.3%) 1 (14.3%) 1 (14.3%)

About Nokia and Microsoft alliance? I was deeply shocked yes, but well, I guess I’m cool and over it now. I’m sure MeeGo is not dead by any chance though… Nevertheless, Nokia’s contribution to X11 development will be obviously diminishing. It’s sad. Our Graphics Team were just feeling the first effects of the new introduced culture for pushing whatever work (well the ones we are allowed) to upstream and now all was cracked down. So, unfortunately this won’t happen with the same volume anymore and the collected numbers of 1.10 is definitely a mark for Nokia.

Xorg server 1.9 minimal

That’s what I’m using for MeeGo now. Autoconf parameters, theeere we go:

--disable-static --disable-aiglx --disable-config-dbus --disable-config-hal --disable-dbe --disable-dga --disable-dpms --disable-dri --disable-glx --disable-glx-tls --disable-int10-module --disable-ipv6 --disable-screensaver --disable-secure-rpc --disable-tcp-transport --disable-vbe --disable-vgahw --disable-xdm-auth-1 --disable-xinerama --disable-xwin --disable-xaa --disable-xace --disable-xdmcp --disable-xf86vidmode --disable-xfree86-utils --disable-xnest --disable-xvmc --disable-libdrm --enable-config-udev --enable-dri2 --enable-null-root-cursor --enable-record --enable-unit-tests --enable-visibility --enable-xorg --with-sha1=libsha1

PS: stop use kdrive hardware servers (Xfbdev and variants). They are dead!

Linux Graphics for Small Devices at FISL

Last week I’ve been in Brazil at 11th International Free Software Forum (FISL) talking about Linux Graphics for Small Devices*. I tried to cover a bit of everything that I learned in the world I’ve been immersed in some near past – I guess there aren’t many news for freedesktopers though. Anyway, everyone is very welcome to give any kind of feedback and comment on it. Just follow here.

*actually, two nights in Porto Alegre and two nights in Curitiba. Was great to see most of my friends!

adopt a child and make multi-card work on Linux

Previously, the message was for toolkit, now it targets new upcoming developers… okay, if I’d be offensive I could say it targets vendor distributions which care for desktop on Linux :)

I have started hacking on X due the laboratory at my university I was working was running an amazing project to employ computer labs in all high-schools of the state I was living, in Brazil. It was a successful and all 2.100 schools used the multiseat computing model.

The beginning of my work in this project happened back in 2006 [0], and on that time I was trying to understand the situation that Linux using multiple graphics cards was living – that is only part of the needed work for making multiseat. The work proceeded but I never could actually push the patches to the mainline. Afterwards, and now at Nokia, I took this work again targeting some clean-up on X server code. It mostly went upstream (see VGA arbiter, libpciaccess and current xserver code). But the code is buggy and lot of work still needs to make it work properly.

Seems that I have a son now, but he (or should be she?) is a rebel baby and generates lot of trouble. Rather, I’m mean and want to give he away!

I don’t care about multi-card development nowadays and for an unknown reason no one also cares. But people use a lot: try to mix old graphics cards with new cards…. boom! Try to use multi-card with decent hw acceleration… boom! Try to hotplug graphics devices… no way! Hotswitch… hardly! Perform close to a single-card system… only in your dream! Some guys are kindly contributing sending patches for a while and unfortunately our open-source community are lacking man-power to make it get reviewed properly and eventually land at upstream. So here’s your big chance:


[0] BTW, I found the first patch I sent for X. It dates back in April 2006 and was against Xgl, GLX backend. Very funny :)

Scrutinizing X memory, part 2: what’s taking all that memory?

So here goes some statistics of the Xorg process running. All the informations were fetch from /proc/`pidof Xorg`/{smaps, status}. I used also a script found on the Web to parse and organize these informations; Mikhail Gusarov has extended this script to show a very useful output.

Xorg per se

Running just one standalone `Xorg -retro`. In my system it represents:
VmRSS: 5440 kB
VmSize: 13620 kB

from those 5440 kB of RSS:
3404 kB (63 %) come from code
1628 kB (30 %) come from malloc/mmap in anonymous memory (heap)
228 kB (4 %) come from other data mapped in memory
180 kB (3 %) come from rodata

from those same 5440 kB of RSS:
1628 kB (30 %) come from malloc/mmap in anonymous memory (heap) somewhere*
1200 kB (22 %) come from Xorg
628 kB (12 %) come from libc
316 kB (6 %) come from libcrypto
164 kB (3 %) come from libint10
136 kB (2.5%) come from libXfont
128 kB come from libxaa
120 kB come from libpixman
116 kB come from nv_drv
112 kB come from ld
102 kB come from libglx
100 kB come from swrast_dri
88 kB come from libfb
60 kB come from libpthread
48 kB come from evdev
xxx kB come from other libraries**

* just looking into /proc/, there’s no way to determine if the allocations came either from the binary itself or some DSO. I’ll definitely analyse carefully this in a near future using another approach.

** it’s missing from these numbers the input hotplug layer, which mostly systems are using today. In another data collected, I’ve seen dbus + hal taking 268 kB against amazingly 64 kB from libudev.

These measurements are not perfect; they are a snapshot of the memory when the server just started. The same footprint brought to memory at Xorg’s initialization time will differs a lot from the regular usage of the rest of Xorg’s life, which would deals with clients and users interacting. For instance, libint10 is mapping 164 kB and it’s likely that will never be swapped back to the memory again. Likewise, the heap portion will increase when clients starts to allocate pixmaps on the server.

Even though, we can see some nice facts. From the first chart, we see that almost 2/3 in RSS is used by instructions. Is it a normal behaviour of a graphics server? I don’t know. In the other chart, we see a huge footprint of libcrypto. In such library, when not counting shared mappings (e.g. used by openssl), it’s using 88 kB of RSS for private mappings only – sigh. We probably can replace it by other SHA1 implementation (in fact, we have already others inside the server) or use our built-in. We have also libpthread, used in GLX, which is being built even on systems that are not using it (e.g. Maemo on N900). libXfont shows up as a surprise to me either, taking a considerable amount of memory. We’re probably able to tweak it a bit though.

the code being started

Another way to analyse Xorg, is getting informations per code and modules being started. So I first set a breakpoint in InitOutput() function. Until InitOutput() be called:
VmRSS: 1728 kB
VmSize: 8788 kB

from 1728 kB in RSS:
1336 kB (77.3 %) come from code
132 kB (7.6 %) come from malloc/mmap in anonymous memory (heap)
144 kB (8.3 %) come from other data mapped in memory
116 kB (6.7 %) come from rodata

from 1728 kB in RSS:
436 kB (25.2 %) come from libc
328 kB (19 %) come from Xorg
316 kB (18.3 %) come from libcrypto

A breakpoint in InitOutput() means the very first steps of Xorg initialization: command line processing, OS layer being started and other basic routines. At this point, naturally it wasn’t executed much code inside Xorg yet, neither any drivers were loaded. Therefore, almost half memory usage of the process (44 %) came from basic libraries start up such as libc, libcrypto, etc.

The next chart, when setting a break point at InitInput(), shows the moment that the output is mostly done. I.e., internal loader initialized, configuration and its parsing done and output drivers already loaded. Until InitInput() be called:
VmRSS: 4436 kB
VmSize: 13724 kB

from 4436 kB in RSS:
3352 kB (75.6 %) come from code
676 kB (15.2 %) come from malloc/mmap in anonymous memory (heap)
228 kB (5.1 %) come from other data mapped in memory
180 kB (4 %) come from rodata

We see the the server’s RSS has jumped 2708 kB from the previous chart. In other words, it represents 2708 kB, or 50%, just being used to output’s initialization, and that 1004 kB (18.4 %) will be used for input initialization routines.

Well, I’m already happy with these preliminary statistics. I guess we have already work to do just looking into. Now, I plan to investigate a bit further X’s heap creation and how efficiently X clients are using pixmaps.

As always, I appreciate any corrections, suggestions and improvements.

* this text was kindly reviewed by Mikhail Gusarov.

Scrutinizing X Memory, part 1: overview

This series of documents explore how the memory is used by the Xorg server. They aim to eventually shrinks the memory footprint of the server and its related components, like X clients, modules being loaded and drivers. Embedded devices with constrained resources are the main focus here. All texts are mostly based on x86 and ARM architectures, under Linux 2.6.33 with Xorg from upstream.


One way to analyse aspects of memory usage of a given program is to scrutinize its object data. Object data contains executable code and static data. Both are of little interest from the process memory management point of view given their layout is determined by the compiler and does not change during process execution. However, we can deduce some nice informations about the object. For instance, from Xorg object we are able to get some statistics about the code, identify its structure and point out architectural mistakes just looking into.

Besides the object itself, also important is to see it in execution and how the dynamic allocations are performed on the stack and heap. So an analysis of the file object running is valuable as well.

X file object

Consider the following sections of Xorg:

.text: contains the instructions executed by the CPU and all constant data – literals. While the program is being executed, pages are loaded into physical memory carrying instructions and literals.

The number of lines in X code is huge, which in some way impacts in a huge .text segment size. In my environment .text is 1833738 bytes (1.74 MB) when the compiler is performing third degree of optimization (-O3). In a very gross view, removal of code means less instructions to execute, consequently less text and less memory footprint. For instance, just a single inclusion of fprintf will cost ~40 bytes of text in your object. Of course it’s not straightforward to cut off code all over the server, but for a given device/environment we can customize it, as already discussed.

Besides code elimination, optimize the code using compiler’s size optimization (-Os) helps a lot either: 260 kB of RSS saved here, only optimizing X server. So we might considered this and also apply the same idea in DSOs. For instance, the size of pixman library mapped on the server shrinks 30% when compiled with size optimization. Good job, compiler!

.data and .bss: static or global variables allocated at program startup.

If the variables allocated in compilation time are not initialized, then BSS (Block Started by Symbol) increases; increase BSS means also increase VM (Virtual Memory), but not necessarily RSS. The VM size is quite meaningless when measuring real memory usage. So I wouldn’t bother to analyse BSS, given the RSS occupied by X is what I really care.

On the other hand, .data section increases when some data object is initialized for permanent variables. And if these variables is being accessed, it increases directly the physical memory. A good habit here is to declare constant variables whenever is possible, so then they go to .text segment and the compiler might be able to perform optimizations.

X dynamic allocations (stack, heap and friends)

Probably this is where there’s more room for optimizations. The heap grows in response the program needs: a program like “ls” will not make a lot of demands on the heap (one hopes), while the heap of a running Xorg can grow in a truly amazing way. It shouldn’t be hard to profile all allocations done inside the server. Probably valgrind’s massif with a bunch of arguments give this for us.

X clients are able to request the server to allocate pixmaps in its own memory. Such feature is one of the main reasons of the growing-shrinking in the server’s memory footprint. Because of that, it’s very usual to see people getting confused thinking there’s a leak on the server while actually it’s on client side.

Besides heap allocations there’s also the stack, used to hold automatic variables and functions data. I don’t think there’s much to track in stack memory or ways to save overall process memory. But a good rule to follow is that typically allocation here is much faster than for dynamic storage (heap or free store), because a memory allocation in the stack involves only pointer increment rather than more complex management.

The ideas above were just an overview where we can start to work on. I don’t believe there’s an unique and certain point that we can go and fix X memory usage. We should analyse the code and attack all sides.

Next, I’ll analyse in depth each of these dynamic and static allocation ways discussed in this document, starting doing some statistics where X sucks more… memory :-P I’ll appreciate any kind of corrections/suggestions on these documents.

* this text was kindly reviewed by Ander Conselvan and Mikhail Gusarov.

Customization and true modularization of Xorg

For the first time in life, Xorg is being used in a single platform and for a given device only (other devices have used an X11 implementation but using other non-canonical servers, such kdrive’s based – Tiny-X).

Previously Xorg was being packed to run in a huge amount of OSes – mostly Linux and Unix-like distributions – with the characteristic of be architecture portable and able to run on a huge set of video and input devices. In terms of software, this means an extensive amount of code able to cover all of this mentioned. But this is far from the needs of a small and single platform device.

Some days ago, at #xorg-devel, Alan mentioned the following:

06:59 < alanc> vignatti: the whole point of Xorg is to drive video output - where else would you possibly sanely put that code?
07:00 < alanc> I think you're going overboard in the drive to remove all code from Xorg

Alan was referring about my previous comment to remove some code of video memory mapping from server… I understand (and respect a lot) his concerns but lemme put this right here: it’s not about removal of code; I don’t even care if the code is in xserver or not; what I do care is about the customization – or more fancy, the true modularization [0] – of Xorg.

As discussed on the last X conference, we’re aiming to optionalize [1] lot of components inside Xorg: distros would build all components, satisfying all supported devices and drivers, whereas constrained environments (such as maemo + n900) would use a restrict set only.

So recently I’ve been confusing people’s mind trying to in fact optionalize several components of the server. There are some straightforward modifications on the code like turn off libdrm, vgahw or libxdmcp, but there’s also other more challenger like all the old-school mechanism to initialise cards, to remove cursor support or even to choose if we want or not all bus subsystem. Sometimes we’ll have to be careful to not run out of the protocol. But the truth is: the currently the code is _very_ tied all over the server. It’s not trivial to “get there”.

IMHO the plan traced at XDC looks perfectly clear and while other display systems seems not suitable enough for us yet, I’ll be keep digging on this direction.

[0] the modularization that happened in the version 1.0 was related with drivers going outside the server.

[1] what would be a good word here?

multiseat with multiple X servers (or “the right way”)

So last week I posted on lkml an old patch that we were carrying for a long time in the Linux community. It basically brings the multiple (old) video cards functionally again on Linux and X server (and this time doing on the right and beauty way). For the people that was following multiseat implementations, this is a HUGE step: we will finally be able to discard the old and ugly hack (a mix of Xorg, several Xephyr servers + evdev) and and go to a clean way, starting multiple X servers in parallel. Cool! Well, not that much, because it might take some time to be in your beloved distribution :)

It’s too early and I don’t know if it’s recommended to say this, but if you want to give a try basically you have to get all X components, this X server patches, my libpciaccess and Dave’s kernel patchset. Again: it’s a very unstable work!

If you’re concerned with the technical explanations then you can follow the nice memo that Dave wrote about this.