Shared publicly  - 
 
What's going on with the AF_BUS Linux kernel patches, and my D-Bus in the kernel work based on my trip last week.
153
46
Donald Carr (sirspudd)'s profile photoGianpaolo Macario's profile photoMiguel Angel Gordian's profile photo冯俊峰's profile photo
74 comments
 
Finally, after all these years we'll know what happens if Linus gets run over by a bus
 
Interesting concept. It's nice to know that people are always looking at ways to change existing paradigms for the (hopefully) better.
 
Great to see that this stuff is in safe hands and will hopefully become a reality soon.
 
I guess the challenge with broadcast/multicast stuff is to know when to throw stuff away and decide a client just isn't keeping up/going to respond.
 
I didn't get how D-Bus is going to be implemented to kernel. Can someone provide a link where I can get more information on that?
 
+David Alan Gilbert true, but we have plans for that, if a client isn't responding in a certain amount of time, we throw things away.  Yeah, garbage collection, fun...

+Mladen Mijatov you didn't read the linked article, did you?
 
Hm, pretty interesting. And I like working with JS.

By the way, what does this mean for software that can change filetype association (like file managers)? Would there basically be a Portal that tells the system that the app can access the ability to set these associations? And knowing which apps are installed work with a Portal too? (wow, almost made a game reference there)
 
+Vincent Beers I don't know, that's higher up the stack than my work, ask on the gnome-os mailing list if you are curious.  But give us some time to actually implement all of this before asking such specific questions like that, otherwise it's just wasting everyone's time, as I'm sure you can agree.
 
Alright, I'll just hold off on asking for now.
 
has been waiting for this stuff desperately. Looking forward to see it in the wild!
 
Only downside I see is what this will do for keeping applications running on other unix like systems. Wouldn't be the first time stuff has been pushed down others throats by Linux & cie but at least this time, it's a worth while something.
 
+Terry Poulin That's the job of people developing those "other unix systems" . and no nobody has every pushed anything on other's throats, it is free software, nobody forces people to do jackshit.
 
+Terry Poulin unfortunately, there are very few developers for those "other unix-like systems", but if they wish to implement this, there shouldn't be anything preventing that.  But to expect us to not create new things for Linux just because of the lack of resources for totally different projects is insane.
 
+Cristian Rodriguez and nobody forces developers to use those interfaces in the first place. As the popularity of an interface increases, the need for it to be implemented by other platforms does. Perfect example: desktop environments. You should be able to use KDE on FreeBSD or Solaris fairly well, but painfulness increases whenever the apps developer ignores the rest of the world, or gets the disfavour of well, they're not forcing me to use it but I don't want to code 3 ways to do this damn thing.

I remember that all the world is not a VAX, Ubuntu, or even built off Linux.

Sorry if I put you on the defensive hole 8-).
 
+Greg Kroah-Hartman no one expects you not to create new things for Linux. My worry is the people that will use those new things ;).
 
I do remember someone saying that userspace dbus allows arbitrarily large amounts of data to be queued, and moving it to kernel space could involve a quite large amount of locked memory.
 
+Dave Airlie yeah, handling some memory limits will be part of this code, we have some ideas about how to do that.
 
Not trying to fan any flames, but why not just adapt Binder?
 
I would turn that last question around, and ask whether the transport gets designed for technical merits of a good IPC mechanism that it'd expected that D-Bus (with that library) and Binder as well be converted to IT.
 
+Daniel Smith Go read the binder code and come back and say that with a straight face, I'll wait...

Seriously, my goal here is to be able to implement the binder protocol on top of this new code.  If I can pull it off, I will be very happy, and I'm sure that +Robert Love and the other Google engineers will be glad to not have to maintain that kernel code anymore as well.
 
This is great, I can't wait to see dbus implemented as a standard protocol inside the kernel. This, among the other benefits already mentioned, has also the benefit of guaranteeing a standard interface that should work out of the box for most systems.
As a developer, this is obviously a step in the right direction, although I don't really know how much it will weigh on the overall codebase of the kernel itself (maintainability, code size, etc. etc.).
It's obviously not my place to judge so I won't say anything really, just kudos to you +Greg Kroah-Hartman  and to all the other guys working on it! :)
 
+Dave Airlie userspace dbus has limits on all its queues etc., but in practice people set the limits "so high nobody hits them" because apps don't handle errors. (Which is understandable; handling errors would be an enormous amount of never-tested code in many cases.) Ideally the "so high nobody hits them" limits are still low enough to prevent Evil, but who knows. It isn't a very elegant approach.
 
+Greg Kroah-Hartman I apologize for never documenting the dbus-daemon semantics, I'm sure +Alexander Larsson was able to fill you in on the important stuff.... even worse a bunch of the tests (see bottom of bus/dispatch.c) are unit rather than integration tests that won't be able to run directly on the kernel implementation and would have to be "ported" to some other form.

Feel free to ask if you guys run into "wtf" moments and it's possible I will remember why it's like that. Of course it's possible there's no good reason. ;-) Also, there are quite a few other people by now who know the codebase, on the dbus mailing list.

Great to see you guys tackle this.
 
I do hope it is as generic as it was when people started discussing it. Having a better local broadcast/multicast system would be a boon. Being forced to use dbus protocol not so much for the usual reasons.
 
Hi Greg, where can I find the design docs for it ?
 
+Luca Barbato What issues are there with the dbus wire protocol? At least the data marshalling format is already implemented by Glib, Qt, and many other libraries.
 
I've been waiting for a solution that can unify binder and dbus on the kernel level since I first looked at the binder source code, but I also had not seen any attempt that looked like it would actually work. Thanks for taking this on Greg, you're a hero!
 
When implementing message passing in the kernel wouldn't it make more sense to implement the messaging API QNX provides? Is D-Bus really too slow when it uses Unix sockets?
 
Not sure to understand the part about AF_DBUS targeting rare performances needs for special environments: Does it means this isn't that much a commonly shared need, and achieving AF_DBUS-like performances (scalability, low latency, ...) isn't an essential libdbus project goals?

Are there other motivations for this kernel port, beside bringing namespaces awareness (cross namespaces flows arbitration...) to dbus - which is already a great and useful goal indeed?

As for the Binder questions above: more generally, are there plans to exposes some generics and low levels hooks/APIs for building different RPC mechanisms (in addition to dbus APIs), ie. something that Binder, OpenMPI, ØMQ, nanomsg, etc. can be built/sit on top? or would the userland API be strictly restricted to what dbus (and possibly Binder) offers?
 
+Benjamin Pineau for the most part, dbus as used today is not that performance sensitive. See http://lists.freedesktop.org/archives/dbus/2012-March/015024.html

In many ways the reason dbus is useful is that it prioritizes things like reliability, lifecycle tracking, ordering, and defined semantics, over performance. I mean, the core design decision (to have a central daemon) is all about that. For dbus usage on the desktop, the most important thing is that it work reliably with a minimum of tricky stuff to get right on the part of app developers. The Linux desktop traditionally has a "swarm of processes" architecture and if all those processes had to handle hard problems like random messages getting dropped or reordered, then it just wouldn't work in practice.

But the dbus implementation is also slow in ways that don't come from design tradeoffs. i.e. lack of performance tuning work. It could be sped up a lot just in userspace with better code. And eliminating the central daemon through kernel work could of course speed it up even more.

Making it faster may make it useful in cases where it currently isn't.  However it's important (for the desktop use-case anyway) not to "cheat" by removing useful semantics.
 
Sounds similar to a breed of UNIX sockets and IP multicast.
 
Can't wait for this! We've built a tester using a "parameter server" which channels IO, message parameters and software command variables between various apps. It's currently implemented under QNX, with its lightening-fast IPC, but I'd love to port it to Linux. D-bus was just traditionally too slow for this. Here's hoping...
 
I'm pretty sure Linux 4.0 or earlier will have it. :)
 
+Benjamin Pineau That's AF_BUS you're apparently referring to, not AF_DBUS.   Sorry to be pedantic; in the age of search engines, the difference matters.
 
+Frank Kusel Given that QNX was open-source until RIM (I mean "Blackberry") bought it, might Linux developers benefit from looking at old QNX IPC code?
 
+Benjamin Pineau Your question is the crux: "would the userland API be strictly restricted to what dbus (and possibly Binder) offers?"   +Greg Kroah-Hartman says in response to a question about binder, "my goal here is to be able to implement the binder protocol on top of this new code."    Presumably what has been implicitly decided is that IPC is a kernel-level function akin to memory management, so we should create a general-purpose IPC API.   The precedent is moving X11's memory management into the kernel by inventing DRM.   Perhaps the existence of DRM is what makes it possible for Wayland to finally displace X11.    Similar beneficial changes to userland IPC might occur if functionality that is common to binder, 0MQ, D-Bus, etc. were moved into kernel and standardized. +Dave Airlie +Kristian Høgsberg 
 
+Alison Chaiken QNX is micro-kernel based. The IPC is the fundamental comms mechanism for the (very small) kernel. I doubt if the much larger and complex Linux kernel would be suitable to use something like this directly? I'm not an expert in that area though ...
 
How will you deal with the dependencies of this libdbus which would be in the kernel (if I understand right ..)
 
As I read "On top of this kernel feature, we will try to provide a "libdbus" interface that allows existing D-Bus users to work without ever knowing the D-Bus daemon was replaced on their system." Means outside kernel right ?
 
Confuse me and searching on google confuse me even more. Let's wait and see. I'm shure Greg have a good idea behind it Just like somebody used to say in some movie :D  "Trust me" ;)
 
+Greg Kroah-Hartman , correct me if I'm wrong, but I was explaining to someone that this change isn't as insane as some people make it out to be. My understanding is that the kernel already talks with, say, udev over netlink, which then broadcasts over dbus. All this change would do is cut out the middleman, correct? Not correct?
 
+Cameron Seader this work doesn't care what desktop you are using, but some of the larger GNOME-specific things that Alex talks about will care, but there's nothing keeping the other environments from taking advantage of the low-level plumbing we are doing here as well.

+Stephen Martin this has nothing to do with the existing udev/netlink connect that is being used, this is something different.  It is merely moving the dbus daemon into kernelspace, with a libdbus in userspace to keep the same interface that applications are used to using.
 
Great for me it's clear now. Tanks Greg, Tanks for your incredible work on maintaining the kernel.
 
"... the crazy automotive Linux developers ..."

I almost feel offended ... ;-).

The short boot time is a crucial requirement, sometimes even a legal requirement for automotive ECUs. There is no compromise and if you are not able to achieve that, you have to use a different OS or a different HW design. In almost every automotive project I have been involved, we have been fighting for every millisecond, spending sleepless nights at the end of the project, optimizing, nearly hacking the code just to have the boot time short.

"... extremely underpowered processors."

You should not look at the world just from a desktop-CPU point of view. We simply can not justify an usage of a more powerful and expensive CPU only because of the boot time when it spends most of time idling at the end.

Anyway, I do look forward for future updates, I am just wondering whether some people from the automotive area are involved in design or we will have to live in illegality, as with the old AF_BUS patches.
 
+Wink Saville the linked article said when they would be available  and where the docs currently are at.

+Vaclav Mocek Don't feel offended, push back on your crazy userspace libraries that are abusing dbus in this manner :)

Seriously, I understand the issues you have, and you have my sympathies, although I seriously question your hardware choices in your projects, but that's not my burden, it's yours...  The automotive people are aware of this work, and are involved, in a way, so don't worry, you aren't forgotten.
 
+Greg Kroah-Hartman Thanks.

An attempt to send thousands of D-Bus messages at boot time is a result of ridiculously poor design, it is definitely not a standard practice and probably nobody expects it could be fixed by your new "kernel dbus" code.
Luc Luc
 
Really looking forward to it. Thanks.
 
+Alison Chaiken nice, when is your presentation, I'll be at ELC for 2 days if you want to talk in-person.

Also, your slide #16, what does "Not viewed that way by everyone..." mean?  Who doesn't want this new solution?
 
The reason I'm giving a talk is have a chance to speak to a lot of developers about both the technical issues and how to ward off mutual misunderstanding.    I hope that people who disagree with me will show up at my presentation and argue in public.    We need to drag a lot of discussions out into the light to make progress.   I also proposed a panel on GPLv3 and "tivoization" in automotive for Linux Collaboration Summit, as that's another topic that should be aired in public.   +Rudolf Streif +Richard Fontana +Karen Sandler +Matthew Garrett 
 
+Alison Chaiken *Very* interesting slide show! Really wish I could attend your talk. My background is more military avionics and industrial software, but I've been eyeing the vehicular stuff for a while. Looks like very interesting things are happening there!
 
+Alison Chaiken An interesting slide show, it is exactly what is needed to influence a bit these non-automotive people ;-).

Apart from the cultural reasons you mentioned in your slides, there are still technical drawbacks and the not-fast-enough under-some-circumstances IPC is probably one of them. It should be said, that especially multimedia ECUs are very often small HPC computers, with a high constant or mostly predictable load. They have exactly the same issues as the "real" HPC computers (memory bandwidth, synchronization) and real-time nature (guaranteed latency). The next big thing is probably AVB, significantly faster than the CAN bus. AVB streams 150-200Mbps are already used in the current designs, everything is precisely synchronised (IEEE 802.1AS). IPC and inter-core communication are then crucial issues, there are no big memory buffers and any latency could mean that the data are lost, without any possibility of their recovery; my experience. Yes, as you write, QNX is a strong competitor here.

And having a faster Gnome would be a nice side effect ;-).
 
+Vaclav Mocek so, what does QNX do differently that Linux can't do here?  Any pointers to the interface their kernel uses that would work better for you?
 
+Greg Kroah-Hartman
"so, what does QNX do differently that Linux can't do here?  Any pointers to the interface their kernel uses that would work better for you?"

I confirmed that QNX is a successful competitor. There are plenty of Tier1 manufactures who have chosen QNX recently as a platform for some automotive projects. I do not think that QNX does something Linux can not do, except that it is an RTOS. Other strong points might be the licence and various certifications. A simple comparison of interfaces is not an answer, I could write code using both APIs.

"Our goal is to provide a reliable multicast and point-to-point messaging system for the kernel, that will work quickly and securely."

Some sub goals might be contradictory and it will be your design decision to what you give preference.

I would prefere multicast capabilities (in UDP sense), preserving the order, a publish/subscribe model, low latency (old data == lost data in many cases), low overhead for sending of small messages (ratio payload size/ size of allocated internal structures).

If you want to see a horrible, but a functional approach from a different world, look at OSEK/VMX which is used in the host processors, especially the COM module:
http://portal.osek-vdx.org/index.php?option=com_content&task=view&id=9&Itemid=13
Just remember that this is probably inside your car :).
 
Indeed +Vaclav Mocek , QNX dominates currently shipping automobiles, as documented in slide 5 of my talk next week:  http://she-devel.com/Chaiken_ELC_2013.pdf   Similarly OSEK is what runs on most MCUs performing simpler control functions.   No one knowledgeable about automotive expects Linux to displace MCUs and RTOSes any time soon.
 
+Alison Chaiken
If there are real-time constraints, an RTOS has to be used, not Linux. OSEK is as an example of an open automotive RTOS, which is able to cope with hundreds of messages coming from buses in real-time manner and which is widely used.

The current automotive 'multimedia' ECUs (camera systems , satnavs, not head units), usually consist of a 'host' processor and a 'main' processor. The host processor is connected to CAN bus, running an RTOS (OSEK) and offloads handling of all real-time events, the main processor does the actual job.

The number of messages passed to the main processor can be still high - steering angle, speed or PDC distance (4 sensors+) could be updated every 40-50ms, plus a lot of internal events - new camera frames, data from GPS, gyroscopes, RDS etc.

Linux is not in a bad position as it might seem here. The complexity of our SW has grown enormously during the last few years, as well as the networking topology in cars and I feel that the way, how I have designed applications so far, is not possible any more. The fact that RTOSes mostly use only threads, sharing the same memory space without any protection, is a source of many delivery delays. A solution is to use a mature operating system and where it makes sense to split the application into isolated parts (processes).

I would need to handle ~400 events per second in time frame <~250us, the rest are interrupt driven things and given the architecture and the speed of processors planned for future projects, Linux can do that and a suitably designed messaging system would help a lot.

Anyway,  time to move on something else.
 
+Vaclav Mocek The use of Linux with (for example) a Cortex-A and Cortex-M serving as "main" and "host" processor will certainly have some advantages.    For the camera case, the capabilities offered by V4L2 may be superior, for example.    TI seeming to be thinking along these lines with its Jacinto/OMAP5 offering.
 
+Alison Chaiken
Heterogeneous SoCs sharing the same memory are just a pain. I would prefer N x A9 with NEON + GPU with OpenCL any time ...
 
+Vaclav Mocek, please don't say things like: "If there are real-time constraints, an RTOS has to be used, not Linux", because it doesn't mean anything and perpetuates the idea that only something like RTOS or QNX is suitable for "proper real-time".

The thing is, when you buy into QNX, for example, you get a nice fluffy promise and no specifics. How many teams measure the latency for all the hardware and drivers they need to see if they behave deterministically, before slapping their application on top? In my experience, they simply buy the vendor's promise and trust that the vendor will hold their hand when the time comes.

In Linux land, there's no generic promise of all things to all people. Instead, we say things like: "it will probably work, but if it doesn't work you can always fix it yourself", or "it will probably work, but you need to look carefully at your exact requirements", or worst of all: "you probably need <another product>".

This isn't what people want to hear when they're short on time, short on skills, don't understand the exact requirements and are generally trying to minimise risk. The person who makes the decision doesn't want to see or touch code, they want a product that "works" and for someone to tell them that it will all be OK. 

For many of these cases, Linux will be better than OK once the hand-holding problem is solved.
 
In some previous posts here netlink has been mentioned. Why is dbus in the kernel a good idea, when there is already netlink? Netlink provides one to one communication, and also broadcasts.
Maybe it's because I'm not aware on how things work in detail, but please can you explain when dbus is doing things better than netlink. Thanks in advance,

Stef Bon
 
+Stef Bon as far as I understand, netlink needs a port to idetify a service, whereas dbus has a string idetifier representation.
Add a comment...