Shared publicly  - 
For those who believe the systemd developers are reasonable and will listen to constructive criticism.....
Ted Lemon's profile photoRussell Nelson's profile photoColin Guthrie's profile photoTheodore Ts'o's profile photo
How can writing out a lot of information prevent the system from booting?
Ted Lemon
It doesn't.   It prevents it from being usable once it's booted.

I think the problem is a particular developer, not systemd as a whole.   That guy really needs to lose his ego.
For better or for worse Kay and Lennart are two of the most important developers on the project.  As such, they attitudes and words have huge sway over the project.
Kinda reminds me of some folks a few years ago who decided to name their program "zoo."  For a while they paid no never mind to those who were saying "but there's this compressing and archiving program, written originally by Rahul Dhesi."  The difference is they eventually relented, and recognized there was...ummm..."prior art."

They don't even have the decency to use something like "systemd.debug=1"?
+Ted Lemon So the link Ted posted deliberately misstates the case? Given the snarky tone as well of that link, it seems more than one is behaving badly.
Its pretty clear systemd devs don't care to play nice. Sad story. Glad I'm not compelled to use their init.
If you feel you need to avoid mentioning systemd by name ("a specific userspace init program") when proposing kernel workarounds for it, you might be doing something wrong. :)  

[Edit: or they might be.]
The original systemd bug says the voluminous output really does break booting due to timeouts. So that's a bug in whatever is failing for too much output. Is it a kernel timeout or somewhere else?
The funny thing too is the insistence that namespaces should be used, coupled with the refusal to use them. It sounds as though the kernel has a bug... Flooding the ring buffer? Lockup with "too much" output? Iunno.
There are two definitions of "boot".  There is having the kernel come up and transfering control to /sbin/init.   Then there is actually coming up to a proper login prompt or login window.

I'm sure it's not a kernel timeout, but probably some kind of cascade failure where (say) journald got too much input, and then logged a debug message indicating that it had dropped some input, which caused more logging, which caused journald to get even more badly tangled up, etc.

It may have been that with certain kernel modules, the kernel debug logs plus the systemd debug logs was enough to cause journald to melt down.   (And maybe the systemd developers didn't have that particular hardware configuration, so they didn't see the cascade failure.)
+Paul Morgan Actually, the kernel does own the command line, both from a moral perspective, and because if necessary, we can filter debug from /proc/cmdline such that systemd never sees it....

So this is not a battle systemd can win....
Ah! Then is this not actually a journald bug, if you're right? Given that a huge problem with upstart was its near total absence of debugging output (in the early days) I'm pretty deaf to any idea that there is "too much". 
+Thomas Bushnell, BSG well, that bug is actively being worked on. If you read the rest of the LKML thread (see the rate limit patches that +Linus Torvalds is writing). But still, using "debug" as a way to spit out tons of output, especially back into the kernel, is not a nice thing to do by userspace. The passing of the kernel command line parameters is a "feature" and should not be abused.

Boris wasn't the only one to see this, and there were even Red Hat bugzillas that were filed (and ignored) about this as well. There is several things that root can do that can pretty much DoS the machine. And currently, crazily writing into kmsg, is one of them, and is a bug if a root process happens to do that.
Ted, there is no "moral perspective" here. I have no ax to grind, but the kernel guys here seem to be taking a low ground. It's as if they were pleased be the opportunity to start a war. Who cares who wins?
+Thomas Bushnell, BSG Oh, there's definitely such a thing as "too much".    Suppose ext4 tries logging information when a process writes to /var/log, and systemd starts writing to /var/log/messages, which causes more ext4 logging, which causes more writes to /var/log/messages, etc., etc.

This is why it's not just about having the ability to turn on debugging, but also to have fine-grained control over what debugging information is useful for solving your particular problem.

This is why I tend to use ftrace and tracepoints for debugging, and not use printk.   So to be fair to both sides, I don't find "debug" on the command line to be that great of a design.  It's just not fine grained enough.
Linus' response to that patch suggests that this isn't an isolated incident.  Thats definitely not a war the systemd team is going to win ...
Note, that prune the command line patch was only sent to start a discussion. And because of that, it was one of the most successful patches I have ever written :-)
+Thomas Bushnell, BSG I agree that there was escalations on both sides.   But the kernel command line does belong to the kernel, since the it was originally designed for use by the kernel, and the kernel controls what  (if anything) the userspace can see of the kernel command line.
+Theodore Ts'o My first temptation would be to filter by priority, e.g. KERN_CRIT etc., if not adding more filtering power to make it manageable.
+Theodore Ts'o, I have the same sort of philosophical disagreement with the GNU GrUB folks, who will insist I should never use "vga=<VESAmodeID|ask>" despite it being defined and documented in the Linux source tree text file with Linux kernel boot time parameters for YEARS.

Come to think of it, when I was trying out Ubuntu I was perplexed why my boot halted when I used "panic=15".  Apparently at some time, the initramfs-tools were using this for something else.  Meh.  I just worked around that one with echo 15 > /proc/sys/kernel/panic as a local hook script in initramfs.  But I similarly thought, this is odd for something in userspace to reuse something which has been well established for the kernel.
+Steven Rostedt I think we probably disagree about whether it's OK to be inflammatory as a tactic for starting discussions or "making people think".
+Thomas Bushnell, BSG The fact that systemd is sending input into the kernel's ring buffer is its own questionable design decision.  I don't understand why they are doing that, instead of sending it directly to journald.   If we significantly rate limit who much crap people can pour into the dmesg buffer, that will force systemd to find a better way.

And I have a hint for the systemd folks.   If the goal is that they want to save logging information before journald has started, all you have to do is buffer the information in a userspace process, and once journald starts, or once the file system is mounted read/write, you can then dump to journald, or write it to a file.

I wrote a program to do exactly that in e2fsprogs --- it's called logsave(8), and it was designed to save the output of the fsck of the root file system, until the root file system can be remounted read/write.   Note that I did *not* spam the dmesg buffer from e2fsck, but instead saved it to the logsave process.

This is not hard....  I'm guessing this was the case of some systemd developer being lazy.
+Theodore Ts'o The kernel only owns /proc/cmdline in as much as it ignores what it does not understand.

if i pass `foo=bar` on the kernel cmdline, the kernel is free to ignore it. But the kernel is not allowed to alter what i passed as args.

if the kernel wants to say, "ignore everything after 5 MiB of /proc/cmdline", that's ok too. it's reasonable.

but neither kernel nor any other process should have the privilege of altering /proc/cmdline. it's go/no-go. pass/fail.

if a process swallows params, that's an alteration outside the purview and responsibility of said process.
+Theodore Ts'o sure, but it's absurd on a modern CPU to limit dmesg size or rate. It made sense back when dmesg code was written. Is it lazy that it still is written for a 1995 computer? Probably just sensible allocation of developer time. The same thinking may be true for the systemd peeps.
+Thomas Bushnell, BSG was I being inflammatory? I was trying to be polite. I basically was following akpm's approach on getting things done. That is, he once stated that he posts a very bad patch and threatens to submit it into mainline unless a real fix by the maintainers gets done. That's basically all I did.
+Thomas Bushnell, BSG  reason to throttle dmesg is because the dmesg buffer also doesn't belong to systemd.  The interface was originally to designed for small amounts of information --- for example, "I'm starting to run test shared/127", so you can see how various kernel errors or BUG_ON's can be interleaved with certain external events (such as starting to run a particular test).  This is how we use it when running kernel tests inside Google, for example, and this is a proper use of /dev/kmsg.

Dmesg was never designed to be a syslog replacement, which appears to be how systemd is abusing it.   So the rate limit simply forces /dev/kmsg to be used the way it was originally intended.
+Paul Morgan That may be your opinion; but that and $3.00 will buy you a small latte at Starbucks.
+Paul Morgan, seems to me /proc/cmdline is a courtesy to userspace.  It could just as well reasonably swallow EVERYTHING except what it doesn't understand.  After all, that was its original purpose and use, to give the kernel boot time configurability, no?  The fact that it also manifests those strings to userspace to me is a bonus.
I like the suggestion there to halt the kernel if systemd is detected.
it's long-standing tradition that /proc/cmdline is a string. anybody that can read /proc/cmdline is free to interpret that string, for better or worse.

/proc/cmdline is somewhat special in the sense that it's the "root" for deterministic source-of-truth to start the host. if you allow a process (kernel or user-space) to swallow, ignore, or alter /proc/cmdline, then you need to provide a mechanism to pass args.

if you provide a mechanism to pass args, where does the rabbit-hole go?
how deep? do you have a mechanism to check the mechanism?

as an admin/engineer/whatever, i have the power and authority to specify kernel cmd-line params from a static config (even if that config is dynamically generated via a dynamic provisioning process). at some point, i provide a deterministic cmdline. 

there must be a "root" cmdline that is authoritative.

if not, we must recompile the kernel and every possible userspace that could possibly swallow params from /proc/cmdline. this is not a sustainable model.

do i need to create an overlay for the overlay?
+Paul Morgan let me point one thing out. The only one that can prune it, is the kernel. No userspace process can modify it. /proc/cmdline is a gift to userspace from the kernel, and if that gift is to be abused by userspace , then perhaps the kernel will take it away.

That said, I had no intention on having that patch actually applied. But it was rather a big hammer to point out that systemd is abusing it. Even +Linus Torvalds said that it was fine if systemd or any other process printed more debugging if "debug" was stated. But the core issue is that it caused systemd to basically freeze the system, and when this was reported to the systemd developers, their response was "tough". And they basically said they can do anything they want with the "debug" option in the cmdline output, because nobody owns it. Although you say the kernel does not own it, the kernel is the only one with the power to change it. I would say, that's ownership.
This level of disagreement on something this trivial to address is why we can't have nice things.
+Steven Rostedt 

> The only one that can prune it, is the kernel.


> /proc/cmdline is a gift to userspace from the kernel, and if that gift is to be abused by userspace , then perhaps the kernel will take it away.

negative. /proc/cmdline is not merely a gift. it's the root of boot-time.

> the core issue is that it caused systemd to basically freeze the system

that's a systemd problem. if systemd parses "foo", "debug", or "bar", and subsequently does the wrong thing, that's a systemd bug, not a kernel bug.

does systemd incorrectly parse `blahdebugblah` as `debug`?
does systemd incorrectly parse `debug=foo` as `debug`?
does systemd incorrectly parse anything?

that's a systemd problem, and distros are wise to cast a distrusting eye to anything and everything that poses as pid 1. pid1 is near the kernel in importance wrt robustness.

> And they basically said they can do anything they want with the "debug" option in the cmdline output, because nobody owns it

yep, systemd can do whatever they want in parsing /proc/cmdline.
and if they do irresponsible things, i will responsibly choose a pid 1 that does not act irresponsibly. (and by extension, distros that avoid irresponsible and arrogant behaviors).

> Although you say the kernel does not own it, the kernel is the only one with the power to change it. I would say, that's ownership.

with great power (ownership), comes great responsibility. there can be only one (root). that goes for both kernel and pid 1, regardless of what pid 1 happens to be.
+Steven Rostedt I've re-reviewed the original patch, and my fundamental problem with it is the juvenile verbiage in the commit message.  words have meaning.
+Paul Morgan anyway, this is all just bikeshedding. The real problem is that a bug was shown and the systemd developers wanted to ignore it. Their excuse was that "debug" on the kernel command line is generic and not for debugging the kernel only, and the kernel should use something else if it wanted (loglevel for example). The "hiding" of debug was only to point it out to them, that this was not an option, and they had better fix their shit or we may need to do something drastic.
+Paul Morgan really? I was just being descriptive and polite. I didn't swear, curse or even call anybody names. If you find me juvenile, I'll take that as a complement, as I consider myself rather young at heart :-)
+Steven Rostedt 

> their excuse was that "debug" on the kernel command line is generic and not for debugging the kernel only

exactly. and since the kernel is first consumer of /proc/cmdline, it's their prerogative.

if anybody other than kernel wants to parse `debug`, then it's up to that "somebody" to properly namespace their parser to avoid collisions.

at $dayjob, we use args like `$dayjob_var=foo`. it works for now. if kernel collides with us, we'll change for upstream instead of using generic var and blaming upstream.

> If you find me juvenile, I'll take that as a complement, as I consider myself rather young at heart :-)

+1 to that! count me in for young at heart :-)

the patch commit message reads...

>> we can keep the users from seeing stuff on it if we so choose

yep. and users can reject such statements.
+Greg Kroah-Hartman has submitted a patch to fix the debug thing.  Hopefully it gets approved and we can just move on.

These guys are going to meet next week, so likely they'll discuss the situation then.

That said, I think it's kind of silly.  There probably should be some discussion of namespace.  'Debug' should probably have 'kernel.debug', or 'systemd.debug'.

I do agree that just using 'debug' and then turning your system unusable is not a good default and it should be tuned to be at least usable and completely defeats the purpose of using 'debug'.

But let's not kid ourselves here, this was a show of power. :P  With a coup de grace from Linus not accepting any of Kay's patches because he's a bad actor.
This whole talk of ownership is pointlessly hostile. If the kernel owns cmdline, does not user space own user space? Does systemd not have the right to block login as it pleases? Is not the kernel its library? The whole talk is absurd. And no, +Steven Rostedt, your message was not polite. I take at your word that you intended politeness, but you did not succeed by a long shot. Pointing out whatever misdeeds your opponents may have done cannot justify your own. The whole language of your message indeed is about opponents, battles, ownership, and hostility. Given that you could just as easily have proposed for yourself to very cure you offered your enemy (namespaces) and rejected it only in the interests of maintaining a fight, your protestations ring hollow in my ears.
+Paul Morgan I think what you (and others) seem to miss is that the systemd people made the "debug" option that we introduced not just do something - but do something useless that actively broke other peoples use of that option.

It doesn't matter who "owns" it, the fact is, they broke it.

Ok, fine. Bugs happen, and that's not what makes people upset.

What makes me (and others) upset is that when the bug is reported, with explanations and a suggestion for how to fix it, Kay just closed the bug-report, claiming it wasn't a bug.

Seriously? You want to debug kernel stuff, using the kernel command line command "debug" that makes the kernel more verbose, and now the systemd people say "sorry, we stole your thing and made it useless, and it's not a bug because you didn't call shot-gun".

Now, if this was an isolated incident, I personally would let it go. There are bad engineers out there, it's not worth worrying about. Ignore them and move on.

But this is not an isolated incident. This is how Kay has treated other bugs in the past. Literally months of stalling, closing bug-reports, and blaming other people and projects for problems that he caused, telling others how they should change their projects because he broke something, and obviously it can't be his fault.

And that is a problem. 
+Linus Torvalds 

> I think what you (and others) seem to miss is that the systemd people made the "debug" option that we introduced not just do something - but do something useless that actively broke other peoples use of that option.

nope. i do not miss that. i argue in favor of kernel being the authoritative source of truth for params. that's precisely my point. in all respect to +Thomas Bushnell, BSG and my own statements about namespace, it's not the kernel's responsibility to namespace kernel params. It's the responsibility of the userspace parser to correctly interpret /proc/cmdline.

if anybody outside the kernel usurps a param for their own purposes, it's a bug in the process that parses /proc/cmdline. not the kernel.

if kernel introduces a new param, it's up to userspace to adapt.
+Thomas Bushnell, BSG Umm? When, exactly?

We have a very strict "don't break peoples programs" policy. We've done it exactly so that people can upgrade kernels without having to worry about it. Bugs happen, but we do consider them bugs. Serious ones.

We have some cases when security issues means that "oops, we really have to change semantics", but we try to avoid it. And the graphics people have often forced their own nasty flag days (so X doesn't reliably work across kernels if you go back about five years or so), but I think they've also seen the error of their ways.

And you do realize that +Steven Rostedt's patch happened as a response to the fact that the systemd people didn't own up to their bug? So in order to make the option useful again, it needed to be controlled.

Now, admittedly I really didn't like that patch, and there's a different approach floating around as a patch that I wrote, although I don't have confirmation that it limits the problem sufficiently yet.

And it turns out that the adult in the systemd saga (who is also one of the very major kernel developers) is trying to get the original systemd bug fixed.
+Thomas Bushnell, BSG OK, politeness may be in the eye of the beholder. I'm rather blunt, and usually just say what I think, and that is considered rude by many people. I don't consider it rude, but many people do. People who know me, don't ask me questions they don't want to hear the answers to. Because I will give it to them. My wife never asks me "does this make me look fat?", although I don't know why, as she's very thin and I would honestly answer "no". But still, she knows that I don't even give false answers to those types of questions (It's amazing I've been married to her for over 20 years).

Anyway, for both you and  +Sriram Ramkrishna which say we could have simply added "kernel.debug" as we suggested that systemd should add "systemd.debug". Well, "debug" has been with the kernel longer than it has been in git (I looked it up, and it was in Linus's original commit). That is, "debug" as a kernel option has been around much longer than systemd has, and you expect us to say, "Oh damn, systemd now conflicts with the kernel "debug" option, we must bow down to the gods Kay and Lennart and change our ways". I'm sorry, but you don't want to hear my answer to that question.
+Thomas Bushnell, BSG _I_ care about responsibility, and so does every person in this thread. The fact is the kernel considers it a bug to break userspace. By the same token, userspace should strive to be responsible wrt kernelspace.
+Thomas Bushnell, BSG "it is absurd on modern CPU..."
Whatever you define as a "modern CPU". But Linux (unlike maybe other OS) is not only running on Desktop and Laptop systems. By numbers, the installations on such devices are probably only a small fraction of the whole installed base. Linux also runs on many small and embedded systems. So wasting resources just because "we have them on a small percentage of all systems and maybe a misbeheaving tool might make use of them" does not seem very reasonable to me.
+Steven Rostedt I'm not saying you should bow down to anyone.  I'm just saying it's probably good to hash out the namespace. 

 I probably would leave the 'debug' alone as part of the kernel because it is legacy and it is a well known and used option and probably wouldn't make sense for the now.  But could be something to move to in the far future.

Adding the namespace just makes it clear.
+Sriram Ramkrishna I'm not against adding a kernel.debug as a separate parameter. I was just pissed that Kay would not realize this was his bug. The only way to get him to listen is to do something drastic like make "debug" disappear. Believe me, the inclusion of that patch wasn't the objective. A wake up call to Kay was.

BTW, I think we would have also been happy if Kay turned around and said, "oh we shouldn't spam the kernel that hard, we should rate-limit our output" and even without doing a "systemd.debug", this would never have escalated as high has it has. But instead he said "This is not a bug, wont fix". That is the issue here.

He said, "take it to the mailing list" and that is exactly what I did.
The kernel's command line is "generic"? Why would anybody think that? Oh, because you can look at its command line? No, that dog don't hunt. Because you can put things on the command line that the kernel will ignore? Okay, but what about the things the kernel acts on? Those things are not generic and you don't get to hang your semantics on top of them.

I'm getting the impression that the systemd folks are fairly incompetent. Good coders, maybe, but incompetent hackers.
+Linus Torvalds My comment was not meant to suggest in the least that I agree with them. Any high-profile maintainer of a widely-used bit of free software has to say no to people, and sometimes say no for months on end to the same thing. Sometimes people will get upset, rightly or wrongly, and describe them in terms similar to those you used about Kay. I worked for many years for a man, as you know, who believed the most important thing in any interaction was to make sure to know who was an enemy and who was a friend. The result is that despite succeeding in his life's work in an extraordinary fashion, he believes himself to be a failure and almost bereft of allies. The experience has made me profoundly allergic to that dynamic.

I'm not in a position to judge Kay or +Steven Rostedt, and certainly not on the basis of one interaction. Kay's manner on that bug did seem childish to me. But +Steven Rostedt's manner in his patch seemed equally childish. I'm grateful to you for your post, because it gave me the words to understand why I feel as I do about this case. When a child behaves badly, it is the mark of an adult to respond calmly, and the mark of a child to behave back on the basis of "he hit me first", "this is mine", "i have more power", "you'll never hit me again!" and the like. Of course +Steven Rostedt's was a response to a seemingly childish thing from Kay. But that doesn't make it any less a childish response to Kay.

I'm grateful, and not surprised, that the adults (such as you, certainly, and +Theodore Ts'o from my long acquaintance) will end up in the right place, and that there are adults among the systemd maintainers too, as you allude to.

+Steven Rostedt I don't know whether you're unaware or obstinate, or (if unaware) whether willingly or unwillingly so. Your decision to have a "blunt" style does not somehow excuse being childish or rude. It doesn't matter much whether you consider it rude or not, any more than you can decide to count small bits of dust as money and leave them at a restaurant as payment. That's stealing, and saying, "well, as far as I'm concerned, in my world view, that counts as real money," is irrelevant.

The only other person I've ever heard treat a technical dispute as a question about whether someone is being forced to bow down to someone else, by the way, was the aforementioned man I worked for for eight years: RMS. It is the mark of an adult (which, frankly, RMS never learned to be) to understand that letting people have their way for a time and routing around them is a productive strategy; it is the mark of a child to believe that one must preserve one's ego before all else.

+Marco Tedaldi This is about kernel debugging on systems with systemd. That's not tiny cell phones. 
The second way I must thank +Linus Torvalds for his language as being so helpful to my own understanding here of my reaction is this. There are adults among the systemd maintainers, and as you and others here insist, also folks who behave consistently as children, and there are people (like probably most of us, actually) who are sometimes adults and sometimes children.

That's what bothered me about +Theodore Ts'o's use of this example as evidence that the systemd maintainers are not reasonable and won't listen to criticism. It's not that at all. It's evidence at most that one of them was unreasonable and childish. And there are others among them, as you say, who are adults.

The same is true of the kernel maintainers. Some are consistently adults, some are consistently childish, the vast majority are sometimes one and sometimes another in varying proportions and varying times. An example of one person being childish cannot be evidence for what "the systemd maintainers" are like, any more than +Steven Rostedt's intemperate patch is evidence for what "the kernel maintainers" are like.

All here are individual people, not exemplars of group tendencies.
I think +Steven Rostedt explained himself fairly well.  The patch was simply to call attention to the behavior.  Kind of a move to bring someone back to the negotiation table when they aren't listening so to speak.

I agree this shouldn't be some kind of evidence that systemd folks are bunch of immature children.  We can all be a little hard headed sometime especially if we agree in our own minds that we are correct.  I'm sure each of you have been in that situation at some point or another.
+Thomas Bushnell, BSG wow, that is the first time anyone has ever compared me to RMS. People who know me would find that rather amusing.

I like to do things to get things done. This is the reason why I'm rather blunt. I don't like to beat around the bush as I find that wastes time, and time is the most precious thing all humans have, as we have so little of it. If you find my tactics childish, so be it, my wife would probably agree with you.

But honestly, if I had tried a more "mature" route, I could guarantee you that I would still be arguing my case and nothing would be done. I may get a fix in a few months or so, or perhaps nothing at all. Again, that would be a waste of time.

Hindsight is 20/20, and looking back, I would have done it exactly the same way. Because I don't see any other way that would have been as productive.
+Thomas Bushnell, BSG, regarding your question of how can writing a lot of output prevent booting, I guess following can be one way to cause it:

Have a parent process start a child process. Attach from parent to child process output (I guess in Linux land we call that pipe?). Fail to process output in your parent process at all (or quickly enough), that is : cause the output buffer to fill, cponsequently blocking child process.

I saw something like that happen in Windows 7 with java parent process. Turns out that you have to manually attach to child process (doesn't seem necessary for parent process written in other language - Java 7 has improved this weird behaviour) and read what it writes (well, you don't have to actually read every byte, just connect :) ). Of course fun thing happens when your parent process dies (you end it) and suddenly there is nothing to read output. Java really sometimes is painful to work with.
+Steven Rostedt There should be an analogue to Godwin's law for Linux. Something along the lines of "As the discussion grows longer, the probability of a comparison to RMS approaches 1".

Also, the issue was ignored for quite some time. The first, internal, report that I'm aware of was at 1st Dec 2013 (1036400, for RHers), still completely ignored (not a word from anyone except the reporter), then was - initially some feedback was given from the systemd crowd - "kernel, please, use ignore_loglevel instead", then silence, then, thanks to +Tom Gundersen (who was kind enough to contact someone responsible after seeing my G+ posts, thanks again!), but the only outcome was "that's a huge benefit for users, but kernel guys are capable of using a longer command line option", CLOSED, WONTFIX (so it's actually not only Kay's modus operandi, at least on this example).

And then, finally, got some love, thanks to upstream discussion.

So this issue has a long background, was discussed "at great length" by systemd folks ( and the outcome of the discussion was "WONTFIX", at least seemed to be (and they've acted the same way).

So I don't really think that Kay's the (only) one to blame here. IIRC it wasn't even his commit that introduced it (I might be wrong, though), other core devs acted the same way, and it was discussed internally.
+Veaceslav Falico And yet, it's true: the kernel devs could use a namespaced option. The issue has everything to do with egos and bowing and theories about who owns what. Feh.

+Steven Rostedt You continue to think that your decision to be "blunt" somehow trumps the social fact of what that means. You want to be speedy, and if you run over people in the progress, it's your policy? That makes it worse, not better.
I don't get the namespace issue. I have been working on various initsystems for some time, and we routinely reuse the kernel's commandline options and interpret them to also apply to us (in a consistent way of course).

To take a few, we (early user-space) read "root=", "rootflags=", "rootfstype=", "init=", "ip=", "ro", "rw", "quiet", and "debug". If this is a big no-no, we have a bigger problem than just 'debug'. My take is that in general it makes sense to reuse these options, but there may of course be exceptions.

Secondly, through all the hot air, it is a bit hard to tell if there is actually a bug here. I.e., if both systemd and the kernel is in debug mode (through whatever mechanism) does that prevent boot or login? Does it actually break/garble either debug output (I don't think so, but it is hard to tell exactly what people are claiming here)? Either of those would clearly be a problem regardless of the particular command-line options, so why don't we focus on solving that?

While I don't really respect the amount of vitriol coming from some people in this discussion, I did find +Steven Rostedt's original patch to be an amusing way to start the discussion (and I was half looking forward to adding dmesg parsing to systemd to extract the kernel commandline ;-) ).
I see comments above that when requested to discuss on the mailing list, that's what happened, but I'm struggling to see any discussion on the mailing list. Please correct me if I'm wrong.

Seems reasonable to me that on a slightly subjective point that the bug should be closed and discussion moved to the list. The outcome of that discussion could see the bug reopened, or it could be a different alternative fix/approach is identified which warrants a different bug being opened or a patch might just come out as a result.

Ultimately I feel the bug report was quite strange to me in that it presented a problem and then pre-supposed the correct solution (i.e. stopping systemd parsing "debug"). To me this isn't a solution at all but just papering over the cracks of one trigger case. If it fails with "debug" just now, then it will fail with  "debug systemd.debug" tomorrow therefore this is not a fix and you can call Kay all kinds of names for closing the bug report and requesting discussion on the mailing list all you like but to me this seems like the correct action - more discussion on how to best solve the underlying issue is clearly needed (whether or not you feel systemd is abusing kmsg/dmesg or any other kind of construct can then be discussed there in a rational way rather than via passive-aggressive posts on social networks). Whether you agree with the closing of the bug before the discussion or not is a very minor technicality in this whole debate (one largely down to the preferences of the individual project in question) that has quite frankly blown up out of all proportion from people who should really know better.

Please discuss this issue on the systemd mailing list as requested. The fact this hasn't happened yet seems to suggest some other kind of desired outcome from the reporters. There is no need to spit the dummy. Have the discussion on via a proper mechanism where more eyes can see the problem case and work on the solution.
+Tom Gundersen FWIW, I agree with you on the "interpreting the kernel commandline", when indeed needed.

The "bug" is that the previous, well-documented, well-described, widely-used etc. etc. workflow ("debug" in commandline): stopped being useable when running systemd, because of huge number of useless (in the context of kernel debugging) messages from systemd.

I also don't really see any point in systemd doing this in the first place, actually. What's the scope? Someone wants to debug systemd early by adding just "debug" instead of "systemd.debug" because of... what?

And it's actually counter-productive for systemd debugging too - cause you end up having tons of kernel-related output which you don't always need. And if you really want kernel debug messages - you could always turn on the "debug" option along with "systemd.debug".

When these were pointed out in different BZs, the systemd's team reaction was, basically, "use loglevel", and closing them without any further input (please refer to my previous messages for examples). A lot of people (including myself) don't think that that's the way an issue should be handled, one might just say that's a simple disrespect of the community. That might be an explanation to why there's so much vitriol.
+Colin Guthrie you have 2 bugs and half presented:
- You are reacting to a keyword you do not own and a fix had been proposed.
- You are spamming dmesg and that hopefully, after this public spotlight, it has a fix now.
- Apparently you aren't testing the options on enough systems (otherwise the first two would be noticed), would be nice if the experience helps preventing them.
Also, about the namespaces and "kernel.debug" suggestions - ... really? It's, first of all, kernel command line, and while everyone's free to parse it (thus it's exported), and, more than that, free to add private options (which kernel will silently ignore) - it's kernel command line, and if there's no namespace - the default would be to assume that it's kernel namespace.

Suggesting to rename long-existing defaults from kernel commandline in favour of systemd's decision to use it for itself is... kinda fun, if said politely. :)

Anyway, I feel like I'm either being trolled here, or that I've lived under a rock enough time to miss the announcement that some userspace component was declared the centre of the universe :).
+Veaceslav Falico i think what we care about here is not kernel devs debugging the kernel, or systemd devs debugging systemd, we can all set whatever commandline options necessary.

The question is what is useful for an end user to enable when "boot is broken". I'm not suggesting that the amount and kind of debug output systemd gives is ideal (there may be tweaks necessary). But it makes sense to me that if "something broke boot", the user should pass "debug" on the kernel command line and as a result get debug messages from anything that may have broken boot (basically the kernel, the initrd and init).

But hey, maybe I'm wrong, and users don't want this. However, even if we are wrong, I think the reaction to this as seen in the above and related discussions are completely inappropriate...
+Veaceslav Falico I don't get the suggestion of "kernel.debug" sounds stupid. There is already "loglevel=" for kernel specefic log level, there is no need for anything else (just like there is "systemd.log_level=" for systemd-specific log-level). Now if there was no way to set just the kernel log-level, I'd understand the fury, but as there is... shrug
+Tom Gundersen I said that it sounds fun. There is loglevel=, but there is (and was for a lot of time) "debug", which people use for obvious reasons (easier to remember, more easier to understand/grasp, documented all over the place etc.).

The "fury" is not because of the actual change, but because of the way systemd's developers handle the requests.
+Veaceslav Falico I think that how anyone handles such requests is always going to be subjective. If you are aligned to the kernel side and have been using the "debug" option for kernel debugging you'll likely automatically assume it's being handled badly. Personally, as a distro guy, I find that if I can tell a user who is having problems with their boot to simply use the "debug" command line and I can get a log from them with all the relevant low-level info, then it makes it much easier to triage problems, so I thing the "hijacking" here is a very useful thing from an end-user perspective. Ditto for also respecting "quiet" - which also means we don't need to rework distro scripts and tools for handling boot entries. However, I do accept how it could be considered a "regression" from a kernel developer's POV. I don't think Kay handled it overly-badly - he did ask that discussion be taken to the mailing list and that didn't happen, so from my perspective that's really bad "handling" of the bug by the reporters. I'm sure you too can see both sides. That said, if Kay had worded things differently and said: "I accept some of your points but can we discuss it further on the mailing list first and I'll close this bug for now until the outcome of that discussion" then perhaps this whole brew-ha-ha could have been avoided, but I don't think the underlying meaning is overly different. People just tend to (IMO) overreact to rather blunt statements.

Ultimately this is something that happened in the last 24hrs or so (and a good chunk of it was during European night time). There hasn't really been sufficient time to see how either side has handled anything yet (other than the ridiculous amount of vehement rhetoric before any real discussion has taken place which only serves to make people defensive and really doesn't make for the best start for any reasonable discussion). Ultimately we need more time to pass before anyone can really judge how this problem case has been/is being "handled".
+Veaceslav Falico yeah, I'd just ignore the kernel.debug suggestion, probably a joke :)

I get that people who are used to a particular usage pattern don't like it to change (especially if they use it a lot). However, we should keep in mind the big-picture here, namely designing a system that makes the most possible sense for end-users.

"debug" is indeed easier to remember, grasp and document. Which is why I believe it should be what most people should be using, especially if they don't know what is wrong (systemd/initrd/kernel). We should implement this in such a way that the debug info printed is as relevant as possible to the majority of end users.

Now, we probably need to discuss some more precisely what to print. However, just saying "this option is mine, I had it first, your change is annoying me" is not going to be taken very seriously.

It was regrettable that it took some time to answer to some of the bug reports, but the way +Kay Sievers answered the linked report I think was reasonable. He correctly identified this as a design issue, rather than an actual bug, and tried to redirect the discussion to the mailing list (at which point the kernel developers decided they'd rather "teach him a lesson" and show him who's boss).
+Colin Guthrie WRT "useful thing from an end-user perspective":

1) Do you see a huge difference between telling a user "add debug" and "add systemd.debug debug" ? I don't think so. And I don't see any benefit in that, again. As a support guy with quite a few years of experience dealing with customers (you've started it! :) ) I can say there's no difference between "Hi, can you add 'debug'" and "Hi, can you add 'systemd.debug debug'". There is, however, difference for people analysing it. So, while adding no bonuses - it only adds headache for support people, kernel devs, even systemd casual debuggers who'll try to turn on 'debug' to see why their target isn't started and see lots of kernel garbage.
2) Using the same logic all services that start on boot should also parse the "debug" flag. For example, nfs is quite complex in dependencies and start-up, so it might be useful for it to parse cmdline for debug - this way you can tell the user to add "debug" to cmdline and will get this too!

WRT How it was handled:
I think it was indeed handled badly, from the POV of a developer who works with BZs for the last 5+ years. The BZ is a perfect place to discuss things, and if it's needed the discussion can be moved to mailing list. But in no way the BZ should be closed before it was proven that it's not a bug. That attitude only says that "I consider this not a bug, conversation here is over - feel free to write to the mailing list", as I (and, apparently, lots of other, a lot more experienced folks) read it.

WRT It happened in the last 24hrs:
That's wrong. First report was on 1st Dec 2013, then there were several other reports, which were treated the same way - either ignored or CLOSED WONTFIX. Please refer to my previous messages for links.
+Tom Gundersen I disagree that kernel command line param "debug" is easier to remember than "systemd.debug" in the context of... systemd debugging. :)

Again, debug was default for kernel-only debugging for a long time, and people are used to that. If they'll need to debug systemd - they'll man systemd and add "systemd.debug", that's easy enough.

Now the "debug", which used to work for years, is not working. It's a regression, that simple. Was working - doesn't work. :)

Your idea is really good - and it was discussed for years in different global support groups :) - to add a one-off param to get all debug on - for kernel, pid 1, started processes etc. It turned out (at least, from what I've heard last) a huge overkill and kinda useless, but still it can be proposed and discussed. all_debug_on_start_the_beast ? :)

However, the "debug" overriding wasn't that "all debug on" - it was just a useless (from my POV) overriding of a long-existing, long-working kernel param which broke a lot of scripts that a lot of people used.

WRT The handling of BZs - it wasn't only Kay, please refer to the public RH BZ, it seems to be how (some) folks from systemd handle it. And I, again :), don't agree with you that it's the proper way to handle BZs, please refer to my previos comment to +Colin Guthrie .
I can't see a huge difference but I do see a difference. After all, many users who are typing this in won't even know what systemd is and I've no doubt some people would type it in as system.debug (the two d's being conflated) which in turn will lead to increased round trip time after getting incomplete logs etc. That said, I don't want to blow this out of proportion. I don't think it's too hard to specify both, but I also don't think it's too hard to just use loglevel either when debugging a kernel-specific problem. Ultimately my opinion on the matter isn't strong on either side (and also my opinion really counts for much!)

That said, I think we disagree about your take on the "one true workflow" for BZ. You may have your opinions on things but not everyone will agree. I'm involved in several projects where the BZ workflow and processes are very different. To each their own, we just have to fit in (and argue for workfow changes via appropriate channels when appropriate). Ultimately BZ has waaay fewer eyes on it than on the mailing list and it really stifles multi-dev input unless everyone religiously follows BZ notifications and reads every bug which is pretty unrealistic.

I suspect that at the end of the day, this is a fight not worth fighting on technical grounds as there are reasonable, but not overly strong, arguments on both sides. Some of the technical changes that come out of this will likely be good tho' (rate limiting stuff is generally sensible and makes things more robust generally, and if there are alternative ways to capture all logs and record them sensible then this may will be nicer too). Sadly this is now just a pissing contest and I expect, eventually, that systemd will likely remove the debug piggy-backing because the kernel side tends to shout louder and use more aggressive tone than systemd does and ultimately it'll just make for bad PR when the playing field is uneven. But time will tell.
+Veaceslav Falico I'm not talking about systemd debugging. I'm talking about an end-user who is debugging a system that does not boot. They have no idea why that might be, if it is the kernel, systemd or the initrd. They also have not been using 'debug' for years, they will most probably use it precisely once in their life, when told to over the phone.

It appears you are still arguing from the point of view of a kernel/systemd developer, which is just not what our main focus is.

So, why just init/initrd/kernel? Because these are the things that may break your boot. Once you have actually booted your system and some service don't work, you can use journalctl/syslog/whatever to get more details.

Unless there is a bug I'm missing, 'debug' has not 'stopped working', it is simply behaving a bit differently. This is like saying that a redesigned user-interface is broken, because the buttons are not in the same location on the screen as they used to be, so when you press where you used to the wrong thing happens...

What we are discussion, is UI design, not a bug (again, unless I'm missing something). Also, I sense an "us" v. "them" thing going, but that's also missing the point. We want to design a full operating system, not just individual pieces, so we need to look at the full picture and decide what's the right thing to do on a global scale.

Regarding handling. Different projects have different ways to handle different sorts of issues. If the dev tells you that in their project they do design discussion on the mailinglist, I don't see why that's a problem. Also, if the dev thinks there is nothing to be done, closing the bug is a reasonable communication that he is not doing anything more about it. If you believe he has missed something (i.e., that it was a bug after all), then you can reopen it...
+Thomas Bushnell, BSG I see +Steven Rostedt's patch just as a "If you continue to abuse the toy we gave you with /proc/cmdline's debug param, we'll take away the toy".  Something adults sometimes have to do so children grow up and become adults.  AKA "Saying 'No' to them".
+Colin Guthrie about how hard it is - we seem to agree that both are quite on the same level. However, in one way (the current way) the default, long-existing method is broken and unusable, which is a regression, and should be fixed.

On the BZ - the systemd developers themselves could start a mailing list discussion, if they really wanted to. They, instead, chose to ignore and just close BZs.

On the "systemd will remove it because kernel side shouts louder" - sorry, this view is also kinda funny. By "shouting louder" do you mean filing BZs, trying to get ahold of developers, and trying to fix a regression in userspace by patching the kernel side because userspace doesn't care? :)

I think that we fundamentally disagree on the situation here. I see it as a regression of a well-known workflow, whilst you see this as someone "shouting loud" and a "useful thing" to break things that worked.

+Tom Gundersen They can be told over the phone to use "systemd.debug debug" the same way, nothing changes. However for a lot of people the overriding of "debug" changed their workflow, as "debug" became unusable for their scope.

I really don't get what are we arguing about. There was a perfectly fine use case, which systemd broke. That's a regression.

Boot can be broken by any stuck service, per example - mounting the FS. So, mount should also start treating the 'debug' option, by your logic. It might be also broken by any other service that is essential for boot. In a lot of use cases it's network - so networkmanager should also be started with debug. Also, X (blank screen?). Also, a lot others.

"Debug" indeed started to behave differently and, by behaving this way (outputing additional tons of unrelated messages) broke its initial usage, which everyone till now used - to debug kernel. That's a bug, that's a regression.

All that talk about "designing a full operating system" is, of course, quite interesting, however here we have a pure regression - when systemd broke the usage of kernel's "debug" param.

Regarding the handling of BZs - so you're saying that closing several reports and sending people to discuss it on mailing list is ok, however I consider it to be not ok because it can be perfectly discussed in the BZ and, if needed, copied to the mailing list by the developer itself.

He (and others) have missed one important thing - treating the community normally, instead of ignoring them. And by ignoring I mean both "ignoring the bug reports" and "closing the bug reports with a redirect". That's just derailing, instead of actually taking care of the situation.
+Veaceslav Falico i don't agree with your usage of the term 'regression'. Again, you are talking about an UI here, not an API. It is, as I said, like complaining about the placement of buttons in OpenOffice changing. This might be a bad thing, but calling it a regression is wrong.
+Veaceslav Falico perhaps we do fundamentally disagree, but I don't think we're that polar at the crux of it all. I guess I just attribute different weights to the usefulness of each outcome. I genuinely understand the workflow is affected here, but my workflows have changed so much over the last 10 or so years that I really don't attribute that much weight to such things vs. looking forward to what's most sensible generally.

I also meant no disrespect by the "shouting louder" comment, I just mean that from past experience, those affected by changes in workflow tend to complain loudly and anyone who may benefit from any change are generally silent as they are simply unaware of the whole debate (and that this doesn't really correlate to who is "most correct" in the actual debate - however you calculate "correctness"). Hopefully we at least agree on that distinction!

I doubt there is anything more I can add to this discussion tho', so will attempt to bow out now!
This seems like just a big storm in a teacup. Yes, some people's personalities can be grating, but at the end of the day I do expect the best technical solution will win. I see some good discussion about the various alternatives on the systemd mailing list.
The choice of communication tools is not neutral. USENET newsgroups are not owned, and many mailing lists operate in that tradition. The way you end an unproductive discussion is to let flamers burn themselves out, and then somebody the group respects steps in to say "please stop filling my mailbox."

Bugzilla is designed as an information tool for a single group to manage. Unlike mailing lists, it is owned. In the bad old days, any kind of persistent state on a mailing list had to be maintained and reposted by a particular person. This is now we got FAQs, and sometimes multiple warring FAQs and HOWTOs etc.

BZ is designed for single groups to be able to remember state, make decisions, and ship. It's a lot better than manually maintaining a bugfaq. But since it is an owned thing, it does not respond so well to the social mechanisms in place on mail.

My instinctual reaction to chronic BZ pathology is to call up the people-manager of the project in question and have a chat about how using all-caps terms shouting WONTFIX is hurting communication in the project-of-projects. I find it interesting I immediately assume there is a manager to yank the chain of. Linux hackers don't have managers that way; at least we tell ourselves that. But BZ sorta assumes there is that kind of backpressure mechanism.

If anybody is looking for a fun project (never believe a proposal billed as a "fun project"), there is a gap for distributed bug management communication/coordination tools. Our big hammer for letting different people have different opinions on state and sometimes reaching consensus is git. So distributed bug tracking looks like a git repo, although I bet the discussion part of it looks more like a mailing list. Ideally, all the flaming on a bug topic keeps on getting linked to; just because you killfile a topic doesn't mean people stop talking.
+Michael Chapman Ignoring bugs (in the RH BZ) and outright closing bugs (in the f.d.o BZ) and not taking responsibility for bugs is just a "grating personality" issue? I think it is an example of the lead developers of systemd not being able to work well with others.

+Thomas Bushnell, BSG FYI, Greg KH is much more of a kernel developer than a systemd developer. So to call him an example of a reasonable systemd developer is... amusing.
Its appalling to see people defending Kay's misbehavior.
+Theodore Ts'o Don't get me wrong, I'm not excusing the behaviour. I just don't think all the hoopla that's arisen out of it is going to do anything about it.
+Michael Chapman Just as the sanctions on Russia were designed so that Putin would see a cost to his bad behavior, the hoopla helps to call out Kay and the other systemd developers who discussed this internally and decided the correct answer was WONTFIX, and exact a social cost. Maybe it will be as ineffective as the sanctions against Russia. But it is still a worthy thing to do.
Urf... to me sentences like "it's about the end user" or "end users care about writing "debug" only to debug their boot process as a whole" are ludicrous. If we talk about inexperienced users that would somehow want to debug their boot process (hint, they don't. They install something else or buy a mac or a pc with windows, if their linux box cannot even boot properly). They don't screw around with kernel command lines. They choose "boot in rescue mode" or "boot in diagnostics mode" or whatever entry the distro has put for them in the grub menu, at most. No it's very much about power-users that use the kernel command line on a daily basis and do tens of reboot on a single day to debug either systemd XOR the kernel. The simple fact that you want to debug one while having the other behave normally  (and as a sysadmin, I really want that) warrants two command-line options, that is all. And the other simple fact that "debug" is used by the kernel, because it was introduced to mean "hey kernel, please run in debug mode" means systemd should use something else. The fact that other user-space programs have used "debug" as a hint that some debugging is going on matters little.
Ted Lemon
+Tom Gundersen, Kay basically leapt up over the desk and went for my throat when I asked a positive question about some work you'd done that made the news.   I think you are used to him responding this way, so it doesn't seem so bad to you anymore, but most people who interact with him don't know him.  If +Theodore Ts'o  is responding this way, I think you have an indication that this behavior is damaging to the project.

If I hadn't been interested in what you had to say (as opposed to Kay) I would have just blown the discussion off at that point.   This kind of behavior kills open source projects.   It's also clear that a lot of people have a problem with Kay's behavior—my comment about what he said on your post is still getting +1s two days later.

Your responses to me on the thread suggested to me that maybe Kay was the exception rather than the rule, since you were helpful and engaged.   But if you defend his behavior here you're enabling his behavior.   You don't have to be the bad guy here, but if you really think his behavior is okay, it might be time for you to reconsider that.
The trouble, +Todd Vierling is that it's not clear that the problem with systemd is solely its developers. Anybody seeking to fork systemd would have a LOT of reputation-building to do.
Can you point out that "leapt up over the desk and went for your throat" comment +Ted Lemon? The only reply I saw from Kay to yourself was was a joke with a "Let me Google that for you" link. Such links are clearly just jokes (and this is a social media platform after all!) and I've sent similar links to my Mum before (and plenty more to my friends) so I'm pretty sure you cannot be referring to that particular comment with the above description. Hence I'd like to know the actual comment you're referring so I can judge appropriately.
+Ted Lemon to deescalate the conflict I prefer to focus on the facts rather than the style (as at least the facts we can all probably agree on and solve). In that light, I still defend that it was correct to close the bug report and redirect the discussion to the ML.
That's the comment. I understood it to be intended as humorous.   You've known your Mum all your life, so it's not surprising that you know what is appropriate to say to her, and what is not.   Saying something like that to a total stranger is incredibly inappropriate: it reads as a joke made at the expense of a n00b.

While the point that it is important to know your audience is certainly germane, and we can theorize that Kay simply delivered a well-meant joke poorly, no such excuse applies to his response to the bug Ted is talking about.
+Tom Gundersen, when person A attacks person B, the right response is not to de-escalate the conflict, because there is no conflict.   There's an attack.   In the context of the bug report, the problem with your proposed solution is that a person who is not, and does not want to be, a regular contributor to the project is not going to want to be on the mailing list.   So what you've actually said, and what they will hear, is "please shut up," not "please participate in this other way," even though you intended the latter, not the former.
The problem, +Tom Gundersen , is that it was a bug. Something worked, something got changed, something got broken. The person who made the change made a mistake and needs to rectify their error. Now, it's possible that, with negotiation, a better solution can be achieved. But by closing the bug WONTFIX, Kay made it clear that the negotiation was going to start with him taking no action.

Your continued defense of Kay is appalling, by the way. THIS is why we can't have good things in the Open Source community.
+Ted Lemon it was intended as a joke at your expense. LMGTFY is always abusive, only ever a joke when between close friends.
+Russell Nelson Personally I disagree. I've had random people I don't know send me LMGTFY links in the past and can't think when I've ever not taken them in good humour. I guess like many other things, such as what the best bugzilla workflow is, it's somewhat subjective. I personally thought +Ted Lemon overreacted (as this is not how I would react to such interactions), but again this is a very personal and subjective thing, and I don't for a second want to question Ted's right to feel that way!!

But I do think to characterize a LMGTFY post which Ted concededs he appreciated was meant as a humorous (regardless of how it was received) as somehow being equivalent to "leaping across a desk and going for his throat" is a little disingenuous to say the least.

This thread has pretty much descended into a witch hunt against Kay and deliberately misrepresenting interactions known to be intended as humour as somehow deliberately vindictive is adding to this incredibly negative tone.

I will not defend Kay's actions in all areas,  but I think people here are guilty of massively overreacting, to the determent of all concerned.
+Colin Guthrie I agree this thread is starting to lose its usefulness, so I'll close it for further comments.  If it turns out that Kay and Lennart engage in more bad behaviors, we can call them out in other G+ posts.   If this causes them to be more careful in how they interact with other projects and with their users, then all of this will be for the good.  And if they create further demonstrations of not playing well with others, maybe you'll better understand the perspective of folks for which this who view this as but one example of a pattern of behavior and not a one-off on their part.