CVE-2013-1940: shaggy bug story

So myself and +Peter Hutterer  found a pretty low risk security bug in the X server last week, but I decided to document how we found it, for my own posterity!

So +Maarten Lankhorst mentioned that GPU hotplug when you have multiple X servers running was fail, and I was just about to start writing the code to fix it. The problem is of course the X server that is VT switched takes control of the hotplugged USB device as well as the one that is currently running the VT. So Peter and myself were in the office and I decided to ask him that input hotplug does to stop this. He wasn't sure but he thought it might not handle it so well, and he also realised it was a possible CVE for input devices.

So instead of reading the code, I started an X session, opened a terminal, left the cursor in it, VT switched the X server, plugged in the keyboard, typed some stuff, and VT switched back, and there were my keystrokes in the terminal window. So this could lead to someone being able to steal your login details if they left a VT switched X server in the background and jumped to a gdm login screen.

So clearly there was nothing in the VT switched X server blocking new input devices being attached. Except then we read the code, and there was. Very explicitly we don't enable new input devices on VT switched X servers. So we had a found a bug, just not the bug we had been trying to find.

Head scratching, wtf, WTUF ensued.

So keystrokes from the device were getting buffered somewhere, but where could that be. When we VT switched back to the X server and it enables input devices, the evdev driver eventually calls xf86FlushInput to remove any queued input events for the device. Also when we VT switch away we closed all the file descriptors and reopen them on VT switch back.

Except when evdev hotplugged something it plumbed some pieces of the device in, it opened the file descriptor for example, and kept it open. So that was hint one, we had an open fd, so maybe the kernel was putting stuff in there, and we weren't flushing it. A few straces later, and I located the flush code reading from the device, and getting -EINVAL back  (queue another wtf).

Investigating the flushing code which was written back when devices were serial and read always returned whatever bytes it could, the kernel evdev drivers won't allow partial reads of input events, the X server flush code only used a 4-byte buffer as its flush buffer. So the X server was never actually flushing evdev devices, one 4->256 in a char array and it was all fixed.

What I find wierd, if we had read the code first and I hadn't gone straight to just see what happened, we'd have decided there was no problem at all and this would have lived on, until someone noticed it some other way.
Shared publiclyView activity