Do I get a prize for obfuscated (but efficient!) code?

I'd like to get rid of that loop that removes trailing slashes, it predicts horribly badly (ie common case is "exactly once through the loop") but while I could play with separate masks for NUL and '/' characters, that runs into register pressure issues. I suspect I should just special-case the "one slash" thing and avoid the mispredict that way.

Pastebin:
http://pastebin.com/a83f6Ecx

Resulting inner loop:
http://pastebin.com/xPdifHNB
---
#define ONEBYTES 0x0101010101010101ul
#define SLASHBYTES 0x2f2f2f2f2f2f2f2ful
#define HIGHBITS 0x8080808080808080ul

/* Return the high bit set in the first byte that is a zero */
static inline unsigned long has_zero(unsigned long a)
{
return ((a - ONEBYTES) & ~a) & HIGHBITS;
}

/*
* Calculate the length and hash of the path component, and
* return the beginning of the next one (or the pointer to the
* final NUL character if none).
*/
static inline const char *hash_name(struct qstr *str, const char *name)
{
unsigned long a, mask, hash, len;

str->name = name;
hash = a = 0;
len = -sizeof(unsigned long);
do {
hash = (hash + a) * 11;
len += sizeof(unsigned long);
a = *(unsigned long *)(name+len);
/* Do we have any NUL or '/' bytes in this word? */
mask = has_zero(a) | has_zero(a ^ SLASHBYTES);
} while (!mask);

/* The mask below the first high bit set */
mask = (mask - 1) & ~mask;
mask >>= 7;
hash += a & mask;
str->hash = fold_hash(hash);

/* Get the final path component length */
len += ffz(mask) / 8;
str->len = len;

/* remove trailing slashes */
while (name[len] == '/')
len++;

return name+len;
}
Shared publiclyView activity