Shared publicly  - 
Maybe everyone else knew that socket writes on O_NONBLOCK TCP socekts are smaller than writes on blocking sockets (after poll() says it's writable).  I didn't, and bugs followed:
Pettycoin. YA crazy bitcoin project. View On GitHub. This project is maintained by Rusty Russell. Fascinating Socket Write Behaviour. 07 Aug 2014. A user (wow, I have a user!) reported several bugs, but the coolest was that dumbwallet would freeze talking to pettycoin .
Rusty Russell's profile photoronnie sahlberg's profile photo繧九醉's profile photoLucian Carata's profile photo
I wonder how much space 'poll' actually promises to be writeable (if any).  Maybe one page?  I wonder if increasing SO_SNDBUF has any effect on the write that will succeed after a poll...
I didn't know it. At a past employer i had a quite large chunk of code doing non-blocking socket reads and writes. Fortunately all the reads and writes were rather small (< 10k). I never knowingly ran into this.

At what size writes were you seeing this problem?
This is one of those things that has you going "I really should have known that"
Ouch.  This was standard knowledge by my second year CS networking/unix course!  I wonder how many are making this mistake?
+Rusty Russell: the basic rule is that doing "poll()" on a blocking file descriptor is insane. 

The poll() return value says that you can read or write something, but it does not mean that you can read or write arbitrary amounts.  So a blocking operation will block, modulo other concerns.

This is perhaps more obvious - and easier to see - with reads. Imagine your socket got a packet of data - obviously it will now be readable. But if you do a blocking read of one megabyte, you'll still be blocking until something like a signal happens, because you didn't get a whole megabyte.

So "obviously" you cannot do a one megabyte read just because poll() said there was something to read. I bet that doesn't surprise you. You can do various random hacks, like using "SIOCINQ/FIONREAD" etc, and still use a blocking socket, so it can work, but it's not thread-safe and it's just plain stupid. Just use a nonblocking socket.

For writes, the exact same thing holds. Sure, you can try to play games by knowing socket buffer sizes and look at pending buffers with SIOCOUTQ etc, and say "ok, I can probably do a write of size X without blocking" even on a blocking file descriptor, but it's hacky, fragile and wrong.

So just do the sane thing. If you are using poll(), use a nonblocking file descriptor, and do reads/writes in a loop until you get an error.
Hmmm, don't you always have to assume short writes (e.g. -EAGAIN b/c of a signal)? So the write loop would be the same even for blocking writes ...
+Rusty Russell: What's the advantage of reenabling blocking behaviour after a successful poll as opposed to a simple TEMP_FAILURE_RETRY style (glibc) loop? 
+Linus Torvalds I naively assumed you'd get a short read on a ready, blocking fd. I get short writes, and no signals in sight. It'd be a kind of feature if you could assume complete reads and writes on blocking sockets without signals.
The odd thing is that you get a short write for both the blocking and the non-blocking case, but they are different short writes.
That's essentially what you said at the top,  but no-one has really addressed that issue.
I understand a non-blocking write returning a short write, and I understand a blocking write blocking until it could write everything.  But why would a blocking write block until it can write some large amount, but not everything?
+Linus Torvalds my tiny test program here indicates that I my understanding was correct: on TCP sockets I get a short read, rather than blocking on a 1MB read.

So now you can see why I expected write to behave the same way.
+Tim Connors 5.2 does seem to imply it, but only if you read "completed" as "fully completed", which would then imply that we normally "fully complete" I/O on blocking sockets, which isn't true.
+Rusty Russell yeah, different file descriptor types have somewhat different semantics for "blocking" (with tty's being probably the most complex - think of all the timeout and "minimum character count" and line ending issues with termios).

So the exact detail end up being complicated, but the basic rule of "poll/select goes together with nonblocking IO" still holds. As you saw, mixing poll and blocking IO can "work" but usually has various issues.

It's made more subtle by the fact that poll actually generally tries to help applications be efficient, so if I recall correctly (on my phone right now, no source code) poll will not return writable until the write queues are "sufficiently' empty (something like half empty) so that applications that are in a poll/write loop don't get woken up for each packet that gets sent and acknowledged, but instead get bigger poll sleeps and bigger writes.

So details like that, along with timing etc, makes things much harder to predict. Blocking writes end up " almost working well" except when they don't.

With nonblocking IO, those subtleties all go away.
Oh, and even "nonblocking" IO will block for some things. Kernel locks, memory buffer allocations etc. So a nonblocking socket shouldn't block waiting for network traffic, but it might still block on paging or on the socket lock etc. Do there is no guarantee of absolute 100% CPU time.
Sorry +Linus Torvalds you have it exactly wrong WRT reading.  Your "justification by symmetry" for write is based on a flawed premise.  Short reads are the norm on non-regular files, not some weird exception.

Turns out POSIX actually covers this: "The use of the O_NONBLOCK flag has no effect if there is some data available." (if not a FIFO or pipe) and "The value returned may be less than nbyte if ... file is a pipe or FIFO or special file and has fewer than nbyte bytes immediately available for reading."

But, POSIX completely breaks the symmetry with write, where partial writes are effectively illegal unless O_NONBLOCK: "If the O_NONBLOCK flag is clear, a write request may cause the thread to block, but on normal completion it shall return nbyte." (FIFO/pipe) and by exception-proves the rule implied for others: "If the O_NONBLOCK flag is set,...If some data can be written without blocking the thread, write() shall write what it can and return the number of bytes written."

TL;DR: O_NONBLOCK has no effect on socket reads, except making them return -EAGAIN on empty.  O_NONBLOCK is required if you don't want socket writes to sleep.  And Linux is bug compatible with POSIX here.
BTW, if you think that is crazy, what about this:

Normal blocking UDP socket. Poll says the socket is readable but then the read fails with -EAGAIN.
Add a comment...