socket - yszheda/wiki GitHub Wiki
select
poll
The Problem with select()
int select(int numfd, fd_set * readfds, fd_set * writefds,
fd_set * exceptfds, struct timeval * tv);
The performance of
select()
is directly related to the value of the file descriptors being monitored. No other Linux system call depends on file descriptors' values. (The number of file descriptors monitored would be a more natural characteristic for performance to depend on.) As a specific example of this,select()
performs very poorly once the file descriptors get large.
If
fd_set
were actually enlarged, eachfd_set
structure would measure at least 12K in size! Needless to say, handlingfd_set
structures of that size would have a noticeable performance impact!
int poll(struct pollfd *ufds, unsigned int nfds, int timeout);
struct pollfd {
int fd; /* file descriptor */
short events; /* requested events */
short revents; /* returned events */
};
The number of
struct pollfd
items passed to it is going to have an effect on performance, which is reasonable, since the amount of work the system call is doing is directly related to the number of items in that array. However, the value of the individual file descriptors has no effect. This is a major win forpoll()
overselect()
.
The other advantage to
poll()
is that the largest file descriptor it can handle is limited by the size of anint
, not by the size of another structure (such as anfd_set
). This letspoll()
work with Linux kernels that allow hundreds of thousands of file descriptors per process, which is extremely useful in some cases.
Pipes
Linux considers writing to a pipe with no readers a pretty serious matter, since data could get lost. Rather then let a process blithely ignore error codes from
write()
and go about its business, Linux sends the process a signal calledSIGPIPE
. Unless the process has specifically asked to handle theSIGPIPE
signal, Linux kills the process. TheBroken pipe
message is from the shell, which is telling you that the process died because of a broken pipe.
Poll vs Select
poll( )
does not require that the user calculate the value of the highest- numbered file descriptor +1poll( )
is more efficient for large-valued file descriptors. Imagine watching a single file descriptor with the value 900 via select()—the kernel would have to check each bit of each passed-in set, up to the 900th bit.select( )
’s file descriptor sets are statically sized.- With
select( )
, the file descriptor sets are reconstructed on return, so each subsequent call must reinitialize them. Thepoll( )
system call separates the input (events
field) from the output (revents
field), allowing the array to be reused without change. - The timeout parameter to
select( )
is undefined on return. Portable code needs to reinitialize it. This is not an issue withpselect( )
select( )
is more portable, as some Unix systems do not supportpoll( )
Epoll vs Select/Poll
- We can add and remove file descriptor while waiting
epoll_wait
returns only the objects with ready file descriptorsepoll
has better performance – O(1) instead of O(n)epoll
can behave as level triggered or edge triggered (see man page)epoll
is Linux specific so non portable
options
MSG_WAITALL
- http://man7.org/linux/man-pages/man2/recv.2.html
- Socket recv() hang on large message with MSG_WAITALL