Saturday, November 8, 2014

Unix Prog: I/O Multiplexing(1)

1. Why we need I/O Multiplexing
When we read from one descriptor and write to another, we can use blocking I/O in a loop.
 while((n = read(STDIN_FILENO, buf, BUFSIZ)) > 0)   
   if(write(STDOUT_FILENO, buf, n) != n)  
     exit(-1);  

What if we have to read from two descriptors? In this case, we can't do a blocking  read on either descriptor, as data may appear on one descriptor while we're blocked in a read on the other.

telnet example:
telnet read user input and send to telnetd daemon which will write to network connection. At the same time, telnet command will read input from daemon and present it to terminal. In this case, telnet has 2 inputs and 2 outputs. We can't do a blocking read on either of the inputs, as we never know which input will have data for us.

One solution: divide telnet command into two processes. One process is responsible for one direction. But when process is terminated, parent and child has to send signal to the other notify termination which increases complexity.

Another solution: use 2 threads each thread is responsible for one direction, then it requires us to deal with multithreading  synchronization which could add more complexity.

Third solution: polling loop. Use nonblocking I/O in a single process by setting both descriptors nonblocking and issuing a read on the first descriptor. If data is present, we read it and process it. If there is no data to read, the call returns immediately. Then do the same thing with 2nd descriptor. After this, sleep for a few seconds and then try to read from the first descriptor again. This is a big waste of CPU time.

Fourth solution: asynchronous I/O. If one file descriptor is ready, let the operating system send the process a signal. But not all OS supports this feature

Better solution: I/O multiplexing. We build a list of the descriptors that we are interested in(usually more than one descriptor) and call a function that doesn't return until one of the descriptors is ready for I/O. On return from the function, we are told which descriptors are ready for I/O.
System calls for I/O multiplexing: poll, pselect, and select.

2. select and pselect
System Definitions:
 ubuntu@ip-172-31-23-227:~$ less /usr/include/x86_64-linux-gnu/sys/select.h  
 ......  
 /* Check the first NFDS descriptors each in READFDS (if not NULL) for read  
   readiness, in WRITEFDS (if not NULL) for write readiness, and in EXCEPTFDS  
   (if not NULL) for exceptional conditions. If TIMEOUT is not NULL, time out  
   after waiting the interval specified therein. Returns the number of ready  
   descriptors, or -1 for errors.  
   
   This function is a cancellation point and therefore not marked with  
   __THROW. */  
 extern int select (int __nfds, fd_set *__restrict __readfds,  
           fd_set *__restrict __writefds,  
           fd_set *__restrict __exceptfds,  
           struct timeval *__restrict __timeout);  
   
 #ifdef __USE_XOPEN2K  
 /* Same as above only that the TIMEOUT value is given with higher  
   resolution and a sigmask which is been set temporarily. This version  
   should be used.  
   
   This function is a cancellation point and therefore not marked with  
   __THROW. */  
 extern int pselect (int __nfds, fd_set *__restrict __readfds,  
           fd_set *__restrict __writefds,  
           fd_set *__restrict __exceptfds,  
           const struct timespec *__restrict __timeout,  
           const __sigset_t *__restrict __sigmask);  
 #endif  
 ......
 ubuntu@ip-172-31-23-227:~$ less /usr/include/linux/time.h
 ......  
 struct timespec {
         __kernel_time_t tv_sec;                 /* seconds */
         long            tv_nsec;                /* nanoseconds */
 };
 #endif

 struct timeval {
         __kernel_time_t         tv_sec;         /* seconds */
         __kernel_suseconds_t    tv_usec;        /* microseconds */
 };
 ......

Arguments we pass to select:
1) which descriptors we are interested in. (readfds, writefds, exceptfds are three arrays of file descriptors used to read, write and have pending exceptions)
2) How long we want to wait:
If timeout(third argument) is NULL, it would wait forever, if timeout->tv_sec == 0 and timeout->tv_usec == 0. Don't wait at all, all specified descriptors are tested and return immediately. If timeout->tv_sec !=0 || timeout->tv_usec != 0, wait for specified number of seconds and microseconds.

When select returns, the kernel tells us:
1) the total count of the number of descriptors that are ready(return value)
2) If return value is 0, then it times out
3) If return value is -1, then there is error
4) which descriptors are ready for each of the three conditions(readfds, writefds, exceptfds are re-populated)

Based on above information, we call appropriate I/O function(read or write), and we can make sure that they won't block.
1) A descriptor in the read set is considered ready if a read from that descriptor won't block
2) A descriptor in the write set is considered ready if a write from that descriptor won't block
3) A descriptor in the exception set is considered ready if an exception condition is pending on that descriptor
4) A descriptor for regular files always return ready for reading, writing and exception handling.

Regarding first argument: nfds(number of file descriptors). since unix need to know how many file descriptor bits need to be checked. We can set it to be FD_SETSIZE(normally 1024), but it is a waste for OS to check so many unused bits. For example, if the read file descriptors has 0 1 2, write file descriptors has 2, 3, exception descriptors set has 4. Then in this case nfds should be 5, meaning that OS should check first 5 file descriptors, fd0 to fd4.

No comments:

Post a Comment