Sunday, November 16, 2014

Unix Prog: Socket Descriptors

1. Unix Socket
A socket is an abstraction of a communication endpoint.
Socket descriptors are implemented as file descriptors in the unix system.
Many of the functions that deal with file descriptors, such as read and write, will work with socket descriptor.

2. Create a socket descriptor.
System Call:
 ubuntu@ip-172-31-23-227:~$ less /usr/include/x86_64-linux-gnu/sys/socket.h  
 ......  
 /* Create a new socket of type TYPE in domain DOMAIN, using  
   protocol PROTOCOL. If PROTOCOL is zero, one is chosen automatically.  
   Returns a file descriptor for the new socket, or -1 for errors. */  
 extern int socket (int __domain, int __type, int __protocol) __THROW;  
 ......  

domain argument determines the nature of the communication, including the address format.
 ubuntu@ip-172-31-23-227:~$ less /usr/include/x86_64-linux-gnu/bits/socket.h  
 ......  
 #define AF_UNSPEC    PF_UNSPEC  
 #define AF_LOCAL    PF_LOCAL  
 #define AF_UNIX     PF_UNIX  
 ......  
 #define AF_INET     PF_INET  
 ......  
 #define AF_INET6    PF_INET6  
 ......  

type argument determines the type of the socket, which further determines the communication characteristic.
 ubuntu@ip-172-31-23-227:~$ less /usr/include/x86_64-linux-gnu/bits/socket_type.h  
 ......  
 /* Types of sockets. */  
 enum __socket_type  
 {  
  SOCK_STREAM = 1,       /* Sequenced, reliable, connection-based  
                   byte streams. */  
 #define SOCK_STREAM SOCK_STREAM  
  SOCK_DGRAM = 2,        /* Connectionless, unreliable datagrams  
                   of fixed maximum length. */  
 #define SOCK_DGRAM SOCK_DGRAM  
  SOCK_RAW = 3,         /* Raw protocol interface. */  
 #define SOCK_RAW SOCK_RAW  
  SOCK_RDM = 4,         /* Reliably-delivered messages. */  
 #define SOCK_RDM SOCK_RDM  
  SOCK_SEQPACKET = 5,      /* Sequenced, reliable, connection-based,  
                   datagrams of fixed maximum length. */  
 #define SOCK_SEQPACKET SOCK_SEQPACKET  
  SOCK_DCCP = 6,        /* Datagram Congestion Control Protocol. */  
 #define SOCK_DCCP SOCK_DCCP  
  SOCK_PACKET = 10,       /* Linux specific way of getting packets  
                   at the dev level. For writing rarp and  
                   other similar things on the user level. */  
 #define SOCK_PACKET SOCK_PACKET  
   
  /* Flags to be ORed into the type parameter of socket and socketpair and  
    used for the flags parameter of paccept. */  
   
  SOCK_CLOEXEC = 02000000,   /* Atomically set close-on-exec flag for the  
                   new descriptor(s). */  
 #define SOCK_CLOEXEC SOCK_CLOEXEC  
  SOCK_NONBLOCK = 00004000   /* Atomically mark descriptor(s) as  
                   non-blocking. */  
 #define SOCK_NONBLOCK SOCK_NONBLOCK  
 };  
 ......  

protocol argument is usually zero, to select the default protocol for the given domain and socket type.
Default protocol for SOCK_STREAM socket in the AF_INET communication domain is TCP.
Defautl protocol for SOCK_DGRAM socket in the AF_INET communication domain is UDP.

Note: calling "socket" system call is similar from calling "open". You get a file descriptor that can be used for I/O. When we are done, we call "close" to relinquish the access of it.

Note: although socket descriptor is one file descriptor, but it doesn't mean we apply all system calls using file descriptors on socket descriptors, for example, lseek(socket doesn't have concept of offset).

3. Socket Type
1) SOCK_DGRAM Interface: no logical connection needs to exist between peers for them to communicate. A datagram provides a connection-less service.

A datagram is a self-contained message, which includes the counter-party address. Sending a datagram is analogous to mailing someone a letter. You can mail many letters, but you can't guarantee the order of delivery, and some might get lost along the way.

2) SOCK_STREAM Interface: requires that before exchanging data, you setup a logical connection between your socket and the socket belonging to the peer you want to communicate with.

Message contain no addressing information, as a point-to-point virtual connection exists between both ends, and the connection itself implies a particular source and destination.

With a SOCK_STREAM socket, applications are unaware of message boundaries. When we read data from a socket, it might not return the same number of bytes written by the process sending data. We will eventually get everything sent to us, but it might take several function calls.

3) SOCK_SEQPACKET Interface: it is same as SOCK_STREAM socket except that we get a message-based service instead of a byte-stream service. The amount of data received from a SOCK_SEQPACKET socket is the same amount as was written.

OS just "pack up" the entire message as in same packet and send to us.

4. Shut down socket
Communication on a socket is bidirectional. We can disable the I/O on a socket with the shutdown function.

 ubuntu@ip-172-31-23-227:~$ less /usr/include/x86_64-linux-gnu/sys/socket.h  
 ......  
 /* Shut down all or part of the connection open on socket FD.  
   HOW determines what to shut down:  
    SHUT_RD  = No more receptions;  
    SHUT_WR  = No more transmissions;  
    SHUT_RDWR = No more receptions or transmissions.  
   Returns 0 on success, -1 for errors. */  
 extern int shutdown (int __fd, int __how) __THROW;  
 ......  

close is similar from shutdown, but it deallocate the entire socket(only if the socket descriptor is not duplicated with dup, if that is the case, the socket will not be deallocated until the last socket descriptor is closed).

No comments:

Post a Comment