Thursday, August 14, 2014

Unix Prog: File I/O(1)

1. File Descriptor
To the kernel all open files are referred to by file descriptors.
We need to feed file descriptors(returned from "open" or "create") to "read" or "write" to identify files.

By convention, UNIX use 0 as descriptor for standard input, 1 for standard output, 2 for standard error.

shell:
standard input/output/error file descriptors are defined inside unistd.h
 ubuntu@ip-172-31-23-227:~$ less /usr/include/unistd.h  
 ......  
 /* Standard file descriptors. */  
 #define STDIN_FILENO  0    /* Standard input. */  
 #define STDOUT_FILENO  1    /* Standard output. */  
 #define STDERR_FILENO  2    /* Standard error output. */  
 ......  

OPEN_MAX config decides how many files one process can open at the same time.
File descriptors range from 0 to OPEN_MAX

shell:
1) From posix_lim.h, we can see that POSIX.1 feature defines it as 20 and 16 based on version.But this is the minimum value for all POSIX.1 compliant system.
2) For our current implementation, we can use sysconf with _SC_OPEN_MAX to get the number of files allowed to be opened by a process.
 ubuntu@ip-172-31-23-227:~$ less /usr/include/x86_64-linux-gnu/bits/posix1_lim.h  
 ......  
 /* Number of files one process can have open at once. */  
 #ifdef __USE_XOPEN2K  
 # define _POSIX_OPEN_MAX    20  
 #else  
 # define _POSIX_OPEN_MAX    16  
 #endif  
 ......  
 ubuntu@ip-172-31-23-227:~$ less /usr/include/x86_64-linux-gnu/bits/confname.h  
 ......  
   _SC_OPEN_MAX,  
 #define _SC_OPEN_MAX          _SC_OPEN_MAX  
 ......  

2.open, creat, close, lseek, read, write system definition
shell:
1) open definition
2) creat definition
3) close definition
4) lseek definition
5) read definition
6) write definition
 ubuntu@ip-172-31-23-227:~$ grep open /usr/include/fcntl.h  
 ......  
 extern int open (const char *__file, int __oflag, ...) __nonnull ((1));  
 ......  
 ubuntu@ip-172-31-23-227:~$ grep creat /usr/include/fcntl.h  
 ......  
 extern int creat (const char *__file, mode_t __mode) __nonnull ((1));  
 ......  
 ubuntu@ip-172-31-23-227:~$ grep close /usr/include/unistd.h  
 extern int close (int __fd);  
 ......  
 ubuntu@ip-172-31-23-227:~$ grep lseek /usr/include/unistd.h  
 ......  
 extern __off_t lseek (int __fd, __off_t __offset, int __whence) __THROW;  
 ......  
 ubuntu@ip-172-31-23-227:~$ grep read /usr/include/unistd.h  
 ......  
 extern ssize_t read (int __fd, void *__buf, size_t __nbytes) __wur;  
 ......  
 ubuntu@ip-172-31-23-227:~$ grep write /usr/include/unistd.h  
 ......  
 extern ssize_t write (int __fd, const void *__buf, size_t __n) __wur;  
 ......  

3. open mode list
O_RDONLY: the file is open for "read only"
O_WRONLY: the file is open for "write only"
O_RDWR: the file is open for "read and write"
only one of above macros could be used.

They are all defined at "fcntl.h" in current system.
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_RDONLY  
 /usr/include/asm-generic/fcntl.h:#define O_RDONLY    00000000  
 ......  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_WRONLY  
 /usr/include/asm-generic/fcntl.h:#define O_WRONLY    00000001  
 ......  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_RDWR  
 /usr/include/asm-generic/fcntl.h:#define O_RDWR     00000002  
 ......  

O_APPEND: append to the end of file on each write
O_CREAT: create the file if doesn't exist, it needs open function to specify the permission bit.
O_EXCL: Generate an error, if O_CREAT is specified and file is already existed.
O_TRUNC: if the file already exists, open it for read or write, truncate it to 0
O_NOCTTY: if the pathname refers to a terminal device, do not allocate the device as the controlling terminal for this process.
O_NONBLOCK: if the pathname refers to a FIFO, a block special file, or a character special file, it sets the nonblocking mode.
O_DSYNC: have each write wait for the physical I/O to complete, but don't wait for the file attributes to be updated.
O_RSYNC: Have each read operation on the file descriptor wait until any pending writes for the same portion of the file are complete.
O_SYNC: have each write operation wait for all physical I/O to complete.

They are all defined in fcntl.h
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_APPEND  
 ......  
 /usr/include/asm-generic/fcntl.h:#define O_APPEND    00002000  
 ......  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_CREAT  
 ......  
 /usr/include/asm-generic/fcntl.h:#define O_CREAT        00000100    /* not fcntl */  
 ......  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_EXCL  
 ......  
 /usr/include/asm-generic/fcntl.h:#define O_EXCL     00000200    /* not fcntl */  
 ......  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_TRUNC  
 ......  
 /usr/include/asm-generic/fcntl.h:#define O_TRUNC        00001000    /* not fcntl */  
 ......  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_NOCTTY  
 ......  
 /usr/include/asm-generic/fcntl.h:#define O_NOCTTY    00000400    /* not fcntl */  
 ......  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_NONBLOCK  
 ......  
 /usr/include/asm-generic/fcntl.h:#define O_NONBLOCK   00004000  
 ......  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_DSYNC
 ......
/usr/include/asm-generic/fcntl.h:#define O_DSYNC                00010000        /* used to be O_SYNC, see below */
 ......
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep O_SYNC
 ......
 /usr/include/asm-generic/fcntl.h:#define O_SYNC         (__O_SYNC|O_DSYNC)
 ......


4. File Names Truncation
If file names exceeds length specified by macro NAME_MAX, some system will truncate the name silently, some system will assign error value to the errno(ENAMETOOLONG)

Explanation:
1) limits.h file defines the NAME_MAX, in current system, it is 255 chars long
2) We can also use _PC_NAME_MAX to get the maximum length using sysconf.
3) In current system, _POSIX_NO_TRUNC is defined, if it is defined, once the name is too long, system will not truncate the name silently, instead of, it will assign the ENAMETOOLONG to errno.
4) errno.h defines ENAMETOOLONG macro
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep "#define NAME_MAX"  
 /usr/include/linux/limits.h:#define NAME_MAX     255    /* # chars in a file name */  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep "_PC_NAME_MAX"  
 /usr/include/x86_64-linux-gnu/bits/confname.h:  _PC_NAME_MAX,  
 /usr/include/x86_64-linux-gnu/bits/confname.h:#define  _PC_NAME_MAX          _PC_NAME_MAX  
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep "_POSIX_NO_TRUNC"
 /usr/include/unistd.h:   _POSIX_NO_TRUNC                Pathname components longer than
 /usr/include/x86_64-linux-gnu/bits/posix_opt.h:#define  _POSIX_NO_TRUNC 1
 ubuntu@ip-172-31-23-227:~$ sudo find /usr/include -name *.h | xargs grep "ENAMETOOLONG"
 /usr/include/asm-generic/errno.h:#define        ENAMETOOLONG    36      /* File name too long */
 /usr/include/stdlib.h:   ENAMETOOLONG; if the name fits in fewer than PATH_MAX chars,


No comments:

Post a Comment