Sunday, August 31, 2014

Unix Prog: Standard I/O Efficiency

fileio1.c: -> io1.out
This program use getc to get the input character and then output to standard output with putc.
 #include<stdio.h>  
 #include<stdlib.h>  
 #include<unistd.h>  
   
 int main(int argc, char* argv[])  
 {  
  char c;  
  while((c = getc(stdin)) != EOF)  
   if(putc(c, stdout) == EOF) {  
    printf("output error!\n");  
    exit(1);  
   }  
   
  if(ferror(stdin)) {  
   printf("input error!\n");  
   exit(2);  
  }  
   
  exit(0);  
 }  

fileio2.c: -> io2.out
This program use fgetc to get the input character and then output to standard output with fputc.
 #include<stdio.h>  
 #include<stdlib.h>  
 #include<unistd.h>  
   
 int main(int argc, char* argv[])  
 {  
  char c;  
  while((c = fgetc(stdin)) != EOF)  
   if(fputc(c, stdout) == EOF) {  
    printf("output error!\n");  
    exit(1);  
   }  
   
  if(ferror(stdin)) {  
   printf("input error!\n");  
   exit(2);  
  }  
   
  exit(0);  
 }  

fileio3.c: -> io3.out
This program use fgets to get the input line and then output to the standard output with fputs.
 #include<stdio.h>  
 #include<stdlib.h>  
 #include<unistd.h>  
   
 int main(int argc, char* argv[])  
 {  
  char buf[BUFSIZ];  
  while(fgets(buf, BUFSIZ, stdin) != NULL)  
   if(fputs(buf, stdout) == EOF) {  
    printf("output error!\n");  
    exit(1);  
   }  
   
  if(ferror(stdin)) {  
   printf("input error!\n");  
   exit(2);  
  }  
   
  exit(0);  
 }  

shell:
1) List the file test.txt, which is a big file about 130MB
2) Use fileio1.c program to read input from test.txt and output to output_1.txt
3) Use fileio2.c program to read input from test.txt and output to output_2.txt
4) Use fileio3.c program to read input from test.txt and output to output_3.txt
 ubuntu@ip-172-31-23-227:~$ ls -lrt test.txt  
 -rw-rw-r-- 1 ubuntu ubuntu 130000000 Aug 31 14:30 test.txt  
 ubuntu@ip-172-31-23-227:~$ ./io1.out <test.txt >output_1.txt  
 ubuntu@ip-172-31-23-227:~$ ./io2.out <test.txt >output_2.txt  
 ubuntu@ip-172-31-23-227:~$ ./io3.out <test.txt >output_3.txt  

Summary of Time Consumption of 3 ways:
1) getc and putc use most of the time, especially on user time, which is the time spent on loop inside the program. Since it needs to do much more loop compared to line I/O, so user time is larger.
2) getc and putc spend much less time compared to the way using "read" system call with buffer size 1. Since underlying the implementation, getc and putc will not call corresponding system call whenever it is called once, with the help of stream buffer, the number of time of calling system call "read, write" is much much less.
3) fputs and fgets is much faster than the way using "getc, putc". On user time, since it needs to do much fewer loops, it spends much less user time. On system time(kernel), depending on the implementation of "fputs, fgets", it may be same as the way using "getc,putc", if fgets fputs are implemented with "getc,putc". It may be spending less time if they are implemented with memccpy.
4) In summary, the system time spent by standard I/O is almost same as the way calling read/write system directly with good buffer size. But the way of programming, like more loops, maybe affecting user time(time spent on user space, normally the program developer directly writes) a lot.

No comments:

Post a Comment