Saturday, August 23, 2014

Unix Prog: Files -- Size and Truncation

1. File sizes
st_size member of the stat structure contains the size of file in bytes.

fileio.c:
 #include<stdio.h>  
 #include<stdlib.h>  
 #include<unistd.h>  
 #include<sys/stat.h>  
   
 int main(int argc, char* argv[])  
 {  
  struct stat statbuff;  
  if(lstat(argv[1], &statbuff) < 0) {  
   printf("lstat error for %s\n", argv[1]);  
   exit(1);  
  }  
   
  printf("file size: %d\n", (int)statbuff.st_size);  
   
  exit(0);  
 }  

shell:
1) List all files at current directory, we have 3 regular files, one symbolic link and one directory, and output also includes the file sizes
2) Run the program against fileio.c, st_size member indicates 304
3) Run the program against slnk, st_size member indicates 8. Actually, the size of symbolic link file is just the length of destination file name string("fileio.c", null-terminated)
4) Run the program against testdir, st_size member indicates: 4096. Actually the size of directory is always the number of bytes in the filename.
 ubuntu@ip-172-31-23-227:~$ ls -lrt  
 total 24  
 -rw-rw-r-- 1 ubuntu ubuntu  128 Aug 23 14:10 fileio.c~  
 -rw-rw-r-- 1 ubuntu ubuntu  304 Aug 23 14:40 fileio.c  
 -rwxrwxr-x 1 ubuntu ubuntu 10600 Aug 23 14:40 io.out  
 lrwxrwxrwx 1 ubuntu ubuntu   8 Aug 23 14:42 slnk -> fileio.c  
 drwxrwxr-x 2 ubuntu ubuntu 4096 Aug 23 14:42 testdir  
 ubuntu@ip-172-31-23-227:~$ ./io.out fileio.c  
 file size: 304  
 ubuntu@ip-172-31-23-227:~$ ./io.out slnk  
 file size: 8  
 ubuntu@ip-172-31-23-227:~$ ./io.out testdir  
 file size: 4096  

2. Files with hole
fileio.c:
 #include<stdio.h>  
 #include<stdlib.h>  
 #include<unistd.h>  
 #include<fcntl.h>  
 #include<sys/stat.h>  
   
 int main(int argc, char* argv[])  
 {  
  struct stat buf;  
   
  if(lstat("hole", &buf) < 0) {  
   printf("lstat error!\n");  
   exit(1);  
  }  
   
  printf("st_size: %d st_blksize: %d st_blocks: %d\n", (int)buf.st_size, (int)buf.st_blksize, (int)buf.st_blocks);  
   
  exit(0);  
 }  


shell:
1) List the hole file, the file size is 16396.There is "hole" in the file, which means, write system call start writing after the end of file end.
2) read hole file and output to the file "hole.cp". Since if "read" system call read the "hole", it will just return 0, which means, hole.cp will fill up the "hole" with 0, the hole disappears at "hole.cp"
3) List the files: hole, hole.cp. They still have same size from "ls", this is because ls command read the st_size member of struct stat.
4) Use du command to indicate how many blocks each file occupy, hole occupies 8, since the "hole" inside file doesn't really occupy disk space. But "hole.cp" occupies 20 blocks, since it doesn't have "hole" inside, which means it will occupy more disk spaces.
5) run program against two files, and output the file size, block size, blocks of the given file.
 ubuntu@ip-172-31-23-227:~$ ls -lrt hole  
 -rw-rw-r-- 1 ubuntu ubuntu 16396 Aug 23 14:56 hole  
 ubuntu@ip-172-31-23-227:~$ cat hole > hole.cp  
 ubuntu@ip-172-31-23-227:~$ ls -lrt hole*  
 -rw-rw-r-- 1 ubuntu ubuntu 16396 Aug 23 14:56 hole  
 -rw-rw-r-- 1 ubuntu ubuntu 16396 Aug 23 15:08 hole.cp  
 ubuntu@ip-172-31-23-227:~$ du hole  
 8    hole  
 ubuntu@ip-172-31-23-227:~$ du hole.cp  
 20   hole.cp  
 ubuntu@ip-172-31-23-227:~$ ./io.out hole
 st_size: 16396 st_blksize: 4096 st_blocks: 16
 ubuntu@ip-172-31-23-227:~$ ./io.out hole.cp
 st_size: 16396 st_blksize: 4096 st_blocks: 16

3. File Truncation

Definition:
 ubuntu@ip-172-31-23-227:~$ less /usr/include/unistd.h  
 ......  
 /* Truncate FILE to LENGTH bytes. */  
 # ifndef __USE_FILE_OFFSET64  
 extern int truncate (const char *__file, __off_t __length)  
    __THROW __nonnull ((1)) __wur;  
 ......  
 # ifndef __USE_FILE_OFFSET64  
 extern int ftruncate (int __fd, __off_t __length) __THROW __wur;  
 ......  

truncate and ftruncate are used to truncate the file after specified length.
truncate is run against the file path name.
ftruncate is run against the file descriptor.

fileio.c:
 #include<stdio.h>  
 #include<stdlib.h>  
 #include<unistd.h>  
   
 int main(int argc, char* argv[])  
 {  
  if(truncate("test1.txt", 12) == -1) {  
   printf("truncate error!\n");  
   exit(1);  
  }  
   
  if(truncate("test2.txt", 12) == -1) {  
   printf("truncate error!\n");  
   exit(1);  
  }  
   
  exit(0);  
 }  

shell:
1) List two files, one with size: 26 ,one with size 7
2) Print out two files
3) Run the program against these 2 files
4) List two files again, now they are all with file size: 12
5) Print out two files

above example indicates that for current system, if the file size is larger than given truncated size: 12, the remaining part will be truncated. If the file size is smaller than given truncated size: 12, it will be extended to 12 by filling 0.
 ubuntu@ip-172-31-23-227:~$ ls -lrt test*  
 -rw-rw-r-- 1 ubuntu ubuntu 26 Aug 23 15:24 test1.txt  
 -rw-rw-r-- 1 ubuntu ubuntu 7 Aug 23 15:24 test2.txt  
 ubuntu@ip-172-31-23-227:~$ cat test1.txt  
 Hello world!  
 HELLO WORLD!  
 ubuntu@ip-172-31-23-227:~$ cat test2.txt  
 Hello!  
 ubuntu@ip-172-31-23-227:~$ ./io.out  
 ubuntu@ip-172-31-23-227:~$ ls -lrt test*  
 -rw-rw-r-- 1 ubuntu ubuntu 12 Aug 23 15:25 test2.txt  
 -rw-rw-r-- 1 ubuntu ubuntu 12 Aug 23 15:25 test1.txt  
 ubuntu@ip-172-31-23-227:~$ cat test1.txt  
 Hello world!ubuntu@ip-172-31-23-227:~$ cat test2.txt  
 Hello!  

No comments:

Post a Comment