Sunday, September 7, 2014

Unix Prog: Process Memory Layout

1. C Program Memory Layout


1) Text Segment: the machine instructions CPU executes. And text segment is sharable, so only one copy need to be saved in memory for frequently used program, like editor and compiler.

2) Initialized Data: For all variables that are not defined inside any functions, and already assigned a value. For example: int a = 10;

Note: only text segment and initialized data need to be saved in disk. For other parts, they are generated in memory when the program is in execution.

3) Uninitialized Data: For all variables that are not defined inside any functions, and not assigned a value. For example: int a.

These data are not saved in disk, they are generated in memory when program is in execution, and exec will initialize them to 0 by default.

4) Stack, for all automatic variables defined inside the function, along with information that is saved each time when a function is called. When a function exits, all information will be destroyed and kicked out of stack.

5) Heap, where dynamic memory allocation usually take place.

2. size command

shell:
This indicates that, for a.out file, the text segment is 1598 bytes, "data"(initialized global variables) is 568 bytes, "bss"(uninitialized global variables) is 16 bytes, "dec" and "hex" are total size in decimal mode and hexadecimal mode.
 ubuntu@ip-172-31-23-227:~$ size ./a.out  
   text  data   bss   dec   hex filename  
   1598   568   16  2182   886 ./a.out  

3. shared libraries

Shared libraries removed the common routines from the executable file and maintain a single copy of the library routine in memory that all process can reference.

This dramatically reduce the size of each executable files and increase the running overhead, either when the program is firstly executed or the shared library function is firstly executed.

shell example:
1) Compile proc.c without using the shared library (what -static means)
2) Run the size, we found that the size is dramatically larger.
3) Compile proc.c with using the shared library (default option)
4) Run the size, we found that the size is much smaller.
 ubuntu@ip-172-31-23-227:~$ gcc -g -static proc.c  
 ubuntu@ip-172-31-23-227:~$ size ./a.out  
   text  data   bss   dec   hex filename  
  782283  7532  9632 799447  c32d7 ./a.out  
 ubuntu@ip-172-31-23-227:~$ gcc -g proc.c  
 ubuntu@ip-172-31-23-227:~$ size ./a.out  
   text  data   bss   dec   hex filename  
   1598   568   16  2182   886 ./a.out  

This indicates even for a very simple c program, it still need to refer to many common library routines.

4. Memory Allocation

1) malloc: allocates a specified number of bytes of memory. The initial value of memory is undefined.
2) calloc: allocates a specified number of objects of a specified size. The initial value of memory is 0
3) realloc: increase or decrease the size of previously allocated area. If increasing space, the initial value between the old content and the end of new area is undefined.

Definition:
 ubuntu@ip-172-31-23-227:~$ less /usr/include/stdlib.h  
 ......  
 /* Allocate SIZE bytes of memory. */  
 extern void *malloc (size_t __size) __THROW __attribute_malloc__ __wur;  
   
 /* Allocate NMEMB elements of SIZE bytes each, all initialized to 0. */  
 extern void *calloc (size_t __nmemb, size_t __size)  
 __THROW __attribute_malloc__ __wur;  
   
 /* Re-allocate the previously allocated block in __ptr, making the new  
   block SIZE bytes long. */  
 /* __attribute_malloc__ is not used, because if realloc returns  
   the same pointer that was passed to it, aliasing needs to be allowed  
   between objects pointed by the old and new pointers. */  
 extern void *realloc (void *__ptr, size_t __size)  
 __THROW __attribute_warn_unused_result__;  
   
 /* Free a block allocated by `malloc', `realloc' or `calloc'. */  
 extern void free (void *__ptr) __THROW;  
 ......  

1) malloc, calloc, realloc although return void* pointer, but they are all aligned well to adapt any kind of pointer.

For example, double must start at the memory location that are multiple of 8, the pointer to double need to be aligned well to pointer to double type.

2) free can deallocate the space pointed by ptr. These spaces are not returned back to kernel, instead, they will be maintained in a pool for later malloc allocation.

3) realloc, if user realloc the original space to a larger space and there is already enough space after the end of old space, then it is good. Otherwise, realloc will allocate new memory spaces, and then move old contents to the new memory space, lastly deallocate old memory spaces.

memory.c:
It allocates part of memory with "malloc", then use realloc to move memory to a space of 200 bytes, lastly use realloc again to memory to a space of 200000 bytes.
 #include<stdio.h>  
 #include<stdlib.h>  
 #include<string.h>  
   
 int main(int argc, char* argv[])  
 {  
  char *pc;  
   
  if((pc = (char*)malloc(100)) == NULL) {  
   printf("malloc error!\n");  
   exit(1);  
  }  
   
  strcpy(pc, "Hello world!");  
   
  printf("%p, %s\n", pc, pc);  
   
  if((pc = (char*)realloc(pc, 200)) == NULL) {  
   printf("realloc error!\n");  
   exit(2);  
  }  
   
  printf("%p, %s\n", pc, pc);  
   
  if((pc = (char*)realloc(pc, 200000)) == NULL) {  
   printf("realloc error!\n");  
   exit(2);  
  }  
   
  printf("%p, %s\n", pc, pc);  
   
  free(pc);  
 }  

shell:
1) Run malloc to allocate memory, starting address is 0xd4c010
2) Run realloc to move data to space with 200 bytes, luckily there is enough space after current space, so no data is moved, and the starting address is same
3) Run realloc to move data to space with 200000 bytes, unluckily there is no enough space after current space, so a new memory space is allocated and all data is moved to that new place, the old place is deallocated. So, the starting address changed this time.
 ubuntu@ip-172-31-23-227:~$ ./a.out  
 0x1d4c010, Hello world!  
 0x1d4c010, Hello world!  
 0x7ff872d99010, Hello world!  

No comments:

Post a Comment