An overview of Linux processes
2013-08-14 15:41
316 查看
An overview of Linux processes
Himanshuz.chd|July 10 2012
| Visits (5035)
inShare
Tweet
A Process is one of the most important fundamental concepts of the Linux operating system. This article focuses on the basics of Linux processes.
Process
A process is an instance of a program running in Linux. This is the basic definition that you might have heard before. Though its simple enough to understand but still lets elaborate a bit for the beginners. Lets quickly create a hello world program in C language:
#include<stdio.h> int main(void) { printf("\n Hello World\n"); // Simulate a wait for some time for(i=0; i<0xFFFFFFFF; i++); return 0; }
Compile the code above :
$ gcc -Wall hello_world.c -o hello_world
Run the executable :
$ ./hello_world
The command above will execute the hello world program. Since the program waits for some time, so quickly go to the other terminal and check for any process named 'hello_world' :
$ ps -aef | grep hello_world himanshu 2260 2146 95 20:38 pts/0 00:00:13 ./hello_world
So we see that a process named 'hello_world' is running in the system. Now, try to run the same program in parallel from 2-3 locations and again run the above command. I tried running the program in parallel from three different terminals and here is the output
of the above command :
$ ps -aef | grep hello_world himanshu 2320 2146 99 20:43 pts/0 00:00:03 ./hello_world himanshu 2321 2261 67 20:43 pts/1 00:00:02 ./hello_world himanshu 2322 2287 72 20:43 pts/2 00:00:00 ./hello_world
So you see that each instance of the hello_world program created a separate process. Hence we say that process is running instance of a program.
Identifiers associated with a process
Each process has following identifiers associated with it:Process Identifier (PID)
Each process has a unique identifier associated with it known as process ID. This ID remains unique across the system. For example, if you run the ps command on your Linux box, you will see something like:UID PID PPID C STIME TTY TIME CMD root 1 0 0 19:43 ? 00:00:00 /sbin/init root 2 0 0 19:43 ? 00:00:00 [kthreadd] root 3 2 0 19:43 ? 00:00:00 [migration/0] root 4 2 0 19:43 ? 00:00:00 [ksoftirqd/0] root 5 2 0 19:43 ? 00:00:00 [watchdog/0] root 6 2 0 19:43 ? 00:00:00 [migration/1] root 7 2 0 19:43 ? 00:00:00 [ksoftirqd/1] root 8 2 0 19:43 ? 00:00:00 [watchdog/1] root 9 2 0 19:43 ? 00:00:00 [events/0] root 10 2 0 19:43 ? 00:00:00 [events/1] ... ... ...
The above output is from my Linux box. The second column (PID) gives the process ID of the process being described in the row. You may notice another similar looking column PPID. Well, this gives information of the parent process ID of this process. Any process
in the Linux system will have a parent.
User and group Identifiers (UID and GID)
The category of identifiers associated with a process is the user and group identifiers. The user and group ID can further be classified into :Real user ID and real group ID
These identifiers give information about the user and group to which a process belongs. Any process inherits these identifiers from its parent process.
Effective user ID, effective group ID and supplementary group ID
Ever got an error like "Permission denied"? Well this is a common error that is encountered many times. This error usually occurs when a process does not have sufficient permissions to carry out a task. These three IDs are used to determine the permission that
a process has to do stuff that requires special permissions. Usually the effective user ID is same as real user ID but in case its different then it means that process is running with different privileges then what it has by default (ie inherited from its
parent).If a process is running with effective user ID '0', this means that this process has special privileges. The processes that have zero effective user ID are known as privileged processes as they are running as superuser. These processes bypass all the
permission checks that kernel has in place for all the unprivileged processes.
The init process
In Linux every process has a parent process. Now, one would ask that there has to be some starting point, some process that is created first. Yes, there is a process known as 'init' that is the very first process that Linux kernel creates after system bootsup. All the process there-on are children of this process either directly or indirectly. The init process has special privileges in the sense that it cannot be killed. The only time it terminates is when the Linux system is shut down. The init process always
has process ID 1 associated with it.
Zombie and orphan processes
Suppose there are two processes. One is parent process while the other is child process. In a real time, there can be two scenarios:The parent dies or gets killed before the child.
In the above scenario, the child process becomes the orphan process (as it has lost its parent). In Linux, the init process comes to the rescue of the orphan processes and adopts them. This means after a chile has lost its parent, the init process becomes its
new parent process.
The child dies and parent does not perform wait() immediately.
Whenever the child is terminated, the termination status of the child is available to the parent through the wait() family of calls. So, the kernel does waits for parent to retrieve the termination status of the child before its completely wipes out the child
process. Now, In a case where parent is not able to immediately perform the wait() (in order to fetch the termination status), the terminated child process becomes zombie process. A zombie process is one that is waiting for its parent to fetch its termination
status. Although the kernel releases all the resources that the zombie process was holding before it got killed, some information like its termination status, its process ID etc are still stored by the kernel. Once the parent performs the wait() operation,
kernel clears off this information too.
Daemon process
A process that needs to run for a long period of time and does not require a controlling terminal, these type of processes are programmed in a way that they becomes a daemon processes. For example, monitoring software like key-logger etc are usually programmedas daemon processes. A daemon process has no controlling terminal.
Memory layout of a process
A process can broadly be defined into following segments :Stack
Stack contains all the data that is local to a function like variables, pointers etc. Each function has its own stack. Stack memory is dynamic in the sense that it grows with each function being called.
Heap
Heap segment contains memory that is dynamically requested by the programs for their variables.
Data
All the global and static members become part of this segment.
Text
All the program instructions, hard-coded strings, constant values are a part of this memory area.
If we extend the above hello world program to something like :
#include<stdio.h> #include<stdlib.h> #include<string.h> int a; int main(void) { int i = 0; char *ptr = (char*)malloc(15); memset(ptr, 0, 15); memcpy(ptr, "Hello World", 11); printf("\n %s \n", ptr); // Simulate a wait for some time for(i=0; i<0xFFFFFFFF; i++); free(ptr); return 0; }
In the example above :- The variable 'a' goes into the data segm
ent(specifically into BSS segment that contains all the uninitialized globals)- The variables 'i' and 'ptr' lie on stack segment. Each function call like memset, memcpy, printf and free will have their separate stack
once they get called.- The constant values like "Hello World", '15', '11', '0', '0XFFFFFFFF' and all the instructions are part of text segment.- The 15 bytes of memory allocated by the malloc function is allocated on heap. So the pointer 'ptr' holds the address
of a memory location on heap.Note that heap is shared by all processes, so overuse or corruption of heap might affect other programs running in the system.
Linux process environment
Environment in Linux is a list of 'variable=value' information that is used for variety of purposes. Programs, scripts, shells etc use this information for their smooth operation. For example the home directory of the user which is presently logged-in can beaccessed by the 'HOME' environment variable. List of these environment variables along with their values can be viewed using the 'env' command. For example, on my Linux box I could see the following output of the env command :
ORBIT_SOCKETDIR=/tmp/orbit-himanshu SSH_AGENT_PID=1653 TERM=xterm SHELL=/bin/bash XDG_SESSION_COOKIE=b8b52be9a0280f3c8b48fcf04d7ac5a3-1341925217.889152-1390765341 WINDOWID=62917358 GNOME_KEYRING_CONTROL=/tmp/keyring-6UEJQ4 GTK_MODULES=canberra-gtk-module USER=himanshu SSH_AUTH_SOCK=/tmp/keyring-6UEJQ4/ssh DEFAULTS_PATH=/usr/share/gconf/gnome.default.path SESSION_MANAGER=local/himanshu-laptop:@/tmp/.ICE-unix/1619,unix/himanshu-laptop:/tmp/.ICE-unix/1619 USERNAME=himanshu XDG_CONFIG_DIRS=/etc/xdg/xdg-gnome:/etc/xdg DESKTOP_SESSION=gnome PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games PWD=/home/himanshu GDM_KEYBOARD_LAYOUT=us LANG=en_IN GNOME_KEYRING_PID=1601 MANDATORY_PATH=/usr/share/gconf/gnome.mandatory.path GDM_LANG=en_IN GDMSESSION=gnome SPEECHD_PORT=7560 SHLVL=1 HOME=/home/himanshu GNOME_DESKTOP_SESSION_ID=this-is-deprecated LOGNAME=himanshu XDG_DATA_DIRS=/usr/share/gnome:/usr/local/share/:/usr/share/ DBUS_SESSION_BUS_ADDRESS=unix:abstract=/tmp/dbus-AWvAHVEXeC,guid=62c39aae57aa4bfc10e80e444ffc2762 DISPLAY=:0.0 XAUTHORITY=/var/run/gdm/auth-for-himanshu-yxPNRW/database COLORTERM=gnome-terminal _=/usr/bin/env
So we can see that there is a wide list of environment variables available. A user can add an environment variable using the 'export' command. In C language, an extern variable char**environ can be used
to access this list in a program. A list of functions like getenv(), setenv() etc are available to manipulate the process environment.
Manipulating Linux resource limits
Any process in Linux can get hold of resources like files, memory etc. As always there is a limit to these resources per process. Each resource has a soft and a hard limit associated with it. A soft limit is a temporary limit associated with a resource andcan be changed while a hard limit is the cap up to which the soft limit can be changed. Linux provides command line utilities like 'ulimit' to manipulate these resource limits. On the other hand the system calls like getrlimit() and setrlimit() can be used
to play with these limits from within a C code.
相关文章推荐
- An Introduction to SELinux on CentOS 7 – Part 2: Files and Processes
- An overview of Openvswitch implementation
- General overview of the Linux file system
- An overview of Android
- An overview of LabOne programming
- namespaces - overview of Linux namespaces
- An Overview of Cryptography 密码学概述
- LINUX AND THE MAXIMUM NUMBER OF PROCESSES (THREADS)
- An Overview of Managed/Unmanaged Code Interoperability
- An overview of gradient descent optimization algorithms
- An overview of the Spring MVC request flow
- An Overview of Acoustic Modeling Techniques from ICASSP 2012
- An overview of ETX
- An Overview of Pages, Blocks and FTLs in a Solid-State Drive (SSD)
- An overview of gradient descent optimization algorithms
- Elasticsearch Internals: Networking Introduction An Overview of the Network Topology
- An overview of gradient descent optimization algorithms解读
- An Overview of Complex Event Processing
- an overview of TraceMonkey
- linux下的C++编程错误(一):terminate called after throwing an instance of ‘std::ios_base::failure' wha