tofutu: Chapter 7: Process Scheduling

7.1 Scheduling Policy
- Based on time-sharing technique. CPU time is divided into slices, one for each runnable process.
- The scheduling policy is based on ranking process according to priority.
- Linux applies dynamic process priority assignment. If a process is denied for using CPU for a long time, its priority is increased dynamically.

Linux 把process 區分為三種
- Interactive process: spend lots of time waiting for events, example: shell, text editor
- Batch process: no need to be responsive, no user interaction, run in background
- Real-time process: have very stringent scheduling requirements. ex: video and sound application

"Linux 2.6 scheduler implements a sophisticated heuristic algorithm based on the past behavior of the processes to decide whether a given process should be considered as interactive or batch. " [ref1]

7.1.1 Preemption
- 當Process在執行時(TASK_RUNNING), 如果priority比現在current的process較高, current process 會被preempt然後context switch.
- 第二種情形是quantum expires

- Determine the quantum duration is critical for system performance. If the average quantum is too short, the system overhead caused by context switching is high.
- Quantum太長不會影響interactive program的performance, 因為interactive program有比較高的priority, 會preempt current running process.
- The rule of thumb adopted by Linux is choose a duration as long as possible, while keeping good system response time.

7.2 The scheduling algorithm
- 每個processor都有個swapper (PID=0) , 當CPU沒有process可執行時, 執行swapper
- 每個linux process is always scheduled according to one of the scheduling classes:
1. SCHED_FIFO: FIFO real-time process. 如果沒有更高優先權的real-time process, 此process會持續使用CPU. (Leave the process descr. in its current position in the runqueue)
2. SCHED_RR: put the process descr. at the end of the runqueue. 當被higher priority process interrupt, 就會從後面開始
3. SCHED_NORMAL: conventional, time-shared process.

7.2.1 Scheduling of Conventional Process
一個process會有static priority跟dynamic priority
Static
- static priority: 100(high) - 139(low)
- use nice() or setpriority() system calls
- base quantum time會根據不同的static priority來作調整
- 較高的priority有較大的quantum time
- ex: priority: 100, nice value: -20, base time quantum: 800ms
Dynamic
- 100 - 139
- dynamic priority = max( 100, min (static priority - bonus + 5, 139))
- bonus會根據過去的紀錄來決定, related to the average sleep time of the process.
- bonus range between 0 -10, greater than 5 is a premium that raises the dynamic priority.
- ex: sleep less than 100 ms: bonus = 0, greater than 1 second: bonus = 10

Starvation:
較高priority的process可能造成較低的process無法使用CPU.
"when a process finishes its time quantum, it can be replaced by a lower priority process whose time quantum has not yet been exhausted"
The scheduler keeps two set of runnable processes:

Active process: process自已的quantum還沒用完.
Expired process: the runnable process have exhausted their time quantum and are thus forbidden to run until all active processes expire. quantum用完後就放到expire list
from ref1: figure 7-1

每一個CPU有一個runqueue, 裡面包含了所有的runnable processes. 而每一個process只能存在於一個runqueue. Struct runqueue有兩個pointer分別指到active processes跟expired processes. active 跟expire的pointer會週期性的互相調換.
schedule()的功能: 從runqueue之中選一個新的process來執行, 如果沒有就執行idle process (swapper). Kernel藉由觀察runqueue的大小來決決定一個CPU的覆載量. 如果不平均, 就需要load balance...

Runqueue Balancing
Multiprocessor machines有這三種:
1. SMP (Symmetric Multi Processing)指的是多顆CPU有相同的能力, Kernel在處理時不續要特別針對不同特性的CPU做特別的處理
2. Hyper-threading: CPU上可執行多個thread, CPU內有多組的registers互相交替使用並模擬出多個虛擬的CPU給kernel使用
3. NUMA (Non Uniform Memory Access): 所有CPU對RAM的存取並不是相同的cost, 而是某部份的RAM分給CPU當他的local memory, 因此不需要考慮其他CPU會始用到此區記憶體, 而省去了synchronization overhead.

Linux利用scheduling Domain的概念來keep the CPUs balanced by the kernel.

相關的System Calls
Scheduling:
1. nice(), getpriority(), setpriority(): 設定process的static base priority, 如果沒有root權限, 只能降低priority
2. sched_getaffinity(), sched_setaffinity(): 設定此process要執行在那個CPU. cpu_allowed field in the process descriptor.

real-time process:
1. sched_getscheduler(), sched_setscheduler(): 回傳此process所使用的scheduling policy (FIFO, RR, or NORMAL)
2. sched_yield(): allow a process to relinquish the CPU voluntarily without being suspended.
3. sched_get_priority_min(), sched_get_priority_max():
4. sched_rr_get_interval():

Question:
1. 每個process有自己的scheduling policy?還是整個系統採用一種policy?

--- Appendix ---
from ref2
TASK_RUNNING:
means that a task is in a runnable state. It does not mean that a CPU is actually allocated. The task can wait until it is selected by the scheduler. This state guarantees that the process really is ready to run and is not waiting for an external event.
TASK_INTERRUPTIBLE:
is set for a sleeping process that is waiting for some event or other. When the kernel signals to the process that the event has occurred, it is placed in the TASK_RUNNING state and may resume execution as soon as it is selected by the scheduler.
TASK_UNINTERRUPTIBLE:
is used for sleeping processes disabled on the instructions of the kernel. They may not be woken by external signals, only by the kernel itself.
TASK_STOPPED:
indicates that the process was stopped on purpose — by a debugger, for example.

Reference:
1. Understanding the Linux Kernel, Chapter 7
2. Professional Linux Kernel Architecture

tofutu

2011年9月21日星期三

Chapter 7: Process Scheduling

沒有留言:

張貼留言

總網頁瀏覽量

2011年9月21日 星期三

Chapter 7: Process Scheduling

沒有留言:

張貼留言

總網頁瀏覽量

2011年9月21日星期三