How SMP schedule work in Linux kernel? (ARM architecture)
2014-07-11 17:11
459 查看
http://stackoverflow.com/questions/21182160/how-smp-schedule-work-in-linux-kernel-arm-architecture
The ARM SMP systems support two types of interrupts. SPI (shared peripheral interrupt) and PPI (peripheral private interrupts). The PPI is a
per-CPU interrupt source. A special case for SMP of the PPI is an SGI (software generated interrupt); this is a CPU-to-CPU interrupt that is used to signal from one CPU to another in the SMP world (called
IPI).Note1
A
PPI timer can be used to allow each CPU to use 'tickless scheduling'; that is timer interrupts are scheduled via knowledge of future time events (google timing wheel, look at the
Global PPI timer is used. This timer can interrupt each CPU selectively, but the register set is global to all CPUs. A particular CPU may schedule an interrupt for itself; with the time base being global.
The complication is that tasks must be migrated from one CPU to another in order to balance work among CPUs. Also, the Linux kernel's core code/scheduler is written for multiple CPUs (or architectures) and they may not have these
per-CPU interrupt sources. An definitive answer may depend on your kernel version and the scheduler used (or more generally kernel configuration). Generally, a busy CPU will do the
migration, other CPUs may wake on a timer tick just to see if a task in it's set should run (maybe a migrated process). If
In any case, there is nothing that is ARM specific in the CPU scheduling besides the clock source. It is possible for an
ARM SMP system to not have the a global PPI timer. In this case, every CPU may wake to service an interrupt, but the majority may sleep immediately. This could happen on any system due to a bad timer/interrupt controller design or
a bad system configuration. However, even in these cases, the code would not call into the scheduler except where needed.
See:
Linux Scheduler on SMP (which maybe a duplicate although the answer is not great IMO),
IBMs completely fair scheduler article and
O'Reillys Linux Kernel scheduler chapter.
Note1: This is actually GIC (or generic interrupt controller) terminology. However, most ARM SMP systems use this interrupt controller. It is bundled with Cortex-A CPUs and came as an external soft-component for
some ARMv6 systems. It is possible for an ARM SMP systems to use another controller, but it is probably extremely rare or non-existent.
Edit: There are two ARM on-chip timers; these are useful as every Cortex-A has them compared to SOC vender timers. One of them is used instead of a 'counting loop' for a delay. This works better in the case of interrupts. I don't think it
is critical to understand SMP scheduling, you may ignore that comment and just know that that source file is not used for scheduling. It was the first one I looked at. If you find it really distracting, I will remove that information.
See
this paper on timing wheels; it is about 'IP'/networking, but the concept of
another for multi-tasking as well, so the scheduler itself may set a timer.
The ARM SMP systems support two types of interrupts. SPI (shared peripheral interrupt) and PPI (peripheral private interrupts). The PPI is a
per-CPU interrupt source. A special case for SMP of the PPI is an SGI (software generated interrupt); this is a CPU-to-CPU interrupt that is used to signal from one CPU to another in the SMP world (called
IPI).Note1
A
PPI timer can be used to allow each CPU to use 'tickless scheduling'; that is timer interrupts are scheduled via knowledge of future time events (google timing wheel, look at the
NO_HZdocumentation, etc). The current Linux kernel doesn't use this specific PPI timer for scheduling. It is only used as a delay loop time source. Instead the
Global PPI timer is used. This timer can interrupt each CPU selectively, but the register set is global to all CPUs. A particular CPU may schedule an interrupt for itself; with the time base being global.
The complication is that tasks must be migrated from one CPU to another in order to balance work among CPUs. Also, the Linux kernel's core code/scheduler is written for multiple CPUs (or architectures) and they may not have these
per-CPU interrupt sources. An definitive answer may depend on your kernel version and the scheduler used (or more generally kernel configuration). Generally, a busy CPU will do the
migration, other CPUs may wake on a timer tick just to see if a task in it's set should run (maybe a migrated process). If
NO_HZis in effect, some CPUs may not wake at all; they will get an IPI in the case of migration.
In any case, there is nothing that is ARM specific in the CPU scheduling besides the clock source. It is possible for an
ARM SMP system to not have the a global PPI timer. In this case, every CPU may wake to service an interrupt, but the majority may sleep immediately. This could happen on any system due to a bad timer/interrupt controller design or
a bad system configuration. However, even in these cases, the code would not call into the scheduler except where needed.
See:
Linux Scheduler on SMP (which maybe a duplicate although the answer is not great IMO),
IBMs completely fair scheduler article and
O'Reillys Linux Kernel scheduler chapter.
Note1: This is actually GIC (or generic interrupt controller) terminology. However, most ARM SMP systems use this interrupt controller. It is bundled with Cortex-A CPUs and came as an external soft-component for
some ARMv6 systems. It is possible for an ARM SMP systems to use another controller, but it is probably extremely rare or non-existent.
Edit: There are two ARM on-chip timers; these are useful as every Cortex-A has them compared to SOC vender timers. One of them is used instead of a 'counting loop' for a delay. This works better in the case of interrupts. I don't think it
is critical to understand SMP scheduling, you may ignore that comment and just know that that source file is not used for scheduling. It was the first one I looked at. If you find it really distracting, I will remove that information.
See
this paper on timing wheels; it is about 'IP'/networking, but the concept of
NO_HZis similar. Ie. Don't interrupt every 10mS, just to increment ticks. In the
NO_HZcase, each CPU can set a future wake-up time based on what sort of requests drivers and sub-systems have given. Ie,
schedule_work()needs to be run in 175ms, then the timer is set to that value for the CPU and we don't wake-up 17 times (if the system tick is 10mS), but just increment ticks by 17. Some CPUs may need a timeout to evict the current process to run
another for multi-tasking as well, so the scheduler itself may set a timer.
相关文章推荐
- usb to ethernet adapter (moshi) work in linux kernel
- ARM SMMUv3 architecture in linux
- How to compile Linux kernel in fedora 6
- linux kernel Internal error: Oops: 5 [#1] PREEMPT SMP ARM
- How to burn Linux Kernel into nor flash and rootfs in USB stick at Mindspeed c1k
- How to build qemu-system-arm in Linux
- How to sleep in the Linux kernel?
- How system calls work in Linux
- How to Install Nvidia Kernel Module Cuda and Pyrit in Kali Linux
- ARM Linux Driver how to work with DMA
- Linux Device Drivers 3rd Edition Data Types in the Kernel
- How the Linux Kernel initcall Mechanism Works
- How do I configure the iscsi-initiator in Red Hat Enterprise Linux 5?
- Interrupt Handling Internals in Linux Kernel
- Linux Kernel Threads in Device Drivers
- ARM linux kernel file analysis
- Kernel Memory Layout on ARM Linux
- Kernel Memory Layout on ARM Linux
- How GNU/Linux start-up scripts work
- HOWTO build arm-linux toolchain for ARM/XSCALE