fans-rt 任务调度-堆栈切换篇(4)tiny模型详细分析
2015-07-11 23:32
375 查看
优化后的Tiny模型代码:
;
; Copyright(C) 2013-2015, Fans-rt development team.
;
; All rights reserved.
;
; This is open source software.
; Learning and research can be unrestricted to modification, use and dissemination.
; If you need for commercial purposes, you should get the author's permission.
;
; Configuration:
; System global core stack NO
; The local core stack of general task YES
; The loacl core stack of kernel task YES
; The local user stack of general task NO
; Hardware supported task switch IRQ NO
; Hardware supported double stack NO
;
; date author notes
; 2015-06-25 JiangYong new file
; 2015-07-07 JiangYong rename to kboard_interrupt.s
; 2015-07-11 JiangYong code optimization
;
INCLUDE kirq_define_enum.inc
EXPORT UsageFault_Handler
EXPORT BusFault_Handler
EXPORT MemManage_Handler
EXPORT HardFault_Handler
EXPORT SysTick_Handler
EXPORT PendSV_Handler
EXPORT SVC_Handler
EXPORT CORE_Switch2UserMode
IMPORT CORE_EnterIRQ
IMPORT CORE_LeaveIRQ
IMPORT CORE_TickHandler
IMPORT CORE_TaskScheduling
IMPORT CORE_HandlerLPC
IMPORT CORE_SwitchTask
IMPORT CORE_SetTaskStackPosition
IMPORT CORE_GetTaskStackPosition
IMPORT CORE_GetCoreStackPosition
IMPORT CORE_CheckMustbeSchedule
PRESERVE8
AREA |.text|, CODE, READONLY
ALIGN 4
THUMB
CORE_Switch2UserMode PROC
MOV R0, #0 ;
MSR PRIMASK, R0 ; Enable IRQ
BX LR ; return
ENDP
PendSV_Handler PROC
BX LR ; Not support PendSV IRQ
ENDP
SVC_Handler PROC
CPSID I ; Why to disable IRQ ? Guess !
<span style="color:#ff0000;"> MOV R0, SP ; R0 = Offset of {R0 - R3}
PUSH {LR, R12} ; Why to push 12? Guess !
MOV R12, R0 ; R12 = Offset of {R0 - R3}</span>
BL CORE_EnterIRQ ; Set current interrupt nest layer
CPSIE I ; Enable IRQ
<span style="color:#ff0000;"> LDMFD R12, {R0-R3} ; Resume R0 - R3 to call service</span>
BL CORE_HandlerLPC ; Call system service
<span style="color:#ff0000;"> B ST_L1 ; The next step same as system tick handler</span>
ENDP
SysTick_Handler PROC
CPSID I ; Why to disable IRQ ? Guess !
<span style="color:#ff0000;"> PUSH {R12, LR} ; Why to push R12?</span>
BL CORE_EnterIRQ ; Set current interrupt nest layer
BL CORE_TickHandler ; Inc the system tick
CPSIE I ; Enable IRQ
BL CORE_TaskScheduling ; Find the new task will be scheduling
<span style="color:#ff0000;">ST_L1</span>
CPSID I ; Disable IRQ
BL CORE_LeaveIRQ ; Set and get current interrupt nest layer
CBNZ R0, ST_LE ; Nest layer != 0 then leave this interrupt
BL CORE_CheckMustbeSchedule ; Check need schedule
CBZ R0, ST_LE ; Must schedule = FALSE then leave this interrupt
<span style="color:#ff0000;"> PUSH {R4 - R11} ; nest layer = 0 and Must schedule = TRUE then scheduling</span>
MRS R0, MSP ; R0 = Core stack for old task
MRS R1, PSP ; R1 = User stack for old task
BL CORE_SwitchTask ; CORE_SwitchTask(CoreStack, UserStack);
; No need check the task permssion
BL CORE_GetTaskStackPosition ; R0 = User stack for new task(no need)
MOV R1, R0 ; R1 = User stack for new task(no need)
BL CORE_GetCoreStackPosition ; R0 = Core stack for new task
MSR PSP, R1 ; Update user stack(no need)
MSR MSP, R0 ; Update core stack
<span style="color:#ff0000;"> POP {R4 - R11} ; Restore new task registers</span>
ST_LE
POP {R12, LR} ; Resume break point
CPSIE I ; Enable IRQ
BX LR ; Return to task break point
ENDP
HardFault_Handler PROC
B .
ENDP
MemManage_Handler PROC
B .
ENDP
BusFault_Handler PROC
B .
ENDP
UsageFault_Handler PROC
B .
ENDP
ALIGN
END
代码中,红色部分为第一个版本与优化后的主要差异点,为什么这么优化呢?在我们分析完代码执行过程中的堆栈环境就明白了。Tiny模型所有任务只有一个堆栈,本文将所有任务堆栈按照内核堆栈进行分析。
首先,进入中断时的堆栈状态:
; |-----------------------|
; | ........ |
; |-----------------------|
; | xPSR | <<= SP + 0x1C
; |-----------------------|
; | PC | <<= SP + 0x18
; |-----------------------|
; | LR | <<= SP + 0x14
; |-----------------------|
; | RC | <<= SP + 0x10
; |-----------------------|
; | R3 | <<= SP + 0x0C
; |-----------------------|
; | R2 | <<= SP + 0x08
; |-----------------------|
; | R1 | <<= SP + 0x04
; |-----------------------|
; | R0 | <<= SP + 0x00
; |-----------------------|
; | ........ |
; |-----------------------|
; 中断入口的堆栈映像,由CPU自动保存
从堆栈映像图可以看出,SP 刚好指向保存 R0 的位置, 所以 SVC_Handler 中断入口在执行
<span style="color:#ff0000;">MOV R0, SP ; R0 = Offset of {R0 - R3}</span>
PUSH {LR, R12} ; Why to push 12? Guess !
<span style="color:#ff0000;">MOV R12, R0 ; R12 = Offset of {R0 - R3}</span>
后, R12就指向堆栈中保存R0-R3寄存器的首地址,在调用 CORE_HandlerLPC之前,通过
<span style="color:#ff0000;"> LDMFD R12, {R0-R3} ; Resume R0 - R3 to call service</span>
BL CORE_HandlerLPC ; Call system service即可恢复R0-R3寄存器,以便作为参数传递给 CORE_HandlerLPC。
当代码执行完 LR 和 R12 寄存器的入栈操作后堆栈如下:
; |-----------------------|
; | ........ |
; |-----------------------|
; | xPSR | <<= SP + 0x24
; |-----------------------|
; | PC | <<= SP + 0x20
; |-----------------------|
; | LR | <<= SP + 0x1C
; |-----------------------|
; | RC | <<= SP + 0x18
; |-----------------------|
; | R3 | <<= SP + 0x14
; |-----------------------|
; | R2 | <<= SP + 0x10
; |-----------------------|
; | R1 | <<= SP + 0x0C
; |-----------------------|
; | R0 | <<= SP + 0x08
; |-----------------------|
; | LR | <<= SP + 0x04
; |-----------------------|
; | RC | <<= SP + 0x00
; |-----------------------|
; | ........ |
; |-----------------------|
; LR 和 R12 入栈后修改前的代码SVC_Handler和SysTick_Handler分别对R4和R0入栈,并未对R12入栈。为什么修改为对R12入栈呢?因为任务切换时需要构造断点堆栈映像如下:
; |-----------------------|
; | ........ |
; |-----------------------|
; | xPSR | <<= SP + 0x54
; |-----------------------|
; | PC | <<= SP + 0x50
; |-----------------------|
;
af2e
| LR | <<= SP + 0x4C
; |-----------------------|
; | RC | <<= SP + 0x48
; |-----------------------|
; | R3 | <<= SP + 0x44
; |-----------------------|
; | R2 | <<= SP + 0x40
; |-----------------------|
; | R1 | <<= SP + 0x3C
; |-----------------------|
; | R0 | <<= SP + 0x38
; |-----------------------|
; | LR | <<= SP + 0x34
; |-----------------------|
; | RC | <<= SP + 0x30
; |-----------------------|
; | RB | <<= SP + 0x2C
; |-----------------------|
; | RA | <<= SP + 0x28
; |-----------------------|
; | R9 | <<= SP + 0x24
; |-----------------------|
; | ........ |
; |-----------------------|
; | R0 | <<= SP + 0x00 (It's the old task break point)
; |-----------------------|
; | ........ |
; |-----------------------|
; 断点堆栈映像在确认需要调度后,会对 R4-R11进行入栈以构造断点堆栈映像,而已入栈的R12刚好在断点堆栈映像中正确的位置。所以,相比修改前,节省了R4/R0的入栈和出栈动作。
优化点:
1.中断入口的入栈由R4/R0修改为R12,减少R0/R4的入栈和出栈消耗
2.在 SVC_Handler 中,对R0-R3的入栈和出栈修改为由R12寻址的LDMFD指令,通过两次寄存器访问来减少对内存的4次访问
3.减少冗余代码,SVC_Handler和SysTick_Handler共用中断下半部代码。
;
; Copyright(C) 2013-2015, Fans-rt development team.
;
; All rights reserved.
;
; This is open source software.
; Learning and research can be unrestricted to modification, use and dissemination.
; If you need for commercial purposes, you should get the author's permission.
;
; Configuration:
; System global core stack NO
; The local core stack of general task YES
; The loacl core stack of kernel task YES
; The local user stack of general task NO
; Hardware supported task switch IRQ NO
; Hardware supported double stack NO
;
; date author notes
; 2015-06-25 JiangYong new file
; 2015-07-07 JiangYong rename to kboard_interrupt.s
; 2015-07-11 JiangYong code optimization
;
INCLUDE kirq_define_enum.inc
EXPORT UsageFault_Handler
EXPORT BusFault_Handler
EXPORT MemManage_Handler
EXPORT HardFault_Handler
EXPORT SysTick_Handler
EXPORT PendSV_Handler
EXPORT SVC_Handler
EXPORT CORE_Switch2UserMode
IMPORT CORE_EnterIRQ
IMPORT CORE_LeaveIRQ
IMPORT CORE_TickHandler
IMPORT CORE_TaskScheduling
IMPORT CORE_HandlerLPC
IMPORT CORE_SwitchTask
IMPORT CORE_SetTaskStackPosition
IMPORT CORE_GetTaskStackPosition
IMPORT CORE_GetCoreStackPosition
IMPORT CORE_CheckMustbeSchedule
PRESERVE8
AREA |.text|, CODE, READONLY
ALIGN 4
THUMB
CORE_Switch2UserMode PROC
MOV R0, #0 ;
MSR PRIMASK, R0 ; Enable IRQ
BX LR ; return
ENDP
PendSV_Handler PROC
BX LR ; Not support PendSV IRQ
ENDP
SVC_Handler PROC
CPSID I ; Why to disable IRQ ? Guess !
<span style="color:#ff0000;"> MOV R0, SP ; R0 = Offset of {R0 - R3}
PUSH {LR, R12} ; Why to push 12? Guess !
MOV R12, R0 ; R12 = Offset of {R0 - R3}</span>
BL CORE_EnterIRQ ; Set current interrupt nest layer
CPSIE I ; Enable IRQ
<span style="color:#ff0000;"> LDMFD R12, {R0-R3} ; Resume R0 - R3 to call service</span>
BL CORE_HandlerLPC ; Call system service
<span style="color:#ff0000;"> B ST_L1 ; The next step same as system tick handler</span>
ENDP
SysTick_Handler PROC
CPSID I ; Why to disable IRQ ? Guess !
<span style="color:#ff0000;"> PUSH {R12, LR} ; Why to push R12?</span>
BL CORE_EnterIRQ ; Set current interrupt nest layer
BL CORE_TickHandler ; Inc the system tick
CPSIE I ; Enable IRQ
BL CORE_TaskScheduling ; Find the new task will be scheduling
<span style="color:#ff0000;">ST_L1</span>
CPSID I ; Disable IRQ
BL CORE_LeaveIRQ ; Set and get current interrupt nest layer
CBNZ R0, ST_LE ; Nest layer != 0 then leave this interrupt
BL CORE_CheckMustbeSchedule ; Check need schedule
CBZ R0, ST_LE ; Must schedule = FALSE then leave this interrupt
<span style="color:#ff0000;"> PUSH {R4 - R11} ; nest layer = 0 and Must schedule = TRUE then scheduling</span>
MRS R0, MSP ; R0 = Core stack for old task
MRS R1, PSP ; R1 = User stack for old task
BL CORE_SwitchTask ; CORE_SwitchTask(CoreStack, UserStack);
; No need check the task permssion
BL CORE_GetTaskStackPosition ; R0 = User stack for new task(no need)
MOV R1, R0 ; R1 = User stack for new task(no need)
BL CORE_GetCoreStackPosition ; R0 = Core stack for new task
MSR PSP, R1 ; Update user stack(no need)
MSR MSP, R0 ; Update core stack
<span style="color:#ff0000;"> POP {R4 - R11} ; Restore new task registers</span>
ST_LE
POP {R12, LR} ; Resume break point
CPSIE I ; Enable IRQ
BX LR ; Return to task break point
ENDP
HardFault_Handler PROC
B .
ENDP
MemManage_Handler PROC
B .
ENDP
BusFault_Handler PROC
B .
ENDP
UsageFault_Handler PROC
B .
ENDP
ALIGN
END
代码中,红色部分为第一个版本与优化后的主要差异点,为什么这么优化呢?在我们分析完代码执行过程中的堆栈环境就明白了。Tiny模型所有任务只有一个堆栈,本文将所有任务堆栈按照内核堆栈进行分析。
首先,进入中断时的堆栈状态:
; |-----------------------|
; | ........ |
; |-----------------------|
; | xPSR | <<= SP + 0x1C
; |-----------------------|
; | PC | <<= SP + 0x18
; |-----------------------|
; | LR | <<= SP + 0x14
; |-----------------------|
; | RC | <<= SP + 0x10
; |-----------------------|
; | R3 | <<= SP + 0x0C
; |-----------------------|
; | R2 | <<= SP + 0x08
; |-----------------------|
; | R1 | <<= SP + 0x04
; |-----------------------|
; | R0 | <<= SP + 0x00
; |-----------------------|
; | ........ |
; |-----------------------|
; 中断入口的堆栈映像,由CPU自动保存
从堆栈映像图可以看出,SP 刚好指向保存 R0 的位置, 所以 SVC_Handler 中断入口在执行
<span style="color:#ff0000;">MOV R0, SP ; R0 = Offset of {R0 - R3}</span>
PUSH {LR, R12} ; Why to push 12? Guess !
<span style="color:#ff0000;">MOV R12, R0 ; R12 = Offset of {R0 - R3}</span>
后, R12就指向堆栈中保存R0-R3寄存器的首地址,在调用 CORE_HandlerLPC之前,通过
<span style="color:#ff0000;"> LDMFD R12, {R0-R3} ; Resume R0 - R3 to call service</span>
BL CORE_HandlerLPC ; Call system service即可恢复R0-R3寄存器,以便作为参数传递给 CORE_HandlerLPC。
当代码执行完 LR 和 R12 寄存器的入栈操作后堆栈如下:
; |-----------------------|
; | ........ |
; |-----------------------|
; | xPSR | <<= SP + 0x24
; |-----------------------|
; | PC | <<= SP + 0x20
; |-----------------------|
; | LR | <<= SP + 0x1C
; |-----------------------|
; | RC | <<= SP + 0x18
; |-----------------------|
; | R3 | <<= SP + 0x14
; |-----------------------|
; | R2 | <<= SP + 0x10
; |-----------------------|
; | R1 | <<= SP + 0x0C
; |-----------------------|
; | R0 | <<= SP + 0x08
; |-----------------------|
; | LR | <<= SP + 0x04
; |-----------------------|
; | RC | <<= SP + 0x00
; |-----------------------|
; | ........ |
; |-----------------------|
; LR 和 R12 入栈后修改前的代码SVC_Handler和SysTick_Handler分别对R4和R0入栈,并未对R12入栈。为什么修改为对R12入栈呢?因为任务切换时需要构造断点堆栈映像如下:
; |-----------------------|
; | ........ |
; |-----------------------|
; | xPSR | <<= SP + 0x54
; |-----------------------|
; | PC | <<= SP + 0x50
; |-----------------------|
;
af2e
| LR | <<= SP + 0x4C
; |-----------------------|
; | RC | <<= SP + 0x48
; |-----------------------|
; | R3 | <<= SP + 0x44
; |-----------------------|
; | R2 | <<= SP + 0x40
; |-----------------------|
; | R1 | <<= SP + 0x3C
; |-----------------------|
; | R0 | <<= SP + 0x38
; |-----------------------|
; | LR | <<= SP + 0x34
; |-----------------------|
; | RC | <<= SP + 0x30
; |-----------------------|
; | RB | <<= SP + 0x2C
; |-----------------------|
; | RA | <<= SP + 0x28
; |-----------------------|
; | R9 | <<= SP + 0x24
; |-----------------------|
; | ........ |
; |-----------------------|
; | R0 | <<= SP + 0x00 (It's the old task break point)
; |-----------------------|
; | ........ |
; |-----------------------|
; 断点堆栈映像在确认需要调度后,会对 R4-R11进行入栈以构造断点堆栈映像,而已入栈的R12刚好在断点堆栈映像中正确的位置。所以,相比修改前,节省了R4/R0的入栈和出栈动作。
优化点:
1.中断入口的入栈由R4/R0修改为R12,减少R0/R4的入栈和出栈消耗
2.在 SVC_Handler 中,对R0-R3的入栈和出栈修改为由R12寻址的LDMFD指令,通过两次寄存器访问来减少对内存的4次访问
3.减少冗余代码,SVC_Handler和SysTick_Handler共用中断下半部代码。
相关文章推荐
- C语言中内存分布及程序运行中(BSS段、数据段、代码段、堆栈)
- MySQL 优化
- Google排名优化的几个影响因素
- DB2优化(简易版)
- Mysql limit 优化,百万至千万级快速分页 复合索引的引用并应用于轻量级框架
- C#中尾递归的使用、优化及编译器优化
- 优化Ruby脚本效率实例分享
- mysql -参数thread_cache_size优化方法 小结
- 详解mysql的limit经典用法及优化实例
- oracle数据库sql的优化总结
- SQL优化技巧指南
- SQL Server优化50法汇总
- C++快速排序的分析与优化详解
- 手把手教你如何优化C语言程序
- mysql 分页优化解析
- 非常不错的MySQL优化的8条经验
- 优化Node.js Web应用运行速度的10个技巧
- JavaScript学习笔记(十七)js 优化
- 如何改进javascript代码的性能
- javascript代码加载优化方法