first step in order to optimization of my C program
2010-11-03 20:42
435 查看
Recently, I have written a C program about combining particle filtering and mcmc algorithms to estimate high frequency data model. The program is a little slow, so I need to target which parts can be improved by using openmp library.
(1) time ./main
see the wall clock time and CPU time
(2) compile main.c using GCC with -pg and run program
under working folder, there is a main.out file, contaning information about call, time of each functions in the source file
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
52.67 14.99 14.99 10100 1.48 1.83 sample
12.26 18.48 3.49 10100 0.35 0.35 revsort
11.81 21.84 3.36 10100 0.33 2.82 chunk
11.14 25.01 3.17 20200000 0.00 0.00 distance
6.50 26.86 1.85 20200000 0.00 0.00 simGBM
5.62 28.46 1.60 20200000 0.00 0.00 pYcondX
0.00 28.46 0.00 101 0.00 281.78 callik
Call graph
granularity: each sample hit covers 4 byte(s) for 0.04% of 28.46 seconds
index % time self children called name
3.36 25.10 10100/10100 callik [2]
[1] 100.0 3.36 25.10 10100 chunk [1]
14.99 3.49 10100/10100 sample [4]
3.17 0.00 20200000/20200000 distance [6]
1.85 0.00 20200000/20200000 simGBM [7]
1.60 0.00 20200000/20200000 pYcondX [8]
-----------------------------------------------
0.00 28.46 101/101 main [3]
[2] 100.0 0.00 28.46 101 callik [2]
3.36 25.10 10100/10100 chunk [1]
-----------------------------------------------
<spontaneous>
[3] 100.0 0.00 28.46 main [3]
0.00 28.46 101/101 callik [2]
-----------------------------------------------
14.99 3.49 10100/10100 chunk [1]
[4] 64.9 14.99 3.49 10100 sample [4]
3.49 0.00 10100/10100 revsort [5]
-----------------------------------------------
3.49 0.00 10100/10100 sample [4]
[5] 12.3 3.49 0.00 10100 revsort [5]
-----------------------------------------------
3.17 0.00 20200000/20200000 chunk [1]
[6] 11.1 3.17 0.00 20200000 distance [6]
-----------------------------------------------
1.85 0.00 20200000/20200000 chunk [1]
[7] 6.5 1.85 0.00 20200000 simGBM [7]
-----------------------------------------------
1.60 0.00 20200000/20200000 chunk [1]
[8] 5.6 1.60 0.00 20200000 pYcondX [8]
-----------------------------------------------
Index by function name
[2] callik [8] pYcondX [7] simGBM
[1] chunk [5] revsort
[6] distance [4] sample
(1) time ./main
see the wall clock time and CPU time
(2) compile main.c using GCC with -pg and run program
under working folder, there is a main.out file, contaning information about call, time of each functions in the source file
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
52.67 14.99 14.99 10100 1.48 1.83 sample
12.26 18.48 3.49 10100 0.35 0.35 revsort
11.81 21.84 3.36 10100 0.33 2.82 chunk
11.14 25.01 3.17 20200000 0.00 0.00 distance
6.50 26.86 1.85 20200000 0.00 0.00 simGBM
5.62 28.46 1.60 20200000 0.00 0.00 pYcondX
0.00 28.46 0.00 101 0.00 281.78 callik
Call graph
granularity: each sample hit covers 4 byte(s) for 0.04% of 28.46 seconds
index % time self children called name
3.36 25.10 10100/10100 callik [2]
[1] 100.0 3.36 25.10 10100 chunk [1]
14.99 3.49 10100/10100 sample [4]
3.17 0.00 20200000/20200000 distance [6]
1.85 0.00 20200000/20200000 simGBM [7]
1.60 0.00 20200000/20200000 pYcondX [8]
-----------------------------------------------
0.00 28.46 101/101 main [3]
[2] 100.0 0.00 28.46 101 callik [2]
3.36 25.10 10100/10100 chunk [1]
-----------------------------------------------
<spontaneous>
[3] 100.0 0.00 28.46 main [3]
0.00 28.46 101/101 callik [2]
-----------------------------------------------
14.99 3.49 10100/10100 chunk [1]
[4] 64.9 14.99 3.49 10100 sample [4]
3.49 0.00 10100/10100 revsort [5]
-----------------------------------------------
3.49 0.00 10100/10100 sample [4]
[5] 12.3 3.49 0.00 10100 revsort [5]
-----------------------------------------------
3.17 0.00 20200000/20200000 chunk [1]
[6] 11.1 3.17 0.00 20200000 distance [6]
-----------------------------------------------
1.85 0.00 20200000/20200000 chunk [1]
[7] 6.5 1.85 0.00 20200000 simGBM [7]
-----------------------------------------------
1.60 0.00 20200000/20200000 chunk [1]
[8] 5.6 1.60 0.00 20200000 pYcondX [8]
-----------------------------------------------
Index by function name
[2] callik [8] pYcondX [7] simGBM
[1] chunk [5] revsort
[6] distance [4] sample
相关文章推荐
- first step in order to optimization of my C program
- 前端:Import in body of module; reorder to top import/first
- Some Samples Of Oracle Function In Order To Relax Java Program.
- The first step of my life in the EC field
- Error occurred in deployment step ‘Activate Features’: Operation is not valid due to the current state of the object
- Fast ways in R to get the first row of a data frame grouped by an identifier
- First Step Of My Technical Blog
- How to discover memory usage of my application in Android
- The 'Apple Developer Program License Agreement' has been updated. In order to access certain members
- My First Test of Php Linking to MySql
- In Oracle 11g, how to change the order of the results of a sql without “order by”?(转)
- In order to run a trace against SQL Server you must be a member of sysadmin fixed server role or have the ALTER TRACE permission.
- How to get and set the drawing order of layers in globe(获取并设置Globe图层的叠加次序:)
- My first program written in Python
- How to Get First and Last Day of a Week in SQL Server
- my feelings of removing in japan and effort to forget the old relationship
- DE24 Introduction to First-order Systems of ODEs
- How to discover memory usage of my application in Android
- iOS The 'Apple Developer Program License Agreement' has been updated. In order to access certain mem
- First day to reown this website of my own