Halide学习笔记----Halide tutorial源码阅读4
2017-12-01 17:47
656 查看
Halide入门教程04
// Halide tutorial lesson 4: Debugging with tracing, print, and print_when // Halide入门第四课:用tracing,print,print_when调试 // This lesson demonstrates how to follow what Halide is doing at runtime. // 本课展示了如何跟踪Halide在运行时的行为 // On linux, you can compile and run it like so: // g++ lesson_04*.cpp -g -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_04 -std=c++11 // LD_LIBRARY_PATH=../bin ./lesson_04 // On os x: // g++ lesson_04*.cpp -g -I ../include -L ../bin -lHalide -o lesson_04 -std=c++11 // DYLD_LIBRARY_PATH=../bin ./lesson_04 // If you have the entire Halide source tree, you can also build it by // running: // make tutorial_lesson_04_debugging_2 // in a shell with the current directory at the top of the halide // source tree. #include "Halide.h" #include <stdio.h> using namespace Halide; int main(int argc, char **argv) { Var x("x"), y("y"); // Printing out the value of Funcs as they are computed. // 打印函数(Func)在计算时刻的值 { // We'll define our gradient function as before. Func gradient("gradient"); gradient(x, y) = x + y; // And tell Halide that we'd like to be notified of all // evaluations. // 告诉Halide我们想要跟踪所有的函数计算值 gradient.trace_stores(); // Realize the function over an 8x8 region. printf("Evaluating gradient\n"); Buffer<int> output = gradient.realize(8, 8); // This will print out all the times gradient(x, y) gets // evaluated. // Now that we can snoop on what Halide is doing, let's try our // first scheduling primitive. We'll make a new version of // gradient that processes each scanline in parallel. Func parallel_gradient("parallel_gradient"); parallel_gradient(x, y) = x + y; // We'll also trace this function. parallel_gradient.trace_stores(); // Things are the same so far. We've defined the algorithm, but // haven't said anything about how to schedule it. In general, // exploring different scheduling decisions doesn't change the code // that describes the algorithm. // 在Halide中,由于算法和调度解耦合,算法的调度并不影响算法的描述 // Now we tell Halide to use a parallel for loop over the y // coordinate. On Linux we run this using a thread pool and a task // queue. On OS X we call into grand central dispatch, which does // the same thing for us. // 在y方形并行执行for循环 parallel_gradient.parallel(y); // This time the printfs should come out of order, because each // scanline is potentially being processed in a different // thread. The number of threads should adapt to your system, but // on linux you can control it manually using the environment // variable HL_NUM_THREADS. // 由于采用了并行计算饭,每行的计算可能位于不同的线程,因此输出结果可能会是乱序的。 // 可以通过环境变量HL_NUM_THREADS来指定parallel的线程数 printf("\nEvaluating parallel_gradient\n"); parallel_gradient.realize(8, 8); } // Printing individual Exprs. { // trace_stores() can only print the value of a // Func. Sometimes you want to inspect the value of // sub-expressions rather than the entire Func. The built-in // function 'print' can be wrapped around any Expr to print // the value of that Expr every time it is evaluated. // trace_stores()函数打印函数值,内置的print函数可以答应表达式(Expr)对象的值 // For example, say we have some Func that is the sum of two terms: Func f; f(x, y) = sin(x) + cos(y); // If we want to inspect just one of the terms, we can wrap // 'print' around it like so: // 如果我们仅仅需要关注表达式中的一个条目,我们可以在这个条目上加上print函数 Func g; g(x, y) = sin(x) + print(cos(y)); printf("\nEvaluating sin(x) + cos(y), and just printing cos(y)\n"); g.realize(4, 4); } // Printing additional context. { // print can take multiple arguments. It prints all of them // and evaluates to the first one. The arguments can be Exprs // or constant strings. This can be used to print additional // context alongside the value: // 如果需要,可以在打印单个条目上加上额外的文本 Func f; f(x, y) = sin(x) + print(cos(y), "<- this is cos(", y, ") when x =", x); printf("\nEvaluating sin(x) + cos(y), and printing cos(y) with more context\n"); f.realize(4, 4); // It can be useful to split expressions like the one above // across multiple lines to make it easier to turn on and off // printing certain values while debugging. Expr e = cos(y); // Uncomment the following line to print the value of cos(y) // e = print(e, "<- this is cos(", y, ") when x =", x); Func g; g(x, y) = sin(x) + e; g.realize(4, 4); } // Conditional printing { // Both print and trace_stores can produce a lot of output. If // you're looking for a rare event, or just want to see what // happens at a single pixel, this amount of output can be // difficult to dig through. Instead, the function print_when // can be used to conditionally print an Expr. The first // argument to print_when is a boolean Expr. If the Expr // evaluates to true, it returns the second argument and // prints all of the arguments. If the Expr evaluates to false // it just returns the second argument and does not print. // 如果需要查看中间某个特定的结果,可以调用条件打印函数,打印出在特定条件下,表达式的结果。 // print_when(bool_expr, expr, context) // 如果 bool_expr == ture: 返回expr,打印context内容 // 否则只返回expr Func f; Expr e = cos(y); e = print_when(x == 37 && y == 42, e, "<- this is cos(y) at x, y == (37, 42)"); f(x, y) = sin(x) + e; printf("\nEvaluating sin(x) + cos(y), and printing cos(y) at a single pixel\n"); f.realize(640, 480); // print_when can also be used to check for values you're not expecting: Func g; e = cos(y); e = print_when(e < 0, e, "cos(y) < 0 at y ==", y); g(x, y) = sin(x) + e; printf("\nEvaluating sin(x) + cos(y), and printing whenever cos(y) < 0\n"); g.realize(4, 4); } // Printing expressions at compile-time. { // The code above builds up a Halide Expr across several lines // of code. If you're programmatically constructing a complex // expression, and you want to check the Expr you've created // is what you think it is, you can also print out the // expression itself using C++ streams: // 在编写一些复杂的表达式时,如果你想要查看表达式是否和你想象中一样,可以用c++ // 的输出流将表达式结果打印到标准输出上,检查是否如预期一致。 Var fizz("fizz"), buzz("buzz"); Expr e = 1; for (int i = 2; i < 100; i++) { if (i % 3 == 0 && i % 5 == 0) e += fizz*buzz; else if (i % 3 == 0) e += fizz; else if (i % 5 == 0) e += buzz; else e += i; } std::cout << "Printing a complex Expr: " << e << "\n"; } printf("Success!\n"); return 0; }
编译执行:
$ g++ lesson_04*.cpp -g -I ../include -L ../bin -lHalide -lpthread -ldl -o lesson_04 -std=c++11 $ ./lesson_04
结果:
Begin pipeline gradient.0() Store gradient.0(0, 0) = 0 Store gradient.0(1, 0) = 1 Store gradient.0(2, 0) = 2 Store gradient.0(3, 0) = 3 Store gradient.0(4, 0) = 4 Store gradient.0(5, 0) = 5 Store gradient.0(6, 0) = 6 Store gradient.0(7, 0) = 7 Store gradient.0(0, 1) = 1 Store gradient.0(1, 1) = 2 Store gradient.0(2, 1) = 3 Store gradient.0(3, 1) = 4 Store gradient.0(4, 1) = 5 Store gradient.0(5, 1) = 6 Store gradient.0(6, 1) = 7 Store gradient.0(7, 1) = 8 Store gradient.0(0, 2) = 2 Store gradient.0(1, 2) = 3 Store gradient.0(2, 2) = 4 Store gradient.0(3, 2) = 5 Store gradient.0(4, 2) = 6 Store gradient.0(5, 2) = 7 Store gradient.0(6, 2) = 8 Store gradient.0(7, 2) = 9 Store gradient.0(0, 3) = 3 Store gradient.0(1, 3) = 4 Store gradient.0(2, 3) = 5 Store gradient.0(3, 3) = 6 Store gradient.0(4, 3) = 7 Store gradient.0(5, 3) = 8 Store gradient.0(6, 3) = 9 Store gradient.0(7, 3) = 10 Store gradient.0(0, 4) = 4 Store gradient.0(1, 4) = 5 Store gradient.0(2, 4) = 6 Store gradient.0(3, 4) = 7 Store gradient.0(4, 4) = 8 Store gradient.0(5, 4) = 9 Store gradient.0(6, 4) = 10 Store gradient.0(7, 4) = 11 Store gradient.0(0, 5) = 5 Store gradient.0(1, 5) = 6 Store gradient.0(2, 5) = 7 Store gradient.0(3, 5) = 8 Store gradient.0(4, 5) = 9 Store gradient.0(5, 5) = 10 Store gradient.0(6, 5) = 11 Store gradient.0(7, 5) = 12 Store gradient.0(0, 6) = 6 Store gradient.0(1, 6) = 7 Store gradient.0(2, 6) = 8 Store gradient.0(3, 6) = 9 Store gradient.0(4, 6) = 10 Store gradient.0(5, 6) = 11 Store gradient.0(6, 6) = 12 Store gradient.0(7, 6) = 13 Store gradient.0(0, 7) = 7 Store gradient.0(1, 7) = 8 Store gradient.0(2, 7) = 9 Store gradient.0(3, 7) = 10 Store gradient.0(4, 7) = 11 Store gradient.0(5, 7) = 12 Store gradient.0(6, 7) = 13 Store gradient.0(7, 7) = 14 End pipeline gradient.0() Begin pipeline parallel_gradient.0() Store parallel_gradient.0(0, 0) = 0 Store parallel_gradient.0(1, 0) = 1 Store parallel_gradient.0(2, 0) = 2 Store parallel_gradient.0(3, 0) = 3 Store parallel_gradient.0(4, 0) = 4 Store parallel_gradient.0(5, 0) = 5 Store parallel_gradient.0(6, 0) = 6 Store parallel_gradient.0(7, 0) = 7 Store parallel_gradient.0(0, 1) = 1 Store parallel_gradient.0(1, 1) = 2 Store parallel_gradient.0(2, 1) = 3 Store parallel_gradient.0(3, 1) = 4 Store parallel_gradient.0(4, 1) = 5 Store parallel_gradient.0(5, 1) = 6 Store parallel_gradient.0(6, 1) = 7 Store parallel_gradient.0(7, 1) = 8 Store parallel_gradient.0(0, 2) = 2 Store parallel_gradient.0(0, 3) = 3 Store parallel_gradient.0(1, 2) = 3 Store parallel_gradient.0(1, 3) = 4 Store parallel_gradient.0(0, 4) = 4 Store parallel_gradient.0(2, 3) = 5 Store parallel_gradient.0(2, 2) = 4 Store parallel_gradient.0(1, 4) = 5 Store parallel_gradient.0(3, 2) = 5 Store parallel_gradient.0(2, 4) = 6 Store parallel_gradient.0(3, 3) = 6 Store parallel_gradient.0(4, 2) = 6 Store parallel_gradient.0(4, 3) = 7 Store parallel_gradient.0(3, 4) = 7 Store parallel_gradient.0(5, 2) = 7 Store parallel_gradient.0(5, 3) = 8 Store parallel_gradient.0(4, 4) = 8 Store parallel_gradient.0(6, 2) = 8 Store parallel_gradient.0(6, 3) = 9 Store parallel_gradient.0(7, 2) = 9 Store parallel_gradient.0(5, 4) = 9 Store parallel_gradient.0(7, 3) = 10 Store parallel_gradient.0(0, 6) = 6 Store parallel_gradient.0(6, 4) = 10 Store parallel_gradient.0(0, 7) = 7 Store parallel_gradient.0(1, 6) = 7 Store parallel_gradient.0(1, 7) = 8 Store parallel_gradient.0(7, 4) = 11 Store parallel_gradient.0(2, 6) = 8 Store parallel_gradient.0(2, 7) = 9 Store parallel_gradient.0(3, 6) = 9 Store parallel_gradient.0(3, 7) = 10 Store parallel_gradient.0(4, 6) = 10 Store parallel_gradient.0(4, 7) = 11 Store parallel_gradient.0(5, 7) = 12 Store parallel_gradient.0(5, 6) = 11 Store parallel_gradient.0(0, 5) = 5 Store parallel_gradient.0(6, 7) = 13 Store parallel_gradient.0(6, 6) = 12 Store parallel_gradient.0(1, 5) = 6 Store parallel_gradient.0(7, 7) = 14 Store parallel_gradient.0(2, 5) = 7 Store parallel_gradient.0(7, 6) = 13 Store parallel_gradient.0(3, 5) = 8 Store parallel_gradient.0(4, 5) = 9 Store parallel_gradient.0(5, 5) = 10 Store parallel_gradient.0(6, 5) = 11 Store parallel_gradient.0(7, 5) = 12 End pipeline parallel_gradient.0() 1.000000 1.000000 1.000000 1.000000 0.540302 0.540302 0.540302 0.540302 -0.416147 -0.416147 -0.416147 -0.416147 -0.989992 -0.989992 -0.989992 -0.989992 1.000000 <- this is cos( 0 ) when x = 0 1.000000 <- this is cos( 0 ) when x = 1 1.000000 <- this is cos( 0 ) when x = 2 1.000000 <- this is cos( 0 ) when x = 3 0.540302 <- this is cos( 1 ) when x = 0 0.540302 <- this is cos( 1 ) when x = 1 0.540302 <- this is cos( 1 ) when x = 2 0.540302 <- this is cos( 1 ) when x = 3 -0.416147 <- this is cos( 2 ) when x = 0 -0.416147 <- this is cos( 2 ) when x = 1 -0.416147 <- this is cos( 2 ) when x = 2 -0.416147 <- this is cos( 2 ) when x = 3 -0.989992 <- this is cos( 3 ) when x = 0 -0.989992 <- this is cos( 3 ) when x = 1 -0.989992 <- this is cos( 3 ) when x = 2 -0.989992 <- this is cos( 3 ) when x = 3 -0.399985 <- this is cos(y) at x, y == (37, 42) -0.416147 cos(y) < 0 at y == 2 -0.416147 cos(y) < 0 at y == 2 -0.416147 cos(y) < 0 at y == 2 -0.416147 cos(y) < 0 at y == 2 -0.989992 cos(y) < 0 at y == 3 -0.989992 cos(y) < 0 at y == 3 -0.989992 cos(y) < 0 at y == 3 -0.989992 cos(y) < 0 at y == 3 Evaluating gradient Evaluating parallel_gradient Evaluating sin(x) + cos(y), and just printing cos(y) Evaluating sin(x) + cos(y), and printing cos(y) with more context Evaluating sin(x) + cos(y), and printing cos(y) at a single pixel Evaluating sin(x) + cos(y), and printing whenever cos(y) < 0 Printing a complex Expr: ((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((((1 + 2) + fizz) + 4) + buzz) + fizz) + 7) + 8) + fizz) + buzz) + 11) + fizz) + 13) + 14) + (fizz*buzz)) + 16) + 17) + fizz) + 19) + buzz) + fizz) + 22) + 23) + fizz) + buzz) + 26) + fizz) + 28) + 29) + (fizz*buzz)) + 31) + 32) + fizz) + 34) + buzz) + fizz) + 37) + 38) + fizz) + buzz) + 41) + fizz) + 43) + 44) + (fizz*buzz)) + 46) + 47) + fizz) + 49) + buzz) + fizz) + 52) + 53) + fizz) + buzz) + 56) + fizz) + 58) + 59) + (fizz*buzz)) + 61) + 62) + fizz) + 64) + buzz) + fizz) + 67) + 68) + fizz) + buzz) + 71) + fizz) + 73) + 74) + (fizz*buzz)) + 76) + 77) + fizz) + 79) + buzz) + fizz) + 82) + 83) + fizz) + buzz) + 86) + fizz) + 88) + 89) + (fizz*buzz)) + 91) + 92) + fizz) + 94) + buzz) + fizz) + 97) + 98) + fizz) Success!
Halide提供的debug方法要点提炼:
1. Func.trace_stores() 跟踪函数运行时计算结果 2. Func.parallel(y) 在某个domain方向多线程并行计算 3. print() 打印所关注表达式的值 4. print_when() 打印在指定条件为真情况下的值,也可用于屏蔽条件为假时的输出 5. 用c++的输出流输出复杂表达式,检查表达式构造是否和预期一致
相关文章推荐
- Halide学习笔记----Halide tutorial源码阅读17
- Halide学习笔记----Halide tutorial源码阅读10
- Halide学习笔记----Halide tutorial源码阅读19
- Halide学习笔记----Halide tutorial源码阅读6
- Halide学习笔记----Halide tutorial源码阅读3
- Halide学习笔记----Halide tutorial源码阅读8
- Halide学习笔记----Halide tutorial源码阅读16
- Halide学习笔记----Halide tutorial源码阅读2
- Halide学习笔记----Halide tutorial源码阅读5
- Halide学习笔记----Halide tutorial源码阅读13
- Halide学习笔记----Halide tutorial源码阅读14
- Halide学习笔记----Halide tutorial源码阅读21
- Halide学习笔记----Halide tutorial源码阅读7
- api.js源码阅读学习笔记
- Java框架类源码阅读学习笔记
- Redux 学习笔记 - 源码阅读
- Redux 学习笔记 - 源码阅读
- Halide学习笔记----Halide tutorial源码阅读9
- Redux 学习笔记 - 源码阅读
- Redux 学习笔记 - 源码阅读