您的位置:首页 > 编程语言 > Python开发

用Cython编译写出更快的Python代码

2015-06-19 10:46 801 查看
原文地址: http://www.behnel.de/cython200910/talk.html以下为原文

About myself

Passionate Python developer since 2002

after Basic, Logo, Pascal, Prolog, Scheme, Java, C, ...

CS studies in Germany, Ireland, France

PhD in distributed systems in 2007

Language design for self-organising systems

Darmstadt University of Technologies, Germany

Current occupations:

http://codespeak.net/lxml/

IT transformations, SOA design, Java-Development, ...

Employed by Senacor Technologies AG, Germany

»lxml« OpenSource XML toolkit for Python

»Cython«

Part 1: Intro to Cython

Part 1: Intro to Cython

Part 2: Building Cython modules

Part 3: Writing fast code

Part 4: Talking to other extensions

What is Cython?

Cython is the missing link
between the simplicity of Python
and the speed of C / C++ / Fortran.


What is Cython?

Cython is the missing link
between the simplicity of Python
and the speed of C / C++ / Fortran.


What is Cython?

Cython is
an Open-Source project

http://cython.org

http://pypi.python.org/pypi/Cython

a Python compiler (almost)

an enhanced, optimising fork of Pyrex

an extended Python language for

writing fast Python extension modules

interfacing Python with C libraries

Major Cython Core Developers

Robert Bradshaw, Stefan Behnel, Dag Sverre Seljebotn

lead developers

Lisandro Dalcín

C/C++ portability and various feature patches

Kurt Smith, Danilo Freitas

Google Summer of Code 2009: Fortran/C++ integration

Greg Ewing

main developer and maintainer of Pyrex

many, many others - see

http://cython.org/

the mailing list archives of Cython and Pyrex

How to use Cython

you write Python code

Cython translates it into C code

your C compiler builds a shared library for CPython

you import your module into CPython

Cython has support for

optionally compile Python code from setup.py!

Cython does that for its own modules :-)

distutils

embedding the CPython runtime in an executable

Example: compiling Python code

# file: worker.pyclass HardWorker(object):    u"Almost Sisyphos"    def __init__(self, task):        self.task = task    def work_hard(self, repeat=100):        for i in range(repeat):            self.task()def add_simple_stuff():
x = 1+1HardWorker(add_simple_stuff).work_hard()

Example: compiling Python code

compile with
$ cython worker.py

translates to ~1500 line .c file (Cython 0.11.3)

helps tracing your own code in generated sources

different C compilers, Python versions, ...

lots of portability #define's

tons of helpful C comments with Python code snippets

a lot of code that you don't want to write yourself

Portable Code

Cython compiler generates C code that compiles

with all major compilers (C and C++)

on all major platforms

in Python 2.3 through 3.1

Cython language syntax follows Python 2.6

get involved to get it quicker!

optional Python 3 syntax support is on TODO list

... the fastest way to port Python 2 code to Py3 ;-)

Python language feature support

most of Python 2 syntax is supported

top-level classes and functions

control structures: loops, with, try-except/finally, ...

object operations, arithmetic, ...

plus many Py3 features:

list/set/dict comprehensions

keyword-only arguments

extended iterable unpacking (a,b,*c,d = some_list)

Python features in work

Inner functions with closures
def factory(a,b):    def closure_function(c):        return a+b+c    return closure_function

status: (hopefully) to be merged for 0.12

Planned Cython features

improved C++ integration (GSoC 2009)

e.g. function/operator overloading support

status: mostly there, to be finished and integrated

improved Fortran integration (GSoC 2009)

talking to Fortan code directly

status: mostly there, to be finished and integrated

native array data type with SIMD behaviour

status: large interest, implementation pending

... as usual: great ideas, little time

Currently unsupported

local/inner classes (~open)

lambda expressions (~easy)

generators (~needs work)

generator expressions (~easy)

with obvious optimisations, e.g.
set( x.a for x in some_list )== { x.a for x in some_list }

... all certainly on the TODO list for 1.0.

Speed

Cython generates very efficient C code:
PyBench: most benchmarks run 20-80% faster

conditions and loops run 5-8x faster than in Py2.6.2

overall about 30% faster for plain Python benchmark

obviously, real applications are different

PyPy's richards.py benchmark:

heavily class based scheduler

20% faster than CPython 2.6.2

Type declarations

Cython supports optional type declarations that
can be employed exactly where performance matters

let Cython generate plain C instead of C-API calls

make richards.py benchmark 5x faster than CPython

without Python code modifications :)

can make code 100 - 1000x faster than CPython

expect several 100 times in calculation loops

Part 2: Building Cython modules

Part 1: Intro to Cython

Part 2: Building Cython modules

Part 3: Writing fast code

Part 4: Talking to other extensions

Ways to build Cython code

To compile Python code (.py) or Cython code (.pyx)
you need:

Cython, Python and a C compiler

you can use:

web app that supports writing and running Cython code

on-the-fly build + import (for experiments)

setup.py script (likely required anyway)

distutils

pyximport

Sage notebook

cython source.pyx + manual C compilation

Example: distutils

A minimal setup.py script:
from distutils.core import setupfrom distutils.extension import Extensionfrom Cython.Distutils import build_ext

ext_modules = [Extension("worker", ["worker.py"])]

setup(
name = 'stupid little app',
cmdclass = {'build_ext': build_ext},
ext_modules = ext_modules
)

Run with
$ python setup.py build_ext --inplace

Example: pyximport

Build and import Cython code files (.pyx) on the fly
$ ls
worker.pyx$ PYTHONPATH=. python

Python 2.6.2 (r262:71600, Apr 17 2009, 11:29:30)[GCC 4.3.2] on linux2Type "help", "copyright", "credits" or "license" for more information.>>> import pyximport>>> pyximport.install()>>> import worker>>> worker<module 'worker' from '~/.pyxbld/.../worker.so'>>>> worker.HardWorker<class 'worker.HardWorker'>>>> worker.HardWorker(worker.add_simple_stuff).work_hard()

pyximporting Python modules

pyximport can also compile Python modules:

>>> import pyximport>>> pyximport.install(pyimport = True)>>> import shlex[lots of compiler errors from different modules ...]>>> help(shlex)

currently works for a few stdlib modules

falls back to normal Python import automatically

not production ready, but nice for testing :)

Writing executable programs

# file: hw.pydef hello_world():    import sys    print "Welcome to Python %d.%d!" % sys.version_info[:2]if __name__ == '__main__':
hello_world()

Writing executable programs

# file: hw.pydef hello_world():    import sys    print "Welcome to Python %d.%d!" % sys.version_info[:2]if __name__ == '__main__':
hello_world()

Compile, link and run:
$ cython --embed hw.py   # <- embed a main() function$ gcc $CFLAGS -I/usr/include/python2.6 \
-o hw hw.c -lpython2.6 -lpthread -lm -lutil -ldl$ ./hw
Welcome to Python 2.6!

Part 3: Writing fast code

Part 1: Intro to Cython

Part 2: Building Cython modules

Part 3: Writing fast code

Part 4: Talking to other extensions

A simple example

Plain Python code:

# integrate_py.pyfrom math import sindef f(x):    return sin(x**2)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0    for i in range(N):
s += f(a+i*dx)    return s * dx

Type declarations in Cython

Function arguments are easy
Python:
def f(x):    return sin(x**2)

Cython:
def f(double x):    return sin(x**2)

Type declarations in Cython

»cdef« keyword declares
variables with C or builtin types
cdef double dx, s

functions with C signatures
cdef double f(double x):    return sin(x**2)

classes as 'builtin' extension types
cdef class MyType:    cdef int field

Functions: def vs. cdef vs. cpdef

def func(int x):

part of the Python module API

Python call semantics

cdef int func(int x):

C signature

C call semantics

cpdef int func(int x):

Python wrapper around cdef function

C calls cdef function, Python calls wrapper

note: modified C signature!

Typed arguments and return values

def func(int x):

caller passes Python objects for x

function converts to int on entry

implicit return type always object

cdef int func(int x):

caller converts arguments as required

function receives C int for x

arbitrary return type, defaults to object

cpdef int func(int x):

wrapper converts

C callers convert arguments as required

Python callers pass and receive objects

A simple example: Python

# integrate_py.pyfrom math import sindef f(x):    return sin(x**2)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0    for i in range(N):
s += f(a+i*dx)    return s * dx

A simple example: Cython

# integrate_cy.pyxcdef extern from "math.h":
double sin(double x)cdef double f(double x):    return sin(x**2)cpdef double integrate_f(double a, double b, int N):    cdef double dx, s    cdef int i

dx = (b-a)/N
s = 0    for i in range(N):
s += f(a+i*dx)    return s * dx

Overriding declarations in .pxd

Plain Python code:

# integrate_py.pyfrom math import sindef f(x):    return sin(x**2)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0    for i in range(N):
s += f(a+i*dx)    return s * dx

Overriding declarations in .pxd

Python integrate_py.pyCython integrate_py.pxd
# integrate_py.pyfrom math import sindef f(x):    return sin(x**2)def integrate_f(a, b, N):

dx = (b-a)/N
s = 0    for i in range(N):
s += f(a+i*dx)    return s * dx
# integrate_py.pxdcimport cythoncpdef double f(double x)@cython.locals(
dx=double, s=double, i=int)cpdef integrate_f(
double a, double b, int N)

The .pxd file used

# integrate_py.pxdcimport cythoncpdef double f(double x):    return sin(x**2)cpdef double integrate_f(double a, double b, int N)

Overriding declarations in .pxd

advantage:

Eclipse, pylint, 2to3, ...

runs unchanged in Python interpreter

plain Python code

complete Python tool-chain available

drawback:

cannot override from math import sin

no access to C functions

Typing in Python syntax

Plain Python code:

# integrate_py.pyfrom math import sindef f(x):    return sin(x**2)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0    for i in range(N):
s += f(a+i*dx)    return s * dx

Typing in Python syntax

http://wiki.cython.org/pure

from math import sinimport cython@cython.locals(x=cython.double)def f(x):    return sin(x**2)@cython.locals(a=cython.double, b=cython.double,
N=cython.Py_ssize_t, dx=cython.double,
s=cython.double, i=cython.Py_ssize_t)def integrate_f(a, b, N):
dx = (b-a)/N
s = 0    for i in range(N):
s += f(a+i*dx)    return s * dx

Declaring Python types

Access to Python's builtins is heavily optimised

for ... in range()/list/tuple/dict

list.append(), list.reverse()

set([...]), tuple([...])

Further improvements in Cython 0.12

replacements for enumerate(), type()

dict([...]), unicode.encode(), list.sort()

Declaring Python types is often worth it!

Easy to add new optimisations

don't write prematurely optimised code, fix Cython!

Declaring Python types: dict

example: dict iteration

def filter_a(d):    return { key : value             for key, value in d.iteritems()             if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)

Declaring Python types: dict

simple change, ~30% faster:

def filter_a(dict d):       # <====    return { key : value             for key, value in d.iteritems()             if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)

Declaring Python types: dict

simple change, ~30% faster:

def filter_a(dict d):       # <====    return { key : value             for key, value in d.iteritems()             if 'a' not in value }import stringd = { s:s for s in string.ascii_letters }print filter_a(d)

drawback:

non-dict mapping arguments raise a TypeError

Think twice before you type



benchmark code before adding static types!

Classes

class MyClass(object):

Python class with __dict__

multiple inheritance

arbitrary Python attributes

Python methods

monkey-patcheable etc.

cdef class MyClass(SomeSuperClass):

C-only access by default, or readonly/public

only from other extension types!

"builtin" extension type

single inheritance

fixed, typed fields

Python + C methods

cdef classes - when to use them?

Use cdef classes

e.g. whenever wrapping C structs/pointers/etc.

when C attribute types are used

when the need for speed beats Python's generality

Use Python classes

for bytes/tuple subtypes (PyVarObject)

for exceptions if Py<2.5 compatibility is required

when multiple inheritance is required

when users are allowed to monkey-patch

Part 4: Talking to other extensions

Part 1: Intro to Cython

Part 2: Building Cython modules

Part 3: Writing fast code

Part 4: Talking to other extensions

Talking to other extensions

Python 3 buffer protocol (available in Py2.6)

external C-APIs

Python 3 buffer protocol

Native support for new Python buffer protocol

PEP 3118

def inplace_invert_2D_buffer(                object[unsigned char, 2] image):    cdef int i, j    for i in range(image.shape[0]):        for j in range(image.shape[1]):
image[i, j] = 255 - image[i, j]

can be supported for extension types in Py2.x

declared through .pxd files

Cython ships with numpy.pxd

array.pxd available (stdlib's array)

Conclusion

Cython is a tool for

translating Python code to efficient C

easily interfacing to external C/C++/Fortran code

Use it to

concentrate on the mapping, not the glue!

don't change the language just to get fast code!

concentrate on optimisations, not rewrites!

speed up existing Python modules

write C extensions for CPython

wrap C libraries in Python

... but Cython is also

a great project

a very open playground for great ideas!

Cython

Cython
C-Extensions in Python
... use it, and join the project!
http://cython.org/
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: