[学习交流] 【上海校区】Cython Tutorial 部分翻译

cython用于加速python，可以简单解释为带有c数据格式的python。

1. hello world
创建 helloworld.pyx 文件，在其中添加测试代码

print("hello word")
创建 setup.py 文件，在其中添加转换编译代码

from distutils.core import setup
from Cython.Build import cythonize

setup(
ext_modules=cythonize("helloworld.pyx")
)
运行命令

python setup.py build_ext --inplace
在测试文件中引入 helloworld module

import helloworld
执行后打印出 hello world 字符。

2. pyximport: Cython Compilation for Developers
如果编写cython module 不需要额外的c libraries or special build setup，就可以直接使用pyximport module 通过import 直接加载 .pyx 文件，而不需要运行setup.py，使用如下

>>> import pyximport; pyximport.install()
>>> import helloworld
Hello World
注意：不推荐使用pyximport在直接使用处构建代码（会与使用者的系统相关），推荐使用wheel packing format预编译binary packages

3. Fibonacci fun
创建fib.pyx文件，在其中定义方法

from __future__ import print_function

def fib(n):
"""Print the Fibonacci series up to n."""
a, b = 0, 1
while b < n:
      print(b, end=' ')
      a, b = b, a + b

print()
同样编写构建代码setup.py

from distutils.core import setup
from Cython.Build import cythonize

setup(
ext_modules=cythonize("fib.pyx"),
)
生成c库

python setup.py build_ext --inplace
调用查看结果

>>> import fib
>>> fib.fib(2000)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
4. Primes 求素数
创建primes.pyx文件，定义求素数方法

def primes(int nb_primes):
cdef int n, i, len_p
cdef int p[1000]
if nb_primes > 1000:
      nb_primes = 1000

len_p = 0  # The current number of elements in p.
n = 2
while len_p < nb_primes:
      # Is n prime?
      for i in p[:len_p]:
         if n % i == 0:
            break

      # If no break occurred in the loop, we have a prime.
      else:
         p[len_p] = n
         len_p += 1
      n += 1

# Let's return the result in a python list:
result_as_list  = [prime for prime in p[:len_p]]
return result_as_list
其中

cdef int n, i, len_p
cdef int p[1000]
这两行使用 cdef 定义c的局部变量，运行时结果会存储在c数组 p 中，并且通过倒数第二行将结果复制到python list （result_as_list）中

执行结果

>>> import primes
>>> primes.primes(10)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
5. 使用cython直接转换 .py 文件
创建文件primes_py.py文件，定义python方法

def primes_python(nb_primes):
p = []
n = 2
while len(p) < nb_primes:
      # Is n prime?
      for i in p:
         if n % i == 0:
            break

      # If no break occurred in the loop
      else:
         p.append(n)
      n += 1
return p
使用Cython直接转换python代码

from distutils.core import setup
from Cython.Build import cythonize

setup(
ext_modules=cythonize(['primes.pyx',       # Cython code file with primes() function
                        'primes_py.py'],  # Python code file with primes_python_compiled() function
                        annotate=True),       # enables generation of the html annotation file
)
对比cython代码和直接转换python代码的结果是否一致

>>> primes_python(1000) == primes(1000)
True
>>> primes_python_compiled(1000) == primes(1000)
True
对比三种方式运行的效率

python -m timeit -s 'from primes_py import primes_python' 'primes_python(1000)'
10 loops, best of 3: 23 msec per loop

python -m timeit -s 'from primes_py import primes_python_compiled' 'primes_python_compiled(1000)'
100 loops, best of 3: 11.9 msec per loop

python -m timeit -s 'from primes import primes' 'primes(1000)'
1000 loops, best of 3: 1.65 msec per loop
直接使用cython转换python代码可以达python的2倍效率，使用cython编写的代码能达到python代码的13倍效率。

6. Memory Allocation
对于大对象和复杂对象，需要手动控制其内存请求和释放。c提供的函数 malloc(), realloc(),  free() ，可以通过 clibc.stdlib导入cython。

void* malloc(size_t size)
void* realloc(void* ptr, size_t size)
void free(void* ptr)
使用例子

import random
from libc.stdlib cimport malloc, free

def random_noise(int number=1):
cdef int i
# allocate number * sizeof(double) bytes of memory
cdef double *my_array = <double *> malloc(number * sizeof(double))
if not my_array:
      raise MemoryError()

try:
      ran = random.normalvariate
      for i in range(number):
         my_array = ran(0, 1)

      # ... let's just assume we do some more heavy C calculations here to make up
      # for the work that it takes to pack the C double values into Python float
      # objects below, right after throwing away the existing objects above.

      return [x for x in my_array[:number]]
finally:
      # return the previously allocated memory to the system
      free(my_array)
使用cython封装的api PyMem_Malloc, PyMem_Realloc, PyMem_Free 同样功能实现

from cpython.mem cimport PyMem_Malloc, PyMem_Realloc, PyMem_Free
大块并长长生命周期的内存可以同上例使用 try..finally 块来控制。另一种比较好的方式时通过python 对象的运行时内存管理来控制，简单用例如下，在对象创建时申请内存，回收时释放内存。

from cpython.mem cimport PyMem_Malloc, PyMem_Realloc, PyMem_Free

cdef class SomeMemory:

cdef double* data

def __cinit__(self, size_t number):
      # allocate some memory (uninitialised, may contain arbitrary data)
      self.data = <double*> PyMem_Malloc(number * sizeof(double))
      if not self.data:
         raise MemoryError()

def resize(self, size_t new_number):
      # Allocates new_number * sizeof(double) bytes,
      # preserving the current content and making a best-effort to
      # re-use the original data location.
      mem = <double*> PyMem_Realloc(self.data, new_number * sizeof(double))
      if not mem:
         raise MemoryError()
      # Only overwrite the pointer if the memory was really reallocated.
      # On error (mem is NULL), the originally memory has not been freed.
      self.data = mem

def __dealloc__(self):
      PyMem_Free(self.data)  # no-op if self.data is NULL
7.使用动态数组
一个计算编辑距离的例子

from libc.stdlib cimport malloc, free

def calculate_edit_distance(word1,  word2):
len1 = len(word1)
len2 = len(word2)

cdef int** dp = <int**> malloc((len1 + 1) * sizeof(int*))
for i in range(len1 + 1):
      dp = <int*> malloc((len2 + 1) * sizeof(int))

for i in range(len1 + 1):
      dp[0] = i
for j in range(len2 + 1):
      dp[0][j] = j
for i in range(1, len1 + 1):
      for j in range(1, len2 + 1):
         delta = 0 if word1[i - 1] == word2[j - 1] else 1
         dp[j] = min(dp[i - 1][j - 1] + delta, min(dp[i - 1][j] + 1, dp[j - 1] + 1))
cdef result = dp[len1][len2]
for i in range(len1 + 1):
      free(dp)
free(dp)

return result

---------------------
作者：koibiki
来源：CSDN
原文：https://blog.csdn.net/koibiki/article/details/83069468
版权声明：本文为博主原创文章，转载请附上博文链接！

魔都黑马少年梦 · 魔都黑马少年梦

不二晨 · 不二晨

帐号		自动登录	找回密码
密码			加入黑马

[学习交流] 【上海校区】Cython Tutorial 部分翻译

2 个回复