cython用于加速python,可以简单解释为带有c数据格式的python。
1. hello world
创建 helloworld.pyx 文件,在其中添加测试代码
print("hello word")
创建 setup.py 文件,在其中添加转换编译代码
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules=cythonize("helloworld.pyx")
)
运行命令
python setup.py build_ext --inplace
在测试文件中引入 helloworld module
import helloworld
执行后打印出 hello world 字符。
2. pyximport: Cython Compilation for Developers
如果编写cython module 不需要额外的c libraries or special build setup,就可以直接使用pyximport module 通过import 直接加载 .pyx 文件,而不需要运行setup.py,使用如下
>>> import pyximport; pyximport.install()
>>> import helloworld
Hello World
注意:不推荐使用pyximport在直接使用处构建代码(会与使用者的系统相关),推荐使用wheel packing format预编译binary packages
3. Fibonacci fun
创建fib.pyx文件,在其中定义方法
from __future__ import print_function
def fib(n):
"""Print the Fibonacci series up to n."""
a, b = 0, 1
while b < n:
print(b, end=' ')
a, b = b, a + b
print()
同样编写构建代码setup.py
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules=cythonize("fib.pyx"),
)
生成c库
python setup.py build_ext --inplace
调用查看结果
>>> import fib
>>> fib.fib(2000)
1 1 2 3 5 8 13 21 34 55 89 144 233 377 610 987 1597
4. Primes 求素数
创建primes.pyx文件,定义求素数方法
def primes(int nb_primes):
cdef int n, i, len_p
cdef int p[1000]
if nb_primes > 1000:
nb_primes = 1000
len_p = 0 # The current number of elements in p.
n = 2
while len_p < nb_primes:
# Is n prime?
for i in p[:len_p]:
if n % i == 0:
break
# If no break occurred in the loop, we have a prime.
else:
p[len_p] = n
len_p += 1
n += 1
# Let's return the result in a python list:
result_as_list = [prime for prime in p[:len_p]]
return result_as_list
其中
cdef int n, i, len_p
cdef int p[1000]
这两行使用 cdef 定义c的局部变量,运行时结果会存储在c数组 p 中,并且通过倒数第二行将结果复制到python list (result_as_list)中
执行结果
>>> import primes
>>> primes.primes(10)
[2, 3, 5, 7, 11, 13, 17, 19, 23, 29]
5. 使用cython直接转换 .py 文件
创建文件primes_py.py文件,定义python方法
def primes_python(nb_primes):
p = []
n = 2
while len(p) < nb_primes:
# Is n prime?
for i in p:
if n % i == 0:
break
# If no break occurred in the loop
else:
p.append(n)
n += 1
return p
使用Cython直接转换python代码
from distutils.core import setup
from Cython.Build import cythonize
setup(
ext_modules=cythonize(['primes.pyx', # Cython code file with primes() function
'primes_py.py'], # Python code file with primes_python_compiled() function
annotate=True), # enables generation of the html annotation file
)
对比cython代码和直接转换python代码的结果是否一致
>>> primes_python(1000) == primes(1000)
True
>>> primes_python_compiled(1000) == primes(1000)
True
对比三种方式运行的效率
python -m timeit -s 'from primes_py import primes_python' 'primes_python(1000)'
10 loops, best of 3: 23 msec per loop
python -m timeit -s 'from primes_py import primes_python_compiled' 'primes_python_compiled(1000)'
100 loops, best of 3: 11.9 msec per loop
python -m timeit -s 'from primes import primes' 'primes(1000)'
1000 loops, best of 3: 1.65 msec per loop
直接使用cython转换python代码可以达python的2倍效率,使用cython编写的代码能达到python代码的13倍效率。
6. Memory Allocation
对于大对象和复杂对象,需要手动控制其内存请求和释放。c提供的函数 malloc(), realloc(), free() ,可以通过 clibc.stdlib导入cython。
void* malloc(size_t size)
void* realloc(void* ptr, size_t size)
void free(void* ptr)
使用例子
import random
from libc.stdlib cimport malloc, free
def random_noise(int number=1):
cdef int i
# allocate number * sizeof(double) bytes of memory
cdef double *my_array = <double *> malloc(number * sizeof(double))
if not my_array:
raise MemoryError()
try:
ran = random.normalvariate
for i in range(number):
my_array = ran(0, 1)
# ... let's just assume we do some more heavy C calculations here to make up
# for the work that it takes to pack the C double values into Python float
# objects below, right after throwing away the existing objects above.
return [x for x in my_array[:number]]
finally:
# return the previously allocated memory to the system
free(my_array)
使用cython封装的api PyMem_Malloc, PyMem_Realloc, PyMem_Free 同样功能实现
from cpython.mem cimport PyMem_Malloc, PyMem_Realloc, PyMem_Free
大块并长长生命周期的内存可以同上例使用 try..finally 块来控制。另一种比较好的方式时通过python 对象的运行时内存管理来控制,简单用例如下,在对象创建时申请内存,回收时释放内存。
from cpython.mem cimport PyMem_Malloc, PyMem_Realloc, PyMem_Free
cdef class SomeMemory:
cdef double* data
def __cinit__(self, size_t number):
# allocate some memory (uninitialised, may contain arbitrary data)
self.data = <double*> PyMem_Malloc(number * sizeof(double))
if not self.data:
raise MemoryError()
def resize(self, size_t new_number):
# Allocates new_number * sizeof(double) bytes,
# preserving the current content and making a best-effort to
# re-use the original data location.
mem = <double*> PyMem_Realloc(self.data, new_number * sizeof(double))
if not mem:
raise MemoryError()
# Only overwrite the pointer if the memory was really reallocated.
# On error (mem is NULL), the originally memory has not been freed.
self.data = mem
def __dealloc__(self):
PyMem_Free(self.data) # no-op if self.data is NULL
7.使用动态数组
一个计算编辑距离的例子
from libc.stdlib cimport malloc, free
def calculate_edit_distance(word1, word2):
len1 = len(word1)
len2 = len(word2)
cdef int** dp = <int**> malloc((len1 + 1) * sizeof(int*))
for i in range(len1 + 1):
dp = <int*> malloc((len2 + 1) * sizeof(int))
for i in range(len1 + 1):
dp[0] = i
for j in range(len2 + 1):
dp[0][j] = j
for i in range(1, len1 + 1):
for j in range(1, len2 + 1):
delta = 0 if word1[i - 1] == word2[j - 1] else 1
dp[j] = min(dp[i - 1][j - 1] + delta, min(dp[i - 1][j] + 1, dp[j - 1] + 1))
cdef result = dp[len1][len2]
for i in range(len1 + 1):
free(dp)
free(dp)
return result
---------------------
作者:koibiki
来源:CSDN
原文:https://blog.csdn.net/koibiki/article/details/83069468
版权声明:本文为博主原创文章,转载请附上博文链接!
|
|