您的位置:首页 > 编程语言 > Python开发

Python Memory Management

2015-10-12 14:34 573 查看
Even though some high level languages handle memory automatically doesn't mean that memory leaks are impossible. A leak in C or C++ is created when you allocate memory using
malloc
or
new
, and never deallocate it. In Python or Java, a memory leak is created when you accidentally leave a reference to some massive data structure hanging around, so the entire thing cannot be freed. The solution is to find this reference and get
rid of it.

I began looking for my memory leak by creating a minimal test that ran a big search. Next, I added "
del reference
" statements one variable at a time, trying to find which one was holding on to memory. I deleted all
my variables and still my program's size did not change. Next, I added calls to force Python'scyclic garbage collector to run, in case I had circular references
that were keeping the objects from being deleted. Python still consumed a gigabyte of memory. Finally, I turned to Google to see if anyone else has had a similar problem. I turned up

a mailing list thread about Python never returning memory to the operating system. It turns out that this is a flaw with the Python interpreter. People work around it by spawning multiple processes or using data structures on disk. Then I was curious, why
would Python choose to never free memory? Many hours later, I now know far more than I ever wanted to about how Python deals with memory.

Python does a lot of allocations and deallocations.
All objects, including "simple" types like integers and floats, are stored on the heap. Calling
malloc
and
free
for each variable would be very slow. Hence, the Python interpreter uses a variety of optimized memory allocation schemes. The most important one is a
malloc
implementation called

pymalloc, designed specifically to handle large numbers of small allocations. Any object that is smaller than 256 bytes uses this allocator, while anything larger uses the system's
malloc
. This implementation never returns memory to the operating
system. Instead, it holds on to it in case it is needed again. This is efficient when it is used again in a short time, but is wasteful if a long time passes before it is needed.


So how can Python's memory usage sometimes decrease if it never releases small objects? Common Python objects, such as integers, floats, and lists, maintain their own, private memory pools which are not shared with any other
type. Integers and floats use a policy similar to pymalloc: They never get released. Lists, on the other hand, are freed. Python only keeps a small, fixed number of unused list objects (currently 80), which limits the wasted memory. Most importantly, the actual
array stored in the list is freed immediately, which is a significant amount of memory for big lists.

There are some serious inefficiencies in the interpreter's memory allocation policies. For example, applications like mine, always hold on to the peak amount of memory, even though the average memory consumed is much lower. This
can be an issue for long running servers, such as
Twisted, Zope or
Plone servers, or applications where memory is critical. For example,

Plone recommends using a caching server with lots of RAM, so having Python release memory that is not currently being used would be helpful in that configuration.
Another problem is that the private memory pools are not shared between types. For example, if you allocate then free a large number of integers, the cached available memory cannot be used to allocate floats.
I am

attempting to rewrite Python's memory allocator, to try and make Python use memory more efficiently. I'll keep you posted on how it goes.

原文地址 http://www.evanjones.ca/python-memory.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: