Memory management is an integral part of working with computers. Python handles nearly all of its memory management behind the scenes, for better or for worse.
In this article, we will see this:
What memory management is and why it’s important
How the default Python implementation, CPython, is written in the C programming language
How the data structures and algorithms work together in CPython’s memory management to handle your data
Python abstracts away a lot of the gritty details of working with computers. This gives you the power to work on a higher level to develop your code without the headache of worrying about how and where all those bytes are getting stored.
You can begin by thinking of a computer’s memory as an empty book intended for short stories. There’s nothing written on the pages yet. Eventually, different authors will come along. Each author wants some space to write their story in.
Since they aren’t allowed to write over each other, they must be careful about which pages they write in. Before they begin writing, they consult the manager of the book. The manager then decides where in the book they’re allowed to write.
Since this book is around for a long time, many of the stories in it are no longer relevant. When no one reads or references the stories, they are removed to make room for new stories.
In essence, computer memory is like that empty book. In fact, it’s common to call fixed-length contiguous blocks of memory pages, so this analogy holds pretty well.
Memory management is the process by which applications read and write data. A memory manager determines where to put an application’s data. Since there’s a finite chunk of memory, like the pages in our book analogy, the manager has to find some free space and provide it to the application. This process of providing memory is generally called memory allocation.
Reference counting works by counting the number of times an object is referenced by other objects in the system. When references to an object are removed, the reference count for an object is decremented. When the reference count becomes zero, the object is deallocated.
Example:
x = 10
Now do this:
x = 10
y = x
if id(x) == id(y):
print("x and y refer to the same object")
Now, let’s change the value of x and observe the outcome.
x = 10
y = x
x += 1
if id(x) != id(y):
print("x and y do not refer to the same object")
Garbage collection is a process in which the interpreter frees up the memory when not in use to make it available for other objects.
Assume a case where no reference is pointing to an object in memory i.e. it is not in use so, the virtual machine has a garbage collector that automatically deletes that object from the heap memory
There are two types of memory
stack memory
heap memory
Stack memory is a type of memory storage that operates based on the LIFO (last-in-first-out) principle. It's used by the program to store temporary data such as function calls, method parameters, and local variables. The main advantage of stack memory is that it's fast and efficient because data is stored and retrieved in a predictable order.
Heap memory is a type of memory storage that's used to store objects and data that are dynamically allocated at runtime. The heap memory is shared by all parts of a program and can grow and shrink dynamically based on the program's needs. Unlike stack memory, the heap has no fixed size, and there's no guarantee on the order in which memory is allocated or freed. The main advantage of heap memory is that it allows for dynamic memory allocation, which makes it easier to manage memory usage in complex programs.
Here are some useful infographics as to the under the hood of how Python memory management works.
Take a look at this code:
# create variables (int)
a = 10
b = 20
# interger representation
print('memory address for int (int)')
print(id(a))
print(id(b))
print()
# hex representation
m1 = hex(id(a))
m2 = hex(id(b))
print('memory address for int (hex)')
print(m1)
print(m2)
print()
# create variables (float)
c = 30.0
d = 40.1
print('memory address for float (int)')
print(id(c))
print(id(d))
print()
print('increment (b-a):', id(b)-id(a))
print('increment (d-c):', id(d)-id(c))
And the results:
memory address for int (int)
140729757787208
140729757787528
memory address for int (hex)
0x7ffe3338e448
0x7ffe3338e588
memory address for float (int)
1606861632816
1606861624432
increment (b-a): 320
increment (d-c): -8384
Memory management in Python differs to how that of other languages.
python is dynamically typed
python stores objects (data) on the heap
variable is reference to an object
if the value of the object changes, the variable points to new object
variables can point to the same object
The process looks like the table below.
The methods are variables are created on stack memory.
The object and instance variables are created on heap memory.
A new stack frame is created on invocation of a function or method.
Stack frames are destroyed as soon as the function or method returns.
Here are some useful resources.