Python Pointers - revealed

Sushant Pupneja
9 min readFeb 28, 2021

--

What are Pointers?

If you’ve ever worked with lower level languages like C or C++, then you’ve probably heard of pointers. Pointers allow you to create great efficiency in parts of your code. They also cause confusion for beginners and can lead to various memory management bugs, even for experts.

Pointers are widely used in C and C++. Essentially, they are variables that hold the memory address of another variable. Lets understand pointers with simple real world analogy.

There is a requirement to paint your office building. Hence you search for an company online to do this task for you (custom_function). The company initial requirement is to ship the office building to them and after completing the task of painting the building will be ship back to you. This might seems an heavy task and company deny to do it. As shipping back entire office building is cost heavy. Hence you decided to copy the address of the building (pointers) on an envelope and send it to the paint company. The company goes to the address and do the necessary work , in this way they has nothing to return back to you.

Why Python does not have Pointers?

Pointers seem to go against the Zen of Python. Pointers encourage implicit changes rather than explicit. Often, they are complex instead of simple, especially for beginners. Even worse, they beg for ways to shoot yourself in the foot, or do something really dangerous like read from a section of memory you were not supposed to.

Python tends to try to abstract away implementation details like memory addresses from its users.

Python often focuses on usability instead of speed. As a result, pointers in Python don’t really make sense.

Understanding pointers in Python requires a short detour into Python’s implementation details. Specifically, you’ll need to understand:

  1. Immutable vs mutable objects
  2. Python variables/names

Hold onto your memory addresses, and let’s get started.

Objects in Python

In Python, everything is an object. For proof, you can explore using isinstance():

Each object contains at least three pieces of data:

  • Reference count
  • Type
  • Value

All magic happens at the reference count is used for memory management. The type is used at C Python layer to ensure safety during runtime, where as Value is the actual value associated with the object.

Not all objects are same though. Understanding the difference between the types of objects really helps clarify the first layer of the onion that is pointers in Python.

Immutable vs Mutable Objects

In Python, there are two types of objects:

  1. Immutable objects can’t be changed. (Integer, Float, Complex, String, Boolean, Tuple, FrozenSets).
  2. Mutable objects can be changed. (Lists, Dictionaries, Sets).

Even though the above code appears to modify the value of x, you’re getting a new object as a response.

The str type is also immutable:

Again, s ends up with a different memory addresses after the += operation.

Contrast that with a mutable object, like list:

This code shows a major difference in the two types of objects. list has an id originally. Even after 4 is appended to the list, list has the same id. This is because the list type is mutable.

With mutable and immutable objects out of the way, the next step on your journey to Python enlightenment is understanding Python’s variable ecosystem.

Understanding Variables

Python variables are fundamentally different than variables in C or C++. In fact, Python doesn’t even have variables.

Python has names, not variables.

Understanding this is important especially when you’re navigating the tricky subject of pointers in Python.

Variables in C

Let’s say you had the following code that defines the variable x:

int x = 2337;

This one line of code has several, distinct steps when executed:

  1. Allocate enough memory for an integer
  2. Assign the value 2337 to that memory location
  3. Indicate that x points to that value

Shown in a simplified view of memory, it might look like this:

x = 2338;

The above code assigns a new value (2338) to the variable x, thereby overwriting the previous value. This means that the variable x is mutable. The updated memory layout shows the new value:

Notice that the location of x didn’t change, just the value itself. This is a significant point. It means that x is the memory location, not just a name for it.

And if you assign x = y , this will execute with another block of address in the memory with value equal to value of x.

Now the memory layout will look like this:

This is in stark contrast with how Python names work.

Names in Python:

Let’s take the equivalent code from the above C example and write it in Python:

>>> x = 2337

Much like in C, the above code is broken down into several distinct steps during execution:

  1. Create a PyObject
  2. Set the typecode to integer for the PyObject
  3. Set the value to 2337 for the PyObject
  4. Create a name called x
  5. Point x to the new PyObject
  6. Increase the refcount of the PyObject by 1

In memory, it might looks something like this:

You can see that the memory layout is vastly different than the C layout from before. Instead of x owning the block of memory where the value 2337 resides, the newly created Python object owns the memory where 2337 lives. The Python name x doesn’t directly own any memory address in the way the C variable x owned a static slot in memory.

If you were to try to assign a new value to x, you could try the following:

>>> x = 2338

What’s happening here is different than the C equivalent, but not too different from the original bind in Python.

This code:

  • Creates a new PyObject
  • Sets the typecode to integer for the PyObject
  • Sets the value to 2338 for the PyObject
  • Points x to the new PyObject
  • Increases the refcount of the new PyObject by 1
  • Decreases the refcount of the old PyObject by 1

Now in memory, it would look something like this:

This diagram helps illustrate that x points to a reference to an object and doesn’t own the memory space as before. It also shows that the x = 2338 command is not an assignment, but rather binding the name x to a reference.

In addition, the previous object (which held the 2337 value) is now sitting in memory with a ref count of 0 and will get cleaned up by the garbage collector.

Intern Objects in Python

Now that you understand how Python objects get created and names get bound to those objects, its time to throw a wrench in the machinery. That wrench goes by the name of interned objects.

Suppose you have the following Python code:

This time, the line x is y returns False. If this is confusing, then don’t worry. Here are the steps that occur when this code is executed:

  1. Create Python object(300)
  2. Assign the name x to that object
  3. Create Python object (200)
  4. Create Python object (100)
  5. Add these two objects together
  6. Create a new Python object (300)
  7. Assign the name y to that object

You never have to worry about cleaning up these intermediate objects or even need to know that they exist! The joy is that these operations are relatively fast, and you never had to know any of those details until now.

The core Python developers, in their wisdom, also noticed this waste and decided to make a few optimizations.

The core Python developers, in their wisdom, also noticed this waste and decided to make a few optimizations. These optimizations result in behavior that can be surprising to newcomers:

n this example, you see nearly the same code as before, except this time the result is True. This is the result of interned objects. Python pre-creates a certain subset of objects in memory and keeps them in the global namespace for everyday use.

Which objects depend on the implementation of Python. CPython 3.7 interns the following:

  1. Integer numbers between -5 and 256
  2. Strings that contain ASCII letters, digits, or underscores only

Simulating Pointers in Python

Just because pointers in Python don’t exist natively doesn’t mean you can’t get the benefits of using pointers. In fact, there are multiple ways to simulate pointers in Python. You’ll learn two in this section:

  1. Using mutable types as pointers
  2. Using custom Python objects

Using Mutable Types as Pointers

You’ve already learned about mutable types. Because these objects are mutable, you can treat them as if they were pointers to simulate pointer behavior:

Here, increase_count(x) accesses the first element and increments its value by one. Using a list means that the end result appears to have modified the value. So pointers in Python do exist? Well, no. This is only possible because list is a mutable type.

Keep in mind, this is only simulates pointer behavior and does not directly map to true pointers in C or C++. That is to say, these operations are more expensive than they would be in C or C++.

Let’s say you had an application where you wanted to keep track of every time an interesting event happened. One way to achieve this would be to create a dict and use one of the items as a counter.

>>> counters = {"func_calls": 0}
>>> def bar():
... counters["func_calls"] += 1
...
>>> def foo():
... counters["func_calls"] += 1
... bar()
...
>>> foo()
>>> counters["func_calls"]
2

Using Python Objects

The dict option is a great way to emulate pointers in Python, but sometimes it gets tedious to remember the key name you used. This is especially true if you’re using the dictionary in various parts of your application. This is where a custom Python class can really help.

To build on the last example, assume that you want to track metrics in your application. Creating a class is a great way to abstract the pesky details:

Here, you can access func_calls and call inc_func_calls() in various places in your applications and simulate pointers in Python. This is useful when you have something like metrics that need to be used and updated frequently in various parts of your applications.

Real Pointers With ctypes

Okay, so maybe there are pointers in Python, specifically CPython. Using the builtin ctypes module, you can create real C-style pointers in Python. If you are unfamiliar with ctypes, then you can take a look at Extending Python With C Libraries and the “ctypes” Module.

The real reason you would use this is if you needed to make a function call to a C library that requires a pointer. read more on Unlocking real pointers with the ctypes module

Conclusion

You now have a better understanding of the intersection between Python objects and pointers. Even though some of the distinctions between names and variables seem pedantic, fundamentally understanding these key terms expands your understanding of how Python handles variables.

You’ve also learned some excellent ways to simulate pointers in Python:

  • Utilizing mutable objects as low-overhead pointers
  • Creating custom Python objects for ease of use

These methods allow you to simulate pointers in Python without sacrificing the memory safety that Python provides.

Thanks for reading. If you still have questions, feel free to reach out either in the comments section or on email.

--

--

Sushant Pupneja

Software Developer during day time, blogger at night.