The following is true for the CPython implementation!
All data in Python are objects.
>>> type(1)
<type 'int'>
>>> type('foo')
<type 'str'>
>>> type([])
<type 'list'>
But there are two different kind of objects. Objects whose value can change are called mutable and objects whose value can’t change are called immutable. For example an integer like 1 or a string like 'foo' are both immutable objects. So if you type a = 'foo', a is pointing to an immutable object. You can change the pointer of a by assigning a new object to a, but you can’t change the value of the object 'foo'. Confused? Quick example:
>>> a = 'foo'
>>> id(a)
3074084016L
>>> a[0]
'f'
>>> a[0] = 'z'
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
>>> a = 'zoo'
>>> id(a)
3074084088L
For this to make sense you must know how data is stored in memory. One approach could be:
a = String() # Make a new string and assign the string to variable
a.value = 'foo' # Set the string to value 'foo'
For storing a new string the system now try to allocate space in memory. If there is free memory we get an address space and can assign the value of the string. Every string you create will have it’s own address space in memory. The more strings you create, the more memory you will consume, regardless of the value of the string. Not so in Python.
>>> a = 'foo'
>>> b = 'foo'
>>> c = 'foo'
>>> d = 'foo'
Only the first created string object will consume memory. The memory
manager of Python checks if there already is a reference to the string object
'foo'. If yes the variable will point to the address space of the already existing string object.
Check it:
>>> a = 'foo'
>>> b = 'foo'
>>> c = 'foo'
>>> d = 'foo'
>>> id(a)
3074084112L
>>> id(b)
3074084112L
>>> id(c)
3074084112L
>>> id(d)
3074084112L
As you can see a, b, c and d are pointing to the same address space. So, a, b,
c and d are equal. Not only string equal, also object equal. But this is only true for immutable objects like strings, numbers, tuples etc. Mutable objects are working like the first approach. You create a new object and they will have their own address space and consume memory. A list is an example for a mutable object.
>>> a = []
>>> b = []
>>> c = []
>>> d = []
>>> id(a)
3074110316L
>>> id(b)
3074111916L
>>> id(c)
3074111884L
>>> id(d)
3074112012L
a, b, c and d are pointing to different address spaces. A lot of words, but where
is the pitfall?
The issue is, regardless of the behaivor of mutable objects and the knowledge that every new object will have it’s own address space, you can get in serious trouble.
What do you think will happen here?
>>> a = []
>>> b = a
- Because the list object is mutable, the value of
a will be copied to b. So we have two lists with the same values, but different address spaces.
a is assigned to a new list object and b is pointing to the object of a, so the object of a and b will be equal.
Maybe this helps.
>>> a = []
>>> b = a
>>> id(a)
3074112332L
>>> id(b)
3074112332L
So if your answer is 1, you failed! And you would be in great trouble if you would have done something like this.
>>> a = [5,4,3,2,1]
>>> tmp = a
>>> for index, value in enumerate(a):
... a[index] = value * 2
...
>>> a
[10, 8, 6, 4, 2]
>>> tmp
[10, 8, 6, 4, 2]
Oh sh*t your tmp variable just changed, too?! If you like to preserve the values of a list and modify the original one, don’t do it like this. Force a new object instead!
>>> a = [5,4,3,2,1]
>>> tmp = list(a)
>>> for index, value in enumerate(a):
... a[index] = value * 2
...
>>> a
[10, 8, 6, 4, 2]
>>> tmp
[5, 4, 3, 2, 1]
>>> id(a)
3074110700L
>>> id(tmp)
3074112460L
UPDATE:
For nested lists the example above does not apply. For nested lists use the copy module.
Thanks to teferi!
>>> a = [[1], 2]
>>> tmp = list(a)
>>> a[0].append(3)
>>> tmp
[[1, 3], 2]
To do this kind of copying correctly – you should use copy module. copy.deepcopy does things right.
>>> a = [[1], 2]
>>> tmp = copy.deepcopy(a)
>>> a[0].append(3)
>>> tmp
[[1], 2]
>>> a
[[1, 3], 2]