Chapter 6: Deep vs. shallow copy
The unexpected copy behavior
In Python, simply copying a variable via a=b
won’t create a “deep“ copy, but just a “shallow“ copy:
A = [4, 12, 24]
B = A # shallow copy
print(f"A and B before changing B (only): {(A,B)}")
A and B before changing B (only): ([4, 12, 24], [4, 12, 24])
B[1] = -7
print(f"A and B after changing B (only): {(A,B)}")
A and B after changing B (only): ([4, -7, 24], [4, -7, 24])
deep vs. shallow copy
- A shallow copy constructs a new compound object and then (to the extent possible) inserts references into it to the objects found in the original.
- A deep copy constructs a new compound object and then, recursively, inserts copies into it of the objects found in the original.
The difference between shallow and deep copying is only relevant for compound objects (objects that contain other objects, like lists or class instances)
To perform a deep-copy use the deepcopy
-command the copy-module:
from copy import deepcopy
C=deepcopy(B)
print(f"B and C before changing B (only): {(B,C)}")
B[1] = 898
print(f"B and C after changing B (only): {(B,C)}")
B and C before changing B (only): ([4, 898, 24], [4, 898, 24])
B and C after changing B (only): ([4, 898, 24], [4, 898, 24])
Defining a new variable by copying only a subset of a predefined variable also performs a deep-copy:
D=B[0:2]
print(f"D and B before changing B (only): {(D,B)}")
B[0] = 99
print(f"D and B after changing B (only): {(D,B)}")
D and B before changing B (only): ([4, 898], [4, 898, 24])
D and B after changing B (only): ([4, 898], [99, 898, 24])
E = B+A # <-- no shallow copy problems
Outlook
We will be confronted with the deep-copy problem again, e.g., in the NumPy chapter.