Content
The Rule of The Big Three Copy Constructor Assignment Operator Destructors and Polymorphism <- Go BackUniversity of Michigan at Ann Arbor
Last Edit Date: 03/31/2023
Disclaimer and Term of Use:
We do not guarantee the accuracy and completeness of the summary content. Some of the course material may not be included, and some of the content in the summary may not be correct. You should use this file properly and legally. We are not responsible for any results from using this file
This personal note is adapted from Professor Amir Kamil, Andrew DeOrio, James Juett, Sofia Saleem, and Saquib Razak. Please contact us to delete this file if you think your rights have been violated.
This work is licensed under a Creative Commons Attribution 4.0 International License.
We need a deep copy if an object owns and manages any resources (ex: dynamic memory). If a constructor creates dynamic memory, we probably need the big three. If some of the members are pointers, we might need the big three.
Destructor
Copy Constructor
Copy regular members from other
Deep copy resources from other
Assignment Operator
Check for self-assignment
Free old resources
Copy regular members from the right hand side
Deep copy resources from the right hand side
return *this
All objects have the big three. If we do not provide a custom version, the compiler will provide implicitly defined versions.
Implicit destructor
Implicit copy constructor
Implicit assignment operator
Shallow Copy
A constructor is always called when a class-type object is created (except for C-style ADTs when the members are initialized directly).
Copy initialization of a class-type object also invokes a constructor, specifically the copy constructor for the class. The following is an example of explicitly defining a copy constructor:
1 class Example {
2 public:
3 Example(int x_in, double y_in)
4 : x(x_in), y(y_in) {}
5
6 Example(const Example &other)
7 : x(other.x), y(other.y) {
8 cout << "Example copy ctor: " << x << ", " << y << endl;
9 }
10
11 int get_x() const {
12 return x;
13 }
14
15 double get_y() const {
16 return y;
17 }
18
19 private:
20 int x;
21 double y;
22 };
The second constructor above (lines 6 to 9) is the copy constructor, and it takes a reference to an existing object as the parameter.
The parameter must be passed by reference – otherwise, a copy will be done to initialize the parameter, which itself will invoke the copy constructor, which will call the copy constructor to initialize its parameter, and so on.
The following a test code:
1 int main() {
2 Example e1(2, -1.3);
3 Example e2 = e1; // Example copy ctor: 2, -1.3
4 Example e3(e1); // Example copy ctor: 2, -1.3
5 }
Try it out:
The program invokes the copy constructor to initialize both e2
and e3
from e1
.
The C++ compiler provides an implicit copy constructor if a user-defined one is not present. The implicit copy constructor does a member-by-member copy.
In our example, the implicit copy is the same as we defined, which is as following:
Example(const Example &other)
: x(other.x), y(other.y) {}
However, the implicit copy constructor can result in a shallow copy in some cases such as when we have members as pointers. Take the UnsortedIntSet
as an example:
1 #include <iostream>
2 using namespace std;
3
4 const int DEFAULT_CAPACITY = 2;
5
6 class UnsortedIntSet {
7 public:
8 UnsortedIntSet()
9 : elts(new int[DEFAULT_CAPACITY]),
10 capacity(DEFAULT_CAPACITY),
11 elts_size(0) {}
12
13 ~UnsortedIntSet() {
14 delete[] elts;
15 }
16
17 void insert(int v) {
18 if (contains(v)) {
19 return;
20 }
21 if (elts_size == capacity) {
22 grow();
23 }
24 elts[elts_size] = v;
25 ++elts_size;
26 }
27
28 void remove(int v) {
29 if (!contains(v)) {
30 return;
31 }
32 elts[indexOf(v)] = elts[elts_size - 1];
33 --elts_size;
34 };
35
36 bool contains(int v) const{
37 return indexOf(v) != -1;
38 }
39
40 int size() const{
41 return elts_size;
42 }
43
44 void print(ostream &os) const {
45 os << "{" << " ";
46 if (elts_size > 0){
47 os << elts[0];
48 }
49 for(int i = 1; i < elts_size; ++i){
50 os << ", " << elts[i];
51 }
52 os << " }" << endl;
53 }
54
55 private:
56 int *elts;
57 int elts_size;
58 int capacity;
59 void grow() {
60 int *new_arr = new int[2 * capacity];
61 for (int i = 0; i < elts_size; i++){
62 new_arr[i] = elts[i];
63 }
64 capacity *= 2;
65 delete[] elts;
66 elts = new_arr;
67 }
68
69 int indexOf(int v) const{
70 for(int i = 0; i < elts_size; ++i){
71 if(elts[i] == v){
72 return i;
73 }
74 }
75 return -1;
76 }
77 };
78
79 ostream &operator<<(ostream &os, const UnsortedIntSet &s) {
80 s.print(os);
81 return os;
82 }
Note: The C++ compiler will use the implicit copy constructor for us.
The following is the test code:
1 int main() {
2 UnsortedIntSet s1;
3 for (int i = 0; i < 3; ++i) {
4 s1.insert(i);
5 }
6 UnsortedIntSet s2(s1);
7 cout << s1 << endl; // prints { 0, 1, 2 }
8 cout << s2 << endl; // prints { 0, 1, 2 }
9 s2.insert(5); // causes a grow
10 cout << s2 << endl; // prints { 0, 1, 2 5 }
11 cout << s1 << endl; // UNDEFINED BEHAVIOR
12 }
Try it out:
Note that both s1
and s2
point to the same elts
. As you may notice, since we have s2.insert(5)
, which creates a new array and deletes the old one. However, s1
accessed the old, deleted array, resulting in undefined behavior. The following two pictures show the idea.
Deep Copy
The fundamental problem is that the implicitly defined member-by-member copy is a shallow copy: it copies the pointer to the array of elements, rather than following it and copying the array as a whole.
This violates the representation invariant that each set has its own array. Instead, we need a deep copy, where we make a copy of the underlying resource rather than having the two sets share the same resource.
To obtain a deep copy, we need to provide a custom implementation of the copy constructor:
1 UnsortedIntSet(const UnsortedIntSet &other)
2 : elts(new int[other.capacity]),
3 capacity(other.capacity),
4 elts_size(other.elts_size) {
5 for (int i = 0; i < elts_size; ++i) {
6 elts[i] = other.elts[i];
7 }
8 }
Let's use the same test code to see how it works.
Try it out:
Rather than copying the elts
pointer, we initialize the new set's member to point to the start of a dynamic array of the same capacity as other
's. The members that don't represent resources are just copied directly (capacity
and elts_size
).
The body of the constructor copies each element from the old set to the new one. The following picture shows the idea.
Shallow Copy
Assignment is another situation where an object is copied; unlike initialization, however, assignment copies into an existing object rather than a new one. The following is an example:
1 int main() {
2 Example e1(2, -1.3);
3 Example e2(3, 4.1);
4 cout << e2.get_x() << endl; // prints 3
5 e2 = e1;
6 cout << e2.get_x() << endl; // prints 2
7 }
Try it out:
Like most operators, the assignment operator can be overloaded; the overload must be a member function of the type of the left-hand side. Like the copy constructor, the compiler provides an implicit definition of the assignment operator if a user-defined one is not present. Like the implicitly defined copy constructor, the implicit assignment operator performs a member-by-member copy.
The following is a definition of the overloaded assignment operator that just does a member-by-member copy:
Example & Example::operator=(const Example &rhs) {
x = rhs.x;
y = rhs.y;
return *this;
}
The function takes in an Example
by reference to const, corresponding to the right-hand operand of the assignment. The return value will be the left-hand-side object itself, and it needs to be returned by reference rather than by value (the latter would make a copy rather than returning the object itself). Thus, the return type is Example &
.
The two members are individually copied from the right-hand side. The left-hand-side object must be returned; we need to dereference the this
pointer to get to the object to which it is pointing.
Deep Copy
The following code defines a custom assignment operator:
1 UnsortedIntSet &operator=(const UnsortedIntSet &rhs) {
2 if (this == &rhs) { // self-assignment check
3 return *this;
4 }
5
6 delete[] elts; // delete old array
7
8 elts = new int[rhs.capacity]; // make new array of the required size
9 capacity = rhs.capacity; // shallow copy non-resources
10 elts_size = rhs.elts_size;
11
12 for (int i = 0; i < elts_size; ++i) { // copy over the elements
13 elts[i] = rhs.elts[i];
14 }
15
16 return *this; // return LHS object
17 }
The this
pointer points to the left-hand-side operand, while the parameter rhs
is an alias for the right-hand-side operand. We need to obtain the address of rhs
to compare to the address stored in the this
pointer.
If the two addresses are the same, then the two operands are the same object, so we immediately return the left-hand-side object. (We cannot return rhs
because it is declared as a reference to const
, while the return type is not.)
The following is the test code:
1 int main() {
2 UnsortedIntSet s1;
3 for (int i = 0; i < 3; ++i) {
4 s1.insert(i);
5 }
6 UnsortedIntSet s2 = s1;
7 cout << s1 << endl; // prints { 0, 1, 2 }
8 cout << s2 << endl; // prints { 0, 1, 2 }
9 s2.insert(5); // causes a grow
10 cout << s2 << endl; // prints { 0, 1, 2 5 }
11 cout << s1 << endl; // UNDEFINED BEHAVIOR
12 }
Try it out:
If the object is of class type, its destructor is run. However, a subtle issue arises when the static and dynamic type of the object do not match: does the destructor use static or dynamic binding? Consider the following example:
1 class Base {
2 public:
3 virtual void add(int x) = 0;
4
5 ~Base() {
6 cout << "Base dtor" << endl;
7 }
8 };
9
10 class Derived : public Base {
11 public:
12 void add(int x) override {
13 items.push_back(x);
14 }
15
16 ~Derived() {
17 cout << "Derived dtor" << endl;
18 }
19
20 private:
21 vector<int> items;
22 };
The destructor is statically bound, so ~Base()
is invoked rather than ~Derived()
.
This is problematic: even though Derived
itself is not managing dynamic memory, its member variable items is. Since the destructor for Derived
was not called, the members introduced by Derived
were also not destructed.
The solution is to use dynamic binding for the destructor instead by declaring it as virtual:
1 class Base {
2 public:
3 virtual void add(int x) = 0;
4
5 virtual ~Base() {
6 cout << "Base dtor" << endl;
7 }
8 };
9
10 class Derived : public Base {
11 public:
12 void add(int x) override {
13 items.push_back(x);
14 }
15
16 ~Derived() { // virtual implicitly inherited
17 cout << "Derived dtor" << endl;
18 }
19 private:
20 vector<int> items;
21 };
Try it out: