Contributors: Edward Zhang
Let's talk about copy constructors! If you're coming from Java, this might be one of the biggest changes
C++ has the notion of a copy constructor, abbreviated cctor
The cctor is a special type of constructor that creates a new object as a copy of some existing object
It's important to remember that in C/C++, almost everything is passed by value. This means that for better or worse, the cctor is quite frequently invoked, sometimes when you might not expect it.
For the rest of this chapter, let's use the following Node
class as an example
class Node{
public:
int val;
Node();
Node(int val);
Node(const Node& other); // This is the copy constructor
};
- The class has one public data member,
int val
General syntax for copy constructor:
className(const className& other){
// make a copy of each data member
}
We can see that the first parameter of a cctor is a REFERENCE to some existing object of the same type
- ... why must it be a reference? We'll come back to this near the end of the chapter
Here is what one implementation of the cctor could look like:
Node::Node(const Node& other){
val = other.val;
}
The code is very straightforward! As its name implies, we simply construct a new object by making copies (whatever that means to you) of some existing object's data members
The next few sections will examine when the copy constructor is invoked. There are 3 main cases
The cctor is invoked when you try to initialize an object from another object of the same type
There are 2 ways to do this: direct initialization or copy initialization
Node a(333);
Node b(a); // direct initialization
Node c = a; // copy initialization
- The last line (
Node c = a
) may have surprised you. In Java, this would mean setting variablec
to refer to existingNode
objecta
- In C++ however, this invokes the copy constructor. We'll see an example shortly that should clear up any confusion
Direct initialization vs copy initialization
Direct initialization: we ask the compiler to use function matching to figure out which constructor best matches the provided argument
For example:
// Compiler will look to see which constructor takes a single
// int argument -> Node(int val)
Node a(333);
// Compiler will look to see which constructor takes
// an argument of type Node -> Node(const Node& other), the cctor!
Node b(a);
Copy initialization: we ask the compiler to copy the right-hand operand into the object being created
- This will happen if you initialize variables using
=
// Compiler knows to construct object "c" as a COPY of existing
// object "a"
Node c = a;
NOTE: you may be wondering what would happen if we split the above steps:
Node c; // default constructor - direct initialization
c = a; // copy-assignment
- The cctor is NOT called here. In another chapter we will look at the copy-assignment operator
- The copy-assignment operator does almost the same thing as a cctor, but just remember that by splitting the steps, you do not call the cctor
Remember in Java, if we did something like:
Node a = new Node(333);
Node c = a;
c.val = 124;
System.out.println(a.val); // what gets printed?
- If we wrote this in Java,
c
is a reference to objecta
- Thus, changing
c
's fields would also be reflected ina
. In the above example, we would print124
In C++:
Node a(333);
Node c = a; // cctor invoked
c.val = 124;
std::cout << a.val << std::endl; // what is printed now?
c
is NOT a reference toa
. Instead, it is a brand new object made as a COPY ofa
(look back to the cctor implementation)- Thus, changing
c
's fields has no impact at all ona
- In the above example, we will still print
333
Copy initialization also happens when you pass an object as an argument to a parameter of nonreference type
- whew, that's a mouthful! Look at the example below for clarity
void foo(Node x){
std::cout << x.val << std::endl; // prints 333
x.val = 124;
}
int main(){
Node a(333);
foo(a);
std::cout << a.val << std::endl; // still prints 333
}
- We pass in
a
to functionfoo
, but notice how the parameterx
is a nonreference. Thus, manipulatingx
's fields infoo
has absolutely no effect ona
back inmain
- But how did
x
geta
's value of333
? Copy initialization! - I.e.,
x
was copy constructed froma
for use in the functionfoo
Looking back at cctor code
Take a look again at the cctor declaration:
Node(const Node& other);
My question is, why must parameter other
be a REFERENCE to some existing Node? What if the declaration were:
Node(const Node other);
Answer: if other
were a nonreference parameter, other
would be copy constructed from the argument. But that's what we're trying to define!
- Basically the cctor would need to call the cctor, leading to a weird infinite loop
Finally, the cctor is invoked when you try to return a non-reference object value from a function
Node foo(){
Node x(124);
return x;
// Returns an unnamed temporary copy of Node "x"
// to the call site
}
int main(){
// Technically the cctor is invoked twice in this program
// The resulting value of foo() is returned as a copy
// Then we construct "a" as a copy of foo()'s result
Node a = foo();
}
Note that the compiler is allowed to optimize during copy initialization and use move semantics to eliminate unecessary copies
- This is why you may write programs and sometimes not see the cctor get invoked when you might expect it
This is important: if you do not define your own cctor, C++ will synthesize one for you
Specifically, the synthesized cctor will perform a shallow copy of all your class's data members
- ... this could be dangerous, as we'll see shortly
For example, let's make some updates to our Node
class:
class Node{
public:
int *rank;
int val;
Node(int val, int rank){
this->val = val;
this->rank = new int(rank);
}
// Node(const Node& other);
};
- We now have an extra data member
rank
, which has typeint*
- Since we used
new
,rank
is a pointer to anint
allocated on the heap - We also got rid of our own cctor
So now what, since we didn't define our own cctor? The synthesized cctor C++ makes for you would look something like this:
Node::Node(const Node& other){
rank = other.rank;
val = other.val;
}
- Shallow copy means that all data members would essentially be initialized like this:
dataMemberName = other.dataMemberName;
- In the above example, look especially carefully at how we shallow copied the int pointer. See any potential problems?
Let's assume that once an object falls out of scope, it will magically call delete
or delete[]
on any data members that are pointers to heap-allocated memory.
- See the chapter on destructors
In our Node
class (second version), rank
is a pointer to an int
object on the heap
We also assumed the compiler was kind enough to supply us with the synthesized cctor, since we didn't define our own.
So anything wrong with this code?
int main(){
Node a(333, 1);
Node x(a); // ok, x is a copy of a
// we're done right?
// ...
}
Sorry bad news, we just caused a double free
- By the way, a double free technically causes undefined behavior. Best case, your program crashes
How? Well, let's examine how the synthesized cctor actually copy constructed x
- Exercise for you: draw a memory diagram!
- Notice how a "shallow copy" of a pointer just makes the
rank
member ofa
andx
point to the same object in the heap! - Then when
x
falls out of scope, some magical apparatus will calldelete
onx.rank
- But when
a
falls out of scope, the magical apparatus will attempt to calldelete
ona.rank
, which has already been freed!
To fix the bug in the last section, we just have to define our own copy constructor
We probably didn't want to copy the pointer value (i.e., an address). More likely we just wanted to copy the pointed-to object's value
So let's write our cctor:
Node::Node(const Node& other){
val = other.val;
rank = new int(*other.rank);
}