Defining new types¶
When developing complex programs, you only seldom have to to with int
s or
double
s directly. Most of the time you will work with objects that
correspond to real-world (or virtual-world) objects like persons, credit cards,
bank accounts, players, monsters, books, dates and so on. Dealing with these by
explicitly juggling with C++’s built-in types quickly becomes cumbersome and
error-prone.
For example, assume we are developing a game in which two player can move across
the screen and has a certain amount of points. To represent this player in our
program we could use two int
s for the x and y-coordinate of the position
and another one for the points:
#include <iostream>
void draw_player(int player_x, int player_y, int player_points)
{
std::cout
<< "Player at " << player_x << "," << player_y
<< " with " << player_points << " points.\n";
}
void next_round(unsigned& game_round)
{
++game_round;
std::cout << "Round " << game_round << ":\n";
}
int main()
{
auto game_round = 0u;
auto player_a_x = 0, player_a_y = 0, player_a_points = 0;
auto player_b_x = 0, player_b_y = 0, player_b_points = 0;
next_round(game_round);
draw_player(player_a_x, player_a_y, player_a_points);
draw_player(player_b_x, player_b_y, player_b_points);
next_round(game_round);
// Move player a right a bit:
player_a_x += 5;
draw_player(player_a_x, player_a_y, player_a_points);
}
This code suffers from several problems:
The player variables are not explicitly separated from each other or other unrelated variables.
The variables are are ordinary
int
s. E.g. nothing would prevent you from accidentally writing nonsensical statements like:// Oops, wrong order! draw_player(player_a_points, player_a_x, player_a_y); // Oops, players mixed! draw_player(player_a_x, player_a_y, player_b_points); // More nonsense: next_round(player_a_points); draw_player(player_b_x, player_b_y, game_round);Adding a new data component to a player involves changing all places where a player was used.
Simple compound types: struct
¶
By using the struct
construct and the .
(dot) operator, we can improve
our program as follows:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 | #include <iostream>
struct Player {
int x;
int y;
int n_points;
};
void draw_player(Player player)
{
std::cout
<< "Player at " << player.x << "," << player.y
<< " with " << player.n_points << " points.\n";
}
void next_round(unsigned& game_round)
{
++game_round;
std::cout << "Round " << game_round << ":\n";
}
int main()
{
auto game_round = 0u;
Player player_a = {}, player_b = {};
next_round(game_round);
draw_player(player_a);
draw_player(player_b);
next_round(game_round);
player_a.x += 5; // Move player to the right a bit.
draw_player(player_a);
}
|
First, we define the Player
structure in line 3 ‒ 6. It starts with the
keyword struct
followed by a name. Then come, inside curly braces {}
the
data members of the struct
: these are just explicitly typed variable
definitions, like with local variables. After the closing curly brace follows a
semicolon ;
(contrary to functions, where this is neither required nor
allowed).
This structure is its own type. It can be used quite similarly to built-in types
like int
: It can e.g. be used as function parameter type (line 8) or for
local variables (line 24). However, since it is a distinct type, you cannot
assign a Player
variable to an int
variable or vice versa, solving the
second of the aforementioned problems. To use variables of the type Player
the program above uses the following operations on the whole struct Player
object:
- Initialize all members to zero by using
= {}
(bool
s would becomefalse
; see line 24).- Assign players to each other (happens implicitly when passing the player by value to
draw_player
; see lines 8 and 27, 28 and 32).
However, in the end a struct
only consists of its data members, and hence
the most important operation of a struct
object is accessing its members.
This is done with the .
(dot) operator in lines 11, 12 and 31 of the above
program. An member access expression like player.x
can be used like a normal
variable: both reading the data member (e.g. writing it to std::cout
or
saving it into an int
variable) and writing to it (i.e. assigning to it
using =
or using a compound assignment operator like +=
) is possible: it
yields an lvalue. As a consequence of being an lvalue you can also bind this
to a reference like:
Player player = {};
int& rx = player.x;
rx = 3; // Now player.x is also 3.
player.x = 42; // Now rx is also 42.
Note that the two Player
variables are of course independent of each other,
just like the int
variables were in the initial program. This means that
changing data members of one player will not change the other player.
struct
initialization with {}
¶
In the above example programs we have always used = {}
to initialize all of
a struct
s members with zero. However, we can also choose our own values
for the data members by writing them inside the braces, separated by comma:
#include <iostream>
struct Player {
int x;
int y;
int n_points;
};
void print_player(Player player)
{
std::cout << player.x << ',' << player.y << ": " << player.n_points << '\n';
}
Player new_player(int n_points) {
return {0, 0, n_points};
}
int main()
{
Player player = {5, 3, 100};
print_player(player);
player = {1, 2, 3};
print_player(player);
Player player_b = {5};
print_player(player_b);
print_player({10, 7, 250});
print_player(new_player(310));
}
Output:
5,3: 100
1,2: 3
5,0: 0
1,2: 42
10,7: 250
0,0: 310
If you study the above program and its output you will also notice that
= {...}
can also be used for reassigning variables that have astruct
type.- If you specify less values in the braces than the struct has members, the rest is filled with zero/false. In fact the
= {}
syntax we have seen before is just a special case of this.- If you have a function with a parameter of a
struct
type, you can use{...}
to create a object of that type with the given values on the fly, withouth needing a variable.- Similarly, if you have a function with
struct
return type, you can use{...}
to return create an object of that type with the given values on the fly.
Naturally, in all cases where literals like 42
or 5
were used, you can
use arbitrary expressions like variables or computations of the right type.
const
and struct
s¶
Of course variables with struct
types can also be made const
, meaning
that neither the object as whole nor any of its data members may be modified:
const Player player = {5, 5, 100};
print_player(player);
//player = {3, 3, 500}; // Does not compile, player is const.
std::cout << player.x;
//player.x = 6; // Does not compile, player is const.
Nested structs¶
Like nearly everything in C++, struct
s can be nested too. For example, it
is probable that in a game, we will neeed x/y-positions not just for the player
but also for other things like monsters, doors, coins or whatever things are
lying around in the game’s virtual world. Thus it totally makes sense to break
out these positions from the Player
type into a new type Point
:
struct Point {
int x;
int y;
};
struct Player {
Point position;
int n_points;
};
Initialization and member access is naturally also nested then:
Player pl = {{1, 2}, 0};
std::cout << pl.position.y << '\n'; // Prints “2”.
std::cout << pl.n_points << '\n'; // Prints “0”.
Efficiently passing struct
s to functions¶
As explained in Background Information: The stack, passing an argument to a function
involves pushing it on the stack. While this is fine for small arguments like
int
s and double
s, most struct
types are much bigger. For
example, our Player
has at least the size of three int
s. We can check
the exact size with the uary sizeof
operator. This operator can be applied
to a type or to any expression and evaluates to the size of the (expression’s)
type measured in bytes [1]:
std::cout << sizeof(int) << '\n';
std::cout << sizeof(Player) << '\n';
Player player_a = {};
std::cout << sizeof(player_a) << '\n';
With the MSVC compiler, the Player has in fact twelve bytes. To avoid copying all these bytes to the stack for passing it to a function, we could use references, e.g.:
void print_player(Player& player) { /* ... */ }
Now only the address (4 bytes on 32 bit systems and 8 bytes on 64 bit systems)
is pushed on the stack. There are two problems though: First, we cannot pass
const Player
variables or expressions like {5, 5, 100}
(an rvalue) and
second, we could accidentally change the passed in variable. To avoid both
problems we use const
references:
void print_player(const Player& player) { /* ... */ }
This solves both of the above problems.
Of course you can also use const
references to builtin types like int
,
but this usually makes no sense and is in fact slower than just copying the
int
over because accessing the value through a reference is slower than
accessing it directly.
Encapsulation and member functions¶
struct
s can be more than simple data containers.
Member functions¶
So far, we have only had data members in struct
s, but it turns out that we
can also add functions to them. For example, we could write a member function
draw()
instead of a “free” function draw_player(const Player&)
like this:
#include <iostream>
struct Point {
int x;
int y;
};
struct Player {
Point position;
int n_points;
void draw() const
{
std::cout
<< "Player at " << position.x << "," << position.y
<< " with " << n_points << " points.\n";
}
};
int main()
{
Player player_a = {}, player_b = {{2, 3}, 1};
player_a.draw();
player_b.draw();
}
While data members like position
and n_points
are stored in each
Player
object, functions are not really a part of the object. They behave
like normal functions with the difference that you can only call them on a
particular object (“instance”) of a Player
/ struct
-type. Behind the
scenes, the member function receives a hidden additional argument. You can think
of it as having the type Player const&
if you write const
after the
function, as done above, or Player&
otherwise (note that this means that you
can only member functions marked as const
on const
objects). Then, when
a variable or function name appears in the member function’s body, the compiler
looks in these places:
- local variables and function parameters
- members of the hidden additional parameter
- global variables and free functions
Considering all this, what is the difference to the version with the free
function? If you were to look very closely at the machine code that is generated
from both version, you could see that they are in fact bit-by-bit identical. So
at runtime there is no difference. Now it seems to be only a matter of style,
but there is in fact a drawback: Later when we learn how to split a program into
multiple files, we will see that a struct
-definition must not span multiple
files. So without touching the file where the struct
is defined, it is
impossible to add new functions. With free functions, you can add more functions
for working with the struct
without changing the definition.
Encapsulation and access control¶
There is an important technique that only works with member functions: encapsulation (also called information hiding). By encapsulating members of a type, you restrict access to them to member functions of the type only. This has two advantages: First, you can often change the (now) internal representation of the type without having to change code that uses the type. Secondly, it allows you to enforce invariants on the members, that is to enforce that some conditions always hold.
For the first advantage, consider how, when we combined the x
and y
members of the player into a new Position
type, we also had to change
draw_player
and all other code using x
and y
. Were the position
encapsulated in the right way, we wouldn’t have had to change anything but the
Player
-struct
itself. The following example demonstrates this part
of encapsulation.
#include <iostream>
struct Point {
int x;
int y;
};
struct Player {
public:
int x() const { return m_position.x; }
void set_x(int x_) { m_position.x = x_; }
int y() const { return m_position.y; }
void set_y(int y_) { m_position.y = y_; }
int n_points() const { return m_n_points; }
void set_n_points(int n_points_) { m_n_points = n_points_; }
Point position() const { return m_position; }
private:
Point m_position;
int m_n_points;
};
void draw_player(Player const& player)
{
std::cout
<< "Player at " << player.x() << "," << player.y()
<< " with " << player.n_points() << " points.\n";
}
int main()
{
Player player;
player.set_x(1);
player.set_y(2);
player.set_n_points(10);
draw_player(player);
}
Everything after private:
is only accessible to member functions of the
struct
. On the other hand, everything after public:
is also accessible
to outside functions. These key words are called access control key
words. It is customary to not intend them (relative to the struct
itself).
Note
Access-control is per-type not per-object.
That is, having a member function like the following Player
:
void steal_points(Player& other) { m_n_points += other.m_n_points; }
in Player
and using it like this:
player_a.steal_points(player_b);
is completely OK.
Notice how instead of having a public data member n_points
, Player
now
has a private data member m_n_points
(the m_
is just a fairly customary
name prefix to mark private data members) and two function for accessing it,
namely int n_points()
to get the value and set_n_points(int)
to set it
(the parameter name n_points_
has a trailing underscore to distinguish it
from the n_points()
function). The former function is called a
getter and the latter a setter. Note that not every getters must
have a corresponding setter (or vice versa, though that is more unusual), e.g.
the position of a Player
is read-only because only a getter Point
position() const
is provided. On the other hand, there may be getters and
setters for “properties” that are not even directly backed by data members, as
is the case with the x
and y
properties of the player.
Note
Getters are often used in expressions involving more than one function call and thus should be side-effect free; see Side effects and their order.
Constructors¶
There is a problem with the previous example: it fails to realize the second
advantage of encapsulation, namely maintained invariants. Although we have not
explicitly stated any, it clearly undesirable to have a Player
where any
members are initialized. In fact, the situation regarding initialization with
this partially-encapsulated example is worse than before since we cannot use the
= { … }
syntax anymore because the data members are private (in fact this
aggregate initialization` cannot be used at all as soon as any members
of the initialized type are private). To remediate this, you can define a
special member function called the constructor. Constructors functions
have no return type and the same name as the type. They can have parameters but
don’t have to. A constructor that can be called without parameters is called a
default-constructor. If we add a default-constructor to our struct
Player
, it looks like this:
struct Player {
public:
Player():
m_position{0, 0},
m_n_points{0}
{
}
// ... getters and setters as before ...
private:
Point m_position;
int m_n_points;
};
The effect of this change is that every Player
variable will be
automatically initialized with position of (0, 0) and zero points.
Apart from the name, which is the same as the type’s, and the missing return
type, constructors have one more feature that normal functions don’t have:
Initializer lists. This comma-delimited list starts at the :
after the
argument list and the entries have the form data_member_name{initial_value}
.
In the case of Player
above, this is the same as if we had written:
Player()
{
m_position = {0, 0};
m_n_points = 0;
}
There are cases where this makes a differences, namely when the some data members have constructors themselves.
Instead of a default-constructor, we could also add a constructor that accepts the initial values as arguments:
Player(Position position_, int n_points_):
m_position{position_},
m_n_points{n_points_}
{
}
Now I can also explain where the initializer list behaves different from using
=
: E.g. if Point
had a constructor Point(int x, int y)
, then we
would have to use the initialization list, because if we leave it out, the
compiler will try to insert a call to the default constructor as if we had
written:
Player(Position position_, int n_points_):
m_position{}
{
m_position = position_;
m_points = n_points;
}
Since the constructor of m_position
requires two int
s as parameters,
however, this would be an error.
A quick glance at templates¶
One thing that is a bit limiting for the struct Point
is that it only has
int
s as the x
and y
components. If we only use the Point
for
one application and we only need int
, that may be fine. But in general, we
might need Point
s of double
or int
or maybe in one case we need to
optimize for memory and even want to use a Point
of signed char
. We
could just write a class for every case such as:
struct IntPoint { int x; int y; };
struct FloatPoint { float x; float y; };
struct DoublePoint { double x; double y; };
// ...
In the case of Point
this may even be feasible as the struct
is so tiny.
But now imagine we also wanted to provide some functions for Point
s, such
as:
IntPoint move_int_point(IntPoint p, IntPoint by)
{
return {p.x + by.x, p.y + by.y};
}
void print_int_point(IntPoint p)
{
std::cout << '(' << p.x << ',' << p.y << ')';
}
If we had \(n\) functions and \(m\) types of points, we would need to implement all of the \(n\) functions separately for all \(m\) types which results in \(m \cdot n\) functions. Clearly this does not scale up well.
Class templates¶
With the template mechanism of C++, struct
s we can instead write a struct
template (usually called a class-template) for Point
:
template <typename T>
struct Point {
T x;
T y;
};
The template
keyword tells the compiler that what follows is a template
definition and inside the angle brackets (which are used the usual less-than and
greater-than signs repurposed) come the template parameters. In this case the
template has as parameter a type (typename
) with the parameter name T
.
Inside the template definition, you can use T
just like a normal type.
To use this class template, we now need to specify the template parameter:
void print_point(Point<int> p)
{
std::cout << '(' << p.x << ',' << p.y << ')';
}
int main()
{
Point<int> mypoint = {2, 5};
print_point(mypoint)
}
This is very similar to functions. In fact, you could view the class template
Point
as a “type function” that takes a component-type as parameter and
“returns” a point-type. The big difference is that this “computation” is done by
the compiler and not when the program runs.
Function templates¶
Now we need to write only one Point
class-template instead of several
SomeTypePoint
classes, but we still would need \(m \cdot n\) functions. But
function templates come to the rescue!
template <typename T>
void print_point(Point<T> p)
{
std::cout << '(' << p.x << ',' << p.y << ')';
}
Transforming print_point
from a function to a function-template should look
extremely similar to the previous tranformation of Point
to a
class-template. However, when calling a function template, there is a catch: You
usually don’t need to specify the template parameter because the compiler
figures them out automatically by comparing the function argument with
parameter. That is, the main
function from before now looks exactly the same!
By using class- and function-templates, instead of \(m \cdot n\) functions,
we now need to write each function only once for all Point
types, greatly
improving maintainability and readability of the program.
Footnotes
[1] | Actually, the C++ standard defines sizeof to return the
size in multiples of char ‘s size (sizeof(char) is defined to be
1 ). Although the standard only specifies that a char must be at
least one byte large, in practice it has exactly one byte on all but some
exotic platforms. |