Contrary to popular belief, C is a strongly typed language; the problem is that C programming practice favours using weak typing.
What is a Type?
A type is something that has
- a set of values associated with it,
- a set of operations that can be performed on it,
- and a representation (how it looks at the machine level).
This sounds similar to objects from Object Oriented Programming (OOP) but objects have 3 additional properties over types:
- inheritance – child objects inherit the operations of the parent object
- extensibility – child objects can extend the parent object
- polymorphism – operations on an object don’t have to be bound at compile time, they can be determined at runtime.
Example of a Type
There are 3 commonly used temperature scales: Celsius, Fahrenheit and Kelvin.
Each has a definite minimum value, an unspecified maximum value and you can perform various operations on them (addition, subtraction, scaling, conversion, averaging, etc).
There is no problem in adding two °C values together, or two °F values together.
I am sure you would agree that it is an error to add a °C value to a °F value – since they are not the same type and the result is meaningless.
It would also be incorrect to assign a °F value to a °C variable. The proper way to assign a value of one type to another is by a conversion.
The Common Way These Types Would be Coded in C
The typical C programmer, when coding types for °C and °F would first determine what is the range of values required and what precision is required.
If we don’t care about fractions of a degree, then some sort of integer representation is fine. If fractions of a degree are important then we would consider using either a fixed point representation or a floating point representation.
Good programming practice dictates that we don’t use naked fundamental C types and instead use typedef
to define the type because
- this makes the code more self documenting and
- if we need to change the representation at a later time, we only have to change it in one place instead of hunting through the code:
typedef float temperature_C;
typedef float temperature_F;
The Problem with the Normal Way of Defining a Type
Unfortunately, we haven’t defined any new types. We have only defined new names for an existing fundamental C type.
The code we write looks like it is using custom types:
/* declare variables */
temperature_C c;
temperature_F f;
Unfortunately, we can still mix types:
c = c + f;
and the compiler will not complain because it knows that c
and f
are both of type float
. The compiler is unable to enforce the meaning of the types – that responsibility falls on the programmer who has to ensure the operation makes sense.
Imagine you are developing software for a baby bath monitor. One of the things the monitor does is check if the bath water is too hot (we don’t want to scald the baby). As a programmer you write a suitable function:
_Bool is_bath_too_hot (temperature_F t) { if (t > MAX_SAFE_TEMPERATURE) { return true; } return false; }
The problem with this function is that while it looks like it takes a parameter of temperature_F
, in reality, it will accept any floating point number including a temperature_C
. A MAX_SAFE_TEMPERATURE for bath water is 100°F (or 38°C). By accidentally passing a temperature_C
to the function, you risk approving water that is dangerously hot (100°C is the boiling point of water).
Most programmers would say that is just the nature of the language, there are a limited number of fundamental types and it is the programmer’s responsibility to use those types responsibly.
Creating Your Own Types in C
We are not limited to the fundamental data types in C, we can create our own data types.
Every struct
and every union
in C is a new data type. Even if the representation is the same, you cannot assign one struct
to different struct
struct temperature_C { float value; }; struct temperature_F { float value; }; . . . struct temperature_C c; struct temperature_F f; c = f; /* this will not compile because they are different types */
Of course, it is ugly to have to preface every use of the type with struct
, so you are more likely to typedef
it:
typedef struct { float value; } temperature_C; typedef struct { float value; } temperature_F;
Now you can use them as though they were plain types:
temperature_C c;
temperature_F f;
c = f; /* this will not compile because they are different types */
Why You are Not Likely to Code This Way
All we have is a new type, but we have no operations on it (aside from assignment).
This means we have to define all the operations we want for this type. Since C does not allow operator overloading, this means we lose the convenience of doing:
a = b + c;
and, instead, have to write:
a = add(b, c);
Worse yet, C does not allow function overloading either, so all function names must be unique. To make each function name unique, we will probably resort to decorating each function name with the type name.
a = add_my_type(b, c);
None of the library functions will work with it. You can’t use log
or sin
or printf
with these types – unless you write your own wrappers.
If type safety is important, this is the only way to go – aside from changing languages.
Performance Costs
Aside from wrapping up a fundamental type in a struct
and creating all those new functions what is the performance overhead?
There is none – if you use inline functions (available in C99 and later).
The new type doesn’t take up any more space than the fundamental type it wraps up. The compiler doesn’t create extra code for it.
Writing
inline my_type add_my_type (my_type a, my_type b) { my_type t; t.value = a.value + b.value; return t; }
gets compiled as a simple addition. There is no function call overhead, no extra instructions inserted by the compiler.
Using GCC version 4.4.1, there was no difference for the assembly output for these two declarations and operations:
temperature_C c, d;
c = add_temperature_C(c, d);
float a, b;
a = a + b;
Macros Make Life Easier
It is a lot of typing to create a new type and write fundamental operations for it. Fortunately, macros can save us a lot of work.
The macro new_type
takes two parameters: name
which is the name of the new type and type
which is the representation (int, short, char, float, etc) for the type.
It creates
- the new type with the name specified
- an initializer
- add function
- subtract function
- multiply function
- divide function
- greater than comparison function
- greater than or equal to comparison function
- less than comparison function
- less than or equal to comparison function
- equality test function
- non-equality test function
- value function that returns the basic C type
#define new_type(name, type) \ typedef struct \ { \ type value; \ } name; \ inline name name##_(type v) \ { \ name t; \ t.value = v; \ return t; \ } \ inline name add_##name(name a, name b) \ { \ name t; \ t.value = a.value + b.value; \ return t; \ } \ inline name sub_##name(name a, name b) \ { \ name t; \ t.value = a.value - b.value; \ return t; \ } \ inline name mul_##name(name a, name b) \ { \ name t; \ t.value = a.value * b.value; \ return t; \ } \ inline name div_##name(name a, name b) \ { \ name t; \ t.value = a.value / b.value; \ return t; \ } \ inline _Bool gt_##name(name a, name b) \ { \ return a.value > b.value; \ } \ inline _Bool gte_##name(name a, name b) \ { \ return a.value >= b.value; \ } \ inline _Bool lt_##name(name a, name b) \ { \ return a.value < b.value; \ } \ inline _Bool lte_##name(name a, name b) \ { \ return a.value <= b.value; \ } \ inline _Bool eq_##name(name a, name b) \ { \ return a.value == b.value; \ } \ inline _Bool neq_##name(name a, name b) \ { \ return a.value != b.value; \ } \ inline type value_##name(name a) \ { \ return a.value; \ } \
Creating a temperature_C
type is simply:
new_type(temperature_C, float);
Which gives you:
temperature_C
The new type.
temperature_C temperature_C_ (float v)
An initializing / casting function
temperature_C add_temperature_C (temperature_C a, temperature_C b)
Returns a + b.
temperature_C sub_temperature_C (temperature_C a, temperature_C b)
Returns a - b.
temperature_C mul_temperature_C (temperature_C a, temperature_C b)
Returns a * b.
temperature_C div_temperature_C (temperature_C a, temperature_C b)
Returns a / b.
_Bool gt_temperature_C (temperature_C a, temperature_C b)
Returns a > b.
_Bool gte_temperature_C (temperature_C a, temperature_C b)
Returns a >= b.
_Bool lt_temperature_C (temperature_C a, temperature_C b)
Returns a < b.
_Bool lte_temperature_C (temperature_C a, temperature_C b)
Returns a <= b.
_Bool eq_temperature_C (temperature_C a, temperature_C b)
Returns a == b.
_Bool neq_temperature_C (temperature_C a, temperature_C b)
Returns a != b.
float value_temperature_C (temperature_C a)
Returns the raw value of the type. This makes it easier to use the type with existing library functions like printf
or in a switch
statement.