Wednesday, December 16, 2009

Values and Their Types




I l@ve RuBoard


Values and Their Types


In C++, every value, at every moment in its lifetime (during program execution), is characterized by its type. C++ variables are associated with their types at the time of definition. The type describes three characteristics of the value:





  • the size of the values of that type in computer memory





  • the set of values that are legal for the type (the method of interpretation of the bit pattern that represents the value of that type)





  • the set of operations that are legal on the values of that type





For example, the values of type int on my machine are allocated four bytes, and the set of legal values ranges from -2,147,483,648 to +2,147,483,647. The set of legal operations includes assignment, comparisons, shifts, arithmetic operations, and some others. The values of the type TimeOfDay that I defined in the section
"Classes,"
in Chapter 2, "Getting Started Quickly: A Brief Overview of C++,"
are allocated the size of two int values (unless the compiler adds more space to align values in memory for faster access). The set of TimeOfDay legal values is any combination of values for the first integer (from 0 to 23) and for the second integer (from 0 to 59). The set of TimeOfDay legal operations includes setTime(), displayTime(), and displayMilitaryTime(); it includes assignment but not comparisons. Sure, TimeOfDay components can be compared (they are integers, and the rules of int apply to them) but not the TimeOfDay values: You should distinguish between properties of the type and properties of its components. If the client code has to compare TimeOfDay values, class TimeOfDay has to support this by implementing functions such as isLater() or compareTime() or something like that. (Again, notice the client-server terminology I am using here.)



Every C++ variable has to be defined by specifying the type of its values. In addition, type also characterizes the values of constants, functions, and expressions. This means that you can combine typed values into expressions that give other typed values as results, and these values can be used in other expressions and so on.



In most cases, the type is denoted by an identifier, that is, the type has a name (e.g., int or TimeOfDay). This is common and natural, but this is not the only way to define types. C++ allows so-called anonymous types that do not have specific names. These types are not common.



Type names of built-in C++ types are reserved words: int, char, bool, float, double, and void (actually, this is it, this is the whole list). In this list void denotes the absence of the value that can be manipulated in an expression. We use it to indicate that further use of the value in other expressions is not appropriate. For example, the function computeSquare() in the section
"Functions and Function Calls,"
in Chapter 2, returns the value that can be used in expressions, and the function displayResults() in the same section cannot be used this way: It returns no value. If you try to use it incorrectly, the compiler will tell you that this is an error.





int a, b;
a = computeSquare(x,y) * 5; // this is legal C++
b = displayResults(PI*PI) * 5; // this is an error



Other languages do not have this special "type" because they distinguish between functions (that return values) and procedures (that do not return values). C++ inherited from C the function syntax that doubles both as a function and as a procedure. Logically, the absence of the specified return type should be interpreted as the absence of the return value; not so in C. To add insult to injury, the absence of type specification in C denotes the integer type and requires a return statement that returns an integer value. C++ implements a compromise. If you do not specify the return type, the compiler does not go after you and does not demand that the function return an integer value (as the C compiler does); the new C++ compiler assumes that you want the void return type.





displayResults(double y) // C++ it is void
{
cout << "In that world, pi square is " << y << endl;
cout << "Have a nice day!" << endl; // no error in C++
}



However, if you use this function as an operand in an expression, C++ assumes that you are using an old C convention and want to return an integer. At run time, displayResults() silently returns junk. As they say, the compiler "does not second-guess the programmer" and removes compile time protection.





b = displayResults(PI*PI) * 5; // not a syntax error



If you supply the return statement, the function with no return type is treated as if it returns an integer value.





displayResults(double y) // C++ assumes it is int
{
cout << "In that world, pi square is " << y << endl;
cout << "Have a nice day!" << endl;
return 0; // no syntax error
}


The client code can use the return value as it sees fit.





b = displayResults(PI*PI) * 5; // this is legitimate



The use of int as a default return type goes back to the days when most C functions were designed to return values and saving the programmer three keystrokes was viewed as a nontrivial advantage. Avoid this practice. If the return type is integer, say int. If a function returns no value, denote return type as void.



ALERT



Always specify the return type of a function. If the function returns no value, specify type
void.
Do not rely on C++ default.





The types defined by the program in addition to built-in C++ types are called user-defined types. I do not like this terminology, because users do not define types. A user is a person or an organization that uses the implemented system to achieve the stated objectives. It is the programmer who defines the type composition and the name of the type, similar to the type TimeOfDay in the section
"Classes,"
in Chapter 2. This is why I prefer to call these types programmer-defined types.



Although different types in C++ are of different sizes, there is nothing unusual for values of different types to have the same size in memory. For different types, it is the interpretation of the bit pattern that distinguishes the values. For example, the bit pattern 01000001 is interpreted as value 65 if it is stored in an integer variable; the same bit pattern is interpreted as A if it is stored in a variable of character type.



In the old days, programmers had to know how to read binary numbers, octal numbers, hexadecimal numbers, ASCII codes, EBCDIC codes, remember by heart the powers of 2 to the 16th power (sometimes to the 20th or even 32nd power), understand one-complement and two-complement representation for negative numbers and whatnot. Today, most programmers do not need that. Still, the computer hardware is built in sizes that are increments of 8 bits. A byte has 8 bits, a half-word has 16 bits, a word has 32 bits. On some machines, it is a word that has 16 bits, and a double word has 32 bits. This is why it is a good idea to know at least the ranges of values that can be stored in memory of different sizes.



So, 4 bits can represent 16 different combinations of bits (one hexadecimal digit). Usually, these 16 combinations are assigned to integer numbers from zero to 15. Similarly, 8 bits can represent 256 values (2 to the power of 8). These 256 bit combinations are assigned to integer numbers from zero to 255. What if we want to represent both positive and negative numbers, not just positive? We still have only 256 bit combinations at our disposal. The range from -128 to +128 would not do because this range has 257 values, not 256. The common solution is to represent numbers from -128 to +127.



Two bytes (16 bits) can represent 65,536 bit combinations (this magic number is 2 to the power of 16). For positive numbers, the range is from zero to 65,535. For signed values (positive and negative numbers) the range is from -32,768 (2 to the power of 15) to +32,767 (2 to the power of 15 minus 1). Similarly, 32 bits (four bytes) can represent 4,294,967,296 values. For signed numbers, four bytes cover the range from -2,147,483,648 (2 to the power of 31) to +2,147,483,647. This is probably all that you should know about binary numbers.







I l@ve RuBoard

No comments: