Declaring a built-in value-type variable
The C# language is strongly typed the type of
every variable must be specified when the variable is declared. There are two
ways to declare a built-in value-type variable. You can use the full struct
name or use the alias. For instance, the following are identical:
// Using the full struct name
System.Int32 width;
// Using the C# alias
int width;
Both of these statements declare 32-bit integer variables.
Using the alias produces more compact, readable code and is the preferred
method, although the compiled IL code is identical. We can initialize a value
type when it is declared by including a value in the variable declaration
statement:
Literal values
The right-hand side of the above statement is called a numeric literal. You have to be a
little careful if there is a type conflict. Integer literals default to the smallest type
possible into which the value will fit int, uint,
long,
ulong,
in that sequence and floating-point literals default to double.
If we wish to mark a literal as having a different data type from the default,
we can add a suffix to the literal to make its type explicit (these suffixes
are not case-sensitive):
|
Type
|
Suffix
|
|
uint
|
u
|
|
long
|
l
|
|
ulong
|
ul
|
|
float
|
f
|
|
decimal
|
m
|
For instance, the following statement won't compile:
This won't compile because the literal 23.7
is assumed to be of type double. In this case you would have
to use the following syntax:
All integral types can take values in either hex or decimal
notation. In the former case, the literal is prefixed with 0x;
for example:
Note that C# does
not support octal literals.
Literals of type char are enclosed in single quotes:
As well as character literals, we can use Unicode values
(four-digit hex codes prefixed by \u) and (variable-length) hex values
prefixed by \x
to represent char
values. We can also cast integer values to char. The following statements are
all identical:
char c = 'X'; // Character literal
char c = '\u0058'; // Unicode value
char c = '\x58'; // Hex value
char c = (char)88;
Finally, a char can also contain one of the following
escape sequences:
|
Sequence
|
Description
|
Hex Code
|
|
\'
|
Single quote
|
0027
|
|
\"
|
Double quote
|
0022
|
|
\\
|
Backslash
|
005C
|
|
\0
|
Null
|
0000
|
|
\a
|
Alert
|
0007
|
Table continued on following page
|
Sequence
|
Description
|
Hex Code
|
|
\b
|
Backspace
|
0008
|
|
\f
|
Form feed
|
000C
|
|
\n
|
Newline
|
000A
|
|
\r
|
Carriage return
|
000D
|
|
\t
|
Tab character
|
0009
|
|
\v
|
Vertical tab
|
000B
|
The unescaped literal characters ', \,
and newline are not permitted in chars.
User-defined value types
The C# language allows the creation of user-defined value
types. A user-defined value type will be defined as a struct that derives from
the System.ValueType
class. Structs are covered
in detail in Chapter 8. A user-defined value type can contain fields,
properties, methods, and events. If the user-defined type is boxed, it will have access to the virtual methods defined in
the System.ValueType
and System.Object
classes. Boxing is described later in this chapter. Value types are by their
nature sealed
no other type can derive from them (this is implicit the sealed
modifier can't be explicitly added to structs).
Reference types
The second major branch of the type hierarchy tree corresponds to reference types. Reference
types contain a pointer to a location on the heap where the object itself is
stored. Because they only contain a reference, not the actual values,
reference-type variables passed into method calls will be affected by any
changes made to the parameter within the body of the method, and are therefore
similar in some ways to reference parameters.
Let's look at what happens when a string variable is
declared and passed as a parameter into a method:
When the string variable s1 is declared, a value is pushed
onto the stack which points to a location on the heap; in the figure above the
reference is stored at address 1243044, and the actual string is stored at
address 12262032 on the heap. When the string is passed into a method, a new
variable is declared on the stack (this time at location 1243032) corresponding
to the input parameter, and the value held in our reference variable the
memory location on the heap is passed into this new variable.
Reference types include pointer, interface, and
self-describing types. Pointer types are used to store the address of another
object, and are only permitted in unsafe code
in C# (see Chapter 17). An interface defines the contract which implementing
classes or structs must adhere to (the methods, properties, fields, etc. they
must expose), but doesn't provide any implementation for those members.
Interfaces are described in more detail in Chapter 9.
Self-describing reference types include class types and arrays. An array represents a set of
elements, which can be value or reference types. An array is a reference type
even if its elements are value types. Details on how to create and manipulate
arrays are provided in Chapter 6.
Classes are user-defined reference types (very similar in
functionality to structs). They can contain data members (fields, constants)
and function members (methods, properties, events, operators, constructors, and
destructors). According to object-oriented programming principles, a class will
define all of the functionality needed to manipulate its data members. Classes
are described in more detail in Chapter 7.
A delegate is a reference type that refers to a method,
similar to a function pointer in C++ (the chief difference is that delegates
include the object on which the method is
invoked). Delegates provide a type-safe and secure way to reference a method
defined in another object, and can refer to a static, virtual, or instance
method. Delegates are extensively used in event handling, and are discussed in
more detail in Chapter 15.
Predefined reference types
There are two reference types that are given special treatment in the C# language, and which have C#
aliases like those for the predefined value types. The first is the Object class (the C# alias is object,
with a lower-case o). This is the ultimate base class of all value and
reference types. Because all .NET types derive from Object,
inheritance from Object is assumed and does not have to be declared. The Object
class defines methods to compare two objects to return the hash code, a Type
object, and a String
representation of the object. A complete description of the Object
type can be found in Chapter 22. Because all types derive from Object,
it is possible to cast any type to an Object, even value types. This
process of casting value types to Object is known as boxing, and we
will look at it in more depth shortly. Boxing is also the magic which lies
behind the apparent paradox that all value types derive from a reference type.
The other class given special treatment
by the C# language is the String class. A string represents an
immutable sequence of Unicode characters. This immutability means that once a
string has been allocated on the heap, its value will never change. If the
value is altered, .NET creates an entirely new String
object, and assigns that to the variable. This means that in many ways strings
behave like value rather than reference types if we pass a string into a
method, and alter the parameter's value within the method body, that won't
affect the original string (unless of course the parameter is passed by
reference). C# provides the alias string (with a lower-case s) to
represent the System.String
class. If you're using String in your code, the using
System;
line must be at the top of your code. Using the built-in alias string
means this line is no longer necessary.
Strings can be created using the usual constructor
syntax or by using what is called a string literal. The following three
expressions are all valid ways to create a String object:
String strA = new String('A', 5); // Sets
strA to "AAAAA"
string strB = "Zachary";
String strC = @"This line can wrap,
and can contain backslashes (\), etc.";
The first version uses one of the String
class constructors (other overloads can take a char
array, or a pointer to a char array). The second and third
statements simply assign a string literal to the variable. If the @
character is placed before the literal text, it indicates that the string will
be read "verbatim" this allows the literal to span multiple lines
and to contain escape characters.
The String class defines methods that can
be used to concatenate two strings. We
can also use the + operator for this purpose:
String str = "Hello";
String str2 = str + " There";
We can extract a character at a given position
in the String
by using an indexer syntax:
String str = "Bye Bye";
char firstChar = str[0];
Chapter 22 contains a complete description of the
properties, methods, and operators defined in the String
class. Indexers are covered in Chapter 14.
Mutable strings
Because strings are immutable, it requires three separate strings every time we
perform a concatenation. For example, take the C# code:
string s1 = "Hello";
string s2 = s1 + " There";
string s3 = s2 + " John";
This code requires five separate strings to be loaded, as
can be seen from the IL code it compiles to:
.locals init (string V_0, string V_1, string V_2)
IL_0000: ldstr
"Hello" // Load the
string "Hello"
IL_0005: stloc.0 // Pop the value into variable V_0
IL_0006: ldloc.0 // Push V_0
onto the stack
IL_0007: ldstr
" There" // Load the
string " There"
// Call the String::Concat() method
IL_000c: call
string [mscorlib]System.String::Concat(string,
string)
IL_0011: stloc.1 // Pop the return value into V_1
IL_0012: ldloc.1 // Push V_1
onto the stack
IL_0013: ldstr
" John" // Load the
string " John"
// Call String::Concat() again
IL_0018: call
string [mscorlib]System.String::Concat(string,
string)
As well as the three hard-coded strings, two strings are
created by the calls to String::Concat() each change to the
string results in a new String
object being allocated on the heap.
It's possible to get round the immutability of strings by using a StringBuilder
object (this resides in the System.Text namespace, and is covered
in detail in Chapter 26). The StringBuilder represesents a mutable
sequence of characters, and it can therefore be more efficient than strings if
we're performing a lot of concatenation. The equivalent C# code using a StringBuilder
would be:
StringBuilder sb = new
StringBuilder("Hello");
sb.Append(" There");
sb.Append(" John");
This compiles to the IL code:
.locals init (class
[mscorlib]System.Text.StringBuilder V_0)
IL_0000: ldstr
"Hello"
IL_0005: newobj
instance void
[mscorlib]System.Text.StringBuilder::.ctor(string)
IL_000a: stloc.0
IL_000b: ldloc.0
IL_000c: ldstr
" There"
IL_0011: callvirt
instance class [mscorlib]System.Text.StringBuilder
[mscorlib]System.Text.StringBuilder::Append(string)
IL_0016: pop
IL_0017: ldloc.0
IL_0018: ldstr
" John"
IL_001d: callvirt
instance class [mscorlib]System.Text.StringBuilder
[mscorlib]System.Text.StringBuilder::Append(string)
In this case, we only have the three hard-coded strings, plus
our single instance of the StringBuilder class the calls to StringBuilder::Append()
don't result in new instances of the StringBuilder. While there's no
perceptible performance gain in this instance, the difference is startling when
concatenation is repeated many times in a test, 50,000 iterations of a simple
concatenation took on average 18 milliseconds with the StringBuilder,
but 423 milliseconds with standard strings!