A string is a sequence of bytes that may represent characters. Within a string, all the characters are represented by a common coding representation. In some cases, it might be necessary to convert these characters to a different coding representation. The process of conversion is known as character conversion. (10)
Character conversion can occur when an SQL statement is executed remotely. Consider, for example, these two cases:
In either case, the string could have a different representation at the sending and receiving systems. Conversion can also occur during string operations on the same system.
The following list defines some of the terms used when discussing character conversion.
The following example shows how a typical character set might map to different code points in two different code pages.
Even with the same encoding scheme, there are many different coded character sets, and the same code point can represent a different character in different coded character sets. Furthermore, a byte in a character string does not necessarily represent a character from a single-byte character set (SBCS). Character strings are also used for mixed and bit data. Mixed data is a mixture of single-byte, double-byte, or multi-byte characters. Bit data (columns defined as FOR BIT DATA or BLOBs, or binary strings) is not associated with any character set.
The database manager determines code page attributes for all character strings when an application is bound to a database. The potential code page attributes are:
Character string code page attributes are as follows:
A set of rules is used to determine the code page attributes for operations that combine string objects, such as the results of scalar operations, concatenation, or set operations. At execution time, code page attributes are used to determine any requirements for code page conversions of strings.
For more details on character conversion, see:
(10) Character conversion, when required, is automatic and is transparent to the application when it is successful. A knowledge of conversion is therefore unnecessary when all the strings involved in a statement's execution are represented in the same way. This is frequently the case for stand-alone installations and for networks within the same country. Thus, for many readers, character conversion may be irrelevant.