T-SQL: Data Types (String)
2021-05-23
1. Data Type Categories
- Numerics
- Exact numerics
- Approximate numerics
- Strings
- Character strings
- Unicode character strings
- Binary strings
- Date and time
- Other data types
2. Character strings
- char (fixed-size) and varchar (variable-size)
- text (to be deprecated)
when a UTF-8 enabled collation(归类) is used, these data types store the full range of Unicode character data and use the UTF-8 character encoding. If a non-UTF-8 collation is specified, then these data types store only a subset of characters supported by the corresponding code page of that collation.
翻译: 假如配置不对,就不一定支持全部的 Unicode character。
n 代表什么?
- n defines the string length in bytes (0-8,000). n never defines numbers of characters that can be stored.
- e.g., char(n), varchar(n), varchar(max)
Best Practice
If you use char or varchar, we recommend to:
- Use
char
when the sizes of the column data entries are consistent. - Use
varchar
when the sizes of the column data entries vary considerably. - Use
varchar(max)
when the sizes of the column data entries vary considerably, and the string length might exceed 8,000 bytes.
3. Unicode character strings
- nchar (fixed-size)
- nvarchar (variable-size)
- ntext (to be deprecated)
When a Supplementary Character (SC) enabled collation is used, these data types store the full range of Unicode character data and use the UTF-16 character encoding. If a non-SC collation is specified, then these data types store only the subset of character data supported by the UCS-2 character encoding.
翻译:假如配置是 SC enabled 就是 UTF-16 encoding,否则就是 UCS-2 encoding
n 代表什么?
- n defines the string size in byte-pairs and must be a value from 1 through 4,000.
- max indicates that the maximum storage size is
2^30-1
characters (2 GB) - e.g., nchar(n), nvarchar(n), nvarchar(max)
ISO synonyms
The ISO synonyms for nvarchar are national char varying and national character varying.
sysname
sysname is a system-supplied user-defined data type that is functionally equivalent to nvarchar(128), except that it is not nullable. sysname is used to reference database object names.
Best Practice
If you use nchar or nvarchar, we recommend to:
- Use nchar when the sizes of the column data entries are consistent.
- Use nvarchar when the sizes of the column data entries vary considerably.
- Use nvarchar(max) when the sizes of the column data entries vary considerably, and the string length might exceed 4,000 byte-pairs.
Recommend: Prefix with letter N
Prefix a Unicode character string constants with the letter N to signal UCS-2 or UTF-16 input, depending on whether an SC collation is used or not.
Without the N prefix, the string is converted to the default code page of the database that may not recognize certain characters.
When prefixing a string constant with the letter N, the implicit conversion will result in a UCS-2 or UTF-16 string if the constant to convert does not exceed the max length for the nvarchar string data type (4,000). Otherwise, the implicit conversion will result in a large-value nvarchar(max).
假如 length 超过了 4000,就会转化成 nvarchar(max).
e.g., select N'abcsdfs'
4. Binary strings (Binary data type)
- binary
- varbinary The ANSI SQL synonym for varbinary is binary varying.
- image (to be deprecated)
n 代表什么?
- n defines the storage size and must be a value from 1 through 8,000.
- max indicates that the maximum storage size is
2^31-1
bytes (2 GB) - e.g., binary(n), varbinary(n), varbinary(max)
Convert to binary or varbinary (not safe)
When other data types are converted to binary or varbinary, the data is padded or truncated on the left. Padding is achieved by using hexadecimal zeros.
Conversions between any data type and the binary data types are not guaranteed to be the same between versions of SQL Server.