Internet Character Sets and Encodings

Listed here are commonly used character sets (charset) and their encoding standards.

Encoding standard
Set Original
(RFC 1345)
UTF
UCS Transform Format
Unicode, UCS, ISO-10646
Universal Character Set
ASCII (7 bit) US-ASCII UTF-7 (RFC 2152) ISO-10646-UCS-Basic
Latin 1 (8 bit) ISO-8859-1 UTF-8 (RFC 2279) ISO-10646-Unicode-Latin1
Most languages (16 bit) UTF-16 (RFC 2781) ISO-10646-UCS-2
All languages (32 bit) ISO-10646-UCS-4

More information elsewhere

[up one level] [home] [about] [copyright] [contact] This page changed 2003 May 10.