UTF-8

Posted by on Jul 17, 2014 in | 0 comments

UTF-8 (UCS Transformation Format—8-bit) is a variable-width encoding that can represent every character in the Unicode character set. It was designed for backward compatibility with ASCII and to avoid the complications of endianness (added by Jan: refers to the convention used to interpret the bytes making up a data word when those bytes are stored in computer memory) and byte order marks in UTF-16 and UTF-32. UTF-8 has become the dominant character encoding for the World Wide Web, accounting for more than half of all Web pages. The Internet Mail Consortium (IMC) recommends that all e-mail programs be able to display and create mail using UTF-8. UTF-8 is also increasingly being used as the default character encoding in operating systems, programming languages, APIs, and software applications.

Ref.: Wikipedia – UTF-8

« Glossary Index