ASCII and Unicode || Kenan Hançer Blog

Unicode is a global standard for character encoding and is the most commonly used character set today.

Basic ASCII Character Set

The basic ASCII character set uses 7-bits for each character

2⁷ = 128

Extended ASCII Character Set

The Extended ASCII character set uses 8-bits for an additional 128 characters

2⁸ = 256

Unicode

Unicode is the new standard for representing characters of all the languages of the World.

ASCII character encoding is a subset of Unicode.

The Unicode standard defines UTF-8, UTF-16 and UTF-32

UTF-8 represents 256 distinct characters (popular encoding used on the web).

UTF-16 represents 65,536 distinct characters (used by Java and Windows).

UTF-32 represents 4,294,967,296 possible characters, enough for all known languages (UTF-8 and UTF-32 are used by Linux and various Unix systems).

Unicode advantages over ASCII

More languages or all(modern) languages can be represented in one character set.

Improved portability of documents in Unicode as each character has an unique representation in Unicode.

Kenan Hançer Blog

Software Adventures

ASCII and Unicode

Basic ASCII Character Set

Extended ASCII Character Set

Unicode

Unicode advantages over ASCII

Leave a Reply Cancel reply