Resources

Unicode

Unicode is a standard for consistent encoding representation, and handling of text expressed in most of the world's writing systems. At the end is a table that matches a letter/emoji/character/symbol to a number, this number is called code point.

Multibyte Characters

A multibyte character will mean a character whose encoding requires more than 1 byte. Usual strings (array of chars) are made of multibyte characters, making a multibyte string

Wide Characters

A wide character is a computer character datatype that generally has a size greater than the traditional 8-bit (1-byte) character.

Encodings

Encoding tells us how to represent a code point in memory. There are many Unicode encoding:

UTF-8