Loading
Please wait while we prepare something amazing...
Please wait while we prepare something amazing...
Understanding character encoding is crucial for any developer. This guide traces the evolution from ASCII to Unicode and explains how modern systems handle text.
ASCII was the earliest standard for text encoding that laid the foundation for all modern text systems.
0–127): English letters, digits, basic symbols, control codes'A'.charCodeAt(0); // 65
'a'.charCodeAt(0); // 97
'0'.charCodeAt(0); // 48As computing spread globally, ASCII's limitations became apparent:
The same file could display correctly in one system but show gibberish in another, making international communication through computers extremely problematic.
Unicode solved the chaos by creating a unified system:
U+0000 – U+10FFFF (~1.1 million code points)'A'.codePointAt(0); // 65 (U+0041)
'你'.codePointAt(0); // 20320 (U+4F60)
'🎉'.codePointAt(0); // 127881 (U+1F389)Unicode defines the numbers (code points), but UTF encodings define how to store them in bytes:
| Encoding | Bytes per char | Notes |
|---|---|---|
| UTF-8 | 1–4 | Compact, ASCII-compatible (web standard) |
| UTF-16 | 2 or 4 | Used by JS, Java, Windows |
| UTF-32 | 4 | Fixed length, large memory use |
'A' // 1 byte (ASCII range)
'é' // 2 bytes (Latin extended)
'你' // 3 bytes (CJK characters)
'🎉' // 4 bytes (Emoji)JavaScript has some unique characteristics due to its UTF-16 foundation:
charCodeAt() returns a UTF-16 value (0–65535)// Basic characters
"A".charCodeAt(0); // 65 (same as ASCII)
"你".charCodeAt(0); // 20320 (U+4F60)
// Emoji requiring surrogate pairs
"🎉".length; // 2 (two UTF-16 code units)
"🎉".charCodeAt(0); // 55357 (high surrogate)
"🎉".charCodeAt(1); // 56329 (low surrogate)
// Use codePointAt() for proper Unicode handling
"🎉".codePointAt(0); // 127881 (actual Unicode code point)The Evolution Path: ASCII → Local chaos → Unicode (unified IDs) → UTF-8/16/32 (storage formats)
For Modern Developers:
Understanding this evolution helps explain why we sometimes encounter encoding issues and how to prevent them in modern applications.