Welcome to the Emoji Table Viewer
This guide explains how Unicode works and how to read the emoji data in this table. Understanding these concepts will help you explore the fascinating technical side of emojis!
What is Unicode?
Unicode is like a giant dictionary that assigns a unique number to every character, symbol, and emoji in the world. This ensures that a 😀 emoji looks the same whether you're texting on an iPhone in Tokyo or browsing the web on a computer in New York.
Before Unicode, different computer systems used different character sets, which often led to garbled text when sharing files between systems.
Code Points Explained
Every Unicode character has a code point - its unique identification number written in the format U+XXXX. The "U+" means "Unicode" and the numbers/letters after it are the character's ID in hexadecimal.
Examples:
- 😀 =
U+1F600 (grinning face)
- ❤️ =
U+2764 (red heart)
- 🎉 =
U+1F389 (party popper)
UTF-8 Encoding
Computers can't directly store Unicode code points - they need to convert them into bytes (the computer's language). UTF-8 is the most popular way to do this conversion.
The UTF-8 column shows these byte sequences. Each \x represents one byte in hexadecimal format. More complex emojis need more bytes!
Examples:
- 😀 →
\xF0\x9F\x98\x80 (4 bytes)
- © →
\xC2\xA9 (2 bytes)
- A →
\x41 (1 byte)
Simple vs Complex Characters
Some emojis are made from multiple code points combined together. The Count column shows how many code points make up each emoji:
Simple emojis (1 code point):
- 😀 = 1 code point = 4 bytes
- 🎯 = 1 code point = 4 bytes
Skin tone variants (2 code points):
- 👋🏻 = base wave + light skin tone = 2 code points = 8 bytes
- 🤝🏿 = base handshake + dark skin tone = 2 code points = 8 bytes
Complex sequences (3+ code points):
- 👨👩👧 = man + zero-width joiner + woman + zero-width joiner + girl = 5 code points
- 🏴 = flag sequences can have many code points
Reading the Table
Each column provides different information about the emoji:
- Code Point: The Unicode identifier (U+XXXX format)
- Count: How many code points make up this emoji
- Character: The actual emoji (click to copy!)
- UTF-8: The byte sequence computers use to store it
- Bytes: How much memory the emoji takes
- Name: The official Unicode name
💡 Tip: Click any column header to sort the data! Try sorting by bytes to see the most complex emojis.
Detailed Examples
Example 1: Simple Emoji
😀
grinning face
Code Point: U+1F600
Count: 1 (just one code point)
UTF-8: \xF0\x9F\x98\x80
Bytes: 4 (typical for emoji)
Example 2: Skin Tone Variant
👋🏽
waving hand: medium skin tone
Code Points: U+1F44B U+1F3FD
Count: 2 (base gesture + skin tone modifier)
UTF-8: \xF0\x9F\x91\x8B\xF0\x9F\x8F\xBD
Bytes: 8 (4 bytes per code point)
Example 3: Complex Sequence
👨💻
man technologist
Code Points: U+1F468 U+200D U+1F4BB
Count: 3 (man + zero-width joiner + laptop)
UTF-8: Complex byte sequence
Bytes: 11 (varies by sequence complexity)