Toggle language
Toggle theme
Back

Unicode Converter

Unicode Converter
Text ↔ Unicode

U+XXXX format (e.g., U+0041 for 'A')

About Unicode

Unicode is a universal character encoding standard that assigns a unique number (code point) to every character in every language. It supports over 143,000 characters including emojis, mathematical symbols, and characters from ancient scripts. This tool converts between text and various Unicode representations commonly used in programming and web development.

What is Unicode?

Unicode is a universal character encoding standard that provides a unique number (called a "code point") for every character in virtually every writing system in the world. First published in 1991, Unicode now includes over 143,000 characters covering 154 scripts, symbols, and emoji.

Each Unicode character is represented by a code point in the format U+XXXX, where XXXX is a hexadecimal number. For example, the letter "A" is U+0041, the Euro sign "€" is U+20AC, and the emoji "😀" is U+1F600.

This free online tool converts text to various Unicode representations and can decode Unicode back to readable text. It supports code points, decimal, hexadecimal, HTML entities, CSS escapes, and JavaScript escape sequences.

Supported Formats

Unicode Code Points (U+XXXX)

The standard way to represent Unicode characters. Example: A = U+0041

Used in documentation and standards

Decimal

The base-10 numeric value of the code point. Example: A = 65

Used in programming and character tables

Hexadecimal (0xXXXX)

Base-16 representation with prefix. Example: A = 0x41

Common in programming languages

HTML Entities

Decimal entities for HTML. Example: A = A

Used in HTML/XML documents

CSS Escape

Backslash followed by hex code. Example: A = \41

Used in CSS content property

JavaScript Escape

\u followed by 4 hex digits. Example: A = \u0041

Used in JavaScript strings

Common Unicode Ranges

Basic Latin
U+0000 - U+007F
Latin Extended
U+0080 - U+024F
Greek
U+0370 - U+03FF
Cyrillic
U+0400 - U+04FF
Arabic
U+0600 - U+06FF
CJK Unified
U+4E00 - U+9FFF
Currency Symbols
U+20A0 - U+20CF
Emoji
U+1F300 - U+1F9FF

Frequently Asked Questions

What's the difference between Unicode and UTF-8?

Unicode is the standard that defines code points for characters. UTF-8 is an encoding scheme that determines how those code points are stored in bytes. UTF-8 is variable-length (1-4 bytes per character) and is the most common encoding on the web.

Why do some characters need surrogate pairs in JavaScript?

JavaScript uses UTF-16 internally, which can only directly represent characters up to U+FFFF. Characters above this (like many emoji) require two 16-bit code units called a surrogate pair. This tool handles these automatically.

When should I use HTML entities?

HTML entities are useful when you need to include special characters in HTML documents, especially characters that have special meaning in HTML (like <, >, &) or when you're unsure about the document's character encoding.

Is my data sent to a server?

No. All conversion is done entirely in your browser using JavaScript. Your data never leaves your device, making this tool completely private.