Unicode Converter

Unicode Converter
Text ↔ Unicode

Output Format

U+XXXX format (e.g., U+0041 for 'A')

Text Input

Unicode Output

About Unicode

Unicode is a universal character encoding standard that assigns a unique number (code point) to every character in every language. It supports over 143,000 characters including emojis, mathematical symbols, and characters from ancient scripts. This tool converts between text and various Unicode representations commonly used in programming and web development.

Related Tools

Hex to Text

Hexadecimal conversion

Binary to Text

Binary conversion

HTML Encoder/Decoder

HTML entity encoding

What is Unicode?

Unicode is a universal character encoding standard that provides a unique number (called a "code point") for every character in virtually every writing system in the world. First published in 1991, Unicode now includes over 143,000 characters covering 154 scripts, symbols, and emoji.

Each Unicode character is represented by a code point in the format U+XXXX, where XXXX is a hexadecimal number. For example, the letter "A" is U+0041, the Euro sign "€" is U+20AC, and the emoji "😀" is U+1F600.

This free online tool converts text to various Unicode representations and can decode Unicode back to readable text. It supports code points, decimal, hexadecimal, HTML entities, CSS escapes, and JavaScript escape sequences.

Supported Formats

Unicode Code Points (U+XXXX)

The standard way to represent Unicode characters. Example: A = U+0041

Used in documentation and standards

Decimal

The base-10 numeric value of the code point. Example: A = 65

Used in programming and character tables

Hexadecimal (0xXXXX)

Base-16 representation with prefix. Example: A = 0x41

Common in programming languages

HTML Entities

Decimal entities for HTML. Example: A = A

Used in HTML/XML documents

CSS Escape

Backslash followed by hex code. Example: A = \41

Used in CSS content property

JavaScript Escape

\u followed by 4 hex digits. Example: A = \u0041

Used in JavaScript strings

Common Unicode Ranges

Basic Latin

U+0000 - U+007F

Latin Extended

U+0080 - U+024F

Greek

U+0370 - U+03FF

Cyrillic

U+0400 - U+04FF

Arabic

U+0600 - U+06FF

CJK Unified

U+4E00 - U+9FFF

Currency Symbols

U+20A0 - U+20CF

Emoji

U+1F300 - U+1F9FF

Frequently Asked Questions

What's the difference between Unicode and UTF-8?

Unicode is the standard that defines code points for characters. UTF-8 is an encoding scheme that determines how those code points are stored in bytes. UTF-8 is variable-length (1-4 bytes per character) and is the most common encoding on the web.

Why do some characters need surrogate pairs in JavaScript?

JavaScript uses UTF-16 internally, which can only directly represent characters up to U+FFFF. Characters above this (like many emoji) require two 16-bit code units called a surrogate pair. This tool handles these automatically.

When should I use HTML entities?

HTML entities are useful when you need to include special characters in HTML documents, especially characters that have special meaning in HTML (like <, >, &) or when you're unsure about the document's character encoding.

Is my data sent to a server?

No. All conversion is done entirely in your browser using JavaScript. Your data never leaves your device, making this tool completely private.

Unicode Converter