Unicode Decode: Convert Unicode Escape Sequences to Text
Convert \uXXXX Unicode escape sequences back to readable text characters. Perfect for reading escaped source code, JSON, and internationalised strings.
Try the free online tool
Runs entirely in your browser — no signup, no uploads.
When you encounter strings filled with \u0041\u0042\u0043 or similar sequences in source code, config files, or API responses, you are looking at Unicode escape sequences. Decoding them back to readable characters makes the content immediately understandable and is essential when debugging internationalised applications or inspecting obfuscated code.
Unicode decode converts \uXXXX sequences to their corresponding Unicode characters. It also handles supplementary character surrogate pairs (😀) and Python-style \UXXXXXXXX sequences. The result is the original human-readable text that the escape sequences represent.
This tool processes Unicode escape sequences in your browser without any server round-trip. It is useful for reading encoded strings in JSON logs, inspecting minified JavaScript, and verifying that internationalised strings contain the correct characters.
What Is Unicode Escape Decoding?
Unicode escape decoding converts \uXXXX sequences (and their variants) back to the actual Unicode characters they represent. For each escape sequence, the four hexadecimal digits are interpreted as a code point, and the corresponding character is looked up in the Unicode standard.
Surrogate pairs in JavaScript and JSON (two consecutive \uXXXX sequences where the first is in the range �-� and the second is �-�) are combined to produce a single supplementary character. Python-style \UXXXXXXXX sequences with eight hex digits directly encode the full code point.
Most programming language runtime environments decode Unicode escapes automatically when a string literal is parsed. This tool provides the same capability in a visual, interactive format for debugging and inspection.
How to Use This Tool
Decoding Unicode escape sequences is a single-step operation with this tool.
- 1
Paste the escaped string
Copy the string containing \uXXXX sequences from your source code, JSON, log file, or API response.
- 2
Click Decode
The tool identifies all Unicode escape sequences and replaces them with their corresponding characters.
- 3
Read the decoded output
The output shows the original text with all escape sequences resolved to readable characters.
- 4
Copy the result
Copy the decoded text for use in documentation, testing, or further analysis.
Common Use Cases
Unicode decode is particularly useful in internationalisation and debugging scenarios.
- Reading escaped strings in minified or obfuscated JavaScript to understand what text they contain.
- Decoding \uXXXX sequences in JSON API responses to verify localised text is correct.
- Inspecting Java .properties files that use Unicode escapes for non-ASCII characters.
- Decoding escaped content in LDAP distinguished names or Active Directory attributes.
- Reading escaped characters in database exports or migration scripts.
- Verifying that emoji and special symbols in internationalised applications are stored and transmitted correctly.
Tips and Best Practices
- When decoding strings from untrusted sources, be aware that the decoded characters may include invisible control characters, bidirectional override markers, or other characters with visual security implications.
- JSON.parse() in JavaScript automatically decodes all \uXXXX sequences in JSON strings, so you rarely need to manually decode JSON strings in application code.
- In Python, the codecs.decode(b'\\u4e2d\\u6587', 'unicode_escape') pattern can decode escaped strings programmatically.
- Zero-width spaces (\u200B), zero-width non-joiners (\u200C), and bidirectional markers (\u200F, \u200E) are invisible but can affect text rendering and processing; use this tool to reveal them.
- After decoding, if the output contains unexpected characters, compare the original code points to the Unicode character charts to understand what each character is.
Frequently Asked Questions
How do I decode \uXXXX sequences in JavaScript?
JavaScript parses \uXXXX sequences natively in string literals. In a running script, you can use JSON.parse('"\\uXXXX"') to decode a single escape (note the double-escaping in source code). Alternatively, use a regex-based approach: str.replace(/\\u([0-9a-fA-F]{4})/g, (_, hex) => String.fromCharCode(parseInt(hex, 16))).
Why do some Unicode escape sequences decode to question marks or boxes?
A question mark or empty box means your font or rendering environment does not include a glyph for that character. The character was decoded correctly by the tool, but the display system cannot render it. Switch to a font like Noto or a terminal with full Unicode support to see the character.
What are surrogate pairs and why do some characters use two escape sequences?
Unicode code points above U+FFFF are outside the Basic Multilingual Plane. JavaScript strings use UTF-16 internally, which represents these code points as two 16-bit values called a surrogate pair. Each surrogate has its own \uXXXX escape. A surrogate pair decoder combines the two values to reconstruct the original supplementary character.
Can Unicode escapes hide malicious content in source code?
Yes. Obfuscated malware and trojans sometimes use Unicode escapes to hide readable strings from code review. In a striking example, homograph attacks use lookalike Unicode characters to create identifiers that appear identical to legitimate names. This decoder helps reveal what escaped strings actually contain.
Does this tool handle Python-style \N{CHARACTER NAME} sequences?
\N{CHARACTER NAME} sequences (like \N{SNOWMAN} for the snowman character) are Python-specific and not part of the JavaScript or JSON escape specifications. This tool focuses on \uXXXX and \UXXXXXXXX formats. For Python-specific escape handling, use the codecs module or the unicodedata.lookup() function.
How do CSS Unicode escapes differ from JavaScript escapes?
CSS uses a backslash followed by up to six hexadecimal digits without the u prefix: \4E2D for the Chinese character with code point U+4E2D. The sequence ends at the sixth hex digit or at a whitespace character that is consumed as a delimiter. This format is different from JavaScript's \uXXXX.
Ready to use this tool?
Free, instant, no account required. Runs entirely in your browser.