Tag Archives: html

Decode escaped HTML in JavaScript

My co-worker escapes all the data sent to the backend using a Java tool. The characters are converted into something like &#12345; which means its original unicode in decimal. Meanwhile, html character entities are used when escaping so that the character like “<” will be &lt; .

Frontend should not only decode the string transmitted back but also convert the string into JSON objects. lazily I choose the tricky way to unescape the string, like this:

function getJson(data){
    return $.parseJSON($('


But I find this code sometimes doesn’t work and it seems the the data has been cut off in some place which misleading me thinking some sort of memory leaking.

I tried and tried, and with the help of JSON Lint, I found that the problem laid on the character .

In the approach above, the data is converted into characters which means the &quot has been converted into , but the json string returned use in its format! So, it is the that cut off the string!

Then, I wrote something like this:
(Thanks to Convert HTML Character Entities back to regular text using javascript)

function u2text(text){
    var $p = $('

    return text.replace(/&(#(?:x[0-9a-f]+|\d+)|[a-z]+);?/gi, function($0, $1) {
        if ($1[0] === "#") {
            return String.fromCharCode($1[1].toLowerCase() === "x" ? parseInt($1.substr(2), 16) : parseInt($1.substr(1), 10));
        } else {
            return $p.html($0).text().replace(/"/g, "\\u0022");

In this code the Chinese character in unicode will be converted using the String.fromCharCode, while the html character entities will be decoded using the <p> block scheme, and obviously, quotes are treated differently. Be cautious that the quote character should be converted into “\u0022″. If you use “\u0022″, you change nothing.

After all, the decoding is dangerous for the escaping text has been decoded. In my situation the escaping stuff will be handed over to front-end template system, say Dust.js, which will take very good care of the security problems.