RFC 8259 section 8.1 requires JSON text exchanged outside the scope of a closed ecosystem to be encoded using UTF-8, butRead more about proposal
JSON.stringifycan return strings including code points that have no representation in UTF-8 (specifically, surrogate code points U+D800 through U+DFFF). And contrary to the description of
JSON.stringify, such strings are not “in UTF-16” because “isolated UTF-16 code units in the range D800₁₆..DFFF₁₆ are ill-formed” per The Unicode Standard, Version 10.0.0, Section 3.4 at definition D91 and excluded from being “in UTF-16” per definition D89.
JSON.stringify method has been defined to return unformed Unicode strings if there are any lone surrogates in the input, for example below:
JSON.stringify('\uD800'); // '"�"'
But now, because Lone UTF-16 surrogates cannot be encoded as UTF-8, that’s why the proposal, well-formed JSON.stringify changed the JSON.stringify so that it generates escape sequences for lone surrogates, making its output valid Unicode (and representable in UTF-8) for example below:
JSON.stringify('\uD800'); // '"\\ud800"'
JSON.parse(stringified)still produces the same results as before.
Non-BMP characters still serialize to surrogate pairs.
JSON.stringify('𝌆') // → '"𝌆"' JSON.stringify('\uD834\uDF06') // → '"𝌆"'
Unpaired surrogate code units will serialize to escape sequences.
JSON.stringify('\uDF06\uD834') // → '"\\udf06\\ud834"' JSON.stringify('\uDEAD') // → '"\\udead"'
Thanks for reading, Share with love.