Per https://html.spec.whatwg.org/multipage/webappapis.html#creating-a-classic-script , the input source
is supposed to be a string, which, per https://infra.spec.whatwg.org/#string is
A string is a sequence of unsigned 16-bit integers, also known as code units. A string is also known as a JavaScript string. Strings are denoted by double quotes and monospace font.
This is distinct from a "byte string", i.e. AK::StringView/AK::DeprecatedString, and from a UTF-8 string aka AK::String. The spec suggests the input should be Span<u16>
. Which seems quite gross.
If we intend to implement the algorithm taking a UTF-8 string, we should make that explicit with implementation note comments. And if we intend to make assumptions about the UTF-8-ness of a string in that algorithm, we should validate the input being UTF-8 either before calling it, after calling it, or as part of another spec step, such as the one that calls https://tc39.es/ecma262/#sec-parse-script (see #17899)
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too