Some popular libraries allow code snippets in other languages to be embedded within JavaScript code. Users want to format these embedded code snippets within JavaScript to enhance the development experience.
Simply put, the idea is to extract the code snippets from template strings, format them using the respective language's formatter, and then replace them back into the template string.
We need to parse the entire template string and then format it based on the parsing results. However, template strings with interpolations are not valid CSS code (using CSS as an example here). Therefore, we need to preprocess the interpolations, turning the template string into a more valid CSS code. We plan to replace interpolations with a special string and then reinsert them after formatting.
To maximize parsing success, we chose to replace interpolations with grit metavariables. The reason for this choice you can find in #3228 (comment)
Since JavaScript formatters cannot directly format code in other languages, we need to use external tools to format these other languages' code. To achieve this, we designed a generic trait instead of relying on specific implementations, maximizing the decoupling between different language formatters.
enum JsForeignLanguage {
Css,
}
trait JsForeignLanguageFormatter {
fn format(&self, language: JsForeignLanguage, source: &str) -> FormatResult<Document>;
}
Then we can add a new parameter to the format_node
function to pass in the formatter for other languages.
pub fn format_node(
options: JsFormatOptions,
+ foreign_language_formatter: impl JsForeignLanguageFormatter,
root: &JsSyntaxNode,
) -> FormatResult<Formatted<JsFormatContext>> {
biome_formatter::format_node(
root,
JsFormatLanguage::new(options, foreign_language_formatter),
)
}
When formatting JavaScript files, we need to be aware of other languages' settings. For example, when formatting CSS code, we need to know the CSS formatter's settings.
The LSP provides a feature called format_range
that formats code snippets. This feature relies on SourceMarkers generated during the printing process. Generating a SourceMarker
depends on the position information of tokens in the source code. This position information is contained in the following two FormatElements
:
biome/crates/biome_formatter/src/format_element.rs
Lines 36 to 50 in ce00685
Since the formatting of embedded languages is done by extracting, preprocessing, and then separately parsing and formatting them, the source_position
in these two FormatElement
is inaccurate, and the entire template string is handled as a whole. Therefore, I recommend erasing these inaccurate source_position
. It is acceptable to erase them because the format_range
function will still be able to find the SourceMarker
closest to the range start and end. If there is a need to format parts of the embedded code in the future, we can revisit this issue.
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too