In mkdocstrings, to format code, highlight it, and cross-ref names, we have to dance a bit:
str
are hidden behind unique ids, we lose some of the lexer capabilities)It feels super convoluted (back and forth between Python code and Jinja templating), and inefficient (Griffe parsed code with ast
, then Black parses it again, then Pygments lexes it again...).
Furthermore, pygments is "only" a lexer, so that means it cannot distinguish between the name of a parameter and a name used as value, and potentially other similar cases. That means we don't get classes distinct enough to allow users to style their code with enough flexibility, based on semantics rather than tokenization.
I wonder if we couldn't pack all this (formatting + highlighting) directly in our expressions.
The highlighting part is easy: Griffe already parsed code as an expression, so we have all the semantics associated with each part of the expression. For example, in a function call, we know that the ExprName
used for the names of keyword parameters are parameter names, and not just names, so we could easily wrap them in <span class="pn">...</span>
or whatever.
The formatting part however is probably much more complex. I don't have the pretension to think I'm capable of writing something as qualitative and efficient as Black or Ruff. But maybe there's a way? For now, Griffe expressions only handle single statements (type annotations, signatures, assignments). What when we start supporting arbitrary code in expressions?
We could consider never formatting code ourselves, but use a CST instead of an AST, to preserve already existing formatting. Users would format the code themselves (there are plenty of tools to do that), and we would just keep the same formatting. However with Griffe's ability to transform expressions (for example expression modernization, and Annotated
unwrapping), we will have to update the spacing anyway.
Maybe a combination of both would be enough? We preserve original formatting, therefore the formatting complexity is drastically reduced so we can naively format the transformed parts ourselves?
Benefit of retaining original formatting: we respect the users choice (comments enabling/disabling formatting for example).
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too