-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add missing docstring normalization #1736
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not know the best way to distinguish these cases in odoc
Multi-line code blocks are `Code_block
in the AST and are handled in fmt_nestable_block_element
, immediately calling fmt_code_block
.
Inline code spans are `Code_span
.
I don't know the best implementation strategy. The issues I am trying to resolve are, I think, similar in nature to the following example. Formatting: val trms : t -> trm iter
(** [trms a] enumerates the maximal foreign or noninterpreted proper
subterms of [a]. Considering an arithmetic term as a polynomial,
[trms (c × (Σᵢ₌₁ⁿ cᵢ × Πⱼ₌₁ᵐᵢ Xᵢⱼ^pᵢⱼ))] is the sequence of monomials
[Πⱼ₌₁ᵐᵢ Xᵢⱼ^pᵢⱼ] for each [i]. If the arithmetic term is a monomial,
[trms (Πⱼ₌₁ᵐ Xⱼ^pⱼ)] is the sequence of factors [Xⱼ] for each [j]. *) yields val trms : t -> trm iter
(** [trms a] enumerates the indeterminate terms appearing in [a].
Considering an arithmetic term as a polynomial,
[trms (c × (Σᵢ₌₁ⁿ cᵢ × Πⱼ₌₁ᵐᵢ
Xᵢⱼ^pᵢⱼ))] is the sequence of terms [Xᵢⱼ] for each [i] and
[j]. *) while with this PR this docstring is unchanged, as desired. There seems to be something amiss where without this PR, the length of strings with UTF8 characters is not calculated correctly, where with this PR they are. @Julow, this behavior that I see doesn't match up with your explanation in the inline comment, so I'm confused. Do you see a different / better solution? |
Are you using 0.18.0 as a comparison ? I get no diff between this PR and the current master with your example but I do see it compared with 0.18.0. |
Hmm, I am literally comparing this PR with master. I'll dig some more to see if I can understand what is happening. |
Ugh, this is an instance of #1675. If you start with master and format
then it is unchanged. Switch to this PR and format and it produces:
|
The remaining question is: How should be change the formatting of code spans. This PR allows every spaces in a code span to break. This makes it more stable (there's only one stable formatting) but it changes Odoc's AST. (** [a
b] *) This is parsed as |
Hmm, this feels like another case of odoc just doing something different from what is in the ocaml manual, so I don't know. To my understanding, |
Without this docstrings with UTF8 withing `[]` have miscalculated lengths.
I tend to agree that |
I don't think that there is an odoc issue about this. I'm not sure of the constraints on the odoc side, and on the interop between ocamlformat and odoc. We are currently using this PR internally (this branch) with no issues, even though we do full docstring normalization. So I'm not sure what the observed problem just for odoc looks like. |
I would support normalizing - (** [trim " "] is [""] *)
+ (** [trim " "] is [""] *) I believe that this PR doesn't fix the original issue (which was fixed in #1673). Perhaps we should open an issue on Odoc ? |
In the original report I was mistaken to bring up utf length computation, that turned out not to be the issue. The issue in the comment above remains though, and is fixed by this PR. @Julow, are you saying that any change in ocamlformat that allows every space in a code span to break would cause problems with odoc? Are the odoc ast changes you mention something that are checked in ocamlformat analogous to how we check the main parsetree is preserved by formatting? I'm curious as we have been running with this PR in production for a while with no issues. |
Any change in whitespace/newlines in code spans would change the rendered documentation. Every strings in doc comments are normalized: https://github.com/ocaml-ppx/ocamlformat/blob/main/lib/Normalize.ml#L79-L83 which is wrong at least for code spans with the current implementation of Odoc. I believe it is reasonable to change Odoc to normalize line breaks (and the following spaces) in code spans (but not consecutive spaces). |
@Julow So is this something worth fixing in odoc? we will synchronize with odoc as necessary. Otherwise we should close this PR. |
I think it's worth clarifying Odoc's code span syntax independently from this PR. |
Without this docstrings with UTF8 withing
[]
have miscalculated lengths.This adds (some of?) the normalization needed for
[...]
docstring elements. Currently this change also incorrectly applies to preformatted code block elements{[...]}
. I do not know the best way to distinguish these cases in odoc, see #1735.