Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

text-anchor prevents MathML interpretation #3295

Open
ozross opened this issue Oct 11, 2024 · 5 comments
Open

text-anchor prevents MathML interpretation #3295

ozross opened this issue Oct 11, 2024 · 5 comments
Labels
Code Example Contains an illustrative code example, solution, or work-around Expected Behavior This is how MathJax works v3

Comments

@ozross
Copy link

ozross commented Oct 11, 2024

Not sure whether this is a bug-report, feature-request, or whether I've failed to find configuration options that handle the situation correctly.

MathJax fails to process this part of a MathML block, containing an HTML <a href=".."> tag, which is otherwise handled correctly in modern browsers.

              <mtext> 
                <a xmlns="http://www.w3.org/1999/xhtml" href="#Sect.1">1</a>
              </mtext>
Screenshot 2024-10-12 at 9 05 26 am

Please see this GitHub site for a fuller discussion.
Scroll to the bottom to find images of both with and w/o using MathJax.

Conceptually this is about presenting mathematical content containing text snippets, particularly with hyperlinks and/or other HTML markup, within <mtext> content of a MathML block.
It may require deeper analysis than just what is shown in this example.

@dpvc
Copy link
Member

dpvc commented Oct 12, 2024

It is not clear what version of MathJax you are using (you didn't follow the instructions in the template for the issue you filed). So I can't tell if you are using v3 or v4. Version 3 doesn't implement HTML in MathML, but v4 does. However, because of the potential security issues of allowing unfiltered HTML in user-supplied expressions (either in TeX or MathML), MathJax v4 requires that you set a configuration option in order to enable this feature explicitly for MathML, and you have to load an extension for support in TeX. See the 4.0.0-alpha.1 release notes for details of this feature.

Note, however, that your MathML would be better as

<mtext href="#Sect.1">1</mtext>

without the need for an HTML island inside the MathML. If you are using v3, it would probably be possible to make a MathML input jax pre-filter that transforms the <mtext><a href="...">...</a></mtext> into <mtext href="...">...</mtext>.

@dpvc dpvc added Expected Behavior This is how MathJax works v3 labels Oct 12, 2024
@ozross
Copy link
Author

ozross commented Oct 23, 2024

Thanks for the tips about using v4. and (in the release notes) the mml: option allowHtmlInTokenNodes: true.
Here is the configuration that (after much testing) I'm now using.

<script>
  MathJax = {
    loader: {
      load: ['[tex]/noerrors', 'a11y/complexity', '[tex]/mathtools', '[tex]/html', '[tex]/textmacros', '[mml]/mml3']
    },
    options: {
      ignoreHtmlClass: 'tex2jax_ignore',
      processHtmlClass: 'tex2jax_process',
      enrichSpeech: 'deep',
      makeCollapsible: true,
      a11y: {
        highlight: 'None',
        backgroundColor: 'Blue',
        foregroundColor: 'Black',
        speech: true,
        subtitles: true
      }
    },
    tex: {
      packages: {'[+]': ['ams', 'noerrors', 'noundefined', 'mathtools', 'html', 'textmacros']},
      inlineMath: {'[+]': [['$', '$']]}
//      processEscapes: true
    },
	mml: {
		allowHtmlInTokenNodes: true
	},
    startup: {
      ready: () => {
        mtexthrefpre(); // set tabindex=-1 on math-anchors
//        mtexthref(); // run the hyperlink modification polyfill
        MathJax.startup.defaultReady();
       },
      pageReady: () => {
         return MathJax.startup.defaultPageReady().then(() => {
           hrefnotab(); //  set tabindex=-1 on math-anchors
           mjxmathrole(); // set math role on mjx-container
           mtexthrefpre(); // set tabindex=-1 on reconstructed math-anchors
         })
      }
    }
};
</script>
<script id="MathJax-script" async src="https://cdn.jsdelivr.net/npm/[email protected]/tex-mml-chtml.js"></script>

And here's a link to an example.

Notice that there are 3 polyfills: mjxmathrole, hrefnotab, mtexthrefpre whose function I'll explain below, in relation to using two Accessibility validators.

<script>
	// polyfill: give mjx-container 'math' role, so 'aria-label' is valid
	const mjxmathrole = function(){
		const maths = document.querySelectorAll('mjx-container>mjx-math');
		maths.forEach((mjx) => {
			mjx.parentNode.setAttribute('role','math'); // ARIA 1.1
	// try unhide - - bad idea ?
//			mjx.setAttribute("aria-hidden","false");
//			mjx.setAttribute("role","tree");
			});
	// similarly for SVG mode
		const svgm = document.querySelectorAll('mjx-container>svg>g>g[data-mml-node]');
		svgm.forEach((mjx) => { 
			mjx.parentNode.parentNode.parentNode.setAttribute('role','math'); // ARIA 1.1
			});
		};
	// polyfill: MathJax doesn't support support hyperlink children of <mtext> or other tags
	const mtexthrefpre = function(){
		// MathJax screens the tab-order, so may be best to kill it first
		const mtextref = document.querySelectorAll('mtext>a,mi>a,mo>a,mn>a,mjx-html-holder>a,mtext>ref,mi>ref,mo>ref,mn>ref');
		mtextref.forEach(function(mta){
			mta.setAttribute('tabindex',"-1");
			});
		};
	// <a> in MathML is not interpreted directly prior to v4
	// move the href (and other) attribute(s) to the parent node
	// this polyfill is unnecessary with v.4 and  mml: {allowHtmlInTokenNodes: true}
	const mtexthref = function(){
		const attrs = ["href","id","class","aria-label","aria-braillelabel","aria-describedby","tabindex"]; // add more, if required
		const mtextref = document.querySelectorAll('mtext>a,mi>a,mo>a,mn>a,mtext>ref,mi>ref,mo>ref,mn>ref');
		mtextref.forEach(function(mta){
		attrs.forEach((at) => {
			if(mta.hasAttribute(at)){
				mta.parentNode.setAttribute(at,mta.getAttribute(at))};
			});
	// also move content to the parent node 
		mta.parentNode.textContent=mta.textContent;
	// remove the now-redundant node 
		mta.replaceWith();
		})};
// 
	// taborder  reverts with TeX->MML after collapse/expand - - is this fixable?
	const hrefnotab = function () {
		// remove links from tab-order, as masked by MathJax navigation
		const refs = document.querySelectorAll('mjx-mrow>a[href],g[data-mml-node]>a[href]');
		refs.forEach((mjx) => mjx.setAttribute("tabindex","-1"));
		// save old value with assistive-mml mode
		const mrefs = document.querySelectorAll('mtext[href],mi[href],mo[href],mn[href]');
		mrefs.forEach((mjx) => {
			var str = mjx.getAttribute("data-semantic-attributes");
			if ( str != ''){ str.concat(';',"tabindex:-1");
//	console.log(str.concat(';',"tabindex:-1"));
				mjx.setAttribute("data-semantic-attributes", 
				  str.concat(';',"tabindex:-1"))
				};
			});
		};
</script>

mjxmathrole sets role="math" on the outer mjx-container structures. Without this, the AInspector for Firefox validator objects to using aria-label on a tag with generic role.
Having role="math" seems like the best way to avoid this.

Now the Siteimprove validator complains about <a href=".."> tags occurring within the contents of MJX math containers. This is because the content has aria-hidden="true" on the first (or deeper) child node. This applies for both CHTML and SVG output modes and also for mjx-assistive-mml blocks. To resolve this I use polyfills as follows.

mtexthrefpre adds the attribute tabindex="-1" to anchors initially before MathJax runs, but also afterwards to catch any that have been generated by interpreting LaTeX and MathML inputs.

hrefnotab does the same when href=".." is specified on a MathML element, and also records this attribute value in the data-semantic-attributes under an mjx-assistive-mml block. (This may not really be necessary.)

An earlier devised polyfill mtexthref was designed for shifting the href=".." attribute from the <a> tag to its MathML parent. But it's no longer needed with allowHtmlInTokenNodes: true.

Using these polyfills, as in the configuration given above, all seems good with Siteimprove
reporting no errors.

That is, until one tries contracting and expanding mathematical expressions.
Do this in the 1st math block of the example linked earlier, which block is generated from LaTeX source. Upon expanding, the tabindex="-1" (added via a polyfill) gets lost, and Siteimprove now complains.

MathJax-SImp-notabindex-fromTeX

When the input is from MathML, this issue does not occur.
Presumably this is because the mtexthrefpre polyfill has added the tabindex attribute before the MathML code is captured and saved prior to any contraction/expansion taking place.
On the other hand, it would be the MathML generated from the LaTeX source which is saved, before a later usage of a polyfill has added the tabindex attribute. The attribute is then missing when the code is regenerated after expansion.
Can something be done about this?

BTW, this explanation can be tested somewhat, using a local copy of the example.
Comment-out the usage of mtexthrefpre in the ready function.
Then contract/expand in the 2nd and 3rd math blocks. You should find that the tabindex attribute does not return, just as with the 1st block.
For the 4th, 5th and 6th blocks, the tabindex is supplied in the MathML source itself.

On a side note, after turning on Braille generation and reloading the page, Siteimprove now complains about the aria-braillelabel attribute not being defined.
This is a Microsoft thing, yes? It isn't in any official WAI-ARIA release, but is in online MDN documentation. Since it can be turned on/off at will, I'm not really concerned about it.

@dpvc
Copy link
Member

dpvc commented Oct 29, 2024

I think you are working harder than you need to, and are not taking advantage of the tools MathJax has for working with its output. A more straight-forward way to handle this would be to use the renderActions configuration option to insert your own code into MathJax's rendering pipeline in order to modify the final HTML directly.

Here is an example configuration that does that.

MathJax = {
  mml: {
    allowHtmlInTokenNodes: true
  },
  options: {
    renderActions: {
      adjust: [
        250, 
        (doc) => {for (const math of doc.math) MathJax.config.adjustMath(math, doc)},
        (math, doc) => MathJax.config.adjustMath(math, doc)
      ]
    }
  },
  adjustMath(math, doc) {
    const adaptor = doc.adaptor;
    adaptor.setAttribute(math.typesetRoot, 'role', 'math');
    for (const a of adaptor.tags(math.typesetRoot, 'a')) {
      adaptor.setAttribute(a, 'tabindex', -1);
    }
  }
};

This sets up an action that comes after the math is typeset, inserted into the page, and has had the speech text attached. The adjust action gives two functions, one to call when the document is updated (via MathJax.typeset() for example), and one when an individual MathItem is updated (e.g., when expanding/contracting an expression). Both call the adjustMath() function to do the work. You can modify that function to do other tasks, if you need, but this one simply set the role on the mjx-container element (the only one that doesn't already have a role), and sets the tabindex attribute on any anchors within the container. You don't need the more complicated queries that you are using for that.

Because this makes your code be part of the MathJax pipeline, you will get the proper result for both the initial typeset, and any subsequent typesetting, either from additional math added to the page, or if the renderer is changed, or if an expression is collapsed or reopened. So the timing of these actions is take care of automatically by using a render action.

Not that the handling of the aria labels and roles may be changing in the next beta release, as there are other problems in addition to the missing role on the container, and we are trying to work out a solution that works more reliably in more screen-reader/browser/OS combinations. So some of this may need to be modified in the future.

As for aria-braillelabel, that is a relatively recent addition to the ARIA standards (see the 1.3 draft), and may not be known to all the verification tools yet. A number of screen readers already have adopted it, however, so it is reasonable to use it now.

@dpvc dpvc added the Code Example Contains an illustrative code example, solution, or work-around label Oct 29, 2024
@dpvc
Copy link
Member

dpvc commented Oct 29, 2024

PS, you might also want to include

<style>
mjx-container a * {
  text-decoration: inherit;
}
</style>

so that the usual underline will be used for the <mtext href=...> example that comes from the \href in the first TeX expression. Usually, underlines are not great for mathematical content, which is why MathJax doesn't do this by default, but in this case, you might prefer it.

@ozross
Copy link
Author

ozross commented Nov 13, 2024

I really like it when you say I'm “working harder than I need to". :-)
Simpler answers are always what I'm looking for, and yours here works very well (of course).
Which part of the documentation would have allowed me to work out such an approach for myself?

My overall aim is to get the best configuration for MathJax to handle TeX/LaTeX coding stored in Tagged PDF documents, when such structured documents are "derived" to HTML.
The resulting HTML should match the rendering (LaTeX-based) in the original PDF, but enriched with extra features provided by MathJax; for accessibility, sub-expression expansion/contraction, etc.

There is a remaining problem with my example here, of an active link embedded in the math cases construction. That is, with the alternative text given as an aria-label at the <mjx-container> (or other level), then the presence of the hyperlink itself will not be detected by Assistive Technology, right?
After all, that is why the validator complains, and why we set tabindex=-1 to silence it.
In any case some functionality is indeed lost.

I don't see any easy way to avoid this, while maintaining the tree-like structure of the mathematical content.
If you have any ideas about this sort of thing, I'd be happy to try to implement them first in HTML, then encoded into a PDF so as to derive into that HTML construction.

All the best.
Ross

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Code Example Contains an illustrative code example, solution, or work-around Expected Behavior This is how MathJax works v3
Projects
None yet
Development

No branches or pull requests

3 participants
@dpvc @ozross and others