DOM side-effects in Tasks #1

DrRataplan · 2019-02-12T14:26:59Z

Hi Debbie and Adam,

I really enjoyed your talk in Prague! Gave me a lot of ideas, especially because (like I talked about during my presentation), we're facing a challenge to express the DOM forking model we at Fonto use
for our commands into an XQuery model. We are basically 'trying' out a new structure, validate it, and act accordingly.

I'm wondering how to best express that into your task structure.

Say I have the following thing I'd like to chain:

Look for a set of DOM nodes
Remove some of them
Report on the new situation

At first sight, this seems like it very fluently mixes with your Task
proposal. I came up with this:

let $items := //item()
return task:of("hello")
  ?bind(function ($id) as xs:boolean {
    (: Let's say that fonto:remove-items returns whether it removed something :)
    fonto:remove-items(items[@id=$id]//things-to-remove)
  })
  ?bind(function ($hasItemsBeenMutated) {
    if ($hasItemsBeenMutated) {
	  (: return the new count :)
	  ($items//things-to-remove => count()) || " thingies left"
	} else {
	  "Nothing removed"
	}
  })
})

Our main question is: How can $items represent two different data structures?. This forces us to either greedily 'copy' the DOM when we make it a 'pure' value, or make the engine aware of tasks: make the
DOM behave differently depending on which tasks have been executed / in which task we are currently looking at the DOM.

This seems to make the tasks framework not suitable to model dom-based side-effects. Have you thought of addressing this, or do you know of any ways to work around this problem?

adamretter · 2019-02-14T10:20:41Z

@DrRataplan Hi Martin, thanks for contacting us, I am glad you enjoyed the talk. I cannot speak about DOM manipulation directly, that is really Debbie's area of expertise and I think she will respond to you as well.

However, if you want to thread multiple items through the chain, you can just make items an array of sequences. Although I guess you know that, and are looking for something else?

DrRataplan · 2019-02-14T20:41:07Z

@adamretter, @deblock,

Correct, what I'm actually looking for is a way to 'formalize' dom side-effects, and I don't want to revert to vendor-specific APIs. Especially now that there's an initiative that feels like such a good fit for what I'm doing!

What I'm basically trying to do (in terms of XQuery Update Facility) is to apply the pending update list (PUL) of the previously ran tasks to the DOM/world that is active 'during' a task. After the task(chain) has finished, the PUL can be returned as usual, like how you described at the end of paragraph 4.1.4 in your paper.

What this would mean is that a Path Expression would be 'bound' to the dom state at a given time, requiring intimate knowledge of the XQuery engine at the least. By just having it been bound, we will not have to disable optimizations like lazy evaluation: as long as we can still read from that DOM state, we can resolve the Path Expression. This may however interfere with variable inlining (every variable in functional language is equivalent to inlining it, and vice versa).

Another approach in this would be to thread the things that we're going to edit as an array of sequences, like you suggested. What I like about this idea is that it does not require any knowledge of the engine.
What I do not fancy is that (to allow me to 'edit' these elements):

I run into problems with where a Path Expression is placed versus 'when' it is (lazily) executed (see my example).
I could address the inconsistent paths by effectively (minimally) cloning the DOM, and apply the pending updates there. This can be a function inside a task step. I will then however run into problems with node identities: a node in an outer closure is not a node in an inner closure. If they would be the same node, it wouldn't make sense that an expression could result to different values for the same input. if $nodeA is $nodeB holds true, then surely $nodeA//things-to-remove except $nodeB//things-to-remove must be the empty set.
To address the identity problem, I would have to document that it is absolutely illegal to read from an outer closure when you're using tasks. This feels bad because there is (as far as I know) no way to enforce this..
Choosing which nodes to pass in the outer sequence could turn out tricky. You'll be tempted to pass the whole document as a 'world' to change, but what if we want to have side-effects on all the documents. I am unsure whether this is an issue though: passing a node effectively gives one access to the whole new document anyway (fn:root($new-node)), and I don't even want to think about how things like fn:doc should behave.

I acknowledge that having DOM side-effects in one way or another requires intimate knowledge and some help of the XQuery engine + DOM implementation. I also think that they could be very useful. If we could express DOM side-effects, it will effectively allow one to write a framework for atomic transactions within XQuery. Besides that, having dom side-effects will also address one of the key points that's confusing to someone who is just learning Update Facility: Why can't I see my changes?.

Does this make any sense to you? I am looking to find a way to express my problem in terms of your proposal, if that is possible.

deblock · 2019-02-15T16:51:48Z

Hi Martin,

Indeed, great to hear from you and your interest in Tasks. Your explanations are very useful, but yes, this all takes some thinking about to understand well enough...

I think somehow what we need to do is to include all of the DOM side-effecting actions inside the task chain. And if one action changes nodes in the DOM, then you actually need to pass those nodes through the chain. Does something like the following make sense?

let $items := //item, $thingies := count($items//things-to-remove)
return task:of("hello")
  ?bind(function ($id) as element(item)* {
    (:  Rather than returning whether it removed something, can fonto:remove-items actually return the
    new item with the things-to-remove removed?
    Or even better, have a 2-arg function which takes a sequence of items, and $id, and
    returns a new sequence of items, for which things-to-remove have been removed for the item with
    the given $id :)
    fonto:remove-items($items, $id)
  })
  ?bind(function ($itemsWithSomeThingsRemoved) {
    if (count($itemsWithSomeThingsRemoved//things-to-remove) ne $thingies) then 
	  (: return the new count :)
	  ($itemsWithSomeThingsRemoved//things-to-remove => count()) || " thingies left"
	 else 
	  "Nothing removed"
  })

I think basically it isn't safe to refer to $items again in the task chain if an earlier task has a side effect of changing those nodes.

adamretter · 2019-02-17T20:42:27Z

I am just wondering, why don't we start with a:

task:of(//item)

Also are there existing functions for reading/writing the DOM in Fonto? If so we could just encapsulate those also..

DrRataplan · 2019-02-25T15:19:47Z

Hi @adamretter, @deblock,

Let me give you an example of a function we're trying to expose to XQuery: horizontal-insert. This function accepts an element node, a new element we're trying to insert and (optionally) a preferred offset at which we want to insert the new node. this mutation inserts the new element under the container, at a valid offset; preferring the offset that is passed. This mutation is special because it uses the schema and the 'current state' of the DOM and generates a new one from it. After the changes are applied, we can query them to do further actions. For more info: this mutation is further documented at https://documentation.fontoxml.com/api/latest/insertnodehorizontal-16324515.html.

This mutation depends on the current state of the dom and outputs a new state, I agree with @deblock, this should be seen as a task that does exactly that:

let $original-metadata := /metadata
return
task:of($original-metadata)
  ?bind(function ($metadata-element as element()) {
    (: Insert the new value, assume it worked. Let's ignore the selection for now :)
    fonto:horizontal-insert-node($metadata-element, <metadata-value/>)
  })
  ?bind(function ($mutated-metadata-element as element()) {
   (: We can work with the new value here :)
   trace($mutated-metadata-element/metadata-value)
  })

This will pose some difficulty with having to optimize for node cloning ($metadata-element should not be the same element as $mutated-metadata-element), and this may make it harder for us to add additional APIs, like reading from the selection, but that's of later concern.

In conclusion: I think that the tasks proposal is a good fit for our APIs. Nodes that will be changed should be passed through the chain. These passed nodes must behave as if they were clones of the nodes returned in the previous step.

Thanks for thinking along! This has been very helpful to me!

adamretter · 2019-02-26T10:24:57Z

Thanks @DrRataplan.

I think though for all intents and purposes it won't matter if you pass a mutable or immutable element between steps, as regardless the side-effect has been cleanly encapsulated and deferred until execution time (not evaluation time).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DOM side-effects in Tasks #1

DOM side-effects in Tasks #1

DrRataplan commented Feb 12, 2019

adamretter commented Feb 14, 2019

DrRataplan commented Feb 14, 2019

deblock commented Feb 15, 2019 •

edited

Loading

adamretter commented Feb 17, 2019

DrRataplan commented Feb 25, 2019

adamretter commented Feb 26, 2019

DOM side-effects in Tasks #1

DOM side-effects in Tasks #1

Comments

DrRataplan commented Feb 12, 2019

adamretter commented Feb 14, 2019

DrRataplan commented Feb 14, 2019

deblock commented Feb 15, 2019 • edited Loading

adamretter commented Feb 17, 2019

DrRataplan commented Feb 25, 2019

adamretter commented Feb 26, 2019

deblock commented Feb 15, 2019 •

edited

Loading