Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent Select/Query interface with xmlquery #7

Open
Maldris opened this issue Mar 15, 2022 · 7 comments
Open

Consistent Select/Query interface with xmlquery #7

Maldris opened this issue Mar 15, 2022 · 7 comments
Labels
enhancement New feature or request

Comments

@Maldris
Copy link

Maldris commented Mar 15, 2022

In the sibling xmlquery package, the Node struct has public methods;
SelectElement(name string) *Node and
SelectElements(name string []*Node

jsonquery by comparison only has SelectElement.

Given both packages export the global function Find and FindOne both can obviously be implemented for JSON, as they already are.

I propose a change where jsonquery's SelectElement implementation mimics xmlquery's and internally calls FindOne, and similarly add SelectElements that internally calls Find.

To this end, both node types can be identified and used with an interface of

type Selector interface {
	InnerText() string
	SelectElement(name string) Selector
	SelectElements(name string) []Selector
}

Allowing basic manipulation of both file types to only differ during load.

The alternative to this being using Query and QueryAll to allow errors to be returned, instead of simply returning a nil result/panicking, or implementing both sets as methods in each package and allowing users to make the selection.

Is this proposal sound and within the intended scope of this project?
If so I will make the changes and a pull request.

@Maldris
Copy link
Author

Maldris commented Mar 15, 2022

My main concern with this change is that it changes the behaviour of jsonquery's SelectElement which currently only looks at immediate children with a matching field name, instead of running a full query that can look at the full child tree as xmlquery does.

As a result, this would be a breaking change. However, it would mean a consistent interface, and more predictable behaviour compared to the other query and select functions which all run a full query.

@zhengchun zhengchun added the enhancement New feature or request label Mar 16, 2022
@zhengchun
Copy link
Contributor

I forget why no SelectElements method. May be is JSON doesn't support Duplidate Keys.

the below is incorrect json structure.

{
	"aa": "sss",
	"aa": "sss"
}

In json the SelectElements is doesn't make sense, because we don't have same keys in the childs element. Let me know if I'm wrong. Thanks

@Maldris
Copy link
Author

Maldris commented Mar 16, 2022

TLDR:
while yes a query like //aa for your example json doesn't make sense, a query like //array/elements, //name or //cars//name on the example below do, and are already implemented at the package level. I'd like to add them as methods so that there is a common interface (and I am currently testing this on a fork).

The main issue with this being doing so changes SelectElement from only looking at siblings and immediate children to the whole subtree.


My main use case consideration was twofold;

A work project involves a large configuration driven file processor, historically it worked only with xml files, but now adding JSON files. Having an abstract interface for both would mean the logic remains the same, but with different loaders, making the whole application simpler than putting type switches everywhere.

To achieve this, the main case was queries like getting the elements of an array, or getting the presence of a keyword in sibling object properties.

Such as for the test json:

{
	"name":"John",
	"age":30,
	"cars": [
		{ "name":"Ford", "models":[ "Fiesta", "Focus", "Mustang" ] },
		{ "name":"BMW", "models":[ "320", "X3", "X5" ] },
		{ "name":"Fiat", "models":[ "500", "Panda" ] }
	]
}

running the query //cars//name to find all the car names, or //cars/element to get each array element so that the system can then range over those lists and run configured operations on each.

Given that Find/FindOne or Query/QueryAll are already implemented at the package level, and for xmlquery the methods just call them (passing the subject node as top) I'm unsure why the same pattern wasn't taken for this package?
yes certain classes of query such as the example you gave aren't valid, but other ones like the above example are.

I've forked both packages and implemented a common interface, I'm currently looking at improving the test coverage in jsonquery, then I'll be doing a test in a real system to make sure it works as intended with real documents rather than simple unit tests.

jsonquery fork
xmlquery fork

*Edited some typos

@zhengchun
Copy link
Contributor

You are right, I forgot the Array[] type in JSON. SelectElements can works for array type.

@Maldris
Copy link
Author

Maldris commented Mar 16, 2022

I'll be doing some testing tomorrow with the forked packages, so I'll report back if its working as intended, then we can make decisions on how to handle the breaking change to SelectElement and whether to merge.

@Maldris
Copy link
Author

Maldris commented Mar 17, 2022

ok, mixed bag results, I created the interface

type Selector interface {
    // data retrieval
    InnerText() string

    // structure query
    SelectElement(query string) Selector
    SelectElements(query string) []Selector
    Query(query string) (Selector, error)
    QueryAll(query string) ([]Selector, error)
    QuerySelector(query *xpath.Expr) Selector
    QuerySelectorAll(query *xpath.Expr) []Selector

    // ===================== //
    // currently not working //
    // ===================== //

    // signature differs between packages
    // OutputXML() string 
    // vs
    // OutputXML(self bool) string

    // only in jsonquery
    // ChildNodes() []*Node 
    // could be added to xmlquery easily

    // only in xmlquery
    // SelectAttr(name string) string
    // no analog in jsonquery
}

As you can tell by the comments there were a few issues, the different signatures and differing functions could be overcome without too much issue, but the main issue;

I'd forgotten that even if *Node satisfies the Selector interface []*Node will not match []Selector due to the way go handles slices and not making O(n) calls implicit (see a stack overflow answer that also links to a few examples in the comments).
As a result, I'd need to change the return types of each function to return the interface, which is not ideal as it masks the methods not in the interface, as well as field properties of the struct.

At this point, proxying the interface with a wrapper type is probably the best choice for my application, as a naïve example:

func ConvertToSelector(doc *jsonquery.Node) common.Selector {
	return selectorImpl{node: doc}
}

type selectorImpl struct {
	node *jsonquery.Node
}

func (n selectorImpl) InnerText() string {
	return n.node.InnerText()
}

func (n selectorImpl) SelectElements(query string) []common.Selector {
	res := n.node.SelectElements(query)
	result := make([]common.Selector, len(res))
	for i := range res {
		result[i] = selectorImpl{node: res[i]}
	}
	return result
}

// ...

The new method versions of the selector functions still help with this, but this does detract somewhat from the motivation

My current thought is to at least merge the forks and add the methods as they still have utility and can help other users, and improves test coverage.
But moving towards having a consistent interface as part of the package would be a larger longer term change, if still desirable.

Any thoughts?
otherwise I'll create the pull requests for the last pushed commits which only added the methods and additional test coverage and the discussion can move to specifically those changes in the pull request itself

@zhengchun
Copy link
Contributor

Thans for your suggest, but This change will breaking the previous xmlquery and jsonquery package. xmlquery and jsonquery are independent packages, and both are depend on xpath package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants