Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for using wildcards in the IUBundleContainer #1061

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

laeubi
Copy link
Contributor

@laeubi laeubi commented Jan 20, 2024

Currently one has to specify an exact id for an IU to be used in a target but sometimes it is a bit cumbersome to do so. Also it can become quite annoying if items are added/removed to update the target.

This adds a new way of specify the id with wildcards that is enabled whenever the id contains an '*' (any number characters) or '?' (a single character), the container constructs a pattern and includes everything that matches this pattern to be considered.

Example: Selects all ius that start with jakarta. prefix

<target name="with-wildcard">
	<locations>
		<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
			<repository location="my/repository"/>
			<unit id="jakarta.*" version="0.0.0"/>
		</location>
	</locations>
</target>

Example: Selects all features in a repository

<target name="with-wildcard">
	<locations>
		<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
			<repository location="my/repository"/>
			<unit id="*.feature.group" version="0.0.0"/>
		</location>
	</locations>
</target>

@laeubi laeubi requested review from merks and HannesWell January 20, 2024 06:19
Currently one has to specify an exact id for an IU to be used in a
target but sometimes it is a bit cumbersome to do so. Also it can become
quite annoying if items are added/removed to update the target.

This adds a new way of specify the id with wildcards that is enabled
whenever the id contains an '*' (any number characters) or '?' (a single
character), the container constructs a pattern and includes everything
that matches this pattern to be considered.
@laeubi laeubi force-pushed the support_wildcard_in_iu branch from ec6a8ad to a13f7b8 Compare January 20, 2024 06:23
@laeubi
Copy link
Contributor Author

laeubi commented Jan 20, 2024

@merks @HannesWell this is a bit of experimental at the moment but already works quite well!

Some things that might need consideration

  • I think an IU id could theoretically contain any character but using * / ? is something I have never seen and won't expect. As an alternative one could require an additional attribute to be specified but this will require to pass more data around
  • There is not really a good way for the UI (except we allow to use the filter string as a unit match but that would be global somehow)
  • It is not possible to exclude items so its rather rough but for me workable as I often do not care about the exact number of IUs
  • I first considered allow to specify some kind of filters like Tycho does that's more flexible but also requires special knowledge from the user a simple pattern seem to be something every user can easily handle.

Alternative syntax would then be:

enable by attribute / element

<target name="with-wildcard">
	<locations>
		<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
			<repository location="my/repository"/>
			<unit id="jakarta.*" version="0.0.0" pattern="true"/>
		</location>
	</locations>
</target>

or we simply use a new element:

<target name="with-wildcard">
	<locations>
		<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
			<repository location="my/repository"/>
			<bundle matches="jakarta.*"/>
			<feature matches="*"/>
		</location>
	</locations>
</target>

using generic (match) expression

<target name="with-wildcard">
	<locations>
		<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
			<repository location="my/repository"/>
			<unit>
			    <match>id ~= /jakarta.*/</match>
   			</unit>
		</location>
	</locations>
</target>

Of course one could even think about using a different name and omit the unit all together:

<target name="with-wildcard">
	<locations>
		<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
			<repository location="my/repository"/>
			 <match>id ~= /jakarta.*/</match>
		</location>
	</locations>
</target>

Copy link

Test Results

  271 files   -     6    271 suites   - 6   47m 27s ⏱️ - 16m 24s
3 497 tests +    1  3 460 ✅ ±    0  36 💤 ± 0  1 ❌ +1 
7 898 runs   - 2 721  7 819 ✅  - 2 698  78 💤  - 24  1 ❌ +1 

For more details on these failures, see this check.

Results for commit a13f7b8. ± Comparison against base commit 76e5d1f.

@merks
Copy link
Contributor

merks commented Jan 20, 2024

I think <match>id ~= /jakarta.*/</match> is the most powerful/flexible. It corresponds most closely to what's possible via a org.eclipse.equinox.p2.metadata.IRequirement.getMatches(). If I understand correctly, I think this would allow expressing version ranges, which would be pretty darned nice...

@HannesWell
Copy link
Member

In general I really like the idea, this can significantly simplify including a group of units.

enable by attribute / element

<target name="with-wildcard">
	<locations>
		<location includeAllPlatforms="false" includeConfigurePhase="true" includeMode="planner" includeSource="true" type="InstallableUnit">
			<repository location="my/repository"/>
			<unit id="jakarta.*" version="0.0.0" pattern="true"/>
		</location>
	</locations>
</target>

That is in my opinion the best way as it ensures that existing targets continue to work as they are, even if they use unusual symbols like * or ?.
Furthermore it would even be possible to select the kind of pattern used. Simple ANT/glob style patterns or even more complex regex patterns, that would even allow to exclude IUs from being included (it would of course not be possible to exclude IUs included by another unit).

Alternatively we could use a new unit attribute like id-pattern as an alternative to id, but then one cannot choose the type of pattern.
Since such an element would then potentially match more than one unit the element name should actually be plural, but introducing new elements should be well thought out.

I think <match>id ~= /jakarta.*/</match> is the most powerful/flexible. It corresponds most closely to what's possible via a org.eclipse.equinox.p2.metadata.IRequirement.getMatches().

That's right, but since those matchers can be quite complex (what even the P2 doc mentions), I don't think we should make the full power available in targets, but spend the complexity budget of a user wisely, only when it brings enough benefit for the added complexity. Otherwise users will just continue using only the existing exact ids.

If I understand correctly, I think this would allow expressing version ranges, which would be pretty darned nice...

For version ranges we could just re-use the existing version attribute since [ ( ] ) , are characters not permitted in single versions.
I have also worked on version-ranges locally, just didn't have the time to finish this yet, but actually wanted to do that soon (unless this makes it obsolete).

But regarding versions, if a id-pattern is specified having an exact version is in most cases not useful (unless all units matched by the pattern are available with the same version), so either a no version (respectively 0.0.0 aka latest) or a version range should be specified.

  • There is not really a good way for the UI (except we allow to use the filter string as a unit match but that would be global somehow)

A text field in the UI to enter the pattern that shows all matches sound good and useful to me.

@merks
Copy link
Contributor

merks commented Jan 21, 2024

I agree that the matcher syntax not pretty to author, but would allow to express requirements on packages...

Given the units being resolved here are primarily plugins and features, those have these constraints:

Bundle symbolic name contains illegal characters. Legal characters are A-Z a-z 0-9 . _ -
Illegal value 'org.eclipse.oomph.?setup.core' for attribute 'id'. Legal token characters are "a-z", "A-Z", "0-9", "_". Tokens must be separated by "."

Also ?* are not valid characters in a group or artifact ID. So probably it's safe to just use these.

Using/allowing Java regular expressions would be much more powerful than glob-style patterns.

I think using a version (range or otherwise), in combination with an ID pattern is almost never all that sensible...

@laeubi
Copy link
Contributor Author

laeubi commented Jan 21, 2024

Yes glob pattern is rather limited and I just choose that for simplicity to test if it could work also the version is then quite questionable.

If we choose

<unit>
   <match>id ~= /jakarta.*/</match>
</unit>

I think one should simply assume id/version is always empty/ignored.

On the other hand, we already have code that can handle category/site.xml syntax so maybe one should simply allow all elements like here: https://tycho.eclipseprojects.io/doc/main/Category.html and instead of <site> choose <select> as the only root element under the repository.

This would allow to use the defined categories to display to the user and one could even reuse the category.xml editor for that purpose (even though it does not yet support the enhanced iu syntax).

@HannesWell
Copy link
Member

Given the units being resolved here are primarily plugins and features, those have these constraints:

Bundle symbolic name contains illegal characters. Legal characters are A-Z a-z 0-9 . _ -
Illegal value 'org.eclipse.oomph.?setup.core' for attribute 'id'. Legal token characters are "a-z", "A-Z", "0-9", "_". Tokens must be separated by "."

Also ?* are not valid characters in a group or artifact ID. So probably it's safe to just use these.

Thanks for the hint about permitted characters. With that in mind, I think the current suggestion to keep all elements as they are and use * and ? as markers for patterns. That's the simplest approach, that already adds much value.

Using/allowing Java regular expressions would be much more powerful than glob-style patterns.

Yes. To distinguish the type of pattern, we could also use the approach used in Tycho when filtering automatically added repo-references (which is actually just derived from Apache Common's Path-Matcher class):
By default relatively simple glob/ANT-style patterns are used, but one can still use regular-expressions, by wrapping them in %regex[<the-expression] and we added a similar way to negate the pattern via ![<the-expression].
With that all existing XML elements can be re-used.

I agree that the matcher syntax not pretty to author, but would allow to express requirements on packages...

That's right, but I have never felt the need for it yet at the TP level (but maybe that may come if it is possible).
But if we want the user a simpler way to satisfy the packages a project depends on, maybe it would be even more convenient to do it the Oomph-Targlet way and add an option to just add everything to the TP necessary to resolve all projects in the workspace?

I think using a version (range or otherwise), in combination with an ID pattern is almost never all that sensible...

Yep. Also given that in most cases (except Guava or the javax/jakarta bundles) there is only one version of a bundle in the TP personally I usually just use the latest/0.0.0 version for my dependencies and control their version implicitly by the Repo URL.

On the other hand, we already have code that can handle category/site.xml syntax so maybe one should simply allow all elements like here: https://tycho.eclipseprojects.io/doc/main/Category.html and instead of <site> choose <select> as the only root element under the repository.

That code is only in P2 or Tycho, isn't it? At least I can't see support for the more complex cases in PDE's standard category editor?

This would allow to use the defined categories to display to the user and one could even reuse the category.xml editor for that purpose (even though it does not yet support the enhanced iu syntax).

I'm not sure I understand that? Do you want to add units by category or just display them? Because the visual Target-Editor already shows the categories (just like for updates).

@laeubi
Copy link
Contributor Author

laeubi commented Jan 21, 2024

I don't think we should support regex / different styles that just adds to much complications confusion. If one needs extra control then the traditional way of defining a feature is the way to go, this is/was really meant to support things like "include all jetty http bundles".

At least I can't see support for the more complex cases in PDE's standard category editor?

The category editor does not support the IU element and its syntax, but any glob/regex/whatever pattern will not be supported either. In fact it would be more valuable to extend the category editor as this will immediately allow to edit category.xml as well.

Do you want to add units by category or just display them?

Display them by category, because you can't see the matched items otherwise.

@laeubi
Copy link
Contributor Author

laeubi commented Jan 23, 2024

@mickaelistria @akurtakov any thoughts on this?
I could even think about supporting both approaches...

@HannesWell
Copy link
Member

I don't think we should support regex / different styles that just adds to much complications confusion. If one needs extra control then the traditional way of defining a feature is the way to go, this is/was really meant to support things like "include all jetty http bundles".

If regex patterns are too complicated, we should really think twice about the (IMHO) even more complex units-filter approach.
Therefore I would be in favor to continue with the current approach, as this already adds much value.
If there is later a demand for more, other elements can still be supported.

At least I can't see support for the more complex cases in PDE's standard category editor?

The category editor does not support the IU element and its syntax, but any glob/regex/whatever pattern will not be supported either. In fact it would be more valuable to extend the category editor as this will immediately allow to edit category.xml as well.

That's right, but the IU-filter syntax is much more complex than just displaying multiple items matched by a pattern. I even think it will be quite difficult to represent it in a visual editor, maybe hardly possible (but maybe someone has good ideas).

@laeubi
Copy link
Contributor Author

laeubi commented Jan 23, 2024

If regex patterns are too complicated, we should really think twice about the (IMHO) even more complex units-filter approach.

regex patterns itself are not that complicated of course but supporting different syntax with modifiers and of course the error handling while with a glob you can't do anything "wrong" (maybe just not matching what you want)

That's right, but the IU-filter syntax is much more complex than just displaying multiple items matched by a pattern. I even think it will be quite difficult to represent it in a visual editor, maybe hardly possible (but maybe someone has good ideas).

The main difficulty is that we need to transform all items to IUs but there are already functions for this so it is possible but I think a simple text box would be suitable as a first step.

@laeubi
Copy link
Contributor Author

laeubi commented Apr 21, 2024

@HannesWell @merks I lost a bit track of this PR, should we proceed here or do we thing its not worth to change PDE in this regards?

@HannesWell
Copy link
Member

@HannesWell @merks I lost a bit track of this PR, should we proceed here or do we thing its not worth to change PDE in this regards?

I think it is definitively worth to have that. But I think we should keep this simple in the first step and just support wild-cards in IDs only with GLOB syntax. Having a good UI for that will probably be enough work already.

If we then think that's still not flexible enough, we can still add support for the full general IU syntax that P2 offers. But just a simple text block without any assistance and syntax checks is really hard to use if one just make the smallest mistake. I have made that experience already by myself for example with M2Es BND-instructions editor in the past.

Btw. I have also pushed a preliminary draft of my mentioned work to support of Version ranges in targets with:

Both PRs together will already give users many more opportunities without much more syntax to learn and new attributes or elements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants