Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Magic field wildcard support #383

Open
jimaek opened this issue Jun 30, 2023 · 5 comments
Open

Magic field wildcard support #383

jimaek opened this issue Jun 30, 2023 · 5 comments

Comments

@jimaek
Copy link
Member

jimaek commented Jun 30, 2023

How about we add wildcard support using * ?

It will be useful in cases where there is full match but you dont want it. e.g. There are Zenlayer probes and Zenlayer Inc. probes.
magic:zenlayer will only return 3 probes because it matches them fully.

A good solution would be magic:zenlayer* to match every single probe. Other companies have similar issues because their ASNs are not uniform and use random combinations of their company names.

@MartinKolarik
Copy link
Member

It seems a little confusing because there's an implicit * already if there's no exact match. Maybe we could instead try to normalize the ASNs?

@jimaek
Copy link
Member Author

jimaek commented Jul 3, 2023

It seems a little confusing because there's an implicit * already if there's no exact match

Internally yes but it's not visible to the user.

Maybe we could instead try to normalize the ASNs?

They are set by the providers, not sure how we could normalize them. It could be anything

@MartinKolarik
Copy link
Member

MartinKolarik commented Jul 3, 2023

Internally it's okay, but how do you explain it to the user: "we have xxx* where * acts as a wildcard but actually, xxx will act as a wildcard often too"

It seems it would be cleaner to do something like we do for cities (removing "city" suffix from "new york city") and remove the common legal form suffixes (or maybe find something even little smarter) so that the returned names look all the same.

@MartinKolarik
Copy link
Member

MartinKolarik commented Jul 3, 2023

Possible starting point: https://github.com/ProfoundNetworks/company_designator/blob/master/company_designator.yml. E.g.:

  1. Make a list of all abbr fields, sorted from the longest one.
  2. Find the first case-insensitive matching abbr field (if any) that's at the end of the ASN and remove it.
  3. If the name now ends with , remove that as well.

@MartinKolarik
Copy link
Member

MartinKolarik commented Jul 3, 2023

One remaining problem that I see after checking our data is Vodafone, which has various national entities, e.g., Vodafone Portugal and Vodafone Germany. There I don't see a better solution right now... But regardless, I think we should aim to fix most of the cases primarily by normalizing the data and not expecting the users to use different query forms.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants