Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

escape characters in URI's #141

Open
lklic opened this issue Mar 9, 2021 · 2 comments
Open

escape characters in URI's #141

lklic opened this issue Mar 9, 2021 · 2 comments

Comments

@lklic
Copy link

lklic commented Mar 9, 2021

we have a lot of escape characters in the URI's:
https://artresearch.net/resource/?uri=https%3A%2F%2Fpharos.artresearch.net%2Fresource%2Fhertziana%2Fwork%2F08075675%252CT%252C002%252CT%252C002%252CT%252C073

image

@mafragias
Copy link
Contributor

mafragias commented Mar 26, 2021

That is because in the MIDAS data, the ID for different levels can look like :

<a5001>00001333,T,001,T</a5001>
<a5002>00000090,T,002,T,001</a5002>
<a5003>00012395,T,002,T,002,T,001</a5003>

and it escapes the comma.

I don't think we should hash them and lose the id, however we could probably remove the comma with ETL.

@MinadakisNikos
Copy link
Contributor

Need to consider this because it created many issues. We most probably will have to normalize it by choosing an other character than ,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants