Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance on existing tagging datasets #6

Open
dennlinger opened this issue Apr 22, 2022 · 2 comments
Open

Performance on existing tagging datasets #6

dennlinger opened this issue Apr 22, 2022 · 2 comments

Comments

@dennlinger
Copy link

Hi Paul,
I'm a researcher working with temporal tagging, and was wondering if you do have any information on the tagging performance of Timexy on some of the existing datasets, such as TempEval (for English) or KRAUTS (for German).

I'm generally curious if you could comment on what particular tags/temporal expressions your package works well with, and what it might struggle on (normalization, language-specific expressions, etc.) or how it stacks up against existing packages like Heideltime. Let me know if you wanted some of the evaluation results in your future work, I'm happy to help out with that (currently busy this and next week, but free to contribute after).

Best,
Dennis

@paulrinckens
Copy link
Owner

Hi Dennis,

there is no evaluation of Timexy on public datasets yet.

Currently, the temporal expression extraction and normalization is based on rules only, which makes it easy to assess and track on which expressions it works well. For expressions of type date, you can refer to the existing Timexy language modules to see transparently which patterns are covered.

However, if you plan to evaluate Timexy on any public datasets, your contribution is very much welcomed.

Best,

Paul

@chlor
Copy link

chlor commented Jun 8, 2023

Hi Paul,

What is the right configuration to use the Timexy Language modules?

If I try to use timexy to parse German language data strings (e.g., 10.10.2023), I have an error like this:

Span could not be retrieved for annotation of type timexy for datestring 10.10.2023 with character offsets (14, 24). Skipping the match.

Do you have a code example to configure the detection or annotation of German data strings?

Best, Christina

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants