Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Coreference resolution #5

Open
valeriobasile opened this issue Oct 21, 2016 · 2 comments
Open

Coreference resolution #5

valeriobasile opened this issue Oct 21, 2016 · 2 comments

Comments

@valeriobasile
Copy link
Owner

By default, the option "resolve" of Boxer is turned off. This means that KNEWS misses a lot of references, e.g. "John is here. He is speaking" -> John is the Agent of Speaking.
The option "resolve" is on and working when using the online version of Boxer but it doesn't work with the local version. And it should.

@roquelopez
Copy link
Collaborator

Hi,
I was analysing the outputs of online and local version when option "resolve" is on, and both outputs are the same.

The reference resolution is OK. The "problem" is in the format of the xml generated by Boxer. In your example ("John is here. He is speaking" ), Boxer knows "He" refers to "Jhon" (it keeps an attribute named variable which have the same values for both tokens), but it still keeps others attributes for the token "He" such as symbol (symbol=male), type (type=n), etc.

I implemented a function to modify the xml in order to replace the attributes by the ones of the target token. In the example, all attributes of "Jhon" will be the same for "He".

Below two examples of outputs of Boxer before and after the call to the new function.

I) Robert is driving the car. He is running in the street.
BEFORE
predicates = [{'token_end': 6, 'token_start': 6, 'symbol': 'male', 'sense': '2', 'variable': 'x1', 'type': 'n'}, ...]
namedentities = [...]

AFTER
predicates = [...]
namedentities = [{'token_end': 6, 'token_start': 6, 'symbol': 'robert', 'variable': 'x1', 'type': 'nam', class': 'per'}, ...]

II) The car is old. It was used by Peter.
BEFORE
predicates = [{'token_end': 5, 'token_start': 5, 'symbol': 'thing', 'sense': '12', 'variable': 'x1', 'type': 'n'}, ...]

AFTER
predicates = [{'token_end': 5, 'token_start': 5, 'symbol': 'car', 'sense': '0', 'variable': 'x1', 'type': 'n'}, ...]

However, apparently, this change doesn't have consequences in the pipeline since the other functionalities of Knews doesn't used the modified attributes. They only use the attribute "variable" which is like an ID for each token. That new function could be useful for new features of Knews.

@shankha117
Copy link

@roquelopez do you know how to use the API to generate a Boxer DRS output in XML format from a .txt file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants