Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xid creation in dataset #9

Open
aburns4 opened this issue Feb 22, 2021 · 2 comments
Open

xid creation in dataset #9

aburns4 opened this issue Feb 22, 2021 · 2 comments

Comments

@aburns4
Copy link

aburns4 commented Feb 22, 2021

Hi,

I was wondering if you could explain how you obtained the 'xid' values for the annotations in the dataset. Did you perform breadth or depth first search and number elements in the DOM according to the traversal? Were there any other specifications to count the elements, such as whether they were visible or not?

Thank you!

@ppasupat
Copy link
Contributor

Hello. The xids were generated in the order where the open tag appears (which is equivalent to depth-first search). All tags, including invisible ones, get an xid.

The dataset, which was processed by beautifulsoup, uses the following

for x in soup.body(True):     # Select all nodes
    x['data-xid'] = i
    i += 1

The demo Chrome extension also does something similar in the injectXids function.

@aburns4
Copy link
Author

aburns4 commented Feb 28, 2021

Hi, okay. What about text elements that do not have children? It doesn't seem they have xids in the data files.

Thank you again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants