Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ask about the approach #8

Open
KienPM opened this issue May 1, 2020 · 8 comments
Open

Ask about the approach #8

KienPM opened this issue May 1, 2020 · 8 comments

Comments

@KienPM
Copy link

KienPM commented May 1, 2020

Thank you for sharing this great repository!
Can you share me the reason why you consider article body as a long sequence instead of sentences. If I want to encode each sentence then use sentences represent vector to encode article body, is it possible?

@wuch15
Copy link
Owner

wuch15 commented May 1, 2020

Of course, you can use other architectures such as HAN to process the news body, and the performance is usually slightly better. But it usually requires a larger GPU memory/smaller batch size.

@KienPM
Copy link
Author

KienPM commented May 1, 2020

Yeah, I've tried to encode each sentence then encode article body on 2080Ti GPU. I can only train with batch size = 1 and it took 25s/step, maybe something went wrong.
What does HAN stand for? Can you please share me a reference. Thank you very much!

@KienPM
Copy link
Author

KienPM commented May 1, 2020

HAN stands for Hierarchical Attention Network, right?

@wuch15
Copy link
Owner

wuch15 commented May 1, 2020

Yeah, HAN means Hierarchical Attention Network (Yang et al., 2016). You can replace the LSTM with CNN to boost the training speed.

@wuch15
Copy link
Owner

wuch15 commented May 1, 2020

https://www.aclweb.org/anthology/N16-1174

@KienPM
Copy link
Author

KienPM commented May 1, 2020

Yeup, thank you!
I've read some of your papers, they are awesome

@wuch15
Copy link
Owner

wuch15 commented May 1, 2020

In addition, it is highly recommended that you can use a smaller sentence length or fewer sentences. Although I believe that using the full news body is beneficial, it takes a large amount of GPU memory and the improvement is usually marginal.

@KienPM
Copy link
Author

KienPM commented May 1, 2020

Yeah, thanks for your recommendation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants