CL-SciSumm Competition: Code Collection (2017-2019) This repository contains the code collection from our team’s participation in the CL-SciSumm competition over three consecutive years (2017, 2018, and 2019). Each year, we proposed different strategies for the three subtasks:
Task 1A: We selected more efficient features, applied negative sampling to address the imbalanced data issue, and implemented neural network models. Task 1B: Facet identification was based on word probabilities learned from the training set. Task 2: We organized sentences in logical order before selecting the key ones to generate a summary. You can find more details about the competition on the official CL-SciSumm site: CL-SciSumm.
Our work can be found here:
2019: https://ceur-ws.org/Vol-2414/paper19.pdf
2018: https://ceur-ws.org/Vol-2132/paper11.pdf
2017: https://ceur-ws.org/Vol-2002/njustclscisumm2017.pdf
You can also check this paper to get more information. Ma, Shutian, Jin Xu, and Chengzhi Zhang. "Automatic identification of cited text spans: a multi-classifier approach over imbalanced dataset." Scientometrics 116 (2018): 1303-1330.
2017 is our first time to join the task:https://ceur-ws.org/Vol-2002/, and we have a good performance!
Feel free to explore the code and use it for your own research or competition preparation!