Literature Review: NLP for Education

NLP Tasks/Keywords

Summarization
Simplification
Question Answering
Question Generation
- Free-answer Question
  - Factoid Question
  - Causal Question
- MCQ
Retrieval
Domain-specific
Second-language Learning
Writing Assessment
Grammar/Spelling Error Correction
Peer Review
Dialogue
Interactive Evaluation
Active Learning

Category by Task

T1-Extraction	T2-AbsSummarization	T3-Simplification	T4-Enrichment	T5-Negation

Text	Text	Text

Categorized by Domain

Language	Scientific	Math	Not-Specified

Previous Special-Topic Samples

Including Signed Languages in Natural Language Processing Kayo Yin, Amit Moryossef, Julie Hochgesang, Yoav Goldberg, Malihe Alikhani [ACL-2021]

Towards Faithfully Interpretable NLP Systems: How should we define and evaluate faithfulness? Alon Jacovi, Yoav Goldberg [ACL-2020]

Survey Paper

A Review on Question Generation from Natural Language Text Ruqing Zhang, Jiafeng Guo, Lu Chen, Yixing Fan, Xueqi Cheng [ACM Transactions on Information Systems, 2021]

Automatic question generation and answer assessment: a survey Bidyut Das, Mukta Majumder, Santanu Phadikar, Arif Ahmed Sekh [Research and Practice in Technology Enhanced Learning-2021]

A Systematic Review of Automatic Question Generation for Educational Purposes Ghader Kurdi, Jared Leo, Bijan Parsia, Uli Sattler, Salam Al-Emari [International Journal of Artificial Intelligence in Education-2020]

Putting Humans in the Natural Language Processing Loop: A Survey Zijie J. Wang, Dongjin Choi, Shenyu Xu, Diyi Yang [EACL-2021]

Dataset & Benchmark

Scisummnet: A large annotated corpus and content-impact models for scientific paper summarization with citation networks Michihiro Yasunaga, Jungo Kasai, Rui Zhang, Alexander R Fabbri, Irene Li, Dan Friedman, and Dragomir R Radev [AAAI-2019]

What Should I Learn First: Introducing LectureBank for NLP Education and Prerequisite Chain Learning Irene Li, Alexander R. Fabbri, Robert R. Tung, Dragomir R. Radev [AAAI-2019] [dataset]

Multi-Task Identification of Entities, Relations, and Coreference for Scientific Knowledge Graph Construction Yi Luan, Luheng He, Mari Ostendorf, Hannaneh Hajishirzi [EMNLP-2018]

LearningQ: A Large-Scale Dataset for Educational Question Generation Guanliang Chen, Jie Yang, G. Houben [ICWSM-2018] [code]

Educational Taxonomy

Evaluating the quality of learning: The SOLO taxonomy (Structure of the Observed Learning Outcome).

Taxonomy of educational objectives

Categorize by Purpose

Assist Human Reading Comprehension

[1976] Automatic question generation from text—An aid to independent study --- [*Improve independent study of reading materials]

[2009] Generating Instruction Automatically for the Reading Strategy of Self-Questioning ---[*mention Reading Strategy]

[2017] Learning to Ask: Neural Question Generation for Reading Comprehension --- [Seq2Seq]

Difficulty controllable question generation for reading comprehension

Difficulty-controllable Question Generation (DQG)

[2020] Exploring Artificial Jabbering for Automatic Text Comprehension Question Generation

Flaw Analysis on existing AQG for promote reading comprehension

[2021] [preprint] I Do Not Understand What I Cannot Define: Automatic Question Generation With Pedagogically-Driven Content Selection

The paper introduces a novel pedagogically meaningful content selection mechanism to find question-worthy sentencesand answers in arbitrary textbook contents. We conducted an empirical evaluation study with educational experts, annotating 150 generated questions in six different domains. Results indicatea high linguistic quality of the generated questions. Furthermore, the evaluation results imply that the majority of the generated questions inquire central information related to the given text and may foster text comprehension in specific learning scenarios.

Writing Support

[2012] G-Asks: An Intelligent Automatic Question Generation System for Academic Writing Support

This is the first project, to our knowledge, that contributes a system for generating content-specific questions that support writing. [When students are asked to write a literature review or an essay, the purpose is often not only to develop disciplinary communication skills, but also to learn and reason from multiple documents. This involves skills such as sourcing (i.e., citing sources as evidence to support arguments) and information integration (i.e., presenting the evidence in a cohesive and persuasive way).]

Knowledge Acquisition & Active Learning

[1998] Learning concepts by asking questions

Marvin, which uses concepts it has learned previously to learn new concepts. The program forms hypotheses about the concept being learned and tests the hypotheses by asking the trainer questions. Learning begins when the trainer shows Marvin an example of the concept to be learned. The program determines which objects in the example belong to concepts stored in the memory. A description of the new concept is formed by using the information obtained from the memory to generalize the description of the training example.

[2017] Question generation for language learning: From ensuring texts are read to supporting learning.

we want to broaden the perspective on the different functions questions can play in FLTL and discuss how automatic question generation can support the different uses. We discuss two types of questions serving this purpose, how they can be generated automatically; and we report on a crowd-sourcing evaluation comparing automatically generated to manually written questions targeting particle verbs, a challenging linguistic form for learners of English.

Knowledge Assessment

[2006]FAST: An automatic generation system for grammar tests

manually-designed patterns, Grammar Test

[2010] Good Question! Statistical Ranking for Question Generation

Overgenerate and Ranking by a binary logistic classifier(?)

[2012] Generating Grammar Exercises

Language Learning

[2013] Discriminative Approach to Fill-in-the-Blank Quiz Generation for Language Learners

[2016] Automatic generation of short answer questions for reading comprehension assessment

[2017] Automatic Generation of English Reference Question by Utilising Nonrestrictive Relative Clause

[2017] Factual open cloze question generation for assessment of learner’s knowledge

automatic factual open cloze question generation system ; The sentences are considered as informative based on part-of-speech tags and certain rules.

[1999] A web-based system for automatic language skill assessment: Evaling

Linguistic Exercise Design for Native French Language Assessment

[2018] Compiling Questions into Balanced Quizzes about Documents

Instruction/Feedback Generation

Adaptive Learning

Question Generation for Adaptive Education Megha Srivastava, Noah Goodman [ACL-2021]

[2018] Learning to Automatically Generate Fill-In-The-Blank Quizzes

MATH

[2017] Intelligent Math Tutor: Problem-Based Approach to Create Cognizance

we built a tool called Intelligent Math Tutor (IMT), which automatically generates mathematical word problems such that teachings from other subjects from a given curriculum can also be incorporated. our tool is the first of its kind tool which explicitly blends knowledge from multiple dissociated subjects and uses it to enhance the cognizance of its learners.

Programming

On the Use of Semantic-Based AIG to Automatically Generate Programming Exercises

NO PARTICULAR EDUCATIONAL PURPOSE

[2011] Automatic Question Generation using Discourse Cues

we present a system that automatically generates questions from natural language text using discourse connectives. We explore the usefulness of the discourse connectives for Question Generation (QG) that looks at the problem beyond sentence level. Our work divides the QG task into content selection and question formation. Content selection consists of finding the relevant part in text to frame question from while question formation involves sense disambiguation of the discourse connectives, identification of question type and applying syntactic transformations on the content. The system is evaluated manually for syntactic and semantic correctness.

[2016] Infusing NLU into Automatic Question Generation

Deep Questions without Deep Understanding

ontologycrowd-relevance workflow, consisting of first representing the original text in a low-dimensional ontology, then crowdsourcing candidate question templates aligned with that space, and finally ranking potentially relevant templates for a novel region of text.

[2018] Answer-focused and Position-aware Neural Question Generation

[2018] Paragraph-level neural question generationwith maxout pointer and gated self-attention networks

[2019] Key Phrase Extraction for Generating Educational Question-Answer Pairs [Proceedings of the Sixth ACM Conference on Learning @ Scale-2019]

Generate Keyphrase as the candidates first, then generate question. No particular evaluation of educational value.

Automatically Generating Cause-and-Effect Questions from Passages. Katherine Stasaski, Manav Rathod, Tony Tu, Yunfang Xiao, and Marti A. Hearst. [ACL-2021]

We build a pipeline that extracts causal relations from passages of input text, and feeds these as input to a state-of-the-art neural question generator. The extractor is based on prior work that classifies causal relations by linguistic category. This work results in a new, publicly available collection of cause-and-effect questions

Improving Question Generation With to the Point Context Jingjing Li, Yifan Gao, Lidong Bing, Irwin King, Michael R. Lyu. [EMNLP-2019]

Harvesting paragraph-level question-answer pairs from wikipedia Xinya Du, Claire Cardie [ACL-2018]

Leveraging Context Information for Natural Question Generation Linfeng Song, Zhiguo Wang, Wael Hamza, Yue Zhang, Daniel Gildea [ACL-2018]

Paragraph-level Neural Question Generation with Maxout Pointer and Gated Self-attention Networks Yao Zhao, Xiaochuan Ni, Yuanyuan Ding, Qifa Ke [EMNLP-2018]

Capturing Greater Context for Question Generation Luu Anh Tuan, Darsh J Shah, Regina Barzilay [AAAI-2020]

How to Ask Good Questions? Try to Leverage Paraphrases Xin Jia, Wenjie Zhou, Xu SUN, Yunfang Wu [ACL-2020]

PathQG: Neural Question Generation from Facts Siyuan Wang, Zhongyu Wei, Zhihao Fan, Zengfeng Huang, Weijian Sun, Qi Zhang, Xuanjing Huang [EMNLP-2020]

Joint Passage Ranking for Diverse Multi-Answer Retrieval Sewon Min, Kenton Lee, Ming-Wei Chang, Kristina Toutanova and Hannaneh Hajishirzi [EMNLP-2021]

Improving Unsupervised Question Answering via Summarization-Informed Question Generation Chenyang Lyu, Lifeng Shang, Yvette Graham, Jennifer Foster, Xin Jiang and Qun Liu [EMNLP-2021]

Asking It All: Generating Contextualized Questions for any Semantic Role Valentina Pyatkin, Paul Roit, Julian Michael, Yoav Goldberg, Reut Tsarfaty and Ido Dagan [EMNLP-2021]

Perhaps PTLMs Should Go to School – A Task to Assess Open Book and Closed Book QA Manuel Ciosici, Joe Cecil, Dong-Ho Lee, Alex Hedges, Marjorie Freedman and Ralph Weischedel [EMNLP-2021]

Distractor Analysis and Selection for Multiple-Choice Cloze Questions for Second-Language Learners Lingyu Gao, Kevin Gimpel and Arnar Jensson. [BEA@ACL-2021]

Automatic Distractor Generation for Multiple Choice Questions in Standard Tests Zhaopeng Qiu, Xian Wu [CoLing-2020]

Predicting the Difficulty and Response Time of Multiple Choice Questions Using Transfer Learning Kang Xue, Victoria Yaneva, Christopher Runyon and Peter Baldwin. [BEA@ACL-2020]

On the Application of Transformers for Estimating the Difficulty of Multiple-choice Questions from Text Luca Benedetto, Giovanni Aradelli, Paolo Cremonesi, Andrea Cappelli,Andrea Giussani, and Roberto Turrin. [BEA@ACL-2020]

Using Linguistic Features to Predict the Response Process Complexity Associated with Answering Clinical MCQs Victoria Yaneva, Daniel Jurich, Le An Ha, and Peter Baldwin. [BEA@ACL-2021]

When Retriever-Reader Meets Scenario-Based Multiple-Choice Questions ZiXian Huang, Ao Wu, Yulin Shen, Gong Cheng and Yuzhong Qu [EMNLP-2021]

GooAQ: Open Question Answering with Diverse Answer Types Daniel Khashabi, Amos Ng, Tushar Khot, Ashish Sabharwal, Hannaneh Hajishirzi and Chris Callison-Burch [EMNLP-2021]

Distractor Generation

Distractor Generation for Multiple Choice Questions Using Learning to Rank Chen Liang, Xiao Yang, Neisarg Dave, Drew Wham, Bart Pursel, C. Lee Giles [NAACL-2018]

Evaluation

[2016] How do machine-generated questions compare to human-generated questions?

Bio-domain Question generation evaluation. Result shows no difference from AQG and human-generated question

[2017] Evaluation of automatically generated pronoun reference questions

Pronoun reference questions are multiple choice questions that ask test takers to choose an antecedent of a target pronoun in a reading passage from four options. The evaluation was performed from two perspectives: the perspective of English teachers and that of English learners. Item analysis suggests that machine-generated questions achieve comparable quality with human-made questions. Correlation analysis revealed a strong correlation between the scores of machine-generated questions and that of human-made questions.

Difficulty Controllability

[2016] Item Difficulty Analysis of English Vocabulary Questions

Advanced Subtask Formulation

Conversation/Feedback Generation

[2020] Review-based question generation with adaptive instance transfer and augmentation

[2020] [Generating clarifying questions in conversational search systems] (https://dl.acm.org/doi/10.1145/3340531.3418513)

QA/QG as Interpretability Tool

[2020] Asking and answering questions to evaluate the factual consistency of summaries

Knoledge Graph

[2017] Knowledge Questions from Knowledge Graphs

Extractive Summarization

The Effect of Pretraining on Extractive Summarization for Scientific Documents Yash Gupta, Pawan Sasanka Ammanamanchi, Shikha Bordia, Arjun Manoharan, Deepak Mittal, Ramakanth Pasunuru, Manish Shrivastava, Maneesh Singh, Mohit Bansal, Preethi Jyothi [NAACL-2021]

Simplification

Metaphors in Text Simplification: To change or not to change, that is the question. Yulia Clausen and Vivi Nastase. [BEA@ACL-2019]

Text Simplification by Tagging. Kostiantyn Omelianchuk, Vipul Raheja, and Oleksandr Skurzhanskyi. [EACL-2021] [code]

Interpreting Neural CWI Classifiers’ Weights as Vocabulary Size Yo Ehara. [BEA@ACL-2020]

Content Enrichment

An Empirical Investigation of Neural Methods for Content Scoring of Science Explanations Brian Riordan, Sarah Bichler, Allison Bradford, Jennifer King Chen, Korah Wiley, Libby Gerard and Marcia C. Linn. [BEA@ACL-2020]

Arguement Mining

"Sharks are not the threat humans are": Argument Component Segmentation in School Student Essays. Tariq Alhindi and Debanjan Ghosh. [EACL-2021]

Domain Knowledge

Equipping Educational Applications with Domain Knowledge. Tarek Sakakini, Hongyu Gong, Jong Yoon Lee, Robert Schloss, JinJun Xiong and Suma Bhat. [BEA@ACL-2019] [dataset]

Training and Domain Adaptation for Supervised Text Segmentation. Goran Glavaš, Ananya Ganesh, and Swapna Somasundaran. [EACL-2021]

Error Generation

Artificial Error Generation with Fluency Filtering. Mengyang Qiu and Jungyeul Park. [BEA@ACL-2019]

Learning Outcome

From Receptive to Productive: Learning to Use Confusing Words through Automatically Selected Example Sentences Chieh-Yang Huang, Yi-Ting Huang, Mei-Hua Chen and Lun-Wei Ku. [ACL-2019]

Interactive Analysis

Virtual Pre-Service Teacher Assessment and Feedback via Conversational Agents Debajyoti Datta, Maria Phillips, James P. Bywater, Jennifer Chiu, Ginger S. Watson, Laura Barnes, and Donald Brown. [BEA@ACL-2021]

B-Pref: Benchmarking Preference-Based Reinforcement Learning Kimin Lee, Laura Smith, Anca Dragan, Pieter Abbeel [NeurIPS-2021]

Towards Interactive Language Modeling Maartje ter Hoeve et al. [preprint]

Other Generation Tasks

Automated curriculum generation through setter-solver interactions Sebastien Racaniere, Andrew Lampinen, Adam Santoro, David Reichert, Vlad Firoiu, Timothy Lillicrap [ICLR-2020]

Future Direction

Difficulty Control
- 2016-Item Difficulty Analysis of English Vocabulary Questions
Evaluation
- 2017-Evaluation of automatically generated pronoun reference questions

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
Readme.md		Readme.md

Olivia-fsm/NLP4Education

Folders and files

Latest commit

History

Repository files navigation