This repository contains a large-scale temporal question answering dataset designed for evaluating and training language models on temporal reasoning tasks. The dataset consists of question-answer pairs with a focus on temporal aspects, covering a wide range of events and entities from 1987 to 2023.
- Size: The dataset comprises 100,228,457 question-answer pairs, making it one of the largest temporal question answering datasets available.
- Question Types: Questions are categorized based on their complexity, including easy and hard questions, each designed to test different levels of temporal reasoning and understanding.
- Content: The dataset covers a diverse range of events and entities, sourced from Wikipedia and Wikidata, ensuring a rich and varied set of questions for evaluation.
- Metadata: Each question-answer pair includes additional metadata, such as entity/event IDs, question difficulty ratings, and temporal attributes, providing valuable information for analysis and model evaluation.
- Performance Evaluation: The dataset can be used to evaluate the performance of language models on temporal reasoning tasks, including across-time comparison, event/entity detection, and multi-hop reasoning.
- Fine-Tuning: Researchers can leverage this dataset for fine-tuning language models, enhancing their temporal reasoning capabilities and performance on similar tasks.
- Download: The dataset is available at Hugging Face