To conduct the analysis of statistical features of both the raw dataset and refined dataset, Evaluator employs a visualization approach to present the data distribution (perplexity, language, etc.) in a user-friendly manner.
Additionnally, Evaluator evaluates the texts in three manners, language identifier by fasttext, perplexity evalutor by kenlm and text scores by ChatGPT.