Skip to content

Commit

Permalink
Update projects.md
Browse files Browse the repository at this point in the history
  • Loading branch information
bplank authored Aug 28, 2024
1 parent 48a4012 commit 41d7b79
Showing 1 changed file with 0 additions and 4 deletions.
4 changes: 0 additions & 4 deletions _pages/projects.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,10 +192,6 @@ What linguistic phenomena contribute to the prominence of entities in a document
Build an NLP model that predicts the prominent entities in a document and evaluate accordingly based on reference summaries.
Start with the CNN/DM dataset. Project can be extended to other genres and languages. Level: BSc or MSc.

- :hourglass_flowing_sand: *Understanding Political Party Manifestos*. Recent advancements in natural language processing have already changed adjacent fields like political science and communication research. A question that has always been relevant in these and other social sciences has been how to turn textual data into ecologically valid numerical representations. In the field of party communication, the question of how one can turn party manifestos into numerical vector representations has been studied by the Manifesto Project for decades. The project codes - with huge human work input - some dozens of political issue categories for thousands of party manifestos. This project aims to use recent advances in natural language inference and zero-shot classification to reproduce the human codings produced by the Manifesto Project. Level: MSc (could be adapted to BSc). References: Intro to the political science political theory behind the Manifesto Project (Chapters 1-3): [Lemmer 2023](https://doi.org/10.25593/978-3-96147-671-8); Paper on Natural Language Inference: [Laurer et al. 2024](https://www.cambridge.org/core/journals/political-analysis/article/less-annotating-more-classifying-addressing-the-data-scarcity-issue-of-supervised-machine-learning-with-deep-transfer-learning-and-bertnli/05BB05555241762889825B080E097C27); [Manifesto Project website](https://manifesto-project.wzb.eu/).

- ~~*Characteristics of language between amateur and expert poetists.*~~ Writing is an art - a beautiful and moving poem has various characteristics which readers relate to and draws their mind into an imaginative tale. This project aims to better understand and characterize writing styles of amaetur and expert poets. The first step would be constructing a corpus of poems or prose of experienced and amateur writings from online sources, checking carefully for copyright. The data would have to be clean and preprocessed. Afterwards, various NLP techniques such as sentiment analysis or analysis of metaphors will be used to better understand and characterize various writing styles. If time allows, the corpus could be expanded to across genres and time periods for a more comprehensive analysis of writing style. References: [Kao & Jurafsky 2015](https://aclanthology.org/2015.lilt-12.3.pdf), [Kao & Jurafsky 2012](https://aclanthology.org/W12-2502.pdf), [Gopidi & Alam 2019](https://aclanthology.org/W19-4702.pdf). Level: BSc or MSc


<a name="v4"/>
### V4: Human-centric Natural Language Understanding: Uncertainty, Perception, Cognition, Vision, Interpretability
Expand Down

0 comments on commit 41d7b79

Please sign in to comment.