Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Example project: A corpus of funny German #76

Open
alanakbik opened this issue Mar 10, 2023 · 2 comments
Open

Example project: A corpus of funny German #76

alanakbik opened this issue Mar 10, 2023 · 2 comments
Assignees
Labels
feature Have an idea on how to improve the code base? Come forward and let us know.

Comments

@alanakbik
Copy link
Contributor

One way to test Fundus would be to execute a few example projects ourselves and see if Fundus is up to the task :)

Here is an idea for an example project: Make a corpus of funny German text

Steps:

  • select at least two sources (Titanic, Eulenspiegel etc.)
  • create parsers for these sources
  • create a corpus of at least 200 articles
  • save in json format for easy distribution
@Weyaaron
Copy link
Collaborator

Seems like fun, I will attempt it :)

@MaxDall MaxDall added the feature Have an idea on how to improve the code base? Come forward and let us know. label Apr 19, 2023
@Weyaaron
Copy link
Collaborator

Weyaaron commented May 8, 2023

I would like to specify this a bit more: In particular, we should add a list of publishers we want to investigate. I started working on #185 , but this list should be done like #54 . @alanakbik do you mind if hijack the top comment for this one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Have an idea on how to improve the code base? Come forward and let us know.
Projects
None yet
Development

No branches or pull requests

3 participants