DKPro BigData enables the easy execution of UIMA-based natural language processing pipelines on a hadoop cluster.
###Features Large scale NLP processing using UIMA and hadoop Store your corpora on a Hadoop filesystem and access them from local or distributed pipelines Find patterns in your textual data using adaptable collocation extraction ###Details
-
Execute DKPro pipelines on a hadoop cluster with minimal adaption
-
Read data stored on a HDFS Filesystem using DKPro Collection Readers
-
Read/Write serialized CASes from HDFS ###Contributors:
-
Hans-Peter Zorn
-
Johannes Simon
-
Martin Riedl
-
Richard Eckart de Castilho
-
Steffen Remus
##License DKPro BigData is licensed under the Apache Software Licence (ASL) Version 2.0.
This project is a joint effort of UKP Lab and the Language Technology Group, Technical University of Darmstadt.