Skip to content

dkpro/dkpro-bigdata

Repository files navigation

dkpro-bigdata

DKPro BigData enables the easy execution of UIMA-based natural language processing pipelines on a hadoop cluster.

###Features Large scale NLP processing using UIMA and hadoop Store your corpora on a Hadoop filesystem and access them from local or distributed pipelines Find patterns in your textual data using adaptable collocation extraction ###Details

  • Execute DKPro pipelines on a hadoop cluster with minimal adaption

  • Read data stored on a HDFS Filesystem using DKPro Collection Readers

  • Read/Write serialized CASes from HDFS ###Contributors:

  • Hans-Peter Zorn

  • Johannes Simon

  • Martin Riedl

  • Richard Eckart de Castilho

  • Steffen Remus

##License DKPro BigData is licensed under the Apache Software Licence (ASL) Version 2.0.

This project is a joint effort of UKP Lab and the Language Technology Group, Technical University of Darmstadt.