HanMiner Extension for RapidMiner

Understanding unstructured text data without the need to code! The HanMiner Extension of RapidMiner provides a fast and easy-to-use toolset for text processing/mining in Chinese Mandarin (Han language). It enables researchers/data analysts to extract valuable information from text with no programming knowledge required. Use it to build your own workflow for public opinion monitoring, sentiment analysis, keyword extraction for wordle, etc.

Features

Document reader/writer
Processing
- Word segmentation (tokenization)
- Filtering
  - Filter stopwords
  - Filter tokens
  - Filter documents
Feature Extraction
- Word Count
- Keyword extraction
- Vectorizer
  - Count Vectorizer
  - TfIdf Vectorizer
  - Doc2Vec
Analyzing
- Part-of-Speech (POS) Tagging
- Name Entity Recognition (NER)
Translation
- Simplified Chinese to Traditional Chinese
- Traditional Chinese to Simplifies Chinese
Classification
- Document classification
- Sentiment Analysis

Getting Started

Run with Intellij

Clone this repository https://github.com/joeyhaohao/rapidminer-HanMiner.git
Open the project with Intellij. Use Java 1.8 as project SDK.
Build the project
Run GuiLauncher under the source folder

Acknowledgement

Some NLP models and functions of this project are supported by Hanlp.

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
changes		changes
config		config
gradle/wrapper		gradle/wrapper
lib		lib
licenses		licenses
src		src
.gitignore		.gitignore
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HanMiner Extension for RapidMiner

Features

Getting Started

Run with Intellij

Acknowledgement

About

Releases

Packages

Contributors 3

Languages

awsyunhaz/rapidminer-HanMiner

Folders and files

Latest commit

History

Repository files navigation

HanMiner Extension for RapidMiner

Features

Getting Started

Run with Intellij

Acknowledgement

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages