KOMORAN is a KOrean MORphological ANalyzer implemented in Java.
- Pure Java
- Because of implemented in ONLY Java, it can be used in any environment where Java is installed.
- Dependency Free
- There is no dependency issue with external libraries because of self-made libraries.
- Lightweight
- It need low memory(approx. 50MB) for running, using self made TRIE dictionary and so on.
- Easy to Use
- Just type a few lines of code for analyzing sentences.
- Easy to manage dictionary
- Easily edit dictionaries as human-readable plain text file.
- New analysis results
- Unlike other morpheme analyzers, it is possible to analyze morpheme units containing spaces.
- On the KOMORAN website, you can check the analysis result as below.
- Input sentence: 대한민국은 민주공화국이다.
Please refer to the 'Installation' document.
Please refer to the 'Morphological Analysis Blitz in 3 minutes' document.
- Analysis example
- Model training example
- Spark2 analysis example (in Scala)
- A demo page is available to test the performance of KOMORAN.
- You can refer to KOMORAN official document site and how use KOMORAN
- Please visit KOMORAN Slack to share usage and tips.
- In addition, there is a Simple API Server repo that can be run on yourself.
- There is a repository implemented in Python3 by Hyunjoong Kim - KOMORAN3Py(/lovit/komoran3py)
@misc{komoran,
author = {Junsoo Shin, Junghwan Park, Geunho Lee},
title = {komoran},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/shin285/KOMORAN}}
-
Ihm, S. Y., Lee, J. H., & Park, Y. H. (2019). Skip-gram-KR: Korean Word Embedding for Semantic Clustering. IEEE Access. (IF: 3.557)
-
Kwon, S., Ko, Y., & Seo, J. (2019). Effective vector representation for the Korean named-entity recognition. Pattern Recognition Letters, 117, 52-57. (IF: 1.952)
-
Song, H. J., Choi, J. E., Lee, Y. K., Yoon, J. H., Kim, J. D., Park, C. Y., & Kim, Y. S. (2019). A Web Service for Evaluating the Level of Speech in Korean. Applied Sciences, 9(3), 594. (IF: 1.689)
-
Han, K., Shim, H., & Yi, M. Y. (2018). A New Biomedical Passage Retrieval Framework for Laboratory Medicine: Leveraging Domain-specific Ontology, Multilevel PRF, and Negation Differential Weighting. Journal of healthcare engineering, 2018. (IF: 1.261)
-
Edmiston, D., & Stratos, K. (2018). Compositional Morpheme Embeddings with Affixes as Functions and Stems as Arguments. In Proceedings of the Workshop on the Relevance of Linguistic Structure in Neural Architectures for NLP (pp. 1-5).