Description
Morpheme Tokenization.
Description
Tokenize text into morphemes. The morphemepiece algorithm uses a lookup table to determine the morpheme breakdown of words, and falls back on a modified wordpiece tokenization algorithm for words not found in the lookup table.
README.md
morphemepiece
The goal of morphemepiece is to allow you to tokenize words into morphemes (the smallest unit of meaning).
Installation
You can install the released version of morphemepiece from CRAN with:
install.packages("morphemepiece")
And the development version from GitHub with:
# install.packages("devtools")
devtools::install_github("macmillancontentscience/morphemepiece")
Code of Conduct
Please note that the morphemepiece project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
Disclaimer
This is not an officially supported Macmillan Learning product.
Contact information
Questions or comments should be directed to Jonathan Bratt ([email protected]) and Jon Harmon ([email protected]).