Package: morphemepiece 1.2.3
morphemepiece: Morpheme Tokenization
Tokenize text into morphemes. The morphemepiece algorithm uses a lookup table to determine the morpheme breakdown of words, and falls back on a modified wordpiece tokenization algorithm for words not found in the lookup table.
Authors:
morphemepiece_1.2.3.tar.gz
morphemepiece_1.2.3.zip(r-4.7)morphemepiece_1.2.3.zip(r-4.6)morphemepiece_1.2.3.zip(r-4.5)
morphemepiece_1.2.3.tgz(r-4.6-any)morphemepiece_1.2.3.tgz(r-4.5-any)
morphemepiece_1.2.3.tar.gz(r-4.7-any)morphemepiece_1.2.3.tar.gz(r-4.6-any)
morphemepiece_1.2.3.tgz(r-4.6-emscripten)
manual.pdf |manual.html✨
DESCRIPTION |NEWS
card.svg |card.png
morphemepiece/json (API)
| # Install 'morphemepiece' in R: |
| install.packages('morphemepiece', repos = c('https://macmillancontentscience.r-universe.dev', 'https://cloud.r-project.org')) |
Bug tracker:https://github.com/macmillancontentscience/morphemepiece/issues
Last updated from:bc071b1a03. Checks:7 NOTE, 2 OK. Indexed: yes.
| Target | Result | Time | Files | Syslog |
|---|---|---|---|---|
| linux-devel-x86_64 | NOTE | 134 | ||
| source / vignettes | OK | 183 | ||
| linux-release-x86_64 | NOTE | 137 | ||
| macos-release-arm64 | NOTE | 89 | ||
| macos-oldrel-arm64 | NOTE | 97 | ||
| windows-devel | NOTE | 76 | ||
| windows-release | NOTE | 84 | ||
| windows-oldrel | NOTE | 82 | ||
| wasm-release | OK | 122 |
Exports:load_lookupload_or_retrieve_lookupload_or_retrieve_vocabload_vocabmorphemepiece_cache_dirmorphemepiece_lookupmorphemepiece_tokenizemorphemepiece_vocabprepare_vocabset_morphemepiece_cache_dir
Dependencies:bitbit64cachemclicliprcpp11crayondigestdlrfastmapfastmatchfsgluehmslifecyclemagrittrmemoisemorphemepiece.datapiecemakerpillarpkgconfigprettyunitsprogresspurrrR6rappdirsreadrrlangstringistringrtibbletidyselecttzdbutf8vctrsvroomwithr
Last update: 2021-10-26
Started: 2021-07-29
Last update: 2021-09-06
Started: 2021-07-29
Readme and manuals
Help Manual
| Help page | Topics |
|---|---|
| morphemepiece: Morpheme Tokenization | morphemepiece-package |
| Load a morphemepiece lookup file | load_lookup |
| Load a lookup file, or retrieve from cache | load_or_retrieve_lookup |
| Load a vocabulary file, or retrieve from cache | load_or_retrieve_vocab |
| Load a vocabulary file | load_vocab |
| Retrieve Directory for Morphemepiece Cache | morphemepiece_cache_dir |
| Tokenize Sequence with Morpheme Pieces | morphemepiece_tokenize |
| Format a Token List as a Vocabulary | prepare_vocab |
| Set a Cache Directory for Morphemepiece | set_morphemepiece_cache_dir |
