Package: piecemaker 1.0.2.9000

Jon Harmon

piecemaker: Tools for Preparing Text for Tokenizers

Tokenizers break text into pieces that are more usable by machine learning models. Many tokenizers share some preparation steps. This package provides those shared steps, along with a simple tokenizer.

Authors:Jon Harmon [aut, cre], Jonathan Bratt [aut], Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]

piecemaker_1.0.2.9000.tar.gz
piecemaker_1.0.2.9000.zip(r-4.7)piecemaker_1.0.2.9000.zip(r-4.6)piecemaker_1.0.2.9000.zip(r-4.5)
piecemaker_1.0.2.9000.tgz(r-4.6-any)piecemaker_1.0.2.9000.tgz(r-4.5-any)
piecemaker_1.0.2.9000.tar.gz(r-4.7-any)piecemaker_1.0.2.9000.tar.gz(r-4.6-any)
piecemaker_1.0.2.9000.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
piecemaker/json (API)
NEWS

# Install 'piecemaker' in R:
install.packages('piecemaker', repos = c('https://macmillancontentscience.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/macmillancontentscience/piecemaker/issues

Pkgdown/docs site:https://macmillancontentscience.github.io

On CRAN:

Conda:

3.48 score 2 packages 6 scripts 262 downloads 10 exports 8 dependencies

Last updated from:b02c1a7492. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK117
source / vignettesOK174
linux-release-x86_64OK113
macos-release-arm64OK174
macos-oldrel-arm64OK164
windows-develOK80
windows-releaseOK84
windows-oldrelOK62
wasm-releaseOK104

Exports:prepare_and_tokenizeprepare_textremove_control_charactersremove_diacriticsremove_replacement_charactersspace_cjkspace_punctuationsquish_whitespacetokenize_spacevalidate_utf8

Dependencies:cligluelifecyclemagrittrrlangstringistringrvctrs