Package: piecemaker 1.0.2.9000

Jon Harmon

piecemaker: Tools for Preparing Text for Tokenizers

Tokenizers break text into pieces that are more usable by machine learning models. Many tokenizers share some preparation steps. This package provides those shared steps, along with a simple tokenizer.

Authors:Jon Harmon [aut, cre], Jonathan Bratt [aut], Bedford Freeman & Worth Pub Grp LLC DBA Macmillan Learning [cph]

piecemaker_1.0.2.9000.tar.gz
piecemaker_1.0.2.9000.zip(r-4.7)piecemaker_1.0.2.9000.zip(r-4.6)piecemaker_1.0.2.9000.zip(r-4.5)
piecemaker_1.0.2.9000.tgz(r-4.6-any)piecemaker_1.0.2.9000.tgz(r-4.5-any)
piecemaker_1.0.2.9000.tar.gz(r-4.7-any)piecemaker_1.0.2.9000.tar.gz(r-4.6-any)
piecemaker_1.0.2.9000.tgz(r-4.6-emscripten)
manual.pdf |manual.html
card.svg |card.png
piecemaker/json (API)
NEWS

# Install 'piecemaker' in R:
install.packages('piecemaker', repos = c('https://macmillancontentscience.r-universe.dev', 'https://cloud.r-project.org'))

Bug tracker:https://github.com/macmillancontentscience/piecemaker/issues

Pkgdown/docs site:https://macmillancontentscience.github.io

On CRAN:

Conda:

3.48 score 2 packages 6 scripts 282 downloads 10 exports 8 dependencies

Last updated from:b02c1a7492. Checks:9 OK. Indexed: yes.

TargetResultTimeFilesSyslog
linux-devel-x86_64OK120
source / vignettesOK194
linux-release-x86_64OK118
macos-release-arm64OK82
macos-oldrel-arm64OK93
windows-develOK73
windows-releaseOK68
windows-oldrelOK65
wasm-releaseOK111

Exports:prepare_and_tokenizeprepare_textremove_control_charactersremove_diacriticsremove_replacement_charactersspace_cjkspace_punctuationsquish_whitespacetokenize_spacevalidate_utf8

Dependencies:cligluelifecyclemagrittrrlangstringistringrvctrs