kgrams v0.1.2 on CRAN
Summary
Version v0.1.2 of my R package kgrams was just accepted by CRAN. This package provides tools for training and evaluating k-gram language models in R, supporting several probability smoothing techniques, perplexity computations, random text generation and more.
Short demo
library(kgrams) # Get k-gram frequency counts from Shakespeare's "Much Ado About Nothing" freqs <- kgram_freqs(kgrams::much_ado, N = 4) # Build modified Kneser-Ney 4-gram model, with discount parameters D1, D2, D3. mkn <- language_model(freqs, smoother = "mkn", D1 = 0.25, D2 = 0.5, D3 = 0.75) # Sample sentences from the language model at different temperatures set.seed(840) sample_sentences(model = mkn, n = 3, max_length = 10, t = 1)
[1] "i have studied eight or nine truly by your office [...] (truncated output)" [2] "ere you go : <EOS>" [3] "don pedro welcome signior : <EOS>"
sample_sentences(model = mkn, n = 3, max_length = 10, t = 0.1)
[1] "i will not be sworn but love may transform me [...] (truncated output)" [2] "i will not fail . <EOS>" [3] "i will go to benedick and counsel him to fight [...] (truncated output)"
sample_sentences(model = mkn, n = 3, max_length = 10, t = 10)
[1] "july cham's incite start ancientry effect torture tore pains endings [...] (truncated output)" [2] "lastly gallants happiness publish margaret what by spots commodity wake [...] (truncated output)" [3] "born all's 'fool' nest praise hurt messina build afar dancing [...] (truncated output)"NEWS Overall Software Improvements ... [Read more...]