Neural Text Modelling with R package ruimtehol
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last week the R package ruimtehol was released on CRAN (https://github.com/bnosac/ruimtehol) allowing R users to easily build and apply neural embedding models on text data.
It wraps the ‘StarSpace’ library “>https://github.com/facebookresearch/StarSpace allowing users to calculate word, sentence, article, document, webpage, link and entity ’embeddings’. By using the ’embeddings’, you can perform text based multi-label classification, find similarities between texts and categories, do collaborative-filtering based recommendation as well as content-based recommendation, find out relations between entities, calculate graph ’embeddings’ as well as perform semi-supervised learning and multi-task learning on plain text. The techniques are explained in detail in the paper: ‘StarSpace: Embed All The Things!’ by Wu et al. (2017), available at https://arxiv.org/abs/1709.03856.
You can get started with some common text analytical use cases by using the presentation we have built below. Enjoy!
{aridoc engine=”pdfjs” width=”100%” height=”550″}images/bnosac/blog/R_TextMining_Starspace.pdf{/aridoc}
If you like it, give it a star at https://github.com/bnosac/ruimtehol and if you need commercial support on text mining, get in touch.
Upcoming training schedule
Note also that you might be interested in the following courses held in Belgium
- 21-22/02/2018: Advanced R programming. Leuven (Belgium). Subscribe here
- 13-14/03/2018: Computer Vision with R and Python. Leuven (Belgium). Subscribe here
- 15/03/2019: Image Recognition with R and Python: Subscribe here
- 01-02/04/2019: Text Mining with R. Leuven (Belgium). Subscribe here
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.