DATA 531 Text Analytics with Information Retrieval
Investigation of text mining tools using R, including bag-of-word models, and information retrieval using the term frequency-inverse document frequency (tf-idf) approach. Advanced topics such as document clustering are considered. A variety of types of texts are analyzed from tweets from Twitter to digitized books from Project Gutenberg.
Credits
4
Prerequisite
DATA 511 or permission of department chair.