Text frequency analysis - Process - R

December 15, 2018

This is a tutorial to get the frequency distribution of words used in a chunk of text and is a simpler alternative to a more elaborate text mining post that involves auto-removal of stopwords e.g. "the", "a", "and", etc.


The script basically breaks the chunk of text into a dataframe of words and these words are ran through a text cleaning function that removes punctuations/ symbols (the function can be modified to include text stemming if necessary; text stemming is done to derive root words).





Please reload

Recent Posts

Please reload

©2017-2020 by DATA DOUBLE CONFIRM. Proudly created with Wix.com

This site was designed with the
website builder. Create your website today.
Start Now