Text frequency analysis - Process - R

December 15, 2018

This is a tutorial to get the frequency distribution of words used in a chunk of text and is a simpler alternative to a more elaborate text mining post that involves auto-removal of stopwords e.g. "the", "a", "and", etc.

 

The script basically breaks the chunk of text into a dataframe of words and these words are ran through a text cleaning function that removes punctuations/ symbols (the function can be modified to include text stemming if necessary; text stemming is done to derive root words).

 

  

 

 

Please reload

Recent Posts

Please reload

©2017-2019 by DATA DOUBLE CONFIRM. Proudly created with Wix.com

This site was designed with the
.com
website builder. Create your website today.
Start Now