top of page
Untitled

DATA DOUBLE CONFIRM

Search

Text frequency analysis - Process - R

  • Writer: datadoubleconfirm
    datadoubleconfirm
  • Dec 15, 2018
  • 1 min read

This is a tutorial to get the frequency distribution of words used in a chunk of text and is a simpler alternative to a more elaborate text mining post that involves auto-removal of stopwords e.g. "the", "a", "and", etc.

The script basically breaks the chunk of text into a dataframe of words and these words are ran through a text cleaning function that removes punctuations/ symbols (the function can be modified to include text stemming if necessary; text stemming is done to derive root words).

©2017-2024 by DATA DOUBLE CONFIRM. Proudly created with Wix.com

bottom of page