Word Counts in Pandas Data Frames

Assume that you work with a Pandas data frame, and you want to get the word frequency of your reviews columns as a part of exploratory analysis. You can easily do it with one line of code. Let’s create the data frame:

import pandas as pd

df = pd.DataFrame({'mytext':['I love Predictive Hacks!','How can I remove punctuations?'
                             ,'He said: "This is cool!".']})


0        I love Predictive Hacks!
1  How can I remove punctuations?
2       He said: "This is cool!".

Since we want to get the Word Frequency it is better to convert the text to a lower case and to remove the punctuations.

df["mytext_new"] = df['mytext'].str.lower().str.replace('[^\w\s]','')

new_df = df.mytext_new.str.split(expand=True).stack().value_counts().reset_index()

new_df.columns = ['Word', 'Frequency'] 

            Word  Frequency
0              i          2
1            can          1
2         remove          1
3           cool          1
4             is          1
5            how          1
6           said          1
7           this          1
8   punctuations          1
9             he          1
10         hacks          1
11    predictive          1
12          love          1

