Assume that you work with a Pandas data frame, and you want to get the word frequency of your reviews
columns as a part of exploratory analysis. You can easily do it with one line of code. Let’s create the data frame:
import pandas as pd df = pd.DataFrame({'mytext':['I love Predictive Hacks!','How can I remove punctuations?' ,'He said: "This is cool!".']}) df
mytext
0 I love Predictive Hacks!
1 How can I remove punctuations?
2 He said: "This is cool!".
Since we want to get the Word Frequency it is better to convert the text to a lower case and to remove the punctuations.
df["mytext_new"] = df['mytext'].str.lower().str.replace('[^\w\s]','') new_df = df.mytext_new.str.split(expand=True).stack().value_counts().reset_index() new_df.columns = ['Word', 'Frequency'] new_df
Word Frequency
0 i 2
1 can 1
2 remove 1
3 cool 1
4 is 1
5 how 1
6 said 1
7 this 1
8 punctuations 1
9 he 1
10 hacks 1
11 predictive 1
12 love 1