Predictive Hacks

Word Counts in Pandas Data Frames

Assume that you work with a Pandas data frame, and you want to get the word frequency of your reviews columns as a part of exploratory analysis. You can easily do it with one line of code. Let’s create the data frame:

import pandas as pd

df = pd.DataFrame({'mytext':['I love Predictive Hacks!','How can I remove punctuations?'
                             ,'He said: "This is cool!".']})

df

                           mytext
0        I love Predictive Hacks!
1  How can I remove punctuations?
2       He said: "This is cool!".

Since we want to get the Word Frequency it is better to convert the text to a lower case and to remove the punctuations.

df["mytext_new"] = df['mytext'].str.lower().str.replace('[^\w\s]','')


new_df = df.mytext_new.str.split(expand=True).stack().value_counts().reset_index()

new_df.columns = ['Word', 'Frequency'] 

new_df
            Word  Frequency
0              i          2
1            can          1
2         remove          1
3           cool          1
4             is          1
5            how          1
6           said          1
7           this          1
8   punctuations          1
9             he          1
10         hacks          1
11    predictive          1
12          love          1

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s