How to centralize a pandas data frame

For recommender systems and collaborative filters it is a good strategy to centralize your data around 0 by subtracting the mean value and then filling the NAs with 0. Depending on your dataset and what you want to do, the centralization can be by row or by column.

Centralize a pandas data frame by row

In this case, we want to subtract the row mean from each element in a row. Let’s see how we can do it with on line of code.

import pandas as pd
import numpy as np

df = pd.DataFrame({'ColA': [1, 2, 3], 'ColB': [4, 10, 12], 'ColC': [10, np.nan, 7]})
df

Centralize the data frame:

df_centralized = df.sub(df.mean(axis=1), axis=0)
df_centralized

Centralize a pandas data frame by column

Similarly, we can centralize it by subtracting the column mean for each element in a row.

df_centralized = df.sub(df.mean(axis=0), axis=1)
df_centralized

Share This Post

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

George Pipis March 21, 2024

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s

George Pipis March 15, 2024

How to centralize a pandas data frame

Centralize a pandas data frame by row

Centralize a pandas data frame by column

Share This Post

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Image Captioning with HuggingFace

Intro to Chatbots with HuggingFace

#Tag Cloud ☁️