Predictive Hacks

Pandas Dataframes Basics: Reshaping Data

pandas dataframes

In this series of posts, we will show you the basics of Pandas Dataframes which is one of the most useful Data Science python libraries ever made. The first post of this series is about reshaping data.


pd.pivot: Spread columns into rows

pandas dataframes pivot

Example:

df = pd.DataFrame(
{"A" : ['a' ,'a', 'a', 'b', 'b' ,'b'],
"B" : ['A' ,'B', 'C', 'A', 'B' ,'C'],
"C" : [4, 5, 6 , 7 ,8 ,9]})

df
   A  B  C
0  a  A  4
1  a  B  5
2  a  C  6
3  b  A  7
4  b  B  8
5  b  C  9
df.pivot(columns='B',values='C',index='A')
B  A  B  C
A         
a  4  5  6
b  7  8  9

pd.melt: Gather columns into rows

pandas dataframes melt

Example

df=pd.DataFrame({'A': [4, 7], 'B': [5, 8], 'C': [6, 9]})
df
   A  B  C
0  4  5  6
1  7  8  9
df.melt()
  variable  value
0        A      4
1        A      7
2        B      5
3        B      8
4        C      6
5        C      9

pd.concat: Combine Data-Frames

pandas dataframes concat

Example

df1 = pd.DataFrame(
{"A" : [1 ,2, 3],
"B" : [4, 5, 6],
"C" : [7, 8, 9]})

df2 = pd.DataFrame(
{"A" : [10 ,11],
"B" : [12, 13],
"C" : [14, 15]})

print(df1)

print(df2)
   A  B  C
0  1  4  7
1  2  5  8
2  3  6  9

    A   B   C
0  10  12  14
1  11  13  15
pd.concat([df1,df2])
    A   B   C
0   1   4   7
1   2   5   8
2   3   6   9
0  10  12  14
1  11  13  15

pd.explode: Transform each element of a list-like to a row

pandas dataframes explode

Example

df=pd.DataFrame({'A':[[1,2,3],[4,5,6]]})
           A
0  [1, 2, 3]
1  [4, 5, 6]
df.explode('A')
   A
0  1
0  2
0  3
1  4
1  5
1  6

Stack: Stack columns to index

pandas dataframes stack

Example

df = pd.DataFrame([[0, 1], [2, 3]],
                                    index=['A', 'B'],
                                    columns=['COL1', 'COL2'])
df
   COL1  COL2
A     0     1
B     2     3
df.stack()
A  COL1    0
   COL2    1
B  COL1    2
   COL2    3

Unstack: Unstack columns from index

pandas dataframes unstack

Example

index = pd.MultiIndex.from_tuples([('A', 'col1'), ('A', 'col2'),
                                   ('B', 'col1'), ('B', 'col2')])
df = pd.Series(np.arange(1.0, 5.0), index=index)
df
A  col1    1.0
   col2    2.0
B  col1    3.0
   col2    4.0
df.unstack()
   col1  col2
A   1.0   2.0
B   3.0   4.0

pd.split(expand=True): Expand split strings into separate columns

Example

import pandas as pd

df = pd.DataFrame(
{"A" : ['A B C' ,'D E F', 'G H I']})
       A
0  A B C
1  D E F
2  G H I
print(df['A'].str.split(' ',expand=True))
   0  1  2
0  A  B  C
1  D  E  F
2  G  H  I

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

2 thoughts on “Pandas Dataframes Basics: Reshaping Data”

Leave a Comment

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s