Assume that you are dealing with the following data frame:
import pandas as pd df = pd.DataFrame({'MyDate': ['2020-03-11', '2021-04-26', '2021-01-17']}) df['MyDate'] = pd.to_datetime(df.MyDate) df
Output:
MyDate
0 2020-03-11
1 2021-04-26
2 2021-01-17
And you want to truncate the date to month:
df['Truncated'] = df['MyDate'] + pd.offsets.MonthBegin(-1) # OR # df['Truncated'] = df['MyDate'] - pd.offsets.MonthBegin(1) df
Output:
MyDate Truncated
0 2020-03-11 2020-03-01
1 2021-04-26 2021-04-01
2 2021-01-17 2021-01-01
Important note: This approach DOES NOT work when the date is already floored to the first day of the month, For that reason, we will provide other solutions too.
import pandas as pd df = pd.DataFrame({'MyDate': ['2021-03-11', '2021-04-26', '2021-01-17', '2021-02-01']}) df['MyDate'] = pd.to_datetime(df.MyDate) df['Truncated'] = df['MyDate'].dt.to_period('M').dt.to_timestamp() print(df)
And we get:
MyDate Truncated
0 2021-03-11 2021-03-01
1 2021-04-26 2021-04-01
2 2021-01-17 2021-01-01
In case you want to get a string of yyyy-mm
you can work as follows:
df['yr_month'] = df['MyDate'].dt.strftime("%Y-%m") df
MyDate yr_month
0 2020-03-11 2020-03
1 2021-04-26 2021-04
2 2021-01-17 2021-01
You can use this approach to truncate the dates to months.
df['Truncated'] = df['MyDate'].dt.strftime('%Y-%m-01') print(df)
And we get:
MyDate Truncated
0 2021-03-11 2021-03-01
1 2021-04-26 2021-04-01
2 2021-01-17 2021-01-01
More Data Science Hacks?
You can follow us on Medium for more Data Science Hacks