datetime#
In this section I will pay special attention to working with dates and times in Pandas.
import numpy as np
import pandas as pd
test_series = pd.Series(
np.random.choice(
pd.date_range("2021-01-01", "2022-12-31"),
size=40
))
date_range
#
So, in the following example, I just put a few dates in the list day by day.
Basic#
Allows you to create an array of dates. By default it is DatetimeIndex
, but you can easily convert it to any other common array type.
pd.date_range("2020-01-01", "2020-01-07").to_list()
[Timestamp('2020-01-01 00:00:00'),
Timestamp('2020-01-02 00:00:00'),
Timestamp('2020-01-03 00:00:00'),
Timestamp('2020-01-04 00:00:00'),
Timestamp('2020-01-05 00:00:00'),
Timestamp('2020-01-06 00:00:00'),
Timestamp('2020-01-07 00:00:00')]
freq
#
This parameter allows you to set the step at which observations are added to the array. So in the following dataframe I have shown some options with arguments that should be passed as values of this parameter.
pd.DataFrame({
"Days 'd'" : pd.date_range("2021-01-01", "2021-01-10", freq="d")
"Weeks 'W'" : pd.date_range("2021-01-01", "2021-3-7", freq="W"),
})
Weeks 'W' | Days 'd' | |
---|---|---|
0 | 2021-01-03 | 2021-01-01 |
1 | 2021-01-10 | 2021-01-02 |
2 | 2021-01-17 | 2021-01-03 |
3 | 2021-01-24 | 2021-01-04 |
4 | 2021-01-31 | 2021-01-05 |
5 | 2021-02-07 | 2021-01-06 |
6 | 2021-02-14 | 2021-01-07 |
7 | 2021-02-21 | 2021-01-08 |
8 | 2021-02-28 | 2021-01-09 |
9 | 2021-03-07 | 2021-01-10 |
Ramdom dates#
This is the snippet of code that allows you to generate a pandas.Series
of random dates within specified borders. It will be common in the other examples. So I create test_series
here, which will be an experimental variable for other sections by default.
import pandas as pd
import numpy as np
start_date = '2021-01-01'
end_date = '2021-12-31'
num_dates = 10
test_series = pd.Series(
np.random.choice(
pd.date_range(start_date, end_date),
size=num_dates
))
test_series
0 2021-03-24
1 2021-10-19
2 2021-08-04
3 2021-02-08
4 2021-04-10
5 2021-08-31
6 2021-06-22
7 2021-12-19
8 2021-09-20
9 2021-08-08
dtype: datetime64[ns]
Extracting components#
It’s a common task to get a fraction of the date from pandas series, so here I show some options. Usually you should use the dt
property of the series to get access to it.
dt.day_of_week
#
You can get days of the week.
By default it returns numbers representing the days of the week: 0-Monday,…,6-Sunday.
So in the following example I show the case for the week this page was created.
week_range = pd.date_range("2023-08-28", "2023-09-03").to_series()
pd.DataFrame({
"Original date" : week_range,
"Day of the week" : week_range.dt.day_of_week
})
Original date | Day of the week | |
---|---|---|
2023-08-28 | 2023-08-28 | 0 |
2023-08-29 | 2023-08-29 | 1 |
2023-08-30 | 2023-08-30 | 2 |
2023-08-31 | 2023-08-31 | 3 |
2023-09-01 | 2023-09-01 | 4 |
2023-09-02 | 2023-09-02 | 5 |
2023-09-03 | 2023-09-03 | 6 |
week of year#
dt.isocalendar().week
#
You can use the above function to find the week number for any date.
test_weeks = pd.date_range("2021-01-01", "2021-04-1", freq="W").to_series()
test_weeks.dt.isocalendar().week.rename("week number").to_frame()
week number | |
---|---|
2021-01-03 | 53 |
2021-01-10 | 1 |
2021-01-17 | 2 |
2021-01-24 | 3 |
2021-01-31 | 4 |
2021-02-07 | 5 |
2021-02-14 | 6 |
2021-02-21 | 7 |
2021-02-28 | 8 |
2021-03-07 | 9 |
2021-03-14 | 10 |
2021-03-21 | 11 |
2021-03-28 | 12 |
Note The first days of a certain year may refer to the 54th week of the previous year. Documentation about this feature not really reach. The documentation about this function is not very extensive and does not mention in detail the exact algorithm for calculating the value in question. But in the next cell, I went through the dates of the border months of different summers. It turns out that the week refers to the year in which lies more number of its days and is numbered accordingly.
from IPython.display import HTML
for y in range(2012, 2017):
next_y = y+1
days = pd.date_range(
datetime(y, 12, 28),
datetime(next_y, 1, 3), freq="d"
).to_series()
display(HTML(f"<p style='font-size:150%'>======{y}-{next_y}======</p>"))
display(pd.DataFrame({
"Day":days,
"Day of week":days.dt.day_of_week,
"Week of year":days.dt.isocalendar().week
}))
======2012-2013======
Day | Day of week | Week of year | |
---|---|---|---|
2012-12-28 | 2012-12-28 | 4 | 52 |
2012-12-29 | 2012-12-29 | 5 | 52 |
2012-12-30 | 2012-12-30 | 6 | 52 |
2012-12-31 | 2012-12-31 | 0 | 1 |
2013-01-01 | 2013-01-01 | 1 | 1 |
2013-01-02 | 2013-01-02 | 2 | 1 |
2013-01-03 | 2013-01-03 | 3 | 1 |
======2013-2014======
Day | Day of week | Week of year | |
---|---|---|---|
2013-12-28 | 2013-12-28 | 5 | 52 |
2013-12-29 | 2013-12-29 | 6 | 52 |
2013-12-30 | 2013-12-30 | 0 | 1 |
2013-12-31 | 2013-12-31 | 1 | 1 |
2014-01-01 | 2014-01-01 | 2 | 1 |
2014-01-02 | 2014-01-02 | 3 | 1 |
2014-01-03 | 2014-01-03 | 4 | 1 |
======2014-2015======
Day | Day of week | Week of year | |
---|---|---|---|
2014-12-28 | 2014-12-28 | 6 | 52 |
2014-12-29 | 2014-12-29 | 0 | 1 |
2014-12-30 | 2014-12-30 | 1 | 1 |
2014-12-31 | 2014-12-31 | 2 | 1 |
2015-01-01 | 2015-01-01 | 3 | 1 |
2015-01-02 | 2015-01-02 | 4 | 1 |
2015-01-03 | 2015-01-03 | 5 | 1 |
======2015-2016======
Day | Day of week | Week of year | |
---|---|---|---|
2015-12-28 | 2015-12-28 | 0 | 53 |
2015-12-29 | 2015-12-29 | 1 | 53 |
2015-12-30 | 2015-12-30 | 2 | 53 |
2015-12-31 | 2015-12-31 | 3 | 53 |
2016-01-01 | 2016-01-01 | 4 | 53 |
2016-01-02 | 2016-01-02 | 5 | 53 |
2016-01-03 | 2016-01-03 | 6 | 53 |
======2016-2017======
Day | Day of week | Week of year | |
---|---|---|---|
2016-12-28 | 2016-12-28 | 2 | 52 |
2016-12-29 | 2016-12-29 | 3 | 52 |
2016-12-30 | 2016-12-30 | 4 | 52 |
2016-12-31 | 2016-12-31 | 5 | 52 |
2017-01-01 | 2017-01-01 | 6 | 52 |
2017-01-02 | 2017-01-02 | 0 | 1 |
2017-01-03 | 2017-01-03 | 1 | 1 |
weekofyear
#
Pandas datetime unit timestamp
has a weekofyear
parameter that you can combine with the apply
method as in the next example.
test_weeks = pd.date_range("2021-01-01", "2021-04-1", freq="W").to_series()
test_weeks.apply(lambda val: val.weekofyear).rename("week number").to_frame()
week number | |
---|---|
2021-01-03 | 53 |
2021-01-10 | 1 |
2021-01-17 | 2 |
2021-01-24 | 3 |
2021-01-31 | 4 |
2021-02-07 | 5 |
2021-02-14 | 6 |
2021-02-21 | 7 |
2021-02-28 | 8 |
2021-03-07 | 9 |
2021-03-14 | 10 |
2021-03-21 | 11 |
2021-03-28 | 12 |