How to group data with pandas?

  Data, Pandas, Python, Worknotes

Group DataFrame using a mapper or by a Series of columns.Return a GroupBy object, grouped by values in column named “col”.Grouping and aggregation functions to help you to learn features of your dataset, like the sum, mean, or average value of a group of elements. The sample code below may help.

Prepare data


import pandas as pd
#https://github.com/ziwangdeng/Data/blob/main/Vancouver_weather2010to2019_v01.csv
df=pd.read_csv('Vancouver_weather2010to2019_v01.csv')
df.columns
Index(['Unnamed: 0', 'Date/Time (LST)', 'Year', 'Month', 'Day', 'Time (LST)',
       'Temp (°C)', 'Dew Point Temp (°C)', 'Rel Hum (%)', 'Wind Dir (10s deg)',
       'Wind Spd (km/h)', 'Visibility (km)', 'Stn Press (kPa)', 'Weather'],
      dtype='object')

Group


df.groupby("Year")["Temp (°C)"].mean()
df.groupby("Year")["Temp (°C)"].median()
df.groupby("Year")["Temp (°C)"].min()
df.groupby("Year")["Temp (°C)"].max()
df.groupby("Year")["Temp (°C)"].std()
df.groupby("Year")["Temp (°C)"].var()
df.groupby("Year")["Temp (°C)"].quantile(0.95)
df.groupby("Year")["Temp (°C)"].size()
df.groupby("Year")["Temp (°C)"].count()
df.agg(['sum', 'min'])
df.groupby("Year")["Temp (°C)"].agg(['sum','mean','median','max','std','var','min'])

df.groupby(level=0).sum()

Shift


df.shift(-2, axis = 0)
df.shift( 2, axis = 0)