It is easy to change the layout, sort, reindex, rename, and subset table data using pandas commands. The simple code below shows you how to do this easily. The functions include melt(),pivot,sort_values,rename,sort_index,drop,filter,query,iloc,loc,iat,at and drop_duplicates.
Table of Contents
Prepare data
#dowonload https://github.com/ziwangdeng/Data/blob/main/Vancouver_weather2010to2019_v00.csv
import pandas as pd
df=pd.read_csv('Vancouver_weather2010to2019_v00.csv')
cols=df.columns
df1=df[cols[:10]]
df2=df[cols[10:]]
ll=len(df)
df3=df.head(5000)
df4=df.tail(ll-5000)
df.columns
Index(['Unnamed: 0', 'Longitude (x)', 'Latitude (y)', 'Station Name',
'Climate ID', 'Date/Time (LST)', 'Year', 'Month', 'Day', 'Time (LST)',
'Temp (°C)', 'Temp Flag', 'Dew Point Temp (°C)', 'Dew Point Temp Flag',
'Rel Hum (%)', 'Rel Hum Flag', 'Precip. Amount (mm)',
'Precip. Amount Flag', 'Wind Dir (10s deg)', 'Wind Dir Flag',
'Wind Spd (km/h)', 'Wind Spd Flag', 'Visibility (km)',
'Visibility Flag', 'Stn Press (kPa)', 'Stn Press Flag', 'Hmdx',
'Hmdx Flag', 'Wind Chill', 'Wind Chill Flag', 'Weather'],
dtype='object')
Collect columns into rows
pd.melt(df).sample(n=10)
data:image/s3,"s3://crabby-images/90246/90246264f7a9de0a6cd0404912e10bb663af075e" alt=""
Spread rows into columns
df.pivot(columns='Month', values='Temp (°C)').sample(n=10)
data:image/s3,"s3://crabby-images/76adb/76adbe8e25b425bfe6d532688ccd9b8f3df293ce" alt=""
Append rows of DataFrames
pd.concat([df3,df4])
df3.append(df4,ignore_index=True)
df3.shape
(5000, 31)
df4.shape
(83392, 31)
pd.concat([df3,df4]).shape
(88392, 31)
Append columns of DataFrames
pd.concat([df1,df2], axis=1)
df1.shape
(88392, 10)
df2.shape
(88392, 21)
pd.concat([df1,df2], axis=1).shape
(88392, 31)
Drop columns from DataFrame
df.drop(columns=['Unnamed: 0', 'Longitude (x)', 'Latitude (y)', 'Station Name'])
Subset columns of DataFrames
Select columns whose name matches regular expression regex
df[['Longitude (x)', 'Latitude (y)', 'Station Name']]
#Select columns whose name matches regular expression regex
df.filter(regex='^T')
data:image/s3,"s3://crabby-images/ad9df/ad9df83f0d2edc70a2f6c2efffbbc9ee0fe48e40" alt=""
Select columns using iloc
df.iloc[:, [0, 10, 20, 25]]
data:image/s3,"s3://crabby-images/a2702/a2702c1f43ddbf90160e9380485913caccba4657" alt=""
Select columns between 2 columns
df.loc[:, 'Time (LST)':'Dew Point Temp Flag']
data:image/s3,"s3://crabby-images/162cc/162cc969cfccde40b3e23ec3d54094134fada371" alt=""
Order rows by values of a column
df.sort_values('Temp (°C)')
df.sort_values('Temp (°C)', ascending=True)
df.sort_values('Temp (°C)', ascending=False)
Subset rows
Extract rows that meet logical criteria.
df[df.Year > 2019]
Remove duplicate rows
df.drop_duplicates()
Select rows using query
df.query('Year >2019')
data:image/s3,"s3://crabby-images/89a4d/89a4d0213476252d12932dd4de2d3ab2286d576f" alt=""
df.query('Year > 2019 and Day < 10')
data:image/s3,"s3://crabby-images/613b6/613b6302d8adf55d021215b683307a8f792c0322" alt=""
Select rows based on index
df.iloc[10:20]
df.iloc[:10]
df.iloc[10:]
df.iloc[[1,5,100]]
Subset rows and columns
df.loc[df['Year'] > 2019, ['Time (LST)','Dew Point Temp (°C)']]
Access single value by index
df.iat[10, 5]
df.at[4, 'Year']
Set index with a column
df.set_index('Date/Time (LST)')
Set MultiIndex for Pandas DataFrame
df.set_index(['Date/Time (LST)','Weather'])
data:image/s3,"s3://crabby-images/3bc9a/3bc9aa14cc7b028064f2fa51bccdb90575d2cd7b" alt=""
Sort the index of a DataFrame
df.sort_index()
Reset index of DataFrame to row numbers, moving index to columns.
df.reset_index()
Rename the columns of a DataFrame
df.rename(columns = {'Temp (°C)':'Tas','Wind Spd (km/h)':'Wsp'})
data:image/s3,"s3://crabby-images/1ac72/1ac727c9a00d0b62c8aa179a9d45983ae335fbbc" alt=""