Based on my code: Canada_COVID19_cases_information.ipynb
I like this way to convert string to date:
from pyspark.sql.types import * #data types
func = udf (lambda x: datetime.strptime(x, '%d/%m/%Y'), DateType())
df = df.withColumn('newDate', func(col('Date')))
calculate difference days between two date: Some good examples
from pyspark.sql import functions as F
df = df.withColumn('startDay',F.lit('2020-01-01').cast("Date"))
df = df.withColumn('Days_from_01_Jan',F.datediff(F.col('newDate'),F.col('startDay')))
convert pandas string date to datetime format: Some good examples
df['Date']= pd.to_datetime(df['Date'])
Extract date of today:
pd.Timestamp.today().date()
Example for constructing column date from columns of year, month, and day for pandas dataframe:
from datetime import date
casetable2["Date"]=casetable2.apply(lambda row: date(2020,int(row.Month),int(row.Day)), axis=1)