If you have grid or point data and polygon shapefiles for longitude and longitude. Using geopandas and shapely it’s easy to find grids or points in specific polygons. import geopandas as gpd import pandas as pd import numpy as np import matplotlib.pyplot as plt from shapely.geometry import Point # US state polygon shape file US_states=gpd.GeoDataFrame.from_file(‘states_province_shapefile\cb_2018_us_state_20m\cb_2018_us_state_20m.shp’) ..
Category : Pandas
The following example is to find latitude and longitude of a city from the data worldcities (https://datahub.io/core/world-cities) and then based on the latitude and longitude to slice data from era5 data of this city. It also can be used to find data from other source spatial data. Answer other questions: How to read netcdf file ..
The global historical climate network daily data (ghcnd) is the widely used global climate data for all purposes. However, the data size is huge. If you only want to use data from a certain area, your best bet is to find the site information first, and then download the data for those weather stations directly. ..
Following is my python code to plot stock chart on a background image from local file. The code is not optimal, I will improve it later. Hope it be useful. The example is plot ABX.TO.csv (download from yahoo finance) data on a logo collection image (my file:/media/Data1/XIU/AACodes/NASDAQ_stocks.png). Code # plot stock data with a background-image ..
This example is used for fitting world covid-19 cases number import numpy as np import pandas as pd from datetime import datetime from lmfit import Minimizer, Parameters, report_fit import chart_studio.plotly as py import cufflinks as cf def Cauchy_cumulative_hazard_fit(x,loc,scale,decaybase): decayterm=np.power(decaybase,(x-loc)) decayterm[np.whe..
df2[‘date’] = df1[‘date’].values df2[‘hour’] = df1[‘hour’].values It’s better use inter join case.tail(3) caseCA caseON caseDaily CAcaseDailyON case_Date_province 2020-04-26 47864 15411 1598.0 498.0 2020-04-27 49499 15868 1635.0 457.0 2020-04-28 50982 16337 1483.0 469.0 test.tail(3) testCA testON testDailyCA testDailyON date_testing 2020-04-26 734824 229638 23570.0 12020.0 2020-04-27 765056 242188 30232.0 12550.0 2020-04-28 787612 253040 22556.0 10852.0 case_test = ..
import pandas as pd from plotly.offline import iplot import cufflinks cufflinks.go_offline() # Set global theme cufflinks.set_config_file(world_readable=True, theme=’pearl’) fig=df.iplot(asFigure=True, mode=’lines+markers’, size=6, secondary_y = ‘Increase’, secondary_y_title=’Increase’, xTitle=’Date’, yTitle=’Cases’, title=’Projected COVID-19 Cases in South Korea’, theme=’solar’) fig.show() import pandas as pd import chart_studio.plotly as py from ipywidgets import interact, interact_manual import cufflinks as cf @interact def plot_ProjectedSouthKereaCOVID19(): fig=output.iplot(asFigure=True, ..
import pandas as pd import numpy as np # model from lmfit import Minimizer, Parameters, report_fit #plot import chart_studio.plotly as py import ipywidgets as widgets from ipywidgets import interact, interact_manual import cufflinks as cf theCountry=’Canada’ threshhold=10 theData=confirmed_series_21[confirmed_series_21[theCountry]>threshhold] data=theData[theCountry] start_date= data.index[0] end_date= data.index[-1] dateData=pd.date_range(start=start_date,end=end_date) forecastDays=60 dateForecast= pd.date_range(start=end_date,periods=forecastDays+1)[1:] dateObsForecast=dateData.append(dateForecast) #dateObsForecast # define objective function: returns the array ..
output = pd.DataFrame({‘date’ : [],’Forecast’:[],’Cases’: [],’Fitting’:[],’Increase’:[]}) output[‘date’]=dateObsForecast output[‘Forecast’]=y1*last output[‘Cases’].iloc[:dataLen]=data.values*last output[‘Fitting’].iloc[:dataLen]=final.values*last output[‘Increase’].iloc[1:]=(y1[1:]-y1[:-1])*last output=output.set_ind..
Examples for pivot_table of Pandas and crosstab of Pyspark from my work directory:pyWorkDir/Bigdata/Pyspark/DataForYuanPei.ipynb pivot_table casepandas=indcases.toPandas() casetable1=pd.pivot_table(casepandas, values=’VALUE’, index=[“Case identifier number”], columns=[“Case information”], aggfunc=np.sum) crosstab casetable=casedf.crosstab(‘case_Date’,’province’) casetable=casetable.toPandas() casetable=casetable.sort_values(‘case_Date_province’) cumsum_casetable=casetable.set_index(‘case_Date_province’).cumsum() cumsum_casetable[‘CA’]=cumsum_casetable.sum(axis=1) casedftable=casedf.crosstab(‘case_Date’,’health_region’) health_region_table=casedftable.select([‘case_Date_health_region’,’Toronto’,’Montréal’,’Vancouver Coastal’,..