Category : Pandas

A scenic autumn landscape

Example code for creating and adding new features to a data frame using the feature-engine. It also answer following questions: Math features Code import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from feature_engine.creation import MathFeatures from feature_engine.creation import RelativeFeatures from feature_engine.creation import CyclicalFeatures # create range of ..

Read more

Winding rivers in mountainous forest

Example code for creating features from time series data, such as lag features and window features? It can answer following questions: import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from feature_engine.timeseries.forecasting import LagFeatures # create range of monthly dates download_dates = pd.date_range(start=’2019-01-01′, end=’2020-01-01′, freq=’MS’) # URL from ..

Read more

Example code for log,reciprocal,arcsin ,power transformers of feature-engine. You can find answer to the following question as well: import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from feature_engine import transformation as vt # Load dataset # create range of monthly dates download_dates = pd.date_range(start=’2019-01-01′, end=’2020-01-01′, freq=’MS’) # ..

Read more

deep ocean scape

Example code for handling outlier with 3 methods of feature-engine. Winsorizer Caps maximum and/or minimum values of a variable at automatically determined values.[ref:https://feature-engine.readthedocs.io/en/latest/user_guide/outliers/Winsorizer.html] Code import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from feature_engine.outliers import Winsorizer # Load dataset def load_titanic(): data = pd.read_csv(‘https://www.openml.org/data/get_csv/16826755/phpMYEkMl’) data = data.replace(‘?’, ..

Read more

Example python code for handling missing data (ref:Python feature engineering cookbook ). Also answer the following questions: import pandas as pd from sklearn.model_selection import train_test_split from sklearn.impute import SimpleImputer from feature_engine.missing_data_imputers import MeanMedianImputer from feature_engine.imputation import ArbitraryNumberImputer from feature_engine.imputation import EndTailImputer from feature_engine.imputation import CategoricalImputer from feature_engine.imputation import RandomSampleImputer from feature_engine.imputation import AddMissingIndicator from feature_engine.imputation ..

Read more

Sample code for multiple-level treemap generation.This example also includes some methods on pandas data processing, such as: How to create a pandas dataframe? How to append several dataframe to construct a bigger dataframe? How to build a hierarchical dataframe? import pandas as pd import numpy as np import plotly.express as px import plotly.graph_objects as go ..

Read more

The first step in using stock data for financial analysis is to download price data from yahoo Finance. Below is a very simple example for downloading data. import yfinance as yf import pandas as pd startDay=”2020-06-18″ endDay=”2020-06-20″ fnlist=’nasdqIntheUSAMostActiveStocks.txt’ #symbols fns=pd.read_csv(fnlist) toPath=’NASDQ/’ for fn0 in fns.Symbols: stockName=fn0.replace(‘ ‘,”) toFile=toPath+stockName+’.csv’ toFile=toFile.replace(‘ ‘,”) data = yf.download(stockName, start=startDay, end=endDay) ..

Read more