Example code about how to extract several date and time features from datetime variables with feature-engine. It can answer following questions: import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from feature_engine import transformation as vt # create range of monthly dates download_dates = pd.date_range(start=’2019-01-01′, end=’2020-01-01′, freq=’MS’) # ..
Tag : worknotes
Example code for log,reciprocal,arcsin ,power transformers of feature-engine. You can find answer to the following question as well: import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from feature_engine import transformation as vt # Load dataset # create range of monthly dates download_dates = pd.date_range(start=’2019-01-01′, end=’2020-01-01′, freq=’MS’) # ..
Example code for handling outlier with 3 methods of feature-engine. Winsorizer Caps maximum and/or minimum values of a variable at automatically determined values.[ref:https://feature-engine.readthedocs.io/en/latest/user_guide/outliers/Winsorizer.html] Code import numpy as np import pandas as pd import matplotlib.pyplot as plt from sklearn.model_selection import train_test_split from feature_engine.outliers import Winsorizer # Load dataset def load_titanic(): data = pd.read_csv(‘https://www.openml.org/data/get_csv/16826755/phpMYEkMl’) data = data.replace(‘?’, ..
The sample code shows you how to encode categorical data and answer the following questions: One hot encoder Replaces the categorical variable by a group of binary variables which take value 0 or 1, to indicate if a certain category is present in an observation. Example code import numpy as np import pandas as pd ..
Example python code for handling missing data (ref:Python feature engineering cookbook ). Also answer the following questions: import pandas as pd from sklearn.model_selection import train_test_split from sklearn.impute import SimpleImputer from feature_engine.missing_data_imputers import MeanMedianImputer from feature_engine.imputation import ArbitraryNumberImputer from feature_engine.imputation import EndTailImputer from feature_engine.imputation import CategoricalImputer from feature_engine.imputation import RandomSampleImputer from feature_engine.imputation import AddMissingIndicator from feature_engine.imputation ..
Finds files with a pattern in filenames from a directory or directories. Find one file. find /directory/ -name “tofindthisfile.txt” -print Find all files with pattern in filenames. find /directory/ -name “aa*bb*.png” -print Find files in several directories. find /directory1/ /direcotory2/ -name “aa*bb*.pngR..
Sample code for multiple-level treemap generation.This example also includes some methods on pandas data processing, such as: How to create a pandas dataframe? How to append several dataframe to construct a bigger dataframe? How to build a hierarchical dataframe? import pandas as pd import numpy as np import plotly.express as px import plotly.graph_objects as go ..
The following python program show you the method to draw a rectangular area in a map and mark state or province name on the map. Answer to other questions: How to add ocean, lake, river, coastline, province border to map? How to set map extent? How to set xtick or ytick intervals? # Ignore warnings ..