Skip to content

Basic Functions

recipes-package recipes
recipes: A package for computing and preprocessing design matrices.
recipe()
Create a recipe for preprocessing data
formula(<recipe>)
Create a formula from a prepared recipe
print(<recipe>)
Print a Recipe
summary(<recipe>)
Summarize a recipe
prep()
Estimate a preprocessing recipe
bake()
Apply a trained preprocessing recipe
juice()
Extract transformed training set
selections selection
Methods for selecting variables in step functions
has_role() has_type() all_outcomes() all_predictors() all_date() all_date_predictors() all_datetime() all_datetime_predictors() all_double() all_double_predictors() all_factor() all_factor_predictors() all_integer() all_integer_predictors() all_logical() all_logical_predictors() all_nominal() all_nominal_predictors() all_numeric() all_numeric_predictors() all_ordered() all_ordered_predictors() all_string() all_string_predictors() all_unordered() all_unordered_predictors() current_info()
Role Selection
add_role() update_role() remove_role()
Manually alter roles
update_role_requirements()
Update role specific requirements
get_case_weights() averages() medians() variances() correlations() covariances() pca_wts() are_weights_used()
Helpers for steps with case weights
case_weights
Using case weights with recipes

Step Functions - Imputation

step_impute_bag() imp_vars()
Impute via bagged trees
step_impute_knn()
Impute via k-nearest neighbors
step_impute_linear()
Impute numeric variables via a linear model
step_impute_lower()
Impute numeric data below the threshold of measurement
step_impute_mean()
Impute numeric data using the mean
step_impute_median()
Impute numeric data using the median
step_impute_mode()
Impute nominal data using the most common value
step_impute_roll()
Impute numeric data using a rolling window statistic
step_unknown()
Assign missing categories to "unknown"

Step Functions - Individual Transformations

step_BoxCox()
Box-Cox transformation for non-negative data
step_bs()
B-spline basis functions
step_harmonic()
Add sin and cos terms for harmonic analysis
step_hyperbolic()
Hyperbolic transformations
step_inverse()
Inverse transformation
step_invlogit()
Inverse logit transformation
step_log()
Logarithmic transformation
step_logit()
Logit transformation
step_mutate()
Add new variables using dplyr
step_ns()
Natural spline basis functions
step_poly()
Orthogonal polynomial basis functions
step_poly_bernstein()
Generalized bernstein polynomial basis
step_relu()
Apply (smoothed) rectified linear transformation
step_spline_b()
Basis splines
step_spline_convex()
Convex splines
step_spline_monotone()
Monotone splines
step_spline_natural()
Natural splines
step_spline_nonnegative()
Non-negative splines
step_sqrt()
Square root transformation
step_YeoJohnson()
Yeo-Johnson transformation

Step Functions - Discretization

step_discretize()
Discretize Numeric Variables
discretize() predict(<discretize>)
Discretize Numeric Variables
step_cut()
Cut a numeric variable into a factor

Step Functions - Dummy Variables and Encodings

step_bin2factor()
Create a factors from A dummy variable
step_count()
Create counts of patterns using regular expressions
step_dummy()
Create traditional dummy variables
step_dummy_extract()
Extract patterns from nominal data
step_dummy_multi_choice()
Handle levels in multiple predictors together
step_factor2string()
Convert factors to strings
step_indicate_na()
Create missing data column indicators
step_integer()
Convert values to predefined integers
step_novel()
Simple value assignments for novel factor levels
step_num2factor()
Convert numbers to factors
step_ordinalscore()
Convert ordinal factors to numeric scores
step_other()
Collapse infrequent categorical levels
step_percentile()
Percentile transformation
step_regex()
Detect a regular expression
step_relevel()
Relevel factors to a desired level
step_string2factor()
Convert strings to factors
step_unknown()
Assign missing categories to "unknown"
step_unorder()
Convert ordered factors to unordered factors

Step Functions - Date and Datetime

step_date()
Date feature generator
step_time()
Time feature generator
step_holiday()
Holiday feature generator

Step Functions - Interactions

step_interact()
Create interaction variables

Step Functions - Normalization

step_center()
Centering numeric data
step_normalize()
Center and scale numeric data
step_range()
Scaling numeric data to a specific range
step_scale()
Scaling numeric data

Step Functions - Multivariate Transformations

step_classdist()
Distances to class centroids
step_classdist_shrunken()
Compute shrunken centroid distances for classification models
step_depth()
Data depths
step_geodist()
Distance between two locations
step_ica()
ICA signal extraction
step_isomap()
Isomap embedding
step_kpca()
Kernel PCA signal extraction
step_kpca_poly()
Polynomial kernel PCA signal extraction
step_kpca_rbf()
Radial basis function kernel PCA signal extraction
step_mutate_at()
Mutate multiple columns using dplyr
step_nnmf()
Non-negative matrix factorization signal extraction
step_nnmf_sparse()
Non-negative matrix factorization signal extraction with lasso penalization
step_pca()
PCA signal extraction
step_pls()
Partial least squares feature extraction
step_ratio() denom_vars()
Ratio variable creation
step_spatialsign()
Spatial sign preprocessing

Step Functions - Filters

step_corr()
High correlation filter
step_filter_missing()
Missing value column filter
step_lincomb()
Linear combination filter
step_nzv()
Near-zero variance filter
step_rm()
General variable filter
step_select()
Select variables using dplyr
step_zv()
Zero variance filter

Step Functions - Row Operations

step_arrange()
Sort rows using dplyr
step_filter()
Filter rows using dplyr
step_lag()
Create a lagged predictor
step_naomit()
Remove observations with missing values
step_impute_roll()
Impute numeric data using a rolling window statistic
step_sample()
Sample rows using dplyr
step_shuffle()
Shuffle variables
step_slice()
Filter rows by position using dplyr

Step Functions - Others

step_intercept()
Add intercept (or constant) column
step_profile()
Create a profiling version of a data set
step_rename()
Rename variables by name using dplyr
step_rename_at()
Rename multiple columns using dplyr
step_window()
Moving window functions

Check Functions

check_class()
Check variable class
check_cols()
Check if all columns are present
check_missing()
Check for missing values
check_new_values()
Check for new values
check_range()
Check range consistency

Developer Functions

developer_functions
Developer functions for creating recipes steps
add_step() add_check()
Add a New Operation to the Current Recipe
detect_step()
Detect if a particular step or check is used in a recipe
fully_trained()
Check to see if a recipe is trained/prepared
.get_data_types()
Get types for use in recipes
names0() dummy_names() dummy_extract_names()
Naming Tools
prepper()
Wrapper function for preparing recipes within resampling
recipes_eval_select()
Evaluate a selection with tidyselect semantics specific to recipes
recipes_extension_check()
Checks that steps have all S3 methods
recipes_ptype()
Prototype of recipe object
recipes_ptype_validate()
Validate prototype of recipe object
recipes_names_predictors() recipes_names_outcomes()
Role indicators
update(<step>)
Update a recipe step

Tidy Methods

tidy(<step_BoxCox>) tidy(<step_YeoJohnson>) tidy(<step_arrange>) tidy(<step_bin2factor>) tidy(<step_bs>) tidy(<step_center>) tidy(<check_class>) tidy(<step_classdist>) tidy(<step_classdist_shrunken>) tidy(<check_cols>) tidy(<step_corr>) tidy(<step_count>) tidy(<step_cut>) tidy(<step_date>) tidy(<step_depth>) tidy(<step_discretize>) tidy(<step_dummy>) tidy(<step_dummy_extract>) tidy(<step_dummy_multi_choice>) tidy(<step_factor2string>) tidy(<step_filter>) tidy(<step_filter_missing>) tidy(<step_geodist>) tidy(<step_harmonic>) tidy(<step_holiday>) tidy(<step_hyperbolic>) tidy(<step_ica>) tidy(<step_impute_bag>) tidy(<step_impute_knn>) tidy(<step_impute_linear>) tidy(<step_impute_lower>) tidy(<step_impute_mean>) tidy(<step_impute_median>) tidy(<step_impute_mode>) tidy(<step_impute_roll>) tidy(<step_indicate_na>) tidy(<step_integer>) tidy(<step_interact>) tidy(<step_intercept>) tidy(<step_inverse>) tidy(<step_invlogit>) tidy(<step_isomap>) tidy(<step_kpca>) tidy(<step_kpca_poly>) tidy(<step_kpca_rbf>) tidy(<step_lag>) tidy(<step_lincomb>) tidy(<step_log>) tidy(<step_logit>) tidy(<check_missing>) tidy(<step_mutate>) tidy(<step_mutate_at>) tidy(<step_naomit>) tidy(<check_new_values>) tidy(<step_nnmf>) tidy(<step_nnmf_sparse>) tidy(<step_normalize>) tidy(<step_novel>) tidy(<step_ns>) tidy(<step_num2factor>) tidy(<step_nzv>) tidy(<step_ordinalscore>) tidy(<step_other>) tidy(<step_pca>) tidy(<step_percentile>) tidy(<step_pls>) tidy(<step_poly>) tidy(<step_poly_bernstein>) tidy(<step_profile>) tidy(<step_range>) tidy(<check_range>) tidy(<step_ratio>) tidy(<step_regex>) tidy(<step_relevel>) tidy(<step_relu>) tidy(<step_rename>) tidy(<step_rename_at>) tidy(<step_rm>) tidy(<step_sample>) tidy(<step_scale>) tidy(<step_select>) tidy(<step_shuffle>) tidy(<step_slice>) tidy(<step_spatialsign>) tidy(<step_spline_b>) tidy(<step_spline_convex>) tidy(<step_spline_monotone>) tidy(<step_spline_natural>) tidy(<step_spline_nonnegative>) tidy(<step_sqrt>) tidy(<step_string2factor>) tidy(<recipe>) tidy(<step>) tidy(<check>) tidy(<step_time>) tidy(<step_unknown>) tidy(<step_unorder>) tidy(<step_window>) tidy(<step_zv>)
Tidy the result of a recipe