Skip to content

The recipes package can be used to create design matrices for modeling and to conduct preprocessing of variables. It is meant to be a more extensive framework that R's formula method. Some differences between simple formula methods and recipes are that

  1. Variables can have arbitrary roles in the analysis beyond predictors and outcomes.

  2. A recipe consists of one or more steps that define actions on the variables.

  3. Recipes can be defined sequentially using pipes as well as being modifiable and extensible.

Basic Functions

The three main functions are recipe(), prep(), and bake().

recipe() defines the operations on the data and the associated roles. Once the preprocessing steps are defined, any parameters are estimated using prep(). Once the data are ready for transformation, the bake() function applies the operations.

Step Functions

These functions are used to add new actions to the recipe and have the naming convention "step_action". For example, step_center() centers the data to have a zero mean and step_dummy() is used to create dummy variables.

Author

Maintainer: Max Kuhn max@posit.co

Authors:

Other contributors:

  • Posit Software, PBC [copyright holder, funder]