step_holiday creates a a specification of a
recipe step that will convert date data into one or more binary
indicator variables for common holidays.
step_holiday( recipe, ..., role = "predictor", trained = FALSE, holidays = c("LaborDay", "NewYearsDay", "ChristmasDay"), columns = NULL, skip = FALSE, id = rand_id("holiday") ) # S3 method for step_holiday tidy(x, ...)
| recipe | A recipe object. The step will be added to the sequence of operations for this recipe. |
|---|---|
| ... | One or more selector functions to choose which
variables will be used to create the new variables. The selected
variables should have class |
| role | For model terms created by this step, what analysis role should they be assigned?. By default, the function assumes that the new variable columns created by the original variables will be used as predictors in a model. |
| trained | A logical to indicate if the quantities for preprocessing have been estimated. |
| holidays | A character string that includes at least one
holiday supported by the |
| columns | A character string of variables that will be
used as inputs. This field is a placeholder and will be
populated once |
| skip | A logical. Should the step be skipped when the
recipe is baked by |
| id | A character string that is unique to this step to identify it. |
| x | A |
An updated version of recipe with the new step
added to the sequence of existing steps (if any). For the
tidy method, a tibble with columns terms which is
the columns that will be affected and holiday.
Unlike other steps, step_holiday does
not remove the original date variables.
step_rm() can be used for this purpose.
library(lubridate) examples <- data.frame(someday = ymd("2000-12-20") + days(0:40)) holiday_rec <- recipe(~ someday, examples) %>% step_holiday(all_predictors()) holiday_rec <- prep(holiday_rec, training = examples) holiday_values <- bake(holiday_rec, new_data = examples) holiday_values#> # A tibble: 41 x 4 #> someday someday_LaborDay someday_NewYearsDay someday_ChristmasDay #> <date> <dbl> <dbl> <dbl> #> 1 2000-12-20 0 0 0 #> 2 2000-12-21 0 0 0 #> 3 2000-12-22 0 0 0 #> 4 2000-12-23 0 0 0 #> 5 2000-12-24 0 0 0 #> 6 2000-12-25 0 0 1 #> 7 2000-12-26 0 0 0 #> 8 2000-12-27 0 0 0 #> 9 2000-12-28 0 0 0 #> 10 2000-12-29 0 0 0 #> # … with 31 more rows