step_geodist
creates a specification of a
recipe step that will calculate the distance between
points on a map to a reference location.
step_geodist( recipe, lat = NULL, lon = NULL, role = "predictor", trained = FALSE, ref_lat = NULL, ref_lon = NULL, log = FALSE, name = "geo_dist", columns = NULL, skip = FALSE, id = rand_id("geodist") ) # S3 method for step_geodist tidy(x, ...)
recipe | A recipe object. The step will be added to the sequence of operations for this recipe. |
---|---|
lon, lat | Selector functions to choose which variables are affected by the step. See selections() for more details. |
role | or model term created by this step, what analysis role should be assigned?. By default, the function assumes that resulting distance will be used as a predictor in a model. |
trained | A logical to indicate if the quantities for preprocessing have been estimated. |
ref_lon, ref_lat | Single numeric values for the location of the reference point. |
log | A logical: should the distance be transformed by the natural log function? |
name | A single character value to use for the new predictor column. If a column exists with this name, an error is issued. |
columns | A character string of variable names that will
be populated (eventually) by the |
skip | A logical. Should the step be skipped when the
recipe is baked by |
id | A character string that is unique to this step to identify it. |
x | A |
... | One or more selector functions to choose which
variables are affected by the step. See |
An updated version of recipe
with the new step added
to the sequence of existing steps (if any). For the tidy
method, a tibble with columns echoing the values of lat
,
lon
, ref_lat
, ref_lon
, name
, and id
.
step_geodist
will create a
library(modeldata) data(Smithsonian) # How close are the museums to Union Station? near_station <- recipe( ~ ., data = Smithsonian) %>% update_role(name, new_role = "location") %>% step_geodist(lat = latitude, lon = longitude, log = FALSE, ref_lat = 38.8986312, ref_lon = -77.0062457) %>% prep(training = Smithsonian) bake(near_station, new_data = NULL) %>% arrange(geo_dist)#> # A tibble: 20 x 4 #> name latitude longitude geo_dist #> <fct> <dbl> <dbl> <dbl> #> 1 National Postal Museum 38.9 -77.0 0.00421 #> 2 Renwick Gallery 38.9 -77.0 0.0108 #> 3 National Museum of the American Indian 38.9 -77.0 0.0161 #> 4 Smithsonian American Art Museum 38.9 -77.0 0.0189 #> 5 National Air and Space Museum 38.9 -77.0 0.0190 #> 6 National Portrait Gallery 38.9 -77.0 0.0190 #> 7 Hirshhorn Museum and Sculpture Garden 38.9 -77.0 0.0216 #> 8 Arthur M. Sackler Gallery 38.9 -77.0 0.0228 #> 9 Arts and Industries Building 38.9 -77.0 0.0230 #> 10 National Museum of Natural History 38.9 -77.0 0.0232 #> 11 National Museum of African Art 38.9 -77.0 0.0239 #> 12 Smithsonian Institution Building 38.9 -77.0 0.0241 #> 13 Freer Gallery of Art 38.9 -77.0 0.0247 #> 14 National Museum of American History 38.9 -77.0 0.0270 #> 15 National Museum of African American History and … 38.9 -77.0 0.0295 #> 16 Anacostia Community Museum 38.9 -77.0 0.0514 #> 17 National Zoological Park 38.9 -77.1 0.0552 #> 18 Steven F. Udvar-Hazy Center 38.9 -77.4 0.440 #> 19 George Gustav Heye Center 40.7 -74.0 3.49 #> 20 Cooper Hewitt, Smithsonian Design Museum 40.8 -74.0 3.58#> # A tibble: 1 x 6 #> latitude longitude ref_latitude ref_longitude name id #> <chr> <chr> <dbl> <dbl> <chr> <chr> #> 1 latitude longitude 38.9 -77.0 geo_dist geodist_lihkJ