Checks if an object is a valid dataset and returns it with the appropriate madshapR::class attribute. This function mainly helps validate inputs within other functions of the package but could be used separately to check if a dataset is valid.

as_dataset(object, col_id = NULL)

Arguments

object

A potential dataset object to be coerced.

col_id

An optional character string specifying the name(s) or position(s) of the column(s) used as identifiers.

Value

A data frame with madshapR::class 'dataset'.

Details

A dataset is a data table containing variables. A dataset object is a data frame and can be associated with a data dictionary. If no data dictionary is provided with a dataset, a minimum workable data dictionary will be generated as needed within relevant functions. Identifier variable(s) for indexing can be specified by the user. The id values must be non-missing and will be used in functions that require it. If no identifier variable is specified, indexing is handled automatically by the function.

Examples

{

# use madshapR_DEMO provided by the package
library(dplyr)

###### Example 1: A dataset can have an id column specified as an attribute. 
dataset <- as_dataset(madshapR_DEMO$dataset_MELBOURNE, col_id = "id")
glimpse(dataset)

###### Example 2: Any data frame can be a dataset by definition.
glimpse(as_dataset(iris, col_id = "Species"))

}
#> Rows: 19
#> Columns: 6
#> $ id         <dbl> 377943, 497013, 927676, 995667, 21829, 209432, 272983, 5806…
#> $ Gender     <dbl> 2, 1, 1, 2, 2, 1, 2, 2, 2, 1, 1, 2, 2, 2, 1, 1, 2, 2, 2
#> $ BMI        <dbl> 22.10000, 18.50656, 24.57680, 15.71540, 18.55378, 15.93178,…
#> $ age        <dbl> 52, 49, 43, 59, 40, 47, -888, 53, 35, 40, 41, 34, 48, 43, -…
#> $ smo_status <dbl> 1, 2, 3, -77, NA, 2, -77, 2, 1, 1, NA, 3, 2, 1, 2, 1, NA, 1…
#> $ prg_curr   <dbl> 0, -77, -77, 1, 0, -77, 8, 0, 0, -77, -77, 1, 1, 9, -77, -7…
#> Rows: 150
#> Columns: 5
#> $ Species      <fct> setosa, setosa, setosa, setosa, setosa, setosa, setosa, s…
#> $ Sepal.Length <dbl> 5.1, 4.9, 4.7, 4.6, 5.0, 5.4, 4.6, 5.0, 4.4, 4.9, 5.4, 4.…
#> $ Sepal.Width  <dbl> 3.5, 3.0, 3.2, 3.1, 3.6, 3.9, 3.4, 3.4, 2.9, 3.1, 3.7, 3.…
#> $ Petal.Length <dbl> 1.4, 1.4, 1.3, 1.5, 1.4, 1.7, 1.4, 1.5, 1.4, 1.5, 1.5, 1.…
#> $ Petal.Width  <dbl> 0.2, 0.2, 0.2, 0.2, 0.2, 0.4, 0.3, 0.2, 0.2, 0.1, 0.2, 0.…