Checks if an object is a valid dataset and returns it with the appropriate madshapR::class attribute. This function mainly helps validate inputs within other functions of the package but could be used separately to check if a dataset is valid.

as_dataset(object, col_id = NULL)

Arguments

object

A potential dataset object to be coerced.

col_id

An optional character string specifying the name(s) or position(s) of the column(s) used as identifiers.

Value

A data frame with madshapR::class 'dataset'.

Details

A dataset is a data table containing variables. A dataset object is a data frame and can be associated with a data dictionary. If no data dictionary is provided with a dataset, a minimum workable data dictionary will be generated as needed within relevant functions. Identifier variable(s) for indexing can be specified by the user. The id values must be non-missing and will be used in functions that require it. If no identifier variable is specified, indexing is handled automatically by the function.

Examples

{

# use madshapR_examples provided by the package
library(dplyr)
library(fabR)

###### Example 1: A dataset can have an id column specified as an attribute. 
dataset <- as_dataset(madshapR_examples$`dataset_example`, col_id = "part_id")
print(attributes(dataset)$`madshapR::col_id`)
glimpse(dataset)

###### Example 2: Any data frame can be a dataset by definition.
dataset <- tibble(iris %>% add_index("my_index"))
dataset <- as_dataset(dataset, "my_index")
print(attributes(dataset)$`madshapR::col_id`)

}
#> [1] "part_id"
#> Rows: 50
#> Columns: 9
#> $ part_id   <chr> "ID001", "ID002", "ID003", "ID004", "ID005", "ID006", "ID007…
#> $ gndr      <dbl> 1, 2, 2, 2, 2, -77, 2, 2, -77, 1, 2, 1, 1, 1, 2, 1, 2, 1, 1,…
#> $ height    <dbl> 191, 176, 154, 167, 185, 171, 185, 171, 169, 179, 175, 150, …
#> $ weight_ms <dbl> 63, NA, NA, -88, NA, 57, NA, NA, 52, NA, 67, NA, NA, 59, 95,…
#> $ weight_dc <dbl> NA, 65, 121, NA, 45, NA, 58, 59, NA, 62, NA, 84, 82, NA, NA,…
#> $ dob       <chr> "3/22/1990", "8/15/2001", "12/17/1996", "6/13/1990", "12/17/…
#> $ prg_ever  <dbl> -7, 0, 2, 1, 8, -7, 9, 2, -7, -7, 1, -7, -7, -7, 0, -7, 0, -…
#> $ empty     <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, …
#> $ opentext  <chr> "All children, except one, grow up. They soon know that they…
#> [1] "my_index"