R/03-harmonized_data_evaluate.R
data_proc_elem_evaluate.Rd
Assesses the content and structure of a Data Processing Elements object and
generates reports of the results. This function can be used to evaluate data
structure, presence of specific fields, coherence across elements, and data
dictionary formats.
data_proc_elem_evaluate(data_proc_elem, taxonomy = NULL)
A Data Processing Elements object.
An optional data frame identifying a variable classification schema.
A list of data frames containing assessment reports.
The Data Processing Elements specifies the input elements and processing algorithms
to generate harmonized variables in the DataSchema formats. It is also
contains metadata used to generate documentation of the processing.
A Data Processing Elements object is a data frame with specific columns
used in data processing: dataschema_variable
, input_dataset
,
input_variables
, Mlstr_harmo::rule_category
and Mlstr_harmo::algorithm
.
To initiate processing, the first entry must be the creation of a harmonized
primary identifier variable (e.g., participant unique ID).
A taxonomy is a classification schema that can be defined for variable
attributes. A taxonomy is usually extracted from an
Opal environment, and a
taxonomy object is a data frame that must contain at least the columns
taxonomy
, vocabulary
, and terms
. Additional details about Opal
taxonomies are
available online.
{
# Use Rmonize_examples to run examples.
library(dplyr)
data_proc_elem <- Rmonize_examples$`Data_Processing_Element_no errors`
glimpse(data_proc_elem)
}
#> Rows: 45
#> Columns: 12
#> $ index <dbl> 1, 2, 3, 4, 5, 6, 7, 8, 9, 1, 2, 3, 4, 5,…
#> $ dataschema_variable <chr> "adm_unique_id", "adm_study_id", "sdc_age…
#> $ input_dataset <chr> "dataset_study1", "dataset_study1", "data…
#> $ input_variables <chr> "pid", "__BLANK__", "maternal_age", "civi…
#> $ input_information <chr> "Participant ID", NA, "Maternal age [QX1]…
#> $ input_format <chr> "text", NA, "integer", "1 = Single \r\n2 …
#> $ `Mlstr_harmo::status` <chr> "complete", "complete", "complete", "comp…
#> $ `Mlstr_harmo::status_detail` <chr> "identical", "identical", "identical", "c…
#> $ `Mlstr_harmo::rule_category` <chr> "id_creation", "paste", "direct_mapping",…
#> $ `Mlstr_harmo::algorithm` <chr> "pid", "1", "direct_mapping", "recode(1:2…
#> $ internal_comment <chr> NA, NA, NA, NA, "alc_c5 is NA if alc_c1 i…
#> $ `Mlstr_harmo::comment` <chr> NA, NA, NA, "The input category \"Single …