R/05-harmonized_data_visualise.R
harmonized_dossier_visualize.Rd
Generates a visual report for a dataset in an HTML bookdown document. The
report provides figures and descriptive statistics for each variable to
facilitate the assessment of input data. Statistics and figures are generated
according to variable data type. The report can be used to help assess
data structure, coherence across elements, and taxonomy or
data dictionary formats. The summaries and figures provide additional
information about variable distributions and descriptive statistics.
The charts and tables are produced based on their data type. The variable can
be grouped using group_by
parameter, which is a (categorical) column in the
dataset. The user may need to use as.factor()
in this context. To fasten
the process (and allow recycling object in a workflow) the user can feed the
function with a .summary_var
, which is the output of the function
dataset_summarize()
of the column(s) col
and group_by
. The summary
must have the same parameters to operate.
harmonized_dossier_visualize(
harmonized_dossier = NULL,
to,
taxonomy = NULL,
valueType_guess = FALSE,
pooled_harmonized_dataset = NULL,
.summary_pool = NULL,
.keep_files = TRUE
)
List of tibble(s), each of them being harmonized dataset.
A character string identifying the folder path where the bookdown report will be saved.
A tibble identifying the scheme used for variables classification.
Whether the output should include a more accurate valueType that could be applied to the dataset. FALSE by default.
A tibble, identifying the pooled harmonized dataset.
A list which is the summary of the variables.
whether to keep the R-markdown files. TRUE by default. (used for internal processes and programming).
A bookdown folder containing files in the specified output folder. To
open the file in browser, open 'docs/index.html'.
Or use open_visual_report()
A harmonized dossier must be a named list containing at least one data frame
or data frame extension (e.g. a tibble), each of them being
harmonized dataset(s). It is generally the product of applying harmonization
processing to a dossier object. The name of each tibble will be use as the
reference name of the dataset. A harmonized dossier has four attributes :
harmonizR::class
which is ""harmonized_dossier"" ; harmonizR::Dataschema
(provided by user) ; harmonizR::data processing elements
;
harmonizR::harmonized_col_id
(provided by user) which refers to the column
in each dataset which identifies unique combination observation/dataset.
This id column name is the same across the dataset(s), the DataSchema and
the data processing elements (created by using 'id_creation') and is used to
initiate the process of harmonization.
A taxonomy is classification scheme that can be defined for variable attributes. If defined, a taxonomy must be a data frame like object. It must be compatible with (and is generally extracted from) an Opal environment. To work with certain functions, a valid taxonomy must contain at least the columns 'taxonomy', 'vocabulary', and 'terms'. In addition, the taxonomy may follow Maelstrom research taxonomy, and its content can be evaluated accordingly, such as naming convention restriction, tagging elements, or scales, which are specific to Maelstrom Research. In this particular case, the tibble must also contain 'vocabulary_short', 'taxonomy_scale', 'vocabulary_scale' and 'term_scale' to work with some specific functions.
The valueType is a property of a variable and is required in certain
functions to determine the handling of the variables. The valueType refers
to the OBiBa-internal type of a variable. It is specified in a data
dictionary in a column valueType
and can be associated with variables as
attributes. Acceptable valueTypes include 'text', 'integer', 'decimal',
'boolean', datetime', 'date'). The full list of OBiBa valueType
possibilities and their correspondence with R data types are available using
madshapR::valueType_list.
# \donttest{
pooled_harmonized_dataset <- DEMO_files_harmo$pooled_harmonized_dataset
summary_var_harmo <- DEMO_files_harmo$summary_var_harmo
to = tempdir()
harmonized_dossier_visualize(
pooled_harmonized_dataset = pooled_harmonized_dataset,
.summary_pool = summary_var_harmo,
to = to)
#> Error in dataset_visualize(dataset = pooled_harmonized_dataset, group_by = group_by, taxonomy = taxonomy, to = to, valueType_guess = valueType_guess, .keep_files = .keep_files, .summary_var = .summary_pool): unused arguments (to = to, .keep_files = .keep_files)
# To open the file in browser, you can also open 'to/docs/index.html'.
open_visual_report(to)
#> Warning: The `to` argument of `open_visual_report()` is deprecated as of madshapR 1.0.2.
#> ℹ Please use the `bookdown_path` argument of `bookdown_open()` instead.
# }