The user should write a config.yaml file containing information pertaining to the data products used in the code run. The example config.yaml file below describes a code run with inputs:
disease/sars_cov2/SEINRD_model/parameters/static_params
disease/sars_cov2/SEINRD_model/parameters/rts
disease/sars_cov2/SEINRD_model/parameters/efoi
These inputs are listed in the register
block, meaning that they should be downloaded to the local data store from an external source, with associated metadata stored in the local registry. These inputs are automatically converted into a read
block by fair run
(when data products are already present in the data registry, inputs should be listed in the read
block).
A code run usually also has outputs, which are listed in the write
block. In the example below, our outputs are:
SEINRDconfig.yaml:
run_metadata:
default_input_namespace: BioSS
description: SEINRD_model
script: R -f inst/extdata/SEINRD.R --args ${{CONFIG_DIR}}
register:
- namespace: BioSS
full_name: Biomathematics and Statistics Scotland
website: https://ror.org/03jwrz939
- external_object: disease/sars_cov2/SEINRD_model/parameters/static_params
namespace_name: BioSS
root: https://raw.githubusercontent.com/
path: FAIRDataPipeline/rSimpleModel/main/inst/extdata/static_params_SEInRD.csv
title: Static parameters of the model
description: Static parameters of the model
unique_name: Simple model parameters - Static parameters of the model
alternate_identifier_type: simple_model_params
file_type: csv
release_date: 2022-01-28T12:00
version: 1.0.0
primary: True
- external_object: disease/sars_cov2/SEINRD_model/parameters/rts
namespace: BioSS
root: https://raw.githubusercontent.com/
path: FAIRDataPipeline/rSimpleModel/main/inst/extdata/Rt_beep.csv
title: Values of Rt at time t
description: Values of Rt at time t
unique_name: Simple model parameters - Values of Rt at time t
alternate_identifier_type: simple_model_params
file_type: csv
release_date: 2022-01-28T12:00
version: 1.0.0
primary: True
- external_object: disease/sars_cov2/SEINRD_model/parameters/efoi
namespace: BioSS
root: https://raw.githubusercontent.com/
path: FAIRDataPipeline/rSimpleModel/main/inst/extdata/efoi_all_dates.csv
title: External force of infection at time t
description: External force of infection at time t
unique_name: Simple model parameters - External force of infection
alternate_identifier_type: simple_model_params
file_type: csv
release_date: 2022-01-28T12:00
version: 1.0.0
primary: True
write:
- data_product: disease/sars_cov2/SEINRD_model/results/model_output
description: SEINRD model results
file_type: csv
- data_product: disease/sars_cov2/SEINRD_model/results/figure
description: SEINRD output plot
file_type: pdf
The submission script should call initialise()
to set up the code run, then perhaps read in some data using one of the read_*()
functions (for internal file formats) or link_read()
(for external file formats such as csvs). The data might now be processed in some way, or a model / analysis might bw carried out, after which the results should be saved in the local data store via one of the write_*()
functions or link_write()
. When the code run is complete, finalise()
should be called to register the all metadata with the local registry.
fair pull
Using the CLI tool, fair pull
identifies any data products listed in the register
field of the config.yaml. These data products are downloaded to the local data store whilst associated metadata is registered in the local registry.
fair init --ci
fair pull inst/extdata/SEINRDconfig.yaml
#> FAIR repository is already initialised.
#> Updating registry from inst/extdata/SEINRDconfig.yaml
#> WARNING:FAIRDataPipeline.ConfigYAML:Remote registry pulls are not yet implemented
#> WARNING:FAIRDataPipeline.ConfigYAML:Remote registry pulls are not yet implemented
The local registry should now contain three data products:
disease/sars_cov2/SEINRD_model/parameters/static_params
,disease/sars_cov2/SEINRD_model/parameters/rts
, anddisease/sars_cov2/SEINRD_model/parameters/efoi
.fair run
Again using the CLI tool, fair run
performs the code run, as written in the. submission script. In preparation for this, it translates the user-written config.yaml file for use by the Data Pipeline API. Any variables / wildcards specified by the user in the config file are cross referenced with the registry, and any data products registered by fair pull
are made available to read by the current code run.
fair run inst/extdata/SEINRDconfig.yaml
#> Updating registry from inst/extdata/SEINRDconfig.yaml
#>
#> R version 4.2.0 (2022-04-22) -- "Vigorous Calisthenics"
#> Copyright (C) 2022 The R Foundation for Statistical Computing
#> Platform: x86_64-pc-linux-gnu (64-bit)
#>
#> R is free software and comes with ABSOLUTELY NO WARRANTY.
#> You are welcome to redistribute it under certain conditions.
#> Type 'license()' or 'licence()' for distribution details.
#>
#> Natural language support but running in an English locale
#>
#> R is a collaborative project with many contributors.
#> Type 'contributors()' for more information and
#> 'citation()' on how to cite R or R packages in publications.
#>
#> Type 'demo()' for some demos, 'help()' for on-line help, or
#> 'help.start()' for an HTML browser interface to help.
#> Type 'q()' to quit R.
#>
#> > library(rSimpleModel)
#> > library(rDataPipeline)
#> > library(deSolve)
#> > library(ggplot2)
#> >
#> > # Read config directory from command line
#> > conf.dir <- commandArgs(trailingOnly=TRUE)[1]
#> >
#> > # Initialise code run
#> > config <- file.path(conf.dir, "config.yaml")
#> > script <- file.path(conf.dir, "script.sh")
#> > handle <- initialise(config, script)
#> ℹ Reading config.yaml from data store
#> ✔ Writing /home/runner/work/rSimpleModel/rSimpleModel/data_store/jobs/2022-06-21_14_30_24_873290//config.yaml to local registry
#> ✔ Writing /home/runner/work/rSimpleModel/rSimpleModel/data_store/jobs/2022-06-21_14_30_24_873290//script.sh to local registry
#> ✔ Writing FAIRDataPipeline/rSimpleModel to local registry
#> ✔ Writing new code_run to local registry
#> >
#> > # Read code run inputs
#> > static_params <- read.csv(link_read(handle, "disease/sars_cov2/SEINRD_model/parameters/static_params"))
#> ℹ Locating 'disease/sars_cov2/SEINRD_model/parameters/static_params'
#> > rts_params <- read.csv(link_read(handle, "disease/sars_cov2/SEINRD_model/parameters/rts"))
#> ℹ Locating 'disease/sars_cov2/SEINRD_model/parameters/rts'
#> > efoi_params <- read.csv(link_read(handle, "disease/sars_cov2/SEINRD_model/parameters/efoi"))
#> ℹ Locating 'disease/sars_cov2/SEINRD_model/parameters/efoi'
#> >
#> > # Run the model
#> > data <- initialise_SEINRD(rts_params, efoi_params, static_params)
#> > results <- ode(y = data$init_state,
#> + times = data$time_length,
#> + func = rSimpleModel::SEINRD_model,
#> + parms = data$pars)
#> > g <- plot_SEINRD(results)
#> >
#> > # Save outputs to data store
#> > path <- link_write(handle, "disease/sars_cov2/SEINRD_model/results/model_output")
#> > write.csv(results, path)
#> >
#> > path <- link_write(handle, "disease/sars_cov2/SEINRD_model/results/figure")
#> > ggsave(path, g, width=20, height=5, units="cm", dpi=600)
#> >
#> > # Register code run in local registry
#> > finalise(handle)
#> ✔ Writing 'disease/sars_cov2/SEINRD_model/results/model_output' to local registry
#> ✔ Writing 'disease/sars_cov2/SEINRD_model/results/figure' to local registry
#> -> PATCH /api/code_run/1/ HTTP/1.1
#> -> Host: 127.0.0.1:8000
#> -> User-Agent: libcurl/7.68.0 r-curl/4.3.2 httr/1.4.3
#> -> Accept-Encoding: deflate, gzip, br
#> -> Accept: application/json, text/xml, application/xml, */*
#> -> Content-Type: application/json
#> -> Authorization: token d946655533485fed81ce0d9710815a6441f93adb
#> -> Content-Length: 304
#> ->
#> >> {
#> >> "inputs": [
#> >> "http://127.0.0.1:8000/api/object_component/1/",
#> >> "http://127.0.0.1:8000/api/object_component/2/",
#> >> "http://127.0.0.1:8000/api/object_component/3/"
#> >> ],
#> >> "outputs": [
#> >> "http://127.0.0.1:8000/api/object_component/7/",
#> >> "http://127.0.0.1:8000/api/object_component/8/"
#> >> ]
#> >> }
#>
#> <- HTTP/1.1 200 OK
#> <- Date: Tue, 21 Jun 2022 14:30:36 GMT
#> <- Server: WSGIServer/0.2 CPython/3.9.13
#> <- Content-Type: application/json
#> <- Vary: Accept, Cookie
#> <- Allow: GET, PUT, PATCH, DELETE, HEAD, OPTIONS
#> <- X-Frame-Options: DENY
#> <- Content-Length: 675
#> <- X-Content-Type-Options: nosniff
#> <- Referrer-Policy: same-origin
#> <-
#> No encoding supplied: defaulting to UTF-8.
#> >