Syntax of config.yml
The file config.yml
contains the information needed to render and run an application on BioLib. This configuration
defines the entry to your application and what input arguments the user can set. When you edit an application using the
graphical interface on BioLib the config.yml
file is automatically updated.
The config.yml
must be a valid YAML file located at .biolib/config.yml
. The following sections describe the fields
you can set in the configuration file. Note: all fields marked with asterisk (*) are required.
biolib_version
*
Specifies the syntax version of the configuration file, for all new applications use version 2. This option exists to ensure backwards compatibility of applications.
biolib_version: 2
modules
*
Applications on BioLib consist of one or more modules. A module defines where the containerized code is located and how it should be run. You must define at least one module called "main", which is the entry module of your application.
The example below shows how to define a module to run a Docker image from Dockerhub.
modules:
main:
image: 'dockerhub://ncbi/blast:latest'
command: efetch -db protein -format fasta -id P01349 > queries/P01349.fsa
working_directory: /home/biolib/
input_files:
- COPY / /home/biolib/
output_files:
- COPY /home/biolib/queries/ /
image
*
A module must define an image to run. To use a local Docker image for a module, set image
as local-docker://$DOCKER_IMAGE:$TAG
The example below uses a local docker image called protein-predictor
and uses the latest version.
image: 'local-docker://protein-predictor:latest'
To use an image from dockerhub in your module, use the syntax dockerhub://$REPO/$TAG:$VERSION
.
image: 'dockerhub://ncbi/blast:latest'
command
A command can be provided to specify what to run inside the image.
When creating an application based on a Docker image, this field corresponds to calling docker run
with the specified
command.
The example below uses the command field to run an installed binary called efetch
:
command: efetch -db protein -format fasta -id P01349 > queries/P01349.fsa
Another example could be running a Python script:
command: python3 script_to_run.py
working_directory
Specifies which directory the module.command
will be run in. The path must be absolute, for example:
working_directory: /home/biolib/
input_files
*
This field defines where to copy the input files that are sent from the user of the application.
The field is defined as a list of COPY
statements from the input file path to the input file destination.
The COPY
statements has the following syntax: - COPY [SOURCE_PATH] [DESTINATION_PATH]
.
All paths must be absolute.
For example if the code in the module expects the input to be in /data/input/
, this can be done in the following way:
input_files:
- COPY / /data/input/
output_files
*
This field defines where to copy the output files, if any, after the module has been run. The output files could be a csv file or a picture as a png file to show the user.
The syntax is a list of COPY
statements of the form: - COPY [SOURCE_PATH] [DESTINATION_PATH]
.
All paths must be absolute.
Two common usecases are to send either a single file or a folder back to the user.
In the ncbi/blast
example above the command creates a file called /home/biolib/queries/P01349.fsa
.
To send everything in the queries
folder back to the user:
output_files:
- COPY /home/biolib/queries/ /
To send only the P01349.fsa
file:
output_files:
- COPY /home/biolib/queries/P01349.fsa /
If no output files are generated by the module, an empty list can be defined in the following way:
output_files: [ ]
source_files
Source files are pushed to BioLib instead of them copying them into your Docker image.
The syntax is a list of COPY
statements of the form: - COPY [SOURCE_PATH] [DESTINATION_PATH]
.
All paths must be absolute.
All the source files can be copied to a directory like /home/biolib/
in the following way:
source_files:
- COPY / /home/biolib/
If no source files are needed for the module, an empty list can be defined in the following way:
source_files: [ ]
arguments
Specifies how input options and settings will be rendered to the user of the application, and how inputs will be parsed. The field should follow this structure:
arguments:
- key: --data # required
description: 'Input Dropdown' # required
key_value_separator: ' ' # optional, default is ' '
default_value: '' # optional, default is ''
type: dropdown # required
options:
'This will be shown as option one': 'value1'
'This will be shown as option two': 'value2'
required: true # optional, default is true
Under type
you have the following options:
text
provides a text input fieldfile
provides a file select where users can upload an input filetext-file
provides both a text input field and a file select allowing the user supply eitherhidden
allows the application creator to provide a default input argument without it being shown to the end-usertoggle
provides a toggle switch where users can choose two optionsnumber
provides a number input fieldradio
provides a "radio select" where users can select one amongst a number of prespecified optionsdropdown
provides a dropdown menu where users can select one amongst a number of prespecified optionsmultiselect
provides a dropdown menu where users can select one or more prespecified options
sub_arguments:
Allow you to specify arguments that are only rendered if a user chooses a particular option in the
parent argument. For example, an application might allow the user to run one of two commands, where each of these
commands would need different input arguments:
arguments:
- key: --function
description: 'Choose a function'
key_value_separator: ''
default_value: ''
type: dropdown
options:
'Command A': a
'Command B': b
sub_arguments:
a:
- key: --argument_a
description: "Argument A takes a file input"
type: file
b:
- key: --argument_b
description: 'Argument B takes a text input'
type: text
remote_hosts
In order for your application to be able to reach external servers each hostname must be specified in the config.yml
as a remote host like the example below:
remote_hosts:
- blast.ncbi.nlm.nih.gov
Note: the end-user must allow each of these hostnames in their account settings in order to run your application.
main_output_file
Specifies the path of the output file to render when the application finishes. If the file name ends with .md
it is
rendered as Markdown otherwise it will be rendered as text.
main_output_file: output.md
description_file
Specifies the path to the README file, which will be rendered as the description on the application page. The default
path is README.md
.
description_file: README.md
license_file
Specifies the path to the license file, which will be visible to users at the bottom of the application page. The
default path is LICENSE
.
license_file: LICENSE
citation
Provide a citation using the following structure:
citation:
entry_type: book
author: Dr. John Doe
title: Using config.yml
publisher: BioLib Community Press
year: '2020'
Note that the choice of entry_type
has implications for which fields are required. For a complete list of which fields
are required for which types you can read more about the BibTex
standard here.
*
Required fields
Still have a question?
If you have any questions that you can't find an answer to above, please reach out to the BioLib community.