Syntax of config.yml
The file config.yml contains the information needed to render and run an application on BioLib. This configuration
defines the entry to your application and what input arguments the user can set. When you edit an application using the
graphical interface on BioLib the config.yml file is automatically updated.
The config.yml must be a valid YAML file located at .biolib/config.yml. The following sections describe the fields
you can set in the configuration file. Note: all fields marked with asterisk (*) are required.
biolib_version *
Specifies the syntax version of the configuration file, for all new applications use version 2. This option exists to ensure backwards compatibility of applications.
biolib_version: 2modules *
Applications on BioLib consist of one or more modules. A module defines where the containerized code is located and how it should be run. You must define at least one module called "main", which is the entry module of your application.
The example below shows how to define a module to run a Docker image from Dockerhub.
modules:
main:
image: 'dockerhub://ncbi/blast:latest'
command: efetch -db protein -format fasta -id P01349 > queries/P01349.fsa
working_directory: /home/biolib/
input_files:
- COPY / /home/biolib/
output_files:
- COPY /home/biolib/queries/ /image *
A module must define an image to run. To use a local Docker image for a module, set image
as local-docker://$DOCKER_IMAGE:$TAG
The example below uses a local docker image called protein-predictor and uses the latest version.
image: 'local-docker://protein-predictor:latest'To use an image from dockerhub in your module, use the syntax dockerhub://$REPO/$TAG:$VERSION.
image: 'dockerhub://ncbi/blast:latest'command
A command can be provided to specify what to run inside the image.
When creating an application based on a Docker image, this field corresponds to calling docker run with the specified
command.
The example below uses the command field to run an installed binary called efetch:
command: efetch -db protein -format fasta -id P01349 > queries/P01349.fsaAnother example could be running a Python script:
command: python3 script_to_run.py working_directory
Specifies which directory the module.command will be run in. The path must be absolute, for example:
working_directory: /home/biolib/input_files *
This field defines where to copy the input files that are sent from the user of the application.
The field is defined as a list of COPY statements from the input file path to the input file destination.
The COPY statement has the following syntax: - COPY [SOURCE_PATH] [DESTINATION_PATH].
All paths must be absolute.
For example if the code in the module expects the input to be in /data/input/, this can be done in the following way:
input_files:
- COPY / /data/input/output_files *
This field defines where to copy the output files, if any, after the module has been run. The output files could be a csv file or a picture as a png file to show the user.
The syntax is a list of COPY statements of the form: - COPY [SOURCE_PATH] [DESTINATION_PATH].
All paths must be absolute.
Two common use cases are to send either a single file or a folder back to the user.
In the ncbi/blast example above the command creates a file called /home/biolib/queries/P01349.fsa.
To send everything in the queries folder back to the user:
output_files:
- COPY /home/biolib/queries/ /To send only the P01349.fsa file:
output_files:
- COPY /home/biolib/queries/P01349.fsa /If no output files are generated by the module, an empty list can be defined in the following way:
output_files: [ ]arguments
Specifies how input options and settings will be rendered to the user of the application, and how inputs will be parsed. The field should follow this structure:
arguments:
- key: --data # required
description: 'Input Dropdown' # required
key_value_separator: ' ' # optional, default is ' '
default_value: '' # optional, default is ''
type: dropdown # required
options:
'This will be shown as option one': 'value1'
'This will be shown as option two': 'value2'
required: true # optional, default is trueUnder type you have the following options:
textprovides a text input fieldtextareaprovides a multi-line text input areafileprovides a file select where users can upload an input file. You can optionally specifyallowed_file_extensionsto restrict file extensions.multifileprovides a file select where users can upload multiple input files. You can optionally specifyallowed_file_extensionsto restrict file extensions.drag-and-drop-fileprovides a drag and drop area where users can upload a single file. You can optionally specifyallowed_file_extensionsto restrict file extensions.drag-and-drop-filesprovides a drag and drop area where users can upload multiple files. You can optionally specifyallowed_file_extensionsto restrict file extensions.text-fileprovides both a text input field and a file select allowing the user supply either. You can optionally specifyallowed_file_extensionsto restrict file extensions.numberprovides a number input fieldsequencea spreadsheet styled input which checks for valid sequence characters and passes a FASTA file to your application. By default it checks for valid protein sequence characters. You can optionally specify asequence_type('protein', 'dna', or 'rna') to set the base allowed character set, andadditional_sequence_charactersto add extra characters beyond that base set (e.g., for ambiguity codes or gaps). Characters inadditional_sequence_charactersare case-sensitive. You can also setauto_uppercase_sequenceto false to disable automatic uppercasing of sequences (enabled by default).tableprovides a table input where users can enter structured data in rows and columnsradioprovides a "radio select" where users can select one amongst a number of prespecified optionsdropdownprovides a dropdown menu where users can select one amongst a number of prespecified optionsmultiselectprovides a dropdown menu where users can select one or more prespecified optionsmulti-chain-compoundprovides an input for multi-chain molecular compound structures. You can optionally specifymax_chain_name_lengthto customize the maximum allowed length for chain names (default is 4 characters).toggleprovides a toggle switch where users can choose two options. Note that the options need to be named'on' : 'value1'and'off': 'value2'groupallows grouping of multiple related arguments together for better organizationhiddenallows the application creator to provide a default input argument without it being shown to the end-user
sub_arguments: Allow you to specify arguments that are only rendered if a user chooses a particular option in the
parent argument. For example, an application might allow the user to run one of two commands, where each of these
commands would need different input arguments:
arguments:
- key: --function
description: 'Choose a function'
key_value_separator: ''
default_value: ''
type: dropdown
options:
'Command A': a
'Command B': b
sub_arguments:
a:
- key: --argument_a
description: "Argument A takes a file input"
type: file
b:
- key: --argument_b
description: 'Argument B takes a text input'
type: textFile Type Restrictions
For file input types (file, text-file, multifile, drag-and-drop-file, drag-and-drop-files), you can restrict which file extensions users can upload by specifying allowed_file_extensions:
arguments:
- key: --input-file
description: 'Upload an image file'
type: file
allowed_file_extensions:
- 'png'
- 'jpg'
- 'jpeg'
- key: --data-file
description: 'Upload a data file'
type: text-file
allowed_file_extensions:
- 'csv'
- 'txt'
- 'json'remote_hosts
In order for your application to be able to reach external servers each hostname must be specified in the config.yml
as a remote host like the example below:
remote_hosts:
- blast.ncbi.nlm.nih.govNote: the end-user must allow each of these hostnames in their account settings in order to run your application.
main_output_file
Specifies the path of the output file to render when the application finishes. If the file name ends with .md it is
rendered as Markdown otherwise it will be rendered as text.
main_output_file: output.mddescription_file
Specifies the path to the README file, which will be rendered as the description on the application page. The default
path is README.md.
description_file: README.mdlicense_file
Specifies the path to the license file, which will be visible to users at the bottom of the application page. The
default path is LICENSE.
license_file: LICENSEassets
Specifies a directory containing client-facing assets that should be included with your application. These files will be accessible to users of your application.
assets: data/The assets directory should contain files that you want to make available to application users, such as reference data, example files, or other resources that enhance the application experience.
Important security note: Only include files in the assets directory that you want application users to be able to access. Do not include sensitive files, credentials, or any data that should remain private.
citation
Provide a citation using the following structure:
citation:
entry_type: book
author: Dr. John Doe
title: Using config.yml
publisher: BioLib Community Press
year: '2020'Note that the choice of entry_type has implications for which fields are required. For a complete list of which fields
are required for which types you can read more about the BibTex
standard here.
* Required fields
Still have a question?
If you have any questions that you can't find an answer to above, please reach out to the BioLib community.