>>> job = samtools.cli(args='--help')
2021-08-18 11:41:39,085 | INFO : Loading package...
2021-08-18 11:41:39,384 | INFO : Loaded package: samtools/samtools
2021-08-18 11:41:40,252 | INFO : Computing...
Program: samtools (Tools for alignments in the SAM format)
Version: 1.10-31-g6e6d5f9 (using htslib 1.10.2-31-g4f60833)
Usage: samtools <command> [options]
dict create a sequence dictionary file
faidx index/extract FASTA
fqidx index/extract FASTQ
index index alignment
Save result files
To save the output files to a directory (in this case result_files/) run:
Working with Jobs
A Job is an object referring to the execution of an application. It contains progress information of the application
execution as well as the result when the job completes.
When signed in, you can print a table of your jobs by running:
where count refers to the number of jobs you want to show.
Retrieving a Job
To retrieve a Job in python call biolib.get_job() with the Job's ID. You can find its ID on the job overview
job = biolib.get_job(job_id)
To retrieve the status of a job in Python, call .get_status() on the job:
status = biolib.get_job(job_id).get_status()print(status)
You can use this to determine if a job has completed or is still in progress.
If your Job is still running you can attach to its stdout and stderr by running:
This will print current output and keep streaming stdout and stderr until the job has finished.
Assuming a Job has completed, its outputs can be accessed by the following methods:
job.get_stdout()# Returns stdout as bytes
job.get_stderr()# Returns stderr as bytes
job.get_exit_code()# Returns exit code of the application as an integer
job.save_files(output_dir)# Saves result files to 'output_dir'
.save_files() also supports a glob filter using the path_filter argument. To save all .pdb files from a result you can run:
Using Results Without Saving to Disk
Some applications may output large files.
To save disk space on your computer you can interact with result files without saving them to disk.
To list the output files from a job:
To load a single file into memory, without saving it to disk, run:
my_csv_file = job.get_output_file('/my_file.csv')
To pass an output file to a library like Pandas or BioPython, run .get_file_handle() on the object:
import pandas as pd
my_dataframe = pd.read_csv(my_csv_file.get_file_handle())
Starting Multiple Jobs in Parallel
Use the blocking=False argument to cli() on an application to get the job immediately
without having to wait for the application to finish.
This feature allows for parallelized workflows as the one below: