A yaml (or yml) config file that describes the analysis you want to run and the type of report you want to generate.
You can provide any of the command line arguments via this config file, for instance pass the input.csv or ID string in through via the configuation file.
Using this input option will allow the user to run very specific, elaborate reports again and again, without having to specify all arguments via the command line.
Note, if the same option is specified in the config file and as a command line argument, the command line argument will overwrite the config file option.
civet -c config.yaml
# Input options:
input_csv: test.csv
fasta: test.fasta
# Output options:
datadir: `
Input csv file with an identifier column. By default civet will look for a column called `name` but this can be changed with the `-icol/--input-column` flag (or added to the input config file).
For more information on options to do with the input csv see Input column configuration.
A comma-separated set of ids that civet trys to match against the database (you can define what field you want to match against with -bcol/--background-id-column, the default is will match against the `sequence_name` column).
A civet instance is then run with the sequences that match the search parameters. If the search hits more than the maximum number of query sequences (configurable with -mq/--max-queries) then you'll be prompted to define a more specific search.
Optional input fasta file with sequences not yet added to the background data. These will be added into the tree next to their closest sequence found in the background data.
If no csv or ID string is supplied, civet will use all fasta records in this file as the query set.
See Input sequences options for options associated with input sequence files.
If all sequences of interest are in the background tree and metadata, the user can opt to define a set of query sequences using the -fm/--from-metadata flag.
To do this, supply one or more of column_name=match_string pairs. For example:
Maximum number of queries that can be analysed. Default:5000