NAME
FAStrGen.pl - Generate structures for Fatty Acyls (FA)
SYNOPSIS
FAStrGen.pl FAAbbrev|FAAbbrevFileName ...
FAStrGen.pl [-h, --help] [-m, --mode Abbrev | AbbrevFileName] [-p, --ProcessMode WriteSDFile | CountOnly] [-o, --overwrite] [-r, --root rootname] [-w, --workingdir dirname] <arguments>...
DESCRIPTION
Generate Fatty Acyls (FA) structures using compound abbreviations specified on a command line or in a CSV/TSV Text file. All the command line arguments represent either compound abbreviations or file name containing abbreviations. Use mode option to control the type of command line arguments.
A SD file, containing structures for all SP abbreviations along with ontological information, is generated as an output.
SUPPORTED ABBREVIATIONS
Current support for FA structure generation include these main classes and sub classes:
o Fatty Acids and Conjugates
. Methyl branched fatty acids
. Unsaturated fatty acids
. Hydroperoxy fatty acids
. Hydroxy fatty acids
. Oxo fatty acids
. Epoxy fatty acids
. Methoxy fatty acids
. Halogenated fatty acids
. Amino fatty acids
. Cyano fatty acids
. Nitro fatty acids
. Thia fatty acids
o Eicosanoids
o Fatty alcohols
o Fatty aldehydes
o Fatty amides
OPTIONS
- -h, --help
-
Print this help message
- -m, --mode Abbrev|AbbrevFileName
-
Controls interpretation of command line arguments. Two different methods are provided: specify compound abbreviations or a file name containing compound abbreviations. Possible values: Abbrev or AbbrevFileName. Default: Abbrev
In AbbrevFileName mode, a single line in CSV/TSV files can contain multiple compound abbreviations. The file extension determines delimiter used to process data lines: comma for CSV and tab for TSV. For files with TXT extension, only one compound abbreviation per line is allowed.
Wild card character, *, is also supported in compound abbreviations to generate straight chain and unsaturated fatty acids.
Examples:
Specific structures: "18:0" "20:4(5Z,8Z,11Z,14Z)"
"28:1(12Z)(2Me,4Me,6Me)"
"18:3(6Z,9Z,11E)(13OOH[S])"
"18:2(9E,11E)(13OH)"
"18:1(10E)(9Ke,10Ep)"
"16:1(5Z)(2OMe)" "7:1(2Z)(3Br)"
"18:2(9Z,12Z)(10NO2)" "16:2(10E,12Z)(1OH)"
"6:0(1CHO)" "12:0(1NH2)"
"20:2(5Z,13E)(9OH[S],11OH[R],15OH[S]){8a,12b}"
All possibilites: *:* or *
With wild card character, +/- can also be used for chain lengths to indicate even and odd lengths; additionally > and < qualifiers are also allowed to specify length requirements. Examples:
Odd number chains: "*-:*"
Even number chains: "*+:*"
Odd number chains with chain length longer than 18: "*->18:*"
Even number chains with chain length longer than 14: "*+>14:*"
- -o, --overwrite
-
Overwrite existing files
- -r, --root rootname
-
New file name is generated using the root: <Root>.sdf. Default for new file names: FAAbbrev.sdf, <AbbrevFilenName>.sdf, or <FirstAbbrevFileName>1To<Count>.sdf.
- -p, --ProcessMode WriteSDFile|CountOnly
-
Specify how abbreviations are processed: generate structures for specified abbreviations along with generating a SD file or just count the number of structures corresponding to specified abbreviations without generating any SD file. Possible values: WriteSDFile or CountOnly. Default: WriteSDFile.
It can take substantial amount of time for generating all the structures and writing out a SD file for abbreviations containing wild cards. CountOnly value of --ProcessMode option can be used to get a quick count of number of structures to be generated without writing out any SD file.
- -w, --workingdir dirname
-
Location of working directory. Default: current directory
EXAMPLES
Print this help message
Controls interpretation of command line arguments. Two different methods are provided: specify compound abbreviations or a file name containing compound abbreviations. Possible values: Abbrev or AbbrevFileName. Default: Abbrev
In AbbrevFileName mode, a single line in CSV/TSV files can contain multiple compound abbreviations. The file extension determines delimiter used to process data lines: comma for CSV and tab for TSV. For files with TXT extension, only one compound abbreviation per line is allowed.
Wild card character, *, is also supported in compound abbreviations to generate straight chain and unsaturated fatty acids.
Examples:
All possibilites: *:* or *
With wild card character, +/- can also be used for chain lengths to indicate even and odd lengths; additionally > and < qualifiers are also allowed to specify length requirements. Examples:
Even number chains: "*+:*"
Odd number chains with chain length longer than 18: "*->18:*"
Even number chains with chain length longer than 14: "*+>14:*"
Overwrite existing files
New file name is generated using the root: <Root>.sdf. Default for new file names: FAAbbrev.sdf, <AbbrevFilenName>.sdf, or <FirstAbbrevFileName>1To<Count>.sdf.
Specify how abbreviations are processed: generate structures for specified abbreviations along with generating a SD file or just count the number of structures corresponding to specified abbreviations without generating any SD file. Possible values: WriteSDFile or CountOnly. Default: WriteSDFile.
It can take substantial amount of time for generating all the structures and writing out a SD file for abbreviations containing wild cards. CountOnly value of --ProcessMode option can be used to get a quick count of number of structures to be generated without writing out any SD file.
Location of working directory. Default: current directory
On some systems, command line scripts may need to be invoked using perl -s FAStrGen.pl; however, all the examples assume direct invocation of command line script works.
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for straight chain fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for methyl branched fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for unsaturated fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for hydroperoxy fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for hydroxy fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for oxo fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for epoxy fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for methoxy fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for halogenated fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for amino fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for Cyano fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for nitro fatty acids, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for prostaglanins, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for fatty alcohols, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for fatty aldehydes, type:
To generate a FAStructures.sdf file containing a structure specified by a command line FA abbreviation for primary amides, type:
To enumerate straight chain and unsaturated fatty acids with commonly occurring chain lengths and generate FAStructures.sdf file, type:
or
AUTHOR
CONTRIBUTOR
SEE ALSO
CLStrGen.pl, GLStrGen.pl, GPStrGen.pl, SPStrGen.pl, STStrGen.pl, 
COPYRIGHT
Copyright (C) 2006-2017. The Regents of the University of California. All Rights Reserved.
LICENSE
Modified BSD License