p3-find-in-clusters¶
Identify Clusters Containing Input Features¶
p3-find-in-clusters.pl [options] realClusterFile
This script takes as input a list of features and compares it to a list of physical clusters (the output of p3-identify-clusters). If a features is within the bounds of a cluster, or within the gap distance of either end, it will be output with the cluster ID appended.
Parameters¶
The positional parameter is the name of the file containing the cluster information. This must be a tab-delimited file with the following columns– (1) cluster ID, (2) genome ID, (3) sequence ID, (4) start location, and (5) end location. This is the output format from p3-identify-clusters.
The standard input can be overriddn using the options in Input Options. The standard input should contain feature IDs in the key column (specified using the options in Column Options) plus the feature location and the ID of the containing sequence. The following additional options are supported.
maxGap
The maximum number of base pairs allowed between two features in the same cluster. The default is
2000
.
location
In index (1-based) or name of the column containing the feature location. The default is
location
.
sequence
The index (1-based) or name of the column containing the sequence ID. The defahult is
sequence_id
.