p3-generate-close-roles¶
Find Roles That Occur Close Together¶
p3-generate-close-roles.pl [options] <roles.tbl >pairs.tbl
This script is part of a pipeline to compute functionally-coupled roles. It takes a file of locations and roles, then outputs a file of pairs of roles with the number of times features containing those two roles occur close together on the chromosome. Such roles typically have related functions in a genome.
The input file must contain the following four fields.
1
genome ID
2
contig (sequence) ID
3
location in the sequence
4
functional role
The default script assumes the four columns are in that order. This can all be overridden with command-line options.
The input file must be sorted by genome ID and then by sequence ID within genome ID. Otherwise, the results will be incorrect. Use p3-sort to sort the file.
The location is a BV-BRC location string, either of the form start..end or complement(left..right).
Given a set of genome IDs in the file genomes.tbl, you can generate the proper file using the following pipe.
p3-get-genome-features --attr sequence_id --attr location --attr product <genomes.tbl | p3-function-to-role
(If BV-BRC does not yet have roles defined, you will need to use an additional command-line option on p3-function-to-role.)
Parameters¶
There are no positional parameters.
The standard input can be overriddn using the options in Input Options.
Additional command-line options are
genome
The index (1-based) or name of the column containing the genome ID. The default is
1.
sequence
The index (1-based) or name of the column containing the sequence ID. The default is
2.
location
The index (1-based) or name of the column containing the location string. The default is
3.
role
The index (1-based) or name of the column containing the role description. The default is
4.
maxGap
The maximum space between two features considered close. The default is
2000.
minOcc
The minimum number of occurrences for a pair to be considered significant. The default is
4.
Example¶
This command is shown in the tutorial p3_common_tasks.html
- p3-get-genome-features –eq feature_type,CDS –attr sequence_id –attr location –attr product <genomes.tbl | p3-function-to-role | p3-generate-close-roles
role1 role2 count Transposase, IS3/IS911 family Mobile element protein 33 Mobile element protein Mobile element protein 29 Lead, cadmium, zinc and mercury transporting ATPase (EC 3.6.3.3) (EC 3.6.3.5) Copper-translocating P-type ATPase (EC 3.6.3.4) 25 Potassium efflux system KefA protein Small-conductance mechanosensitive channel 13 …