p3-generate-close-roles¶
Find Roles That Occur Close Together¶
p3-generate-close-roles.pl [options] <roles.tbl >pairs.tbl
This script is part of a pipeline to compute functionally-coupled roles. It takes a file of locations and roles, then outputs a file of pairs of roles with the number of times features containing those two roles occur close together on the chromosome. Such roles typically have related functions in a genome.
The input file must contain the following four fields.
- 1 
genome ID
- 2 
contig (sequence) ID
- 3 
location in the sequence
- 4 
functional role
The default script assumes the four columns are in that order. This can all be overridden with command-line options.
The input file must be sorted by genome ID and then by sequence ID within genome ID. Otherwise, the results will be incorrect. Use p3-sort to sort the file.
The location is a BV-BRC location string, either of the form start..end or complement(left..right).
Given a set of genome IDs in the file genomes.tbl, you can generate the proper file using the following pipe.
p3-get-genome-features --attr sequence_id --attr location --attr product <genomes.tbl | p3-function-to-role
(If BV-BRC does not yet have roles defined, you will need to use an additional command-line option on p3-function-to-role.)
Parameters¶
There are no positional parameters.
The standard input can be overriddn using the options in Input Options.
Additional command-line options are
- genome 
The index (1-based) or name of the column containing the genome ID. The default is
1.
- sequence 
The index (1-based) or name of the column containing the sequence ID. The default is
2.
- location 
The index (1-based) or name of the column containing the location string. The default is
3.
- role 
The index (1-based) or name of the column containing the role description. The default is
4.
- maxGap 
The maximum space between two features considered close. The default is
2000.
- minOcc 
The minimum number of occurrences for a pair to be considered significant. The default is
4.
Example¶
This command is shown in the tutorial p3_common_tasks.html
- p3-get-genome-features –eq feature_type,CDS –attr sequence_id –attr location –attr product <genomes.tbl | p3-function-to-role | p3-generate-close-roles
- role1 role2 count Transposase, IS3/IS911 family Mobile element protein 33 Mobile element protein Mobile element protein 29 Lead, cadmium, zinc and mercury transporting ATPase (EC 3.6.3.3) (EC 3.6.3.5) Copper-translocating P-type ATPase (EC 3.6.3.4) 25 Potassium efflux system KefA protein Small-conductance mechanosensitive channel 13 …