PerlCyc Example Query
PerlCyc Example Query
Example of starting up Pathway Tools in API mode from within Perl:
sub run_ptools_api_mode {
exec "~/pathway-tools/pathway-tools -api";
}
Example query: Print common name of pathways in E. coli
use percyc;
perlcyc -> new ("ORGID")
my $cyc = perlcyc -> new ("ECOLI");
my @pathways = $cyc -> all_pathways ();
foreach my $p (@pathways) {
print $cyc -> get_slot_value ($p, "COMMON-NAME");}
Example query: Number of proteins in E. coli
use perlcyc;
my $cyc = perlcyc -> new ("ECOLI");
my @proteins = $cyc-> get_class_all_instances("|Proteins|");
my $protein_count = scalar(@proteins);
print "Protein count: $protein_count.\n";
Example query: Print IDs of all proteins with molecular weight between 10 and 20 kD and pI between 4 and 5.
use perlcyc;
my $cyc = perlcyc -> new ("ECOLI");
foreach my $p ($cyc->get_class_all_instances("|Proteins|")) {
my $mw = $cyc->get_slot_value($p, "molecular-weight-kd");
my $pI = $cyc->get_slot_value($p, "pi");
if ($mw <= 20 && $mw >= 10 && $pI <= 5 && $pI >= 4) {
print "$p\n";
}
}
Example query: List all the transcription factors in E. coli, and the list of genes that each regulates:
use perlcyc;
my $cyc = perlcyc -> new ("ECOLI");
foreach my $p ($cyc->get_class_all_instances("|Proteins|")) {
if ($cyc->transcription_factor_p($p)) {
my $name = $cyc->get_slot_value($p, "common-name");
my %genes = ();
foreach my $tu ($cyc->regulon_of_protein($p)) {
foreach my $g ($cyc->transcription_unit_genes($tu)) {
$genes{$g} = $cyc->get_slot_value($g, "common-name");
}
}
print "\n\n$name: ";
print join " ", values %genes;
}
}
Editing example: Add a link from each gene to the corresponding object in MY-DB (assume ID is same in both cases)
use perlcyc;
my $cyc = perlcyc -> new ("HPY");
my @genes = $cyc->get_class_all_instances ("|Genes|");
foreach my $g (@genes) {
$cyc->add_slot_value ($g, "DBLINKS", "(MY-DB \"$g\")");
}
$cyc->save_kb();
Examples of how to extend PerlCyc with functions available in the Lisp API:
package perlcyc;
sub slot_has_value_p {
@_ == 3 or warn "slot_has_value_p: expected two arguments, received @_\n";
my $self = shift;
my $entity = shift;
my $slotname = shift;
my $frame = protectFrameName($entity);
return $self->call_func_that_returns_boolean("slot-has-value-p \'$frame \'$slotname");
}
sub get_protein_sequence {
@_ == 2 or warn "get_protein_sequence: expected one argument, received @_\n";
my $self = shift;
my $entity = shift;
my $frame = protectFrameName($entity);
return $self->call_func_that_returns_string("get-protein-sequence \'$frame");
}
Example using new extension:
## Load the PerlCyc Perl Module:
use perlcyc;
## Create a new PerlCyc object:
my $cyc = perlcyc->new("ECOLI"); ## Use PerlCyc to connect to PGDB 'ECOLI'
## Take the first protein in the list:
my $prot = $cyc->get_class_all_instances("|Proteins|")[0];
## Use our new PerlCyc function to access the protein sequence:
my $seq = $cyc -> get_protein_sequence ($prot);
## Check to see if the slot has a value before accessing the slot:
if ( $seq && $seq ne "NIL" && $cyc -> slot_has_value_p($prot,"molecular-weight-kd") ) {
## Fetch the weight:
$weight = $cyc -> get_slot_value($prot, "molecular-weight-kd");
}
|