Tutorial

About Genes Like Me (formerly GeneDecks Partner Hunter)

Genes Like Me is a novel analysis tool which provides a similarity metric by highlighting shared descriptors between genes, based on the rich annotation within the GeneCards compendium of human genes.

Users supply a query gene, and the system finds putative functional paralogs, namely genes that are similar to the query gene based on combinatorial similarity of attribute annotations.

Genes Like Me Algorithm


Genes Like Me calculates similarity scores between each query gene and all remaining candidate genes in the GeneCards database for 8 attributes that appear in table 1. For all attributes except Gene Ontology, and sequence paralogy, the similarity score between a query gene and a candidate gene is calculated in the following manner: each descriptor score (DS) is the result of dividing its rank by Log10 of its frequency in the database Descriptor ranks are each assigned the value of 1, except for those associated with the Gene Ontology (GO) attribute, which are assigned the descriptor's evidence code (Buza et al. 2008); for example Inferred from Direct Assay (IDA) will receive a descriptor score of 5 The attribute score (AS) is the sum of the descriptor scores for those descriptors shared by both the query gene and the candidate gene, divided by the sum of the descriptor scores for all descriptors associated with the query gene For the sequence paralogy attribute, if a partner candidate is also identified as a sequence paralog (SP), then it is assigned a value of 1 for this attribute and 0 otherwise Gene expression data was mined from BioGPS (http://biogps.org/). The similarity score is the mean Pearson correlation (P.Corr) between all expression vectors for the query gene and candidate gene This improves finding Genes Like Me for expression patterns, since it looks for vector correlations rather than binary expression pattern exact matches and is therefore less stringent.

The attribute score is then multiplied by the weight given for the attribute and all attribute scores are then summed to give the Genes Like Me score (PHS)

Table 1


The attributes used in Genes Like Me algorithms with their contributing data sources.
Attribute Data Source
Sequence paralogy
  • Ensembl
  • HomoloGene
Domains
  • InterPro (Ensembl)
  • Blocks
Super Pathways
  • GeneCards
Expression patterns
  • BioGPS
Phenotypes
  • Mouse Genome Informatics (MGI)
Compounds
  • Tocris Bioscience
  • Human Metabolome Database(HMDB)
  • BitterDB
  • DrugBank
  • Novoseek (formerly Alma Knowledge Server)
  • PharmGKB
  • FDA Approved Drugs
  • DGIdb
  • ClinicalTrials
  • ApexBio
Disorders
  • MalaCards
  • On-line Mendelian Inheritance in Man(OMIM)
  • UniProtKB
  • University of Copenhagen DISEASES
  • Novoseek (formerly Alma Knowledge Server)
  • GENATLAS
  • GeneTests (formerly GeneClinics)
  • The Breast Cancer Gene Database (BCGD)
Gene Ontology
  • Entrez Gene (National Center for Biotechnology Information - NCBI)
  • Ensembl
Content