Day-08 - SIFT, GRAIL and STRING

Post date: Jul 11, 2011 3:11:20 PM

I wouldn't put too much emphasis on these, but I did a quick SIFT and GRAIL analysis using only about 476 missense SNPs reported in my 23&me data (I got these by just loading all the data into Polyphen2). SIFT was roughly concordant with polyphen2 and I made a spreadsheet of those SNPs reported by pphen2 AND SIFT as being deleterious. Some of the results are interesting (as reported in Day 7). The olfactor receptor findings are not surprising. This is highlighted when I put these SNPs/GENES into STRING (see image below). We see loads of connections between olfactory genes with missense mutations. Note though that I did not filter out common mutations hete - this is the entire list of 476 SNPs. For SIFT and PPHEN2 analyses, I sorted by most rare in HapMap CEU first. Genes with <5% CEU freq and predicted to me probably damaging by SIFT and PPHEN2 are below:

C13orf26

YSK4 Sps1/Ste20-related kinase homolog

ZKSCAN2 zinc finger with KRAB and SCAN domains 2

KIAA0564

LY6G5B lymphocyte antigen 6 complex, locus G5B

Carboxypeptidase N, polypeptide 2

Now, GRAIL: (a) all SNPs, (b) rare SNPs only (13 SNPs < 5% freq).

(a) results

'olfactory' 'genome' 'mouse' 'subfamilies' 'pseudogenes' 'orthologous' 'receptors' 'intact' 'clusters' 'genes' 'locations' 'receptor' 'families' 'largest' 'repertoire' 'seattle' 'epithelium''avenue' 'tags' 'encode'

(b) results

'complement' 'database' 'resource' 'cloning' 'target' 'responds' 'adapted' 'tumorigenic' 'multiplex' 'contains' 'retrieval' 'online' 'biological' 'methodology' 'immediate' 'available' 'project''sets' 'proteomics' 'linking'

So nothing screaming out here, for (b) - which is more interesting as mutations are rare - 2 genes had significant connections:

TOE1 0.0096856492 ZNF687(5), ZKSCAN2(52), VPS72(251), BAT4(615), BAT3(1476), SLC44A4(1655), BAT5(1937), DOM3Z(1979)

HPDL 0.034077166 C13orf26(43), ZNF687(76), ZSWIM5(801), VPS72(936), SLC44A4(1289), BAT4(1496)

TOE1: Inhibits cell growth rate and cell cycle. Induces CDKN1A expression as well as TGF-beta expression. Mediates the inhibitory growth effect of EGR1 (specifically expressed in Schwann cells,induced by a wide variety of extracellular stimuli,involved in cell proliferation,macrophage differentiation,synaptic activation and long term potentiation,coactivated by CREBBP HPDL: May have dioxygenase activity)

HPDL: may have dioxygenase activity (Potential)

Can't immediately relate these to my phenotype, but good to record anyway(!)

We can also do a GRAIL GO analysis, though I wouldn't expect much to come out of this...

I think this gives you a taste of some of the interesting tools out there. STRING is particularly cool I think, especially the ability to look for interactions between genes harboring genes with potentially interesting mutations.