What To Do About IRS1 Before Time Runs Out
Table 1 Summary of RNA-seq data and assembly. Table 2 Summary of RNA-seq data and assembly per tissue. RNA-seq quality and functional annotation We investigated the quality of our tissue specific transcriptome by a series of similarity-based searches of our transcripts against sequences in the NCBI-nr database. As expected, the single largest category of top blast hits (blastx E-value cut-off 10?3), corresponding to 25.3% of top blast hits, was to chelicerate protein coding genes, followed by hits to other arthropod species (4.1%). Within the Arthropoda, hits within Hexapoda represents about 12% (Fig. 1A), while Selleck JQ1 Ixodes scapularis is the species receiving the majority of hits (Fig. 1B). Figure 1 Macrothele taxonomic distribution. Overall, 2,619, 3,353 and 776 out of the 6,696 identified transcripts have IRS1 a GO, InterPro, or KEGG associated term, respectively (Table 2); in total 4,978 of them (74.3%) have some functional annotation information. We analysed the distribution of GO terms (at GO level 2) across the 2,619 M. calpeiana transcripts sequences with GO annotation. We found that the most frequent GO terms present in this sample are ��metabolic�� and ��cellular processes�� within the biological process domain (BP), and ��binding�� and ��catalytic activities�� within molecular function domain (MF). The distribution of GO terms in the complete data set (2,619 GO terms; Fig. 2) and in the data set excluding singleton sequences (1,734 GO terms; Fig. S1) is not significantly different (two tailed FET, P-value = 0.592 and 0.757 for BP and MF, respectively). Hence, we used the complete dataset for further functional annotation analyses. Figure 2 Distribution of the Gene Ontology (GO) terms associated with the complete set of M. calpeiana transcripts (2,619 transcripts with GO annotations over 6,696 sequences). Tissue-specific expression With our subtractive approach we aimed to enrich a number of tissue-specific transcripts. We detected 1,005 transcripts annotated as housekeeping genes (Table 2) and 789 transcripts with putative homology to 290 of 458 CEG members of the CEGs dataset. Out of the 789 transcripts with CEG homologs, 488 are also annotated as HK genes (Fig. S2 and Tables S3�CS5). Despite the finding of about 15% of HK see more and CEG genes, the largest proportion of them are located at the intersection of the Venn diagram (Figs. 3C and ?and3D),3D), indicating that tissue-specific transcripts should reliably represent tissue-specific functions. After excluding these likely ubiquitously expressed genes, the remaining sample (n = 5, 390 transcripts; 1,523 with GO annotation) exhibits the desired tissue-specific expression profile. In fact, the distributions of GO terms including (2,619 transcripts) or not (1,523 transcripts) HK/CEG genes are significantly different from each other (two tailed P-value