Unusual But Nevertheless Possible KU-55933 Techniques
Of note is the similarity of our results with a previous study investigating the performance of different predictive models for mammalian TFs.21 In the earlier study, 26 models were applied to predict binding sites for 66 different mouse TFs. The results indicated that models based on simple mononucleotide PWMs perform similarly to more complex models for most of the mouse TFs examined, but do not perform as well in some cases (selleck chemicals of the underlying biological implications of the performance of these different models will be critical in future studies. In particular, expanding the studies to examine the performance of the MARZ algorithm on additional datasets for Group B TF binding, including in vivo ChIP-seq47 and in vitro PBMs,9 will be informative in further dissecting the key nucleotide interdependencies in the binding sites. Given that the sequences in ChIP peaks often contain binding sites for multiple TFs,48,49 it may also be possible that some of the interdependencies CYTH4 detected by the MARZ algorithm represent overlapping sequences from two different binding sites. In future studies, it will therefore be important to explore other salient features of TF binding sites in CRMs,50 including their spatial arrangement,51�C53 relative binding affinities,8,30,54 and the biophysical constraints of protein�CDNA interactions,55 in combination with the application of gapped n-mer matrix models, in order to further refine overall predictive efficacy. Methods Data used In this study, we investigate 15 TFs prominent in embryonic Drosophila development. These were chosen based on the degree of their characterization, the quality and quantity of the corresponding selleckchem data, and the range of their spatial expression profiles (Table 1). For each TF, ChIP-chip data were obtained from MacArthur et al.49, corresponding to regions of DNA in which the given TF binds. To reduce any potential noise in the data, only the center 100 bp of each ChIP peak are considered. Any ChIP peak of fewer than 100 bp of length is discarded, thus all trimmed ChIP peaks used are exactly 100 bp in size. These trimmed ChIP peaks, combined with aligned sequences from in vivo binding data,45,56,57 are then used as the input to the MARZ algorithm.22 MARZ algorithm The MARZ algorithm combinatorially analyzes all possible gapped n-mer matrices (where n �� 6) for each studied TF.