MSigDB relate TFs to Genes

collection: Motif gene sets

Gene sets representing potential targets of regulation by transcription factors or microRNAs. The sets consist of genes grouped by short sequence motifs they share in their non-protein coding regions. The motifs represent known or likely cis-regulatory elements in promoters and 3'-UTRs. These gene sets make it possible to link changes in an expression profiling experiment to a putative cis-regulatory element. The C3 collection is divided into two sub-collections: microRNA targets (MIR) and transcription factor targets (TFT). 
 

sub-collection MIR: microRNA targets

These sets consist of genes sharing 7-nucleotide motifs in their 3' untranslated regions. Each 7-mer motif matches (is complementary to) the seed (bases 2 through 8) of the mature human microRNA (miRNAs) catalogued in v7.1 of miRBase (October 2005). 
 

sub-collection TFT: transcription factor targets

Gene sets that share upstream cis-regulatory motifs which can function as potential transcription factor binding sites. We used two approaches to generate these motif gene sets.
  • Gene sets of "conserved instances" consist of the inferred target genes for each motif m of 174 motifs highly conserved in promoters of four mammalian species (human, mouse, rat and dog). The motifs represent potential transcription factor binding sites and are catalogued in Xie X, Lu J, Kulbokas EJ, Golub TR, Mootha V, Lindblad-Toh K, Lander ES, Kellis M. Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals. Nature. 2005 Mar 17;434(7031):338-45. Each gene set consists of all human genes whose promoters contained at least one conserved instance of motif m , where a promoter is defined as the non-coding sequence contained within a 4-kilobase window centered at the transcription start site (TSS).
  • Mammalian transcriptional regulatory motifs were extracted from v7.4 TRANSFAC database (see supplementary data of Xie et al). Each gene set consists of all human genes whose promoters contains at least one conserved instance of the TRANSFAC motif, where a promoter is defined as the non-coding sequence contained within a 4-kilobase window centered at the transcription start site (TSS).