Parameters:

Search by gene

Search by genomic coordinates





Statistics for the metadata in GEO/ENCODE database

Email

jmzeng1314@163.com

gangli@umac.mo

Acknowledgement

TFmapper was built using the ENCODE datasets and the GEO datasets curated by CistromeDB

TFmapper

The main purpose of TFmapper is to search all experimental ChIP-seq datasets and identify the trans-acting factors or histone modifications which show peaks at a gene of interest or a specified genomic region in a defined biological sample.

Tutorial

Please watch it at HD (1080p) in full screen for a better viewing experience

The workflow diagram of TFmapper

The workflow diagram of TFmapper

1. Peak files in the BED (Browser Extensible Data) format are downloaded from GEO/Cistrome and ENCODE.

2. Peaks are annotated to genomic features (Promoter, TTS, 5’UTR, 3’UTR, Intron, Exon, Intergenic) using the software HOMER with GRCh38/hg38 for human and GRCm38/mm10 for mouse as the reference genomes.

3. The annotation results are stored in a MySQL database. To increase the speed of query processing, the peaks are split by species, sources, factors, and chromosomes. In this way, the average of the number of rows would be about a few millions.

4. For the client side, all the elements on the HTML page are built by R (Shiny), and result tables are created with the DataTables JavaScript library.

5. Results can be downloaded in the CSV or BED format, and peaks can be directly visualized in the in the WashU Epigenome Browser or the UCSC Genome Browser.



To search:

Users can query the database by 1) gene symbols or 2) genomic coordinates of GRCh38/hg38 for human or GRCm38/mm10 for mouse respectively.



Results table:

  • SampleID : Curated from GEO,cistromeDB or ENCODE
  • Factors : The name of trans-acting factors or histone modifications
  • Visualization : Links to UCSC and washU
  • Sequence : For user to get the nucleotide sequenece for selected peak
  • Distance : From the summit of selected peak to the promoter of gene of interest
  • Score : Fold enrichment score calculated by MACS2
  • -log10(p value) : -log10(p value) calculated by MACS2
  • -log10(q value) : -log10(q value) calculated by MACS2
  • Attribute : Seven genomic features based on RefSeq annotations were assigned to each peak:
    • 1. Promoter: -1kb to +100bp of transcription start site (TSS)
    • 2. TTS region: -100 bp to +1kb of transcription termination site (TTS)
    • 3. Exon
    • 4. 5' UTR (untranslated region) Exon
    • 5. 3' UTR (untranslated region) Exon
    • 6. Intron
    • 7. Intergenic regions
  • Title : The title of selected sample from GEO
  • Source Name : The detail description for the selected sample from GEO


Direct download links

User can download the results table as a csv or bed format files

  • csv format file contains all of the fields in the search results.
  • bed : format file contains six different types of fields, which are :
    • Chrom: The name of the chromosome
    • chromStart: The starting position of the feature in the chromosome.
    • chromEnd: The ending position of the feature in the chromosome.
    • Factors: Name of the trans-acting factor or Histone mark
    • Score: Fold enrichment score which calculated by MACS2
    • Strand: "+" strand only


To visualize multiple peaks:

When multiple peaks are selected, a link for visualization in the WashU Epigenome Browser will appear (red arrow 1).

To visualize multiple peaks

Advanced Searching

The results can be further filtered by typing in the boxes (figure 1, red arrow 2); a slider will appear when the box under Distance being clicked, which can be moved to narrow down the region.



Back to home page

Go back to the search results by clicking the Home button at the top left corner of the page