# MySQL vs MongoDB

### Introduction

The relational databases held the leadership for decades and at that time the choice was quite obvious, either MySQL, Oracle, or MS SQL, just to name a few. They’ve served as a basis for tons of enterprise applications, while modern apps require more diversity and scalability. Non-relational databases, like MongoDB, have appeared to meet the existing requirements and replace current relational environment.

This post originally appeared on DA-14 website. Read the ...

Aside from the occasional somatic mutation, the genome of every cell in an individual’s body is largely preserved. Yet different types of cells (and tissues, and organs) are incredibly diverse. The majority of that specialization is governed by epigenetic changes — histone modifications, DNA accessibility, and methylation — that influence when and how genes are expressed.

Our knowledge of the epigenome has lagged well behind our knowledge of the genome, partly because it’s been difficult to study. The application of next-gen sequencing to RNA libraries (RNA-Seq),...

# MSigDB relate TFs to Genes

## collection: Motif gene sets

Gene sets representing potential targets of regulation by transcription factors or microRNAs. The sets consist of genes grouped by short sequence motifs they share in their non-protein coding regions. The motifs represent known or likely cis-regulatory elements in promoters and 3'-UTRs. These gene sets make it possible to link changes in an expression profiling experiment to a putative cis-regulatory element. The C3 collection is divided into two sub-collections: microRNA targets (MIR) and transcription factor targets (TFT).

# chromHMM

Core 15-state model

...
STATE NO. MNEMONIC DESCRIPTION COLOR NAME COLOR CODE

# dbNSFP

Database type:          variant
Number of records:      89,617,785
Distinct variants:      84,484,850
Reference genome hg18:  chr, hg18_pos, ref, alt
Reference genome hg19:  chr, pos, ref, alt

Field:                  chr
Type:                   string
Comment:                Chromosome number
Missing entries:        0
Unique Entries:         24

Field:                  pos
Type:                   integer
Comment:                physical position on the chromosome as to hg19
(1-based coordinate)
Missing entries:        0
Unique Entries:...

# C4A

C4A is part of a “complement” group. The term complement means it is able to kill bacteria and contributes to immune defenses. However, if there are too many compliments, it can cause tissue damage and trigger an allergic reaction. C4A is an activation protein, which means it also activates the other complement proteins to increase in level. The C3a, C4a, and C5a components are referred to as anaphylatoxins: they cause smooth muscle contraction, histamine release from mast cells, and enhanced vessel permeability. They also mediate inflammation and the generation of free ...

# Variant Consequences

For each variant that is mapped to the reference genome, we identify each Ensembl transcript that overlap the variant. We then use a rule-based approach to predict the effects that each allele of the variant may have on the transcript. The set of consequence terms, defined by the Sequence Ontology (SO), that can be currently assigned to each combination of an allele and a transcript is shown in the table below. Note that each allele of each variant may have a different...