Gene Expression: Microarray

Gene expression is a key determinant of cellular phenotypes. Microarrays have been the workhorse for gene expression studies for over a decade because of their ability to probe the expression of many thousands of transcripts simultaneously. While RNA-seq has many advantages over hybridization-based microarrays (see the RNA-seq Services page), RNA-seq is not a mature technology. While new standards are being developed, RNA-seq is undergoing rapid changes in library preparation biochemistry, sequencing platforms, computational pipelines, analysis methods, and statistical treatment. The microarray community has benefited enormously over the last decade by collaborating with bioinformatics and statistics experts to identify and account for microarrays’ biases and statistical issues.


The core currently offers the following bioinformatics analyses for microarray experiments using the Affymetrix GeneChip system:

  • Accession and analysis of publicly available data (e.g. GEO, ArrayExpress).
  • Formatting and uploading data to GEO.
  • Preprocessing: background subtraction, summarization, and quantile normalization using RMA (Robust Multichip Average) expression measure described in Irizarry et al. Biostatistics 4:249-264.

Quality assessment:

  • Visualization of signal intensity distributions of each array using boxplots and density plots.
  • MA plots to visualize signal intensity over average intensity.
  • Principal components analysis to visualize the overall data (dis)similarity between arrays.


  • Estimation of fold changes and standard errors using a linear model.
  • Empirical Bayes smoothing to standard errors.
  • Lists of top differentially expressed genes, fold changes, statistical significance, multiple testing correction.


  • Heatmaps and dendrograms.
  • Volcano plots to visualize statistical significance by fold change.

Biological context – See Pathway/Functional Analysis page.



Q: What software packages do you use in your microarray analysis pipeline?

The core uses a combination of custom-built and open-source application-specific software in the R statistical computing environment, including the arrayQualityMetrics, affy, and limma packages from Bioconductor.

Q: What about microarray platforms other than AffyMetrix?

We will analyze data from any vendor using any platform. Analysis may require more time using obscure or custom platforms without a Bioconductor annotation package. Please read the about page, and submit a consultation request form.

Q: How much will this cost me?

Bioinformatics costs are separate from microarray costs, and are the same whether data is generated here at the DNA Sciences Core or elsewhere through an external vendor. Please read the about page, and submit a consultation request form.