DRC Genomics and Epigenetics Core
The DRC Genomics and Epigenetics Core is a state-of-the-art facility that facilitates high throughput genomic approaches. Microarray studies, high-throughput sequencing, data acquisition and analysis require expensive instrumentation and reagents and a highly skilled team of individuals who are experts in specific components of the overall procedure. These technologies therefore lie beyond the scope of most individual laboratories. The DRC Genomics and Epigenetics Core which is comprised of the Institute for Genomic Medicine (IGM) and VA/VMRF Microarray & NGS Facilities at UCSD has been instrumental in providing DRC investigators access to these technologies. Both facilities have been in operation since 1999. Their value in discovery research enables investigators to identify signal transduction pathways, uncover new targets for therapeutic intervention, and reveal biomarkers for disease detection and prognosis. Genomics services are available to the DRC community at discounted rates.
1. High Throughput Sequencing Technologies; Illumina HiSeq2000, Roche’s GS FLX
2. Support for DNA library preparation and validation for High Throughput
Sequencing (including RNA-seq RNA-seq, miRNA-seq ChIP-seq)
3. Expression microarray technology; Affymetrix, Agilent, NimbleGen, and Illumina
4. MicroRNA analysis using Agilent, Affymetrix, and Exiqon microarray platforms
5. Bioinformatics support for assistance with experimental design and data analysis.
Chris Glass, Ph.D.
Kristen Jepsen, Ph.D.
IGM Genomics Center
Nicholas Webster, Ph.D.
VA Genechip Core
Sequencing Library Preparation
PacBio RS II SMRT Sequencing
VA/VMRF Microarray & NGS Core
Affymetrix and Agilent Micro Arrays
(858) 552-8585 x7100
DERC Transcriptional Genomics Core Capabilities
1. High Throughput Sequencing Technologies
The Core contains two ‘Illumina HiSeq2000 instruments’ in addition to a ‘Roche GS FLX+ System’. The Illumina HiSeq2000 is based on the massively parallel sequencing of millions of fragments using a proprietary clonal single molecule array technology coupled to a novel reversible terminator-based sequencing chemistry.
One advantage of the HiSeq2000 is that two flow cells can be run simultaneously independent of the run type increasing sample throughput. For short sequence reads, the approach has been determined to be highly robust and accurate. Applications in whole-genome association studies, expression analysis, and sequencing in addition to genome wide location studies have been reported. Read lengths, are currently up to 100 bp in length, and both single end and paired end sequencing is supported. Currently the Core runs 50bp single read and 100 bp paired-end read. Read lengths are expected to increase to >125 bp this year.
The GS FLX+ System features the unique combination of long reads, exceptional accuracy and high-throughput, making the system well suited for larger genomic projects. The GS FLX System has been used extensively for genomic discoveries in over 1,000 peer-reviewed publication. Our current read lengths are 400-500 bp with over 1.2 million reads. The system is being upgraded in June 2012 to allow 800-1000 bp reads. A broad selection of gaskets and Multiplex Identifiers (MID’s) enables multiple libraries to be sequenced in a single run. Roche’s GS Data Analysis software allows for rapid de novo genome assembly, reference mapping, and amplicon variant analysis.
Popular uses of the Illumina high-throughput, short-read sequencing technology include applications for sequencing includes DNA-seq (whole exome), RNA-seq (transcriptome sequencing, expression profiling), miRNA-seq (small-RNA sequencing) and ChIP-seq. ChIP-Sequencing, also known as ChIP-Seq has replaced earlier approaches based on microarray technology to analyze protein interactions with DNA. The approach combines chromatin immuno-precipitation (ChIP) with massively parallel DNA sequencing to identify binding sites of DNA-associated proteins. It has the advantage that it can be used to precisely and cost-effectively map global binding sites for any protein of interest, and is not dependent on the availability of high density tiling microarrays. The Roche long-read GSFLX sequencing technology is particularly suitable for metagenomic studies of diversity in environmental microbial communities such as in the human gut and mouth, soil, coral reefs, deep sea thermal vents, and drinking water; analysis of viral load and the identification of rare viral variants in patient samples; analysis of Ig and TCR rearrangements in populations of T and B cells; candidate gene sequencing in patient cohorts; and complete sequencing of small genome microbes. A combination of the Illumina high-density short reads with the Roche long-read scaffolds is recommended for whole genome shotgun assembly of larger genomes.
2. Support for DNA library preparation and validation for High Throughput Sequencing (including RNA-seq RNA-seq, miRNA-seq ChIP-seq)
The Core enables DERC investigators to advance the field by carrying out experiments using massively parallel sequencing technologies. The Core removes the technical, economic and informatics barriers that normally exist for entry to these technologies. One of the major bottlenecks in sample preparation for next-gen sequencing is library construction. Generation of unbiased and appropriately size-selected sequencing libraries requires technical skill and is essential for the success of all high throughput sequencing experiments. This service is a critical function of both the BIOGEM and VA/VMRF Cores and removes the burden from the investigator. In addition to library construction, the Cores provides library validation services prior to DNA sequencing. In the case of ChIP-seq experiments, it is highly recommended that users perform a conventional ChIP assay using a known target gene as a positive control to ensure that the enrichment has occurred.
Candidate gene amplification is also available using a Fluidigm Access Array System Sample in the VA/VMRF Core. This system allows the simultaneous amplification of 48 targets from 48 samples in a microfluidic flow cell providing 2304 PCR products for sequencing. This minimizes the cost of PCR reagents for sample prep. The system also allows the addition of MID tags to allow multiplexing of samples in a single sequencing run. The system can produce libraries suitable for sequencing on either the Illumina HiSeq or Roche GSFLX platforms.
Exome enrichment from individual patient samples can be performed in the VA/VMRF Core using either solution (Agilent or Nimblegen) or array (Nimblegen) enrichment. The enriched DNA is suitable for sequencing on the Illumina HiSeq instrument, but is not recommended for the Roche GSFLX, and provides a cheaper alternative to whole genome sequencing.
For library validation, Agilent Bioanalyzer profiles are run after DNA fragmentation, end-repair/A-tailing, adapter ligation and PCR amplification. This provides an estimate of insert sizes and helps eliminate libraries that contain excessive adapter sequence beyond the insert. Q-PCR is used to titrate the library to ensure that adequate quantities of the library are sequenced to provide maximal depth of coverage.
3. Expression microarray technology; Affymetrix, Agilent, NimbleGen, and Illumina microarray platforms for analysis of large scale gene expression.
Microarray technology remains a viable and cost effective means of analyzing transcriptomic changes. Commercial microarray platforms have been in existence for over a decade and the technology is mature and robust. The platforms each have different strengths and weaknesses; thus, optimum results are achieved by tailoring platforms to specific biological questions. Four microarray platforms are supported by the Transcriptional Genomics Core, Affymetrix and NimbleGen (at the VA/VMRF Microarray & NGS Core) and Agilent and Illumina (at BIOGEM).
On the Affymetrix platform, 25-mer oligonucleotide arrays are synthesized in-situ on silicon chips using photolithographic masking techniques adapted from the semiconductor industry. One advantage of this system is that the fabrication method allows very high-density arrays to be produced. The current whole genome mRNA profiling chips have a feature size of 11 µm allowing 1.4 million different oligonucleotides to be synthesized on a 1.6 cm² silicon chip. Another advantage of the single color Affymetrix system is that all of the labeling, hybridization, scanning, and data extraction is standardized, so data generated by different cores can be readily compared.
The NimbleGen microarray platform consists of high-density arrays of isothermal long oligos synthesized in-situ using digital-light processing photolithographic techniques. Comparison of probe performance versus probe length has shown that longer probes (45-85mer) provide superior signal over shorter (<30mer) probes. The advantage of the novel chemistry is that long oligonucleotides can be synthesized efficiently on a slide substrate. The superior specificity of long-oligo probes negates the need for mismatch probes. Each probe is optimized for GC content and length to ensure that its thermal hybridization properties are identical to the other probes on the array. As well as single arrays, these slides are offered in a multiplex option as a cost-effective solution for high sample throughput enabling statistical analysis of replicates on the same slide. They offer a 4-plex format (4x72K) with 72,000 probes per array and a 12-plex format (12x135K) with 135,000 probes per array. Hybridize four or twelve independent samples or run replicate samples on a single slide and average the data for increased statistical confidence. For more detail please visit
Agilent microarrays rely on the in situ synthesis of 60-mer probes at or near the surface of the microarray slide by ink jet printing using phosphoramidite chemistry. Three features of Agilent microarrays that make them a preferred choice for some applications. First, Agilent microarrays are typically run as ‘two-color’ experiments in which control RNA is labeled with one dye (e.g., Cy5) and test RNA is labeled with another dye (e.g., Cy3), allowing control and test samples to be directly compared by hybridization on the same array. As a consequence, fewer biological replicates are required to identify genes exhibiting significant changes in an experiment. Second, because Agilent microarrays are synthesized in situ, user-designed microarrays can be fabricated for relatively minimal additional setup costs. Third, the longer 60-mers are more tolerant of sequence mismatches, and are thus are more suitable for the analysis of highly polymorphic regions. For more detail please visit
The Illumina Bead Array platform has two features that make it a preferred choice for some investigators. First, this system utilizes 50-mer oligonucleotides that are assembled at high redundancy (i.e. each probe is represented on an array on average at least 30 times). Second, the manufacturing process has enabled significant price reductions compared to other commercially available platforms. Dynamic range appears to be compressed in comparison to other platforms, but the precision of measurement resulting from the high level of probe redundancy enables this technology to be similar to other platforms with respect to capturing differentially regulated genes. For more detail please visit
4. MicroRNA analysis using Agilent, Invitrogen NCode, Affymetrix and Exiqon microarray platforms
Since their discovery in 1993, small non-coding RNAs (microRNA, scaRNA, and snoRNA) have emerged as a major component in the regulatory circuitry that underlies the development and physiology of complex organisms. As a result, it is becoming increasingly important to complement messenger RNA (mRNA) gene expression studies with miRNA analysis to understand the biological context of differentially expressed genes. These key functional gene products are estimated to regulate approximately 30 percent of all protein-coding genes. MicroRNAs (miRNAs) in particular have generated considerable interest in recent years and have been shown to regulate gene expression thereby regulating important aspects of development and metabolism. Several unique physical attributes of miRNAs, including their small size, lack of poly-adenylated trails, and tendency to bind their mRNA targets with imperfect sequence homology, have made them elusive and challenging to study. Four commercial platforms are supported by the Transcriptional Genomics Core: Agilent, Invitrogen, Affymetrix and Exiqon. As with expression arrays optimum results are achieved by tailoring platforms to specific biological questions.
Agilent miRNA profiling offers four advantages. First the system offers five orders of magnitude of linear dynamic range. Second it permits detection of miRNAs in amounts as low as 10 zmol (~6000 molecules). Third Agilent miRNA arrays Agilent miRNA array content is updated routinely to include the latest information from Sanger mirBase database. Fourth Agilent offers array customization. The BIOGEM Core is fully equipped to process Agilent microarrays.
Life Technologies (Invitrogen) has developed the NCode™ Multi-species miRNA Microarray. Individual miRNA samples can be analyzed using Alexa Fluor® 3 and Alexa Fluor® 5 Capture Reagents enabling efficient profiling of miRNA expression patterns in various types of tissue, disease and developmental states, providing insight into their role in gene regulation.
Affymetrix GeneChip miRNA Arrays provide the most sensitive, accurate, and complete measurement of small non-coding RNA transcripts involved in gene regulation. These are the only arrays with 100 percent miRBase coverage, all organisms, snoRNAs and scaRNAs probe sets unique to pre-miRNA hairpins. The very high density of the GeneChip arrays allows all miRNAs in all mammalian species to be synthesized on the same array.
The advantage of the Exiqon array is the use of LNA probes that show greater specificity and hybridization. Locked Nucleic Acid (LNA) is a conformationally restricted nucleic acid analogue, in which the ribose ring is "locked" with a methylene bridge connecting the 2’-O atom and the 4’-C atom. LNA nucleotides containing the six common nucleobases (T, C, G, A, U and mC) can form base-pairs with their complementary nucleotides according to standard Watson-Crick base pairing rules. LNA oligonucleotides can be used with great advantage in any application which demands high hybridization specificity with short target sequences.
The VA/VMRF Microarray & NGS Core is fully equipped to process Affymetrix, Life Technologies and Exiqon microarrays.
5. Bioinformatics support for assistance with experimental design and data analysis.
The DERC Transcriptional Genomics Core can provide bioinformatics support for experimental design and data analysis for sequencing and microarray experiments. Services for experimental design will include power analysis based on estimates of sample variance, biological effect, and technical characteristics of specific microarray platforms. Microarray and RNAseq data generated by the core is provided via samba or ftp data transfer through the core websites. For most users, the ‘primary’ data will consist of normalized intensity signals in the form of tab-delimited text files. This data can then be imported by the user into any one of a number of established microarray analysis programs.
The Cores offer access to a number of software packages for the analysis of microarray data. We currently have licenses to Agilents GeneSpring, DNA ArrayStar, GeneGo’s MetaCore and Biotique’s X-ray. Most of these are user friendly and can be used to help you analyze your microarray data, including Venn diagrams, a scatter plot, heat maps and line graphs for clustering, a gene ontology tree, pathway enrichment, disease association, and alterations in RNA splicing.
For sequencing data, the Cores offer access to the Illumina Genome Studio software. This software suite has modules for DNA sequencing, RNA sequencing, ChIP sequencing, genotyping, gene expression, and methylation. For the long-read Roche sequence data, three GS software solutions available for analyzing target region data, depending on the sequencing methods and experimental design. GS Genome Assembly software allows de novo assembly of reads into contigs and genomes. GS Amplicon Variant Analysis software aligns amplicon reads against a reference sequence, accurately detects and quantifies known variants in complex pools, and discovers novel variants. GS Reference Mapper software aligns reads to any reference genome and identifies genomic variations including SNPs, insertions, deletions and structural variations.
Basic data analysis is provided but, in addition, the cores provide more sophisticated bioinformatic services that typically start with false discovery rate analysis, which leads to lists of genes exhibiting significant changes in expression (RNA analysis) or significant enrichment (ChIP-Chip analysis). These lists are then used for secondary analysis of functionally enriched GO terms and pathways. This analysis is available at a recharge rate of $120/h.
The Transcriptional Genomics Core provides services to DERC investigators at a discounted rate. The recharge rates for genomic services performed by the Core are illustrated below. DERC users will obtain a 10% discount from these rates. Users are responsible for microarray purchase. Investigators typically order arrays with delivery designated to the core facility.