Sequencing

The past few weeks have been very busy. I was due to be running my first sample set on the sequencer in January but due to a last minute drop out by another grad student I was bumped up to be on the end of November/start of December run. Running the Illumina HiSeq (the sequencer) is an expensive undertaking so in order to make it economical we fill the 8 sequencing lanes with samples from 8 different researchers, thus spreading the cost across projects. The trouble is that if somebody drops out then a replacement must be found quickly because otherwise the run is delayed and everybody suffers. So, with only four days notice I began to prepare my DNA libraries.

In order to sequence Restriction site Associated DNA (RAD) tags from many individuals at the same time and in the same lane we need to make sure that the sequence read from each sample is individually identifiable during the later analyses. To do this we follow the protocol of Peterson et al (2012) 1, adapted by Dr Kim Andrews (Hawai’i Institute of Marine Biology, University of Hawai’i). For me this involved taking 72 samples post enzyme digestion, ligating one of 12 unique barcodes to every sample in 6 groups. Samples from different locations were randomised across the 6 groups. Then the samples in each group were pooled before one of 6 unique indices were added to each pool; thus giving two levels of identification and allowing us to pull out individual samples after all 6 groups were pooled to make the final library.

This sounds simple enough but there are quite a few more quality control and quantification steps involved and when trying to do this in a rush it can be difficult. This whole process can be completed in 3 days if all goes perfectly. It would be advised to take longer however. In any case, I had problems with pool 5 and pool 6 and despite 12 and 13 hour stints in the lab to try and recover these pools we eventually had to drop pool 5 altogether. However, we managed to get out completed library to the sequencing facility in time (just!) and we have just had our preliminary data back. My PhD Christmas present is 233 million reads of DNA. That should keep me busy in the new year!

20151202_142045
The Illumina HiSeq next-gen sequencer at DBS Genomics

 

  1. Peterson, B. K., Weber, J. N., Kay, E. H., Fisher, H. S. & Hoekstra, H. E. Double Digest RADseq: An Inexpensive Method for De Novo SNP Discovery and Genotyping in Model and Non-Model Species. PLoS ONE 7, e37135 (2012).