Learn Metagenomics

Last spring I taught an introductory course to metagenomics at Princeton University. I wanted to share that the lecture notes and lab exercises are now available online here. Check it out if you’re looking to learn more about metagenomics!


The Week of Mars: Is the red planet habitable?

The question of whether or not life could/can survive on Mars was brought to the forefront of popular media this month with NASA’s announcement of water currently flowing on Mars, Matt Damon “science-ing the sh!t out” of the red planet in the Hollywood release of The Martian, and Google’s adorable doodle of Mars drinking a large gulp of water.  Together, these events have led to an increased interest in the habitability of Mars and, Continue reading

Plants in Space!

Today, Robert Ferl gave a seminar on space biology at the Princeton Environmental Geology and Geochemistry Seminar and it was so cool that I wanted to post about it here!  Robert is an expert on how plants respond to changes in gravity and has led experiments to study how Arabidopsis responds to parabolic flight campaigns (aka a vomit comet) and life on the International Space Station.  Now, it is pretty clear that zero g is probably one of the coolest “field sites” you could work at but what really got me excited was the amount of knowledge we can gain from such a small plant and the sophisticated methods that were being used.

Continue reading

Sequencing Depth and World Cup Paninis

The World Cup is underway and I am back to the blog!  I was recently inspired by the World Cup Paninis to write a post about sequencing depth.  In sequencing, the problem we are often faced with is whether or not enough sequences have been generated to be representative of a population.  Tools we often use to determine whether we have sampled enough are rarefaction curves and the Chao estimate but how many sequences would we need to generate in order to capture a 16S from every organism present in an environment? Continue reading

De Bruijn Graph Assembly

When our lab got its first metagenomic dataset, the first thing we did was upload our QC filtered and merged paired-end Illumina reads (mean length 160 bp) onto MG-RAST for annotation.  However, when the annotations came back, some organisms whose genomes were known to be present in both our sample and the m5nr reference dataset were missing and, for those sequences that were annotated, the designated e-values centered around 1e-10.  In order to improve the annotation of our data, we decided to perform an assembly.   Searching the literature, I found that a class of assemblers — called De Bruijn Graph assemblers — were the popular choice for assembly of short read metagenomic data; however, the intuition behind how these assemblers worked was a little less clear. Continue reading

Alignment with Bowtie (and Bowtie2)

For those who only know a bow tie as something worn by hipsters or really fancy people — Bowtie is a very powerful bioinformatics tool that has a diverse array of applications.  Most often I will use Bowtie to map RNA transcripts back to a known genome; however, you can also use Bowtie to assess how well your assembly performed or for any instance where you want to find how many of your high throughput sequences map back to a [longer] sequence or genomes of interest.

What makes Bowtie special is that it requires little RAM (can easily run on your laptop) and is very fast — or as the creators of Bowtie declare: ultrafast (aligning more than 25 million reads to the human genome in 1 CPU hour) . Continue reading

What we know so far…


A bacterial biofilm growing on nutrient rich fracture water 1.3 km below the surface in Beatrix Gold Mine, South Africa. NOTE: This image does not represent the bacterial communities I sample.

In order to study life kilometers below the earth’s surface, subterraneauts travel underground through deep mine shafts around the globe.  These scientists collect and analyze fracture waters that have been locked away for thousands of years — completely removed from the sun.  The deepest and most well-studied mines are located in South Africa.  Scientists that study these deep sites use  a “prawn” or similar device (see below) attached to a borehole to sample water that hides meters beyond the mine’s walls.

One of the most recognized discoveries of deep subsurface research is the unprecedented identification of “an ecosystem of one”.  Here, scientists performed a metagenomic study on a 2.8 km deep fracture water community and found that a novel bacterium, Candidatus Desulforudis audaxviator,  accounted for >99.9% of the microbial community (Chivian et al., 2008).  In order to survive on its own, the genome of D. audaxviator  reveals Continue reading