Blog

Exploring MIQE-Like Standards for qPCR-Based Molecular Detection of eDNA

Exploring MIQE-Like Standards for qPCR-Based Molecular Detection of eDNA

In the previous blog, we briefly touched on the MIQE standards – guidelines published to promote minimal standards of reporting qPCR data from clinical trials,  used to detect small relative changes in gene expression within cells or tissues, or to quantify the amount of pathogen(s). As much of contemporary environmental DNA work is conducted using qPCR approaches, it is only sensible that similar standards be developed for eDNA work, that should – it will be argued – extend beyond what is acceptable for clinical applications.

This is somewhat of a reverse of convention in scientific circles, as it is normally the case that clinical applications need an elevated burden of evidence to support any conclusions drawn from experimental data. However, because of the ecology of eDNA, its distribution in natural systems with myriad potential sources of generation and decay, allied to the hierarchical nature of its sampling, I believe eDNA presents a special case whereby more rigorous standards should be applied than clinical settings so that confidence in our results are trustworthy.

What are the chief differences in using qPCR to detect changes in gene expression or viral load, for instance, with using qPCR to detect eDNA? Figure outlines the workflow for A) performing a gene expression qPCR analysis alongside B) a generic eDNA workflow. In A) an investigator has a sample of tissue, within which there is guaranteed nucleic acid content. Imagine that they want to test for the expression levels of a gene that is hypothesized to play a positive role in alleviating stress in plants subject to an environmental stressor. In such studies, expression levels are compared against so-called reference genes, which are always expressed, so that any changes in the target gene can be compared after normalization of expression levels. Although plants in a control plot should show little expression of the target gene, it would still be expected to be found, albeit at reduced levels, in the plant tissue subject to nucleic acid extractions (in this case, messenger RNA, which is converted to DNA (complementary DNA or cDNA)) in a process called reverse transcription, so that the target is amenable to DNA-based qPCR. As such, one expects a lot of cDNA, if it were to be visualized using standard laboratory gel techniques. There will always be much more – and stable – expression of the reference gene’s mRNA, and by extension cDNA. In short: there is a reliable source of cDNA, which forms the template for the qPCR assays for both the target gene and the reference gene(s).

Generic workflows for conducting A) gene expression qPCR and B) eDNA qPCR

Consider then, environmental DNA. Leaving aside the complex ecology of eDNA to one side (subject to another blog post but see excellent review by Barnes and Turner [2]), the distribution of eDNA in a water body is largely ephemeral and at much lower concentrations, with no guarantee that sampling will entrain DNA molecules or cellular debris into sampling tubes or onto filter papers. There is a significant source of observational error at this stage, which is largely absent in gene expression studies, although both share procedural errors that can impact downstream qPCR success. Depending on factors including the volume of water sampled and pore size of the filters, the amount of total eDNA collected may vary substantially, although if visualized on a gel, is less likely to contain as much target as tissue-extracted mRNA turned cDNA.

Performing qPCR for either A) or B) requires taking an amount of total cDNA or total eDNA for use as template for the reactions. The probability of subsampling this extract and not getting detection is higher for eDNA, due to the extra rarity of the target in both the water sample – which may not have been collected at all – and by the relative rarity of the target molecule in the soup of total eDNA molecules.  For eDNA we have two uncertainties in sampling due to the two sampling events – of the water body and of the total eDNA – that act to increase underlying statistical error.

Bottom-line: eDNA needs to adopt more sensitive and, arguably, more specific assays than clinical applications. More specific? Well, whilst many genes evolve during evolution by way of duplication and subsequent divergence, these events are localized to gene families so whilst when developing a qPCR assay for a gene with a number of evolutionary similar orthologs and paralogs, the nucleotide divergence between these genes and all others in the genome of a single tissue type is huge, a dn thus non-specific amplification of a non-target gene attenuated. However, when using an eDNA marker, that same stretch of DNA (e.g., COI locus) is present in all non-targets, as it has been inherited by a common ancestor. However, evolution will increase the number of nucleotide differences between species as generations pass. However, more recently diverged taxa may not display enough between species variation to design an appropriate assay – but see the previous blog for an in-depth treatment of assay design and target specificity.

In the next blog, I shall discuss exactly how we estimate LOD, LOQ – further adopting MIQE standards – and how we optimize an assay to befit a rigorous, repeatable and reproducible eDNA assay. To do so, not only do we need to optimize the target, but also identify and countenance factors that input variance into the system, most nefariously PCR inhibitors. I shall describe how we use MIQE-like guidance and synthetic internal positive control elements to determine the reliability of eDNA results. We shall also discuss in the future, how the ecology of eDNA and assay performance in pilot trials can be used to optimize detection (and the potential quantification) of targeted eDNA detection studies.

[1] Bustin et al. (2009). The MIQE guidelines: minimum information for publication of qPCR experiments. Clinical Chemistry 55, doi.org/41373/clinch.2008.112797.

[2] Barnes and Turner (2016). The ecology of environmental DNA and implications for conservation genetics. Conservation Genetics, 17, 1-17.

[3] Doi et al. (2017). Environmental DNA analysis for estimating the abundance and biomass of stream fish. Freshwater Biology, 62, doi.org/10.1111/fwb.12846.

 [4] Nevers et al. (2018). Environmental DNA (eDNA): a tool for quantifying the abundant but elusive round goby (Neogobius melanostomus). PLoS One, 13, doi.org/10.1371/journal.pone.0191720.

[5] Evans et al. (2016). Quantification of mesocosm fish and amphibian species diversity via eDNA metabarcoding. Molecular Ecology Resources, 16, doi/10.1111/1755-0998.12433.

[6] Hunter et al. (2016). Detection limits of quantitative and digital PCR assays and their influence in presence-absence surveys of eDNA. Molecular Ecology Resources, 17, doi/10.1111/1755-0998.12619.

[7] Forootan et al. (2017). Methods to determine limit of detection and limit of quantification in quantitative real-time PCR (qPCR). Biomolecular Detection Quantification, 12, 1-6.

***cDNA vs eDNA (diagram of how each is made and detected)? – largely deterministic range of signal vs. stochastic ephemeral signal; # orthologs/paralogs intragenomically vs. interspecifically; targets are constant within cells vs. ephemeral and temporal-spatial of species distributions and predictors of shedding rate and eDNA decomposition in ecosystems. ***

Detecting eDNA – The Importance of Assay Specificity and Sensitivity Part I: An Introduction to the MIQE Guidelines and DNA Sequence Database Generation & Curation for qPCR Assay Design

MIQE – a Brief Introduction

In 2009, Bustin and colleagues[1] codified a set of minimal requirements for the publication of quantitative real-time polymerase chain reaction (qPCR) data, ostensibly for use in the fields of clinical medicine and molecular biology, as a means of ensuring rigorous datasets that enable investigators to quantify subtle changes in the expression of particular genes or in the estimation of viral load, for instance. In gene expression studies, fractional changes in the production of intracellular messenger molecules, called mRNAs, which convey genetic information encoded in genes to their proteinaceous or final form, need to be perceived so as to test crucial hypotheses of physiological, medical and even evolutionary and ecological import. Bustin et al. address some of the minimal mandatory standards that need to be reported to ensure continuing robust detection of nucleic acids at low concentrations. These standards were termed the MIQE guidelines: Minimal Information for the publication of Quantitative PCR Experiments.

We will discuss these guidelines and how they compare with eDNA standards in future posts, but for now we will divulge the chief difference between the studies that invoke MIQE and those that involve environmental DNA quantification using qPCR: that concentrations of eDNA, depending upon the circumstances in which it is collected, is likely to be several orders of magnitude lower than many observed changes in the level of mRNA expression or absolute levels of viral genomes present within tissues. Consequently, we need to modify MIQE standards to ensure that eDNA surveys are protected from accusations of too lax standards that may result in the significant incurrence of Types I and II error (false positive and negatives, respectively). Like ancient DNA (aDNA) before, eDNA detection has to elevate these standards to a higher level given the much more ephemeral nature of the target molecules in contemporaneous natural systems.

The Importance of Specificity and Sensitivity

Shared with aDNA and biomedical applications of qPCR assays, is a vital dependency on two touchstones of molecular-based detections: specificity (or: what is the likelihood of incorrectly detecting a non-target, which may lead to Type I error if not wholly specific to the target(s)?) and sensitivity (or: what is the likelihood of detecting extremely low concentrations of the target species, which, if unquantified, may lead to Type II error?)[2]. The very first step to minimize these potential sources of error is to design extremely robust assays in silico, which depends, crucially, on having substantial genomic data from target and sympatric (co-distributed) non-target organisms with which to design highly discriminative assays. To do so, genomic information is paramount; one must collect and curate a significant body of genetic sequence information.

Information is Key – An Evolutionary and Population Genetics Rationale for Data Generation and Curation

How much genomic information is enough? And how can we account for high levels of within-species genetic variation, and low levels of genomic sequence divergence between closely-related, sister and/or cryptic species? All excellent questions, and each needs to be answered satisfactorily, or one cannot with good conscience publish or make available a generic assay for the target species in question.

Coda

In this post, we have broadly discussed gathering and curating data for designing a species-specific assay to be generally deployed across a species’ range, or at least in the populations from which sequence data were generated. Here at PBI, we are also developing novel ways to tease apart even extremely closely-related species with low levels of nucleotide variation. In a similar vein, we also hope to design population-specific assays that may be able to target individual evolutionary significant units (ESUs) and management or conservation units (M/CUs)[9].

In the next blog, we shall develop further the required minimal standards of a reliable, robust and reputable eDNA qPCR assay, and how these requirements extend and embellish those of the MIQE guidelines. We shall treat in some detail the concepts LOD (limit of detection) and LOQ (limit of quantification) and ask: what are the crucial assay requisite parameters (CARP) for designing, optimizing and validating eDNA assays?

[1] Bustin et al. (2009). The MIQE guidelines: minimum information for publication of qPCR experiments. Clinical Chemistry 55doi.org/41373/clinch.2008.112797.

[2] Lahoz-Monfort et al. (2015). Statistical approaches to account for false positive errors in environmental DNA samples. Molecular Ecology Resources16, doi: 10.1111/1755-0998.12486.

[3] Hale et al. (2012). Sampling for Microsatellite-Based Population Genetic Studies: 25 to 30 individuals per population Is enough to accurately estimate allele frequencies.  PloS One, doi: 10.1371/journal.pone/0045170

[4] Luo et al. (2018). Biparental inheritance of Mitochondrial DNA in humans. Proceedings of the National Academy of Sciences, doi: 10.1073/pnas.1810946115

[5] Benasson et al. (2001). Mitochondrial pseudogenes: evolution’s misplaced witnesses. Trends in Ecology and Evolution16, 314-321.

[6] Smith & Smith (1996). Synonymous nucleotide divergence: what is “saturation”? Genetics142, 1033-1036.

[7] Felsenstein J (2003). Inferring Phylogenies, Sunderland (Sinauer).

[8] Crête-Lafrenière et al. (2012). Framing the Salmonidae family phylogenetic portrait: a more complete pictire from increased taxon sampling. PloS One, doi: 10.1371/journalpone.0046662.

[9] Palsbøll et al. (2006). Identification of management units using population genetic data. Trends in Ecology and Evolution22, 11-16.

Using Environmental DNA (eDNA) to Monitor Aquatic Ecosystems

Environmental DNA monitoring – eDNA – is at the vanguard of a new wave of technologically advanced monitoring efforts. With roots in soil and paleoecology, eDNA was first used to detect a multicellular aquatic organism– the invasive American bullfrog Lithobates catesbeianus– in a landmark scientific paper by Ficetola et al. in 2008[1]. By applying the widely used, yet highly sensitive, polymerase chain reaction (PCR) – a reaction that ‘amplifies’ a specific DNA sequence in a sample, only if the species’ DNA is present in however little amount – Ficetola and colleagues detected the frogs’ DNA in sediment filtered out of the water column. To understand why, a current working definition of eDNA – adopted by most ecologists – will illuminate: eDNA is “…genetic material obtained directly from environmental samples (soil, sediment, water, etc.) without any obvious signs of biological source material” (Thomsen & Willerslev 2015[2]). Examples of how eDNA is shed by an organism – e.g., here a midland painted turtle Chrysemys picta – are illustrated in the figure below.

Blog pic 1

In the figure, DNA-containing cells are constantly or periodically shed from the internal linings of the turtle’s gut, reproductive system, through regurgitation of food, the replacement of skin cells and mucus, the egress of waste materials, and through the release of sex cells (i.e., sperm and eggs). Once in the water, DNA is somewhat protected within cells. Eventually, cells are broken down and DNA is released into the aqueous environment whereupon it is to be found in solution. Although eDNA is depleted through a number of biological, chemical and physical processes, it will keep being replenished if the organism is still to be found living in the vicinity. That is to say, the signal of eDNA will be stronger the closer it is to its source, and will also increase when there are more individuals in a local population, if the volume of water remains the same. Therefore, an eDNA signal can be a reliable indicator of the target species’ presence in a given habitat.

What are the chief benefits of eDNA monitoring? First of all, there is no need to physically sample the species, thus minimising disturbance associated by the targeted monitoring of live creatures that are sensitive to stress. Furthermore, as only water samples are taken, and by few people, there will be a decrease in the environmental footprint associated with monitoring efforts per se. Because PCR is a highly sensitive molecular biological assay, eDNA surveys tend to have much higher sensitivities to be able to detect rare or cryptic species than conventional methods (e.g., Schmelzle & Kinziger 2016[3]). As a result, species-specific eDNA surveys are also potentially much more cost-effective than conventional techniques. Furthermore, anyone can take a water sample, following simple instructions, democratising and facilitating citizen science projects across the globe. Indeed, the current crop of companies that offer eDNA detection services are predicated on a model of water samples being collected by lay and technical personnel for processing back in a central laboratory.

However, as eDNA is a nascent technology, uncertainty exists over some of the conclusions drawn from early eDNA studies. However, these issues (i.e., sources of ‘error’) are under ongoing scrutiny by scientists, including here at Precision Biomonitoring, to minimise their impacts. As such, all potential sources of error, as they are currently understood, must be acknowledged and incorporated into any technical development (i.e., design, production and validation of species-specific PCR assays) or standard operating protocols for field surveys. For example, organisms are liable to move throughout their lifetimes. Seasonality has shown to be a strong factor in eDNA detection success (de Souza et al. 2016[4]). It is imperative that surveys are conducted with a thorough knowledge of a species’ ecology, including insight into current distributions and habitat preferences, otherwise inadequate surveying will lead to a false negative result, i.e., inferring a target to be absent when it actually is present; just undetected. Failure to account for false negatives can result in severe financial repercussions if infrastructure projects are subsequently halted, put on-hold or abandoned due to the rediscovery of the target by an intrepid ecologist or member of the public. False negatives can also result from improper assay development, the underestimation of within-species genetic diversity at PCR amplification sites, and by the current disjointed process by which eDNA samples are processed by the majority of eDNA practitioners.

As noted previously, eDNA will decay if left exposed to natural world processes. Therefore, collected eDNA is at risk of post-sampling decay, as there would be no mechanism for eDNA replenishment in the collection vessel, reducing the eDNA signal and potentially failing to garner a positive PCR result. Therefore the risk of eDNA degradation during sampling – particularly on hot, sunny days – and in transit from the field to the laboratory, is highly significant. Inappropriate storage may also destroy eDNA (e.g., water crystal formation during freezing may ‘shred’ DNA molecules). To compound the status quo further, despite the best efforts of contemporaneous laboratories, there remains a significant risk of false positive PCR results mediated by the transportation of aerosolised DNA particles among labs within buildings through ventilation pathways. Most eDNA practitioners seek to physically separate the processing of eDNA samples (e.g., filter papers or precipitated water samples) with downstream PCR detection, but even that is far from fool-poof.

Here at Precision Biomonitoring, we are set to unveil a state-of-the-art platform that will seek to eliminate, or minimise, these sources of eDNA analytical and sampling error, through the eradication of transit stages to a central laboratory and the application of standard operating procedures. Moreover, we will further the cause of a democratised biomonitoring field in which no technical specialty is required to conduct sophisticated PCR-based species-specific assays. Our system, using bespoke PCR assays, will yield PCR eDNA results in real-time (< 2 hours from water sampling to PCR read-out), which can then be immediately disseminated to colleagues via the cloud.

It is our aim to give those working at the coalface of biodiversity monitoring (from professional ecologists to local citizen science projects), the power to conduct highly rigorous, and potentially highly coordinated, targeted eDNA surveys to better vouchsafe our world’s biodiversity heritage for all generations to come.

[1] Ficetola et al. (2008). Species detection using environmental DNA from water samples. Biology Letters4, 423-425.

[2] Thomsen & Willerslev (2015). Environmental DNA – An emerging tool in conservation for monitoring past and present biodiversity. Biological Conservation183, 4-18.

[3] Schmelzle & Kinziger (2016). Using occupancy modelling to compare environmental DNA to traditional field methods for regional-scale monitoring of an endangered aquatic species. Molecular Ecology Resources16, 895-908.

[4] de Souza et al. (2016). Environmental DNA (eDNA) detection probability is influenced by seasonal activity of organisms. PLoS One, doi: 10.1371/journal.pone.0165273