Abstract
Chimeras are a frequent artifact in polymerase chain reaction and could be the underlying causes of erroneous taxonomic identifications, overestimated microbial diversity, and spurious sequences. However, little is known about the regional effects on chimera formation. Therefore, we investigated the chimera formation rates in different regions of phylogenetically important biomarker genes to test the regional effects on chimera formation. An empirical study of chimera formation rates was performed using the Roche GSFLXTM system with sequences of the V1/V2/V3 and V4/V5 regions of the 16S rRNA gene and sequences of the nifH gene from a mock microbial community. The chimera formation rates for the 16S V1/V2/V3 region, V4/V5 region, and nifH gene were 22.1–38.5%, 3.68–3.88%, and 0.31–0.98%, respectively. Some amplicons from the V1/V2/V3 regions were shorter than the typical length (~7–31%), reflecting incomplete extension. In the V1/V2/V3 and V4/V5 regions, conserved and hypervariable regions were identified. Chimeric hot spots were located in parts of conserved regions near the ends of the amplicons. The 16S V1/V2/V3 region had the highest chimera formation rate, likely because of long template lengths and incomplete extension. The amplicons of the nifH gene had the lowest frequency of chimera formation most likely because of variations in their wobble positions in triplet codons. Our results suggest that the main reasons for chimera formation are sequence similarity and premature termination of DNA extension near primer regions. Other housekeeping genes can be a good substitute for 16S rRNA genes inmolecularmicrobial studies to reduce the effects of chimera formation.
Citations
Citations to this article as recorded by

- Strategies for sample labelling and library preparation in DNA metabarcoding studies
Kristine Bohmann, Vasco Elbrecht, Christian Carøe, Iliana Bista, Florian Leese, Michael Bunce, Douglas W. Yu, Mathew Seymour, Alex J. Dumbrell, Simon Creer
Molecular Ecology Resources.2022; 22(4): 1231. CrossRef - User‐friendly bioinformatics pipeline gDAT (graphical downstream analysis tool) for analysing rDNA sequences
Martti Vasar, John Davison, Lena Neuenkamp, Siim‐Kaarel Sepp, J. Peter W. Young, Mari Moora, Maarja Öpik
Molecular Ecology Resources.2021; 21(4): 1380. CrossRef - Promises and pitfalls of using high‐throughput sequencing for diet analysis
Antton Alberdi, Ostaizka Aizpurua, Kristine Bohmann, Shyam Gopalakrishnan, Christina Lynggaard, Martin Nielsen, Marcus Thomas Pius Gilbert
Molecular Ecology Resources.2019; 19(2): 327. CrossRef - ITS all right mama: investigating the formation of chimeric sequences in the ITS2 region by DNA metabarcoding analyses of fungal mock communities of different complexities
Anders Bjørnsgaard Aas, Marie Louise Davey, Håvard Kauserud
Molecular Ecology Resources.2017; 17(4): 730. CrossRef - Analysis of large 16S rRNA Illumina data sets: Impact of singleton read filtering on microbial community description
Lucas Auer, Mahendra Mariadassou, Michael O'Donohue, Christophe Klopp, Guillermina Hernandez‐Raquet
Molecular Ecology Resources.2017;[Epub] CrossRef - Barcoding lichen-forming fungi using 454 pyrosequencing is challenged by artifactual and biological sequence variation
Kristiina Mark, Carolina Cornejo, Christine Keller, Daniela Flück, Christoph Scheidegger, Jianping Xu
Genome.2016; 59(9): 685. CrossRef - Tag jumps illuminated – reducing sequence‐to‐sample misidentifications in metabarcoding studies
Ida Bærholm Schnell, Kristine Bohmann, M. Thomas P. Gilbert
Molecular Ecology Resources.2015; 15(6): 1289. CrossRef