MCPtaggR: R package for accurate genotype calling in reduced representation sequencing data by eliminating error-prone markers based on genome comparison

研究成果査読

抄録

Reduced representation sequencing (RRS) offers cost-effective, high-throughput genotyping platforms such as genotyping-by-sequencing (GBS). RRS reads are typically mapped onto a reference genome. However, mapping reads harbouring mismatches against the reference can potentially result in mismapping and biased mapping, leading to the detection of error-prone markers that provide incorrect genotype information. We established a genotype-calling pipeline named mappable collinear polymorphic tag genotyping (MCPtagg) to achieve accurate genotyping by eliminating error-prone markers. MCPtagg was designed for the RRS-based genotyping of a population derived from a biparental cross. The MCPtagg pipeline filters out error-prone markers prior to genotype calling based on marker collinearity information obtained by comparing the genome sequences of the parents of a population to be genotyped. A performance evaluation on real GBS data from a rice F2 population confirmed its effectiveness. Furthermore, our performance test using a genome assembly that was obtained by genome sequence polishing on an available genome assembly suggests that our pipeline performs well with converted genomes, rather than necessitating de novo assembly. This demonstrates its flexibility and scalability. The R package, MCPtaggR, was developed to provide functions for the pipeline and is available at https://github.com/tomoyukif/MCPtaggR.

本文言語English
論文番号dsad027
ジャーナルDNA Research
31
1
DOI
出版ステータスPublished - 2月 1 2024

ASJC Scopus subject areas

  • 医学一般

フィンガープリント

「MCPtaggR: R package for accurate genotype calling in reduced representation sequencing data by eliminating error-prone markers based on genome comparison」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル