返回首页 登录 注册 购物车
关于我们 通知公告 服务指南 联系我们
通用目录频道
搜文章 敬请关注公众号
总站导航 AraShare PlantShare ExpShare 植科头条 学术讲堂 求职招聘 仪器设备 试剂耗材 技术服务 通用目录
A near-complete assembly of an Arabidopsis thaliana genome
发布时间:2021-10-29 21:55:10



Genome sequence of Arabidopsis thaliana, a widely adopted model species, has greatly expedited molecular plant biology research. Over twenty years after the first release of the genome (Arabidopsis Genome Initiative, 2000), there remains unresolved gap regions that are presumably composed of highly repetitive sequence, such as telomeres, centromeres, 5S rDNA clusters, and nucleolar organization regions (NORs) containing 45S rDNA. The near-complete Col-PEK assembly was obtained by combining long-read Nanopore ONT, high-fidelity long-read PacBio HiFi, and short-read Illumina NovaSeq sequencing data. The Col-PEK assembly has filled most of the gaps (only NORs and a telomere at the end of NOR4 are not complete enough), including five centromeres, found in the TAIR10 assembly. The new assembly is 133.92 Mb, 14.77 Mb larger than the TAIR10 assembly. The annotations of coding genes in Araport11 were lifted-over to this new assembly by Liftoff, and substantial homologous duplications of known genes were found. In addition, there are repetitive sequence annotation and non-coding RNA annotation.


Please use the interactive JBrowse to view the genome. Genome assembly and annotation are free to download.


拟南芥(Arabidopsis thaliana)作为被广泛应用的模式植物,其基因组序列极大地加快了植物分子生物学研究。在首个基因组(Arabidopsis Genome Initiative, 2000)发布二十多年后,仍然存在未解决的缺口区域,这些区域可能由高度重复的序列组成,例如端粒、着丝粒、5S rDNA 簇和含有45S rDNA的核仁组织区(NORs)。我们结合长读Nanopore ONT、高保真的长读PacBio HiFi和短读Illumina NovaSeq测序数据获得了此接近完整的Col-PEK组装版本。Col-PEK组装填补了TAIR10组装中包括五个着丝粒在内各区域中的绝大多数缺口(仅NORs和NOR4末端端粒区域不完整)。该组装大小为133.92 Mb,比TAIR10组装增加14.77 Mb序列。Araport11中的绝大多数编码基因注释能被Liftoff迁移到此新的Col-PEK组装中并发现了很多已知基因的同源复制结果。此外,还对该新组装进行了重复序列注释以及非编码RNA注释。


可使用交互式JBrowse浏览器查看基因组及注释。基因组序列和注释文件可由此下载