Microsatellite Identification Based on Genome Assembly Reveals Potentials Marker of Macassar Ebony (Diospyros celebica Bakh.)
Abstract
Macassar ebony (Diospyros celebica Bakh.) is an endemic tree species native to Sulawesi. To date, there are limited comprehensive publications regarding its genome assembly. In this study, we employed paired-end libraries of HiSeq 4000 Illumina, generating 141.2 million paired-end reads (42.4 Gigabases). Our analysis revealed the presence of 950,081 Scaffolds, with an N50 value of 6,023. Notably, BUSCO analysis identified 183 (12.7%) complete and single-copy BUSCOs (S), as well as 9 (0.6%) complete and duplicated BUSCOs (D). Furthermore, we identified 12,890 microsatellites within the Macassar Ebony genome. These microsatellites encompass 14 dinucleotide SSR motifs, 12,090 trinucleotide SSRs, 780 tetranucleotide SSR motifs, and 6 pentanucleotide SSRs. This dataset represents a valuable resource for assessing Macassar Ebony genetic makeup in its natural habitats and for subsequent analyses of the Macassar Ebony genome.
References
[2] Karlinasari L, Noviyanti N, Purwanto YA, Majiidu M, Dwiyanti FG, Rafi M, Damayanti R, Harnelly E, Siregar IZ. 2021 Discrimination and determination of extractive content of ebony (Diospyros celebica Bakh.) from Celebes Island by Near-Infrared Spectroscopy. Forests 12(1):6. doi: 10.3390/f12010006
[3] Siregar IZ, Dwiyanti FG, Pratama R, Matra DD, Majiidu M. 2021. Generating long-read sequences using Oxford Nanopore Technology from Diospyros celebica genomic DNA. BMC Res Notes. 14(1): 75. doi: doi.org/10.1186/s13104-021-05484-0
[4] Riswan S. 2002. Kajian biologi eboni (Diospyros celebica Bakh.). Berita Biologi. 6(2):211–218. doi: 10.14203/beritabiologi.v6i2.1483
[5] Mimura K. 2021. "Inquiry on Diospyros celebica". Personal communication: February 19th, 2021, Yamaha Corporation.
[6] Golar G, Muis H, Akhbar A, Khaeruddin C. 2022. Threat of forest degradation in ex-forest concession right (HPH) in Indonesia. Sustainability and climate change 15(3): 216–223. doi: 10.1089/scc.2022.0019
[7] Fatlan K, Pamoengkas P, Majiidu M, Siregar IZ. 2021. Tree species diversity on ebony habitat with different degradation levels in Sulawesi. IOP Conference Series: Earth and Environmental Science. 918(012052):. doi: 10.1088/1755-1315/918/1/012052.
[8] World Conservation Monitoring Centre. 1998. Diospyros celebica. The IUCN Red List of Threatened Species 1998: e.T33203A9765120. doi: 10.2305/IUCN.UK.1998.RLTS.T33203A9765120.en. Accessed on 04 December 2023.
[9] Kementerian Lingkungan Hidup dan Kehutanan [KLHK]. 2022. Rencana Operasional Indonesia's FOLU Net Sink 2030. [Accessed on 26 September 2023]. https://pustandpi.or.id/download/rencana-operasional-indonesias-folu-net-sink-2030/.
[10] Jalonen R, Valette M, Boshier D, Duminil J, Thomas E. 2017. Forest and landscape restoration severely constrained by a lack of attention to the quantity and quality of tree seed: insights from a global survey. Conservation Letters 4(11):1–9. doi: 10.1111/conl.12424.
[11] Ng KKS, Kobayashi MJ, Fawcet JA, Hatakeyama M, Paape T, Ng CH, Ang CC, Tnah LH, Lee CT, Nishiyama T et al. 2021. The genome of Shorea leprosula (Dipterocarpaceae) highlights the ecological relevance of drought in aseasonal tropical rainforests. Communication Biology. 4 (1166). doi: 10.1038/s42003-021-02682-1.
[12] Burke MK, Long AD. 2012. Perspective: What paths do advantageous alleles take during short-term evolutionary change? Molecular Ecology 21: 4913–4916. doi: 10.1111/j.1365-294x.2012.05745.x
[13] Garfinkel AR, Otten M, Crawford S. 2021. SNP in potentially defunct tetrahydrocannabinolic acid synthase is a marker for Cannabigerolic acid dominance in Cannabis sativa L. Genes 12:228. doi: 10.3390/genes12020228.
[14] Butler T. 2023. Comparison of Constant- and PulsedField Electrophoresis Technologies for Analysis of High Molecular Weight and Large DNA Fragments. Application Note Genomics. https://www.agilent.com/cs/library/applications/application-dna-fragment-size-automated-electrophoresis-5994-6103en-agilent.pdf 5994-6103EN
[15] Andrews S. 2010. FastQC: a quality control tool for high throughput sequence data. [Accessed on 26 September 2023]. http://www.bioinformatics.babraham.ac.uk/projects/fastqc.
[16] Kinjo S, Monma N, Misu S, Kitamura N, Imoto N, Yoshitake K, Gojobori T, Ikeo K. 2018. Maser: one-stop platform for NGS big data from analysis to visualization. Database bay027. doi: 10.1093/database/bay027
[17] Luo, Liu B, Xie Y, Li Z, Huang W, Yuan J, He G, Chen Y, Pan Q, Liu Y et al. 2012. SOAPdenovo2: an empirically improved memory-efficient short-read de novo assembler. GigaScience 1:18. doi: 10.1186/2047-217x-1-18.
[18] Matra DD, Ritonga AW, Natawijaya A, Poerwanto R, Sobir. Siregar UJ. 2019. Widodo, W.D.; Inoue E. Datasets for genome assembly of six underutilized Indonesian fruits. Data in Brief 22: 960–963. doi: 10.1016/j.dib.2018.12.070.
[19] Li R, Zhu H, Ruan J, Qian W, Fang X, Shi Z, Li Y, Li S, Shan G, Kristiansen K, et al. 2010. De novo assembly of human genomes with massively parallel short read sequencing. Genome Research 20: 265–272. doi: 10.1101/gr.097261.109.
[20] Seppey M, Manni M, Zdobnov EM. 2019. BUSCO: Assessing genome assembly and annotation completeness. In Kollmar M (eds). Gene Prediction. Methods in Molecular Biology. Volume 1962. New York: Humana Press. doi: 10.1007/978-1-4939-9173-0_14.
[21] Simão FA, Waterhouse RM, Ioannidis P, Kriventseva EV, Zdobnov EM. 2015. BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs. Bioinformatics 31(19):3210– 3212. doi:10.1093/bioinformatics/btv351.
[22] Thiel T, Michalek M, Varshney RK, Graner A. 2003. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor. Appl. Genet. 106(3):411-422. doi: 10.1007/s00122-002-1031-0.
[23] Matra DD, Adrian M, Poerwanto R. 2022. Underutilised Fruit Tree Genomes from Indonesia. In: Chapman MA (eds). Underutilised Crop Genomes. Compendium of Plant Genomes. Springer, Cham. doi: 10.1007/978-3-031-00848-1_10.
[24] Nystedt B, Street N, Wetterbom A et al. 2013. The Norway spruce genome sequence and conifer genome evolution. Nature 497:579–584. doi: 10.1038/nature12211.
[25] The International Brachypodium Initiative. 2010. Genome sequencing and analysis of the model grass Brachypodium distachyon. Nature 463:763–768. doi: 10.1038/nature08747.
[26] Gan X, Stegle O, Behr J, Steffen JG, Drewe P, Hildebrand KL, Lyngsoe R, Schultheiss SJ, Osborne EJ, Sreedharan VT et al. 2011. Multiple reference genomes and transcriptomes for Arabidopsis thaliana. Nature 477:419–423. doi: 10.1038/nature10414.
[27] Wang W, Mauleon R, Hu Z, Mansueto L, Copetti D, Sanciangco M, Palis KC, Xu J, Sun C, Fu B et al. 2018 Genomic variation in 3.010 diverse accessions of Asian cultivated rice. Nature 557:43–49. doi: 10.1038/s41586-018-0063-9.
[28] Xin X, He SY. 2013. Pseudomonas syringae pv. tomato DC3000: A model pathogen for probing disease susceptibility and hormone signaling in plants. Annual Review of Phytopathology 51(1): 473-498.
[29] Jiao Y, Wickett N, Ayyampalayam S. Chanderbali AS, Landherr L, Ralph PE, Tomsho LP, Hu Y, Liang H, Soltis PS et al. 2011. Ancestral polyploidy in seed plants and angiosperms. Nature 473:97–100. doi: 10.1038/nature09916.
[30] Kajikawa M, Sierro N, Kawaguchi H, Bakaher N, Ivanov NV, Hashimoto T, Shoji T. 2017. Genomic insights into the evolution of the nicotine biosynthesis pathway in tobacco. Plant Physiol 174(2):999–1011. doi: 10.1104/pp.17.00070.
[31] Shih PM, Wu D, Latifi A, Axen SD, Fewer DP, Talla E, Calteau A, Cai F, Tandeau de Marsac N, Rippka R et al. 2013. Improving the coverage of the cyanobacterial phylum using diversity-driven genome sequencing. Proc Natl Acad Sci USA. 110(3):1053–1058. doi: 10.1073/pnas.1217107110..
[32] Zou C, Chen A, Xiao L, Muller HM, Ache P, Haberer G, Zhang M, Jia W, Deng P, Huang R et al. 2017. A high-quality genome assembly of quinoa provides insights into the molecular basis of salt bladder-based salinity tolerance and the exceptional nutritional value. Cell Res. 27(11):1327–1340. doi: 10.1038/cr.2017.124.
[33] Anita V, Matra DD, Siregar UJ. 2023. Chloroplast genome draft assembly of Falcataria moluccana using hybrid sequencing technology. BMC Res Notes 16(31). doi: 10.1186/s13104-023-06290-6.
[34] Anita VPD, Siregar UJ, Matra DD. 2023. Analisis genomik dengan teknologi sekuensing secara hybrid (short-reads dan long-reads) pada Sengon (Falcataria moluccana) [tesis]. Bogor: Institut Pertanian Bogor.
[35] Chen Z, Erickson DL, Meng J. 2021. Polishing the Oxford Nanopore long-read assemblies of bacterial pathogens with Illumina short reads to improve genomic analyses. Genomics 113:1366–1377. doi: 10.1016/j.ygeno.2021.03.018.
[36] Chen D, Du Y, Fan X, Zhu Z, Jiang H, Wang J, Fan Y, Chen H, Zhou D, Xiong C et al. 2020. Reconstruction and functional annotation of Ascosphaera apis full-length transcriptome utilizing PacBio long reads combined with Illumina short reads. Journal of Invertebrate Pathology 176:107475. doi: 10.1016/j.jip.2020.107475.
[37] Magandhi M, Sobir, Kusumo YWE, Sudarmono, Matra DD. 2021. Development and characterization of Simple SequenceRepeats (SSRs) markers in durian kura-kura (Durio testudinarius Becc.) using NGS data. IOP Conf. Series: Earth and Environmental Science 948:012082. doi:10.1088/1755-1315/948/1/012082.
[38] Sari HP, Efendi D, Suwarno WB, Matra DD. 2020. Mining and characterization of genomic-based microsatellite markers in duku (Lansium domesticum). IOP Conf. Ser.: Earth Environ. Sci. 457(012083): 1–5. doi: 10.1088/1755-1315/457/1/012083
[39] Matra DD, Fathoni MAN, Majiidu M, Wicaksono H, Sriyono A, Gunawan G, Susanti H, Sari R, Fitmawati F, Siregar IZ et al. 2021. The genetic variation and relationship among the natural hybrids of Mangifera casturi Kosterm. Sci Rep 11: 19766. doi: 10.1038/s41598-021-99381-y.
[40] Morgante M, Hanafey M, Powell W. 2002. Microsatellites are preferentially associated with nonrepetitive DNA in plant genomes. Nat Genet 30:194–200. doi: 10.1038/ng822.
[41] Rhodes L, Maxted N. 2016. Mangifera casturi. The IUCN Red List of Threatened Species 2016: e.T32059A61526819. doi: 10.2305/IUCN.UK.2016-3.RLTS.T32059A61526819.en.
[42] Dwiyanti FG, Harada K, Siregar IZ, Kamiya K. 2014. Population genetics of the critically endangered species Dipterocarpus littoralis Blume (Dipterocarpaceae) endemic in Nusakambangan Island, Indonesia. Biotropia 21(1):1-12. doi: 10.11598/btb.2014.21.1.1.
[43] Hamidi A, Robiansyah I. 2018. Dipterocarpus littoralis. The IUCN Red List of Threatened Species 2018: e.T33376A125628315. doi: 10.2305/IUCN.UK.2018-1.RLTS.T33376A125628315.en.
Authors
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).