Skip to main content

Complete genome sequencing of nematode Aphelenchoides besseyi, an economically important pest causing rice white-tip disease

Abstract

Aphelenchoides besseyi is a seed-borne plant-parasitic nematode that causes severe rice yield losses worldwide. In the present study, the A. besseyi Anhui-1 strain isolated from rice in China was sequenced with a hybrid method combining PacBio long reads and Illumina short reads, and subsequently annotated using available transcriptome references. The genome assembly consists of 166 scaffolds totaling 50.3 Mb, with an N50 of 1.262 Mb and a maximum scaffold length of 9.17 Mb. A total of 16,343 genes were annotated in the genome, with 94 gene families expanded while 70 families contracted specifically in A. besseyi. Furthermore, gene function analysis demonstrated that the genes related to drought tolerance were enriched, and cellulase genes were horizontally acquired from eukaryotic origin. Our findings provide resources to interpret the biology, evolution, ecology, and functional diversities of Aphelenchoides spp. in the light of genomics.

Background

Aphelenchoides besseyi is a seed-borne plant-parasitic nematode (PPN) that parasitizes rice (Oryza sativa), strawberry (Fragaria grandiflora), as well as other plants belonging to 35 genera (Duncan and Moens 2013). A. besseyi could cause severe rice yield losses of up to 70% in some cases (Lin et al. 2005; Tulek and Cobanoglu 2010) and is considered as one of the major PPNs in world crop production (Jones et al. 2013).

A. besseyi was first isolated from strawberries in the USA (Christie 1929). Later, Yokoo (1948) described A. oryzae from rice, but it was considered as a junior synonym of A. besseyi due to the overlapping of many morphological characters (Allen 1952). Recently, molecular and phylogenetic analyses suggested that A. besseyi may be a species complex consisting of several cryptic species that are not well morphologically delimited (Oliveira et al. 2019; Xu et al. 2020). The species complex consists of three described species: A. pseudobesseyi parasitizing ornamental plants, A. oryzae parasitizing rice, and A. besseyi parasitizing strawberries (Subbotin et al. 1942). Since the identification of these species primarily relies on molecular tools and their morphology is nearly identical, we retain the name of species complex A. besseyi in the present study.

Unlike most PPNs that infect root tissues in the soil, A. besseyi feeds on growing points of stems and leaves of seedlings, causing the disease called ‘white-tip’ (Perry and Moens 2013). A. besseyi bears a series of interesting characters that can be a model to study nematode evolution and adaptation. For example, it can survive in stored rice grains for several years through anhydrobiosis (Tiwari and Khare 2003), thus can be used to study the adaption of desiccation; it is a facultative parasite propagating on both fungi and plants (Perry and Moens 2013), which could be an excellent object to study the evolution of plant parasitism. A. besseyi is predominantly amphimictic, and males are usually abundant (Huang et al. 1979), but parthenogenetic reproduction has also been found in some populations (Nandini et al. 2001); thus, it can be an example to study the mechanisms of reproduction mode.

Regardless of their importance in agriculture, the genome of A. besseyi has not yet been sequenced. This is also the case in a broader of aphelenchs nematodes. Among 453 valid aphelenchs nematodes listed by Hunt (2008), genomic information was only available for Bursaphelenchus xylophilus, B. okinawaensis, B. mucronatus, and Aphelenchus avenae. More high-quality genomic data would provide valuable insights into the evolution of aphelenchs (Kikuchi et al. 2011; Wan et al. 2021).

In this study, both PacBio long reads and Illumina short reads sequences were used to study the genome of A. besseyi. Protein-coding, non-coding genes, and transposable elements (TE) were predicted using newly sequenced data together with the available RNA-seq transcriptome. Cellulase is an iconic gene that PPNs acquired through horizontal gene transfer (HGT), and we also investigated the possible origin of cellulase genes from A. besseyi. So far, this new genome is the most contiguous and most complete annotated one for aphelenchs species, which could provide a robust reference for further analyses with important evolutionary and agro-economic implications.

Results

Genome features of A. besseyi

Our assembly of the A. besseyi Anhui-1 strain (NCBI BioProject: PRJNA901680) consists of 166 scaffolds totaling 50.3 Mb, with an N50 of 1.262 Mb and a maximum scaffold length of 9.17 Mb (Table 1). A total of 143 ncRNAs were identified, including 17 miRNA (62–148 bp), 148 rRNA (113–7556 bp), 61 snRNA (72–217 bp), and 276 tRNA (71–127 bp) (Additional file 1: Tables S1–S4). The genome completeness was assessed by mapping BUSCOs onto the genome assembly. The assembled genome represents 78.2% of the Anhui-1 genome as it carries 744 single-copy (75.8%), 24 duplicated (2.4%), 161 missing BUSCOs (16.4%). In addition, 53 fragmented BUSCOs (5.4%) were aligned to the genome. The assembled genome is about half of the model nematode Caenorhabditis elegans (100 Mb), and it is the smallest genome known in aphelenchs (Table 1). GC content of the genome assembled is 42.2%, which is similar to A. avenae (42.1%) and B. xylophilus (40.4%) but higher than B. okinawaensis (36.2%).

Table 1 The genome statistics of newly sequenced Aphelenchoides besseyi and other sequenced aphelenchs nematodes

Analyses of repetitive elements suggested a total of 303 kb tandem repeats, occupying 0.6% of the genome. The size of TE varies depending on different methods, with the largest when using RepeatModeler and LTR-FINDER database (de novo methods, 9810 kb, 19.5% of genome) and smallest when using Repbase database (Repbase TEs, 988 kb, 1.97% of genome). After combining these methods and removing redundancy, significantly longer TEs were recovered (Combined TEs, 10.4 Mb), accounting for 21.2% of the genome (Table 2 and Additional file 2).

Table 2 Summary of Aphelenchoides besseyi transposable elements (TE) annotation statistics based on different reference databases and methods

Gene annotation and comparison with other nematodes

The A. besseyi genome is predicted to encode 16,343 protein-coding genes, whose number is similar to B. xylophilus (15,860) but much less than A. avenae (43,724) (Table 1). Among these annotated genes, 93.9% of protein-coding genes can be assigned to orthogroups (15,348). A total of 452 species-specific orthogroups containing 2334 protein-coding genes were found in A. besseyi. Within 44 examined species, the root-knot nematode Meloidogyne graminicolas has the smallest gene number (10,895), while animal parasitic species tend to have more genes; for example, the largest gene number was found in the insect parasite nematode Romanomermis culicivorax (48,376).

A total of 7495 annotated orthogroups in A. besseyi contain a single copy (5299 orthogroups), followed by two copies (1114 orthogroups), while multiple copies are relatively rare (a total of 1082 orthogroups). This is in line with B. xylophilus as well as most PPNs, but different from polyploid root-knot species like M. arenaria and M. enterolobii in which two copies were more abundant (Fig. 1a). Besides, the genome of A. besseyi shows a longer average gene length, with 3 kb to be the most frequently recovered length (Additional file 3: Figure S1).

Fig. 1
figure 1

The comparison of orthologous genes of Aphelenchoides besseyi and other closely related species. a The number of annotated genes in orthogroups and their copy numbers. b Orthologous genes shared among different species

Identifying homologous relationships among the sequences of different species plays a pivotal role in enhancing our understanding of evolution and diversity. Therefore, we compared the protein-coding gene families shared by A. besseyi, B. xylophilus, and Ditylenchus destructor. A total of 5277 orthogroups were resolved among the three species. The two aphelenchs nematodes A. besseyi and B. xylophilus share the most abundant unique orthogroups (1059), followed by the stem nematodes D. destructor and B. xylophilus (607), and there are only 201 orthogroups shared by D. destructor and A. besseyi. In respect to unique genes, A. besseyi has a reduced number of unique gene families (958) compared to B. xylophilus (1048) and D. destructor (1355) (Fig. 1b).

Phylogenetic placement and molecular dating

We identified 242 orthogroups that have single-copy genes for a minimum of 50% of species (44 nematodes), and these genes were subsequently used for phylogeny reconstruction. As expected, A. besseyi is fully supported as a sister to pine wood nematode B. xylophilus (BS = 100) and forms a basal clade within Tylenchomorpha (Fig. 2a). Further molecular dating was performed using 1126 orthogroups that have single-copy genes for 12 out of 17 species. The results suggests that A. besseyi splits with B. xylophilus in an average of 163.8 million years ago, similar to the splitting of sedentary endoparasite cyst and root-knot nematodes (160.3 million years ago) (Fig. 2b).

Fig. 2
figure 2

Phylogenetic placement and molecular dating of Aphelenchoides besseyi within Nematoda. a The tree was constructed by most single-copy conserved orthologous groups acquired by comparison of protein sequences in 44 species. The bootstrap values are shown at nodes. b The molecular dating of A. besseyi. The estimated species divergence times (millions of years ago) are indicated as ranges at each branch node

The gene family expansion and function prediction

The analysis for gene family expansion and contraction reveals that 94 and 70 gene families are respectively expanded and contracted in A. besseyi (Additional file 4: Tables S1, S2), similar to B. xylophilus, which has 88 expanded and 69 contracted gene families (Fig. 3).

Fig. 3
figure 3

The analysis for the gene family expansion and contraction. The red and blue colors indicate the number of expanded and contracted gene numbers, respectively

To better evaluate the gene ontology and functional classification of annotated genes, we performed functional analysis using gene ontology (GO), eukaryotic orthologous groups (KOG), Kyoto encyclopedia of genes and genomes (KEGG) (Figs. 45 and Additional file 4: Tables S3–S5), NCBI NR, and SwissProt databases. The NR search annotated 12,031 genes and the SwissProt resulted in 7646 annotations; details for these two annotations are given in Additional file 4: Tables S6, S7.

Fig. 4
figure 4

Functional annotations of the Aphelenchoides besseyi genes. a GO functional annotation. b KOG functional annotation

Fig. 5
figure 5

Functional annotation of the Aphelenchoides besseyi genes by KEGG

A total of 8238 protein-coding genes were functionally annotated using the GO database (Additional file 4: Table S3). GO terms include biological process (BP), cellular component (CC), and molecular function (MF), comprising 22, 16, and 10 elements, respectively. The top three annotated BPs were the cellular process (GO:0009987), metabolic process (GO:0008152), and single-organism process (GO:0044699), in which 947, 942, and 733 genes were included, respectively. There were 495, 495, and 368 genes included in the top three CCs, including the cell (GO:0005623), cell part (GO:0044464), and membrane (GO:0016020), respectively. A total of 1025, 734, and 108 genes were included in the most annotated MFs: catalytic activity (GO:0003824), binding (GO:0005488), and transporter activity (GO:0005215), respectively (Fig. 4a).

For KOG, a total of 10,832 protein-coding genes involving 25 categories were annotated (Additional file 4: Table S4). Among them, 1802 (16.64%) genes were annotated as the general function, which was the most abundant category, followed by 1639 (15.13%) genes assigned in signal transduction mechanisms (Fig. 4b).

KEGG pathway analyses annotated 2834 protein-coding genes (Additional file 4: Table S5), and the main pathways were ‘global and overview maps’ for metabolism, ‘translation’ for genetic information processing, ‘signal transduction’ for environmental information processing, ‘transport and catabolism’ for cellular processes, and ‘aging’ for organismal systems (Fig. 5). Further analysis suggested that A. besseyi has several different metabolism pathways compared to other related species (Additional file 3: Figures S2–S4). For instance, for vitamin B6 metabolism, A. besseyi is similar to B. xylophilus in lack of aldehyde oxidase (K00157) but bears threonine synthase (K01733), which is absent in Pristionchus pacificus, C. elegans and B. xylophilus. The potato rot nematode Ditylenchus destructor has similar life cycles being both mycophagous and plant parasitic species. In comparison to D. destructor, A. besseyi is similar to free-living P. pacificus and C. elegans in missing pyridoxal 5'-phosphate synthase pdxS subunit (K06215) and pyridoxine 5-phosphate synthase (K03474). With respect to biotin, A. besseyi is similar to B. xylophilus in having 3-oxoacyl-[acyl-carrier protein] reductase (K00059), which is absent in free-living P. pacificus and C. elegans. In comparison to D. destructor, A. besseyi lacks 8-amino-7-oxononanoate synthase (K00652), biotin synthase (K01012), and biotin-protein ligase (K01942). The riboflavin metabolism is generally similar to B. xylophilus, except riboflavin kinase (K00861) is absent. However, flavin prenyltransferase (K03186), ectonucleotide pyrophosphatase/phosphodiesterase family member 1/3 (K01513), and FAD synthetase (K00953) are present in A. besseyi. A total of seven proteins are missing in comparison to D. destructor; they are GTP cyclohydrolase II (K01497), diaminohydroxyphosphoribosylaminopyrimidine deaminase (K01498), 5-amino-6-(5-phosphoribosylamino) uracil reductase (K00082), 5-amino-6-(5-phospho-D-ribitylamino) uracil phosphatase (K22912), 3,4-dihydroxy 2-butanone 4-phosphate synthase (K02858), riboflavin kinase (K00861), and FMN Hydrolase/5-amino-6-(5-phospho-D-ribitylamino) uracil phosphatase (K20860). Besides, ectonucleotide pyrophosphatase/Phosphodiesterase family member 1/3 (K01513) and flavin prenyltransferase (K03186) were found in A. besseyi but not in D. destructor.

Gene related to drought tolerance

The survival of the A. besseyi is to remain anhydrobiotic in the seed until planting; thus, we suspected a series of drought tolerance/resistance genes were possibly involved. Indeed, we recovered significantly more transcription factors in A. besseyi in comparison to other studied species. In particular, there are 83 proteins similar to the LysR family transcriptional regulator (LTTRs) in the bacterium Bradyrhizobium japonicum, and 7 proteins are similar to the 12-oxophytodienoate reductase (OxyR) presents in Oryza sativa subsp. Japonica. Interestingly, both LTTRs and OxyR are related to transcriptional regulation during the expression of drought tolerance/resistance genes (Additional file 4: Table S8).

Aphelenchoides besseyi horizontally acquired cellulase genes from eukaryotic origin

Cellulose is one of the major components in plant tissues. In this study, we found three cellulase genes in the A. besseyi genome, and they are endo-glucanases that belong to the glycosyl hydrolase family 45 (GHF45). When blasted against the NCBI database, the best-hit homologs of A. besseyi cellulases match to the pinewood nematode Bursaphelenchus species and fungi (Fig. 6). The phylogenetic tree showed that A. besseyi and Bursaphelenchus cellulases are clustered in one clade, while all fungal cellulases are in a separate clade. Aphelenchoides and Bursaphelenchus are closely related genera. Based on limited data, we could not determine if nematodes from the two genera acquired cellulases from the same origin, but it is likely that A. besseyi also gains cellulase from the fungal origin as Bursaphelenchus species (Kikuchi et al. 2004).

Fig. 6
figure 6

A phylogenetic tree of cellulases from Aphelenchoides besseyi and other organisms. Cellulases in A. besseyi and their homologs were used to construct a phylogenetic tree using the model WAG + I + G4

Discussion

Use of long-read sequence technologies to generate genomes in the plant-parasitic nematode

The first PPN was sequenced in 2008 based on the Sanger method using BAC libraries (Abad et al. 2008). Later, with the development of high throughput sequencing technologies and decreasing cost, a growing number of PPNs has been sequenced. Currently, a total of 27 PPN species are genomes available in GeneBank (accessed on 01 July 2022). Among these, approximately half of them were assembled based on short reads generated through the Illumina platform, resulting in highly fragmented contigs, e.g., 17,125 in Subanguina moxae, 31,341–34,316 in Meloidogyne javanica, 129,028 contigs in Rotylenchulus reniformis, and 5944 in Hoplolaimus columbus (Takeuchi et al. 2015; Szitenberg et al. 2017; Ma et al. 2021). The poor quality of these draft genomes reduces the reliability of downstream gene annotation and limits further sensitive studies, such as comparative genomics or population genomics at the species level. A typical example is the A. avenae. This species is related to Aphelenchoides and Bursaphelenchus but has nearly three times more annotated genes (43,724 vs. 16,343 in A. besseyi and 15,860 in B. xylophilus). The assembly of A. avenae has 28,772 contigs, which are highly fragmented, including a considerable number of duplications or even contaminations. Therefore, it is difficult to draw any solid conclusion based on the quality of the given dataset (Wan et al. 2021).

The utilization of long-read sequencing technologies, such as PacBio and Nanopore, has greatly advanced our ability to assemble high-quality genomes in animals. With these technologies, obtained nematode genomes can reach a few hundred scaffolds, with an N50 at a level of several Mb, greater consensus accuracy, and a lower degree of sequencing bias (Amarasinghe et al. 2020). In the present study, we demonstrated a hybrid genome sequencing strategy, combining long reads (PacBio) with high-accuracy and low-cost Illumina short reads, which can be used to correct long reads assemblies, and finally obtain a more complete and contiguous genome assembly. This resource will also pave the way for comparative genomics towards pinpointing the evolution of plant parasitism, the genome bases of anhydrobiosis, and the mechanism of reproduction model switch in this plant parasite.

More recently, A. besseyi complex was sequenced in an independent study (Lai et al. 2022) during the revision of this manuscript. In that study, they used the hybrid strategy of Illumina HiSeq 2500 to produce 150 bp paired-end reads, PacBio and Nanopore sequencing system to produce long-read, and, more importantly, Hi-C was used to generate chromosome level assembly. The acquired populations of A. besseyi have genome sizes ranging from 44.7 to 47.4 Mb, slightly smaller than our sequenced population, and are amongst the smallest in the clade IV. This method can be further used for genome sequencing for other PPNs.

Horizontally acquired cellulases in A. besseyi

The acquisition of plant cell-wall degrading genes through HGT is a symbolic event for the evolution of PPNs (John et al. 2005; Haegeman et al. 2011; Kikuchi et al. 2017). Among those enzymes, cellulase exists the most in all known plant parasites and some free-living nematodes. However, cellulases from different GHFs were found in nematodes, and those genes were gained from different origins independently. As cellulases from most plant-parasites are of bacterial origins (Danchin et al. 2010), it has also been shown that fungi are potential donors of cellulases in nematodes, although it is less frequent than bacteria (Haegeman et al. 2011). Here using the complete genome, we showed that A. besseyi along with the pine wood nematodes Bursaphelenchus species had acquired cellulases from fungal origins, which belong to the GHF 45. This is in agreement with earlier studies in B. xylophilus and A. besseyi from the pre-sequencing era (Kikuchi et al. 2004, 2014; Palmoares-Ruis et al. 2014). However, due to the lack of data, we are not able to confirm whether Aphelenchoides and Bursaphelenchus gained cellulases through one HGT event. Along with recent HGT studies in nematodes (Han et al. 2022) and insects (Xia et al. 2021), our data provide new insights into the adaption of animals through HGT.

Conclusion

In this study, we sequenced A. besseyi isolated from rice in China using PacBio long reads and Illumina short reads. The assembly consists of 166 scaffolds totaling 50.3 Mb, with an N50 of 1.262 Mb and a maximum scaffold length of 9.17 Mb. A total 16,343 genes were annotated in the genome, with 94 expanded and 70 contracted gene families. Further gene function analysis demonstrated that the transcription factors related to drought tolerance were enriched, and cellulase genes were horizontally acquired from eukaryotic origin.

Methods

Nematode culture and DNA isolation

A. besseyi was isolated from the infected seeds of O. sativa subsp. japonica, cv. AnHui-1 as described in Xie et al. (2019). Nematodes were subsequently cultured on the fungus Botrytis cinerea at 25°C using one male and one female for 10 generations. The nematodes were collected in double-distilled water for 24 h before further application. Genomic DNA was extracted from mix stages.

DNA extraction and sequencing

High molecular weight DNA of A. besseyi for PacBio sequencing was extracted from c.a. 50,000 individuals. DNA quantity and quality were assessed using a Qubit Fluorometer (ThermoFisher, Waltham, MA, USA) and 2100 bioanalyzer (Agilent Technology, Santa Barbara). DNA molecules were ruptured into smaller fragments; BluePippin (Saga Science, Beverly, MA, USA) was used to size select DNA fragments of > 20 kb. Libraries were prepared using the SMRTbell Template Prep Kit-SPv3 following the manufacturer’s recommendations. Sequencing was performed on a PacBio Sequel platform at Gendenovo (Guangzhou, China). Illumina libraries were prepared using the Paired-End Sample Prep Kit (Illumina Inc., San Diego, CA) with an insert size of 500 bp. Sequencing was performed on Illumina Novaseq 6000 platform.

De novo assembly

De novo assembly was performed with PacBio long reads using MECAT (Xiao et al. 2017). The parameters ‘-n 50’ for mecat2pw and ‘Overlapper = mecat2asmpw’ for mecat2canu were used for assembly. Illumina reads were used to correct the PacBio long reads using Pilon (Walker et al. 2014). To evaluate the accuracy of genome assembly and sequencing, Illumina short reads were realigned to genome assembly to obtain statistical indicators, including mapping rate, genome coverage, depth distribution, and homozygous and heterozygous SNP number. Expressed Sequence Tags (ESTs) from A. besseyi were aligned to the genome assembly by BLAT software to evaluate the integrity of genome assembly. BUSCO (Simão et al. 2015) pipelines were also performed to evaluate the completeness of genome assembly using nematoda_odb9 (with the number of species = 8).

Gene annotation

A hybrid strategy using transcriptome, homologous, and de novo annotation was adopted to predict gene structure. The de novo prediction was conducted using Augustus (Stanke et al. 2005) and GeneMark (Lukashin and Borodovsky 1998) based on the Hidden Markov Model. References were then used to search for and annotate homologous genes in MAKER (Cantarel et al. 2008). RNA-Seq data were used for prediction by combining hisat2 alignment and StringTie (Pertea et al. 2016) assembly results to obtain predicted gene sets. Finally, MAKER (Cantarel et al. 2008) software was used to integrate the prediction resulted from the three above-mentioned methods to obtain the final gene set.

We annotated genes using NCBI NR, GO, SwissProt, KEGG, and KOG databases. The predicted protein-coding gene sequences are aligned with different databases through BLAST 2.2.29+ (McGinnis et al. 2004) with a threshold of e-value less than 10–5 for filtering, and the top 20 hits with the highest score value are selected.

For Non-coding RNAs, rRNAs were predicted using RNAmmer (Lagesen et al. 2007), tRNA were predicted by tRNAscan-SE (Lowe and Eddy 1997), and both sRNA and miRNA were predicted by comparing the Rfam database (V13, http://rfam.xfam.org/).

Tandem repeats finder (Benson 1999) was used to predict tandem repeats. Three prediction methods were used to extract interspersed repeats: (1) based on signature using Repbase (Jurka et al. 2005) through LTR_FINDER (Xu and Wang 2007), Helitroscanner (Xiong et al. 2014), MITE-Hunter (Han and Wessler 2010), and MGEscan-nonLTR (Rho and Tang 2009). (2) construction in de novo method using programs PILER (Edgar et al. 2005), RepeatScout (Price et al. 2005), and RepeatModeler (Flynn et al. 2020); (3) homology construction. RepeatMasker (Chen 2004) software was used to predict the repeat sequences based on the constructed repeat sequence database in structure prediction (signature) and de novo prediction.

Phylogeny, molecular dating, and gene function analysis

For phylogenetic analysis, the ortholog gene identification and clustering were performed using OrthoFinder (Emms and Kelly 2019). MAFFT (Katoh and Standley 2013) was used to align amino acid sequences in each orthogroup. Aligned sequences were concatenated, and a maximum-likelihood Species tree was constructed using IQ-TREE (Nguyen et al. 2015) using 1000 bootstrap replications.

The divergence time between A. besseyi and 16 other nematodes was estimated using the MCMCtree program implemented in PAML (Yang 2007). Calibration time was obtained from the TimeTree database (http://www.timetree.org/). Gene family expansion and contraction were determined using CAFÉ (De Bie et al. 2006) based on gene family changes in the inferred phylogenetic history. Two methods were employed for function prediction. For the gene families, the GO terms were obtained through BLAST2GO (Conesa et al. 2005) searching against NCBI non-redundant database and using the Gene Set Enrichment Analysis tool in WormBase7 (https://wormbase.org).

To further analyze drought-related genes, the reference database was built using all available drought-resistant and drought-tolerant genes in UniProt (https://www.uniprot.org/). The annotated A. besseyi gene, together with other species, was used as a query. The genes were extracted using the BLASTP search for those showing > 50% similarity with > 30% identity.

Analysis of cellulase genes

To search for potential cellulase genes, multiple known nematode cellulases (Han et al. 2022) were used to BLAST against the A. besseyi genome. DIAMOND blastp with the ‘–more-sensitive’ option was used and resulted in five hits from the A. besseyi genome (Buchfink et al. 2021). These genes were manually examined in SMART (http://smart.embl-heidelberg.de/), and only three of them contain a cellulase domain, which belongs to the glycoside hydrolase (GH) family 45.

To investigate the potential origin of cellulase genes in A. besseyi, we first searched for homologs of the A. besseyi cellulase with the domain amino sequences in the NCBI non-redundant database using the BLASTp algorithm. Matching sequences with e values less than 1.25 × 10–87 were collected, and pre-existing homologs from Aphelenchoides were manually removed. These sequences were clustered using a 90% identity threshold through cd-hit (Li and Godzik 2006), and the remaining 43 and three A.besseyi sequences from this study were aligned using MAFFT (Katoh and Standley 2013). IQ-TREE was used to construct phylogenetic trees (Nguyen et al. 2015). A total of 541 substitution models were tested with 1000 ultrafast bootstraps (Kalyaanamoorthy et al. 2017; Hoang et al. 2018). Based on Bayesian Information Criterion, WAG + I + G4 was identified as the best-fit model for the given data.

Availability of data and materials

The datasets generated and/or analysed during the current study are available in the NCBI repository, https://www.ncbi.nlm.nih.gov/nuccore/?term=PRJNA901680.

Abbreviations

BP:

Biological process

CC:

Cellular component

ESTs:

Expressed sequence tags

GH:

Glycoside hydrolase

GHF45:

Glycosyl hydrolase family 45

GO:

Gene ontology

HGT:

Horizontal gene transfer

KEGG:

Kyoto encyclopedia of genes and genomes

KOG:

Eukaryotic orthologous groups

LTTRs:

LysR family transcriptional regulator

MF:

Molecular function

PPN:

Plant-parasitic nematode

TE:

Transposable elements

References

Download references

Acknowledgements

Not applicable.

Funding

This work was supported by grants from the National Natural Science Foundation of China (32001876) and China Agriculture Research System (Grant No. CARS-01-41).

Author information

Authors and Affiliations

Authors

Contributions

YLP and HLJ designed the research; JLX, FY, and WJY prepared the materials, XQ, ZDH, HLJ, and JLX analyzed the data and wrote the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Xue Qing.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Supplementary Information

Additional file 1: Table S1.

The statistics of predicted miRNA in Aphelenchoides besseyi. Table S2. The statistics of predicted rRNA in Aphelenchoides besseyi. Table S3. The statistics of predicted snRNA in Aphelenchoides besseyi. Table S4. The statistics of predicted tRNA in Aphelenchoides besseyi.

Additional file 2.

The tandem repeats and transposable elements in the genome of Aphelenchoides besseyi.

Additional file 3: Figure S1.

Gene length distribution of Aphelenchoides besseyi. Figure S2. The metabolism pathway for vitamin B6. Figure S3. The metabolism pathway for biotin. Figure S4. The metabolism pathway for riboflavin.

Additional file 4: Table S1.

The expanded gene family and corresponding gene family. Table S2. The annotation for the expanded genes. Table S3. Gene function prediction: GO analysis-Cellular component/molecular function/biological process. Table S4. Gene function prediction: KOG analysis. Table S5. Annotated KEGG pathway. Table S6. The annotated genes by NR search. Table S7. The annotated genes by SwissProt. Table S8. The comparison of genes related to tolerance/resistance in different nematode species.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ji, H., Xie, J., Han, Z. et al. Complete genome sequencing of nematode Aphelenchoides besseyi, an economically important pest causing rice white-tip disease. Phytopathol Res 5, 5 (2023). https://doi.org/10.1186/s42483-023-00158-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s42483-023-00158-0

Keywords