Identification of two distinct begomoviruses infecting Malvastrum coromandelianum

Malvastrum coromandelianum is a common weed plant frequently found around agricultural fields. Three virus isolates (Y249, Y278 and Y281) were obtained from M. coromandelianum with yellow vein symptoms in Honghe and Baoshan, Yunnan Province, China. Specific 500 bp products were amplified from total DNA extracts using universal primers for members of the genus Begomovirus. The complete viral genome sequences of both Y278 and Y281 were determined to be 2743 nucleotides, and that of Y249 was determined to be 2740 nucleotides. Sequence alignments and phylogenetic analyses support the proposal of creating new species in the genus Begomovirus, for which the name malvastrum yellow vein Baoshan virus (MaYVBsV) is proposed for Y278 and Y281, and malvastrum yellow vein Honghe virus (MaYVHhV) is proposed for Y249.


Background
The genus Begomovirus constitutes the largest genus in the family Geminiviridae, with over 400 species recognized by the International Committee on Taxonomy of Viruses (ICTV) (Zerbini et al. 2017). Begomoviruses infect economically important crops and cause serious damages in agriculture throughout the tropical and subtropical regions (Brown et al. 2015). Based on genome organizations, begomoviruses can be further classified into monopartite and bipartite subgroups (Hanley-Bowdoin et al. 2013). The genome of bipartite begomoviruses contains two similarly sized single-stranded DNA (ssDNA) components, designated as DNA A and DNA B. The DNA A component encodes six proteins involved in viral replication, encapsidation, transmission and pathogenesis (Fondong 2013). The DNA B component encodes two proteins, which participate in cell-to-cell and systemic spread throughout the host (Lazarowitz and Beachy 1999). The genome of monopartite begomoviruses comprises a single molecule that is similar to the DNA A component of bipartite begomoviruses. For a few begomoviruses, DNA A/DNA component also encodes V3 (Gong et al. 2021) or C5 (Li et al. 2015) protein involved in suppression of host RNA interference. In addition, two types of ssDNA molecules, referred to as alphasatellite and betasatellite, are frequently found to be associated with the infection of monopartite begomoviruses (Briddon et al. 2018;Yang et al. 2019). Both alphasatellite and betasaellite have approximately half the size of the begomovirus components, and depend on their helper viruses for encapsidation and movement in plants (Zhou 2013). Alphasatellite has a single ORF coding for a replication initiator protein (alpha-Rep), and is capable of self-replication in the host plants (Briddon et al. 2018). Betasatellite encodes a multifunctional βC1 protein involved in pathogenicity, suppression of gene silencing, and repression of other plant defense responses (Li et al. 2018). Recently, a novel βV1 gene was identified in betasatellite, which is related with symptom development (Hu et al. 2020).
Weeds serve as the important intermediate hosts of plant viruses, and may participate in disease epidemic. Several types of weeds have been identified to be the alternative hosts of geminiviruses (Yang et al. 2008;Papayiannis et al. 2011;Fiallo-Olive et al. 2012). Malvastrum coromandelianum is a weed plant commonly seen in Yunnan Province, China, and is also a reservoir host for geminiviruses (Zhou et al. 2003). In this study, we describe the molecular characterization of two distinct begomovirus species that infect M. coromandelianum in Yunnan, China.

Identification of begomoviruses in M. coromandelianum
Three M. coromandelianum leaf samples, Y249, 278 and Y281, with yellow vein symptoms were collected in Yunnan Province. To test whether the symptoms were caused by geminivirus infection, degenerate primers PA and PB, universal for members of the genus Begomovirus, were designed to detect a conserved region within DNA-A/DNA component of begomoviruses, and an amplicon of~500 base pairs (bp) in size was obtained from each sample, suggesting that these samples were infected by begomoviruses. Based on the obtained sequences, primer pairs were designed to amplify the rest part of the viral genome from each sample. The complete nucleotide sequences of both Y278 and Y281 are 2743 bp in length (Accession numbers FN386459 and FN386460, respectively). These two sequences are nearly identical to each other (99.5% similarity), demonstrating that they are isolates of a single begomovirus species. Comparative sequence analysis of Y278 and Y281 with other begomoviruses showed that they share the highest sequence identities (88.4 and 88.5%) with pea leaf distortion virus (PLDV). The complete genome of Y249 is 2740 bp in length (Accession number FN552749). Sequence analysis showed that Y249 genome shares the highest sequence similarity (86.5%) with malvastrum yellow mosaic virus (MaYMV). These sequence identities are below the threshold value of 91% for species demarcation within the genus Begomovirus (Fauquet and Stanley 2003;Brown et al. 2015). According to the established principles of geminivirus taxonomy and nomenclature (Fauquet et al. 2008), the virus isolates Y278 and Y281 were named as malvastrum yellow vein Baoshan virus (MaYVBsV), and the virus isolate Y249 was named as malvastrum yellow vein Honghe virus (MaYVHhV) (Fig. 1).

Genome organization of MaYVBsV and MaYVHhV
Both MaYVBsV and MaYVHhV have canonical begomovirus genome arrangement (Fig. 1). In the virion-sense strand of the genome, the ORFs V1 encode 256 amino acids (aa) (MaYVBsV) and 257 aa (MaYVHhV) viral coat proteins (CP) (pfam 00844). Putative proteins encoded by ORFs V2 of MaYVBsV and MaYVHhV are 112 aa (12.9 kDa) and 116 aa (13.3 kDa), respectively, predicted to be the movement protein-like V2 protein (pfam 01524). MaYVHhV has an additional ORF V3 located downstream of the ORF V2, encoding a polypeptide with a molecular weight of 7.4 kDa. The function of the ORF V3 is elusive, while in silico analysis revealed a putative transmembrane helix at position 13-35 aa. In the complementary-sense strand, the C1 ORF of both MaYVBsV and MaYVHhV codes for putative 363 aalong polypeptides with predicted molecular mass of 41.2 kDa. These proteins were identified as the replicationassociated protein (Rep) (pfam 00799), containing four  (Nash et al. 2011;Fondong 2013) (Fig. 2a). P-loop NTPase domains were identified at aa 217-264 in both proteins, which are involved in nucleotide binding. ORFs C2 of the two viruses encode putative proteins of 150 aa, identified as the transcriptional activator protein (TrAP) (pfam 01440). Three known functional regions of C2 proteins were identified, including nuclear location signal (NLS), cysteine-rich zinc finger-like domain which confers DNA-binding activity, and the acidic region at Cterminal required for transactivation activity (Fig. 2b). Putative proteins encoded by ORFs C3 and C4 of MaYVBsV and MaYVHhV are 134 aa and 143 aa, identified as the replication enhancer protein (REn) (pfam 01407) and C4 protein (pfam 01492), respetively. The C5 ORFs potentially encode proteins of 167 aa for MaYVBsV and 208 aa for MaYVHhV. By analogy to similarly located ORF of other members of the genus Begomovirus, the noncoding regions of both MaYVBsV and MaYVHhV are 272 nt in length, with a predicted hairpin structure containing the conserved nonanucleotide motif TAATATT/AC.

Phylogenetic relationship of MaYVBsV and MaYVHhV with other geminiviruses
Amino acid sequence comparisons of the six viral proteins of MaYVBsV or MaYVHhV were performed with the 12 begomoviruses, which have the highest genome sequence identities with these two viruses (Table 1). MaYVBsV has the highest amino acid sequence    Table S1 genome sequences of geminiviruses. The trees showed that MaYVBsV and MaYVHhV always cluster with other begomoviruses (Fig. 3).

Discussion
Due to the broad distribution and rapid propagation characters, weeds may survive in or around crop fields during the non-cropping season, which makes them important reservoir hosts for plant viruses. Besides, mutations and recombinations increase in virus genome when weeds are infected with multiple virus species, which may increase the transmission rate of viruses and further broaden their host range. In our study, two different types of begomovirus, MaYVBsV and MaYVHhV, were identified in M. coromandelianum. Sequence alignment analysis of virus genome and gene products showed that MaYVBsV and MaYVHhV were closely related with begomoviruses that collected from different plant hosts, indicating that these two viruses may be transmitted from M. coromandelianum to different types of crops and nurseries. In spite of our efforts, we failed to reveal a DNA-B component associated with MaYVBsV or MaYVHhV, indicating that they are monopartite begomoviruses. Further study will focus on detecting the infectivity and host range of MaYVBsV and MaYVHhV.

Conclusions
Our study reports the detection and characterization of two novel putative begomovirus species infecting M. coromandelianum plants. These results will facilitate the development of strategies for managing the spread of geminiviruses.

Plant materials
Virus isolates were collected from M. coromandelianum plant displaying yellow vein symptoms in Honghe (Y249) in 2003 and Baoshan (Y278 and Y281) in 2005, in Yunnan Province, China.

Sequence analysis
Sequences were assembled and analyzed with Snap-Gene®. Domains were analyzed using the Pfam database (http://pfam.xfam.org). Transmembrane helices were predicted using TMHMM 2.0 (http://www.cbs.dtu.dk/ services/TMHMM/). Sequence alignments were performed using and MUSCLE by MEGA X. Phylogenetic analyses were performed using the maximum likelihood method by MEGA X. The GenBank accession numbers of sequences analyzed in the study are listed in Additional file1: Table S1. Received: 4 February 2021Accepted: 20 March 2021