Posted on

Zika viruses encode 5′ upstream open reading frames affecting infection of human brain cells

Zika viruses encode 5′ upstream open reading frames affecting infection of human brain cells

Ribosome profiling reveals the presence of novel uORFs in the 5′ UTR of different ZIKV strains

We utilised Ribo-Seq, in combination with whole transcriptome sequencing (RNA-Seq), to investigate the translation of the ZIKV genome at sub-codon resolution. Ribo-Seq exploits the capacity of elongating ribosomes to protect ~30 nucleotides of messenger RNA (mRNA) from digestion during nuclease incubation of cell extracts13. Such ribosome-protected fragments (RPFs) are purified and deep sequenced, revealing the position of translating ribosomes on the mRNA at the time of harvesting with single-nucleotide precision14. African green monkey (Vero) cells and human glioblastoma–astrocytoma (U251) cells were infected at a multiplicity of infection (MOI) of three with the ZIKV American isolate PE243, representative of the Asian/American lineage15, and the ZIKV African isolate Dak84, a prototypic African isolate16. To preserve the positions of translating ribosomes upon cell lysis, infected cells at 24 h post-infection (h p.i.) were flash-frozen or pre-treated for 3 min with the translation inhibitor cycloheximide (CHX) before harvesting. CHX treatment is widely used in Ribo-Seq studies but can lead to the accumulation of 80S ribosomes on start codons and, in stressed cells, can induce the accumulation of RPFs in the 5′ region of coding sequences17. For this reason, cells were harvested by flash-freezing to avoid these potential biases (unless stated). Ribo-Seq and RNA-Seq libraries were prepared and deep sequenced as previously described18. In addition, quality control of the different libraries was also conducted as previously described18, and the data were deemed to be of high quality (Supplementary Figs. 1–6).

At 24 h p.i., the viral envelope (E) protein of each strain is robustly expressed in infected Vero and U251 cells confirming that the infection was well-established (Fig. 1A). The Ribo-Seq (red) and RNA-Seq (green) read densities on the virus genome for infected Vero and U251 cells are illustrated in Fig. 1B and Supplementary Fig. 7A–C. Consistent with the translation of a single polyprotein from the genomic mRNA, read coverage across the main ORF was even, although localised variations in RPF density appear, that may arise from technical biases (ligation, PCR and nuclease biases19) or, potentially, from ribosome pausing during translation20. Few RPFs were present in the 3′ UTR, consistent with the absence of translation in this region. The 3′ UTR RNA-Seq density was noticeably higher (65–70%) than across the upstream part of the genome in each cell line (Fig. 1B). Structured flavivirus 3′ UTRs resist degradation by the 5′–3′ Xrn1 host exonuclease, giving rise to non-coding subgenomic flavivirus RNAs (sfRNAs) that accumulate during infection21 and are linked to cytopathic and pathologic effects15. A sharp spike of RNA-Seq density is seen in the ZIKV American isolate PE243 (nt 10,478) and ZIKV African isolate Dak84 (nt 10,477) libraries (Supplementary Fig. 7D, E), consistent with the presence of a nuclease-resistant RNA structure at this position. Indeed, this location is two nucleotides upstream of the predicted 5′ end of RNA ‘stem-loop 2’ (SL2)21.

Fig. 1: ZIKV RNA synthesis and translation.

A Western blot analysis of ZIKV E protein and GAPDH in Vero and U251 cells infected with American isolate PE243 and African isolate Dak84 (MOI:3) for 24 h. GAPDH was used as a loading control. Molecular masses (kDa) are indicated on the left. Infections were performed in triplicate with similar results. B Map of the 10,807-nt ZIKV/Brazil/PE243/2015 genome. The 5′ and 3′ UTRs are in black, and the polyprotein ORF is in pale blue with subdivisions showing mature cleavage products. Histograms show the read densities, in reads per million mapped reads (RPM), of Ribo-Seq (red) and RNA-Seq (green truncated at 100 RPM for better visualisation) reads at 24 h p.i. (repeat 1) in Vero cells pre-treated with CHX. The positions of the 5′ ends of reads are plotted with a +12 nt offset to map (for RPFs) approximate P-site positions. Negative-sense reads are shown in dark blue below the horizontal axis. In light blue, the translational efficiency (TE) is calculated as the positive-sense Ribo-Seq/RNA-Seq ratio. Source data are provided as a Source Data file.

Fig. 2: ZIKV 5′ region.
figure 2

The 5′ region of the ZIKV American isolate PE243 genome shows two non-AUG uORFs in Vero cells pre-treated with CHX (A), flash-frozen Vero cells (B, upper panel) and flash-frozen U251 cells (B, lower panel). Note that in order to visualise RPFs across the uORFs properly, the y-axis has been truncated at 10 RPM for the Ribo-Seq samples for Vero cells and 3 RPM for U251 cells, leaving some RPF counts, mainly for the main ORF, off-scale. C The 5′ region of the ZIKV African isolate Dak84 genome shows the African uORF in flash-frozen Vero (upper panel) or flash-frozen U251 cells (lower panel). Note that in Vero-infected cells, the peak marked with a red asterisk, unlike all other peaks, has an unusual read-length distribution centred on 26 nt, and this is not seen in U251-infected cells. For U251 cells infected with Dak84, only RPFs with a read length of 28 and 29 were plotted. Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map the approximate P-site. Reads whose 5′ ends map to the first, second or third phases relative to codons in the polyprotein reading frame are indicated in blue, orange or purple, respectively. Capsid (C) denotes the initiation of the polyprotein (main ORF). DF Bar charts of the percentage of ribosome-protected fragments (RPFs) in each phase relative to the polyprotein ORF. Regions with the least amount of overlapping, (coordinates given in plot titles), were selected. Reads whose 5′ ends map to the −2, −1 or 0 phases are indicated in blue, orange or purple, respectively. CHX-treated Vero cells infected with American isolate PE243 (D); flash-frozen Vero (E, upper panel) and U251 (E, lower panel) cells infected with PE243; and flash-frozen Vero (F, upper panel) and U251 (F, lower panel) cells infected with African isolate Dak84.

Fig. 3: Analysis of ZIKV uORF translation.
figure 3

A Firefly luciferase (FF-Luc) reporter constructs scheme where the 2A-FF-Luc cassette is positioned downstream of and in frame with uORF1, uORF2 or main ORF. uORF1-2A-FF-Luc includes the complete 5′-UTR (107 nucleotides) plus 22 nucleotides of the polyprotein. In this case, a tryptophan residue substitutes the uORF1 stop codon (grey asterisk) to allow the luciferase reporter expression. uORF2-2A-FF-Luc includes the complete 5′-UTR plus 89 nucleotides of the polyprotein, and the main ORF-2A-FF-Luc includes the complete 5′-UTR plus 87 nucleotides of the polyprotein. Differences in protein length are due to frame correction. B Relative FF-Luc activity for uORF1, uORF2 and main ORF normalised to Renilla luciferase (Ren-Luc) used as a transfection control. Vero-transfected cells were harvested at 30 h post-transfection (h p.t.). One-hundred percent translation accounted for the main ORF WT translation. C Scheme of the 5′ region of American isolate PE243 with initiation codons for uORF1, uORF2 and main ORF indicated in blue, orange and purple, respectively. Modified nucleotides in the different mutants are indicated by a coloured-filled square associated with the mutant name, described in (D). E Relative FF-Luc activity of different mutants for uORF1, uORF2 and main ORF normalised to Ren-Luc as described in (B). American WT FF-Luc activities of uORF1, uORF2 and main ORF from (B) have been included for clarity. F Relative FF-Luc/Ren-Luc ratio of uORF1-, uORF2- and main ORF-2A-FF-Luc reporter mRNAs in Vero cells infected with American isolate PE243 (MOI:3, purple triangle) or mock-infected (red circle) at 6 h p.t. Cells were harvested at 24 h p.i. G Relative FF-Luc/Ren-Luc ratio of main ORF-2A-FF-Luc mutant reporters in Vero cells infected with PE243 (MOI:3) or mock-infected at 6 h p.t. Cells were harvested at 24 h p.i. Experiments were performed in triplicate with three biological replicates. In all cases, error bars represent standard errors. All t-tests were two-tailed and did not assume equal variance for the two populations being compared. All p values are from comparisons of the mutant with the respective non-mutated luciferase reporter (i.e., derived from the American wild-type) in the same ORF. Source data are provided as a Source Data file.

Fig. 4: The significance of uORF translation in virus infection.
figure 4

A Summarising table of SHAPE and RT-PCR data for ZIKV uORFs mutant viruses derived from pCCI-SP6-American ZIKV infectious clone. B SHAPE RNA secondary structure of the 5′ region (first 180 nucleotides) of the American WT, African-like, uORF1-KO and uORF2-PTC mutant viruses. Nucleotides are colour-coded based on SHAPE reactivity. SLA stem-loop A, SLB stem-loop B and cHP capsid hairpin. C Time-course of infection of U251 cells with ZIKV mutant viruses (MOI 0.01) for 96 h. Plaque assays were performed on serial dilutions of the supernatant containing released virions. Values show the mean averages of the titration of three biological replicates. Error bars represent standard errors. PFU plaque-forming units. All t-tests were two-tailed and did not assume equal variance for the two populations being compared. All p values are from comparisons of the mutant virus with the American WT at the indicated time points. D Pie charts of competition assays of American WT with mutant viruses after 2 (p2), 4 (p4) and 6 (p6) sequential passages in U251 cells. Different proportions of each virus (50:50 and 90:10) were added as indicated in the first row (input 16 h p.i., passage 0). Chart area indicates the RNA proportion for each virus, as measured from sequencing chromatograms of RT-PCR products. Experiments were repeated independently eight times (Supplementary Table 6). E (left panels) Ribo-Seq read density in the 5′ region of the ZIKV genome at 24 h p.i. of flash-frozen Vero cells infected with American WT, uORF1-KO, uORF2-PTC1 and African-like viruses (MOI:3). Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map the P-site as described in Fig. 2A–C. For these plots, the y-axis has been truncated at 18 RPM and only RPFs with a read length of 29 was plotted. Blue asterisk indicates uORF1 initiation codon, and red asterisk is the premature termination codon in uORF2. E (right panels) Bar charts of the percentage of RPFs in each phase, relative to the main ORF (plot as described in Fig. 2D–F). Source data are provided as a Source Data file.

Fig. 5: ZIKV uORF1 interacts with the cytoskeleton.
figure 5

A Subcellular fractionation analysis by western blot of Vero cells transfected for 48 h with plasmid pCAG (left panel) or pCAG expressing the ZIKV uORF1 protein fused with TAP-tag at the C-terminus (pCAG-ORF1-TAP, right panel). The total extract, cytosolic, membrane, nuclear, chromatin and cytoskeletal fractions were probed with antibodies against FLAG (to detect TAP-tag); GAPDH (cytosolic marker); ERp72 (membrane marker); Lamin A + C (nuclear marker); H2A (chromatin marker); and vimentin (cytoskeletal marker). Molecular masses (kDa) are indicated on the left. B Representative confocal images of Vero cells transfected for 36 h with pCAG (mock) or the pCAG-ORF1-TAP plasmids. Cells were stained with antibodies against FLAG (green) and different cytoskeletal proteins (i.e., actin, tubulin and vimentin in magenta). Nuclei were counter-stained with DAPI (blue). Images are a maximum projection of a Z-stack. Scale bars, 25 μm. Fluorescence intensity profiles of ORF1-TAP (green) and the different cytoskeletal markers (magenta) were obtained using ImageJ software, along the white straight line shown on the merged image crossing the representative cell. C AlphaFold2 prediction for ZIKV uORF1-encoded protein. D Representative confocal images of Vero cells transfected for 36 h with pCAG-ORF1-TAP-1X Pro. Cells were stained and analysed as described in (B). Experiments in (A, B and D) were repeated three times with similar results. Source data are provided as a Source Data file.

Fig. 6: ZIKV uORF1-encoded protein helps in the formation of the cytoskeletal cage during infection.
figure 6

A Representative brightfield and confocal images of Vero cells infected with the American WT (Am WT), the African-like and the uORF1-KO mutant viruses (MOI:3) for 18 h (upper panel) and 24 h (lower panel). Cells were stained with antibodies against the viral E protein (green) and vimentin (magenta). Nuclei were counter-stained with DAPI (blue). Images are a maximum projection of a Z-stack. Scale bars, 25 μm. Fluorescence intensity profiles (right panel) of E protein (green), vimentin (magenta) and the nuclear staining (blue) were obtained using ImageJ software, along the white straight line shown on the merged image crossing the representative cell. B Quantification of the proportion of vimentin area by fluorescence microscopy versus overall cell area, measured by brightfield microscopy, in Vero-infected cells as shown in (A), and U251-infected cells (C) as shown in Supplementary Fig. 15. Each point represents a single cell (n = 30 per condition and biological replicate). Data are represented as mean ± SD from three biologically independent experiments. Statistical analysis was repeated measures two-way ANOVA. All p values are from comparisons of the different viruses at that specific time point and from each virus or mock at the two different time points. Source data are provided as a Source Data file.

In Ribo-Seq samples, the number of negative-sense reads (dark blue) was negligible (14,20. In comparison, in RNA-Seq samples (Fig. 1B and Supplementary Fig. 7A–C), a low, uniform coverage of negative-sense reads was observed (at ~1% c.f. positive sense), corresponding to negative-sense intermediates that act as templates for genome replication. The translational efficiency (TE) of each virus genome was calculated by applying a 15-nt running mean filter and dividing the number of Ribo-Seq reads by the number of RNA-Seq reads (light blue), revealing relatively even coverage across the genome. Strikingly, we found substantial TE within the 5′ UTR in both CHX-treated and flash-frozen cells (Fig. 2A, B, C). In the ZIKV American isolate PE243-infected cells, a prominent peak of RPF density was seen at nucleotide 25 of the 5′ UTR in Vero and U251 cells, coinciding with a non-canonical (CUG) initiation codon (Fig. 2A, B). RPFs mapped in the −2 phase along the length of the associated 29-codon upstream ORF (uORF1, in blue), which ends at nucleotide 111 (the 4th nucleotide of the polyprotein ORF). Additionally, RPFs mapped in the −1 phase to a second uORF (uORF2, in orange), which appears to be translated via a non-canonical (UUG) initiation codon at nucleotide 80 (Fig. 2A, B). This uORF2 is 77 codons in length and extends 202 nucleotides into the polyprotein ORF. Analysis of the mapping of Ribo-Seq reads to the ZIKV African isolate Dak84 5′ UTR (Fig. 2C) revealed a single uORF (African uORF), translated in the –1 frame. The African uORF initiates at nucleotide 25 (the same initiation codon used by uORF1) and terminates at nucleotide 309, at the equivalent stop codon of uORF2 encoding a putative protein of 94 amino acids in length. In all African ZIKV isolates with available sequence data, uORF1 and uORF2 are present as a single ORF, which appears to have been split in two by the insertion of a uracil residue at position 81 in the 1966 Malaysian lineage that gave rise to the Asian/American strain22 (Supplementary Fig. 8).

To provide further support for the presence of ZIKV uORFs, we examined the phasing of RPFs in viral 5′ UTRs. For RPFs, mapping of the 5′ end positions to coding sequences (CDSs) characteristically reflects the triplet periodicity (herein referred to as ‘phasing’) of translational decoding14,18. We summarised phase relative to the polyprotein (main) ORF by plotting bar charts of the percentage of reads in each phase (Fig. 2D–F). To avoid the potential confounding effects of overlapping ORFs on these calculations, regions with the least overlap were selected. For uORF1, ribosomal phasing was measured over a 55 nucleotide region (position 25–79), which does not overlap another ORF. Here, a clear dominance of the –2 phase was seen (light blue), consistent with translation in the uORF1 frame (Fig. 2D, E). For uORF2, a short region of 27 nucleotides (position 80–107) was selected, to avoid the high ribosomal occupancy in the 0 phase corresponding to the main ORF that starts at position 108. Here, the majority of reads map to the −1 phase, consistent with uORF2 translation; however, the overlap with uORF1 is still evident from the increased read density in the −2 phase (Fig. 2D, E). For the African uORF (Fig. 2F), within the chosen region (position 25–106), the majority of reads are attributed to the −1 phase, supporting the expression of this fused uORF. As a positive control, phasing within the polyprotein ORF was assessed, in a region with no known overlapping ORFs (position 310/311–480), revealing a clear dominance of reads attributed to the 0 phase (Fig. 2D–F). Additionally, as shown in Supplementary Fig. 9, the length distribution of Ribo-Seq reads mapping to the ZIKV uORFs mirrored that of polyprotein-mapping RPFs, indicating that they are bona fide ribosome footprints.

The translation of ZIKV uORFs can modulate main ORF expression

The presence of uORFs in mRNAs is often associated with the regulation of downstream gene expression23,24. To investigate the potential modulation of ZIKV main ORF expression by the 5′ uORFs, capped T7 RNA polymerase-derived synthetic reporter mRNAs were prepared in which American PE243 uORF1, uORF2 or a 5′ portion of the main ORF was placed upstream of, and in frame with, the Firefly luciferase (FF-Luc) reporter gene. The ZIKV and FF-Luc sequences are separated by the short, foot and mouth disease virus 2A autoprotease-encoding sequence that liberates the FF-Luc enzyme following expression in cells (Fig. 3A). RNAs were reverse transfected into Vero cells alongside a T7-derived RNA expressing Renilla luciferase (Ren-Luc) as transfection control. FF-Luc and Ren-Luc activities were measured at 30 h post-transfection (h p.t.), and translation efficiencies were determined after normalisation with main ORF (American wild-type, American WT) translation levels as 100% (Fig. 3B). Under these conditions, uORF1 and uORF2 expression levels were 0.80% and 4.13%, respectively (Fig. 3B). To assess whether translation of the uORFs could affect main ORF translation, mutations were introduced into uORF1 and uORF2 that were predicted to reduce or increase their expression (Fig. 3C–E). As shown in Fig. 3E, mutation of the uORF1 start codon from CUG to CUA (uORF1-KO, blue) led to a modest reduction in uORF1 expression (left panel), no change in uORF2 expression (middle panel) and a small increase in main ORF translation (right panel). Reducing uORF2 expression by changing the initiation codon from UUG to UUA (uORF2-KO, pink), or introducing a premature stop codon within uORF2 at residue number 6 (uORF2-PTC1, orange), reduced uORF2 translation by 50% (Fig. 3E, middle panel) and led to a slight decrease in main ORF expression (Fig. 3E, right panel). Replacing the uORF2 initiation codon with a canonical AUG codon (in purple) led to a substantial increase in uORF2 translation and prevented main ORF expression (Fig. 3E, middle and right panels). A fusion of uORF1 and uORF2 that recapitulated the African uORF (African-like; in green) did not significantly change uORF expression (Fig. 3E, middle panel) and led to a modest, albeit significant, increase in main ORF translation (Fig. 3E, right panel). Similar results were obtained in U251 cells transfected with main ORF-2A-FF-Luc mutants (Supplementary Fig. 10A).

We went on to ask whether viral infection could influence the relative utilisation of main and uORFs. In these experiments, reporter mRNA-transfected cells were infected at 6 h p.t. with the American isolate PE243 (MOI:3) and harvested 24 h later. In the context of infection, the expression of the uORFs and main ORF relative to each other remained similar, although the total expression of each ORF increased significantly compared to uninfected cells (Fig. 3F). Notably, the raw luciferase values (Supplementary Table 1) were slightly lower in the presence of the virus, which may reflect some impairment of translation initiation as a result of the phosphorylation of the alpha subunit of the initiation factor 2 (p-eIF2α) during infection (Supplementary Fig. 10B). In relative terms, expression of the transfection control mRNA, Ren-Luc, in comparison to the ZIKV FF-Luc mRNAs, was reduced in the presence of ZIKV (Supplementary Table 1). This might indicate that the viral 5′ UTR arrangement selectively preserves the expression of the main and uORFs during infection, although this requires further investigation. The effect of uORF mutations on main ORF expression in infected cells was also tested (Fig. 3G). In all cases, except uORF2-KO, the relative expression of the main ORF was increased modestly. In conclusion, virus infection modestly and uniformly increased expression from upstream and main ORFs.

uORF translation modulates virus replication

To investigate the potential role of ZIKV uORFs in virus replication, a panel of viruses containing the mutations tested above was generated using reverse genetics of an American ZIKV infectious clone25, detailed in Fig. 4A. Given that structured RNA elements and long-range interactions in the 5′ and 3′ terminal regions of the ZIKV genome are essential for virus translation and replication26,27, we began by confirming that the 5′ end structures were retained in full-length RNA transcripts of the mutant viruses. Using selective 2′-hydroxyl acylation analysed by primer extension (SHAPE), we found that the structure of the 5′ UTR and the start of the main ORF of the mutant viruses generally very closely matched that of the American WT infectious clone (Fig. 4B and Supplementary Fig. 11A). Two differences were observed; modelling of the African-like mutant virus indicated a slightly shorter third stem-loop (cHP), with loss of two base pairs at the bottom of the helix (Fig. 4B), and in the uORF2-AUG mutant, the internal loop in the centre of the second stem-loop (SLB, Supplementary Fig. 11A) is base-paired. Mutant viruses were also analysed for the stability of the introduced mutations. After five passages, RT-PCR analysis of intracellular viral RNA (initially infected at MOI 0.01 PFU/cell) revealed that all mutations were stable, with two exceptions. ZIKV uORF2-KO showed reversion to WT after passage 1, and we could not recover any virus following electroporation of the ZIKV uORF2-AUG mutant (Supplementary Fig. 11B). The latter observation may reflect reduced translation initiation at the main polyprotein AUG as a consequence of increased recognition of the uORF2-AUG start codon. The rapid reversion to the WT sequence seen with the uORF2-KO virus may indicate a role for the uORF2 start codon in the translation of the viral polyprotein, but further experimental analysis will be required to confirm this, including the design and testing of alternative uORF2 knockout strategies.

To assess the growth and infectivity of the stable mutants, U251 cells were infected with sequence-verified American WT, African-like, uORF1-KO or uORF2-PTC1 viruses at low multiplicity (0.01 PFU/cell) in a multi-step growth experiment from 0 to 96 h p.i. As shown in Fig. 4C, from 48 h p.i. onwards, African-like and uORF1-KO mutant viruses reached significantly higher titres (~6-fold) than the American WT, whereas no difference was observed with the uORF2-PTC1 mutant. This phenotype was confirmed in competition assays in which cells were simultaneously infected with a defined ratio (50:50 and 90:10) of American WT virus:corresponding mutant virus to a final MOI of 0.01 PFU/cell (Fig. 4D; note that a ratio of 10:90 was used for the American WT:uORF2-PTC1 competition experiment as the uORF2-PTC1 virus showed somewhat slower replication in multi-step growth curves). Supernatants containing virus particles were harvested 72 h later and used to infect fresh cells at a dilution of 1:10,000. This regimen was repeated six times28, and intracellular RNA was harvested, reverse transcribed, and Sanger sequenced. Input corresponds to 16 h post-viral infection and represents the starting point. From passage 2, the African-like and uORF1-KO mutant viruses began to dominate the cultures (Fig. 4D), indicating that both mutant viruses can outcompete American WT. In contrast, the proportion of uORF2-PTC1 viral RNA was relatively unchanged throughout the course of the experiment (Fig. 4D, right panel), indicating no competition with the American WT virus. The phenotypic differences amongst viruses were also evident, albeit subtler, in Vero-infected cells (Supplementary Fig. 11C, D).

In an effort to detect the expression of uORF products in infected cells, antibodies to expressed proteins and peptides were prepared, but they lacked sensitivity and specificity. Therefore, we returned to ribosome profiling to examine translation of the uORFs in ZIKV mutant viruses (Fig. 4E and Supplementary Fig. 11E, F). As seen in Fig. 4E, reads corresponding to the translation of uORF1 (−2 frame) and uORF2 (−1 frame) were visible in the American WT virus, similar to PE243 (Fig. 2A, B). The uORF1-KO mutant, as expected, blocked translation of uORF1, with no reads observed at the initiation codon (blue asterisk); and the African-like virus with fused uORFs recapitulated the pattern of uORF translation of the African virus (as in Fig. 2C). These observations were further confirmed in phasing bar charts, with the highest proportion of reads in the predicted reading frames (Fig. 4E, right panels). Unexpectedly, reads corresponding to the uORF2 phase (Fig. 4E, red rectangle) were present after the premature termination codon in uORF2-PTC1 (Fig. 4E, red asterisk), supported by the high proportion of reads corresponding to the −1 phase in phasing plots (Fig. 4E, right panels). A potential explanation for these reads is −1 ribosomal frameshifting into uORF2 from ribosomes initiating at the main ORF start codon (Supplementary Fig. 12), and supportive evidence for this hypothesis is presented in Supplementary Note 1 section.

ZIKV uORF1- and African uORF-encoded proteins interact with intermediate filaments

Bioinformatic analyses suggest no homology of ZIKV uORFs to known proteins. Given their predicted sizes (uORF1, 3.1 kDa; uORF2, 8.4 kDa; and African uORF, 10.3 kDa), however, it is likely that some, or all, will have intrinsic biological activity. To examine their cellular localisation, we transiently expressed mCherry-tagged and TAP (Strep-Strep-FLAG)-/FLAG-tagged uORF variants in uninfected and infected (ZIKV American isolate PE243 and ZIKV African isolate Dak84) mammalian cells and performed subcellular fractionation (Fig. 5A and Supplementary Fig. 13). As observed in Supplementary Fig. 13C, D, uORF2- and African uORF-encoded proteins were mainly present in the cytoplasm, and there was no relocalisation upon infection. However, uORF1, tagged with mCherry or TAP, and in two different cell lines (Vero and U251 cells), appeared not only in the cytoplasm but also in the cytoskeletal fraction (Fig. 5A and Supplementary Fig. 13E–H). Note that uORF1-TAP also appeared in the nuclear fraction, but this was probably due to passive diffusion of this small protein through the nuclear membrane. To discern the specific cytoskeletal target of uORF1, we performed confocal analysis with different proteins marking the three types of cytoskeletal filaments: actin for microfilaments, tubulin for microtubules and vimentin for intermediate filaments (Fig. 5B). Whereas the fluorescence intensity profiles of actin and tubulin with uORF1-TAP did not merge (Fig. 5B, right panel), it was found that uORF1-TAP could form denser granular structures that co-aligned with an abnormal ‘collapsed’ vimentin (Fig. 5B), and this was not observed in mock-transfected cells. Subsequently, we investigated how uORF1 could trigger vimentin rearrangement. AlphaFold2 analysis29 suggested that the uORF1 peptide adopts an alpha-helical conformation with high confidence (Fig. 5C). To test whether this secondary structure was involved in the collapse of vimentin, we created a mutant version of uORF1-TAP with a helix-destabilising proline residue inserted into the middle of the α-helix (I14P, uORF1-TAP-1X Pro). As observed in Fig. 5D, uORF1-TAP-1X Pro was unable to form granular structures in the cytoplasm in comparison with the unmodified uORF1-TAP (Fig. 5B). In addition, the fluorescence intensity profiles of vimentin and this mutant uORF1 did not co-align (Fig. 5D, bottom panel). This suggests that the helical structure of the uORF1-encoded peptide might be responsible for the collapse of vimentin in transfected cells.

We went on to investigate whether the African uORF, which comprises the American uORF1 and uORF2, was also able to interact with intermediate filaments. Based on AlphaFold2 (Supplementary Fig. 14A), the uORF2-encoded peptide is likely to be intrinsically disordered, but in the African uORF, the N terminus was predicted, albeit with low confidence, to have an α-helical structure similar to uORF1. Plasmids encoding these proteins were transfected, and confocal analysis with different proteins marking cytoskeletal filaments was performed as described before. As observed in Supplementary Fig. 14B, the uORF2-encoded peptide localised in the cytoplasm and did not colocalise with any cytoskeletal marker, whereas the African uORF-encoded protein (Supplementary Fig. 14C) had a more granular perinuclear pattern and partially co-aligned with vimentin, similarly to uORF1 peptide, although in this case, intermediate filaments did not collapse.

ZIKV uORF1-encoded protein helps in the formation of the cytoskeletal cage during infection

ZIKV infection reorganises microtubules and intermediate filaments to form a cytoskeletal cage surrounding viral factories30,31. These factories are the sites of viral RNA replication and virion assembly, and cytoskeletal remodelling might partly contribute to localising these processes in a closer environment for high viral replication efficiency30. To test specificity, the cytoskeletal phenotype of the uORF1-KO and African-like mutant viruses were compared to the American WT.

Previously synchronised Vero cells in the G0/G1 phase were infected (at MOI:3) with either virus, the cells fixed at 18 and 24 h p.i. and stained for viral E protein and vimentin (Fig. 6A). At 18 h p.i., vimentin was perinuclear in all cells and radiated toward the cell periphery with a filamentous structure, although a denser arrangement was observed in infected cells (Fig. 6A, upper panels). At 24 h p.i., the cytoskeletal cage, as defined by the collapse of vimentin, was visible in American WT infected cells (Fig. 6A, lower panels), but vimentin mostly remained in a perinuclear distribution in uORF1-KO and African-like infected cells. The compact aggregation of vimentin together with viral protein near the nucleus did not have a notable effect on cell size or overall morphology (Fig. 6A, brightfield panels). The fluorescence intensity profiles (Fig. 6A, right panels) indicate an accumulation of vimentin on one side of the nucleus in infected cells, which is more remarkable at 24 h p.i. An almost perfect co-alignment between the E protein and vimentin was observed in American WT infected cells at this time point. However, this was not the case with the African-like and the uORF1-KO mutant viruses. We quantified the collapse of vimentin in Vero-infected cells by measuring the area occupied by vimentin in relation to the total cell area at an early (18 h p.i.) and a later (24 h p.i.) time point, as previously published30. Thirty cells per condition were quantified in three different experiments (Fig. 6B). The area occupied by vimentin was reduced in American WT infected cells from 68% to 34%, from 68% to 46% in African-infected cells, and by only a small amount with the uORF1-KO mutant virus (from 71% to 66%). A similar pattern was observed in U251-infected cells (Fig. 6C and Supplementary Fig. 15). Due to slower viral replication in these cells, vimentin was quantified at later time points (20 and 28 h p.i.). These data indicate that the expression of the uORF1 polypeptide likely helps in the collapse of vimentin and thus in the formation of the cytoskeletal cage in infected cells. We can also conclude that the African-like 5′ UTR arrangement is less efficient than the American WT at promoting the cytoskeletal cage, probably due to its inability to collapse vimentin as described above.

ZIKV uORFs are involved in the infection of human cortical neurons and cerebral organoids

To assess whether expression of the different uORFs might influence the capacity of ZIKV to infect neural cells, 2D human induced pluripotent stem cell (hiPSC)-derived cortical neuronal mono-cultures and human 3D cerebral organoid slice cultures were infected with either American WT, uORF1-KO or the African-like ZIKV. Differentiated hiPSC-derived glutamatergic cortical neurons (CNs) (i3Neurons32) were infected at high MOI (10 PFU/cell based on titre in Vero cells) for 4 days. Released virions in the supernatant were quantified and from 48 h p.i. onwards, mutant virus titres were significantly lower, decreasing by 1.5- to 2-log10 at 96 h p.i. (Fig. 7A), in sharp contrast to their increased replication in U251 cells (Fig. 4C). This was further confirmed by immunolabelling and quantification of ZIKV E protein (Fig. 7B and Supplementary Fig. 16A, B), where spread of infection was less obvious with the African-like and the uORF1-KO mutant viruses, only limited to single infected cells, suggesting little to no viral spread. Next, we used cerebral organoids to investigate infection in a human brain-like 3D tissue environment and to explore whether the African uORF or uORF1 are also required for the infectivity of other cell types. To do so, we infected cerebral organoid slices grown at the air–liquid interface (ALI-COs), which recapitulate cortical cell type-diversity, layering and neurodevelopmental milestones33,34. After 82 days in vitro (DIV), reflecting the first trimester of gestation, ALI-COs were infected with the American WT, the uORF1-KO and the African-like viruses at MOI 5. Virus inoculum was removed after 24 h and the ALI-COs were grown for a further 7 days before being fixed. Six micrometres of tissue sections from six different Z-axis sections in ALI-COs were stained for viral E protein and positive cells were quantified in relation to the total number of nuclei as previously described for cerebral organoids infected with SARS-CoV-235. Approximately, 47% of cells were positive for the E protein in the American WT virus infection experiment. This was significantly reduced in the African-like and uORF1-KO virus infection experiments, to ~33% and ~11%, respectively (Fig. 7C and Supplementary Table 2). This was further confirmed by IF (Fig. 7D–F and Supplementary Fig. 16), corroborating our findings on the involvement of the ZIKV 5′ UTR in the infection of human brain cells.

Fig. 7: ZIKV uORFs are involved in neural cell infection.
figure 7

A Time-course of i3Neurons infected with ZIKV (MOI 10) for 96 h. TCID50 were performed with serial dilutions of the supernatant to measure virion release. Values show the mean averages of the titration from five biological replicates. Error bars represent standard errors. Statistical analysis was repeated measures two-way ANOVA on the log-transformed data. All p values are from comparisons of the mutant virus with the American WT at that specific time point. B Representative confocal images of i3Neurons infected with the American WT, the African-like and the uORF1-KO viruses (MOI:10) for 96 h. Cells were stained with antibodies against the viral E protein (green) and the mature neuron marker MAP2 (magenta). Nuclei were counter-stained with DAPI (blue). Images represent the maximum projection of a Z-stack. Scale bars, 25 μm. C Percentage of E+ cells in relation to total number of nuclei in ALI-COs infected with the American WT, the African-like and the uORF1-KO viruses (MOI:5) for 7 days. Thirty-three images per virus type at 20× resolution (~400–500 nuclei/image) were quantified for E-positive staining. Error bars represent standard errors. Statistical analysis was one-way ANOVA with Gaussian distribution and did not assume equal variance for the two populations being compared. All p values are from comparisons of the mutant virus with the American WT. Representative confocal images of infected ALI-COs, showing immunoreactivity for the viral E protein (green) and for different cellular markers (in magenta): nestin (D), GFAP (E) and MAP2 (F). Nuclei were counter-stained with DAPI (blue). Images represent the maximum projection of a Z-stack. Scale bars, 25 μm. G Percentage of E+ cells that are positive for nestin, GFAP and MAP2. Eleven images per virus type at 20× resolution (~400–500 nuclei/image) were quantified. Statistical analysis was performed as in (C). Four ALI-CO slices derived from two independent cerebral organoids were analysed for (C) and (G). The number of fluorescent cells corresponding to each staining was measured with ImageJ software by splitting the different channels.

To examine the differential neural cell tropism of these mutant viruses, cryostat sections of ALI-COs were immunostained for a panel of cellular markers, including nestin (as a marker for neural progenitor cells, NPCs), GFAP (a marker for the astroglial lineage such as radial glial cells, glial progenitors and astrocytes), and MAP2 (for mature neurons). As shown in Fig. 7D–G, ZIKV viruses preferentially infected nestin- or GFAP-positive cells, whereas very few MAP2-positive cells were also positive for E protein. A similar infectivity pattern was seen with all viruses tested (~36% nestin+, ~30% GFAP+ and ~4% MAP2+, Supplementary Table 3), suggesting that the modifications to the ZIKV 5′ UTR do not affect neural cell tropism.

Together, these data indicate that the expression of ZIKV uORF1 promotes infection of NPCs and precursors of the astroglial lineage in cerebral organoids. These results also demonstrate that the African 5′ UTR arrangement impairs ZIKV infection of the developing brain, although to a lesser extent.

No detectable effect of ZIKV uORFs in the mosquito vector

ZIKV typically cycles between humans and Aedes mosquitoes; hence we also tested whether the uORFs were expressed during the infection of cells derived from the mosquito vector. Ribo-Seq analysis of Aedes albopictus (A. albopictus) (C6/36) cells infected with the ZIKV American isolate PE243 for 24 h suggested that both uORFs were occupied by ribosomes during infection (Fig. 8A). Then, we compared the transmission dynamics of American WT, African-like and uORF1-KO viruses, derived from infectious clones, in female Aedes aegypti (Ae. aegypti). Mosquitoes were exposed to an artificial, infectious blood meal containing an expected virus titre of 2 × 106 PFU/mL of blood. Actual titres ranged from 0.8 × 106 to 3.6 × 106 PFU/mL across the different experiments, and this uncontrolled variation was accounted for in the statistical analysis. At several defined time points after the infectious blood meal, we determined the rates of mosquito infection and systemic viral dissemination by RT-PCR and detected the presence of ZIKV in saliva by infectious assay. We calculated infection prevalence as the proportion of blood-fed mosquitoes with a body infection (determined by RT-PCR, Fig. 8B, left panel), dissemination prevalence as the proportion of infected mosquitoes with a virus-positive head (determined by RT-PCR, Fig. 8B, middle panel) and transmission prevalence as the proportion of mosquitoes with a disseminated infection that had infectious saliva (determined by focus-forming assay, Fig. 8B, right panel), as previously described11,36. A total of 99, 128 and 234 individual mosquitoes were analysed in the first, second and third experiments, respectively. In each experiment, mosquitoes were collected at the same time points for all groups. In the first experiment, mosquitoes were collected on days 7, 14 and 21 post-exposure. In the two following experiments, the collection time points were changed to days 7, 10, 14 and 17 to gather more meaningful data based on the results of the first experiment. The subsequent analysis combined the three experiments in a well-established statistical framework (logistic regression) that accounts for differences in sample size and does not require all time points to be present in all experiments (i.e., a full-factorial design). The number of mosquitoes for each time point and each experiment is provided in Supplementary Table 4.

Fig. 8: ZIKV mutant viruses display similar transmission dynamics in mosquitoes in vivo.
figure 8

A (left panel) Ribo-Seq reads in the 5′ region of the ZIKV genome at 24 h p.i. of CHX-treated C6/36 cells infected with PE243 (MOI:3). Histograms show the positions of the 5′ ends of reads with a +12 nt offset to map the approximate P-site as described in Fig. 2A–C. A (right panel) Bar charts of the percentage of RPFs translated in each frame in relation to the main ORF as described in Fig. 2D–F. B Prevalence of ZIKV infection (left), dissemination (middle) and transmission (right) over time in mosquitoes exposed to an infectious blood meal containing 2 × 106 PFU/mL of virus. Infection prevalence is the proportion of blood-fed mosquitoes with a body infection (determined by RT-PCR), dissemination prevalence is the proportion of infected mosquitoes with a virus-positive head (determined by RT-PCR), and transmission prevalence is the proportion of mosquitoes with a disseminated infection and infectious saliva (determined by focus-forming assay). The data represent three separate experiments combined, colour-coded by different ZIKV mutants (total number of Ae. aegypti mosquitoes per experiment and time points are indicated in Supplementary Table 4). The size of the data points is proportionate to the number of mosquitoes tested, and the lines are the logistic fits of the time effect for each type of virus (ignoring the experiment effect on transmission prevalence in the visual representation). The vertical error bars are the 95% confidence intervals of the proportions.

Although the vast majority of mosquitoes became infected irrespective of the experiment, virus type and time point (mean: 90.5%; median: 90.9%; range: 66.7%–100%), none of these variables had a statistically significant effect on infection prevalence (Supplementary Table 5). Both dissemination and transmission prevalence significantly increased over time (Fig. 8B), but there was no detectable difference between the different ZIKV mutant viruses (Supplementary Table 5). We conclude that the ZIKV mutant viruses have similar transmission dynamics in mosquitoes regardless of their 5′-UTR arrangement.