Nos tutelles


Nos partenaires

Accueil > Départements > Biologie des Génomes > Linda SPERLING : Analyse du Génome

Publications de l’équipe


  • O. Arnaiz, E. Van Dijk, M. Bétermier, M. Lhuillier-Akakpo, A. de Vanssay, S. Duharcourt, E. Sallet, J. Gouzy, et L. Sperling, « Improved methods and resources for paramecium genomics: transcription units, gene annotation and gene expression », BMC genomics, vol. 18, nᵒ 1, p. 483, 2017.
    Résumé : BACKGROUND: The 15 sibling species of the Paramecium aurelia cryptic species complex emerged after a whole genome duplication that occurred tens of millions of years ago. Given extensive knowledge of the genetics and epigenetics of Paramecium acquired over the last century, this species complex offers a uniquely powerful system to investigate the consequences of whole genome duplication in a unicellular eukaryote as well as the genetic and epigenetic mechanisms that drive speciation. High quality Paramecium gene models are important for research using this system. The major aim of the work reported here was to build an improved gene annotation pipeline for the Paramecium lineage. RESULTS: We generated oriented RNA-Seq transcriptome data across the sexual process of autogamy for the model species Paramecium tetraurelia. We determined, for the first time in a ciliate, candidate P. tetraurelia transcription start sites using an adapted Cap-Seq protocol. We developed TrUC, multi-threaded Perl software that in conjunction with TopHat mapping of RNA-Seq data to a reference genome, predicts transcription units for the annotation pipeline. We used EuGene software to combine annotation evidence. The high quality gene structural annotations obtained for P. tetraurelia were used as evidence to improve published annotations for 3 other Paramecium species. The RNA-Seq data were also used for differential gene expression analysis, providing a gene expression atlas that is more sensitive than the previously established microarray resource. CONCLUSIONS: We have developed a gene annotation pipeline tailored for the compact genomes and tiny introns of Paramecium species. A novel component of this pipeline, TrUC, predicts transcription units using Cap-Seq and oriented RNA-Seq data. TrUC could prove useful beyond Paramecium, especially in the case of high gene density. Accurate predictions of 3' and 5' UTR will be particularly valuable for studies of gene expression (e.g. nucleosome positioning, identification of cis regulatory motifs). The P. tetraurelia improved transcriptome resource, gene annotations for P. tetraurelia, P. biaurelia, P. sexaurelia and P. caudatum, and Paramecium-trained EuGene configuration are available through ParameciumDB ( ). TrUC software is freely distributed under a GNU GPL v3 licence ( ).
    Mots-clés : ANGE, Autogamy, Cap-Seq, Ciliate, DBG, Differential gene expression, Gene annotation, MICMAC, RNA-Seq, TSS.

  • J. Gruchota, C. Denby Wilkes, O. Arnaiz, L. Sperling, et J. K. Nowak, « A meiosis-specific Spt5 homolog involved in non-coding transcription », Nucleic Acids Research, 2017.
    Résumé : Spt5 is a conserved and essential transcriptional regulator that binds directly to RNA polymerase and is involved in transcription elongation, polymerase pausing and various co-transcriptional processes. To investigate the role of Spt5 in non-coding transcription, we used the unicellular model Paramecium tetraurelia In this ciliate, development is controlled by epigenetic mechanisms that use different classes of non-coding RNAs to target DNA elimination. We identified two SPT5 genes. One (STP5v) is involved in vegetative growth, while the other (SPT5m) is essential for sexual reproduction. We focused our study on SPT5m, expressed at meiosis and associated with germline nuclei during sexual processes. Upon Spt5m depletion, we observed absence of scnRNAs, piRNA-like 25 nt small RNAs produced at meiosis. The scnRNAs are a temporal copy of the germline genome and play a key role in programming DNA elimination. Moreover, Spt5m depletion abolishes elimination of all germline-limited sequences, including sequences whose excision was previously shown to be scnRNA-independent. This suggests that in addition to scnRNA production, Spt5 is involved in setting some as yet uncharacterized epigenetic information at meiosis. Our study establishes that Spt5m is crucial for developmental genome rearrangements and necessary for scnRNA production.
    Mots-clés : ANGE, DBG.

  • F. Guérin, O. Arnaiz, N. Boggetto, C. Denby Wilkes, E. Meyer, L. Sperling, et S. Duharcourt, « Flow cytometry sorting of nuclei enables the first global characterization of Paramecium germline DNA and transposable elements », BMC genomics, vol. 18, nᵒ 1, p. 327, 2017.
    Résumé : BACKGROUND: DNA elimination is developmentally programmed in a wide variety of eukaryotes, including unicellular ciliates, and leads to the generation of distinct germline and somatic genomes. The ciliate Paramecium tetraurelia harbors two types of nuclei with different functions and genome structures. The transcriptionally inactive micronucleus contains the complete germline genome, while the somatic macronucleus contains a reduced genome streamlined for gene expression. During development of the somatic macronucleus, the germline genome undergoes massive and reproducible DNA elimination events. Availability of both the somatic and germline genomes is essential to examine the genome changes that occur during programmed DNA elimination and ultimately decipher the mechanisms underlying the specific removal of germline-limited sequences. RESULTS: We developed a novel experimental approach that uses flow cell imaging and flow cytometry to sort subpopulations of nuclei to high purity. We sorted vegetative micronuclei and macronuclei during development of P. tetraurelia. We validated the method by flow cell imaging and by high throughput DNA sequencing. Our work establishes the proof of principle that developing somatic macronuclei can be sorted from a complex biological sample to high purity based on their size, shape and DNA content. This method enabled us to sequence, for the first time, the germline DNA from pure micronuclei and to identify novel transposable elements. Sequencing the germline DNA confirms that the Pgm domesticated transposase is required for the excision of all ~45,000 Internal Eliminated Sequences. Comparison of the germline DNA and unrearranged DNA obtained from PGM-silenced cells reveals that the latter does not provide a faithful representation of the germline genome. CONCLUSIONS: We developed a flow cytometry-based method to purify P. tetraurelia nuclei to high purity and provided quality control with flow cell imaging and high throughput DNA sequencing. We identified 61 germline transposable elements including the first Paramecium retrotransposons. This approach paves the way to sequence the germline genomes of P. aurelia sibling species for future comparative genomic studies.
    Mots-clés : ANGE, DBG, Flow Cytometry, High throughput sequencing, ITm DNA transposons, Non-LTR retrotransposons, Programmed DNA elimination.

  • L. Shi, K. France, O. Arnaiz, et J. Cohen, « The Ciliary Protein IFT57 in the Macronucleus of Paramecium », The Journal of Eukaryotic Microbiology, 2017.
    Résumé : The intraflagellar transport IFT57 protein is essential for ciliary growth and maintenance. Also known as HIPPI, human IFT57 can be translocated to the nucleus via a molecular partner of the Huntingtin, Hip1, inducing gene expression changes. In Paramecium tetraurelia, we identified four IFT57 genes forming two subfamilies IFT57A/B and IFT57C/D arising from whole genome duplications. The depletion of proteins of the two subfamilies induced ciliary defects and IFT57A and IFT57C localized in basal bodies and cilia. We observed that IFT57A, but not IFT57C, is also present in the macronucleus and able to traffic toward the developing anlage during autogamy. Analysis of chimeric IFT57A-IFT57C-GFP-tagged proteins allowed us to identify a region of IFT57A necessary for nuclear localization. We studied the localization of the unique IFT57 protein of Paramecium caudatum, a species, which diverged from Paramecium tetraurelia before the whole genome duplications. The Paramecium caudatum IFT57C protein was excluded from the nucleus. We also analyzed whether the overexpression of IFT57A in Paramecium could affect gene transcription as the human protein does in HeLa cells. The expression of some genes was indeed affected by overexpression of IFT57A, but the set of affected genes poorly overlaps the set of genes affected in human cells. This article is protected by copyright. All rights reserved.
    Mots-clés : ANGE, BIOCELL, BIOCIL, cilia, DBG, IFT57 /HIPPI, intraflagellar transport (IFT), Macronucleus, Paramecium.


  • C. Denby Wilkes, O. Arnaiz, et L. Sperling, « ParTIES: a toolbox for Paramecium interspersed DNA elimination studies », Bioinformatics (Oxford, England), vol. 32, nᵒ 4, p. 599-601, 2016.
    Résumé : MOTIVATION: Developmental DNA elimination occurs in a wide variety of multicellular organisms, but ciliates are the only single-celled eukaryotes in which this phenomenon has been reported. Despite considerable interest in ciliates as models for DNA elimination, no standard methods for identification and characterization of the eliminated sequences are currently available. RESULTS: We present the Paramecium Toolbox for Interspersed DNA Elimination Studies (ParTIES), designed for Paramecium species, that (i) identifies eliminated sequences, (ii) measures their presence in a sequencing sample and (iii) detects rare elimination polymorphisms. AVAILABILITY AND IMPLEMENTATION: ParTIES is multi-threaded Perl software available at ParTIES is distributed under the GNU General Public Licence v3.
    Mots-clés : ANGE, Ciliophora Infections, DBG, DNA, Protozoan, Genome, Protozoan, Interspersed Repetitive Sequences, Paramecium, Protozoan Proteins, Software.


  • Q. Carradec, U. Götz, O. Arnaiz, J. Pouch, M. Simon, E. Meyer, et S. Marker, « Primary and secondary siRNA synthesis triggered by RNAs from food bacteria in the ciliate Paramecium tetraurelia », Nucleic Acids Research, vol. 43, nᵒ 3, p. 1818-1833, 2015.
    Résumé : In various organisms, an efficient RNAi response can be triggered by feeding cells with bacteria producing double-stranded RNA (dsRNA) against an endogenous gene. However, the detailed mechanisms and natural functions of this pathway are not well understood in most cases. Here, we studied siRNA biogenesis from exogenous RNA and its genetic overlap with endogenous RNAi in the ciliate Paramecium tetraurelia by high-throughput sequencing. Using wild-type and mutant strains deficient for dsRNA feeding we found that high levels of primary siRNAs of both strands are processed from the ingested dsRNA trigger by the Dicer Dcr1, the RNA-dependent RNA polymerases Rdr1 and Rdr2 and other factors. We further show that this induces the synthesis of secondary siRNAs spreading along the entire endogenous mRNA, demonstrating the occurrence of both 3'-to-5' and 5'-to-3' transitivity for the first time in the SAR clade of eukaryotes (Stramenopiles, Alveolates, Rhizaria). Secondary siRNAs depend on Rdr2 and show a strong antisense bias; they are produced at much lower levels than primary siRNAs and hardly contribute to RNAi efficiency. We further provide evidence that the Paramecium RNAi machinery also processes single-stranded RNAs from its bacterial food, broadening the possible natural functions of exogenously induced RNAi in this organism.
    Mots-clés : ANGE, DBG, Food Microbiology, High-Throughput Nucleotide Sequencing, Paramecium tetraurelia, RNA Interference, RNA, Bacterial, RNA, Small Interfering.

  • K. Maliszewska-Olejniczak, J. Gruchota, R. Gromadka, C. Denby Wilkes, O. Arnaiz, N. Mathy, S. Duharcourt, M. Bétermier, et J. K. Nowak, « TFIIS-Dependent Non-coding Transcription Regulates Developmental Genome Rearrangements », PLoS genetics, vol. 11, nᵒ 7, p. e1005383, 2015.
    Résumé : Because of their nuclear dimorphism, ciliates provide a unique opportunity to study the role of non-coding RNAs (ncRNAs) in the communication between germline and somatic lineages. In these unicellular eukaryotes, a new somatic nucleus develops at each sexual cycle from a copy of the zygotic (germline) nucleus, while the old somatic nucleus degenerates. In the ciliate Paramecium tetraurelia, the genome is massively rearranged during this process through the reproducible elimination of repeated sequences and the precise excision of over 45,000 short, single-copy Internal Eliminated Sequences (IESs). Different types of ncRNAs resulting from genome-wide transcription were shown to be involved in the epigenetic regulation of genome rearrangements. To understand how ncRNAs are produced from the entire genome, we have focused on a homolog of the TFIIS elongation factor, which regulates RNA polymerase II transcriptional pausing. Six TFIIS-paralogs, representing four distinct families, can be found in P. tetraurelia genome. Using RNA interference, we showed that TFIIS4, which encodes a development-specific TFIIS protein, is essential for the formation of a functional somatic genome. Molecular analyses and high-throughput DNA sequencing upon TFIIS4 RNAi demonstrated that TFIIS4 is involved in all kinds of genome rearrangements, including excision of ~48% of IESs. Localization of a GFP-TFIIS4 fusion revealed that TFIIS4 appears specifically in the new somatic nucleus at an early developmental stage, before IES excision. RT-PCR experiments showed that TFIIS4 is necessary for the synthesis of IES-containing non-coding transcripts. We propose that these IES+ transcripts originate from the developing somatic nucleus and serve as pairing substrates for germline-specific short RNAs that target elimination of their homologous sequences. Our study, therefore, connects the onset of zygotic non coding transcription to the control of genome plasticity in Paramecium, and establishes for the first time a specific role of TFIIS in non-coding transcription in eukaryotes.
    Mots-clés : ANGE, Cell Lineage, DBG, Genome, germ cells, High-Throughput Nucleotide Sequencing, MICMAC, Paramecium tetraurelia, RNA Polymerase II, RNA, Long Noncoding, Transcription, Genetic, Transcriptional Elongation Factors.

  • D. Smedley, S. Haider, S. Durinck, L. Pandini, P. Provero, J. Allen, O. Arnaiz, M. H. Awedh, R. Baldock, G. Barbiera, P. Bardou, T. Beck, A. Blake, M. Bonierbale, A. J. Brookes, G. Bucci, I. Buetti, S. Burge, C. Cabau, J. W. Carlson, C. Chelala, C. Chrysostomou, D. Cittaro, O. Collin, R. Cordova, R. J. Cutts, E. Dassi, A. Di Genova, A. Djari, A. Esposito, H. Estrella, E. Eyras, J. Fernandez-Banet, S. Forbes, R. C. Free, T. Fujisawa, E. Gadaleta, J. M. Garcia-Manteiga, D. Goodstein, K. Gray, J. A. Guerra-Assunção, B. Haggarty, D. - J. Han, B. W. Han, T. Harris, J. Harshbarger, R. K. Hastings, R. D. Hayes, C. Hoede, S. Hu, Z. - L. Hu, L. Hutchins, Z. Kan, H. Kawaji, A. Keliet, A. Kerhornou, S. Kim, R. Kinsella, C. Klopp, L. Kong, D. Lawson, D. Lazarevic, J. - H. Lee, T. Letellier, C. - Y. Li, P. Lio, C. - J. Liu, J. Luo, A. Maass, J. Mariette, T. Maurel, S. Merella, A. M. Mohamed, F. Moreews, I. Nabihoudine, N. Ndegwa, C. Noirot, C. Perez-Llamas, M. Primig, A. Quattrone, H. Quesneville, D. Rambaldi, J. Reecy, M. Riba, S. Rosanoff, A. A. Saddiq, E. Salas, O. Sallou, R. Shepherd, R. Simon, L. Sperling, W. Spooner, D. M. Staines, D. Steinbach, K. Stone, E. Stupka, J. W. Teague, A. Z. Dayem Ullah, J. Wang, D. Ware, M. Wong-Erasmus, K. Youens-Clark, A. Zadissa, S. - J. Zhang, et A. Kasprzyk, « The BioMart community portal: an innovative alternative to large, centralized data repositories », Nucleic Acids Research, vol. 43, nᵒ W1, p. W589-598, 2015.
    Résumé : The BioMart Community Portal ( is a community-driven effort to provide a unified interface to biomedical databases that are distributed worldwide. The portal provides access to numerous database projects supported by 30 scientific organizations. It includes over 800 different biological datasets spanning genomics, proteomics, model organisms, cancer data, ontology information and more. All resources available through the portal are independently administered and funded by their host organizations. The BioMart data federation technology provides a unified interface to all the available data. The latest version of the portal comes with many new databases that have been created by our ever-growing community. It also comes with better support and extensibility for data analysis and visualization tools. A new addition to our toolbox, the enrichment analysis tool is now accessible through graphical and web service interface. The BioMart community portal averages over one million requests per day. Building on this level of service and the wealth of information that has become available, the BioMart Community Portal has introduced a new, more scalable and cheaper alternative to the large data stores maintained by specialized organizations.
    Mots-clés : ANGE, Database Management Systems, DBG, Genomics, Humans, Internet, Neoplasms, proteomics.
--- Exporter la sélection au format

Publications par thématiques avant 2015

Paramecium genome

- Singh DP, Saudemont B, Guglielmi G, Arnaiz O, Goût J-F, Prajer M, Potekhin A, Przybòs E, Aubusson-Fleury A, Bhullar S, Bouhouche K, Lhuillier-Akakpo M, Tanty V, Blugeon C, Alberti A, Labadie K, Aury J-M, Sperling L, Duharcourt S & Meyer E (2014) Genome-defence small RNAs exapted for epigenetic mating-type inheritance. Nature 509 : 447–452

- Arnaiz O, Mathy N, Baudry C, Malinsky S, Aury J-M, Denby Wilkes C, Garnier O, Labadie K, Lauderdale BE, Le Mouël A, Marmignon A, Nowacki M, Poulain J, Prajer M, Wincker P, Meyer E, Duharcourt S, Duret L, Bétermier M & Sperling L (2012) The Paramecium Germline Genome Provides a Niche for Intragenic Parasitic DNA : Evolutionary Dynamics of Internal Eliminated Sequences. PLoS Genet. 8 : e1002984

- Arnaiz O & Sperling L (2011) ParameciumDB in 2011 : new tools and new data for functional and comparative genomics of the model ciliate Paramecium tetraurelia. Nucleic Acids Res 39 : D632–636

- Jaillon O, Bouhouche K, Gout J-F, Aury J-M, Noel B, Saudemont B, Nowacki M, Serrano V, Porcel BM, Ségurens B, Le Mouël A, Lepère G, Schächter V, Bétermier M, Cohen J, Wincker P, Sperling L, Duret L & Meyer E (2008) Translational control of intron splicing in eukaryotes. Nature. 451 : 359–62

- Duret L, Cohen J, Jubin C, Dessen P, Goût J-F, Mousset S, Aury J-M, Jaillon O, Noël B, Arnaiz O, Bétermier M, Wincker P, Meyer E & Sperling L (2008) Analysis of sequence variability in the macronuclear DNA of Paramecium tetraurelia : a somatic view of the germline. Genome Res 18 : 585–96

- Aury J-M, Jaillon O, Duret L, Noel B, Jubin C, Porcel BM, Ségurens B, Daubin V, Anthouard V, Aiach N, Arnaiz O, Billaut A, Beisson J, Blanc I, Bouhouche K, Câmara F, Duharcourt S, Guigo R, ... Weissenbach, J., Scarpelli, C., Schachter, V., Sperling, L., Meyer E., Cohen J., Wincker P. (2006) Global trends of whole-genome duplications revealed by the ciliate Paramecium tetraurelia. Nature. 444 : 171–8

Animal genomes

- Audit B, Baker A, Chen C-L, Rappailles A, Guilbaud G, Julienne H, Arach G, d’Aubenton-Carafa Y, Hyrien O, Thermes C. and Arneodo A. (2013) Multiscale analysis of genome-wide replication timing profiles using a wavelet-based signal-processing algorithm. Nat. Protoc. 8, 98-110.

- Baker A, Audit B, Chen C-L, Moindrot B, Leleu A, Rappailles A, Guilbaud G, Vaillant C, Arach G, Mongelard F, d’Aubenton-Carafa Y, Hyrien O, Thermes C. and Arneodo A. (2012) Replication fork polarity gradients revealed by megabase-sized U-shaped replication timing domains in human cell lines. PLoS Comput. Biol. 8:e1002443.

- Chen C-L, Duquenne L, Audit B, Guilbaud G, Rappailles A, Baker A, Huvet M, D’ Aubenton-Carafa Y, Hyrien O, Arneodo A & Thermes C (2011) Replication-associated mutational asymmetry in the human genome. Mol. Biol. Evol. 28 : 2327–2337

- Van Dijk EL, Chen CL, D’ Aubenton-Carafa Y, Gourvennec S, Kwapisz M, Roche V, Bertrand C, Silvain M, Legoix-Né P, Loeillet S, Nicolas A, Thermes C & Morillon A (2011) XUTs are a class of Xrn1-sensitive antisense regulatory non-coding RNA in yeast. Nature 475 : 114–117

- Chen C-L, Rappailles A, Duquenne L, Huvet M, Guilbaud G, Farinelli L, Audit B, D’ Aubenton-Carafa Y, Arneodo A, Hyrien O & Thermes C (2010) Impact of replication timing on non-CpG and CpG substitution rates in mammalian genomes. Genome Res. 20 : 447–457

publié le , mis à jour le