Annotation ambiguities and annotation errors are a general challenge in genomics. While a reliable protein function assignment can be obtained by experimental characterization, this is expensive and ...time-consuming, and the number of such Gold Standard Proteins (GSP) with experimental support remains very low compared to proteins annotated by sequence homology, usually through automated pipelines. Even a GSP may give a misleading assignment when used as a reference: the homolog may be close enough to support isofunctionality, but the substrate of the GSP is absent from the species being annotated. In such cases, the enzymes cannot be isofunctional. Here, we examined a variety of such issues in halophilic archaea (class Halobacteria), with a strong focus on the model haloarchaeon
.
Annotated proteins of
were identified for which public databases tend to assign a function that is probably incorrect. In some cases, an alternative, probably correct, function can be predicted or inferred from the available evidence, but this has not been adopted by public databases because experimental validation is lacking. In other cases, a probably invalid specific function is predicted by homology, and while there is evidence that this assigned function is unlikely, the true function remains elusive. We listed 50 of those cases, each with detailed background information, so that a conclusion about the most likely biological function can be drawn. For reasons of brevity and comprehension, only the key aspects are listed in the main text, with detailed information being provided in a corresponding section of the Supplementary Materials.
Compiling, describing and summarizing these open annotation issues and functional predictions will benefit the scientific community in the general effort to improve the evaluation of protein function assignments and more thoroughly detail them. By highlighting the gaps and likely annotation errors currently in the databases, we hope this study will provide a framework for experimentalists to systematically confirm (or disprove) our function predictions or to uncover yet more unexpected functions.
While many aspects of archaeal cell biology remain relatively unexplored, systems biology approaches like mass spectrometry (MS) based proteomics offer an opportunity for rapid advances. ...Unfortunately, the enormous amount of MS data generated often remains incompletely analyzed due to a lack of sophisticated bioinformatic tools and field-specific biological expertise for data interpretation. Here we present the initiation of the Archaeal Proteome Project (ArcPP), a community-based effort to comprehensively analyze archaeal proteomes. Starting with the model archaeon Haloferax volcanii, we reanalyze MS datasets from various strains and culture conditions. Optimized peptide spectrum matching, with strict control of false discovery rates, facilitates identifying > 72% of the reference proteome, with a median protein sequence coverage of 51%. These analyses, together with expert knowledge in diverse aspects of cell biology, provide meaningful insights into processes such as N-terminal protein maturation, N-glycosylation, and metabolism. Altogether, ArcPP serves as an invaluable blueprint for comprehensive prokaryotic proteomics.
Glycosylation is one of the most complex posttranslational protein modifications. Its importance has been established not only for eukaryotes but also for a variety of prokaryotic cellular processes, ...such as biofilm formation, motility, and mating. However, comprehensive glycoproteomic analyses are largely missing in prokaryotes. Here, we extend the phenotypic characterization of
N
-glycosylation pathway mutants in
Haloferax volcanii
and provide a detailed glycoproteome for this model archaeon through the mass spectrometric analysis of intact glycopeptides. Using in-depth glycoproteomic datasets generated for the wild-type (WT) and mutant strains as well as a reanalysis of datasets within the Archaeal Proteome Project (ArcPP), we identify the largest archaeal glycoproteome described so far. We further show that different
N
-glycosylation pathways can modify the same glycosites under the same culture conditions. The extent and complexity of the
Hfx
.
volcanii N
-glycoproteome revealed here provide new insights into the roles of
N
-glycosylation in archaeal cell biology.
For osmoadaptation the halophilic bacterium
Halomonas elongata
synthesizes as its main compatible solute the aspartate derivative ectoine.
H. elongata
does not rely entirely on synthesis but can ...accumulate ectoine by uptake from the surrounding environment with the help of the osmoregulated transporter TeaABC. Disruption of the TeaABC-mediated ectoine uptake creates a strain that is constantly losing ectoine to the medium. However, the efflux mechanism of ectoine in
H. elongata
is not yet understood.
H. elongata
possesses four genes encoding mechanosensitive channels all of which belong to the small conductance type (MscS). Analysis by qRT-PCR revealed a reduction in transcription of the
mscS
genes with increasing salinity. The response of
H. elongata
to hypo- and hyperosmotic shock never resulted in up-regulation but rather in down-regulation of
mscS
transcription. Deletion of all four
mscS
genes created a mutant that was unable to cope with hypoosmotic shock. However, the knockout mutant grew significantly faster than the wildtype at high salinity of 2 M NaCl, and most importantly, still exported 80% of the ectoine compared to the wildtype. We thus conclude that a yet unknown system, which is independent of mechanosensitive channels, is the major export route for ectoine in
H. elongata
.
Halovirus HF2 was the first member of the
genus to have its genome fully sequenced, which revealed two classes of intergenic repeat (IR) sequences: class I repeats of 58 bp in length, and class II ...repeats of 29 bp in length. Both classes of repeat contain AT-rich motifs that were conjectured to represent promoters. In the present study, nine IRs were cloned upstream of the
reporter gene, and all displayed promoter activity, providing experimental evidence for the previous conjecture. Comparative genomics showed that IR sequences and their relative genomic positions were strongly conserved among other members of the same virus genus. The transcription of HF2 was also examined by the reverse-transcriptase-PCR (RT-PCR) method, which demonstrated very long transcripts were produced that together covered most of the genome, and from both strands. The presence of long counter transcripts suggests a regulatory role or possibly unrecognized coding potential.
Plasmids PL6A and PL6B are both carried by the C23
strain of the square archaeon
, and are closely related (76% nucleotide identity), circular, about 6 kb in size, and display the same gene synteny. ...They are unrelated to other known plasmids and all of the predicted proteins are cryptic in function. Here we describe two additional PL6-related plasmids, pBAJ9-6 and pLT53-7, each carried by distinct isolates of
that were recovered from hypersaline waters in Australia. A third PL6-like plasmid, pLTMV-6, was assembled from metavirome data from Lake Tyrell, a salt-lake in Victoria, Australia. Comparison of all five plasmids revealed a distinct plasmid family with strong conservation of gene content and synteny, an average size of 6.2 kb (range 5.8-7.0 kb) and pairwise similarities between 61-79%. One protein (F3) was closely similar to a protein carried by betapleolipoviruses while another (R6) was similar to a predicted AAA-ATPase of His 1 halovirus (His1V_gp16). Plasmid pLT53-7 carried a gene for a FkbM family methyltransferase that was not present in any of the other plasmids. Comparative analysis of all PL6-like plasmids provided better resolution of conserved sequences and coding regions, confirmed the strong link to haloviruses, and showed that their sequences are highly conserved among examples from
isolates and metagenomic data that collectively cover geographically distant locations, indicating that these genetic elements are widespread.
The genome of Halobacterium strain 63‐R2 was recently reported and provides the opportunity to resolve long‐standing issues regarding the source of two widely used model strains of Halobacterium ...salinarum, NRC‐1 and R1. Strain 63‐R2 was isolated in 1934 from a salted buffalo hide (epithet “cutirubra”), along with another strain from a salted cow hide (91‐R6T, epithet “salinaria,” the type strain of Hbt. salinarum). Both strains belong to the same species according to genome‐based taxonomy analysis (TYGS), with chromosome sequences showing 99.64% identity over 1.85 Mb. The chromosome of strain 63‐R2 is 99.99% identical to the two laboratory strains NRC‐1 and R1, with only five indels, excluding the mobilome. The two reported plasmids of strain 63‐R2 share their architecture with plasmids of strain R1 (pHcu43/pHS4, 99.89% identity; pHcu235/pHS3, 100.0% identity). We detected and assembled additional plasmids using PacBio reads deposited at the SRA database, further corroborating that strain differences are minimal. One plasmid, pHcu190 (190,816 bp) corresponds to pHS1 (strain R1) but is even more similar in architecture to pNRC100 (strain NRC‐1). Another plasmid, pHcu229, assembled partially and completed in silico (229,124 bp), shares most of its architecture with pHS2 (strain R1). In deviating regions, it corresponds to pNRC200 (strain NRC‐1). Further architectural differences between the laboratory strain plasmids are not unique, but are present in strain 63‐R2, which contains characteristics from both of them. Based on these observations, it is proposed that the early twentieth‐century isolate 63‐R2 is the immediate ancestor of the twin laboratory strains NRC‐1 and R1.
The complete genomes of four Halobacterium salinarum strains were compared in detail. Two strains (91‐R6T and 63‐R2) were isolated in 1934 by Lochhead from cow and buffalo hides. From the results of these comparisons, we conclude that strain 63‐R2 is the immediate ancestor of the two, widely used laboratory strains NRC‐1 and R1.
The clustered regularly interspaced short palindromic repeat (CRISPR)/CRISPR-associated (Cas) system provides adaptive and heritable immunity against foreign genetic elements in most archaea and many ...bacteria. Although this system is widespread and diverse with many subtypes, only a few species have been investigated to elucidate the precise mechanisms for the defense of viruses or plasmids. Approximately 90% of all sequenced archaea encode CRISPR/Cas systems, but their molecular details have so far only been examined in three archaeal species: Sulfolobus solfataricus, Sulfolobus islandicus, and Pyrococcus furiosus. Here, we analyzed the CRISPR/Cas system of Haloferax volcanii using a plasmid-based invader assay. Haloferax encodes a type I-B CRISPR/Cas system with eight Cas proteins and three CRISPR loci for which the identity of protospacer adjacent motifs (PAMs) was unknown until now. We identified six different PAM sequences that are required upstream of the protospacer to permit target DNA recognition. This is only the second archaeon for which PAM sequences have been determined, and the first CRISPR group with such a high number of PAM sequences. Cells could survive the plasmid challenge if their CRISPR/Cas system was altered or defective, e.g. by deletion of the cas gene cassette. Experimental PAM data were supplemented with bioinformatics data on Haloferax and Haloquadratum.
Background: CRISPR/Cas systems allow archaea and bacteria to resist invasion by foreign nucleic acids.
Results: The CRISPR/Cas system in Haloferax recognized six different PAM sequences that could trigger a defense response.
Conclusion: The PAM sequence specificity of the defense response in type I CRISPR systems is more relaxed than previously thought.
Significance: The PAM sequence requirements for interference and adaptation appear to differ markedly.
The archaeal cytoplasmic membrane provides an anchor for many surface proteins. Recently, a novel membrane anchoring mechanism involving a peptidase, archaeosortase A (ArtA), and C-terminal lipid ...attachment of surface proteins was identified in the model archaeon
ArtA is required for optimal cell growth and morphogenesis, and the S-layer glycoprotein (SLG), the sole component of the
cell wall, is one of the targets for this anchoring mechanism. However, how exactly ArtA function and regulation control cell growth and morphogenesis is still elusive. Here, we report that archaeal homologs to the bacterial phosphatidylserine synthase (PssA) and phosphatidylserine decarboxylase (PssD) are involved in ArtA-dependent protein maturation.
strains lacking either HvPssA or HvPssD exhibited motility, growth, and morphological phenotypes similar to those of an Δ
mutant. Moreover, we showed a loss of covalent lipid attachment to SLG in the Δ
mutant and that proteolytic cleavage of the ArtA substrate HVO_0405 was blocked in the Δ
and Δ
mutant strains. Strikingly, ArtA, HvPssA, and HvPssD green fluorescent protein (GFP) fusions colocalized to the midcell position of
cells, strongly supporting that they are involved in the same pathway. Finally, we have shown that the SLG is also recruited to the midcell before being secreted and lipid anchored at the cell outer surface. Collectively, our data suggest that haloarchaea use the midcell as the main surface processing hot spot for cell elongation, division, and shape determination.
The subcellular organization of biochemical processes in space and time is still one of the most mysterious topics in archaeal cell biology. Despite the fact that haloarchaea largely rely on covalent lipid anchoring to coat the cell envelope, little is known about how cells coordinate
synthesis and about the insertion of this proteinaceous layer throughout the cell cycle. Here, we report the identification of two novel contributors to ArtA-dependent lipid-mediated protein anchoring to the cell surface, HvPssA and HvPssD. ArtA, HvPssA, and HvPssD, as well as SLG, showed midcell localization during growth and cytokinesis, indicating that haloarchaeal cells confine phospholipid processing in order to promote midcell elongation. Our findings have important implications for the biogenesis of the cell surface.
Archaea play indispensable roles in global biogeochemical cycles, yet many crucial cellular processes, including cell-shape determination, are poorly understood. Haloferax volcanii, a model ...haloarchaeon, forms rods and disks, depending on growth conditions. Here, we used a combination of iterative proteomics, genetics, and live-cell imaging to identify mutants that only form rods or disks. We compared the proteomes of the mutants with wild-type cells across growth phases, thereby distinguishing between protein abundance changes specific to cell shape and those related to growth phases. The results identified a diverse set of proteins, including predicted transporters, transducers, signaling components, and transcriptional regulators, as important for cell-shape determination. Through phenotypic characterization of deletion strains, we established that rod-determining factor A (RdfA) and disk-determining factor A (DdfA) are required for the formation of rods and disks, respectively. We also identified structural proteins, including an actin homolog that plays a role in disk-shape morphogenesis, which we named volactin. Using live-cell imaging, we determined volactin's cellular localization and showed its dynamic polymerization and depolymerization. Our results provide insights into archaeal cell-shape determination, with possible implications for understanding the evolution of cell morphology regulation across domains.