Computational pipelines are common place in scientific research. However, most of the resources for constructing pipelines are heavyweight systems with graphical user interfaces. Ruffus is a library ...for the creation of computational pipelines. Its lightweight and unobtrusive design recommends it for use even for the most trivial of analyses. At the same time, it is powerful enough to have been used for complex workflows involving more than 50 interdependent stages. Availability and implementation: Ruffus is written in python. Source code, a short tutorial, examples and a comprehensive user manual are freely available at http://www.ruffus.org.uk. The example program is available at http://www.ruffus.org.uk/examples/bioinformatics Contact: ruffus@llew.org.uk
The mixture of different interested parties, from the elite locals, made up mostly of businessmen, the British Hongs, the expatriate civil servants and their local counterparts, the British ...Government and its Foreign Office gurus, the mainland Chinese and
Hong Kong is among the richest cities in the world. Yet over the past 15 years, living conditions for the average family have deteriorated despite a robust economy, ample budget surpluses and record ...labour productivity. Successive governments have been reluctant to invest in services for the elderly, the disabled, the long-term sick, and the poor, while education has become more elitist. The political system has helped to entrench a mistaken consensus that social spending is a threat to financial stability and economic prosperity. In this trenchant attack on government mismanagement, Leo Goodstadt traces how officials have created a ‘new poverty’ in Hong Kong and argues that their misguided policies are both a legacy of the colonial era and a deliberate choice by modern governments, and not the result of economic crises. This provocative book will be essential reading for anyone wishing to understand why poverty returned to Hong Kong in this century.
Structural variation is widespread in mammalian genomes and is an important cause of disease, but just how abundant and important structural variants (SVs) are in shaping phenotypic variation remains ...unclear. Without knowing how many SVs there are, and how they arise, it is difficult to discover what they do. Combining experimental with automated analyses, we identified 711,920 SVs at 281,243 sites in the genomes of thirteen classical and four wild-derived inbred mouse strains. The majority of SVs are less than 1 kilobase in size and 98% are deletions or insertions. The breakpoints of 160,000 SVs were mapped to base pair resolution, allowing us to infer that insertion of retrotransposons causes more than half of SVs. Yet, despite their prevalence, SVs are less likely than other sequence variants to cause gene expression or quantitative phenotypic variation. We identified 24 SVs that disrupt coding exons, acting as rare variants of large effect on gene function. One-third of the genes so affected have immunological functions.
Between 1935 and 1985, Hong Kong's growth seemed unstoppable. The economy flourished despite wars, revolution and Western protectionism to emerge as a world-class manufacturing exporter and an ...international financial centre. Yet, for bankers, these were t
Long considered to be the building block of life, it is now apparent that protein is only one of many functional products generated by the eukaryotic genome. Indeed, more of the human genome is ...transcribed into noncoding sequence than into protein-coding sequence. Nevertheless, whilst we have developed a deep understanding of the relationships between evolutionary constraint and function for protein-coding sequence, little is known about these relationships for non-coding transcribed sequence. This dearth of information is partially attributable to a lack of established non-protein-coding RNA (ncRNA) orthologs among birds and mammals within sequence and expression databases.
Here, we performed a multi-disciplinary study of four highly conserved and brain-expressed transcripts selected from a list of mouse long intergenic noncoding RNA (lncRNA) loci that generally show pronounced evolutionary constraint within their putative promoter regions and across exon-intron boundaries. We identify some of the first lncRNA orthologs present in birds (chicken), marsupial (opossum), and eutherian mammals (mouse), and investigate whether they exhibit conservation of brain expression. In contrast to conventional protein-coding genes, the sequences, transcriptional start sites, exon structures, and lengths for these non-coding genes are all highly variable.
The biological relevance of lncRNAs would be highly questionable if they were limited to closely related phyla. Instead, their preservation across diverse amniotes, their apparent conservation in exon structure, and similarities in their pattern of brain expression during embryonic and early postnatal stages together indicate that these are functional RNA molecules, of which some have roles in vertebrate brain development.
Accurate predictions of orthology and paralogy relationships are necessary to infer human molecular function from experiments in model organisms. Previous genome-scale approaches to predicting these ...relationships have been limited by their use of protein similarity and their failure to take into account multiple splicing events and gene prediction errors. We have developed PhyOP, a new phylogenetic orthology prediction pipeline based on synonymous rate estimates, which accurately predicts orthology and paralogy relationships for transcripts, genes, exons, or genomic segments between closely related genomes. We were able to identify orthologue relationships to human genes for 93% of all dog genes from Ensembl. Among 1:1 orthologues, the alignments covered a median of 97.4% of protein sequences, and 92% of orthologues shared essentially identical gene structures. PhyOP accurately recapitulated genomic maps of conserved synteny. Benchmarking against predictions from Ensembl and Inparanoid showed that PhyOP is more accurate, especially in its predictions of paralogy. Nearly half (46%) of PhyOP paralogy predictions are unique. Using PhyOP to investigate orthologues and paralogues in the human and dog genomes, we found that the human assembly contains 3-fold more gene duplications than the dog. Species-specific duplicate genes, or "in-paralogues," are generally shorter and have fewer exons than 1:1 orthologues, which is consistent with selective constraints and mutation biases based on the sizes of duplicated genes. In-paralogues have experienced elevated amino acid and synonymous nucleotide substitution rates. Duplicates possess similar biological functions for either the dog or human lineages. Having accounted for 2,954 likely pseudogenes and gene fragments, and after separating 346 erroneously merged genes, we estimated that the human genome encodes a minimum of 19,700 protein-coding genes, similar to the gene count of nematode worms. PhyOP is a fast and robust approach to orthology prediction that will be applicable to whole genomes from multiple closely related species. PhyOP will be particularly useful in predicting orthology for mammalian genomes that have been incompletely sequenced, and for large families of rapidly duplicating genes.
In most countries e-Identity card adoption has had limited or little success. A notable exception to this is the case of Hong Kong. This article examines this phenomenon and contrasts it with the ...lackluster performance of this technology, in Europe in particular. It is suggested that several useful lessons can be learned from the Hong Kong experience and some possible directions for the future research into the nature and role of e-Identity cards are proposed.
Using a positional cloning approach supported by comparative genomics, we have identified a previously unreported gene, EYS, at the RP25 locus on chromosome 6q12 commonly mutated in autosomal ...recessive retinitis pigmentosa. Spanning over 2 Mb, this is the largest eye-specific gene identified so far. EYS is independently disrupted in four other mammalian lineages, including that of rodents, but is well conserved from Drosophila to man and is likely to have a role in the modeling of retinal architecture.
Variation in gene expression has been held responsible for the functional and morphological specialization of tissues. The tissue specificity of genes is known to correlate positively with gene ...evolution rates. We show here, using large data sets, that when a gene is expressed highly in a small number of tissues, its protein is more likely to be secreted and more likely to be mutated in genetic diseases with Mendelian inheritance. We find that secreted proteins are evolving at faster rates than nonsecreted proteins, and that their evolutionary rates are highly correlated with tissue specificity. However, the impact of secretion on evolutionary rates is countered by tissue-specific constraints that have been held constant over the past 75 million years. We find that disease genes are underrepresented among intracellular and slowly evolving housekeeping genes. These findings illuminate major selective pressures that have shaped the gene repertoires expressed in different mammalian tissues.