The CMS experiment collects and analyzes large amounts of data coming from high energy particle collisions produced by the Large Hadron Collider (LHC) at CERN. This involves a huge amount of real and ...simulated data processing that needs to be handled in batch-oriented platforms. The CMS Global Pool of computing resources provide +100K dedicated CPU cores and another 50K to 100K CPU cores from opportunistic resources for these kind of tasks and even though production and event processing analysis workflows are already managed by existing tools, there is still a lack of support to submit final stage condor-like analysis jobs familiar to Tier-3 or local Computing Facilities users into these distributed resources in an integrated (with other CMS services) and friendly way. CMS Connect is a set of computing tools and services designed to augment existing services in the CMS Physics community focusing on these kind of condor analysis jobs. It is based on the CI-Connect platform developed by the Open Science Grid and uses the CMS GlideInWMS infrastructure to transparently plug CMS global grid resources into a virtual pool accessed via a single submission machine. This paper describes the specific developments and deployment of CMS Connect beyond the CI-Connect platform in order to integrate the service with CMS specific needs, including specific Site submission, accounting of jobs and automated reporting to standard CMS monitoring resources in an effortless way to their users.
The CMS Global Pool, based on HTCondor and glideinWMS, is the main computing resource provisioning system for all CMS workflows, including analysis, Monte Carlo production, and detector data ...reprocessing activities. The total resources at Tier-1 and Tier-2 grid sites pledged to CMS exceed 100,000 CPU cores, while another 50,000 to 100,000 CPU cores are available opportunistically, pushing the needs of the Global Pool to higher scales each year. These resources are becoming more diverse in their accessibility and configuration over time. Furthermore, the challenge of stably running at higher and higher scales while introducing new modes of operation such as multi-core pilots, as well as the chaotic nature of physics analysis workflows, places huge strains on the submission infrastructure. This paper details some of the most important challenges to scalability and stability that the CMS Global Pool has faced since the beginning of the LHC Run II and how they were overcome.
CMS computing operations during run 1 Adelman, J; Alderweireldt, S; Artieda, J ...
Journal of physics. Conference series,
01/2014, Volume:
513, Issue:
3
Journal Article
Peer reviewed
Open access
During the first run, CMS collected and processed more than 10B data events and simulated more than 15B events. Up to 100k processor cores were used simultaneously and 100PB of storage was managed. ...Each month petabytes of data were moved and hundreds of users accessed data samples. In this document we discuss the operational experience from this first run. We present the workflows and data flows that were executed, and we discuss the tools and services developed, and the operations and shift models used to sustain the system. Many techniques were followed from the original computing planning, but some were reactions to difficulties and opportunities. We also address the lessons learned from an operational perspective, and how this is shaping our thoughts for 2015.
The connection of diverse and sometimes non-Grid enabled resource types to the CMS Global Pool, which is based on HTCondor and glideinWMS, has been a major goal of CMS. These resources range in type ...from a high-availability, low latency facility at CERN for urgent calibration studies, called the CAF, to a local user facility at the Fermilab LPC, allocation-based computing resources at NERSC and SDSC, opportunistic resources provided through the Open Science Grid, commercial clouds, and others, as well as access to opportunistic cycles on the CMS High Level Trigger farm. In addition, we have provided the capability to give priority to local users of beyond WLCG pledged resources at CMS sites. Many of the solutions employed to bring these diverse resource types into the Global Pool have common elements, while some are very specific to a particular project. This paper details some of the strategies and solutions used to access these resources through the Global Pool in a seamless manner.
Purpose: This study aims to define which of the right ventricular myocardial deformation indices best correlates with the classic echocardiographic measurements and indices of right ventricular (RV) ...dysfunction in patients with stable chronic obstructive pulmonary disease (COPD). Patients and Methods: Ninety-one patients with stable COPD underwent clinical evaluation, spirometry, a 6-minute walk test, and echocardiographic examination. Patients were divided into two groups: “with RV dysfunction” (≥ 1 classic parameter) and “without RV dysfunction”. We used speckle tracking to estimate myocardial deformation. For all analyses, results were considered significant if p < 0.05. Results: The mean age across all participants was 65 ± 9 years, with 53% (48/91) being male. Patients in the group with RV dysfunction were able to walk shorter distances and had higher estimated right ventricular systolic pressure (RVSP) and mean pulmonary arterial pressure (mPAP). The RV free wall longitudinal strain (RVFWLS) was the only deformation indices that showed a significant correlation with all classic measurements and indices in the diagnosis of RV dysfunction (Wald test, 10.24; p < 0.01; odds ratio, 1.61). In the ROC curve analysis, the absolute value < 20% was the lowest cut-off point of this index for detection of RV dysfunction (AUC = 0.93, S: 95.8%, and E: 88%). Conclusion: In COPD patients, RVFWLS is the myocardial deformation index that best correlates with classic echocardiographic parameters for the diagnosis of RV dysfunction using < 20% as a cut-off point.
Understanding mechanisms of tree mortality and the
dynamics of associated canopy gaps is relevant for robust estimates of
carbon balance in forests. We combined monthly RGB images acquired from an
...unoccupied aerial vehicle with field surveys to identify gaps in an 18 ha plot
installed in an old-growth central Amazon forest. We measured the size and
shape of gaps and analyzed their temporal variation and correlation with
rainfall over a period of 28 months. We further described associated modes
of tree mortality (i.e., snapping, uprooting and standing dead) and branch
fall and quantified associated losses of biomass. In total, we detected 32
gaps either in the images or field ranging in area from 9 to 835 m2. Relatively small gaps (< 39 m2) opened by branch fall
were the most frequent (11 gaps). Out of 18 gaps for which both field and
image data were available, three could not be detected remotely. Gaps
observed in the field but not captured on the imagery were relatively small
and mainly formed by the fall of branches from live and standing dead trees.
Our data show that ∼ 17 % of the tree-mortality and
branch-fall events only affected the lower canopy and the understory of the
forest and are likely neglected by top-of-the-canopy assessments.
Regardless of the detection method, the size distribution was best described
by a lognormal function for gaps starting from the smallest detected size (9 and 10 m2 for field and imagery data, respectively), and the
Weibull and Power functions for gaps larger than 25 m2. Properly
assessing associated confidence intervals requires larger sample sizes.
Repeated field measurements reveal that gap area does not differ
significantly among modes of tree mortality or branch fall in central Amazon
forests, with the last contributing the least to biomass loss. Predicting
mechanisms of gap formation based on associated area and biomass loss
remains challenging, which highlights the need for larger datasets. The rate
of gap area formation was positively correlated with the frequency of
extreme rainfall events, which may be related to a higher frequency of
storms propagating extreme rain and wind gusts. While remote sensing has proven to be an accurate and precise method for mapping gaps compared to field data (i.e., ground truth), it is important to note that our sample size was relatively small. Therefore, the extrapolation of
these results beyond our study region and landscape shall be made
cautiously. Apart from improving landscape assessments of carbon balance,
regional information on gap dynamics and associated mechanisms of formation
are fundamental to address forest responses to altered disturbance regimes
resulting from climate change.
Classic adenoid cystic carcinomas (C-AdCCs) of the breast are rare, relatively indolent forms of triple negative cancers, characterized by recurrent MYB or MYBL1 genetic alterations. Solid and ...basaloid adenoid cystic carcinoma (SB-AdCC) is considered a rare variant of AdCC yet to be fully characterized. Here, we sought to determine the clinical behavior and repertoire of genetic alterations of SB-AdCCs. Clinicopathologic data were collected on a cohort of 104 breast AdCCs (75 C-AdCCs and 29 SB-AdCCs). MYB expression was assessed by immunohistochemistry and MYB-NFIB and MYBL1 gene rearrangements were investigated by fluorescent in-situ hybridization. AdCCs lacking MYB/MYBL1 rearrangements were subjected to RNA-sequencing. Targeted sequencing data were available for 9 cases. The invasive disease-free survival (IDFS) and overall survival (OS) were assessed in C-AdCC and SB-AdCC. SB-AdCCs have higher histologic grade, and more frequent nodal and distant metastases than C-AdCCs. MYB/MYBL1 rearrangements were significantly less frequent in SB-AdCC than C-AdCC (3/14, 21% vs 17/20, 85% P < 0.05), despite the frequent MYB expression (9/14, 64%). In SB-AdCCs lacking MYB rearrangements, CREBBP, KMT2C, and NOTCH1 alterations were observed in 2 of 4 cases. SB-AdCCs displayed a shorter IDFS than C-AdCCs (46.5 vs 151.8 months, respectively, P < 0.001), independent of stage. In summary, SB-AdCCs are a molecularly heterogeneous but clinically aggressive group of tumors. Less than 25% of SB-AdCCs display the genomic features of C-AdCC. Defining whether these tumors represent a single entity or a collection of different cancer types with a similar basaloid histologic appearance is warranted.
To determine risk factors for the development of long coronavirus disease 2019 (COVID-19) in healthcare personnel (HCP).
We conducted a case-control study among HCP who had confirmed symptomatic ...COVID-19 working in a Brazilian healthcare system between March 1, 2020, and July 15, 2022. Cases were defined as those having long COVID according to the Centers for Disease Control and Prevention definition. Controls were defined as HCP who had documented COVID-19 but did not develop long COVID. Multiple logistic regression was used to assess the association between exposure variables and long COVID during 180 days of follow-up.
Of 7,051 HCP diagnosed with COVID-19, 1,933 (27.4%) who developed long COVID were compared to 5,118 (72.6%) who did not. The majority of those with long COVID (51.8%) had 3 or more symptoms. Factors associated with the development of long COVID were female sex (OR, 1.21; 95% CI, 1.05-1.39), age (OR, 1.01; 95% CI, 1.00-1.02), and 2 or more SARS-CoV-2 infections (OR, 1.27; 95% CI, 1.07-1.50). Those infected with the SARS-CoV-2 δ (delta) variant (OR, 0.30; 95% CI, 0.17-0.50) or the SARS-CoV-2 o (omicron) variant (OR, 0.49; 95% CI, 0.30-0.78), and those receiving 4 COVID-19 vaccine doses prior to infection (OR, 0.05; 95% CI, 0.01-0.19) were significantly less likely to develop long COVID.
Long COVID can be prevalent among HCP. Acquiring >1 SARS-CoV-2 infection was a major risk factor for long COVID, while maintenance of immunity via vaccination was highly protective.
Mosaic mutations in normal tissues can occur early in embryogenesis and be associated with hereditary cancer syndromes when affecting cancer susceptibility genes (CSG). Their contribution to ...apparently sporadic cancers is currently unknown. Analysis of paired tumor/blood sequencing data of 35,310 patients with cancer revealed 36 pathogenic mosaic variants affecting CSGs, most of which were not detected by prior clinical genetic testing. These CSG mosaic variants were consistently detected at varying variant allelic fractions in microdissected normal tissues (n = 48) from distinct embryonic lineages in all individuals tested, indicating their early embryonic origin, likely prior to gastrulation, and likely asymmetrical propagation. Tumor-specific biallelic inactivation of the CSG affected by a mosaic variant was observed in 91.7% (33/36) of cases, and tumors displayed the hallmark pathologic and/or genomic features of inactivation of the respective CSGs, establishing a causal link between CSG mosaic variants arising in early embryogenesis and the development of apparently sporadic cancers.
Here, we demonstrate that mosaic variants in CSGs arising in early embryogenesis contribute to the oncogenesis of seemingly sporadic cancers. These variants can be systematically detected through the analysis of tumor/normal sequencing data, and their detection may affect therapeutic decisions as well as prophylactic measures for patients and their offspring. See related commentary by Liggett and Sankaran, p. 889. This article is highlighted in the In This Issue feature, p. 873.
Lynch syndrome is defined by germline pathogenic mutations involving DNA mismatch repair (MMR) genes and linked with the development of MMR-deficient colon and endometrial cancers. Whether breast ...cancers developing in the context of Lynch syndrome are causally related to MMR deficiency (MMRd), remains controversial. Thus, we explored the morphologic and genomic characteristics of breast cancers occurring in Lynch syndrome individuals.
A retrospective analysis of 20,110 patients with cancer who underwent multigene panel genetic testing was performed to identify individuals with a likely pathogenic/pathogenic germline variant in
,
,
, or
who developed breast cancers. The histologic characteristics and IHC assessment of breast cancers for MMR proteins and programmed death-ligand 1 (PD-L1) expression were assessed on cases with available materials. DNA samples from paired tumors and blood were sequenced with Memorial Sloan Kettering-Integrated Mutation Profiling of Actionable Cancer Targets (≥468 key cancer genes). Microsatellite instability (MSI) status was assessed utilizing MSISensor. Mutational signatures were defined using SigMA.
A total of 272 individuals with Lynch syndrome were identified, 13 (5%) of whom had primary breast cancers. The majority of breast cancers (92%) were hormone receptor-positive tumors. Five (42%) of 12 breast cancers displayed loss of MMR proteins by IHC. Four (36%) of 11 breast cancers subjected to tumor-normal sequencing showed dominant MSI mutational signatures, high tumor mutational burden, and indeterminate (27%) or high MSISensor scores (9%). One patient with metastatic MMRd breast cancer received anti-PD1 therapy and achieved a robust and durable response.
A subset of breast cancers developing in individuals with Lynch syndrome are etiologically linked to MMRd and may benefit from anti-PD1/PD-L1 immunotherapy.