Most existing cross-domain 3D shape retrieval (CD3DSR) methods have assumed the setting of a fixed kind of query set (source domain), and all the annotated query data follow the same distribution. ...However, in practical scenarios, the labelled query sets are typically collected from multiple sources. In such scenarios, single-source CD3DSR methods may fail because of the domain shift across different sources, and universal CD3DSR methods are needed. In this paper, we propose a novel universal unsupervised domain adaptation network (U
2
DAN). It mainly consists of two modules: cross-domain statistics alignment (CDSA) and source-domain feature adaptation (SDFA). First, we use 2D CNN to encode the query and 3D shape from the gallery (target domain) to obtain visual features. To mix up the features between each source and target domain pair, we introduce the margin disparity discrepancy (MDD) model to enforce the domain alignment in an adversarial way. Since the domain shifts also exist across different sources, which may result in performance degradation, we introduce two kinds of discriminators, source-domain discriminator, and cycle cross-domain discriminator to reduce source domain bias. Further, considering there are no available 3D datasets for evaluation, we constructed two novel datasets, MS3DOR-1 for universal cross-dataset 3D shape retrieval (3D-to-3D) and MS3DOR-2 for universal cross-modal 3D shape retrieval (2D-to-3D). Extensive comparisons on two datasets can verify the effectiveness of U
2
DAN against the state-of-art methods.
The most existing methods for 3D model classification and retrieval rely on the fully supervised training scheme, which are prohibitive and time-consuming to collect and label 3D models of wide ...different categories. How to make full use of the existing known data to represent the unknown data is a crucial topic. Inspired by the zero-shot learning in 2D image domain, we propose the semantically guided projection method to classify and retrieve unseen 3D models by exploring the semantic relationship between seen and unseen 3D models. First, we explore the multi-view information of 3D models to construct the semantic attributes as the prior knowledge to represent 3D models. Then, we learn bidirectional projections from visual features to semantics and from semantics to visual features, which can eliminate the gap between seen and unseen domains. Extensive experiments for zero-shot 3D model classification and retrieval on two popular datasets, ModelNet40 and ShapeNetCore55, have demonstrated the effectiveness and superiority of the proposed method.
In this paper, we study the task of unsupervised 2D image-based 3D shape retrieval (UIBSR), which aims to retrieve unlabeled shapes (target domain) using labeled images (source domain). Previous ...works on UIBSR mainly focus on aligning the prototypes generated by the source labels and predicted target pseudo labels for reducing the cross-domain discrepancy. However, simply maintaining consistency between features may corrupt the original semantic information. Moreover, the existing methods usually ignore the diversity of the instances during the adaptation process, which results in reducing the discrimination of features. To solve these problems, we propose the prototype-based semantic consistency (PSC) learning method, exploring semantic knowledge in both prototype-prototype and prototype-instance relationships in the probability space rather than the embedding space to preserve the structure of semantic information. Besides, we propose a novel adversarial scheme between feature extractor and classifier to explore the characteristic of different instances, which can further enhance the model to learn more robust representations. Extensive experiments on two challenging datasets demonstrate the superiority of our proposed method.
Abstract
Background
Cancer-associated fibroblasts (CAFs) are critically involved in tumor progression by maintaining extracellular mesenchyma (ECM) production and improving tumor development. ...Cyclooxygenase-2 (COX-2) has been proved to promote ECM formation and tumor progression. However, the mechanisms of COX-2 mediated CAFs activation have not yet been elucidated. Therefore, we conducted this study to identify the effects and mechanisms of COX-2 underlying CAFs activation by tumor-derived exosomal miRNAs in lung adenocarcinoma (LUAD) progression.
Methods
As measures of CAFs activation, the expressions of fibroblasts activated protein-1 (FAP-1) and α-smooth muscle actin (α-SMA), the main CAFs markers, were detected by Western blotting and Immunohistochemistry. And the expression of Fibronectin (FN1) was used to analyze ECM production by CAFs. The exosomes were extracted by ultracentrifugation and exo-miRNAs were detected by qRT-PCR. Herein, we further elucidated the implicated mechanisms using online prediction software, luciferase reporter assays, co-immunoprecipitation, and experimental animal models.
Results
In vivo, a positive correlation was observed between the COX-2 expression levels in parenchyma and α-SMA/FN1 expression levels in mesenchyma in LUAD. However, PGE2, one of major product of COX-2, did not affect CAFs activation directly. COX-2 overexpression increased exo-miR-1290 expression, which promoted CAFs activation. Furthermore, Cullin3 (CUL3), a potential target of miR-1290, was found to suppress COX-2/exo-miR-1290-mediated CAFs activation and ECM production, consequently impeding tumor progression. CUL3 is identified to induce the Nuclear Factor Erythroid 2–Related Factor 2 (NFE2L2, Nrf2) ubiquitination and degradation, while exo-miR-1290 can prevent Nrf2 ubiquitination and increase its protein stability by targeting CUL3. Additionally, we identified that Nrf2 is direcctly bound with promoters of FAP-1 and FN1, which enhanced CAFs activation by promoting FAP-1 and FN1 transcription.
Conclusions
Our data identify a new CAFs activation mechanism by exosomes derived from cancer cells that overexpress COX-2. Specifically, COX-2/exo-miR-1290/CUL3 is suggested as a novel signaling pathway for mediating CAFs activation and tumor progression in LUAD. Consequently, this finding suggests a novel strategy for cancer treatment that may tackle tumor progression in the future.
Generating a description for an image/video is termed as the visual captioning task. It requires the model to capture the semantic information of visual content and translate them into syntactically ...and semantically human language. Connecting both research communities of computer vision (CV) and natural language processing (NLP), visual captioning presents the big challenge to bridge the gap between low-level visual features and high-level language information. Thanks to recent advances in deep learning, which are widely applied to the fields of visual and language modeling, the visual captioning methods depending on the deep neural networks has demonstrated state-of-the-art performances. In this paper, we aim to present a comprehensive survey of existing deep learning-based visual captioning methods. Relying on the adopted mechanism and technique to narrow the semantic gap, we divide visual captioning methods into various groups. Representative categories in each group are summarized, and their strengths and limitations are discussed. The quantitative evaluations of state-of-the-art approaches on popular benchmark datasets are also presented and analyzed. Furthermore, we provide the discussions on future research directions.
Visual dialog is one attractive vision-language task to predict correct answer according to the given question, dialog history and image. Although researchers have offered diversified solutions to ...contact text with vision, multi-modal information still get inadequate interaction for semantic alignment. To solve the problem, we propose closed-loop reasoning with graph-aware dense interaction, aiming to discover cues through the dynamic structure of graph and leverage it to benefit dialog and image features. Moreover, we analyze the statistics of the linguistic entities hidden in dialog to prove the reliability of graph construction. Experiments are set up on two VisDial datasets, which indicate that our model achieves the competitive results against the previous methods. Ablation study and parameter analysis can further demonstrate the effectiveness of our model.
Social media popularity prediction refers to using multi-modal content to predict the popularity of a post offered by an internet user. It is an effective way to explore advanced forecasting trends ...and make more popularity-sensitive strategic decisions for the future. Existing methods attempt to explore various multi-model features to solve this task, which only focus on local information, lacking global understanding for the post’s content. In this paper, we propose social media popularity prediction with caption (SMPC), a novel architecture that integrates the caption as the global representation into the existing multi-model-feature-based popularity prediction method. To make good use of the generated captions, we process them in word-level, sentence-level and length-level ways, obtaining three kinds of caption features. To incorporate caption features, we exploit seven variants of the architecture by concatenating features in all the possible manners, for the feature fusion and training different combinations for the CatBoost regression. Extensive experiments are conducted on Social Media Prediction Dataset (SMPD) and show that the proposed approaches can achieve competing results against state-of-the-art models.
The COVID-19 outbreak, designated a "pandemic" by the World Health Organization (WHO) on 11 March 2020, has spread worldwide rapidly. Each country implemented prevention and control strategies, ...mainly classified as SARS LCS (SARS-like containment strategy) or PAIN LMS (pandemic influenza-like mitigation strategy). The reasons for variation in each strategy's efficacy in controlling COVID-19 epidemics were unclear and are investigated in this paper. On the basis of the daily number of confirmed local (imported) cases and onset-to-confirmation distributions for local cases, we initially estimated the daily number of local (imported) illness onsets by a deconvolution method for mainland China, South Korea, Japan and Spain, and then estimated the effective reproduction numbers
by using a Bayesian method for each of the four countries. China and South Korea adopted a strict SARS LCS, to completely block the spread via lockdown, strict travel restrictions and by detection and isolation of patients, which led to persistent declines in effective reproduction numbers. In contrast, Japan and Spain adopted a typical PAIN LMS to mitigate the spread via maintaining social distance, self-quarantine and isolation etc., which reduced the
values but with oscillations around 1. The finding suggests that governments may need to consider multiple factors such as quantities of medical resources, the likely extent of the public's compliance to different intensities of intervention measures, and the economic situation to design the most appropriate policies to fight COVID-19 epidemics.
Image-text retrieval is a fundamental and vital task in multi-media retrieval and has received growing attention since it connects heterogeneous data. Previous methods that perform well on image-text ...retrieval mainly focus on the interaction between image regions and text words. But these approaches lack joint exploration of characteristics and contexts of regions and words, which will cause semantic confusion of similar objects and loss of contextual understanding. To address these issues, a dual-level representation enhancement network (DREN) is proposed to strength the characteristic and contextual representations by innovative block-level and instance-level representation enhancement modules, respectively. The block-level module focuses on mining the potential relations between multiple blocks within each instance representation, while the instance-level module concentrates on learning the contextual relations between different instances. To facilitate the accurate matching of image-text pairs, we propose the graph correlation inference and weighted adaptive filtering to conduct the local and global matching between image-text pairs. Extensive experiments on two challenging datasets (i.e., Flickr30K and MSCOCO) verify the superiority of our method for image-text retrieval.
•We propose an efficient 3D residual neural network for brain tumor segmentation.•We propose a fusion loss function based on Dice and Cross-entropy.•We introduce a concise but effective ...post-processing method.•The evaluation is performed on the BRATS 2018 dataset.•The results demonstrate that our method outperforms the state-of-the-art approaches.
Brain tumors are the most aggressive and mortal cancers, which lead to short life expectancy. A reliable and efficient automatic or semi-automatic segmentation method is significant for clinical practice. In recent years, deep learning-based methods achieve great success in brain tumor segmentation. However, due to the limitation of parameters and computational complexity, there is still much room for improvement in these methods. In this paper, we propose an efficient 3D residual neural network (ERV-Net) for brain tumor segmentation, which has less computational complexity and GPU memory consumption. In ERV-Net, a computation-efficient network, 3D ShuffleNetV2, is firstly utilized as encoder to reduce GPU memory and improve the efficiency of ERV-Net, and then the decoder with residual blocks (Res-decoder) is introduced to avoid degradation. Furthermore, a fusion loss function, which is composed of Dice loss and Cross-entropy loss, is developed to solve the problems of network convergence and data imbalance. Moreover, a concise and effective post-processing method is proposed to refine the coarse segmentation result of ERV-Net. The experimental results on the dataset of multimodal brain tumor segmentation challenge 2018 (BRATS 2018) demonstrate that ERV-Net achieves the best performance with Dice of 81.8%, 91.21% and 86.62% and Hausdorff distance of 2.70 mm, 3.88 mm and 6.79 mm for enhancing tumor, whole tumor and tumor core, respectively. Besides, ERV-Net also achieves high efficiency compared to the state-of-the-art methods.