This paper introduces a novel rotation-based framework for arbitrary-oriented text detection in natural scene images. We present the Rotation Region Proposal Networks , which are designed to generate ...inclined proposals with text orientation angle information. The angle information is then adapted for bounding box regression to make the proposals more accurately fit into the text region in terms of the orientation. The Rotation Region-of-Interest pooling layer is proposed to project arbitrary-oriented proposals to a feature map for a text region classifier. The whole framework is built upon a region-proposal-based architecture, which ensures the computational efficiency of the arbitrary-oriented text detection compared with previous text detection systems. We conduct experiments using the rotation-based framework on three real-world scene text detection datasets and demonstrate its superiority in terms of effectiveness and efficiency over previous approaches.
Deep neural networks have demonstrated remarkable recognition results on video classification, however great improvements in accuracies come at the expense of large amounts of computational ...resources. In this paper, we introduce LiteEval for resource efficient video recognition. LiteEval is a coarse-to-fine framework that dynamically allocates computation on a per-video basis, and can be deployed in both online and offline settings. Operating by default on low-cost features that are computed with images at a coarse scale, LiteEval adaptively determines on-the-fly when to read in more discriminative yet computationally expensive features. This is achieved by the interactions of a coarse RNN and a fine RNN, together with a conditional gating module that automatically learns when to use more computation conditioned on incoming frames. We conduct extensive experiments on three large-scale video benchmarks, FCVID, ActivityNet and Kinetics, and demonstrate, among other things, that LiteEval offers impressive recognition performance while using significantly less computation for both online and offline settings.
Full text
Available for:
CEKLJ, DOBA, EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, IZUM, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, SIK, UILJ, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
Effective visual representation plays an important role in the scene classification systems. While many existing methods are focused on the generic descriptors extracted from the RGB color channels, ...we argue the importance of depth context, since scenes are composed with spatial variability and depth is an essential component in understanding the geometry. In this letter, we present a novel depth representation for RGB-D scene classification based on a specific designed convolutional neural network (CNN). Contrast to previous deep models that transfer from pretrained RGB CNN models, we harness model by using the multiviewpoint depth image augmentation to overcome the data scarcity problem. The proposed CNN framework contains the dilated convolutions to expand the receptive field and a subsequent spatial pooling to aggregate multiscale contextual information. The combination of contextual design and multiviewpoint depth images are important toward a more compact representation, compared to directly using original depth images or off-the-shelf networks. Through extensive experiments on SUN RGB-D dataset, we demonstrate that the representation outperforms recent state of the arts, and combining it with standard CNN-based RGB features can lead to further improvements.
Co
3
O
4
nanoparticles-assembled microrods (Mic-Co
3
O
4
) were successfully synthesized with the precursor of Co-BTC (BTC = 1,3,5-benzenetricarboxylic acid) and applied for efficient propane (C
3
H
...8
) oxidation. It shows a higher reaction rate of 4.14 μmol
C3H8
g
cat
−1
s
−1
at 250 °C, when it is only 1.18 μmol
C3H8
g
cat
−1
s
−1
obtained over Co
3
O
4
nanoparticles (Np-Co
3
O
4
) via direct calcination of cobalt nitrate. Moreover, Mic-Co
3
O
4
remains the original morphology of Co-BTC MOF, and the keeping pores enhance the microrod rigidity, hindering nanoparticles growth and thus resulting in superior thermal stability. After 12 h of durability test at 500 °C, the size of Mic-Co
3
O
4
nanoparticles increases slightly from 62 to 70 nm, whereas it is from 97 to 130 nm for Np-Co
3
O
4
. Meanwhile, the calcination of Co-BTC precursor can induce large amounts of surface Co
2+
, favoring activation of adsorptive oxygen species. This can promote oxygen mobility, which is helpful for total propane oxidation.
Full text
Available for:
DOBA, EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, IZUM, KILJ, KISLJ, MFDPS, NLZOH, NUK, OBVAL, OILJ, PILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, SIK, UILJ, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
With the recent development of deep learning, the regression, classification, and segmentation tasks of Computer-Aided Diagnosis (CAD) using Non-Contrast head Computed Tomography (NCCT) for ...spontaneous IntraCerebral Hematoma (ICH) have become popular in the field of emergency medicine. However, a few challenges such as time-consuming of ICH volume manual evaluation, excessive cost demanding patient-level predictions, and the requirement for high performance in both accuracy and interpretability remain. This paper proposes a multi-task framework consisting of upstream and downstream components to overcome these challenges. In the upstream, a weight-shared module is trained as a robust feature extractor that captures global features by performing multi-tasks (regression and classification). In the downstream, two heads are used for two different tasks (regression and classification). The final experimental results show that the multi-task framework has better performance than single-task framework. And it also reflects its good interpretability in the heatmap generated by Gradient-weighted Class Activation Mapping (Grad-CAM), which is a widely used model interpretation method, and will be presented in subsequent sections.
Crowd counting, i.e. count the number of people in a crowded visual space, is emerging as an essential research problem with public security. A key in the design of the crowd counting system is to ...create a stable and accurate robust model, which requires to process on the feature channels of the counting network. In this study, the authors present a featured channel enhancement (FCE) block for crowd counting. First, they use a feature extraction unit to obtain the information of each channel and encodes the information of each channel. Then use a non-linear variation unit to deal with the encoded channel information, finally, normalise the data and affixed to each channel separately. With the use of the FCE, the positive characteristic channel can be enhanced and weak or negative channel information can be suppressed. The authors successfully incorporate the FCE with two compact networks on the standard benchmarks and prove that the proposed FCE achieves promising results.
Full text
Available for:
FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SAZU, SBCE, SBMB, UL, UM, UPUK
After the implementation of 2- and 3-child policies, the rising proportion of high-age and high-risk pregnancies put enormous pressure on maternal and child health (MCH) services for China. This ...populous nation with an increasing population flow imperatively required the support of large-scale information systems for management. Municipal MCH information systems were commonly applied in developed cities of eastern provinces in China. However, implementation of provincial MCH information systems in relatively low-income areas is lacking. In 2020, the implementation of a regional maternal and child information system (RMCIS) in Inner Mongolia filled this gap.
This paper aimed to demonstrate the construction process and evaluate the implementation effect of an RMCIS in improving the regional MCH in Inner Mongolia.
We conducted a descriptive study for the implementation of an RMCIS in Inner Mongolia. Based on the role analysis and information reporting process, the system architecture design had 10 modules, supporting basic health care services, special case management, health support, and administration and supervision. Five-color management was applied for pregnancy risk stratification. We collected data on the construction cost, key characteristics of patients, and use count of the main services from January 1, 2020, to October 31, 2022, in Inner Mongolia. Descriptive analysis was used to demonstrate the implementation effects of the RMCIS.
The construction and implementation of the RMCIS cost CNY 8 million (US $1.1 million), with a duration of 13 months. Between 2020 and 2022, the system recorded 221,772 registered pregnant women, with a 44.75% early pregnancy registry rate and 147,264 newborns, covering 278 hospitals and 225 community health care centers in 12 cities. Five-color management of high-risk pregnancies resulted in 76,975 (45.45%) pregnancies stratified as yellow (general risk), 36,627 (21.63%) as orange (relatively high risk), 156 (0.09%) as red (high risk), and 3888 (2.30%) as purple (infectious disease). A scarred uterus (n=28,159, 36.58%), BMI≥28 (n=14,164, 38.67%), aggressive placenta praevia (n=32, 20.51%), and viral hepatitis (n=1787, 45.96%) were the top factors of high-risk pregnancies (yellow, orange, red, and purple). In addition, 132,079 pregnancies, including 65,018 (49.23%) high-risk pregnancies, were registered in 2022 compared to 32,466 pregnancies, including 21,849 (67.30%) high-risk pregnancies, registered in 2020.
The implementation of an RMCIS in Inner Mongolia achieved the provincial MCH data interconnection for basic services and obtained both social and economic benefits, which could provide valuable experience to medical administration departments, practitioners, and medical informatics constructors worldwide.
► This paper gives a simplified multi-class SVM by introducing a relaxed error bound. ► The method reduces the size of the resulting dual problem from
l
×
k to
l. ► The experiments demonstrate that ...the proposed method speeds up the training process. ► Meanwhile it maintains a competitive classification accuracy.
Support vector machine (SVM) was initially designed for binary classification. To extend SVM to the multi-class scenario, a number of classification models were proposed such as the one by
Crammer and Singer (2001). However, the number of variables in Crammer and Singer’s dual problem is the product of the number of samples (
l) by the number of classes (
k), which produces a large computational complexity. This paper presents a simplified multi-class SVM (SimMSVM) that reduces the size of the resulting dual problem from
l
×
k to
l by introducing a relaxed classification error bound. The experimental results demonstrate that the proposed SimMSVM approach can greatly speed-up the training process, while maintaining a competitive classification accuracy.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UL, UM, UPCLJ, UPUK
Display omitted
•MnO2 nanoparticles were encapsuled in spheres of Ce-Mn solid solution.•The package structure of catalyst shows superior activity of toluene oxidation.•Synergistic effect between the ...shell material and cavity nanoparticles is crucial.•The shell contributes to high water resistance and thermal stability.
The high-efficient catalyst is critical for volatile organic compounds (VOCs) catalytic oxidation. MnO2 nanoparticles encapsuled in spheres of Ce-Mn solid solution (package structure) are controllable designed and applied for the catalytic oxidation of toluene, a representative of VOCs. Our study indicates that the obtained Ce1Mn2 (molar ratio of Ce:Mn = 1:2) catalyst displays much better toluene oxidation activity than pristine MnO2 and CeO2. Meanwhile, the shell of Ce-Mn solid solution contributes to superior thermal stability and resistance against 5 vol% H2O. The synergistic effect between two oxides is maximized by the package structure, giving rise to high BET surface area, good reducibility and fast oxygen mobility, which in turn leads to the outstanding catalytic performance in Ce1Mn2. Additionally, in situ DRIFTS results indicate that intermediate species COO and OCH(O) are found during the catalytic oxidation of toluene over Ce1Mn2, which is mainly dominated by MvK mechanism.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP
Layout analysis from a document image plays an important role in document content understanding and information extraction systems. While many existing methods focus on learning knowledge with ...convolutional networks directly from color channels, we argue the importance of high-frequency structures in document images, especially edge information. In this paper, we present a novel document layout analysis framework with the Explicit Edge Embedding Network (E3Net). Specifically, the proposed network contains the edge embedding block and dynamic skip connection block to produce detailed features, as well as a lightweight fully convolutional subnet as the backbone for the effectiveness of the framework. The edge embedding block is designed to explicitly incorporate the edge information from the document images. The dynamic skip connection block aims to learn both color and edge representations with learnable weights. In contrast to the previous methods, we harness the model by using a synthetic document approach to overcome data scarcity. The combination of data augmentation and edge embedding is important toward a more compact representation than directly using the training images with only color channels. We conduct experiments using the proposed framework on three document layout analysis benchmarks and demonstrate its superiority in terms of effectiveness and efficiency over previous approaches.
Full text
Available for:
GEOZS, IJS, IMTLJ, KILJ, KISLJ, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, UILJ, UL, UM, UPCLJ, UPUK, ZAGLJ, ZRSKP