As urban traffic safety becomes increasingly important, real-time crosswalk detection is playing a critical role in the transportation field. However, existing crosswalk detection algorithms must be ...improved in terms of accuracy and speed. This study proposes a real-time crosswalk detector called X-CDNet based on YOLOX. Based on the ConvNeXt basic module, we designed a new basic module called Reparameterizable Sparse Large-Kernel (RepSLK) convolution that can be used to expand the model’s receptive field without the addition of extra inference time. In addition, we created a new crosswalk dataset called CD9K, which is based on realistic driving scenes augmented by techniques such as synthetic rain and fog. The experimental results demonstrate that X-CDNet outperforms YOLOX in terms of both detection accuracy and speed. X-CDNet achieves a 93.3 AP50 and a real-time detection speed of 123 FPS.
•Proposed a new basic module, RepSLK, for constructing backbone and neck networks.•Constructed a new real-time crosswalk detection model, X-CDNet.•Established a new crosswalk detection dataset, CD9K.
In this study, we present a novel method for modeling the canopy surface of an umbrella. Our approach involves representing the area between the ribs on the canopy as a trimmed bilinear patch. ...Furthermore, we conduct an in-depth exploration of various differential geometric properties of the umbrella surface. We introduce a method for unfolding the canopy surface onto a plane, which serves as a valuable technique for fabricating a cardboard template to accurately cut canopy fabrics. To validate the effectiveness of our geometric modeling method, we apply it to several umbrella models, showcasing its practical application and benefits.
•A canopy surface between ribs is represented as a trimmed bilinear patch.•A method to unfold the canopy surface onto a plane is proposed.•Parameters that determine the shape of the canopy surface are defined.•An umbrella was fabricated based on our methods.
In the domain of aquaculture, the act of feeding fish is a pivotal factor influencing both the growth of the fish and the associated cultivation costs. The implementation of intelligent feeding ...strategies is a crucial prerequisite for maintaining fish health and minimizing costs, with the accurate discernment of fish feeding behavior serving as the fundamental basis for the realization of such strategies. Addressing the issues of high data redundancy and substantial noise content inherent in the datasets utilized by existing identification models, as well as the intricate design and suboptimal execution efficiency of the model structures, this study introduces a two-stage framework for discriminating fish feeding behavior. In the initial stage, a re-parameterizable multi-scale object detection model is established, facilitating the acquisition of the spatial distribution of fish schools. This process effectively eliminates redundant data and noise, decouples the training and inference processes of the model, equivalently transforms the complex network training weights, and simplifies the inference process of the model. An analysis of the dataset characteristics is conducted, leading to the optimization of the model detector’s design and a reduction in both the model parameters and computational requirements. In the subsequent stage, responding to the key data characteristics of the spatial distribution of fish schools, a lightweight behavior recognition model is designed. This model enables the rapid and accurate identification of fish feeding behavior. A plethora of experimental results demonstrate that the proposed method can achieve a high recognition accuracy (Acc 83.33%) while operating under the constraints of minimal model parameters and computations (6.45M Params, 8.135G FLOPs). This provides a robust model foundation for the industrial application of the algorithm, underscoring the significant potential of the proposed method in advancing intelligent feeding strategies in aquaculture.
Display omitted
•A comprehensive dual-phase analytical framework is introduced for examining feeding patterns.•An innovative method integrating multi-scale information fusion and lightweight reparameterization backbone is proposed.•A classification model is presented to capture temporal and spatial complexities in fish feeding behaviors.•Achieved 83.33% accuracy, providing compelling evidence of its proficiency in analyzing fish feeding behavior.
Advances in Variational Inference Zhang, Cheng; Butepage, Judith; Kjellstrom, Hedvig ...
IEEE transactions on pattern analysis and machine intelligence,
08/2019, Letnik:
41, Številka:
8
Journal Article
Recenzirano
Odprti dostop
Many modern unsupervised or semi-supervised machine learning algorithms rely on Bayesian probabilistic models. These models are usually intractable and thus require approximate inference. Variational ...inference (VI) lets us approximate a high-dimensional Bayesian posterior with a simpler variational distribution by solving an optimization problem. This approach has been successfully applied to various models and large-scale applications. In this review, we give an overview of recent trends in variational inference. We first introduce standard mean field variational inference, then review recent advances focusing on the following aspects: (a) scalable VI, which includes stochastic approximations, (b) generic VI, which extends the applicability of VI to a large class of otherwise intractable models, such as non-conjugate models, (c) accurate VI, which includes variational models beyond the mean field approximation or with atypical divergences, and (d) amortized VI, which implements the inference over local latent variables with inference networks. Finally, we provide a summary of promising future research directions.
•StocIP is converted into a pure network training paradigm for high-dimensional problems making full use of backpropagation and gradient-based optimization.•Affine transformation is embedded in the ...network to convert statistical parameters of physical random vector into learnable weights and biases.•Pre-embedded subnetwork implements the reparameterization for physical random vector by reformulating sampling as a differentiable transformation.•Maximum mean discrepancy is employed as a distribution-free loss to quantify distribution discrepancy.•StocIPNet has solid probabilistic interpretability due to theoretical equivalence to maximum likelihood estimation method.
The stochastic inverse problem (StocIP), which aims to align push-forward and observed output distributions by estimating probability distributions of unknown system inputs, often faces optimization challenges and the curse of dimensionality. A novel deep network called StocIPNet which comprises an affine-embedded reparameterization subnetwork (ReparNet) and a complex system metamodeling subnetwork (MetaNet) is proposed to alleviate these issues. The ReparNet subnetwork embeds the affine transformation to convert the statistical parameters of the physical random vector into learnable weights and biases, effectively implementing the reparameterization trick by separating random and deterministic elements in the stochastic sampling operation to preserve differentiability. In parallel, the MetaNet subnetwork offers a computationally efficient alternative to time-consuming forward solvers, facilitating the generation of push-forward distributions. The entire StocIPNet utilizes the kernel maximum mean discrepancy (MMD) as a distribution-free loss function, quantifying the discrepancy between push-forward and observed output distributions. By leveraging the ReparNet’s advantage of reformulating the sampling process as a differentiable transformation and combining two subnetworks seamlessly, the StocIP is reconfigured into a pure network training paradigm preserving differentiability perfectly, which allows for direct modeling and efficient inference of uncertainty within the network using automatic differentiation, backpropagation and gradient-based optimization methods, enabling ease of scaling to high-dimensional problems. The proposed framework has been theoretically demonstrated to be equivalent to the maximum likelihood method, ensuring its solid probabilistic interpretable foundation. The proposed framework is applied to perform stochastic model updating on a numerical and an experimental structure, which effectively demonstrates the framework’s remarkable effectiveness and high efficiency in treating high-dimensional StocIPs.
The most puzzling aspect of the ‘strange metal’ behavior of correlated electron compounds is that the linear in temperature resistivity often extends down to low temperatures, lower than natural ...microscopic energy scales. We consider recently proposed deconfined critical points (or phases) in models of electrons in large dimension lattices with random nearest-neighbor exchange interactions. The criticality is in the class of Sachdev–Ye–Kitaev models, and exhibits a time reparameterization soft mode representing gravity in dual holographic theories. We compute the low temperature resistivity in a large M limit of models with SU(M) spin symmetry, and find that the dominant temperature dependence arises from this soft mode. The resistivity is linear in temperature down to zero temperature at the critical point, with a co-efficient universally proportional to the product of the residual resistivity and the co-efficient of the linear in temperature specific heat. We argue that the time reparameterization soft mode offers a promising and generic mechanism for resolving the strange metal puzzle.
Recent research has successfully adapted vision-based convolutional neural network (CNN) architectures for audio recognition tasks using Mel-Spectrograms. However, these CNNs have high computational ...costs and memory requirements, limiting their deployment on low-end edge devices. Motivated by the success of efficient vision models like InceptionNeXt and ConvNeXt, we propose AudioRepInceptionNeXt, a single-stream architecture. Its basic building block breaks down the parallel multi-branch depth-wise convolutions with descending scales of k×k kernels into a cascade of two multi-branch depth-wise convolutions. The first multi-branch consists of parallel multi-scale 1×k depth-wise convolutional layers followed by a similar multi-branch employing parallel multi-scale k×1 depth-wise convolutional layers. This reduces computational and memory footprint while separating time and frequency processing of Mel-Spectrograms. The large kernels capture global frequencies and long activities, while small kernels get local frequencies and short activities. We also reparameterize the multi-branch design during inference to further boost speed without losing accuracy. Experiments show that AudioRepInceptionNeXt reduces parameters and computations by 50%+ and improves inference speed 1.28× over state-of-the-art CNNs like the Slow–Fast while maintaining comparable accuracy. It also learns robustly across a variety of audio recognition tasks.
Generative Adversarial Network (GAN) is a thriving generative model and considerable efforts have been made to enhance the generation capabilities via designing a different adversarial framework of ...GAN (e.g., the discriminator and the generator) or redesigning the penalty function. Although existing models have been demonstrated to be very effective, their generation capabilities have limitations. Existing GAN variants either result in identical generated instances or generate simulation data with low quality when the training data are diverse and extremely limited (a dataset consists of a set of classes but each class holds several or even one single sample) or extremely imbalanced (a category holds a set of samples and other categories hold one single sample). In this paper, we present an innovative approach to tackle this issue, which jointly employs joint distribution and reparameterization method to reparameterize the randomized space as a mixture model and learn the parameters of this mixture model along with that of GAN. In this way, we term our approach Joint Distribution GAN (JDGAN). In our work, we show that the JDGAN can not only generate high quality simulation data with diversity, but also increase the overlapping area between the generating distribution and the raw data distribution. We proceed to conduct extensive experiments, utilizing MNIST, CIFAR10 and Mass Spectrometry datasets, all using extremely limited amounts of data, to demonstrate the significant performance of JDGAN in both achieving the smallest Fréchet Inception Distance (FID) score and producing diverse generated data.
In recent times, there has been notable and swift advancement in the field of image dehazing. Several deep learning techniques have demonstrated remarkable proficiency in resolving homogeneous ...dehazing issues. Nonetheless, the current dehazing approaches are generally formulated to deal with homogeneous haze, which is often undermined in real-world scenarios due to the uncertain haze dispersion. In this paper, we propose a dehazing model named RepDehazeNet by combining a structurally Reparameterization Encoder-Decoder subnet and a Full-Resolution Attention subnet. To be specific, the structural reparameterization idea is introduced into the encoder–decoder subnet to strengthen the feature extraction of dehazed images and improve the feature extraction speed. RepDehazeNet is compared with seven SOTA models on different datasets in terms of PSNR, SSIM, parameter quantity, and inference time. Compared to the DW-GAN model, the proposed RepDehazeNet model reduces the number of parameters by 2.7 million, and improves the inference speed by 90.3%, while achieving a higher PSNR of 0.5 dB on the NH-Haze2021 dataset. The experimental results demonstrate that the proposed RepDehazeNet model can effectively improve the real-time performance, accuracy of dehazing synthesized and nonhomogeneous haze images.
Display omitted
•Structural reparameterization dehazenet: outstanding performance, faster speed.•Replacing Tanh with ReLU leads to better results.•Transfer learning addresses the problem of insufficient samples.•Dual subnets method proves highly effective in datasets of different scales.