UNI-MB - logo
UMNIK - logo
 
E-resources
Full text
Peer reviewed
  • On the Performance of new H...
    Gundawar, Atharva; Lodha, Srishti; Vijayarajan, V.; Iyer, Balaji; Prasath, V. B. Surya

    Neural processing letters, 12/2023, Volume: 55, Issue: 8
    Journal Article

    Over the past few decades, a lot of new neural network architectures and deep learning (DL)-based models have been developed to tackle problems more efficiently, rapidly, and accurately. For classification problems, it is typical to utilize fully connected layers as the network head. These dense layers used in such architectures have always remained the same – they use a linear transformation function that is a sum of the product of output vectors with weight vectors, and a trainable linear bias. In this study, we explore a different mechanism for the computation of a neuron’s output. By adding a new feature, involving a product of higher order output vectors with their respective weight vectors, we transform the conventional linear function to higher order functions, involving powers over two. We compare and analyze the results obtained from six different transformation functions in terms of training and validation accuracies, on a custom neural network architecture, and with two benchmark datasets for image classification (CIFAR-10 and CIFAR-100). While the dense layers perform better in all epochs with the new functions, the best performance is observed with a quadratic transformation function. Although the final accuracy achieved by the existing and new models remain the same, initial convergence to higher accuracies is always much faster in the proposed approach, thus significantly reducing the computational time and the computational resources required. This model can improve the performance of every DL architecture that uses a dense layer, with remarkably higher improvement in larger architectures that incorporate a very high number of parameters and output classes.