S2: Convolutional neural networks: beyond traditional solutions

Irina Perfilieva

Jan Platos

Jan Hula

University of Ostrava, Czech Republic

Abstract

There are many variants of CNN architecture in the literature, but the entire structure of CNN is mainly in two parts. The first part is used for feature extraction, and the second part processes the regression and consists of fully connected layers and an output layer. Feature extraction is related to network architecture and its ability to get a good data representation. In general, the good representation is a particular problem that is connected with a sufficient statistics of training data that is minimal and invariant to future variability of the test data. However, there is still no comprehensive theory to explain how deep networks create optimal representations. The second part of CNN is determined by the choice of the loss function. In cases of classification, the latter is usually empirical cross-entropy, so the process is prone to overfitting. This problem is usually solved with regularization, which can be explicit or implicit in stochastic gradient descent. The choice of the regularizer is also responsible for the ability of the networks to accommodate the aforementioned future variability. We solicit contributions that relate to both the development of representation theory and developments in optimization and hardware that contribute to the further progress of deep neural networks. The following list of topics is suggested, but not limited to:

weight initialization and weight evolution;
regularization and output-related feature extraction;
reduction of dimensionality and weight initialization;
feature representation of inputs;
global optimality in deep learning and the influence of activation functions and pooling operations.