A Deep Learning Approach to Unsupervised Ensemble Learning and Crowdsourcing
The goal of this paper is to show that Deep Learning can be applied to the areas of Unsupervised Ensemble learning and Crowdsourcing. Firstly, the authors prove that the David and Skene Model, which assumes that classifiers in ensemble learning are conditionally independent, has an equivalent parameterization in terms of a Restricted Boltzmann Machine with a single hidden node. Secondly, they propose to apply an RBM based Deep Neural Network to classifiers that may strongly violate the conditional independence assumption in the DS model.
David and Skene were the first ones to consider a set-up in which the predictions of d classifiers on a set of n instances are combined to a single prediction - Ensemble Learning. The DS model mainly assumes two conditions. One, that all the classifiers make perfectly independent errors and two, that all these errors are uniformly distributed. Many works have tried to address the second assumption but few have tried to address the first condition of conditional independence between the classifiers. The authors address this assumption by proposing to apply Deep Learning.
To show the equivalence of the DS model to an RBM with a single node, the authors depend on a special case of a result proved by Chang(2015) that makes the parameters of the DS model identifiable. Given the parameters, the DS model is equivalent to an RBM with a single hidden node. Further, given that there are at least 3 classifiers, the maximum likelihood λMLE of RBM is known and if there exists joint prob p(x,y), we know that RBM distribution converges to true posterior prob of DS Model.
The next step is to construct DNN. The authors make use of SVD to determine the architecture of the DNN. Initially, an RBM with d (no of classifiers) hidden nodes is trained. SVD of the resulting weight matrix is computed. ‘M’, the number of hidden units is set to the minimum of SVD singular values whose cum sum is 95% of the total sum. If m<=d and m>1, another layer is added on top of the existing RBM layer the number and re-trained. This process is repeated till m = 1.
The DNN is compared with several supervised and unsupervised models, applied on both simulated and real-world datasets, some in which conditional independence is strongly violated. The measure defined as
It is found that the DNN performs better than the compared models on both the simulated and RealWorld Datasets. In some cases, it was also found that the learned features in the last hidden layer in the DNN are perfectly uncorrelated while the raw data contained correlated features.
Future work in this direction can be towards analysing the SVD approach to determine the architecture of the DNN and extending the approach to multiclass problems.