Support Vector Machine Active Learning for Image Retrieval
Image retrieval is a system for browsing, searching and retrieving search relevant images from a large database of digital images. Traditional image retrieval methods make use of some form of metadata associated with the images such as captions keywords etc so. In this case, the retrieval is actually performed on the annotation words rather than on the images themselves.
Hand-labelling images are prohibitively expensive, giving rise to a need for a way to allow users to convey their query concept to the database. Relevance feedback can be used to learn user query concepts. Under this method, the user is displayed a few image instances and asked to label them relevant or not. Based on the feedback from the user, another set of images are displayed and the process repeats. This is referred to as pool-based active learning. The goal is to build a learner that is able to learn how to choose informative images within the pool to display to the user for labelling. The key concept of active learning is for the learner to choose its next pool-query based on user feedback to previous pool-queries. The learner must meet two important design criteria. One, it must be able to learn targets accurately. Two, since users don’t provide a lot of feedback, it must be able to quickly grasp query concepts from a small set of labelled feedback instances.
The learner proposed in this paper is an SVMact (SVM active learner). It regards query concept learning as an SVM binary classification. SVmact learns through active learning and select the most informative instances to train the SVM classifier. Once the training is complete, SVMact returns the top k most relevant images to the user.
Till now, we have considered dealing with data that is linearly separable. When SVMact is designed to consider query concept learning to be equivalent to that of SVM binary classification, there is a need to figure out how to deal with data that is not linearly separable. The paper deals with non-linear data by using version space. This follows the assumption that non-linear data is linearly separable in another vector space. A version space is a set that contains all possible classifiers. Version space is defined as below. As the classifier needs to become more accurate with an increase in the number of feedback rounds, the version space needs to be reduced as much as possible.
It is not practical to compute the size of new space V-, V+ for an unlabelled instance in the pool. The way to approximating the size is to train SVM on a labelled image and then choose as the next query the pool instance that is closest to the hyperplane. The SVM unit vector is approximately in the centre of the version space. Distance of the unlabelled instance to the hyperplane is computed using the below function
SVMact outperforms a number of traditional query refinement techniques. The model maintains high accuracy while maintaining the quickness and high precision.