A comparison of active classification methods for content-based image retrieval

Content-based image retrieval (CBIR), also known as query by image content (QBIC) and content-based visual information retrieval (CBVIR) is the application of computer vision techniques to the image retrieval problem, that is, the problem of searching for digital images in large databases. (see this survey for a recent scientific overview of the CBIR field). Content based image retrieval is opposed to concept based approaches. In machine learning and statistics, classification is the problem of identifying which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations (or instances) whose category membership is known. The individual observations are analyzed into a set of quantifiable properties, known as various explanatory variables, features, etc. These properties may variously be categorical (e.g. Information retrieval (IR) is the area of study concerned with searching for documents, for information within documents, and for metadata about documents, as well as that of searching structured storage, relational databases, and the World Wide Web. There is overlap in the usage of the terms data retrieval, document retrieval, information retrieval, and text retrieval, but each also has its own body of literature, theory, praxis, and technologies. Machine learning, a branch of artificial intelligence, is a scientific discipline concerned with the design and development of algorithms that allow computers to evolve behaviors based on empirical data, such as from sensor data or databases. A learner can take advantage of examples (data) to capture characteristics of interest of their unknown underlying probability distribution. Data can be seen as examples that illustrate relations between observed variables. A database is an organized collection of data, today typically in digital form. The data are typically organized to model relevant aspects of reality (for example, the availability of rooms in hotels), in a way that supports processes requiring this information (for example, finding a hotel with vacancies). The term database is correctly applied to the data and their supporting data structures, and not to the database management system (DBMS). This paper deals with content-based image indexing and category retrieval in general databases. Statistical learning approaches have been recently introduced in CBIR. Labelled images are considered as training data in learning strategy based on classification process. We introduce an active learning strategy to select the most difficult images to classify with only few training data. Experimentations are carried out on the COREL database. We compare seven classification strategies to evaluate the active learning contribution in CBIR. Source.


Яндекс.Метрика Рейтинг@Mail.ru Free Web Counter
page counter
Last Modified: April 29, 2014 @ 12:00 am