Let say we have a large enough image library. Consider the following image retrieval scenario:
Some user is looking for an image for his web site design, for example. The easiest UI for it is a text search, so the user types the keywords and retrieves some images as a result.
The problem here is that the association of keywords with images is usually quite poor in such libraries. The difference in vocabularies used by the contributors of images and the searcher is also an issue. Thus the user may not necessarily obtain the results that match his needs well enough.
At this stage,
CBIR can be used. It is a nice approach, allowing the user to select what images from the result match his needs best (and, in some techniques, also what do not match it at all), and force the system to return a new result set matching the needs of user better than the original one.
So in few CBIR iterations the user finds what he needs.
Let me repeat it: first one step of a text search, then few steps of CBIR.
Now comes the obvious idea: what if we keep the keywords of the initial text search query as tags associated with the images marked by user as relevant? In that case we actually make not only the contributors, but also the majority of users of such a library to tag images in it, which enforce the text searching, possibly reducing the required number of CBIR steps.
I am not dealing with CBIR myself, so I did not come across any publication describing such an idea so far. Although I am pretty sure some must exist.