Hashing methods for image retrieval map image features to binary codes, and perform image search by the hamming ranking. Since hamming distances are discrete values, they can not perform fine-grained search for the large amount images sharing the same distances. Some hashing methods try to tackle this drawback, and propose to learn weights for different hashing functions or codes. Till now, several methods have been proposed, I’ll briefly summarize them.

Query adaptive ranking for image search (ICMR 2011)


Query adaptive ranking (I’ll referred it as QaRank) is probably the first weighted hashing methods to the best of my knowledge. The pipeline of QaRank is shown as follows.
QaRank

QaRank first extract image features, then use existing hashing methods to generate binary codes for database images and query images. In fact, existing weighted hashing methods don’t care much about hashing function learning, instead they try to improve the performance of generated binary codes. After the codes generated, QaRank first learn the class specific weights for each image class in the data base, then generate the query adaptive weights in the query stage.

For the class specific weights, QaRank try to learn the hash weights that maximize the intra-class similarity while maintain the inter-class relations. For the intra-class similarity, QaRank try to minimize following objective:
$$f(a_1,\cdots,a_k)=\sum_{i=1}^{k}\sum_{x\in X_i}{\Vert a_i\cdot x-a_i\cdot c^{(i)}\Vert^2}$$
$a_i$ is the corresponding weights to be learned for image class i. x is any image in class i. $c^{(i)}$ is the center of the binary codes in class i. Minimizing above objective function will encourage samples within same class to be more similar. While it’s very straightforward, but this goal has been achieved in hash function learning stage, which this paper uses semi-supervised hashing, why not just learn the hash function and weights simultaneously??

For the inter-class relations, QaRank try to minimize following term:
$$g(a_1,\cdots,a_k)=\sum_{i,j=1}^{k}{s_{ij}\Vert a_i\cdot c^{(i)}-a_j\cdot c^{(j)}\Vert^2}$$
where $s_{ij}$ is a proximity term between class i and j, which in this paper is calculated by the similarity of average BoW feature. Intuitively, we expected similar class have similar codes or small hamming distance. Finally, QaRank combine above two terms to minimize following objective function:
\begin{aligned}
&\min_{a_i,\cdots,a_k} f(a_1,\cdots,a_k)+\lambda g(a_1,\cdots,a_k)\\
&s.t. a_i^T1=1, i=1,\cdots,k, \\
& a_i\geq 0, i=1,\cdots,k
\end{aligned}
QaRank solve above optimization problem by a quadratic programming scheme. After the training, QaRank get the hash codes of training images, as well as the corresponding weights $a$ for different image class.

In the query stage, QaRank generate query adaptive weights for different query in a very simple way. QaRank first rank the database images by the hamming distance with query $x_q$, then the top k (k is set to 500) similar images are used to predict the labels of query image by fusion the weights of different class. After the query weights $a_q$ is generated, fine-grained image search can be performed by the weighted hamming distance.

Query sensitive hash code ranking (CVPR 2012)

QsRank is another weighted hashing method designed for PCA based scheme, it also develops a indexing structure and search algorithm for $\epsilon$-neighbor search. QsRank first formulate the hashing function as:
$$H=sgn(XW)$$
where $W\in R{d\times d}$ and the column of W correspond the eigenvectors of the data co-variance matrix $X^TX$, which is a PCA based approach. Then QsRank calculate the ranking score of hash code h with respect to query q as:
$$QsRank(q,h,\epsilon)=\frac{\int_{y\in{S(h)\cap NN(q,\epsilon)}}p(y)dy}{\int_{y\in NN(q,\epsilon)}p(y)dy}$$
where $p(y)$ is a pdf function indicating the distribution of $y$. The denominator denotes the probability of y being an $\epsilon$-neighbor of query q, and the numerator equals the probability of y being mapped to hash code h and also being a an $\epsilon$-neighbor of query q. Based on the Bayes rule, above definition can be formulated as:
\begin{aligned}
QsRank(q,h,\epsilon)&=\frac{P(y\in S(h),y\in NN(q,\epsilon)}{P(y\in NN(q,\epsilon))}\\
&=P(y\in S(h)\vert y\in NN(q,\epsilon))
\end{aligned}
Above formulation equals the percentage of $\epsilon$-neighbors of query q that are mapped to hash code h. Thus QsRank measures the recall loss of search if data points mapped to h are not retrieved.

QsRank further proposes some approximation and efficient computation algorithms to speed up the calculation, I’ll not focus on these parts. A simple illustration of QsRank’s idea is shown as follows:
QsRank
In this figure, if we use hamming distance $H$, then $H(q,11)=0$ and $H(q,01)=1$, if we use QsRank, the gap between them will be smaller. QsRank thinks that if the number of $\epsilon$-neighbors lies on $01$ and $11$ are quite close, then the QsRank should be similar.

In my own words, QsRank uses the probability that $\epsilon$-neighbors of query q map to hash code h to measure the ranking score of hash code h.

Weighted hamming distance ranking (CVPR 2013)

Above QsRank only designed for PCA-based hashing methods and $\epsilon$-neighbors search, which limits its applications, thus WhRank is proposed.


Comments