Efficient Query Processing in Reverse Nearest Neighbor Search

ISSN 2394-3777 (Print) ISSN 2394-3785 (Online) Available online at www.ijartet.com International Journal of Advanced Research Trends in Engineering and Technology (IJARTET) Vol. 5, Issue 9, September 2018

Efficient Query Processing in Reverse Nearest Neighbor Search MS.Poonam Jain , MCA, Asst Proffessor, Thakur college of science and commerce,Mumbai DR.SK Singh , Department Head(Information Technology), Thakur college of science and commerce,Mumbai. Abstract: Reverse nearest neighbor (RNN) queries are useful in identifying objects that are of significant influence or importance. Existing methods reply on pre-computation of nearest neighbor distances, do not scale well with high dimensionality, or do not produce exact solutions. In this work we motivate and investigate the problem of reverse nearest neighbor search on high dimensional, multimedia data. We propose several heuristic algorithms to approximate the optimal solution and reduce the computation complexity. We propose exact and approximate algorithms that do not require precomputation of nearest neighbor distances, and can potentially prune off most of the search space. We demonstrate the utility of reverse nearest neighbor search by showing how it can help improve the classification accuracy. Keywords: Nearest Neighbor, Multidimensional, data heuristic algorithm, Multiple Boolean Ranges. areas such as business impact analysis and customer profile analysis. I. INTRODUCTION II. KNN FOR CLASSIFICATION Let’s see how to use KNN for classification. In this The nearest neighbor (NN) search has long been accepted as one of the classic data mining methods, and its case, we are given some data points for training and also a role in classification and similarity search is well new unlabelled data for testing. Our aim is to find the class documented. Given a query object, nearest neighbor search label for the new point. The algorithm has different behavior returns the object in the database that is the most similar to based on k. the query object. Similarly, the k nearest neighbor search or Case 1: k = 1 or Nearest Neighbor Rule This is the simplest scenario. Let x be the point to the k-NN search returnsthe k most similar objects to the query. NN and k-NNsearch problems have applications in be labeled. Find the point closest to x . Let it be y. Now many disciplines:information retrieval (find the most similar nearest neighbor rule asks to assign the label of y to x. This website to the query website), GIS (find the closest hospital seems too simplistic and sometimes even counter intuitive. If to a certain location), etc. Contrary to nearest neighbor you feel that this procedure will result a huge error, you are search, less considered is the related but much more right – but there is a catch. This reasoning holds only when computationally complicated problem of reverse nearest the number of data points is not very large. neighbor (RNN) search. Our heuristic algorithms include If the number of data points is very large, then there is a very high chance that label of x and y are same. An example Successive Selection, Greedy and Quasi-Greedy. Given a query object, reverse nearest neighbor might help – Let’s say you have a (potentially) biased coin. search finds all objects in the database whose nearest You toss it for 1 million time and you have got head 900,000 neighbors are the query object. Note that since the relation of times. Then most likely your next call will be head. We can NN is not symmetric, the NN of a query object use a similar argument here. Let me try an informal argument here - Assume all might differ from its RNN(s). Object B being the nearest neighbor of object Adoes not automatically make it the points are in a D dimensional plane. The number of points is reverse nearest neighbor of A, since A is not the nearest reasonably large. This means that the density of the plane at neighbor of B. The problem of reverse nearest neighbor any point is fairly high. In other words, within any subspace search has many practical applications that can be applied to there is adequate number of points. Consider a point x in the subspace which also has a lot of neighbors. Now let y be the

7

Turn static files into dynamic content formats.

Create a flipbook