domingo, 12 de octubre de 2014

Algortihms, part I, Week 5: Kd-Tree

As it's become usual here is the task of the week:

"Write a data type to represent a set of points in the unit square (all points have x- and y-coordinates between 0 and 1) using a 2d-tree to support efficient range search (find all of the points contained in a query rectangle) and nearest neighbor search (find a closest point to a query point). 2d-trees have numerous applications, ranging from classifying astronomical objects to computer animation to speeding up neural networks to mining data to image retrieval."



Search & Insert Operations
The first step is to be able to build the kd-tree by inserting points to the corresponding node. This was easy because we implemented two Comparator<Point2D> classes one for the X_ORDER and another for the Y_ORDER and store a pointer to the corresponding comparator static object at each node (recall that we each node subdivides the space vertically or horizontally alternatively).

We wanted to get something like:


Now the one tricky point here is that before inserting a new node we need to check if the tree already contains that point. If we are careless we can find out that for a single insert opertion we are performing two searches one for the proper insert operation and another for the contains check. 

So one first optimization was to take advantage of the insert walk-down-tree operation to check at each node if the point is already in the tree.

Range Search & Nearest Neighbor Search
"The prime advantage of a 2d-tree over a BST is that it supports efficient implementation of range search and nearest neighbor search. Each node corresponds to an axis-aligned rectangle in the unit square, which encloses all of the points in its subtree. The root corresponds to the unit square; the left and right children of the root corresponds to the two rectangles split by the x-coordinate of the point at the root; and so forth."

As it always occur here in the programming assignments, it's not enough to code a working solution but it has to do it the faster it can.

Now Range Search didn't suppose any troubly troubles, but the Nearest Neighbor Search did!!
The NNS was a recursive function that had to perform as follows (from http://en.wikipedia.org/wiki/K-d_tree )


It took me a while since i hit the correct logic the function yet very simple though. 
Now i was passing the timing tests but not 3 of the Correctness tests the ones related to nearest. It seemed my nearest function wasn't returning the exact nearest neighbor in some special cases. So i started debugging the circle10.txt BST and see if i could reproduce the same error, and hell if i couldn't!!

Finally i realised that the problem was in an if-statement inside my nearest function, changed it, and solved it!!

Here are the results:
Compilation:  PASSED
Style:        FAILED
Findbugs:     No potential bugs found.
API:          PASSED

Correctness:  19/19 tests passed
Memory:       8/8 tests passed
Timing:       44/44 tests passed

Aggregate score: 100.00% [Correctness: 65%, Memory: 10%, Timing: 25%, Style: 0%]

And here is the video i did of my working solution:


And this is the end of the course in what concerns the programming assignments!!

Next week is the last one with only lectures and a no-coding exam, wish me luck!!


No hay comentarios:

Publicar un comentario