viernes, 26 de septiembre de 2014

Algorithms, part I week 3 : Collinear points

This last week has been a little bit tougher because of the requisites the assignment had to match.
This week' s exercise has been also very interesting because of the timig tests. 

This time the challenge wasn't much about coding an algorithm that works but also that works in the most efficient way this is, not all the possible solutions were fast enough and hence did not pass all the timing tests.

This was the task of the week:

"Write a program to recognize line patterns in a given set of points.

Computer vision involves analyzing patterns in visual images and reconstructing the real-world objects that produced them. The process in often broken up into two phases: feature detection and pattern recognition. Feature detection involves selecting important features of the image; pattern recognition involves discovering patterns in the features. We will investigate a particularly clean pattern recognition problem involving points and line segments. This kind of pattern recognition arises in many other applications such as statistical data analysis."

The Problem
Given a set of N distinct points in the plane, draw every (maximal) line segment that connects a subset of 4 or more of the points.

"For full credit, do not print permutations of points on a line segment (e.g., if you output pqrs, do not also output either srqp or prqs). Also, for full credit in Fast.java, do not print or plot subsegments of a line segment containing 5 or more points (e.g., if you output pqrst, do not also output either pqst or qrst); you should print out such subsegments in Brute.java."

The solution

The algorithm I designed was as follows:

1. Sort all the points in the array
2.  Find a set of k>=4 collinear points
3. Check if already processed (this means that this set is a permutation of an earlier set printed)
4. if not, then draw the segment with a single call from the first to the last point in the segment.

The tasks the algorithm had to perform were well determined, now the problem arises in how to accomplish those points.

First Approach

My first implementation did pass all the timing tests 17/17 but didn't satisfy all the Correctness tests:
Compilation:  PASSED
Style:        FAILED
Findbugs:     No potential bugs found.
API:          PASSED

Correctness:  32/36 tests passed
Memory:       1/1 tests passed
Timing:       17/17 tests passed

Raw score: 92.78% [Correctness: 65%, Memory: 10%, Timing: 25%, Style: 0%]
which means that the algorithm doesnt deal with all the possible cases and thus it's wrong.

The problem was in the step3 check. I was checking if all k collinear points with p were present in an ArrayList<Point> if they were all then the segment has already been printed and drawn. This wouldnt work all the time because there may be k points already in the list but that belong to a different new segment.

Second Approach

Despite the good raw score of 92.78% i wasnt satisfied so in this next step i wrote again the algorithm using an ArrayList<ArrayList<Point>> were each index corresponds to a different line segment that holds the array of all the collinear points of the set.

This solution passed all the Correctness tests but now was failing in the timing tests:
Compilation:  PASSED
Style:        FAILED
Findbugs:     No potential bugs found.
API:          PASSED

Correctness:  36/36 tests passed
Memory:       1/1 tests passed
Timing:       4/17 tests passed

Raw score: 80.88% [Correctness: 65%, Memory: 10%, Timing: 25%, Style: 0%]

The solution was too slow and the step responsible for this was number 3 the check operation.

Final Solution

I was a bit stuck until i realised that the check step number 3 could be performed by a simple binarySearch adding only a complexity of Lg N.
The line segment has been processed only if we can find a point P0 that appears in the ordered array before the point P. In another way: a point P has collinear points p1, p2, p3... pk that all occur after the point P. If there is a point p0 whose index in the array is less than the one from the point P currently being processed.
This, and the use of an extra array of size N gave the following by the grader:

Compilation:  PASSED
Style:        FAILED
Findbugs:     No potential bugs found.
API:          PASSED

Correctness:  36/36 tests passed
Memory:       1/1 tests passed
Timing:       17/17 tests passed

Raw score: 100.00% [Correctness: 65%, Memory: 10%, Timing: 25%, Style: 0%]

I did code three different solutions until I came in with this one with time complexity order of O(N^2 LgN). N times two mergeSorts of NLgN.

There is still a more efficient algorithm with O(N^2) based on the topological sweep.


No hay comentarios:

Publicar un comentario