4. Week

Our attempt to get better search results from the search engine.

WHAT HAVE WE DONE THIS WEEK?

One of our main goals this week was the improvement of speed for the lookup functionality. After many hours of trying and debugging, the multithreading attempt finally worked. Now we are able to split the weight from the 53.000 tuples up to four different threads. Our lookup is now four times as fast as it has been till yet. This means we are down to 4.5 hours from over 22 hours last week, which is a terrific improvement to our software.

Another goal of this week was to find a way to get better search results from the search engine. As you can see in the picture above, we decided to add a calculation of probability attempt to our search logic. This attempt will now help us to define if a URLs is now more likely to be the official website of a customer or not. We also added the “Blacklist” approach to the search logic so now we can make sure that websites such as Facebook, Wikipedia or LinkedIn will no longer be in the results list.

In the meeting Schurter pointed out, that they preferred Google as their main search engine instead of Bing. We added the API of Google search to our project so now Schurter will be able to choose which search engine they will use.

WHAT ARE WE GOING TO DO NEXT?

Our software is on a good way to fulfill all requirements. But there is still a lot work to do until we can deliver it.

The multithreading attempt has to be improved to make sure the threads will run without problems. We are even discussing if it makes sense to add even more threads dynamically. As bigger the file get as more threads will handle the work.

Also we have to improve and test the calculation of probability. We believe there is still room for improvement. At the weekly status report prof. Marfurt pointed out, that there is maybe an algorithm such as “soundex” or “levenstein-distance” which will deliver even better results. We will decide next week which way will work best for our solution.

SCHURTER Inc04.pptx