The Pareto Principle

One of the variants of the Pareto Principle is that “Finishing the last 20% of the job is likely to take 80% of your time”. This principle seems to apply to the last stores to index, and the complexity of getting their locations. Last week was a battle, and while I added the Plus retail brand like a breeze this week, the Attent franchise chain is posing new increased challenges again. As the number of stores brought in per visitor gets lower, the effort increases.

The Plus visitor

Plus is one of the bigger brands in the Netherlands, and you can easily identify that by visiting its retail website, which looks quite professional. The store locator for their website looks like this:

plus-winkels

Luckily, their is a very easy JSON data feed coming into their search page that allows us to quickly gather their store locations in our database. The visitor is one of the most simplest we have created until now:

As can be identified, a simple JSON feed is consumable that has all the information embedded in it. The big supermarket brands understand the value of easy access to store information. Because, the easier shared, the more copied, the better found, the more customers.

Application of the JavaScript “eval” function is demonstrating the power of JavaScript objects as Data Transfer Objects (DTOs) in web applications, because each DTO can simply be brought to live as needed. Note that using the “eval” function is often regarded “evil” and an anti-pattern of use, because ANY code injected in our application will just be executed without escaping. But I did want to showcase at least once how easy and powerful it is to do it in the wrong way …

The Attent visitor (Work-In-Progress)

The challenge with the Attent store locator is directly visible when you take a look at their store locator functionality:

attent-winkels

The site only allows to stores in a radius of 20 km around a City or Postal Code. And because there are only around 100+ stores expected, how to safely find them? Certainly, not every position searched will give a hit.

This needs a form of “intelligent brute force”, a pattern that we have also already used for the Aldi visitor, but this time on a more local scale, because the range of search is very limited. In preparation, we need to find out which Cities or Postal codes to enter in a query to get a 100% coverage of the Netherlands area. For this, I wrote a quick JavaScript that does the following:

  1. Calculate the smallest rectangular bounding box around the Netherlands,
  2. Within the bounding box, iterate over the latitude and longitude in such a way that the areas of circles with a radius of 20 km cover all of the Netherlands,
  3. For each circle’s central latitude and longitude, get the closest matching postal code,
  4. For each valid Dutch postal code returned, perform a scripted search on the site.

I did finish step 2, which gives a picture like this:

NL circles

Now I need an automatic mechanism to give me back the postal codes for the circles centers … at least once. I tried to use Google Maps for that, but I seem to be overrunning my quota earlier than expected. Maybe too many requests at the same time, need to figure out why it only returns me information for some of the queries I perform.

At least a lesson learned already this week is that, while NodeJS is completely asynchronous and we’re supposed to keep applying our code to leverage that to the fullest, the website’s we’re visiting do not like to be stressed to return the information that much. Hence, a good timeout inside a Closure is a inevitable, even valuable asset in your toolkit applying NodeJS to other non-NodeJS web interfaces.

Next post we will likely be finishing the Attent visitor, and hopefully one or two more, which puts us behind schedule by a week … because I forgot about the golden 80/20 rule to start with …

Bookmark the permalink.

Leave a Reply

Your email address will not be published. Required fields are marked *