Wednesday, September 23, 2009

LIES, DAMNED LIES AND STATISTICS

Statistics are in and off themselves not that valuable, it is the interpretation of statistics that makes them relevant. When such relevance it ignored, people are accused of "gaming" the system. One of the best examples is the large addition of articles in the Volapük Wikipedia. As a result it became a top 10 sized project in article size to the dismay of many Wikipedians. This resulted in the devaluation of the article numbers for Wikipedia.

The issue with the Volapük approach is that the objective of the addition of these articles was successful; the amount of traffic and consequently the exposure of Volapük increased significantly.

The top 30 Wikipedias get 98.31% of the traffic, and languages like Hindi and Bengali are not part of this. It has been said that we want our "other" languages to do well, it is even one of the "emerging strategic priorities". When growth of our projects is a priority, we should have tools that help us decide what to do. In stead of relying on rhetoric, we could rely on statistics. When we are to rely on numbers, the question becomes what numbers. Even more important is how we will use them.

The problem with statistics is that they are reflective and our need is to energise people. Translatewiki.net shows who was most active in the last 7 days. This activity does not get us more traffic but we know it gives people an incentive to do more and we should use our numbers in this way.


When we want more traffic, our statistics should help us decide what to work on. Most obvious are the articles we do not have, the articles that are not found. At this moment our statistics do not inform us what people are looking for. When we do know, we can write the new articles or create the redirect pages and keep our audience reading our project.

It is feasible to generate a list of newly created articles in the previous month and sort them by the amount of traffic they generated. This helps people focus their attention. Such lists can be easily generated for the smaller projects and it is exactly the smaller projects where these list will have the most impact.

NB this is just one approach that will improve our traffic.. there are others :)
Thanks,
GerardM

No comments: