Sunday, August 28, 2016

#Wikidata - La Galería de las Mujeres de Costa Rica

#Marketing is something the #Wikimedia Foundation does not do. It does not mean that concepts like KPI are foreign to the WMF. Take this list from the English article "La Galería de las Mujeres de Costa Rica" the women listed are "women who have broken gender stereotypes and advanced human rights principals".

A lot of effort goes into fighting for a diverse Wikipedia where both women are given proper attention. If I were a marketing man, I would say that lists like this provide pointers to people who want to help. I would be happy with a list that shows all the current people with an article and I would be ecstatic when I had a list that would show all the missing articles that would auto update.

The funny thing is that technically it is not that hard to produce. It is not even that hard to include the technology into MediaWiki but it takes a marketing man to drive the point home that you have to engage people and that it shows the quality of a Wikipedia project when we know where we are lacking and where we should concentrate.

Tuesday, August 23, 2016

#Wikidata - Colorado Women's Hall of Fame

There is a continuous effort underway in #Wikipedia to celebrate notable women. When women are seen as a role model, it is obvious that they deserve attention.

The Colorado Women's Hall of Fame is an organisation that celebrates women and every year 10 more women are included. The article on the organisation includes a list and it includes many red links. So more can be done, not only in Wikipedia but also in Wikidata.

As Wikidata is maturing, SPARQL is now of sufficient quality that many of the tools developed by Magnus are transitioning to SPARQL. This takes time and at the same time some tools are discontinued or do not fully function any more. Linked Items is one such tool. It creates a list of items that are found in a Wikipedia text. It is ideal when a text based file full of wiki links exist. It is just a matter of copying in the links and it will generate a list with Wikidata items for you. It is then needed to restrict the items that are used and it was possible to use WDQ the engine that could when SPARQL for Wikidata was a distant dream. Sadly it does not work anymore.

A solution is taking the list of items and copying to Petscan, the tool Magnus favours. It uses SPARQL and it is something of a Swiss army knife for data. When you are used to earlier tools like Autolist, many of the assumptions are wrong and it takes time to discover how the tool works. It does and that is why there are a large number of women who are known to be on the Colorado women's hall of fame.

Sunday, August 14, 2016

#Wikidata - #quality is not abstract

There is a new "Request for Comments" on quality for Wikidata. It is an attempt to describe quality in a top down approach. It is about words, it is abstract and well, I wish them well.

Wikidata has qualities. When you understand Wikidata by what it is and what it does you understand the not so abstract qualities it has. Its principle aim is to bring structure to the data that is in the Wikimedia projects.

The first quality that Wikidata brought was that it replaced the text based interwiki links. The improvement was important; in a short space of time the quality of these interwiki links improved and the associated number of edits went down. The quality of the interwiki links is not absolute but there has been no research on the follow up.

Interwiki links represent  connection between articles of Wikimedia projects that are about the same subject. Within a Wikipedia, a Wikisource there are links that are in essence similar to Wikidata statements. When a university is mentioned, the subject may be a student or staff at that university and when the statement has been made there is a reason for inclusion in categories. We can research the concurrence of such statements and Wikilinks. Quality improves when the concurrence improves.

When enough data is available, it becomes possible to use Wikidata statements in templates. Templates and info boxes expect high quality data in Wikidata and the available data is typically not good enough. When it is easy to make statements to wiki links and red links, the data in an info box will grow with the added statements.

We do need to work on the quality for our readers. This is done best by leveraging the data we have and engage our communities not only to link articles together but also by expanding these links with the statements that bind them together.

Yes, we will have to solve abstract issues but the reality is that they are not so abstract. Issues have their basis in what it is we have to understand this in what we hope to achieve; serving the world with the sum of all our available knowledge.

Monday, August 08, 2016

Is convergence between #Wikipedia and #Wikidata possible?

Wikidata is piggybacking on Wikipedia I was told. This is true; much data is imported from any and all of the Wikipedias and thereby Wikidata changes for the better. It improves in quality and become much more than what any single Wikipedia has to offer. At the same time Wikidata is rather awkward in its use and, there has been too much thinking in terms of what people know and expect for their own project.

Perspectives evolve. I tend to think of Wikidata as not yet good enough for most purposes. It is incomplete and its quality is inconsistent when we consider statements about its items. The remedy is obvious; work on the areas that are relevant and where Wikidata can easily make a difference.

That is fine road plan for me but Wikipedians also use Wikidata, they even need to use Wikidata. When they add an article about a person, the authority control data is served from Wikidata and, they have to add the information to Wikidata if it is to show. So what can be done to make this easy so that the use of Wikidata and Wikipedia may converge?

One aspect that seems important is that Wikidata information needs to function in whatever edit mode. The biggest motivational handicap I found is that most of what I did does not have an effect. It is much more rewarding when effects are more noticeable. All wiki links in an article link to other articles that have items of their own. Why not have a toggle that either shows these links with relations or not? For the brave hearts that take an interest it is cool, The others do not even have to notice.

When such links are annotated, they result in statements and such statements may even imply categories or other subsequent functionality. Currently bots only harvest in Wikipedia but why not have them add to the Wikipedias in a predetermined way? It makes for a much more dynamic editing process and it will definitely improve quality.

What do you think?