Sunday, January 01, 2006

Happy New Year

I wish everyone joy and happiness. I hope that we will find the cooperation that will make WiktionaryZ the success that we hope for.

The next important deliverable is a "write up" by Erik where he explains what Wikidata is about. How it can be used and how certain essentials are delivered. As much as an explanation it will also be the mental excercise needed to get more programmers involved in Wikidata. Combine this with the amount of inline documentation that PHP provides and it will open up many possibilities to many programmers, inside the Wikimedia Foundation an out.

Let me be really clear about one thing; Wikidata is powerfull stuff in its own right. In one way it is really great that its first implementation is so ambitious; when you can model this. You can model almost everything. In another it means that as WiktionaryZ is dependent on Wikidata, it will move forward technically at the same speed. Given that the namespace manager will be in Mediawiki 1.6 and given that other infrastructure issues are addressed things could not look more prommissing that the way they do.

For the language guys reading this; we are seriously considering to "steal" a page out of the LMF book; particularly for lexicological information we are going to have "Attributes" that are defined in a language specific way and that are going to be defined particularly in three places; the Expression, the SynTrans and the DefinedMeaning level. This means that we are about to ditch the LexicalItem. This will do us two things; it will make the core of WiktionaryZ more efficient and it will allow us to be more language specific from the start. As we will have Attributes that are conditional on other Attributes and as this will be reflected in the User Interface I think this may reflect the core idea of LMF. Then again I do not really know as I do not really understand enough of LMF yet.

Thanks,
GerardM

6 comments:

Anonymous said...

Could you clarify the "ditching LexicalItem" part? How are you going to characterize words/terms/lemmata?

GerardM said...

In WiktionaryZ we do not have words/terms/lemmata. We have Expression SynTrans and DefinedMeaning; that is when we have removed the LexicalItem from the data design.

Please have a look at the data design when this does not make sense to you. The functionality that is currently associated would be associated with SynTrans. It would allow for one thing; so far it was assumed that when a DefinedMeaning is associated with a noun, all other associated words would be a noun as well. This is not necessarily true. It would be a good default but some languages use the same expression as both noun and verb. I do not have a name for such a part of speech but it surely has one.

Thanks,
GerardM

Anonymous said...

What about alternate spellings of the same word (attested variations, not misspellings)? will each of these get its own SynTrans entry? if so, these will need to be kept in sync - looks like a maintainance burden.

GerardM said...

When a word exists on the level of SynTrans, there is not much to synchronise. This synchronisation is done courtesy of the relational database engine.

Yes, we very much want alternate spellings as well and as many are different depending on the locale where they are from, we want to have this locale information as an attribute.

Thanks,
GerardM

Anonymous said...

WiktionaryZ? I hope the name hasn't been decided yet, because I tend to favour the other suggestion (Wiktionary^2). ;-)

Is there an "official" site where we can add our suggestions?

Anonymous said...

Talking of big words one of my bosses at g3 creative used bibulation all the time - I still do not have a clue in what it means but it happened a lot on a night out seemingly.