Friday, May 11, 2007

Wikipedia internals

I've noticed that my post about eBay internals was very popular. So, here is another one post about big and complex system - Wikipedia. It is presentation on MySQL Conference & Expo. LAMP environment under heavy load... They are using Apache Lucene for Search. But what they say about Java:

Due to licensing issues Wikipedia did not run Java Virtual Machine from Sun Microsystems (non-free software is not matching free content ideals), so alternatives were chosen - at first GCJ-based solution, afterwards .net Lucene port was used on top of Mono .NET framework.
Interesting point of view... *nix implementation of .NET framework (sponsored by... not Microsof but Novell) with port of well-known solution is better than using "native" Java with JVM.

Another hint is :
The major components for search are:
.....
....
Mono (or GCJ.. or JVM... depends on mood - we have support for all).
Depends on mood is great ! :) It is still not clear for me what actually they use. But is not a problem, I like wikipedia and using it all the time. If some solution works - let it be so.


Another question is why Lucene ? Why not Swish-e for example ? It is nice product, Apache folks using it at http://search.apache.org.

3 comments:

Fernando Olmos said...

It's my first time checking a blog from your country, but yours is amazing... very well-done
congratulations

checK mine at http://boggito.blogspot.com

fernando

Stas Ostapenko said...

Hi Fernando !

I'm glad that you have found my blog interesting. At the least I'm trying to make it interesting :)

I had a look at your blog. I'm sure it's great, but I dont understand Spanish.

Good luck !

Anonymous said...

Well I acquiesce in but I about the brief should acquire more info then it has.