I've been busy working on a new implementation
of our beloved "webpin"
, which is actually a service for searching for packages in the insane amount of repositories and packages we have, in the distribution
, in all openSUSE Build Service repositories
, as well as on Packman
The thing is, it's a bit dated now, and its features are limited by the fact that it's using a relational database
to perform search operations. I've been digging into Apache Solr
quite a bit over the last few months (did I already mention that it totally rocks? :)
) and I thought.. hmm.. why not use that for indexing packages/repositories ?
So I just started out on a quick prototype, to see how well it suits the job as well as how well it performs. The results are quite stunning, to say the least, both in terms of performance (results just take a couple of milliseconds on a search index that includes openSUSE 11.1, 11.2, 11.3, all non-home: repositories in the OBS, as well as Packman for 11.1, 11.2 and 11.3.. that's.. quite a lot) as well as in terms of the quality of results -- but the latter is hardly a surprise, as Solr really excels at that. It's what it has specifically been designed and implemented for, after all.
So there it is
, it's already completely functional, and consists of a Solr schema definition
as well as a bunch of Perl scripts to crawl
The next items on the TODO list are as follows:
After that, I shall probably implement an additional REST API that supports more features, as a wealth of more precise and/or complex search options are provided by Solr.
I will implement those (REST API and web user interface) in Java, given that there is a faster, native format
to send queries and fetch results to/from Solr. That being said, applications and web frontends that interact with Solr can be written in quite a lot of programming languages.
Once I'll have a prototype of the above, I'll let you know, and will ask for testing and feedback :)
If you're already interested in more information or want to help developing, please let me know
(or just poke me on IRC
Labels: opensuse, solr, webpin