" /> Marcel Neuhausler's World: November 2006 Archives

« October 2006 | Main | December 2006 »

November 24, 2006

Open source hardware, software and more for the holidays

The Open source gift guide:

"There are hundreds of gift guides this holiday season filled with junk you can buy - but a lot of time you actually don't own it, you can't improve upon it, you can't share it or make it better, you certainly can't post the plans, schematics and source code either. We want to change that, we've put together our picks of interesting open source hardware projects, open source software, services and things that have the Maker-spirit of open source. Some are kits, some are open software projects that you'll need to build hardware for before gifting, and some are just support for the projects/groups that do open source. Included in this guide are things you can get from the MAKE store too (we try and have as many open source goods as possible)."

(Via MAKE Magazine.)

.. simple hardware + internet .. definitely a powerful combination ..

November 11, 2006

Grails

"Grails aims to bring the 'coding by convention' paradigm to Groovy . It's an open-source web application framework that leverages the Groovy language and complements Java Web development. You can use Grails as a standalone development environment that hides all configuration details or integrate your Java business logic. Grails aims to make development as simple as possible and hence should appeal to a wide range of developers not just those from the Java community."

.. pretty impressive "competitor" to Ruby On Rails! .. and it's all Java, easy deployable on any JavaEE server, like Tomcat .. I started by reading the excellent introduction by Harshad Oak ..

November 01, 2006

Krugle: The art of ranking code search results

The art of ranking code search results: "I am often asked the question - so how do you rank your source code results ?

While the ranking of web page results has a well understood set of heuristics and algorithms, this is somewhat unchartered territory as far as ranking source code goes. For web pages most search engines use some version of link analysis to derive a static (independent of the query) score for each page, and then apply the run-time query text against the page content, inbound link anchor text, and other heuristics to come up with a final score and ranking for their hits.

But what does it mean to rank source code files? How does one say that file A which has the word ‘test’ in it should rank higher than file B which also contains the same text?

In our earlier attempts at this we tried things like just boosting files that were named ‘test’ to the top of the list - that soon got to be ridiculous, when we started seeing the top 20 results all named ‘test’.

Another approach we tried looked at was boosting the repository - not all repositories are created equal… Well, we soon got into a state where the top results were from just one repository.

We have now settled down into something we think is more meaningful to us and our users.

The filename is taken into account, but so is the project: how active it has been, how big it is, and other project specific details. But unlike parsing web pages as just a stream of text, we do either full code parsing or some fuzzy parsing to extract meaningful syntactic elements from source code. For example, we know whether the word ‘test’ is in a comment versus it being a function call or a function definition.

So for the Krugle source code ranking recipe, we combine repository and project-level information to generate a static code file score, and then use syntactic information to boost function definitions over function calls, function calls over comment text, and so on.

In the office we still have passionate debates about what we ought to return for a general query such as ‘language:java’ - how does one rank something so generic? That, IMHO, is a user experience issue and not a ranking problem - we either need to detect these types of queries and generate alternative, meaningful results, or we need to convince our users that they shouldn’t be doing that.

Anyway, the above represents where we are after a year of work, but I’m sure it will continue to evolve. Let us know if it isn’t (or is) working for you - thanks! "

(Via Krugle Blog.)

.. I'm pretty sure, that we will see more and more of these domain-specific search engines .. Krugle is a good example for that, as the article above proves .. they use specific domain logic to improve search results, something a generic search engine just can't do ..