List of products/projects/ideas that caught my eye (again) while attending the excellent QCon conference last week in San Francisco:
JetBrains MPS, DSLs, Microsoft Oslo + article, oAW, GridGain, YQL, Yahoo Pipes, LINQ, Yaws, mochiweb, JSON, CouchDB, Hypertable, Neo4J, OAuth, AtomPub, Httpbis, MagLev, MogileFS, hadoop, Gearman, Thrift, Bloom filter, Lucene
The art of ranking code search results: "I am often asked the question - so how do you rank your source code results ?
While the ranking of web page results has a well understood set of heuristics and algorithms, this is somewhat unchartered territory as far as ranking source code goes. For web pages most search engines use some version of link analysis to derive a static (independent of the query) score for each page, and then apply the run-time query text against the page content, inbound link anchor text, and other heuristics to come up with a final score and ranking for their hits.
But what does it mean to rank source code files? How does one say that file A which has the word ‘test’ in it should rank higher than file B which also contains the same text?
In our earlier attempts at this we tried things like just boosting files that were named ‘test’ to the top of the list - that soon got to be ridiculous, when we started seeing the top 20 results all named ‘test’.
Another approach we tried looked at was boosting the repository - not all repositories are created equal… Well, we soon got into a state where the top results were from just one repository.
We have now settled down into something we think is more meaningful to us and our users.
The filename is taken into account, but so is the project: how active it has been, how big it is, and other project specific details. But unlike parsing web pages as just a stream of text, we do either full code parsing or some fuzzy parsing to extract meaningful syntactic elements from source code. For example, we know whether the word ‘test’ is in a comment versus it being a function call or a function definition.
So for the Krugle source code ranking recipe, we combine repository and project-level information to generate a static code file score, and then use syntactic information to boost function definitions over function calls, function calls over comment text, and so on.
In the office we still have passionate debates about what we ought to return for a general query such as ‘language:java’ - how does one rank something so generic? That, IMHO, is a user experience issue and not a ranking problem - we either need to detect these types of queries and generate alternative, meaningful results, or we need to convince our users that they shouldn’t be doing that.
Anyway, the above represents where we are after a year of work, but I’m sure it will continue to evolve. Let us know if it isn’t (or is) working for you - thanks! "
(Via Krugle Blog.)
.. I'm pretty sure, that we will see more and more of these domain-specific search engines .. Krugle is a good example for that, as the article above proves .. they use specific domain logic to improve search results, something a generic search engine just can't do ..
What a brilliant idea! Google Co-op lets you define your own search engine, you define what sites should get included in your own Google search.
"Harness the power of Google search to create a free Custom Search Engine that reflects your knowledge and interests. Specify the websites that you want searched - and integrate the search box and results into your own website."
.. and as a nice side effect for Google, it gets information filtering for free, and of course, some more revenue from their ads, shared partially with you.. just brilliant ..
..kind of hard for me .. but I have to say, this editor from Microsoft to create or edit blog posts looks pretty cool .. and it has an amazing preview mode! .. and it even works with MoveableType ..
"Here's a list of thirteen things you really should try with Flock. We're bragging, of course, but at the end of the list you'll also find a few warnings about things we're still working on."
.. that could become big .. still a little bit buggy .. but it shows how a WWW-browser could look and act like in the not so far future .. an interesting front end to your blog, to del.icio.us, and your bookmarks .. and it's all based on the Mozilla/Firefox platform, therefor it runs on Windows, OS X, and Linux .. cool ..
information aesthetics: "inspired by Manovich's definition of information aesthetics, this weblog explores the symbiotic relationship between creative design and the field of information visualization, in an emergent multidisciplinary field what could be coined as 'creative information visualization'."
.. cool blog! .. many interesting examples of information visualization ..
Wired News: One Login to Bind Them All: "Between Friendster profiles, Flickr photo streams, LiveJournal blogs and del.icio.us bookmarks -- not to mention e-mailing, instant messaging and Skyping -- the much-ballyhooed 'social web' can feel like a slippery slope to multiple personality disorder. But if a still-under-development service called the GoingOn Network lives up to its hype, our online selves may soon enjoy a long-overdue digital reintegration."
.. who will own the central social network directory .. the fight is on ..
plazes.beta: "Plazes is the first global location-aware interaction and geo-information system, connecting you with the people and Plazes in your area and all over the world. It is the navigation system for your social life and it's absolutely free."
..interesting, have to check it out .. don't yet know what to think about it .. at least the design of the site looks cool :-) ..
"Using FeedMap you can geo-code your blog, browse already geo-coded blogs and search for blogs. Once geo-coded, you can get your own BlogMap location using a simple url that allows you to network with your local bloggers and much more!"
.. interesting, but I think they are missing some opportunities .. anyway, they are using Microsoft's Map service and not Google's ..
.. don't know if that will fly ..
HonorTags, as proposed by Dan Gillmor, should help readers find content they can trust, and help journalists, bloggers, podcasters and other creators build that trust within their communities. As a creator, you can tag the postings on your own blog or other site to indicate your intentions.
..that's the way to do it..
Backpack becomes a web service:
Backpack is not just for you to love, but for machines too. The brand new Backpack API makes it possible for other programs to easily talk to your backpack. That opens the door to Dashboard widgets, weblog integration, command-line tools, and much more.
We’ve created a forum to go with the API, too. Let us know of your creations and share them if you can. The API is not all finalized, so hold off with the nuclear reactor integration for a couple of weeks. But have fun experimenting today.
If you’re working with Ruby, have a look at this sample wrapper for the API.
(Via Backpack Weblog.)
Strategic Weaknesses of Folksonomies:
There has been some talk about technorati and del.icio.us and the issue of tag spam. But any system whether Wiki or tagging or other, that is open will suffer from the same basic strategic weakness...there are potential solutions though...
What are the contributing factors that allow spam to happen?
(Via Get Real.)
It was only a matter of time, but Google is apparently experimenting with embedding AdSense ads in RSS feeds. Yahoo has been experimenting with RSS ads since last fall. We imagine that by the end of the year, most serious publishers will have at least tried ads in their feeds (Forrester analyst Charlene Li hosted two standing-room only sessions on RSS ads at AdTech this week). This could significantly change the complexion of the RSS experience, which has been blessedly light on marketing to this point. On the other hand, here's hoping that this new money-making opportuninty encourages more publishers to offer full-text RSS feeds. It'll also be interesting to see if the ad networks can figure how to do a better job of contextually matching the ads to the content (why so many Bob Marley and mortgage ads in Fred Wilson's feed, for example?)
More coverage: Tony Gentile, Battelle, Technorati, Feedster.
(Via SiliconBeat.)
Want to know your online social reputation?:
Go to Opinity, which has just launched a Web site that tracks a user's reputation based on their behavior on classifieds and auctions sites, and on social networking Web sites. Apparently they want to partner with sites like eBay or Slashdot. We get the heebies when thinking about this stuff, but maybe it's just us -- or because of all the news lately about cookie and pie.
Anyway, these guys are serious. The San Jose-based start-up just raised $2.7 million in venture capital from SoftBank Ventures Korea, which led the round, and Solborn Venture Investment, Korea Investment and Valmore Partners, according to a story today in Venture Wire (sub req)...
'When you have established a reputation on one site, there's no way to transfer your reputation to another site,' [CEO Ted] Cho said. 'Opinity provides a way to efficiently transfer your reputation by verifying identities at Opinity.'
To build a consolidated reputation, users can create a profile that can also be reviewed and rated by third parties. While users can dispute reviews on themselves, they can't edit them. But a user can register separate reputations on Opinity for each of their identities. For instance, users can establish a reputation related to their eBay account and at the same time can also set up a different reputation for their online dating identity.
(Via SiliconBeat.)
Ten Ideas for Corporate RSS Feeds:
Here are 10 ideas for corporate RSS feeds to (mainly) external audiences. Most of these reasons are good ones for deploying RSS internally as well as part of your employee communications, knowledge management, content management, and other systems. [Cross-posted at Blogging Planet]
1) Email is an increasingly problematic communications tool due to the growth of spam and the overwhelming amount of email most businesspeople receive. More effective spam filters can also create a greater risk of missing important emails. Today, RSS offers a way for users to organize incoming information – on their terms (they have to actually subscribe to receive anything). While ads are increasingly entering RSS feeds, but they remain relatively free from spam at this point. Therefore, organizations should consider offering RSS feeds for many different information categories. Ideas follow!
2) RSS is perfect for the online press room. Added to your newsroom, RSS provides a great channel for delivering press releases to the journalists and analysts who are covering your company without clogging up their email inboxes. You can also use this channel to deliver information that might not be worthy of a press release, but which you deem could be interesting to press/analysts nonetheless. For example, you can post information about an upcoming show your company is exhibiting at and offer interviews.
Some companies using feeds successfully for their newsrooms include:
3) Keep your partners informed. Add an RSS feed to your extranet or partner area and keep it populated with press releases, announcements, product detail, meetings, etc. This works great for user groups as well. Organizations doing this include National Public Radio, which uses RSS feeds in its extranet for station owners/managers and Genesys Telecommunications Labs which offers RSS feeds for its user group.
4) Keep your customers informed. Journalists and analysts aren't the only people who will subscribe to your news release feed. Customers are very likely to as well. You should ask yourself what kind of information your customers want, besides news. One likely target is product support information. Product tuning, specs, troubleshooting and security updates are just a few of the topics that companies like IBM, Oracle, Microsoft and UserLand Software provide in their RSS feeds.
5) Provide specific informational categories so people can just receive what they are most interested in. Most companies deploying RSS today are using it in newsrooms and for product support. Some also offer feeds for overall website changes, for new articles or white papers. If you have multiple products or services, having a feed for each product might make sense.
6) Make your resource centers/online libraries dynamic! Use RSS to inform audiences of new case studies, white papers, and presentations. By providing a feed specific to your library, people don't have to visit the website to see what's new.
7) Put your events to work for you online. Create an RSS feed for each event you plan, as well as a general event feed that keeps your audiences up to date on where and when your organization will appear. Populate the feeds with executive schedules, photos, onsite reporting, and news. You can even produce an audio podcast with interviews from the show floor.
8) Capture and publish the buzz. By setting up an RSS feed that captures and publishes everything that is being said about your organization online, you can keep your audiences up to date on the buzz in an automated, easy-to-manage manner. This also provides a great way for your employees and executives to listen to what people are saying about your organization. Now, clearly, this type of automated feed will also capture negative commentary as well, and may not be for everyone (do a manual feed in that case). But in the growing spirit of communications transparency, it might be a great way for your organization to acknowledge issues and address them publicly. You can easily capture feeds from Feedster, PubSub, or Technorati about your organization and make them available to internal and external audiences. Or, hire a company like Intelliseek to do buzz tracking for you.
9) Set up a feed for special promotions. Provide limited-time only product discounts, early-bird specials to events, prizes and more to key customer sets.
10) You can just as easily create private (password-protected) RSS feeds as public ones. These can be a great way to keep employees, partners, customers informed of company happenings, events, promotions, office closings, and other information you don't necessarily want widely available. You can use a feed for a final press release distribution 24 hours before it hits the wire, for example. [Update: This would be for internal audiences for final approval and/or executive knowledge, not to external audiences. You don't want to run afoul of SEC laws. Thanks to Bill French for pointing out this needed to be clarified.] Many content management and knowledge management software vendors are planning on adding RSS to their product suites in the near future.
(Via CEO Bloggers' Club.)
Google Video (Beta) Upload Program launches:
Long Tail of video? Meet Google: Google just launched their Video Upload Program.
It's crazy to think that searches on Google Video will now include random people's own video and hopefully you won't be limited by what you can search for and they highlight the good stuff. I suspect the main Google Video site will look less like google.com and more like Google News someday, as they showcase video of all types.
The coolest part is that you can even charge for downloads/viewings of video. I can't wait to see what comes out of this, perhaps Google was trying to one-up companies in the bittorrent and IP-based TV space before they could even get going.
(Via PVRblog.)
A directory of cool stuff for the podcasting community.
Teen Blogging: "Jupiter Research stats: '28% of teens keep blogs, the Web logs that are fast becoming a prominent alternative source of news and commentary, while only 16% of adults do the same.' (reported here)...."
(Via Get Real.)
'Podcasts' Catching on with iPod Owners - Survey: "WASHINGTON (Reuters) - The home-brewed audio programs known as 'Podcasts' are catching on with people who own iPods or other digital-music players, according to a survey released on Sunday."
(Via Reuters: Technology.)
weblogs.com as portal to MSN Spaces: "Scott Isaacs, an architect at Microsoft, explains the new weblogs.com listing for MSN Spaces. For me, it's fascinating to watch the idea percolate through the Spaces community. This kind of 'anchor' page is an essential part of the bootstrap of a blogging community."
(Via Scripting News.)
JotSpot Is Free for Open Source Projects: "JotSpot is now free for open source projects! We've been wanting to offer this for a while now, but we've been waiting for guest user (i.e. anonymous) support, which just arrived with the latest update. Jot benefits greatly from open source software, and this is one way we can give back. And because a JotSpot wiki is provided as a hosted service, open source projects not only get the bits free, they get the server hosting and management free as well -- the whole enchilada. (Note that we're still in beta, but this offer is through GA and beyond... ) You can go here and get your site in a minute or two. Just be ..."
(Via JotBlog.)
Ground-breaking BBC article on commercialization of...:
"Ground-breaking BBC article on commercialization of podcasting, with extensive quotes from Adam Curry and myself, and not surprisingly, we disagree. Adam says podcasting will kill radio. Nahhh. It'll become radio and vice versa. Airwaves are just another method of distribution. Same with satellites. What will change is who's talking and who's listening. Now the conversation will flow in all directions, with broadcasters listening to people they used to think of as 'audience.' Blogs changed the architecture of written-word-journalism in the same way. The BBC did miss that I did the early podcasts that were the inspiration for Curry and others, and continue to podcast, so it's not as if I'm on the sidelines, I'm in there, putting my ideas out, and helping inspire others to do the same."
(Via Scripting News.)
Podcast Shuffle: "Manton Reece: ‘Podcast Shuffle is an RSS feed with randomly selected podcasts. Each item is a direct link to the most recent enclosure for a particular show.’"
(Via ranchero.com.)
Inside Ranchero with Brent and Sheila Simmons: "The DrunkenBlog interview on NetNewsWire, MarsEdit, Apple’s dark years, frisbees, user interface, and coffee."
(Via ranchero.com.)
using RSS 2.0 enclosures to deliver application updates.
Six Apart redesign (plus a few thoughts from your author): "
TypePad - What’s a weblog? - Merlin Mann’s 43 Folders
The nice kids over at Six Apart have—in conjunction with the wildly talented MULE Design Studio—just launched a beautiful redesign of their site that consolidates their Movable Type, TypePad, and LiveJournal brands nicely.
They asked happy TypePad users like Julie Moos, mathowie, and me to each respond to the question, ‘What’s a Weblog?.’ Here’s a bit from my screed.
"The trick, if there is one, is to zero in on the thing that really makes you want to share your stuff with the world, and then go with it. Photos of fire hydrants? Video clips of on-air profanity? Haiku about Regis Philbin? Awesome, awesome, awesome. Just, please don’t make macramé.
Remember that the ‘kit’ is just there to get you started—that the ease of posting does not in any way parallel the sometimes painful act of creating. Your visitors crave a fresh voice that surprises them and makes them feel grand about the wonderful things people are making from that same basic set of tools. Give us a little of yourself, and leave that macramé in the basement.
(Via 43 Folders.)
First Look: Ubergroups: "I was turned onto Ubergroups yesterday (having completley missed other commentary on the product). In a nutshell, it is a social tools space for team-based project work, supporting real-time (IM, Chat, file transfer) and slow-time (blogs, file repository, Chat history,..."
(Via Get Real.)
"LiveMessage Blogger Edition is a free service that enables bloggers to drive traffic to their blog sites by alerting interested users of new messages. The system uses real-time networks which find a subscriber on the network and deliver a headline via a desktop alert, a cell phone message or a PDA message. E-mail alerts are also available, but the company has no plans for snailmail alerts."
This is a beta of "News Maps" from NewsIsFree. An image, created by a Java applet, shows a "technology news map" of current event in the tech sphere. Here's how NewsIsFree describes the site:
NewsKnowledge and The Hive Group have joined forces to bring you News Maps, visual maps of the NewsIsFree headline database. News Maps allow you to quickly scan dozens of news articles and instantly understand what's being reported all over the world. Each square in the News Map is an article. You can obtain additional detail on each article by moving your mouse over it. You can read an article by clicking on it.This kind of mapping isn't an entirely new idea. And the potential, for now, seems greater than the achievement.The Hive Group's Honeycomb algorithm organizes news headlines by source. Size and Color information indicate article age and popularity. You can easily filter and rearrange you results to view articles that meet certain criteria, or that contain certain text.
But this is an intriguing approach to making the daily information flood a little less intimidating and a little more manageable. I'll be watching with interest.
(Cross-posted to We the Media.)
[Dan Gillmor's eJournal]
LifeBlogger allows you to post your Nokia Lifeblog favorites to your blog.
Bloglines is a free service that makes it easy to keep up with your favorite blogs and newsfeeds.
audblog enables audio posting to your current blog site with any phone at any time from any where. Its easy and fun. Just 3 clicks and you are publishing audio to the web.
WebLog zum Thema MoveableType.
w.Bloggar: WebLog Client fuer Windows. Version 3.0 unterstuetzt alle MoveableType Features (Title, Category).