Main

January 02, 2010

My PCM System

I just finished the first release of my PCM (personal content management system). The system I built uses CouchDB, SimpleDB, and a few servlets written in Java to listen, store, aggregate, and publish all the content I produce on Twitter, delicious, NyTimes People, this blog, and other sites. The nice thing, I can still use the regular interface of all those sites to enter content, while in the background my system is gathering that content automatically. That content then gets stored twice, in a local database and in the "Amazon cloud". From where the content then gets published to the ticker page in my new blog: Neuhäusler Weekly.

.. and of course I already have a ton of ideas how I could improve the system :-) ..

December 31, 2009

Using Cassandra with Scala and Akka

Using Cassandra with Scala and Akka: "With all this talk about NoSQL and new programming languages, I though I’d try getting Cassandra to work with Scala. Always being interested in productivity, I wanted to know how easy and concise an integration would be."

..via Code Monkeyism..

October 15, 2009

Hadoop ..

Different Hadoop releated links:

Slides of Hadoop World in NYC, "Cascading is a feature rich API for defining and executing complex, scale-free, and fault tolerant data processing workflows on a Hadoop cluster", and a library to aid writing Hadoop jobs in Clojure.

September 23, 2009

How I Learned to Stop Worrying and Love Using a Lot of Disk Space to Scale

"The best way to implement joins with BigTable is: don't. You--pause for dramatic effect--duplicate data instead of normalize it. *shudder*"

.. it so reminds me of Lotus Notes .. but it all makes sense now ..

August 12, 2009

NoSQL debrief

.. SQL still needed? .. I believe so .. but there are new database systems/concepts showing up in the Web2.0 space .. "braindump: NOSQL debrief" and "Notes from the NoSQL Meetup" ..

May 17, 2009

Cloudera

"Cloudera provides a training program aimed at producers and users of large volumes of data. These sessions, exercises and tutorials will teach you what you need to know to work more deeply with Hadoop, as well as related business intelligence platforms such as Hive."

.. they offer good tutorials online for free .. including a "Hadoop Training Virtual Machine" .. thanks Cloudera! ..

April 02, 2009

Amazon Elastic MapReduce

"Amazon Elastic MapReduce is a web service that enables businesses, researchers, data analysts, and developers to easily and cost-effectively process vast amounts of data. It utilizes a hosted Hadoop framework running on the web-scale infrastructure of Amazon Elastic Compute Cloud (Amazon EC2) and Amazon Simple Storage Service (Amazon S3)."

.. cheaper than running it on EC2 ..

March 26, 2009

AWS Toolkit for Eclipse

"The AWS Toolkit for Eclipse is a plug-in for the Eclipse Java IDE that makes it easier for developers to develop, deploy, and debug Java applications using Amazon Web Services."

.. a short video shows the full power of that "cloud-computing-IDE" .. pretty impressive ..

March 07, 2009

Hadoop Virtual Image

Hadoop Virtual Image Documentation: "Setting up a Hadoop cluster can be an all day job. However, if you want to experiment with the platform right now, we have created a virtual machine image with a preconfigured single node instance of Hadoop."

.. perfect starting point .. thanks to Google Code University ..

February 07, 2009

PNUTS - Platform for Nimble Universal Table Storage

"PNUTS, a massively parallel and geographcally distributed database system for Yahoo!’s web applictions." .. an interesting paper ..

December 18, 2008

REST Anti-Patterns

This article focuses on anti-patterns – "typical examples of attempted RESTful HTTP usage that create problems and show that someone has attempted, but failed, to adopt REST ideas."

.. pretty enlightening! ..

November 29, 2008

Google Search Applicance Virtual Edition

"Google Search Appliance virtual edition is a developer platform designed for the enterprise development community to build and test applications that use the Google Search Appliance. With this free software, you can try out the search capabilities of the Google Search Appliance. You can also use this software to write and test programs that integrate with the search appliance."

.. still not sure if I like the Google Search Appliance .. but at least there is now a way to try it out for free ..

November 21, 2008

CouchDB

"Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API. Among other features, it provides robust, incremental replication with bi-directional conflict detection and resolution, and is queryable and indexable using a table-oriented view engine with JavaScript acting as the default view definition language.

CouchDB is written in Erlang, but can be easily accessed from any environment that provides means to make HTTP requests. There are a multitude of third-party client libraries that make this even easier for a variety of programming languages and environments."

.. bye-bye RelationalDBs ..

.. by the way, an interesting approach: "ElasticDB - (Elasticdrive + CouchDB)" ..

September 14, 2008

BOSS API Guide

"Yahoo! Search BOSS (Build your Own Search Service) is an initiative in Yahoo! Search to open up Yahoo!'s search infrastructure and enable third parties to build revolutionary search products leveraging their own data, content, technology, social graph, or other assets. This release includes Web, News, and Image Search as well as Spelling Suggestions."

.. definitely the right way to go .. especially after Google shut down their Search API ..

August 30, 2008

Zembly

"At zembly, you easily create and host social applications of all shapes and sizes, targeting the most popular social platforms on the web. And, you do it along with other people."

December 17, 2007

Amazon SimpleDB

"Amazon SimpleDB is a web service for running queries on structured data in real time. This service works in close conjunction with Amazon Simple Storage Service (Amazon S3) and Amazon Elastic Compute Cloud (Amazon EC2), collectively providing the ability to store, process and query data sets in the cloud."

"The data model used by Amazon SimpleDB makes it easy to input, manage and query your structured data. Developers organize their data-set into domains and can run queries across all of the data stored in a particular domain. Domains are collections of items that are described by attribute-value pairs."

.. SQL is getting old and boring .. it's time for a new database .. and unconfirmed rumor says that the service is written in Erlang .. interesting ..

November 01, 2007

jSLP and Concierge

"jSLP is a pure Java implementation of SLP, the Service Location Protocol, as specified in RFC 2608. The API is derived from RFC 2614 with some modifications. The implementation runs on every Java2 VM, for instance, also on a J2ME CDC Profile. The footprint of less than 80 kBytes for the full version with SA, UA, and Daemon makes it very feasible for small and embedded devices."

"Concierge is an optimized OSGi R3 framework implementations with a file footprint of about 80 kBytes. This makes it ideal for mobile or embedded devices. Typically, these devices have VMs that are more focused on compactness and less optimized. For instance, purely interpreting VMs often kill the performance of existing OSGi framework implementations. The design of Concierge has been developed with respect to such platforms. Concierge uses resources in a very careful way and is able to provide significantly better performance in resource-constrained environments."

.. an interesting lightweight platform for small networked appliances .. projects from the "Information and Communication Systems Research Group" at the ETH in Switzerland ..

Chumby

Wikipedia: "Chumby is a consumer product created by Chumby Industries slated to go on sale in November of 2007. It is designed as an Open System, with schematics, PCB layouts and packaging/outerware designs available.

The primary intended use for a Chumby device is to play a set of user-customizable widgets, small Adobe Flash animations that deliver real-time information. The animations also have the ability to control and interact with the low-level hardware, thereby enabling functionality such as smart alarm clocks that bring the hardware out of sleep, and physical user interface features such gesture recognition through squeezing the soft housing."

.. just got one .. probably the coolest open-source hardware / applicance / platform / gadget out there :-) ..

September 25, 2007

WSO2

"WSO2 is leading the way to build new open source platforms for Web services and SOA. We deliver integrated middleware stacks based on components developed at Apache, offering industry leading performance and convenience for customers."

.. a wide range of products for the WebService space .. including a Mashup-Server ..

July 26, 2007

Running Hadoop MapReduce on Amazon EC2 and Amazon S3

"AWS and Hadoop developer Tom White illustrates how to use Hadoop and Amazon Web Services together using a large collection of web access logs."

.. a powerful combination .. and an excellent article .. by the way, an AMI is available including instructions for the installation on EC2

May 17, 2007

Best Tech Videos

Best Tech Videos .. no comment .. just enjoy ..

May 01, 2007

Thrift

"Thrift is a software framework for scalable cross-language services development. It combines a powerful software stack with a code generation engine to build services that work efficiently and seamlessly between C++, Java, Python, PHP, and Ruby. Thrift was developed at Facebook, and we are now releasing it as open source."

.. that definitely sounds interesting ..

Digg API

"The Digg Application Programming Interface (API) has been created to let users and partners interact programmatically with Digg. The API returns Digg data in a form that can be easily integrated into an application or a web site."

.. wouldn't be suprised if Google or Yahoo would use the API to improve their news sites ;-) ..

April 02, 2007

12 Ways to Limit an API

12 Ways to Limit an API:

"The vast majority of the over 400 open APIs listed here have imposed some limitations on how much they can be used, certainly in the free use model. There are good reasons for this ranging from preventing abuse, controlling costs, or other business-driven reasons. Just over a year ago, in 7 ways to limit API use we looked at some of these. With twice as many APIs now listed it’s a good time to check back and see what other ways APIs get throttled. As a refresher, here’s the original list:

  • Time based limits: 1 call per second, Last.fm
  • Call Volume by Address: 5,000 queries per IP per day, Yahoo! Image Search
  • Call volume per-application: 10,000 queries per application per day MSN Search
  • Return results volume: 10 results per query, the now deprecated Google Search, or 100 items returned per call, Tailrank, or 100 blogs per map FeedMap
  • Data Transmission Volume: 120 packets of 1.6KB per minute, MSN Messenger
  • Formula: Monthly quotas based on various factors, Google AdWords
  • Kindness of strangers: ‘Please be gentle with Simpy’s server’, Simpy

There are over 40 variations of the above list, mostly differing in exact size of the limit. What happens if you hit one of these limits? It depends. Most return an error code and/or message, some unfortunately leave it undefined.

And here are a few more to keep in mind, some are more exact than others…

  • Per second and per month limits: Up to 1 call per second for up to 60,000 requests per month, Amazon Historical Pricing
  • Login limits: 250,000 logins per day or 2 million per month, AOL Instant Messenger, AIM
  • Varies within a range: determined by overall system load, GeoCoder.ca
  • As needed: ‘The servers will block excessive requests’, ISBNdb
  • Heads-up: not so much a limit as a request, ‘Let me know if you’re going to hit it hard’, Where’s Tim API"

(Via ProgrammableWeb.)

.. perfect overview, thanks John Musser / ProgrammableWeb! ..

March 30, 2007

Introducing the Yahoo! Mail Web Service

"The Yahoo! Mail Web Service is a big release for Yahoo! and the Internet, and it's only the beginning of what you'll be seeing from Yahoo!. Jump into our code samples for Java, .NET, PHP, Perl and Python, and build your dream mail app today, then be sure to give us feedback on your experience so we can continue to make the API even better. Be sure to leverage the online Yahoo! Mail Web Service support group where you can get help from the Yahoo! Mail Web Service team and your fellow developers. We can't wait to see what applications you will build when you add your imagination to the platform. Maybe you want to build an application that backs up Yahoo! mail targeted at a large number of Yahoo! users, or maybe you just want to add a niche feature that makes Yahoo! Mail better for your mom. For inspiration, we've gathered a few applications:

You can watch a short screencast here (or download the full video) with Ryan Kennedy as he explains more about the Yahoo! Mail Web Service in detail.

And Hack Day hacker Leah Culver demonstrates her Flickr Postcard hack which uses Yahoo! Mail Web Service too (full download here):

Chad Dickerson, Head of Yahoo! Developer Network"

(Via Yahoo! Web Services blog.)

.. interesting .. new-world access to old-world-mail-technology .. will be interesting to see what people will build with it ..

March 22, 2007

A demo of Pipes

"Pasha Sadri and Ed Ho gave YDN a complete walk-through of Pipes. Ed first showed us how to create a Pipe and shared some examples of clever ways to use it. Pasha then explained some of the thinking behind Pipes and how it was conceived.

The demo is also available on Yahoo! Video here."

(Via Yahoo! Web Services blog.)

March 15, 2007

Mounting Amazon S3 as a File System in Amazon EC2

"This tutorial discusses how to mount Amazon Simple Storage Service (Amazon S3) as a file system in an Amazon EC2 instance. If you use the mounted file system to store your data, it is automatically saved to Amazon S3. This gives the Amazon EC2 instance the permanent storage it needs, and it does so transparently. Applications that use the mounted file system work in the usual way and see the mounted file system as though it were the local hard drive. Behind the scenes, however, all the data is stored on Amazon S3."

.. as always with EC2, good idea, but quite a few steps to get it up and running .. but the communication between EC2 and S3 is free :-) ..

February 25, 2007

Free Hosting of YUI Files from Yahoo

Free Hosting of YUI Files from Yahoo: "Coinciding with this week’s release of YUI version 2.2.0, the one year anniversary of the YUI open-source release, and as announced at the YUI Party just moments ago, we’re opening up free YUI hosting from the Yahoo! network to all YUI implementers. If you’re using YUI for your own project, we’ll serve the files for you — gzipped, with good cache-control, using our state-of-the-art network, for free. You can count on these files being continuously available because they’re the same files, served by the same source, that we use for most YUI implementations at Yahoo!."

(Via Yahoo! User Interface Blog.)

.. another interesting Yahoo announcement .. don't know yet, how that all fits together .. what is Yahoo's strategy? ..

February 09, 2007

Pipes by Yahoo

"Pipes is a free online service that lets you remix popular feed types and create data mashups using a visual editor. You can use Pipes to run your own web projects, or publish and share your own web services without ever having to write a line of code."

.. that sounds pretty interesting .. reminds me a lot of another similar service, openkapow ..

January 16, 2007

WSO2 WSAS

"WSO2 WSAS (Previously known as Tungsten) is an integrated Web services Platform which offers a complete middleware solution. It is a lightweight, high performing platform for Service Oriented Architectures, enabling business logic and applications. Bringing together a number Apache Web services projects, WSO2 WSAS provides a secure, transactional and reliable runtime for deploying and managing Web services."

.. complicated abbreviation .. but an interesting web services application server and a good SOAP developer portal ..