Excellent talks on each of the presenting companies approach the design of their recommendation engines based on the specifics of their markets and users
Here are my notes on their respective technology stacks. Hadoop, Hive, Memcached, Java are used by all 3.
1. Trulia: Todd Holloway on Trulia Suggest.
- R on each Hadoop Server
2. Rich Relevance: John Jensen and Mike Sherman
Starting to deploy
3. Pandora: Eric Bieschke
- Python. Hadoop. Hive for Offline processing
- Memcached. Reddis: for near line & online
- Java & PostgreSQL for online
Memcached: Used as key-value store in the sky as long as you don’t care about losing data
Reddis: “Persistent Memcached”