Notes from SF Data Mining Meetup: Recommendation Engines

Excellent talks on each of the presenting companies approach the design of their recommendation engines based on the specifics of their markets and users

Recommendation Engines

Thursday, Apr 4, 2013, 6:30 PM

Pandora HQ
2101 Webster Street, Suite 1650 Oakland, CA

200 Data Scientists Went

6:30 – 7:00pm Social and Food7:00 – 8:30pm Talks**8:30 – 9:00pm SocialWe’re excited to have three sets of speakers:1. Trulia: Todd Holloway will be giving a talk on Trulia Suggest.2. Rich Relevance: John Jensen and Mike Sherman will be giving their perspectives on recommendation engines.3. Pandora: Eric Bieschke will be giving his perspec…

Check out this Meetup →

Here are my notes on their respective technology stacks. Hadoop, Hive, Memcached, Java are used by all 3.

1. Trulia: Todd Holloway on Trulia Suggest.

  • Hadoop
  • Hive
  • R on each Hadoop Server
  • Memcached
  • Java

2. Rich Relevance: John Jensen and Mike Sherman

  • Hadoop
  • Hive
  • Pig
  • Crunch

Starting to deploy

  • Kafka
  • Storm

3. Pandora: Eric Bieschke

  • Python. Hadoop. Hive for  Offline processing
  • Memcached. Reddis: for near line & online
  • Java & PostgreSQL for online

Memcached: Used as key-value store in the sky  as long as you don’t care about losing data

Reddis: “Persistent Memcached”

Leave a Reply

Discover more from Software Engineering - from the Trenches

Subscribe now to keep reading and get access to the full archive.

Continue reading