Notes from SF Data Mining Meetup: Recommendation Engines

Excellent talks on each of the presenting companies approach the design of their recommendation engines based on the specifics of their markets and users

Recommendation Engines

Thursday, Apr 4, 2013, 6:30 PM

Pandora HQ
2101 Webster Street, Suite 1650 Oakland, CA

200 Data Scientists Went

6:30 – 7:00pm Social and Food7:00 – 8:30pm Talks**8:30 – 9:00pm SocialWe’re excited to have three sets of speakers:1. Trulia: Todd Holloway will be giving a talk on Trulia Suggest.2. Rich Relevance: John Jensen and Mike Sherman will be giving their perspectives on recommendation engines.3. Pandora: Eric Bieschke will be giving his perspec…

Check out this Meetup →

Here are my notes on their respective technology stacks. Hadoop, Hive, Memcached, Java are used by all 3.

1. Trulia: Todd Holloway on Trulia Suggest.

  • Hadoop
  • Hive
  • R on each Hadoop Server
  • Memcached
  • Java

2. Rich Relevance: John Jensen and Mike Sherman

  • Hadoop
  • Hive
  • Pig
  • Crunch

Starting to deploy

  • Kafka
  • Storm

3. Pandora: Eric Bieschke

  • Python. Hadoop. Hive for  Offline processing
  • Memcached. Reddis: for near line & online
  • Java & PostgreSQL for online

Memcached: Used as key-value store in the sky  as long as you don’t care about losing data

Reddis: “Persistent Memcached”

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s