How we built our realtime app, Filmgrain, part 1 - the backend

imissmyjuno · on June 25, 2013

Very interesting, thanks for the writeup.

On a related note, I'm curious how many here find user ratings of new movies coincide with their own. Personally, I've been sticking to critic reviews because the authors usually have a much larger base for comparison and developed tastes than the casual movie goer like me, and because user-submitted ratings like IMDb can be highly inflated as waves of unusually high ratings pour in in the first few weeks of the release ("Best romantic comedy EVR, 10/10!"). Once the dust settles, the reviews tend to sober up a bit and even out at a value that I tend to agree with more.

Guzba · on June 25, 2013

When we started working on Filmgrain, what became clear was that it wasn't so much that the popularity told you if the movie was good or bad, it was that it worked remarkably well at highlighting the movies people care about right now. I know many of us watch critic reviews to see if a movie is any good before going, but I don't think this is the norm. Many people just get excited about a movie and then go see it.

The other cool thing that Filmgrain captures that's different from quality is insight into what people liked about a movie afterwords. For example, people were divided on if Great Gatsby was good or not, but people couldn't stop tweeting about the soundtrack. I'd have never known how much people cared if it wasn't for this feed and it's this type of meta information we want to bring forward, as we can.

jondot · on June 25, 2013

I think what you're describing is really a showcase of a pub/sub system, and you're probably using Redis for a feature which is not in its core concern.

Things like RabbitMQ do this very well. You can use HA mode with RabbitMQ and not have a SPOF.

Alternatively you can use Kafka (for really big traffic) or Kestrel for the same thing - they're less sophisticated than RabbitMQ but I see that as an advantage since they're simpler.

samyxp17 · on June 25, 2013

A Redis only system is very interesting but are you doing any kind of replication or clustering and would you recommend it as a main datastore for mission critical user data (accounts, profile, purchased...) ?

nasalgoat · on June 26, 2013

You can do it with shared storage and/or replication without worrying about data loss on a failure.

The key to scaling and maintaining HA with redis is using clustering, either built into the app or through something like nutcracker (https://github.com/twitter/twemproxy) and making sure you properly balance.

The performance of redis makes it very worthwhile to deal with the persistence/HA issues.

Guzba · on June 25, 2013

Right now we don't have user accounts or anything mission critical to store. (Everything we use Redis for right now is essentially app state and we use it to managing cross-machine communications.)

When we have user accounts and other things like that, we'll have to think about how to do that the most reliable way as well, which may be Redis or may mean bringing Postgres into the mix. I think there's a lot of benefits to keeping the variety of technologies down so if we were to bring in another, it would need to offer something compelling that our current mix doesn't offer.

The major theme we were trying to highlight in this post is just how far Redis can take you if you're trying to build a component-based backend and need a way to handle the communication as well as the state of the system.

mountaineer · on June 25, 2013

Thanks for writing up. What stack/libraries is Feeder built on?

treeform · on June 25, 2013

The feeder is written in python with gevent (will talk more about that in part 2). Every thing runs on ubutu at linode. For twitter, the library we used was tweepy. Tweepy is cool but required some tweaking. We used the filter/streaming api at twitter.

samyxp17 · on June 25, 2013

What kind of tool/service are you using to monitor those components and notify you when there is a problem?

treeform · on June 25, 2013

We wrote our own tools to monitor our components. I suffer a little bit from "not written by me syndrome" and the fact that our system is a little different. Keeper is the primary uptime monitor. We use mandrill to send out emails and twilio to send out SMS to us.

flavien_bessede · on June 25, 2013

Am I the only one here that think this system won't scale?

Guzba · on June 25, 2013

Why do you think that?

samyxp17 · on June 25, 2013

Will you be adding an IOS client soon ?

cshesse · on June 26, 2013

I am working on that, it's being finished up this week and should be submitted to the App Store soon after that.