How are they inefficient? I'm designing an app right now and am looking at diffe...

jaytaylor · on Oct 30, 2012

Tasty-Pie is inefficient because if there are 20 resulting records to output from the API, if each record has 2 nested referenced records, TP will end up doing at least 60 queries, 1 query to hydrate each row + 1 query per nested record (but anecdotally for me it's been more like 100 queries to get 10 records due to redundant queries and other miscellaneous things happening). TP does no bulk query optimization and the result is /very/ high query volume.

I have cleaned up a lot of our Tasty-Pie backend and reduced query volume on endpoints which were 100+ queries down to be just 2 or 3 queries- but it was a tremendous amount of work with an ugly result from a complexity and architectural perspective.

I haven't found any Python-based API-in-a-box which I like.

OTOH, I've had much success with Scala + Play-framework + Squeryl - this combination gives fine grained explicit control over what is happening. This approach is generally very easy to scale. However this setup has it's own shortcomings- its always a tradeoff, right?

With Scala + Play, you don't get an API-in-a-box, so you must build it out yourself. It isn't particularly hard, but it definitely doesn't come "for free".

"for free", like TP- but it's only free at first because you will have to do complex and ugly hacks to make it scale.

shabble · on Oct 30, 2012

I'm currently in the middle of a very similar situation. It's fine for dev, but throwing hundreds of queries at a single request just isn't going to scale.

We've ended up with stuff like:

     class Meta(SomeResource.Meta):
            cls = Challenge
     
            select_related_fields = [
                'category', 'owner', 'company', 'bonus', 'reward',
                ]
            prefetch_related_fields = [
                'specialities', 'countries',
                'companies', 'products',
                'groups',
                ]
     
            queryset = (cls.objects
                        .select_related(*select_related_fields)
                        .prefetch_related(*prefetch_related_fields)
                        .all())

which takes some of the pain away, but going through and profiling exactly which prefetches & joins help is a major hassle.

Resources with relations with full=True are also brutal. Manually stuffing only the required extra data speeds things up greatly, but there's too much boilerplate for my liking. Having some level in-between 'full' and 'resource-uri only' would be nice, but I haven't seen any nice way to do it.

izak30 · on Oct 30, 2012

These are the problems you have with any Django app as you learn to scale it, they're not inherent to Tastypie. They're why prefetch_related and select_related exist, you're Doing Things Right.

jaytaylor · on Oct 31, 2012

This isn't Right, IMO this is a sloppy mess of a way to develop a scalable API, and only lets you barely eek by. There is no joy.

izak30 · on Oct 31, 2012

I wish that I had a better response for this. I don't. Scaling is hard. It's work, It's not usually magic and you have to understand a lot about your system to get it right. Peppering in a few cache statements and figuring out the proper querysets isn't that bad. I know that I can't be seen as an impartial observer in this situation since I commit to Tastypie, but I got involved with the project because it did this stuff right. If you've got any suggestions on improvements I'm all ears and my email is in my profile

dev360 · on Nov 1, 2012

+1 - couldnt agree more.

dev360 · on Nov 1, 2012

Uhm.. sounds like your doing a bunch of N+1 querying, this is generally fixed by overriding get_query_set to use select_related in your manager classes and then you set .objects. to point to the overridden manager class inside of your models.