Isn't the GROUP BY run before the SELECT though, e.g. "SELECT MAX(t) FROM foo GROUP BY t"? I think to do it the way they suggest you'd probably need to create a temp table like
WITH mapped as SELECT map() from crawl_table
SELECT * FROM mapped GROUP BY reduce()
> To a first approximation, MR runs a single query:
> SELECT map() FROM crawl_table GROUP BY reduce()
Or you could read the entire Google Mapreduce paper