Apple is doing more than just the voice search. It is parsing what the user is saying and bringing back exact result or completing that actions. Just like Siri did when you wanted to book a hotel or restaurant.
The "jogging guy" portion of the video shows Siri juggling context to a surprising degree. I think there's a pretty clear line between that and voice search + instant actions.
It's not unlike the line between what Google does and what Wolfram Alpha does.