Well, "small" is relative. It's certainly not a bottleneck, but it's still between 17.4% to 34.2% of the total time. That's definitely still in range where optimizing could have a measurable impact on performance difference (depending on how much room there's left for optimization).
Right, but it's clearly not the situation I was describing where cutting the parse time to zero for the stages after the tokenization stage would only slightly decrease total time.
100/2.92 = 34.2%
100/5.76 = 17.4%