A somewhat clearer way (imho) is /(\d)\D\*$/ since it anchors to the end of the ...

claar · on June 4, 2014

According to the debugger at http://regex101.com/:

  /.*(\d)/

- Searches right-to-left, backtracking until it finds the match.

  /(\d)\D*$/

- Searches left-to-right, going forward a step, backtrack, forward, backtrack, until it finds a match.

If you're looking for a match toward the end of a string, the .* version will be faster.

hamburglar · on June 4, 2014

Wow, regex101.com is very nice, but I never would have noticed that debugger pane if I hadn't gone looking for it after your comment. Incredibly useful tool.

Rynant · on June 4, 2014

This is not the same as finding the last match though. The parent's example will match '2' in '1 of 2 steps.'

ronaldx · on June 4, 2014

On the contrary, it does gives the same result.

$ anchors to the end of the string, \D clears the non-digits from the end to allow \d to match the digit '2'.

Rynant · on June 4, 2014

Thanks, I see where I was wrong now.

In this case when finding the last match from the end, would the lazy quantifier reduce backtracking? e.g. /(\d)\D*?$/

icambron · on June 4, 2014

No, that would work very similarly to the greedy version. The backtracking happens because the \d gets matched to the '1' and the whole thing has to be rolled back when the $ attempts match and instead finds '2' (this would happen again if there were more digits for \d to speculatively match on). So the backtracking is not caused by the laziness or greediness of the \D* ; we really do want to gobble up all of the non-digits.

On the two options generally:

    /(\d)\D*$/

is problematic if you have a lot of digits, while

    /.*(\d)/

is problematic if you have a lot of text after the last digit. Both could potentially be optimized by the engine to run right-to-left (the former because it's anchored to the end and the latter because it greedily matches to the beginning), and then both would do well. I'm not sure if that happens in practice.

Overall, I prefer the latter, both because I think it's clearer and because its perf characteristics hold up under a wider variety of inputs.

Edit: how do you make literal asterisks on HN without having a space after them?

nkozyra · on June 4, 2014

In addition to the other explanation, the lazy qualifier is redundant here anyway since there should only be one $ in any given expression.

chinpokomon · on June 4, 2014

As others before me have said, this pattern works as expected. Putting it through regexper, you can visually see this.

http://www.regexper.com/#%2F(%5Cd)%5CD*%24%2F

ygra · on June 4, 2014

Technically, while they both capture the same digit, the match itself it different, including either everything before that digit or everything after it. But I tend to liberally use lookaround to keep the actual match clean myself; maybe others go more often for a capturing group. (Well, and not being able to use arbitrary-length lookaround in most engines might be a reason too.)

abuzzooz · on June 4, 2014

Actually, both will match the '2' in '1 of 2 steps.'

perlgeek · on June 4, 2014

Yes, but that's harder to do for more complicated regexes, because you need to negate a regex (here \d => \D) for this trick.

If you have a complicated regex $r, you can only negate it with (?:(?!$r).), and in that case, .$r is much easier to read :-)