One important aspect of leetcode/coding contest problems is that they have a input size constraint and a time limit constraint.
You can use the two to figure out what is the time complexity for a solution that would work. This simplifies the search for a solution by quite a bit. Here's a blog post about this idea (going from the input constraint to the possible algorithm): https://www.infoarena.ro/blog/numbers-everyone-should-know
Other than that, understanding a set of frequent data structures and algorithms helps a ton. Here's a short course from stanford on preparing for coding contests http://web.stanford.edu/class/cs97si/
A lot of people reading the paper miss this. I guess it's not emphasized enough.
In the first paper, the selfplay trained policy is about 1500 in elo rating, while darkforest2 a supervised trained policy from Facebook is around the same, if not better. So selfplay wasn't of much use the first time around. While in the AlphaZero paper the selfplay trained policy has about 3000 elo rating.
Attention is used in machine translation since 2014 as far as i know (Dzmitry Bahdanau, Kyunghyun Cho, Yoshua Bengio:
Neural Machine Translation by Jointly Learning to Align and Translate) in various forms.
It takes time and practice. You won't get the practice at work. I dealt with a network flow problem a while back and a spell correction for URLs problem before that. In the last year I haven't done anything algorithmically challenging. Most of the time you're building systems that move data around.
A common mistake is that people learn about complex algorithms and data structures, remember their names but then can't solve questions that involve just basic structures like vectors, stacks or binary search trees.
I can't learn by reading a book, I have to solve problems to really grasp a concept so my advice is this:
Do topcoder.com/tc div 2 practice rooms, about 30 of them in a short time span. You'll see the solutions of other people and be able to learn from them. Also the level of div 2 problems is about the level of more difficult interview questions.
Picking up algorithm skills on the job is not that easy. You really have to spend some time, think problems though, play with them and internalize the learnings.
Your job in any company doesn't deal with algorithms that much but it's very useful to know them for the rare opportunities that do occur so that you make better choices.
As for YCombinator, in startups when you hit scaling issues if means your product is good and you're already on the path to success and you can hire someone with better fundamentals to help you deal with the load.
I too have had interviews with Google and had questions about big O notation. And questions about sorting during interviews with several other companies.
Not knowing fundamentals like this never helps. Many interviewers will spoon feed it to you, but you're not going to respond to questions with all your mental resources if you're spending all your energy trying to understand the context.
But its not necessary. And it costs nothing other than time to fix it. And it may even be fun.
You can use the two to figure out what is the time complexity for a solution that would work. This simplifies the search for a solution by quite a bit. Here's a blog post about this idea (going from the input constraint to the possible algorithm): https://www.infoarena.ro/blog/numbers-everyone-should-know
Other than that, understanding a set of frequent data structures and algorithms helps a ton. Here's a short course from stanford on preparing for coding contests http://web.stanford.edu/class/cs97si/