Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For neural net gradient descent, automatic differentiation etc, the widely used ReLU function has infornation carrying derivatives at +0 and –0 if those are infinitesimals.


Barely any information. After surviving RELU that signed zero is probably getting added to another value and then oops the information is gone. It sounds a lot worse than properly spaced values.


sign = most important bit of information


If you were looking at the entire number line, sign would roughly be the most important part.

But you still have all the other numbers carrying sign info. This is only the sign of denormals and that's way less valuable. Outside of particular equations it ends up added to something else and disappearing entirely. It would be way better to cut it and have either half the smallest existing positive value or double the largest existing value as a replacement. Or many other options.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: