Standard tensor notation in pytorch and other libraries is very indirect, and often shapes are only documented in comments.
I definitely find it helps to draw parts of your architecture as a tensor diagram. Or perhaps use a library like tensorgrad which makes eveything explicit.
I definitely find it helps to draw parts of your architecture as a tensor diagram. Or perhaps use a library like tensorgrad which makes eveything explicit.