Using controllers isn't about accuracy, it's about occlusion and haptics.
Using only gestures means no haptic feedback and it gets de-synced as soon as the camera can't see what you're doing.
Take Apples gestures for example (e.g. tap your thumb and index finger together to click). As soon as your hand is rotated in such a way that your own fingers are hidden from the camera, you actions stop being applied. This will also happen should one hand cross the other, or resting your arms down while standing, or reaching up for something, or even resting on a couch with your knee up.
Using only gestures means no haptic feedback and it gets de-synced as soon as the camera can't see what you're doing.
Take Apples gestures for example (e.g. tap your thumb and index finger together to click). As soon as your hand is rotated in such a way that your own fingers are hidden from the camera, you actions stop being applied. This will also happen should one hand cross the other, or resting your arms down while standing, or reaching up for something, or even resting on a couch with your knee up.