The brain is not a single universal neural network that does everything well. It...

vinay427 · on Oct 13, 2016

This is being done using various types of networks. See these slides on image captioning by Karpathy for an example using a CNN and RNN: http://cs.stanford.edu/people/karpathy/sfmltalk.pdf

Senji · on Oct 13, 2016

If we're going with a brain metaphor. What would be the those neural networks' version of synesthesia?

empath75 · on Oct 13, 2016

Feeding mp3s to an image recognition neural net. And as soon as I typed that, I want to try it.

modeless · on Oct 13, 2016

Actually, in the architecture you described, if there is a planning net that's connected to image net and an audio net, rather than feeding audio to the image net I think synesthesia would be better modeled by feeding the output of the audio net into the image net's input on the planning net. If that makes sense.

Senji · on Oct 13, 2016

Not the output. Making several single connections from intermediate layers from the different nets.

dharma1 · on Oct 13, 2016

CNNs can actually be used for audio tasks too, on spectrograms

Senji · on Oct 14, 2016

It's how some guys defeated the first iteration of recaptcha's audio mode. Then google replaced it with something very annoying to use even for humans.