The emoji out of band authentication is cool, but probably annoying. You have to read those emoji by voice to the other end, so they can check them. That could be a pain if the emoji are chosen randomly from the 2600 available emoji.
The idea comes from the STU-3 secure phone, where there was a 2-digit number display to be read back by voice. It's one of the ways to detect a man-in-the-middle attack. If there's a MITM, the crypto bits sent and received are different, because the MITM is re-encrypting, and this is detectable if you have some out of band channel for comparing them. A MITM would thus have to be able to fake the voice of the other party.
With techniques like this, you can make an MITM work arbitrarily hard to maintain the illusion that it's the other party. I've proposed some ways to do this for web pages.
The idea comes from the STU-3 secure phone, where there was a 2-digit number display to be read back by voice. It's one of the ways to detect a man-in-the-middle attack. If there's a MITM, the crypto bits sent and received are different, because the MITM is re-encrypting, and this is detectable if you have some out of band channel for comparing them. A MITM would thus have to be able to fake the voice of the other party.
With techniques like this, you can make an MITM work arbitrarily hard to maintain the illusion that it's the other party. I've proposed some ways to do this for web pages.