"Each time you make a Voice Call on Telegram, a neural network learns from your and your device‘s feedback (naturally, it doesn’t have access to the contents of the conversation, it has only technical information such as network speed, ping times, packet loss percentage, etc.). The machine optimizes dozens of parameters based on this input, improving the quality of future calls on the given device and network."
I have the feeling that's there soley for the "AI-Powered" hype train benefit. Versus actually being used as a genuine feedback loop to improve the service.
Logs of speed, ping times, packet loss, would likely be more useful in good old, non-AI reports...to identify regional issues, peering opportunities, etc.
I don't think you need AI for voip calls that upgrade/degrade quality based on network health.
It's fundamentally very similar to the sorts of issues you end up with if you compress then encrypt. If the attacker can make some educated guesses about the plaintext prior to the compression, the compression ratio can be a very powerful tool in their arsenal.
On the other hand, if you could do it, you'd probably have invented a convoluted speech-to-text (where text is a index into a dictionary of words). Note that you would also likely lose things like inflection, voice, accent etc - so while it might work as a texting system with voice input - it would be a poor substitute for voice chat..
This is HN. A link to an example would be appreciated.
Edit:
To clarify, I work with audio codecs too, and can't really
think of parameters (other than the compression level?)
that would make much sense to adjust on the fly.
If "AI" is used for more than just a buzzword here, then
I imagine the answer must be quite interesting.
They probably adjust the incoming / outgoing buffer sizes (and therefore the audio delay, since it's live) to account for packet loss.
They might also prioritize traffic depending on how full your buffers are.
I can only assume Youtube and Netflix do similar parameter tweaks to optimize their video delivery based on the connection (totally filling the buffer to a max size all the time would waste bandwidth, but if the client has lots of packet loss they need a larger safety net).
What sort of parameters are adjusted?