One of the annoying things about having a living standard is that it is difficult to implement a conforming version as additional updates means that you are no longer conforming.
Versioned standards allow you to know that you are compliant to that version of the specification, and track the changes between versions -- i.e. what additional functionality do I need to implement.
With "living standards" you need to track the date/commit you last checked and do a manual diff to work out what has changed.
AI in this sense means using Machine Learning (ML)/Neural Networks (NN) to convert the text (or phonemes) to audio.
There are effectively two approaches to voice synthesis: time-domain and pitch-domain.
In time-domain synthesis you care concatenating short waveforms together. These are variations of Overlap and Add: OLA [1], PSOLA [2], MBROLA [3], etc.
In pitch-domain synthesis, the analysis and synthesis happens in the pitch domain through the Fast Fourier Transform (visualized as a spectrogram [4]), often adjusted to the Mel scale [5] to better highlight the pitches and overtones. The TTS synthesizer is then generating these pitches and converting them back to the time domain.
The basic idea is to extract the formants (pitch bands for the fundamental frequency and overtones) and have models for these. Some techniques include:
Kotlin does have interop with Java, but is limited by either the features not existing in Java (non-nullable types) or behave differently in Java (records, etc.).
You have to explicitly annotate that a Kotlin data class is a Java record due to the limitations Java has on records compared to data classes [1]. This is similar to adding nullable/not-null annotations in Java that are mapped to Kotlin's nullable/non-nullable types.
Where there is a clean 1-1 mapping and you are targeting the appropriate version of Java, the Kotlin compiler will emit the appropriate Java bytecode.
I view the relationship between Kotlin and Java like that between C++ and C.
The two-way interop is one of Kotlin's advantages as it makes porting code from Java to Kotlin easier, or using existing Java libraries. For example, you don't have/need something like Scala's `asJava` and `asScala` mappers as the language/standard library does that mapping for you.
The interop isn't always perfect or clean due to the differences in the languages. But that's similar to writing virtual function tables in C -- you can do it, and have interop between C and C++ (such as with COM) but you often end up exposing internal details.
It's not just screen reader users. I use TTS to listen to text content and the AI TTS voices I've tried have the issues with skipping words or generating garbled output in sections.
I don't know if this is a data/transcription issue, an issue with noisy audio, or what.
And at the next election -- if the current polling stays consistent -- they are likely to get 15-85 seats, which is not enough for them to gain power. Even then they are unlikely to form a coalition as Labour are not doing well in the polls currently, and the gain in support for the greens is largely coming at Labour's expense.
[3] (2021 https://arxiv.org/abs/2106.07889) UnivNet: A Neural Vocoder with Multi-Resolution Spectrogram Discriminators for High-Fidelity Waveform Generation
[1] https://search.nixos.org/packages?channel=unstable&query=mus...
[2] https://search.nixos.org/packages?channel=unstable&query=med...
reply