Unicode includes characters built from multiple characters.
This means questions (functions) like “how many characters do I have”, and indexing in general, are not as straight forward as they could be.
So storing each visual character’s character group together, as a single element, results in a simpler interface from a programming standpoint.
Not necessarily as efficient, since either indirection or larger character footprints (or both) are required, but simpler.
This is definitely the route I would go, while implementing other complexity. After that, possibly undoing this “simplicity optimization”, would be a small price to pay for the development convenience.
This means questions (functions) like “how many characters do I have”, and indexing in general, are not as straight forward as they could be.
So storing each visual character’s character group together, as a single element, results in a simpler interface from a programming standpoint.
Not necessarily as efficient, since either indirection or larger character footprints (or both) are required, but simpler.
This is definitely the route I would go, while implementing other complexity. After that, possibly undoing this “simplicity optimization”, would be a small price to pay for the development convenience.