Remember when they claimed Yi had 200k context length despite it having 16k of usable context?
I remember, because I spent non-trivial effort trying to make it work for long-form technical summarization. My lackluster findings were validated by RULER.
I remember, because I spent non-trivial effort trying to make it work for long-form technical summarization. My lackluster findings were validated by RULER.
https://github.com/hsiehjackson/RULER