Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> Our foundation models are fine-tuned for users’ everyday activities, and can dynamically specialize themselves on-the-fly for the task at hand. We utilize adapters, small neural network modules that can be plugged into various layers of the pre-trained model, to fine-tune our models for specific tasks. For our models we adapt the attention matrices, the attention projection matrix, and the fully connected layers in the point-wise feedforward networks for a suitable set of the decoding layers of the transformer architecture.

>We represent the values of the adapter parameters using 16 bits, and for the ~3 billion parameter on-device model, the parameters for a rank 16 adapter typically require 10s of megabytes. The adapter models can be dynamically loaded, temporarily cached in memory, and swapped — giving our foundation model the ability to specialize itself on the fly for the task at hand while efficiently managing memory and guaranteeing the operating system's responsiveness.

This kind of sounds like Loras......



The article explicitly states they’re Loras.


I think it is just LoRA, you can call the LoRA weights as adapters


The A in LoRA stands for adapters


LoRA stands for "Low Rank Adaptation" btw.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: