Researchers propose a framework for customizing large language foundation models (LLMs) for specific users or tasks.
The framework involves training an additional branch of transformer blocks on the final-layer embedding of pretrained LLMs, and using a carry-on module to merge the base models.
Multiple layers or LLMs specialized in different domains can be combined to create a customized LLM for a new task.
The proposed approach allows outsourcing most computation of the training job on inference nodes, reducing the memory and computation requirements.