menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Cloud News

>

My “Secret...
source image

Medium

4w

read

421

img
dot

Image Credit: Medium

My “Secret Sauce” for the Inaugural Singapore Nationwide AWS Large Language Models League (LLML)…

  • The Large Language Models League (LLML) was a fine-tuning competition organised by Amazon Web Services (AWS) team to promote the skill of creating large language models using Llama-3-8B-Instruct.
  • The competition had two rounds i.e preliminary round and Grand Finale. In the preliminary round, based off the Llama-3-8B-Instruct model, participants were pitted against a Llama-3–70B-Instruct model. The top five finalists advanced to the Grand Finale on October 3rd for the showdown.
  • The Grand Finale had seven questions, judged by a LM (40%), a panel of five experts (40%), and audience (20%), where participants were asked to generate their model responses within 60 seconds.
  • The author shares his experiences based on trial-and-error fine-tuning attempts on balancing the experiments between hyperparameters tuning and dataset selection.
  • The author adopts a cautious approach with the dataset size by choosing the top 1,000 data points with the longest response token length, which he called SeaEval-1k for the Singapore-context instruction-response competition.
  • The author also experimented with Low-Rank Adaptation (LoRA) and changes in the target_modules for fine-tuning. They mainly focus on the epoch, learning_rate, lora_r, and lora_alpha hyperparameters.
  • Initial results suggested a correlation between increasing epoch and performance. Note that lora_alpha being 2x that of lora_r seemed to be the most commonly suggested ratio.
  • Prompt engineering played a crucial role in the Grand Finale. The author focused on generating long responses to maximize the LM judge's score while prioritizing less structure and more creativity on the final question.
  • The author emphasizes that luck played a vital role, and his insights are based on his trial-and-error fine-tuning attempts, which may not reflect universally optimal approaches.
  • The author thanks Gen-C Generative AI Learning Community for hosting the workshop and the AWS team for organizing and facilitating the competition.

Read Full Article

like

25 Likes

For uninterrupted reading, download the app