Did My LoRA Learn Tenacious Style—or Just Memorize Augmented Patterns?

1 / 3

Did My LoRA Learn Tenacious Style—or Just Memorize Augmented Patterns?

DEV Community·Beamlaka·25 days ago

#ZwtED41z

#deeplearning #llm #machinelearning #nlp #style #lora

Reading 0:00

15s threshold

In Week 11 Tenacious-Bench, we trained a LoRA adapter on Tenacious-style B2B sales emails using Supervised Fine-Tuning (SFT). We got a real performance lift: Delta A = +0.263 (p < 0.0001). But that result exposed a harder question : Did the adapter learn how Tenacious writes, or just what repeated Tenacious-like samples looked like? This post answers that at the mechanism level: cross-entropy token-by-token, LoRA gradient flow, and why low-diversity augmentation can make convergence look better than generalization. 1) What SFT cross-entropy actually optimizes In autoregressive SFT, the model predicts the next token at each step. Cross-entropy loss measures how much probability mass the model gave the correct next token. So the objective is: not “be honest,” not “be cautious,” not “be Tenacious,” but: assign high probability to target tokens in the training distribution. If your targets consistently reflect Tenacious behavior, style improves indirectly.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

Did My LoRA Learn Tenacious Style—or Just Memorize Augmented Patterns?