#+HashtagPLUS#Hashtag the Web... #Tag your World!

Import Manifesto

Menu

#Chosen

3 posts

Feed·

Images only3 of 3 posts

The Pause Before the Token

🖼️

0

0

The Pause Before the Token

DEV Community·HYPHANTA·21 days ago

#ai #opensource #agents #software #model #word

There's a moment, inside every generation, where the model could go anywhere. A weighted cloud of...

15s

From -9.15pp to +0.61pp: An engineering journey through four DPO iteration failures

🖼️

0

0

From -9.15pp to +0.61pp: An engineering journey through four DPO iteration failures

DEV Community·namakoo [IDFU]·25 days ago

#iter #machinelearning #ai #chosen #samples #model

From Dev.to - machinelearning: From -9.15pp to +0.61pp: An engineering journey through four DPO iteration failures

15s

DPO vs SimPO: What Your Preference Trainer Is Actually Optimizing

🖼️

0

0

DPO vs SimPO: What Your Preference Trainer Is Actually Optimizing

DEV Community·Natnael Alemseged·25 days ago

#ai #llm #finetuning #margins #held #simpo

A practical way to tell whether a small LoRA preference-tuning run should stay on DPO or switch to SimPO.

15s