EmoNet: Speaker-Aware Transformers for Emotion Recognition — and What I’d Build Differently in 20…

1 / 7

EmoNet: Speaker-Aware Transformers for Emotion Recognition — and What I’d Build Differently in 2026 | Towards Data Science

Towards Data Science·Biju Puthan Veetil·3 days ago

#cEGCJnHP

#towardsdatascience #speaker #model #dialogue #context #module

Reading 0:00

15s threshold

, I submitted my MS thesis on Emotion Recognition in Conversation (ERC). The model,  EmoNet , achieved a Weighted F1 of  39.18 on EmoryNLP  — competitive with the public PapersWithCode leaderboard at the time, sitting between TUCORE-GCN_RoBERTa (39.24) and S+PAGE (39.14), and improving over my chosen baseline, CoMPM, by  +1.81 F1 . Two years later, I returned to look at where the field is now. The leaderboard is unrecognizable. The top entries are no longer encoder-only models with clever attention heads — they’re  LLaMA-2–7B-based systems with LoRA fine-tuning and retrieval-augmented prompting : InstructERC, CKERC, BiosERC, LaERC-S. The methods are different. The compute is different. The mindset is different. And yet — when I read these new papers carefully,  the core ideas I proposed in EmoNet show up inside them, just implemented at a different layer of the stack.  This is the story of what I built, where it placed, and what I’d build now if I were starting over.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

EmoNet: Speaker-Aware Transformers for Emotion Recognition — and What I’d Build Differently in 2026 | Towards Data Science