Menu

📰
0

Reddit - Please wait for verification

Computer Science: Theory and Application·/u/Shaweyy·4 days ago
#5kmiX9Sw
Reading 0:00
15s threshold

Hi everyone! I'm an undergrad who just started my first Natural Language Processing course this semester and really enjoy it! In one of the early lectures, we were talking about the Levenshtein distance and other algorithms, and I was astonished to learn that most string distance function are O(n*m) and get painfully slow. I tought to myself "What if we represented each word as a vector instead of comparing raw character sequences?" So we could just do a fast vector search using FAISS and other similar libraries. I started tinkering a lot, way too much! and almost missed important deadline, but I was having a blast trying different approaches! I ended up building a working prototype, it encodes each dictionary word into a fixed-size vector using character frequencies, average positions, and what typically comes before and after each letter.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Read More