What are people using for low-latency autocomplete in production? [P] I’ve been looking into autocomplete/typeahead systems recently, especially in contexts where latency really matters (e.g. search-as-you-type or RAG pipelines). From what I can tell, the main approaches are: * Full search backends (Elasticsearch, Meilisearch, etc.) * LLM-based suggestions (flexible but slow per keystroke) * Simpler prefix / n-gram systems (fast but sometimes limited) I’m trying to understand what people actually use in production when you need: * very low latency * reasonable suggestion quality * minimal infra overhead Are most systems still based on classical methods, or are people moving toward hybrid approaches (retrieval + reranking)?…