that “groups are remarkably intelligent, and are often smarter than the smartest people in them.” He was writing about decision-making , but the same principle applies to classification: get enough people to describe the same phenomenon and a taxonomy starts to emerge, even if no two people phrase it the same way. The challenge is extracting that signal from the noise. I had several thousand rows of free-text data and needed to do exactly that. Each row was a short natural-language annotation explaining why an automated security finding was irrelevant , which functions to use for a fix, or what coding practices to follow. One person wrote “this is test code, not deployed anywhere.” Another wrote “non-production environment, safe to ignore.” A third wrote “only runs in CI/CD pipeline during integration tests.” All three meant the same thing, but no two shared more than a word or two. The taxonomy was in there. I just needed the right tool to extract it.…