Most audience-analysis projects start with one deceptively simple question: Which people are shared across these audiences, and which people are unique? That question sounds like marketing. In practice, it is a data engineering problem. If you are analyzing creators, competitors, communities, conferences, or niche accounts, the expensive mistake is usually the same: You hydrate every profile too early. You pull usernames, bios, avatars, follower counts, descriptions, and other profile fields before you know which users are actually worth inspecting. For many social graph workflows, the better pattern is: IDs first. Profiles later. Enter fullscreen mode Exit fullscreen mode Collect the graph as raw IDs. Run set operations. Find the interesting segments. Hydrate only the users that survive that first pass. This post walks through that pattern using Twitter/X-style follower data as the example.…