How useful are linguistic signals like hedging and passive voice for comparing news outlets? Hi everyone, I’m working on a metadata-only analysis of news coverage across major outlets, and I’d be interested in feedback from people with journalism/editorial experience. The goal is **not** to rank outlets by truthfulness or say that one outlet is “better” than another. I’m trying to understand whether measurable linguistic signals can be useful for comparing reporting style over time. The current analysis looks at 8 outlets from 2016–2026 and tracks two metrics: **Hedging rate** Share of sentences using uncertainty/speculative language, such as “may,” “might,” “could,” “reportedly,” or “allegedly.” **Passive voice ratio** Share of sentences detected as passive voice, used as a rough proxy for less direct agency or attribution structure. The dataset is filtered to hard-news topics and excludes sports, entertainment, lifestyle, weather, and similar categories.…