The Problem You have environmental data (pH, organic matter, nutrients) and biodiversity data (Shannon index). You want to know: which environmental factors significantly affect diversity? The quickest way to answer this is scatter plots + linear regression . Here's a complete, runnable R workflow using real soil microbiome data — from a simple lm() call to a multi-panel publication-ready figure. Everything uses base R graphics , no ggplot2 required. Data This dataset has 24 soil samples with 6 environmental factors and 2 diversity indices: Variable Description Unit pH Soil acidity — AK Available potassium mg/kg AP Available phosphorus mg/kg AN Available nitrogen mg/kg OM Organic matter g/kg W Water content % Shannon Species diversity — PD Phylogenetic diversity — # Read and inspect df <- read.csv ( "sandian.csv" , header = TRUE ) str ( df ) head ( df ) Enter fullscreen mode Exit fullscreen mode Single Factor: pH → Shannon x <- df $ pH y <- df $ Shannon fit <- lm ( y ~ x ) summary ( fit ) Enter…