exploratory data analysis

There probably is no “one stop shop” for exploratory data analysis. I’d like to see some plots myself – box plots, cumulative distributions, time series – and why not put all those side by side with the relevant summary statistics? But still, this is a nice entry.

The describe() function from R’s psych package (Revelle, 2023) provides a comprehensive statistical summary of your dataset. Unlike R’s base summary() function, it includes additional metrics that are particularly useful for data exploration and assumption checking.

All this works best for data points you assume are independent of each other in time and space, which is not how the actual universe tends to work. And heah, I know there is a thing called machine learning, but I like to start simple to begin building a picture of a data set in my simple human brain.

Leave a Reply

Your email address will not be published. Required fields are marked *