Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Biased vs. Unbiased Standard Deviation and Variance #59

Open
minus27 opened this issue May 11, 2020 · 1 comment
Open

Biased vs. Unbiased Standard Deviation and Variance #59

minus27 opened this issue May 11, 2020 · 1 comment

Comments

@minus27
Copy link

minus27 commented May 11, 2020

Thanks for the jq recipes! jq is very powerful, but horribly underdocumented.

While looking at your standard deviation recipe, I threw the numbers into Google Sheets (and Excel) and your the results did not match - which bothered me. Long story short, your recipe is for the biased standard deviation (and variation) while the Google Sheets and Excel functions calculate the unbiased standard deviation (and variation). While I do not fully appreciate correcting for biases, I am happy to know why the results did not match.

In your existing recipe, I did the following:

  • changed mean to simply def mean: add / length;
  • removed pow2 altogether and replaced its call with . * .
  • simplified your data with just an array of the same numbers

Here is where I ended up:

def mean: add / length;
def variance_biased: . | mean as $mean | map_values(. - $mean | . * .) | mean;
def stdev_biased: . | variance_biased | sqrt;
def variance_unbiased: . | mean as $mean | map_values(. - $mean | . * .) | add / (length-1);
def stdev_unbiased: . | variance_unbiased | sqrt;

@remy
Copy link
Owner

remy commented May 12, 2020

I'll update my post with biased - thank you 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants