LaVOZs

The World’s Largest Online Community for Developers

'; r - Calculate significance of correlation in grouped data with dplyr - LavOzs.Com

I have grouped data, for which I would like to test several basic inference statistics.

library(tidyverse)

df <- data.frame(x=runif(50, min = 0, max = 25),y=runif(50, min = 10, max = 25), group=rep(0:1,25))

df %>%
  group_by(group) %>%
  summarize(cor(x,y))

Here I can easily get the correlation, but I also need to check it's statistical significance. Unfortunately options like cor.test does not work in dyplr. Is there an easy workaround?

Could this be what you want?

df %>%
    group_by(group) %>%
    summarize(cor.test(x,y)[["p.value"]])

The thing is that cor.test() returns a list and not a single value, so you need to pick the element out of the list that you are interested in.

Related
How to join (merge) data frames (inner, outer, left, right)
Grouping functions (tapply, by, aggregate) and the *apply family
Drop data frame columns by name
Grouping and conditions without loop (big data)
Group by multiple columns in dplyr, using string vector input
data.table vs dplyr: can one do something well the other can't or does poorly?
dplyr: calculate group weights
Calculating ratios by group with dplyr
Correlation matrix of grouped variables in dplyr