r - Kolmogorov-Smirnov or a Chi-Square test for a distribution? -

February 15, 2015

i used model fitting fit negative binomial distribution discrete data. final step looks need perform kolmogrov-smirnov test determine if model fits data well. references find talk using test normally distributed continuous data. can tell me if can done in r data not distributed , discrete? (even chi-square test should i'm guessing please correct me if i'm wrong.)

update:

so found vcd package contains function goodfit can used purpose in following way:

library(vcd)  # define data data <- c(67, 81, 93, 65, 18, 44, 31, 103, 64, 19, 27, 57, 63, 25, 22, 150,           31, 58, 93, 6, 86, 43, 17, 9, 78, 23, 75, 28, 37, 23, 108, 14, 137,           69, 58, 81, 62, 25, 54, 57, 65, 72, 17, 22, 170, 95, 38, 33, 34, 68,           38, 117, 28, 17, 19, 25, 24, 15, 103, 31, 33, 77, 38, 8, 48, 32, 48,           26, 63, 16, 70, 87, 31, 36, 31, 38, 91, 117, 16, 40, 7, 26, 15, 89,           67, 7, 39, 33, 58)  gf <- goodfit(data, type = "nbinomial", method = "minchisq")  plot(gf)

but after gf <- ... step, r complains saying:

warning messages: 1: in pnbinom(q, size, prob, lower.tail, log.p) : nans produced 2: in pnbinom(q, size, prob, lower.tail, log.p) : nans produced 3: in pnbinom(q, size, prob, lower.tail, log.p) : nans produced

and when plot complains:

error in xy.coords(x, y, xlabel, ylabel, log) :    'x' list, not have components 'x' , 'y'

i not sure happening because if set data following:

data <- <- rnbinom(200, size = 1.5, prob = 0.8)

everything works fine. suggestions?

a ks-test continuous variables only, plus have specify distribution testing against. if still wanted it, this:

ks.test(data, pnbinom, size=100, prob=0.8)

it compares empirical cumulative distribution function of data against specified 1 (whether makes sense depends on data). have choose parameters size , prob based on theoretical considerations, test not valid if estimate parameters based on data.

your problem goodfit() might have data, sure these counts? barplot(table(data)) not it's approximately following negative binomial distribution, compare, e.g., barplot(table(rnbinom(200, size = 1.5, prob = 0.8)))

finally, i'm not sure if approach of doing null-hypothesis test after fitting appropriate. may want quantitative fit measures beyond / based on $\chi^2$ of there many (rmsea, srmr, ...).

Search This Blog

shell

r - Kolmogorov-Smirnov or a Chi-Square test for a distribution? -

Comments

Post a Comment

Popular posts from this blog

Add email recipient to all new Trac tickets -

400 Bad Request on Apache/PHP AddHandler wrapper -

asp.net - repeatedly call AddImageUrl(url) to assemble pdf document -