How does fsrftest calculate the p-value?

8 ビュー (過去 30 日間)
Isaiah
Isaiah 2023 年 12 月 18 日
編集済み: Ive J 2024 年 1 月 8 日
I am trying to understand how the fsrftest works in MATLAB. From the documentation, I understand that it uses an F-Test to test a null hypothesis and alternative hypothesis. Subsequently the p-value is used to determine the importance of the feature. From my understanding the p-value is also not compared with a significance level and as such this function does not actually reject/accept either hypothesis but rather just uses the p-value to rank features.
My question is regarding how is the p-value calculated? Is the process the same as ANOVA?

採用された回答

Ive J
Ive J 2024 年 1 月 8 日
編集済み: Ive J 2024 年 1 月 8 日
At the end of doc you can see it uses -log(p) to rank features, so there is no significance level here. And yes, it's same as ANOVA (to be precise, it's a GLM), note that NumBins argument is used to bin continuous features.
n = 100; % sample size
data = table;
data.BMI = randi([18, 50], n, 1);
% bin BMI into two categories
med_bmi = median(data.BMI);
idx = data.BMI > med_bmi;
data.BMI(idx) = 1;
data.BMI(~idx) = 0;
data.Sex = randi([0, 1], n, 1);
data.Target = randn(n, 1);
mdl_bmi = fitlm(data(:, ["BMI", "Target"]))
mdl_bmi =
Linear regression model: Target ~ 1 + BMI Estimated Coefficients: Estimate SE tStat pValue _________ _______ ________ _______ (Intercept) 0.04267 0.13963 0.30559 0.76056 BMI -0.067441 0.19746 -0.34153 0.73343 Number of observations: 100, Error degrees of freedom: 98 Root Mean Squared Error: 0.987 R-squared: 0.00119, Adjusted R-Squared: -0.009 F-statistic vs. constant model: 0.117, p-value = 0.733
mdl_sex = fitlm(data(:, ["Sex", "Target"]))
mdl_sex =
Linear regression model: Target ~ 1 + Sex Estimated Coefficients: Estimate SE tStat pValue ________ _______ ________ _______ (Intercept) -0.10768 0.14984 -0.71864 0.47407 Sex 0.20462 0.19847 1.031 0.30509 Number of observations: 100, Error degrees of freedom: 98 Root Mean Squared Error: 0.983 R-squared: 0.0107, Adjusted R-Squared: 0.000635 F-statistic vs. constant model: 1.06, p-value = 0.305
[~, sc] = fsrftest(data, "Target", "NumBins", 2);
p = exp(-sc)
p = 1×2
0.7334 0.3051

その他の回答 (0 件)

カテゴリ

Help Center および File ExchangeANOVA についてさらに検索

製品


リリース

R2023b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by