I would appreciate if you can gudie me on this problem:
I want to estimate variance of a model given dataset.
I propose the following models:
A) Reserve a small subset (TST) of the dataset for variance estimation. With the rest, draw 1000s of bootstrap samples. For each sample do the following:
-A.1- train the model
-A.2- classify each observation in TST
Find the ratio of correct labels/number of bootstrap samples
B) Initially I had considered in A.1 to drop models that had large training error. I did not find
any reason why I would include them in estimating variance, as they will never be viable for any classification exercise.
I would appreciate if you can help me understand
- whether these are valid variance estimation procedures for classification
- Are there benchmark procedures published ( I spent many days searching and I lost my way)
- I have not found any method to estimate bias? Do you have any recommendation.
Bias is indicated, that is i can detect the presence of bias but unable to determine its magnitude.