# Another point against automatic selection?

This blog describes a potential point to add to the case against automatic selection [1,2]. Please consider two bottom line questions up front.

1. Do any other empirical examples exist of automatic selection not being tractable for series of regression equations involving transformed Y variables?
2. Do examples exist of other forms of multivariate regression or longitudinal data analysis producing curves similar to acceptability curves?

Introduction to net-benefit regression
A useful way to present economic evaluation results is through cost-effectiveness acceptability curves since they consider a range of ceiling ratios [3-5]. Regression methods for economic evaluation are proliferating [6-8], and net-benefit regression (NBR) can generate these curves using p-values from the coefficient associated with a binary X representing intervention versus comparison groups [6]. NBR uses the net-benefit statistic as its dependent variable [9], which incorporates a constant called the ceiling ratio, or decision threshold representing a valuation of each unit of effect. Because consensus does not exist about ceiling ratio definitions [10,11], acceptability curves provide a useful method for presenting results [3,4] (and can have different shapes [5]). Generating acceptability curves involves using a separate regression equation for each point along the curve, with each associated with a different value for the ceiling ratio [6].

My finding and question
My research illustrates a point that supplements other rationale for not using automatic selection [1,2], contributing that may contribute to current discussion [12]. Backwards stepwise selection procedures generate different sets of independent variables for equations at different points of the curve, at least in my data, rendering points inconsistent with others. The resulting curve resembles white noise, not an interpretable irregularity [5]. I think that this finding relates to the net-benefit statistic involving a constant. Do any empirical examples exist of automatic selection not being tractable for series of regression equations involving transformed Y variables in other contexts?

Clues I have found
I have found that some monotonic transformations involve Pi as a constant [13], although Pi would not vary along the axis of a curve. Van Belle et al. [14] cite several examples of multivariate regression such as systolic and diastolic blood pressure, but they are not linked with a transformation. Do examples exist of other forms of multivariate regression or longitudinal data analysis producing curves similar to acceptability curves?

Relevance
Most scientific research is biased, particularly studies with flexible methodologies [15]. Given the array of methods available for economic evaluation [3,4], and variation in how they are applied in practice [16-19], this point against automatic selection may increase in relevance as regression-based approaches proliferate. Past trends and perspectives about use of new statistical methods in medical literature [20] provide important considerations.

Thank you for any feedback. If this finding is new, I would like to publish it, followed by a corrected version of the main result from the research [21]. I did not use automatic selection for these acceptability curves, but there is a problem with my use of an instrumental variable (a separate topic). My revised conceptual framework also applies to my other paper [22] given its risk of bias due to bivariable screening [23].

References

1. Harrell Jr FE (1996) What are some of the problems with stepwise regression? . https://www.stata.com/support/faqs/statistics/stepwise-regression-problems/. Accessed 10/31/2018

2. Harrell Jr FE (2015) Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis. Springer-Verlag, New York

3. Drummond MF, Sculpher MJ, Claxton K, Stoddart GL, Torrance GW (2015) Methods for the economic evaluation of health care programmes. 4th edn. Oxford University Press, Oxford

4. Neumann PJ, Sanders GD, Russell LB, Siegel JE, Ganiats TG (2017) Cost-effectiveness in health and medicine. 2nd edn. Oxford University Press, New York

5. Fenwick E, O’Brien BJ, Briggs A (2004) Cost‐effectiveness acceptability curves–facts, fallacies and frequently asked questions. Health Econ 13 (5):405-415

6. Hoch JS, Briggs AH, Willan AR (2002) Something old, something new, something borrowed, something blue: a framework for the marriage of health econometrics and cost‐effectiveness analysis. Health Econ 11 (5):415-430

7. Kreif N, Grieve R, Sadique MZ (2013) Statistical methods for cost‐effectiveness analyses that use observational data: A critical appraisal tool and review of current practice. Health Econ 22 (4):486-500

8. Willan AR, Briggs AH, Hoch JS (2004) Regression methods for covariate adjustment and subgroup analysis for non‐censored cost‐effectiveness data. Health Econ 13 (5):461-475

9. Stinnett AA, Mullahy JJMdm (1998) Net health benefits: a new framework for the analysis of uncertainty in cost-effectiveness analysis. Med Decis Making 18 (2 Suppl):S68-S80

10. Claxton K, Martin S, Soares M, Rice N, Spackman E, Hinde S, Devlin N, Smith PC, Sculpher M (2015) Methods for the estimation of the National Institute for Health and Care Excellence cost-effectiveness threshold. Health Technol Assess 19 (14):1

11. Shillcutt SD, Walker DG, Goodman CA, Mills AJ (2009) Cost effectiveness in low-and middle-income countries: A review of debates surrounding decision rules. Pharmacoeconomics 27 (11):903-917

12. Heinze G, Wallisch C, Dunkler D (2018) Variable selection–A review and recommendations for the practicing statistician. Biom J 60 (3):431-449

13. McCune B, Grace JB, Urban DL (2002) Chapter 9. Data transformations. In: Analysis of ecological communities. Wild Blueberry Media LLC,

14. Van Belle G, Fisher LD, Heagerty PJ, Lumley T (2004) Biostatistics: a methodology for the health sciences, vol 519. John Wiley & Sons, Hoboken NJ

15. Ioannidis JP (2005) Why most published research findings are false. PLoS medicine 2 (8):e124

16. Walker D, Fox‐Rushby JA (2000) Economic evaluation of communicable disease interventions in developing countries: a critical review of the published literature. Health Econ 9 (8):681-698

17. Neumann PJ, Greenberg D, Olchanski NV, Stone PW, Rosen AB (2005) Growth and quality of the Cost–Utility literature, 1976–2001. Value Health 8 (1):3-9

18. Neumann PJ, Thorat T, Shi J, Saret CJ, Cohen JT (2015) The changing face of the cost-utility literature, 1990–2012. Value Health 18 (2):271-277

19. Neumann PJ, Thorat T, Zhong Y, Anderson J, Farquhar M, Salem M, Sandberg E, Saret CJ, Wilkinson C, Cohen JT (2016) A systematic review of cost-effectiveness studies reporting cost-per-DALY averted. PLoS One 11 (12):e0168512

20. Altman D, Goodman S (1994) Transfer of technology from statistical journals to the biomedical literature: Past trends and future predictions. JAMA 272:129-132

21. Shillcutt SD, LeFevre AE, Fischer-Walker CL, Taneja S, Black RE, Mazumder S (2017) Cost-effectiveness analysis of the diarrhea alleviation through zinc and oral rehydration therapy (DAZT) program in rural Gujarat India: an application of the net-benefit regression framework. Cost Eff Resour Alloc 15 (1):9

22. Shillcutt SD, LeFevre AE, Fischer Walker CL, Taneja S, Black RE, Mazumder S (2016) Economic costs to caregivers of diarrhoea treatment among children below 5 in rural Gujarat India: findings from an external evaluation of the DAZT programme. Health Policy Plan 31 (10):1411-1422

23. Sun G-W, Shook TL, Kay GL (1996) Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. J Clin Epidemiol 49 (8):907-916