Evidence grading - weight causal inference from retrospective v prospective data capture?

A commonly cited paper for evidence interpretation of predictor biomarker studies is Simon et al. Use of Archived Specimens in Evaluation of Prognostic and Predictive Biomarkers. JNCI 2009: 101: 1446-52. The authors have a very good discussion of the differences between retrospective and prospective studies (see section heading - “Prospective vs Retrospective Studies: A Matter of Semantics”). However, I have found that some people - including some influential review bodies - assign most retrospective data-capture studies as being Category D (see Table 1); it is unclear if they assessed whether there was a pre-specified protocol, or the study was registered, or the develop of statistical analysis plan. I wonder whether people feel this landmark paper needs an update so as to provide clarification. Two of several topics that might be worth considering - whether it is possible to have a high-quality study where the data were collected in the past? description of some newer methods being advanced for assessing causal inference of predictive biomarkers/algorithms?

I don’t know how widespread this is in other areas, but GRADE is quite a standard in health care and public health. This paper outlines the current approach to grading the strength of the evidence: