Advice regarding analysis plan for a study

S_Chakraborty · December 8, 2022, 7:55am

Background: Diabetes is a common comorbidity in cancer patients and we have observed that patients with diabetes tend to have a higher risk of side effects during chemoradiation for carcinoma cervix. Infact quite a few of these patients have interruptions in chemotherapy related to various forms of electrolyte imbalances, infection etc.

We are planning a cross sectional analysis of patients of carcinoma cervix who undergo radiotherapy and evaluate the association between presence of diabetes and the acute toxicity profile. We have a potential case list of about 300 - 350 patients of whom we anticipate about 20% - 30% will have diabetes at baseline.

The problem: As diabetes is a comorbidity associated with multiple other factors and some studies even postulate that these patients tend to have more aggressive disease, we think that these patients may also be receiving radiotherapy to a larger volume simply because they will present with more advanced stage of the disease. Additionally the effect of diabetes on the kidney enhances the nephrotoxic effects of chemotherapy. Given the cross-sectional study design, it would possibly be impossible to ascertain if diabetes is resulting in increased toxicity per se. However, even if the study suggests that there is an association we can look at planning a further cohort study evaluating a more rigrous supportive care program for such patients.

Given the biases inherent in the design we thought of using a regression model to evaluate the association and the choice of the model would be an ordinal regression model given the grades in which toxicities are expressed and varied effects they have on the patient. Now the issue is that patients can have multiple toxicities which often exist as symptom clusters (say diarrhea , vomiting, hyponatremia are usually clustered together). If we simply lump all toxicities together the model will possibly loose lot of power (say for example we do an ordinal regression where patient toxicity overall is graded on a 5 point scale where 0 represents no toxicity, 1 mild toxicity only, 2 moderate toxicities, 3 any severe toxicity, 4 any toxicity needing hospitalization and 5 toxicity related to death).

If we on the other hand choose to add up the grades of individual toxicities we run into a situation where a patient who experiences a single grade 4 toxicity has the same score as a patient who has 4 grade 1 toxicity events - in essence does not capture the severity of the toxicity.

I would like to know from the house how we can overcome this issue - any advise on how to meaningfully reflect the toxicity burden would be helpful for a better model estimate.

f2harrell · December 8, 2022, 11:43pm

I don’t know how to meet your goals without doing a somewhat in-depth survey of 10-20 experts where you use some reliable process to score all the components, say by predicting and overall ordinal utility/outcome severity judged by the experts. Then you have a new ordinal or metric scale that makes the (probably) right tradeoffs.

S_Chakraborty · December 9, 2022, 2:10am

Thank you prof. Let us see if we can get this organized.