Inferences from summary data - without seeing the data for a Cox PH model, do chi square tests tell us anything?

DRG · July 24, 2023, 1:43pm

Hi folks - this might be a daft question, I asked similar on Cross validated yesterday and got a suggestion to post his here, so sorry in advance if similar has been asked before. Consider a paper like this, which reports an association between artificial sweetener use and increased risk of cancer. The authors report using a Cox PH model for this predominantly female cohort, with a minimally adjusted model with age and gender as covariates, and a more complicated one with many others. Focusing just on the former, the image shows some of the results they report:

The problem is, I don’t have access to the raw data, and I’m wondering the limitations of what I can ascertain from summary statistics like the above. Without any raw data, I simply did a chi-squared test looking at the proportion of cancers per consumer group, which seems to suggest the reported trend is driven by the lower consumer group in the all cancer subsection: comparing non-consumers to higher consumers, the result is insignificant (p = 0.195). If this is the case, it might suggest a non monotonic dose response, indicating either weird biology or spurious findings.

I also looked at breast cancer (presumably almost entirely female) to eliminate at least one variable (gender) and found the same trend in the reported data, the low consumption group driving any seeming effects. But this of course is just a basic analysis based on the figures I can crib - my question is without access to the raw data from a Cox PR model, can analysis of summary statistics like this tell you anything conclusively, or do you need the raw data itself to make inferences?

f2harrell · July 24, 2023, 1:48pm

“Insignificant” means less than you think, in general.

I’m not clear of what you think the value of an unadjusted test is here.

The most important issue is what is the set of variables used in the “fully adjusted” analysis, and was this set based on unbiased expert opinion? And did the authors get access to all of the potential sweetener use selection factors named by the experts?

DRG · July 24, 2023, 2:05pm

Absolutely agree with you re: significance, but press for this paper reports a hazard, which in part influenced IACR’s decision to label aspartame a class 2B carcinogen. I have a whole host of personal misgivings about the methodology, but the reason I would check the unadjusted test is because if the result isn’t spurious, I think one would expect a monotonic dose-response relationship, so that higher consumers should have higher risk, if artificial sweeteners are driving the ostensible increase in cancers. So without any other information, I looked at these as unadjusted categoricals and wonder if this lack of an apparent relationship can tell us anything, or if we need the raw data to make inferences. Open to suggestion of course!

f2harrell · July 24, 2023, 3:37pm

SInce aspartame use was not randomized and people who use it are very likely to have different characteristics from those who don’t, and since selection bias can also be related to different doses even within users, there is in my opinion no value in the unadjusted comparisons.

DRG · July 24, 2023, 4:14pm

Thank you for that! So I think my best bet is to request the raw data and then examine that to see what holds. Now onto the fun problem of trying to get authors to furnish you with their data!

f2harrell · July 24, 2023, 7:03pm

But first find out which variables were adjusted for and how the set was chosen.