Is there a way to create a numeric variable from measurements with different ranges?

Patricia_Nunes · November 10, 2021, 10:02pm

I am working with a dataset that contains limnological parameters. I stacked turbidity-related measures in one column to assess their relationship with a numerical response variable. This column is, therefore, formed of values in different units and consequently different ranges. Is there a way to standardize this column into a numerical variable called “Turbidity” that comprises all the measurements in different units?

I can not just use the measures together in different units because it will create a range that goes from 0.1 to 1000 in absolute numbers and some turbidity units are high at 10. I also don´t want to use each one of them separately because some has only a few measurements, so I´m trying to make my dataset more robust.

Reprex

values	unit	abundance
0.5	x	500
10	x	30
50	y	50
100	y	100
30	z	20
60	z	60
500	z	80

pmbrown · November 11, 2021, 3:38pm

unstack them then convert to z scores, then use average of the z-scores (across the non-missing immunological params) as “turbidity”?

Patricia_Nunes · November 11, 2021, 4:33pm

That´s exactly what I need. how can I convert data to z-scores?

pmbrown · November 11, 2021, 4:38pm

what stats software are you using? in sas it would be proc stdize. othwreise it could be done in a calculation within a data step

Patricia_Nunes · November 11, 2021, 4:40pm

I am using R. I know that some statistical multivariate approaches do this but I don´t have enough data to do so.