The appropriateness of equivalence testing to compare averaging methods based on the same data points

I am conducting a study on the use of the recovery rate constant (k) of muscle metabolism measured by near-infrared spectroscopy (NIRS) after exercise to assess muscle mitochondrial capacity in vivo. The procedure requires performing short-duration ischemic cuff periods after muscle activation to measure the rate constant of recovery from a high to a low metabolic rate by building an exponential function on 10-20 data points and calculating k. Different variables obtained from NIRS, such as tissue saturation index (in %) or deoxygenated haemoglobin (in arbitrary units), can be used to obtain k.

As a part of the study, I performed two measurement procedures on each individual within the same testing session. I have three different averaging methods for k, which are:

  1. the mean of the two k values obtained from the two separated exponential functions/measurement procedures

  2. the single k value that is obtained by building a single exponential function using all the data points collected from both measurement procedures

  3. the k value obtained from a single exponential function constructed by taking the mean of the pairs of data points from both measurement procedures according to their order of acquisition within each measurement procedure (i.e., the mean of the first data point of both measurement procedures, followed by the mean of the second data point, etc.)

I would like to compare these methods to prove equivalence. I was planning to investigate my hypothesis through an equivalence test (TOST procedure) for a paired-sample comparison of means (using only tissue saturation index data); however, I am not sure whether this approach is appropriate since these methods use the same data points to obtain K for each individual. May this equivalence perhaps be proven elegantly by rearranging the various equations leading to k without the need for a formal hypothesis test (i.e., by proving that these averaging methods are mathematically equivalent if they indeed are)?

I would also like to compare k values calculated using the same method but different variables (i.e., tissue saturation index and deoxygenated haemoglobin with two different blood volume corrections) on the same individuals. Since the tissue saturation index is calculated as the ratio between the oxygenated haemoglobin and the total haemoglobin [which is the sum of the oxygenated and (uncorrected) deoxygenated haemoglobin], should I be worried about the possible impact of mathematical coupling when interpreting equivalence test results?

Any help is welcome.