A New Perspective on Medical Decision-Making with Combinatorial Game Theory

Dear members of the Data Methods community,

I’ve been exploring a somewhat unconventional lens for analyzing medical decision-making and wanted to share it with this community for your thoughts and feedback. I’ve written a series of blog posts on applying Conway’s Combinatorial Game Theory (CGT) to clinical scenarios.

The core idea is that the sequential nature of many treatment pathways can be effectively modeled as combinatorial games. Unlike traditional game theory, this approach focuses on games with perfect information, where players (for instance, a clinician and a disease process) move sequentially.

In this framework:

  • Game states represent the patient’s clinical condition at a specific point in time.
  • Moves correspond to the available treatment options or interventions.
  • Outcomes are framed as leading to long-term favorable or unfavorable health states.

This perspective offers a novel language and a set of mathematical tools to dissect and compare treatment pathways. For instance, I explore concepts like the “temperature” of a game, which can be used to quantify the urgency or immediate impact of a particular clinical decision. This allows for developing strategies based on which “game” (i.e., which clinical problem) is most pressing to address at any given time, potentially offering a more structured way to prioritize interventions, especially in patients with multiple comorbidities.

I believe this CGT framework could be a useful complement to existing quantitative methods like Markov decision processes, particularly for thinking critically about long-term sequential decision-making in chronic diseases. The posts also grapple with the inherent limitations of such a mathematical abstraction and the crucial importance of incorporating clinical uncertainty and patient preferences.

I outlined this in a series of Substack posts. Looking for contributors to make it into a white paper.
The link contains the 5th text, which comes after an introduction to surreal numbers and combinatorial games.

I look forward to a stimulating discussion and would be grateful for any insights or critiques you might have.

Best regards,

5 Likes

Very cool work. Interested to wade into it more.

When I think of a MDP, I think of a state, action, and final state - here I see a state, move, and outcome.

If I give the example of a state being cholesterol level, action being statin, and final state being final cholesterol, how does the combinatorial approach represent this?

1 Like

When I prescribe statins I do not consider myself to be aiming to lower cholesterol, I aim to prevent acute myocardial infarctions. Does that reframing better fit the proposed method?

2 Likes

Thank you very much, @samw235711 !

@Elias_Eythorsson comment is helpful.
To begin with, we need to choose a utility function. The default in my text is QALYs.
It is a nice one, because CGT handle games in which time/turns/initiative is somewhat interchangeable with points.

In CGT, an intuitive visualization is that of game trees.
The simplest case for the statin example is a binary tree.

The practitioner can choose the game in which statin is prescribed, which increases QALY (e.g. +0.4 QALY). Alternatively, the body can take it’s natural course and die earlier from a disease (e.g. acute myocardial infarction, +0 QALY). Let’s say statin provides 0.4 QALY.
The algebraic representation for this is { 0.4 | 0 }. The average value is 0.2.

Moving to a more complex one, we may consider sequences of decisions. For instance, in a infarction, we can consider a few branches. Instead of statin (+0.4QALY), we consider a new game, related to reperfusion. No interference will lead to poor outcome (e.g. -10 QALY), a ‘move’ by the disease process. Alternatively, we can interfere (e.g. ASA, Oxygen…), picking a new game, which is also a tree. In this new tree, also binary, we can try thrombolytics (e.g. -5 QALY), or stay with primary interventions only (-7 QALY).
{{-5|-7}|-10}. The average value here is less trivial.

Each decision scenario is represented through these trees, which can be added together and considered under CGT strategies.
For instance, we can keep several concomitant trees, deciding which move is more urgent based on concepts like temperature, cooling and freezing.

Interesting - is it possible to augment it? Because I see the possibilities as

statin, die
no statin die,
statin, live
no statin, live

Is it possible to take a weighted average based on probability?

Also note, within an MDP, MIs could be an event in a state, and QALY could be a function defined on these events

1 Like

Yes, it is possible.

We could add sets to each game, such as:
{{a b c | d e f} | {g h i | j l m}}

Then, triplets may represent each scenario and you could average probabilities considering a random (or semi-random) player. I had saved a white paper on these games, which I will link here later. They can assimilate the uncertainty of clinical studies in literature.
I imagine that there are several homologies between CGT trees and MDPs.

The interesting features of CGTs are those of temperature in such trees. Since the utilities are outlayed within a simetric (2 opposing players) and branched (tree) structure, we can propagate a sort of global prunning in the whole tree. This process eventually merges branches (at certain ‘temperatures’) and one can use this information to strategicaly prioritize certain decisions when considering several scenarios.

1 Like

I think I see, so it’s sort of like the superset contains the first in time and then the subsets are later in time? However the 4 outcomes given with statins are all in a sense at the same level (completely guessing here),

{statin live | statin die | no statin live | no statin die}

but I guess if the trees are based on binary decisions, one could think of it as like

{statin|no statin}
{{statin live | statin die}|{no statin live | no statin die}}

Would be glad to take a look at the paper you mentioned.

Interested how the pruning might apply to this simple case if possible - or is it more relevant to sequential decisions.

1 Like

Yes, precisely!

Here is the link: https://www.sciencedirect.com/science/article/abs/pii/S0020025504002725
And also https://webdocs.cs.ualberta.ca/~mmueller/ps/solvepm.pdf (cites the above)

On page 8:

  1. Probabilistic combinatorial game (PCG) model and the meta-search technique
    Let us define an outcome distribution as a finite set of ordered pairs of the
    form (S,p), where S is an integer score and p is the probability that score S will
    be the result, 0 < p < = 1, such that the sum of all the ps equals 1, i.e. an out-
    come distribution D = {(Si,pi)jSi 2 I, 0< pi < = 1, and Pn i1⁄41pi 1⁄4 1}. Next we define probabilistic combinatorial games recursively as follows:
  2. An outcome distribution is a PCG.
  3. A sum of PCGs is a PCG.
  4. {A1, A2, …, AnjB1, B2, …, Bm} is a PCG if A1, A2, …, An, B1, B2, …, Bm
    are PCGs. A1, A2, …, An are called the Left options and B1, B2, … Bm are
    called the Right options, just like in the ordinary combinatorial games.
    Intuitively, we can think of a PCG in the following way: the playing rules
    are the same as in ordinary combinatorial games [1] except when a sub-game
    reaches an outcome distribution, dice will be thrown to select an outcome of
    the sub-game according to the probability distribution. And the game contin-
    ues with the outcome of this sub-game known to both sides. If the final total
    outcome scores of all component sub-games is positive, it is a win for Left,
    otherwise it is a win for Right.

It is formally called ‘cooling’ due to the notion that a game may be seen as vibrating between two states (Left and Right) and that you cool it down until it freezes to a single numerical value. That works by discounting points for both players proportionally (and also propagating the effect along the nested games).That is, {a|b} becomes {a -1 |b +1} and {{a| b} | c} becomes {{a -1| b +1} -1| c +1}. We can cool a simple game, such as {3|0}, by 1 degree, making it {3-1|0 +1} = {2|1}. Then, cooling another half degree, {1.5|1.5}, which yields the average value ‘at this temperature’.

In the previous example, {{-5|-7}|-10} will cool to {{-5 -1 |-7 + 1} - 1|-10 +1} = {{-6|-6} - 1|-9} = {-7|-9}. The thrombolytics sub-game freezes at T=1, with value -6. Then, the cooled game {-7|-9} will freeze at {-8|-8}, and the game has value -8 at T=2. Hot games take longer to freeze and present a wider margin for the player that makes the first move. A simple strategy is the ‘hotstrat’, which just consists of playing at the hotest game available, then waiting for the adversary’s turn, recalculate temperatures and play at the new hotest game. There are others, such as sentestrat and thermostrat.

/ Combinatorial game theory in Health Sciences #03 - Cooling games and hot strategies

Some scenarios in which this may be applicable to prioritize decisions. ATLS’s classic “ABCDE” algorithm to treat trauma changed to “X ABCDE” to consider emerging ‘hot games’, but with CGT this could generalized to other aspects. The texts at Substack consider vaccination schedules, which is also a nice application. In-between decisions, we have to wait for a period, during which the patient can respond to immunization(s) and/or get infected in other ‘games’ that were previously available.

1 Like

I looked though the linked paper, and I have to still read your posts in full.

Still just going back to the statin example, suppose we have the 4 scenarios with utility die, statin = 0, utility no die statin = 20, utility die, no statin = 1, utility no die, no statin = 50,

it can be represented { {0|20} | {1|50}} where the most central | is whether to take statin and the two less central divide die or no die. It’s interesting tho - we sort of use this symbol | to represent a sort of branching of the tree. However, one branching is a human choice - statin or no statin - the other is nature - die or no die. It’s as if a human makes one decision, nature makes another.

In a sense the notation makes it so that the only thing that distinguishes these decision is time, because the symbol | is the same for both.

An mdp however seems to me like it says that these things are different. Often the policy pi(statin) v different from p(die|statin) - ie p != pi.

So in a sense I can see how maybe combinatorial framework would isolate time in a way perhaps an mdp does not - maybe an mdp has extra junk sort of, obfuscating the issue

2 Likes

@samw235711.
When a player no longer has options, that is the empty set. So the situation that you described might be: { { | 20} | { | 50} }.

There are some interesting advantages in CGT, for sure!

I am looking for collaborators to work on a ‘peer-reviewable’ white paper about this framework. Let me know if you are willing @samw235711 (and other forum members).