PEA Soup is pleased to host a Philosophy & Public Affairs discussion on Fabian Beigang‘s “Reconciling Algorithmic Fairness Criteria”, with a précis by Kasper Lippert-Rasmussen.

Open access for the article can be found here:

We will turn things over now to Kasper:

I am honored to write this Peasoup précis for Fabian Beigang’s wonderfully innovative and clear article on algorithmic fairness. In this article, Beigang proposes two novel criteria for algorithmic fairness: matched equalized odds and matched predictive parity. Beigang’s proposal is worth taking seriously. First, it articulates underlying fairness intuitions better than related fairness criteria that have been proposed in the literature. Second, unlike these criteria, it is possible to satisfy Beigang’s two fairness criteria simultaneously under realistic circumstances.

The topic of algorithmic fairness has received huge attention among academics in recent years. One factor that has contributed to this development is the ever-increasing use of algorithmic prediction instruments, as illustrated by the much-discussed case of COMPAS. Another factor is the intellectual conundrum that two intuitively appealing fairness criteria – equalised odds and predictive parity (more on both shortly) – are impossible to satisfy simultaneously under realistic conditions.

COMPAS (Correctional Offender Management Profiling for Alternative Sanctions) is a risk-prediction instrument used by some US courts to predict the likelihood that an offender will recidivate. Typically, in view of their assumed dangerousness, high-risk offenders receive a harsher punishment than low-risk offenders. The promise of predictive algorithms such as COMPAS is that they can make more accurate predictions because they can process the various pieces of information fed to them better than human predictors. However, in an investigative article published in Pro Publica, COMPAS was in effect accused of being racially discriminatory. (1)

The basis for this accusation was that COMPAS had unequal false positive and unequal false negative rates across black and white offenders. More specifically, COMPAS was more likely to predict that black offenders who would not in fact recidivate would do so than in the case of similar white offenders (i.e., the false positive rate for black offenders was higher than for white offenders). Conversely, COMPAS was less likely to predict that white offenders who would in fact recidivate would do so than in the case of similar white offenders (i.e., the false negative rate was higher for white offenders than for black offenders). Since, from the offender’s perspective, a false positive is bad for you (e.g., you spend more time in prison even though you will not recidivate (or presumably would not have recidivated)), and a false negative seems good for you (you spend less time in prison), it seems natural to conclude, as the authors of the Pro Publica article did, that COMPAS is unfairly biased against blacks.

In response to the criticism, Northpointe – the private company selling COMPAS – defended COMPAS noting that the algorithm satisfies predictive parity across black and white offenders. Basically, this means that if white and black offenders are given the same risk score, they are equally likely to recidivate. One can see the intuitive force of this defense by imagining that COMPAS had also been afflicted with predictive disparity such that, say, a significantly smaller proportion of high-risk black offenders recidivated than did high-risk white offenders. In this situation, it would be natural to infer that black offenders were exposed to racial bias, e.g., being deemed more dangerous because of their race.

Many observers thought that Pro Publica and Northpointe were both on to something. Specifically, some thought that both equalized odds and predictive parity are necessary conditions of algorithmic fairness. If so, why not just tweak COMPAS (and other similar risk prediction tools) to satisfy both conditions? Unfortunately, several theorists – including Jon Kleinberg et al. and Alexandra Chouldechova, after whom Beigang’s Kleinberg-Chouldechova Impossibility Theorem (p. 169) is named – have proved that it is impossible to satisfy both criteria under non-exceptional circumstances. Exceptional circumstances are circumstances in which the prediction algorithm is guaranteed to make correct predictions, or in which the base rate probabilities across the relevant groups – in this case, the prevalence of recidivism among black and among white offenders – are identical. Neither of these conditions was met in the COMPAS case. COMPAS makes many prediction errors, and black offenders have a higher recidivism rate than white offenders.

What do we make of this situation? As Beigang (p. 170) points out, one answer is to conclude that perfect algorithmic fairness is unattainable, and that we need to find the right balance between reducing the deviation from compliance with one of the criteria and increasing deviation from compliance with the other. Many theorists have taken this position. However, there is another answer, which is this: Since it is impossible to satisfy equalized odds and predictive parity in standard situations, and since it should be possible to achieve perfect algorithmic fairness in general, perhaps the Kleinberg-Chouldechova Impossibility Theorem is a reason to critically reassess whether equalized odds and predictive parity are really necessary conditions for algorithmic fairness. This is the novel and, in my view, highly appealing response proposed by Beigang. In this spirit, he proposes that matched equalized odds and matched predictive parity capture the underlying intuitive concerns about fairness better than non-matched equalized odds and non-matched predictive parity. I will explain Beigang’s two criteria in a moment, but first I will say something about the underlying fairness concerns according to Beigang.

According to Beigang, “We can … understand equalized odds and predictive parity as criteria that prevent discriminatory outcomes in predictive models” (p. 171). This anti-discriminatory concern is what both algorithmic fairness criteria reflect. Let us consider each criterion in turn.

First, Beigang says that “[e]qualized odds can be interpreted as a criterion that prevents discriminatory outcomes by preventing systematic cognitive bias with regard to a sensitive characteristic. By systematic cognitive bias, we here mean misjudging how informative a certain trait is in predicting another trait” (p. 171). To see why this rationale does not justify the view that equalized odds is a necessary condition for algorithmic fairness:

“imagine a health insurance company that tries to predict the healthcare costs an individual incurs in a given year in order to decide how to set their customers’ premiums… imagine the company is trying to predict only whether an individual’s annual costs are above a certain threshold… Now imagine that in country C, citizens of religion R1 are, on average, younger than citizens of religion R2… Suppose that, upon examination, the predictions turn out to have a higher false positive rate for people of religion R1 than for people of religion R2.” (p. 172)

According to Beigang, we cannot conclude from this that the predictive model used by the insurance company has a discriminatory bias against people of religion R1. Suppose that the only predictor used is age, and that the predictive algorithm is “biased with regard to age, in that it overestimates how informative young age is of risky behavior, and hence increase health costs” (p 173). Hence, there is unequalized odds in a way seemingly disadvantaging citizens of religion R1 in C. However, “the outcomes of this predictive model do not discriminate against people of religion R1” since “which religion a person has does not, in any sense, influence the predictions (or, for that matter, the error rates)” (p. 173). Imagine that there is a different country, D, in which the age profile of people from religions R1 and R2 is the opposite, i.e., citizens of religion R2 are on average younger. In D, the relevant predictive algorithm will disadvantage people of religion R2. However, the algorithm as such cannot both discriminate against and in favor of the citizens of the same religion (p. 174). This is not to say that the use of the algorithm is of no moral concern, e.g., its use might involve concerns of distributive justice. But, appealing to Eidelson, Beigang submits that such concerns are “conceptually distinct” from concerns about “(direct and structural) discrimination” (p. 174).

Second, like equalized odds, predictive parity can be violated even if there is no bias regarding a sensitive characteristic. To show this, Beigang imagines:

“a medical device that tests for a specific disease. Given a person has the disease, there is a 95 percent probability that the test turns out positive. When applied to a person who is healthy, there is a 5 percent probability that the test nonetheless turns out positive. There is no difference whatsoever in the likelihood of receiving an erroneous result, no matter whether a patient is male or female… But now imagine that … one in every 10 men has the disease, but only one in every 100 women does. Then the positive predictive value, that is, the probability of actually having the disease given that one receives a positive test result, is different for men and women. For men it is roughly 68 percent, whereas for women it is only about 16 percent” (pp. 174-175).

In this case, predictive parity is violated. Yet clearly the reason is not that predictors misjudge how informative gender is in predicting the specific disease status. Rather, the violation of predictive parity simply reflects that the prevalence of the disease across gender differs.

Accordingly, Beigang concludes that “the definitions of both equalized odds and predictive parity do not adequately explicate the underlying moral intuitions they were designed to capture” (p. 175).

What then does capture those intuitions? To answer this question, Beigang introduces the concept of matching from the literature on causal inference. If you want to make an inference regarding the causal effect of some factor, ideally, you want a randomized controlled trial where you randomly sort randomly selected subjects into two (sufficiently large) groups; one that is subjected to the intervention, the effect of which you want to test, and one that is not subjected to any intervention (setting aside the issue of placebo). The idea is that if you register a difference regarding the dependent variable between the two groups, you can infer that this is the causal effect of the intervention since the two groups of individuals only differ in terms of the relevant intervention.

Often, it is infeasible to make a randomized controlled trial. If you want to test the effect of a carcinogenic substance, your local ethics committee would not approve of a randomized controlled trial experiment. In cases where you have access to the relevant observational data, you could instead select a group of randomized individuals who have been subjected to the relevant causal variable in a non-experimental setting (e.g., they have inhaled the relevant carcinogenic substance) and then select a set of matching individuals who have not been subjected to the relevant causal variable (e.g., they have not inhaled the relevant carcinogenic substance). Here, “matching individuals” refer to individuals who are identical to the individuals in the first group on all variables other than the causal variable – “co-variates,” as Beigang calls them. This ensures that any difference between the two groups with respect to the effect variable is due to the causal variable whose effect we want to test.

Obviously, I am skipping over a lot of complications here, but the gist of Beigang’s idea is that instead of equalized odds being a necessary condition for algorithmic fairness, equalized odds across matched groups is. Similarly, instead of predictive parity being a necessary condition for algorithmic fairness, predictive parity across matched groups is.

First, consider equalized odds across matched groups with respect to Beigang’s religious group counterexample described above. By stipulation, “in country C, citizens of religion R1 are, on average, younger than citizens of religion R2”. Accordingly, if we want to assess whether the insurance company’s predictive model overestimates the effectiveness of religion with respect to health care costs, the group of citizens of religion R1 and the group of citizens of religion R2 are not matched. To construct a match, we could compare a subset of citizens of religion R1 with a subset of citizens of religion R2, where the two groups are of the same age (on average, or for all relevant matching pairs, etc.). Since the predictive model is based on age information only, it will make the same predictions when it comes to matching groups of citizens of religion R1 and citizens of religion R2. Thus, in Beigang’s counterexample, equalized odds across matching groups will be satisfied.

Next, consider predictive parity across matched groups. In Beigang’s medical device test case, the groups of men and women were not matched in that the disease in question is more prevalent in men than in women. Thus, to see whether the test satisfies predictive parity across matched groups, we must compare a group of men and a group of women where the prevalence of the disease in question does not differ between the two groups. By stipulation, then, “there is no difference whatsoever in the likelihood of receiving an erroneous result, no matter whether a patient is male or female” and, accordingly, the test satisfies predictive parity across matched groups. Again, this fits well with Beigang’s contention that the medical test is not unfairly biased (p. 175).

Hence, the counterexamples that, arguably, defeat the view that equalized odds and predictive parity state necessary conditions for algorithmic fairness do not defeat the view that equalized odds and predictive parity across matched groups state necessary conditions for algorithmic fairness. If Beigang is right about the “underlying moral intuitions” that equalized odds and predictive parity were “designed to capture”, then, equalized odds across matched groups and predictive parity across matched groups seem to be strong candidates for necessary conditions of algorithmic fairness.

Having proposed his two algorithmic fairness criteria, Beigang revisits the Kleinberg-Couldechova Impossibility Theorem showing that “under perfect matching conditions” “matched equalized odds implies matched predictive parity” and vice versa (p. 185). This weakens the significance of the Impossibility Theorem significantly and is a welcome result for those who hope for perfect algorithmic fairness, even though this positive attitude should be tempered by the fact that in “many realistic situations… the conditions for matching are not ideal.”

As noted, I find Beigang’s revised algorithmic fairness criteria promising. However, in closing, I want to raise three points that might be conducive to discussion here on Peasoup.

First, as we have seen, Beigang submits that “the definitions of both equalized odds and predictive parity do not adequately explicate the underlying moral intuitions they were designed to capture” (p. 175). That intuition, I take it, is that algorithmic fairness requires preventing “discriminatory outcomes in predictive models” (p. 171). To fully assess that claim, we need to know more about what this means. In the case of equalized odds, Beigang proposes that the intuition that the two criteria were “designed to capture” “can be interpreted as a criterion that prevents discriminatory outcomes by preventing systematic cognitive bias with regard to a sensitive characteristic,” e.g., by preventing overestimation of “the informativeness of a sensitive characteristic” that “could lead to disadvantageous predictions on the basis of an irrelevant (or less relevant than warranted) sensitive characteristic” (p. 171).

It would be good if Beigang could provide some more support for this interpretation. This is especially so because in the COMPAS case, for legal reasons, race was not an input into the algorithm. Accordingly, it must have been clear to theorists who favored equalized odds, partly in light of COMPAS, that whatever intuition this requirement was intended to capture, and which was flouted in COMPAS, it could not have been the view that “misjudging how informative [race] is in predicting [recidivism]” is to be avoided.
Regarding predictive parity, Beigang notes that this criterion is “often interpreted as a criterion that prohibits the meaning of a prediction from depending on a person’s sensitive characteristic, as doing so could incentivize discriminatory behavior” (p. 174). By this Beigang has in mind the fact that violation of predictive parity “provides a clear incentive” to decision makers to treat people differently, despite the fact that the predictive model makes the same prediction regarding these people (Kleinberg et. al.,, p. 4). That sounds right. However, it is unclear that this interpretation is an interpretation of what it takes to prevent “discriminatory outcomes in predictive models.” One reason is that the “discriminatory behavior” that predictive disparity incentivizes is causally downstream from the predictions generated by the algorithm and not located in the “outcomes in predictive models” as such. One way to see this is to imagine that the incentivized “discriminatory behavior” simply counterbalances other biases of the user of the predictive model has thereby resulting in the decisionmaker producing non-discriminatory decisions overall despite the predictive disparity.
My second question concerns Beigang’s contention that “(direct and structural) discrimination is conceptually distinct from matters of distributive justice” (p. 174) and, thus, that, in the context of algorithmic decision-making, we can distinguish between “discriminatory predictions and unjust decisions” such that issues of distributive justice arising from algorithmic decision-making is irrelevant to the former issue (p. 174). Note, first, that the parenthesis “(direct and structural”) is a bit surprising here. It leaves out indirect discrimination, which surely is the more common contrast to direct discrimination. Moreover, when it comes to indirect discrimination, there is an issue about whether the wrong of indirect discrimination can be understood simply as a matter of distributive injustice. Some argue that victims of indirect discrimination are wronged, and that they may be wronged even if society as a whole does not become more distributively unjust because of the indirect discriminatory practice in question. However, others seem to tie the two issues more closely together such that if a practice really results in a mitigation of distributive injustice, it cannot amount to wrongful, indirect discrimination. When it comes to structural discrimination, the case for tying structural discrimination and distributive injustice closely together appears, if anything, no weaker than in the case of indirect discrimination, which makes it harder to see why Beigang refers to structural, but not to indirect, discrimination here.

More importantly, Beigang’s argument about equalizing odds, and the suggestion that the rationale behind this criterion is to prevent “systematic cognitive bias with regard to a sensitive characteristic,” seems to make the very notion of indirect algorithmic discrimination via unequalized odds impossible. Perhaps this is the right way to go. However, again, I doubt that this reflects any intuitions on part of those theorists who have adopted equalized odds as an algorithmic fairness criterion. Suppose black offenders have a higher risk of recidivism than white offenders because direct discrimination against blacks is widespread. As a result of this higher prevalence, non-recidivating black offenders face a higher risk of unjustifiably long incarceration than white offenders. On Beigang’s matched equalizing odds criterion, COMPAS would not involve algorithmic unfairness since COMPAS would not involve any cognitive bias regarding the informativeness of race (even if the algorithm had been fed information about the offender’s race, no cognitive bias regarding race need have obtained), even if the decisions resulting from the use of COMPAS’ prediction might raise issues of distributive justice on Beigang’s view. However, I suspect that many theorists would say that in the light of the causal role discrimination plays in differential prevalence of recidivism, the issue is not just one of distributive justice but also one of indirect discrimination, or more specifically an issue of indirect algorithmic discrimination. To some extent, what is a matter of discrimination and what is a matter of distributive justice may be a matter of reasonable disagreement, or even to some extent a dispute over words. However, arguably, Beigang’s position seems revisionary and cannot credit itself with the virtue (if that is what it is) of capturing the intuitions friends of equalizing odds have tried to capture.

Excluding the possibility of something like indirect algorithmic discrimination might seem problematic, regardless of whether doing so fits well with the designs of the friends of the equalizing odds criterion. Suppose an employer uses a predictive algorithm to assess how economically attractive an applicant is. Suppose the predictor used is time spent on care work. This is a relevant predictor because while male and female applicants are equally well qualified in terms of technical skills, etc., women tend to take more parental leave, refuse to work late hours in the interest of their children, and stay home more often because of a sick child, and these characteristics are economically important. I wonder if Beigang’s position implies that the prediction algorithm does not indirectly discriminate against women because matched groups of men and women, i.e., men and women who spent the same number of hours on care work, are predicted to be equally economically attractive? If so, something needs to be said in response to the view that this is a reductio of Beigang’s view, since this case comes close to what many would consider a paradigm case of indirect gender discrimination.

In response, Beigang might appeal to a general skepticism about indirect discrimination. In Eidelson’s view, which Beigang quotes approvingly, indirect discrimination interventions are best thought of not as interventions to mitigate what is really discrimination, i.e., in Eidelson’s view, disrespectful differential treatment, but as interventions to promote distributive justice. However, it is at least a noteworthy implication that to distinguish between algorithmic fairness and issues of distributive justice, we must reject the very category of indirect discrimination.

Alternatively, Beigang might say that while the predictions made are not discriminatory, the decisions made based on these predictions are. However, if this distinction holds in the present case, one would like to know why the same should not be said about algorithmic predictions that do not satisfy equalized odds and predictive parity across matched groups, i.e., that they are not discriminatory even though decisions based on them may be. (A claim that has some force in the light of, say, philosophers’ toy examples such as Robinson Crusoe using his supercomputer to make predictions to avoid boredom, but that have no practical significance for the subjects whose behavior is being predicted.) And, of course, there might be other responses. However, something needs to be said in response to my second point.

My final point is not so much a challenge or a question. Rather, it is a conjecture that fits well with Beigang’s observation that it is sometimes impossible to construct matching groups. Consider the example above of matching groups of men and women who spend the same number of hours on care work. Given that gendered norms and gendered expectations explain why women on average spend more time on care work than men do, one would suspect that a matching group of men would systematically differ from other men. Moreover, men in the matching group also differ from women in the matching group in that, for example, they have the characteristic of being deviant in terms of their non-compliance with dominant gender norms and expectations, whereas the women in the matching group do not. Thus, the very fact that one makes sure that the two groups match each other in terms of care work, given the relevant background information about statistical differences between men and women and about gender norms, etc., means that one makes them mismatch in terms of whether they conform to gender norms. If so, it is difficult to say whether differential error rates between putative matching groups of men and women reflect that the algorithm in question overestimates “the informativeness of a sensitive characteristic” or simply that gender-norm-non-conformance is more prevalent among the putatively matching group of men than among the matching group of women. One cannot discuss everything in one article, and Beigang refers to some of the relevant literature (p. 179 footnote 23), so here I simply encourage him to offer his take on this issue in the context of his view that satisfying his two fairness criteria may not be feasible together under non-ideal circumstances. I think his article covers what it should cover and brilliantly so. Still, as always in philosophy, there is more to discuss.


13 Replies to “Fabian Beigang’s “Reconciling Algorithmic Fairness Criteria”. Précis by Kasper Lippert-Rasmussen

  1. Thanks, Kasper, for a very helpful précis. And thanks, Fabian, for writing such a remarkable article. I think it’s especially impressive how you manage to integrate normative theory with statistical theory to achieve clarity. I learned a lot from engaging with it and I find the argument extremely plausible.

    I’d like to raise a question that, as with Kasper’s comments, touches upon the normative interpretation of the fairness criteria.

    When Beigang offers his interpretation of the normative rationales of the fairness criteria, he seems to both highlight a concern for “preventing discriminatory outcomes” and a concern for “preventing cognitive bias” (p. 171), even if the former plays the largest role. And granted, there is certainly a close, perhaps even conceptual, connection between those ideas. But I’m inclined to think they come apart at least in some cases.

    To illustrate: Suppose that Boss fires Employee because of their gender (‘gender’ is the motivating reason here). But Employee has also committed a bad professional offence where firing them would be the only defensible response which Boss ignores. I’d like to think of this as a case of wrongful gender discrimination but without a biased outcome.

    What I think this case shows is that there is at least a conceptual difference between interpreting the fairness criteria as concerned with anti-discrimination and as being concerned with anti-bias. And I wonder what interpretation Fabian would stick to for the following reason: If the anti-bias rationale is deemed central, it’s less clear what to make of causal part of Fabian’s proposal (since this is something he seems to lift from a conceptual analysis of discrimination). In the other direction, if Fabian goes with the anti-discrimination part, it’s now much less clear what to make of the concern for bias in outcomes (which arguably seems central to many working on algorithmic fairness).

    Thanks again for a wonderful paper.

  2. First off, a big thank you to Kasper Lippert-Rasmussen for this beautifully written précis of my paper. I’m truly honored that you devoted your time and thought to engage with my work. Your précis succinctly captures the essence of my argument and brings up several interesting critical points. I’d like to briefly respond to those points.

    As a first point, you noted that the argument supporting my interpretation of the two criteria could be made stronger. Also, you noted that COMPAS did not contain race as a variable, for which reason it seems unlikely that defenders of equalized odds had the avoidance of direct discrimination in mind when talking about fairness.

    I do take that point; maybe the interpretation is a weak point in my argument. Understanding equalized odds and predictive parity in this way is ultimately what I concluded is the most reasonable interpretation of fairness constraints on predictive models. Many of the papers proposing mathematical fairness constraints skip over the normative theory justifying their criteria rather quickly, if they provide any justification at all. The paper introducing equalized odds, for instance, simply states that the goal is to formalize the notion that the predictor “does not discriminate with respect to A”. Beyond that, not much is provided in terms of justifications or underlying normative theory. This, of course, makes it hard to find a strong evidential base in the literature for my interpretation.

    In another paper (“On the Advantages of Distinguishing Between Predictive and Allocative Fairness in Algorithmic Decision-Making” in Minds and Machines, 2022) I have argued at length for distinguishing between two aspects of fairness in software applications based on machine learning models: the component that predicts certain properties, and the component that makes decisions based on these properties. My conclusion was that the predictive component carries the risk of introducing direct discrimination with regards to a protected attribute, if the protected attribute (or proxies thereof) have an effect on the accuracy (however measured) of the predictions. On the other hand, the component involved with making decisions based on these predictions carries the risk of introducing indirect discrimination or distributive injustices. To my mind, equalized odds and predictive parity can only be interpreted as predictive fairness criteria, as they are both defined in terms of accuracy metrics for predictive models, namely error rates (in the case of equalized odds) and predictive value (in the case of predictive parity). In the light of this, interpreting equalized odds and predictive parity as fairness criteria imposed on the predictions made by a ML model, and hence requiring that the signal contained in a protected attribute (or its proxies) is not overestimated (as the ML model does nothing other than trying to find the signal for predicting some other property) seemed to me like the most sensible interpretation of these criteria.

    Note also that a ML model’s prediction can lead to direct discrimination even if the protected attribute is not part of the input data, as there can be redundant encodings of the protected attributes (i.e. proxy variables that allow one to pick out the different groups with relatively high accuracy), as is often illustrated with the analogy to red lining, where those postcodes that were known for having a dominantly Black population were excluded from banks’ services. Despite the fact that on the face of it this is a “colour-blind’ procedure, it, so it is argued, constitutes direct discrimination. So, I don’t think from the fact that COMPAS did not include race as a variable it follows that its critics could not have had direct discrimination in mind when analyzing it using equalized odds.

    I would answer in a similar vein to your second point about indirect discrimination. A key sentence is that the interpretation of equalized odds “make the very notion of indirect algorithmic discrimination via unequalized odds impossible”. Here, again, I would refer to the distinction between predictive fairness and the fairness of decisions based on these predictions. As indirect discrimination is generally defined along the lines of “an act that imposes a disproportionate disadvantage on the members of a certain group”, it clearly falls into the latter category (i.e. an unfair outcome at the level of decisions). So, I would argue that a violation of equalized odds can indeed lead to decisions that impose disproportionate disadvantages on some groups, but it is neither a necessary nor sufficient condition for that. As a consequence, I think it is more sensible to aim at identifying indirect discrimination at the level of decision-making rather than at the level of predictions.

    The challenge you raise towards the end provides an interesting example case. The answer to this case lies in the details about how the two matching groups ought to be constructed in order to allow for the application of causal inference methods. The co-variates on which the two groups should be matched should never include variables that are causally downstream of the putative causal variable. To compare this with a more clear cut case, if we were to compare smokers and non-smokers in an observational study in terms of the effect of smoking on mortality, we would not want to include lung cancer as a co-variate (i.e. we would not want to create two groups that have equal rates of lung cancer), because, of course, in most cases lung cancer is caused/explained by the fact that someone smokes. So, including this causally downstream variable as part of the co-variates to match the groups on would distort the causal analysis, and would hence render the conclusions of the analysis invalid. The same reasoning can be applied to your hypothetical case: since the amount of care work an individual does is at least in part explained by the individual’s gender due to prevailing gender norms, this would disqualify it as a co-variate for matching. So, I don’t think this is in fact a challenge to the two criteria proposed in my paper.

    Nonetheless, I do think this case highlights an important problem in applying causal inference methods to social questions, namely that one has to make certain causal assumptions to arrive at estimates of the causal effect, and it is often difficult to discern exactly what causal influence a given protected attribute has on some other variable of interest. In this sense, the argument in my paper might be more of conceptual than of practical relevance, as I think it might be very difficult to specify the right set of co-variates in any real world situation.

  3. To respond to Lauritz Munch:

    Thanks for the comment and the interesting example. I very much agree with you that cognitive bias and discriminatory outcomes should be distinguished conceptually, and I probably didn’t do that sufficiently in this article. Yet, I do think there is a clear relation between them: cognitive bias relative to a protected attribute will, if acted upon directly, result in disciminatory outcomes. Especially in an AI context where decisions might be taken on the basis of predictions by an underlying ML model in an automated fashion, this connection seems to be fairly tight. But as mentioned above, I’ve tried to conceptually discern the two aspects of fairness in another paper, “On the Advantages of Distinguishing Between Predictive and Allocative Fairness in Algorithmic Decision-Making” (in Minds and Machines, 2022).

    I think the example you describe is an interesting one:

    “Suppose that Boss fires Employee because of their gender (‘gender’ is the motivating reason here). But Employee has also committed a bad professional offence where firing them would be the only defensible response which Boss ignores. I’d like to think of this as a case of wrongful gender discrimination but without a biased outcome.”

    This, I guess, is a case of the right decision for the wrong reasons. I would describe it slightly differently, though. I would say (assuming the boss fires the employee because they falsely think the fact that the employee is of that gender is associated with, say, lower productivity on the job) the boss is biased with regards to gender and therefore makes a discriminatory decision, the outcome of which, however, is fair (from the perspective of fair allocation of job opportunities, or something along those lines). In the ML context, again, I would say this can be analyzed by distinguishing between predictive and allocative fairness.

  4. Thanks Fabian for a rich and thought-provoking paper, and Kasper for an excellent precis to kick things off!

    I’m still getting my head around the “matched” versions of equalized odds and predictive parity, and in particular around the notion of a co-variate. In response to Kasper, you wrote, “The co-variates on which the two groups should be matched should never include variables that are causally downstream of the putative causal variable.”

    Does that mean that in a case involving race, the co-variates would not include e.g., wealth and income, since one’s race partly causes one’s level of wealth and income (given the unjust background conditions of society)?

    If so, I worry that the matched versions of equalized odds and predictive parity will still fall prey to the problem of infra-marginality. Suppose we’re doing loan approvals, and suppose (unrealistically) that members of one race are either filthy rich or utterly destitute, while members of another race are solidly middle class. For the loan in question (say, a million dollar home loan), the members of the former race are all clear (non-marginal) cases – they will either clearly repay it or clearly default, while the members of the other race are all difficult (marginal) cases – it’s tough to tell whether they’d repay it or not.

    Here, a natural algorithm that predicts loan repayment on the basis of wealth and income will violate ordinary equalized odds and predictive parity, since we’ll get a higher false positive rate and higher false negative rate for the racial group consisting of all marginal cases, and we’ll get a lower positive predictive value and lower negative predictive value for that group as well.

    But I’d claim that there needn’t be any unfairness or bias in this predictive algorithm (wealth and income are clearly very important to determining loan repayment!), even though there may be plenty of injustice in the background conditions of society, yielding the different distributions of wealth and income across racial groups.

    Now, since wealth and income aren’t legitimate co-variates here, I would have thought that matched equalized odds and matched predictive parity are equivalent to ordinary equalized odds and ordinary predictive parity, so that if the latter are violated (as they are in this case), so are the former. But then if you agree that there’s no unfairness or bias in the predictive algorithm, this would show that matched equalized odds and matched predictive parity aren’t necessary for fairness/lack of bias.

    (If you’re familiar with the coin flip case in my algorithmic fairness paper, think of there being one room where everyone has coins with extreme biases but symmetrically distributed about 0.5, along with another room where everyone has coins with middling biases but again symmetrically distributed about 0.5. And the algorithm predicts heads for everyone with a coin with labeled bias >0.5 and tails for everyone with a coin with labeled bias <0.5. And let people's coins' labeled biases be causally downstream of room membership.)

    What would you say about this example? Have I misunderstood what matching is, or misapplied it in this case? Or would you disagree with my intuitions about what fairness requires vis-a-vis making predictions in loan approval cases where racial groups differ in wealth and income, and where those features are causally downstream of race?


  5. Thanks Fabian for the great paper and everyone for participating to this discussion. I will latch on to something Fabian said in his reply to KLR’s (alleged) counterexample. Fabian argues that we should not match groups for a feature that is causally downstream of the putative causal variable. (This view also appears in the paper “another important norm for choosing covariates is not to include any variables that are causally influenced by the causal variable. This might lead to an underestimation of the causal effect and thereby distort the analysis”, 178) To take an extremely simple case of indirect causation, G->F->P->D (the group G causes a feature F that causes the prediction P that causes the decision D) we should *not* match groups to be equal in F.

    Now imagine a case in which there is religion persecution in country A, which causes people of a certain minority religion to leave the country (and is the only cause of emigration), and become the only population of immigrant workers in country B, where they also form a religious group distinct from the rest of the population. The rest of the story is identical to the story in Fabian’s own example in the paper (p. 172-173): immigrants tend to be younger than the average population in B; age overestimates risk and immigrants (who merely happens to be of a different religious group) end up paying disproportionately higher prices, both in absolute terms and in comparison to groups of other religions analysed with the same algorithm. I would be curious to undertand how Fabian understands my example – that is slightly different from the one in the paper. On one possible interpretation, this is a case of G->F->P->D, where
    P=(overestimated) risky behavior
    D= higher insurance cost
    Here, Fabian could argue that we should not match the two groups by age, because age is causally downstream from religion.
    On the other hand, Fabian could argue that age cannot be caused by religion, not even indirectly. So the two groups should be matched.

    Interestingly, in my example, the existence of an immigrant *group* with a lower than average age is caused, indirectly, by religion. By hypothesis, had the immigrants been a member of the majority religion, they would not have left the country. Had they not left the country, they would not have formed a group of younger average age than any other group.

    I wonder what Fabian would say about this case. Fabian could argue that this should be considered a case in which age is causally downstream relative to religion. So, we should *not* apply equalized odds to age-matched groups (as in his reply to KLR’s example). If we do not age-match, equalized odds is likely violated. So we should conclude that the algorithm is biased. But then what about the case Fabian himself discusses at pp. 172-173 of his article? This example (also involving a religion group that has a lower average age due to immigration) is used as an illustration of a case in which there is no bias and discrimination. But Fabian’s own case looks very similar to the one I presented, minus some detail. It is true that, in Fabian’s example, religion here is not the cause of immigration, while in my example it is. But is this difference in the causal role of religion between the two examples really so important morally? (If it isn’t, a possible critique of the matching method is that it fails to track our intuitions about what is morally important.)

    There is, however, another way Fabian could reply. Fabian could argue that, even though religion is the cause of the average age of the *group*, in the example I present, it is coherent with the norm for choosing covariates for matching that we match by age, nonetheless. He could maintain that, despite the different causal role of religion, the case I present is not fundamentally different from the one he discusses in the paper, where clearly there is no bias against any religious group. For in both cases, no *individual* age attribute is causally influenced by an individual’s own religion. After all, even in the story I tell, an individual’s rage remains the same irrespective of their religion. An individual’s age is what it is, from the day that individual is born, and that is simply causally independently of the religion that person happens to have. Fabian’s methodological premise would be, then, that the norm for choosing covariates should exclude features that are causally downstream from the putative causal variable, but only at the individual level. Since age cannot be considered causally downstream of religion *at the individual level*, it is ok in my example to match groups by age. Hypothetically, when we match groups by age, we discover that equalized odds is satisfied, so there is no bias.

    I wonder whether this is, in fact, Fabian’s view on the matter. If it is, then it seems to leave open the possibility that there could be a form of bias against a religion *group* even when there is no bias against individuals. After all, it seems clear, in my example, that a group of people is mistreated because (indirectly) of the religious nature of that group. Religion does not play a direct role in the bias – it is not the ground of discrimination – but it clearly plays an indirect role: it indirectly makes it the case that the group receives the worse treatment it does by forcing emigration and the constitution of a group characterized by low average age.

    Thanks again for such a thought-provoking paper, and thanks in advance for your reply.


  6. Hi Brian,

    Many thanks for your comment. This is a great point, and I’m not sure I have a good answer to it.

    To reply to your question about matching: I think you’ve got things right. In your example, wealth and income should not be included because they would be mediator variables that are partially influenced by the protected attribute in your example. Generally, the idea of matching is to establish two (synthetic) groups of data points that are as similar as possible to one another in all properties other than the putative causal variable and anything on the causal path between causal and effect variable, to make sure any difference in the effect variable is due only to the putative causal variable. However, I think in the case of variables like wealth and income, we’d probably want to find a more fine grained description of the immediate causes of wealth and income to be able to distinguish between those that are influenced by the protected attribute (for instance, less access to education) and those that aren’t (maybe something like diligence). This would of course be difficult in practice, but at least conceptually this would be the ideal scenario for matching-based causal inference.

    Now, this doesn’t really solve the problem of infra marginality. At first glance it seems to me that matched equalized odds will not get around this problem, but I’ll have to think about that a bit more.

  7. Thanks Michele for this example. This is an interesting point, but I do think in that case I would probably bite the bullet and say there seems to be a morally relevant difference between the two cases, and even though they do seem very similar, in one case the discrepancy in error rates is (partially) explained by a protected attribute and is hence unfair (in the sense of direct discrimination), while in the other it isn’t. But I admit, both cases are edge cases where intuitions might differ. A case where there is a slight (albeit not full) explanatory connection between a protected attribute and different error rates might strike us as unfair, even though not as unfair as one where there is blatant explanatory connection. At the same time, a case where there is no explanatory connection but the disadvantages clearly fall onto one group might not strike us as a case of direct discrimination, but might yet seem morally problematic from the perspective of the resulting distributive patterns. I guess for every theory there will be edge cases of applicability, where things might seem less clear cut than in other cases, and I think your example highlights that very elegantly for the theory of matched equalized odds.

  8. Thanks Fabian,
    In that case, I’d argue that even if the matched versions of equalized odds and predictive parity are reconcilable, we should still reject them for reasons having to do with infra-marginality.


  9. I should say that one might argue that the problem of infra-marginality isn’t a serious problem after all, and that predictive algorithms are sometimes required by fairness to be insensitive to the distinction between marginal and non-marginal cases. But I’m curious if you’d want to go that route in defence of matched equalized odds and predictive parity. Or would you say that while modifying equalized odds and predictive parity to incorporate matching gets around one problem (the impossibility results), it still doesn’t result in satisfactory criteria of algorithmic fairness because of the problem of infra-marginality?

  10. Yes, I think I would presumably argue for the latter. Whether the use of an ML model in given situation is fair depends on so many contextual factors that it is probably impossible to identify universally applicable fairness criteria. I by now lean much more towards understanding fairness criteria as diagnostic tools (or heuristics) that can help identify fairness issues, rather than strict conditions on ML models. So, I would probably want to say that moving from equalized odds/predictive parity to their matched counterparts moves us one step closer towards more coherent diagnostic tools to detect unfairness, but there might be exceptions to those as well. Infra-marginality would be one such exception.

  11. Thanks. I agree that we should probably seen most of these fairness criteria as diagnostic tools rather than strict necessary conditions on fairness. I’m not sure about calibration – that strikes me as the best candidate for a genuine necessary condition on fairness. (Or perhaps something even weaker, like Ben Eva’s base rate tracking.)

    But if we want to see those other criteria as merely diagnostic, there remains the question of what fairness really is, or else it’s unclear how to evaluate whether these other criteria are good tests for fairness or unfairness. Do you have a view on this?

    I’m a bit skeptical that predictive fairness is going to be as highly contextual as you suggest. Perhaps fairness in decisions is highly contextual, as the “payoff structure” will differ from case to case. (Even here, though, I think there may well be some underlying, context-invariant notion of fairness, along the lines of treating everyone’s welfare equally.) There’s also a worry that if fairness is highly contextual, it’ll be difficult to determine whether some criterion is even a good diagnostic tool, since the thing whose presence or absence we’re trying to detect will vary widely from case to case.

  12. Thanks Fabian for a great paper and Kasper for such a lucid summary, and to everybody for stimulating comments. I am not sure if I am too late to the party, but I wanted to share a couple thoughts, developing strands of ongoing conversations between Fabian and others.

    1. Brian and Fabian agree that fairness criteria are diagnostic, and as Brian notes, this leaves open what fairness in algorithmic decisions (or algorithmic predictions) really consists in. What are fairness criteria diagnostic of? I want to rehearse some intuitions that might be behind why we find classification parity/equalized odds such a compelling criterion. First, violations of classification might signal a difference in the allocation of burdens and benefits across socially salient groups. This tracks the idea of “disproportionality” that many — even in mainstream media — talk about. So here I think the concerns associated with violations of classification parity are about distributive justice of some kind. Second, violations of classification parity signal differences in the treatment of individuals due to their group membership. This is where direct or indirect discrimination comes in. Third, violations of classification parity signal a difference in representation or a symbolic difference, say that people of a certain group are more likely to exhibit a behavior rather than another. My sense is that among these three intuitions, Fabian’s matched version of equalized odds is meant to track differences in treatment across individuals due to group membership (say race or gender). Is this right, Fabian? What about the other intuitions? Are they totally off or still deserving of consideration in debates about algorithmic fairness?

    (An aside here: If equality in the treatment of individuals is the intuition matched equalized odds is supposed to track, however, things like affirmative action or reparation or other form of remedial justice would conflict with matched equalized odds, right? We think — or some of us think so, though of course not everyone agrees — that sometimes it is required of us to treat people differently because of their groups differences to remedy a prior injustice. How does matched equalized odds sit relative to such cases?)

    2. There is another intuition behind classification parity that Fabian explicitly mentions (and Kasper underscores in his summary), that is, preventing systematic cognitive bias. If this the intuition we are trying to track with equalized odds, we would need to match individual across all features used by the algorithm, including those that are downstream relative to the protected characteristic. My sense is that equality in the treatment of individual is probably a better rationale that justifies why we should match individuals, excluding features that are downstream relative to the protected characteristic. What do you think, Fabian, about this contrast between “equality in treatment for individuals” versus “preventing cognitive bias” as intuitive rationales that justify adopting matched equalized odds as a fairness criterion?

    3. Finally, I want to pick up on a point made by Michele about group fairness versus individual fairness criteria. Perhaps equalized odds or classification parity is meant to track intuitions about unfairness in treatment of groups, not necessarily of individuals. It could be that what we are concerned with is whether groups are treated differently from other groups, since this difference in treatment might have symbolic/representational value, might make a difference in how resources are allocated or might eventually lead to a difference in how individuals within a group are treated compared to individuals in another group. Where do you stand Fabian on this idea of fairness in the treatment of groups versus individuals? Should fairness criteria try to capture these kinds of group fairness intuitions as well?

    My overall point is that we should probably be clearer on the type of fairness intuitions we want to capture and this way we can better assess whether a proposed formal criterion of fairness does track them or not. There might be a number of competing intuitions we want to capture and equalized odds might fare well for some, but not others.

  13. Sorry for the delay in replies.

    Brian, I definitely agree with you on this point that there will be more context-dependence for fairness evaluations of ML-based decision, and less for the predictions. Yet, I am skeptical that it’ll be possible to capture in a single mathematical equation all the intuitions people have around what is to be considered fair with regards to predictions. We might be able to come to agree on a (presumably fairly weak) minimal necessary condition which holds universally, and that can be strengthened depending on context. I was hoping to bringing the discussion a bit closer towards this with the idea of replacing observed equalized odds/predictive parity with their matched counterparts. But, as you have pointed out, those also don’t seem to be necessary conditions as there are some contexts within which even violations of matched equalized odds/predictive parity seem intuitively fair due to infra-marginality. Maybe there’s a way to further weaken the criteria to get to the truly necessary core – I’m not sure.

    Marcello, thanks for joining in to the discussion! Regarding 1, you’re right about the underlying rationale of matched EO/PP. But note that I am only concerned with predictive fairness, not with how decisions are made on the basis of these predictions. I think considerations of distributive justice and affirmative action are to be made at the level of decisions, not predictions. Say, you use a predictive model to identify students with special needs at school in order to provide them with extra support. If we impose a distributive criterion at the level of predictions to, for instance, ensure that on average students with a learning disability are not predicted to have lower grades than others, this will make it impossible to provide the right students with the extra support. If, on the other hand, we only ensure that the predictions are unbiased and accurate, then on the basis of these predictions we can design the decision function such that it realizes the distributive criterion we want to implement – say, equality of opportunity or equal treatment. The same argument can be made for affirmative action. Regarding 2, I agree that equality in treatment for individuals is a good rationale for matched EO/PP, but I do think cognitive bias with regards to a protected attribute will lead to a failure to treat a person as an individual (understood in Eidelson’s sense), as this means that the prediction is made (partially) on the basis of a (usually) immutable attribute where this attribute is irrelevant (or at least not as relevant as the prediction implicitly takes it to be), instead of the individual’s relevant attributes. Regarding 3, I would point to the decision side of fairness again, as I would understand these group fairness considerations as distributive concerns.

Comments are closed.