Itai Sher: “How Perspective-based Aggregation Undermines the Pareto Principle”. Précis by Fred D’Agostino

and I welcome you to the discussion of Itai Sher‘s recent “How perspective-based aggregation undermines the Pareto principle” (free access through July 3rd here). To kick off the discussion, we have a précis from Fred D’Agostino and reply from Sher. Please join us in the discussion!

Critical Précis to “How perspective-based aggregation undermines the Pareto principle” by Fred D’Agostino Preferences, Pareto, and Collective Judgment and Decision What could be more natural, indeed compelling, than the notion that there is a determinate collective choice that can safely be made when any given individual chooser belonging to the collective is either indifferent about the alternatives or, if not indifferent, then agrees in their preference for one specific alternative with all others who are not indifferent? The indifferent choosers have nothing to complain of when the alternative is implemented that is universally preferred by those who have a “strict preference” for that alternative and, of course, those latter choosers are happy indeed that their personally preferred alternative has been identified as the social choice for this group.

Although I have worded this notion a little differently than Itai Sher does, this is, for our purposes, close enough to the Pareto Principle which it is his business to comment on in his recent PPE paper, “How perspective-based aggregation undermines the Pareto principle” (vol. 19.2 (2020), pp. 182-205).

Of course, we already know, and Sher himself documents in another paper (the prepublication version of “Comparative Value and the Weight of Reasons”, https://www.hbs.edu/faculty/Shared%20Documents/conferences/2016-newe/Sher_Reasons_HBS.pdf), that there must be more to a technology of collective choosing than the Pareto Principle, which, in order to make a crucial point a little more transparent, I might rephrase as follows:

If, among a group of individuals seeking a solution to a social choice problem, all are either indifferent among the alternatives or strictly prefer one specific alternative X (the same for all who aren’t indifferent), then the alternative X is the legitimate social choice for this group.

After all, there will be many cases – perhaps the vast majority of cases – when the antecedent of the conditional is not satisfied, e.g. when those not indifferent among the alternatives disagree among themselves about which of the alternatives is their preferred alternative. In this case, if all we had to guide choice was the Pareto Principle, we’d have to shrug our shoulders. And, obviously, anything that we developed that would enable us to identify a social choice from such a “muddled” collection of individual preferences would have to be based on additional machinery. All this is familiar enough.

What Sher undertakes to show is that even when the antecedent of the conditional is satisfied, the consequent does not follow. I.e. that there are situations where the Pareto-preferred alternative is not the legitimate social choice for the group (whose members’ preferences satisfy the antecedent).

In fact, Sher’s approach represents a challenge – to the Pareto Principle in precisely the circumstances in which it is applicable and might seem compelling – in two variables. First of all, Sher thinks that it is not individuals’ preferences for alternatives that should be “aggregated” to develop a social choice, but, rather, their “perspectives”, or the bases, in values, principles, beliefs, and the like on which those preferences might be formed. Secondly, Sher thinks, though he does not particularly stress, that the implied unanimity requirement of the Pareto Principle (e.g. that none of the choosers strongly prefers some alternative other than X) need not be applied uniformly in situations of collective choice.

Sher positions his work in relation to earlier work by Philippe Mongin (“Spurious unanimity and the Pareto principle”, Economics & Philosophy, vol. 32.3 (1997), pp. 511–532), who has argued, as Sher puts it (p. 183), “that when people hold the same preferences for different reasons, then the Pareto principle is not compelling.” He, Sher, adds (ibid.): “The current article argues that it makes more sense to aggregate overall perspectives than preferences and that doing so helps us to understand why the Pareto principle is not always compelling.”

Sher’s approach is multi-modal. He produces a general argument but illustrates and strengthens that argument by the presentation of specific cases, rather different among themselves (though sharing some abstract architecture), each of which is intended by him to show that the Pareto-endorsed social choice is not the obviously best social choice when we probe beneath the individual preferences that are “aggregated” via Pareto. The device for these various probes is (esp. p. 187) “The Judge”, who considers, as impartially as possible, all the choice-relevant facts that we might have about the individual choosers and their circumstances of choice and who will, according to Sher’s analyses, rightly be willing to overturn or at least caution against the Pareto-endorsed social choice in certain circumstances, thus demonstrating that social choice, even when the antecedent of the Pareto Principle is satisfied, does not rest on either preferences per se or unanimity with respect to them.

I will focus, because I am already familiar with it, on one of Sher’s case studies … that of the so-called “Discursive Dilemma” (see the section on “Judgment Aggregation”, pp. 192-193 and the references given there), which, in his presentation (though atypically in one respect), arises as follows and that already illustrates how we might “probe beneath” individuals’ preferences or overall judgments. (The “abstract architecture” of this particular case is shared, I think, by Sher’s other four cases, and some points of general significance are well illustrated in this case.)

Suppose that we have three individuals {A, B, C}. Suppose that each of these individuals will be asked to decide yay-or-nay on some particular matter. Suppose, and this is the “underside” of collective decision-making that we reach by “probing beneath”, that each of these individuals deploys three different criteria, {C1, C2, C3} to decide whether their overall individual preference will be for “Yes” or for “No”. (We can stipulate, and Sher certainly seems to be assuming, that all three individuals use the same three criteria.)

In order to develop the Discursive Dilemma, we need to consider a particular array of individual judgments about criterion satisfaction, as illustrated below. (I will return to the question of the nature of this array.) In particular, we posit that, even if they all apply the same criteria, they form different judgments about whether a given criterion is indeed satisfied. For example, in Table 1 below, each of A and C judges that criterion C1 is satisfied, while B judges that it is not. Table 1: Pareto

	C1	C2	C3	preference
horizontal aggregation by individuals of judgments of criterion satisfaction according to a principle of unanimity / vertical aggregation based on Pareto unanimity
A	satisfied	satisfied	not satisfied	No
B	not satisfied	satisfied	satisfied	No
C	satisfied	not satisfied	satisfied	No
Group				No

I mentioned that each individual has, in this example, reached an overall preference on the yay-or-nay issue, based on their own individual aggregation of judgments of criterion satisfaction. Clearly, each individual in this example, aggregating “horizontally”, has used a unanimitarian principle in forming an overall yay-or-nay preference. So long as one of the three criteria is judged not to have been satisfied, any of these individuals will conclude that the “No” answer is strictly preferred by them to the “Yes” answer.

In this example, it is assumed that the social choice is made by aggregating “vertically” using a unanimitarian criterion over the individuals’ overall preferences. (We’d still get a “No” result even if we used a majoritarian principle of social aggregation, as is often assumed in discussions of the Discursive Dilemma, but of course, we are purporting to test a unanimitarian Pareto Principle in this situation.) Since every individual agent has reached the overall judgment that “No” is strictly preferred to “Yes”, the antecedent of the Pareto conditional is satisfied and the conclusion from applying the Pareto Principle is that the social judgment is “No” about the matter in question and that this alternative should be socially preferred to the “Yes” alternative.

But should it be?, Sher asks (as indeed others have before him). And here is where the two elements – preferences and unanimity – are both put under pressure.

First of all, why shouldn’t we, indeed mustn’t we, look beneath the overall preferences of the individuals to their reasons for having these preferences? After all, their preferences are grounded in their judgments about criterion satisfaction and so it’s really those judgments that are driving the bus. And when we look at the criterion-satisfaction judgments, what we find is that, for each criterion, a majority of individuals does believe that that criterion has been satisfied. So, if we thought it reasonable to aggregate individuals’ judgments about a specific shared criterion according to a majoritarian principle (a thought that presupposes that all three individuals use the same criteria), we would have it, as shown below, that against each criterion the group judgment is positive and hence, whatever the mechanism for aggregating across different criteria (be it unanimitarian even), the overall group judgment is “Yes”, not “No”, as it was when (Table 1 above) preferences were the aggreganda.

Table 2: The Judge

	C1	C2	C3	preference
A	satisfied	satisfied	not satisfied	No
B	not satisfied	satisfied	satisfied	No
C	satisfied	not satisfied	satisfied	No
Group Judgment	Satisfied	Satisfied	Satisfied	*Yes*
horizontal aggregation of group judgments about preference satisfaction using the Pareto unanimity principle / vertical aggregation of individuals’ judgments of criterion satisfaction using a majoritarian principle

The difference between the Table 1 and the Table 2 results embodies what has sometimes been called the Discursive Dilemma: We seem to have decent enough grounds, in unanimous preferences, to support a “No” conclusion (Table 1) and we also seem to have decent enough grounds, in individuals’ supporting reasons for having the preferences they have, to support a “Yes” conclusion (Table 2). There has been, and Sher cites (p. 192), much discussion of this dilemma, to which, indeed, I have myself made a small contribution, though from a different angle than is common. (See my discussion of theory-choice at §§6.1.3-6.1.5 of my book Naturalizing Epistemology, Ashgate, 2003.)

Of course, a dilemma is a dilemma because or to the extent that we are torn between the two horns … in this case between a particular social choice mechanism that gives a “No” answer and a different mechanism that gives a “Yes” answer. And one way of resolving that tension might involve discussion of the conditions under which one of these mechanisms might more appropriately be deployed than the other. A horses for courses approach, as we say in Australia. (I will return to this possibility.) But Sher’s argumentation requires, of course, that the “Yes” answer be, in this case, the determinately better answer. Otherwise, he does not have a counter-example to the social decision mechanism based on the Pareto Principle, and hence doesn’t have a basis, at least in terms of this argument, for the additional and for him focal contention, that it’s perspectives (embodied in this case in the individuals’ criterial judgments) that are important, rather than (overall) individual preferences … and, in addition, though perhaps less importantly, that we don’t need to be unanimitarians about all aspects of choice in order to make correct choices. As he says (p. 183, emphasis added), “The current article argues that it makes more sense to aggregate overall perspectives than preferences and that doing so helps us to understand why the Pareto principle is not always compelling.” Not: It makes sense to recognize both as legitimate bases for social choice, perhaps in different circumstances. But, rather: It make more sense to aggregate perspectives than to aggregate preferences (and to reject unanimity as a universal principle of aggregation).

So why, according to Sher, is “Yes” the determinately better answer in the situation giving rise to the so-called Discursive Dilemma? Here’s what Sher has to say (p. 193):

Along the lines discussed in the voluminous literature on the doctrinal paradox, it may be reasonable for the neutral judge, who aggregates the views of the agents, to accept all three conditions [C1, C2, C3] because a majority of agents accept each of these conditions. The judge may then accept the yes decision on the basis of accepting [that C1, C2, C3 have also been satisfied] and also principle (4).

(4) A yes decision should be made if and only if three conditions hold [C1, C2, C3].

He continues:

Thus, the judge may go against the unanimous preferences of the agents. Whether this is reasonable depends on the details of the case, but it could be reasonable for the judge to aggregate in this way. The judgment aggregation literature discusses the relevant considerations in much more depth than I can hope to do here.

Notice the hesitancies: “may be reasonable”; “may go against”; “could be reasonable”. These may be indicative of Sher’s own lack of confidence in this particular case for his conclusion. In particular, two points are pressing here.

First of all, the reasonableness of The Judge’s acceptance of “Yes” depends on the reasonableness of the (partial) substitution of a majoritarian for a unanimitarian approach to “aggregation”. This isn’t perhaps spotlighted because of the way Sher presents the reasoning. In particular, principle (4) is a unanimitarian one, so it might seem that nothing more needs to be justified in the Table 2 reasoning than in the case of Pareto-motivated decision-making illustrated in Table 1. But there is, in the Table 2 reasoning, a second principle of aggregation, call it

(5) The group judgment about criterion satisfaction is “Satisfied” if a majority of individuals judge the criterion to be satisfied and “Not Satisfied” otherwise.

This is, then, the majoritarian (not unanimitarian) principle that enables us to get from both A and C think that C1 has been satisfied to the positive social judgment about C1 satisfaction and thereafter to the overall social judgment in favor of the “Yes” answer. Without it, with, instead, a unanimitarian principle for criterion satisfaction, we’d get a “No” result from aggregating individuals’ judgments about criterion-satisfaction, from which we would then get a “No” result when we aggregated across criteria. And in this case, there would be no dilemma, since each of the two approaches to aggregation would give the same “No” answer … and that it makes “more sense” to aggregate perspectives and to abandon Pareto would therefore not be established.

So what’s the rationale for the use of this majoritarian principle of (vertical) aggregation in Table 2?

One possibility involves an assumption of “objectivity” about judgments of criterion satisfaction. (This assumption is fully and explicitly in play in another of Sher’s cases, that of “Mutually inconsistent beliefs”, pp. 190-192.) There is, on this possible reading, an objective truth of the matter when it comes to whether or not C1 has been satisfied, and each of A, B, and C is to be understood as making a judgment about that objective fact. If there is any divergence in their judgments, then that by itself establishes, as a matter of logic, that one or more of them is wrong in the judgment that they have made. On this reading, then, where there is divergence of judgment, we must, if treating the matter objectively, be prepared to exclude one or more of the expressed judgments, for they can’t all be correct.

This gets us somewhere, I think. We are already at the point where we can, modulo the assumption of objectivity, criticize the Pareto approach (of aggregating overall individual preferences) on the grounds that it permits both correct and incorrect judgments, willy-nilly, to play a role in the apparatus of judgment aggregation. (For at least some preferences in Table 1 must, on this reading, be based on incorrect judgments about criterion satisfaction.) Should we be able to show that the perspective aggregation procedure does not have this defect, we would be still further along. Indeed, perhaps all the way to Sher’s claim that it makes “more sense” to use this procedure.

Well, certainly, The Judge does exclude some individual judgments, using a majoritarian principle of aggregation as the basis for doing so. And this may seem reasonable. Knowing nothing more than “Not everybody can be right if they disagree among themselves”, it is not unreasonable to adopt a majoritarian principle as a principle for eliminating mistaken judgments. What’s the (formal) alternative? Requiring unanimity would be safer, but, assuming that Pareto is supposed to work the same way in both cases, that throws us back into a situation where, because the antecedent of the conditional is not satisfied, we have no principled basis for choice. And I guess it makes sense, if we cannot demand unanimity, to prefer a majoritarian principle to whatever else (of a purely formal kind) there might be.

Unfortunately, while a majoritarian principle of exclusion is reasonable, it does not, indeed cannot, guarantee a priori that this approach to aggregation never uses incorrect individual judgments about supposed matters of fact, as the Table 1 approach must do. That remains, a priori, a risk. For, surely, it is not precluded that the majority view about criterion satisfaction might be wrong, and that the minority judgment is the correct one. After all, the bare act of excluding, say, B’s judgment about C1 satisfaction, C’s about C2 satisfaction, and A’s about C3 satisfaction (as is done in the Table 2 reasoning) implies that each of them is unreliable as a judge of objective matters of fact. And this recognition surely carries with it a frisson of anxiety about the standing of even the social judgments that are supported by a majority of individual judgments (by concededly fallible) judges.

Notice, in addition, that, whatever might be reasonable in certain cases (as in Sher’s analysis of “Mutually inconsistent beliefs”, for instance), the idea that judgments of criterion satisfaction are about objective matters of fact is not an innocent assumption, so that the attempt to ground Sher’s use of a majoritarian principle may be hostage to this assumption. Certainly, there are situations where it is perfectly possible for all of A, B, and C to be “right” in their judgments about the satisfaction of the criterion C1. In particular, they might have different “thresholds” for criterion satisfaction without any of them being in error, objectively, about where “the threshold” is. Perhaps B’s is higher than the ones characteristic of A’s and C’s judgments about this matter; that’s why B says “Not Satisfied”, whereas A and C both say “Satisfied”. And that’s all there is to it. In which case, it’s not clear what the grounds are for aggregating such “subjective” judgments vertically according to a majoritarian principle (as in Table 2) while in effect disallowing their aggregation (as in Table 1) horizontally, individual by individual, by precisely the individuals whose judgments they are. (Notice that the Dilemma again disappears if the principle of horizontal aggregation in the Table 1 reasoning is a majoritarian one.)

Secondly, there is a question, which Sher takes up in passing later in the paper (pp. 200-1) under the heading of “decision rights”. Let me put the matter this way. Consider A, B, and C, as presented in terms of their preferences and underlying judgments of criterion satisfaction. Suppose that we use Table 1 reasoning to determine a social choice and ask for the reflective endorsement of that choice by A, B, and C. Since each of them has formed the preference for “No” over “Yes”, it is hard to imagine any of them protesting that their “decision rights” have been trampled on account of the social choice being determined to be “No”. Now consider using Table 2 reasoning to determine a “Yes” answer to the question of overall social choice. In this case, it is hard to imagine how any of A, B, and C could give their reflective endorsement to The Judge’s understanding of this choice situation. (I will return to this point.) It is hard, at this point, not to worry that Sher has made the very petard by which he hoists himself when he says (p. 201): “I have done nothing in this article to argue that the best alternative should be implemented if this interferes with people’s legitimate decision rights.” Or, in other words, it does not “make more sense” to use The Judge than to aggregate preferences … at least when The Judge’s “verdict” cannot be reflectively endorsed by the members of the decisional collective.

I said, earlier, that I thought we were getting somewhere … that we could see why it might be reasonable to use a majoritarian principle of vertical aggregation of perspectival elements. But it’s not clear to me that we’ve gotten all the way to: And this majoritarian approach involving perspectives is strictly superior to the unanimitarian approach involving preferences, which, it seems to me, is what Sher has to establish if he is to establish that it makes “more sense” to aggregate across perspectives. Sher’s preference for Table 2 over Table 1 seems to be hostage in this particular case to

the possibility that variation in the application of shared criteria might itself be legitimate, and
the shakiness of the justification of majoritarian aggregation even when such variation is excluded by the assumption of objectivism about judgments of criterion satisfaction, and
the likelihood that there can be no reflective endorsement by participants of the conclusion of the Table 2 reasoning.

Having said all that, Sher’s analyses (like Mongin’s and List’s) do enable us to see the depth and complexity of individual judgment and collective decision-making. And, in fairness, I have skipped over four of Sher’s case studies, each of which is illuminating in its own right, and is therefore commended to anyone with an interest in social judgment and decision.

In any event, there certainly is, on the individual side, more to it than just preferences; individuals have all sorts of factual and evaluative commitments that are relevant to the formation of preferences and that, indeed, may well sometimes supersede even those preferences that they may be inveigled to express. As Sher says (p. 200), “people are not adequately characterized by their preferences”. There is no reason, except tactically in relation to specific contexts, to treat preferences as basic and unanalysable. And discussion of the Discursive Dilemma has contributed to that realization.

Certainly, there is, in relation to aggregation, a variety of possible mechanisms, as explored and analysed by social choice theory, for example. And, as my examination of objective versus subjective was meant to show, how these various mechanisms might be deployed and with what intent is undoubtedly heavily contextual.

This brings me to a point that I skipped past earlier, namely, that there is an alternative framework for thinking about these matters, which I earlier designated as a horses for courses approach. Let me say a bit more. In doing so I will draw on a point Sher himself made in the prepublication version of the paper “Comparative Value and the Weight of Reasons”, where he says (p. 8, emphasis added):

Perhaps, the notion of aggregating people’s objectives was misguided to begin with. We might want to adopt an alternative conception according to which respecting people’s preferences and objectives is not to be done by aggregating them, but rather by setting up procedurally fair institutions that do not unnecessarily interfere with people’s choices, that distributes rights and freedoms according to certain principles, that make collective decisions democratically, and that encourage discussion and debate, so that people can persuade one another. The thought that the attitudes that we most want to respect in making social decisions are individual judgments backed by reasons might contribute force to this procedural alternative to aggregation.

This seems to me eminently reasonable. Indeed, I take it that the whole panoply of cases and more abstract argumentation that Sher deploys points squarely if perhaps unintentionally in this direction.

Let’s go back to the question of reflective endorsement. There are two ways in which we might seek it, and I have traded on one of them (perhaps unfairly) in reaching the (interim) conclusion that A, B, and C, in Sher’s (Discursive Dilemma) case study, could not or would not reflectively endorse the overriding (?) decision of The Judge. In particular, I have assumed that we just put it to them that “The Judge” has decided that they should really prefer “Yes” to “No”. But what if, instead of such a bald suggestion, we exposed them all to the perspectival underpinnings of their own preferences and invited them to discuss the relevant matters with one another.

For aggregation, we would substitute deliberation, where each of A, B, and C understood all the relevant facts about the underlying decisional elements and was prepared to discuss these facts with their fellows. Indeed, as someone who has run a lot of meetings where collective decisions have to be reached without too much residual grumpiness, such discussions can often be productive. Once B understands that both A and C have reached a different conclusion about C1 satisfaction than she has, she can reflectively reconsider that question. And, indeed, so can they, based on B’s “minority opinion”. Perhaps, in an effort to “resolve” matters, they will probe even more deeply to yet another underlying level of factual and evaluative elements that are relevant to judgment and decision in this situation, considering, for instance, the grounds for B’s minority judgment. And they will themselves either agree (or not) on the principles of aggregation, integration, synthesis or whatever … reflectively endorsing unanimitarian or majoritarian or indeed some other principle on a horses for courses basis. Sometimes, when considering the matter reflectively, a decisional collective will decide (quite properly given the circumstances) on unanimitarian aggregation of given preferences. Sometimes they will decide (and again properly) on the majoritarian aggregation of perspectives. Sometimes they will decide in one of these ways or another only after they have reconsidered their ex ante perspectives or preferences as a result of collective discussion. And so on. Indeed, commonly recurring situations may give rise, and empirically have given rise, to institutionalization of decisional technologies judged to be well adapted to those recurring situations. So, for example, on certain matters and in certain situations, we vote, using preferences and majoritarian decision mechanisms. In others, we deliberate, probing our individual perspectives, allowing them to be modified upon the discovery of other perspectival elements, and so on until some decisional equilibrium is reached. And so on.

Sher set out to undermine the use of the Pareto Principle even in those circumstances in which the antecedent of the conditional is satisfied. I’m not sure that the Discursive Dilemma case study that I’ve considered is adequate to that aspiration. But I am sure that Sher has pointed us towards a different way of thinking about collective judgment and decision making, one which he himself sketched in an earlier paper. It’s understandable that this is “the road not taken”. It’s the road that doesn’t lend itself as readily as social choice mechanisms do to what I earlier called “formal” analysis. Indeed, once we start thinking things through and adding facts about human cognitive capacities and the social psychology of deliberation (see my Naturalizing Epistemology, Palgrave, 2010), the matter of effective collective decision making becomes highly “empirical”; we leave the domain of proof and the case studies multiply and become internally complex and specific. But that, I think, is a road also worthy of exploration.

Fred D’Agostino The University of Queensland

Reply to D’Agostino by Itai Sher

I want to thank Fred D’Agostino for his excellent commentary on my paper. In the opening section and elsewhere in his comment, D’Agostino describes the aim of my paper in illuminating terms, and the comment gives me some challenging questions to answer.

I would split D’Agostino’s reply to my article into three parts. First D’Agostino describes what my paper does. Then D’Agostino criticizes my analysis, specifically with regard to the example of the discursive dilemma. Then D’Agostino discusses a procedural alternative to aggregation that I raised briefly in the closing section of my paper, and which I also discussed in a previous paper.

Below, I will start by providing some background on some of the main ideas in my paper. Then I will discuss D’Agostino’s criticism with respect to the discursive dilemma. I will conclude with the procedural alternative to aggregation.

It seems to me that overall, D’Agostino and I don’t view the issues involved in the paper very differently, but we have a narrow disagreement about how the discursive dilemma fits into my argument and the cogency of my appeal to it.

Background: The Pareto Principle and the Judge’s Aggregation Problem

Let me start by outlining some of the main ideas in my paper that D’Agostino’s comments appeal to. First, let me give a statement of one version of the Pareto principle.

Pareto Principle. If everyone prefers X to Y, then X should be socially preferred to Y.

Another version of Pareto allows, in the antecedent, that some people may be indifferent between X and Y, but those who care prefer X.

The Pareto principle is a very important principle in normative economics, and part of the point of my paper is to criticize it. I should note that I am not, by any means, the first person to criticize Pareto, but it is a principle that economists are very reluctant to abandon.

In my paper, I imagine a judge who makes a judgement about whether an alternative X should be socially preferred to another alternative Y, knowing not only people’s preferences, but also broader facts about their perspectives. I take a person’s perspective to include their principles and values, their reasons for holding their preferences, their attitudes, their beliefs, their background knowledge, and so on. The judge is supposed to have a minimal attitude with regard to imposing her own substantive views and is supposed to try to base her judgements on the views and attitudes of the members of the group.

After deliberating, the judge will declare either that X is better than Y or that Y is better than X.

An important point to make about the judge is that the judge is viewed simply as making an evaluative judgement. The judge is not necessarily viewed as having the legitimate authority to implement the judgement, although it is possible that someone with legitimate authority might consult the judge. This distinction between an evaluative judgement and the legitimate authority to decide will be important when I discuss the procedural vs aggregation approaches below.

Suppose that the judge accepted the Pareto principle. Then one might expect the judge to adopt the following policy:

The Pareto Policy. If everyone prefers X to Y, then I (the judge) will declare that X is better than Y from the standpoint of the group.

One of the main points that I would like to emphasize in responding to D’Agostino’s comments is that if the judge adopts the Pareto policy, then in every situation in which everyone else prefers to X to Y, all other considerations become irrelevant. So, if the judge were to accept this policy, then in every situation in which everyone prefers X to Y and the judge knows this, other considerations, such as the agents’ reasons, justifications, beliefs, etc., become irrelevant to the judge’s evaluative decision of whether to declare X better than Y. Thus, for the purpose of deciding whether to declare that X is better than Y, the judge need not even know anything about the broader aspects of the agents’ perspectives; when there is preference unanimity, knowing of this unanimity is sufficient.

My aim in the paper is to argue that the judge should not adopt the Pareto policy because information about agents’ broader perspectives is not irrelevant even when agents have unanimous preferences. This means first that the judge will consider information about the agents’ perspectives other than their preferences, even when their preferences are unanimous. Second, it means that considering this information will sometimes make a difference. That is, sometimes, this additional information will be sufficient to overturn the unanimous preferences of the agents in the judge’s verdict.

One might think that my conclusion, if true, will have limited importance because (i) people rarely have unanimous preferences—so Pareto is usually vacuous, and (ii) when they do have unanimous preferences, one would expect that usually the judge will make judgements in conformity with these unanimous preferences. As D’Agostino points out, because of (i), the Pareto principle has limited scope, and other principles are needed to make judgements when preferences are not unanimous.

However, I believe that the question is important, and its significance extends beyond the situation in which preferences are unanimous. First the existence of potential Pareto improvements is more common than unanimity in pairs of alternatives under consideration, and these potential Pareto improvements restrict the overall social objective if one accepts the Pareto principle. Second, and related, Sen, in his article, “Utilitarianism and Welfarism”, and Kaplow and Shavell, in their article, “Any Non-Welfarist Method of Policy Assessment Violates the Pareto Principle” both argued that the Pareto principle crowds out any non-preference based moral considerations in social evaluations, such as those having to do with fairness, rights, or liberty. I think that a similar consideration applies in the context of my paper—the Pareto principle can crowd out considerations of people’s perspective other than their preferences. If we say that, in the context of unanimous preferences, other aspects of people’s perspectives—their reasons, values or principles—cannot ever be the basis to overturn unanimous preference, then this suggests that considerations of preference lexicographically dominate considerations coming from people’s reasons, values, or principles. In a section of may paper entitles “Against the two-step procedure”, I argue that when we represent people’s perspectives as being richer than just their preferences, then a commitment to Pareto is naturally accompanied by an aggregation procedure that only aggregates reasons, principles, values through their influence on individual preferences, as opposed to one that considers these principles, reasons, values, etc., on their own independently of preferences. This aggregation procedure has consequences beyond the case in which preferences are unanimous.

For the discussion below, it is also important to note that to refute the Pareto principle, it is sufficient to find a case in which X is unanimously preferred to Y but X is either socially indifferent to Y, or X is socially incomparable to Y; neither of these situations is consistent with the Pareto principle.

D’Agostino’s criticism of my use of the discursive dilemma.

In my paper, I present a variety of examples, to illustrate a variety of ways in which broader aspects of a person’s perspective, and not just their preferences, can be relevant to aggregation. In his comment, D’Agostino focuses on the well-known example of the discursive dilemma from the judgement aggregation literature. The discursive dilemma is only one of several cases I consider in the paper, and I do not mean to rest my position on this example alone.

Recall that, the example of the discursive dilemma in my paper, there are three agents and three premises C1, C2, and C3, and each agent believes that the conjunction of the three premises is necessary and sufficient for a Yes answer on overall Yes-No decision. For each premise, two out of the three agents accept it, but no agent accepts all three premises, so they all prefer a No answer on the overall question. With reference to that example, D’Agostino writes:

Sher’s argumentation requires, of course, that the “Yes” answer be, in this case, the determinately better answer. Otherwise, he does not have a counter-example to the social decision mechanism based on the Pareto Principle, and hence doesn’t have a basis, at least in terms of this argument, for the additional and for him focal contention, that it’s perspectives (embodied in this case in the individuals’ criterial judgments) that are important, rather than (overall) individual preferences.

As I will explain below, I do not agree with D’Agostino’s assessment that my argument requires a determinate Yes answer.

Presumably, if I were to show that there was a determinate Yes answer, I would also have to show that the procedure that the judge should use is

to decide on the truth of the premises C1, C2, and C3 by majority judgement, and then use the commonly shared view among the agents that the three criteria are necessary and sufficient for a Yes decision to conclude that the answer should be Yes.

The reason that I would have to show that the judge should use this particular procedure because there is no other obvious procedure that leads to a Yes answer.

As I mentioned above, I disagree that the it is the burden of my argument is to show that the answer is determinately Yes in the discursive dilemma. In the context of my overall article, the way that the discursive dilemma functions is that it is just one of several varied examples, in which the overall perspectives of the agents and not just their preferences, are relevant to the judge’s decision, even when preferences are unanimous. What I need for my argument is not that, in every example that I present, the judge should determinately decide against the unanimous preference. My aim rather, in this article, is to show that aspects of agents’ perspectives other than just their preferences should be considered, that other aspects of those perspectives can be relevant in a variety of ways and in a variety of contexts, and that a more holistic aggregation of agents’ overall perspectives will lead us to judge that a decision going against unanimous preferences will, in some instances, be better from the perspective of the group.

In fact, I do not believe, nor do I say in my article, that Yes is determinately the better answer in this case. It seems to me that it could be reasonable for the judge to use procedure (1) but it could also be reasonable for the judge to go with the unanimous preference of No for the overall decision. Perhaps more information about the situation might shed light on which of these two procedures is superior. Indeed, in a passage that D’Agostino quotes, I say exactly this—that is is ambiguous—about the judge going against unanimous preference: “Whether this is reasonable depends on the details of the case, but it could be reasonable for the judge to aggregate in this way.” For example, if we knew more about how and why the agents came to their conclusions on C1, C2, and C3 and the reliability of the procedure that they used, about whether they appealed to shared or disjoint evidence in doing so, and about their own attitudes about the reasonableness of aggregating views on the premises via taking the majority opinion of the agents, then we might be able to come to a determinate conclusion about whether the judge should go against the unanimous preferences of the group.

Notice that if one accepts the Pareto principle, and thinks that the judge should follow the Pareto policy, then all of the above information is irrelevant, once the unanimous overall preference for No is known. It is irrelevant to the judge’s decision that the different agents arrived at their preferences by aggregating premises C1, C2, and C3; it is irrelevant to the judge’s decision that a majority vote on the premises would yield true for each, and that all subscribe to the principle the premises are jointly necessary and sufficient for Yes decision. Moreover, the additional facts that I mentioned in the preceding paragraph about how and why the agents came to their views on the premises and their attitudes about different procedures for aggregating their opinions in the case that they disagree would be irrelevant.

If one thinks that the additional information beyond the unanimous final preference for No over Yes is relevant in this case, then this already sheds serious doubt on the Parteo principle and the Pareto policy, because if Pareto is granted, then such information would always be irrelevant for the judge’s verdict once the unanimous No preference is known.

There are, moreover, many variants of the discursive dilemma. Notice, for example that in the particular example discussed in my paper and by D’Agostino, there is actually more than a majority for each of the premises; there is a supermajority of 2/3 for each premise. One might imagine that the rule is that we do not overturn a unanimous No decision unless there is a 2/3 supermajority in favor of each of the premises, and that a simple majority is not inherently sufficient. One could even impose a higher threshold than 2/3 to overturn a unanimous decision and construct a similar example in which this higher threshold were met. And one could vary any or elaborate on other features of the example. A proponent of Pareto would have to hold that in all such cases, we would go with the unanimous No decision on the outcome.

Before proceeding to a specific elaboration of the discursive dilemma according to which a Yes answer seems particularly plausible, I would also like to mention that the contrast D’Agostino makes between a unanimitarian attitude to the premises in that all must be true for a Yes decision and a majoritarian attitude to the conditions under the judge takes a premise to be true is a bit of an apples to oranges comparison because there may independent reasons to require the conjunction for a yes answer (see for example reason in the elaboration of the example that follows) and the procedure that we use to assign a truth value of each individual conjunct is a different kind of matter.

Also, it is important to distinguish being “unanimitarian” by requiring the truth of all three premises for a Yes judgement by one agent, and being “unanimitarian” by requiring unanimous preferences among the three agents’ preferences to form a social preference to the same effect.

I will now attempt to provide a more detailed elaboration of the example, which strengthens the case for a Yes decision by the judge. That is, I will add additional details to the example which make the case for a Yes answer by the judge stronger. As I explain below, even with these additional details, I think that it is not completely determinate what the judge should do.

Let us suppose that the decision in question is whether to convict or not a person of a crime, and the three agents are jurors. Suppose that there is a statute that says that the crime has been committed if and only if conditions C1, C2, and C3 occur. This explains why each of the three agents will prefer a conviction (a Yes answer) only if all three occur.

Suppose that initially the probability that C1 is true is .5. Likewise, the probability that C2 is true is initially .5 and the probability that C3 is true is .5. Suppose also that these three probabilities that C1, C2, and C3 are true are independent.

Let us focus first on C1. Suppose that each agent does not directly know whether C1 is true or false, but each agent sees a signal that says either “C1 is true” or “C1 is false”. This signal is not perfect. Conditional on C1 being true, agent 1 sees “C1 is true” with probability p and sees “C1 is false” with probability 1-p. On the other hand, conditional on C1 being false, agent 1 sees “C1 is false” with probability p, and sees “C1 is true” with probability 1-p.

Now, using Bayes’ rule, one can show that the probability that agent 1 should assign to C1 being true if he sees that “C1 is true” is also p.

Suppose that also for each of C2 and C3, agent 1 also sees a similar signal, each of which feature the same probability p. The signal for C1 is only influenced by the truth or falsity of C1 and is independent of C2 and C3. Likewise, the analogous fact holds for the signals for C2 and C3. If the agent sees the three signals “C1 is true”, “C2 is true”, and “C3 is true”, the agent should conclude that the probability of the conjunction of C1, C2 and C3 is p³. Now p³ approaches 1 as p approaches 1, so if p is sufficiently close to 1, then seeing the three signals “C1 is true”, “C2 is true”, and “C3 is true” would be a reasonable basis for the agent to conclude that that conjunction is likely true. So in this case, the agent will believe that all three conditions are likely true and would want to convict (a Yes choice). Again if p is sufficiently close to 1, then if the agent sees the signal that one of the conditions is false, then the agent can conclude that the condition is probably false, and hence will want to acquit (a No choice).

Agents 2 and 3 form beliefs in a similar way. We assume that they see signals, also involving the same probability p, and that both conditional on C1 being true and conditional on C1 being false, the signals seen by the three agents are independent of one another. Likewise for C2 and C3. All three agents will come to their decisions in the same way.

We may imagine that this is the procedure according to which the agents judgements in Tables 1 and 2 in D’Agostino’s comment (also the table on p. 192 of my article) are formed. Let us suppose that it turns out that each of the three agents has seen two positive signals and one negative signal, and each agent accepts two of the three conditions, but no agent accepts all three, as in the tables. If p is close to 1, this is an unlikely event, but it is possible.

Let us look at things from the standpoint of the judge. Suppose that the judge knows the opinions of all the agents, and they are as in the table.

One can show that if the judge uses Bayes’ rule, and the signals that the agents have received are as recorded in the tables, then the judge should also assign a probability p to each of the conditions C1, C2, and C3 and p³ to their conjunction. (This probability ends up being so simple because of the simplicity and symmetry of the example.)

As p approaches 1, the probability p³ that the judge should assign to the conjunction of C1, C2 and C3 also approaches 1 (given the configuration of the table). So, if p is sufficiently close to 1, the judge should assign a high probability to the conjunction of C1, C2, and C3. Then given that (A) the agents all accept that the conjunction of C1, C2, and C3 is sufficient for conviction and (B) given his information, the judge can conclude that the probability of this conjunction is close to 1, it seems that it would be justified the judge should support a vote to convict, and so select Yes.

So, in this way of spelling out the discursive dilemma, it seems that the judge should determinately choose Yes.

There is however an important objection: If the agents themselves had all the information that the judge has—that is, if they knew not only their own signal, but also the signals of the other agents, then they would share the judge’s assessment, and hence they would all want to convict, so that the judge’s decision would not violate their unanimous preference. The most relevant preferences for the purpose of the example are the preferences that the agents would have if they had the same information that the judge has, not the preferences they would have on the basis of only their own signals. This is related to a point that D’Agostino makes at the end of his comment.

To avoid this problem, let us add a twist to the example. Suppose that each agent has confidence in their own signal, but not in the signals of the others. The agents do not really respect each other. Reinterpret what I have described above as pertaining not to the objective probabilities associated with the signals, but the model of probabilities that each agent has for their own signals. Each agent believes that the other agents receive signals that are almost completely uninformative—to be precise, the signals of the others are regarded as very slightly informative—and that the other agents mistakenly interpret their own signals as being highly informative. So agent 1 believes his own signal is highly informative, but believes that the signals of agents 2 and 3 are only very slightly informative; and symmetrically from the standpoint of the other agents.

Let us assume that each agent even knows what signals the others have received (i.e., whether they have seen “C1 is true” or “C1 is false”, etc.), but discounts those other signals so highly that they effectively base their judgements only on their own signal. Suppose finally that the agents all still accept all of the statistical independence assumptions made above, and so the judge adopts those as well.

The sort of situation described above is, in one respect, not an uncommon occurrence: people are often highly confident of their own information, while dismissing the information of others.

Now what does the judge do? There is some indeterminacy here, but suppose that the judge decides to weigh heavily each of the agents’ views about their own information, and discount their dismissal of the other agents. Then the judge may reason just as above and come to a Yes conclusion which is contrary to the unanimous preference of No for all the agents. And now there is no objective fact that the judge knows that the other agents do not know: Everyone knows what signals everyone else has received.

I concede that this case is not completely conclusive, because it rests on the attitude that the judge takes to the fact that the agents dismiss one another’s signals; the judge might instead have taken this dismissal more seriously. And judging in this way may be in conflict with the criterion of reflective endorsement of participants that D’Agostino brings up in his comment. But, at the same time, I think that treating each agent as an authority on their own information and not the information of others is reasonable, and so this way of elaborating the discursive dilemma example does present a strong case for an answer of Yes against the agents’ unanimous No.

I have elaborated the details of the discursive dilemma in one way such that I think it would reasonable for the judge to select Yes. But suppose that we conclude that either in this way of elaborating the case or in another, neither a No nor a Yes answer is determinately better. Suppose we conclude that both a No and a Yes verdict are permissible by the judge because each has a reasonable justification and it is indeterminate which one is better. This conclusion itself would amount a counter-example to Pareto. That is because Pareto makes it obligatory to choose No (in an evaluative judgment of social betterness) when preference for No is unanimous, and we are concluding that instead both Yes and No are permissible, so that No is not obligatory, contrary to Pareto.

In my paper, I have also considered a number of other types of examples. I refer the reader to the first example in the section entitled “Higher Order Beliefs”, involving Ann and Bob (p. 188). In that example Ann is altruistic and Bob is not; Bob recognizes that others are morally important but does not factor this abstract ethical recognition into his preferences. There are two allocations of money, one which favors Ann, and another which favors Bob. Ann is indifferent because she weighs both her and Bob’s interests equally. Bob, on the other hand, prefers the allocation that favors him because he only takes his own interests into account when forming his preferences. Since Bob prefers the allocation that favors him, and Ann is indifferent, a version of the Pareto principle—a slightly different version than the above—says that the allocation that favors Bob should be socially preferred. However, I think there is no reason for the judge to prefer this allocation; if the allocation favoring Bob were preferred by the judge on the basis of the agents’ preferences, it would only be because Ann is considerate toward Bob and Bob is not considerate toward Ann, and that is not a good reason to favor the allocation preferred by Bob. So, I think that the judge should be indifferent between the two allocations. But this contradicts the version of the Pareto principle which says that if everyone is indifferent or prefers X to Y, and someone prefers X to Y, then X should be socially preferred. The example is presented in more detail on p. 188 of my paper, and, on p. 189, I present a slightly more complicated version of the example that gets rid of the indifference so that all agents have a strict preference, and in which the judge has a strict preference contrary to the unanimous strict preference of the agents. So I believe that this example—while not being a version of the discursive dilemma—satisfies the request that D’Agostino made in that it is a case in which I believe that the judge should determinately have an attitude that is at odds with what is recommended by Pareto.

The paper also contains a number of other examples illustrating these themes.

With regard to the paper’s main themes I think there is a great deal about which D’Agostino and I agree. D’Agostino seems to agree with me that in the discursive dilemma (and presumably other settings) how we ought to aggregate the agent’s attitudes is highly contextual. Also, D’Agostino writes,

there certainly is, on the individual side, more to it than just preferences; individuals have all sorts of factual and evaluative commitments that are relevant to the formation of preferences and that, indeed, may well sometimes supersede even those preferences that they may be inveigled to express … There is no reason, except tactically in relation to specific contexts, to treat preferences as basic and unanalysable.

I strongly agree, and this in itself is a large part of the message of my paper.

Aggregation vs Procedural Approaches

D’Agostino refers to a quote from an earlier article in which I express skepticism about aggregation and suggest that instead of focusing on aggregation we should focus on procedurally fair institutions, in particular involving deliberation. This is consonant with the concluding section of my paper in which I distinguish the aggregation exercise with the exercise of assigning the decision rights. Above, I emphasized that the judge does not necessarily have the legitimate authority to make decisions, so that he cannot impose his view of the decision that best represents the group on the group against their wishes.

D’Agostino seems to endorse the procedural approach over the aggregation approach.

I, like D’Agostino, do have some skepticism about the aggregation exercise, but I think it is worth thinking about. There are really two questions that we can ask.

Which decision is best for a group taking into account people’s attitudes and views?
Which institutions should we adopt to actually make collective decisions in the face of disagreement?

The first question is one that we might discuss when we don’t have the power to impose a decision. It is, in my view, a question that can be worth discussing. However, the fact that we come to some conclusion on 1 does not automatically give us authority to impose that answer on people. Ultimately, legitimate authority to make decisions exists in the context of some set of institutions—institutions which ideally could amount to the answer to 2. If we have democratic institutions that involve public deliberation, then proposed answers to 1 could feature in the public debate. Indeed, as D’Agostino points out, debates about 1 would typically not occur just in the mind of a single “judge”, but would rather involve the various members of the group as they attempt to persuade one another. Note, finally, that 2 could itself fruitfully be the subject of abstract debate even when there is no authority to act.

So I think both questions are valuable but distinct.

In my view, it is a flaw of economics that, at a normative level, these two questions are not always conceptually distinguished in the correct way. It is sometimes assumed that there is some social objective, possibly in the form of a social welfare function, or criteria, such as, e.g., efficiency, and institutions and policies should be chosen so as to maximize this objective or satisfy these criteria and there is no separate normative question of the legitimate authority to decide. That is wrong; it is important to think about the latter question.

Focusing on the procedural level does not engage with the Pareto principle in the same way as does the aggregation exercise, because endorsing a procedure or an institution is not tantamount to judging that one outcome is better than another. If X is the outcome of an institutional arrangement that we endorse, that does not automatically amount to a revealed preference for X over Y, where Y is some feasible alternative that was not chosen.

There is however an indirect way in which the choice of institutions may conflict with Pareto. Suppose that there are two possible institutions A and B and two possible outcomes X and Y. It may be that A is a fairer and more legitimate procedure than B, A would lead to outcome X, B would lead to outcome Y, and Y Pareto dominates X. That is, an institution that we may regard as procedurally better may lead to an outcome that we regard as worse from the standpoint of preferences than an outcome that would have resulted from another less fair or legitimate institution.

In his article D’Agostino refers to this procedural approach as a “road not taken”. It is true that my paper focuses on the aggregation approach, but I will mention that an earlier draft of the paper had a much more extensive discussion of the procedural approach and it is an avenue that I think is very much worth exploring.

Finally, I do not agree with D’Agostino’s methodological conclusions about the procedural as opposed to the aggregation approach.

It’s the road that doesn’t lend itself as readily as social choice mechanisms do to what I earlier called “formal” analysis. Indeed, once we start thinking things through and adding facts about human cognitive capacities and the social psychology of deliberation (see my Naturalizing Epistemology, Palgrave, 2010), the matter of effective collective decision making becomes highly “empirical”; we leave the domain of proof and the case studies multiply and become internally complex and specific.

I think procedural considerations are amenable to formal methods. Social choice theory can be regarded as being about aggregation and social welfare, but voting theory is part of social choice theory and voting mechanisms are decision procedures. Axioms such as anonymity (voters are treated the symmetrically) and neutrality (candidates are treated symmetrically) can be thought of as procedural fairness requirements. Game theory in general is applied to analysing institutions (although sometimes with the normative distortion I mentioned above). Moreover, assumptions about cognitive capacities and social psychology can be and are modelled in economic theory.

I am a methodological pluralist and I believe that when we are studying institutions, philosophical argumentation, formal modelling, and empirical methods and evidence are all important, and interact in interesting and fruitful ways.

To sum up, I am very grateful to D’Agostino for his commentary on my paper. I found it to be very insightful and thought provoking, and it caused me to reflect more deeply on the underlying issues.

Itai Sher: “How Perspective-based Aggregation Undermines the Pareto Principle”. Précis by Fred D’Agostino

Leave a Reply

Discover more from PEA Soup