#### Getting the Mode of Categories with a Formula

I have pointed out that Excel’s MODE() function does not work when you supply it with text values as its arguments. Here is a method for getting the mode using a worksheet formula. It tells you which text value occurs most often in your data set. You’ll also see how to enter a formula that tells you how many instances of the mode exist in your data.

If you don’t want to resort to a pivot chart to get the mode of a group of text values, you can get their mode with the formula

=INDEX(A2:A21,MODE(MATCH(A2:A21,A2:A21,0)))

assuming that the text values are in A2:A21. (The range could occupy a single column, as in A2:A21, or a single row, as in A2:Z2. It will not work properly with a multi-row, multicolumn range such as A2:Z21.)

If you’re somewhat new to Excel, that formula isn’t going to make any sense to you at all. I structured it, I’ve been using Excel frequently since 1994, and I still have to stare at the formula and think it through before I see why it returns the mode. So if the formula seems baffling, don’t worry about it. It will become clear in the fullness of time, and in the meantime you can use it to get the modal value for any set of text values in a worksheet. Simply replace the range address A2:A21 with the address of the range that contains your text values.

Briefly, the components of the formula work as follows:

The MATCH() function returns the position in the array of values where each individual value first appears. The third argument to the MATCH() function, 0, tells Excel that in each case an exact match is required and the array is not necessarily sorted. So, for each instance of Ford in the array, MATCH() returns 1; for each instance of Toyota, it returns 2; for each instance of GM, it returns 4.

The results of the MATCH() function are used as the argument to MODE(). In this example, there are twenty values for MODE() to evaluate: some equal 1, some equal 2 and some equal 4. MODE() returns the most frequently occurring of those numbers.

The result of MODE() is used as the second argument to INDEX(). Its first argument is the array to examine. The second argument tells it how far into the array to look. Here, it looks at the first value in the array, which is Ford. If, say, GM had been the most frequently occurring text value, MODE() would have returned 4 and INDEX() would have used that value to find GM in the array.

##### Using an Array Formula to Count the Values

With the modal value (Ford, in this example) in hand, we still want to know how many instances there are of that mode. This section describes how to create the array formula that counts the instances.

Figure 2 also shows, in cell C2, the count of the number of records that belong to the modal value. This formula provides that count:

=SUM(IF(A2:A21=C1,1,0))

The formula is an array formula, and must be entered using the special keyboard sequence Ctrl+Shift+Enter. You can tell that a formula has been entered as an array formula if you see curly brackets around it in the formula box. If you array enter the prior formula, it will look like this in the formula box:

{=SUM(IF(A2:A21=C1,1,0))}

But don’t supply the curly brackets yourself. If you do, Excel interprets this as text, not as a formula.

Here’s how the formula works: As shown in Figure 2, cell C1 contains the value “Ford”. So the following fragment of the array formula tests whether values in the range A2:A21 equal the value “Ford”:

A2:A21=C1

Because there are 20 cells in the range A2:A21, the fragment returns an array of TRUE and FALSE values: TRUE when a cell contains “Ford” and FALSE otherwise. The array looks like this:

{TRUE;FALSE;TRUE;FALSE;FALSE;FALSE;TRUE;FALSE;TRUE;TRUE;

FALSE;FALSE;FALSE;TRUE;TRUE;FALSE;FALSE;TRUE;FALSE;FALSE}

Specifically, cell A2 contains “Ford” and so it passes the test: The first value in the array is therefore TRUE. Cell A3 does not contain “Ford” and so it fails the test: The second value in the array is therefore FALSE—and so on for all 20 cells.

Note

The array of TRUE and FALSE values is an intermediate result of this array formula (and of many others, of course). As such, it is not routinely visible to the user, who normally needs to see only the end result of the formula.

Now step outside that fragment, which, as we’ve just seen, resolves to an array of TRUE and FALSE values. The array is used as the first argument to the IF() function. Excel’s IF() function takes three arguments:

The first argument is a value that can be TRUE or FALSE. In this example, that’s each value in the array just shown, returned by the fragment A2:A21=C1.

The second argument is the value that you want the IF() function to return when the first argument is TRUE. In the example, this is 1.

The third argument is the value that you want the IF() function to return when the first argument is FALSE. In the example, this is 0.

The IF() function examines each of the values in the array to see if it’s a TRUE value or a FALSE value. When a value in the array is TRUE, the IF() function returns, in this example, a 1, and a 0 otherwise. Therefore, the fragment

IF(A2:A21=C1,1,0)

returns an array of 1’s and 0’s that corresponds to the first array of TRUE and FALSE values. That array looks like this:

{1;0;1;0;0;0;1;0;1;1;0;0;0;1;1;0;0;1;0;0}

A 1 corresponds to a cell in A2:A21 that contains the value “Ford” and a 0 corresponds to a cell in the same range that does not contain “Ford”. Finally, the array of 1’s and 0’s is presented to the SUM() function, which totals the values in the array. Here, that total is 8.