# The Cantor Set

In the last post, How long is infinity?, we described how to very carefully define the length (technically the measure) of any set of points on the number line. You cover up your set with intervals, then add up the lengths of the intervals. That gives you an upper bound for how long your set can be. The actual length is then the smallest upper bound you can get this way.

Using this definition, we proved the surprising result that any countable set has length zero.1

For the counting numbers $\{1, 2, 3, \cdots\}$ this isn’t so weird. After all, they’re nice and spread out, even if there are infinitely many of them.

But I want to emphasize again how weird it is that countable sets have zero length. For example, consider all the rational numbers (numbers that are fractions, like 2/3) between 0 and 1. They’re packed in there like sardines.

Now, if we want to estimate their length, we could count them, putting an interval of length 1/4 on the first rational number, an interval of length 1/8 on the second, then 1/16, then 1/32, etc..

In this way we could cover all of those rational numbers by intervals with total length $\frac14 + \frac 18+\frac1{16} + \cdots = \frac12$. Somehow, despite the rational numbers being everywhere inbetween 0 and 1, and us covering every single one with its own interval, somehow we only managed to cover half the numbers between 0 and 1!

Yeah, it’s pretty bizarre.

So, countably infinite sets don’t have any length, while many uncountably infinite sets, like the interval [0,1], all the numbers between 0 and 1, usually have positive length.

Does every uncountably infinite set have length?

Now seems a good point to mention that adding up uncountably many things behaves… badly.

When we had a countable set, you could argue that the total length had to be zero because each individual point has length zero, and $0+0+0+\cdots$ must clearly be zero, so the total length had to be zero.

But for an uncountable set, adding up lengths just doesn’t work.2

For instance, no matter how we try, the argument we used for countable sets to show that their measure is zero simply cannot work. To remind you, we put smaller and smaller intervals around each point of the countable set. As we made the sets smaller and smaller, we found we could get as small of total length as we wanted.

Now, suppose we try to put smaller and smaller intervals around the points of an uncountable set, like we did with a countable set. Thus, each point has an associated interval length. If we add up all those interval lengths, we are adding up uncountably many numbers that are each greater than zero.

Since we can add up sequences and sometimes get finite numbers, like $1+\frac12+\frac14+\frac18 + \cdots = 2$, this might not seem like a problem. But when you add up uncountably many positive numbers, you always get infinity!

The argument goes like this: split up the (uncountably many) intervals we are using to cover our uncountable set based on their length. Put all the intervals with lengths between, say, 1 and 1/2 into one group, and all those with lengths between 1/2 and 1/4 into another, and all those between 1/4 and 1/8 into another, etc..

We’ve now split our uncountably many intervals into countably many groups. (There are countably many groups since we can count them — here’s the first group, the second group, the third group, etc..) But uncountably infinite is larger than countably infinite, so we can’t fit the intervals into the groups nicely.

At least one of the groups has to have infinitely many intervals!

Each of these groups, though, has a minimum length. Even if the group with infinite intervals had lengths between $1/2^{1000}$ and $1/2^{1001}$, infinitely many of them still add up to infinite length.

So, we need to come up with some other way of showing an uncountable set has zero length.

Fortunately, our good friend Georg Cantor came up with a set that will help us out.

Cantor’s set3 is perhaps the simplest example of a fractal, by which we’ll, informally, mean a shape or set that is more or less self-similar as you zoom in closer.

To start, take the interval $[0,1]$, all the numbers between 0 and 1.

Then, remove the middle third of the set.

Now, we have two smaller intervals, each with length 1/3. From each of those, remove the middle third again.

Now, we have four intervals. We continue this process infinitely many times. What’s left over is the Cantor set!

If we calculate how much length was left at each step, we started with length 1. After the first step, we had 2 intervals of length 1/3, for a total of $2\cdot \frac13$. After the second step, we had $4 = 2^2$ intervals of length $1/9 = 1/3^2$, for a total of $2^2 \cdot \frac1{3^2} = \frac49$. Then $2^3 \cdot \frac1{3^3} = \frac{8}{27}$, and so on. After $n$ steps, the total remaining length would be $\left(\frac{2}{3}\right)^n$. As we do this process infinitely many times, this remaining length goes to zero.

Thus, the Cantor set has zero length!

Of course, at first glance, this set doesn’t seem like there’s much there. After all, we removed “everything,” right?

But let’s look a bit closer.

First of all, the endpoints of each interval along our process is in the Cantor set. For instance, the points 0, 1/3, 2/3, and 1 were never removed.

At first glance, these endpoints seem like the only points left. I mean, we keep on cutting out the heart of every remaining interval!

And, if the endpoints are all that’s left, that would only be countably many points, since we can count them. (Just order them up in the same order we cut out the intervals.)

But, if you look even closer, it turns out that the Cantor set does have uncountably many numbers in it. But to figure out why, we’ll need the ternary representation of the numbers.

The ternary expansion is the base 3 version of decimal expansion. For decimals, 0.012 represents 0 tenths, 1 hundredth, and 2 thousandths. In base three, the ternary expansion 0.012 represents 0 thirds, 1 ninth, and 2 twenty-sevenths. In other words, the ternary places are powers of 3 instead of powers of 10. In ternary expansions you only use the digits 0, 1, and 2.

Yeah, definitely weird the first time you see it.

But ternary expansions are perfect for the Cantor set. For instance, the points 0, 1/3, 2/3, and 1, in ternary, are represented by 0, 0.1, 0.2 and 1! In fact, the endpoints of the intervals are all numbers whose ternary representations stop after a finite number of digits.

So, what about those intervals we removed?

The first one was all the numbers between 1/3 and 2/3, or, in ternary, between 0.1 and 0.2. In other words, except for 0.1, we have removed all the numbers whose ternary expansions have the digit 1 in the first slot, the “thirds.”

The second set of intervals was [1/9, 2/9] and [7/9, 8/9]. In ternary, those are [0.01, 0.02] and [0.21, 0.22]. Thus, except for the endpoints 0.01 and 0.21, we have removed all the numbers whose ternary expansions have the digit 1 in the second slot, the “ninths.”

We can carefully continue this pattern. At each step, we are removing the numbers whose ternary expansions have ones in them.4 And these are the only numbers which we’re removing.

Thus, the Cantor set is all numbers whose ternary expansions have only the digits 0 and 2, except, perhaps, a terminal 1.

So what did we gain from all this complication?

In addition to the endpoints, which are all decimals that end, we also have a whole bunch of numbers that don’t end, like $0.0200022222002020202222\cdots$. We’ve found the extra points!

These numbers look a lot like normal decimals, just a bit restricted. In fact, they’re so similar, that, like normal decimals, there are uncountably many of them. You can even use the same argument we used in the post A bigger infinity for the normal decimals to show that the Cantor set has uncountably many numbers in it.

So, there we have it. Every countable set has length, or measure, zero, but uncountable sets can have length zero as well.5

Of course, any uncountable set is still much bigger than any countable set like the counting numbers $\{1, 2, 3, \cdots\}$, so it seems unfair to lump them together. So, one of these weeks, we’ll show that the Cantor set is different than a countable set. They may both have length zero, but it turns out that the Cantor set is not a zero dimensional set!

Figuring out what that even means will be half the fun!

But, before that, I want to talk about sets so weird that you can’t even measure them. And I don’t mean that their length is zero — That would still be measuring that they were length zero.

I mean sets so weird that you can’t make sense of what it means to measure them. They simply break our method of measuring sets.

<– Previous Post: How Long is Infinity?
First post in this series: How Long is Infinity?
–> Next Post: Non-measurable Sets That Go Bump in the Night

1. Briefly, a countable set is one that is infinitely big, but you can count, like the counting numbers, $\{1, 2, 3, \cdots\}$. A less obvious example is that all rational numbers (numbers like 2/3 or 17/28) are countable also. An uncountable set, which we’ll get to in a second, is a set with so many things in it that you can’t count them. An example would be the interval [0,1], all the numbers between 0 and 1. The fact that these are different is not at all obvious. If you’re interested in learning more about the different sizes of infinities, you could read the first few posts of this blog, starting with Infinity plus one
2. If you learn all this “measure theory” carefully, you’d find out that it is by definition that you can only add up the lengths of countably many sub-lengths. That part of the definition is exactly to avoid the contradictions with uncountable sets of the type we’ll describe here.
3. Really, there are many similar sets that are given this name, but we’ll start with the most common one.
4. Except, perhaps, as the very last digit. These, as we’ve mentioned represent some of the endpoints. Of course, recall that in base 10, $0.0999\cdots = 0.1$, and similar, so we could actually get rid of these terminal 1’s by replacing them with the digits $\cdots 0.02222\cdots$. Thus, we have no 1’s at all!
5. To be clear, “most” uncountable sets don’t have zero length. The Cantor set is one exception. Intervals like [0,1] are uncountable and have length.