In this course so far, we have focused primarily on one specific example of a countably additive measure, namely Lebesgue measure. This measure was constructed from a more primitive concept of *Lebesgue outer measure*, which in turn was constructed from the even more primitive concept of *elementary measure*.

It turns out that both of these constructions can be abstracted. In this set of notes, we will give the Carathéodory lemma, which constructs a countably additive measure from any abstract outer measure; this generalises the construction of Lebesgue measure from Lebesgue outer measure. One can in turn construct outer measures from another concept known as a pre-measure, of which elementary measure is a typical example.

With these tools, one can start constructing many more measures, such as Lebesgue-Stieltjes measures, product measures, and Hausdorff measures. With a little more effort, one can also establish the Kolmogorov extension theorem, which allows one to construct a variety of measures on infinite-dimensional spaces, and is of particular importance in the foundations of probability theory, as it allows one to set up probability spaces associated to both discrete and continuous random processes, even if they have infinite length.

The most important result about product measure, beyond the fact that it exists, is that one can use it to evaluate iterated integrals, and to interchange their order, provided that the integrand is either unsigned or absolutely integrable. This fact is known as the Fubini-Tonelli theorem, and is an absolutely indispensable tool for computing integrals, and for deducing higher-dimensional results from lower-dimensional ones.

We remark that these notes omit a very important way to construct measures, namely the Riesz representation theorem, but we will defer discussion of this theorem to 245B.

This is the final set of notes in this sequence. If time permits, the course will then begin covering the 245B notes, starting with the material on signed measures and the Radon-Nikodym-Lebesgue theorem.

** — 1. Outer measures and the Carathéodory extension theorem — **

We begin with the abstract concept of an outer measure.

Definition 1 (Abstract outer measure)Let be a set. Anabstract outer measure(orouter measurefor short) is a map that assigns an unsigned extended real number toeveryset which obeys the following axioms:

- (Empty set) .
- (Monotonicity) If , then .
- (Countable subadditivity) If is a countable sequence of subsets of , then .

Outer measures are also known as *exterior measures*.

Thus, for instance, Lebesgue outer measure is an outer measure (see Exercise 4 of Notes 1) is an outer measure. On the other hand, Jordan outer measure is only finitely subadditive rather than countably subadditive and thus is not, strictly speaking, an outer measure; for this reason this concept is often referred to as *Jordan outer content* rather than *Jordan outer measure*.

Note that outer measures are weaker than measures in that they are merely countably subadditive, rather than countably additive. On the other hand, they are able to measure *all* subsets of , whereas measures can only measure a -algebra of measurable sets.

In Definition 1 of Notes 1, we used Lebesgue outer measure together with the notion of an open set to define the concept of Lebesgue measurability. This definition is not available in our more abstract setting, as we do not necessarily have the notion of an open set. An alternative definition of measurability was put forth in Exercise 17 of Notes 1, but this still required the notion of a box or an elementary set, which is still not available in this setting. Nevertheless, we can modify that definition to give an abstract definition of measurability:

Definition 2 (Carathéodory measurability)Let be an outer measure on a set . A set is said to beCarathéodory measurablewith respect to if one hasfor

everyset .

Exercise 3 (Null sets are Carathéodory measurable)Suppose that is a null set for an outer measure (i.e. ). Show that is Carathéodory measurable with respect to .

Exercise 4 (Compatibility with Lebesgue measurability)Show that a set is Carathéodory measurable with respect to Lebesgue outer measure if and only if it is Lebesgue measurable. (Hint:one direction follows from Exercise 17 of Notes 1. For the other direction, first verify simple cases, such as when is a box, or when or are bounded.)

The construction of Lebesgue measure can then be abstracted as follows:

Theorem 5 (Carathéodory lemma)Let be an outer measure on a set , let be the collection of all subsets of that are Carathéodory measurable with respect to , and let be the restriction of to (thus whenever ). Then is a -algebra, and is a measure.

*Proof:* We begin with the -algebra property. It is easy to see that the empty set lies in , and that the complement of a set in lies in also. Next, we verify that is closed under finite unions (which will make a Boolean algebra). Let , and let be arbitrary. By definition, it suffices to show that

To simplify the notation, we partition into the four disjoint sets

(the reader may wish to draw a Venn diagram here to understand the nature of these sets). Thus (1) becomes

On the other hand, from the Carathéodory measurability of , one has

and

while from the Carathéodory measurability of one has

putting these identities together we obtain (2). (Note that no subtraction is employed here, and so the arguments still work when some sets have infinite outer measure.)

Now we verify that is a -algebra. As it is already a Boolean algebra, it suffices (see Exercise 6 below) to verify that is closed with respect to countable disjoint unions. Thus, let be a sequence of disjoint Carathéodory-measurable sets, and let be arbitrary. We wish to show that

In view of subadditivity, it suffices to show that

For any , is Carathéodory measurable (as is a Boolean algebra), and so

By monotonicity, . Taking limits as , it thus suffices to show that

But by the Carathéodory measurability of , we have

for any , and thus on iteration

On the other hand, from countable subadditivity one has

and the claim follows.

Finally, we show that is a measure. It is clear that , so it suffices to establish countable additivity, thus we need to show that

whenever are Carathéodory-measurable and disjoint. By subadditivity it suffices to show that

By monotonicity it suffices to show that

for any finite . But from the Carathéodory measurability of one has

for any , and the claim follows from induction.

Exercise 6Let be a Boolean algebra on a set . Show that is a -algebra if and only if it is closed under countabledisjointunions, which means that whenever are a countable sequence ofdisjointsets in .

Remark 7Note that the above theorem, combined with Exercise 4 gives a slightly alternate way to construct Lebesgue measure from Lebesgue outer measure than the construction given in Notes 1. This is arguably a more efficient way to proceed, but is also less geometrically intuitive than the approach taken in Notes 1.

Remark 8From Exercise 3 we see that the measure constructed by the Carathéodory lemma is automatically complete, in the sense that any sub-null set for (a subset of a null set for ) is also a null set.

Remark 9In 245C we will give an important example of a measure constructed by Carathéodory’s lemma, namely the -dimensionalHausdorff measureon that is good for measuring the size of -dimensional subsets of .

** — 2. Pre-measures — **

In previous notes, we saw that finitely additive measures, such as elementary measure or Jordan measure, could be extended to a countably additive measure, namely Lebesgue measure. It is natural to ask whether this property is true in general. In other words, given a finitely additive measure on a Boolean algebra , is it possible to find a -algebra refining , and a countably additive measure that extends ?

There is an obvious necessary condition in order for to have a countably additive extension, namely that already has to be countably additive *within *. More precisely, suppose that were disjoint sets such that their union was also in . (Note that this latter property is not automatic as is merely a Boolean algebra rather than a -algebra.) Then, in order for to be extendible to a countably additive measure, it is clearly necessary that

Using the Carathéodory lemma, we can show that this necessary condition is also sufficient. More precisely, we have

Definition 10 (Pre-measure)Apre-measureon a Boolean algebra is a finitely additive measure with the property that whenever are disjoint sets such that is in .

Exercise 11

- Show that the requirement that is finitely additive could be relaxed to the condition that without affecting the definition of a pre-measure.
- Show that the condition could be relaxed to without affecting the definition of a pre-measure.
- On the other hand, give an example to show that if one performs both of the above two relaxations at once, one starts admitting objects that are not pre-measures.

Exercise 12Without using the theory of Lebesgue measure, show that elementary measure (on the elementary Boolean algebra) is a pre-measure. (Hint: use }{Lemma 6} from Notes 1. Note that one has to also deal with co-elementary sets as well as elementary sets in the elementary Boolean algebra.)

Exercise 13Construct a finitely additive measure that is not a pre-measure. (Hint:take to be the natural numbers, take to be the discrete algebra, and define separately for finite and infinite sets.)

Theorem 14 (Hahn-Kolmogorov theorem)Every pre-measure on a Boolean algebra in can be extended to a countably additive measure .

*Proof:* We mimic the construction of Lebesgue measure from elementary measure. Namely, for any set , define the *outer measure* of to be the quantity

It is easy to verify (cf. Exercise 4 of Notes 1) that is indeed an outer measure. Let be the collection of all sets that are Carathéodory measurable with respect to , and let be the restriction of to . By the Carathéodory lemma, is a -algebra and is a countably additive measure.

It remains to show that contains and that extends . Thus, let ; we need to show that is Carathéodory measurable with respect to and that . To prove the first claim, let be arbitrary. We need to show that

by subadditivity, it suffices to show that

We may assume that is finite, since the claim is trivial otherwise.

Fix . By definition of , one can find covering such that

The sets lie in and cover and thus

Similarly we have

Meanwhile, from finite additivity we have

Combining all of these estimates, we obtain

since was arbitrary, the claim follows.

Finally, we have to show that . Since covers itself, we certainly have . To show the converse inequality, it suffices to show that

whenever cover . By replacing each with the smaller set (which still lies in , and still covers ), we may assume without loss of generality (thanks to the monotonicity of ) that the are disjoint. Similarly, by replacing each with the smaller set we may assume without loss of generality that the union of the is exactly equal to . But then the claim follows from the hypothesis that is a pre-measure (and not merely a finitely additive measure).

Let us call the measure constructed in the above proof the *Hahn-Kolmogorov extension* of the pre-measure . Thus, for instance, from Exercise 4, the Hahn-Kolmogorov extension of elementary measure (with the convention that co-elementary sets have infinite elementary measure) is Lebesgue measure. This is not quite the unique extension of to a countably additive measure, though. For instance, one could restrict Lebesgue measure to the Borel -algebra, and this would still be a countably additive extension of elementary measure. However, the extension is unique within its own -algebra:

Exercise 15Let be a pre-measure, let be the Hahn-Kolmogorov extension of , and let be another countably additive extension of . Suppose also that is-finite, which means that one can express the whole space as the countable union of sets for which for all . Show that and agree on their common domain of definition. In other words, show that for all . (Hint: first show that for all .)

Exercise 16The purpose of this exercise is to show that the -finite hypothesis in Exercise 15 cannot be removed. Let be the collection of all subsets in that can be expressed as finite unions of half-open intervals . Let be the function such that for non-empty and .

- Show that is a pre-measure.
- Show that is the Borel -algebra .
- Show that the Hahn-Kolmogorov extension of assigns an infinite measure to any non-empty Borel set.
- Show that counting measure (or more generally, for any ) is another extension of on .

Exercise 17Let be a pre-measure which is -finite (thus is the countable union of sets in of finite -measure), and let be the Hahn-Kolmogorov extension of .

- Show that if , then there exists containing such that (thus consists of the union of and a null set). Furthermore, show that can be chosen to be a countable intersection of sets , each of which is a countable union of sets in .
- If has finite measure (i.e. ), and , show that there exists such that .
- Conversely, if is a set such that for every there exists such that , show that .

** — 3. Lebesgue-Stieltjes measure — **

Now we use the Hahn-Kolmogorov extension theorem to construct a variety of measures. We begin with Lebesgue-Stieltjes measure.

Theorem 18 (Existence of Lebesgue-Stieltjes measure)Let be a monotone non-decreasing function, and define the left and right limitsthus one has for all . Let be the Borel -algebra on . Then there exists a unique Borel measure such that

*Proof:* (Sketch) For this proof, we will deviate from our previous notational conventions, and allow intervals to be unbounded, thus in particular including the half-infinite intervals , , , and the doubly infinite interval as intervals.

Define the -volume of any interval to be the required value of given by (3) (e.g., ), adopting the obvious conventions that and , and also adopting the convention that the empty interval has zero -volume, . Note that could equal and could equal , but in all circumstances the -volume is well-defined and takes values in , after adopting the obvious conventions to evaluate expressions such as .

A somewhat tedious case check (Exercise!) gives the additivity property

whenever , are disjoint intervals that share a common endpoint. As a corollary, we see that if a interval is partitioned into finitely many disjoint sub-intervals , we have .

Let be the Boolean algebra generated by the (possibly infinite) intervals, then consists of those sets that can be expressed as a finite union of intervals. (This is slightly larger than the elementary algebra, as it allows for half-infinite intervals such as , whereas the elementary algebra does not.) We can define a measure on this algebra by declaring

whenever is the disjoint union of finitely many intervals. One can check (Exercise!) that this measure is well-defined (in the sense that it gives a unique value to for each ) and is finitely additive. We now claim that is a pre-measure: thus we suppose that is the disjoint union of countably many sets , and wish to show that

By splitting up into intervals and then intersecting each of the with these intervals and using finite additivity, we may assume that is a single interval. By splitting up the into their component intervals and using finite additivity, we may assume that the are also individual intervals. By finite additivity, we have for every , so it suffices to show that

By the definition of , one can check that

where ranges over all compact intervals contained in (Exercise!). Thus, it suffices to show that

for each compact sub-interval of . In a similar spirit, one can show that

where ranges over all open intervals containing (Exercise!). Using the trick, it thus suffices to show that

whenever is an open interval containing . But by the Heine-Borel theorem, one can cover by a finite number of the , hence by finite subadditivity

and the claim follows.

As is now verified to be a pre-measure, we may use the Hahn-Kolmogorov extension theorem to extend it to a countably additive measure on a -algebra that contains . In particular, contains all the elementary sets and hence (by Exercise 14 of Notes 3) contains the Borel -algebra. Restricting to the Borel -algebra we obtain the existence claim.

Finally, we establish uniqueness. If is another Borel measure with the stated properties, then for every compact interval , and hence by (5) and upward monotone convergence, one has for every interval (including the unbounded ones). This implies that agrees with on , and thus (by Exercise 15, noting that is -finite) agrees with on Borel measurable sets.

Exercise 19Verify the claims marked “Exercise!” in the above proof.

The measure given by the above theorem is known as the Lebesgue-Stieltjes measure of . (In some texts, this measure is only defined when is right-continuous, or equivalently if .)

Exercise 20Define a Radon measure on to be a Borel measure obeying the following additional properties:

- (Local finiteness) for every compact .
- (Inner regularity) One has for every Borel set .
- (Outer regularity) One has for every Borel set .
Show that for every monotone function , the Lebesgue-Stieltjes measure is a Radon measure on ; conversely, if is a Radon measure on , show that there exists a monotone function such that .

Radon measures will be studied in more detail in 245B.

Exercise 21 (Near uniqueness)If are monotone non-decreasing functions, show that if and only if there exists a constant such that and for all . Note that this implies that the value of at its points of discontinuity are irrelevant for the purposes of determining the Lebesgue-Stieltjes measure ; in particular, .

In the special case when and , then is a probability measure, and is known as the cumulative distribution function of .

Now we give some examples of Lebesgue-Stieltjes measure.

Exercise 22 (Lebesgue-Stieltjes measure, absolutely continuous case)

- If is the identity function , show that is equal to Lebesgue measure .
- If is monotone non-decreasing and absolutely continuous (which in particular implies that exists and is absolutely integrable, show that in the sense of Exercise 47 of Notes 3, thus
for any Borel measurable , and

for any unsigned Borel measurable .

In view of the above exercise, the integral is often abbreviated , and referred to as the Lebesgue-Stieltjes integral of with respect to . In particular, observe the identity

for any monotone non-decreasing and any , which can be viewed as yet another formulation of the fundamental theorem of calculus.

Exercise 23 (Lebesgue-Stieltjes measure, pure point case)

- If is the Heaviside function , show that is equal to the Dirac measure at the origin (defined in Example 9 of Notes 3).
- If is a jump function (as defined in Definition 17 of Notes 5), show that is equal to the linear combination of delta functions (as defined in Exercise 22 of Notes 3), where is the point of discontinuity for the basic jump function .

Exercise 24 (Lebesgue-Stieltjes measure, singular continuous case)

- If is a monotone non-decreasing function, show that is continuous if and only if for all .
- If is the Cantor function (defined in Exercise 46 of Notes 5), show that is a probability measure supported on the middle-thirds Cantor set (Exercise 10 from Notes 1) in the sense that . The measure is known as Cantor measure.
- If is Cantor measure, establish the self-similarity properties and for every Borel-measurable , where .

Exercise 25 (Connection with Riemann-Stieltjes integral)Let be monotone non-decreasing, let be a compact interval, and let be continuous. Suppose that is continuous at the endpoints of the interval. Show that for every there exists such thatwhenever and for are such that . In the language of the Riemann-Stieltjes integral, this result asserts that the Lebesgue-Stieltjes integral extends the Riemann-Stieltjes integral.

Exercise 26 (Integration by parts formula)Let be monotone non-decreasing and continuous. Show thatfor any compact interval . (

Hint:use Exercise \ref}{riemstil}.) This formula can be partially extended to the case when one or both of have discontinuities, but care must be taken when and are simultaneously discontinuous at the same location.

** — 4. Product measure — **

Given two sets and , one can form their Cartesian product . This set is naturally equipped with the coordinate projection maps and defined by setting and . One can certainly take Cartesian products of more than two sets, or even take an infinite product , but for simplicity we will only discuss the theory for products of two sets for now.

Now suppose that and are measurable spaces. Then we can still form the Cartesian product and the projection maps and . But now we can also form the pullback -algebras

and

We then define the *product -algebra* to be the -algebra generated by the union of these two -algebras:

This definition has several equivalent formulations:

Exercise 27Let and be measurable spaces.

- Show that is the -algebra generated by the sets with , . In other words, is the coarsest -algebra on with the property that the product of a -measurable set and a -measurable set is always measurable.
- Show that is the coarsest -algebra on that makes the projection maps both measurable morphisms (see Remark 8 from Notes 3).
- If , show that the sets lie in for every , and similarly that the sets lie in for every .
- If is measurable (with respect to ), show that the function is -measurable for every , and similarly that the function is -measurable for every .
- If , show that the slices lie in a countably generated -algebra. In other words, show that there exists an at most countable collection of sets (which can depend on ) such that . Conclude in particular that the number of distinct slices is at most , the cardinality of the continuum. (The last part of this exercise is only suitable for students who are comfortable with cardinal arithmetic.)

- Show that the product of two trivial -algebras (on two different spaces ) is again trivial.
- (Exercise removed)
- Show that the product of two finite -algebras is again finite.
- Show that the product of two Borel -algebras (on two Euclidean spaces with ) is again the Borel -algebra (on ).
- Show that the product of two Lebesgue -algebras (on two Euclidean spaces with ) is
notthe Lebesgue -algebra. (Hint:argue by contradiction and use Exercise 27(3).)- However, show that the Lebesgue -algebra on is the completion of the product of the Lebesgue -algebras of and with respect to -dimensional Lebesgue measure (see Exercise 26 of Notes 3 for the definition of completion of a measure space).
- This part of the exercise is only for students who are comfortable with cardinal arithmetic. Give an example to show that the product of two discrete -algebras is not necessarily discrete.
- On the other hand, show that the product of two discrete -algebras is again a discrete -algebra if at least one of the domains is at most countably infinite.

Now suppose we have two measure spaces and . Given that we can multiply together the sets and to form a product set , and can multiply the -algebras and together to form a product -algebra , it is natural to expect that we can multiply the two measures and to form a product measure . In view of the “base times height formula” that one learns in elementary school, one expects to have

whenever and .

To construct this measure, it is convenient to make the assumption that both spaces are -finite.

Definition 29 (-finite)A measure space is-finiteif can be expressed as the countable union of sets of finite measure.

Thus, for instance, with Lebesgue measure is -finite, as can be expressed as the union of (for instance) the balls for , each of which has finite measure. On the other hand, with counting measure is not -finite (why?). But most measure spaces that one actually encounters in analysis (including, clearly, all probability spaces) are -finite. It is possible to partially extend the theory of product spaces to the non--finite setting, but there are a number of very delicate technical issues that arise and so we will not discuss them here.

As long as we restrict attention to the -finite case, product measure always exists and is unique:

Proposition 30 (Existence and uniqueness of product measure)Let and be -finite measure spaces. Then there exists a unique measure on that obeys whenever and .

*Proof:* We first show existence. Inspired by the fact that Lebesgue measure is the Hahn-Kolmogorov completion of elementary (pre-)measure, we shall first construct an “elementary product pre-measure” that we will then apply Theorem 14 to.

Let be the collection of all finite unions

of Cartesian products of -measurable sets and -measurable sets . (One can think of such sets as being somewhat analogous to elementary sets in Euclidean space, although the analogy is not perfectly exact.) It is not difficult to verify that this is a Boolean algebra (though it is not, in general, a -algebra). Also, any set in can be easily decomposed into a *disjoint* union of product sets of -measurable sets and -measurable sets (cf. Lemma 2 (and Exercise 2) from the prologue). We then define the quantity associated such a disjoint union by the formula

whenever is the disjoint union of products of -measurable sets and -measurable sets. One can show that this definition does not depend on exactly how is decomposed, and gives a finitely additive measure (cf. Exercise 2 from the prologue, and also Exercise 31 from Notes 3).

Now we show that is a pre-measure. It suffices to show that if is the countable disjoint union of sets , then .

Splitting up into disjoint product sets, and restricting the to each of these product sets in turn, we may assume without loss of generality (using the finite additivity of ) that for some and . In a similar spirit, by breaking each up into component product sets and using finite additivity again, we may assume without loss of generality that each takes the form for some and . By definition of , our objective is now to show that

To do this, first observe from construction that we have the pointwise identity

for all and . We fix , and integrate this identity in (noting that both sides are measurable and unsigned) to conclude that

The left-hand side simplifies to . To compute the right-hand side, we use the monotone convergence theorem to interchange the summation and integration, and soon see that the right-hand side is , thus

for all . Both sides are measurable and unsigned in , so we may integrate in and conclude that

The left-hand side here is . Using monotone convergence as before, the right-hand side simplifies to , and the claim follows.

Now that we have established that is a pre-measure, we may apply Theorem 14 to extend this measure to a countably additive measure on a -algebra containing . By Exercise 27(2), is a countably additive measure on , and as it extends , it will obey (6). Finally, to show uniqueness, observe from finite additivity that any measure on that obeys (6) must extend , and so uniqueness follows from Exercise 15.

Remark 31When , are not both -finite, then one can still construct at least one product measure, but it will, in general, not be unique. This makes the theory much more subtle, and we will not discuss it in these notes.

Example 32From Exercise 22 of Notes 1, we see that the product of the Lebesgue measures on and respectively will agree with Lebesgue measure on the product space , which as noted in Exercise 28 is a subalgebra of . After taking the completion of this product measure, one obtains the full Lebesgue measure .

Exercise 33Let , be measurable spaces.

- Show that the product of two Dirac measures on , is a Dirac measure on .
- If are at most countable, show that the product of the two counting measures on , is the counting measure on .

Exercise 34 (Associativity of product)Let , , be -finite sets. We may identify the Cartesian products and with each other in the obvious manner. If we do so, show that and .

Now we integrate using this product measure. We will need the following technical lemma. Define a *monotone class* in is a collection of subsets of with the following two closure properties:

- If are a countable increasing sequence of sets in , then .
- If are a countable decreasing sequence of sets in , then .

Lemma 35 (Monotone class lemma)Let be a Boolean algebra on . Then is the smallest monotone class that contains .

*Proof:* Let be the intersection of all the monotone classes that contain . Since is clearly one such class, is a subset of . Our task is then to show that contains .

It is also clear that is a monotone class that contains . By replacing all the elements of with their complements, we see that is necessarily closed under complements.

For any , consider the set of all sets such that , , , and all lie in . It is clear that contains ; since is a monotone class, we see that is also. By definition of , we conclude that for all .

Next, let be the set of all such that , , , and all lie in for all . By the previous discussion, we see that contains . One also easily verifies that is a monotone class. By definition of , we conclude that . Since is also closed under complements, this implies that is closed with respect to finite unions. Since this class also contains , which contains , we conclude that is a Boolean algebra. Since is also closed under increasing countable unions, we conclude that it is closed under arbitrary countable unions, and is thus a -algebra. As it contains , it must also contain .

Theorem 36 (Tonelli’s theorem, incomplete version)Let and be -finite measure spaces, and let be measurable with respect to . Then:

- The functions and (which are well-defined, thanks to Exercise 27) are measurable with respect to and respectively.
- We have

*Proof:* By writing the -finite space as an increasing union of finite measure sets, we see from several applications of the monotone convergence theorem that it suffices to prove the claims with replaced by . Thus we may assume without loss of generality that has finite measure. Similarly we may assume has finite measure. Note from (6) that this implies that has finite measure also.

Every unsigned measurable function is the increasing limit of unsigned simple functions. By several applications of the monotone convergence theorem, we thus see that it suffices to verify the claim when is a simple function. By linearity, it then suffices to verify the claim when is an indicator function, thus for some .

Let be the set of all for which the claims hold. From the repeated applications of the monotone convergence theorem and the downward monotone convergence theorem (which is available in this finite measure setting) we see that is a monotone class.

By direct computation (using (6)), we see that contains as an element any product with and . By finite additivity, we conclude that also contains as an element any a disjoint finite union of such products. This implies that also contains the Boolean algebra in the proof of Proposition 30, as such sets can always be expressed as the disjoint finite union of Cartesian products of measurable sets. Applying the monotone class lemma, we conclude that contains , and the claim follows.

Remark 37Note that Tonelli’s theorem for sums (Theorem 2 from Notes 1) is a special case of the above result when are counting measure. In a similar spirit, Corollary 15 from Notes 3 is the special case when just one of is counting measure.

Corollary 38Let and be -finite measure spaces, and let be a null set with respect to . Then for -almost every , the set is a -null set; and similarly, for -almost every , the set is a -null set.

*Proof:* Applying the Tonelli theorem to the indicator function , we conclude that

and thus

and the claim follows.

With this corollary, we can extend Tonelli’s theorem to the completion of the product space : (see Exercise 26 of Notes 3 for the definition of completion). But we can easily extend the Tonelli theorem to this context:

Theorem 39 (Tonelli’s theorem, complete version)Let and be complete -finite measure spaces, and let be measurable with respect to . Then:

*Proof:* From Exercise 26 of Notes 3, every measurable set in is equal to a measurable set in outside of a -null set. This implies that the -measurable function agrees with a -measurable function outside of a -null set (as can be seen by expressing as the limit of simple functions). From Corollary 38, we see that for -almost every , the function agrees with outside of a -null set (and is in particular measurable, as is complete); and similarly for -almost every , the function agrees with outside of a -null set and is measurable, and the claim follows.

Specialising to the case when is an indicator function , we conclude

Corollary 40 (Tonelli’s theorem for sets)Let and be complete -finite measure spaces, and let . Then:

Exercise 41The purpose of this exercise is to demonstrate that Tonelli’s theorem can fail if the -finite hypothesis is removed, and also that product measure need not be unique. Let is the unit interval with Lebesgue measure (and the Lebesgue -algebra ) and is the unit interval with counting measure (and the discrete -algebra ) . Let be the indicator function of the diagonal .

- Show that is measurable in the product -algebra.
- Show that .
- Show that .
- Show that there is more than one measure on with the property that for all and . (Hint: use the two different ways to perform a double integral to create two different measures.)

Remark 42If is not assumed to be measurable in the product space (or its completion), then of course the expression does not make sense. Furthermore, in this case the remaining two expressions in (7) may become different as well (in some models of set theory, at least), even when and are finite measure. For instance, let us assume the continuum hypothesis, which implies that the unit interval can be placed in one-to-one correspondence with the first uncountable ordinal . Let be the ordering of that is associated to this ordinal, let , and let . Then, for any , there are at most countably many such that , and so exists and is equal to zero for every . On the other hand, for every , one has for all but countably many , and so exists and is equal to one for every , and so the last two expressions in (7) exist but are unequal. (In particular, Tonelli’s theorem implies that cannot be a Lebesgue measurable subset of .) Thus we see that measurability in the product space is an important hypothesis. (There do however exist models of set theory (with the axiom of choice) in which such counterexamples cannot be constructed, at least in the case when and are the unit interval with Lebesgue measure.)

Tonelli’s theorem is for the unsigned integral, but it leads to an important analogue for the absolutely integral, known as Fubini’s theorem:

Theorem 43 (Fubini’s theorem)Let and be complete -finite measure spaces, and let be absolutely integrable with respect to . Then:

- For -almost every , the function is absolutely integrable with respect to , and in particular exists. Furthermore, the (-almost everywhere defined) map is absolutely integrable with respect to .
- For -almost every , the function is absolutely integrable with respect to , and in particular exists. Furthermore, the (-almost everywhere defined) map is absolutely integrable with respect to .
- We have

*Proof:* By taking real and imaginary parts we may assume that is real; by taking positive and negative parts we may assume that is unsigned. But then the claim follows from Tonelli’s theorem; note from (7) that is finite, and so for -almost every , and similarly for -almost every .

Exercise 44Give an example of a Borel measurable function such that the integrals and exist and are absolutely integrable for all and respectively, and that and exist and are absolutely integrable, but such thatare unequal. (

Hint:adapt the example from Remark 2 of Notes 1.) Thus we see that Fubini’s theorem fails when one drops the hypothesis that is absolutely integrable with respect to the product space.

Remark 45Despite the failure of Tonelli’s theorem in the non--finite setting, it is possible to (carefully) extend Fubini’s theorem to the non--finite setting, as the absolute integrability hypotheses, when combined with Markov’s inequality, can provide a substitute for the -finite property. However, we will not do so here, and indeed I would recommend proceeding with extreme caution when performing any sort of interchange of integrals or invoking of product measure when one is not in the -finite setting.

Informally, Fubini’s theorem allows one to always interchange the order of two integrals, as long as the integrand is absolutely integrable in the product space (or its completion). In particular, specialising to Lebesgue measure, we have

whenever is absolutely integrable. In view of this, we often write (or ) for .

By combining Fubini’s theorem with Tonelli’s theorem, we can recast the absolute integrability hypothesis:

Corollary 46 (Fubini-Tonelli theorem)Let and be complete -finite measure spaces, and let be measurable with respect to . If(note the left-hand side always exists, by Tonelli’s theorem) then is absolutely integrable with respect to , and in particular the conclusions of Fubini’s theorem hold. Similarly if we use instead of .

The Fubini-Tonelli theorem is an indispensable tool for computing integrals. We give some basic examples below:

Exercise 47 (Area interpretation of integral)Let be a -finite measure space, and let be equipped with Lebesgue measure and the Borel -algebra . Show that if is measurable, then the set is measurable in , andSimilarly if we replace by .

Exercise 48 (Distribution formula)Let be a -finite measure space, and let be measurable. Show that(Note that the integrand on the right-hand side is monotone and thus Lebesgue measurable.) Similarly if we replace by .

Exercise 49 (Approximations to the identity)Let be a good kernel (see Exercise 26 from Notes 5), and let be the associated rescaled functions. Show that if is absolutely integrable, that converges in norm to as . (Hint:use the density argument. You will need an upper bound on which can be obtained using Tonelli’s theorem.)

** — 5. Application: the Radamacher differentiation theorem (Optional) — **

The Fubini-Tonelli theorem is often used in extending lower-dimensional results to higher-dimensional ones. We illustrate this by extending the one-dimensional Lipschitz differentiation theorem (Exercise 40 from Notes 5) to higher dimensions. We first recall some higher-dimensional definitions:

Definition 50 (Lipschitz continuity)A function from one metric space to another is said to be Lipschitz continuous if there exists a constant such that for all . (In our current application, will be and will be , with the usual metrics.)

Exercise 51Show that Lipschitz continuous functions are uniformly continuous, and hence continuous. Then give an example of a uniformly continuous function that is not Lipschitz continuous.

Definition 52 (Differentiability)Let be a function, and let . For any , we say that is directionally differentiable at in the direction if the limitexists, in which case we call the

directional derivativeof at in this direction. If is one of the standard basis vectors of , we write as , and refer to this as thepartial derivativeof at in the direction.

We say that is totally differentiable at if there exists a vector with the property thatwhere is the usual dot product on . We refer to (if it exists) as the

gradientof at .

Remark 53From the viewpoint of differential geometry, it is better to work not with the gradient vector , but rather with the derivative covector given by . This is because one can then define the notion of total differentiability without any mention of the Euclidean dot product, which allows one to extend this notion to other manifolds in which there is no Euclidean (or more generally, Riemannian) structure. However, as we are working exclusively in Euclidean space for this application, this distinction will not be important for us.

Total differentiability implies directional and partial differentiability, but not conversely, as the following three exercises demonstrate.

Exercise 54 (Total differentiability implies directional and partial differentiability)Show that if is totally differentiable at , then it is directionally differentiable at in each direction , and one has the formulaIn particular, the partial derivatives exist for and

Exercise 55 (Continuous partial differentiability implies total differentiability)Let be such that the partial derivatives exist everywhere and are continuous. Then show that is totally differentiable everywhere, which in particular implies that the gradient is given by the formula (10) and the directional derivatives are given by (9).

Exercise 56 (Directional differentiability does not imply total differentiability)Let be defined by setting and for . Show that the directional derivatives exist for all (so in particular, the partial derivatives exist), but that is not totally differentiable at the origin .

Now we can state the Rademacher differentiation theorem.

Theorem 57 (Rademacher differentiation theorem)Let be Lipschitz continuous. Then is totally differentiable at for almost every .

Note that the case of this theorem is Exercise 40 from Notes 5, and indeed we will use the one-dimensional theorem to imply the higher-dimensional one, though there will be some technical issues due to the gap between directional and total differentiability.

*Proof:* The strategy here is to first aim for the more modest goal of directional differentiability, and then find a way to link the directional derivatives together to get total differentiability.

Let . As is continuous, we see that in order for the directional derivative

to exist, it suffices to let range in the dense subset of for the purposes of determing whether the limit exists. In particular, exists if and only if

From this we easily conclude that for each direction , the set

is Lebesgue measurable in (indeed, it is even Borel measurable). A similar argument reveals that is a measurable function outside of . From the Lipschitz nature of , we see that is also a bounded function.

Now we claim that is a null set for each . For is clearly empty, so we may assume . Applying an invertible linear transformation to map to (noting that such transformations will map Lipschitz functions to Lispchitz functions, and null sets to null sets) we may assume without loss of generality that is the basis vector . Thus our task is now to show that exists for almost every .

We now split as . For each and , we see from the definitions that exists if and only if the one-dimensional function is differentiable at . But this function is Lipschitz continuous (this is inherited from the Lipschitz continuity of ), and so we see that for each fixed , the set is a null set in . Applying Tonelli’s theorem for sets (Corollary 40), we conclude that is a null set as required.

We would like to now conclude that is a null set, but there are uncountably many ‘s, so this is not directly possible. However, as is rational, we can at least assert that is a null set. In particular, for almost every , is directionally differentiable in every rational direction .

Now we perform an important trick, in which we interpret the directional derivative as a weak derivative. We already know that is almost everywhere defined, bounded and measurable. Now let be any function that is compactly supported and Lipschitz continuous. We investigate the integral

This integral is absolutely convergent since is bounded and measurable, and is continuous and compactly supported, hence bounded. We expand this out as

Note (from the Lipschitz nature of ) that the expression is bounded uniformly in and , and is also uniformly compactly supported in for in a bounded set. We may thus apply the Lebesgue dominated convergence theorem to pull the limit out of the integral to obtain

Now, from translation invariance of the Lebesgue integral (Exercise 15) we have

and so (by the lienarity of the Lebesgue integral) we may rearrange the previous expression as

Now, as is Lipschitz, we know that is uniformly bounded and converges pointwise almost everywhere to as . We may thus apply the dominated convergence theorem again and end up with the *integration by parts formula*

This formula moves the directional derivative operator from over to . At present, this does not look like much of an advantage, because is the same sort of function that is. However, the key point is that we can choose to be whatever we please, whereas is fixed. In particular, we can choose to be a compactly supported, *continuously differentiable* function (such functions are Lipschitz from the fundamental theorem of calculus, as their derivatives are bounded). By Exercise 55, one has for such functions, and so

The right-hand side is linear in , and so the left-hand side must be linear in also. In particular, if , then we have

If we define the gradient candidate function

(note that this function is well-defined almost everywhere, even though we don’t know yet whether is totally differentiable almost everywhere), we thus have

for all compactly supported, continuously differentiable . This implies (see Exercise 58 below) that vanishes almost everywhere, thus (by countable subadditivity) we have

for almost every and every .

Let be such that (12) holds for all . We claim that this forces to be totally differentiable at , which would give the claim. Let be the modified function

Our objective is to show that

On the other hand, we have , is Lipschitz, and from (12) we see that for every .

Let , and suppose that . Then we can write where and lies on the unit sphere. This need not lie in , but we can approximate it by some vector with . Furthermore, by the total boundedness of the unit sphere, we can make lie in a finite subset of that only depends on (and on ).

Since for all , we see (by making small enough depending on ) that we have

for all , and thus

On the other hand, from the Lipschitz nature of , we have

where is the Lipschitz constant of . As , we conclude that

In other words, we have shown that

whenever is sufficiently small depending on . Letting , we obtain the claim.

Exercise 58Let be a locally integrable function with the property that whenever is a compactly supported, continuously differentiable function. Show that is zero almost everywhere. (Hint:if not, use the Lebesgue differentiation theorem to find a Lebesgue point of for which , then pick a which is supported in a sufficiently small neighbourhood of .)

** — 6. Infinite product spaces and the Kolmogorov extension theorem (optional) — **

In Section 4 we considered the product of two sets, measurable spaces, or (-finite) measure spaces. We now consider how to generalise this concept to products of more than two such spaces. The axioms of set theory allow us to form a Cartesian product of any family of sets indexed by another set , which consists of the space of all tuples indexed by , for which for all . This concept allows for a succinct formulation of the axiom of choice (Axiom 3 from Notes 1), namely that an arbitrary Cartesian product of non-empty sets remains non-empty.

For any , we have the coordinate projection maps defined by . More generally, given any , we define the partial projections to the partial product space by . More generally still, given two subsets , we have the partial subprojections defined by . These partial subprojections obey the composition law for all (and thus form a very simple example of a category).

As before, given any -algebra on , we can pull it back by to create a -algebra

on . One easily verifies that this is indeed a -algebra. Informally, describes those sets (or “events”, if one is thinking in probabilistic terms) that depend only on the coordinate of the state , and whose dependence on is -measurable. We can then define the product -algebra

We have a generalisation of Exercise 27:

Exercise 59Let be a family of measurable spaces. For any , write .

- Show that is the coarsest -algebra on that makes the projection maps measurable morphisms for all .
- Show that for each , that is a measurable morphism from to .
- If in , show that there exists an at most countable set and a set such that . Informally, this asserts that a measurable event can only depend on at most countably many of the coefficients.
- If is -measurable, show that there exists an at most countable set and a -measurable function such that .
- If is at most countable, show that is the -algebra generated by the sets with for all .
- On the other hand, show that if is uncountable and the are all non-trivial, show that is
notthe -algebra generated by sets with for all .- If , , and , show that the set lies in , where we identify with in the obvious manner.
- If , is -measurable, and , show that the function is -measurable.

Now we consider the problem of constructing a measure on the product space . Any such measure will induce *pushforward measures* on (introduced in Exercise 36 of Notes 3), thus

for all . These measures obey the compatibility relation

whenever , as can be easily seen by chasing the definitions.

One can then ask whether one can reconstruct from just from the projections to *finite* subsets . This is possible in the important special case when the (and hence ) are probability measures, provided one imposes an additional inner regularity hypothesis on the measures . More precisely:

Definition 60 (Inner regularity)A (metrisable) inner regular measure space is a measure space equipped with a metric such that

- Every compact set is measurable; and
- One has for all measurable .
We say that is

inner regularif it is associated to an inner regular measure space.

Thus for instance Lebesgue measure is inner regular, as are Dirac measures and counting measures. Indeed, most measures that one actually encounters in applications will be inner regular. For instance, any finite Borel measure on (or more generally, on a locally compact, -compact space) is inner regular (see Exercise 12 of 245B Notes 12). Inner regularity is one of the axioms of a Radon measure, which we will discuss in more detail in 245B.

Remark 61One can generalise the concept of an inner regular measure space to one which is given by a topology rather than a metric; Kolmogorov’s extension theorem still holds in this more general setting, but requires Tychonoff’s theorem, which we will cover in 245B Notes 10. However, some minimal regularity hypotheses of a topological nature are needed to make the Kolmogorov extension theorem work, although this is usually not a severe restriction in practice.

Theorem 62 (Kolmogorov extension theorem)Let be a family of measurable spaces , equipped with a topology . For each finite , let be an inner regular probability measure on with the product topology , obeying the compatibility condition (13) whenever are two nested finite subsets of . Then there exists a unique probability measure on with the property that for all finite .

*Proof:* Our main tool here will be the Hahn-Kolmogorov extension theorem for pre-measures (Theorem 14), combined with the Heine-Borel theorem.

Let be the set of all subsets of that are of the form for some finite and some . One easily verifies that this is a Boolean algebra that is contained in . We define a function by setting

whenever takes the form for some finite and . Note that a set may have two different representations for some finite , but then one must have and , where . Applying (13), we see that

and

and thus . This shows that is well defined. As the are probability measures, we see that .

It is not difficult to see that is finitely additive. We now claim that is a pre-measure. In other words, we claim that if is the disjoint countable union of sets , then .

For each , let . Then the lie in , are decreasing, and are such that . By finite additivity (and the finiteness of ), we see that it suffices to show that .

Suppose this is not the case, then there exists such that for all . As each lies in , we have for some finite sets and some -measurable sets . By enlarging each as necessary we may assume that the are increasing in . The decreasing nature of the then gives the inclusions

By inner regularity, one can find a compact subset of each such that

If we then set

then we see that each is compact and

In particular, the sets are non-empty. By construction, we also have the inclusions

and thus the sets are decreasing in . On the other hand, since these sets are contained in , we have .

By the axiom of choice, we can select an element from for each . Observe that for any , that will lie in the compact set whenever . Applying the Heine-Borel theorem repeatedly, we may thus find a subsequence of the for such that converges; then we can find a further subsequence of that subsequence such that , and more generally obtain nested subsequences for and such that for each , the sequence converges.

Now we use the diagonalisation trick. Consier the sequence for . By construction, we see that for each , converges to a limit as . This implies that for each , converges to a limit as . As is closed, we see that for each . If we then extend arbitrarily from to , then the point lies in for each . But this contradicts the fact that . This contradiction completes the proof that is a pre-measure.

If we then let be the Hahn-Kolmogorov extension of , one easily verifies that obeys all the required properties, and the uniqueness follows from Exercise 15.

The Kolmogorov extension theorem is a fundamental tool in the foundations of probability theory, as it allows one to construct a probability space to hold a variety of random processes , both in the discrete case (when the set of times is something like the integers ) and in the continuous case (when the set of times is something like ). In particular, it can be used to rigorously construct a process for Brownian motion, known as the Wiener process. We will however not focus on this topic, which can be found in many graduate probability texts. But we will give one common special case of the Kolmogorov extension theorem, which is to construct product probability measures:

Theorem 63 (Existence of product measures)Let be an arbitrary set. For each , let be a probability space in which is a locally compact, -compact metric space, with being its Borel -algebra (i.e. the -algebra generated by the open sets). Then there exists a unique probability measure on with the property thatwhenever for each , and one has for all but finitely many of the .

*Proof:* We apply the Kolmogorov extension theorem to the finite product measures for finite , which can be constructed using the machinery in Section 4. These are Borel probability measures on a locally compact, -compact space and are thus inner regular by Exercise 12 of 245B Notes 12. The compatibility condition (13) can be verified from the uniqueness properties of finite product measures.

Remark 64This result can also be obtained from the }{Riesz representation theorem}, which we will cover in 245B Notes 12.

Example 65 (Bernoulli cube)Let , and for each , let be the two-element set with the discrete metric (and thus discrete -algebra) and the uniform probability measure . Then Theorem 63 gives a probability measure on the infinite discrete cube , known as the (uniform)Bernoulli measureon this cube. The coordinate functions can then be interpreted as a countable sequence of random variables taking values in . From the properties of product measure one can easily check that these random variables are uniformly distributed on and are jointly independent. Informally, Bernoulli measure allows one to model an infinite number of “coin flips”. One can replace the natural numbers here by any other index set, and have a similar construction.

Example 66 (Continuous cube)We repeat the previous example, but replace with the unit interval (with the usual metric, the Borel -algebra, and the uniform probability measure). This gives a probability measure on the infinite continuous cube , and the coordinate functions can now be interpreted as jointly independent random variables, each having the uniform distribution on .

Example 67 (Independent gaussians)We repeat the previous example, but now replace with (with the usual metric, and the Borel -algebra), and the normal probability distribution (thus for every Borel set ). This gives a probability space that supports a countable sequence of jointly independent gaussian random variables .

## 83 comments

Comments feed for this article

21 April, 2018 at 3:10 am

AnonymousFor lemma 35,

Correct me if I am wrong, but is the second paragraph necessary: “It is also clear… closed under complements. ” As , the proof already shows is closed under complements, .

21 April, 2018 at 7:57 am

Terence TaoYes, one could arrange the argument in this fashion instead if one wished.

3 October, 2018 at 6:00 am

Nathanael SchillingI think there is a small mistake in the proof of the Kolomogorov Extension Theorem (Theorem 62) Here you write:

.

As potentially only has measure , and hence (which is contained in the inverse image under a projection of ) cannot have a larger measure.

[Corrected, thanks – T.]22 January, 2021 at 6:22 am

AnonymousDear Professor Tao

Can I ask you about the following thing, please?:In (4) of Exercise 59. How can we get the at most countable set {B} and get the function {f_B}?

Now I just know that if want that {f = f_B \circ \pi_B},we will have that

{ (x_\alpha)_{\alpha \in A}} and { (x^’_\alpha)_{\alpha \in A}}are equal under the given function {f} if {x_\alpha = x^’_\alpha} for all {\alpha belong B}.

22 January, 2021 at 1:22 pm

AnonymousIn theorem 18, it seems that there is a typo in the definition of left and right limits of .

[Corrected, thanks – T.]11 October, 2021 at 9:19 am

254A, Supplement 4: Probabilistic models and heuristics for the primes (optional) | What's new[…] of random variables, as this set of notes is not focused on rigorous formalism. (See for instance this previous post for the relevant theory […]