Files
Math/Vol1/README.md
2026-03-20 20:58:59 +00:00

824 KiB

Foundations

Mathematical logic (To add to as needed)

::: epigraph There are no facts, only interpretations.

Friedrich Nietzsche :::

In this section, we will introduce mathematical logic. This will give us the tools and basic building blocks to be able to talk about mathematics formally. What do we mean by 'in a formal way'? Modern mathematics is built on a bedrock of logic, that is to say, given some statements which we will take to be true or have already been proven true, what can we logically deduce must also be true, and what is also false. As an example, we are familiar with the idea of positive whole numbers, also called positive integers; we are also familiar with the idea of a positive whole number being prime when the only other positive whole numbers that divide it are 1 and itself, for example, 2 is prime. From the facts that the positive whole numbers exist and there is at least one prime, we can logically deduce there must be infinitely many primes. We will see the proof of this later.

In this document we won't be needing the full tools of mathematical logic, doing so will take us too far afield, instead, we will only cover the key fundamentals we will need as well as define some terms which will be used throughout.

Defining a definition

What is a definition? What does it mean to define something? Definitions are at the heart of mathematics, without them we wouldn't be able to do anything at all. A definition is a declaration that gives a formal name to an object, class of objects, ideas, etc. For example, we can define prime numbers, such a definition might look something like this.

::: Def Definition 1. Definition of a prime number

Consider a positive whole number, we say that this positive whole number is a prime number if the only other positive whole numbers that divide it are itself and the number 1. :::

With this definition whenever we refer to the idea of a prime number, we know that this prime number must satisfy that it only has two distinct numbers that divide it, itself and 1. As we will say throughout this document, we can use a definition when making logical arguments. Definitions are the backbone of defining the setup to logical arguments, if we don't know about the objects we are arguing about then we can't make any logical deductions, or deduce the truth of mathematical statements. Now that we know what a definition is, we can start using it to lay the foundation for the rest of the document. For formality, we will make, somewhat paradoxically, a formal definition of a definition

::: definition Definition 1. Definition

A definition is a statement which gives a formal name to a concept. :::

What is truth?

What is truth? In particular, what is mathematical truth? Loosely speaking truth and mathematical truth is based on the idea of does the premise entail this conclusion. That is to say, if we assume that a few statements are true, then the conclusion we are trying to reach is also true. This is rather vague at the moment because we haven't defined what we mean by true.

Logical statements and logical connectives

We will need a few definitions.

::: definition Definition 2. Declarative logical statement

We define a Declarative logical statement to be either true or false. Here we are using the intuitive definition of true and false. :::

We need to make the definition of declarative logical statements to define what we mean by true and false, again somewhat paradoxically we need a definition of true and false to define what we mean when a declarative logical statement is true. We shall ignore the paradoxical nature of these definitions.

::: definition Definition 3. Assignment of truth

Let P be a declarative logical statement, an assignment of truth is an interpretation of the statement P that sees P as either true or false. We write this as \delta\left(P\right).

If this assignment of truth \delta sees P as true we write \delta\left(P\right)=1 and we say that \delta interprets P as true. If this assignment of truth sees P as false we write \delta\left(P\right)=0 and we say that \delta interprets P as false. :::

These two definitions will allow us to build the foundations that we will need. It is first important to note that an assignment of truth is not an absolute assignment of the truth of a declarative logical statement. Different assignments of truth, and thus different interpretations, can give rise to different values of P being true or false. Now, we have a building blocks to build more complex logical statements.

A first natural question is when does one the truth of one logical statement imply the truth or falseness of another? Thinking about how this should work gives us a sense that something true should never imply that something false is true, whereas something false can imply anything at all. Using this we define the logical implication operator.

::: definition Definition 4. Logical implication

Let P and Q be logical statements. We define the logical implication of the statements P and Q, written as P\Rightarrow Q, to have the following logical values

$P$ $Q$ $P\Rightarrow Q$


*1*     *1*           *1*
*1*     *0*           *0*
*0*     *1*           *1*
*0*     *0*           *1*

: The truth table for the logical implication operator.

We read this as P implies Q, or if P then Q. :::

::: example Example 1. Let P = "The sky is overcast" and let Q = "The sun is not visible". We have by the truth table of logical implication that P\Rightarrow Q is true when

  1. P is true and Q is true

  2. P is false and Q is true

  3. Both P and Q are false.

In words we have P\Rightarrow Q is true in these circumstances

  1. If it is true the sky is overcast then the sun is not visible.

    That is, if the sky is overcast then the sun is not visible

  2. If it is false that the sky is overcast then the sun is not visible.

    That is, if the sky is not overcast then the sun is not visible.

  3. If it is false that the sky is overcast then the sun is visible.

    That is, if the sky is not overcast then the sun is visible.

In particular case two could be true say when it is nighttime, if it is nighttime the sun is clearly not visible1 .

Lets look at these statements the other way, Q\Rightarrow P. We have that is is true when

  1. Q is true and P is true

  2. Q is false and P is true

  3. Both Q and P are false.

In words that is we have Q\Rightarrow P is true in these circumstances

  1. If the sun is not visible then the sky is overcast

  2. If the sun is visible then the sky is overcast

  3. If the sun is visible then the sky not is overcast :::

There is one definition that arises from logical implication that is occasionally useful in proving other statements.

::: definition Definition 5. Vacuous truth

Let P and Q be statements such that we have P\Rightarrow Q. Suppose that P is false, then by the definition of logical implication we have that P\Rightarrow Q is true. We say that P\Rightarrow Q is vacuously true. :::

::: example Example 2. The statement "All my children are goats" is vacuously true for someone who doesn't have any children. :::

It is often the case we have theorems in mathematics which are of the form P implies Q and Q implies P, that is two separate logical sentences can imply each other. This is the logical bi-conditional.

::: definition Definition 6. Logical Bi-conditional

Let P and Q be logical statements. We define the logical Bi-conditional of the statements P and Q, written P\Leftrightarrow Q, to have the following logical values

$P$ $Q$ $P\Leftrightarrow Q$


*1*     *1*             *1*
*1*     *0*             *0*
*0*     *1*             *0*
*0*     *0*             *1*

: The truth table for the logical Bi-conditional operator.

We read this as P if and only Q, meaning P implies Q and Q implies P. :::

::: example Example 3. Let P = "A number is even" and let Q = "It is divisible by 2". By the truth table of the logical bi-conditional that P\Leftrightarrow Q is true when

  1. Both P and Q are true.

  2. Both P and Q are false.

That is in words we have P\Leftrightarrow Q when

  1. A number is even if and only if it is divisible by 2

  2. A number is not even if and only if it is not divisible by 2 :::

Now that we have the logical implication and logical bi-conditional, we can start defining more complex logical connectives. These are the logical conjunction, logical disjunction and logical negation

::: definition Definition 7. Logical conjunction

Suppose we have two logical statements P and Q. We define logical conjunction, written as P\wedge Q, to be true if and only if P and Q are both true, that is to say the logical conjunction connective has the following truth table

$P$ $Q$ $P\wedge Q$


*1*     *1*         *1*
*1*     *0*         *0*
*0*     *1*         *0*
*0*     *0*         *0*

: The truth table for the logical conjunction operator.

Informally, we call this logical AND rather than logical conjunction. :::

::: example Example 4. Let $P =$"$x > 2$" and $Q =$"$x < 10$" and suppose that P and Q are true, then P\wedge Q is true and represents the expression 2<x<10. :::

::: example Example 5. Let P\wedge Q be the expression "Adam likes apples and oranges". We can break down p\wedge Q into the two separate logical sentences, P = "Adam likes apples" and Q = "Adam likes oranges". :::

::: definition Definition 8. Logical disjunction

Suppose we have two logical statements P and Q. We define logical disjunction, written as P\vee Q, to be true when either one of P and Q are true or both P and Q are true. This is to say the logical disjunction connective has the following truth table

$P$ $Q$ $P\vee Q$


*1*     *1*        *1*
*1*     *0*        *1*
*0*     *1*        *1*
*0*     *0*        *0*

: The truth table for the logical disjunction operator.

Informally, we call this logical OR rather than logical disjunction. :::

Logical propositions

Now that we have an idea about logical connectives we can consider more complex logical statements, in particular we now start to consider statements whose truth values can depend on a variable.

::: definition Definition 9. Variable

A variable is something that has a value that can change. :::

How can a statement whose truth value change depending on a variable. To answer this we need to introduce the idea of all the possible values that this variable can take.

::: definition Definition 10. Domain of Discourse

We define the Domain of Discourse to be the collection of all values that a variable can take. We will denote the domain of Discourse by \mathbb{D}. :::

::: definition Definition 11. Logical proposition

Let \mathbb{D} be a Domain of Discourse. We define a logical proposition, denoted by P\left(n_1,n_2,\dots ,n_k\right) be a proposition that is based on the variables n_1,n_2,\dots ,n_k are variables from the domain of discourse. :::

::: example Example 6. Let P\left(n\right) be the proposition denoted by

$$\begin{equation} P\left(n\right) = n\text{ is a even number} \end{equation*}$$ where the domain of discourse D=\mathbb{N} where \mathbb{N} denotes the positive whole numbers, i.e 1,2,3,\dots.*

We have for even n, say 2,4,6,8,\dots that P\left(n\right) is true. and for odd n, say 1,3,5,7,\dots that P\left(n\right) is false. :::

::: example Example 7. Let P\left(n,m\right) be the proposition denoted

$$\begin{equation} P\left(n,m\right) = n>m \end{equation*}$$ that is n is greater than m, where the domain of discourse D=\mathbb{N} is again the positive whole numbers 1,2,3,\dots.*

Suppose that n=2 and m=3, then P\left(n,m\right) is false, if n=45 and m=7 then P\left(n,m\right) is true. :::

We see that logical propositions allow us to construct more complex logical statements and are the building blocks for the more complex Mathematical statements that we will be using.

Proof

Logic and truth are two of the corner stones of Mathematics, the third is proof. Without proof we are unable to verify the truth of any mathematical statements. So what exactly is a proof?

::: definition Definition 12. Mathematical proof

Suppose we have some logical statements which are known or assumed to be true, and suppose we wish to see if some conclusion if true given this assumption. We define a Mathematical proof is where we start from these assumptions and at each step logically deduce additionally true statements until we have proven the conclusion. In other words a Mathematical proof can be broken down into a simple question. Do the assumptions entail this conclusion? :::

This isn't a truly rigours definition of a mathematical proof, and one can define this rigorously in a course on mathematical logic. To do so here would be too much of a diversion, instead we will just keep in our minds that a proof is a series of logical deductions from assumptions to a conclusion. When we have reached the conclusion we use a special symbol. We use the symbol \qed at the end of a proof to show that we are done.

There are many different types of proof that we will invoke throughout the rest of this document.

Direct Proof

The first type of proof we define is direct proof. We define a direct proof as follows.

::: definition Definition 13. Direct Proof

In a direct proof, the conclusion is logically established by using axioms, definitions and previously proven theorems. :::

We will give an example of direct proof.

::: example Example 8. In this example we will breakdown each step of a direct proof.

Here we will give the definitions we will be using and any assumptions which we will be using in the prove(i,e previously proven theorem):

  1. We say a number is an integer if it is a whole number, such as -4,-3,54,8,0,2,7 and so on.

  2. We will also assume that adding and multiplying integers works as we would have taught in school, for example 5+7=12, 2*14=28 etc.

  3. We say that an integer is an even integer if it can be written as x=2*m where m is any integer.

We now move to the proof.

Suppose we have two such even integers, say x and y. We will use direct proof to show that x+y must also be even.

Proof:

Suppose we have two even integers x and y. By the definition of an even integer we have that x=2*n and y=2*m for some integers n and m. Now consider xx+y, we have

$$\begin{equation} x+y=2n+2m=2*\left(n+m\right) \end{equation*}$$ Now, n+m is adding two integers together and is an integer. say k=n+m, hence we have that x+y=2*k, but by definition of an even we have that x+y is even. This concludes the proof. $\qed$* :::

Proof by contradiction

The second type of proof we define is proof by contradiction. This is a very powerful tool.

::: definition Definition 14. Proof by contradiction

Suppose we have a logical statement P that we wish to find the truth of. If we suppose that \neg P is true and then assuming \neg P we can derive another logical statement Q which is known to be false, or we can derive both Q and \neg Q. Then we must have that \neg P is false and P is true. :::

In other words, proof by contradiction states that if, when making an assumption, we can derive a false statement, then the assumption itself must have been invalid. We can justify proof by contradiction using the following truth table.

P \neg P \neg\neg P \neg\neg P\Rightarrow P


1       0            1                     1
0       1            0                     1

: The truth table for proof by contradiction.

::: example Example 9. Like with the example using direct proof. We will break down each step of proof by contradiction.

Here we will give the definitions we will be using and any assumptions which we will be using in the prove(i,e previously proven theorem):

  1. We say a number is a rational number if it is the ration of two integers a and b where b\neq 0. Examples of rational numbers are \displaystyle \frac{1}{2},\frac{2}{3},-\frac{15}{8} and so on. We say a number is irrational if it is not rational.

  2. We say that a rational number \displaystyle \frac{a}{b} is in simplest form if the only number that divides both a and b is 1.

  3. Any rational number has a simplest form.

  4. We will also assume that adding and multiplying rational numbers works as we would have taught in school, that is we have for two rational numbers \displaystyle \frac{a}{b} and \displaystyle \frac{c}{d} that

    $$\begin{equation} \frac{a}{b}+\frac{c}{d}= \frac{ad+bc}{bd},\ \frac{a}{b}\frac{c}{d}=\frac{ac}{bd} \end{equation*}$$*

  5. We say \sqrt{2} is the number which satisfies $\sqrt{2}\sqrt{2}=2$*

  6. We assume the definition of an even integer from the previous example

  7. If a*a=a^2 is an even integer, then so is $a$

We now move to the proof.

We have that \sqrt{2} is an irrational number. This is to say that \sqrt{2} is not the ratio of two whole numbers a and b where \displaystyle \frac{a}{b} is in simplest form.

Proof:

Aiming for a proof by contradiction, suppose that \sqrt{2} is a rational number that is in simplest form. This is to say we have that \displaystyle \sqrt{2}=\frac{a}{b} for some integers a,b. We have by assumption that \sqrt{2} is the number such that \sqrt{2}*\sqrt{2}=2. Hence we have that

$$\begin{equation} \sqrt{2}\sqrt{2}=\frac{a}{b}\frac{a}{b}=\frac{a^2}{b^2}=2 \end{equation*}$$ Where a^2=a*a and b^2 = b*b. We can multiply the above expression by b^2 on both sides to get*

$$\begin{equation} a^2=2b^2 \end{equation}$$*

By definition of an even integer we have that a^2 is even and so a must be even, that is a=2*k for some integer k. Hence we have that

$$\begin{equation} a^2=\left(2k\right)^2=4k^2=2b^2 \end{equation}$$ That is 4*k^2=2*b^2 which implies that b^2=2*k^2, that is b^2 is even and so b must be even. This is a contradiction, as we have that a is even and b is even and so there share a divisor of 2, contradicting the fact we assumed that \displaystyle\sqrt{2}=\frac{a}{b} was in simplest form.*

Therefore, \sqrt{2} must be irrational. $\qed$ :::

Proof by contra-position

Another type of proof that we define is proof by contra-position, sometimes called proof by contra-positive.

::: definition Definition 15. Proof by contra-position

Suppose we have a logical statement P and we wish to show that P implies some other statement Q. We are able to show that P\Rightarrow Q if we can show that \neg Q\Rightarrow \neg P. :::

Proof by contra-position states that proving a statement of the form P\Rightarrow Q is the same as showing that \neg Q\Rightarrow\neg P. It is easier to see this from the truth table.

P Q \neg P \neg Q P\Rightarrow Q \neg Q\Rightarrow\neg P


1     0       0          1              0                       0
1     1       0          0              1                       1
0     0       1          1              0                       1
0     1       1          0              1                       1

: The truth table for proof by contra-positive.

Maybe, to make it even clearer, we can use a worded example. Let P denote the statement "It is raining" and Q denote the statement "I wear my coat". We have that $P\Rightarrow Q$2 . The contra-positive would be \neg Q\Rightarrow\neg P. In words this would be "If I don't wear my coat" then "It is not raining".

::: example Example 10. A more mathematical example can be seen now. We will let x be an integer and we will show that if x^2 is even then x is even. We will use proof by contra-position. So We will show that if x is not even then x^2 is not even.

  1. So, x not being even means x is odd. This means that x=2n+1 for some integer n.

  2. Now, we have x^2=\left(2n+1\right)^2=4n^2+4n+1=2\left(2n^2+2n\right)+1.

  3. Hence, we have shown that x^2 is of the form 2k+1 for some integer k.

  4. Therefore x^2 is odd.

Concluding the proof by contra-positive. :::

Sets and mappings

::: epigraph No one shall expel us from the paradise that Cantor has created for us.

David Hilbert :::

Sets

Introduction and basic definitions

We start with the most elementary definition, a Set or less formally, a collection of 'objects'. This notion of an object is not very rigorous, what do we mean by an object? Do these objects really exist?3 In what way can one collection of objects differ from another?

These questions are at the foundation of Mathematics and to justify the notions and hence tools we need would require a significant detour into the realm of Mathematical logic. The interested reader would find so-called Zermelo--Fraenkel set theory to be of interest in formalising the notion of a set, we will give a brief overview at the end of the section. To avoid the trip into Mathematical logic, we will instead define sets with a more 'hands on' approach

::: definition Definition 16. Naive definition of a Set

A set is a collection of objects. We list the elements surrounded by curly brackets \{ \}. :::

This definition will make sense after we see some examples

::: example Example 11. Let S=\left\{1,2,3,Dogs,Cats,Apples,Pears\right\}. Then S is a set. :::

::: example Example 12. Let S=\left\{"Foo", \left\{1,2,3,Dogs,Cats,Apples,Pears\right\}, Apples, Pears\right\}. Then S is a set. We note that the set from the previous example is now in this set. :::

It would be useful to talk about a particular object in some set S. For example we can say that 1 is in the set from example 2.1. above. We formalise this idea

::: definition Definition 17. Element of a set

An object in a set is called an element of the set. :::

::: definition Definition 18. Set membership

Let S be a set and let x be an element of the set S. We say that x is a member of the set S and write x\in S. If y is some object which is not in the set S we write that y\not\in S. :::

::: example Example 13. Let S=\left\{1,2,3,Dogs,Cats,Apples,Pears\right\}. We have that 1\in S and Dogs\in S but we have that Blue\not\in S. :::

The above example shows a few interesting points. Dogs in English is used when we wish to talk about multiple dogs at once, so it would be absurd to deny that Dogs could itself be a set, for example Dogs=\left\{Lassie, Scooby-Doo, Snoopy, Blue\right\}. So we have that

$$\begin{equation*} S=\left{1,2,3,\left{Lassie, Scooby-Doo, Snoopy, Blue\right},Cats,Apples,Pears\right} \end{equation*}$$

Does this now mean that Blue\in S?. The answer is no, Blue is not any one of the objects in S, however there is an object in S that does contain Blue, namely Dogs. This shows that \in only looks at most one layer deep of \left\{\dots\right\}.

One might wonder if it can ever be the case that a set contains itself, that is a set like S=\left\{S\right\}? Again the answer is no, to see why we need to define a new way of making sets, where the elements of the set are conditioned on some statement being true.

::: example Example 14. Suppose we want the set of all even integers then we have

$$\begin{equation} S=\left{x : x\text{ is an even integer}\right} \end{equation*}$The:$ symbol stands for such that, so S reads the elements x such that x is an even integer.* :::

Returning to the question of can a set contain itself. Consider the set

$$\begin{equation*} S=\left{R: R\text{ is a set and }R\not\in R\right} \end{equation*}$$ That is S is the set of all sets R such that R is a set and R does not contain itself. Now suppose that S\in S. By definition of S we must conclude that S\not\in S. Conversely if S\not\in S then by definition of S we have that S\in S. This is an issue, and shows the flavour of the issues of allowing a set to contain itself, so we shall revise our definition to not allow for a set to contain itself.

::: definition Definition 19. Set

A set is a collection of objects such that none of the objects in the collection is the set itself. :::

Subsets and universal quantifiers

Given a set, we can talk about a smaller collection of the elements of the set, which we call a subset.

::: definition Definition 20. Subset

Let S be a set. If K is also a set such that for every x\in K we also have that x\in S then we say that K is a subset of X, and write K\subseteq S. We say that K is a proper subset of S if we have that S\subseteq T and S\neq T, we denote a proper subset by \subset, hence \subseteq allows for the possibility that K=S. We call \subseteq and \subset the set inclusion operators. :::

Conversely can also define the notion of a super-set, this isn't too useful for what we are doing but it does sometimes appear in other text so it worth mentioning it now.

::: definition Definition 21. Super-set

Let S\subseteq T. We say that T is a super-set of the set S and we write this as T\supseteq S. :::

::: example Example 15. Let S=\left\{1,2,3,4,5,6\right\} then some subsets of S are \left\{1,2\right\}, \left\{4\right\} and $\left{1,2,6\right}$ :::

With the idea of a subset we have our first proposition

::: {#prop:TwosetsEqualIfContainedInEachOther .proposition} Proposition 1. Two sets are equal if and only if they are subsets of each other

Let X and Y be sets. We have that X=Y if and only if X\subseteq Y and Y\subseteq X.

Proof:

This is an if and only if proposition so we have to prove that given X=Y then X\subseteq Y and Y\subseteq X and then we need to show that given X\subseteq Y and Y\subseteq X, that X=Y.

\left(\Rightarrow\right): Suppose that X=Y then we have that X and Y have the same elements, in particular we have that every x\in X is also in Y so that X\subseteq Y. Likewise Y\subseteq X.

\left(\Leftarrow\right): Suppose that X\subseteq Y and Y\subseteq X. X\subseteq Y means that for every x\in X we have that x\in Y. Likewise Y\subseteq X means that for every x\in Y we have that x\in X. Hence we must have that the elements of X and Y are the same, that is X=Y. $\qed$ :::

There is also another property of subsets that is useful.

::: {#prop:SetInclusionTransitivityProp .proposition} Proposition 2. Set inclusion transitivity property

Let R,S and T be sets such that R\subseteq S and S\subseteq T. We have that $R\subseteq T$

Proof:

Let R,S and T be sets such that R\subseteq S and S\subseteq T. Suppose that x\in R. By assumption we have that R\subseteq S and so x\in S. Likewise by assumption we have that S\subseteq T and so x\in T. Hence R\subseteq T.

The result follows. $\qed$ :::

A similar result holds if we replace subsets with proper subsets.

::: {#prop:ProperSetInclusionTransitivityProp .proposition} Proposition 3. Proper set inclusion transitivity property

Let R,S and T be sets such that R\subset S and S\subset T. We have that $R\subset T$

Proof:

Let R,S and T be sets such that R\subset S and S\subset T. Suppose that x\in R. By assumption we have that R\subset S and so x\in S. Likewise by assumption we have that S\subset T and so x\in T. Hence R\subset T.

We must show that it is not possible for R=T. As R\subset S then by definition we have that R\neq S, likewise as S\subset T then S\neq T. As R\neq S\neq T we conclude that R\neq T and so R\subset T.

The result follows. $\qed$ :::

We can also make the following observation.

::: {#prop:ProperSetSubSetInclusionNotTransitivity .proposition} Proposition 4. Proper set inclusion and subset inclusion is not transitive

Let R,S and T be sets such that R\subseteq S and S\subset T. We have that $R\subset T$

Proof:

Let R,S and T be sets such that R\subseteq S and S\subset T.

If R\neq S then R\subset S and so proposition 3{reference-type="ref" reference="prop:ProperSetInclusionTransitivityProp"} applies. So suppose that R=S then R\subseteq S and so \forall x\in R we have that x\in S. Now as S\subset T we have that S\neq T\implies R\neq T as R=S.

The result follows. $\qed$ :::

We will define what we truly mean by transitivity in the next chapter, right now it is more important to know that sets satisfy this property than why this property is named the way it is. As set inclusion is transitive, so is set equality.

::: proposition Proposition 5. Set equality transitivity property

Let R,S and T be sets such that R=S and S=T. We have that R=T.

Proof:

Let R,S and T be sets such that R=S and S=T. We have that R=T. By equality of sets we have that R\subseteq S and S\subseteq R, likewise we also have that S\subseteq T and T\subseteq S. Now as R\subseteq S and S\subseteq T then we must have by transitivity of set inclusion that R\subseteq T. Moreover as T\subseteq S and S\subseteq R we again have by transitivity that T\subseteq R. The result follows by equality of sets. $\qed$ :::

::: definition Definition 22. The empty-set

The empty-set is the set that contains no elements. It is denoted by \emptyset. :::

To make our lives a little easier we will introduce some notation

::: definition Definition 23. Universal and existential quantifiers

Let S be any set. The universal quantifier \forall, meaning for all, allows us to talk about every element S. We can condition the universal quantifier with a such that ,$:$, in order to pick all the elements that satisfy a given condition.

The existential quantifier \exists tells us of the existence of an element in S. Just saying an element in a set exists is not particularly usual and so we normally combine \exists with a condition. :::

Some examples will help us here.

::: example Example 16. Consider the set \left\{1,2,3,4,5,\dots\right\}=\mathbb{N}, we call \mathbb{N} the natural numbers. Moreover, consider $S=\left{1,2,3,4,5,6\right}$

  1. We have that \forall x\in S that x\in\mathbb{N}, that is every element of S is also an element of \mathbb{N}.

  2. We can apply the universal quantifier multiple times in a statement, for example

    $$\begin{equation} \forall a\in\mathbb{N},\forall b\in\mathbb{N},\exists c\in\mathbb{N}:a+b=c \end{equation*}$$*

  3. Let a,b\in\mathbb{N} that is let a\in\mathbb{N} and let b\in\mathbb{N}. Then we can construct the following set. We say that a is divisible by b if \exists c\in\mathbb{N} such that a=bc, we write this as b\mid a. The set of all such c can be expressed by

    $$\begin{equation} C=\left{c\in\mathbb{N}:a=bc\right} \end{equation*}$$* :::

The empty set has the interesting property that it is a subset of any set.

::: {#prop:EmptySetincontainedineveryset .proposition} Proposition 6. The empty-set is contained in every set

Let S be any set. Then $\emptyset\subseteq S$

Proof:

We have that \emptyset\subseteq S means that every element of \emptyset is also contained in S. The definition of the empty set means that there are no elements in \emptyset. We can phrase this to the following statement

$$\begin{equation} \forall x: x\in\emptyset\Rightarrow x\in S \end{equation*}$$ But x\in\emptyset is not true for any x so*

$$\begin{equation} \forall x: x\in\emptyset\Rightarrow x\in S \end{equation*}$$*

is vacuously true. It hence follows the empty-set is contained in any set. $\qed$ :::

::: {#prop:EmptySetUnique .proposition} Proposition 7. The empty-set is unique

The empty-set is unique, that is there is only one distinct set which is the empty-set.

Proof:

Suppose that \emptyset and \emptyset' are two empty sets. By proposition 6{reference-type="ref" reference="prop:EmptySetincontainedineveryset"} we have that \emptyset\subseteq\emptyset', likewise \emptyset'\subseteq\emptyset. So by proposition 1{reference-type="ref" reference="prop:TwosetsEqualIfContainedInEachOther"} we have that \emptyset=\emptyset'. Hence the empty-set is unique. $\qed$ :::

It would be nice to have more ways to construct sets. Two key ways to do this are with the union operation and intersection operation.

::: definition Definition 24. Union and intersection of sets

Let S and T be any two sets. We define the union of S and T, denoted by S\cup T, is the set

$$\begin{equation} S\cup T=\left{x: x\in S\text{ or } x\in T\right} \end{equation*}$$*

The intersection of S and T, denoted by S\cap T, is the set

$$\begin{equation} S\cap T = \left{x : x\in S\text{ and } x\in T\right} \end{equation*}$$*

If we have a finite number of sets, given by A_1, A_2, \dots, A_n then the union of all of these sets is denoted by

$$\begin{align} \bigcup_{i=1}^n A_i \end{align*}$$*

and the intersection is denoted by

$$\begin{align} \bigcap_{i=1}^n A_i \end{align*}$$ Sometimes it is useful to define a union or intersection of multiple sets given some condition or multiple conditions, usually when the conditions involve other previously defined sets, this is denoted as*

$$\begin{equation} \bigcup_{\substack{\text{Condition 1 for} A \ \text{Condition 2 for} A\ \ \dots}} A \end{equation*}$$ for the union and for the intersection*

$$\begin{equation} \bigcap_{\substack{\text{Condition 1 for} A \ \text{Condition 2 for} A\ \text{}\dots}} A \end{equation*}$$* :::

::: example Example 17. Let S=\left\{1,2,3,4,5,6\right\} and let T=\left\{2,4,5,6,7,8\right\}, we have that

$$\begin{align} S\cup T &=\left{1,2,3,4,5,6\right}\cup \left{2,4,5,6,7,8\right}=\left{1,2,3,4,5,6,2,4,5,6,7,8\right}=\left{1,2,3,4,5,6,7,8\right}\ S\cap T &=\left{1,2,3,4,5,6\right}\cap \left{2,4,5,6,7,8\right}=\left{1,2,3,4,5,6,2,4,5,6,7,8\right}=\left{2,4,5,6\right}\ \end{align*}$$*

We note that in the union we have multiple elements, for example we have two $2$'s. Repeated elements in a set are considered to be the same element so we don't write them, i.e $\left{2,2\right}=\left{2\right}$ :::

::: example Example 18. Let A_1=\left\{1,2,3\right\}, A_2=\left\{1,2,7,9\right\} and A_3=\left\{1,4,8,12\right\}. We have that the union of these sets is given by

$$\begin{align} \bigcup_{i=1}^n A_i&=A_1\cup A_2\cup A_3\ &=\left{1,2,3\right}\cup \left{1,2,7,9\right}\cup \left{1,4,8,12\right}\ &=\left{1,2,3,4,7,8,9,12\right} \end{align*}$$*

The intersection of these sets is given by

$$\begin{align} \bigcap_{i=1}^n A_i&=A_1\cap A_2\cap A_3\ &=\left{1,2,3\right}\cap \left{1,2,7,9\right}\cap \left{1,4,8,12\right}\ &=\left{1\right} \end{align*}$$* :::

We make one useful definition about intersections

::: definition Definition 25. Disjoint sets

Let X and Y be sets. If we have that X\cap Y =\emptyset then we say that X and Y are disjoint sets. :::

Operations on sets
The union, the intersection and set inclusion

Before we continue we introduce three new ideas that will play a role throughout the rest of this paper.

::: definition Definition 26. Operation

An operation \circ acts on some inputs to produce an output or some outputs. :::

::: example Example 19. The union \cup and intersection \cap are examples of operations. These operators operate on two sets to produce a third. :::

::: definition Definition 27. Commutative operation

Let \circ be an operation that accepts two inputs, i.e we have A\circ B for valid inputs A and B. We say that \circ is commutative if and only if $A\circ B = B\circ A$ :::

::: example Example 20. Consider \mathbb{N}=\left\{1,2,3,4,5,\dots\right\}. We are familiar with the idea of addition of positive numbers, say 1+2=3. It is clear that the addition operation is commutative for \mathbb{N}, e.g. $1+2=3=2+1$ :::

::: definition Definition 28. Associative operation

Let \circ be an operation that accepts two inputs, i.e we have A\circ B for valid inputs A and B. We say that \circ is associative if and only if \left(A\circ B\right)\circ C = A\circ\left(B\circ C\right) where the operation in the brackets should be computed first. :::

::: example Example 21. Again consider \mathbb{N}=\left\{1,2,3,4,5,\dots\right\}. The addition operator for \mathbb{N} is associative, e.g. $\left(1+2\right)+3=3+3=6=1+5=1+\left(2+3\right)$ :::

We note that we have not defined a rigorous notion of addition, to do so will require us to consider mappings which we do later.

We have the following proposition about the properties of intersections, unions and set inclusions.

::: {#prop:PropertiesOfUnionIntersectionSetinclusion .proposition} Proposition 8. Properties of intersection, union and set inclusion

Let A,B,C be sets. Then we have that the following properties are true

  1. $A\cap B = B\cap A$

  2. $A\cup B = B\cup A$

  3. $A\cap B\subseteq A$

  4. $A\subseteq A\cup B$

  5. $A\subseteq B \Rightarrow A\cap B = A$

  6. $A\subseteq B\Rightarrow A\cup B =B$

  7. $A\subseteq B \Rightarrow A\cap C \subseteq B\cap C$

  8. $A\subseteq B \Rightarrow A\cup C \subseteq B\cup C$

  9. $A\cap A = A$

  10. $A\cup A =A$

  11. $A\cap\left(B\cap C\right)=\left(A\cap B\right)\cap C$

  12. $A\cup\left(B\cup C\right)=\left(A\cup B\right)\cup C$

  13. $A\cap\left(B\cup C\right)=\left(A\cap B\right)\cup\left(A\cap C\right)$

  14. $A\cup\left(B\cap C\right)= \left(A\cup B\right) \cap \left(A\cup C\right)$

Proof:

  1. A\cap B = B\cap A:

    Let x\in A\cap B then x\in A and x\in B by the definition of the intersection. It is hence clear that x\in B\cap A. So we have A\cap B\subseteq B\cap A. Likewise if x\in B\cap A then x\in B and x\in A, so that x\in A\cap B. So B\cap A\subseteq A\cap B. It hence follows by proposition 1{reference-type="ref" reference="prop:TwosetsEqualIfContainedInEachOther"} that A\cap B = B\cap A.

  2. A\cup B = B\cup A:

    Let x\in A\cup B then x\in A or x\in B by the definition of the union. We hence have that x\in B\cup A. So we have A\cup B\subseteq B\cup A. Likewise if x\in B\cup A then x\in B and x\in A, so that x\in A\cup B. So B\cup A\subseteq A\cup B. It hence follows by proposition 1{reference-type="ref" reference="prop:TwosetsEqualIfContainedInEachOther"} that A\cup B = B\cup A.

  3. A\cap B\subseteq A:

    Let x\in A\cap B, then by the definition of the intersection x\in A and x\in B. Hence x\in A\cap B means that x\in A so that A\cap B\subseteq A.

  4. A\subseteq A\cup B:

    Let x\in A. By the definition of the union of two sets we have that y\in A\cup B if and only if y\in A or y\in B. Hence it follows that $x\in A\cup B$

  5. A\subseteq B \Rightarrow A\cap B = A:

    Let A\subseteq B and suppose that x\in A, then we have that x\in B as A\subseteq B. Hence x\in A\cap B. This holds for any choice of x\in A. We conclude that if A\subseteq B then $A\cap B = A$

  6. A\subseteq B\Rightarrow A\cup B =B:

    Let A\subseteq B. Observe that B\subseteq B so that A\cup B\subseteq B\cup B= B, that is to say A\cup B \subseteq B. Now B\subseteq A\cup B. Hence A\cup B = B.

  7. A\subseteq B \Rightarrow A\cap C \subseteq B\cap C:

    Suppose that A\subseteq B and let x\in A\cap C, then by definition x\in A and x\in C. Also we have that as A\subseteq B that x\in A gives x\in B. Hence x\in B\cap C. It follows that A\cap C\subseteq B\cap C.

  8. A\subseteq B \Rightarrow A\cup C \subseteq B\cup C:

    Suppose A\subseteq B and let x\in A\cup C. We have that x\in A or x\in C. If x\in A then as A\subseteq B we have that x\in B so that x\in B\cup C. If x\in C then clearly x\in B\cup C. Either way we have that A\cup C\subseteq B\cup C.

  9. A\cap A = A:

    Let x\in A, then by the definition of the intersection we have that y\in A\cap A if and only if y\in A and y\in A, hence x\in A\cap A. So that A\subseteq A\cap A. Now If x\in A\cap A we have by definition of the intersection of two sets that x\in A and x\in A, so the force of deductive logic then drives one to the conclusion that x\in A. So A\cap A\subseteq A. Hence A\cap A = A.

  10. A\cup A =A:

    Let x\in A, then by the definition of the union of two sets, we have that y\in A\cup A if and only if y\in A pr y\in A, hence x\in A\cup A so that A\subseteq A\cup A. Now suppose that x\in A\cup A, then again by the definition of the union we have that x\in A so that A\cup A\subseteq A. Hence A=A\cup A.

  11. A\cap\left(B\cap C\right)=\left(A\cap B\right)\cap C:

    Let A,B and C be sets. Consider A\cap\left(B\cap C\right), we have that x\in A\cap\left(B\cap C\right) means that x\in A and x\in B\cap C, likewise x\in B\cap C means that x\in B and x\in C. Now as x\in A and x\in B and x\in C so we have that x\in A\cap B and x\in C. Finally we have that x\in\left(A\cap B\right)\cap C so that A\cap\left(B\cap C\right)\subseteq \left(A\cap B\right)\cap C.

    Now consider \left(A\cap B\right)\cap C, if x\in\left(A\cap B\right)\cap C then x\in A\cap B and x\in C, also x\in A\cap B means that x\in A and x\in B. As x\in A and x\in B and x\in C so we have that x\in A and x\in B\cap C so that x\in A\cap\left(B\cap C\right). Hence \left(A\cap B\right)\cap C\subseteq A\cap\left(B\cap C\right).

    Hence $A\cap\left(B\cap C\right)=\left(A\cap B\right)\cap C$

  12. A\cup\left(B\cup C\right)=\left(A\cup B\right)\cup C:

    Let A,B and C be sets. Consider A\cup\left(B\cup C\right) and let x\in A\cup\left(B\cup C\right), we have that either x\in A or x\in\left(B\cup C\right). If x\in A then we have that x\in A\cup B so that x\in\left(A\cup B\right)\cup C. If x\in B\cup C then either x\in B or x\in C. If x\in B then X\in A\cup C so that x\in \left(A\cup B\right)\cup C. Otherwise x\in C and we have that x\in \left(A\cup B\right)\cup C. Hence we have that $A\cup\left(B\cup C\right)\subseteq\left(A\cup B\right)\cup C$

    Conversely let x\in\left(A\cup B\right)\cup C. We have that either x\in\left(A\cup B\right) or x\in C. If x\in\left(A\cup B\right) then either x\in A or x\in B, in either case we have that x\in A\cup\left(B\cup C\right). If x\in C then x\in A\cup\left(B\cup C\right). So that \left(A\cup B\right)\cup C\subseteq A\cup\left(B\cup C\right).

    Hence $A\cup\left(B\cup C\right)=\left(A\cup B\right)\cup C$

  13. A\cap\left(B\cup C\right)=\left(A\cap B\right)\cup\left(A\cap C\right):

    Let x\in A\cap\left(B\cup C\right), then we have that x\in A and x\in B\cup C. We have x\in B\cup C gives us that x\in B or x\in C. If x\in B then x\in A\cap B and so x\in\left(A\cap B\right)\cup\left(A\cap C\right). Likewise is x\in C then x\in A\cap C so x\in \left(A\cap B\right)\cup\left(A\cap C\right). Hence A\cap\left(B\cup C\right)\subseteq\left(A\cap B\right)\cup\left(A\cap C\right).

    For the opposite inclusion, let x\in\left(A\cap B\right)\cup\left(A\cap C\right) then we have that either x\in A\cap B or x\in A\cap C. If x\in A\cap B then x\in A and x\in B, so we hence have that x\in B\cup C so that x\in A\cap\left(B\cup C\right). Likewise if we have x\in A\cap C then x\in A and x\in C, so x\in B\cup C and x\in A\cap\left(B\cup C\right). Hence $\left(A\cap B\right)\cup\left(A\cap C\right)\subseteq A\cap\left(B\cup C\right)$

    So A\cap\left(B\cup C\right)=\left(A\cap B\right)\cup\left(A\cap C\right).

  14. A\cup\left(B\cap C\right)= \left(A\cup B\right) \cap \left(A\cup C\right):

    Let x\in A\cup\left(B\cap C\right) then either x\in A or x\in B\cap C. If x\in A then x\in A\cup B and x\in A\cup C, which is to say x\in\left(A\cup B\right)\cap\left(A\cup C\right). If x\in B\cap C then x\in B and x\in C, so it follows that x\in A\cup B and x\in A\cup C which is to say x\in\left(A\cup B\right)\cap\left(A\cup C\right). Hence A\cup\left(B\cap C\right)\subseteq \left(A\cup B\right) \cap \left(A\cup C\right).

    Now, suppose that x\in\left(A\cup B\right) \cap \left(A\cup C\right). We then have that x\in A\cup B and x\in A\cup C. Now x\in A\cup B gives x\in A or x\in B, also x\in A\cup C means that x\in A or x\in C. This gives us two possible outcomes. If x\in A then x\in A\cup\left(B\cap C\right) so that \left(A\cup B\right) \cap \left(A\cup C\right)\subseteq A\cup\left(B\cap C\right). Suppose that x\not\in A then we must have that x\in B and x\in C as x\in A\cup B and x\in A\cup C. Hence x\in B\cap C so x\in A\cup\left(B\cap C\right). Hence \left(A\cup B\right) \cap \left(A\cup C\right)\subseteq A\cup\left(B\cap C\right).

    So we have that A\cup\left(B\cap C\right)= \left(A\cup B\right) \cap \left(A\cup C\right).

The proposition now follows. $\qed$ :::

::: {#thm:EquivSubsetIntUnion .theorem} Theorem 1. Equivalence of Subsets with union and intersection

Let A,B be sets. The following are equivalent

  1. $A\subseteq B$

  2. $A\cap B = A$

  3. $A\cup B =B$

Proof:

Suppose A\subseteq B. By proposition 8{reference-type="ref" reference="prop:PropertiesOfUnionIntersectionSetinclusion"} we have that

$$\begin{equation} A=A\cap A \subseteq A\cap B\subseteq A \end{equation*}$$ Hence A=A\cap B.*

Now suppose that A\cap B = A, then A\subseteq B. This shows 1 and 2 are equivalent.

Suppose A\subseteq B. Let x\in A then x\in B. Then as x\in B we have that x\in A\cup B so that B\subseteq A\cup B. Suppose that x\in A\cup B, then either x\in A or x\in B. If x\in B we are done and we have that A\cup B\subseteq B. If x\in A then as A\subseteq B we have that x\in B so that A\cup B\subseteq B.

Hence A\cup B = B.

Now suppose that A\cup B = B. Suppose that x\in A then x\in A\cup B =B so x\in B, hence A\subseteq B.

This shows the equivalence of 1 and 3.

The equivalence of 2 and 3 now follows. Indeed, suppose that A\cap B =A then by the equivalence of 1 and 2 we know that A\subseteq B, also by the equivalence of 1 and 3 we know that A\cup B = B. $\qed$ :::

The complement of a set

It sometimes becomes useful to talk about the elements that are not in some set S. This only makes sense if S is contained inside some larger set.

::: definition Definition 29. Complement of a set

Let S be a set such that S\subseteq U for some set U. We define the complement of S, denoted by S^C as the following set

$$\begin{equation} S^C = \left{x\in U:x\not\in S\right} \end{equation*}$$*

We can alternatively write S^C = U\setminus S, where \setminus is the set difference operation.

Moreover we can also consider the complement of a set A with respect to some other set B, again occurring inside some larger set U which is to say A\subseteq U and B\subseteq U. We have that

$$\begin{equation} A\setminus B = \left{x\in A : x\not\in B\right} \end{equation*}$$*

We call :::

::: example Example 22. Let U=\left\{1,2,3,4,5,6\right\}, S=\left\{1,2,3,4,6\right\} and T=\left\{2,4,6\right\}. We have that S\subseteq U so that

$$\begin{align} S^C&=\left{x\in U:x\not\in S\right}=\left{5\right}\ T^C&=\left{x\in U:x\not\in T\right}=\left{1,3,5\right}\ \end{align*}$$*

Also

$$\begin{align} S\setminus T=\left{x\in S: x\not\in T\right}=\left{1,3\right}\ T\setminus S=\left{x\in T: x\not\in S\right}=\emptyset \end{align*}$$* :::

An immediate result follows from the previous definitions of the complement of a set and set difference.

::: {#thm:DeMorgan .theorem} Theorem 2. De-Morgan's laws

Let A and B be subsets of some universal set U. We have the complement laws

  1. $\left(A\cap B\right)^C=A^C\cup B^C$

  2. $\left(A\cup B\right)^C= A^C\cap B^C$

We also have the set difference laws

  1. $U\setminus\left(A\cap B\right)=\left(U\setminus A\right)\cup \left(U\setminus B\right)$

  2. $U\setminus\left(A\cup B\right)=\left(U\setminus A\right)\cap \left(U\setminus B\right)$

Proof:

We first prove the complement laws.

  1. \left(A\cap B\right)^C=A^C\cup B^C:

    Let x\in\left(A\cap B\right)^C, by the definition of the set complement we have that x\not\in \left(A\cap B\right). So by the definition of the intersection and x not being an element of A\cap B we have that x\not\in A or x\not\in B. Suppose that x\not\in A, then by the definition of set complement we have that x\in A^C so that x\in A^C\cup B^C. Likewise if x\not\in B then x\in B^C so that x\in A^C\cup B^C. Hence we have that \left(A\cap B\right)^C\subseteq A^C\cup B^C.

    Now suppose x\in A^C\cup B^C, then x\in A^C or x\in B^C. Suppose x\in A^C then x\not\in A so that x\not\in A\cap B hence x\in\left(A\cap B\right)^C. Likewise if x\in B^C then x\not\in B so x\not\in A\cap B so that x\in\left(A\cap B\right)^C. Thus $A^C\cup B^C\subseteq \left(A\cap B\right)^C$

    Hence \left(A\cap B\right)^C=A^C\cup B^C.

  2. \left(A\cup B\right)^C= A^C\cap B^C:

    Let x\in \left(A\cup B\right)^C, then we have that x\not\in A\cup B so x\not\in A and x\not\in B. This means that x\in A^C and x\in B^C which is to say x\in A^C\cap B^C. So \left(A\cup B\right)^C\subseteq A^C\cap B^C.

    Suppose x\in A^C \cap B^c then x\in A^C and x\in B^C. x\in A^C means that x\not\in A and x\in B^C means that x\not\in B, so x\not\in A and x\not\in B hence x\not\in A\cup B. Thus x\in\left(A\cup B\right)^C. Hence $A^C\cap B^C\subseteq\left(A\cup B\right)^C$

    Thus $\left(A\cup B\right)^C= A^C\cap B^C$

It is left to prove the set difference laws.

  1. U\setminus\left(A\cap B\right)=\left(U\setminus A\right)\cup \left(U\setminus B\right):

    Let X\in U\setminus\left(A\cap B\right) then by definition we have that x\in U and x\not\in A\cap B, which is to say that x\not\in A or x\not\in B with the possibility of being in neither. If x\not\in A then x\in \left(U\setminus A\right) and we clearly have x\in \left(U\setminus A\right)\cup \left(U\setminus B\right). Likewise if x\not\in B and both cases clearly hold in the case where x\not\in A and X\not\in B. It follows that in every case that x\in \left(U\setminus A\right)\cup \left(U\setminus B\right). Hence $U\setminus\left(A\cap B\right)\subseteq\left(U\setminus A\right)\cup \left(U\setminus B\right)$

    Now suppose that x\in \left(U\setminus A\right)\cup \left(U\setminus B\right) then by definition we have that x\in U\setminus A or x\in U\setminus B with the possibility of being in both. If x\in U\setminus A then x\in U and X\not\in A. Hence x\not\in A\cap B, likewise if X\in Y\setminus B then we again conclude that X\not\in A\cap B. However as x\in U then we have by definition that x\in U\setminus\left(A\cap B\right). We conclude that $\left(U\setminus A\right)\cup \left(U\setminus B\right)\subseteq U\setminus\left(A\cap B\right)$

    It follows that $U\setminus\left(A\cap B\right)=\left(U\setminus A\right)\cup \left(U\setminus B\right)$

  2. U\setminus\left(A\cup B\right)=\left(U\setminus A\right)\cap \left(U\setminus B\right):

    Suppose that U\setminus\left(A\cup B\right) then x\in U and x\not\in A\cup B so x\not\in A and x\not\in B. Clearly then x\in U\setminus A and x\in U\setminus B so that x\in \left(U\setminus A\right)\cap \left(U\setminus B\right). So we have that U\setminus\left(A\cup B\right)\subseteq\left(U\setminus A\right)\cap \left(U\setminus B\right).

    Let x\in \left(U\setminus A\right)\cap \left(U\setminus B\right) then x\in U\setminus A and x\in U\setminus B which is to say that x\in U and x\not\in A and x\not\in B. Clearly x\not\in A and x\not\in B gives us that x\not\in A\cup B and so x\in U\setminus \left(A\cup B\right) by definition. This allows us to conclude that $\left(U\setminus A\right)\cap \left(U\setminus B\right)\subseteq U\setminus\left(A\cup B\right)$

    Hence $U\setminus\left(A\cup B\right)=\left(U\setminus A\right)\cap \left(U\setminus B\right)$

This proves the theorem. $\qed$ :::

::: {#prop:AdditionComplement .proposition} Proposition 9. Additional properties of set complements and set differences

Let A, B and C be a sets such that A\subseteq U, B\subseteq U and C\subseteq U. Moreover suppose U is not contained in any other set. Then we have that

  1. $A\cup A^C = U$

  2. $A\cap A^C =\emptyset$

  3. $\emptyset^C =U$

  4. $U^C=\emptyset$

  5. If A\subseteq B then $B^C\subseteq A^C$

  6. $\left(A^C\right)^C=A$

  7. $A\setminus B = A\cap B^C$

  8. $\left(A\setminus B\right)^C=A^C\cup B$

  9. $A^C\setminus B^C=B\setminus A$

  10. $\left(A\setminus B\right)\cap C = \left(A\cap C\right)\setminus\left(B\cap C\right)$

  11. $A\setminus\left(B\setminus C\right) = \left(A\cap B\right)\setminus\left(A\cap C\right)$

  12. $\left(A\setminus B\right)\cap B=\emptyset$

  13. $\left(A\setminus B\right)\cap\left(A\cap B\right)=\emptyset$

Proof:

  1. A\cup A^C = U:

    Let x\in A\cup A^C then x\in A or x\in A^C. If x\in A then as A\subseteq U we have that x\in U. If x\in A^c then by the definition of set complements we have that x\in A^C if and only if x\in U. Hence A\cup A^C\subseteq U.

    Conversely suppose that x\in U. We know that A\subseteq U so if x\in A we clearly have x\in A\cup A^C. So suppose x\not\in A then by definition of the set complement we have that x\in A^C so that x\in A\cup A^C. Hence U\subseteq A\cup A^C.

    So A\cup A^C=U.

  2. A\cap A^C =\emptyset:

    Let x\in A\cap A^C, then x\in A and x\in A^C, however x\in A^C means that x\not\in A. This contradicts the fact that x\in A, hence there are no elements x\in U so that x\in A and x\in A^C, this is to say A\cap A^C= \emptyset.

    Hence A\cap A^C =\emptyset.

  3. \emptyset^C =U:

    By the definition of the empty set we have that \emptyset has no elements. The complement of the empty-set is

    $$\begin{equation} \emptyset^C=\left{x\in U:x\not\in\emptyset\right} \end{equation*}$$*

    Hence every x\in U is such that x\not\in\emptyset. So \emptyset^C\subseteq U.

    Conversely let x\in U, then x\not\in \emptyset as \emptyset has no elements. so x\in\emptyset^C hence U\subseteq \emptyset^C.

    It follows that \emptyset^C=U.

  4. U^C=\emptyset:

    Let x\in U^C, by the definition of set complement we have that

    $$\begin{equation} U^C=\left{y\in U:y\not\in U\right} \end{equation*}$$*

    This is clearly empty as no such y can satisfy y\in U and y\not\in U.

    Hence U^C=\emptyset.

  5. If A\subseteq B then B^C\subseteq A^C:

    Suppose that A\subseteq B. We have by proposition 8{reference-type="ref" reference="prop:PropertiesOfUnionIntersectionSetinclusion"} property 5 we have that A\cap B = A. It follows that \left(A\cap B\right)^C = A^C. Now by De-Morgan's laws we have that \left(A\cap B\right)^C= A^C\cup B^C. Hence A^C\cup B^C = A^C. Finally by theorem 1{reference-type="ref" reference="thm:EquivSubsetIntUnion"} we know that X\cup Y = Y if and only if X\subseteq Y for sets X and Y. Hence B^C\subseteq A^C.

  6. \left(A^C\right)^C=A:

    Let x\in \left(A^C\right)^C. By definition we have that

    $$\begin{equation} \left(A^C\right)^C=\left{x\in U : x\not\in A^c\right} \end{equation*}$$*

    Hence x\in \left(A^C\right)^C if and only if x\not\in A^C. However x\not\in A^C means that x\in A. Hence $\left(A^C\right)^C\subseteq A$

    Suppose that x\in A, then x\not\in A^C, moreover by definition x\not\in A^C if and only if x\in \left(A^C\right)^C, hence A\subseteq \left(A^C\right)^C.

    Hence $\left(A^C\right)^C=A$

  7. A\setminus B = A\cap B^C:

    Let x\in A\setminus B, then by definition we have that A\setminus B is the set

    $$\begin{equation} A\setminus B = \left{y\in A:y\not\in B\right} \end{equation*}$$ Hence x\in A\setminus B means that x\in A and x\not\in B. We have that x\not\in B means that x\in B^C. So that x\in A\cap B^C. It follows that A\setminus B\subseteq A\cap B^C.*

    Let x\in A\cap B^C, then x\in A and x\in B^C. x\in B^C means that x\not\in B, so by definition x\in A and x\not\in B means that x\in A\setminus B. Hence A\cap B^C\subseteq A\setminus B.

    Hence A\setminus B = A\cap B^C.

  8. \left(A\setminus B\right)^C=A^C\cup B:

    We know that A\setminus B = A\cap B^C by the previous property. Now by De-Morgan's laws we have that

    $$\begin{equation} \left(A\setminus B\right)^C=\left(A\cap B^C\right)^C = A^C\cup \left(B^C\right)^C = A^C \cup B \end{equation*}$$*

  9. A^C\setminus B^C=B\setminus A:

    We know that A^C\setminus B^C = A^C\cap \left(B^C\right)^C. Now, \left(B^C\right)^C=B hence A^C\cap \left(B^C\right)^C=A^C\cap B = B\cap A^C. Finally we know that B\cap A^C = B\setminus A by property 7.

    Hence A^C\setminus B^C=B\setminus A.

  10. \left(A\setminus B\right)\cap C = \left(A\cap C\right)\setminus\left(B\cap C\right):

  11. $A\setminus\left(B\setminus C\right) = \left(A\cap B\right)\setminus\left(A\cap C\right)$

  12. $\left(A\setminus B\right)\cap B=\emptyset$

  13. $\left(A\setminus B\right)\cap\left(A\cap B\right)=\emptyset$

The proposition now follows. $\qed$ :::

Cartesian Product

We now look to another method of constructing a set. This method differs from the union and intersection as it allows us to construct a set where the elements come in pairs, in particular these pairs are ordered.

::: definition Definition 30. Ordered pair

Let S and T be sets. Let s\in S and t\in T. We say that the tuple \left(s,t\right) is an ordered pair of an element in S and an element in T. :::

::: definition Definition 31. Cartesian product of two sets

Let S and T be sets. We define the Cartesian product of S and T, denoted S\times T to be the set of all ordered pairs of the form \left(s,t\right) where s\in S and t\in T. This is to say that

$$\begin{equation} S\times T=\left{\left(s,t\right):s\in S,t\in T\right} \end{equation*}$$* :::

::: example Example 23. Let S=\left\{1,2,3\right\} and T=\left\{4,5,6\right\}. We have that

$$\begin{align} S\times T&=\left{\left(1,4\right),\left(1,5\right),\left(1,6\right),\left(2,4\right),\left(2,5\right),\left(2,6\right),\left(3,4\right),\left(3,5\right),\left(3,5\right)\right}\ T\times S&=\left{\left(4,1\right),\left(4,2\right),\left(4,3\right),\left(5,1\right),\left(5,2\right),\left(5,3\right),\left(6,1\right),\left(6,2\right),\left(6,3\right)\right}\ \end{align*}$$ This example shows that S\times T\neq T\times S in general.* :::

We can make repeated uses of this idea, we just need to defined an ordered $n$-tuple.

::: {#def:orderedNtuple .definition} Definition 32. Ordered $n$-tuple

Let S_1,S_2,\dots,S_n be sets. Let s_1\in S_1,s_2\in S_2,\dots,s_n\in S_n. We say that \left(s_1,s_2,\dots,s_n\right) is an ordered $n$-tuple of an elements in S_1,S_2,\dots,S_n. :::

::: {#def:CartProductOfNSet .definition} Definition 33. Cartesian product of n sets

Let S_1,S_2,\dots,S_n be sets. We define the Cartesian product of S_1,S_2,\dots,S_N, denoted S_1\times S_2\times\dots\times S_n to be the set of all ordered pairs of the form \left(s_1,s_2,\dots,s_n\right) where s_1\in S_1.s_2\in S_2,\dots s_n\in S_n. This is to say that

$$\begin{equation} S_1\times S_2\times\dots\times S_n=\left{\left(s_1,s_2,\dots,s_n\right):s_1\in S_1.s_2\in S_2,\dots s_n\in S_n\right} \end{equation*}$$*

If all the sets are the same we denote this by S^n. :::

We make the following observations

::: {#lem:CartEmpty .lemma} Lemma 1. Cartesian product is empty if and only if at least one of the sets in the product is empty

Let A and B be sets. We have that A\times B=\emptyset if and only if A=\emptyset or B=\emptyset.

Proof:

We argue as follows. Suppose that A\times B\neq \emptyset then we have by definition of a non-empty Cartesian product that A\times B\neq \emptyset if and only if \exists\left(a,b\right)\in A\times B. Now, by the definition of a Cartesian product we have that as \left(a,b\right)\in A\times B if and only if \exists a\in A and \exists b\in B, which is to say A\neq\emptyset and B\neq\emptyset.

This proves the result as assuming A\times B\neq \emptyset gives us A\neq\emptyset and B\neq\emptyset. $\qed$ :::

::: {#prop:CriterionForComOfCartProd .proposition} Proposition 10. Criterion for commutativity of the Cartesian product

Let A and B be sets. We have that A\times B = B\times A only if at least one of the following holds.

  1. $A=B$

  2. A = \emptyset or B=\emptyset or $A=B=\emptyset$

Proof:

Let A and B be sets.

  1. A=B:

    Suppose that A=B then without loss of generality4 consider

    $$\begin{equation} A\times B = A\times A = \left{\left(a,a\right):a\in A\right} \end{equation*}$$*

    Moreover

    $$\begin{equation} B\times A = A\times A = \left{\left(a,a\right):a\in A\right} \end{equation*}$$*

    Hence, varying over every a\in A we have that A\times B = B\times A.

  2. A = \emptyset or B=\emptyset or A=B=\emptyset:

    By lemma 1{reference-type="ref" reference="lem:CartEmpty"} we have that if A=\emptyset or B=\emptyset or A=B=\emptyset then A\times B=\emptyset =B\times A.

The proposition follows. $\qed$ :::

We have seen that the Cartesian product is not commutative, but what can we say about associativity.

::: example Example 24. Let A=\left\{1\right\}. Consider

$$\begin{align} A\times\left(A\times A\right)&=A\times\left{\left(1,1\right)\right}=\left{\left(1,\left(1,1\right)\right)\right}\ \left(A\times A\right)\times A &=\left{\left(1,1\right)\right}\times A = \left{\left(\left(1,1\right),1\right)\right}\ \end{align*}$$*

Hence A\times\left(A\times A\right)\neq \left(A\times A\right)\times A. So in general the Cartesian product is not associative. :::

We have the following criterion for the associativity of the Cartesian product.

::: {#prop:CriterionForAssOfCartProd .proposition} Proposition 11. Criterion for associativity of the Cartesian product

Let A,B and C be sets. We have that A\times\left(B\times C\right)=\left(A\times B\right)\times C if and only if A=\emptyset or B=\emptyset or C=\emptyset.

Proof:

Suppose that A\times\left(B\times C\right)=\left(A\times B\right)\times C, we need to show one of A,B or C is empty.

Consider A\times\left(B\times C\right), we have that

$$\begin{equation} A\times\left(B\times C\right)=A\times\left{\left(b,c\right):b\in B,c\in C\right}=\left{\left(a,\left(b,c\right)\right):a\in A, \left(b,c\right)\in B\times C\right} \end{equation*}$$*

Now consider \left(A\times B\right)\times C, we have that

$$\begin{equation} \left(A\times B\right)\times C=\left{\left(a,b\right):a\in A,b\in B\right}\times C=\left{\left(\left(a,b\right),c\right):\left(a,b\right)\in A\times B, c\in C\right} \end{equation*}$$*

Hence for equality we need that a=\left(a,b\right) and \left(b,c\right)=c. However this is not possible as \left(a,b\right)\not\in A and \left(b,c\right)\not\in C. Hence one of the products must be empty, which implies that one of A,B or C is empty.

Now suppose that one of A,B or C is empty. Without loss of generality suppose that A=\emptyset, then by lemma 1{reference-type="ref" reference="lem:CartEmpty"} we know that one of A\times B=\emptyset and A\times\left(B\times C\right)=\emptyset. Also \left(A\times B\right)\times C=\emptyset\times C=\emptyset.

Hence we have that \left(A\times B\right)\times C=\emptyset=A\times\left(B\times C\right). is associative. $\qed$ :::

It is left to see how the Cartesian product interacts with unions, intersections and complements.

::: {#prop:CartProdUnIntComp .proposition} Proposition 12. Properties of Cartesian products, unions, intersections and complements

Let A,B,C and D be sets. We have the following properties

  1. $\left(A\cap B\right)\times\left(C\cap D\right) =\left(A\times C\right)\cap\left(B\times D\right)$

  2. $A\times\left(B\cap C\right)=\left(A\times B\right)\cap \left(A\times C\right)$

  3. $\left(A\times B\right)\cap\left(B\times A\right)=\left(A\cap B\right)\times\left(A\cap B\right)$

  4. $\left(A\cup B\right)\times\left(C\cup D\right) = \left(A\times C\right)\cup \left(B\times D\right)\cup\left(A\times D\right)\cup\left(B\times C\right)$

  5. $A\times\left(B\cup C\right) = \left(A\times B\right)\cup\left(A\times C\right)$

  6. $\left(B\cup C\right)\times A = \left(B\times A\right)\cup\left(C\times A\right)$

  7. If A\subseteq B and C\subseteq D then A\times C\subseteq B\times D. Moreover if A\neq\emptyset and C\neq\emptyset then

    $$\begin{equation} A\times C\subseteq B\times T \iff A\subseteq B\text{ and } C\subseteq D \end{equation*}$$*

  8. If A\subseteq B then $A\times C\subseteq B\times C$

  9. If C\subseteq D then $A\times C\subseteq A\times D$

  10. $A\times\left(B\setminus C\right)=\left(A\times B\right)\setminus\left(A\times C\right)$

  11. $\left(A\setminus B\right)\times C = \left(A\times C\right)\setminus\left( B\times C\right)$

  12. $\left(A\times B\right)\setminus\left(C\times D\right)=\left(A\times\left(B\setminus D\right)\right)\cup\left(\left(A\setminus B\right)\times C\right)$

  13. Suppose A\subseteq C and B\subseteq D and consider C\setminus A and T\setminus B. We have

    $$\begin{align} \left(C\setminus A\right)\times D &= \left(C\times D\right)\setminus\left(A\times D\right)\ C\times\left(D\setminus B\right) &=\left(C\times D\right)\setminus \left(C\times B\right) \end{align*}$$*

Proof:

  1. \left(A\cap B\right)\times\left(C\cap D\right) =\left(A\times C\right)\cap\left(B\times D\right):

    Let \left(x,y\right)\in\left(A\cap B\right)\times\left(C\cap D\right), then by definition of the Cartesian product we have that \left(x,y\right)\in\left(A\cap B\right)\times\left(C\cap D\right) if and if only x\in A and x\in B and y\in C and y\in D. x\in A and x\in B and y\in C and y\in D means that \left(x,y\right)\in A\times C and \left(x,y\right)\in B\times D, finally this happens if and only if \left(x,y\right)\in \left(A\times C\right)\cap\left(B\times D\right).

  2. A\times\left(B\cap C\right)=\left(A\times B\right)\cap \left(A\times C\right):

    We know that A\cap A=A. By the previous property we have that

    $$\begin{equation} A\times\left(C\cap D\right)=\left(A\cap A\right)\times\left(B\cap C\right)=\left(A\times B\right)\cap \left(A\times C\right) \end{equation*}$$*

  3. \left(A\times B\right)\cap\left(B\times A\right)=\left(A\cap B\right)\times\left(A\cap B\right):

    By property 1 we have

    $$\begin{equation} \left(A\times B\right)\cap\left(B\times A\right)=\left(A\cap B\right)\times \left(B\cap A\right) = \left(A\cap B\right)\times \left(A\cap B\right) \end{equation*}$$*

  4. \left(A\cup B\right)\times\left(C\cup D\right) = \left(A\times C\right)\cup \left(B\times D\right)\cup\left(A\times D\right)\cup\left(B\times C\right):

    Let \left(x,y\right)\in \left(A\cup B\right)\times\left(C\cup D\right), then by definition of Cartesian product and the union of sets we have that \left(x,y\right)\in \left(A\cup B\right)\times\left(C\cup D\right) if and only if x\in A or x\in B and y\in C or y\in D.

    x\in A or x\in B and y\in C or y\in D will occur if and only if (x\in A or x\in B and y\in C) or (x\in A or x\in B and y\in D).

    (x\in A or x\in B and y\in C) or (x\in A or x\in B and y\in D) occurs if and only if (x\in A and y\in C) or (x\in B and y\in C) or (x\in A and y\in D) or (x\in B and y\in D).

    By the definition of the Cartesian product we have that (x\in A and y\in C) or (x\in B and y\in C) or (x\in A and y\in D) or (x\in B and y\in D) if and only if \left(x,y\right)\in A\times C or \left(x,y\right)\in A\times D or$\left(x,y\right)\in B\times C$ or \left(x,y\right)\in B\times D. Hence by the definition of the union of two sets, \left(x,y\right)\in A\times C or \left(x,y\right)\in A\times D or$\left(x,y\right)\in B\times C$ or \left(x,y\right)\in B\times D occurs if and only if \left(x,y\right)\in \left(A\times C\right)\cup \left(B\times D\right)\cup\left(A\times D\right)\cup\left(B\times C\right).

  5. A\times\left(B\cup C\right) = \left(A\times B\right)\cup\left(A\times C\right):

    We know A=A\cup A and so by the previous property we have that

    $$\begin{align} A\times\left(B\cup C\right)&=\left(A\cup A\right)\times\left(B\cup C\right)\ &=\left(A\times B\right)\cup \left(A\times C\right)\cup\left(A\times C\right)\cup\left(A\times B\right)\ &=\left(A\times B\right)\cup\left(A\times C\right) \end{align*}$$*

  6. \left(B\cup C\right)\times A = \left(B\times A\right)\cup\left(C\times A\right):

    Again A=A\cup A and so by property 4 we have

    $$\begin{align} \left(B\cup C\right)\times A&=\left(B\cup C\right)\times\left(A\cup A\right)\ &=\left(B\times A\right)\cup \left(B\times A\right)\cup\left(C\times A\right)\cup\left(C\times A\right)\ &=\left(B\times A\right)\cup\left(C\times A\right) \end{align*}$$*

  7. If A\subseteq B and C\subseteq D then A\times C\subseteq B\times D. Moreover if A\neq\emptyset and C\neq\emptyset then

    $$\begin{equation} A\times C\subseteq B\times T \iff A\subseteq B\text{ and } C\subseteq D \end{equation*}$$:*

    Let A\subseteq B and C\subseteq D. If A=\emptyset or C=\emptyset then by lemma 1{reference-type="ref" reference="lem:CartEmpty"} we have A\times C=\emptyset and by proposition 6{reference-type="ref" reference="prop:EmptySetincontainedineveryset"} we have A\times C=\emptyset \subseteq B\subseteq D.

    So suppose that A\neq\emptyset and C\neq\emptyset then lemma 1{reference-type="ref" reference="lem:CartEmpty"} gives A\times C\neq\emptyset. Then we have that \left(x,y\right)\in A\times C if and if only x\in A and y\in C. We have A\subseteq B so x\in B and C\subseteq D so y\in D, hence \left(x,y\right)\in B\times D. Hence A\times C\subseteq B\times D.

    It is left to prove that if A\neq\emptyset and C\neq\emptyset and A\times C\subseteq B\times D, then A\subseteq B and C\subseteq D. Suppose A\times C\subseteq B\times D. If A=\emptyset then A\times C=\emptyset by lemma 1{reference-type="ref" reference="lem:CartEmpty"} and A\times C=\emptyset\subseteq B\times D irrespective of C, so C need not be a subset of D. Likewise if C=\emptyset then A\times C=\emptyset\subseteq B\times D irrespective of A so A need not be a subset of B.

    So suppose that A\neq\emptyset and C\neq\emptyset then \exists x\in A and \exists y\in C such that \left(x,y\right)\in A\times C, we have that A\times C\subseteq B\times T and so \left(X,y\right)\in B\times D so x\in B and y\in D.

    Hence for A\neq\emptyset and C\not\emptyset, we have that A\subseteq B and C\subseteq D gives A\times C\subseteq B\times D and A\times C\subseteq B\times D gives A\subseteq B and C\subseteq D. Hence we have

    $$\begin{equation} A\times C\subseteq B\times D\iff A\subseteq B\text{ and } C\subseteq D \end{equation*}$$*

  8. If A\subseteq B then A\times C\subseteq B\times C:

    Let A be such that A\subseteq B. We have for any set C that C\subseteq C, hence by the previous property we know that

    $$\begin{equation} A\subseteq B\text{ and } C\subseteq C\Rightarrow A\times C\subseteq B\times C \end{equation*}$$*

  9. If C\subseteq D then A\times C\subseteq A\times D:

    Let C be such that C\subseteq D. We have that A\subseteq A and so by property 7 we have that

    $$\begin{equation} A\subseteq A\text{ and } C\subseteq D\Rightarrow A\times C\subseteq A\times D \end{equation*}$$*

  10. A\times\left(B\setminus C\right)=\left(A\times B\right)\setminus\left(A\times C\right):

    Let \left(x,y\right)\in A\times\left(B\setminus C\right) then we have that \left(x,y\right)\in A\times\left(B\setminus C\right) if and only if x\in A and y\in B\setminus C. y\in B\setminus C means that y\in B and y\not\in C. Thus, x\in A and y\in B and y\not\in C happens if and only if \left(x,y\right)\in A\times B and \left(x,y\right)\not\in A\times C. Hence by definition of the difference of two sets we have that \left(x,y\right)\in A\times B and \left(x,y\right)\not\in A\times C if and only if \left(x,y\right)\in \left(A\times B\right)\setminus\left(A\times C\right).

  11. \left(A\setminus B\right)\times C = \left(A\times C\right)\setminus\left( B\times C\right):

    Let \left(x,y\right)\in \left(A\setminus B\right)\times C then we have that \left(x,y\right)\in \left(A\setminus B\right)\times C if and only if x\in A\setminus B and y\in C, moreover x\in A\setminus B means that x\in A and x\not\in B. Hence x\in A and x\not\in B and y\in C occurs if and only if \left(x,y\right)\in A\times C and \left(x,y\right)\not\in B\times C. Hence by definition we have that \left(x,y\right)\in A\times C and \left(x,y\right)\not\in B\times C if and only if \left(x,y\right)\in\left(A\times C\right)\setminus\left( B\times C\right).

  12. \left(A\times B\right)\setminus\left(C\times D\right)=\left(A\times\left(B\setminus D\right)\right)\cup\left(\left(A\setminus B\right)\times C\right):

    Let \left(x,y\right)\in \left(A\times B\right)\setminus\left(C\times D\right), then we have that \left(x,y\right)\in A\times B and \left(x,y\right)\not\in C\times D, which happens if and only if x\in A and y\in B and x\not\in C and y\not\in D. Now, x\in A and y\in B and x\not\in C and y\not\in D means that either x\in A and y\in B and x\not\in C or x\in A and y\in B and y\not\in D. In the first case, x\in A and y\in B and x\not\in C, we have that x\in A\setminus C and y\in B, in the second case, x\in A and y\in B and y\not\in D we have x\in A and y\in B\setminus D.

    x\in A and y\in B and x\not\in C or x\in A and y\in B and y\not\in D occurs if and only if x\in A\setminus C and y\in B or x\in A and y\in B\setminus D. Now by the definition of the Cartesian product we have that x\in A\setminus C and y\in B gives us that \left(x,y\right)\in \left(A\setminus C\right)\times B and x\in A and y\in B\setminus D gives us \left(x,y\right)\in A\times \left(C\setminus D\right).

    Hence x\in A\setminus C and y\in B or x\in A and y\in B\setminus D occurs if and only if \left(x,y\right)\in \left(A\setminus C\right)\times B or \left(x,y\right)\in A\times \left(C\setminus D\right), from which we deduce that \left(x,y\right)\in \left(A\setminus C\right)\times B or \left(x,y\right)\in A\times \left(C\setminus D\right) if and only if \left(x,y\right)\in \left(A\setminus C\right)\times B\cup A\times \left(C\setminus D\right).

  13. Suppose A\subseteq C and B\subseteq D and consider C\setminus A and T\setminus B. We have

    $$\begin{align} \left(C\setminus A\right)\times D &= \left(C\times D\right)\setminus\left(A\times D\right)\ C\times\left(D\setminus B\right) &=\left(C\times D\right)\setminus \left(C\times B\right) \end{align*}$$*

    Recall that C\setminus A=\left\{x: x\in C\text{ and } x\not\in A\right\}. Now we have by property 11. that

    $$\begin{equation} \left(C\setminus A\right)\times D= \left(C\times D\right)\setminus \left(A\times D\right) \end{equation*}$$*

    Likewise, by property 10. we have that

    $$\begin{equation} C\times\left(D\setminus B\right)= \left(C\times D\right)\setminus \left(C\times B\right) \end{equation*}$$*

Hence the result has been shown. $\qed$ :::

Power Set

We make one final definition of an elementary operation for sets.

::: definition Definition 34. Power set

Let S be a set. We define the power set of the set S, denoted P\left(S\right) to be the set which contains all of the possible subsets of S. :::

::: example Example 25. Let S=\left\{1,2,3\right\} then we have that

$$\begin{equation} P\left(S\right)=\left{\emptyset,\left{1\right},\left{2\right},\left{3\right},\left{1,2\right},\left{1,3\right},\left{2,3\right},S\right} \end{equation*}$$* :::

Set Partitions

Recall the idea of disjoint sets, that is if X and Y are sets then X and Y are disjoint if X\cap Y=\emptyset. This is saying that X and Y have no elements in common. Now suppose we have a set S such that X\cup Y=S but X\cap Y=\emptyset. Then S is made of two distinct pieces. Of course there is nothing special about S being made of only two pieces, and could be made of many many pieces. We capture this idea in the next definition.

::: definition Definition 35. Partition of a set

Let S be a set and define \mathbb{S} to be the set of subsets of S. We say that \mathbb{S} is a partition of S if the following hold.

  1. \forall S_1,S_2\in\mathbb{S} we have S_1\cap S_2=\emptyset whenever $S_1\neq S_2$

  2. Taking the union of every T\in\mathbb{S} gives us S that is

    $$\begin{equation} S=\bigcup_{T\in\mathbb{S}} T \end{equation*}$$*

  3. \forall T\in\mathbb{S} we have that T\neq\emptyset.

If the number of sets in \mathbb{S} is finite with say n elements then we call \mathbb{S} an $n$-component partition :::

::: example Example 26. Let S=\left\{1,2,3,4\right\} and let S_1=\left\{2,4\right\} and S_2=\left\{1,3\right\}. Then S_1 and S_2 partition S. Interestingly we have that S_1^C=S_2 and S_2^C = S_1, so the complements of these sets still forms a partition

If instead we have S_3 = \left\{1\right\} and S_4=\left\{2,3,4\right\} then we also have a partition where the complements are also a partition. Now if S_5=\left\{2\right\}, S_6=\left\{1,3\right\} and S_7=\left\{4\right\} then S_5,S_6 and S_7 is a partition of S. :::

The fact in the first two examples we had two sets partitioning S where the complements also partitioned S is not a coincidence.

::: proposition Proposition 13. Complements of 2-component partition is partition

Let S be a set such that A\subseteq S and B\subseteq S is a $2$-component partition for S. We have that A and B partition S if and only if A^C and B^C partition S.

Proof:

\left(\Rightarrow\right): Suppose that A\subseteq S and B\subseteq S partition S. By definition we have that

  1. $A\cap B = \emptyset$

  2. $A\cup B = S$

  3. A\neg\emptyset and $B\neq \emptyset$

We need to show that A^C and B^C is a partition that is

  1. $A^C\cap B^C = \emptyset$

  2. $A^C\cup B^C = S$

  3. A^C\neq\emptyset and $B^C\neq \emptyset$

  1. A^C\cap B^C = \emptyset:

    As A\cup B = S we have on taking the complement of both sides that

    $$\begin{align} A\cup B &= S\ \left(A\cup B\right)^C &= S^C\ A^C\cap B^C &= \emptyset \end{align*}$$*

    So A^C\cap B^C = \emptyset.

  2. A^C\cup B^C = S:

    Likewise as A\cap B = \emptyset then on taking the complement of both sides we have that

    $$\begin{align} A\cap B &= \emptyset\ \left(A\cap B\right)^C &= \emptyset^C\ A^C\cup B^C &= S \end{align*}$$*

    So A^C\cup B^C = S.

  3. A^C\neq\emptyset and B^C\neq \emptyset:

    Suppose that A^C = \emptyset then by taking the complement of both sides we have that A=S which implies B=\emptyset, which is a contradiction as A and B partition S. Likewise if we suppose that B^C=\emptyset we will have to conclude that A=\emptyset which will be a contradiction. It thus follows that neither A^C or B^C can be empty.

    Hence A^C\neq\emptyset and B^C\neq \emptyset.

It follows that A^C and B^C is a partition of $S$

\left(\Leftarrow\right): Suppose that A^C and B^C is a partition of S. We have that A^C\subseteq S and B^C\subset S. By the previous part we have that \left(A^C\right)^C and \left(B^C\right)^C is a partition of S. However \left(A^C\right)^C=A and \left(B^C\right)^C=B. Thus A and B is a partition of $S$

The result now follows. $\qed$ :::

There are some additional results we can state about partitions that relate to the operations we can do on sets. We will require the following lemma.

::: lemma Lemma 2. Set difference and intersection are disjoint sets

Let S and T be two sets. We have that S\setminus T and S\cap T are disjoint sets, which is to say that

$$\begin{equation} \left(S\setminus T\right)\cap \left(S\cap T\right)=\emptyset \end{equation*}$$*

Proof:

Suppose that x\in \left(S\setminus T\right)\cap \left(S\cap T\right) then by definition x\in S\setminus T and x\in S\cap T. As x\in S\setminus T then we have that x\in S and x\not\in T, likewise as x\in S\cap T then x\in S and x\in T. It is clear that no such x can exist hence \left(S\setminus T\right)\cap \left(S\cap T\right)=\emptyset. :::

A brief look at Zermelo--Fraenkel set theory

At the start of this section we introduced the idea of Zermelo--Fraenkel set theory. This is the complete formalisation of set theory and the true bedrock of mathematics. The Zermelo--Fraenkel set theory axioms, hence now referred to as ZF, are given as follows.

::: definition Definition 36. Zermelo--Fraenkel set theory axioms

The Zermelo-Fraenkel set theory axioms are the following.

  1. The axiom of extensionality:

    The axiom of extensionality asserts that two sets are equal if and only if they contain the same elements.

  2. The axiom of the empty-set:

    The axiom of the empty-set asserts that there exists a set which contains no elements

  3. The axiom of pairing:

    The axiom of pairing asserts that given any set A and any set B, there is a set C such that, given any set D, D is a member of C if and only if D is equal to A or D is equal to B. This is to say, given two sets, there is a set whose members are exactly the two given sets.

  4. The axiom of specification:

    The axiom of specification asserts that we can construct a set which satisfies a given condition, so long as this condition is not inherently contradictory.

  5. The axiom of unions:

    The axiom of unions asserts that we can perform the union of two sets A and $B$

  6. The axiom of powers:

    The axiom of powers asserts that for any set S we can construct a set P\left(S\right) whose elements are all the possible subsets of S.

  7. The axiom of infinity:

    The axiom of infinity asserts that there is at least one infinite set A, that is at least one set with infinitely many elements. That is we have a set A such that the \emptyset\in A and if x\in A then the set x\cup\left\{x\right\} is also in A.

  8. The axiom of replacement:

    We will need the next section to fully understand this axiom, however informally asserts that for some set S, and form another set by replacing the elements of S by other sets according to any definite rule.

  9. The axiom of foundation:

    The axiom of foundation asserts that for every non-empty set S, there exists an element x\in S such that x and S are disjoint. This also asserts that no set can contain itself. :::

There is also one axiom which we have left off. This is the controversial axiom of choice.

::: definition Definition 37. The axiom of choice

Let S be a set of non-empty sets. The axiom of choice asserts that there is a way to pick an element of each of the sets in S. :::

With the axiom of choice we have the following

::: definition Definition 38. ZFC axioms

The axioms of ZF along with the axiom of choice gives us the ZFC axioms :::

We can already see that our "hands-on" approach to set theory has somewhat indirectly captured the essence of the ZF axioms. We can use the ZF axiom to prove in a truly rigours way what we did with out "hands-on" approach. Although an interesting field of study itself, we will not really need to use the ZF axioms, although occasionally we may rely on choice.

There is one other thing that needs bringing up, ZFC has one more component, the axioms alone are not enough to prove anything. We need the notion of inclusion, that is being an element of a set. That is we include the symbol \in along with the axioms, where \in takes on the meaning we defined earlier. With this we can in theory use ZFC to start proving and building up mathematics from the bedrock.

Mappings

Introduction and basic definitions

Now that we have the of a set what can we use it for? Many areas of mathematics can be broken down into the theory of sets, in particular how we can get from one set to another. Without this idea we wouldn't be able to get very far at all. As an example, you may have seen, in a calculus course for example, the idea of a function f\left(x\right), say f\left(x\right)=x^2 where x can be any number we choose. Say x=2 then f\left(2\right)=4. You may have also seen functions where we are not allowed to use any number we wish for example, if we take f\left(x\right)=\sqrt{x} then we are only allowed positive numbers if we want a to find an answer using the numbers we are familiar with, such as 1, 88.125, \pi,\sqrt{2} etc. This set we will denote by \mathbb{R}. The alert reader may now see how sets will come into play, to define in a rigours way the ideas of f\left(x\right)=x^2 and other such functions, we need to consider what are the allowable inputs which once done will give us the possible outputs. That is if we have a set whose elements are inputs and we define some form of function, which we will now call a map, then we will get another set whose elements are what inputs will be 'mapped' to.

::: definition Definition 39. Mapping

Let X and Y be sets. Suppose we have some rule or description, which we will denote by f, by which for each x\in X there is some element f\left(x\right)\in Y. We say that the rule (description) is a mapping or map or function from X to Y. We denote a mapping with the following notation

$$\begin{align} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right) \end{align*}$$*

where the first line tells us what sets the mapping is between, and the bottom line tells us where each element x\in X gets mapped to :::

::: definition Definition 40. Domain

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between two sets X and Y. We say that the set X is the domain of the mapping f. The domain contains the elements which the map can act on. We can write this as

$$\begin{equation} \mathop{\mathrm{Dom}}\left(f\right)=X \end{equation*}$$* :::

::: definition Definition 41. Co-Domain

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between two sets X and Y. We say that the set Y is the Co-domain of the mapping f. The co-domain contains the possible elements that the map can send elements of X to. We can write this as

$$\begin{equation} \mathop{\mathrm{Cdm}}\left(f\right)=Y \end{equation*}$$* :::

We have some examples of mappings.

::: example Example 27. Let X=\left\{1,2,3\right\} and let Y=X. Define the map

$$\begin{align} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right)=x \end{align*}$$*

To see what f does we will take each element of X one at a time. Starting with 1 we have that 1\mapsto f\left(1\right)=1, for 2 we have 2\mapsto f\left(2\right)=2 and finally 3\mapsto f\left(3\right)=3. Hence the map f takes an element of X and leaves it alone. A map which takes every element of its domain and leaves it alone is called an identity map, or if you prefer the do nothing at all map. :::

::: {#exmp:Mapping 1 .example} Example 28. Let X=Y=\mathbb{N}. Let f be the map given by

$$\begin{align} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right)=2x \end{align*}$$*

It is clear to see that every element in the domain gets doubled, i.e f\left(1\right)=2, f\left(2\right)=4, f\left(3\right)=6 and so on. :::

A map does not need to be given by an explicit mathematical formulae

::: example Example 29. Let A=\text{The set of all humans currently alive on planet earth}, from which it should be clear to see that \text{You}\in A 5 . Let B=\left\{0,1\right\}. Let f be the mapping given by

$$\begin{align} f:A&\mathlarger{\mathlarger{\rightarrow}}B\ a&\mapsto f\left(a\right)= \begin{cases} 1,\ \text{If } a \text{ has hair on their head}\ 0,\ \text{If } a \text{ does not have hair on their head}\ \end{cases} \end{align*}$$*

Then f is a map which indicates if a given person has hair on their head or not. :::

The above definition of a mapping can be made more general

::: definition Definition 42. Piecewise mapping

Let f:X\rightarrow Y be a mapping. We say that f is a piecewise mapping if we need multiple rules or descriptions to fully describe f. That we wish to define the mapping using different rules based on the input. If for each of this input ranges we define a mapping g_1,g_2,g_3,\dots then we can write the piecewise function as follows

$$\begin{align} f:X&\rightarrow Y\ x&\mapsto f\left(x\right)=\begin{cases} g_1\left(x\right),\ \text{Condition for }g_1\ g_2\left(x\right),\ \text{Condition for }g_2\ g_3\left(x\right),\ \text{Condition for }g_3\ \dots \end{cases} \end{align*}$$* :::

::: example Example 30. Let f:\mathbb{N}\rightarrow\mathbb{N} be defined by

$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x &\mapsto f\left(x\right) = \begin{cases} 2x,\ \text{If } $x <5$\ 5x,\ \text{Otherwise} \end{cases} \end{align*}$$*

We have that f\left(1\right)=2, f\left(2\right)=4 and so on up to f\left(4\right)=8, then f\left(5\right)=25 and so on. :::

We make one more useful definition that will be useful throughout the rest of the text,

::: definition Definition 43. Closure of a mapping

Let X be a set. If we have a mapping such that f:X^n\rightarrow X. We say the mapping has closure on the set X, or we say that f is a closed mapping. :::

The image and pre-image

We now define a more technical notion of how a mapping f maps an element in the domain to the co-domain.

::: definition Definition 44. Image of an element

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of between two sets X and Y, and let x\in X be an element of the domain. We say that f\left(x\right)\in Y is the image of the element x. :::

Which in turn allows us to define a subset of the co-domain for which every element x\in X gets mapped to

::: {#def:ImageMapping .definition} Definition 45. Image of a mapping

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of between two sets X and Y. We define the set

$$\begin{equation} \mathop{\mathrm{Image}}\left(f\right)=f\left(X\right)=\left{f\left(x\right):x\in X\right}\subseteq Y \end{equation*}$$ To be the image of the domain, sometimes called the range of f. That is the image is the set of all possible outputs of the mapping f with the domain X.*

Moreover, suppose that A\subseteq X then we define the image of the subset A to be

$$\begin{equation} f\left(A\right)=\left{f\left(x\right):x\in A\right}\subseteq f\left(X\right)\subseteq Y \end{equation*}$$*

That is we can consider the image of subsets of X. :::

::: example Example 31. Consider the mapping in example [28](#exmp:Mapping 1){reference-type="ref" reference="exmp:Mapping 1"}, we have that X=Y=\mathbb{N} and is f the map

$$\begin{align} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right)=2x \end{align*}$$*

then we have that $\mathop{\mathrm{Image}}\left(f\right)=f\left(\mathbb{N}\right)=\left{2x:x\in\mathbb{N}\right}$ :::

::: example Example 32. Let f be an arbitrary mapping such that f:\emptyset\mathlarger{\mathlarger{\rightarrow}}Y for some set Y. What is \mathop{\mathrm{Image}}\left(f\right)?. We have by the definition of a mapping 45{reference-type="ref" reference="def:ImageMapping"}, we have that

$$\begin{equation} \mathop{\mathrm{Image}}\left(f\right)=\left{f\left(x\right):x\in\emptyset\right} \end{equation*}$$*

However, we know that the empty set has no elements, so there are no elements that f can send anything to, so \mathop{\mathrm{Image}}\left(f\right)=\emptyset. :::

Likewise we can define how a mapping is mapped to from the domain to the co-domain. This is called the pre-image.

::: definition Definition 46. Pre-image of an element

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of between two sets X and Y, and let y\in Y be an element of the co-domain. If f\left(x\right)=y then we say that f\left(x\right)\in X is the pre-image of the element y and we denote this f^{-1}\left(y\right). :::

Which in turn allows us to define a subset of the domain for which every element y\in Y gets mapped to

::: {#def:PreImageMapping .definition} Definition 47. Pre-image of a mapping

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of between two sets X and Y. We define the set

$$\begin{equation} \mathop{\mathrm{PreImage}}\left(f\right)=f^{-1}\left(Y\right)=\left{x\in X:f\left(x\right)\in Y\right}\subseteq X \end{equation*}$$ To be the pre-image of the co-domain. That is the pre-image is the set of all possible inputs that give the given outputs.*

Moreover, suppose that B\subseteq Y then we define the pre-image of the subset B to be

$$\begin{equation} f^{-1}\left(B\right)=\left{x\in X:f\left(x\right)\in B\right}\subseteq f^{-1}\left(Y\right)\subseteq X \end{equation*}$$* :::

::: example Example 33. Consider the mapping f:\mathbb{N}\rightarrow\mathbb{N} given by

$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto f\left(x\right)=\frac{x}{2} \end{align*}$$*

We have that \frac{x}{2} is defined in the naturals only when x is an even number, hence the pre-image must consist of the even numbers.

$$\begin{equation} \mathop{\mathrm{PreImage}}\left(f\right)=f^{-1}\left(\mathbb{N}\right)=\left{x\in\mathbb{N}:\frac{x}{2}\in\mathbb{N}\right}=\left{0,2,4,6,8\dots\right} \end{equation*}$$* :::

::: example Example 34. Consider the mapping f:\mathbb{N}\rightarrow\mathbb{N} given by

$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto f\left(x\right)=x^2 \end{align*}$$*

We have that the pre-image is given by

$$\begin{equation} \mathop{\mathrm{PreImage}}\left(f\right)=\left{x\in\mathbb{N}:x^2\in\mathbb{N}\right}=\left{0,1,2,3,4\dots\right}=\mathbb{N} \end{equation*}$$* :::

With these definitions we can make the following observations

::: {#prop:PropertyImagePreImage .proposition} Proposition 14. Properties of the image and pre-image

Let f:X\rightarrow Y be a mapping and let A\subseteq X and B\subseteq Y. We have that the following properties hold for the image and pre-image

  1. $f\left(X\right)\subseteq Y$

  2. $f\left(f^{-1}\left(Y\right)\right)=f\left(X\right)$

  3. $f\left(f^{-1}\left(B\right)\right)\subseteq B$

  4. $f\left(f^{-1}\left(B\right)\right)=B\cap f\left(X\right)$

  5. $f\left(f^{-1}\left(f\left(A\right)\right)\right)=f\left(A\right)$

  6. $f\left(A\right)=\emptyset\iff A=\emptyset$

  7. $B\subseteq f\left(A\right)\iff\exists C\subseteq A: f\left(C\right)=B$

  8. $f\left(X\setminus A\right)\subseteq f\left(A\right)\iff f\left(A\right)=f\left(X\right)$

  9. $f\left(X\right)\setminus f\left(A\right)\subseteq f\left(X\setminus A\right)$

  10. $f\left(A\cup f^{-1}\left(B\right)\right)\subseteq f\left(A\right)\cup B$

  11. $f\left(A\cap f^{-1}\left(B\right)\right)= f\left(A\right)\cap B$

Likewise the following properties hold for the pre-image

  1. $f^{-1}\left(Y\right)=X$

  2. $f^{-1}\left(f\left(X\right)\right)=X$

  3. $A\subseteq f^{-1}\left(f\left(A\right)\right)$

  4. Suppose that instead of the mapping f:X\rightarrow Y we consider a new mapping based on f, which we we call \Bar{f}. We define \Bar{f} to be the mapping

    $$\begin{align} \Bar{f}:A&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto \Bar{f}\left(x\right)=f\left(x\right) \end{align*}$$*

    that is \Bar{f} maps every element of a\in A to what f\left(a\right) does. With this new mapping we have the following property

    $$\begin{equation} \left(\Bar{f}\right)^{-1}\left(B\right)=A\cap f^{-1}\left(B\right) \end{equation*}$$*

  5. $f^{-1}\left(f\left(f^{-1}\left(B\right)\right)\right)=f^{-1}\left(B\right)$

  6. $f^{-1}\left(B\right)=\emptyset\iff B\subseteq Y\setminus f\left(X\right)$

  7. $A\subseteq f^{-1}\left(B\right)\iff f\left(A\right)\subseteq B$

  8. $f^{-1}\left(Y\setminus B\right)\subseteq f^{-1}\left(B\right)\iff f^{-1}\left(B\right)=X$

  9. $f^{-1}\left(Y\setminus B\right)= X\setminus f^{-1}\left(B\right)$

  10. $A\cup f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cup B\right)$

  11. $A\cap f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cap B\right)$

Proof:

We start with the properties of the image.

  1. f\left(X\right)\subseteq Y:

    This holds by definition of the image.

  2. f\left(f^{-1}\left(Y\right)\right)=f\left(X\right):

    Let x\in f\left(f^{-1}\left(Y\right)\right) and recall the definition of the image and pre-image.

    $$\begin{align} f\left(A\right)&=\left{f\left(x\right):x\in A\right}\subseteq f\left(X\right)\subseteq Y\ f^{-1}\left(B\right)&=\left{x\in X:f\left(x\right)\in B\right}\subseteq f^{-1}\left(Y\right)\subseteq X \end{align*}$$*

    We have that

    $$\begin{equation} f\left(f^{-1}\left(Y\right)\right)=\left{f\left(y\right):y\in f^{-1}\left(Y\right)\right} \end{equation*}$$*

    Hence x\in f\left(f^{-1}\left(Y\right)\right) means that x=f\left(y\right) for some y\in f^{-1}\left(Y\right), additionally we conclude that y\in X. Moreover by the definition of the pre-image we have that f^{-1}\left(Y\right)\subseteq X. It thus follows that x\in f\left(X\right) and so f\left(f^{-1}\left(Y\right)\right)\subseteq f\left(X\right).

    Now suppose that x\in f\left(X\right), that is x=f\left(x'\right) for some x'\in X. Now by definition of the pre-image as x'\in X with f\left(x'\right)\in Y we have that x'\in f^{-1}\left(Y\right). Hence by definition of the set f\left(f^{-1}\left(Y\right)\right) we must conclude that f\left(x'\right)\in f\left(f^{-1}\left(Y\right)\right), which is to say x\in f\left(f^{-1}\left(Y\right)\right). Hence f\left(X\right)\subseteq f\left(f^{-1}\left(Y\right)\right).

    It follows that f\left(f^{-1}\left(Y\right)\right)=f\left(X\right).

  3. f\left(f^{-1}\left(B\right)\right)\subseteq B:

    Suppose that x\in f\left(f^{-1}\left(B\right)\right) where B\subseteq Y. We hence have that x=f\left(b\right) for some b\in f^{-1}\left(B\right), hence b\in X giving us f\left(b\right)\in B and so f\left(f^{-1}\left(B\right)\right)\subseteq B.

  4. f\left(f^{-1}\left(B\right)\right)=B\cap f\left(X\right):

    Let x\in f\left(f^{-1}\left(B\right)\right) then by property 3 we have that x\in B. Additionally as x\in f\left(f^{-1}\left(B\right)\right) and B\subseteq Y then f\left(f^{-1}\left(B\right)\right)\subseteq f\left(f^{-1}\left(Y\right)\right) and so x\in f\left(f^{-1}\left(Y\right)\right). Now by property 2 we have that f\left(f^{-1}\left(Y\right)\right)=f\left(X\right) thus x\in f\left(X\right) and so x\in B\cap f\left(X\right). It follows that f\left(f^{-1}\left(B\right)\right)\subseteq B\cap f\left(X\right).

    Now suppose that x\in B\cap f\left(X\right). By definition of f\left(X\right) we have x\in f\left(X\right) gives us that x=f\left(x'\right) where x'\in X, moreover we also have that x\in B. Now we have the set f\left(f^{-1}\left(B\right)\right) is given by

    $$\begin{equation} f\left(f^{-1}\left(B\right)\right)=\left{f\left(b\right):b\in f^{-1}\left(B\right)\right} \end{equation*}$$*

    We have that x=f\left(x'\right) and so x'\in f^{-1}\left(B\right), hence clearly by definition of the image we have that x\in f\left(f^{-1}\left(B\right)\right). It follows that B\cap f\left(X\right)\subseteq f\left(f^{-1}\left(B\right)\right).

    Hence the result f\left(f^{-1}\left(B\right)\right)=B\cap f\left(X\right).

  5. f\left(f^{-1}\left(f\left(A\right)\right)\right)=f\left(A\right):

    By property 4 we have that

    $$\begin{equation} f\left(f^{-1}\left(f\left(A\right)\right)\right)=f\left(A\right)\cap f\left(X\right) \end{equation*}$$ as f\left(A\right)\subseteq Y. Finally f\left(A\right)\cap f\left(X\right)=f\left(A\right) as f\left(A\right)\subseteq f\left(X\right). The result follows.*

  6. f\left(A\right)=\emptyset\iff A=\emptyset:

    \left(\Leftarrow\right): Suppose that f\left(A\right)=\emptyset. By definition of the image we have that

    $$\begin{equation} f\left(A\right)=\left{f\left(x\right):x\in A\right} \end{equation*}$$ By set equality we must have that f\left(A\right)=\left\{f\left(x\right):x\in A\right\}=\emptyset. Hence there can be no elements f\left(x\right) where x\in A which can only occur if A=\emptyset for if not then f\left(A\right) has at least one element for some x'\in A, contradicting the fact that f\left(A\right)=\emptyset. It follows that A=\emptyset.*

    \left(\Rightarrow\right): Suppose that A=\emptyset, we have that the image of the empty set is given by

    $$\begin{equation} f\left(A\right)=f\left(\emptyset\right)=\left{f\left(x\right):x\in \emptyset\right}=\emptyset \end{equation*}$$*

    It follows that f\left(A\right)=\emptyset.

  7. B\subseteq f\left(A\right)\iff\exists C\subseteq A: f\left(C\right)=B:

    \left(\Rightarrow\right): Suppose that B\subseteq f\left(A\right). We show that \exists C\subseteq A: f\left(C\right)=B. So, suppose that x\in B then we have that x\in f\left(A\right) by assumption. By definition of the image we have that

    $$\begin{equation} f\left(A\right)=\left{f\left(x\right):x\in A\right} \end{equation*}$$*

    Hence we have x\in f\left(A\right) gives us that x=f\left(x'\right) for some x'\in A. We define the required set C as follows.

    $$\begin{equation} C = \bigcup_{\substack{x'\in A \ f\left(x'\right)\in B}} x' \end{equation*}$$ That is C is defined to be those elements x'\in A such that f\left(x'\right)\in B which is a subset of f\left(A\right). Clearly C\subseteq A as each x'\in C is by construction an element of A. Additionally we also have f\left(C\right)=B by construction of C.*

    \left(\Leftarrow\right): Suppose that \exists C\subseteq A: f\left(C\right)=B. As f\left(C\right)=B we have by the definition of the image that

    $$\begin{equation} f\left(C\right)=\left{f\left(x\right):x\in C\right} \end{equation*}$$ that is x\in f\left(C\right) gives x=f\left(c\right) for some c\in C and additionally x\in B by assumption. Now C\subseteq A so c\in A. Hence x\in f\left(A\right), hence we must conclude that B\subseteq f\left(A\right), possibly being equal if C=A.*

    The result follows.

  8. f\left(X\setminus A\right)\subseteq f\left(A\right)\iff f\left(A\right)=f\left(X\right):

    \left(\Rightarrow\right): Suppose that f\left(X\setminus A\right)\subseteq f\left(A\right) and recall the definition of the complement of sets. We have that

    $$\begin{equation} X\setminus A = \left{x\in X: x\not\in A\right} \end{equation*}$$ Now, A\subseteq X by hypothesis of the proposition. So if x\in f\left(X\setminus A\right) then by definition of the image we have that*

    $$\begin{equation} f\left(X\setminus A\right)=\left{f\left(x\right): x\in X\setminus A\right}=\left{f\left(x\right):x\in X\text{ and } x\not\in A\right} \end{equation*}$$ but then if x\not\in A then x\not\in f\left(A\right). However if A=X then we have that X\setminus A = \emptyset from which it follows by property 6 that f\left(X\setminus A\right)=\emptyset and so as the empty set is a subset of any set we conclude that \emptyset\subseteq f\left(A\right), that is we must have f\left(A\right)=f\left(X\right).*

    \left(\Leftarrow\right): Suppose that f\left(A\right)=f\left(X\right), by definition of the image we have that

    $$\begin{equation} f\left(A\right)=\left{f\left(a\right):a\in A\right}=\left{f\left(x\right):x\in X\right)=f\left(X\right) \end{equation*}$$*

    Now consider f\left(X\setminus A\right) this set is given by

    $$\begin{equation} f\left(X\setminus A\right)=\left{f\left(x\right): x\in X\setminus A\right}=\left{f\left(x\right):x\in X\text{ and } x\not\in A\right} \end{equation*}$$ But as all such x\in A must also be x\in X by assumption we conclude that f\left(X\setminus A\right)=\emptyset and the empty set is clearly contained in any other set. Hence f\left(X\setminus A\right)\subseteq f\left(A\right). The result has now been shown.*

  9. f\left(X\right)\setminus f\left(A\right)\subseteq f\left(X\setminus A\right):

    Let x\in f\left(X\right)\setminus f\left(A\right). By definition we have that

    $$\begin{equation} f\left(X\right)\setminus f\left(A\right)=\left{x\in f\left(X\right):x\not\in f\left(A\right)\right} \end{equation*}$$*

    Hence x\in f\left(X\right)\setminus f\left(A\right) gives us that x\in f\left(X\right) and x\not\in f\left(A\right). That is \exists y\in X with y\nexists A such that x=f\left(y\right), this is y\in X\setminus A. Hence it follows that x\in f\left(X\setminus A\right). That is f\left(X\right)\setminus f\left(A\right)\subseteq f\left(X\setminus A\right).

  10. f\left(A\cup f^{-1}\left(B\right)\right)\subseteq f\left(A\right)\cup B:

    Let x\in f\left(A\cup f^{-1}\left(B\right)\right). This is our first usage of the pre-image of a set so we recall the definition, we have that

    $$\begin{equation} f^{-1}\left(B\right)=\left{x\in X:f\left(x\right)\in B\right)\subseteq X \end{equation*}$$*

    Hence the image f\left(A\cup f^{-1}\left(B\right)\right) is given by

    $$\begin{align} f\left(A\cup f^{-1}\left(B\right)\right)&=\left{f\left(y\right):y\in A\cup f^{-1}\left(B\right)\right}\ &=\left{f\left(y\right):y\in A\text{ or } y\in f^{-1}\left(B\right)\right}\ &=\left{f\left(y\right):y\in A\text{ or } y\in X : f\left(y\right)\in B\right} \end{align*}$$*

    Now, x\in f\left(A\cup f^{-1}\left(B\right)\right) gives us that either \exists y\in A with x=f\left(y\right) or \exists y\in X with f\left(y\right)\in B. In the first case where \exists y\in A with x=f\left(y\right) then by definition of the image we have that x\in f\left(A\right) and so is clearly in the union f\left(A\right)\cup B. Now for the second case we have that x\in B as y\in X such that x=f\left(y\right)\in B, likewise it is in the union f\left(A\right)\cup B.

    Hence x\in f\left(A\right)\cup B and we have that f\left(A\cup f^{-1}\left(B\right)\right)\subseteq f\left(A\right)\cup B. Hence the result.

  11. f\left(A\cap f^{-1}\left(B\right)\right)= f\left(A\right)\cap B:

    Let x\in f\left(A\cap f^{-1}\left(B\right)\right), the image of A\cap f^{-1}\left(B\right) is given by

    $$\begin{align} f\left(A\cap f^{-1}\left(B\right)\right)&=\left{f\left(y\right):y\in A\cap f^{-1}\left(B\right)\right}\ &=\left{f\left(y\right):y\in A\text{ and } y\in f^{-1}\left(B\right)\right}\ &=\left{f\left(y\right):y\in A\text{ and } y\in X : f\left(y\right)\in B\right}\ \end{align*}$$*

    Now x\in f\left(A\cap f^{-1}\left(B\right)\right) gives us that \exists y\in A with x=f\left(y\right) and \exists y\in X with f\left(y\right)\in B. Hence we clearly have that x\in f\left(A\right) and x\in B and so is in the intersection f\left(A\right)\cap B. Hence we have that f\left(A\cap f^{-1}\left(B\right)\right)\subseteq f\left(A\right)\cap B.

    Now suppose that x\in f\left(A\right)\cap B. We have that x\in f\left(A\right) and x\in B, from the first of these having x\in f\left(A\right) means that \exists y\in A such that x=f\left(y\right). Now as x\in B means there is some y'\in X with x=f\left(y'\right). However as f\left(A\right)\cap B then we must have that f\left(y'\right)\in f\left(A\right) hence y'\in A. Hence both y and y' are in the set A\cap f^{-1}\left(B\right) and so we have x\in f\left(A\cap f^{-1}\left(B\right)\right) and therefore f\left(A\right)\cap B\subseteq f\left(A\cap f^{-1}\left(B\right)\right).

    The result f\left(A\cap f^{-1}\left(B\right)\right)= f\left(A\right)\cap B follows.

We now turn our attention to the results for the pre-image.

  1. f^{-1}\left(Y\right)=X:

    By definition of the pre-image we have that

    $$\begin{equation} f^{-1}\left(Y\right)=\left{x\in X:f\left(x\right)\in Y\right}\subseteq X \end{equation*}$$*

    Clearly f^{-1}\left(Y\right)\subseteq X by definition. Now if x\in X then we must also clearly have f\left(x\right)\in Y and so X\subseteq f^{-1}\left(Y\right). Hence f^{-1}\left(Y\right)=X.

  2. f^{-1}\left(f\left(X\right)\right)=X:

    Let y\in f^{-1}\left(f\left(X\right)\right), we have that the set f^{-1}\left(f\left(X\right)\right) is given by

    $$\begin{equation} f^{-1}\left(f\left(X\right)\right)=\left{x\in X: f\left(x\right)\in f\left(X\right)\right}\ \end{equation*}$$*

    It is hence clear that for any x\in f^{-1}\left(f\left(X\right)\right) we have clearly have x\in X, that is f^{-1}\left(f\left(X\right)\right)\subseteq X. Likewise if x\in X then clearly x\in f\left(X\right) and so by the definition of f^{-1}\left(f\left(X\right)\right) we have that x\in f^{-1}\left(f\left(X\right)\right). That is X\subseteq f^{-1}\left(f\left(X\right)\right). The result follows.

  3. A\subseteq f^{-1}\left(f\left(A\right)\right):

    Suppose that x\in A\subseteq X. By property 2. of the pre-image we have that f^{-1}\left(f\left(X\right)\right)=X. Hence x\in A\subseteq f^{-1}\left(f\left(X\right)\right)=X giving the result.

  4. Suppose that instead of the mapping f:X\rightarrow Y we consider a new mapping based on f, which we we call \Bar{f}. We define \Bar{f} to be the mapping

    $$\begin{align} \Bar{f}:A&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto \Bar{f}\left(x\right)=f\left(x\right) \end{align*}$$*

    that is \Bar{f} maps every element of a\in A to what f\left(a\right) does. With this new mapping we have the following property

    $$\begin{equation} \left(\Bar{f}\right)^{-1}\left(B\right)=A\cap f^{-1}\left(B\right): \end{equation*}$$*

    Let x\in \left(\Bar{f}\right)^{-1}\left(B\right). We have that \left(\Bar{f}\right)^{-1}\left(B\right) is given by

    $$\begin{equation} \left(\Bar{f}\right)^{-1}\left(B\right)=\left{x\in A:\Bar{f}\left(x\right)\in B\right} \end{equation*}$$*

    So x\in \left(\Bar{f}\right)^{-1}\left(B\right) gives that x\in A, moreover as \Bar{f}\left(x\right)\in B and \Bar{f} maps every x\in A to f\left(x\right) then \Bar{f}\left(x\right)=f\left(x\right)\in B. It follows that x\in f^{-1}\left(B\right) and so x\in A\cap f^{-1}\left(B\right). Thus \left(\Bar{f}\right)^{-1}\left(B\right)\subseteq A\cap f^{-1}\left(B\right).

    Now, suppose that x\in A\cap f^{-1}\left(B\right), by definition of \Bar{f} we have that \Bar{f}\left(x\right). Now x\in f^{-1}\left(B\right) means that f\left(x\right)\in B, now as \Bar{f}\left(x\right) maps any x\in A to f\left(x\right) we have that \Bar{f}\left(x\right)=f\left(x\right) and so $x\in \left(\Bar{f}\right)^{-1}\left(B\right)$

    Hence $\left(\Bar{f}\right)^{-1}\left(B\right)=A\cap f^{-1}\left(B\right)$

  5. f^{-1}\left(f\left(f^{-1}\left(B\right)\right)\right)=f^{-1}\left(B\right):

    This follows by property 2. f^{-1}\left(f\left(X\right)\right)=X. Indeed we have

    $$\begin{equation} f^{-1}\left(f\left(f^{-1}\left(B\right)\right)\right)=f^{-1}\left(B\right) \end{equation*}$$*

  6. f^{-1}\left(B\right)=\emptyset\iff B\subseteq Y\setminus f\left(X\right):

    \left(\Rightarrow\right): Suppose f^{-1}\left(B\right)=\emptyset, by definition of the pre-image we have

    $$\begin{equation} f^{-1}\left(B\right)=\left{x\in X:f\left(x\right)\in B\right}=\emptyset \end{equation*}$$*

    Hence the pre-image being empty means that there are no elements x\in X with f\left(x\right)\in B. Now the set Y\setminus f\left(X\right) is given

    $$\begin{equation} Y\setminus f\left(X\right)=\left{y\in Y: y\not\in f\left(X\right)\right} \end{equation*}$$*

    Thus as there are no x\in X with f\left(x\right)\in B, then Y\setminus f\left(x\right) will not remove any f\left(x\right)\in B, that is B\subseteq Y\setminus f\left(X\right).

    \left(\Leftarrow\right): Suppose that B\subseteq Y\setminus f\left(X\right). We Have that Y\setminus f\left(X\right) is precisely the set of y\in Y with y\not\in f\left(X\right), therefore the set B\subseteq Y\setminus f\left(X\right) means that if f\left(b\right)\in B then we have have that b\not\in f\left(X\right) and hence b\not\in X. This holds for any f\left(b\right)\in B and hence we must have that the pre-image of B is empty. This is to say f^{-1}\left(B\right)=\emptyset.

  7. A\subseteq f^{-1}\left(B\right)\iff f\left(A\right)\subseteq B:

    \left(\Rightarrow\right): Suppose that A\subseteq f^{-1}\left(B\right). Recall the definition of the image

    $$\begin{equation} f\left(A\right)=\left{f\left(x\right):x\in A\right} \end{equation*}$$*

    Now for some a\in A we have that a\in f^{-1}\left(B\right) and so there is some x\in X such that f\left(x\right)\in B, in particular a=x and so x\in A which gives f\left(A\right)\subseteq B.

    \left(\Leftarrow\right): Now, suppose that f\left(A\right)\subseteq B we have that for some y\in f\left(A\right) that y\in B and in particular by definition there is some x\in A such that f\left(x\right)=y\in f\left(A\right). Hence as A\subseteq X we have that x\in X and so by definition of the pre-image we have that x\in f^{-1}\left(B\right). This is to say we conclude that A\subseteq f^{-1}\left(B\right).

  8. f^{-1}\left(Y\setminus B\right)\subseteq f^{-1}\left(B\right)\iff f^{-1}\left(B\right)=X:

    Suppose that f^{-1}\left(Y\setminus B\right)\subseteq f^{-1}\left(B\right). We have that pre-image of Y\setminus B is given by

    $$\begin{equation} f^{-1}\left(Y\setminus B\right)=\left{x\in X: f\left(x\right) \in Y\setminus B\right}=\left{x\in X: f\left(x\right)\in Y \text{ and } f\left(x\right)\not\in B\right} \end{equation*}$$*

    Hence by definition y\in f^{-1}\left(Y\setminus B\right) gives us that y=x for some x\in X with f\left(x\right)\in Y and f\left(x\right)\not\in B, but then we can't have y\in f^{-1}\left(B\right) by the definition of the pre-image on B. Hence we conclude that f^{-1}\left(Y\setminus B\right)\subseteq f^{-1}\left(B\right) holds if and only if Y\setminus B = \emptyset from which B= Y and so by property 1. we have that f^{-1}\left(B\right)= X.

  9. f^{-1}\left(Y\setminus B\right)= X\setminus f^{-1}\left(B\right):

    Suppose that x\in f^{-1}\left(Y\setminus B\right) then by definition we have that f\left(x\right)\in y and f\left(x\right)\not\in B for some x\in X, but this is clearly the definition of X\setminus f^{-1}\left(B\right) and so x\in X\setminus f^{-1}\left(B\right).

    Conversely if x\in X\setminus f^{-1}\left(B\right) then f\left(x\right)\not\in B but by definition of f we have that f\left(x\right)\in Y and so x\in f^{-1}\left(Y\setminus B\right).

    It follows that f^{-1}\left(Y\setminus B\right)= X\setminus f^{-1}\left(B\right).

  10. A\cup f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cup B\right):

    Let x\in A\cup f^{-1}\left(B\right). We have that either x\in A or x\in f^{-1}\left(B\right). If x\in A then f\left(x\right)\in f\left(A\right) and so f\left(x\right)\in f\left(A\right)\cup B, the result follows on taking the pre-image as

    $$\begin{equation} f^{-1}\left(f\left(A\right)\cup B\right)=\left{x\in X: f\left(x\right)\in f\left(A\right)\cup B\right} \end{equation*}$$ This is to say that x\in f^{-1}\left(f\left(A\right)\cup B\right)=\left\{x\in X: f\left(x\right)\in f\left(A\right)\cup B\right\}.*

    Now if x\in f^{-1}\left(B\right) then we have by definition that f\left(x\right)\in B and by a similar argument to above we conclude that f\left(x\right)\in f\left(A\right)\cup B so that on taking the pre-image we conclude that x\in f^{-1}\left(f\left(A\right)\cup B\right)=\left\{x\in X: f\left(x\right)\in f\left(A\right)\cup B\right\}.

    Hence it follows that A\cup f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cup B\right).

  11. A\cap f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cap B\right):

    Suppose that x\in A\cap f^{-1}\left(B\right) then x\in A and x\in f^{-1}\left(B\right) and so f\left(x\right)\in B. As x\in A then f\left(x\right)\in f\left(A\right) and hence as f\left(x\right)\in f\left(A\right) and f\left(x\right)\in B then f\left(x\right)\in f\left(A\right)\cap B. The result follows on taking the pre-image.

    Hence $A\cap f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cap B\right)$

The proposition now follows. $\qed$ :::

Injective, surjective and bijective mappings

Armed with the examples we have seen we can make a few comments about mappings. Consider example [28](#exmp:Mapping 1){reference-type="ref" reference="exmp:Mapping 1"} where we have that X=Y=\mathbb{N} and is f the map

$$\begin{align*} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right)=2x \end{align*}$$

We have that for every x,y\in X with f\left(x\right)=f\left(y\right) that x=y, which is to say if the image of two different elements agree, then the elements are in-fact the same. This is clear to see, suppose that x,y\in X with f\left(x\right)=f\left(y\right), then we have that

$$\begin{align*} f\left(x\right)&=f\left(y\right)\ 2x&=2y\ x&=y \end{align*}$$

Another way of expressing this idea is that two distinct elements in the domain will have distinct images, we say a mapping with this property is an injective mapping. Now, if we consider \mathop{\mathrm{Image}}\left(f\right)\subseteq Y and consider the map

$$\begin{align*} g:X&\mathlarger{\mathlarger{\rightarrow}}\mathop{\mathrm{Image}}\left(f\right)\ x&\mapsto g\left(x\right)=2x \end{align*}$$ Then, for every y\in\mathop{\mathrm{Image}}\left(f\right), we have that there exists some element x\in X such that y=g\left(x\right). Again, we can show this. Let y\in\mathop{\mathrm{Image}}\left(f\right), then we need to show that \exists x\in X such that g\left(x\right)=y. Now

$$\begin{align*} y&=g\left(x\right)\ y&=2x\ \frac{y}{2}&=x \end{align*}$$

We hence will need to take \displaystyle x=\frac{y}{2}, however we first then to verify that \displaystyle x=\frac{y}{2}\in X. We note that y\in\mathop{\mathrm{Image}}\left(f\right) means that y=2k for some k\in\mathbb{N}, so

$$\begin{align*} x&=\frac{y}{2}\ x&=\frac{2k}{2}\ x&=k \end{align*}$$

as x\in X=\mathbb{N} and k\in\mathbb{N} then we can rest safe in the knowledge that our choice for x indeed works. As a sanity check we have that

$$\begin{equation*} g\left(x\right)=2x=2\frac{y}{2}=y \end{equation*}$$

This choice of x works for any choice of y. Another way to express this idea is that every element in the image of the mapping is the image of some element in the domain, we say a mapping with this property is a surjective mapping.

It is worth noting that the mapping g is both injective and surjective, this makes g a special type of mapping. If we take an element in the domain x and consider its image g\left(x\right)\in\mathop{\mathrm{Image}}\left(f\right), then as g is injective we know that g\left(x\right) is a distinct element in \mathop{\mathrm{Image}}\left(f\right). Moreover, as g is surjective then there is an element in the domain, say a with the property that g\left(a\right)=g\left(x\right), but as g is injective then we know that a=x. This means that we can go between elements of the domain and elements of the image in a distinct way, a mapping with this property is called a bijective mapping and the domain and image are said to be in bijection with each other.

We formalise these ideas now to a mapping between any two sets.

::: definition Definition 48. Injective, surjective and bijective maps

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between two sets X and Y.

  1. We say that f is an injective mapping, sometimes called a one-to-one mapping, if

    $$\begin{equation} \forall x,y\in X,\ f\left(x\right)=f\left(y\right) \Rightarrow x=y \end{equation*}$$*

    That is we have that f\left(x\right)=f\left(y\right) for x,y\in X then x=y. If we know that f is injective we can write the mapping as

    $$\begin{equation} f:X\mathlarger{\mathlarger{\hookrightarrow}}Y \end{equation*}$$*

    which is read as f is an injective mapping from X to Y.

  2. We say that f is a surjective mapping, sometimes called a onto mapping, if

    $$\begin{equation} \forall y\in Y,\exists x\in X: y=f\left(x\right) \end{equation*}$$*

    That is we have that for each y\in Y, there exists some x\in X such that f\left(x\right)=y. If we know that f is a surjective then we can write the mapping as

    $$\begin{equation} f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Y \end{equation*}$$*

    which is read as f is a surjective mapping from X to $Y$

  3. We say that f is a bijective mapping, sometimes called a one-to-one and unto mapping, if f is both injective and surjective. If we know that f is a bijection then we can write the mapping as

    $$\begin{equation} f:X% \mathlarger{\mathlarger{\hookrightarrow}}\mathrel{\mspace{-27.5mu}}\mathlarger{\mathlarger{\rightarrow}} Y \end{equation*}$$*

    which is read as f is a bijective mapping from X to Y. :::

We will look for additional examples of each type of mapping.

::: example Example 35. Let f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} where f\left(x\right)=x. We will prove that f is a bijective mapping.

Proof:

To show f is bijective we show that f is injective and surjective. To see that f is an injection, suppose that f\left(x\right)=f\left(y\right) where x,y\in N, the domain. then we have that

$$\begin{align} f\left(x\right)&=f\left(y\right)\ x&=y \end{align*}$$ This shows f is injective as this holds for any choice of x,y\in \mathbb{N}. To see that f is surjective consider y\in\mathbb{N}, the co-domain, we show there exists an x\in\mathbb{N}, the domain, so that f\left(x\right)=y. We have*

$$\begin{align} y&=f\left(x\right)\ y&=x \end{align*}$$ so we take x=y. This works for every y\in\mathbb{N}, the co-domain, so f is surjective.*

As f is both injective and surjective it is by definition a bijective map, that is $f:\mathbb{N}% \mathlarger{\mathlarger{\hookrightarrow}}\mathrel{\mspace{-27.5mu}}\mathlarger{\mathlarger{\rightarrow}} \mathbb{N}$. $\qed$ :::

::: example Example 36. Let f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} where

$$\begin{equation} f\left(x\right)=\begin{cases} x,\ \text{If } x \text{ is odd}\ \frac{x}{2},\ \text{If } x \text{ is even}\ \end{cases} \end{equation*}$$*

Is f injective? To see if it is we would need to show that f\left(x\right)=f\left(y\right) with x,y\in \mathbb{N} means that x=y. It becomes clear that there are x,y\in\mathbb{N} where this does not hold, for example f\left(1\right)=1 and f\left(2\right)=1 so f\left(1\right)=f\left(2\right) but 1\neq 2. This shows that f is not injective. Is f surjective? To see if it is we would need to show that \forall y\in\mathbb{N},\exists x\in\mathbb{N} such that y=f\left(x\right). Note that for every even input x=2k we have that \displaystyle f\left(x\right)=\frac{2k}{2}=k. So for any y\in\mathbb{N} if we take x=2y then every y\in\mathbb{N} gets mapped to to by 2y. So f is surjective.

As f was not injective we have that f is not a bijection, so we have f:\mathbb{N}\mathlarger{\mathlarger{\twoheadrightarrow}}\mathbb{N}. :::

::: example Example 37. Let X=\left\{1,2\right\} and Y=\left\{3,4,5\right\} and define the map f:X\mathlarger{\mathlarger{\rightarrow}}Y by

$$\begin{equation} f\left(1\right)=3,\ f\left(2\right)=4 \end{equation*}$$*

Then it is clear that f is injective, as each input is mapped to a distinct output. More formally suppose that f\left(x\right)=f\left(y\right) where x,y\in X. We have that by the definition of the mapping f\left(1\right)=3,\ f\left(2\right)=4. In the first case we have f\left(x\right)=f\left(y\right)=3 and so x=y=1, likewise in the second case we have that f\left(x\right)=f\left(y\right)=4 and so x=y=2. This proves injectivity.

To see that f is not surjective, consider the image \mathop{\mathrm{Image}}\left(f\right)=\left\{f\left(x\right):x\in X\right\}=\left\{3,4\right\}\neq Y. So \exists y\in Y such that \not\exists x\in X with y=f\left(x\right).

It hence follows that f is not bijective, that is f:\left\{1,2\right\}\mathlarger{\mathlarger{\hookrightarrow}}\left\{3,4,5\right\}. :::

::: example Example 38. Let X=\left\{1,2,3\right\} and Y=\left\{4,5\right\} and define the map f:X\mathlarger{\mathlarger{\rightarrow}}Y by

$$\begin{equation} f\left(1\right)=4,\ f\left(2\right)=4,\ f\left(3\right)=5 \end{equation*}$$*

We have that f is not injective as f\left(1\right)=f\left(2\right)=4 but 1\neq 2. However we have that f is surjective as the image of f is \mathop{\mathrm{Image}}\left(f\right)=\left\{f\left(x\right):x\in X\right\}=\left\{4,5\right\}=Y.

By definition f is not bijective, hence f:\left\{1,2,3\right\}\mathlarger{\mathlarger{\twoheadrightarrow}}\left\{4,5\right\}. :::

We note that we can always construct a mapping g from f:X\rightarrow Y such that g:X\mathlarger{\mathlarger{\rightarrow}}\mathop{\mathrm{Image}}\left(f\right) is a surjection.

::: {#prob:RestOfCodomainToImageIsSurjective .proposition} Proposition 15. The restriction of a mappings co-domain to its image is a surjective mapping

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping and consider \mathop{\mathrm{Image}}\left(f\right)=\left\{f\left(x\right):x\in X\right\}. Consider the following mapping

$$\begin{align} g:X&\mathlarger{\mathlarger{\rightarrow}}\mathop{\mathrm{Image}}\left(f\right)\ x&\mapsto f\left(x\right) \end{align*}$$*

Then g is a surjective map.

Proof:

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and consider \mathop{\mathrm{Image}}\left(f\right)=\left\{f\left(x\right):x\in X\right\}. By the definition of the image of a mapping 45{reference-type="ref" reference="def:ImageMapping"} we have that \mathop{\mathrm{Image}}\left(f\right)\subseteq Y. Moreover, by the definition of the image of a map we have that y\in\mathop{\mathrm{Image}}\left(f\right) if and only if \exists x\in X such that y=f\left(x\right). This will hold for all y\in\mathop{\mathrm{Image}}\left(f\right) so g is a surjection. \qed. :::

In the proof we used the idea of restricting the co-domain of the function so that it was the image \mathop{\mathrm{Image}}\left(f\right) rather than Y, while leaving the domain X unchanged. In actuality we didn't restrict the co-domain at all but instead only considered those elements of the co-domain that actually get mapped to. It should be clear that the image \mathop{\mathrm{Image}}\left(f\right), the elements that actually get mapped to, only depends on the allowable inputs for the function, that is only depend on the domain X. In many fields of mathematics it is sometimes desirable to restrict the domain X that is being worked with to a smaller subset of the domain A\subseteq X. As a quick example of why this is useful, and which we will see later, is for inverse mappings. For now the key idea of an inverse map is to be able to create a bijection between a mapping and its domain and co-domain to enable us to unambiguously go between the two. Why is this useful?

For an example, suppose that you wanted to go on holiday abroad then you'll need to convert your currency to the currency that is in use where you go to. Suppose that you use gold coins where as the contry you vist only uses silver coins. The exchange rate from gold coins to silver coins is given by the following mapping E\left(x\right) = Ax^2 where the domain is the set of all the numbers that we are familiar with, that is \mathbb{R}, and A is some positive number which is greater than 0.

Suppose we wish to convert 50 gold coins into the new currency, then we will have E\left(50\right)=A*50^2=2500A silver coins. Finally suppose that after our holiday we have some silver coins left over that we wish to convert back to gold coins, say 2500A-y where 0<y<2500A, how many gold coins will we get back?

To work this out we will need to find a way to go backwards from \mathop{\mathrm{Image}}\left(E\right) back to the domain. To do this we will solve g=Ax^2 for x, we have that

$$\begin{align*} g&=Ax^2\ \frac{g}{A}&=x^2\ x&=\pm\sqrt{\frac{g}{A}} \end{align*}$$ You may wonder where \pm came from and what it means. \pm stands for plus or minus and is used when we are unsure wherever the number is positive or negative. It occurs here because for the numbers we are familiar with there are two possible answers when taking the square root of a number, for example if we wanted to find the square root of 2 we have that \sqrt{2}*\sqrt{2}=2 or \left(-\sqrt{2}\right)\left(-\sqrt{2}\right)=2.

So, going back to the currency problem. When we wish to convert our remaining silver coins back into gold coins we will get back

$$\begin{equation*} x=\pm\sqrt{\frac{2500A-y}{A}} \end{equation*}$$

This is a problem, because the domain of E was any of the usual numbers we don't know wherever we should get back the positive or negative value, as both will have given us the silver coins we had remaining; perhaps on a more relatable note, we would find it very annoying if we got back from holiday and converted are positive money back only to end up with a negative amount of money. To over come this problem we should ensure that the domain of E consists of only positive numbers, rather than any value, by doing so the negative square root value is no longer valid and we hence get back the correct amount of money.

Although a simple example, this shows the importance sometimes having to restrict the domain of a mappings. We define the idea of a restriction of the domain now.

::: definition Definition 49. Restriction of a mapping

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between two sets X and Y. Let A\subseteq X be any subset of X. We define the restriction of f to A, denoted by \displaystyle \mathrel f\restriction_A, by the mapping

$$\begin{align} \mathrel f\restriction_A:A&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\restriction_A\left(x\right) \end{align*}$$*

In particular, restricting a mapping will cause the image to change so that $\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)\subseteq\mathop{\mathrm{Image}}\left(f\right)$ :::

Now that we have the idea of restricting a mapping we can see the following

::: {#prop;RestOfInjectionIsInjection .proposition} Proposition 16. Restriction of an injective mapping is injective

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between two sets X and Y such that f is an injective mapping. Let A\subseteq X be any subset of X. We have that the restriction \mathrel f\restriction_A:A\rightarrow Y is an injective map. In particular we have that \mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right) is an injection.

Proof:

To show that \mathrel f\restriction_A:A\rightarrow Y is an injective we show that \mathrel f\restriction_A\left(x\right)=\mathrel f\restriction_A\left(y\right) for x,y\in A means that x=y. Suppose that \mathrel f\restriction_A is not an injective map, then we have that \exists x,y\in A with x\neq y such that \mathrel f\restriction_A\left(x\right)=\mathrel f\restriction_A\left(y\right). However A\subseteq X and so x,y\in X but f is an injective map so f\left(x\right)=f\left(y\right) with x\neq y, contradicting the fact that f is an injection.

We conclude the the restriction map \mathrel f\restriction_A must be injective.

Finally, by definition of \mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right) we have that

$$\begin{equation} \mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)=\left{\mathrel f\restriction_A\left(x\right):x\in A\right}\subseteq Y \end{equation*}$$*

that is the image is all the elements \mathrel f\restriction_A will map elements of A to, as \mathrel{f\restriction_A}:X\rightarrow Y is an injection we must conclude that \mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right) is an injection, for if not then the original restriction map could not have been an injection. $\qed$ :::

::: example Example 39. Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping where x=\left\{1,2,3,4,5,6\right\} and Y=\left\{7.8,9,10,11,12\right\} where

$$\begin{equation} x\mapsto f\left(x\right)=x+6 \end{equation*}$$*

Consider A\subseteq X where A=\left\{1,2,3\right\} and B\subseteq X with B=\left\{1,2,3,4,5\right\}, then A\subseteq B. We have that \mathrel f\restriction_A:A\mathlarger{\mathlarger{\rightarrow}}Y has the image \mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)=\left\{7,8,9\right\} and we have that \mathrel f\restriction_B:B\mathlarger{\mathlarger{\rightarrow}}Y has the image $\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right)=\left{7,8,9,10,11\right}$

Hence under the two different restrictions we observe that \mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)\subseteq\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right). :::

From this example we have the following result.

::: {#prop:ImageOfSubsetIsSubsetOfImage .proposition} Proposition 17. The image of a subset is a subset of the image

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of sets and let A,B\subseteq X where A\subseteq B, we have that

$$\begin{equation} \mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)\subseteq\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right)\subseteq\mathop{\mathrm{Image}}\left(f\right) \end{equation*}$$*

Proof:

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of sets and let A,B\subseteq X where A\subseteq B. Consider the restriction mappings \mathrel f\restriction_A:A\mathlarger{\mathlarger{\rightarrow}}Y and \mathrel f\restriction_B:B\mathlarger{\mathlarger{\rightarrow}}Y. Let y\in\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right), then by definition we have that \exists x\in A such that \mathrel f\restriction_A\left(x\right)=y. As A\subseteq B we have that x\in A \Rightarrow x\in B and so \mathrel f\restriction_B\left(x\right)=y, hence y\in\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right). This shows that \mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)\subseteq\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right). To see the final inclusion note that A\subseteq B\subseteq X so x\in A\Rightarrow x\in B \Rightarrow x\in X and so f\left(x\right)=y, hence y\in\mathop{\mathrm{Image}}\left(f\right).

This shows the result. $\qed$ :::

We conclude with the following observation.

::: {#prop:InjectiveMapToImageIsBijection .proposition} Proposition 18. Injective mapping to image is a bijection

Let f:X\mathlarger{\mathlarger{\hookrightarrow}}Y be an injective map between two sets X and Y. Let A\subseteq X be any subset of X possibly being X itself. We have that the mapping g:A\mathlarger{\mathlarger{\rightarrow}}\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right) is a bijection.

Proof:

Let f:X\mathlarger{\mathlarger{\hookrightarrow}}Y be an injective mapping and let A\subseteq X. By proposition 16{reference-type="ref" reference="prop;RestOfInjectionIsInjection"} we have that the mapping \mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right) is an injection. Also, by proposition 15{reference-type="ref" reference="prob:RestOfCodomainToImageIsSurjective"} that \mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right) is a surjection. Hence by definition we have that \mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right) is a bijection. $\qed$ :::

Compositions of maps

We have seen how a mapping f takes elements in one set, the domain X, and sends them to the elements of another set, the image \mathop{\mathrm{Image}}\left(f\right)\subseteq Y of some co-domain Y. We can extend this idea so that the image \mathop{\mathrm{Image}}\left(f\right) and more generally the co-domain Y act as the domain for some other mapping g. This will allow us to consider some more interesting examples of mappings in general.

::: definition Definition 50. Composition of two mappings

Let f:X\rightarrow Y and g:Y\rightarrow Z be two mappings for some sets X,Y and Z. We define the composition map by

$$\begin{align} g\circ f: X&\rightarrow Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$*

That is, the mapping f is done first and then we apply the mapping g.

Additionally, let h:X\rightarrow X be a mapping from X to X then if h is composed with itself we write h\circ h = h\left(h\left(x\right)\right)=h^2\left(x\right). If h is composed with itself n times we write h^{n+1}\left(x\right). This is sometimes called the $n+1$-fold composition of h with itself. :::

::: example Example 40. Let f:\mathbb{N}\rightarrow\mathbb{N} and g:\mathbb{N}\rightarrow\mathbb{N} be maps such that

$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto f\left(x\right)=x^2\ g:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto g\left(x\right)=x^3\ \end{align*}$$ Then we have that, for some arbitrary x\in\mathbb{N} that*

$$\begin{align} g\circ f\left(x\right)=g\left(f\left(x\right)\right)=g\left(x^2\right)=\left(x^2\right)^3=x^6\ f\circ g\left(x\right)=f\left(g\left(x\right)\right)=g\left(x^3\right)=\left(x^3\right)^2=x^6\ \end{align*}$$*

In this case g\circ f=f\circ g, and it does not matter in which way we compose the two mappings.

The ideas of injectivity and subjectivity also apply to compositions of maps. We will see if g\circ f is injective.

Recall that a mapping h:X\rightarrow Y is injective if h\left(x\right)=h\left(y\right) for x,y\in X means that x=y. So let x,y\in\mathbb{N} and consider g\circ f\left(x\right)=g\circ f\left(y\right). Then we have that

$$\begin{align} g\circ f\left(x\right)&=g\circ f\left(y\right)\ x^6&=y^6\ x&=y \end{align*}$$*

This makes sense as x^6,y^6\in\mathbb{N} as x^6=x*x*x*x*x*x which is multiplication in \mathbb{N}, also We can take the sixth-root of x^6 without issue. Likewise for y. It is clear that the composition is not surjective for example 2\in\mathbb{N} does not have an element x\in\mathbb{N} such that x^6=2. If we were to include any possible positive number we would have x=\sqrt[6]{2}\approx 1.1224620483094.

Hence g\circ f is not bijective as it is not surjective. Likewise for g\circ f. :::

::: example Example 41. Consider the mappings f:\mathbb{N}\rightarrow\mathbb{N} and g:\mathbb{N}\rightarrow\mathbb{N} given by

$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto f\left(x\right)=4x+2\ g:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto g\left(x\right)=\sqrt{x}\ \end{align*}$$*

We have that

$$\begin{align} g\circ f\left(x\right)&=g\left(f\left(x\right)\right)=g\left(4x+2\right)=\sqrt{4x+2}\ f\circ g\left(x\right)&=f\left(g\left(x\right)\right)=f\left(\sqrt{x}\right)=4\sqrt{x}+2 \end{align*}$$*

Unlike last time we have that g\circ f\neq g\circ g. Now is g\circ f injective? Let x,y\in\mathbb{N} and consider

$$\begin{align} g\circ f\left(x\right)&=g\circ f\left(y\right)\ \sqrt{4x+2}&=\sqrt{4y+2} \iff 4x+2=4y+2\ 4x+2&=4y+2\ x&=y \end{align*}$$ So we have injectivity. We do not have subjectivity as, for example with y=1\in\mathbb{N} then*

$$\begin{align} 1&=\sqrt{4x+2}\ 1&=4x+2\ -1&=4x\ x&=-\frac{1}{4}\not\in\mathbb{N} \end{align*}$$*

What about f\circ g?. For injectivity let x,y\in\mathrel{N} then

$$\begin{align} f\circ g\left(x\right)&=f\circ g\left(y\right)\ 4\sqrt{x}+2&=4\sqrt{y}+2\ \sqrt{x}&=\sqrt{y}\iff x=y \end{align*}$$ hence we have injectivity. We do not have subjectivity, for example with y=1\in\mathbb{N} we have that*

$$\begin{align} 1&=4\sqrt{x}+2\ -1&+4\sqrt{x}\ -\frac{1}{4}&=\sqrt{x}\Rightarrow x\not\in\mathbb{N} \end{align*}$$* :::

::: example Example 42. Consider X=\left\{1,2,3\right\}, Y=\left\{4,5\right\} and Z=\left\{6\right\} and the mappings f:X\mathlarger{\mathlarger{\rightarrow}}Y and g:Y\mathlarger{\mathlarger{\rightarrow}}Z given by

$$\begin{equation} f\left(1\right)=4,\ f\left(2\right)=4,\ f\left(3\right)=5 \end{equation*}$$*

$$\begin{equation} g\left(4\right)=6,\ g\left(5\right)=6 \end{equation*}$$*

Finally, consider the composition map given by

$$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align}$$

Clearly g\circ f is not injective as g\left(f\left(1\right)\right)=6 and g\left(f\left(2\right)\right)=6 but 1\neq 2. However the compositing map is surjective as \mathop{\mathrm{Image}}\left(g\circ f\right)=\left\{6\right\}=Z. :::

We deduce an immediate result.

::: {#prop:DomainOfCompMapisDomainofFirstFunc .proposition} Proposition 19. Domain of composition mapping equals the domain of the first function

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings. Consider the composite mapping g\circ f:X\mathlarger{\mathlarger{\rightarrow}}Z. We have that

$$\begin{equation} \mathop{\mathrm{Dom}}\left(g\circ f\right)=\mathop{\mathrm{Dom}}\left(f\right) \end{equation*}$$*

Proof:

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings and consider the composite mapping g\circ f:X\mathlarger{\mathlarger{\rightarrow}}Z. We need to show that \mathop{\mathrm{Dom}}\left(g\circ f\right)=\mathop{\mathrm{Dom}}\left(f\right).

Let x\in\mathop{\mathrm{Dom}}\left(g\circ f\right), then g\left(f\left(x\right)\right) is well-defined with say z=g\left(f\left(x\right)\right) for some z\in Z. Hence for this to be well-defined we have that \exists y\in Y such that y=f\left(x\right) is well-defined. But then x\in\mathop{\mathrm{Dom}}\left(f\right), hence \mathop{\mathrm{Dom}}\left(g\circ f\right)\subseteq \mathop{\mathrm{Dom}}\left(f\right).

For the inverse inclusion, let x\in\mathop{\mathrm{Dom}}\left(f\right) then f\left(x\right)=y for some y\in Y. As g:Y\mathlarger{\mathlarger{\rightarrow}}Z is a mapping with domain Y, then \exists z\in Z such that g\left(y\right)=Z. Hence we have that g\left(y\right)=g\left(f\left(x\right)\right)=z. Hence \mathop{\mathrm{Dom}}\left(f\right)\subseteq\mathop{\mathrm{Dom}}\left(g\circ f\right).

As we have that \mathop{\mathrm{Dom}}\left(g\circ f\right)\subseteq \mathop{\mathrm{Dom}}\left(f\right) and \mathop{\mathrm{Dom}}\left(f\right)\subseteq\mathop{\mathrm{Dom}}\left(g\circ f\right), then we conclude by proposition 1{reference-type="ref" reference="prop:TwosetsEqualIfContainedInEachOther"} that \mathop{\mathrm{Dom}}\left(g\circ f\right)=\mathop{\mathrm{Dom}}\left(f\right) as required. $\qed$ :::

These examples show something interesting. In the first example we note that f and g are both injective. Indeed we have for x,y\in\mathbb{N} that

$$\begin{align*} x^2&=y^2\Rightarrow x=y\ x^3&=y^3\Rightarrow x=y \end{align*}$$

and the composition mappings g\circ f and f\circ g where both injective, in the last example we had that both f and g where surjective as

$$\begin{align*} \mathop{\mathrm{Image}}\left(f\right)=\left{f\left(x\right):x\in \left{1,2,3\right}\right}=\left{4,5\right}=Y\ \mathop{\mathrm{Image}}\left(g\right)=\left{g\left(x\right):x\in \left{4,5\right}\right}=\left{6\right}=Z \end{align*}$$ and the composition map g\circ f was also surjective. This is not a coincidence which we prove now

::: {#prop: PropInjecSurjecBijecMapping .proposition} Proposition 20. Injectivity, surjectivity and bijectivity of composition mappings

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings.

  1. If f and g are injective maps then so is $g\circ f$

  2. If f and g are surjective maps then so is $g\circ f$

  3. If f and g are bijective maps then s is $g\circ f$

Proof:

  1. If f and g are injective maps then so is g\circ f:

    Let f:X\mathlarger{\mathlarger{\hookrightarrow}}Y and g:Y\mathlarger{\mathlarger{\hookrightarrow}}Z be injective mappings, then by definition we have that

    $$\begin{align} \forall a,b\in X,\ f\left(a\right)&=f\left(b\right)\Rightarrow a=b\ \forall c,d\in X,\ g\left(c\right)&=g\left(d\right)\Rightarrow c=d \end{align*}$$*

    Consider the composition map

    $$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$ Let x,y\in X then we have that*

    $$\begin{align} g\left(f\left(x\right)\right)&=g\left(f\left(y\right)\right)\ f\left(x\right)&=f\left(y\right),\ \text{As } g \text{ is an injective map}\ x&=y,\ \text{As } f \text{ is an injective map} \end{align*}$$ As this works for every x,y\in X we have that g\circ f is injective.*

  2. If f and g are surjective maps then so is g\circ f:

    Let f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Y and g:Y\mathlarger{\mathlarger{\twoheadrightarrow}}Z be surjective mappings, then by definition we have that

    $$\begin{align} \forall b\in Y,\exists a\in X: f\left(a\right)&=b\ \forall d\in Z,\exists c\in X: g\left(c\right)&=d \end{align*}$$*

    Consider the composition map

    $$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$*

    Let z\in Z, then \exists y\in Y such that g\left(y\right)=z, also \exists x\in X such that f\left(x\right)=y as both f and g are surjective, then we have that

    $$\begin{equation} g\left(f\left(x\right)\right)=g\left(y\right)=z \end{equation*}$$ As this works for every z\in Z we have that g\circ f is surjective.*

  3. If f and g are bijective maps then s is g\circ f:

    Let $f:X% \mathlarger{\mathlarger{\hookrightarrow}}\mathrel{\mspace{-27.5mu}}\mathlarger{\mathlarger{\rightarrow}} Y$ and $g:Y% \mathlarger{\mathlarger{\hookrightarrow}}\mathrel{\mspace{-27.5mu}}\mathlarger{\mathlarger{\rightarrow}} Z$ be bijective mappings, then by definition we have that f is an injection and a surjection so f satisfies

    $$\begin{align} \forall a,b\in X,\ f\left(a\right)&=f\left(b\right)\Rightarrow a=b\ \forall d\in Y,\exists c\in X: f\left(c\right)&=d \end{align*}$$*

    Also g is an injection and surjection and so satisfies

    $$\begin{align} \forall a,b\in Y,\ g\left(a\right)&=g\left(b\right)\Rightarrow a=b\ \forall d\in Z,\exists c\in Y: Y\left(c\right)&=d \end{align*}$$*

    Consider the composition map

    $$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$*

    By part 1. we have that g\circ f is an injection as f and g are injections, by part 2. we have that g\circ f is a surjection as f and g are surjections. Hence by definition as g\circ f is both injective and surjective it is bijective.

$\qed$ :::

In a sense we can also deduce properties about the mappings f and g if we know something about the composition map g\circ f

::: {#prop:CompositeMapInectSurjectProp .proposition} Proposition 21. Properties of mappings from composite map

Let f:X\rightarrow Y and g: Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings and consider the composite map g\circ f:X\mathlarger{\mathlarger{\rightarrow}}Z.

  1. If g\circ f:X\mathlarger{\mathlarger{\hookrightarrow}}Z is an injective map, then f:X\mathlarger{\mathlarger{\rightarrow}}Y is an injective map.

  2. If g\circ f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Z is a surjective map, then g:Y\mathlarger{\mathlarger{\rightarrow}}Z is a surjective map.

Proof:

  1. If g\circ f:X\mathlarger{\mathlarger{\hookrightarrow}}Z is an injective map, then f:X\mathlarger{\mathlarger{\rightarrow}}Y is an injective map:

    Let g\circ f:X\mathlarger{\mathlarger{\hookrightarrow}}Z is an injective composite mapping, then g\left(f\left(x\right)\right)=g\left(f\left(y\right)\right) for all x,y\in X, we need to show that \forall x,y\in X that f\left(x\right)=f\left(y\right)\Rightarrow x=y.

    So indeed, suppose that for some x,y\in X that f\left(x\right)=f\left(y\right), then we have that

    $$\begin{align} g\circ f\left(x\right)&=g\left(f\left(x\right)\right)\ &=g\left(f\left(y\right)\right)\ &=g\circ f\left(y\right) \end{align*}$$ Now, as g\circ f is an injective map we conclude that x=y, hence f\left(x\right)=f\left(y\right)\Rightarrow x=y. Hence f:X\mathlarger{\mathlarger{\rightarrow}}Y is an injection.*

  2. If g\circ f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Z is a surjective map, then g:Y\mathlarger{\mathlarger{\rightarrow}}Z is a surjective map:

    Let g\circ f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Z is a surjective composite mapping, then \forall z\in Z, \exists x\in X: z=g\circ f\left(x\right), we need to show that \forall z\in Z,\exists y\in Y: z=g\left(y\right). Let z\in Z then as g\circ f is surjective there is some x\in X such that z=g\circ f\left(x\right).

    Now, we have by proposition 19{reference-type="ref" reference="prop:DomainOfCompMapisDomainofFirstFunc"} that \mathop{\mathrm{Dom}}\left(g\circ f\right)=\mathop{\mathrm{Dom}}\left(f\right) and so x\in\mathop{\mathrm{Dom}}\left(f\right), so that f\left(x\right)\in\mathop{\mathrm{Image}}\left(f\right). This is to say \exists y\in Y: y=f\left(x\right) and hence z=g\left(y\right). As this can be done for any z\in Z we conclude that g:Y\mathlarger{\mathlarger{\twoheadrightarrow}}Z is a surjection.

$\qed$ :::

The examples also allow us to deduce something about the image of composition mappings

:::: {#Prop:ImageOfCompMap .proposition} Proposition 22. The image of a composite mapping

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings for sets X,Y and Z. Consider the composition mapping given by

$$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$*

We have that \mathop{\mathrm{Image}}\left(g\circ f\right)=g\left(\mathop{\mathrm{Image}}\left(f\right)\right) where we recall the notation $f\left(X\right)=\left{f\left(x\right):x\in X\right}$

Proof:

We have that

$$\begin{equation} \mathop{\mathrm{Image}}\left(g\circ f\right)=\left{g\left(f\left(x\right)\right):x\in X\right} \end{equation*}$$*

also, we have that

$$\begin{equation} g\left(\mathop{\mathrm{Image}}\left(f\right)\right)=\left{g\left(y\right):y\in\mathop{\mathrm{Image}}\left(f\right)\right} \end{equation*}$$ Now, y\in\mathop{\mathrm{Image}}\left(f\right) means that y\in\left\{f\left(x\right):x\in X\right\}, hence y=f\left(x\right) for some x\in X, hence*

$$\begin{equation} g\left(\mathop{\mathrm{Image}}\left(f\right)\right)=\left{g\left(f\left(x\right)\right):x\in X\right} \end{equation*}$$*

Hence the two definitions agree, that is \mathop{\mathrm{Image}}\left(g\circ f\right)=g\left(\mathop{\mathrm{Image}}\left(f\right)\right).

We do need to check the case of Y=\emptyset. If Y=\emptyset then we note that \mathop{\mathrm{Image}}\left(g\right)=\emptyset by the remark after the definition of the image of a function. So g:\emptyset\mathlarger{\mathlarger{\rightarrow}}\emptyset, i.e g takes has no elements in its domain and no elements in its co-domain and so is a mapping that maps nothing to nothing. Also f:\emptyset\mathlarger{\mathlarger{\rightarrow}}\emptyset, we prove this

::: lemma Lemma 3. Mapping from empty set to some co-domain set is valid if and only if co-domain is empty

Let Y be some set, then f:\emptyset\mathlarger{\mathlarger{\rightarrow}}Y is a mapping if and only if Y=\emptyset

Proof:

\left(\Rightarrow\right): Suppose that Y\neq\emptyset then \exists s\in S, that is there is at least one element in S, but the domain is empty so there are no elements that could be mapped to s, hence f is not a well-defined mapping, so we conclude that Y=\emptyset.

\left(\Leftarrow\right): Suppose that Y=\emptyset, then f:\emptyset\mathlarger{\mathlarger{\rightarrow}}\emptyset holds as a mapping, mapping nothing to nothing. \qed :::

So the lemma shows f:\emptyset\mathlarger{\mathlarger{\rightarrow}}\emptyset. Hence

$$\begin{equation} \mathop{\mathrm{Image}}\left(g\circ f\right)=\emptyset=g\left(\emptyset\right)=g\left(\mathop{\mathrm{Image}}\left(f\right)\right) \end{equation*}$$*

As required. $\qed$ ::::

From the proposition we also deduce the following

::: proposition Proposition 23. Image of composite mapping is a subset of the image of the second function

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings. Consider the composite mapping g\circ f:X\mathlarger{\mathlarger{\rightarrow}}Z. We have that

$$\begin{equation} \mathop{\mathrm{Image}}\left(g\circ f\right)\subseteq\mathop{\mathrm{Image}}\left(g\right) \end{equation*}$$*

Proof:

We know by proposition 22{reference-type="ref" reference="Prop:ImageOfCompMap"} that \mathop{\mathrm{Image}}\left(g\circ f\right)=g\left(\mathop{\mathrm{Image}}\left(f\right)\right) where

$$\begin{equation} g\left(\mathop{\mathrm{Image}}\left(f\right)\right)=\left{g\left(y\right):y\in\mathop{\mathrm{Image}}\left(f\right)\right} \end{equation*}$$*

Now, observe that \mathop{\mathrm{Image}}\left(f\right)\subseteq Y, and in particular \mathop{\mathrm{Image}}\left(f\right)\subseteq\mathop{\mathrm{Dom}}\left(g\right). Hence, with proposition 17{reference-type="ref" reference="prop:ImageOfSubsetIsSubsetOfImage"}, we deduce that

$$\begin{equation} g\left(\mathop{\mathrm{Image}}\left(f\right)\right)=\left{g\left(y\right):y\in\mathop{\mathrm{Image}}\left(f\right)\right}\subseteq g\left(\mathop{\mathrm{Dom}}\left(g\right)\right)=\left{g\left(y\right):y\in\mathop{\mathrm{Dom}}\left(g\right)\right}=\mathop{\mathrm{Image}}\left(g\right) \end{equation*}$$*

Hence we have \mathop{\mathrm{Image}}\left(g\circ f\right)=\left\{g\left(y\right):y\in\mathop{\mathrm{Image}}\left(f\right)\right\}\subseteq\mathop{\mathrm{Image}}\left(g\right), which is to say \mathop{\mathrm{Image}}\left(g\circ f\right)\subseteq\mathop{\mathrm{Image}}\left(g\right) as required. $\qed$ :::

We have seen earlier that function composition need not be commutative, for example when f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} with f\left(x\right)=4x+2 and g:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} with g\left(x\right)=\sqrt{x}. We saw that

$$\begin{align*} g\circ f\left(x\right)&=\sqrt{4x+2}\ f\circ g\left(x\right)&=4\sqrt{x}+2 \end{align*}$$

What can we say about associativity and function composition?

::: example Example 43. Let f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}, g:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} and h:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} where

$$\begin{align} f:\mathbb{N}&\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ x&\mapsto f\left(x\right)=4x+2\ g:\mathbb{N}&\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ x&\mapsto g\left(x\right)= x^2\ h:\mathbb{N}&\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ x&\mapsto h\left(x\right) = \sqrt{x} \end{align*}$$*

Consider the following

$$\begin{align} h\circ\left(g\circ f\right)\left(x\right)&=h\left(g\left(f\left(x\right)\right)\right)=h\left(\left(4x+2\right)^2\right)=\sqrt{\left(4x+2\right)^2}=4x+2\ \left(h\circ g\right)\circ f\left(x\right)&=h\left(g\left(x\right)\right)\circ f\left(x\right)=\sqrt{x^2} \circ \left(4x+2\right)=x\circ 4x+2=4x+2 \end{align*}$$ In this case the function composition is associative.* :::

This is not a coincidence

::: proposition Proposition 24. Function composition is associative

Let f:W\mathlarger{\mathlarger{\rightarrow}}X, g:X\mathlarger{\mathlarger{\rightarrow}}Y and h:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings. We have that

$$\begin{equation} h\circ\left(g\circ f\right)=\left(h\circ g\right)\circ f \end{equation*}$$*

Proof:

Let f:W\mathlarger{\mathlarger{\rightarrow}}X, g:X\mathlarger{\mathlarger{\rightarrow}}Y and h:Y\mathlarger{\mathlarger{\rightarrow}}Z and consider the composite mappings h\circ\left(g\circ f\right):W\mathlarger{\mathlarger{\rightarrow}}Z and \left(h\circ g\right)\circ f:W\mathlarger{\mathlarger{\rightarrow}}Z.

Let w\in W, then we have that as f:W\mathlarger{\mathlarger{\rightarrow}}X is a mapping then \exists x\in X such that x=f\left(w\right), likewise as g:X\mathlarger{\mathlarger{\rightarrow}}Y, then \exists y\in Y such that y=g\left(x\right)=g\left(f\left(w\right)\right). Finally as h:Y\mathlarger{\mathlarger{\rightarrow}}Z is a mapping then \exists z\in Z such that z=h\left(y\right)=h\left(g\left(f\left(w\right)\right)\right).

Likewise, for the same w\in W we have x=f\left(w\right), now as \left(h\circ g\right)\circ f\left(w\right)=\left(h\circ g\right)\left(f\left(w\right)\right) then we need to see where h\circ g maps f\left(w\right). As h\circ g\left(x\right)=h\left(g\left(x\right)\right) then we have that \left(h\circ g\right)\left(f\left(w\right)\right)=h\left(g\left(f\left(w\right)\right)\right), We know that g\left(f\left(w\right)\right)=y and h\left(g\left(f\left(w\right)\right)\right)=z.

Hence h\circ\left(g\circ f\right)=\left(h\circ g\right)\circ f as w as an arbitrary element of W. $\qed$ :::

Inverse mappings

With the theory of composite mappings now understood, we are in a position to try and understand how to undo a given map f:X\mathlarger{\mathlarger{\rightarrow}}Y. Why did we need to develop a theory of composite mappings? The idea comes from the fact that undoing a mapping should somehow be the same as never doing anything in the first place. This is to say, if we denote the inverse map by f^{-1} then we should expect that f^{-1}\circ f\left(x\right)=x, likewise the original mapping f somehow undoes f^{-1} i.e f\circ f^{-1}\left(y\right)=y where y is in the co-domain of f. As always in mathematics, examples will help to understand whats going on.

You may recall from a course in physics that an object thrown in a vacuum so that there is no air resistance, where only gravity acts has the following equation for its height

$$\begin{equation*} H\left(t\right)=V_0\sin\left(\theta\right)t-\frac{1}{2}gt^2 \end{equation*}$$ where V_0 is the objects launch velocity in metres per second m/s, \theta is the angle that the projectile is launched at from the horizontal, g is gravity in metres per second$^2$ m/s^2 and t is time in seconds s. Suppose the particle is launched with a velocity of 45 m/s at an angle of 45 degrees to the horizontal and we take g=9.8 m/s^2, then for example, the height above the origin of the projectile at t=1s is

$$\begin{equation*} H\left(1\right)=10sin\left(45\right)\frac{1}{2}9.8=5\sqrt{2}-\frac{49}{10}\approx 2.17 m \end{equation*}$$

Now suppose you are told that the maximum height is achieved at a time of \displaystyle t=\frac{25\sqrt{2}}{49}\approx 0.721 s which is \displaystyle h=\frac{125}{49}\approx 2.551 m. Considering time values \displaystyle 0<t<\frac{25\sqrt{2}}{49}\approx 0.721, find the time that the projectile was first at 2 m above the ground. In essence we need to take h\left(t\right) and somehow undo the process to find some t such that h\left(t\right)=2. How do we do this? Well set h\left(t\right)=h then solve for t as follows

$$\begin{align*} h&=10\sin\left(45\right)t-\frac{1}{2}\left(9.8\right)t^2\ h&=5\sqrt{2}t-\frac{49}{10}t^2\ \frac{49}{10}t^2-5\sqrt{2}t+h&=0\ 49t^2-50\sqrt{2}t+10h&=0 \end{align*}$$

Now, from school we have learnt the quadratic formula, applying this here we will get two answers for t

$$\begin{align*} t&=\frac{-\left(-50\sqrt{2}\right)\pm\sqrt{\left(-50\sqrt{2}\right)^2-4\left(49\right)\left(10h\right)}}{2\left(49\right)}\ t&=\frac{50\sqrt{2}\pm\sqrt{5000-1960h}}{98}\ \end{align*}$$

Hence when h=2 we get the following times

$$\begin{align*} t&=\frac{50\sqrt{2}\pm\sqrt{5000-1960\left(2\right)}}{98}\ t&=\frac{50\sqrt{2}\pm\sqrt{5000-3920}}{98}\ t&=\frac{50\sqrt{2}\pm\sqrt{1080}}{98}\ t&=\frac{50\sqrt{2}\pm 6\sqrt{30}}{98}\ t&=\frac{25\sqrt{2}\pm 3\sqrt{30}}{49}\ \end{align*}$$

That is, \displaystyle t= \frac{25\sqrt{2}+3\sqrt{30}}{49}\approx 1.057 s or \displaystyle t= \frac{25\sqrt{2}-3\sqrt{30}}{49}\approx 0.386 s. This example illustrates a key point about inverse maps, when we undo a given map we should get back the original input. Thankfully in this case we were told when the ball reaches its maximum height and the time it does so which was about 0.721 s hence we have that the value we are looking for is the smaller \displaystyle t= \frac{25\sqrt{2}-3\sqrt{30}}{49}\approx 0.386 s. In fact if we want to find the time the projectile was first at h m above the ground we will always take the smaller of the two values for t found. That is, defining a new map T given by

$$\begin{equation*} t\left(h\right)=\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\ \end{equation*}$$

So that when he particle is launched with a velocity of 45 m/s at an angle of 45 degrees to the horizontal with g=9.8 m/s^2 and using our knowledge of the fact that the maximum height is achieved at a time of \displaystyle t=\frac{25\sqrt{2}}{49}\approx 0.721 s which is \displaystyle h=\frac{125}{49}\approx 2.551 m, then the mapping

$$\begin{equation*} T\left(h\right)=\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\ \end{equation*}$$

is the inverse to

$$\begin{equation*} H\left(t\right)=10\sin\left(45\right)t-\frac{1}{2}\left(9.8\right)t^2\ \end{equation*}$$

Indeed, for example we have that

$$\begin{align*} H\circ T\left(t\right)&=H\left(T\left(h\right)\right)\ &=H\left(\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\right)\ &=10\sin\left(45\right)\left(\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\right)-\frac{1}{2}\left(9.8\right)\left(\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\right)^2\ &=5\sqrt{2}\left(\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\right)-\frac{1}{2}\left(9.8\right)\frac{\left(50\sqrt{2}-\sqrt{5000-1960h}\right)^2}{98^2}\ &=\frac{500}{98}-\frac{5\sqrt{2}\sqrt{5000-1960h}}{98}-\frac{1}{2}\left(9.8\right)\frac{\left(5000-100\sqrt{10000-3920h}+\left(5000-1960h\right)\right)}{98^2}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{1}{2}\frac{\left(5000-100\sqrt{10000-3920h}+\left(5000-1960h\right)\right)}{980}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{1}{2}\frac{\left(10000-100\sqrt{10000-3920h}-1960h\right)}{980}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{\left(5000-50\sqrt{10000-3920h}-980h\right)}{980}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{5000}{980}+\frac{50\sqrt{10000-3920h}}{980}+\frac{980h}{h}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{250}{49}+\frac{5\sqrt{10000-3920h}}{98}+h\ &=h\ \end{align*}$$

Again, we have this idea that inverse functions should somehow return any mapping back to where it started.

We can start to express this idea in terms of a so-called identity mapping.

::: definition Definition 51. Let \mathop{\mathrm{id}}_X:X\mathlarger{\mathlarger{\rightarrow}}X be a mapping from X to itself, so that

$$\begin{align} \mathop{\mathrm{id}}:X&\mathlarger{\mathlarger{\rightarrow}}X\ x&\mapsto\mathop{\mathrm{id}}\left(x\right)=x \end{align*}$$*

We say that \mathop{\mathrm{id}} is the identity mapping on the set X. Suppose we also have a mapping \mathop{\mathrm{id}}_Y:Y\mathlarger{\mathlarger{\rightarrow}}Y, then \mathop{\mathrm{id}}_Y is the identity map on the set Y, so it is clear that \mathop{\mathrm{id}}_X\neq \mathop{\mathrm{id}}_Y. :::

Indeed we can prove that these identity maps do nothing under function composition.

::: proposition Proposition 25. Composition with the identity mapping does nothing

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping and consider the identity maps \mathop{\mathrm{id}}_X:X\mathlarger{\mathlarger{\rightarrow}}X and \mathop{\mathrm{id}}_Y:Y\mathlarger{\mathlarger{\rightarrow}}Y. We have that

  1. $f\circ \mathop{\mathrm{id}}_X=f$

  2. $\mathop{\mathrm{id}}_Y\circ f=f$

Proof:

We simply need to compose the maps to see the desired results.

  1. f\circ \mathop{\mathrm{id}}_X=f:

    Let x\in X then f\circ \mathop{\mathrm{id}}_X\left(x\right)=f\left(id_X\left(x\right)\right)=f\left(x\right)=f.

  2. \mathop{\mathrm{id}}_Y\circ f=f:

    Let x\in X then $\mathop{\mathrm{id}}_Y\circ f\left(x\right)=id_Y\left(f\left(x\right)\right)=f\left(x\right)=f$

Hence the result follows. :::

For completeness we will prove some trivial properties about the identity mapping.

::: {#prop:IdentityMapProperties .proposition} Proposition 26. Properties of the identity mapping

Let \mathop{\mathrm{id}}_X:X\mathlarger{\mathlarger{\rightarrow}}X be the identity map on X. Then the following hold

  1. \mathop{\mathrm{id}}_X is an injective map

  2. \mathop{\mathrm{id}}_X is a surjective map

  3. \mathop{\mathrm{id}}_X is a bijective map

  4. $\mathop{\mathrm{id}}_X\circ\mathop{\mathrm{id}}_X=\mathop{\mathrm{id}}_X$

Proof:

  1. \mathop{\mathrm{id}}_X is an injective map:

    Let x,y\in X then we have that \mathop{\mathrm{id}}_X\left(x\right)=id_X\left(y\right)\Rightarrow x=y, hence \mathop{\mathrm{id}}_X is injective.

  2. \mathop{\mathrm{id}}_X is a surjective map:

    Let y\in X be such that y=\mathop{\mathrm{id}}_X\left(x\right) for some x\in X, then y=x as this works for every y\in X then \mathop{\mathrm{id}}_X is surjective.

  3. \mathop{\mathrm{id}}_X is a bijective map:

    By parts 1. and 2. we have that \mathop{\mathrm{id}}_X is injective and surjective and thus by definition is bijective.

  4. \mathop{\mathrm{id}}_X\circ\mathop{\mathrm{id}}_X=\mathop{\mathrm{id}}_X:

    Let x\in X and consider \mathop{\mathrm{id}}_X\circ\mathop{\mathrm{id}}_X\left(x\right)=\mathop{\mathrm{id}}_X\left(\mathop{\mathrm{id}}_X\left(x\right)\right)=\mathop{\mathrm{id}}_X\left(x\right)=x=\mathop{\mathrm{id}}_X\left(x\right).

\qed. :::

The identity mapping will allow us to define the idea of a left and right inverse of a mapping.

::: definition Definition 52. Left inverse

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping. We define g:Y\mathlarger{\mathlarger{\rightarrow}}X to be the left inverse of f if

$$\begin{equation} g\circ f=\mathop{\mathrm{id}}_X \end{equation*}$$* :::

::: definition Definition 53. Right inverse

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping. We define g:Y\mathlarger{\mathlarger{\rightarrow}}X to be the right inverse of f if

$$\begin{equation} f\circ g=\mathop{\mathrm{id}}_Y \end{equation*}$$* :::

::: example Example 44. Let f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} be such that f\left(x\right)=x+1. Define the mapping g:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} by

$$\begin{equation} g\left(x\right)=\begin{cases} x-1,\ \text{If }x\neq 0,\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

Then g is a left inverse of f. Indeed we have that

$$\begin{equation} g\circ f\left(x\right)=g\left(f\left(x\right)\right)=g\left(x+1\right)=\left(x+1\right)-1=x \end{equation*}$$ as x+1>0 for every x\in\mathbb{N}. Observe also that f is an injective map, indeed let x,y\in\mathbb{N} and suppose f\left(x\right)=f\left(y\right) then*

$$\begin{align} f\left(x\right)&=f\left(y\right)\ x+1&=y+1\ x&=y \end{align*}$$*

It is also worth noting that g is not injective as we have g\left(1\right)=0=g\left(0\right) but 1\neq 0. We note that f is the right inverse of g as the calculation above shows. :::

::: example Example 45. Let X=\mathbb{R} and define Y=\mathbb{R}^+=\left\{x\in\mathbb{R}:x\geq 0\right\}, the set of familiar numbers. Let f:\mathbb{R}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}^+ be define by f\left(x\right)=x^2. We can define two possible right inverses of f. The first is given by g_1:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} where g_1\left(x\right)=\sqrt{x}. Indeed

$$\begin{equation} f\circ g_1\left(x\right)=f\left(g_1\left(x\right)\right)=f\left(\sqrt{x}\right)=\left(\sqrt{x}\right)^2=x=\mathop{\mathrm{id}}_{\mathbb{R}}\left(x\right) \end{equation*}$$ The second, as you may have guessed, is given by g_2:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} where g_1\left(x\right)=-\sqrt{x} where likewise we have*

$$\begin{equation} f\circ g_2\left(x\right)=f\left(g_2\left(x\right)\right)=f\left(-\sqrt{x}\right)=\left(-\sqrt{x}\right)^2=x=\mathop{\mathrm{id}}_{\mathbb{R}}\left(x\right) \end{equation*}$$*

We note that f is surjective. Let y\in\mathbb{R}^+ then f\left(x\right)=y\Rightarrow x^2=y\Rightarrow x=\pm\sqrt{y}\in\mathbb{R}, hence every output of f is mapped to by some input. It is clear that f is not injective as f\left(2\right)=4=f\left(-2\right).

Does f have a left inverse?. By the definition of a left inverse we will need to find some g:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} such that g\circ f=id_{\mathbb{R}}. So for each input of f, g will have to send f\left(x\right) back to x, hence we might require that f be injective, for if not then \exists x,y\in\mathbb{R} such that f\left(x\right)=f\left(y\right) with x\neq y and we have the problem where g could send f\left(x\right) back to either x or y, and if it is sent back to y then we don't have the identity mapping!

Now, f is not injective as we have seen that f\left(2\right)=4=f\left(-2\right), so if there where a left inverse g it wouldn't know where to send 4 back to, it could have been either 2 or -2. :::

::: example Example 46. Let X=\mathbb{R} and Y=\mathbb{R}\setminus\left\{0\right\}=\left\{x\in\mathbb{R}:x\neq 0\right\}. You may have seen the function e^x before, we shall consider this mapping, that is the mapping f:\mathbb{R}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}\setminus\left\{0\right\} given by f\left(x\right)=e^x=\exp\left(x\right). We can define a left inverse to f by g:\mathbb{R}\setminus\left\{0\right\}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} where g\left(x\right)=\ln\left(x\right), where \ln\left(x\right) is the natural logarithm, the logarithm to the base e. We will discuss logarithms in more detail later but for now we can think of \ln\left(x\right)=y as asking the question e^y=x, that is value of y do we need to raise e to to get x. This g is indeed a left inverse of f as

$$\begin{equation} g\circ f\left(x\right)=g\left(f\left(x\right)\right)=g\left(e^x\right)=\ln\left(e^x\right)=x=\mathop{\mathrm{id}}_{\mathbb{R}} \end{equation*}$$*

Like in the previous example, we can ask the question does f have a right inverse? By definition for f to have a right inverse, there needs to be a mapping g:\mathbb{R}\setminus\left\{0\right\}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} such that f\circ g=\mathop{\mathrm{id}}_\mathbb{R}\setminus\left\{0\right\}. So for each g\left(y\right) with y\in\mathbb{R}\setminus\left\{0\right\} we have that f will send g\left(y\right) back to y. This will happen if every output of f has some input that generates it, that is f is a surjection. If this not the case then there is some element y\in\mathbb{R}\setminus\left\{0\right\} that is not mapped to by f\left(x\right) for some x\in\mathbb{R}.

For example we have that \not\exists x\in\mathbb{R} such that e^x=-1 for example. So f is not surjective in this case we are not able to define a right inverse that makes sense. :::

We can generalise the last two examples to the next two propositions.

::: {#prop:LeftInverseIffInjective .proposition} Proposition 27. Condition for the existence of a left inverse

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping with X\neq\emptyset. We have that f has a left inverse g:Y\mathlarger{\mathlarger{\rightarrow}}X such that g\circ f=\mathop{\mathrm{id}}_X if and only if f is an injective mapping.

Proof:

\left(\Rightarrow\right): Suppose that f has a left inverse g:Y\mathlarger{\mathlarger{\rightarrow}}X such that g\circ f=\mathop{\mathrm{id}}_X. We know by proposition 26{reference-type="ref" reference="prop:IdentityMapProperties"} that \mathop{\mathrm{id}}_X is an injective mapping, moreover we know by proposition 21{reference-type="ref" reference="prop:CompositeMapInectSurjectProp"} that if a composite map g\circ f is injective then so is f. Hence as g\circ f = \mathop{\mathrm{id}}_X and \mathop{\mathrm{id}}_X is injective, we conclude that f is an injective map.

\left(\Leftarrow\right): Suppose that f is an injective map, then \forall x,y\in X we have that f\left(x\right)=f\left(y\right)\Rightarrow x=y. Let x\in X, we need to construct a map which acts as a left inverse to f.

Consider the following map \mathrel{h\restriction_{\mathop{\mathrm{Image}}\left(f\right)}}:\mathop{\mathrm{Image}}\left(f\right)\mathlarger{\mathlarger{\rightarrow}}X, where we send y\in\mathop{\mathrm{Image}}\left(f\right) back to the element that it was mapped from. Now, define g as follows

$$\begin{align} g:Y&\mathlarger{\mathlarger{\rightarrow}}X\ y&\mapsto g\left(y\right)=\begin{cases} x,\ \text{If } y\in Y\setminus\mathop{\mathrm{Image}}\left(f\right)\ h\left(y\right),\ \text{If } y\in\mathop{\mathrm{Image}}\left(f\right) \end{cases} \end{align*}$$*

We note that if \mathop{\mathrm{Image}}\left(f\right) = Y then we do not need to consider the first case x,\ \text{If } y\in Y\setminus\mathop{\mathrm{Image}}\left(f\right), however if \mathop{\mathrm{Image}}\left(f\right) \subset Y then there exists at least one x for this case.

Now with this g we have that

$$\begin{equation} g\circ f\left(x\right)=g\left(f\left(x\right)\right)=h\left(f\left(x\right)\right)=x=\mathop{\mathrm{id}}_X \end{equation*}$$*

Hence g is indeed a left inverse of f.

The proposition now follows. $\qed$ :::

::: {#prop:RightInverseIffSurjective .proposition} Proposition 28. Condition for the existence of a right inverse

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping with X\neq\emptyset. We have that f has a right inverse g:Y\mathlarger{\mathlarger{\rightarrow}}X such that f\circ g=\mathop{\mathrm{id}}_Y if and only if f is a surjective mapping.

Proof:

\left(\Rightarrow\right): Suppose that f has a right inverse g:Y\mathlarger{\mathlarger{\rightarrow}}X such that f\circ g=\mathop{\mathrm{id}}_Y. We know by proposition 26{reference-type="ref" reference="prop:IdentityMapProperties"} that \mathop{\mathrm{id}}_X is a surjective mapping, moreover we know by proposition 21{reference-type="ref" reference="prop:CompositeMapInectSurjectProp"} that if a composite map f\circ g is surjective then so is f. Hence as f\circ g = \mathop{\mathrm{id}}_Y and \mathop{\mathrm{id}}_Y is surjective, we conclude that f is a surjective map.

\left(\Leftarrow\right): Suppose that f is a surjective map, then \forall y\in Y,\exists x\in X: f\left(x\right)=y. We need to construct a g:Y\mathlarger{\mathlarger{\rightarrow}}X such that f\circ g=\mathop{\mathrm{id}}_Y. As f is surjective we have that \forall y\in Y,\exists x\in X: f\left(x\right)=y, in particular we know that there maybe more than one such x so that f\left(x\right)=y, if this is the case we pick for that y one of the possible choices of x. Hence we can define g\left(y\right)=x for every y\in Y then we have that $f\circ g\left(y\right)=f\left(g\left(y\right)\right)=f\left(x\right)=y=\mathop{\mathrm{id}}_Y$

The proposition now follows. $\qed$ :::

These two propositions give the following immediate results

::: {#LeftInverseOfInjectionIsSurjective .proposition} Proposition 29. Left inverse of injective mapping is a surjection

Let f:X\rightarrow Y be an injection with left inverse g:Y\rightarrow X. We have that g is a surjection.

Proof let f and g be as stated. Then by definition of a left inverse we have that g\circ f =\mathop{\mathrm{id}}_X. Moreover we have the identity mapping \mathop{\mathrm{id}}_X is an injection as it is bijective. We then have by proposition 21{reference-type="ref" reference="prop:CompositeMapInectSurjectProp"} that g is a surjection. $\qed$ :::

::: {#RightInverseOfSurjecctionisInection .proposition} Proposition 30. Right inverse of surjective mapping is an injection

Let f:X\rightarrow Y be a surjection with right inverse g:Y\rightarrow X. We have that g is an injection.

Proof let f and g be as stated. Then by definition of a right inverse we have that f\circ g =\mathop{\mathrm{id}}_Y. Moreover we have the identity mapping \mathop{\mathrm{id}}_Y is a surjection as it is bijective. We then have by proposition 21{reference-type="ref" reference="prop:CompositeMapInectSurjectProp"} that g is an injection. $\qed$ :::

The ideas of a left and right inverse will allow us to construct the idea of a so-called two-sided inverse, that is an inverse which is both a left inverse and a right inverse. this will allow us to consider when a mappings can be inverted without regards to how we compose the mappings. However there is one final result about left and right inverse that will be required in order to pave the way.

::: {#prop:BijectionHasLeftRightInverse .proposition} Proposition 31. Bijection has a left and right inverse

Let f:X\rightarrow Y be a bijective mapping. We have that there exists a left inverse g:Y\rightarrow X and there exists a right inverse h:Y\rightarrow X such that

$$\begin{align} g\circ f &= \mathop{\mathrm{id}}_X\ f\circ h&=\mathop{\mathrm{id}}_Y \end{align*}$$*

Proof:

Let f:X\rightarrow Y be a bijection. We have that as f is a bijection then we know that f is both injective and surjective. Now by proposition 27{reference-type="ref" reference="prop:LeftInverseIffInjective"} that a left inverse exists if and only if f is an injective mapping. Likewise by proposition 28{reference-type="ref" reference="prop:RightInverseIffSurjective"} we have that a right inverse exists if and only if f is a surjective mapping. Hence we have the existence of a left and right inverse. As required. $\qed$ :::

::: {#prop:LeftRightInverseImpliesBijection .proposition} Proposition 32. The existence of a left and right inverse implies a bijection

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping such that \exists g_1:Y\mathlarger{\mathlarger{\rightarrow}}X such that g_1\circ f=\mathop{\mathrm{id}}_X and \exists g_2:Y\mathlarger{\mathlarger{\rightarrow}}X such that f\circ g_2=\mathop{\mathrm{id}}_Y. We have that f is a bijection.

Proof:

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping such that \exists g_1:Y\mathlarger{\mathlarger{\rightarrow}}X such that g_1\circ f=\mathop{\mathrm{id}}_X and \exists g_2:Y\mathlarger{\mathlarger{\rightarrow}}X such that f\circ g_2=\mathop{\mathrm{id}}_Y. We have by proposition 27{reference-type="ref" reference="prop:LeftInverseIffInjective"} that as g_1 is a left inverse of f then f must be injective. Likewise by proposition 28{reference-type="ref" reference="prop:RightInverseIffSurjective"} that as g_2 is a right inverse of f then f must be surjective. It hence follows by definition that f is a bijective mapping. $\qed$ :::

These propositions are useful in proving the following.

::: proposition Proposition 33. Bijection if and only if left and right inverses exist

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping. We have that f is bijective if and only if \exists g_1:Y\mathlarger{\mathlarger{\rightarrow}}X such that g_1\circ f=\mathop{\mathrm{id}}_X and \exists g_2:Y\mathlarger{\mathlarger{\rightarrow}}X such that f\circ g_2=\mathop{\mathrm{id}}_Y.

Proof:

\left(\Rightarrow\right): Let f: X\rightarrow Y be a bijective mapping. We have by proposition 31{reference-type="ref" reference="prop:BijectionHasLeftRightInverse"} we have that f being a bijection gives the existence of a left and right inverse.

\left(\Leftarrow\right): Suppose we have a mapping f:X\rightarrow Y such that \exists g_1:Y\mathlarger{\mathlarger{\rightarrow}}X such that g_1\circ f=\mathop{\mathrm{id}}_X and \exists g_2:Y\mathlarger{\mathlarger{\rightarrow}}X such that f\circ g_2=\mathop{\mathrm{id}}_Y. Then f has both a left inverse and a right inverse, hence by proposition 32{reference-type="ref" reference="prop:LeftRightInverseImpliesBijection"} we have that f is a bijection.

The result is shown. $\qed$ :::

We have seen that if f:X\rightarrow Y is a bijection then f has both a left and a right inverse, likewise if these two inverses exist then we have that f is a bijection. This property is key to defining what we mean by the inverse to a bijective mapping.

::: definition Definition 54. Inverse

Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping. We say that the mapping g:Y\mathlarger{\mathlarger{\rightarrow}}X is an inverse6 of f if we have that g is both a left inverse and a right inverse for f. This is to say, g is an inverse of f if we have that

$$\begin{align} g\circ f&=\mathop{\mathrm{id}}_X\ f\circ g &=\mathop{\mathrm{id}}_Y \end{align*}$$*

We sometimes use the notation f^{-1} to denote the inverse. :::

::: example Example 47. Let f:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}^+ be such that f\left(x\right)=x^2, then we have that g:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}^+ with g\left(x\right)=\sqrt{x} is an inverse of f. Indeed

$$\begin{align} g\circ f\left(x\right)&=g\left(f\left(x\right)\right)=g\left(x^2\right)=\sqrt{x^2}=x=\mathop{\mathrm{id}}{\mathbb{R}^+}\ f\circ g\left(x\right)&=f\left(g\left(x\right)\right)=f\left(\sqrt{x}\right)=\left(\sqrt{x}\right)^2=x=\mathop{\mathrm{id}}{\mathbb{R}^+}\ \end{align*}$$* :::

::: example Example 48. The identity mapping \mathop{\mathrm{id}}_X:X\mathlarger{\mathlarger{\rightarrow}}X with \mathop{\mathrm{id}}_X\left(x\right)=x,\ \forall x\in X is its own inverse, indeed

$$\begin{equation} \mathop{\mathrm{id}}_X\circ\mathop{\mathrm{id}}_X=\mathop{\mathrm{id}}_X\left(\mathop{\mathrm{id}}_X\left(x\right)\right)=\mathop{\mathrm{id}}_X\left(x\right)=x=\mathop{\mathrm{id}}_X \end{equation*}$$* :::

::: example Example 49. Let f:\left\{1,2\right\}\mathlarger{\mathlarger{\rightarrow}}\left\{a,b\right\} be such that f\left(1\right)=a and f\left(2\right)=b. We have that g:\left\{a,b\right\}\mathlarger{\mathlarger{\rightarrow}}\left\{1,2\right\} with g\left(a\right)=1 and g\left(b\right)=2 is an inverse to f. Indeed we have that

$$\begin{align} f\left(g\left(a\right)\right)&=f\left(1\right)=a\ f\left(g\left(b\right)\right)&=f\left(2\right)=b\ \end{align*}$$*

It also follows that g is an inverse to f, indeed

$$\begin{align} g\left(f\left(1\right)\right)&=g\left(a\right)=1\ g\left(f\left(1\right)\right)&=g\left(b\right)=2\ \end{align*}$$* :::

::: example Example 50. Let f:\mathbb{R}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}^+ be given by f\left(x\right)=e^x. We have that g:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} where g\left(x\right)=\ln\left(x\right) is an inverse of f. :::

We shall prove that the composition of a mapping and its inverse gives the identity mapping. Firstly, we will need to show the following propositions.

::: {#prop:MappingInjectiveSurjectiveIFFInverseIsMapping .proposition} Proposition 34. Mapping is injective and surjective if and only if the inverse is a mapping

Let f:X\rightarrow Y be a mapping. We have that f is a bijection if and only if f^{-1}, the inverse of f, is a mapping.

Proof:

\left(\Rightarrow\right): Let f:X\rightarrow Y be a bijection, then f is both surjective and injective. Let y\in Y, then as f is surjective we have that \exists x\in X such that f\left(x\right)=y, moreover by injectivity of f we have that there is only one such x which does this. Define g:Y\rightarrow X by

$$\begin{equation} g\left(y\right)=x \end{equation*}$$*

As y\in Y is an arbitrary element, it follows that

$$\begin{equation} \forall y\in Y:\exists x\in X : g\left(y\right)=x \end{equation*}$$ such that x is unique for a given y. That is g is a mapping. Now by the definition of g we have that*

$$\begin{equation} \forall y\in Y: f\left(g\left(y\right)\right)=y \end{equation*}$$ Now, let x\in X and let*

$$\begin{equation} x'=g\left(f\left(x\right)\right) \end{equation*}$$ then*

$$\begin{equation} f\left(x'\right)=f\left(g\left(f\left(x\right)\right)\right)=f\left(x\right) \end{equation*}$$ by the above. However, f is an injection so we have that x'=x and thus x=g\left(f\left(x\right)\right).*

It follows that f and g are inverse mappings of each other.

\left(\Leftarrow\right): Suppose that f:X\rightarrow Y is a mapping, moreover suppose that f^{-1}:Y\rightarrow X is also a mapping which is the inverse of f. We show that f must be a bijection.

  1. f is injective:

    Let x,y\in X and suppose that $f\left(x\right)=f\left(y\right)$

    $$\begin{align} f\left(x\right)&=f\left(y\right)\ f^{-1}\left(f\left(x\right)\right)&=f^{-1}\left(f\left(y\right)\right)\ \Rightarrow x&=y,\ \text{As } f^{-1} \text{ is the inverse of f} \end{align*}$$ Hence we have that f is injective.*

  2. f is surjective:

    Suppose that y\in Y. We then have that

    $$\begin{align} y&\in Y\ \Rightarrow f^{-1}\left(y\right)&\in X,\ \text{As } f^{-1} \text{ is the inverse of f}\ \Rightarrow f\left(^{-1}\left(y\right)\right)&=y,\ \text{By definition of an inverse mapping}\ \Rightarrow \exists x\in X: f\left(x\right)&= y,\ \text{Where } x=f^{-1}\left(y\right) \end{align*}$$ Hence we have that f is surjective.*

As f is both injective and surjective it is a bijection. $\qed$ :::

We can now show that the inverse of a bijective mapping is also a bijective mapping.

::: {#prop:InverseBijectionIsBijection .proposition} Proposition 35. Inverse of a bijective mapping is a bijective mapping

Let f:X\rightarrow Y be a bijective mapping. We have that f^{-1}:Y\rightarrow X, the inverse of f, is also a bijection.

Proof:

Let f:X\rightarrow Y be a bijective mapping. By definition of being a bijection we have that f is both injective and surjective. By proposition 34{reference-type="ref" reference="prop:MappingInjectiveSurjectiveIFFInverseIsMapping"} we have that f^{-1} is a mapping. Now it is clear that the inverse of the inverse is the original mapping that is.

$$\begin{equation} \left(f^{-1}\right)^{-1}=f \end{equation*}$$*

Now, f is a bijection and thus is a mapping. But as f is a mapping we have that by proposition 34{reference-type="ref" reference="prop:MappingInjectiveSurjectiveIFFInverseIsMapping"} we have that f^{-1} is a bijection. As required. $\qed$ :::

We can now see that the composition of a bijective mapping with its inverse must be the identity map.

::: {#prop:BijectionWithInverseIsIdentity .proposition} Proposition 36. Composition of bijective mapping with the inverses is the identity mapping

Let f:X\rightarrow Y be a bijective mapping, and let f^{-1}:Y\rightarrow X be the inverse mapping of f. We have that

$$\begin{align} f\circ f^{-1} &=\mathop{\mathrm{id}}_Y\ f^{-1}\circ f &= \mathop{\mathrm{id}}_X \end{align*}$$*

Proof:

Let f:X\rightarrow Y be a bijective mapping, with inverse given by f^{-1}:Y\rightarrow X. As f is bijective we have that by proposition 35{reference-type="ref" reference="prop:InverseBijectionIsBijection"} we have that f^{-1} is a bijection. Let x\in X, then we have that

$$\begin{equation} \exists y\in Y: f\left(x\right)=y \Rightarrow f^{-1}\left(y\right)=x \end{equation*}$$*

Hence, we have that

$$\begin{align} f^{-1}\circ f\left(x\right)&=f^{-1}\left(f\left(x\right)\right),\ \text{By function composition}\ &=f^{-1}\left(y\right),\ \text{By above}\ &=x,\ \text{By above}\ &=\mathop{\mathrm{id}}_X\left(x\right),\ \text{By the definition of the identity map of } X \end{align*}$$*

We have that the domain of f^{-1}\circ f is clearly X, likewise the co-domain is X, which is the same as \mathop{\mathrm{id}}_X. Moreover \forall x\in X we have f^{-1}\circ f\left(x\right)=x=\mathop{\mathrm{id}}_X\left(x\right). So the mappings are equal.

Likewise, let y\in Y, then we have that

$$\begin{equation} \exists x\in X: f^{-1}\left(y\right)=x \Rightarrow f\left(x\right)=y \end{equation*}$$*

Hence, we have that

$$\begin{align} f\circ f^{-1}\left(y\right)&=f\left(f^{-1}\left(y\right)\right),\ \text{By function composition}\ &=f\left(x\right),\ \text{By above}\ &=y,\ \text{By above}\ &=\mathop{\mathrm{id}}_Y\left(y\right),\ \text{By the definition of the identity map of } Y \end{align*}$$*

We have that the domain of f\circ f^{-1} is clearly Y, likewise the co-domain is Y, which is the same as \mathop{\mathrm{id}}_Y. Moreover \forall y\in Y we have f\circ f^{-1}\left(y\right)=y=\mathop{\mathrm{id}}_Y\left(y\right). So the mappings are equal.

In both cases the composition yields the required identity mappings, as required. $\qed$ :::

The Natural numbers

::: epigraph The natural numbers are the work of God. All the rest is the work of mankind.

Leopold Kronecker (Paraphrased) :::

Constructing the Natural numbers

We now have enough tools and core theory to start building up from the foundations of mathematics. We do this using the ZFC axioms, although perhaps not with the complete rigour we should be using. We touched on these briefly in section 2.1.5{reference-type="ref" reference="subsubSec:ZFCAxioms"}. We will state them again.

  1. The axiom of extensionality:

    The axiom of extensionality asserts that two sets are equal if and only if they contain the same elements.

  2. The axiom of the empty-set:

    The axiom of the empty-set asserts that there exists a set which contains no elements

  3. The axiom of pairing:

    The axiom of pairing asserts that given any set A and any set B, there is a set C such that, given any set D, D is a member of C if and only if D is equal to A or D is equal to B. This is to say, given two sets, there is a set whose members are exactly the two given sets.

  4. The axiom of specification:

    The axiom of specification asserts that we can construct a set which satisfies a given condition, so long as this condition is not inherently contradictory.

  5. The axiom of unions:

    The axiom of unions asserts that we can perform the union of two sets A and B

  6. The axiom of powers:

    The axiom of powers asserts that for any set S we can construct a set P\left(S\right) whose elements are all the possible subsets of S.

  7. The axiom of infinity:

    The axiom of infinity asserts that there is at least one infinite set A, that is at least one set with infinitely many elements. That is we have a set A such that the \emptyset\in A and if x\in A then the set x\cup\left\{x\right\} is also in A.

  8. The axiom of replacement:

    We will need the next section to fully understand this axiom, however informally asserts that for some set S, and form another set by replacing the elements of S by other sets according to any definite rule.

  9. The axiom of foundation:

    The axiom of foundation asserts that for every non-empty set S, there exists an element x\in S such that x and S are disjoint. This also asserts that no set can contain itself.

We also recall that we include the symbol \in in the ZFC axioms, which allows us to talk about element inclusions in sets. In other words, ZFC defines a set of axioms that allow us to talk about sets and elements of sets. Next, we have that, formally speaking, ZFC is allowed to make statements about mappings. Finally, we will ZFC has the power to prove the results in the previous two sections we made on sets and mappings, so we will assume these as well. We will use this as the building blocks for building the natural numbers. How can we do this from the ZFC axioms?

As it stands right now ZFC only gives us the existence of the empty set, and there is at least a set which contains infinitely many elements. We start with the empty set, a set which contains no elements, we can use the ZFC axioms to build a new set which contains the empty set.

Our ultimate goal is to identify each natural number with the number of elements in some corresponding set. Hence naturally the empty set containing no elements would be identified with the number 0, and so on. The question is given that we only have the empty set, how can we build a new set? We can use the axiom of powers. This states that we can take any set S and construct a new set P\left(S\right) whose elements are the possible subsets of S. Applying this to the empty-set, a set which contains no elements and thus has no subsets except for itself, must give us P\left(\emptyset\right)=\left\{\emptyset\right\}. This is sufficient for what we need to do.

So, we have two sets, \emptyset and \left\{\emptyset\right\}. We shall identify \emptyset with 0 and \left\{\emptyset\right\} with 1.

::: {#def:Zero .definition} Definition 55. Zero

We define the number zero to be \emptyset. That is, we say Zero is a set that contains no elements. :::

::: {#def:One .definition} Definition 56. One

We define the number zero to be \left\{\emptyset\right\}. That is, we say One is the set whose only element is \emptyset. :::

How do we define any more numbers? We can use the axiom of unions. This raises the question why not use the axiom of powers again? If we apply the axiom of powers to \left\{\emptyset\right\} we get the set

$$\begin{equation*} P\left(\left{\emptyset\right}\right)=\left{\emptyset,\left{\emptyset\right}\right} \end{equation*}$$ If we assume we already know what the natural numbers are, we could identify this with the number 2. However, a repeated application of the axiom of powers would give us

$$\begin{equation*} P\left(\left{\emptyset,\left{\emptyset\right}\right}\right)=\left{\emptyset,\left{\emptyset\right},\left{\left{\emptyset\right}\right},\left{\emptyset,\left{\emptyset\right}\right}\right} \end{equation*}$$ Which we would identify with the number 4. Another application would give us a set that we would identify with the number 8. Clearly, we are skipping numbers such as 3,5,7,9 etc. We can't get additional numbers that aren't powers of 2. Instead, we can define an operation that will allow us to construct each number one at a time.

This operation uses the axiom of unions, and starts of with the numbers 0 and 1, which we recall are the sets \emptyset, and \left\{\emptyset\right\} respectively. Applying the axiom of unions to these two sets gives us

$$\begin{equation*} \emptyset\cup\left{\emptyset\right}=\left{\emptyset,\left{\emptyset\right}\right} \end{equation*}$$ This is in agreement with P\left(\left\{\emptyset\right\}\right), so we can identify this with the number 2. Now, the axiom of pairing allows us to create a set that contains as elements any two sets that have already been created. Applying this to \left\{\emptyset,\left\{\emptyset\right\}\right\} with itself allows us to create the set \left\{\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}. Hence we can now apply this operation again on the set \left\{\emptyset,\left\{\emptyset\right\}\right\} to get

$$\begin{equation*} \left{\emptyset,\left{\emptyset\right}\right}\cup\left{\left{\emptyset,\left{\emptyset\right}\right}\right}=\left{\emptyset,\left{\emptyset\right},\left{\left{\emptyset,\left{\emptyset\right}\right}\right}\right} \end{equation*}$$ A set of 3 elements so we identify this with the number 3. We can keep doing this to build the Natural numbers. Lets make some definitions

::: definition Definition 57. The successor operation

Let x be a set. We define the successor operation, denoted by S to be given by

$$\begin{equation} S\left(x\right)= x\cup\left{x\right} \end{equation}$$ :::

We call this the successor function, as it is clear in the context of the Natural numbers that S\left(n\right)=n+1, but we shall prove this later.

This definition allows us to essentially make any finite number. This leads us to our first potential definition for the Natural numbers. We first need to define the idea of recursion.

We have the following proposition

::: {#prop:EqualSuccOp .proposition} Proposition 37. Equality of successor operation

Let a,b be sets. We have that S\left(a\right)=S\left(b\right) if and only if a=b.

Proof:

\left(\Rightarrow\right): Suppose that a,b are sets and S\left(a\right)=S\left(b\right). By definition of S we have that

$$\begin{equation} a\cup\left{a\right}=b\cup\left{b\right} \end{equation*}$$*

Now, as a\in S\left(a\right) and S\left(a\right)=S\left(b\right) then we have that a\in b\cup\left\{b\right\} and so a\in b or a=b. Similarly, as b\in S\left(b\right) we get that b\in a\cup\left\{a\right\} and so b\in a or b=a.

Now, if a=b we are done, so suppose a\neq b, then we have that a\in b and b\in a. Consider the set given by

$$\begin{equation} X=\left{a,b\right} \end{equation*}$$*

which can be constructed by the Axiom of pairing. Now as a\in b we have that b\cap \left\{a,b\right\}\neq\emptyset and likewise as b\in a we have a\cap \left\{a,b\right\}\neq\emptyset. This contradicts the Axiom of Foundation, X does not contain an element that is disjoint from it. It follows that we can't have a\neq b and conclude that a=b.

\left(\Leftarrow\right): This is trivial by the definition of S. $\qed$ :::

There are a few extra properties about the successor function that we shall make use of

::: corollary Corollary 1. Successor mapping is injective

Let a,b be sets. We have that the successor function is injective, that is for all sets a,b we have that

$$\begin{equation} S\left(a\right)=S\left(b\right) \Rightarrow a=b \end{equation*}$$*

Proof:

Suppose that a,b are arbitrary sets and that S\left(a\right)=S\left(b\right), by proposition 37{reference-type="ref" reference="prop:EqualSuccOp"} this holds if and only if a=b. Hence we have injectivity. $\qed$ :::

::: corollary Corollary 2. Empty-set is not the successor of any set

We have that \emptyset\neq S\left(a\right) for all sets a.

Proof:

Consider the definition of S\left(a\right) and suppose for contradiction that \emptyset= S\left(a\right). We have by definition of the successor mapping that

$$\begin{equation} \emptyset=S\left(a\right)=a\cup\left{a\right} \end{equation*}$$ This is a contradiction, as a\cup\left\{a\right\} is a set of two elements, namely a and \left\{a\right\} but the empty-set by definition has no elements. $\qed$* :::

::: definition Definition 58. Recursive definition of a set

A set S is defined recursively if the elements of S are defined in terms of other elements x\in S. Moreover we have that there is some initial element x_0 which is used to define the other elements of the set. :::

::: definition Definition 59. First definition of the Natural numbers

We define the set \mathbb{N}, called the set of natural numbers, to be the set given by

$$\begin{equation} \mathbb{N}=\left{x: x=\emptyset\text{ or } x=S\left(y\right)\text{ for some } y\in\mathbb{N}\right} \end{equation}$$ :::

We have defined \mathbb{N} recursively in terms of elements of \mathbb{N}. As an example 2\in\mathbb{N} as 2=S\left(1\right) and likewise 1=S\left(0\right) and we know that 0 is really the same as \emptyset, which is the initial element of \mathbb{N} as defined above. This definition allows us to get any x\in\mathbb{N}, however it is not quite enough to get every element of \mathbb{N} at the same time. We know that there should be infinitely many natural numbers, indeed for any n\in\mathbb{N} we have also that n+1\in\mathbb{N}. In other words we have a chain of sets of increasing size, that is we have

$$\begin{align*} \mathbb{N}_0&=\emptyset=0\ \mathbb{N}_1&=\left{\emptyset\right}=1\ \mathbb{N}_2&=\left{\emptyset,\left{\emptyset\right}\right}=2\ \mathbb{N}_3&=\left{\emptyset,\left{\emptyset\right},\left{\left{\emptyset,\left{\emptyset\right}\right}\right}\right}=3\ \end{align*}$$ Which satisfy \mathbb{N}_0\subset\mathbb{N}_1\subset\mathbb{N}_2\subset\mathbb{N}_3\subset\dots. So we see at each stage \mathbb{N}_n is a finite set of size n and so ultimately our current definition of \mathbb{N} can ultimately only ever reach a finite n. although we can make this n arbitrarily large. To ensure we get every possible n at once we need to invoke the axiom of infinity.

  1. The axiom of infinity:

    The axiom of infinity asserts that there is at least one infinite set A, that is at least one set with infinitely many elements. That is we have a set A such that the \emptyset\in A and if x\in A then the set x\cup\left\{x\right\} is also in A.

There is a useful definition that we can extract from the axiom of infinity.

::: definition Definition 60. Inductive set

Let A be a set and let f:A\rightarrow A be a mapping. We say that A is an inductive if it satisfies the following two properties

  1. $\emptyset\in A$

  2. If x\in A then $f\left(x\right)\in A$

For now, we will be focused on the case where f=S, the successor mapping. :::

In light of the axiom of infinity we have a set that contains the infinitely many Natural numbers. This is nearly what we want, although it won't be the set of Natural numbers. This set could clearly have many, many more things than just the Natural numbers.

We can make a new definition, which will allow us to define \mathbb{N}. We will also be able to show the fact this definition defines \mathbb{N} to be the smallest such inductive set that contains all of the Natural numbers.

::: definition Definition 61. The set $\mathbb{N}_S$

Let S be an inductive set. We define \mathbb{N}_S as follows

$$\begin{equation} \mathbb{N}S=\bigcap{\substack{A\subseteq S \ A\text{ is inductive}}} A \end{equation}$$

This is well-defined by the axiom of specification, being an inductive step is definable and the collection of all subsets of S is a set we can define. :::

We have that all of these sets \mathbb{N}_S are the same.

::: {#thm:EveryNsSetIsSame .theorem} Theorem 3. Every \mathbb{N}_S set is the same set

Let S and T be inductive sets. Define the sets \mathbb{N}_S and \mathbb{N}_T We have that

$$\begin{equation} \mathbb{N}_S=\mathbb{N}_T \end{equation}$$

Proof:

By the axiom of extensionality we know that two sets are equal if and only if they contain the same elements. To see that \mathbb{N}_S and \mathbb{N}_T have the same elements consider the new set given by

$$\begin{equation} C=\mathbb{N}_S\cap\mathbb{N}_T \end{equation*}$$*

We recall from proposition 8{reference-type="ref" reference="prop:PropertiesOfUnionIntersectionSetinclusion"} that for two sets A and B we have A\cap B\subseteq A. Hence it follows that

$$\begin{equation} C=\mathbb{N}_S\cap\mathbb{N}_T\subseteq\mathbb{N}_S \end{equation*}$$ That is, C\subseteq\mathbb{N}_S, that is to say every element of C is also an element of \mathbb{N}_S. Now recall the definition of \mathbb{N}_S,*

$$\begin{equation} \mathbb{N}S=\bigcap{\substack{A\subseteq S \ A\text{ is inductive}}} A \end{equation*}$$ We know that C\subseteq \mathbb{N}_S, hence as \mathbb{N}_S is the intersection of all subsets of S we must conclude that C\subseteq S.*

Now, we know that S is an inductive set. Hence S satisfies the following

  1. $\emptyset\in S$

  2. If x\in S then $S\left(x\right)\in S$

If we can show that C is an inductive set we know that C was one of the sets we used to construct \mathbb{N}_S and hence \mathbb{N}_S\subseteq C, which will give the equality C=\mathbb{N}_S.

Now, to show that C is an inductive set me must show that

  1. $\emptyset\in C$

  2. If x\in C then $S\left(x\right)\in C$

  1. \emptyset\in C:

    We have that C=\mathbb{N}_S\cap\mathbb{N}_T and we have that

    $$\begin{align} \mathbb{N}S&=\bigcap{\substack{A\subseteq S \ A\text{ is inductive}}} A\ \mathbb{N}T&=\bigcap{\substack{A\subseteq T \ A\text{ is inductive}}} A\ \end{align*}$$ In the definitions of both \mathbb{N}_S and \mathbb{N}_T we have that these are the intersections of inductive sets and so \emptyset\in\mathbb{N}_S and \emptyset\in\mathbb{N}_T. It hence follows that as C=\mathbb{N}_S\cap\mathbb{N}_T we must have \emptyset\in C.*

  2. If x\in C then S\left(x\right)\in C:

    Now suppose that x\in C. Like before we know that C=\mathbb{N}_S\cap\mathbb{N}_T, and by the definition of the intersection of two sets, it follows that x\in\mathbb{N}_S and x\in\mathbb{N}_T. Now we have that

    $$\begin{equation} \mathbb{N}S=\bigcap{\substack{A\subseteq S \ A\text{ is inductive}}} A \end{equation*}$$ hence as x\in\mathbb{N}_S we have we must have that x\in A for every subset A of S. Moreover each such A is an inductive set and so by definition of an inductive set we have that S\left(x\right)\in A for every subset A of S. Hence S\left(x\right)\in\mathbb{N}_S and likewise a similar argument shows that S\left(x\right)\in\mathbb{N}_T. It thus follows that S\left(x\right)\in C.*

    As x\in C was arbitrary we must conclude that this holds for any x\in C.

Hence C is an inductive set.

Now, we know that C\subseteq S and C is an inductive set then it follows that C is one of the inductive sets in the definition of \mathbb{N}_S. It hence follows that \mathbb{N}_S\subseteq C. It follows by the axiom of extensionality that as \mathbb{N}_S and C contain the same elements then C=\mathbb{N}_S.

Likewise the a similar argument shows that C=\mathbb{N}_T. So it follows that \mathbb{N}_S = \mathbb{N}_T. $\qed$ :::

In light of this theorem we can now truly define \mathbb{N}.

::: definition Definition 62. The Natural numbers $\mathbb{N}$

Let S be an inductive set, and construct the set \mathbb{N}_S. The set \mathbb{N}_S is the set of Natural numbers and by theorem 3{reference-type="ref" reference="thm:EveryNsSetIsSame"} no matter the inductive set S we have that all such \mathbb{N}_S are the same. Hence we simply refer to the natural numbers by \mathbb{N}. :::

We identify the elements of \mathbb{N} not in terms of \emptyset, and sets of sets containing \emptyset, but instead by the more usually numerals that we use. We have already defined Zero and One, by definitions 55{reference-type="ref" reference="def:Zero"} and 56{reference-type="ref" reference="def:One"}. The other numbers follow likewise, i.e

$$\begin{align*} 0&=\emptyset\ 1&=S\left(0\right)=\left{\emptyset\right}\ 2&=S\left(1\right)=\left{\emptyset,\left{\emptyset\right}\right}\ 3&=S\left(2\right)\ 4&=S\left(3\right)\ &\dots\ n+1&=S\left(n\right) \end{align*}$$

We said that we can prove that \mathbb{N} is the smallest such inductive set that contains all the natural numbers, this is to say if A\subseteq\mathbb{N} is an inductive set we must have that A=\mathbb{N}. We thankfully do not need to prove this as the previous theorem gives this for free. This also gives us the following definition for a minimally inductive set, we make the definition in such a way that we argue about sets of inductive sets.

::: definition Definition 63. Minimally inductive set of sets

Let S be a set whose elements are also sets satisfying some condition, and let f:S\rightarrow S be a mapping. We say that S is minimally inductive if and only if the foll lowing holds

  1. S is an inductive set under the mapping $g$

  2. No proper subset of S is inductive under the mapping $g$ :::

One of the most powerful properties of the natural numbers is the principle of Induction. This tool is powerful in proving many statements on the Natural numbers. It works in a similar way to how an inductive set works7 . We show that the statement works for some base case, usually n=0, then we assume that if it holds true for some n then it holds true for S\left(n\right)=n+1.

::: theorem Theorem 4. The principle of induction

Suppose we have a proposition P\left(n\right) about a Natural number n\in\mathbb{N}. Moreover, suppose that

  1. P\left(0\right) is true

  2. P\left(n\right) being true implies P\left(S\left(n\right)\right)=P\left(n+1\right) is true for any Natural number n.

If these two statements are true, we have that P\left(n\right) is true for any natural number n, and we say the proposition P\left(n\right) holds by the principle of mathematical induction.

Moreover we call P\left(0\right) the base case for induction and P\left(n\right) being true implies P\left(n+1\right) being true is the inductive step.

Proof:

Let P\left(n\right) be a proposition about a Natural number n\in\mathbb{N} such that P\left(n\right) satisfies

  1. P\left(0\right) is true

  2. P\left(n\right) being true implies P\left(S\left(n\right)\right)=P\left(n+1\right) is true for any Natural number n.

Consider the set given by

$$\begin{equation} Q=\left{n:P\left(n\right)\text{ is true}\right} \end{equation*}$$ That is, Q is defined as the set of Natural numbers such that that P\left(n\right) is true, clearly Q\subseteq\mathbb{N}. By hypothesis we know that P\left(0\right) is true, so 0\in Q. Also by hypothesis we know that if P\left(n\right) is true for some n\in\mathbb{N}. then we have that P\left(S\left(n\right)\right)=P\left(n+1\right) is also true, hence we have that every n\in\mathbb{N} is also in Q, hence \mathbb{N}\subseteq Q and so by the axiom of extensionality we have that Q=\mathbb{N}. Hence P\left(n\right) is true for every Natural number n\in\mathbb{N}. \qed.* :::

Now that we have induction we can make a final definition that will be useful. This definition combines a few previously proven results into a convenient package, this package has the strength to prove the usual properties of the natural numbers and perhaps are an easy way to remember the basis for deducing properties about the natural numbers.

::: definition Definition 64. The Peano axioms

We define the Peano axioms as follows. Let A be a set and consider the successor mapping on A, S: A\rightarrow A. If we have that

  1. A is an inductive set

    1. $\emptyset\in A$

    2. If x\in A then $S\left(x\right)\in A$

  2. S is an injective mapping.

  3. \forall x\in S we have that $\emptyset\neq S\left(x\right)$

  4. \forall B \subseteq A. If 0\in B and S\left(n\right)\in B for all n\in B then $B=A$

If A satisfies all of the above, then we say that A satisfies the Peano axioms and induces Peano arithmetic. :::

Properties of the natural numbers

Although we have constructed \mathbb{N} we haven't defined what we can do with this set. We know from our intuitions that we can define addition, a form of subtraction, multiplication and in some cases division. We also know that there is some notion of a Natural number being larger or smaller than another, when two Natural numbers are equal and so. We will explore some of these properties so that we can start doing some form of Mathematics.

Equality of natural numbers

Firstly, it is important to define when two Natural numbers are equal, again as we have defined the natural numbers in terms of Sets, this just comes down to the axiom of extensionality.

::: definition Definition 65. Equality of natural numbers

Let n,m\in\mathbb{N} be two natural numbers. We define that two natural numbers are equal, denoted n=m if and only if n\subseteq m and m\subseteq n. This is simply the axiom of extensionality.

If we do not have n=m then we say that n and m are not equal and we denote this n\neq m. :::

This definition clearly makes sense as each natural number is a set.

::: example Example 51. We have that 1=1. Indeed by definition 0=\emptyset and 1=\left\{\emptyset\right\}. It is clear that \left\{\emptyset\right\}\subseteq \left\{\emptyset\right\} hence the axiom of extensionality gives us that \left\{\emptyset\right\}=\left\{\emptyset\right\}. That is $1=1$ :::

::: example Example 52. We have that 3=3. Indeed by construction we have that 3=\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\} It is clear that \left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\} hence the axiom of extensionality gives us that \left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}=\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}. i.e $3=3$ :::

::: example Example 53. We have 1\neq 2. We have that 1=\left\{\emptyset\right\} and 1=\left\{\emptyset,\left\{\emptyset\right\}\right\}. Now \left\{\emptyset\right\}\subseteq \left\{\emptyset,\left\{\emptyset\right\}\right\} but \left\{\emptyset,\left\{\emptyset\right\}\right\}\not\subseteq \left\{\emptyset\right\}. :::

In particular in light of the definition of equality on the natural numbers if n=m and m=k we must have that n=k.

Inequality of natural numbers

We can define also define what it means for natural numbers to not be equal. We make use of the notion of set inclusion. Recall that a set S is a subset of the set T, written S\subseteq T, if for every s\in S we have that s\in T and that S is a proper subset of T, written S\subset T if S\subseteq T and S\neq T. We will use the proper subset notation to define the so-called less than operator. This operation comes naturally from the definition of the natural numbers by the successor mapping. The successor function has the following chain of definitions for each n\in\mathbb{N}

$$\begin{align*} 0&=\emptyset\ 1&=S\left(0\right)=\left{\emptyset\right}\ 2&=S\left(1\right)=\left{\emptyset,\left{\emptyset\right}\right}\ 3&=S\left(2\right)=\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right}\ 4&=S\left(3\right)=\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right},\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right}\right}\ &\dots\ n+1&=S\left(n\right) \end{align*}$$

From this chain of definitions and the axiom of foundation, 0=\emptyset is the set element minimal element of \mathbb{N}, so every natural number is contained in one that comes after. We can make the following definition which defines when one natural number is smaller than another.

::: definition Definition 66. Less than Operator

Let n,m\in\mathbb{N}. The less than operator, denoted by n<m and read as n is less than m, is defined as follows.

We have n<m if and only if n\subset m. The set that denotes the number n is an element of the set m. In the language of mathematical logic, we have that that < is actually a logical proposition, given by

$$\begin{equation} <\left(n,m\right)=\begin{cases} 1,\ \text{If } n\subset m\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$* :::

Recall that for predicates 0 indicates that the predicate is false and 1 indicates that the predicate is true.

::: example Example 54. We have that 2<3. Indeed 2=\left\{\emptyset,\left\{\emptyset\right\}\right\} and 3=\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}, and clearly

$$\begin{equation} \left{\emptyset,\left{\emptyset\right}\right}\subset\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right} \end{equation*}$$* :::

We can combine the less than operator with the equality operator.

::: definition Definition 67. Less than or equal to operator

Let n,m\in\mathbb{N}. The less than or equal to operator, denoted by n\leq m, and read as n is less than or equal to m, is defined the same as n<m except we now allow for the situation that n=m. This is to say \leq is a logical proposition given by

$$\begin{equation} \leq\left(n,m\right) = \begin{cases} 1,\ \text{If } n< m\ 1,\ \text{If } n=m\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

Where on the right-hand side of the definition we are talking about sets, and on the left-hand side we are talking about natural numbers, although we know these are the same thing. :::

::: example Example 55. We have that 2\leq 3. From the previous example, we know that 2<3. Moreover, we have that 3\leq 3 as 3=3. :::

We have defined less than and less than or equal to, we can define a similar notation of greater than and greater than then equal to, we can do this by considering when n\not\subset m.

::: definition Definition 68. Greater than operator

Let n,m\in\mathbb{N}. The greater than operator, denoted by n>m and is read as n is greater than m, is defined as follows.

We have n>m if and only if n\not\subset m. That is, the set that denotes the number n is not an element of the set m. That is to say that > is a logical proposition, given by

$$\begin{equation} >\left(n,m\right)=\begin{cases} 1,\ \text{If } n\not\subset m\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$* :::

Likewise, we can define the greater than or equal to operator.

::: definition Definition 69. Greater than or equal to operator

Let n,m\in\mathbb{N}. The greater than or equal to operator, denoted by n\geq m, and read as n is greater than or equal to m, is defined the same as n>m except we now allow for the situation that n=m. This is to say \geq is a logical proposition given by

$$\begin{equation} \geq\left(n,m\right) = \begin{cases} 1,\ \text{If } n> m\ 1,\ \text{If } n=m\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$* :::

Defining addition and multiplication on the Natural numbers

We can use the principle of induction to make definitions as well as a proof technique. We shall use induction now to make two definitions, in particular, we define two mappings that will allow us to start manipulating Natural numbers as we expect them to. To do so it is enough to specify what the mapping does when 0 is given as an argument, and then do define what the mapping does when given S\left(n\right) as an argument, hence defining it in terms of n for each n\in\mathbb{N}. This will make sense when we define these operations.

We first recall the Cartesian product of two sets. Let S and T be sets, the Cartesian product of S and T, denoted S\times T is the set of all ordered pairs of the form \left(S,t\right) where s\in S and t\in T. This is to say that

$$\begin{equation*} S\times T=\left{\left(s,t\right):s\in S,t\in T\right} \end{equation*}$$

If S=T then we denote S\times T by S^2.

::: definition Definition 70. Addition on the Natural numbers

We define addition on the Natural numbers by the following mapping. Let +:\mathbb{N}^2\rightarrow\mathbb{N} be such that for all \left(m,n\right)\in\mathbb{N}^2 we have the following

$$\begin{align} +&:\mathbb{N}^2\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ \left(m,n\right)&\mapsto +\left(m,n\right)=\begin{cases} m+0=m,\ \text{If } n=0\ m+S\left(n\right)=S\left(m+n\right),\ \text{If } n\neq 0 \end{cases} \end{align}$$

We will write +\left(m,n\right) as m+n. :::

In light of this definition, we can prove that 1+1=2

::: theorem Theorem 5. 1+1=2

We have that 1+1=2.

Proof:

We know that 1=S\left(0\right) and 2=S\left(S\left(0\right)\right). Hence, we are proving

$$\begin{equation} S\left(0\right)+S\left(0\right)=S\left(S\left(0\right)\right) \end{equation*}$$*

By the definition of the addition mapping, we know that \forall \left(m,n\right)\in\mathbb{N}^2 that

$$\begin{equation} m+S\left(n\right)=S\left(m+n\right) \end{equation*}$$ In particular if n=0 we have \forall m that*

$$\begin{equation} m+S\left(0\right)=S\left(m+0\right) \end{equation*}$$*

and

$$\begin{equation} \label{eq:OnePlusOneProofEq1} S\left(0\right)+S\left(0\right)=S\left(S\left(0\right)+0\right) \end{equation}$$ Moreover, by the definition of addition, we know that \forall m that if n=0 then

$$\begin{equation} m+0=m \end{equation*}$$ Hence*

$$\begin{align} S\left(0\right)+0&=S\left(0\right)\ \Rightarrow S\left(S\left(0\right)+0\right)&= S\left(S\left(0\right)\right)\ \Rightarrow S\left(0\right)+S\left(0\right)&=S\left(S\left(0\right)\right) \end{align*}$$*

This is to say. 1+1=2. As required. \qed. :::

::: definition Definition 71. Multiplication on the Natural numbers

We define multiplication on the Natural numbers by the following mapping. Let *:\mathbb{N}\times\mathbb{N}\rightarrow\mathbb{N} be such that for all \left(m,n\right)\in\mathbb{N}\times\mathbb{N} we have the following

$$\begin{align} &:\mathbb{N}\times\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ \left(m,n\right)&\mapsto \left(m,n\right)=\begin{cases} m0=0,\ \text{If } n=0\ mS\left(n\right)=mn+m,\ \text{If } n\neq 0 \end{cases} \end{align}$$ We will write *\left(m,n\right) as m*n, or more compactly just as the juxtaposition $mn$* :::

As with addition we provide a proof that 2*2=4

::: theorem Theorem 6. 2*2=4

We have 2*2=4.

Proof:

We know that S\left(1\right)=2 and so by definition of multiplication we have that

$$\begin{equation} 22=2S\left(1\right)=21+2 \end{equation}$$*

Likewise we know that S\left(0\right)=1 and so by another application of the definition of multiplication we have that

$$\begin{equation} 21+2=2S\left(0\right)+2=20+2+2 \end{equation}$$*

Now 2*0=0 by definition as so we have that

$$\begin{equation} 22=20+2+2=0+2+2=2+2 \end{equation*}$$*

It is left to show that 2+2 = 4. We use a similar proof to 1+1=2. As 4=S\left(S\left(2\right)\right)=S\left(S\left(S\left(S\left(0\right)\right)\right)\right) and 2=S\left(S\left(0\right)\right) we need to show that

$$\begin{equation} S\left(S\left(0\right)\right)+S\left(S\left(0\right)\right)=S\left(S\left(S\left(S\left(0\right)\right)\right)\right) \end{equation*}$$*

By the definition of addition we have that \forall\left(m,n\right)\in\mathbb{N}^2 that

$$\begin{equation} m+S\left(n\right)=S\left(m+n\right) \end{equation*}$$*

In particular we have that if n=0 and \forall n\in\mathbb{N} that

$$\begin{equation} m+S\left(0\right)=S\left(m+0\right) \end{equation*}$$*

So that

$$\begin{align} S\left(S\left(0\right)\right)+S\left(S\left(0\right)\right)&=S\left(S\left(S\left(0\right)\right)+S\left(0\right)\right)\ &=S\left(S\left(S\left(S\left(0\right)\right)+0\right)\right)\ &=S\left(S\left(S\left(S\left(0\right)\right)\right)\right)\ \end{align*}$$*

That is 2+2=4 and so the theorem is proved. $\qed$ :::

These two definitions are enough to prove every elementary property of addition and multiplication that we are familiar with. However to do so will require an upgrade to the idea of induction. This will allow us to perform induction on both the addition and multiplication mappings. Once we have done this we will have put the natural numbers on a firm logical basis. This idea is called double induction, or more clearly induction on two variables.

For example, we know from school that n+m=m+n for all natural numbers n and m. To show that this is true, we start by induction on n, so we have to show that m+0=0+m and then that \left(m+n=n+m\right) implies that \left(m+S\left(n\right)=S\left(n\right)+m\right), each of these will be proved by induction on m. This is the idea of double induction.

::: theorem Theorem 7. Double induction

Let P\left(m,n\right) be a proposition about a pair of natural numbers m,n\in\mathbb{N}. Moreover suppose that

  1. P\left(0,0\right) is true.

  2. P\left(0,n\right) being true implies that P\left(0,S\left(n\right)\right) is true.

  3. P\left(m,0\right) being true implies that P\left(S\left(m\right),0\right) is true

  4. For a given m\in\mathbb{N}, from the truth that P\left(m,x\right) is true for all x, and also that of P\left(S\left(m\right),n\right) for some n, we can infer that P\left(S\left(m\right),S\left(n\right)\right) is true.

If these statements are true, we have that P\left(m,n\right) is true for any natural numbers m,n\in\mathbb{N} and we say that the proposition P\left(m,n\right) hold by the principle of mathematical double induction.

Proof:

Let P\left(m,n\right) be a proposition about a pair of natural numbers m,n\in\mathbb{N}, which satisfies

  1. P\left(0,0\right) is true.

  2. P\left(0,n\right) being true implies that P\left(0,S\left(n\right)\right) is true.

  3. P\left(m,0\right) being true implies that P\left(S\left(m\right),0\right) is true

  4. For a given m\in\mathbb{N}, from the truth that P\left(m,x\right) is true for all x, and also that of P\left(S\left(m\right),n\right) for some n, we can infer that P\left(S\left(m\right),S\left(n\right)\right) is true.

Statements 1 and 2 are the base case and the inductive step for the proof of P\left(0,n\right) for all n\in\mathbb{N}. Likewise statements 1 and 3 are the base case and the inductive step for the proof of P\left(m,0\right) for all m\in\mathbb{N}.

Finally, the statements 3 and 4 is the base case and inductive step for a proof, by induction on n for a proof of the statement that if P\left(m,n\right) holds for all n, then P\left(S\left(m\right),n\right) holds for all n, and thus by induction we have that P\left(m,n\right) is true for all m. \qed. :::

We can start proving the basic properties of \mathbb{N} that we are familiar with.

Closure properties of addition and multiplication

 
We show that addition and multiplication on the natural numbers to produces a natural number.

::: theorem Theorem 8. The addition and multiplication mappings on the natural numbers are closed

For all n,m\in\mathbb{N}. We have that

  1. n+m\in\mathbb{N}.

  2. nm\in\mathbb{N}.

Proof:

  1. n+m\in\mathbb{N}:

    Let n,m\in\mathbb{N}. We need to show that

    1. $0+0\in\mathbb{N}$

    2. 0+n\in\mathbb{N} implies $0+S\left(n\right)\in\mathbb{N}$

    3. m+0\in\mathbb{N} implies $S\left(m\right)+0\in\mathbb{N}$

    4. For some m\in\mathbb{N}. Suppose that m+x\in\mathbb{N} for all x\in\mathbb{N}, and S\left(m\right)+n\in\mathbb{N} for some n\in\mathbb{N} implies that $S\left(m\right)+S\left(n\right)\in\mathbb{N}$

    1. 0+0\in\mathbb{N}:

      We have by the definition of addition that

      $$\begin{equation} 0+0=0 \end{equation*}$$ which is clearly in \mathbb{N}.*

    2. 0+n\in\mathbb{N} implies 0+S\left(n\right)\in\mathbb{N}:

      Now, suppose that 0+n\in\mathbb{N} for some n, we show that 0+S\left(n\right)\in\mathbb{N}.

      By the definition of addition we have that

      $$\begin{equation} 0+S\left(n\right)=S\left(0+n\right) \end{equation*}$$ Now 0+n\in\mathbb{N} by assumption, therefore we have that S\left(0+n\right)\in\mathbb{N}. Hence 0+S\left(n\right)\in\mathbb{N}.*

    3. m+0\in\mathbb{N} implies S\left(m\right)+0\in\mathbb{N}:

      Now, suppose that m+0\in\mathbb{N} for some m, we show that S\left(m\right)+0\in\mathbb{N}.

      By the definition of addition we have that

      $$\begin{equation} S\left(m\right)+0=S\left(m\right)=S\left(m+0\right) \end{equation*}$$ Now m+0\in\mathbb{N} by assumption, therefore S\left(m+0\right)\in\mathbb{N}. Hence $S\left(m\right)+0\in\mathbb{N}$*

    4. For some m\in\mathbb{N}. Suppose that m+x\in\mathbb{N} for all x\in\mathbb{N}, and S\left(m\right)+n\in\mathbb{N} for some n\in\mathbb{N} implies that $S\left(m\right)+S\left(n\right)\in\mathbb{N}$

      Now suppose that m+x\in\mathbb{N} for all x\in\mathbb{N} and some fixed m\in\mathbb{N}, and suppose that S\left(m\right)+n\in\mathbb{N} where n is some fixed value, we show that S\left(m\right)+S\left(n\right)\in\mathbb{N}.

      So, we have that S\left(m\right)\in\mathbb{N} and S\left(n\right)\in\mathbb{N} we can use the definition of addition, doing so gives

      $$\begin{equation} S\left(m\right)+S\left(n\right)=S\left(S\left(m\right)+n\right) \end{equation*}$$ By assumption S\left(m\right)+n\in\mathbb{N}, hence as we have that m+x\in\mathbb{N} for all x\in\mathbb{N}, then we have that S\left(S\left(m\right)+n\right)\in\mathbb{N}. Therefore we must conclude that S\left(m\right)+S\left(n\right)\in\mathbb{N}.*

    Hence by the principle by double induction we have that m+n\in\mathbb{N} for all m,n\in\mathbb{N}. That is, addition is closed.

  2. nm\in\mathbb{N}:

    Let n,m\in\mathbb{N}. We need to show that

    1. $00\in\mathbb{N}$*

    2. 0*n\in\mathbb{N} implies $0S\left(n\right)\in\mathbb{N}$*

    3. *m*0\in\mathbb{N} implies $S\left(m\right)0\in\mathbb{N}$

    4. *For some m\in\mathbb{N}. Suppose that m*x\in\mathbb{N} for all x\in\mathbb{N}, and S\left(m\right)*n\in\mathbb{N} for some n\in\mathbb{N} implies that $S\left(m\right)S\left(n\right)\in\mathbb{N}$

    1. 0*0\in\mathbb{N}:

      We have by the definition of multiplication that

      $$\begin{equation} 00=0 \end{equation}$$ which is clearly in \mathbb{N}.*

    2. 0*n\in\mathbb{N} implies 0*S\left(n\right)\in\mathbb{N}:

      Now, suppose that 0*n\in\mathbb{N} for some n, we show that 0*S\left(n\right)\in\mathbb{N}.

      By the definition of multiplication we have that

      $$\begin{equation} 0S\left(n\right)=0n+0 \end{equation*}$$*

      Now 0*n\in\mathbb{N} by assumption, moreover we have proved that addition is closed, so 0*n+0\in\mathbb{N} therefore we have that $0S\left(n\right)\in\mathbb{N}$*

    3. m*0\in\mathbb{N} implies S\left(m\right)*0\in\mathbb{N}:

      Now, suppose that m*0\in\mathbb{N} for some m, we show that S\left(m\right)*0\in\mathbb{N}.

      By the definition of addition we have that

      $$\begin{equation} S\left(m\right)0=0 \end{equation}$$ Where S\left(m\right)*0=0 by definition of multiplication. Hence as 0\in\mathbb{N} we have that S\left(m\right)*0\in\mathbb{N}.*

    4. For some m\in\mathbb{N}. Suppose that m*x\in\mathbb{N} for all x\in\mathbb{N}, and S\left(m\right)*n\in\mathbb{N} for some n\in\mathbb{N} implies that S\left(m\right)*S\left(n\right)\in\mathbb{N}:

      Now suppose that m*x\in\mathbb{N} for all x\in\mathbb{N} and some fixed m\in\mathbb{N}, and suppose that S\left(m\right)*n\in\mathbb{N} where n is some fixed value, we show that S\left(m\right)*S\left(n\right)\in\mathbb{N}.

      So, we have that S\left(m\right)\in\mathbb{N} and S\left(n\right)\in\mathbb{N} we can use the definition of multiplication, doing so gives

      $$\begin{equation} S\left(m\right)S\left(n\right)=S\left(m\right)n+S\left(m\right) \end{equation}$$ By assumption S\left(m\right)*n\in\mathbb{N}, moreover as m*x\in\mathbb{N} for all x\in\mathbb{N} we must have S\left(m\right)*n+S\left(m\right)\in\mathbb{N} as addition is closed.

      Hence S\left(m\right)*S\left(n\right)\in\mathbb{N}.

    Hence by the principle by double induction we have that m*n\in\mathbb{N} for all m,n\in\mathbb{N}. That is, multiplication is closed.

Hence, we have that the addition and multiplication mappings are closed. $\qed$ :::

Commutativity of addition and multiplication

 
This will prove that for all a,b\in\mathbb{N} that a+b=b+a and ab=ba.

::: theorem Theorem 9. Addition and multiplication are commutative

For all a,b\in\mathbb{N} we have that

  1. $a+b=b+a$

  2. $ab=ba$

Proof:

  1. a+b=b+a:

    We argue by double induction. We need to show that

    1. $0+0=0+0$

    2. 0+n=n+0 implies $0+S\left(n\right)=S\left(n\right)+0$

    3. m+0=0+m implies $S\left(m\right)+0=0+S\left(m\right)$

    4. If m+x=x+m for all x\in\mathbb{N} and S\left(m\right)+n=n+S\left(m\right) for some n\in\mathbb{N}, then we have that $S\left(m\right)+S\left(n\right)=S\left(n\right)+S\left(m\right)$

    1. 0+0=0+0:

      This is trivial by definition of addition.

    2. 0+n=n+0 implies 0+S\left(n\right)=S\left(n\right)+0:

      Suppose that 0+n=n+0, we show that 0+S\left(n\right)=S\left(n\right)+0. By the definition of addition we have that

      $$\begin{equation} 0+S\left(n\right)=S\left(0+n\right) \end{equation*}$$ We know by assumption that 0+n=n+0. Hence*

      $$\begin{equation} S\left(0+n\right)=S\left(n+0\right)=S\left(n\right)+0 \end{equation*}$$*

    3. m+0=0+m implies S\left(m\right)+0=0+S\left(m\right):

      Suppose that m+0=0+m, we show that S\left(m\right)+0=0+S\left(m\right). By the definition of addition we have that

      $$\begin{equation} S\left(m\right)+0=S\left(m\right)=S\left(m+0\right) \end{equation*}$$ We know by assumption that n+0=+m. Hence*

      $$\begin{equation} S\left(m+0\right)=S\left(0+m\right)=0+S\left(m\right) \end{equation*}$$*

    4. If m+x=x+m for all x\in\mathbb{N} and S\left(m\right)+n=n+S\left(m\right) for some n\in\mathbb{N}, then we have that S\left(m\right)+S\left(n\right)=S\left(n\right)+S\left(m\right):

      Suppose m+x=x+m for all x\in\mathbb{N} and that S\left(m\right)+n=n+S\left(m\right) for some n\in\mathbb{N}, we show that S\left(m\right)+S\left(n\right)=S\left(n\right)+S\left(m\right).

      We have

      $$\begin{equation} S\left(m\right)+S\left(n\right)=S\left(S\left(m\right)+n\right) \end{equation*}$$ Now we have by assumption that S\left(m\right)+n=n+S\left(m\right), for some n\in\mathbb{N}, hence*

      $$\begin{equation} S\left(S\left(m\right)+n\right)=S\left(n+S\left(m\right)\right)=S\left(S\left(n+m\right)\right) \end{equation*}$$*

      Likewise a similar chain of reasoning gives

      $$\begin{equation} S\left(n\right)+S\left(m\right)=S\left(S\left(n\right)+m\right)=S\left(m+S\left(n\right)\right)=S\left(S\left(m+n\right)\right) \end{equation*}$$ Finally, we have that m+n=m+n by assumption, and so $S\left(S\left(n+m\right)\right)=S\left(S\left(m+n\right)\right)$*

    Hence by the principle of double induction we have that a+b=b+a for all a,b\in\mathbb{N}. That is addition is commutative.

  2. ab=ba:

    We need to show that

    1. $00=00$

    2. 0*n=n*0 implies $0S\left(n\right)=S\left(n\right)0$

    3. m*0=0*m implies $S\left(m\right)0=0S\left(m\right)$

    4. *If m*x=x*m for all x\in\mathbb{N} and S\left(m\right)*n=n*S\left(m\right) for some n\in\mathbb{N}, then we have that $S\left(m\right)*S\left(n\right)=S\left(n\right)S\left(m\right)$

    1. 0*0=0*0:

      This is trivial by the definition of multiplication.

    2. 0*n=n*0 implies 0*S\left(n\right)=S\left(n\right)*0:

      Suppose that 0*n=n*0, we show that 0*S\left(n\right)=S\left(n\right)*0. We have by definition of multiplication that

      $$\begin{align} 0S\left(n\right)&=0n+0\ &=n0+0,\ \text{By assumption}\ &=0+0,\ \text{By definition of multiplication}\ &=0,\ \text{By definition of addition}\ &=S\left(n\right)0,\ \text{By definition of multiplication}\ \end{align}$$

    3. m*0=0*m implies S\left(m\right)*0=0*S\left(m\right):

      Suppose that m*0=0*m, we show that S\left(m\right)*0=0*S\left(m\right). We have by definition of multiplication that

      $$\begin{align} 0S\left(m\right)&=0m+0\ &=m0+0,\ \text{By assumption}\ &=0+0,\ \text{By definition of multiplication}\ &=0,\ \text{By definition of addition}\ &=S\left(m\right)0,\ \text{By definition of multiplication}\ \end{align}$$

    4. If m*x=x*m for all x\in\mathbb{N} and S\left(m\right)*n=n*S\left(m\right) for some n\in\mathbb{N}, then we have that S\left(m\right)*S\left(n\right)=S\left(n\right)*S\left(m\right):

      Suppose that m*x=x*m for all x\in\mathbb{N} and S\left(m\right)*n=n*S\left(m\right) for some n\in\mathbb{N}, we show S\left(m\right)*S\left(n\right)=S\left(n\right)*S\left(m\right). By definition of multiplication we have that

      $$\begin{equation} S\left(m\right)S\left(n\right)=S\left(m\right)n+S\left(m\right)=nS\left(m\right)+S\left(m\right)=nm+n+S\left(m\right)=nm+S\left(n+m\right) \end{equation}$$*

      Likewise, we have that $$\begin{equation} S\left(n\right)S\left(m\right)=S\left(n\right)m+S\left(n\right)=mS\left(n\right)+S\left(n\right)=mn+m+S\left(n\right)=mn+S\left(m+n\right) \end{equation}$$ Now, we know that addition is commutative so we have that S\left(m+n\right)=S\left(n+m\right), moreover by assumption we have that n*m=m*n. Hence*

      $$\begin{equation} nm+S\left(n+m\right)=mn+S\left(m+n\right) \end{equation*}$$*

    Hence by the principle of double induction we have that ab=ba for all a,b\in\mathbb{N}. That is multiplication is commutative.

The result now follows. $\qed$ :::

We can also now deduce the following property of multiplication

Associativity of addition

 
This will prove that for all a,b,c\in\mathbb{N} that a+\left(b+c\right)=\left(a+b\right)+c

::: theorem Theorem 10. Addition is associative

For all a,b,c\in\mathbb{N} we have that

$$\begin{equation} a+\left(b+c\right)=\left(a+b\right)+c \end{equation*}$$*

Proof: We can show this by induction. Let x,y\in\mathbb{N} be arbitrary, and let P\left(n\right) be the proposition given by

$$\begin{equation} \left(x+y\right)+n=x+\left(y+n\right) \end{equation*}$$*

For the base case we have n=0 and so

$$\begin{align} \left(x+y\right)+0&=x+y ,\text{By definition of addition}\ &=x+\left(y+0\right) \end{align*}$$*

Hence P\left(0\right) is true.

Now, suppose that P\left(n\right) is true, that is

$$\begin{equation} \left(x+y\right)+n=x+\left(y+n\right) \end{equation*}$$ We show that P\left(S\left(n\right)\right) is also true, that is*

$$\begin{equation} \left(x+y\right)+S\left(n\right)=x+\left(y+S\left(n\right)\right) \end{equation*}$$*

Now, we have that

$$\begin{align} \left(x+y\right)+S\left(n\right)&=S\left(\left(x+y\right)+n\right),\text{By definition of addition}\ &=S\left(x+\left(y+n\right)\right),\ \text{By the induction hypothesis}\ &=x+\left(S\left(y+n\right)\right),\text{By definition of addition}\ &=x+\left(y+S\left(n\right)\right),\text{By definition of addition}\ \end{align*}$$*

Hence P\left(S\left(n\right)\right) is true.

It follows by mathematical induction that \forall a,b,c\in\mathbb{N} we have that a+\left(b+c\right)=\left(a+b\right)+c, that is addition is associative. $\qed$ :::

Multiplication distributes over addition

 
This will prove that for all a,b,c\in\mathbb{N} we have that a\left(b+c\right)=ab+ac and \left(a+b\right)c=ac+bc.

::: theorem Theorem 11. Multiplication distributes over addition

For all a,b,c\in\mathbb{N} we have that

  1. $a\left(b+c\right)=ab+ac$

  2. $\left(b+c\right)a=ba+ca=ab+ac$

Proof:

We can be quick, and solve both problems nearly simultaneously, as we have shown that multiplication is commutative.. To do this we show that for all a,b,c\in\mathbb{N} we have that a\left(b+c\right)=ab+ac.

Let a,b\in\mathbb{N} be arbitrary and we argue by induction on the proposition P\left(n\right) given by

$$\begin{equation} a\left(b+n\right)=ab+an \end{equation*}$$*

For the base case n=0 we have that

$$\begin{align} a\left(b+0\right)&=a\left(b\right),\text{By definition of multiplication}\ &=ab \ &=ab+0,\text{By definition of addition}\ &=ab+a0,\text{By definition of multiplication}\ \end{align}$$*

Hence P\left(0\right) is true.

Now suppose that P\left(n\right) is true, that is to say

$$\begin{equation} a\left(b+n\right)=ab+an \end{equation*}$$*

We show that P\left(S\left(n\right)\right) is true, that is

$$\begin{equation} a\left(b+S\left(n\right)\right)=ab+aS\left(n\right) \end{equation*}$$*

Indeed, we have that

$$\begin{align} a\left(b+S\left(n\right)\right)&=a\left(S\left(b+n\right)\right),\ \text{By definition of addition}\ &=a\left(b+n\right)+a,\ \text{By definition of multiplication}\ &=ab+an+a,\ \text{By assumption}\ &=ab+aS\left(n\right)0,\ \text{By definition of multiplication}\ \end{align*}$$*

Hence P\left(S\left(n\right)\right) is true.

It hence follows by the principle of mathematical induction that \forall a,b,c\in\mathbb{N} we have that a\left(b+c\right)=ab+ac.

Now, we have shown that a\left(b+c\right)=ab+ac, to see that \left(b+c\right)a=ba+ca=ab+ac we simply observe that

$$\begin{align} \left(b+c\right)a&=a\left(b+c\right),\ \text{Multiplication is commutative}\ &=ab+ac,\ \text{By part 1 of the theorem}\ &ba+ca,\ \text{Multiplication is commutative}\ \end{align*}$$*

As required. $\qed$ :::

Associativity of multiplication

 
This will prove that for all a,b,c\in\mathbb{N} that a\left(bc\right)=\left(ab\right)c

::: theorem Theorem 12. For all a,b,c\in\mathbb{N} we have that $a\left(bc\right)=\left(ab\right)c$

Proof:

We again show this by induction. Let x,y\in\mathbb{N} be arbitrary, and let P\left(n\right) be the proposition given by

$$\begin{equation} \left(xy\right)n=x\left(yn\right) \end{equation*}$$*

For the base case we have n=0 and so

$$\begin{align} \left(xy\right)0&=0 ,\text{By definition of multiplication}\ &=x\left(0\right),\text{By definition of multiplication}\ &=x\left(y0\right),\text{By definition of multiplication}\ \end{align}$$*

Hence P\left(0\right) is true.

Now, suppose that P\left(n\right) is true, that is

$$\begin{equation} \left(xy\right)n=x\left(yn\right) \end{equation*}$$*

We show that P\left(S\left(n\right)\right) is also true, that is

$$\begin{equation} \left(xy\right)S\left(n\right)=x\left(yS\left(n\right)\right) \end{equation*}$$*

Now, we have that

$$\begin{align} \left(xy\right)S\left(n\right)&=\left(xy\right)n+xy,\ \text{Definition of multiplication}\ &=x\left(yn\right)+xy,\ \text{By assumption}\ &=xy+x\left(yn\right),\ \text{Addition is commutative}\ &=x\left(y+\left(yn\right)\right),\ \text{Multiplication is distributive over addition}\ &=x\left(\left(yn\right)+y\right),\ \text{Addition is commutative}\ &=x\left(yS\left(n\right)\right),\ \text{Addition is commutative}\ \end{align*}$$*

Hence P\left(S\left(n\right)\right) is true.

Hence, it follows by the principle of mathematical induction that for all a,b,c\in\mathbb{N} we have that a\left(bc\right)=\left(ab\right)c. $\qed$ :::

The Zero and Identity laws

 
These two laws allow us to note that adding zero to any natural number n gives back n and multiplying n by 1 gives n.

::: theorem Theorem 13. The zero and Identity laws

Let n\in\mathbb{N}. We have that

  1. $n+0=n=0+n$

  2. $1n=n=n1$

Proof:

By commutativity, it is enough to only prove

  1. $n+0=n$

  2. $n1=n$*

  1. n+0=n:

    This is true by the definition of addition.

  2. n*1=n:

    We have by the definition of multiplication that

    $$\begin{equation} n1=nS\left(0\right)=n0+n=0+n=n \end{equation}$$ Where the last equality comes from the zero law and the fact addition is commutative.*

The result follows. $\qed$ :::

The cancellation laws

 
These laws allow us to deduce that if a+b=a+c then we must have b=c, and if a\neq 0 that ab=ac gives b=c

::: theorem Theorem 14. The cancellation laws

Let a,b,c\in\mathbb{N}. We have that

  1. If a+b=a+c then we have b=c.

  2. For a\neq 0, if ab=ac then we have that $b=c$

Proof:

  1. If a+b=a+c then we have b=c:

    We argue by induction, let b,c\in\mathbb{N} be arbitrary and let P\left(n\right) be the proposition given by

    $$\begin{equation} n+b=n+c \Rightarrow b=c \end{equation*}$$*

    For the base case P\left(0\right) this holds trivially. Now suppose the proposition P\left(n\right) holds that is

    $$\begin{equation} n+b=n+c \Rightarrow b=c \end{equation*}$$*

    We show that P\left(S\left(n\right)\right) holds, that is

    $$\begin{equation} S\left(n\right)+b=S\left(n\right)+c \Rightarrow b=c \end{equation*}$$*

    Now, we have that

    $$\begin{align} S\left(n\right)+b&=S\left(n\right)+c\ S\left(n+0\right)+b&=S\left(n+0\right)+c\ n+S\left(0\right)+b&=n+S\left(0\right)+c\ n+\left(S\left(0\right)+b\right)&=n+\left(S\left(0\right)+c\right),\ \text{By associativity}\ \left(S\left(0\right)+b\right)&=\left(S\left(0\right)+c\right),\ \text{By hypothesis, as P\left(n\right) has b,c being arbitrary}\ b+S\left(0\right)&=c+S\left(0\right),\ \text{By commutativity}\ S\left(b+0\right)&=S\left(c+0\right)\ S\left(b\right)&=S\left(c\right)\ \end{align*}$$*

    Hence we have b=c by proposition 37{reference-type="ref" reference="prop:EqualSuccOp"}. So P\left(S\left(n\right)\right) is true.

    Hence by mathematical induction we have that if a+b=a+c we must have that b=c.

  2. For a\neq 0, if ab=ac then we have that b=c:

    We again argue by induction, let b,c\in\mathbb{N} be arbitrary and let P\left(n\right) be the proposition given by

    $$\begin{equation} nb=nc\Rightarrow b=c \end{equation*}$$*

    Moreover, we do induction starting at n=1 as the case n=0 is vacuously true. So for P\left(1\right) we have that this holds trivially. Now suppose that P\left(n\right) holds. that is

    $$\begin{equation} nb=nc\Rightarrow b=c \end{equation*}$$*

    We show that P\left(S\left(n\right)\right) is true

    $$\begin{equation} S\left(n\right)b=S\left(n\right)c\Rightarrow b=c \end{equation*}$$*

    Indeed we have that

    $$\begin{align} S\left(n\right)b&=S\left(n\right)c\ bS\left(n\right)&=cS\left(n\right),\ \text{By commutativity}\ bn+b&=cn+c,\ \text{By commutativity}\ a+b&=a+c,\ nb=nc \text{ by assumption, so let } nb=nc=a \text{ for some } a\ b&=c,\ \text{By the cancellation law for addition}\ \end{align*}$$*

    Hence P\left(S\left(n\right)\right) is true.

    Hence by mathematical induction we have that for a\neq 0 if ab=ac we must have that b=c.

As required. \qed. :::

Summation and product notation

Now that we have a well-defined notion of addition and multiplication we can define a shorthand to can be useful in avoiding writing out longer chains of additions (or multiplications) in certain situations. We will require the following mapping. Let s\in\mathbb{N}^{n+1} be an ordered $n+1$-tuple of Natural numbers where s=\left(s_0,s_1,s_1,s_2,\dots,s_n\right) and define \mathbb{N}_n=\left\{0,1,2,3,\dots,n\right\}. Let f:\mathbb{N}_n\rightarrow\mathbb{N} be a mapping defined by

$$\begin{align*} f:\mathbb{N}_n&\rightarrow\mathbb{N}\ i&\mapsto f\left(i\right) =s_i \end{align*}$$

This is to say that f simply gets the value of s_i which is an element of the ordered tuple s.

::: definition Definition 72. Summation notation

Let s\in\mathbb{N}^{n+1} be an ordered $n+1$-tuple of Natural numbers where s=\left(s_0,s_1,s_1,s_2,\dots,s_n\right) and define \mathbb{N}_n=\left\{0,1,2,3,\dots,n\right\}. Let f:\mathbb{N}_n\rightarrow\mathbb{N} be a mapping defined by

$$\begin{align} f:\mathbb{N}_n&\rightarrow\mathbb{N}\ i&\mapsto f\left(i\right) =s_i \end{align*}$$*

We define the summation notation by

$$\begin{equation} \sum_{i=0}^n f\left(i\right)=f\left(0\right)+f\left(1\right)+f\left(2\right)+\dots+f\left(n\right) \end{equation*}$$ This can also be written as*

$$\begin{equation} \sum_{i=0}^n s_i=s_0+s_1+s_2+\dots+s_n \end{equation*}$$*

We call i the index of the summation and that i=0 as the starting index of the summation for some a\in\mathbb{N} and that n is the ending index of the summation. In the case that s\in\emptyset then we define the summation to be 0 and call such a summation an empty sum.

We can also define the summation over a subset of \mathbb{N}_n which allows for starting the summation at a starting point other than i=0. Let T\subseteq\mathbb{N}. We can define the summation over the set T by

$$\begin{equation} \sum_{i\in T} s_i \end{equation*}$$*

If we have a mapping g:\mathbb{N}\rightarrow\mathbb{N} for some mapping g then we can define a summation over g by $$\begin{equation} \sum_{i\in T} g\left(s_i\right) \end{equation*}$$*

Finally, we can define a summation over a predicate P\left(i\right) for i\in T giving

$$\begin{equation} \sum_{P\left(i\right)} g\left(s_i\right) \end{equation*}$$ which means to take the sum of the g\left(s_i\right) where i satisfies the predicate P. If the predicate is not satisfied by any i then the summation is also said to be an empty summation and given a value of 0.*

In light of definition a summation of a predicate we have that if a>n where a is the index lower of summation and n the upper point of summation then the sum would be by definition equal to 0. That is to say

$$\begin{equation} \sum_{i=a}^n s_i = 0 ,\ \text{If } a>n \end{equation*}$$* :::

::: example Example 56. Let s=\left(2,3,4,8\right)\in\mathbb{N}^4 then we have that

$$\begin{equation} \sum_{i=0}^3 s_i = 2+3+4+8 = 17 \end{equation*}$$* :::

::: example Example 57. Let g\left(n\right)=n and let k=4 then we have that

$$\begin{equation} \sum_{i=0}^4-1 g\left(i\right) = \sum_{i=0}^3 i = 1+2+3+4 = 10 \end{equation*}$$* :::

::: example Example 58. Let s_1\in\mathbb{N} then we have

$$\begin{equation} \sum_{i=1}^1 s_1 = s_1 \end{equation*}$$* :::

::: example Example 59. Let g\left(n\right) = n*n and let T=\left\{2,6,11\right\}\subseteq\mathbb{N}^{11} then

$$\begin{equation} \sum_{i\in T} g\left(i\right) = g\left(2\right)+g\left(6\right)+g\left(11\right)=22+66+1111=4+36+121=161 \end{equation}$$* :::

::: example Example 60. Let g\left(n\right) = n, let P\left(n\right) be the predicate such that

$$\begin{equation} P\left(n\right)=\begin{cases} 1,\ \text{If } n=2,4,6\ 0,\ \text{Otherwise } \end{cases} \end{equation*}$$ Let T=\left\{2,6,11\right\}\subseteq\mathbb{N}^{11} then we have for the i\in T that satisfies P\left(i\right) is given by*

$$\begin{equation} \sum_{P\left(i\right)} i = 2+4=6 \end{equation*}$$* :::

::: example Example 61. Let f\left(n\right)= n+5. Consider the sum

$$\begin{equation} \sum_{i=3}^6 n+5 = \left(3+5\right)+\left(4+5\right)+\left(5+5\right)+\left(6+5\right)=8+9+10+11=38 \end{equation*}$$*

We can re-express this sum as

$$\begin{equation} \sum_{i=0}^3 n+5 = \left(\left(0+3\right)3+5\right)+\left(\left(1+3\right)+5\right)+\left(\left(2+3\right)+5\right)+\left(\left(3+3\right)+5\right)=38 \end{equation*}$$*

We have re-indexed the sum into an equivalent form. :::

We can make some observations about summation notation.

::: {#prop:summation_properties_naturals .proposition} Proposition 38. Properties of summation notation

Let n,m\in\mathbb{N} such that m<n. Let s,t\in\mathbb{N}^n and let c\in\mathbb{N}. In addition define A=\mathbb{N}_m and B=\mathbb{N}_n\setminus A=\left\{m+1,m+2,\dots,n\right\} so that A\cup B =\mathbb{N}_n. Let a\in \mathbb{N} be the lower index summation. We have that the following properties hold.

  1. $\displaystyle \sum_{i=0}^n s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i = \sum_{i=0}^m s_i + \sum_{i=m+1}^n s_i$

  2. $\displaystyle \sum_{i=a}^n s_i = \sum_{i=a}^m s_i + \sum_{i=m+1}^n s_i$

  3. $\displaystyle\sum_{i=1}^n c = cn$*

  4. $\displaystyle\sum_{i=1}^n cs_i = c*\sum_{i=1}^n s_i$*

  5. $\displaystyle\sum_{i=1}^n s_i+t_i = \sum_{i=1}^n s_i + \sum_{i=1}^n t_i$

Proof:

  1. \displaystyle \sum_{i=0}^n s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i = \sum_{i=0}^m s_i + \sum_{i=m+1}^n s_i:

    We argue by induction on n. Let P\left(n\right) be the proposition given by

    $$\begin{equation} \sum_{i=1}^n s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i = \sum_{i=1}^m s_i + \sum_{i=m+1}^n s_i \end{equation*}$$*

    The base case P\left(0\right) we have that A=\emptyset and B=\mathbb{N}_0\setminus A=\left\{0\right\} as we have by assumption that m<n. Hence

    $$\begin{equation} \sum_{i=0}^0 s_i = s_0 \end{equation*}$$ Likewise we have*

    $$\begin{equation} \sum_{i\in A} s_i + \sum_{i\in B} s_i = 0+\sum_{i=0}^0 s_i = s_0 \end{equation*}$$*

    So the base case holds. Now suppose that the P\left(n\right) that is

    $$\begin{equation} \sum_{i=1}^n s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i = \sum_{i=1}^m s_i + \sum_{i=m+1}^n s_i \end{equation*}$$*

    we need to show that

    $$\begin{equation} \sum_{i=1}^{n+1} s_i = \sum_{i=1}^m s_i + \sum_{i=m+1}^{n+1} s_i \end{equation*}$$ also holds. By definition we have that*

    $$\begin{equation} \sum_{i=1}^{n+1} s_i =s_0+s_1+s_2+\dots+s_n+s_{n+1}=\sum_{i=0}^n s_i + s_{n+1} \end{equation*}$$*

    Now we have that

    $$\begin{align} \sum_{i=1}^{n+1} s_i &= \sum_{i=0}^n s_i + s_{n+1}\ &= \sum_{i=1}^m s_i + \sum_{i=m+1}^n s_i + s_{n+1},\ \text{By the induction hypothesis}\ &= \sum_{i=1}^m s_i + \sum_{i=m+1}^{n+1} s_i,\ \text{By definition}\ \end{align*}$$*

    Hence P\left(n+1\right) holds and the results follows by induction.

  2. \displaystyle \sum_{i=a}^n s_i = \sum_{i=a}^m s_i + \sum_{i=m+1}^n s_i :

    This follows by a similar argument as 1. but starting the induction at a.

  3. \displaystyle\sum_{i=1}^n c = c*n:

    We argue by induction on n. Let P\left(n\right) be the proposition given by

    $$\begin{equation} \sum_{i=1}^n c = cn \end{equation}$$*

    For the base case $P\left(1\right)$

    $$\begin{equation} \sum_{i=1}^1 c = c = c1 \end{equation}$$*

    Now suppose that P\left(n\right) holds we need to show that P\left(n+1\right) holds, that is

    $$\begin{equation} \sum_{i=1}^{n+1}n c = c*\left(n+1\right) \end{equation*}$$*

    We have that $$\begin{equation} \sum_{i=1}^{n+1} c = \sum_{i=1}^n c + c = nc+c=cn+c=cS\left(n\right)=c\left(n+1\right) \end{equation*}$$*

    The result follows by induction.

  4. \displaystyle\sum_{i=1}^n c*s_i = c*\sum_{i=1}^n s_i:

    We have by definition of summation that

    $$\begin{equation} \sum_{i=1}^n cs_i=cs_1+cs_2+\dots+cs_n \end{equation*}$$*

    Now as multiplication distributes over addition we have

    $$\begin{equation} \sum_{i=1}^n cs_i=cs_1+cs_2+\dots+cs_n = c\left(s_1+s_2+\dots+s_n\right)=c*\sum_{i=1}^n s_i \end{equation*}$$*

  5. \displaystyle\sum_{i=1}^n s_i+t_i = \sum_{i=}^n s_i + \sum_{i=1}^n t_i:

    We argue by induction. Let P\left(n\right) denote be the proposition given by

    $$\begin{equation} \sum_{i=1}^n s_i+t_i = \sum_{i=}^n s_i + \sum_{i=1}^n t_i \end{equation*}$$*

    For the base case P\left(1\right) we have that

    $$\begin{equation} \sum_{i=1}^1 s_i+t_i = s_1+t_1 \end{equation*}$$*

    Likewise we have

    $$\begin{equation} \sum_{i=}^1 s_i + \sum_{i=1}^n t_i = s_1+t_1 \end{equation*}$$ So the base case holds. Now suppose P\left(n\right) holds so we need to show P\left(n+1\right) holds. By definition we have*

    $$\begin{equation} \sum_{i=}^{n+1} s_i + \sum_{i=1}^n s_i+ t_i +s_{n+1}+t_{n+1} = \sum_{i=1}^n s_i + \sum_{i=1}^n t_i +s_{n+1} +t_{n+1} \end{equation*}$$ By the induction hypothesis. Now addition is commutative so we get*

    $$\begin{equation} \sum_{i=}^{n+1} s_i+ t_i = \sum_{i=1}^n s_i + \sum_{i=1}^n t_i +s_{n+1} +t_{n+1}= \sum_{i=1}^n s_i + s_{n+1} + \sum_{i=1}^n t_i + t_{n+1} = \sum_{i=1}^{n+1} s_i + \sum_{i=1}^{n+1} t_i \end{equation*}$$*

    The result follows by induction.

$\qed$ :::

The summation notation allows us to deduce an additional property of multiplication .

::: {#prop:NaturalsHaveNoZeroDivisors .proposition} Proposition 39. Product of two naturals being zero implies one of the numbers is zero

Let a,b\in\mathbb{N}. If ab=0 then at least one of a or b is zero.

Proof:

Let a,b\in\mathbb{N} and let ab=0. Using the summation notation we have that

$$\begin{equation} ab=\sum_{i=1}^b a= \underbrace{a+a+a+\dots+a}_{b\text{ times}} = 0 \end{equation*}$$*

From which we can see that this holds for a=0 and only a=0. Suppose that a\neq 0 then

$$\begin{equation} \sum_{i=1}^b a= \underbrace{a+a+a+\dots+a}_{b\text{ times}} > 0 \end{equation*}$$*

A contradiction to the hypothesis.

A similar result holds for \displaystyle ab = \sum_{i=1}^a b. Finally if both a and b are zero the result is trivial.

The result has been shown. \qed. :::

A similar definition can be made for multiplication, called product notation

::: definition Definition 73. Product notation

Let s\in\mathbb{N}^{n+1} be an ordered $n+1$-tuple of Natural numbers where s=\left(s_0,s_1,s_1,s_2,\dots,s_n\right) and define \mathbb{N}_n=\left\{0,1,2,3,\dots,n\right\}. Let f:\mathbb{N}_n\rightarrow\mathbb{N} be a mapping defined by

$$\begin{align} f:\mathbb{N}_n&\rightarrow\mathbb{N}\ i&\mapsto f\left(i\right) =s_i \end{align*}$$*

We define the product notation by

$$\begin{equation} \prod_{i=0}^n f\left(i\right)=f\left(0\right)f\left(1\right)f\left(2\right)\dotsf\left(n\right) \end{equation*}$$ This can also be written as*

$$\begin{equation} \prod_{i=0}^n s_i=s_s_1s_2*\dotss_n \end{equation}$$*

We call i the index of the product and that i=0 as the lower starting point of the product for some a\in\mathbb{N} and that n is the upper point of the product. In the case that s\in\emptyset then we define the product to be 1 and call such a product an empty product.

We can also define the product over a subset of \mathbb{N}_n which allows for starting the product at a starting point other than i=0. Let T\subseteq\mathbb{N}. We can define the product over the set T by

$$\begin{equation} \prod_{i\in T} s_i \end{equation*}$$*

If we have a mapping g:\mathbb{N}\rightarrow\mathbb{N} for some mapping g then we can define a product over g by $$\begin{equation} \prod_{i\in T} g\left(s_i\right) \end{equation*}$$*

Finally, we can define a product over a predicate P\left(i\right) for i\in T giving

$$\begin{equation} \sum_{P\left(i\right)} g\left(s_i\right) \end{equation*}$$ which means to take the product of the g\left(s_i\right) where i satisfies the predicate P. If the predicate is not satisfied by any i then the product is also said to be an empty product and given a value of 1. In light of definition a product of a predicate we have that if a>n where a is the lower index of the product and n the upper point of product then the product would be by definition equal to 1. That is to say*

$$\begin{equation} \sum_{i=a}^n s_i = 1 ,\ \text{If } a>n \end{equation*}$$* :::

::: example Example 62. Let s=\left(2,3,4,8\right)\in\mathbb{N}^4 then we have that

$$\begin{equation} \prod_{i=0}^3 s_i = 2348 = 192 \end{equation}$$* :::

::: example Example 63. Let g\left(n\right)=n and let k=4 then we have that

$$\begin{equation} \prod_{i=0}^{4-1} g\left(i\right) = \prod_{i=0}^3 i = 1234 = 24 \end{equation}$$* :::

::: example Example 64. Let s_1\in\mathbb{N} then we have

$$\begin{equation} \prod_{i=1}^1 s_1 = s_1 \end{equation*}$$* :::

::: example Example 65. Let g\left(n\right) = n*n and let T=\left\{2,6,11\right\}\subseteq\mathbb{N}^{11} then

$$\begin{equation} \prod_{i\in T} g\left(i\right) = g\left(2\right)g\left(6\right)g\left(11\right)=\left(22\right)+\left(66\right)+\left(1111\right)=436121=17424 \end{equation}$$* :::

::: example Example 66. Let g\left(n\right) = n, let P\left(n\right) be the predicate such that

$$\begin{equation} P\left(n\right)=\begin{cases} 1,\ \text{If } n=2,4,6\ 0,\ \text{Otherwise } \end{cases} \end{equation*}$$ Let T=\left\{2,6,11\right\}\subseteq\mathbb{N}^{11} then we have for the i\in T that satisfies P\left(i\right) is given by*

$$\begin{equation} \sum_{P\left(i\right)} i = 24=12 \end{equation}$$* :::

There is an some immediate properties of product notation that are clear

::: proposition Proposition 40. Properties of product notation

Let n,m\in\mathbb{N} such that m<n. Let s,t\in\mathbb{N}^n and let c\in\mathbb{N}. In addition define A=\mathbb{N}_m and B=\mathbb{N}_n\setminus A=\left\{m+1,m+2,\dots,n\right\} so that A\cup B =\mathbb{N}_n. Let a\in \mathbb{N} be the lower index summation. We have that the following properties hold.

  1. *$\displaystyle \prod_{i=0}^n s_i = \prod_{i\in A} s_i \prod_{i\in B} s_i = \prod_{i=0}^m s_i + \prod_{i=m+1}^n s_i$

  2. $\displaystyle \prod_{i=a}^n s_i = \prod_{i=a}^m s_i * \prod_{i=m+1}^n s_i$

  3. $\displaystyle\prod_{i=1}^n s_it_i = \prod_{i=1}^n s_i \prod_{i=1}^n t_i$

Proof:

  1. \displaystyle \prod_{i=0}^n s_i = \prod_{i\in A} s_i *\prod_{i\in B} s_i = \prod_{i=0}^m s_i + \prod_{i=m+1}^n s_i:

    We argue by induction on n. Let P\left(n\right) be the proposition given by

    $$\begin{equation} \prod_{i=0}^n s_i = \prod_{i=0}^n s_i = \prod_{i\in A} s_i \prod_{i\in B} s_i = \prod_{i=0}^m s_i + \prod_{i=m+1}^n s_i \end{equation}$$*

    The base case P\left(0\right) we have that A=\emptyset and B=\mathbb{N}_0\setminus A=\left\{0\right\} as we have by assumption that m<n. Hence

    $$\begin{equation} \prod_{i=0}^0 s_i = s_0 \end{equation*}$$ Likewise we have*

    $$\begin{equation} \prod_{i\in A} s_i + \prod_{i\in B} s_i = 0+\prod_{i=0}^0 s_i = s_0 \end{equation*}$$*

    So the base case holds. Now suppose that the P\left(n\right) that is

    $$\begin{equation} \prod_{i=1}^n s_i = \prod_{i=1}^m s_i + \prod_{i=m+1}^n s_i \end{equation*}$$*

    we need to show that

    $$\begin{equation} \prod_{i=1}^{n+1} s_i = \prod_{i=1}^m s_i + \prod_{i=m+1}^{n+1} s_i \end{equation*}$$ also holds. By definition we have that*

    $$\begin{equation} \prod_{i=1}^{n+1} s_i =s_0s_1s_2*\dotss_ns_{n+1}=\prod_{i=0}^n s_i * s_{n+1} \end{equation*}$$*

    Now we have that

    $$\begin{align} \prod_{i=1}^{n+1} s_i &= \prod_{i=0}^n s_i * s_{n+1}\ &= \prod_{i=1}^m s_i * \prod_{i=m+1}^n s_i * s_{n+1},\ \text{By the induction hypothesis}\ &= \prod_{i=1}^m s_i * \prod_{i=m+1}^{n+1} s_i,\ \text{By definition}\ \end{align*}$$*

    Hence P\left(n+1\right) holds and the results follows by induction.

  2. \displaystyle \prod_{i=a}^n s_i = \prod_{i=a}^m s_i * \prod_{i=m+1}^n s_i:

    A similar argument as in part 1 shows this.

  3. \displaystyle\prod_{i=1}^n s_it_i = \prod_{i=1}^n s_i \prod_{i=1}^n t_i:

    We argue by induction. Let P\left(n\right) denote the proposition

    $$\begin{equation} \prod_{i=1}^n s_it_i = \prod_{i=1}^n s_i \prod_{i=1}^n t_i \end{equation*}$$*

    In the base case P\left(1\right) we have

    $$\begin{equation} \prod_{i=1}^1 s_it_i=s_1t_1 \end{equation*}$$*

    Likewise

    $$\begin{equation} \prod_{i=1}^1 s_i \prod_{i=1}^1 t_i=s_1t_1=s_1t_1 \end{equation}$$*

    Which shows the base case. Now suppose P\left(n\right) is true, we show that P\left(n+1\right) is true. We have that

    $$\begin{align} \prod_{i=1}^{n+1}n s_it_i&=\prod_{i=1}^n s_it_i * s_{n+1}t_{n+1}\ &=\prod_{i=1}^n s_i\prod_{i=1}^n t_i * s_{n+1}t_{n+1}\ &=\prod_{i=1}^n s_is_{n+1}\prod_{i=1}^n t_i t_{n+1}\ &=\prod_{i=1}^{n+1} s_i\prod_{i=1}^{n+1} t_i\ \end{align}$$ The result follows by induction.

$\qed$ :::

Exponentiation

 
With the product notation defined we can define another operation called exponentiation

::: definition Definition 74. Exponentiation of Natural numbers

Let \left(m,n\right)\in\mathbb{N}\times\mathbb{N} and let \wedge:\mathbb{N}\times\mathbb{N}\rightarrow\mathbb{N}. We define the exponentiation of m by n to be m multiplied by itself n-1 times

$$\begin{align} \wedge:\mathbb{N}\times\mathbb{N}&\rightarrow\mathbb{N}\ \left(m,n\right)&\mapsto \wedge\left(m,n\right)=\begin{cases} 1,\ \text{If } n=0\text{ and } m=0\ 1,\ \text{If } n=0\ \displaystyle \wedge\left(m,n\right)=\prod_{i=1}^n m = 1 * \prod_{i=1}^n m,,, , n\neq 0 \end{cases} \end{align*}$$ We will write \wedge\left(m,n\right) as m^n. We say that m is the base and n is the exponent. We sometimes say that m has been raised to the power of n. In the case that n=0 and m=0 we have a vacuous product and so an empty product which by definition has a value of 1.* :::

With the above definition, we make a quick remark. We know that an empty product has a value of 1 and as multiplication by 1 doesn't change the value we can write exponentiation as

$$\begin{equation*} \prod_{i=1}^n m = 1 * \prod_{i=1}^n m \end{equation*}$$ This makes it clear that exponentiation is multiplication of 1 by n copies of m.

::: example Example 67. Let n=2 and m=2 then we have that as 2=S\left(1\right) then $$\begin{equation} \wedge\left(2,2\right)=\prod_{i=1}^2 2 = 22 = 4 \end{equation}$$* :::

::: example Example 68. Let m=4 and n=1 then we have that $$\begin{equation} \wedge\left(4,1\right)=4^1=4 \end{equation*}$$* :::

::: example Example 69. Let m=5 and n=0 then we have that $$\begin{equation} \wedge\left(5,0\right)=5^0=1 \end{equation*}$$* :::

::: example Example 70. Let m=2 and n=7 then we have that $$\begin{equation} \wedge\left(5,0\right)=5^0=1 \end{equation*}$$* :::

As we have defined a new operation we should check that the operation is meaningful

::: theorem Theorem 15. Exponentiation is closed

For all n,m\in\mathbb{N} we have that

$$\begin{equation} \wedge\left(n,m\right)\in\mathbb{N} \end{equation*}$$*

Proof:

There are two cases to consider m=0 and m\neq 0. When m=0 the operation is defined such that

$$\begin{equation} \wedge\left(n,0\right)=1 \end{equation*}$$ which is in \mathbb{N}. When m\neq 0 then \wedge\left(n,m\right)\in\mathbb{N} as multiplication in \mathbb{N} is closed. $\qed$* :::

We should also verify that the other properties that we have verified for addition and multiplication either hold or do not. For example we can find examples that show that exponentiation is not commutative.

::: {#prop:ExponentiationOfNaturalsIsNotCommutative .proposition} Proposition 41. Exponentiation is non-commutative

There exist n,m\in\mathbb{N} such that

$$\begin{equation} \wedge\left(n,m\right)\neq\wedge\left(m,n\right) \end{equation*}$$*

Proof:

Let n=3 and m=4 then we have that

$$\begin{align} \wedge\left(3,4\right)&=81\ \wedge\left(4,3\right)&=64 \end{align*}$$ from which its clear that 81\neq 64. $\qed$* :::

::: {#prop:ExponentiationOfNaturalsIsNotAssociative .proposition} Proposition 42. Exponentiation is non-associative

There exist a,b,c\in\mathbb{N} such that

$$\begin{equation} \left(a^{b}\right)^{c}\neq a^{\left(b^c\right)} \end{equation*}$$*

Proof:

Let a=2, b=3 and c=4 then we have that

$$\begin{align} \left(2^{3}\right)^{4}&=8^3=4096\ 2^{\left(3^4\right)}&=2^81=2417851639229258349412352 \end{align*}$$*

Clearly 4096\neq 2417851639229258349412352. $\qed$ :::

The non-associativity of exponentiation shows an important point, that the order in which we do exponentiation can give drastically different result and so the order in which the exponentiation should be done will depend on the context. We will bracket for each case as required. There is an interesting property for the case \left(a^{b}\right)^{c} called the power law of exponentiation.

::: {#prop:ExponentiationOfNaturalsPowerLaw .proposition} Proposition 43. Power law of exponentiation

Let a,b,c\in\mathbb{N}. We have that

$$\begin{equation} \left(a^{b}\right)^{c}=a^{bc} \end{equation*}$$*

Proof:

By definition of exponentiation we have that

$$\begin{equation} \left(a^{b}\right)^{c}=\prod_{i=1}^c a^b = \prod_{i=1}^c \left(\prod_{j=1}^b a\right) \end{equation*}$$*

That is we are multiplying \prod_{j=1}^b a by itself c times. The product \prod_{j=1}^b a itself is the multiplication of a by itself b times. We can therefore express the above by

$$\begin{align} \left(a^{b}\right)^{c}&=\underbrace{\prod_{j=1}^b a*\prod_{j=1}^b a*\dots*\prod_{j=1}^b a}{c\text{ times}}\ &=\underbrace{\underbrace{\left(aa\dots*a\right)}{b\text{ times}}\underbrace{\left(aa*\dotsa\right)}_{b\text{ times}}\dots*\underbrace{\left(aa\dotsa\right)}{b\text{ times}}}{c\text{ times}}\ \end{align}$$*

There are therefore b*c multiplications of a with itself, as we need to perform c iterations of \prod_{j=1}^b a. Hence we have that

$$\begin{equation} \left(a^{b}\right)^{c}=\underbrace{aaa*\dotsa}_{bc\text{ times}}=\prod_{i=1}^{bc} a = a^{bc} \end{equation*}$$*

As required. $\qed$ :::

There are some additional properties that we can deduce. Consider 2^m for m\in\mathbb{N} we have for m=0,1,2,3 and 4 that 2^0=1, 2^1=2, 2^2=4, 2^3=8 and 2^4=16. Notice that multiplying any 2^m by 2 adds one to the power. In fact multiplying any 2^m by 4 adds to the power. It looks like the powers multiply together. For example 2^m*2^n=2^{m+n}. We can show this is true for bases other than 2.

::: {#prop:ExponentsOfSameNaturalNumberBaseAdd .proposition} Proposition 44. Multiplying exponents of same base adds the powers

Let a,m,n\in\mathbb{N}. We have that

$$\begin{equation} a^na^m=a^{n+m} \end{equation}$$*

Proof:

Let a,n,m\in\mathbb{N}. If n=0 and m\geq 0 then a^n = 1 and we have that a^n*a^m=a^{n+m}=1*a^m=a^{0+m}=a^m. Likewise for the case m=0 and n\geq 0. So suppose that m>0 and m>0. We have by definition of exponentiation that

$$\begin{align} a^na^m=\prod_{i=1}^n a * \prod_{i=1}^m a=\underbrace{aa*\dotsa}_{n\text{ times}}\underbrace{aa\dotsa}_{m\text{ times}} &=\underbrace{aa\dotsa}_{n+m\text{ times}} =a^{n+m} \end{align}$$ as required. $\qed$* :::

We also have the following result that combines multiplying two numbers and raising that result to a power. As an example consider \left(2*3\right)^2= 6^2=36. Now consider 2^2=4 and 3^2=9 and we clearly have 4*9=36. The powers can come through to each of the numbers of the multiplication.

::: {#prop:ExponentiationPowerOfProductIsProductOfPowers .proposition} Proposition 45. Power of product is product of powers

Let a,b,n\in\mathbb{N}. We have that

$$\begin{equation} \left(ab\right)^n=a^nb^n \end{equation*}$$*

Proof:

If n=0 then \left(a*b\right)^0=1 by definition and a^0*b^0=1. So suppose that n>0 then we have that

$$\begin{align} \left(ab\right)^n=\prod_{i=1}^n ab &= \underbrace{ababab\dotsab}{n\text {times}}\ &= \left(\underbrace{aaa\dots*a}{n\text {times}}\right)\left(\underbrace{bbb\dotsb}_{n\text {times}}\right),\ \text{By commutativity of multiplication}\ &= a^nb^n\ \end{align}$$*

The proposition has been shown. $\qed$ :::

Subtraction

We can define an operation that will allow us to at least partially undo addition. To define this operation we need to make use of the less than operator.

::: definition Definition 75. Subtraction of natural number

Let n,m\in\mathbb{N} such that m\leq n. Let d\in\mathbb{N} such that n=m+d. We define subtraction by

$$\begin{equation} d=n-m \end{equation*}$$*

We call d the difference between n and m. :::

There is an immediate result from the definition of subtraction

::: {#prop:NaturalAddDifference .proposition} Proposition 46. $a+\left(b-c\right)=\left(a+b\right)-c$

Let a,b,c\in\mathbb{N} with b\geq c. We have that

$$\begin{equation} a+\left(b-c\right)=\left(a+b\right)-c \end{equation*}$$*

Proof:

We argue by induction. Let P\left(n\right) denote the proposition

$$\begin{equation} a+\left(n-c\right)=\left(a+n\right)-c \end{equation*}$$*

For the base case n=0 we have by definition c=0 and so

$$\begin{equation} a+\left(0-0\right)=a=\left(a+0\right)-0 \end{equation*}$$*

Now suppose that P\left(n\right) holds, we show that P\left(n+1\right) is true that is

$$\begin{equation} a+\left(\left(n+1\right)-c\right)=\left(a+\left(n+1\right)\right)-c \end{equation*}$$*

We have that n+1=\left(n+0\right)+1=n+\left(0+1\right) and so

$$\begin{align} a+\left(\left(n+1\right)-c\right)&=a+\left(n+\left(0+1\right)-c\right)\ &=a+\left(n+\left(1-c\right)\right)\ &=\left(a+n\right)+1-c\ &=a+\left(n+1\right)-c \end{align*}$$*

As required. $\qed$ :::

We immediately see that subtraction is not commutative that is a-b\neq b-a in fact it is not even defined for b-a unless b\geq a but then it is not defined for a-b and visa-versa. Likewise it is not associative as for example \left(8-4\right)-2=2 but 8-\left(4-2\right)=6. We do however retain the fact that multiplication is commutative over subtraction

::: proposition Proposition 47. Multiplication distributes over subtraction

Let a,b,c\in\mathbb{N} with b\geq c and let a\in\mathbb{N}. We have that

  1. $a\left(b-c\right)=ab-ac$

  2. $\left(b-c\right)a=ba-ca=ab-ac$

Proof:

  1. a\left(b-c\right)=ab-ac:

    Let a\in\mathbb{N} be arbitrary. We argue by induction of the proposition P\left(n\right) given by

    $$\begin{equation} a\left(n-m\right)=an-am \end{equation*}$$ where by definition m\leq n. For the base case we have P\left(0\right) we have that n=m=0 and so*

    $$\begin{equation} a\left(0-0\right)=a0=0=a0-a0 \end{equation}$$*

    Showing the base case. Now suppose that P\left(n\right) holds we show that P\left(n+1\right) is true, that is we show

    $$\begin{equation} a\left(\left(n+1\right)-m\right)=a\left(n+1\right)-am \end{equation*}$$ where m\leq \left(n+1\right). There are two cases to consider if m=n+1 then we have*

    $$\begin{equation} a\left(\left(n+1\right)-m\right)=a0=0=a\left(n-1\right)-am \end{equation}$$*

    Now suppose that m<\left(n+1\right) then

    $$\begin{equation} a\left(\left(n+1\right)-m\right)=a\left(n+1\right)-am \end{equation*}$$ by the induction hypothesis. The result follows by induction.*

  2. \left(b-c\right)a=ba-ca=ab-ac:

    As multiplication is commutative we have that

    $$\begin{align} \left(b-c\right)a&=a\left(b-c\right)\ &=ab-ac\ &=ba-ca \end{align*}$$*

The result follows. $\qed$ :::

The principle of strong induction

 
The final property of the natural we shall look at is that of the principle of strong induction, although as we will see, this is actually equivalent to usual induction. There is one more version of induction that is sometimes useful, this is the so-called principle of strong induction, this is instead of assuming P\left(n\right) is true and showing that P\left(n+1\right). We instead assume that for all n\leq k for some k\in\mathbb{N} we have that P\left(n\right) is true for all n\leq k and we show that this implies that P\left(k+1\right) is true.

::: theorem Theorem 16. The principle of strong induction

Let P\left(n\right) be a proposition about a natural number n\in\mathbb{N}. Moreover, suppose that

  1. P\left(0\right) is true.

  2. \forall k\in\mathbb{N}:P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(k\right) all being true implies that P\left(k+1\right) is true.

If these two statements are true, we have that P\left(n\right) is true for any natural number n, and we say the proposition P\left(n\right) holds by the principle of strong mathematical induction.

Proof:

Define \Tilde{P}\left(n\right) to be the following proposition

$$\begin{equation} \Tilde{P}\left(n\right)=P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right) \end{equation*}$$*

We show that \Tilde{P}\left(n\right) for all n\geq 0. By assumption \Tilde{P}\left(n\right) is true as \Tilde{P}\left(n\right)=P\left(0\right). Now suppose that \Tilde{P}\left(n\right) is true for some n\in\mathbb{N}, that is

$$\begin{equation} \Tilde{P}\left(n\right)=P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right) \end{equation*}$$ is true, we show that \Tilde{P}\left(n+1\right) is true, that is*

$$\begin{equation} \Tilde{P}\left(n+1\right)=P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right)\wedge P\left(n+1\right) \end{equation*}$$*

By assumption 2. as we have that \forall n\in\mathbb{N}:P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right) implies that P\left(n+1\right) is true. Hence we have that

$$\begin{equation} \Tilde{P}\left(n+1\right)=\Tilde{P}\left(n\right)\wedge P\left(n+1\right)=\Tilde{P}\left(n+1\right) \end{equation*}$$ is true.*

Hence by the principle of mathematical induction we have that \Tilde{P}\left(n\right) is true for all n\geq 0. $\qed$ :::

As mentioned earlier, we said that strong induction and the usual induction are equivalent, we shall prove this. We used induction to prove strong induction so it is left to show that given the assumptions for strong induction, we can deduce the truth \forall n\in\mathbb{N} of the proposition P\left(n\right) only using induction.

::: theorem Theorem 17. Strong induction is equivalent to the usual induction

Suppose that the assumptions of strong induction hold. That is suppose P\left(n\right) be a proposition about a natural number n\in\mathbb{N} and moreover suppose that

  1. P\left(0\right) is true.

  2. \forall k\in\mathbb{N}:P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(k\right) all being true implies that P\left(k+1\right) is true.

We have that the truth of P\left(n\right) for all n\in\mathbb{N} can be deduced using only regular induction.

Proof:

Let \Tilde{P}\left(n\right) be the proposition be given by

$$\begin{equation} \forall k\leq n\text{ we have } P\left(k\right) \text{ is true} \end{equation*}$$*

We show by the principle of induction that

  1. \Tilde{P}\left(0\right) is true

  2. \Tilde{P}\left(n\right) being true implies \Tilde{P}\left(n+1\right) is true for any natural number n.

  1. \Tilde{P}\left(0\right) is true:

    To see this, we have that \Tilde{P}\left(0\right) is given by

    $$\begin{equation} \forall k\leq 0\text{ we have } P\left(0\right) \text{ is true} \end{equation*}$$*

    This clearly holds as the only natural number that is less than or equal to zero is zero. Hence P\left(0\right) is true and so \Tilde{P}\left(0\right).

  2. \Tilde{P}\left(n\right) being true implies \Tilde{P}\left(n+1\right) is true for any Natural number n:

    Suppose that \Tilde{P}\left(n\right) is true, that is

    $$\begin{equation} \forall k\leq n\text{ we have } P\left(k\right) \text{ is true} \end{equation*}$$*

    we show that \Tilde{P}\left(n+1\right) is true, that is

    $$\begin{equation} \forall k\leq n+1\text{ we have } P\left(k\right) \text{ is true} \end{equation*}$$*

    Let k\leq n+1 be a natural number, have two cases to consider.

    1. If k<n+1 then we must have that k\leq n. Now, we know that \Tilde{P}\left(n\right) is true by assumption, moreover by the assumptions of strong induction holding true we can conclude that P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right) all being true gives us P\left(n+1\right) is true. Hence we can conclude that \Tilde{P}\left(n+1\right) holds.

    2. Now, the remaining case is k=n+1. In this case \Tilde{P}\left(n+1\right) is the statement

      $$\begin{equation} \forall k\leq n+1\text{ we have } P\left(k\right) \text{ is true} \end{equation*}$$*

      Now, we have that \Tilde{P}\left(n+1\right)=\Tilde{P}\left(n\right)\wedge P\left(n+1\right), from which we can assume the truth of \Tilde{P}\left(n\right) by assumption and by hypothesis this allows us to deduce the truth of P\left(n+1\right). This gives us the truth of \Tilde{P}\left(n+1\right).

    Hence, in both cases we conclude the truth of \Tilde{P}\left(n+1\right).

Hence the proposition follows by mathematical induction. Which is to say, strong induction can be proven using regular induction. $\qed$ :::

We have now shown the equivalence of regular and strong induction.

The well-ordering principle

Consider the way we constructed the natural numbers, we started with one element 0=\emptyset, and build each element in turn by the successor function. That is

$$\begin{align*} 1&=S\left(0\right)=0\cup\left{0\right}\ 2&=S\left(1\right)=1\cup\left{1\right}\ 3&=S\left(2\right)=2\cup\left{2\right}\ \end{align*}$$

This is clearly constructing some form of ordering on \mathbb{N}, in particular we can consider this in two different ways. Firstly we can see the successor map under set inclusion, that is

$$\begin{equation*} 0\subset 1\subset 2\subset 3\subset 4\subset 5\subset\dots \end{equation*}$$ likewise we can consider this ordering in the more intuitive sense of the less than or equal to operator. $$\begin{equation*} 0\leq 1\leq 2\leq 3\leq 4\leq 5\leq\dots \end{equation*}$$ This just doesn't hold for the entirety of \mathbb{N}. For example consider the set S=\left\{2,4,6,8\right\}, we have from the successor mapping that 2\in 4\in 6\in 8, hence 2 is the smallest element of S, with respect to the inclusion of sets. We phrase this in the following proposition

::: {#thm:WOP .theorem} Theorem 18. Well-ordering principle

Let S\subseteq\mathbb{N} be a subset of \mathbb{N} with the possibility of being the entirety of \mathbb{N}. We have that \exists x\in S such that x is the smallest element of S with respect to set inclusion. This is to say \exists x\in S such that \forall y\in S we have x\subseteq y.

Proof:

As 0 is by construction included in every natural number it is enough to show that any subset of \mathbb{N}\setminus\left\{ 0\right\} has no minimal element with respect to set inclusion. For this purpose we will define $M=\mathbb{N}\setminus\left{ 0\right}$

Let S\subseteq M that has no smallest element with respect to set inclusion. We argue by strong induction on $S$

By assumption S has no smallest element with respect to inclusion then 1\not\in S otherwise it would be by definition the smallest element with respect to inclusion. Define T to be the complement of S and then we 0\in T.

Now suppose that every n\in M such that k\leq n is in T. If n+1\in S then it would be a minimal element as every element less than n+1 is in the complement of S, hence n+1\in T. This implies that every element of M is in T by strong induction.

It follows that S=\emptyset. Hence the result. $\qed$ :::

We have shown in some sense that \mathbb{N} is well-ordered. We will see that the idea of well-ordering is an example of a so-called relation.

Rules for the inequality operators

Now that we have a firm grasp of the natural numbers we can deduce some properties that relate to inequalities. In the natural numbers, there are a few results which can be deduced.

::: {#prop:InequalityNaturalNumbers .proposition} Proposition 48. Properties of inequalities for natural numbers

Let a,b,c,d\in\mathbb{N}. We have the following properties for inequalities

  1. a\leq b is the same as $b\geq a$

  2. a<b is the same as $b>a$

  3. If a\leq b and b\leq c then $a\leq c$

  4. If a<b and b\leq c then $a<c$

  5. If a\leq b and b<c then $a<c$

  6. If a< b and b<c then $a<c$

  7. If a\geq b and b\geq c then $a\geq c$

  8. If a>b and b\geq c then $a>c$

  9. If a\geq b and b>c then $a>c$

  10. If a>b and b>c then $a>c$

  11. If a\leq b then $a+c\leq b+c$

  12. If a<b then $a+c<b+c$

  13. If a\geq b then $a+c\geq b+c$

  14. If a>b then $a+c>b+c$

  15. If a\leq b then $ac\leq bc$

  16. If a<b then $ac<bc$

  17. If a\geq b then $ac\geq bc$

  18. If a>b then $ac>bc$

Proof:

  1. a\leq b is the same as b\geq a:

    Suppose that a\leq b then by definition of a\leq b we have that a\subseteq b. We then clearly have that b\not\subset a and so either b>a by definition or b=a. In other words b\geq a.

  2. a<b is the same as b>a:

    Similar to the first part. If a<b then by definition a is a strict subset of b, that is a\subset b. If a is a strict subset of b then b\not\subset a by definition of a subset. Hence b>a by definition of greater than.

  3. If a\leq b and b\leq c then a\leq c:

    Suppose that a\leq b and b\leq c. By definition, we have that a\subseteq b and b\subseteq a and so by proposition 2{reference-type="ref" reference="prop:SetInclusionTransitivityProp"} we have a\subseteq c which is to say a\leq c.

  4. If a<b and b\leq c then a<c:

    As a<b and b\leq c then a\subset b and b\subseteq c. Applying proposition 3{reference-type="ref" reference="prop:ProperSetInclusionTransitivityProp"} gives a\subset c and so $a<c$

  5. If a\leq b and b<c then a<c:

    Similar to part 4. As a\leq b then a\subseteq b and likewise as b\leq c then b\subset c. Applying 3{reference-type="ref" reference="prop:ProperSetInclusionTransitivityProp"} gives a\subset c and hence a<c.

  6. If a<b and b<c then a<c:

    Similar to part 4. and 5. As a<b then a\subset b and likewise as b<c then b\subset c. By proposition 4{reference-type="ref" reference="prop:ProperSetSubSetInclusionNotTransitivity"} we have that a\subset c and hence a<c.

  7. If a\geq b and b\geq c then a\geq c:

    By the first part of the proposition we have that a\geq b and b\geq c then a\geq c is the same as b\leq a and c\leq b then c\leq a, and so part 3. of the proposition applies.

  8. If a>b and b\geq c then a>c:

    Applying part 2. of this proposition to a>b and a>c and part 1. to b\geq c gives the equivalent statement b<a and c\leq a then c<a, and so part 4. of the proposition applies.

  9. If a\geq b and b>c then a>c:

    Applying part 1. of this proposition to a\geq b and part 1. to b>c and a>c gives the equivalent statement b\leq a and c< b then c<a, and so part 5. of the proposition applies.

  10. If a>b and b>c then a>c:

    Applying part 2. to a>b, b>c and c>a gives the equivalent statement b<a and c<b then c<a and so part 6. applies.

  11. If a<b then a+c<b+c:

    Suppose that a<b, then a\subset b. We argue by induction on c that \left(a+c\right)\subset \left(b+c\right).

    Let P\left(c\right) be the proposition given by

    $$\begin{equation} \left(a+c\right)\subset \left(b+c\right) \end{equation*}$$*

    For the base case c=0 and we trivially have a<b by hypothesis. Hence P\left(0\right) is true.

    So suppose that P\left(c\right) is true, that is to say

    $$\begin{equation} \left(a+c\right)\subset \left(b+c\right) \end{equation*}$$*

    We need to show that P\left(c+1\right)=P\left(S\left(c\right)\right) is true. That is

    $$\begin{equation} \left(a+S\left(c\right)\right)\subset \left(b+S\left(c\right)\right) \end{equation*}$$*

    We know from the definition of addition that \forall m\in\mathbb{N} and n\neq 0 that

    $$\begin{equation} m+n=m+S\left(n\right)=S\left(m+n\right) \end{equation*}$$*

    Hence we have

    $$\begin{equation} \left(a+S\left(c\right)\right)\subset \left(b+S\left(c\right)\right) \Rightarrow S\left(a+c\right)\subset S\left(b+c\right) \end{equation*}$$*

    By the induction hypothesis, we know that a+c\subset b+c. Let x,y\in\mathbb{N} with x=a+c and y=b+c. Then we have to show that

    $$\begin{equation} S\left(x\right)\subset S\left(y\right) \end{equation*}$$*

    Now, we have that x=x+0, likewise y=y+0 and so

    $$\begin{align} S\left(x\right)&\subset S\left(y\right)\ S\left(x+0\right)&\subset S\left(y+0\right)\ x+S\left(0\right)&\subset y+S\left(0\right)\ a+c+S\left(0\right)&\subset b+c+S\left(0\right)\ a+S\left(c+0\right)&\subset b+S\left(c+0\right)\ a+S\left(c\right)&\subset b+S\left(c\right)\ \end{align*}$$*

    Hence P\left(S\left(c\right)\right)=P\left(c+1\right) holds.

    The result follows by induction. Therefore a+c\subset b+c for all c\in\mathbb{N} and therefore a+c<b+c.

  12. If a\leq b then a+c\leq b+c:

    Suppose that a\leq b. If a<b then by part 11. we have a+c<b+c. So suppose that a=b then we must have that a+c=b+c and so by definition a+c\leq b+c.

  13. If a>b then a+c>b+c:

    Applying part 2. of the proposition give the equivalent statement of b< a then b+c< a+c and so we can apply part 11.

  14. If a\geq b then a+c\geq b+c:

    Applying part 1. of the proposition give the equivalent statement of b\leq a then b+c\leq a+c and so we can apply part 12.

  15. If a<b then ac<bc:

    Suppose that a<b, then a\subset b. We argue by induction on c that ac\subseteq bc.

    Let P\left(c\right) be the proposition given by

    $$\begin{equation} \left(ac\right)\subset \left(bc\right) \end{equation*}$$*

    For the base case c=0 and we trivially have a*0<b*0\Rightarrow 0<0 is vacuously true. Hence P\left(0\right) is true.

    So suppose that P\left(c\right) is true, that is to say

    $$\begin{equation} \left(ac\right)\subset \left(bc\right) \end{equation*}$$*

    We need to show that P\left(c+1\right)=P\left(S\left(c\right)\right) is true. That is

    $$\begin{equation} \left(aS\left(c\right)\right)\subset \left(bS\left(c\right)\right) \end{equation*}$$*

    We know from the definition of multiplication that \forall m\in\mathbb{N} and n\neq 0 that

    $$\begin{equation} mn=mS\left(n\right)=mn+m \end{equation}$$*

    Hence we have

    $$\begin{equation} \left(aS\left(c\right)\right)\subset \left(bS\left(c\right)\right) \Rightarrow ac+c\subset bc+c \end{equation*}$$*

    By the induction hypothesis, we know that ac<bc and so by part 11. we conclude that ac+c \subset bc+c which is to say aS\left(c\right)\subset bS\left(c\right). Hence P\left(S\left(c\right)\right)=P\left(c+1\right) is true and the result follows by induction. Hence we conclude that ac\subset bc.

  16. If a\leq b then ac\leq bc:

    Suppose a\leq b then if a< b we apply part 15. Otherwise, we have that a=b and so by definition ac=bc which is to say ac\leq bc.

  17. If a>b then ac>bc:

    Applying part 2. of the proposition gives the equivalent statement of b< a then bc<ac and so we apply part 15. of the proposition.

  18. If a\geq b then ac\geq bc:

    Applying part 1. of the proposition gives the equivalent statement of b\leq a then bc\leq ac and so we apply part 16. of the proposition.

The result has been shown. $\qed$ :::

Cardinality, countability, relations

::: epigraph God created infinity, and man, unable to understand infinity, had to invent finite sets.

Gian-Carlo Rota :::

Cardinality

In the previous chapter when constructing the natural numbers, we made continuous reference to the idea that the number 1=\left\{\emptyset\right\} is somehow the set that contains a single element, 3=\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\} somehow contains three individual elements. We can make this a rigorous definition, to do so we will use the idea we have been using all along. This is to say, the natural number n is a set that has n elements.

::: definition Definition 76. Cardinality of a natural number

We define the cardinality of a natural number n\in\mathbb{N}, which we will denote \left|n\right| to be the same as the identity mapping. That is to say

$$\begin{align} \left|\cdot\right|:\mathbb{N}&\rightarrow\mathbb{N}\ n&\mapsto\left|n\right|=n \end{align*}$$* :::

::: example Example 71. Consider 1 and 3 from before. We have that

$$\begin{equation} \left|1\right|=\left|\left{\emptyset\right}\right|=1 \end{equation*}$$ and*

$$\begin{equation} \left|3\right|=\left|\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right}\right|=3 \end{equation*}$$*

Indeed, we have captured the essence of that intuitive idea that a natural number n is a set that has n elements. :::

Now that we have a notion of size for natural numbers, we can extend this idea to sets in general. In particular, how many elements does a given set have? To build this idea we will also be making use of mappings. For an example, suppose we have the set S=\left\{2,4,6,8\right\}. Intuitively we know that this is a set which has four element, and by our definition above we know that \left|4\right|=\left|\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\},\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}\right\}\right|=4, is a set that contains 4 elements. Now, consider the mapping f:S\rightarrow 4, given by

$$\begin{align*} f\left(2\right)&=\emptyset\ f\left(4\right)&=\left{\emptyset\right}\ f\left(6\right)&=\left{\emptyset,\left{\emptyset\right}\right}\ f\left(8\right)&=\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right}\ \end{align*}$$

With f being defined as it is, we conclude that f is a bijection. Hence we know that each element x\in S maps exactly to one and only one of the elements y\in 4. This somehow tells us that S is a set which has 4 elements, as 4 is a set which contains 4 elements. We can make this a definition of the size of a set, and in doing so define the notion of a finite and "Infinite" set

::: definition Definition 77. Cardinality of a set

Let S be a set.

  1. Suppose that n\in\mathbb{N}. We define the cardinality of the set S, denoted by \left|S\right|=n, to be n if and only if there exists a bijective mapping f:S\rightarrow n. We write this

    $$\begin{equation} \left|S\right|=n \end{equation*}$$*

    If such a mapping exist we say that S is a finite set of size n. We recall that each element of n is nothing but a set whose elements are sets.

  2. Suppose that f:S\rightarrow\mathbb{N} be a bijective mapping. We say that the cardinality of the set S is infinite. Informally we denote this by \infty, but formally we say that \left|S\right|=\aleph_0=\left|\mathbb{N}\right|, where \aleph_0 is pronounced Aleph-Null.

  3. Suppose that f:S\rightarrow T is a bijective mapping. We say that the sets S and T have the same cardinality and write \left|S\right|=\left|T\right|. :::

This definition made reference to the idea of an "infinite" set. We know that the axiom of infinity gives us the existence of one infinite set, this infinite set that is defined by the axiom of infinite includes the natural numbers but it also includes the so called ordinal numbers. The natural numbers are what are called cardinal numbers, they refer to the size of collections of objects or the amount of some quantity, they can also be used to list (enumerate) a collection. For example we can think of a race between 20 drivers. We must have that one driver comes first, another second, another third, and so on. Each driver can be listed using a number from 1 to 20 inclusive, alternatively a number from 0 to 19. When used in this way, the natural numbers order by enumeration the positions the drivers in the race finished. Now, elements in the infinite set that are not the natural numbers also have this property that they can be used to enumerate the finishing positions of race drivers, however to use them we first would have to go through every single natural number first. The first such non-natural number ordinal is usually denoted \omega, and so to label something $\omega$-th, infinitely many things would have to come before.

This gets complicated quickly and as such we won't go into more details for now. Instead the idea that the natural numbers can be used for enumeration turns out to be a useful one, especially later down the line when we start considering sets like \mathbb{R}. For now, we are only interested in sets whose cardinality is either finite or \aleph_0 and we will continue the exploration of cardinality.

To continue this exploration we will need to relate the ideas of subsets to that of cardinality.

::: {#prop:ProperSubsetStrictlySmallarCard .proposition} Proposition 49. Proper subset of a finite set has strictly smaller cardinality

Let S and T be finite sets such that S\subset T, then we have that \left|S\right|<\left|T\right|.

Proof:

Let S and T be finite sets such that S\subset T with say \left|T\right|=n, we argue by induction on n, the cardinality of the set T.

Let P\left(n\right) be the proposition given by

$$\begin{equation} \text{If }T\text{ is a finite set with } S\subset T \text{ and }\left|T\right|=n \text{ then } S\text{ is a finite set and} \left|S\right|<\left|T\right|=n \end{equation*}$$*

We need to show that

  1. P\left(0\right) is true.

  2. If P\left(n\right) is true then P\left(n+1\right) is true.

  1. P\left(0\right) is true:

    We have that \left|T\right|=0 and so T=\emptyset. As T=\emptyset then there are no subsets S\subset \emptyset for if there were then T\neq\emptyset. Hence the base case is vacuously true.

  2. If P\left(n\right) is true then P\left(n+1\right) is true:

    Suppose that P\left(n\right) holds for some n\in\mathbb{N} which is the statement

    $$\begin{equation} \text{If }T\text{ is a finite set with } S\subset T \text{ and }\left|T\right|=n \text{ then } S\text{ is a finite set and} \left|S\right|<\left|T\right|=n \end{equation*}$$ We need to show that P\left(n+1\right) also holds that is we show that*

    $$\begin{equation} \text{If }T\text{ is a finite set with } S\subset T \text{ and }\left|T\right|=n+1 \text{ then } S\text{ is a finite set and} \left|S\right|<\left|T\right|=n+1 \end{equation*}$$*

    So suppose that \left|T\right|=n+1 for some n\in\mathbb{N} such that S\subset T. As S is a strict subset of T we know that \exists t\in T with t\not\in S. Hence we have that S\subseteq T\setminus\left\{t\right\}. We need to now show that $\left|T\setminus\left{t\right}\right|=n$

    ::: lemma Lemma 4. Set of cardinality n+1 minus an element has cardinality $n$

    Let S be a finite set with cardinality n+1. Consider the set S\setminus\left\{s\right\} where s\in S is an arbitrary element of S. We have that \left|S\setminus\left\{s\right\}\right|=n

    Proof:

    We need to show that for the set S\setminus\left\{s\right\} that there exists a bijective mapping to a set of n elements. We know that S has cardinality n+1, hence there exists a bijection f:S\rightarrow n+1. We know by construction that n+1=n\cup\left\{n\right\}, hence we have that n=n+1\setminus\left\{n\right\}.

    Consider the mapping given by g defined as follows

    $$\begin{align*} g:S\setminus\left{s\right}&\rightarrow n=n+1\setminus\left{n\right}\ x&\mapsto g\left(x\right)=\begin{cases} f\left(x\right): \text{If }f\left(x\right)\neq \left{n\right}\ f\left(s\right): \text{If }f\left(x\right)=\left{n\right} \end{cases} \end{align*}$$

    This is to say g is a mapping that takes each x\in S and maps it to f\left(x\right) if f\left(x\right)\neq \left\{n\right\}\in n+1, that is if f doesn't map x to the removed element of the set n+1, otherwise if f does map an element x\in S to \left\{n\right\} then g maps x to whatever f takes the removed element s to.

    For example suppose that S=\left\{0,1,2\right\}, i.e we are considering the case n=2, let f:S\rightarrow 3 be the identity mapping, this is a bijection. Suppose we now consider S\setminus\left\{2\right\}=\left\{0,1\right\} and consider the mapping g:S\setminus\left\{2\right\}\rightarrow 2 given by

    $$\begin{align*} g:S\setminus\left{2\right}&\rightarrow 2=3\setminus\left{2\right}=\left{\emptyset,\left{\emptyset\right}\right}\ x&\mapsto g\left(x\right)=\begin{cases} f\left(x\right): \text{If }f\left(x\right)\neq \left{2\right}\ f\left(2\right): \text{If }f\left(x\right)=\left{2\right} \end{cases} \end{align*}$$ We have that g\left(0\right)=0=\emptyset and g\left(1\right)=1=\left\{\emptyset\right\}. We could have instead considered S\setminus\left\{1\right\}=\left\{0,2\right\} again with f being the identity mapping. We have that in this case g is the mapping given by

    $$\begin{align*} g:S\setminus\left{1\right}&\rightarrow 2=3\setminus\left{2\right}=\left{\emptyset,\left{\emptyset\right}\right}\ x&\mapsto g\left(x\right)=\begin{cases} f\left(x\right): \text{If }f\left(x\right)\neq \left{2\right}\ f\left(1\right): \text{If }f\left(x\right)=\left{2\right} \end{cases} \end{align*}$$ In this case we have that g\left(0\right)=0=\emptyset but g\left(2\right)=f\left(1\right)=1=\left\{\emptyset\right\}.

    Now, we need to show the general case where g is given by

    $$\begin{align*} g:S\setminus\left{s\right}&\rightarrow n=n+1\setminus\left{n\right}\ x&\mapsto g\left(x\right)=\begin{cases} f\left(x\right): \text{If }f\left(x\right)\neq \left{n\right}\ f\left(s\right): \text{If }f\left(x\right)=\left{n\right} \end{cases} \end{align*}$$ is a bijection.

    1. g is an injection:

      To see that g is an injection, suppose that x,y\in S\setminus\left\{s\right\} and that x\neq y. There are three cases to consider.

      1. f\left(x\right)\neq \left\{n\right\} and f\left(y\right)\neq\left\{n\right\}:

        We have by definition of the mapping g that f\left(x\right)=g\left(x\right) and f\left(y\right)=g\left(y\right). Moreover we know that f is a bijection and in particular an injection, hence as f\left(x\right)\neq f\left(y\right) we must have that g\left(x\right)\neq g\left(y\right)

      2. f\left(x\right)=\left\{n\right\}:

        By the definition of the mapping g we have that g\left(x\right)=f\left(s\right). Now, recall that y\in S\setminus\left\{s\right\}, thus it follows that y\neq s. Now, by the injectivity of f we have that f\left(y\right)\neq f\left(s\right)=g\left(x\right). Moreover by the injectivity of f we have that f\left(y\right)\neq \left\{n\right\}. It now follows by definition of g that

        $$\begin{equation*} g\left(y\right)=f\left(y\right)\neq f\left(x\right)=g\left(x\right) \end{equation*}$$ That is g\left(y\right)\neq g\left(x\right).

      3. f\left(y\right)=\left\{n\right\}:

        This is the same as f\left(x\right)=\left\{n\right\} except the roles of x and y are swapped, for completeness we give the details.

        By the definition of the mapping g we have that g\left(y\right)=f\left(s\right). Now, as x\in S\setminus\left\{s\right\} it follows that x\neq s. By the injectivity of f we have that f\left(x\right)\neq f\left(s\right)=g\left(y\right). Moreover by the injectivity of f we have that f\left(x\right)\neq \left\{n\right\}. It now follows by definition of g that

        $$\begin{equation*} g\left(x\right)=f\left(x\right)\neq f\left(y\right)=g\left(y\right) \end{equation*}$$ That is g\left(y\right)\neq g\left(x\right).

      This shows that g is an injection.

    2. g is a surjection:

      We need to show that \forall y\in n,\exists x\in S such that g\left(x\right)=y. Let y\in n. We know that f is a bijection and in particular it is a surjection and so by definition we know we must have

      $$\begin{equation*} \forall y\in n+1,\exists x\in S : f\left(x\right)=y \end{equation*}$$

      Consider the definition of g. We know that g:S\setminus\left\{s\right\}\rightarrow n, hence to show that g is surjective we need to show that any y\in n has an element x'\in S with f\left(x'\right)=y. Moreover as S doesn't have the element s we can't use x=s in the surjectivity of f to show surjectivity of g.

      $$\begin{equation*} \forall y\in n=n+1\setminus\left{n\right},\exists x'\in S\setminus\left{s\right}: x\neq a\text{ and } f\left(x'\right)=y \end{equation*}$$

      Finally, we need to consider f\left(s\right) and in particular the two cases of f\left(s\right)\neq\left\{n\right\} and f\left(s\right)=\left\{n\right\}, from the definition of g.

      1. f\left(s\right)\neq\left\{n\right\}:

        Suppose that f\left(s\right)\neq\left\{n\right\}. As f is a bijection we have that f is invertible, in particular we must have that f^{-1}\left(\left\{n\right\}\right)\neq s. There are two additional cases to consider now, f\left(s\right)=y=f\left(x\right) and f\left(s\right)\neq y=f\left(x\right).

        1. f\left(s\right)=y=f\left(x\right):

          Suppose that f\left(s\right)=y, by definition of g we have that

          $$\begin{equation*} g\left(f^{-1}\left(\left{n\right}\right)\right)=y \end{equation*}$$ as f^{-1}\left(\left\{n\right\}\right)\neq s. So let x'=f^{-1}\left(\left\{n\right\}\right).

        2. f\left(s\right)\neq y=f\left(x\right):

          Suppose that f\left(s\right)\neq y, by assumption of surjectivity of f we have that f\left(x\right)=y. Hence f\left(s\right)\neq f\left(x\right) and so by injectivity of f we have that x\neq s., hence we can simply take x'=x,

      2. f\left(s\right)=\left\{n\right\}:

        Now suppose that f\left(s\right)=\left\{n\right\} We know that \left\{n\right\}\not\in n and so by assumption we have that f\left(x\right)=y\neq \left\{n\right\}. Thus we conclude that x\neq s so we let x'=x

      In each case we have found a valid choice for x' and so surjectivity has been shown.

    It follows that g:S\setminus\left\{s\right\}\rightarrow n is a bijection and by definition of set cardinality we conclude that S\setminus\left\{s\right\} has cardinality n. As required. \qed :::

    Now, by the lemma we have that T\setminus\left\{t\right\} is set of cardinality n. Now if S=T\setminus\left\{t\right\} then \left|S\right|=n<n+1=S\left(n\right) and so is finite by definition, otherwise we must have that S is a proper subset of T\setminus\left\{t\right\}. So the induction hypothesis holds, that is S is a finite set with less than n elements. Moreover as n<n+1 it follows that S has less than n+1 elements.

    Hence P\left(n+1\right) holds.

The result now follows by induction. $\qed$ :::

This proposition has an immediate consequence.

::: {#lem:SubsetOfFiniteSetHasAtMostSameCard .lemma} Lemma 5. Subset of a finite set has at most the same cardinality

Let S and T be finite sets such that S\subseteq T, we have that \left|S\right|\leq\left|T\right|.

Proof:

There are two cases to consider. Firstly if S=T we have by definition that S and T have the same elements and therefore the identity map is a bijection between the two sets. Hence \left|S\right|=\left|T\right|. The finally case is S\subset T which is simply proposition 49{reference-type="ref" reference="prop:ProperSubsetStrictlySmallarCard"}. $\qed$ :::

We defined the cardinality of a set S in terms of a bijective mapping from S to \mathbb{N}, although this doesn't mean we can't deduce things about cardinality for say injective mappings or surjective mappings. We will assume unless stated otherwise that the sets we are dealing with are finite.

::: {#prop:CardinalityOfFiniteInjectiveMap .proposition} Proposition 50. Cardinality of finite sets in an injective mapping

Let S and T be two finite sets, and suppose that f:S\rightarrow T is an injection. We have that

$$\begin{equation} \left|S\right|\leq \left|T\right| \end{equation*}$$*

Proof:

Suppose that f:S\rightarrow T is an injective mapping between finite sets with \left|S\right|=n and \left|T\right|=m. Now consider the mapping given by g:S\rightarrow\mathop{\mathrm{Image}}\left(f\right). We have by proposition 18{reference-type="ref" reference="prop:InjectiveMapToImageIsBijection"} that an injective mapping to the image is a bijection and so by definition \left|\mathop{\mathrm{Image}}\left(f\right)\right|=n. Additionally by definition of the image of f we have that \mathop{\mathrm{Image}}\left(f\right)\subseteq T. It follows that as \mathop{\mathrm{Image}}\left(f\right)\subseteq T then n=\left|S\right|=\left|\mathop{\mathrm{Image}}\left(f\right)\right|\leq\left|T\right|=m, that is \left|S\right|\leq \left|T\right|. As required. $\qed$ :::

::: {#prop:CardinalityOfFiniteSurjectiveMap .proposition} Proposition 51. Cardinality of finite sets in a surjective mapping

Let S,T be two finite sets, and suppose that f:S\rightarrow T is a surjection. We have that

$$\begin{equation} \left|T\right|\leq \left|S\right| \end{equation*}$$*

Proof:

Suppose that f:S\rightarrow T is a surjective mapping between finite sets with \left|S\right|=n and \left|T\right|=m. For each t\in T define x_t\in f^{-1}\left(\left\{t\right\}\right), x_t exists because f is surjective and so by definition for any t\in T the is some s\in S such that f\left(s\right)=T.

Define X=\left\{x_t\in S:t\in T\right\}, that is X is the set of all such elements defined by the pre-image above. Clearly X\subseteq S and so by 5{reference-type="ref" reference="lem:SubsetOfFiniteSetHasAtMostSameCard"} we have that \left|X\right|\leq\left|S\right|.

Now, consider the restriction mapping \mathrel f\restriction_S. We have that for all t\in T that x_t\in X and that \mathrel f\restriction_S\left(x_t\right)=t, and so \mathrel f\restriction_S is surjective. Moreover if we have so some x_t,x_v\in S with \mathrel f\restriction_S\left(x_t\right)=t and \mathrel f\restriction_S\left(x_v\right)=v and t=v then by definition we have that x_t=x_v and \mathrel f\restriction_S is a bijection. Hence by definition \left|T\right|=\left|X\right|\leq\left|S\right|. $\qed$ :::

If we have two sets of finite cardinality , what can we say about the Cartesian product? This should also have finite cardinality. If we have a set S of cardinality n and a set T of cardinality m. The Cartesian product S\times T has elements of the form \left(s,t\right) for s\in S and t\in T. For some element s_0 we can have that every element t\in T is in S\times T for which there are precisely m such elements of this form. We can do this for each element in s\in S for which there are n such elements. Hence we expect the total number of elements in S\times T to be nm.

::: {#prop:CardinalityOfCartesianProduct .proposition} Proposition 52. Cardinality of the Cartesian product of finite sets

Let S and T be two sets with cardinalities \left|S\right|=n and \left|T\right|=m then

$$\begin{equation} \left|S\times T\right| = \left|S\right|\left|T \right|=nm \end{equation*}$$*

Proof:

If either one of \left|S\right|=0 or \left|T\right|=0 then S=\emptyset or T=\emptyset and so S\times T=\emptyset. So let S and T be as given then \left|S\right|=m and \left|T\right| =m. Let s\in S and define the following mapping

$$\begin{align} f:T&\rightarrow\left{s\right}\times T\ t&\mapsto f\left(t\right)=\left(s,t\right) \end{align*}$$*

We show that f is a bijection. Indeed suppose that f\left(a\right)=f\left(b\right) where a,b\in T then \left(s,a\right)=\left(s,b\right) and as s is fixed we conclude that a=b which shows injectivity. Now let t\in\left\{s\right\}\times T then t=\left(s,t'\right) for some t'\in T but then clearly f\left(t'\right)=t and so \forall t\in \left\{s\right\}\times T,\exists t'\in T such that f\left(t'\right)=t. Hence f is surjective and therefore we have that f is a bijection. By proposition 50{reference-type="ref" reference="prop:CardinalityOfFiniteInjectiveMap"} as f is an injective mapping between finite sets then \left|T\right|\leq \left|\left\{s\right\}\times T\right|. Likewise by proposition 51{reference-type="ref" reference="prop:CardinalityOfFiniteSurjectiveMap"} we conclude that \left|\left\{s\right\}\times T\right|\leq \left|T\right| hence \left|T\right|=\left|\left\{s\right\}\times T\right|=m.

Now define the set K by

$$\begin{equation} K=\left{\left{s\right}\times T: s\in S\right} \end{equation*}$$*

for any s\in S. Define the following mapping

$$\begin{align} g:S&\rightarrow K\ x&\mapsto g\left(x\right)=\left{x\right}\times T \end{align*}$$*

We show that g is a bijection. Clearly if g\left(a\right)=g\left(b\right) then \left\{a\right\}\times T=\left\{b\right\}\times T and as T is a fixed set then a=b and injectivity holds. Now let k\in K then k=\left\{k'\right\}\times T where k'\in S then clearly g\left(k'\right)=k so surjectivity holds. Hence g is a bijection and so by a similar argument with the mapping f we conclude that $\left|S\right| = \left|K\right|=n$

We now need to show that set K partitions S\times T. This is to say we need to show that

  1. \forall x,y\in K we have that x\cap y=\emptyset whenever $x\neq y$

  2. \forall x\in K we have that

    $$\begin{equation} S\times T=\bigcup_{x\in K} x \end{equation*}$$*

  3. \forall x\in K we have that $x\neq \emptyset$

  1. \forall x,y\in K we have that x\cap y=\emptyset whenever $x\neq y$

    We can make use of the fact that g is a bijection. If g\left(x\right)=g\left(y\right) then x=y and so x\cap y=x=y\neq\emptyset. Now if g\left(x\right)\neq g\left(y\right) then x\neq y say x=\left\{s_1\right\}\times T and y=\left\{s_2\right\}\times T with s_1\neq s_2. It follows that x\cap y = \emptyset.

  2. \forall x\in K we have that

    $$\begin{equation} S\times T=\bigcup_{x\in K} x \end{equation*}$$*

    By definition we have that any x\in K has the form \left\{s\right\}\times T where s\in S. Let y\in \left\{s\right\}\times T then y=\left(s,t\right) for some t\in T and so y\in S\times T therefore

    $$\begin{equation} \bigcup_{x\in K} x\subseteq S\times T \end{equation*}$$*

    Likewise suppose that x\in S\times T then x = \left(s,t\right) for some s\in S and t\in T. This implies that x\in \left\{s\right\}\times T and as \left\{s\right\}\times T\in K then x\in K so that

    $$\begin{equation} S\times T\subseteq\bigcup_{x\in K} x \end{equation*}$$*

    It follows that

    $$\begin{equation} S\times T=\bigcup_{x\in K} x \end{equation*}$$ for all x\in K.*

  3. \forall x\in K we have that $x\neq \emptyset$

    Let x\in K then x\neq\emptyset as S\neq\emptyset and T\neq\emptyset. Hence \forall x\in K x\neq \emptyset.

It follows that K partitions S\times T. Now as K is a set containing n elements and K partitions S\times T and each element of K is a set containing m elements. We have that the cardinality of S\times T is the sum of the cardinalities of each set x\in K which is m*n. That is to say

$$\begin{equation} \left|S\times T\right|=nm \end{equation*}$$ and the result is shown. $\qed$* :::

Countability

::: definition Definition 78. Countable Set

Let S be a set. Let T\subseteq\mathbb{N} allowing for the possibility that T=\mathbb{N}. We say that S is a countable set if and only if the mapping f:S\rightarrow T is a bijection.

If T is a finite subset of \mathbb{N} we say that S is a finitely countable set and thus countable. If T=\mathbb{N} we say that S is a countably infinite set. If S is not a finitely countable set or a countably infinite set we say that S is a uncountably infinite set. :::

Informally, a set S is finitely countable or countably infinite if we have some process for which we can enumerate each element of S, that is to say list out each element in some way. We have an immediate result. We can make the notion of an enumeration rigorous

::: definition Definition 79. Enumeration

Let S be a finitely countable set with cardinality \left|S\right|=n and define \mathbb{N}_n=\left\{1,2,3,\dots,n\right\} for some n\in\mathbb{N}. We define an enumeration of S to be a bijective mapping f:\mathbb{N}_n\rightarrow S or a bijective mapping g:S\rightarrow\mathbb{N}_n.

If S is a countably infinite we define an enumeration of S to be the bijection f:\mathbb{N}\rightarrow S or a bijective mapping g:S\rightarrow\mathbb{N}. :::

It is clear that in either case the if f is a enumeration of a countable set S then so is f^{-1} is also an enumeration of S.

::: proposition Proposition 53. Inverse of an enumeration mapping is an enumeration mapping

  1. Let S be a finitely countable set with cardinality \left|S\right|=n have enumeration f:\mathbb{N}_n\rightarrow S then f^{-1}:S\rightarrow\mathbb{N}_n is an enumeration of S where f and f^{-1} define the same enumeration of the elements of $S$

  2. Let S be a countable set have enumeration f:\mathbb{N}\rightarrow S then f^{-1}:S\rightarrow\mathbb{N} is an enumeration of S where f and f^{-1} define the same enumeration of the elements of $S$

Proof:

  1. Let Let S be a finitely countable set with cardinality \left|S\right|=n have enumeration f:\mathbb{N}_n\rightarrow S then f^{-1}:S\rightarrow\mathbb{N}_n is an enumeration of S where f and f^{-1} define the same enumeration of the elements of S:

    As f is a bijection then it has an inverse f^{-1}:S\rightarrow\mathbb{N}_n which is also a bijection. Hence f^{-1} is an enumeration. To show that f and f^{-1} define the same enumeration of the elements of S we note that f\circ f^{-1}=\mathop{\mathrm{id}}_{\mathbb{N}_n} and f^{-1}\circ f = \mathop{\mathrm{id}}_S.

  2. Let Let S be a countable set have enumeration f:\mathbb{N}\rightarrow S then f^{-1}:S\rightarrow\mathbb{N} is an enumeration of S where f and f^{-1} define the same enumeration of the elements of S:

    As f is a bijection then it has an inverse f^{-1}:S\rightarrow\mathbb{N} which is also a bijection. Hence f^{-1} is an enumeration. To show that f and f^{-1} define the same enumeration of the elements of S we note that f\circ f^{-1}=\mathop{\mathrm{id}}_{\mathbb{N}} and f^{-1}\circ f = \mathop{\mathrm{id}}_S.

The result is shown. $\qed$ :::

::: proposition Proposition 54. The natural numbers are countably infinite

We have that \mathbb{N} is a countably infinite set.

Proof:

To show that \mathbb{N} is countable we need to find a bijective mapping f:\mathbb{N}\rightarrow\mathbb{N}. We can clearly take \mathop{\mathrm{id}}_\mathbb{N}, that is the identity mapping on \mathbb{N}. That is to say

$$\begin{align} \mathop{\mathrm{id}}\mathbb{N}:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto\mathop{\mathrm{id}}\mathbb{N}\left(x\right)=x \end{align*}$$ As required. $\qed$* :::

We also have the following immediate result.

::: proposition Proposition 55. Any subset of \mathbb{N} is countable

Let S\subseteq\mathbb{N} then S is countable.

Proof:

Let S\subseteq\mathbb{N} and suppose that S is not finite, for if it is by definition it is countable. As \mathbb{N} is well-ordered we have by theorem 18{reference-type="ref" reference="thm:WOP"} that S is well-ordered and so have a set inclusion minimal element say s_0. As S is infinite then S\setminus\left\{s_0\right\}. We will use this as the basis for induction.

Suppose we have s_n\in S\setminus\left\{s_0,s_1,s_2,\dots,s_{n-1}\right\} then another application of the well-order principle means there is some set inclusion minimal element s_{n+1} with s_{n+1}\in S\setminus\left\{s_0,s_1,s_2,\dots,s_n\right\}. This holds for all n\in\mathbb{N} and so we conclude that S=\left\{s_0,s_1,s_2,\dots\right\} is countable by defining the bijective mapping mapping

$$\begin{align} f:\mathbb{N}&\rightarrow S\ x&\mapsto f\left(x\right)=s_x \end{align*}$$*

The result follows. $\qed$ :::

::: proposition Proposition 56. The empty-set is countable

We have that \emptyset is a countable set.

Proof:

The empty-set has cardinality 0 which is finite. $\qed$ :::

There are some results that can be deduced which give equivalent conditions for a set to be countable. Two of these results follow by definition of a countable set.

::: {#prop:EquivalelntDefinitionsOfCountable .proposition} Proposition 57. Equivalence definitions of a countable set

Let S be a set. The following hold.

  1. S is countable if and only if there is an injection f:S\rightarrow T for some subset $T\subseteq\mathbb{N}$

  2. S is countable if and only if S=\emptyset or there is a surjection f:T\rightarrow S for some subset $T\subseteq\mathbb{N}$

Proof:

  1. S is countable if and only if there is an injection f:S\rightarrow T for some subset T\subseteq\mathbb{N}:

    \left(\Rightarrow\right): Suppose that S is countable then by definition there is a bijection f:S\rightarrow T for some T\subseteq\mathbb{N}. As f is a bijection then f is an injection and we are done.

    \left(\Leftarrow\right): Suppose that there is an injection f:S\rightarrow T for some T\subseteq\mathbb{N}. Consider the mapping g:S\rightarrow\mathop{\mathrm{Image}}\left(f\right). By proposition 15{reference-type="ref" reference="prob:RestOfCodomainToImageIsSurjective"} we have that g is a surjection. By definition of a surjection we have that \forall y\in\mathop{\mathrm{Image}}\left(f\right) there is some x\in S such that f\left(x\right)=y. It follows that g is a bijection as g is also an injection by definition of the image of a mapping. Therefore \left|S\right|=\left|\mathop{\mathrm{Image}}\left(f\right)\right| and as \mathop{\mathrm{Image}}\left(f\right)\subseteq T\subseteq\mathbb{N} we have that S is countable.

  2. S is countable if and only if S=\emptyset or there is a surjection f:T\rightarrow S for some subset T\subseteq\mathbb{N}:

    \left(\Rightarrow\right): Suppose that S is countable then there is a bijection f:T\rightarrow S and by definition is therefore a surjection.

    \left(\Leftarrow\right): Suppose that f:T\rightarrow S is a surjection. If S=\emptyset then f:T\rightarrow S is vacuously injective and surjective and therefore \left|S\right|=\left|\emptyset\right|=\left|T\right| and therefore countable. So suppose that S\neq\emptyset. By proposition 14{reference-type="ref" reference="prop:PropertyImagePreImage"} we have for any mapping g:X\rightarrow Y that the pre-image of g^{-1}\left(Y\right)=X, therefore f^{-1}\left(S\right)=T. By assumption T\subseteq \mathbb{N} and is therefore either finite or some countably infinite subset of \mathbb{N} possibly being \mathbb{N} itself. If T is finite then we have that \left|S\right|\leq\left|T\right| by definition of f being surjective and there for \left|S\right| is finite and therefore countable. So suppose that \left|T\right|=\aleph_0 then T is either a countable subset of \mathbb{N} or \mathbb{N} itself.

    Let g:T\rightarrow\mathbb{N} be a bijection then g^{-1}:\mathbb{N}\rightarrow T is an bijection by proposition 35{reference-type="ref" reference="prop:InverseBijectionIsBijection"} and we have that f\circ g^{-1}:\mathbb{N}\rightarrow S is a surjection by proposition [20](#prop: PropInjecSurjecBijecMapping){reference-type="ref" reference="prop: PropInjecSurjecBijecMapping"}. It is left to show that f\circ g^{-1} being surjective implies S is countable. Proposition 28{reference-type="ref" reference="prop:RightInverseIffSurjective"} gives that f\circ g^{-1} being surjective means there exists a right inverse h such that h:S\rightarrow \mathbb{N}. By proposition 30{reference-type="ref" reference="RightInverseOfSurjecctionisInection"} we have that h is injective. It follows by part 1 that S is countable.

The result is shown. $\qed$ :::

::: proposition Proposition 58. Set is countable if cardinality of set equals cardinality of a countable set

Let S,T be sets such that \left|S\right|=\left|T\right| then if S is countable so is T.

Proof:

Suppose that S is countable. We have that as \left|S\right|=\left|T\right| then there exists a bijection f:S\rightarrow T, in particular there exists a bijection g:T\rightarrow S. Now as S is countable there exists and injection h:S\rightarrow\mathbb{N}. Now as g is a bijection we have that g is an injection. The mapping h\circ g:T\rightarrow \mathbb{N} is an injection as h and g are. Hence as h\circ g is an injection it follows that T is countable by proposition 57{reference-type="ref" reference="prop:EquivalelntDefinitionsOfCountable"}. $\qed$ :::

Relations

Definition of a relation

So far we have seen a few notations that relate elements of a set to another. An example that relates elements of a set is equality of natural numbers, two natural numbers are equal if and only if there are the same element. Another example that we have seen on the natural numbers is the less than operator <. A natural number x is less than y if and only if x\subseteq y. A more fundamental example of a relation is that of a mapping f:S\rightarrow T. We can consider a function as relating any s\in S and t\in T to the pair \left(s,t\right) where f\left(s\right)=t.

In a sense, we have that the idea of relations is somehow as fundamental as sets and mappings, in fact we just described a mapping as some form of relation so the idea of relations is more fundamental than that of a mapping. Using the examples of the comparison operators on \mathbb{N} we can motivate a definition for a relation.

::: definition Definition 80. Relation

Let S be a set and consider the Cartesian product S\times S. A relation is a subset R\subseteq S\times S. We write an element \left(a,b\right)\in R as aRb or we also write a\sim b and we say that a relates to b. If \left(a,b\right)\not\in R we write a\slashed{R} b or we write $a\not\sim b$ :::

We can recast the ideas at the start of this section into the language of relations.

::: example Example 72. Consider equality on \mathbb{N}. We can define equality as a relation \mathbb{N}\times \mathbb{N} where a\sim b if and only if a\subseteq b and b\subseteq a. Explicitly we have that R is a subset of \mathbb{N}\times\mathbb{N} given by

$$\begin{equation} R=\left{\left(0,0\right),\left(1,1\right),\left(2,2\right),\dots\right} \end{equation*}$$* :::

::: example Example 73. Consider the less than operator on \mathbb{N}. We have that the less than operator is a relation where a\sim b is given by a\subset b. To see this consider T=\left\{0,1,2\right\}. Then the less than relation on T is given by the relation

$$\begin{equation} R=\left{\left(0,1\right),\left(0,2\right),\left(1,2\right)\right} \end{equation*}$$* :::

::: example Example 74. Let S=\left\{0,1\right\}\subseteq\mathbb{N} and define T=P\left(S\right\} be the power set of S given by

$$\begin{equation} T=\left{\emptyset,\left{0\right}, \left{1\right}, \left{0,1\right}, S\right} \end{equation*}$$*

We can define a relation R\subseteq T\times T by

$$\begin{align} R = { &\left(\emptyset,\emptyset\right),\left(\emptyset,\left{0\right}\right),\left(\emptyset,\left{1\right}\right),\left(\emptyset,\left{0,1\right}\right),\left(\emptyset,S\right),\left(\left{0\right},\left{0\right}\right),\left(\left{0\right},\left{0,1\right}\right),\left(\left{0\right},S\right),\ &\left(\left{1\right},\left{1\right}\right),\left(\left{1\right},\left{0,1\right}\right),\left(\left{1\right},S\right),\left(\left{0,1\right},\left{0,1\right}\right),\left(\left{0,1\right},S\right),\left(S,S\right)} \end{align*}$$ This relation expresses inclusive subset inclusion, \subseteq, on S.* :::

::: example Example 75. Let S=\left\{0,1,2\right\} and T=S. Define T\times T by

$$\begin{equation} T\times T = \left{\left(0,0\right),\left(0,1\right),\left(0,2\right),\left(1,0\right),\left(1,1\right),\left(1,2\right),\left(2,0\right),\left(2,1\right),\left(2,2\right)\right} \end{equation*}$$ We can use the less than or equal to operator, \leq, to define a relation. We have that*

$$\begin{equation} R=\left{\left(0,0\right),\left(0,1\right),\left(0,2\right),\left(1,1\right),\left(1,2\right),\left(2,2\right)\right} \end{equation*}$$* :::

Reflexive Relation

All of the examples from the previous section, except the strictly less than example, share a common property. Each element is related to itself, that is in each example there is some element s\in S such that \left(s,s\right)\in R\subseteq S\times S. We formalise this in the following definition.

::: definition Definition 81. Reflexive relation

Let S be a set with a relation R\subseteq S\times S. We say that the relation R is reflexive if and only if \forall s\in S we have that \left(s,s\right)\in R. If there is an s\in S such that \left(s,s\right)\not\in R then we say that the relation is anti-reflexive. :::

We have given examples of reflexive relations and one example of an anti-reflexive relation. We give an additional example of an anti-reflexive relation.

::: example Example 76. We have for a,b\in\mathbb{N} that a=b if and only if a\subseteq b and b\subseteq a. If this doesn't hold then a\neq b and either one of a\subseteq b or b\subseteq a is true but not both. It follows that the relation a\sim b meaning a\neq b is anti-reflexive. This also implies that if a\neq b then either a\leq b or b\leq a. :::

The examples given so far have allowed us to see some examples of relations and one particular type of relation, a reflexive relation. Unfortunately only considering relations on elements a single set S currently gives us few practical examples to work with. A simple extension to the idea of a relation can fix this.

::: definition Definition 82. Binary Relation

Let S and T be sets. We define a binary relation to be a subset R\subseteq S\times T. We write an element \left(s,t\right)\in R as sRt or write s\sim t and we say that s relates to t. If \left(s,t\right)\not\in R we write s\slashed{R} t or we write s\not\sim t. :::

We can extend this the notion of a relation and binary relation to that of any finite Cartesian product

::: definition Definition 83. $n$-ary Relation

Let S_1,S_2,S_3,\dots,S_n be sets. We define an $n$-ary relation to be a subset R\subseteq S_1\times S_2\times S_3\times\dots\times S_n=\mathbb{S}. An element of R has the form r=\left(r_1,r_2,r_3,\dots,r_n\right) and we say that the elements of r relate. We write this as $R\left(r\right)=R\left(r_1,r_2,r_3\dots,r_n\right)$ :::

In light of these previous definitions we would like to extend the definition of a reflexive relation to binary and $n$-ary relations. To see how we could extend a reflexive relation to a binary relation suppose we have two sets S and T. The definition of a reflexive relation of a set Z is that \left(z,z\right)\in R_z\subseteq Z\times Z where z\in Z and R_z is the relation defined on Z. A natural way to extend this two S and T is to have either \left(s,s\right)\in R\subseteq S\times T or \left(t,t\right)\in R\subseteq S\times T where R is a binary relation for S and T. Hence for a reflexive binary relation to makes sense we must have that s,t\in S\cap T and therefore the relation would have to be defined on S\cap T.

In the first case \left(s,s\right)\in R\subseteq S\times T we have by definition of an ordered tuple that \left(s,s\right)\in R if and only if s\in S and s\in T. Likewise for \left(t,t\right)\in R\subseteq S\times T we must have s\in S and t\in T which is to say s,t\in S\cap T. If S\neq T then there will exist at least one element \left(s,t\right)\in R\subseteq S\times T where either s\in S and s\not\in T or t\in T and t\not\in S, in this case it is not possible for a reflexive relation to exist.

::: definition Definition 84. Reflexive binary relation

Let S and T with relation R\subseteq S\times T. We say that the relation R is reflexive if and only if S=T. :::

A similar argument shows there can be no reflexive $n$-ary relation unless all of the sets that make the relation are the same. For example consider the sets X,Y and Z. The natural way to represent a relation R\subseteq X\times Y\times Z would be to have either \left(x,x,x\right)\in R, \left(y,y,y\right)\in R or \left(z,z,z\right)\in R where x\in X, y\in Y and z\in Z. If \left(x,x,x\right)\in R then by definition we must have x\in Y and x\in Z, likewise if \left(y,y,y\right)\in R then y\in X and y\in Z and finally if \left(z,z,z\right)\in R then z\in X and z\in Y. Any of the cases implies that x,y,z\in X\cap Y\cap Z

::: definition Definition 85. Reflexive $n$-ary relation

Let S_1,S_2,S_3,\dots,S_n be sets with relation R\subseteq S_1\times S_2\times S_3\times\dots\times S_n. We say that the relation R is reflexive if and only if S_i=S_j for all $i,j\in\left{1,2,3,\dots,n\right}$ :::

This means when talking about a reflexive relation we only need to consider a single set.

An example of a binary relation is a mapping.

::: example Example 77. Let S=T=\mathbb{N} and define the mapping f:S\rightarrow T given by f\left(s\right)=s. We have that f defines a relation as we have that

$$\begin{equation} R=\left{\left(0,0\right),\left(1,1\right),\left(2,2\right),\left(3,3\right),\dots\right}\subseteq\mathbb{N}\times\mathbb{N} \end{equation*}$$* :::

::: example Example 78. Let S=\left\{1,2\right\} and T=\left\{3,4\right\}. Define the mapping f:S\rightarrow T by f\left(1\right)=4 and f\left(2\right)=3. We have f defines a relation as

$$\begin{equation} R=\left{\left(1,4\right),\left(2,3\right)\right}\subseteq S\times T \end{equation*}$$* :::

We can consider operators as relations by using the $n$-aray notion of a relation

::: example Example 79. Let X=Y=Z=\mathbb{N}. We can consider the operator + as a mapping given by

$$\begin{align} f:X\times Y &\rightarrow Z\ \left(x,y\right)&\mapsto f\left(x,y\right) = x+y \end{align*}$$*

A relation can be defined by f. A sample of this relation R looks as follows

$$\begin{equation} R=\left{\left(0,0,0\right), \left(0,1,1\right),\left(4,3,7\right),\left(3,4,7\right),\left(2,2,4\right),\dots,\right}\subseteq\mathbb{N}\times\mathbb{N}\times\mathbb{N} \end{equation*}$$*

In general, R has the following definition

$$\begin{equation} R=\left{\left(x,y,x+y\right):x,y\in\mathbb{N}\right} \end{equation*}$$*

We note that as X=Y then for any x\in X we have x\in Y and likewise for any y\in Y we have that y\in X. We therefore have that R\left(x,y,x+y\right)=R\left(y,x,y+x\right). This is confirming the fact that addition is commutative. :::

::: example Example 80. Let X=Y=Z=\mathbb{N}. We can consider the operator * as a mapping given by

$$\begin{align} f:X\times Y &\rightarrow Z\ \left(x,y\right)&\mapsto f\left(x,y\right) = xy \end{align}$$*

The relation defined by f looks as follows

$$\begin{equation} R=\left{\left(0,0,0\right), \left(0,1,0\right),\left(4,3,12\right),\left(3,4,12\right),\left(2,2,4\right),\dots,\right}\subseteq\mathbb{N}\times\mathbb{N}\times\mathbb{N} \end{equation*}$$*

In general, R has the following definition

$$\begin{equation} R=\left{\left(x,y,xy\right):x,y\in\mathbb{N}\right} \end{equation}$$*

As before, we have that as X=Y then for any x\in X we have x\in Y and likewise, for any y\in Y we have that y\in X. We, therefore, have that R\left(x,y,x*y\right)=R\left(y,x,y*x\right), again confirming the fact that multiplication is commutative. :::

::: example Example 81. Let X=Y=Z=\mathbb{N}. We can consider the operator \wedge as a mapping given by

$$\begin{align} f:X\times Y &\rightarrow Z\ \left(x,y\right)&\mapsto f\left(x,y\right) = \wedge\left(x,y\right)=x^y \end{align*}$$*

The relation defined by f looks as follows

$$\begin{equation} R=\left{\left(0,0,1\right), \left(0,1,0\right),\left(2,3,8\right),\left(8,2,64\right),\left(3,2,9\right),\dots,\right}\subseteq\mathbb{N}\times\mathbb{N}\times\mathbb{N} \end{equation*}$$*

In general, R has the following definition

$$\begin{equation} R=\left{\left(x,y,x^y\right):x,y\in\mathbb{N}\right} \end{equation*}$$*

As before, we have that as X=Y then for any x\in X we have x\in Y and likewise, for any y\in Y we have that y\in X. We, therefore, have that R\left(x,y,x^y\right)\neq R\left(y,x,y^x\right), which confirms that in general exponentiation is not commutative. :::

The last three examples expose another property that relations can have. If two or more elements relate then it doesn't matter which way the relation is written, that is if x\sim y then we can have the case that y\sim x. Such a relation is called symmetric.

::: definition Definition 86. Symmetric relation

Let S be a set with relation R\subseteq S\times S. We say that R is a symmetric relation if and only if \forall x,y\in S we have that xRy implies yRx, equivalently we can write R is symmetric if and only if x\sim y implies y\sim x. If R is not symmetric we say that R is an anti-symmetric relation. :::

As with reflexive relations we can show that trying to extend a the idea of a symmetric relation on a single set to multiple sets we have to conclude the sets have to be the same.

Indeed suppose that S and T are sets with a relation R\subseteq S\times T. The natural extension for a symmetric relation would be \forall s\in S that sRt\Rightarrow tRs for t\in T. This implies that t\in S and s\in T and therefore s,t\in S\cap T.

::: definition Definition 87. Symmetric binary relation

Let S and T be sets with relation R\subseteq S\times T. We say that R is symmetric if and only if $S=T$ :::

Likewise a similar argument holds for $n$-ary symmetric relations

::: definition Definition 88. Symmetric $n$-ary relation

Let S_1,S_2,S_3,\dots,S_n be sets with relation R\subseteq S_1\times S_2\times S_3\times\dots\times S_n. We say that the relation R is symmetric if and only if S_i=S_j for all $i,j\in\left{1,2,3,\dots,n\right}$ :::

The comparison, less than, less than or equal to, greater than, and greater than or equal to operators on the naturals also give insight into another interesting property. The following examples will make it more clear

::: example Example 82. Let S=T=\mathbb{N} and define x\sim y by x\leq y. Consider a,b,c\in\mathbb{N} with a=2, b=4 and c=6. We have that a\sim b as 2\leq 4 and we have that b\sim c as 4\leq 6, we clearly also have a\sim c as 2\leq 6. In general if we have a,b,c\in\mathbb{N} with a\leq b\leq c we have that a\sim b and b\sim c implies a\sim c. :::

::: example Example 83. Let S=T=\mathbb{N} and define x\sim y by x\geq y. Consider a,b,c\in\mathbb{N} with a=8, b=3 and c=1. We have that a\sim b as 8\geq 3 and we have that b\sim c as 3\leq 1, we also have a\sim c as 8\geq 1. More generally if we have a,b,c\in\mathbb{N} with a\geq b\geq c we have that a\sim b and b\sim c implies a\sim c. :::

::: example Example 84. Let S=T=\mathbb{N} and define x\sim y by x= y. Consider a,b,c\in\mathbb{N} with a=2, b=2 and c=2. We have that a\sim b as 2=2 and we have that b\sim c as 2=2, we also have a\sim c as 2=2. More generally if we have a,b,c\in\mathbb{N} with a= b= c we have that a\sim b and b\sim c implies a\sim c. :::

We see that with certain relations that if a\sim b is true and b\sim c is true then we can conclude that a\sim b is true. Such a relation is called a Transitive relation.

::: definition Definition 89. Transitive relation

Let S be a set with relation R\subseteq S\times S. We say that R is a transitive relation if and only if \forall a,b,c\in S we have that if aRb and bRc then we have that aRc. :::

We again consider if a transitive relation can be extended to multiple sets. Suppose that we have a binary relation R\subseteq S\times T for some sets S and T. The natural extension to make R a transitive relation is to have s\sim t and t\sim u implies s\sim u for s,t\in S and t,u\in T. Hence we must have s,t\in S but need not have u\in S. As we aren't assuming anything else about the relation R there is nothing more we can deduce about a binary transitive relation.

::: definition Definition 90. Transitive binary relation

Let S and T be sets with relation R\subseteq S\times T. We say that R is transitive if and only if the set \tilde{R} given by

$$\begin{equation} \tilde{R} = \left{\left(x,z\right) \in S \times T:\forall x\in S\wedge\forall z\in T: \exists y \in S \cap T: \left(x, y\right) \in R \wedge \left(y, z\right) \in R\right} \end{equation*}$$ is non-empty.* :::

A definition can be made for a transitive $n$-ary relation. I AM NOT SURE HOW TO DEFINE THIS YET, PAIR-WISE RELATION OF EACH SET????????????????? We can make use of a binary relation in order to define

::: definition Definition 91. Transitive $n$-ary relation

Let S_1,S_2,S_3,\dots,S_n be sets with relation R\subseteq S_1\times S_2\times S_3\times\dots\times S_n. We say that the relation R is transitive if and only if the set \tilde{R} given by

$$\begin{equation} \tilde{R}=\left{\left(x,z\right)\in \right} \end{equation*}$$ is non-empty* :::

Equivalence Relations

Of all the examples of relations we have seen so far there is one in particular that is special, the equality operator =. This relation is reflexive, symmetric and transitive.

::: {#prop:EqualityOnNaturalsIsEquivRelation .proposition} Proposition 59. The equality relation on the natural numbers is reflexive, symmetric and transitive

Let S=T=\mathbb{N} and for x,y\in\mathbb{N} define the relation x\sim y by x=y. We have that

  1. \sim is reflexive, that is \forall x\in\mathbb{N} we have $x\sim x$

  2. \sim is symmetric, that is \forall x,y\in\mathbb{N} we have $x\sim y\Rightarrow y\sim x$

  3. \sim is transitive, that is \forall x,y,z\in\mathbb{N} we have that if x\sim y and y\sim z then $x\sim z$

Proof:

  1. \sim is reflexive, that is \forall x\in\mathbb{N} we have x\sim x:

    Let x\in\mathbb{N} then by definition of equality we have that for y,z\in\mathbb{N} that y=z if and only if y\subseteq z and z\subseteq y. It is clear that x=x and therefore x\sim x proving reflexivity.

  2. \sim is symmetric, that is \forall x,y\in\mathbb{N} we have x\sim y\Rightarrow y\sim x:

    Let x,y\in\mathbb{N} with x\sim y. We have that as x\sim y then x=y. By definition of equality we also have y=x and so y\sim x showing that \sim is symmetric.

  3. \sim is transitive, that is \forall x,y,z\in\mathbb{N} we have that if x\sim y and y\sim z then x\sim z:

    Let x,y,z\in\mathbb{N} such that x\sim y and y\sim z, then x=y and y=z. By definition of equality it follows that x=z and so x\sim z showing transitivity.

The result follows. $\qed$ :::

What does it mean for a relation to be reflexive, symmetric and transitive? In the case of equality on the natural numbers we see that reflexivity tells us that an element is equal to itself. Equality being symmetric tells us that if x=y then y=x that is it does not matter which we we say the two numbers are equal. Finally transitivity tells us that if x=y and y=z we are able to deduce that x=z. In this context, equality being reflexive, symmetric and transitive allows us to quantify which elements are equivalent. In the case of equality it is clear which elements are equivalent, the ones that are equal!

::: example Example 85. Consider X=Y=\mathbb{N} and for x,y\in\mathbb{N} define the relation R=\mathbb{N}\times\mathbb{N}. We have that R is reflexive as for any x\in\mathbb{N} we have that \left(x,x\right)\in R. Likewise R symmetric as \forall x,y\in\mathbb{N} we have that \left(x,y\right)\in R\Rightarrow\left(y,x\right)\in R. as X=Y. Finally R is transitive as \forall x,y,z\in\mathbb{N} we have that \left(x,y\right)\in R and \left(y,z\right)\in R and \left(x,z\right)\in R.

What does R being reflexive, symmetric and transitive mean? In this case R being reflexive, symmetric and transitive means that every x\in X and y\in Y are related and we can see R as a relation meaning "is an element of $\mathbb{N}$". This means that we have shown that X and Y are equivalent, which we already know by the fact we set X=Y=\mathbb{N}. :::

Based on the two examples we motivate the following definition.

::: definition Definition 92. Equivalence relation

Let S be a set and R\subseteq S\times S a relation. We say that R is an equivalence relation if and only if

  1. R is reflexive

  2. R is symmetric

  3. R is transitive :::

Proposition 59{reference-type="ref" reference="prop:EqualityOnNaturalsIsEquivRelation"} is equivalent to saying that equality is an equivalence relation on \mathbb{N}. The two examples also show a disparity between the two equivalence relations shown. In the case of the equality the relation R was a strict subset of \mathbb{N}\times\mathbb{N} where as in the second example R was equal to \mathbb{N}\times \mathbb{N}. This raises the question what is different? We can answer this by looking at the set of elements that relate to a given element. Such a set is called an equivalence class.

::: definition Definition 93. Equivalence class

Let S be a set, let x\in S and let R be an equivalence relation on S. We define an equivalence class, denoted \left[x\right] to be the set

$$\begin{equation} \left[x\right]=\left{y\in S:xRy\right} \end{equation*}$$*

If the context doesn't make clear the relation we are referring we explicitly write \left[x\right]_R to be the equivalence class of x under the relation R.

We say that an element y\in\left[x\right] is a representative of the equivalence class of $x$ :::

To get a feel for equivalence classes we consider the, non-mathematical, following example.

::: example Example 86. Consider the set X to be the set of all people currently alive. Define a relation, \sim, on X by

$$\begin{equation} \forall\left(x,y\right)\in X\times X: x\sim y\iff x\text{ and }y\text{ where born in the same year} \end{equation*}$$*

We have that \sim is an equivalence relation. Clearly if x\sim x as x was born in some year D. We have that if x\sim y then x and y are born in the same year and clearly y\sim x. Now if x\sim y and y\sim z then x and y are born in the same year and y and z are born in the same year. This therefore means x and z are born in the same year so x\sim z showing transitivity.

Now let x\in X and consider the equivalence class \left[x\right]_\sim. By definition of an equivalence class we have that

$$\begin{equation} \left[x\right]_\sim=\left{y\in x:x\sim y\right} \end{equation*}$$*

This means that the equivalence class \left[x\right]_\sim is the set of all people currently alive that were born in the same year. As X was the set of all currently alive people we have found a way to extract a subset of X such that they are all born in the same year. If we now pick another element of X, say a, such that x\not\sim a then by definition a was not born in the same year as x and \left[a\right]_\sim is another subset of X of currently alive people born in the same year. Moreover we have that \left[x\right]_\sim\neq\left[a\right]_\sim. We can do this for every element of X and get a collection of sets that correspond to all of the possible different years that anyone currently alive could possibly be in. :::

The previous example has shown that we are able to construct a partition of a set S which has an equivalence relation \sim. We can prove this more generally, firstly we recall the definition of a set partition.

Let S be a set and define \mathbb{S} to be the set of subsets of S. We say that \mathbb{S} is a partition of S if the following hold.

  1. \forall S_1,S_2\in\mathbb{S} we have S_1\cap S_2=\emptyset whenever S_1\neq S_2

  2. Taking the union of every T\in\mathbb{S} gives us S that is

    $$\begin{equation*} S=\bigcup_{T\in\mathbb{S}} T \end{equation*}$$

  3. \forall T\in\mathbb{S} we have that T\neq\emptyset.

Before we can show that the equivalence classes partition the set we must first show that there can be no empty equivalence class.

::: {#prop:EquivClassNonEmpty .proposition} Proposition 60. Equivalence class is non-empty

Let S be a set with an equivalence relation \sim. Let x\in S then we have that $\left[x\right]_\sim\neg\emptyset$

Proof:

Let S be a set with an equivalence relation \sim. By definition of an equivalence relation we have that \forall x,y,z\in S that

  1. \sim is reflexive, that is $x\sim x$

  2. \sim is symmetric, that is $x\sim y\Rightarrow y\sim x$

  3. \sim is transitive, that is x\sim y and y\sim x implies that $x\sim z$

Consider the equivalence class \left[x\right]_\sim. By definition of an equivalence class we know that

$$\begin{equation} \left[x\right]_\sim=\left{y\in S:x\sim y\right} \end{equation*}$$*

As \sim is reflexive we have that x\mathop{\mathrm{Im}}x and so x\in\left[x\right]_\sim and therefore \left[x\right]_\sim\neq\emptyset. $\qed$ :::

We can prove that an equivalence relation partitions the set it is defined on.

::: {#thm:EquivClassesOfRelationPartitionSet .theorem} Theorem 19. Equivalence classes of a relation partitions the set

Let S be a set with an equivalence relation \sim. Let \mathbb{S} denote the equivalence classes of \sim for each s\in S. We have that \mathbb{S} is a partition of S.

Proof:

Let S be a set with an equivalence relation \sim and let \mathbb{S} be the set of equivalence classes of \sim for each s\in S. Let x\in S then x belongs to at least one equivalence class by proposition 60{reference-type="ref" reference="prop:EquivClassNonEmpty"}. We therefore have that

$$\begin{equation} S=\bigcup_{x\in S}\left[x\right]_\sim \end{equation*}$$*

It is left to show that if \left[x\right]_\sim\neq\left[y\right]_\sim for x,y\in S then we have that \left[x\right]_\sim\cap\left[y\right]_\sim=\emptyset. This is equivalent to saying that if \left[x\right]_\sim\cap\left[y\right]_\sim\neq\emptyset then \left[x\right]_\sim=\left[y\right]_\sim. So suppose that \left[x\right]_\sim\cap\left[y\right]_\sim\neq\emptyset then \left[x\right]_\sim\cap\left[y\right]_\sim has at least one element z. Suppose that z\in\left[x\right]_\sim then by definition we have that x\sim z. Let a\in\left[x\right]_\sim be an arbitrary element of the equivalence class of x. We have that a\sim x then by transitivity of \sim we conclude that a\sim z. However as z\in\left[x\right]_\sim\cap\left[y\right]_\sim then we have that z\in\left[y\right]_\sim and so y\sim z. As \sim is symmetric we have z\sim y and again by transitivity we conclude that a\sim y. Hence a\in\left[y\right]_\sim and so \left[x\right]_\sim\subseteq\left[y\right]_\sim.

A similar argument shows \left[y\right]_\sim\subseteq\left[x\right]_\sim and therefore we have that \left[x\right]_\sim=\left[y\right]_\sim. Finally we conclude that unequal equivalence classes are disjoint and therefore the set of equivalence classes \mathbb{S} is a partition for S.

The result is shown. $\qed$ :::

Construction of the Integers

::: epigraph The trouble with integers is that we have examined only the very small ones. Maybe all the exciting stuff happens at really big numbers, ones we can't even begin to think about in any very definite way.

Ronald Graham :::

We now have enough theory to consider extending the natural numbers \mathbb{N}. One reason to do this is to provide a completion to the idea of subtraction. Recall that n-m is only defined in \mathbb{N} if and only if m\leq n. This is a limiting idea. For example, the idea of debt can't be explained using only \mathbb{N}. We know that if the balance on your bank account is negative then you owe money to someone, if your balance is positive you have money to spend8 . The natural numbers don't have a concept of "negative" or debt, we can only deal with "positive" values. To keep the financial institutions happy we should resolve this issue.

To do this we need to consider exactly what it is we want to achieve. Firstly we want to be able to define n-m for all n,m\in\mathbb{N}. Clearly, if n\geq m then such a number already exists in \mathbb{N}. Secondly, such a number n-m could have many different representations, for example, 6-2=4 and 5-1=4. We need a way to say that any of these different representations actually represents the same thing. Formally if we have a,b,c,d\in\mathbb{N} such that a-b=c-d then a-b and c-d represent the same number, this is equivalent to a+d=b+c. Thinking of - as a relation we can use the language of equivalence relations to solve this issue. That is a relation where \left(a,b\right)\sim\left(c,d\right)

Defining the Integers

We start by recasting the defining of subtraction to be defined as an ordered tuple.

::: definition Definition 94. Subtraction as an ordered tuple

Let a,b\in\mathbb{N}. We define the subtraction as an ordered tuple \left(a,b\right)\in\mathbb{N}^2 to mean \left(a-b\right). We will call an element x\in\mathbb{N}^2 a subtraction tuple. We note that if a\geq b we have $\left(a-b\right)\in\mathbb{N}$ :::

From this we can define a relation

::: definition Definition 95. Relation on subtraction

Let \left(a,b\right),\left(c,d\right)\in\mathbb{N}^2 be subtraction tuples. We define the relation \sim such that \left(a,b\right)\sim\left(c,d\right) if and only if $a+d=b+c$ :::

We have that this relation is an equivalence relation.

::: proposition Proposition 61. Relation on subtraction ordered tuples is an equivalence relation

Let x,y\in\mathbb{N}^2 be subtraction tuples and define the relation x\sim y as above. We have that \sim is an equivalence relation.

Proof:

Let x,y,z\in\mathbb{N}^2 be subtraction tuples such that x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right). We need to show that \sim is an equivalence relation, that is

  1. \sim is reflexive

  2. \sim is symmetric

  3. \sim is transitive

  1. \sim is reflexive:

    We have that x=\left(a,b\right) and by definition of \sim we know that x\sim x if and only if a+b=a+b which is clear by definition of equality on the natural numbers. Hence x\sim x and \sim is reflexive.

  2. \sim is symmetric:

    We have that x=\left(a,b\right) and y=\left(c,d\right). Suppose that x\sim y then we have that a+d=b+c. By commutativity of equality of natural numbers that a+d=b+c\Rightarrow b+c=a+d. By commutativity of addition on the natural numbers we have that b+c=a+d is the same as c+b=d+a. Hence we have that \left(c,d\right)\sim\left(a,b\right) by definition of \sim and so y\sim x showing that \sim is symmetric.

  3. \sim is transitive:

    We know that x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right). Now suppose that x\sim y and y\sim z then by definition we have that \left(a,b\right)\sim\left(c,d\right) and \left(c,d\right)\sim\left(e,f\right) and hence by definition of \sim we have a+d=c+b and c+f=e+d.

    Consider a+c+f we have

    $$\begin{equation} a+c+f=a+e+d=a+d+e=c+b+e \end{equation*}$$*

    That is to say a+c+f=c+b+e. We have by the cancellation laws on the natural numbers that a+f=b+e which implies that \left(a,b\right)\sim\left(e,f\right). Which is to say x\sim z. Hence transitivity has been shown.

It follows that \sim is an equivalence relation. $\qed$ :::

Now that we have shown that \sim is an equivalence relation we can solve the multiple representation problem by considering the equivalence classes of \mathbb{N}^2 with the relation \sim. Let x\in\mathbb{N}^2 with x=\left(a,b\right) then the equivalence class of x is given by

$$\begin{equation*} \left[x\right]\sim=\left[\left(a,b\right)\right]\sim=\left{\left(c,d\right)\in\mathbb{N}^2 : \left(a,b\right)\sim\left(c,d\right)\right} \end{equation*}$$

We know by theorem 19{reference-type="ref" reference="thm:EquivClassesOfRelationPartitionSet"} that for each x\in\mathbb{N}^2 there is set of equivalence classes partition \mathbb{N}^2 and that each equivalence class is disjoint. This is to say if x,y\in\mathbb{N}^2 then we have that if \left[x\right]_\sim\cap\left[y\right]_\sim\neq\emptyset then \left[x\right]_\sim =\left[y\right]_\sim. This solves the multiple representation issue.

Let us have a look at some equivalence classes

::: example Example 87. Let x\in\mathbb{N}^2 with x=\left(1,3\right) by definition we have that x represents 1-3. Consider the equivalence class of x, \left[x\right]=\left\{y\in\mathbb{N}:x\sim y\right\} and let y\in\left[x\right]. We have that y=\left(c,d\right) and that 1+d=3+c, one possible y where this is true is given by y=\left(0,2\right) and y represents 0-2, As we have y\in\left[x\right] then we have that \left[x\right]=\left[y\right] so we shall take y to be the canonical representative of this equivalence class. :::

Now that we have that the subtraction tuples are in equivalence classes we can consider the following. Suppose that a,b,c\in\mathbb{N} then what is a-\left(b-c\right)? For example if a=10, b=6 and c=3 then we have that 10-\left(6-3\right)=10-3=7. This is also the same as 10+3-6=13*6=7. This holds in general where we have that \left(a,b-c\right)\sim\left(a+c,b\right)

::: {#lem:NaturalMinusDifferenceOfNatural .lemma} Lemma 6. $\left(a,b-c\right)\sim\left(a+c,b\right)$

Let a,b,c\in\mathbb{N} with a> b\geq c. We have that

$$\begin{equation} \left(a,b-c\right)\sim\left(a+c,b\right) \end{equation*}$$*

Proof:

Let a,b,c\in\mathbb{N} be as given. By definition of \sim we have \left(x,y\right)\sim\left(u,v\right) if and only if x+v=u+y. We argue by contradiction, suppose that \left(a,b-c\right)\not\sim\left(a+c,b\right) then by definition we have that

$$\begin{align} a+b&\neq a+c+\left(b-c\right)\ b&\neq c+\left(b-c\right),\ \text{By the cancellation law}\ b&\neq \left(c+b\right)-c,\ \text{By proposition}\ref{prop:NaturalAddDifference}\ b&\neq\left(b+c\right)-c,\ \text{By commutativity}\ b&\neq b+\left(c-c\right),\ \text{By proposition}\ref{prop:NaturalAddDifference}\ 0&\neq \left(c-c\right),\ \text{By the cancellation law}\ 0&\neq 0 \end{align*}$$*

A contradiction. $\qed$ :::

By this lemma it follows that a-\left(b-c\right)=\left(a+c\right)-b.

We now look at the definition of what the set of equivalence relations looks like. We make the following definition

::: {#def:QuotientSet .definition} Definition 96. Quotient set

Let S be a set with an equivalence relation \sim. Let x\in S and consider the equivalence class \left[x\right]_\sim. We define the quotient set of S, denoted by S/\sim by

$$\begin{equation} S/\sim=\left{\left[x\right]_\sim :x\in S\right} \end{equation*}$$* :::

Why have we called the set of the equivalence classes a quotient set? We can see why with a few examples.

::: example Example 88. We reconsider the example where X is the set of all people currently alive with the relation \sim given by

$$\begin{equation} \forall\left(x,y\right)\in X\times X: x\sim y\iff x\text{ and }y\text{ where born in the same year} \end{equation*}$$*

We know that \sim is an equivalence relation and we know that the equivalence classes define a set of all people currently alive born in a certain year. We can identify the quotient set X/\sim as the set of all of the possible years that all people currently alive could live in. As an example suppose that person x\in X was born in 1983. Then by the definition of \sim we have that x\sim y if and only if y is also born in 1983 and that \left[x\right]_\sim is the equivalence class of all people born in 1983. As \left[x\right]_\sim\in X/\sim then \left[x\right]_\sim is the set in X/\sim that represents the year 1983. That is the quotient set has taken the set X of all currently alive people who were born in a certain year and turned it into the set of all possible years. :::

::: example Example 89. Let X be the set of all possible cars and define the equivalence relation \sim such that x\sim y if and only if x and y are the same colour. We have that sim is an equivalence relation. Reflexivity is clear as if x is a certain colour then clearly x\sim x will be true. Now if x\sim y then both x and y are the same colour and so y\sim x. Finally if x\sim y and y\sim z then x and y are the same colour and so are y and z so it follows that x\sim z.

Suppose now that x\in X, then the equivalence class \left[x\right]_\sim is the set where all cars are the same colours. Hence the quotient set X/\sim will be the set of all possible car colours. The quotient set has taken the set of all possible cars and turned it into the set of all possible car colours.

If we had a different relation R where xRy if and only if x and y have exactly two doors then R is also an equivalence relation and X/R would take all of the possible cars X and turn it into the set of all of the cars that have exactly two doors. :::

These examples show that the quotient set takes a set of objects S and extracts a given property defined by the equivalence relation \sim defined on S. How can we use the quotient set on the equivalence classes of the subtraction tuples?

We have that the the quotient set of \mathbb{N}^2/\sim is given by

$$\begin{equation*} \mathbb{N}^2/\sim=\left{\left[x\right]_\sim:x\in\mathbb{N}^2\right} \end{equation*}$$

What do these elements actually look like? Let \left(a,b\right)=x\in\mathbb{N}^2 and consider the equivalence class \left[x\right]_\sim. Firstly, in the naturals, we know that 0=0-0 and more generally that 0=a-a for any a\in\mathbb{Z}. Hence 0\in\left[\left(0,0\right)\right].

Now, consider \left[\left(a,0\right)\right] then we would have that any \left(c,d\right)=y\in\left[\left(a,0\right)\right] is such that \left(a,0\right)\sim\left(c,d\right) if and only if a-0=c-d. Hence each a is equivalent to some subtraction tuple. Moreover each \left(a,0\right)=a\in\mathbb{N}, therefore we have a canonical representation for each element a\in\mathbb{N}. What happens if we have a tuple \left(a,b\right) where a\geq b? We can see that if \left(a,b\right)\sim\left(c,d\right) then a+d=c+b. For example we have that \left(0,3\right)\sim\left(1,4\right) which gives

\left(8,11\right)\sim\left(0,3\right) = 8-11 = 0-3 8+3 = 11 $$\begin{equation*} 0-3=1-4 \Rightarrow 0+4=1+3 \Rightarrow 4=4 \end{equation*}$$

Hence we can define a canonical representation for each \left(0,a\right) where a\in\mathbb{N}. We will write the element \left(0 ,a\right) by -a for each a\in\mathbb{N}. We have define the set of Integers.

::: definition Definition 97. Integers

Let \mathbb{N}^2 have the equivalence relation \sim defined by \left(a,b\right)\sim\left(c,d\right) if and only if a+d=b+c. We define the set of Integers, denoted \mathbb{Z}, as the quotient set \mathbb{N}^2/\sim. The set \mathbb{Z} has the form

$$\begin{equation} \mathbb{Z}=\left{\dots,-4,-3,-2,-1,0,1,2,3,4,\dots\right} \end{equation}$$ :::

We make two additional definitions based on the definition of the canonical form the equivalence classes

::: definition Definition 98. Positive Integer

Let a\in\mathbb{Z}. We say that a is a positive integer if and only if a\in\left[\left(b,0\right)\right] for some b\in\mathbb{N} with b\neq 0. :::

::: definition Definition 99. Negative Integer

Let a\in\mathbb{Z}. We say that a is a negative integer if and only if a\in\left[\left(0,b\right)\right] for some b\in\mathbb{N} with b\neq 0. :::

We can use these two definitions to define an occasionally useful idea.

::: definition Definition 100. Sign of an integer

Let x\in\mathbb{Z}. We define the sign of x, denoted by \mathop{\mathrm{sgn}}\left(x\right) to be the following function

$$\begin{align} \mathop{\mathrm{sgn}}:\mathbb{Z}&\rightarrow\left{-1,0,1\right}\ x&\mapsto\mathop{\mathrm{sgn}}\left(x\right)=\begin{cases} 1,\ \text{If } x\text{ is a positive integer}\ -1,\ \text{If } x\text{ is a negative integer}\ 0,\ \text{Otherwise} \end{cases} \end{align*}$$* :::

We also have the following, clear result

::: proposition Proposition 62. The natural numbers are a subset of the integers

We have that $\mathbb{N}\subseteq\mathbb{Z}$

Proof:

We have that the elements of the equivalence class \left[\left(x,0\right)\right] have the form x-0=x\in\mathbb{N}. Let a\in\mathbb{N} then we have that a\in\left[\left(a,0\right)\right]. This holds for every a\in\mathbb{N} and so \mathbb{N}\subseteq\mathbb{Z}. $\qed$ :::

We will let \left[\left(a,b\right)\right] be denoted by \left[a,b\right] and extend the operations of addition and multiplication to the integers by defining how they work on the equivalence classes.

Extending equality to the integers

Equality for the integers is easy to define.

::: definition Definition 101. Equality of integers

Let x,y\in\mathbb{Z} be two integer numbers. We define that two integers are equal, denoted x=y if and only if x\sim y. This is the same as saying both x and y belong to the same equivalence class. In the case where x\not\sim y, we say that x is not equal to y and write x\neq y. :::

Extending inequality operators to the integers

Inequality operators extend in a natural way.

::: definition Definition 102. Less than operator

Let x,y\in\mathbb{Z} where x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. The less than operator, denoted by x<y is defined by the logical proposition

$$\begin{equation} <\left(x,y\right)=\begin{cases} 1,\ \text{If } a+d<b+c\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

This can equivalently be express as

$$\begin{equation} x<y \iff a+d<b+c \end{equation*}$$* :::

::: definition Definition 103. Less than or equal to operator

Let x,y\in\mathbb{Z} where x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. The less than or equal operator, denoted by x\leq y is defined by the logical proposition

$$\begin{equation} \leq\left(x,y\right)=\begin{cases} 1,\ \text{If } a+d\leq b+c\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

This can equivalently be express as

$$\begin{equation} x\leq y \iff a+d\leq b+c \end{equation*}$$* :::

::: definition Definition 104. Greater than operator

Let x,y\in\mathbb{Z} where x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. The greater than operator, denoted by x>y is defined by the logical proposition

$$\begin{equation} >\left(x,y\right)=\begin{cases} 1,\ \text{If } a+d>b+c\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

This can equivalently be express as

$$\begin{equation} x>y \iff a+d>b+c \end{equation*}$$* :::

::: definition Definition 105. Greater than or equal to operator

Let x,y\in\mathbb{Z} where x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. The greater than or equal to operator, denoted by x\geq y is defined by the logical proposition

$$\begin{equation} \geq\left(x,y\right)=\begin{cases} 1,\ \text{If } a+d\geq b+c\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

This can equivalently be express as

$$\begin{equation} x\geq y \iff a+d\geq b+c \end{equation*}$$* :::

Extending addition to the integers

We have an understanding of addition on the natural numbers, mainly the recursive definition given by

$$\begin{align*} +&:\mathbb{N}^2\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ \left(m,n\right)&\mapsto +\left(m,n\right)=\begin{cases} m+0=m,\ \text{If } n=0\ m+S\left(n\right)=S\left(m+n\right),\ \text{If } n\neq 0 \end{cases} \end{align*}$$

Now if we take a,b\in\mathbb{Z} with a,b being positive integers then we have that a\in\left[\left(a,0\right)\right] and b\in\left[\left(b,0\right)\right]. We then have that a+b will be in \left[\left(a+b,0\right)\right]. Now suppose that a,b\in\mathbb{N} with a,b being negative integers then we have that a\in\left[\left(0,a\right)\right] and b\in\left[\left(0,b\right)\right]. Intuitively we know that -2+-3=-5 so we want these to add like in the positive integer case. This is to say we have a+b will be in the class \left[\left(0,a+b\right)\right].

We can combine these two observations to define addition on the integers.

::: definition Definition 106. Addition on the Integers

Let x,y\in\mathbb{Z} with x=\left(a,b\right) and y=\left(c,d\right). We define addition on the integers by

$$\begin{equation} \left[a,b\right]+\left[c,d\right]=\left[a+c,b+d\right] \end{equation}$$ :::

To check this definition makes sense consider x=4,y=3. Both x and y belong to some equivalence class, for example x\in\left[\left(5,1\right)\right] and y\in\left[\left(8,5\right)\right]. Then we have that x+y=7 and

$$\begin{equation*} \left(5,1\right)+\left(8,5\right)=\left(5+8,1+5\right)=\left(13,6\right) \Rightarrow 13-6=7 \end{equation*}$$

Extending multiplication to the integers

We also extend multiplication to the integers. We have the definition of multiplication on the naturals given by

$$\begin{align*} &:\mathbb{N}\times\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ \left(m,n\right)&\mapsto \left(m,n\right)=\begin{cases} m0=0,\ \text{If } n=0\ mS\left(n\right)=mn+m,\ \text{If } n\neq 0 \end{cases} \end{align}$$

As before, if we take x,y\in\mathbb{Z} with x,y being positive integers then we have that x\in\left[\left(x,0\right)\right] and b\in\left[\left(x,0\right)\right] we have that x*y\in\left[\left(x*y,0\right)\right].

Suppose that x,y\in\mathbb{Z} with x=\left(a,b\right) and y=\left(c,d\right). We have that

$$\begin{align*} \left(a-b\right)\left(c-d\right)&=\left(a-b\right)c-\left(a-b\right)d\ &=ac-bc-\left(ad-bd\right)\ &=ac-bc+bd-ad\ &= ac+bd-bc-ad\ &=ac+bd-\left(ad+bc\right) \end{align}$$ This is \left(a,b\right)*\left(c,d\right)=\left(ac+bd,ad+bc\right)

This well be the definition of multiplication of the integers.

::: definition Definition 107. Multiplication on the Integers

Let x,y\in\mathbb{Z} with x=\left(a,b\right) and y=\left(c,d\right). We define multiplication on the integers by

$$\begin{equation} \left[a,b\right]\left[c,d\right]=\left[ac+bd,ad+bc\right] \end{equation}$$* :::

Closure properties of addition and multiplication

As with the natural numbers we need to show that the operations of addition and multiplication are closed. Additionally we want to prove our claim at the start of this section that the integers allow us to completely perform subtraction.

::: theorem Theorem 20. Addition and multiplication on the integers are well-defined operators and closed

We have that \forall x,y\in\mathbb{Z} that

  1. $x+y\in\mathbb{Z}$

  2. $xy\in\mathbb{Z}$*

Proof:

  1. x+y\in\mathbb{Z}:

    We need to show that if \left(a,b\right)\sim\left(a',b'\right) and \left(c,d\right)\sim\left(c',d'\right) then \left(a+c,b+d\right)\sim\left(a'+c',b'+d'\right) as this will show equivalent elements produce the same result when added and therefore integer addition is well-defined.

    We have by definition that \left(a,b\right)\sim\left(a',b'\right) that a+b'=a'+b, likewise we have \left(c,d\right)\sim\left(c',d'\right) gives c+d'=c'+d.

    Now, we have that

    $$\begin{align} a+b'+c+d'&=a'+b+c'+d\ a+c+b'+d'&=a'+c'+b+d\ \Rightarrow \left(a+c,b+d\right)&\sim\left(a'+c',b'+d'\right) \end{align*}$$ Hence \left[\left(a+c,b+d\right)\right]=\left[\left(a'+c',b'+d'\right)\right] and so addition is well-defined.*

    It is left to prove closure. Let x,y\in\mathbb{Z} with x=\left(a,b\right) and y=\left(c,d\right). By definition of integer addition we have that x+y=\left(a+c,b+d\right) and moreover we have a+c\in\mathbb{N} and b+d\in\mathbb{N}. Hence \left(a+c,b+d\right)\in\left[a+c,b+d\right] and therefore x+y\in\mathbb{Z} showing closure.

  2. x*y\in\mathbb{Z}:

    As with addition we need to show that if \left(a,b\right)\sim\left(a',b'\right) and \left(c,d\right)\sim\left(c',d'\right) then \left(a,b\right)*\left(c,d\right) \sim \left(a',b'\right)*\left(c',d'\right). As before we have that

    We have that

    $$\begin{equation} \left(a,b\right)\left(c,d\right)=\left(ac+bd,ad+bc\right)\iff ac+bd-\left(ad+bc\right) \end{equation}$$*

    Now as \left(a,b\right)\sim\left(a',b'\right) then a+b'=b+a' and \left(c,d\right)\sim\left(c',d'\right) then c+d'=d+c'. Hence

    $$\begin{align} ac+bd-\left(ad+bc\right)&=\left(ac-ad\right)+\left(bd-bc\right)\ &=a\left(c-d\right)+b\left(d-c\right)\ &=a\left(c'-d'\right)+b\left(d'-c'\right), \text{ By assumption as} c+d'=d+c'\Rightarrow c-d=c'-d'\ &=ac'-ad'+bd'-bc'\ &=\left(ac'-bc'\right)+\left(bd'-ad'\right), \text{ By commutativity of the Naturals}\ &=c'\left(a-b\right)+d'\left(b-a\right)\ &=c'\left(a'-b'\right)+d'\left(b'-a'\right), \text{ By assumption as } a+b'=b+a'\Rightarrow a-b=a'-b'\ &=\left(c'a'-c'b'\right)+\left(d'b'-d'a'\right)\ &=c'a'-c'b'+d'b'-d'a'\ &=a'c'-b'c'+b'd'-a'd', \text{ By commutativity of the Naturals}\ &=\left(a'c+b'd'\right)-b'c'-a'd'\ &=\left(a'c+b'd'\right)-\left(a'd'+b'c'\right), \text{ By lemma \ref{lem:NaturalMinusDifferenceOfNatural}}\ \end{align*}$$*

    This shows that multiplication is well-defined. It is left to show closure. Let x,y\in\mathbb{Z} with x=\left(a,b\right) and y=\left(c,d\right). By the definition of multiplication on the integers we have that x*y=\left(ac+bd,ad+bc\right) with ac+bd\in\mathbb{N} and ad+bc\in\mathbb{N}. Hence we conclude that \left(ac+bd,ad+bc\right)\in\left[ac+bd,ad+bc\right], and so by definition x*y\in\mathbb{Z}.

The result is shown. $\qed$ :::

Now that we have shown closure we can deduce an immediate property.

::: {#prop:multiplication_by_negative_one_for_integers .proposition} Proposition 63. Multiplication of an integer by $-1$

Let x\in\mathbb{Z} where x\in\left[a,b\right] for some a,b\in\mathbb{N}. We have that

  1. $-1x = -1*\left(a,b\right)=\left(b,a\right)$*

  2. $x-1 = \left(a,b\right)-1=\left(b,a\right)$

Proof:

  1. -1*x = -1*\left(a,b\right)=\left(b,a\right):

    We have that -1\in\left[0,1\right] and so

    $$\begin{align} -1x&=\left(0,1\right)\left(a,b\right)\ &=\left(0a+1b,0b+1a\right)\ &=\left(b,a\right) \end{align*}$$*

  2. x*-1 = \left(a,b\right)*-1=\left(b,a\right):

    Likewise we have

    $$\begin{align} x*-1&=\left(a,b\right)\left(0,1\right)\ &=\left(a0+b1,a1+b0\right)\ &=\left(b,a\right) \end{align}$$*

As required. $\qed$ :::

::: {#cor:multiplication_by_negative_one_changes_integer_sign .corollary} Corollary 3. Multiplication of a positive integer by -1 makes it a negative integer and multiplication of a negative integer by -1 makes it a positive integer

  1. If x is a positive integer then -1*x is a negative integer.

  2. If x is a negative integer then -1*x is a positive integer.

Proof:

By definition if x\in\mathbb{Z} is positive then x\in\left[a,0\right] for some a\in\mathbb{N}. By proposition 63{reference-type="ref" reference="prop:multiplication_by_negative_one_for_integers"} we have that -1*x=\left(0,a\right)=x*-1, which is by definition a negative integer.

Likewise if x\in\mathbb{Z} is negative then x\in\left[0,a\right] for some a\in\mathbb{N}. By proposition 63{reference-type="ref" reference="prop:multiplication_by_negative_one_for_integers"} we have that -1*x=\left(a,0\right)=x*-1, which is by definition a positive integer.

$\qed$ :::

Associativity of integer addition and multiplication

The associativity of addition and multiplication of the naturals also extends to the integers.

::: theorem Theorem 21. Let x,y,z\in\mathbb{Z}. We have that

  1. $x+\left(y+z\right)=\left(x+y\right)+z$

  2. $x\left(yz\right)=\left(xy\right)z$

Proof:

  1. x+\left(y+z\right)=\left(x+y\right)+z:

    Let x,y,z\in\mathbb{Z} be such that x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right) where a,b,c,d,e,f\in\mathbb{N} and we have that \left(a,b\right)\in\left[a,b\right], \left(c,d\right)\in\left[c,d\right] and \left(e,f\right)\in\left[e,f\right]. We have that

    $$\begin{align} x+\left(y+z\right)&=\left(a,b\right)+\left(\left(c,d\right)+\left(e,f\right)\right)\ &=\left(a,b\right)+\left(c+e,d+f\right)\ &=\left(a+\left(c+e\right),b+\left(d+f\right)\right)\ &=\left(\left(a+c\right)+e,\left(b+d\right)+f\right),\text{ By associativity of addition for natural numbers}\ &=\left(a+c,b+d\right)+\left(e,f\right)\ &=\left(\left(a,b\right)+\left(c,d\right)\right)+\left(e,f\right)\ &=\left(x+y\right)+z \end{align*}$$*

    Which shows associativity of addition.

  2. x\left(yz\right)=\left(xy\right)z:

    As with addition, let x,y,z\in\mathbb{Z} be such that x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right) where a,b,c,d,e,f\in\mathbb{N} and we have that \left(a,b\right)\in\left[a,b\right], \left(c,d\right)\in\left[c,d\right] and \left(e,f\right)\in\left[e,f\right]. We then have that

    $$\begin{align} x\left(yz\right)&=\left(a,b\right)\left(\left(c,d\right)\left(e,f\right)\right)\ &=\left(a,b\right)\left(ce+df,cf+de\right)\ &=\left(a\left(ce+df\right)+b\left(cf+de\right),a\left(cf+de\right)+b\left(ce+df\right)\right)\ &=\left(ace+adf+bcf+bde,acf+ade+bce+bdf\right)\ &=\left(ace+bde+adf+bcf,acf+bdf+ade+bce\right),\ \text{By associativity of addition for natural numbers}\ &=\left(\left(ac+bd\right)e+\left(ad+bc\right)f,\left(ac+bd\right)f+\left(ad+bc\right)e\right)\ &=\left(ac+bd,ad+bc\right)\left(e,f\right)\ &=\left(\left(a,b\right)\left(c,d\right)\right)\left(e,f\right)\ &=\left(xy\right)z \end{align}$$*

    Showing associativity of multiplication.

The result follows. $\qed$ :::

Commutativity of integer addition and multiplication

As with the naturals, addition and multiplication in the integers both satisfy commutativity.

::: theorem Theorem 22. Addition and multiplication are commutative

For all x,y\in\mathbb{Z} we have that

  1. $x+y=y+x$

  2. $xy=yx$

Proof:

  1. x+y=y+x:

    Let x,y\in\mathbb{Z}. By definition we have that x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. Let x=\left(a,b\right) and y=\left(c,d\right). We then have by definition of addition that

    $$\begin{align} x+y&=\left(a,b\right)+\left(c,d\right)\ &=\left(a+c,b+d\right)\ &=\left(c+a,d+b\right),\ \text{By commutativity of addition for natural numbers}\ &= \left(c,d\right)+\left(a,b\right) &=y+x \end{align*}$$*

    Showing commutativity holds for addition in the integers.

  2. xy=yx:

    Let x,y\in\mathbb{Z} by definition we have that x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. So let x=\left(a,b\right) and y=\left(c,d\right). By definition of multiplication we have

    $$\begin{align} xy&=\left(a,b\right)\left(c,d\right)\ &=\left(ac+bd,ad+bc\right)\ &=\left(ca+db,da+bc\right), \text{By commutativity of multiplication of the naturals}\ &=\left(ca+db,da+bc\right), \text{By commutativity of addition of the naturals}\ &=\left(c,d\right)\left(a,b\right)\ &=yx \end{align*}$$*

    Showing commutativity for integer multiplication.

The result has been shown. $\qed$ :::

Multiplication distributes over addition

Another result that extends from the naturals is that multiplication distributes over addition.

::: theorem Theorem 23. Multiplication distributes over addition

For all x,y,z\in\mathbb{Z} we have that

  1. $x\left(y+z\right)=xy+xz$

  2. $\left(y+z\right)x=yx+zx=xy+xz$

Proof:

Let x,y,z\in\mathbb{Z} then x\in\left[a,b\right],y\in\left[c,d\right] and z\in\left[e,f\right] for some a,b,c,d,e,f\in\mathbb{N}.

So let x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right).

  1. x\left(y+z\right)=xy+xz:

    We have that

    $$\begin{align} x\left(y+z\right)&=\left(a,b\right)\left(\left(c,d\right)+\left(e,f\right)\right)\ &=\left(a,b\right)\left(c+e,d+f\right)\ &=\left(a\left(c+e\right)+b\left(d+f\right),a\left(d+f\right)+b\left(c+e\right)\right)\ &=\left(ac+ae+bd+bf,ad+af+bc+be\right)\ &=\left(ac+bd+ae+bf,ad+bc+af+be\right)\ &=\left(ac+bd,ad+bc\right)+\left(ae+bf,af+be\right)\ &=\left(a,b\right)\left(c,d\right)+\left(a,b\right)\left(e,f\right)\ &=xy+xz \end{align*}$$*

  2. \left(y+z\right)x=yx+zx=xy+xz:

    Now that we have the previous part the proof of this part is quick. We have

    $$\begin{align} \left(y+z\right)x&=x\left(y+z\right), \text{By commutativity of multiplication}\ &=xy+xz, \text{By part }1.\ &=yx+zx, \text{By commutativity of multiplication} \end{align*}$$*

As required. $\qed$ :::

The Zero and Identity laws

The zero and identity laws from the naturals extend to the integers.

::: theorem Theorem 24. The zero and Identity laws

Let x\in\mathbb{Z}. We have that

  1. $x+0=x=0+x$

  2. $1x=x=x1$

Proof:

Let x\in\mathbb{Z} then we have that x=\left(a,b\right) for some $a,b\in\mathbb{N}$

  1. x+0=x=0+x:

    We have that 0\in\left[0,0\right]. Hence we have that

    $$\begin{equation} x+0=\left(a,b\right)+\left(0,0\right)=\left(a+0,b+0\right)=\left(a+b\right)=\left(0+a,0+b\right)=\left(0,0\right)+\left(a,b\right)=0+x \end{equation*}$$*

  2. x*1=x=1*x:

    As 1\in\left[1,0\right] then

    $$\begin{align} x1&=\left(a,b\right)\left(1,0\right)\ &=\left(a1+b0,b1+a0\right)\ &=\left(a+0,b+0\right)\ &=\left(a,b\right)=x\ &=\left(1a+0b,0a+1b\right)\ &=\left(1,0\right)\left(a,b\right)\ &=1x \end{align}$$*

The result follows. $\qed$ :::

Extending subtraction to the integers

As we have a notion of subtraction on the naturals, we can ask about extending this to the integers. We defined subtraction on the naturals as follows. Let n,m\in\mathbb{N} such that n\leq m. Let d\in\mathbb{N} such that n=m+d. We define subtraction by

$$\begin{equation*} d=n-m \end{equation*}$$

Where we called d the difference between n and m. We also have the notion of a positive and negative integer. Recall that x\in\mathbb{Z} is a positive integer if and only if x Let x\in\mathbb{Z}. We say that x is a positive integer if and only if x\in\left[\left(b,0\right)\right] for some b\in\mathbb{N}. Likewise x is a negative integer if and only if x\in\left[\left(0,b\right)\right] for some b\in\mathbb{N}. In order to extend subtraction to the integers we need to consider a few things.

::: definition Definition 108. Negation of an natural number

Let x\in\mathbb{Z} so that x is a positive integer, i.e a natural number. We define the negation of x, denoted -x by

$$\begin{equation} -x=-1x=\left(0,1\right)x \end{equation}$$

where \left(0,1\right)\in\left[\left(0,-1\right)\right]. That is \left(0,1\right) is an element of the equivalence class \left[\left(0,1\right)\right] which represents all possible elements that are -1. :::

We can extend this result to include a general integer.

::: proposition Proposition 64. Negation of an integer

Let x\in\mathbb{Z} so that x\in\left[\left(a,b\right)\right] for some a,b\in\mathbb{N}. We have that

$$\begin{equation} -1x=-1\left(a,b\right)=\left(b,a\right) \end{equation*}$$*

Proof:

Let x\in\mathbb{Z} be as given by the hypothesis. We have that

$$\begin{align} -1x&=-1\left(a,b\right)\ &=\left(0,1\right)\left(a,b\right)\ &=\left(0a+b1,0b+1a\right)\ &=\left(b,a\right) \end{align}$$*

As required. $\qed$ :::

In light of this, we can define subtraction for integers.

::: definition Definition 109. Integer subtraction

Let x,y\in\mathbb{Z}. We define the subtraction of y from x, denoted x-y by

$$\begin{equation} x-y=x+\left(-y\right)=x+\left(-1y\right) \end{equation}$$* :::

We immediately get that subtraction is closed, from the fact that both addition and multiplication are closed. We do not have associativity of subtraction in general.

::: proposition Proposition 65. Integer subtraction is not associative

Let x,y,z\in\mathbb{Z}. We have that

$$\begin{equation} x-\left(y-z\right)\neq \left(x-y\right)-z \end{equation*}$$*

Proof:

Let x=2, y=4 and z=6, we have x\in\left[2,0\right], y\in\left[4,0\right] and z\in\left[0,6\right] so x\in\left(2,0\right), y\in\left(4,0\right) and z\in\left(0,6\right) . We have that

$$\begin{align} x-\left(y-z\right)&=\left(2,0\right)-\left(\left(4,0\right)-\left(6,0\right)\right)\ &=\left(2,0\right)-\left(\left(4,0\right)+\left(-1*\left(6,0\right)\right)\right)\ &=\left(2,0\right)-\left(\left(4,0\right)+\left(0,6\right)\right)\ &=\left(2,0\right)-\left(4,6\right)\ &=\left(2,0\right)+\left(-1*\left(4,6\right)\right)\ &=\left(2,0\right)+\left(6,4\right)\ &=\left(8,4\right)\ \end{align*}$$*

On the other side we have

$$\begin{align} \left(x-y\right)-z&=\left(\left(2,0\right)-\left(4,0\right)\right)-\left(6,0\right)\ &=\left(\left(2,0\right)+\left(-1*\left(4,0\right)\right)\right)-\left(6,0\right)\ &=\left(\left(2,0\right)+\left(0,4\right)\right)-\left(6,0\right)\ &=\left(2,4\right)-\left(6,0\right)\ &=\left(2,4\right)+\left(-1*\left(6,0\right)\right)\ &=\left(2,4\right)+\left(0,6\right)\ &=\left(2,10\right) \end{align*}$$*

Clearly \left(8,4\right)\neq \left(2,10\right). Indeed they are not even equivalent. Suppose that \left(8,4\right)\sim\left(2,10\right) then we have that 8+10=4+2. However 18\neq 6. $\qed$ :::

We can also immediately see the following result, which allows us to formally show that subtraction is an inverse to addition.

::: {#prop:IntegerAdditiveInverse .proposition} Proposition 66. Subtracting an integer from itself gives zero

Let x\in\mathbb{Z}. We have that

$$\begin{equation} x-x=0 \end{equation*}$$*

Proof:

Let x\in\mathbb{Z} where x\in\left[a,b\right] for some a,b\in\mathbb{N}. We have

$$\begin{align} x-x&=\left(a,b\right)-\left(a,b\right)\ &=\left(a,b\right)+\left(b,a\right)\ &=\left(a+b,b+a\right) \end{align*}$$*

It is left to show that \left(a+b,b+a\right)\sim\left(0,0\right). Indeed

$$\begin{equation} \left(a+b\right)+0=\left(b+a\right)+0 \Rightarrow a+b=b+a \end{equation*}$$*

The result is shown. $\qed$ :::

The cancellation laws

We can now deduce that the cancellation laws also extend to the integers.

::: theorem Theorem 25. The cancellation laws

Let x,y,z\in\mathbb{Z}.

  1. If x+y=x+z then we have y=z.

  2. For x\neq 0, if xy=xz then we have that $y=z$

Proof:

  1. If x+y=x+z then we have y=z:

    Let x,y,z\in\mathbb{Z}. We have that

    $$\begin{align} x+y&=x+z\ \Rightarrow -x+x+y&=-x+x+z,\ \text{Adding the negative of } x \text{ to both sides}\ \Rightarrow \left(-x+x\right)+y*&=\left(-x+x\right)+z,\ \text{Associativity of integers}\ \Rightarrow 0+y&=0+z,\ \text{By proposition \ref{prop:IntegerAdditiveInverse}}\ \Rightarrow y&=z \end{align*}$$*

  2. For x\neq 0, if xy=xz then we have that y=z:

    Let x,y,z\in\mathbb{Z} where x\neq 0. Suppose that x\in\left[a,b\right], y\in\left[c,d\right] and z\in\left[e,f\right]. We have

    $$\begin{align} xy&=\left(a,b\right)\left(c,d\right)=\left(ac+bd,ad+bc\right)\ xz&=\left(a,b\right)\left(e,f\right)=\left(ae+bf,af+be\right) \end{align*}$$*

    Now assume xy=xz then we have that \left(ac+bd,ad+bc\right)\sim\left(ae+bd,ad+be\right) which is to say

    $$\begin{equation} ac+bd+af+be=ae+bf+ad+bc \end{equation*}$$*

    Observe that

    $$\begin{align} ac+bd+af+be&=a\left(c+f\right)+b\left(d+e\right)\ ae+bf+ad+bc&=a\left(e+d\right)+b\left(f+c\right) \end{align*}$$*

    Which gives

    $$\begin{equation} a\left(c+f\right)+b\left(d+e\right)=a\left(e+d\right)+b\left(f+c\right) \end{equation*}$$*

    There are now two cases to consider, a<b and a>b. Firstly suppose that a<b then we can write that b=a+h for some h>0, this is well-defined as a,b\in\mathbb{N}. We then have

    $$\begin{align} a\left(c+f\right)+b\left(d+e\right)&=a\left(e+d\right)+b\left(f+c\right)\ a\left(c+f\right)+\left(a+h\right)\left(d+e\right)&=a\left(e+d\right)+\left(a+h\right)\left(f+c\right)\ a\left(c+f\right)+a\left(d+e\right)+h\left(d+e\right)&=a\left(e+d\right)+a\left(f+c\right)+h\left(f+c\right)\ a\left(d+e\right)+h\left(d+e\right)&=a\left(e+d\right)+h\left(f+c\right),\text{ Cancelling }a\left(c+f\right)\ h\left(d+e\right)&=h\left(f+c\right),\text{ Cancelling }a\left(d+e\right)\ \left(d+e\right)&=\left(f+c\right),\text{ Cancelling }h\ \end{align*}$$*

    Now as d+e=f+c we have that c-d=e-f\Rightarrow \left(c,d\right)\sim\left(e,f\right) which is the same as saying y=z.

    Now if a>b then we write b=a-h for some h>0, again being well-defined as a,b\in\mathbb{N}. Thus

    $$\begin{align} a\left(c+f\right)+b\left(d+e\right)&=a\left(e+d\right)+b\left(f+c\right)\ a\left(c+f\right)+\left(a-h\right)\left(d+e\right)&=a\left(e+d\right)+\left(a-h\right)\left(f+c\right)\ a\left(c+f\right)+a\left(d+e\right)-h\left(d+e\right)&=a\left(e+d\right)+a\left(f+c\right)-h\left(f+c\right)\ a\left(d+e\right)-h\left(d+e\right)&=a\left(e+d\right)-h\left(f+c\right),\text{ Cancelling }a\left(c+f\right)\ -h\left(d+e\right)&=-h\left(f+c\right),\text{ Cancelling }a\left(d+e\right)\ \left(f+c\right)&=\left(d+e\right),\text{By adding each side to the other and cancelling }h\ \end{align*}$$*

    As f+c=d+e then we have by similar logic to before the $y=z$

The result is shown. $\qed$ :::

Extending the summation and product notations to integers

Summation and product notation has been defined on the naturals. As with the theme of this section the notations extend in a natural way to integers. As before we need to define a few things.

Let z\in\mathbb{Z}^{n+m+1} be an ordered n+m+1 tuple of integers where z=\left(z_{-m},z_{-m+1},\dots,z_{-1},z_0,z_1,z\dots, z_n\right) and define \mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}. Define f:\mathbb{Z}_m^n\rightarrow\mathbb{Z} by

$$\begin{align*} f:\mathbb{Z}_m^n&\rightarrow \mathbb{Z}\ i&\mapsto f\left(i\right)=z_i \end{align*}$$

As before, f simply maps gets the value of z_i from the ordered tuple z.

::: definition Definition 110. Summation notation for the integers

Let z\in\mathbb{Z}^{n+m+1} be ordered n+m+1 tuple of integers where z=\left(z_{-m},z_{-m+1},\dots,z_{-1},z_0,z_1,z\dots, z_n\right). Define \mathbb{Z}_m^n by \mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}. Let f:\mathbb{Z}^{n+m+1}:\mathbb{Z} defined by

$$\begin{align} f:\mathbb{Z}^{m+n+1}&\rightarrow\mathbb{Z}\ i&\mapsto f\left(i\right)=z_i \end{align*}$$*

We define the summation notation for integers by

$$\begin{equation} \sum_{i=-m}^n f\left(i\right)=f\left(-m\right)+f\left(-m+1\right)+\dots+f\left(-1\right)+f\left(0\right)+f\left(1\right)+\dots+f\left(n\right) \end{equation*}$$*

Alternatively this is written

$$\begin{equation} \sum_{i=-m}^n z_i = z_{-m}+z_{-m+1}+\dots+z_{-1}+z_0+z_1+\dots+z_n \end{equation*}$$*

We have that i is called the index of summation and that i=-m is the starting index of the summation, and n the ending index of the summation. If z=\emptyset then we define the summation to be 0 and call a summation an empty sum.

We can also define the summation of some subset of \mathbb{Z}_m^n which allows for starting a summation at some starting point other than i=-m. Let T\subseteq\mathbb{Z}_m^n. We define the summation over the set T by

$$\begin{equation} \sum_{i\in T} z_i \end{equation*}$$*

If we have a mapping g:\mathbb{Z}\rightarrow\mathbb{Z} we can define a summation over g by

$$\begin{equation} \sum_{i\in T} g\left(z_i\right) \end{equation*}$$*

Finally we can define a summation over a predicate P\left(i\right) for i\in T by

$$\begin{equation} \sum_{P\left(i\right)}g\left(z_i\right) \end{equation*}$$*

where we take the sum of the g\left(z_i\right) for the i that satisfy the predicate P. We note that if we have k>n for some k\in\mathbb{N} then the sum

$$\begin{equation} \sum_{i=k}^n z_i=0 \end{equation*}$$* :::

The proprieties shown for summations with natural numbers also extend to the integer version.

::: proposition Proposition 67. Properties of summation notation

Let n,m\in\mathbb{Z} such that m<n. Let s,t\in\mathbb{N}^{n+m+1} and let c\in\mathbb{Z}.

Let a,b\in\mathbb{Z} with m<a<b<n. Define A=\mathbb{Z}_a^b and define

$$\begin{equation} B=\mathbb{Z}_m^n\setminus A=\left{-m,-m+1,\dots,a-1,b+1,\dots,n-1,n\right} \end{equation*}$$ so that A\cup B =\mathbb{Z}_m^n. Let k\in \mathbb{Z} be the starting index summation such that k<n. We have the following properties hold.*

  1. $\displaystyle \sum_{i=-m}^n s_i = \sum_{i\in A} s_i +\sum_{i\in B} s_i =\sum_{i=-m}^{-1} s_i + \sum_{i=0}^{n}s_i$

  2. $\displaystyle \sum_{i=k}^n s_i = \sum_{i=k}^d s_i + \sum_{i=d+1}^n s_i$

  3. \displaystyle\sum_{i=k}^n c*s_i = c*\sum_{i=k}^n s_i for some $c\in\mathbb{Z}$

  4. \displaystyle\sum_{i=k}^n c = c\left(n+1-k\right) for some $c\in\mathbb{Z}$

  5. $\displaystyle\sum_{i=k}^n s_i+t_i = \sum_{i=k}^n s_i + \sum_{i=k}^n t_i$

Proof:

  1. \displaystyle \sum_{i=-m}^n s_i = \sum_{i\in A} s_i +\sum_{i\in B} s_i =\sum_{i=-m}^{-1} s_i + \sum_{i=0}^{n}s_i:

    This follows by applying the definition. We have that

    $$\begin{align} \sum_{i=-m}^n s_i&=s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_{n}\ &=\left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}\right)+\left(s_0+s_1+\dots+s_{n-1}+s_{n}\right)\ &=\sum_{i=-m}^{-1} s_i + \sum_{i=0}^n s_i \end{align*}$$*

    Additionally note that

    $$\begin{align} \sum_{i=-m}^n s_i&=s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_{n}\ &= \left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{a-2}+s_{a-1}\right)+\left(s_a+s_{a+1}+\dots+s_{b-1}+s_b\right)\ &+\left(s_{b+1}+s_{b+2}+\dots+s_{n-1}+s_n\right)\ &= \left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{a-2}+s_{a-1}\right) + \left(s_{b+1}+s_{b+2}+\dots+s_{n-1}+s_n\right)\ &+ \left(s_a+s_{a+1}+\dots+s_{b-1}+s_b\right)\ &= \sum_{i\in B} s_i + \sum_{i\in A} s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i \end{align*}$$*

  2. \displaystyle \sum_{i=k}^n s_i = \sum_{i=k}^d s_i + \sum_{i=d+1}^n s_i:

    The proof is similar to part 1, replacing -m by k.

  3. \displaystyle\sum_{i=k}^n c*s_i = c*\sum_{i=k}^n s_i for some $c\in\mathbb{Z}$

    We have by definition that

    $$\begin{equation} \sum_{i=k}^n cs_i=cs_k+cs_{k+1}+cs_{k+3}+\dots+cs_n \end{equation}$$*

    By multiplication distributing over addition we have

    $$\begin{equation} \sum_{i=1}^n cs_i=cs_k+cs_{k+1}+cs_{k+3}+\dots+cs_n=c\left(s_k+s_{k+1}+\dots+s_n\right)=c\sum_{i=k}^n s_i \end{equation*}$$*

  4. \displaystyle\sum_{i=k}^n c = c\left(n+1-k\right) for some $c\in\mathbb{Z}$

    If n>0 and k>=0 then the result is the same as for natural numbers. So suppose that k<0. Consider the following set of the indices given by

    $$\begin{equation} S=\left{k,k+1,k+2,\dots,-1,0,1,\dots,n-1,n\right} \end{equation*}$$*

    We have that the cardinality of S is n+1-k. Indeed consider the following mapping

    $$\begin{align} f:S&\rightarrow \mathbb{N}\ s&\mapsto f\left(s\right)=s-k \end{align*}$$*

    Define the mapping g:S\rightarrow\mathop{\mathrm{Image}}\left(f\right) then we have that g is a bijection. Suppose that g\left(x\right)=g\left(y\right) for some x,y\in S then

    $$\begin{align} g\left(x\right)&=g\left(y\right)\ x-k&=y-k\ x&=y \end{align*}$$*

    showing injectivity. Now as g is a mapping from S to the image of f we have by proposition 15{reference-type="ref" reference="prob:RestOfCodomainToImageIsSurjective"} that g is surjective. Hence we conclude that g is a bijection.

    Now we have that

    $$\begin{align} \mathop{\mathrm{Image}}\left(f\right)&=\left{f\left(x\right):x\in S\right}\ &= \left{k-k,\left(k+1\right)-k,\left(k+2\right)-k,\dots,-1-k,0-k,1-k,\dots,\left(n-1\right)-k,n-k\right}\ &=\left{0,1,2,\dots,k-1,k,k-1,\dots,n-1-k,n-k\right} \end{align*}$$*

    Hence \left|S\right|=\left|\mathop{\mathrm{Image}}\left(f\right)\right|=n-k+1. Hence the sum is adding c to itself n+1-k times. This is to say

    $$\begin{equation} \sum_{i=k}^n c= c\left(n+1-k\right) \end{equation*}$$*

  5. \displaystyle\sum_{i=k}^n s_i+t_i = \sum_{i=k}^n s_i + \sum_{i=k}^n t_i:

    This follows by the definition. We have

    $$\begin{align} \sum_{i=k}^n s_i+t_i&= \left(s_k+t_k\right)+\left(s_{k+1}+t_{k+1}\right)+\dots\ &+\left(s_{-1}+t_{-1}\right)+\left(s_{0}+t_{0}\right)+\left(s_{1}+t_{1}\right)+\dots+\left(s_{n-1}+t_{n-1}\right)+\left(s_{n}+t_{n}\right)\ &=\left(s_k+s_{k+1}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_n\right)+\ &+\left(t_k+t_{k+1}+\dots+t_{-1}+t_0+t_1+\dots+t_{n-1}+t_n\right)\ &= \sum_{i=k}^n s_i + \sum_{i=k}^n t_i \end{align*}$$*

$\qed$ :::

We make a similar definition for product notation.

::: definition Definition 111. Product notation for the integers

Let z\in\mathbb{Z}^{n+m+1} be ordered n+m+1 tuple of integers where z=\left(z_{-m},z_{-m+1},\dots,z_{-1},z_0,z_1,z\dots, z_n\right). Define \mathbb{Z}_m^n by \mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}. Let f:\mathbb{Z}^{n+m+1}:\mathbb{Z} defined by

$$\begin{align} f:\mathbb{Z}^{m+n+1}&\rightarrow\mathbb{Z}\ i&\mapsto f\left(i\right)=z_i \end{align*}$$*

We define the summation notation for integers by

$$\begin{equation} \prod_{i=-m}^n f\left(i\right)=f\left(-m\right)f\left(-m+1\right)\dotsf\left(-1\right)f\left(0\right)f\left(1\right)\dots+f\left(n\right) \end{equation}$$

Alternatively this is written

$$\begin{equation} \prod_{i=-m}^n z_i = z_{-m}z_{-m+1}\dotsz_{-1}z_0z_1\dotsz_n \end{equation}$$*

We have that i is called the index of the product and that i=-m is the starting index of the product, and n the ending index of the product. If z\in\emptyset then we define the product to be 1 and call a product an empty sum.

We can also define the product of some subset of \mathbb{Z}_m^n which allows for starting a product at some starting point other than i=-m. Let T\subseteq\mathbb{Z}_m^n. We define the product over the set T by

$$\begin{equation} \prod_{i\in T} z_i \end{equation*}$$*

If we have a mapping g:\mathbb{Z}\rightarrow\mathbb{Z} we can define a product over g by

$$\begin{equation} \prod_{i\in T} g\left(z_i\right) \end{equation*}$$*

Finally we can define a product over a predicate P\left(i\right) for i\in T by

$$\begin{equation} \prod_{P\left(i\right)}g\left(z_i\right) \end{equation*}$$*

where we take the sum of the g\left(z_i\right) for the i that satisfy the predicate P. We note that if we have k>n for some k\in\mathbb{N} then the product

$$\begin{equation} \prod_{i=k}^n z_i=1 \end{equation*}$$* :::

::: proposition Proposition 68. Properties of product notation

Let n,m\in\mathbb{Z} such that m<n. Let s,t\in\mathbb{Z}^{n+m+1} and let c\in\mathbb{Z}. Let a,b\in\mathbb{Z} so that m<a<b<n. Define A=\mathbb{Z}_a^b and define

$$\begin{equation} B=\mathbb{Z}_m^n\setminus A=\left{-m,-m+1,\dots,a-1,b+1,\dots,n-1,n\right} \end{equation*}$$ so that A\cup B =\mathbb{Z}_m^n. Let k\in \mathbb{Z} be the lower index of the product.*

We have that the following properties hold.

  1. *$\displaystyle \prod_{i=-m}^n s_i = \prod_{i\in A} s_i \prod_{i\in B} s_i = \prod_{i=-m}^{-1} s_i * \prod_{i=0}^n s_i$

  2. $\displaystyle \prod_{i=k}^n s_i = \prod_{i=k}^m s_i * \prod_{i=m+1}^n s_i$

  3. $\displaystyle\prod_{i=k}^n s_it_i = \prod_{i=k}^n s_i \prod_{i=1}^n t_i$

Proof:

  1. \displaystyle \prod_{i=-m}^n s_i = \prod_{i\in A} s_i *\prod_{i\in B} s_i = \prod_{i=-m}^{-1} s_i * \prod_{i=0}^n s_i:

    This follows by the definition of the product. We have that

    $$\begin{align} \prod_{i=-m}^n s_i &= s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}s_0s_1*\dotss_{n-1}s_n\ &=\left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}\right)\left(s_0s_1*\dotss_{n-1}s_n\right)\ &=\prod_{i=-m}^{-1}s_i\prod_{i=0}^n s_i \end{align}$$*

    Likewise we have

    $$\begin{align} \prod_{i=-m}^n s_i &= s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}s_0s_1*\dotss_{n-1}s_n\ &= \left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{a-2}s_{a-1}\right)\left(s_as_{a+1}\dotss_{b-1}s_b\right)\ & \left(s_{b+1}s_{b+2}\dotss_{n-1}s_n\right)\ &= \left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{a-2}s_{a-1}\right) * \left(s_{b+1}s_{b+2}\dotss_{n-1}s_n\right)\ & \left(s_as_{a+1}\dotss_{b-1}s_b\right)\ &=\prod_{i\in B}s_i * \prod_{i\in A} s_i = \prod_{i\in A} s_i * \prod_{i\in B} s_i \end{align}$$*

  2. \displaystyle \prod_{i=k}^n s_i = \prod_{i=k}^m s_i * \prod_{i=m+1}^n s_i:

    The proof is similar to part 1. We replace -m with k.

  3. \displaystyle\prod_{i=k}^n s_it_i = \prod_{i=k}^n s_i \prod_{i=1}^n t_i:

    Observer that

    $$\begin{align} \prod_{i=k}^n s_it_i&=s_{k}t_{k}s_{k+1}t_{k+1}s_{k+2}t_{k+2}\dotss_{-1}t_{-1}s_{0}t_{0}s_{1}t_{1}\dotss_{n-1}t_{n-1}s_{n}t_{n}\ &=\left(s_{k}s_{k+1}s_{k+2}\dotss_{-1}s_{0}s_{1}\dotss_{n-1}s_{n}\right)\ &\left(t_{k}t_{k+1}t_{k+2}\dotst_{-1}t_{0}t_{1}\dotst_{n-1}t_{n}\right)\ &=\prod_{i=k}^n s_i * \prod_{i=k}^n s_i \end{align}$$

$\qed$ :::

We can now consider extending the result of proposition 39{reference-type="ref" reference="prop:NaturalsHaveNoZeroDivisors"}. I.e if the product of ab=0 for a,b\in\mathbb{Z} then at least one of a or b is zero.

::: {#prop:IntegersHaveNoZeroDivisors .proposition} Proposition 69. Product of two integers being zero implies one of the numbers is zero

Let x,y\in\mathbb{Z}. If xy=0 then at least one of x or y is zero.

Proof:

Let x,y\in\mathbb{Z}. If x=y=0 then the result is trivial. So suppose that x=\left(a,b\right) and y=\left(c,d\right), moreover suppose y\neq 0. By definition of integer multiplication we have that

$$\begin{equation} xy=\left(a,b\right)\left(c,d\right)=\left(ac+bd,ad+bc\right)=\left(0,0\right) \end{equation}$$*

By assumption. We have that

$$\begin{align} \left(ac+bd,ad+bc\right)&=\left(0,0\right) \iff ac+bd+0=ad+bc+0\ \Rightarrow ac+bd&=ad+bc \end{align*}$$*

Now suppose without loss of generality suppose that c>d then we have that \exists p\in\mathbb{N} such that d+p=c. We hence have

$$\begin{align} ac+bd&=ad+bc\ a\left(d+p\right)+bd&=ad+b\left(d+p\right)\ ad+ap+bd&=ad+bd+bp\ ap&=bp\ a&=b ,\text{By the cancellation laws for the natural numbers}\ a+0&=b+0 \Rightarrow \left(a,b\right)=\left(0,0\right) \end{align*}$$*

A similar argument applies for c<d.

Hence x=0. A similar argument assuming x\neq 0 shows that y=0. The result is shown. $\qed$ :::

Extending the rules for inequalities to the integers

For the natural numbers, we were able to derive some rules for how inequalities behave, we can extend those results to the integers. Before we do so we have an additional consideration. As \mathbb{N}\subset\mathbb{Z} then we can view every non-zero n\in\mathbb{N} as a positive integer in \mathbb{Z}. Hence for positive a,b,c\in\mathbb{Z} the results from the proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} instantly extend to those integers.

To extend the results fully we need to consider negative integers as well. Consider x=-3 and y=6, clearly x<y. Now consider -1*x = 3 and -1*y=-6, we have that -1*x> -1*y. This can be shown in general.

::: {#prop:MultiplicationByNegativeOneFlipsInequalitySign .proposition} Proposition 70. Multiplication by -1 changes the inequality sign

Let x,y\in\mathbb{Z}. We have the following

  1. If x<y then $-x>-y$

  2. If x\leq y then $-x\geq -y$

  3. If x>y then $-x<-y$

  4. If x\geq y then $-x\leq-y$

Proof:

  1. If x<y then -x>-y:

    Let x,y\in\mathbb{Z} so that x<y. There are three cases to consider

    1. x\geq 0 and $y\geq 0$

    2. x<0 and $y\geq 0$

    3. x<0 and $y<0$

    1. x\geq 0 and y\geq 0:

      Suppose that x\geq 0 and y\geq 0 then x\in\left[\left(a,0\right)\right] for some a\in\mathbb{N} and y\in\left[\left(b,0\right)\right] for some b\in\mathbb{N}. As x<y then we must have a+0<b+0\Rightarrow a<b.

      We have that

      $$\begin{align} -x=-1x=-1\left(a,0\right)&=\left(0,a\right)\ -y=-1y=-1\left(b,0\right)&=\left(0,b\right) \end{align*}$$*

      Now, by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 2. we know that a<b is the same as b>a. Now we have -x>-y by definition of greater than for integers as we have

      $$\begin{equation} -x>-y \iff 0+b>a+0 \end{equation*}$$*

    2. x<0 and y\geq 0:

      Now suppose that x<0 and y\geq 0 then we have that x\in\left[\left(0,a\right)\right] and y\in\left[\left(b,0\right)\right] where a,b\in\mathbb{N}.

      $$\begin{align} -x=-1x=-1\left(0,a\right)&=\left(a,0\right)\ -y=-1y=-1\left(b,0\right)&=\left(0,b\right) \end{align*}$$*

      Now, we have that if -x>-y then we have

      $$\begin{equation} a+b>0+0 \end{equation*}$$*

      However as a,b\in\mathbb{N} and x<0 \implies a> 0. We conclude that a+b\geq a > 0 and so -x>-y.

    3. x<0 and y<0:

      Now suppose that x<0 and y< 0 then x\in\left[\left(0,a\right)\right] for some a\in\mathbb{N} and y\in\left[\left(0,b\right)\right] for some b\in\mathbb{N}. As x<y then we have that b<a, which is the same as a>b.

      We have that

      $$\begin{align} -x=-1x=-1\left(0,a\right)&=\left(a,0\right)\ -y=-1y=-1\left(0,b\right)&=\left(b,0\right) \end{align*}$$*

      Applying the definition of > to -x and -y gives

      $$\begin{equation} -x>-y \iff a>b \end{equation*}$$*

      Which we know to be true. Hence -x>-y.

    This shows part 1.

  2. If x\leq y then -x\geq -y:

    If x<y then we apply part 1. to get -x>-y from which it follows that -x\geq -y by definition. It is left to check when x=y. This is clear however as x=y\implies -x=-y and so -x\geq -y.

  3. If x>y then -x<-y:

    The proof of this part is similar to part 1. As in part 1. there are three cases to consider

    1. x\geq 0 and $y\geq 0$

    2. x\geq 0 and $< 0$

    3. x<0 and $y<0$

    1. x\geq 0 and y\geq 0:

      Suppose that x\geq 0 and y\geq 0 then x\in\left[\left(a,0\right)\right] for some a\in\mathbb{N} and y\in\left[\left(b,0\right)\right] for some b\in\mathbb{N}. As x>y then we must have a+0>b+0\Rightarrow a>b.

      We have that

      $$\begin{align} -x=-1x=-1\left(a,0\right)&=\left(0,a\right)\ -y=-1y=-1\left(b,0\right)&=\left(0,b\right) \end{align*}$$*

      Now, by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 2. we know that a>b is the same as b<a. Now we have -x<-y by definition of less than for integers as we have

      $$\begin{equation} -x<-y \iff 0+b<a+0 \end{equation*}$$*

    2. x\geq 0 and y<0:

      Now suppose that x\geq 0 and y< 0 then we have that x\in\left[\left(a,0\right)\right] and y\in\left[\left(0,b\right)\right] where a,b\in\mathbb{N}. We have

      $$\begin{align} -x=-1x=-1\left(a,0\right)&=\left(0,a\right)\ -y=-1y=-1\left(0,b\right)&=\left(b,0\right) \end{align*}$$*

      Now, we have that if -x<-y then we have

      $$\begin{equation} 0+0<a+b \end{equation*}$$*

      However as a,b\in\mathbb{N} and y<0 \implies b> 0. We conclude that 0<b\leq a+b and so $-x<-y$

    3. x<0 and y<0:

      Now suppose that x<0 and y< 0 then x\in\left[\left(0,a\right)\right] for some a\in\mathbb{N} and y\in\left[\left(0,b\right)\right]. A x>y then we have that 0+b>a+0\Rightarrow b>a which is the same as a<b.

      $$\begin{align} -x=-1x=-1\left(0,a\right)&=\left(a,0\right)\ -y=-1y=-1\left(0,b\right)&=\left(b,0\right) \end{align*}$$*

      Applying the definition of < to -x and -y gives

      $$\begin{equation} -x<-y \iff a+0<b+0 \Rightarrow a<b \end{equation*}$$*

      Which we know to be true. Hence -x<-y.

  4. If x\geq y then -x\leq-y:

    If x>y we apply part 3. So instead suppose x=y but then x=y\Rightarrow -x=y and so by definition we have -x\leq -y.

The result is shown. $\qed$ :::

This proposition will play a big role in the following proposition that extends the results for the rules of inequalities to the integers.

::: {#prop:InequalityIntegerNumbers .proposition} Proposition 71. Properties of inequalities for the integers

Let x,y,z,c\in\mathbb{Z}. We have the following properties for inequalities

  1. x<y is the same as y>x:

  2. x\leq y is the same as y\geq x:

  3. If x<y and y<z then x<z:

  4. If x\leq y and y<z then x<z:

  5. If x<y and y\leq z then x<z:

  6. If x\leq y and y\leq z then x\leq z:

  7. If x>y and y>z then x>z:

  8. If x\geq y and y>z then x>z:

  9. If x>y and y\geq z then x>z:

  10. If x\geq y and y\geq z then x\geq z:

  11. If x<y then x+z<y+z:

  12. If x\leq y then x+z\leq y+z:

  13. If x>y then x+z>y+z:

  14. If x\geq y then x+z\geq y+z:

  15. If x<y and z\geq 0 then xz<yz:

  16. If x<y and z< 0 then xz>yz:

  17. If x\leq y and z\geq 0 then xz\leq yz:

  18. If x\leq y and z<0 then xz\geq yz:

  19. If x>y and z\geq 0 then xz>yz:

  20. If x>y and z< 0 then xz<yz:

  21. If x\geq y and z\geq 0 then xz\geq yz:

  22. If x\geq y and z<0 then xz\leq yz:

Proof:

  1. x<y is the same as y>x:

    Let x,y\in\mathbb{Z} with x<y. Similar reasoning as in proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} can be used. As in the proposition, there are three cases to consider.

    1. x\geq 0 and $y\geq 0$

    2. x<0 and $y\geq 0$

    3. x<0 and $y<0$

    1. x\geq 0 and y\geq 0:

      Suppose x\geq 0 and y\geq 0 then x\in\left[\left(a,0\right)\right] and y\in\left[\left(b,0\right)\right] for some a,b\in\mathbb{N}. We have that x< y only holds if a<b, which is equivalent to b>a by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"}. But by definition of > for integers, we have that

      $$\begin{equation} b>a \iff y>x \end{equation*}$$*

    2. x<0 and y\geq 0:

      Suppose that x<0 and y\geq 0, then x\in\left[\left(0,a\right)\right] and y\in\left[\left(b,0\right)\right] for some a,b\in\mathbb{N}. By definition of < we have that

      $$\begin{equation} x<y \iff 0+0 < a+b\implies y>x\iff a+b > 0 \end{equation*}$$*

      Now, x<0\implies a>0 and so we have that a+b\geq a > 0 and so y>x.

    3. x<0 and y<0:

      Now suppose that x<0 and y<0, it follows that x\in\left[\left(0,a\right)\right] and y\in\left[\left(0,b\right)\right] for some a,b\in\mathbb{N}. By definition of < we have that

      $$\begin{equation} x<y\iff b<a \implies y>x \iff a>b \end{equation*}$$*

      Hence, as b<a, we have that a>b and so y>x.

  2. x\leq y is the same as y\geq x:

    If x<y then we apply part 1. Otherwise, we have that x=y and so clearly y=x and hence y\geq x.

  3. If x<y and y<z then x<z:

    Suppose that x<y and y<z. There are four cases to consider.

    1. x\geq 0, y\geq 0 and $z\geq 0$

    2. x<0, y\geq 0 and $z\geq 0$

    3. x<0, y<0 and $z\geq 0$

    4. x<0, y<0 and $z<0$

    1. x\geq 0, y\geq 0 and z\geq 0:

      Suppose that x\geq 0, y\geq 0 and z\geq 0 then the result follows immediately by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 6. as x\geq 0, y\geq 0 and z\geq 0 gives x\in\left[\left(a,0\right)\right], y\in\left[\left(b,0\right)\right] and z\in\left[\left(c,0\right)\right] for some a,b,c\in\mathbb{N} and therefore x,y,z\in\mathbb{N}.

    2. x<0, y\geq 0 and z\geq 0:

      Now suppose that x<0, y\geq 0 and z\geq 0. We have that x\in\left[\left(0,a\right)\right], y\in\left[\left(b,0\right)\right] and z\in\left[\left(c,0\right)\right] for some for some a,b,c\in\mathbb{N}. Now we have that

      $$\begin{align} x<0 &\iff a>0 \ y\geq 0 &\iff b\geq 0\ z\geq 0&\iff c\geq 0 \end{align*}$$*

      By assumption x<y and so we have that 0<a+b, moreover by assumption y<z and so we have b<c. We hence have that 0<a+b<a+c. Now as 0<a+c we have by definition of < that

      $$\begin{equation} 0<a+c\iff 0+0<a+c \iff \left(0,a\right)<\left(c,0\right) \iff x<z \end{equation*}$$*

    3. x<0, y<0 and z\geq 0:

      Now suppose that x<0, y<0 and z\geq 0. We have that x\in\left[\left(0,a\right)\right], y\in\left[\left(0,b\right)\right] and z\in\left[\left(c,0\right)\right] for some $a,b,c\in\mathbb{N}$

      $$\begin{align} x<0 &\iff a>0 \ y<0 &\iff b>0\ z\geq 0&\iff c\geq 0 \end{align*}$$*

      By assumption x<y and so we have that b<a, moreover by assumption y<z and so we have 0<b+c. As b<a then we have 0<b+c<a+c, moreover we have by the definition of < that

      $$\begin{equation} 0<a+c\iff 0+0<a+c \iff \left(0,a\right)<\left(c,0\right) \iff x<z \end{equation*}$$*

    4. x<0, y<0 and z<0:

      Suppose that x<0, y< 0 and z<0. We have that x\in\left[\left(0,a\right)\right], y\in\left[\left(0,b\right)\right] and z\in\left[\left(0,c\right)\right] for some a,b,c\in\mathbb{N}. Observe

      $$\begin{align} x<0 &\iff a>0 \ y<0 &\iff b>0\ z<0 &\iff c>0 \end{align*}$$*

      As x<y we have that b<a, likewise as y<z we have that c<b, hence we have that c<b<a and so c<a. Hence by definition of < we have

      $$\begin{equation} c<a\iff 0+c<a+0\iff \left(0,a\right)<\left(0,c\right)\iff x<z \end{equation*}$$*

  4. If x\leq y and y<z then x<z:

    Suppose that x\leq y and y<z. If x<y then we apply part 3. So suppose that x=y, then we must have that y<z\iff x<z and hence the result.

  5. If x<y and y\leq z then x<z:

    As with part 5. Suppose x<y and y\leq z, then if y<z we apply part 3. Then we are left with the case y=z and hence we have that x<y\iff x<z.

  6. If x\leq y and y\leq z then x\leq z:

    Suppose that x\leq y and y\leq z, then if x<y and y<z we apply part 3. If x\leq y and y<z we apply part 4. If x<y and y\leq z we apply part 5. Hence we are left with the case where x=y and y=z. The result follows immediately.

  7. If x>y and y>z then x>z:

    By part 1. of the proposition we have that this is equivalent to y<x and z<y then z<x and so part 3. applies.

  8. If x\geq y and y>z then x>z:

    Applying part 2 to x\geq y and part 1. to y>z and x>z gives the equivalent statement of y\leq x and z<y then z<x and so part 4. applies.

  9. If x>y and y\geq z then x>z:

    As with part 8. Applying parts 2. and 1. gives the equivalent statement of y<x and z\geq y then z<x and s part 5. applies

  10. If x\geq y and y\geq z then x\geq z:

    Solely applying part 2 of the proposition gives the statement y\leq x and z\leq y then z\leq x, so part 6. applies.

  11. If x<y then x+z<y+z:

    Suppose that x<y where x\in\left[\left(a,b\right)\right] and y\in\left[\left(c,d\right)\right] for some a,b,c,d\in\mathbb{N}. Let z\in\left[\left(e,f\right)\right]. By assumption we know that

    $$\begin{equation} x<y\iff a+d<b+c \end{equation*}$$*

    Now, we have that

    $$\begin{align} x+z=\left(a,b\right)+\left(e,f\right)=\left(a+e,b+f\right)\ y+z=\left(c,d\right)+\left(e,f\right)=\left(c+e,d+f\right) \end{align*}$$*

    Now, suppose that x+z<y+z. We have that

    $$\begin{equation} x+z<y+z\iff \left(a+e\right)+\left(d+f\right)<\left(b+f\right)+\left(c+e\right) \end{equation*}$$*

    Observe that

    $$\begin{align} \left(a+e\right)+\left(d+f\right)&<\left(b+f\right)+\left(c+e\right)\ \underbrace{\left(a+d\right)}{=j}+\underbrace{\left(e+f\right)}{=k}&<\underbrace{\left(b+c\right)}{=l}+\underbrace{\left(f+e\right)}{=k}\ \end{align*}$$*

    For some j,k,l\in\mathbb{N}. We see that j<l as a+d<b+c. Hence by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 12. that j<l\Rightarrow j+k<l+k and so we have x+z<y+z.

  12. If x\leq y then x+z\leq y+z:

    If x<y then the result follows from part 11. Otherwise, we have x=y and then x+z=y+z and so we have x+z\leq y+z.

  13. If x>y then x+z>y+z:

    As has been the case so far, applying part 1. gives us the statement y<x then y+z<x+z and so part 11. applies.

  14. If x\geq y then x+z\geq y+z:

    By part 2. we get the equivalent statement of y\leq x then y+z\leq x+z from which we can apply part 12.

  15. If x<y and z\geq 0 then xz<yz:

    Suppose that x<y where x\in\left[\left(a,b\right)\right] and y\in\left[\left(c,d\right)\right] for some a,b,c,d\in\mathbb{N}. Let z\in\left[\left(e,0\right)\right] for some e\in\mathbb{N}. As x<y we have

    $$\begin{equation} x<y\iff a+d<b+c \end{equation*}$$*

    Now we have that

    $$\begin{align} xz=\left(a,b\right)\left(e,0\right)=\left(ae,be\right)\ yz=\left(c,d\right)\left(e,0\right)=\left(ce,de\right)\ \end{align*}$$*

    Now, consider xz<yz then

    $$\begin{equation} xz<yz\iff ae+de<be+ce \iff e\underbrace{\left(a+d\right)}{=m}<e\underbrace{\left(b+c\right)}{=n} \end{equation*}$$*

    The result now follows from proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 16.

  16. If x<y and z< 0 then xz>yz:

    Suppose that x<y where x\in\left[\left(a,b\right)\right] and y\in\left[\left(c,d\right)\right] for some a,b,c,d\in\mathbb{N}. Let z\in\left[\left(0,e\right)\right] for some e\in\mathbb{N}. As x<y we have

    $$\begin{equation} x<y\iff a+d<b+c \end{equation*}$$*

    Now we have that

    $$\begin{align} xz=\left(a,b\right)\left(0,e\right)=\left(be,ae\right)\ yz=\left(c,d\right)\left(0,e\right)=\left(de,ce\right)\ \end{align*}$$*

    Now, we want to show that xz>yz, by definition we have

    $$\begin{align} xz>yz \iff be+ce>ae+de\iff e\underbrace{\left(b+c\right)}{=m}<e\underbrace{\left(a+d\right)}{=n} \end{align*}$$*

    The result now follows from proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 16.

  17. If x\leq y and z\geq 0 then xz\leq yz:

    Suppose that x\leq y then if we have that x<y we apply part 15. Otherwise, x=y and the result is trivial.

  18. If x\leq y and z<0 then xz\geq yz:

    Likewise, if x<y then we apply part 16. So suppose that x=y then xz=yz and we, therefore, have xz\geq yz.

  19. If x>y and z\geq 0 then xz>yz:

    Let z\geq 0 and by applying part 1. we get the equivalent statement of y<x and z\geq 0 then yz<xz for which we apply part 15.

  20. If x>y and z< 0 then xz<yz:

    Applying part 1. we get the equivalent statement of y<x and z<0 then yz<xz for which we apply part 16.

  21. If x\geq y and z\geq 0 then xz\geq yz:

    Part 2 of this proposition gives the equivalent statement of y\leq x and \geq 0 then yz\leq xz and so part 17. applies.

  22. If x\geq y and z<0 then xz\leq yz:

    Now, part 2 gives us the expression y\leq x and z<0 then yz\geq xz and so we apply part 18.

The result has been shown. $\qed$ :::

The absolute value function

After the construction of the natural numbers, we explored the notion of cardinality. That was assigning a notion of size to a natural number. Recall the definition,

$$\begin{align*} \left|\cdot\right|:\mathbb{N}&\rightarrow\mathbb{N}\ n&\mapsto\left|n\right|=n \end{align*}$$

To extend this we consider the following. We know that a\in\mathbb{N} has a cardinality \left|a\right|=a as a\in\mathbb{N} refers to a set containing a elements. Unfortunately, the notion of a set containing a elements doesn't extend in a natural way to the integers. For example, what does it mean for a set to contain -3 elements? Instead, we need to re-think the notion of size.

Armed with subtraction we can re-cast our this understanding of size into a more useful form. Consider for example 6-3=3, we can interpret this expression as saying that the number 3 is 3 less than 6, or equivalently the number 6 is 3 bigger than 3. Stated in another way, if we were to get a ruler and measure something to be 6 cm long and we want to cut it in half we will measure the halfway point at 3cm along from where we start measuring. That is to say, the halfway point would be 6cm - 3cm=3cm.

What we have done is rather than think about the number of elements, we have thought about things in terms of distances. This turns out to be a very powerful idea, there is an entire subject in mathematics which studies this idea of distances, formally called metrics, which we will see later. We have only considered the positive case so far, what about 3-6?

We know that 3-6=-3 and using similar logic this is saying that the number -3 is 6 away from 3, equivalently 3 is 6 more than -3.

We make a definition.

::: definition Definition 112. Distance function for integers

Let x,y\in\mathbb{Z}. Define the function d:\mathbb{Z}^2\rightarrow\mathbb{N} by

$$\begin{align} d:\mathbb{Z}^2&\rightarrow\mathbb{N}\ \left(x,y\right)&\mapsto d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{align*}$$* :::

We must verify that this is well defined

::: {#prop:IntegerDistanceFuncWellDefined .proposition} Proposition 72. The distance function for the integers is well-defined

Let x,y\in\mathbb{Z}. We have that

$$\begin{equation} d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{equation*}$$*

is well-defined.

Proof:

Let x,y\in\mathbb{Z}. There are two cases to consider x\geq y and x<y.

  1. x\geq y:

    Suppose that x\geq y, then by proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} part 14. we have

    $$\begin{equation} x\geq y \Rightarrow \left(x+\left(-y\right)\right) \geq \left(y+\left(-y\right)\right) \Rightarrow x-y \geq 0 \end{equation*}$$*

    Hence x-y\in\mathbb{N}.

  2. x<y:

    As x<y we have by definition of d that d\left(x,y\right)=-\left(x-y\right) where we have that x-y<0. However we have that -\left(x-y\right)=-1 * \left(x-y\right) and so by part 16 of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} we have that -1*\left(x-y\right)>0 which is to say $-\left(x-y\right)\in\mathbb{N}$

The result has been shown. $\qed$ :::

In light of the definition of the distance function, we can define the so-called absolute value function. This will give us a notion of the magnitude of an integer.

::: definition Definition 113. Absolute value function

Let x\in\mathbb{Z} we define the absolute value function, denoted by \left|x\right| by the function

$$\begin{equation} \left|x\right|=d\left(x,0\right)=\begin{cases} x,\ \text{If } x\geq 0\ -x,\ \text{If } x< 0 \end{cases} \end{equation*}$$* :::

With this definition, we have generalised the idea of "size" to the integers. That is the size of an integer is its distance from 0. We have the basic properties of the absolute value

::: proposition Proposition 73. Properties of the absolute value

Let x,y,z\in\mathbb{Z}. We have that the absolute value function has the following properties

  1. \left|x\right|\geq 0 for all $x\in\mathbb{Z}$

  2. $\left|x\right|=0\iff x=0$

  3. $\left|x-y\right|=0\iff x=y$

  4. $\left|xy\right|=\left|x\right|\left|y\right|$

  5. $\left|\left|x\right|\right|=\left|x\right|$

  6. $\left|-x\right|=\left|x\right|$

  7. $\left|x\right|\leq y \iff -y\leq x\leq y$

  8. \left|x\right|\geq y\iff x\leq -y or $x\geq y$

  9. $\left|x+y\right|\leq \left|x\right|+\left|y\right|$

  10. $\left|x-y\right|\leq\left|x-z\right|+\left|z-y\right|$

  11. $\left|x-y\right|\geq \left|\left|x\right|-\left|y\right|\right|$

  12. \left|\cdot\right| is not injective

  13. \left|\cdot\right| is not surjective

Proof:

  1. \left|x\right|\geq 0 for all x\in\mathbb{Z}:

    This follows by proposition 72{reference-type="ref" reference="prop:IntegerDistanceFuncWellDefined"}.

  2. \left|x\right|=0\iff x=0:

    We have by definition that \left|x\right|=0, if and only if x=0.

  3. \left|x-y\right|=0\iff x=y:

    \left(\Rightarrow\right): Suppose that \left|x-y\right|=0. There are two cases to consider.

    Firstly if x\geq y, then by definition we have that \left|x-y\right|=x-y=0 from which we clearly have x=y. The other case is x<y from which we get \left|x-y\right|=-\left(x-y\right)=0. In other words, we have -1*\left(x-y\right)=0. Now by proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"} we know that for integers a,b that if ab=0, at least one of a or b is zero. As -1\neq 0 we conclude that x-y=0 from which we get x=y.

    \left(\Leftarrow\right): Suppose that x=y then x-y=0 and so \left|x-y\right|=0.

  4. \left|xy\right|=\left|x\right|\left|y\right|:

    Let x,y\in\mathbb{Z}. There are four cases to consider.

    1. x\geq 0 and $y\geq 0$

    2. x\geq 0 and $y<0$

    3. x<0 and $y\geq 0$

    4. x<0 and $y<0$

    1. x\geq 0 and y\geq 0:

      If x\geq 0 and y\geq 0 then xy\geq 0 and so \left|xy\right|=xy. Likewise \left|x\right|=x and \left|y\right|=y. Hence \left|xy\right|=\left|x\right|\left|y\right|.

    2. x\geq 0 and y<0:

      If x\geq 0 then \left|x\right|=x by definition, and if y<0 then \left|y\right|=-y. Now \left|xy\right|=-xy as y<0. Moreover, we have that

      $$\begin{equation} -xy=\left(-1\right)\left(x\right)\left(y\right)=\left(x\right)\left(-1\right)\left(y\right)=\left(x\right)\left(-y\right)=\left|x\right|\left|y\right| \end{equation*}$$*

      Hence we get $\left|xy\right|=\left|x\right|\left|y\right|$

    3. x<0 and y\geq 0:

      This is similar to the above but swapping the roles of x and y.

    4. x<0 and y<0:

      Suppose that x<0 and y<0, then we have that \left|x\right|=-x and \left|y\right|=-y by definition. Moreover, we have that -x*-y = xy. Hence $\left|xy\right|=xy=\left(-x\right)\left(-y\right)=\left|x\right|\left|y\right|$

  5. \left|\left|x\right|\right|=\left|x\right|:

    We have that \left|x\right|=x if x\geq 0 and -x if x<0.

    So if x\geq 0, we have

    $$\begin{equation} \left|\left|x\right|\right|=\left|x\right|=x=\left|x\right| \end{equation*}$$*

    Now if x<0 then

    $$\begin{equation} \left|\left|x\right|\right|=\left|-x\right|=\underbrace{-x}_{\text{As }-x>0}=\left|x\right| \end{equation*}$$*

  6. \left|-x\right|=\left|x\right|:

    As -x=-1 *x we have by part 4 that

    $$\begin{equation} \left|-x\right|=\left|-1x\right|=\left|-1\right|\left|x\right|=1\left|x\right|=\left|x\right| \end{equation*}$$*

  7. \left|x\right|\leq y \iff -y\leq x\leq y:

    \left(\Rightarrow\right): Suppose that \left|x\right|\leq y. If x\geq 0 then we get that \left|x\right|=x\leq y. From this, it is clear that -y\leq x\leq y as x\geq 0 and x\leq y \Rightarrow y \geq 0.

    Now if x<0, then \left|x\right|=-x\leq y. Clearly x\leq -x as x<0 hence we conclude that x\leq -x\leq y. Now by part 18 of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} we have we have

    $$\begin{equation} \left(-1\right)\left(-x\right)\geq \left(-1\right)\left(y\right) \iff x\geq -y \end{equation}$$*

    Now x\geq -y is the same as -y\leq x and so we have -y\leq x\leq -x \leq y.

    Hence -y\leq x\leq y.

    \left(\Leftarrow\right): Suppose that -y\leq x\leq y. There are two cases to consider.

    1. $x\geq 0$

    2. $x<0$

    1. x\geq 0:

      Suppose x\geq 0, then clearly as x\leq y then \left|x\right|\leq \left|y\right|=y. Moreover, we have that -y\leq x is the same x\geq -y and by part 22. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} when applied to x\geq -y gives

      $$\begin{equation} \left(-1\right)\left(x\right)\leq \left(-1\right)\left(-y\right) \iff -x\leq y \end{equation}$$*

      We have that \left|-x\right|=\left|x\right| by part 6. Hence \left|-x\right|=\left|x\right|\leq \left|y\right|=y.

    2. x<0:

      Suppose x<0. By assumption x\leq y so either y\geq 0 or y< 0. We can't have y<0 as for example take x=-4 and y=-2 then we would have 2\leq -4\leq -2 a contradiction.

      So suppose that y\geq 0 then as x\leq y we have \left|x\right|\leq\left|y\right|=y. Now as -y\leq x by assumption we have that x\geq -y and so part 22. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} gives

      $$\begin{equation} \left(-1\right)\left(x\right)\leq \left(-1\right)\left(-y\right) \iff -x\leq y \end{equation}$$*

      Hence part 6. applies and we get that $\left|x\right|\leq y$

  8. \left|x\right|\geq y\iff x\leq -y or x\geq y:

    \left(\Rightarrow\right): Suppose that \left|x\right|\geq y. If x\geq 0 then \left|x\right|=x\geq y. So suppose that x<0 then by definition we have that \left|x\right|=-x and so -x\geq y and the result follows when applying part 22. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"}.

    \left(\Leftarrow\right): Suppose that either x\leq -y or x\geq y. We have three cases to consider.

    1. $x\leq -y$

    2. $x\geq y$

    3. x\leq -y and $x\geq y$

    1. x\leq -y:

      Suppose that x\leq -y holds. If x\geq 0 then we have that -y\geq 0, Hence y<0. Moreover, we have that by part 18. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} that

      $$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(-y\right) \iff -x\geq y \end{equation}$$*

      Now part 6. applies and we see that \left|-x\right|=\left|x\right|\geq\left|y\right|=y. This is to say \left|x\right|\geq y.

      Now suppose that x<0. Then as x\leq -y we have that either -y\geq 0 or -y<0. In the former case -y\geq 0 gives y<0. Hence by part 18. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} we conclude that

      $$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(y\right) \iff -x\geq y \end{equation}$$*

      As x<0 then -x\geq 0. The result follows when taking the absolute value.

      Now suppose that -y<0 then y\geq 0. Following similar logic to the previous case, we see that

      $$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(y\right) \iff -x\geq y \end{equation}$$*

      The result again follows after taking the absolute value.

    2. x\geq y:

      This case is trivial.

    3. x\leq -y and x\geq y:

      Suppose that x\leq -y and x\geq y are both true. We know by the first case that x\leq -y gives \left|x\right|\geq y and x\leq y also implies \left|x\right|\geq y by the second case. Hence both inequalities being true at the same time implies the result \left|x\right|\geq y.

  9. \left|x+y\right|\leq \left|x\right|+\left|y\right|:

    Let x,y\in\mathbb{Z}. There are four cases to consider.

    1. x\geq 0 and $y\geq 0$

    2. x\geq 0 and $y\leq 0$

    3. x\leq 0 and $y\geq 0$

    4. x\leq 0 and $y\leq 0$

    1. x\geq 0 and y\geq 0:

      Suppose x\geq 0 and y\geq 0, then we have that

      $$\begin{equation} \left|x+y\right|=x+y=\left|x\right|+\left|y\right|\Rightarrow \left|x+y\right|\leq\left|x\right|+\left|y\right| \end{equation*}$$*

    2. x\geq 0 and $y\leq 0$

      By assumption we have that \left|x\right|=x and \left|y\right|=-y. We have two cases based on the absolute value, \left|x\right|\leq\left|y\right| and \left|x\right|\geq\left|y\right|.

      So suppose that \left|x\right|\leq\left|y\right| then by definition x\leq -y and so by part 12. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} we have that

      $$\begin{equation} x\leq -y \Rightarrow x+y\leq 0 \end{equation*}$$*

      Moreover, as x\geq 0 then y\leq x+y\leq 0. Hence we have by the definition of the absolute value that

      $$\begin{equation} \left|x+y\right|=-\left(x+y\right)\leq -y=\left|y\right| \end{equation*}$$ As -y>0.*

      In the case \left|x\right|\geq\left|y\right| we have by definition that x\geq -y and so x+y\geq 0. Additionally it is clear that x\geq x+y as y\leq 0 and \left|x\right|\geq\left|y\right|. Hence by definition of the absolute value we have that

      $$\begin{equation} \left|x+y\right|=x+y\leq x=\left|x\right| \end{equation*}$$*

      Now, it is clear to see that \left|x\right|\leq \left|x\right|+\left|y\right| and likewise \left|y\right|\leq \left|x\right|+\left|y\right|.

      We have hence shown that \left|x+y\right|leq\left|x\right|+\left|y\right|.

    3. x\leq 0 and y\geq 0:

      This is similar to above, interchanging the roles of x and y.

    4. x\leq 0 and y\leq 0:

      Suppose that x\leq 0 and y\leq 0 then by definition we have that \left|x+y\right|=-\left(x+y\right)=-x-y. As x\leq 0 and y\leq 0 then we have that and \left|y\right|=-y which shows $\left|x+y\right|=\left|x\right|+\left|y\right|\leq\left|x\right|+\left|y\right|$

  10. \left|x-y\right|\leq\left|x-z\right|+\left|z-y\right|:

    We have that

    $$\begin{align} \left|x-y\right|&=\left|x-\left(z-z\right)-y\right|\ &=\left|x-z+z-y\right|\ &\leq \left|x-z\right|+\left|z-y\right| \end{align*}$$*

  11. \left|x-y\right|\geq \left|\left|x\right|-\left|y\right|\right|:

    We have that

    $$\begin{align} \left|x\right|&=\left|\left(x-y\right)+y\right|\leq \left|x-y\right|+\left|y\right| \Rightarrow \left|x\right|-\left|y\right|\leq \left|x-y\right|\ \left|y\right|&=\left|\left(y-x\right)+x\right|\leq \left|x-y\right|+\left|x\right| \Rightarrow \left|y\right|-\left|x\right|\leq \left|x-y\right|\ \end{align*}$$*

    Hence we have

    $$\begin{align} \left|x\right|-\left|y\right|\leq \left|x-y\right| &\Rightarrow \left|\left|x\right|-\left|y\right|\right|\leq \left|x-y\right|\ \left|y\right|-\left|x\right|=\left(-1\right)\left(\left|x\right|-\left|y\right|\right)\leq \left|x-y\right| &\Rightarrow \left|\left|x\right|-\left|y\right|\right|\leq \left|x-y\right|\ \end{align*}$$*

    Hence we have the result.

  12. \left|\cdot\right| is not injective:

    To see that the absolute value function is not injective consider \left|3\right|=\left|-3\right|. We have that \left|3\right|=3 and \left|-3\right|=3 but 3\neq -3.

  13. \left|\cdot\right| is not surjective:

    We have that the absolute value function as there are no x\in\mathbb{Z} so that \left|x\right|=-1 for example.

This ends the proposition. $\qed$ :::

Extending exponentiation to the integers

We can extend the idea of exponentiation to include integers. We are now able to consider negative bases. In other words, expressions of the form \displaystyle x^n for x\in\mathbb{Z} with x<0. This extension is somewhat trivial and extends naturally from the definition of the naturals. We first look at the case where n\geq 0

::: definition Definition 114. Exponentiation of integer numbers

Let \mathbb{Z}^+=\left\{x\in\mathbb{Z}:x\geq 0\right\}. Let \left(x,n\right)\in\mathbb{Z}\times\mathbb{Z} with n\geq 0 and let \wedge:\mathbb{Z}\times\mathbb{Z}\rightarrow\mathbb{Z}. We define the exponentiation of x by n to be x multiplied by itself n-1 times

$$\begin{align} \wedge:\mathbb{Z}\times\mathbb{Z}^+&\rightarrow\mathbb{Z}\ \left(x,n\right)&\mapsto \wedge\left(x,n\right)=\begin{cases} 1,\ \text{If } x=0\text{ and } n=0\ 1,\ \text{If } n=0\ \displaystyle \prod_{i=1}^y x ,\ \text{If }x\neq 0\text{ and } n \geq 0\ \end{cases} \end{align*}$$*

We will write \wedge\left(x,n\right) as x^n. We say that x is the base and n is the exponent. We sometimes say that x has been raised to the power of n. In the case that x=0 and m=0 we have a vacuous product and so an empty product which by definition has a value of 1. :::

We will explore this definition by first considering x=-1

$$\begin{align*} xx=x^1&=-1=-1\ xx=x^2&=-1*-1=1\ xxx=x^3&=-1*-1*-1=-1\ xxxx=x^4&=-1-1*-1*-1=1\ \end{align*}$$

This leads to the following proposition.

::: proposition Proposition 74. Negative one to power of 2n is 1 Let n\in\mathbb{N}. We have that

$$\begin{equation} \left(-1\right)^{2n} = 1 \end{equation*}$$*

Proof:

We argue by induction on n. The base case is n=0 and by definition, we have that

$$\begin{equation} \left(-1\right)^{20}=\left(-1\right)^{0}=1=1 \end{equation}$$*

Now suppose the result holds for some n=k, that is

$$\begin{equation} \left(-1\right)^{2k}=1 \end{equation*}$$*

We show that

$$\begin{equation} \left(-1\right)^{2*\left(k+1\right)}=1 \end{equation*}$$*

We have

$$\begin{align} \left(-1\right)^{2\left(k+1\right)}&=\left(-1\right)^{2k+2}\ &=\prod_{i=1}^{2k+2} \left(-1\right)\ &=\prod_{i=1}^{2k} \left(-1\right) \prod_{i=2k+1}^{2k+2} \left(-1\right)\ &= 1 * \left(\left(-1\right)\left(-1\right)\right) &=1\left(1\right)=1 \end{align*}$$*

Which shows the result. $\qed$ :::

This result generalises for any negative integer.

::: proposition Proposition 75. Negative integer to the power of 2n is positive

Let x\in\mathbb{Z} with x<0. Let n\in\mathbb{N}. We have that

$$\begin{equation} x^{2n} > 1 \end{equation*}$$*

Proof:

By definition we have

$$\begin{align} x^{2n}&=\prod_{i=1}^{2n} x\ &=\prod_{i=1}^{2n} \left(-1*-x\right)\ &=\prod_{i=1}^{2n} \left(-1\right) \prod_{i=}^{2n}\left(-x\right)\ &=1\underbrace{\prod_{i=}^{2n}\left(-x\right)}_{\geq 1} \geq 1\ \end{align*}$$*

As -x>0 because x<0. $\qed$ :::

We also note that exponentiation is neither commutative nor associative as they were not for the naturals. However, the following results do extend.

::: {#prop:IntegerExponentiationPowerLaw .proposition} Proposition 76. Power law of exponentiation for positive exponents

Let x\in\mathbb{Z} and let n,m\mathbb{N} with n\geq 0 and m\geq 0. We have that

$$\begin{equation} \left(x^n\right)^m = x^{nm} \end{equation*}$$*

Proof:

By the definition of exponentiation, we have that

$$\begin{equation} \left(x^n\right)^m=\prod_{i=1}^m x^n =\prod_{i=1}^m\left(\prod_{j=1}^n x\right) \end{equation*}$$*

Hence we have

$$\begin{align} \left(x^n\right)^m&=\underbrace{\prod_{j=1}^n x * \prod_{j=1}^n x \dots * \prod_{j=1}^n x}_{n\text{ times}}\ &=\underbrace{\underbrace{xx*\dotsx}_{n\text{ times}}\underbrace{xx\dotsx}_{n\text{ times}}\dots*\underbrace{xx\dotsx}{n\text{ times}}}{m\text{ times}}\ \end{align}$$*

Therefore, there are n*m total multiplications of x with itself. Which is to say

$$\begin{equation} \left(x^n\right)^m = \underbrace{xxx*\dotsx}_{nm\text{ times}} = \prod_{i=1}^{nm} x = x^{nm} \end{equation*}$$*

As promised. $\qed$ :::

::: {#prop:IntegerExponentiationOfSameBaseAddsPowers .proposition} Proposition 77. Multiplying exponents of the same base adds the powers

Let x\in\mathbb{Z} be a fixed integer and let n,m\in\mathbb{N}. We have that

$$\begin{equation} x^n x^m = x^{n+m} \end{equation}$$*

Proof:

Let x\in\mathbb{Z} and n,m\in\mathbb{N} If n=0 or m=0 or both then the result is trivial. Likewise if n=0 and m\geq 0 or n\geq 0 and m=0 again the result is trivial. So suppose that n>0 and m>0. We have by definition of exponentiation that

$$\begin{equation} x^nx^m=\prod_{i=1}^n x * \prod_{i=1}^m x = \underbrace{xx*\dots x}_{n\text{ times}} * \underbrace{xx*\dots x}_{m\text{ times}}=\underbrace{xx*\dots x}_{n+m \text{ times}}=x^{n+m} \end{equation}$$*

As expected. $\qed$ :::

::: {#prop:IntegerExponentiationPowerOfProductIsProductOfPowers .proposition} Proposition 78. Power of product is product of powers

Let x,y\in\mathbb{Z} and n\in\mathbb{N}. Then

$$\begin{equation} \left(xy\right)^n=x^ny^n \end{equation*}$$*

Proof:

If n=0 then \left(x*y\right)^n=1 and clearly x^0*y^0=1. So let n>0 then we have

$$\begin{align} \left(xy\right)^n=\prod_{i=1}^n xy &=\underbrace{xyxy*\dots xy}_{n\text{ times}}\ &= \left(\underbrace{xx*\dots x}_{n\text{ times}}\right)\left(\underbrace{yy\dots y}_{n\text{ times}}\right),\ \text{ By commutativity of multiplication}\ &=x^ny^n \end{align*}$$*

Showing the proposition. $\qed$ :::

The awake reader may have noticed how we have only dealt with positive exponents so far in our extension of exponentiation to the integers. What about negative exponents? We can, loosely, justify why we can't yet consider negative exponents by considering proposition 77{reference-type="ref" reference="prop:IntegerExponentiationOfSameBaseAddsPowers"}. For a second suppose that instead of n.m\in\mathbb{N} we consider n,m\in\mathbb{Z}. In particular n=1 and m=-1, then we have that

$$\begin{equation*} x^1x^{-1}=x^{1+-1}=x^0=1 \end{equation}$$

Hence we have that when x^1 is multiplied by x^{-1} we get back to 1. Hence in a sense x^{-1} cancels with x. If we let x=2 we have x^1=2 and so x^1*x^{-1}=1 gives us the equation 2*x^{-1}=1. We intuitively know that \displaystyle x^{-1}=\frac{1}{2} which we know is not an integer. Hence if 77{reference-type="ref" reference="prop:IntegerExponentiationOfSameBaseAddsPowers"} held for all integer powers we have the implied existence of a new type of object. This object has the potential that when an integer is multiplied by the appropriate member of this new type of object, assuming such an object even exists, then integer multiplication is undone.

Construction of the Rationals

::: epigraph A man is like a fraction whose numerator is what he is and whose denominator is what he thinks of himself. The larger the denominator, the smaller the fraction.

Leo Tolstoy :::

We have now built a theory of integer numbers. One main reason for doing this was to be able to always undo subtraction. We still have a glaring issue at hand, however. How do we undo multiplication? For example, we are unable to express in mathematical language how many times one quantity goes into another. If we have 6 pints and 3 friends we know that each friend should get 2 pints as 3*2=6. In a sense we have that 2 goes into 6 a total of 3 times and 3 goes into 6 a total of 2 times. The integers don't have a concept of how many times one integer can go into another. This is what we call division and we write \displaystyle\frac{6}{2}=3 and \displaystyle\frac{6}{3}=2 for each situation respectively.

Thankfully the method used to construct the integers can be used again on the integers themselves to construct an even richer theory. As with the integers, we should consider what we want to do. We seek a way to undo the multiplication of integers. Consider a,b,c,d\in\mathbb{Z} a=6,b=3,c=12 and d=6, with these values we intuitively know that \displaystyle\frac{6}{3}=2 and \displaystyle\frac{12}{6}=2. We also note that 6*6=36 and 3*12=36. This gives us a clue on how to proceed. We have that \displaystyle\frac{6}{3} and \displaystyle\frac{12}{6} are hence similar. If we temporarily use the language of relations we have that \left(a,b\right)\sim\left(c,d\right).

Defining the Rationals

We proceed by defining division as an ordered tuple on integers

::: definition Definition 115. Division as an ordered tuple

Let a,b\in\mathbb{Z}. We define division as an ordered tuple \left(a,b\right)\in\mathbb{Z}^2 to mean \displaystyle\frac{a}{b}. We will call x\in\mathbb{Z}^2 a division tuple in this context. :::

Hence we can define the relation we considered above.

::: definition Definition 116. Relation for division

Let \left(a,b\right),\left(c,d\right)\in\mathbb{Z}^2 be division tuples. We define the relation \sim such that \left(a,b\right)\sim\left(c,d\right) if and only if $ad=bc$ :::

With this definition there is something we need to consider that we have heard since school, you can't divide by zero, that is for any integer a we have \displaystyle\frac{a}{0} is not defined.

Suppose that \left(a,0\right)\sim\left(c,d\right) for some a,c,d\in\mathbb{Z}. We have by definition of the relation that

$$\begin{equation*} \left(a,0\right)\sim\left(c,d\right)\iff ad=0c = 0 \end{equation}$$

By proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"} we have that either a=0 or d=0 or both.

If a=0 then we have \left(0,0\right)\sim\left(c,d\right)\Rightarrow 0=0 for all c,d\in\mathbb{Z}. This means that every division tuple in \mathbb{Z}^2 would be equivalent to \left(0,0\right). Likewise if d=0 we get \left(a,0\right)\sim\left(c,0\right)\Rightarrow 0=0 again meaning for all division tuples in \mathbb{Z}^2 would be equivalent. Finally if both a=0 and d=0 then \left(0,0\right)\sim\left(c,0\right) and so 0=0*c=0 and again every division tuple would be equivalent.

This is a problem as this relation would imply that all elements are essentially the same9 . This is not a useful definition to be using so we will avoid this by not allowing b=0 in \left(a,b\right)\in\mathbb{Z}^2. We revise the definition

::: definition Definition 117. Division as an ordered tuple

Let a,b\in\mathbb{Z} with b\neq 0. We define division as an ordered tuple \left(a,b\right)\in\mathbb{Z}^2 to mean \displaystyle\frac{a}{b}. We will call x\in\mathbb{Z}^2 a division tuple in this context. :::

::: definition Definition 118. Relation for division

Let \left(a,b\right),\left(c,d\right)\in\mathbb{Z}^2 be division tuples where b\neq 0 and d\neq 0. We define the relation \sim such that \left(a,b\right)\sim\left(c,d\right) if and only if $ad=bc$ :::

We can show that this revised definition is an equivalence relation.

::: proposition Proposition 79. Relation for division ordered tuple is an equivalence relation

Let x,y,z\in\mathbb{Z}^2 be division tuples and defined the relation x\sim y as above. We have that \sim is an equivalence relation.

Proof:

Let x,y,z\in\mathbb{Z}^2 be division tuples such that x=\left(a,b\right),y=\left(c,d\right) and z=\left(e,f\right). We show that \sim is an equivalence relation, in other words.

  1. \sim is reflexive

  2. \sim is symmetric

  3. \sim is transitive

  1. \sim is reflexive:

    We have that for x=\left(a,b\right) that x\sim x as x\sim x if and only if ab=ab.

  2. \sim is symmetric:

    Suppose that x=\left(a,b\right) and y=\left(c,d\right). Suppose that x\sim y then we have that ad=bc. Hence bc=ad \Rightarrow cb=ad and so \left(c,d\right)\sim\left(a,b\right) and so y\sim x.

  3. \sim is transitive:

    Suppose that x\sim y and y\sim z then by definition we have that ad=bc and cf=de. We have that

    $$\begin{align} ad&=bc\ adf&=bcf\ adf&=bde\ af&=be \end{align*}$$*

    Hence \left(a,b\right)\sim\left(e,f\right) and so x\sim z.

It follows that \sim is an equivalence relation. $\qed$ :::

We can now turn our attention to the set \mathbb{Z}^2/\sim=\left\{\left[x\right]_\sim:x\in\mathbb{Z}^2\right\}.

::: definition Definition 119. Rationals

Let \mathbb{Z}^2 have the equivalence relation \sim defined by \left(a,b\right)\sim\left(c,d\right) if and only if ad=bc. We define the set of rational numbers, denoted \mathbb{Q}, as the quotient set \mathbb{Z}^2/\sim. The set has the form

$$\begin{equation} \mathbb{Q}=\left{\dots,-\frac{2}{3},-\frac{1}{3},-\frac{1}{2},0,\frac{1}{2},\frac{1}{3},\frac{2}{3},\dots\right} \end{equation}$$ :::

Extending equality to the rationals

As with the integers, it is easy to extend equality.

::: definition Definition 120. Equality of rationals

Let x,y\in\mathbb{Q} be two rational numbers. We define that two rationals are equal, denoted x=y if and only if x\sim y. That is x and y are in the same equivalence class. If x\not\sim y then we say that x is not equal to y and write x\neq y. :::

Extending inequality operators to the rationals

The inequality operators can be extended to the rationals in a natural way.

::: definition Definition 121. Less than operator

Let x,y\in\mathbb{Q} where x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. The less than operator, denoted by x<y is defined by the logical proposition

$$\begin{equation} <\left(x,y\right)=\begin{cases} 1,\ \text{If } ad<bc\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

This can equivalently be express as

$$\begin{equation} x<y \iff ad<bc \end{equation*}$$* :::

::: definition Definition 122. Less than or equal to operator

Let x,y\in\mathbb{Q} where x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. The less than or equal operator, denoted by x\leq y is defined by the logical proposition

$$\begin{equation} \leq\left(x,y\right)=\begin{cases} 1,\ \text{If } ad\leq bc\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

This can equivalently be express as

$$\begin{equation} x\leq y \iff ad\leq bc \end{equation*}$$* :::

::: definition Definition 123. Greater than operator

Let x,y\in\mathbb{Q} where x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. The greater than operator, denoted by x>y is defined by the logical proposition

$$\begin{equation} >\left(x,y\right)=\begin{cases} 1,\ \text{If } ad>bc\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

This can equivalently be express as

$$\begin{equation} x>y \iff ad>bc \end{equation*}$$* :::

::: definition Definition 124. Greater than or equal to operator

Let x,y\in\mathbb{Q} where x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. The greater than or equal to operator, denoted by x\geq y is defined by the logical proposition

$$\begin{equation} \geq\left(x,y\right)=\begin{cases} 1,\ \text{If } ad\geq bc\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*

This can equivalently be express as

$$\begin{equation} x\geq y \iff ad\geq bc \end{equation*}$$* :::

Extending addition to the rationals

We can extend addition to the rationals. To do so we need to consider how integers are represented in the rationals. As we know an element \left(a,b\right)\in\mathbb{Q} is going to represent \displaystyle\frac{a}{b}. So we can start by considering what an integer will look like. We know by the definition of the equivalence relation that for \left(a,b\right),\left(c,d\right)\in\mathbb{Z}^2 that

$$\begin{equation*} \left(a,b\right)\sim\left(c,d\right)\iff ad=bc \end{equation*}$$

Hence if we have for b=d=1 that

$$\begin{equation*} \left(a,1\right)\sim\left(c,1\right)\iff a=c \end{equation*}$$

Hence an integer can be represented in the rationals by an element of the form \left(k,1\right) for all k\in\mathbb{Z}. Therefore if x,y\in\mathbb{Z} they will have the representation x=\left(x_1,1\right) and y=\left(y_1,1\right) for some x_1,y_1\in\mathbb{Z}. Hence by integer addition, we have that

$$\begin{equation*} x+y=\left(x_1,1\right)+\left(y_1,1\right)=\left(x_1+y_1,1\right) \end{equation*}$$

Now what happens if a=c=1? From the definition of the equivalence relation we have that

$$\begin{equation*} \left(1,b\right)\sim\left(1,d\right)\iff d=b \end{equation*}$$

So we see that \left(1,b\right)\sim\left(1,d\right) means that intuitively \displaystyle\frac{1}{b}=\frac{1}{d}. The question now becomes what is \displaystyle\frac{1}{b}+\frac{1}{b}?

For example consider \displaystyle\frac{1}{2}+\frac{1}{2}=1, or \displaystyle\frac{1}{3}+\frac{1}{3}=\frac{2}{3}. It seems the result we need is that \displaystyle\frac{1}{b}+\frac{1}{b}=\frac{2}{b}. We hence have that

$$\begin{equation*} \left(1,b\right)+\left(1,b\right)=\left(2,b\right) \end{equation*}$$

Hence more generally we have that

$$\begin{equation*} \left(a,b\right)+\left(c,b\right)=\left(a+c,b\right) \end{equation*}$$

Now, from intuition, we know that for example \displaystyle\frac{1}{3}=\frac{2}{6}=\frac{1*2}{3*2}. In the language of the relation we have defined, we have that

$$\begin{equation*} \left(a,b\right)\sim\left(ad,bd\right) \end{equation*}$$

With these facts, we have enough to recover the definition of the addition of rational numbers we were told in school.

We have that

$$\begin{align*} \left(a,b\right)+\left(c,d\right)&\sim\left(ad,bd\right)+\left(bc,bd\right)\ &\sim\left(ad+bc,bd\right) \end{align*}$$

Indeed, we have for example

$$\begin{equation*} \frac{1}{2}+\frac{1}{3}=\frac{31+21}{32}=\frac{5}{6} \end{equation}$$

We make the required definition.

::: definition Definition 125. Addition on the Rationals

Let x,y\in\mathbb{Q} with x\in\left[a,b\right] and y=\left[c,d\right] so that b\neq 0 and d\neq 0. We define addition on the rationals by

$$\begin{equation} x+y=\left[a,b\right]+\left[c,d\right]=\left[ad+bc,bd\right] \end{equation}$$ :::

Extending multiplication to the rationals

We can extend multiplication to the rationals as well. As with extending addition, we should consider how integers are represented in the rationals. As before an integer in the rationals is of the form \left(a,1\right) and given the definition from the integers we know we must have

$$\begin{equation*} \left(a,1\right)\left(b,1\right)=\left(ab,1\right) \end{equation}$$

Now we need to answer the question of \left(1,b\right)*\left(1,d\right). Taking a similar approach as to addition we will consider some examples. We intuitively know that \displaystyle 1*\frac{1}{2}=\frac{1}{2}. This is to say that

$$\begin{equation*} \left(1,1\right)\left(1,2\right)=\left(1,2\right) \end{equation}$$

We also knot that 2*2=4 and so we know \displaystyle\frac{4}{2}=2. In other words we must have that

$$\begin{equation*} \left(4,1\right)\left(1,2\right)=\left(2,1\right)\sim\left(4,2\right) \end{equation}$$

Now, suppose we have \displaystyle\frac{3}{2}=1.5, what is \displaystyle\frac{3}{2}*\frac{1}{3}? Again we know intuitively that 0.5+0.5+0.5=3(0.5)=1.5, hence we can write

$$\begin{equation*} \left(3,2\right)\left(1,3\right)=\left(1,2\right)\sim\left(3,6\right) \end{equation}$$

We can now see how to handle \left(1,b\right)*\left(1,d\right) and more generally \left(a,b\right)*\left(c,d\right). We make the definition.

::: definition Definition 126. Multiplication on the Rationals

Let x,y\in\mathbb{Q} with x\in\left[a,b\right] and y=\left[c,d\right] so that b\neq 0 and d\neq 0. We define multiplication on the rationals by

$$\begin{equation} xy=\left[a,b\right]\left[c,d\right]=\left[ac,bd\right] \end{equation}$$ :::

Closure properties of addition and multiplication

As with the natural numbers and integers we need to show that the operations of addition and multiplication on the rationals are closed and well-defined.

::: theorem Theorem 26. Addition and multiplication on the rational are well-defined operators and closed

We have that \forall x,y\in\mathbb{Q} that

  1. $x+y\in\mathbb{Q}$

  2. $xy\in\mathbb{Q}$*

Proof:

  1. x+y\in\mathbb{Q}:

    We must show that if \left(a,b\right)\sim\left(a',b'\right) and \left(c,d\right)\sim\left(c',d'\right) then we have

    $$\begin{equation} \left(ad+bc,bd\right)\sim\left(a'd'+b'c',b'd'\right) \end{equation*}$$*

    By definition we have that \left(a,b\right)\sim\left(a',b'\right) holds if and only if ab'=ba', likewise \left(c,d\right)\sim\left(c',d'\right) holds if and only if cd'=c'd. It is left to show \left(ad+bc,bd\right)\sim\left(a'd'+b'c',b'd'\right). By definition of the equivalence relation we have that

    $$\begin{equation} \left(ad+bc,bd\right)\sim\left(a'd'+b'c',b'd'\right) \iff \left(ad+bc\right)b'd'=bd\left(a'd'+b'c'\right) \end{equation*}$$*

    We have that

    $$\begin{align} \left(ad+bc\right)b'd'&=adb'd'+bcb'd', \text{ As integer multiplication distributes over the addition}\ &=\left(ab'\right)\left(dd'\right)+\left(cd'\right)\left(bb'\right), \text{ By commutativity}\ &=\left(ba'\right)\left(dd'\right)+\left(dc'\right)\left(bb'\right), \text{ By the equivalence relation}\ &=\left(bd\right)\left(a'd'\right)+\left(bd\right)\left(b'c'\right), \text{ By commutativity}\ &=bd\left(a'd'+b'c' ,\right), \text{ As integer multiplication distributes over the addition} \end{align*}$$*

    Which is what we wished to show. Hence addition is well-defined. It is left to show closure. Let x,y\in\mathbb{Q} with x=\left(a,b\right) and y=\left(c,d\right) so that b\neq 0 and d\neq 0. By definition of addition we have that

    $$\begin{equation} \left(a,b\right)+\left(c,d\right)=\left(ad+bc,bd\right) \end{equation*}$$*

    As ad+bc\in\mathbb{Z} and bd\in\mathbb{Z} then \left(ad+bc,bd\right)\in\left[ad+bc,bd\right] and so x+y\in\mathbb{Q}.

  2. x*y\in\mathbb{Q}:

    As with addition we need to show that if \left(a,b\right)\sim\left(a',b'\right) and \left(c,d\right)\sim\left(c',d'\right) that

    $$\begin{equation} \left(ac,bd\right)\sim\left(a'c',b'd'\right) \end{equation*}$$*

    As \left(a,b\right)\sim\left(a',b'\right) holds if and only if ab'=ba', likewise \left(c,d\right)\sim\left(c',d'\right) holds if and only if cd'=c'd. It is left to show \left(ac,bd\right)\sim\left(a'c',b'd'\right), that is

    $$\begin{equation} \left(ac,bd\right)\sim\left(a'c',b'd'\right)\iff acb'd'=bda'c' \end{equation*}$$*

    We have

    $$\begin{align} acb'd'&=\left(ab'\right)\left(cd'\right), \text{By commutativity}\ &=\left(ba'\right)\left(c'd\right),\ \text{By the equivalence relation}\ &=bda'c',\ \text{By commutativity} \end{align*}$$*

    Showing that multiplication is well-defined. To show closure let x,y\in\mathbb{Q} with x=\left(a,b\right) and y=\left(c,d\right) so that b\neq 0 and d\neq 0 then by definition we have that

    $$\begin{equation} \left(a,b\right)\left(c,d\right)=\left(ac,bd\right) \end{equation}$$*

    From which it is clear that ac,bd\in\mathbb{Z} so $xy\in\mathbb{Q}$*

The result is shown. $\qed$ :::

Associativity of rational addition and multiplication

The associativity of addition and multiplication extends to the rationals.

::: theorem Theorem 27. Let x,y,z\in\mathbb{Q}. We have that

  1. $x+\left(y+z\right)=\left(x+y\right)+z$

  2. $x\left(yz\right)=\left(xy\right)z$

Proof:

  1. x+\left(y+z\right)=\left(x+y\right)+z:

    Let x,y,z\in\mathbb{Q} be such that x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right) where a,b,c,d,e,f\in\mathbb{N} and we have that \left(a,b\right)\in\left[a,b\right], \left(c,d\right)\in\left[c,d\right] and \left(e,f\right)\in\left[e,f\right]. We have that

    $$\begin{align} x+\left(y+z\right)&=\left(a,b\right)+\left(\left(c,d\right)+\left(e,f\right)\right)\ &=\left(a,b\right)+\left(cf+de,df\right)\ &=\left(adf+b\left(cf+de\right),bdf\right)\ &=\left(adf+bcf+bde,bdf\right)\ &=\left(\left(ad+bc\right)f+bde,bdf\right)\ &=\left(\left(ad+bc\right)f+ebd,bdf\right),\text{ By associativity of addition for integer numbers}\ &=\left(ad+bc,bd\right)+\left(e,f\right)\ &=\left(\left(a,b\right)+\left(c,d\right)\right)+\left(e,f\right)\ &=\left(x+y\right)+z \end{align*}$$*

    Which shows associativity of addition.

  2. x\left(yz\right)=\left(xy\right)z:

    As with addition, let x,y,z\in\mathbb{Q} be such that x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right) where a,b,c,d,e,f\in\mathbb{Z} and we have that \left(a,b\right)\in\left[a,b\right], \left(c,d\right)\in\left[c,d\right] and \left(e,f\right)\in\left[e,f\right]. We then have that

    $$\begin{align} x\left(yz\right)&=\left(a,b\right)\left(\left(c,d\right)\left(e,f\right)\right)\ &=\left(a,b\right)\left(ce,df\right)\ &=\left(ace,bdf\right)\ &=\left(ac,bd\right)\left(e,f\right)\ &=\left(\left(a,b\right)\left(c,d\right)\right)\left(e,f\right) \end{align}$$*

    Showing associativity of multiplication.

The result follows. $\qed$ :::

Commutativity of rational addition and multiplication

As with the naturals and integers, addition and multiplication in the rationals both satisfy commutativity.

::: theorem Theorem 28. Addition and multiplication are commutative

For all x,y\in\mathbb{Q} we have that

  1. $x+y=y+x$

  2. $xy=yx$

Proof:

  1. x+y=y+x:

    Let x,y\in\mathbb{Q}. By definition we have that x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. Let x=\left(a,b\right) and y=\left(c,d\right). We then have by definition of addition that

    $$\begin{align} x+y&=\left(a,b\right)+\left(c,d\right)\ &=\left(ad+bc,bd\right)\ &=\left(bc+ad,bd\right),\ \text{By associativity of addition for the integers}\ &=\left(cb+da,db\right),\ \text{By commutativity of addition for the integers}\ &= \left(c,d\right)+\left(a,b\right) &=y+x \end{align*}$$*

    Showing commutativity holds for addition in the integers.

  2. xy=yx:

    Let x,y\in\mathbb{Q} by definition we have that x\in\left[a,b\right] and y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. So let x=\left(a,b\right) and y=\left(c,d\right). By definition of multiplication we have

    $$\begin{align} xy&=\left(a,b\right)\left(c,d\right)\ &=\left(ac,bd\right)\ &=\left(ca,db\right), \text{By commutativity of multiplication of the integers}\ &=\left(c,d\right)\left(a,b\right)\ &=yx \end{align*}$$*

    Showing commutativity for integer multiplication.

The result has been shown. $\qed$ :::

The Zero and Identity laws

The zero and identity laws from both the naturals and integers extend to the rationals. But first, we show the following result.

::: lemma Lemma 7. Representation of zero in the rationals

We have that 0=\left[0,a\right] for all a\in\mathbb{Z} with $a\neq 0$

Proof:

Let x,y\in\left[0,a\right] with x=\left(0,a_1\right) and y=\left(0,a_2\right). We hence have that$x\sim y$ and

Where the final 0=0 is the zero of the integers, from which the result is clear. $\qed$ :::

We take the natural representation of 0 for the rationals.

::: theorem Theorem 29. The zero and Identity laws

Let x\in\mathbb{Q}. We have that

  1. $x+0=x=0+x$

  2. $1x=x=x1$

Proof:

Let x\in\mathbb{Q} then we have that x=\left(a,b\right) for some $a,b\in\mathbb{Z}$

  1. x+0=x=0+x:

    We have that 0\in\left[0,1\right]. Hence we have that

    $$\begin{align} x+0&=\left(a,b\right)+\left(0,1\right)\ &=\left(a1+b0,b1\right)\ &=\left(a,b\right)=x\ &=\left(1a+0b.1b\right)\ &=\left(0,1\right)\left(a,b\right)\ &=0+x \end{align}$$*

  2. x*1=x=1*x:

    As 1\in\left[1,1\right] then

    $$\begin{align} x1&=\left(a,b\right)\left(1,1\right)\ &=\left(a1,b1\right)\ &=\left(a,b\right)\ &=\left(a,b\right)=x\ &=\left(1a,1b\right)\ &=\left(1,0\right)\left(a,b\right)\ &=1x \end{align}$$*

The result follows. $\qed$ :::

Multiplication distributes over addition

Yet another result that extends to the rationals is that multiplication distributes over addition.

::: theorem Theorem 30. Multiplication distributes over addition

For all x,y,z\in\mathbb{Q} we have that

  1. $x\left(y+z\right)=xy+xz$

  2. $\left(y+z\right)x=yx+zx=xy+xz$

Proof:

Let x,y,z\in\mathbb{Q} then x\in\left[a,b\right],y\in\left[c,d\right] and z\in\left[e,f\right] for some a,b,c,d,e,f\in\mathbb{Z}.

Let x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right).

  1. x\left(y+z\right)=xy+xz:

    We have that

    $$\begin{align} x\left(y+z\right)&=\left(a,b\right)\left(\left(c,d\right)+\left(e,f\right)\right)\ &=\left(a\left(cf+ed\right),bdf\right)\ &=\left(acf+aed,bdf\right),\ \text{By multiplication distributes over addition for the integers}\ &=\left(acf+aed,bdf\right)\left(1,1\right),\ \text{By the identity law for the rationals}\ &=\left(acf+aed,bdf\right)\left(b,b\right),\ \text{As }\left(1,1\right)\sim\left(b.b\right)\ &=\left(\left(acf+aed\right)b,bdfb\right)\ &=\left(acfb+aedb,bdfb\right),\ \text{By multiplication distributes over addition for the integers}\ &=\left(acbf+aebd,bdbf\right),\ \text{By commutativity of integer multiplication}\ &=\left(ac,bd\right)+\left(ae,bf\right)\ &=\left(a,b\right)\left(c,d\right)+\left(a,b\right)\left(e,f\right)\ &=xy+xz \end{align*}$$*

  2. \left(y+z\right)x=yx+zx=xy+xz:

    Invoking the previous part of the proof we have that

    $$\begin{align} \left(y+z\right)x&=x\left(y+z\right), \text{By commutativity of multiplication}\ &=xy+xz, \text{By part }1.\ &=yx+zx, \text{By commutativity of multiplication} \end{align*}$$*

As required. $\qed$ :::

Extending subtraction to the rationals

We can extend subtraction from the integers to the rationals. Recall that subtraction was defined for x,y\in\mathbb{Z} by

$$\begin{equation*} x-y=x+\left(-y\right)=x+\left(-1y\right) \end{equation}$$

That is to say subtraction was defined by adding the negation of y to x. We will use a similar idea to define subtraction on the rationals. Firstly we need to consider what it means to negate a rational number. To do so we need to define what it means for a rational number to be "positive" or "negative".

We know that any integer x can be expressed as a rational by \left(x,1\right) and so in this case \left(x,1\right) is positive if x is positive and \left(x,1\right) is negative if x is negative. Hence a general rational number \left(a,b\right) being positive or negative will depend on a and b being positive or negative. There are a few cases to consider.

  1. Suppose that a is positive and b is positive. We have that for \left(a,b\right)\sim\left(c,d\right) for some c,d\in\mathbb{Z} that

    $$\begin{equation*} ad=cb \end{equation*}$$

    As a and b are positive then we are forced to conclude that c and d are also positive for if not then one side of this equation would have a different sign.

  2. Suppose that a is positive and b is negative. Then as before we have that for \left(a,b\right)\sim\left(c,d\right) to be true that

    $$\begin{equation*} ad=cb \end{equation*}$$

    As b was negative then we have that cb is either positive or negative depending on c. If c is positive then cb is negative and so d must also be negative. Likewise if c is negative then cb is positive and d must be positive.

The cases for when a is negative and b is either positive or negative are similar. We can use this to make a definition for a positive and negative rational number.

::: definition Definition 127. Positive and negative rational number

Let x\in\mathbb{Q} so that x=\left(a,b\right) for some a,b\in\mathbb{Z}. We say that x is a positive rational number if and only if a is positive and b is positive. That is to say x\in\mathbb{Q} is positive if and only if \mathop{\mathrm{sgn}}\left(a\right)=\mathop{\mathrm{sgn}}\left(b\right) with \mathop{\mathrm{sgn}}\left(a\right)\neq 0 and \mathop{\mathrm{sgn}}\left(b\right)\neq 0 where \mathop{\mathrm{sgn}} denotes the sign function of an integer.

If \mathop{\mathrm{sgn}}\left(a\right)\neq\mathop{\mathrm{sgn}}\left(b\right) and \mathop{\mathrm{sgn}}\left(a\right)\neq 0 and \mathop{\mathrm{sgn}}\left(b\right)\neq 0 then we have that x is a negative rational number.

Finally if \mathop{\mathrm{sgn}}\left(a\right)= 0 and \mathop{\mathrm{sgn}}\left(b\right)\neq 0 then we say that x is neither positive or negative. :::

We can summarise this definition using \mathop{\mathrm{sgn}} just like we did for the integers.

::: definition Definition 128. Sign of a rational number

Let x\in\mathbb{Q} where x=\left(a,b\right) with a,b\in\mathbb{Z} and b\neq 0. We define the sign of x, denoted by \mathop{\mathrm{sgn}}\left(x\right) to be the following function

$$\begin{align} \mathop{\mathrm{sgn}}:\mathbb{Q}&\rightarrow\left{-1,0,1\right}\ x&\mapsto\mathop{\mathrm{sgn}}\left(x\right)=\begin{cases} 1,\ \text{If } x\text{ is a positive rational number}\ -1,\ \text{If } x\text{ is a negative rational number}\ 0,\ \text{If } \mathop{\mathrm{sgn}}\left(a\right)=0 \end{cases} \end{align*}$$* :::

Now that we have defined the notion of a positive and negative rational number we can consider what it means to negate a rational number. The definition follows immediately from the representation of -1 in \mathbb{Q} being \left(-1,1\right). Indeed for any x\in\mathbb{Q} with x=\left(a,b\right) we have

$$\begin{equation*} -x=-1x=\left(-1,1\right)\left(a,b\right)=\left(-a,b\right) \end{equation*}$$

We make the formal definition.

::: definition Definition 129. Negation of a rational number

Let x\in\mathbb{Q}. We define the negation of x, denoted -x by

$$\begin{equation} -x=-1x=\left(-1,1\right)x \end{equation}$$

where \left(-1,1\right)\in\left[\left(-1,1\right)\right]. That is \left(-1,1\right) is an element of the equivalence class \left[\left(-1,1\right)\right] which represents all possible elements that are -1. :::

We can now make a definition for subtraction for the rational numbers

::: definition Definition 130. Rational number subtraction

Let x,y\in\mathbb{Q}. We define the subtraction of y from x, denoted x-y by

$$\begin{equation} x-y=x+\left(-y\right)=x+\left(-1y\right) \end{equation}$$* :::

We immediately get that subtraction is closed, from the fact that both addition and multiplication is closed. We do not have associativity of subtraction in general.

::: proposition Proposition 80. Rational number subtraction is not associative

Let x,y,z\in\mathbb{Q}. We have that

$$\begin{equation} x-\left(y-z\right)\neq \left(x-y\right)-z \end{equation*}$$*

Proof:

Let \displaystyle x=\frac{1}{2}, y=\frac{1}{4} and \displaystyle z=\frac{1}{6}, we have x\in\left[\left(1,2\right)\right], y\in\left[\left(1,4\right)\right] and z\in\left[\left(1,6\right)\right] so x=\left(1,2\right), y=\left(1,4\right) and z=\left(1,6\right) . We have that

$$\begin{align} x-\left(y-z\right)&=\left(1,2\right)+\left(\left(1,4\right)-\left(1,6\right)\right)\ &=\left(1,2\right)-\left(\left(1,4\right)+\left(-1*\left(1,6\right)\right)\right)\ &=\left(1,2\right)-\left(\left(1,4\right)+\left(-1,6\right)\right)\ &=\left(1,2\right)-\left(\left(16+4-1,46\right)\right)\ &=\left(1,2\right)-\left(\left(2,24\right)\right)\ &=\left(1,2\right)+\left(-1\left(\left(2,24\right)\right)\right)\ &=\left(1,2\right)+\left(-2,24\right)\ &=\left(124+2-1,224\right)\ &=\left(22,48\right)\ \end{align}$$*

On the other hand we have

$$\begin{align} \left(x-y\right)-z&=\left(1,2\right)-\left(\left(1,4\right)-\left(1,6\right)\right)\ &=\left(\left(1,2\right)+\left(-1*\left(1,4\right)\right)\right)-\left(1,6\right)\ &=\left(\left(1,2\right)+\left(-1,4\right)\right)-\left(1,6\right)\ &=\left(14+2-1,24\right)-\left(1,6\right)\ &=\left(2,8\right)-\left(1,6\right)\ &=\left(2,8\right)+\left(-1\left(1,6\right)\right)\ &=\left(2,8\right)+\left(-1,6\right)\ &=\left(26+8-1,86\right)\ &=\left(4,48\right) \end{align}$$*

It is left to show that \left(22,48\right)\neq\left(4,48\right). Indeed to have \left(22,48\right)=\left(4,48\right) we need \left(22,48\right)\sim\left(4,48\right) which occurs if and only if 22*48=48*8. However one the left hand side 48 is multiplied by 22 and on the right-hand side 48 is multiplied by 8 so they clearly can not be equal.

The result is shown. $\qed$ :::

As with subtraction with integers, we can now show that formally, subtraction is an inverse to addition.

::: {#prop:RationalAdditiveInverse .proposition} Proposition 81. Subtracting an integer from itself gives zero

Let x\in\mathbb{Q}. We have that

$$\begin{equation} x-x=0 \end{equation*}$$*

Proof:

Let x\in\mathbb{Q} where x\in\left[\left(a,b\right)\right] for some a,b\in\mathbb{Z} and b\neq 0. We have

$$\begin{align} x-x&=\left(a,b\right)-\left(a,b\right)\ &=\left(a,b\right)+\left(-a,b\right)\ &=\left(ab+b*-a,bb\right)\ &=\left(ab-ba,bb\right)\ &=\left(ab-ab,bb\right)\ &=\left(0,bb\right) \end{align*}$$*

It is left to show that \left(0,b*b\right)\sim\left(0,1\right). Indeed

$$\begin{equation} 01=bb0 \Rightarrow 0=0 \end{equation}$$*

The result is shown. $\qed$ :::

The cancellation laws

We can now deduce that the cancellation laws extend to the rational numbers.

::: {#thm:CancellationLawsForRationals .theorem} Theorem 31. The cancellation laws

Let x,y,z\in\mathbb{Q}.

  1. If x+y=x+z then we have y=z.

  2. For x\neq 0, if xy=xz then we have that $y=z$

Proof:

  1. If x+y=x+z then we have y=z:

    Let x,y,z\in\mathbb{Q}. We have that

    $$\begin{align} x+y&=x+z\ \Rightarrow -x+x+y&=-x+x+z,\ \text{Adding the negative of } x \text{ to both sides}\ \Rightarrow \left(-x+x\right)+y*&=\left(-x+x\right)+z,\ \text{Associativity of the rationals}\ \Rightarrow 0+y&=0+z,\ \text{By proposition \ref{prop:RationalAdditiveInverse}}\ \Rightarrow y&=z \end{align*}$$*

  2. For x\neq 0, if xy=xz then we have that y=z:

    Let x,y,z\in\mathbb{Q} where x\neq 0. Suppose that x\in\left[\left(a,b\right)\right], y\in\left[\left(c,d\right)\right] and z\in\left[\left(e,f\right)\right]. We have

    $$\begin{align} xy&=\left(a,b\right)\left(c,d\right)=\left(ac,bd\right)\ xz&=\left(a,b\right)\left(e,f\right)=\left(ae,bf\right) \end{align*}$$*

    Now suppose that xy=xz then we have that \left(ac,bd\right)\sim\left(ae,bf\right) which is to say

    $$\begin{equation} acbf=aebd \end{equation*}$$*

    Observer that $$\begin{align} &acbf=aebd\ &a\left(cbf\right)=a\left(ebd\right)\ &cbf=ebd,\ \text{By the cancellation laws for the integers}\ &bcf=bed,\ \text{By commutativity of the integers}\ &b\left(cf\right)=b\left(ed\right)\ &cf=ed,\ \text{By the cancellation laws for the integers}\ \Rightarrow&\left(c,d\right)\sim\left(e,f\right),\ \text{By definition of the equivalence relation} \end{align*}$$*

    It hence follows that as \left(c,d\right)\sim\left(e,f\right) then $y=z$

The result is shown. $\qed$ :::

Defining multiplicative inverses and division

When we extended the naturals to the integers we were able to extend the notion of subtraction in such a way that we could undo any addition operation. We were not able to do the same for multiplication in general. For example if we have x*2=1 where 1,2,x\in\mathbb{Z} then there is no integer x that when multiplied by 2 gives 1.

What happens if we consider instead the situation where we have 1,2,x\in\mathbb{Q}? Let x=\left(a,b\right) for some a,b\in\mathbb{Z} with b\neq 0 and taking the natural representations for 1 and 2 of 1=\left(1,1\right) and 2=\left(2,1\right). We have that

$$\begin{align*} x2&=1\ \left(a,b\right)\left(2,1\right)&=\left(1,1\right)\ \left(2a,b\right)&=\left(1,1\right)\ \Rightarrow\left(2a,b\right)&\sim\left(1,1\right)\iff 2a=b \end{align}$$

We don't seem to be in a better position then when we asked this question for \mathbb{Z}. However as a,b were arbitrary, of course with b\neq 0, we are free to vary them. For example a=1 gives us b=2, a=2 gives b=4, a=3 yields b=6 and so on. We hence have that there is a family of possible value for x which satisfies x*2=1 over the rational numbers, in particular we have x=\left(a,2a\right) for a\in\mathbb{Z} and a\geq 0. Moreover we clearly have

$$\begin{equation*} \left(a,2a\right)\sim\left(1,2\right)\iff 2a=2a \end{equation*}$$

Hence we have that \left(a,2a\right) somehow undoes multiplication by 2. Indeed consider 45*2=90. We have that

$$\begin{equation*} 90*\left(a,2a\right)=\left(90,1\right)\left(a,2a\right)=\left(90a,2a\right) \end{equation}$$

Where we have \left(90a,2a\right)\sim\left(45,1\right) as 90a*1=45*2a \iff 90a = 90a. We can generalise this to x*y=1 for any y\in\mathbb{Q}. Indeed let x=\left(a,b\right) and y=\left(c,d\right) where a,b,c,d\in\mathbb{Z} and c\neq 0 and d\neq 0 then we have

$$\begin{align*} xy&=\left(a,b\right)\left(c,d\right)\ &=\left(ac,bd\right)=\left(1,1\right)\ \Rightarrow\left(ac,bd\right)&\sim\left(1,1\right)\iff bd=ac \end{align*}$$

This is a somewhat unsatisfactory conclusion as it doesn't tell us what a or b should actually be equal to in order for x*y=1, likewise, it doesn't tell us what c or d should be either.

Perhaps then we should consider a more simple setup. Suppose that x\in\mathbb{Z} then is there y\in\mathbb{Q} where y=\left(c,d\right) with d\neq 0, such that x*y=1? We have

$$\begin{equation*} xy=\left(x,1\right)\left(c,d\right)=\left(xc,d\right)=\left(1,1\right) \end{equation*}$$

Hence

$$\begin{equation*} \left(xc,d\right)\sim\left(1,1\right)\iff xc=d \end{equation*}$$

Hence y=\left(c,xc\right) satisfies this relation. However we can see that \left(c,cx\right)\sim\left(1,x\right). Hence for any integer x\neq 0 we have a solution to x*y=1 with y\in\mathbb{Q}. We call y a multiplicative inverse of x and x a multiplicative inverse of y.

::: definition Definition 131. Multiplicative inverse of an integer

Let x\in\mathbb{Z} be such that x\neq 0. Then there is a y\in\mathbb{Q} such that

$$\begin{equation} xy=1=yx \end{equation*}$$*

where y=\left(1,x\right). We can write this as \displaystyle y=\frac{1}{x} or y=x^{-1}. We sometimes say that x{-1} is a reciprocal of x or a multiplicative inverse of x. :::

In light of this, we have the immediate result

::: {#prop:MultiplicativeInverseOfIntegerTimesInverseIsOriginalNumber .proposition} Proposition 82. Multiplicative inverse of an integer times its multiplicative inverse is the original number

Let x\in\mathbb{Z} so that x^{-1}\in\mathbb{Q} where \displaystyle x^{-1}=\frac{1}{x} is the multiplicative inverse to x in the rationals. The following result holds.

$$\begin{equation} xx^{-1}x = x \end{equation}$$

Proof:

By definition of a multiplicative inverse we have that

$$\begin{equation} xx^{-1}=x\frac{1}{x}=\left(x,1\right)\left(1,x\right)=\left(x,x\right)\sim\left(1,1\right)=1 \end{equation}$$*

Hence as x^{-1} is a multiplicative inverse for x it follows that x is a multiplicative inverse for x^{-1} and so

$$\begin{equation} xx^{-1}x=1x=x \end{equation}$$*

As required. $\qed$ :::

Armed with this definition we can answer the original question. In order to find an x so that x*y=1 we have that we need to find a multiplicative inverse for c and a multiplicative inverse for \displaystyle d^{-1}=\frac{1}{d}. Clearly we have that \displaystyle c^{-1}=\frac{1}{c} and a multiplicative inverse for d^{-1} is simply d. Hence a candidate for x is given by x=\left(d,c\right). Indeed we have that

$$\begin{equation*} xy=\left(d,c\right)\left(c,d\right)=\left(cd,cd\right)\sim\left(1,1\right)=1 \end{equation*}$$

We can hence extend the idea of multiplicative inverses to the rationals.

::: definition Definition 132. Multiplicative inverse of a rational number

Let x\in\mathbb{Q} such that x=\left(a,b\right) with a,b\in\mathbb{Z} and b\neq 0. Then there is a y\in\mathbb{Q} such that

$$\begin{equation} xy=1=yx \end{equation*}$$*

where y=\left(b,a\right). Hence we must also have a\neq 0. We write this as \displaystyle y=\frac{b}{a} or as \displaystyle x^{-1}=y=\frac{b}{a}. We sometimes say that x{-1} is a reciprocal of x or a multiplicative inverse of x. :::

A similar result holds as for proposition 82{reference-type="ref" reference="prop:MultiplicativeInverseOfIntegerTimesInverseIsOriginalNumber"}

::: {#prop:MultiplicativeInverseOfRationalTimesInverseIsOriginalNumber .proposition} Proposition 83. Multiplicative inverse of a rational number times its multiplicative inverse is the original number

Let x\in\mathbb{Q} with x=\left(a,b\right) and a,b\in\mathbb{Z} so that a\neq 0 and b\neq 0. Let x^{-1} denote the multiplicative inverse of x. The following result holds.

$$\begin{equation} xx^{-1}x = x \end{equation}$$

Proof:

By definition of a multiplicative inverse we have that

$$\begin{equation} xx^{-1}=1 \end{equation}$$*

Hence

$$\begin{equation} xx^{-1}x=1x=x \end{equation}$$*

As required. $\qed$ :::

We now have a solid grasp of undoing multiplication in the rational numbers. In fact we are now in a position to define the operation of division. However we are already done due to the work we have just done, and our original motivation for defining the rational numbers in the first place. We use the idea of multiplicative inverses!

::: definition Definition 133. Division

Let a,b\in\mathbb{Z} so that b\neq 0. We define the division of a by b, denoted \displaystyle\frac{a}{b} by

$$\begin{equation} \frac{a}{b}=ab^{-1}=\left(a,1\right)\left(1,b\right)=\left(a,b\right) \end{equation*}$$* :::

We can extend the notion of division even further by considering a,b\in\mathbb{Q} rather than \mathbb{Z}. At first is appears we have a problem, we defined the rationals using integers and division in terms of integers, so how could we possibly assign any meaning to an expression like \displaystyle\frac{1}{\frac{1}{2}}?

Consider for example the following

$$\begin{equation*} \frac{1}{\frac{1}{2}}\frac{1}{2} \end{equation}$$

If we were suppose the rule for multiplication that we defined extends to this situation then we get

$$\begin{equation*} \frac{1}{\frac{1}{2}}\frac{1}{2}=\frac{11}{\frac{1}{2}2}=\frac{1}{1}=1 \end{equation}$$

In the context of the work we have just done we have that \displaystyle \frac{1}{\frac{1}{2}} is a multiplicative inverse of \frac{1}{2}. However we know that \displaystyle \frac{1}{2} has a multiplicative inverse of 2. Does this mean that \displaystyle \frac{1}{\frac{1}{2}}=2? A deeper analysis of expressions of the form \displaystyle \frac{1}{\frac{1}{a}}.

We know from before that \displaystyle\frac{1}{a}=a^{-1} for some non-zero a\in\mathbb{Z}. Hence we have that by definition a^{-1}\in\mathbb{Z}. Hence we are considering the expression

$$\begin{equation*} \frac{1}{\frac{1}{a}}=\frac{1}{a^{-1}} \end{equation*}$$

Therefore we know from the definition of the multiplicative inverse of a rational number that there is some y\in\mathbb{Q} so that

$$\begin{equation*} \frac{1}{a^{-1}}y=1 \end{equation}$$

By the definition we also know what y must be \displaystyle \frac{a^{-1}}{1}=a^{-1}=\frac{1}{a}. Hence we can justify our "temporary" assumption of extending the multiplication rule. Hence hence make the following deduction

::: {#prop:OneDividedByMultiplicativeInverseOfInteger .proposition} Proposition 84. One divided by multiplicative inverse of an integer is the integer itself

Let x\in\mathbb{Q} so that \displaystyle x=\frac{1}{\frac{1}{a}} for some a\in\mathbb{Z} with a\neq 0. we have that

$$\begin{equation} \frac{1}{\frac{1}{a}}=a \end{equation*}$$*

Proof:

Let x\in\mathbb{Q} be such that \displaystyle x=\frac{1}{\frac{1}{a}} for some non-zero a\in\mathbb{Z}. We know by definition that

$$\begin{equation} x=\frac{1}{a}=a^{-1} \end{equation*}$$*

where a^{-1}\in\mathbb{Z} and therefore \displaystyle x = \frac{1}{a^{-1}}. Moreover this is still a rational number by definition and so there exists some rational y so that

$$\begin{equation} xy=1 \end{equation}$$*

where \displaystyle y=\frac{a^{-1}}{1}=a^{-1}. It follows that \displaystyle y=\frac{1}{a}. Again by definition there is some z\in\mathbb{Q} so that y*z=1 where \displaystyle z=\frac{a}{1}=a that is to say z is a multiplicative inverse of y.

We therefore have that

$$\begin{equation} xy=1=yz \end{equation*}$$*

Hence by theorem 31{reference-type="ref" reference="thm:CancellationLawsForRationals"} we have that x=z which is to say

$$\begin{equation} \frac{1}{\frac{1}{a}}=a \end{equation*}$$*

As required. $\qed$ :::

We hence get an immediate corollary

::: corollary Corollary 4. One divided by rational number

Let x\in\mathbb{Q} be such that \displaystyle x=\frac{a}{b}. We have that

$$\begin{equation} \frac{1}{x}=\frac{1}{\frac{a}{b}}=\frac{b}{a} \end{equation*}$$*

Proof:

We have

$$\begin{equation} \frac{1}{x}=\frac{1}{\frac{a}{b}}=\frac{1}{a\frac{1}{b}}=\frac{1}{a b^{-1}}=\frac{1}{a}\frac{1}{b^{-1}}=\frac{1}{a}b=\frac{b}{a} \end{equation}$$*

As required. $\qed$ :::

Extending the summation and product notations to the rationals

Summation and product notation has been defined on the naturals as well as the integers. We can extend the notation to include the rational numbers.

Let q\in\mathbb{Q}^{n+m+1} be an ordered n+m+1 tuple of rational numbers where

$$\begin{equation*} q=\left(q_{-m},q_{-m+1},\dots,q_{-1},q_0,q_1,\dots, q_n\right) \end{equation*}$$

Define \mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\} to be a set of indices and define f:\mathbb{Z}_m^n\rightarrow\mathbb{Q} by

$$\begin{align*} f:\mathbb{Z}_m^n&\rightarrow \mathbb{Q}\ i&\mapsto f\left(i\right)=q_i \end{align*}$$

::: definition Definition 134. Summation notation for rational numbers

Let z\in\mathbb{Q}^{n+m+1} be ordered n+m+1 tuple of integers where q=\left(q_{-m},q_{-m+1},\dots,q_{-1},q_0,q_1,\dots, q_n\right). Define \mathbb{Z}_m^n by \mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}. Let f:\mathbb{Z}^{n+m+1}:\mathbb{Q} defined by

$$\begin{align} f:\mathbb{Z}^{m+n+1}&\rightarrow\mathbb{Q}\ i&\mapsto f\left(i\right)=q_i \end{align*}$$*

We define the summation notation for the rational numbers by

$$\begin{equation} \sum_{i=-m}^n f\left(i\right)=f\left(-m\right)+f\left(-m+1\right)+\dots+f\left(-1\right)+f\left(0\right)+f\left(1\right)+\dots+f\left(n\right) \end{equation*}$$*

Alternatively this is written

$$\begin{equation} \sum_{i=-m}^n q_i = q_{-m}+q_{-m+1}+\dots+q_{-1}+q_0+q_1+\dots+q_n \end{equation*}$$*

We have that i is called the index of summation and that i=-m is the starting index of the summation, and n the ending index of the summation. If q\in\emptyset then we define the summation to be 0 and call the summation an empty sum.

We can also define the summation of some subset of \mathbb{Z}_m^n which allows for starting a summation at some starting point other than i=-m. Let T\subseteq\mathbb{Z}_m^n. We define the summation over the set T by

$$\begin{equation} \sum_{i\in T} z_i \end{equation*}$$*

If we have a mapping g:\mathbb{Q}\rightarrow\mathbb{Q} we can define a summation over g by

$$\begin{equation} \sum_{i\in T} g\left(z_i\right) \end{equation*}$$*

Finally we can define a summation over a predicate P\left(i\right) for i\in T by

$$\begin{equation} \sum_{P\left(i\right)}g\left(z_i\right) \end{equation*}$$*

where we take the sum of the g\left(z_i\right) for the i that satisfy the predicate P. We note that if we have k>n for some k\in\mathbb{N} then the sum

$$\begin{equation} \sum_{i=k}^n z_i=0 \end{equation*}$$* :::

The usual proprieties shown for summations with integer numbers also extend to the rational number version.

::: proposition Proposition 85. Properties of summation notation

Let n,m\in\mathbb{Z} such that m<n. Let s,t\in\mathbb{Q}^{n+m+1} and let c\in\mathbb{Q}.

Let a,b\in\mathbb{Z} with m<a<b<n. Define A=\mathbb{Z}_a^b and define

$$\begin{equation} B=\mathbb{Z}_m^n\setminus A=\left{-m,-m+1,\dots,a-1,b+1,\dots,n-1,n\right} \end{equation*}$$ so that A\cup B =\mathbb{Z}_m^n. Let k\in \mathbb{Z} be the starting index summation such that k<n. We have that the following properties hold.*

  1. $\displaystyle \sum_{i=-m}^n s_i = \sum_{i\in A} s_i +\sum_{i\in B} s_i =\sum_{i=-m}^{-1} s_i + \sum_{i=0}^{n}s_i$

  2. $\displaystyle \sum_{i=k}^n s_i = \sum_{i=k}^d s_i + \sum_{i=d+1}^n s_i$

  3. \displaystyle\sum_{i=k}^n c*s_i = c*\sum_{i=k}^n s_i for some $c\in\mathbb{Q}$

  4. $\displaystyle\sum_{i=k}^n s_i+t_i = \sum_{i=k}^n s_i + \sum_{i=k}^n t_i$

Proof:

  1. \displaystyle \sum_{i=-m}^n s_i = \sum_{i\in A} s_i +\sum_{i\in B} s_i =\sum_{i=-m}^{-1} s_i + \sum_{i=0}^{n}s_i:

    The proof is the same as for the integer case. We give it again for completeness

    We have that

    $$\begin{align} \sum_{i=-m}^n s_i&=s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_{n}\ &=\left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}\right)+\left(s_0+s_1+\dots+s_{n-1}+s_{n}\right)\ &=\sum_{i=-m}^{-1} s_i + \sum_{i=0}^n s_i \end{align*}$$*

    Additionally note that

    $$\begin{align} \sum_{i=-m}^n s_i&=s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_{n}\ &= \left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{a-2}+s_{a-1}\right)+\left(s_a+s_{a+1}+\dots+s_{b-1}+s_b\right)\ &+\left(s_{b+1}+s_{b+2}+\dots+s_{n-1}+s_n\right)\ &= \left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{a-2}+s_{a-1}\right) + \left(s_{b+1}+s_{b+2}+\dots+s_{n-1}+s_n\right)\ &+ \left(s_a+s_{a+1}+\dots+s_{b-1}+s_b\right)\ &= \sum_{i\in B} s_i + \sum_{i\in A} s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i \end{align*}$$*

  2. \displaystyle \sum_{i=k}^n s_i = \sum_{i=k}^d s_i + \sum_{i=d+1}^n s_i:

    The proof is similar to part 1, replacing -m by k.

  3. \displaystyle\sum_{i=k}^n c*s_i = c*\sum_{i=k}^n s_i for some $c\in\mathbb{Q}$

    We have by definition that

    $$\begin{equation} \sum_{i=k}^n cs_i=cs_k+cs_{k+1}+cs_{k+3}+\dots+cs_n \end{equation}$$*

    By multiplication distributing over addition we have

    $$\begin{equation} \sum_{i=1}^n cs_i=cs_k+cs_{k+1}+cs_{k+3}+\dots+cs_n=c\left(s_k+s_{k+1}+\dots+s_n\right)=c\sum_{i=k}^n s_i \end{equation*}$$*

  4. \displaystyle\sum_{i=k}^n s_i+t_i = \sum_{i=k}^n s_i + \sum_{i=k}^n t_i:

    This follows by the definition. We have

    $$\begin{align} \sum_{i=k}^n s_i+t_i&= \left(s_k+t_k\right)+\left(s_{k+1}+t_{k+1}\right)+\dots\ &+\left(s_{-1}+t_{-1}\right)+\left(s_{0}+t_{0}\right)+\left(s_{1}+t_{1}\right)+\dots+\left(s_{n-1}+t_{n-1}\right)+\left(s_{n}+t_{n}\right)\ &=\left(s_k+s_{k+1}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_n\right)+\ &+\left(t_k+t_{k+1}+\dots+t_{-1}+t_0+t_1+\dots+t_{n-1}+t_n\right)\ &= \sum_{i=k}^n s_i + \sum_{i=k}^n t_i \end{align*}$$*

$\qed$ :::

We make a similar definition for product notation.

::: definition Definition 135. Product notation for the rationals numbers

Let z\in\mathbb{Q}^{n+m+1} be ordered n+m+1 tuple of integers where q=\left(q_{-m},q_{-m+1},\dots,q_{-1},q_0,q_1,\dots, q_n\right). Define \mathbb{Z}_m^n by \mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}. Let f:\mathbb{Z}^{n+m+1}:\mathbb{Z} defined by

$$\begin{align} f:\mathbb{Z}^{m+n+1}&\rightarrow\mathbb{Q}\ i&\mapsto f\left(i\right)=z_i \end{align*}$$*

We define the summation notation for integers by

$$\begin{equation} \prod_{i=-m}^n f\left(i\right)=f\left(-m\right)f\left(-m+1\right)\dotsf\left(-1\right)f\left(0\right)f\left(1\right)\dots+f\left(n\right) \end{equation}$$

Alternatively this is written

$$\begin{equation} \prod_{i=-m}^n q_i = q_{-m}q_{-m+1}\dotsq_{-1}q_0q_1\dotsq_n \end{equation}$$*

We have that i is called the index of the product and that i=-m is the starting index of the product, and n the ending index of the product. If z\in\emptyset then we define the product to be 1 and call a product an empty product.

We can also define the product of some subset of \mathbb{Z}_m^n which allows for starting a product at some starting point other than i=-m. Let T\subseteq\mathbb{Z}_m^n. We define the product over the set T by

$$\begin{equation} \prod_{i\in T} z_i \end{equation*}$$*

If we have a mapping g:\mathbb{Z}\rightarrow\mathbb{Z} we can define a product over g by

$$\begin{equation} \prod_{i\in T} g\left(z_i\right) \end{equation*}$$*

Finally we can define a product over a predicate P\left(i\right) for i\in T by

$$\begin{equation} \prod_{P\left(i\right)}g\left(z_i\right) \end{equation*}$$*

where we take the sum of the g\left(z_i\right) for the i that satisfy the predicate P. We note that if we have k>n for some k\in\mathbb{N} then the product

$$\begin{equation} \prod_{i=k}^n z_i=1 \end{equation*}$$* :::

::: proposition Proposition 86. Properties of product notation

Let n,m\in\mathbb{Z} such that m<n. Let s,t\in\mathbb{Q}^{n+m+1} and let c\in\mathbb{Z}. Let a,b\in\mathbb{Z} so that m<a<b<n. Define A=\mathbb{Z}_a^b and define

$$\begin{equation} B=\mathbb{Z}_m^n\setminus A=\left{-m,-m+1,\dots,a-1,b+1,\dots,n-1,n\right} \end{equation*}$$ so that A\cup B =\mathbb{Z}_m^n. Let k\in \mathbb{Z} be the lower index of the product.*

We have that the following properties hold.

  1. *$\displaystyle \prod_{i=-m}^n s_i = \prod_{i\in A} s_i \prod_{i\in B} s_i = \prod_{i=-m}^{-1} s_i * \prod_{i=0}^n s_i$

  2. $\displaystyle \prod_{i=k}^n s_i = \prod_{i=k}^m s_i * \prod_{i=m+1}^n s_i$

  3. $\displaystyle\prod_{i=k}^n s_it_i = \prod_{i=k}^n s_i \prod_{i=1}^n t_i$

Proof:

  1. \displaystyle \prod_{i=-m}^n s_i = \prod_{i\in A} s_i *\prod_{i\in B} s_i = \prod_{i=-m}^{-1} s_i * \prod_{i=0}^n s_i:

    The proof is the same for the intger case.

    We have that

    $$\begin{align} \prod_{i=-m}^n s_i &= s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}s_0s_1*\dotss_{n-1}s_n\ &=\left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}\right)\left(s_0s_1*\dotss_{n-1}s_n\right)\ &=\prod_{i=-m}^{-1}s_i\prod_{i=0}^n s_i \end{align}$$*

    Likewise we have

    $$\begin{align} \prod_{i=-m}^n s_i &= s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}s_0s_1*\dotss_{n-1}s_n\ &= \left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{a-2}s_{a-1}\right)\left(s_as_{a+1}\dotss_{b-1}s_b\right)\ & \left(s_{b+1}s_{b+2}\dotss_{n-1}s_n\right)\ &= \left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{a-2}s_{a-1}\right) * \left(s_{b+1}s_{b+2}\dotss_{n-1}s_n\right)\ & \left(s_as_{a+1}\dotss_{b-1}s_b\right)\ &=\prod_{i\in B}s_i * \prod_{i\in A} s_i = \prod_{i\in A} s_i * \prod_{i\in B} s_i \end{align}$$*

  2. \displaystyle \prod_{i=k}^n s_i = \prod_{i=k}^m s_i * \prod_{i=m+1}^n s_i:

    The proof is similar to part 1. We replace -m with k.

  3. \displaystyle\prod_{i=k}^n s_it_i = \prod_{i=k}^n s_i \prod_{i=1}^n t_i:

    Observer that

    $$\begin{align} \prod_{i=k}^n s_it_i&=s_{k}t_{k}s_{k+1}t_{k+1}s_{k+2}t_{k+2}\dotss_{-1}t_{-1}s_{0}t_{0}s_{1}t_{1}\dotss_{n-1}t_{n-1}s_{n}t_{n}\ &=\left(s_{k}s_{k+1}s_{k+2}\dotss_{-1}s_{0}s_{1}\dotss_{n-1}s_{n}\right)\ &\left(t_{k}t_{k+1}t_{k+2}\dotst_{-1}t_{0}t_{1}\dotst_{n-1}t_{n}\right)\ &=\prod_{i=k}^n s_i * \prod_{i=k}^n s_i \end{align}$$

$\qed$ :::

We can now extend the result of proposition 39{reference-type="ref" reference="prop:NaturalsHaveNoZeroDivisors"} and proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"}. I.e if the product of ab=0 for a,b\in\mathbb{Q} then at least one of a or b is zero.

::: {#prop:RationalsHaveNoZeroDivisors .proposition} Proposition 87. Product of two rational numbers being zero implies one of the numbers is zero

Let x,y\in\mathbb{Q}. If xy=0 then at least one of x or y is zero.

Proof:

Let x,y\in\mathbb{Q}. If x=y=0 then the result is trivial. So suppose that x=\left(a,b\right) and y=\left(c,d\right), moreover suppose y\neq 0. By definition of rational number multiplication we have that

$$\begin{equation} xy=\left(a,b\right)\left(c,d\right)=\left(ac,bd\right)=\left(0,1\right) \end{equation}$$*

Hence we must have ac=0. Therefore by proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"} we must have that either a=0 or c=0 or both. As we have assumed y\neq 0 then a=0 and so x=0. A similar argument assuming x\neq 0 shows that y=0. The result is shown. $\qed$ :::

Extending the rules for inequalities to the integers

For the natural numbers and the integers, we have a theory of inequalities. These results extend to the rationals. Additionally, as rational numbers represent the division of integers there are some additional properties that now hold.

we were able to derive some rules for how inequalities behave, we can extend those results to the integers. Before we do so we have an additional consideration.

To extend the results fully we need to consider negative rational numbers as well. We follow a similar layout to the section on integer inequalities.

::: {#prop:MultiplicationByNegativeOneFlipsInequalitySignRational .proposition} Proposition 88. Multiplication by -1 changes the inequality sign

Let x,y\in\mathbb{Q}. We have the following

  1. If x<y then $-x>-y$

  2. If x\leq y then $-x\geq -y$

  3. If x>y then $-x<-y$

  4. If x\geq y then $-x\leq-y$

Proof:

Let x,y\in\mathbb{Q} so that \displaystyle x=\frac{a}{b} and \displaystyle y=\frac{c}{d} where b\neq 0 and d\neq 0.

  1. If x<y then -x>-y:

    Let x,y\in\mathbb{Q} so that x<y. By definition of < for the rationals we have that

    $$\begin{equation} x<y\iff ad<bc \end{equation*}$$*

    Applying proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} we have

    $$\begin{equation} ad<bc\Rightarrow -ad>-bc \end{equation*}$$*

    Hence -x>-y.

  2. If x\leq y then -x\geq -y:

    Let x,y\in\mathbb{Q} so that x\leq y. Applying the definition of \leq for the rationals gives

    $$\begin{equation} x\leq y\iff ad\leq bc \end{equation*}$$*

    Proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} gives

    $$\begin{equation} ad\leq bc\Rightarrow -ad\geq -bc \end{equation*}$$ Hence -x\geq y.*

  3. If x>y then -x<-y:

    Let x,y\in\mathbb{Q} so that x>y. By definition of > for the rationals we have that

    $$\begin{equation} x>y\iff ad>bc \end{equation*}$$*

    Proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} shows us that

    $$\begin{equation} ad>bc\Rightarrow -ad<-bc \end{equation*}$$*

    Hence -x<-y.

  4. If x\geq y then -x\leq-y:

    Let x,y\in\mathbb{Q} so that x>y. By definition of \geq for the rationals, we have that

    $$\begin{equation} x\geq y\iff ad\geq bc \end{equation*}$$*

    Proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} we have

    $$\begin{equation} ad\geq bc\Rightarrow -ad\leq-bc \end{equation*}$$*

    Hence -x\leq -y.

The result is shown. $\qed$ :::

There is another useful lemma that will be useful for extending the rules of inequalities to the rationals.

::: {#lem:LargerRatMinusSmallIsPositive .lemma} Lemma 8. Strictly larger rational minus a smaller is positive

Let x,y\in\mathbb{Q}. We have that $x<y\iff y-x>0$

Proof:

\left(\Rightarrow\right): Let x,y\in\mathbb{Q}, then \displaystyle x=\frac{a}{b} and \displaystyle y=\frac{c}{d} for some a,b,c,d\in\mathbb{Z} and b\neq 0 and d\neq 0. As x<y then ad<bc. Now y-x is given by

$$\begin{align} y-x&=y+\left(-1x\right)\ &=\left(c,d\right)+\left(-a,b\right)\ &=\left(cb-ad,bd\right) \end{align}$$*

Now, saying y-x is positive is the same as y-x>0. By definition of greater than, and the fact that 0\in\left[\left(0,1\right)\right] we would have that

$$\begin{equation} \left(cb-ad\right)1>0\left(bd\right) \Rightarrow bc-ad>0 \end{equation*}$$*

Which is true as ad<bc. Hence $y-x>0$

\left(\Leftarrow\right): Suppose that y-x>0 where x,y\in\mathbb{Q}, with \displaystyle x=\frac{a}{b} and \displaystyle y=\frac{c}{d} for some a,b,c,d\in\mathbb{Z} and b\neq 0 and d\neq 0. We have that

$$\begin{equation} y-x = \left(cb-ad,bd\right) \end{equation*}$$*

Moreover, y-x>0 implies that

$$\begin{equation} \left(cb-ad\right)1>0\left(bd\right) \Rightarrow bc-ad>0 \end{equation*}$$*

This is to say bc>ad, which by part 1 of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} is the same as ad<bc which is equivalent to saying that x<y.

As required. $\qed$ :::

::: {#cor:LargerOrEqualRatMinusSmallIsPositive .corollary} Corollary 5. Larger or equal rational minus a smaller is positive

Let x,y\in\mathbb{Q}. We have that x\leq y\iff y-x>0 or y=x.

Proof:

\left(\Rightarrow\right): Suppose x\leq y. If x<y then lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"} applies. Otherwise x=y.

\left(\Leftarrow\right): Suppose that one of y-x> 0 or y=x holds. In the first case, y-x>0 implies x<y by lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"} and clearly we will have x\leq y. If x=y then we clearly also have x\leq y by definition. $\qed$ :::

We can now extend the properties of inequalities to the rationals.

::: {#prop:InequalityRationalNumbers .proposition} Proposition 89. Properties of inequalities for the rationals

Let x,y,z,c\in\mathbb{Q}. We have the following properties for inequalities

  1. x<y is the same as $y>x$

  2. x\leq y is the same as $y\geq x$

  3. If x<y and y<z then $x<z$

  4. If x\leq y and y<z then $x<z$

  5. If x<y and y\leq z then $x<z$

  6. If x\leq y and y\leq z then $x\leq z$

  7. If x>y and y>z then $x>z$

  8. If x\geq y and y>z then $x>z$

  9. If x>y and y\geq z then $x>z$

  10. If x\geq y and y\geq z then $x\geq z$

  11. If x<y then $x+z<y+z$

  12. If x\leq y then $x+z\leq y+z$

  13. If x>y then $x+z>y+z$

  14. If x\geq y then $x+z\geq y+z$

  15. If x<y and z\geq 0 then $xz<yz$

  16. If x<y and z< 0 then $xz>yz$

  17. If x\leq y and z\geq 0 then $xz\leq yz$

  18. If x\leq y and z<0 then $xz\geq yz$

  19. If x>y and z\geq 0 then $xz>yz$

  20. If x>y and z< 0 then $xz<yz$

  21. If x\geq y and z\geq 0 then $xz\geq yz$

  22. If x\geq y and z<0 then $xz\leq yz$

  23. If x<y and z>0 then $\displaystyle\frac{x}{z}<\frac{y}{z}$

  24. If x\leq y and z>0 then $\displaystyle\frac{x}{z}\leq\frac{y}{z}$

  25. If x>y and z>0 then $\displaystyle\frac{x}{z}>\frac{y}{z}$

  26. If x\geq y and z>0 then $\displaystyle\frac{x}{z}\geq\frac{y}{z}$

  27. If x<y and z<0 then $\displaystyle\frac{x}{z}>\frac{y}{z}$

  28. If x\leq y and z<0 then $\displaystyle\frac{x}{z}\geq\frac{y}{z}$

  29. If x>y and z<0 then $\displaystyle\frac{x}{z}<\frac{y}{z}$

  30. If x\geq y and z<0 then $\displaystyle\frac{x}{z}\leq\frac{y}{z}$

  31. If x<y and x>0 and y>0 then $\displaystyle \frac{1}{x}>\frac{1}{y}$

  32. If x<y and x<0 and y<0 then $\displaystyle \frac{1}{x}>\frac{1}{y}$

  33. If x\leq y and x>0 and y>0 then $\displaystyle \frac{1}{x}\geq \frac{1}{y}$

  34. If x\leq y and x<0 and y<0 then $\displaystyle \frac{1}{x}\geq \frac{1}{y}$

  35. If x>y and x>0 and y>0 then $\displaystyle \frac{1}{x}<\frac{1}{y}$

  36. If x>y and x<0 and y<0 then $\displaystyle \frac{1}{x}<\frac{1}{y}$

  37. If x\geq y and x>0 and y>0 then $\displaystyle \frac{1}{x}\leq \frac{1}{y}$

  38. If x\geq y and x<0 and y<0 then $\displaystyle \frac{1}{x}\leq \frac{1}{y}$

Proof:

Let x,y,z\in\mathbb{Q}. Let \displaystyle x=\frac{a}{b}, \displaystyle y=\frac{c}{d}, \displaystyle z=\frac{e}{f} for a,b,e,f,g,h\in\mathbb{Z} and b\neq 0, d\neq 0, f\neq 0.

  1. x<y is the same as y>x:

    Suppose that x<y then by definition we have ad<bc. Applying part 1. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} gives be>af and so y>x.

  2. x\leq y is the same as y\geq x:

    If x<y then part 1 applies, Otherwise we have x=y and so y=x and clearly y\geq x.

  3. If x<y and y<z then x<z:

    Suppose that x<y and y<z then y-x>0 and z-y>0 by lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"}. Now we have that

    $$\begin{equation} \left(y-x\right)+\left(z-y\right)=z-x>0 \end{equation*}$$*

    As y-x and z-y are both greater than 0. Hence as z-x>0 then x<z.

  4. If x\leq y and y<z then x<z:

    Suppose that x\leq y and y<z. If x<y then the previous part applies, so suppose not. Then x=y and so x<z.

  5. If x<y and y\leq z then x<z:

    Suppose that x<y and y\leq z. By lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"} we have that y-x>0, likewise by corollary 5{reference-type="ref" reference="cor:LargerOrEqualRatMinusSmallIsPositive"} we have that y\leq z means either z-y>0 or y=z.

    If z-y>0 then the result is the same as part 3. So suppose y=z then clearly x<z.

  6. If x\leq y and y\leq z then x\leq z:

    If x\leq y and y\leq z then either x<y and y<z in which case part 3. applies, or x<y and y\leq z so part 5. applies, or x\leq y and y<z so part 4 applies. Finally, we have the case x=y and y=z so clearly x=z so that x\leq z.

  7. If x>y and y>z then x>z:

    By part 1. this is equivalent to y<x and z<y then z<x so part 3. applies.

  8. If x\geq y and y>z then x>z:

    Using parts 1. and 2. gives us the equivalent expression y\leq x and z<y then z<x and so part 4 applies.

  9. If x>y and y\geq z then x>z:

    As with the previous part, applying parts 1. and 2. gives the statement y<x and z\leq y then z<x so part 5. applies.

  10. If x\geq y and y\geq z then x\geq z:

    Using part 2. gives us y\leq x and z\leq y then z\leq x so part 6. applies.

  11. If x<y then x+z<y+z:

    Suppose that x<y then y-x>0 by lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"}. Observer that

    $$\begin{align} y-x&=y-\left(z-z\right)-x\ &=\left(y-z\right)+\left(z-x\right)\ &=\left(y-z\right)-\left(x-z\right)>0 \end{align*}$$*

    So \left(y-z\right)-\left(x-z\right)>0 and so by the same lemma we conclude that x+z<y+z.

  12. If x\leq y then x+z\leq y+z:

    If x<y then the previous part applies. Otherwise x=y and clearly x+z=y+z and so x+z\leq y+z.

  13. If x>y then x+z>y+z:

    Applying part 1. and then part 11. gives the equivalent result y<x then y+z<x+z.

  14. If x\geq y then x+z\geq y+z:

    Applying part 2. and then part 12. gives the equivalent result y\leq x then y+z\leq x+z.

  15. If x<y and z\geq 0 then xz<yz:

    Suppose x<y then y-x>0 by lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"}. Hence, by distributivity, we have z\left(y-x\right)>0 as z\geq 0. Hence

    $$\begin{equation} z\left(y-x\right)=zy-zx=yz-xz \Rightarrow xz<yz \end{equation*}$$*

  16. If x<y and z< 0 then xz>yz:

    Suppose x<y, as z<0 \Rightarrow -z>0, then applying part 15. with -z gives -xz<-yz. Finally by proposition 88{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySignRational"} part 1 yields xz>yz.

  17. If x\leq y and z\geq 0 then xz\leq yz:

    If x\leq y there are two cases to consider. If x<y then part 15. applies. Otherwise x=y and clearly xz=yz giving xz\leq yz.

  18. If x\leq y and z<0 then xz\geq yz:

    Likewise, if x\leq y there are two cases. The case x<y is covered by part 16. Otherwise x=y gives xz=yz and again we have xz\geq yz.

  19. If x>y and z\geq 0 then xz>yz:

    We have x>y is the same as y<x and so x-y>0. By distributivity, we have that z\left(x-y\right)>0. Therefore we have zx-zy=xz-yz>0 and so yz<xz which is the same as xz>yz by part 1.

  20. If x>y and z< 0 then xz<yz:

    We have x>y. Additionally, z<0\Rightarrow -z>0 so applying part 19. gives -xz>-yz and so by part 1. we conclude xz<yz.

  21. If x\geq y and z\geq 0 then xz\geq yz:

    There are two cases to consider. If x>y then we apply part 19. Otherwise x=y and xz=yz so that xz\geq yz.

  22. If x\geq y and z<0 then xz\leq yz:

    Again there are two cases to consider. If x>y then the result holds by part 20. Otherwise x=y and so xz=yz to give the result xz\leq yz.

  23. If x<y and z>0 then \displaystyle\frac{x}{z}<\frac{y}{z}:

    This follows by part 15.

  24. If x\leq y and z>0 then \displaystyle\frac{x}{z}\leq\frac{y}{z}:

    This follows by part 17.

  25. If x>y and z>0 then \displaystyle\frac{x}{z}>\frac{y}{z}:

    This follows by part 19.

  26. If x\geq y and z>0 then \displaystyle\frac{x}{z}\geq\frac{y}{z}:

    This follows by part 21.

  27. If x<y and z<0 then \displaystyle\frac{x}{z}>\frac{y}{z}:

    This follows by part 16.

  28. If x\leq y and z<0 then \displaystyle\frac{x}{z}\geq\frac{y}{z}:

    This follows by part 18.

  29. If x>y and z<0 then \displaystyle\frac{x}{z}<\frac{y}{z}:

    This follows by part 20.

  30. If x\geq y and z<0 then \displaystyle\frac{x}{z}\leq\frac{y}{z}:

    This follows by part 22.

  31. If x<y and x>0 and y>0 then \displaystyle \frac{1}{x}>\frac{1}{y}:

    Suppose that x<y then ad<bc. Moreover as x>0 that either a>0 and b>0 or a<0 and b<0. Likewise as y>0 then either c>0 and d>0 or c<0 and d<0. Hence there are four cases to consider.

    1. a>0 and b>0 and c>0 and $d>0$

    2. a>0 and b>0 and c<0 and $d<0$

    3. a<0 and b<0 and c>0 and $d>0$

    4. a<0 and b<0 and c<0 and $d<0$

    1. a>0 and b>0 and c>0 and d>0:

      Observe that

      $$\begin{align} ad&<bc\ a^{-1}ad&<a^{-1}bc,\ \text{By part 15. as} a^{-1}>0\ d&<a^{-1}bc,\ \text{As multiplication of an element by its inverse is } 1\ dc^{-1}&<a^{-1}bcc^{-1},\ \text{By part 15. as } c^{-1}>0\ dc^{-1}&<a^{-1}b,\ \text{As multiplication of an element by its inverse is } 1\ \frac{d}{c}&<\frac{b}{a},\ \text{By the definition of an inverse element}\ \end{align*}$$*

      Hence \displaystyle \frac{d}{c}<\frac{b}{a} which is equivalent to \displaystyle \frac{b}{a}>\frac{d}{c}, which is to say \displaystyle\frac{1}{x}>\frac{1}{y}.

    2. a>0 and b>0 and c<0 and d<0:

      We have that as c<0 and d<0 then ad<0 and bc<0 and ad<bc. Hence observer that

      $$\begin{align} ad&<bc\ a^{-1}ad&>a^{-1}bc,\ \text{By part 16. as } a^{-1}<0\ d&>a^{-1}bc\ dc^{-1}&<a^{-1}bcc^{-1},\ \text{By part 20. as } c^{-1}<0\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*

      Hence we again conclude that \displaystyle\frac{1}{x}>\frac{1}{y}.

    3. a<0 and b<0 and c>0 and d>0:

      The argument is similar to the previous one, swapping the roles of a,b,c and d.

    4. a<0 and b<0 and c<0 and d<0:

      This is similar to the first part. We give the full argument. As a<0, b<0, c<0 and d<0 then ad>0 and bc>0 and ad<bc. Hence we can see that

      $$\begin{align} ad&<bc\ a^{-1}ad&>a^{-1}bc,\ \text{By part 16. as } a^{-1}<0\ d&>a^{-1}bc\ dc^{-1}&<a^{-1}-bcc^{-1},\ \text{By part 20. as } c^{-1}<0\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*

      Giving the result.

  32. If x<y and x<0 and y<0 then \displaystyle \frac{1}{x}>\frac{1}{y}:

    This is similar to the previous part. Suppose that x<y then ad<bc. Moreover as x<0 that either a>0 and b<0 or a<0 and b>0. Likewise as y<0 then either c>0 and d<0 or c<0 and d>0. Hence there are four cases to consider.

    1. a>0 and b<0 and c>0 and $d<0$

    2. a>0 and b<0 and c<0 and $d>0$

    3. a<0 and b>0 and c>0 and $d<0$

    4. a<0 and b>0 and c<0 and $d>0$

    1. a>0 and b<0 and c>0 and d<0:

      As a>0 and b<0 and c>0 and d<0 then we have that ad<0 and bc<0 and ad<bc. We have that

      $$\begin{align} ad&<bc\ a^{-1}ad&<a^{-1}bc\ d&<a^{-1}bc\ dc^{-1}&<a^{-1}bcc^{-1}\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*

      Giving $\displaystyle \frac{1}{x}>\frac{1}{y}$

    2. a>0 and b<0 and c<0 and d>0:

      We have a>0 and b<0 and c<0 and d>0 then we have that ad>0 and bc>0 and ad<bc. We have that

      $$\begin{align} ad&<bc\ a^{-1}ad&<a^{-1}bc\ d&<a^{-1}bc\ dc^{-1}&<a^{-1}bcc^{-1}\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*

      Giving $\displaystyle \frac{1}{x}>\frac{1}{y}$

    3. a<0 and b>0 and c>0 and d<0:

      This time we have a<0 and b>0 and c>0 and d<0 then we have that ad>0 and bc>0 and $ad<bc$

      $$\begin{align} ad&<bc\ -ad&>bc\ a^{-1}\left(-ad\right)&>a^{-1}\left(-bc\right)\ -d&>a^{-1}\left(-bc\right)\ -dc^{-1}&>a^{-1}\left(-bc\right)c^{-1}\ -dc^{-1}&>-a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*

      Giving the result.

    4. a<0 and b>0 and c<0 and d>0:

      Finally, a<0 and b>0 and c<0 and d>0 which gives ad<0 and bc<0 with ad<bc. Once again we have that

      $$\begin{align} ad&<bc\ a^{-1}ad&>a^{-1}bc\ d&>a^{-1}bc\ dc^{-1}&<a^{-1}bcc^{-1}\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*

      Which concludes this part of the proposition

  33. If x\leq y and x>0 and y>0 then \displaystyle \frac{1}{x}\geq \frac{1}{y}:

    If x<y then we apply part 31. Otherwise x=y and so \displaystyle \frac{1}{x}= \frac{1}{y} hence the result.

  34. If x\leq y and x<0 and y<0 then $\displaystyle \frac{1}{x}\geq \frac{1}{y}$

    Likewise if x<y we apply part 32. Otherwise x=y and \displaystyle \frac{1}{x}= \frac{1}{y} so the result is clear.

  35. If x>y and x>0 and y>0 then \displaystyle \frac{1}{x}<\frac{1}{y}:

    Applying part 1. the equivalent statement is y<x and x<0 and y<0 then \displaystyle \frac{1}{y}>\frac{1}{y} so part 32. applies.

  36. If x>y and x<0 and y<0 then \displaystyle \frac{1}{x}<\frac{1}{y}:

    Likewise by part 1. this is the same as y<x and x>0 and y>0 then \displaystyle \frac{1}{y}>\frac{1}{y} so part 31. applies.

  37. If x\geq y and x>0 and y>0 then \displaystyle \frac{1}{x}\leq \frac{1}{y}:

    If x>y then part 35 applies. Otherwise, x=y and the result is clear.

  38. If x\geq y and x<0 and y<0 then \displaystyle \frac{1}{x}\leq \frac{1}{y}:

    Finally, if x>y then we apply part 36. Otherwise x=y and we get the result.

The result has been shown.10 $\qed$ :::

Extending exponentiation to the rational numbers

Recall the definition of exponentiation from the integers.

$$\begin{align*} \wedge:\mathbb{Z}\times\mathbb{Z}^+&\rightarrow\mathbb{Z}\ \left(x,n\right)&\mapsto \wedge\left(x,n\right)=\begin{cases} 1,\ \text{If } x=0\text{ and } n=0\ 1,\ \text{If } n=0\ \displaystyle \prod_{i=1}^y x ,\ \text{If }x\neq 0\text{ and } n \geq 0\ \end{cases} \end{align*}$$

where \mathbb{Z}^+=\left\{x\in\mathbb{Z}:x\geq 0\right\}. We noted in the section on extending exponentiation to the integers that we were unable to consider the case of negative exponents. By assuming that they did we deduced that a new type of object exists that undoes integer multiplication. As we have seen in this section, that object type is actually a rational number. Indeed we showed that in proposition 82{reference-type="ref" reference="prop:MultiplicativeInverseOfIntegerTimesInverseIsOriginalNumber"} that if x\in\mathbb{Z} then there is some *x^{-1}\in\mathbb{Q} so that x*x^{-1}=1=x^0. This would generalise proposition 77{reference-type="ref" reference="prop:IntegerExponentiationOfSameBaseAddsPowers"} to all integers rather than positive exponents. We hence generalise the definition of exponentiation and prove the results to all integer exponents rather than the positive.

::: definition Definition 136. Exponentiation of integer numbers

Let \left(x,y\right)\in\mathbb{Z}\times\mathbb{Z} and let \wedge:\mathbb{Z}\times\mathbb{Z}\rightarrow\mathbb{Q}. We define the exponentiation of x by y by $$\begin{align} \wedge:\mathbb{Z}\times\mathbb{Z}&\rightarrow\mathbb{Q}\ \left(x,y\right)&\mapsto \wedge\left(x,y\right)=\begin{cases} 1,\ \text{If } x=0\text{ and } y=0\ 1,\ \text{If } x=0\ \displaystyle \prod_{i=1}^y x ,\ \text{If }x\neq 0\text{ and } n \geq 0\ \displaystyle \prod_{i=1}^{\left|y\right|} \frac{1}{x} ,\ \text{If }x\neq 0\text{ and } y < 0\ \end{cases} \end{align*}$$* :::

We can now extend the results shown in the section on integer exponentiation extension.

::: {#prop:IntegerExtensionExponentiationPowerLaw .proposition} Proposition 90. Power law of exponentiation for positive exponents

Let x\in\mathbb{Z} and let n,m\in\mathbb{Z}. We have that

$$\begin{equation} \left(x^n\right)^m = x^{nm} \end{equation*}$$*

Proof:

If n,m\geq 0 the result is the same as proposition 76{reference-type="ref" reference="prop:IntegerExponentiationPowerLaw"}. So we must consider the following cases

  1. n\geq 0 and $m<0$

  2. n< 0 and $m\geq 0$

  3. n< 0 and $m<0$

  1. n\geq 0 and m<0:

    By definition of integer exponentiation, we have that \displaystyle x^n=\prod_{i=1}^n x. Now applying the general definition of integer exponentiation we see that

    $$\begin{align} \left(x^n\right)^m=&\prod_{i=1}^{\left|m\right|} \frac{1}{x^n}\ &=\underbrace{\left(\frac{1}{x^n}\right)\left(\frac{1}{x^n}\right)\left(\frac{1}{x^n}\right)\dots\left(\frac{1}{x^n}\right)}_{\left|m\right|\text{ times}} \end{align*}$$*

    Now, we know by definition of multiplication for rationals that \displaystyle\frac{1}{a}*\frac{1}{b}=\frac{1}{ab} and so.

    $$\begin{align} \left(x^n\right)^m=&\underbrace{\left(\frac{1}{x^n}\right)\left(\frac{1}{x^n}\right)\left(\frac{1}{x^n}\right)\dots\left(\frac{1}{x^n}\right)}{\left|m\right|\text{ times}}\ &=\frac{1}{x^{n\left|m\right|}}\ &=\prod{i=1}^{n\left|m\right|} \frac{1}{x} =x^{nm} \end{align*}$$*

    By definition.

  2. n< 0 and m\geq 0:

    As n<0 then we have that

    $$\begin{equation} x^n=\prod_{i=1}^{\left|n\right|}\frac{1}{x}=\frac{1}{x^n} \end{equation*}$$*

    We can now apply similar logic to the first part to conclude the result.

  3. n< 0 and m<0:

    Using similar logic to the two previous parts deduces the result.

As promised. $\qed$ :::

::: {#prop:IntegerExtensionExponentiationOfSameBaseAddsPowers .proposition} Proposition 91. Multiplying exponents of the same base adds the powers

Let x\in\mathbb{Z} be a fixed integer and let n,m\in\mathbb{Z}. We have that

$$\begin{equation} x^n x^m = x^{n+m} \end{equation}$$*

Proof:

If n,m\geq 0 the result is the same as proposition 77{reference-type="ref" reference="prop:IntegerExponentiationOfSameBaseAddsPowers"}, so we have to consider the following three cases

  1. n\geq 0 and $m<0$

  2. n< 0 and $m\geq 0$

  3. n< 0 and $m<0$

  1. n\geq 0 and m<0:

    Let m=-k for some k\in\mathbb{Z} with k>0. We know that \displaystyle x^m=x^{-k}=\prod_{i=1}^{-k} \frac{1}{x} = x^{-k}. Now we have

    $$\begin{equation} x^nx^m=x^n x^{-k}=x^{n+-k} \end{equation}$$*

    Which is equivalent to x^{n+m}.

  2. n< 0 and m\geq 0:

    Like the previous part let n=-k for some k\in\mathbb{Z} with k>0 then we get

    $$\begin{equation} x^nx^m=x^{-k}x^{m}=x^{-k+m}=x^{n+m} \end{equation}$$

  3. n< 0 and m<0:

    Let n=-k and m=-j for k,j\in\mathbb{Z} with k>0 and j>0. Then

    $$\begin{equation} x^nx^m=x^{-k}x^{-j}=x^{-k+-j}=x^{n+m} \end{equation}$$

As required. $\qed$ :::

::: {#prop:IntegerExtensionExponentiationPowerOfProductIsProductOfPowers .proposition} Proposition 92. Power of product is product of powers

Let x,y\in\mathbb{Z} and n\in\mathbb{Z}. Then

$$\begin{equation} \left(xy\right)^n=x^ny^n \end{equation*}$$*

Proof:

If n=0 then \left(x*y\right)^n=1 and clearly x^0*y^0=1. So suppose n>0 then we have

$$\begin{align} \left(xy\right)^n=\prod_{i=1}^n xy &=\underbrace{xyxy*\dots xy}_{n\text{ times}}\ &= \left(\underbrace{xx*\dots x}_{n\text{ times}}\right)\left(\underbrace{yy\dots y}_{n\text{ times}}\right),\ \text{ By commutativity of multiplication}\ &=x^ny^n \end{align*}$$*

Finally, let n<0 then a similar argument shows that

$$\begin{equation} \left(xy\right)^n=\frac{1}{x^ny^n} \end{equation*}$$ Showing the proposition. $\qed$* :::

We have extended integer exponentiation. What can we say about rational exponentiation? We can clearly extend the base of exponentiation to an arbitrary rational number. We have already used special cases of this when we considered denominators and numerators separately in the previous proofs. We formalise this to a fully general rational number. Firstly, we know that if n<0 then \displaystyle x^n=\frac{1}{x^n}. Additionally if x\in\mathbb{Z} then a multiplicative inverse of x in the rationals is given by x^{-1}=\frac{1}{x}. We combine the two into a general definition.

::: definition Definition 137. Exponentiation for negative indices

Let x\in\mathbb{Z} with x\neq 0. We extend exponentiation to negative n\in\mathbb{Z} by

$$\begin{equation} x^{-n} = \left(x^{-1}\right)^n \end{equation*}$$*

Clearly we have in general that $x^{-n}\in\mathbb{Q}$ :::

Now we can consider the more general case of \displaystyle\left(\frac{a}{b}\right)^n for a,b,n\in\mathbb{Z} and b\neq 0. We have the following proposition

::: proposition Proposition 93. Rational number raised to an integer exponent

Let x\in\mathbb{Q} with \displaystyle x=\frac{a}{b} and b\neq 0. Let n\in\mathbb{Z} We have that

$$\begin{equation} \left(\frac{a}{b}\right)^n=\frac{a^n}{b^n} \end{equation*}$$*

Proof:

We have that

$$\begin{align} \left(\frac{a}{b}\right)^n&=\left(ab^{-1}\right)^n\ &= \underbrace{\left(a b^{-1}\right)\left(a b^{-1}\right)\dots \left(a b^{-1}\right)}_{n \text{ times}}\ &=\underbrace{aaa\dotsa}_{n \text{ times}}\underbrace{b^{-1}b^{-1}b^{-1}\dotsb^{-1}}_{n \text{ times}}\ &= a^n \left(b^{-1}\right)^n\ &=a^nb^{-n}\ &=\frac{a^n}{b^n} \end{align*}$$*

As required. $\qed$ :::

The rules of integer exponentiation extend when the base is rational.

::: {#propRationalExponentiationPowerLaw .proposition} Proposition 94. Power law of exponentiation for positive exponents

Let x\in\mathbb{Q} and let n,m\in \mathbb{Z}. We have that

$$\begin{equation} \left(x^n\right)^m = x^{nm} \end{equation*}$$*

Proof:

Let \displaystyle x=\frac{a}{b} with a,b\in\mathbb{Z} and b\neq 0. We have that

$$\begin{align} \left(x^n\right)^m&=\left(\left(\frac{a}{b}\right)^n\right)^m\ &=\left(\frac{a^n}{b^n}\right)^m\ &=\left(a^nb^{-m}\right)^m\ &=a^{nm}b^{-nm}\ &=\frac{a^{nm}}{b^{nm}}\ &=x^{nm} \end{align}$$ $\qed$ :::

::: {#prop:RationalExponentiationOfSameBaseAddsPowers .proposition} Proposition 95. Multiplying exponents of the same base adds the powers

Let x\in\mathbb{Q} be a fixed integer and let n,m\in\mathbb{Z}. We have that

$$\begin{equation} x^n x^m = x^{n+m} \end{equation}$$*

Proof:

Let \displaystyle x=\frac{a}{b} with a,b\in\mathbb{Z} and b\neq 0. Observe that

$$\begin{align} x^nx^m&=\left(\frac{a}{b}\right)^n\left(\frac{a}{b}\right)^m\ &=\frac{a^n}{b^n}\frac{a^m}{b^m}\ &=\frac{a^na^m}{b^nb^m}\ &=\frac{a^{n+m}}{b^{n+m}}\ &=\left(\frac{a}{b}\right)^{n+m}\ &=x^{n+m} \end{align}$$*

As required. $\qed$ :::

::: {#prop:RationalExponentiationPowerOfProductIsProductOfPowers .proposition} Proposition 96. Power of product is product of powers

Let x,y\in\mathbb{Q} and n\in\mathbb{Z}. Then

$$\begin{equation} \left(xy\right)^n=x^ny^n \end{equation*}$$*

Proof:

Let \displaystyle x=\frac{a}{b} with a,b\in\mathbb{Z} and b\neq 0 and let \displaystyle y=\frac{c}{d} with c,d\in\mathbb{Z} and d\neq 0. We have

$$\begin{align} \left(xy\right)^n&=\left(\frac{a}{b}\frac{c}{d}\right)^n\ &=\left(\frac{ac}{bd}\right)^n\ &=\frac{\left(ac\right)^n}{\left(bd\right)^n}\ &=\frac{a^n c^n}{b^n d^n}\ &=\frac{a^n}{b^n}\frac{c^n}{d^n}\ \frac{}{} &=x^ny^n \end{align*}$$* :::

What about rational exponents? Can we assign meaning to expressions of the form \displaystyle \wedge\left(\frac{a}{b},\frac{c}{d}\right)? Using a similar argument to when we considered extending integer exponentiation. Suppose that proposition 95{reference-type="ref" reference="prop:RationalExponentiationOfSameBaseAddsPowers"} holds for rational exponents. In particular we have for some x\in\mathbb{Q} that

$$\begin{equation*} x^{\frac{1}{2}}x^{\frac{1}{2}}=x^1 \end{equation}$$

Now, suppose that x=2. We are hence saying that

$$\begin{equation*} 2^{\frac{1}{2}}2^{\frac{1}{2}}=2 \end{equation}$$

If we suppose that \displaystyle 2^{\frac{1}{2}}\in\mathbb{Q} with say \displaystyle y=2^{\frac{1}{2}} we are saying that y^2=2. Unfortunately, there is no such rational y that satisfies this. Moreover, we lack the theory required to prove this at this time. This will be corrected in part {reference-type="ref" reference="part2"}.

Extending the absolute value function

When we constructed the integers we recast the notion of size into that of distance. This was achieved using the so-called absolute value function given by

$$\begin{equation*} \left|x\right|=d\left(x,0\right)=\begin{cases} x,\ \text{If } x\geq 0\ -x,\ \text{If } x< 0 \end{cases} \end{equation*}$$

where

$$\begin{align*} d:\mathbb{Z}^2&\rightarrow\mathbb{N}\ \left(x,y\right)&\mapsto d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{align*}$$

Now that we have constructed the rational numbers we can consider how this idea extends. One thing that is clear from the definition of d for integers is that the smallest possible non-zero distance that can be achieved is 1, for example, d\left(2,1\right). However, consider

$$\begin{equation*} 1-\frac{1}{2}=\frac{1}{2} \end{equation*}$$

If this idea of distance is to extend to the rationals we will clearly have that distances smaller than 1 are now possible. In other words, the mapping for d when used with rational numbers can no longer map into \mathbb{N}. This is easily remedied by defining the following set.

::: definition Definition 138. Positive rationals

We define the set of positive rationals by

$$\begin{equation} \mathbb{Q}^+=\left{x\in\mathbb{Q}: x>0\right} \end{equation*}$$* :::

It is clear from the definitions for the integers how to extend the distance function and the absolute value function to the rationals.

::: definition Definition 139. Distance function for the rationals

Let x,y\in\mathbb{Q}. Define the function d:\mathbb{Q}^2\rightarrow\mathbb{Q}^+ by

$$\begin{align} d:\mathbb{Q}^2&\rightarrow\mathbb{Q}^+\ \left(x,y\right)&\mapsto d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{align*}$$* :::

As before we prove that this distance function is well-defined.

::: {#prop:RationalDistanceFuncWellDefined .proposition} Proposition 97. The distance function for the rationals is well-defined

Let x,y\in\mathbb{Q}. We have that

$$\begin{equation} d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{equation*}$$*

is well-defined.

Proof:

Let x,y\in\mathbb{Q}. There are two cases to consider x\geq y and x<y.

  1. x\geq y:

    Suppose that x\geq y, then by proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} part 14. we have

    $$\begin{equation} x\geq y \Rightarrow \left(x+\left(-y\right)\right) \geq \left(y+\left(-y\right)\right) \Rightarrow x-y \geq 0 \end{equation*}$$*

    Hence x-y\in\mathbb{Q}^+.

  2. x<y:

    As x<y we have by definition of d that d\left(x,y\right)=-\left(x-y\right) where we have that x-y<0. However we have that -\left(x-y\right)=-1 * \left(x-y\right) and so by part 16 of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we have that -1*\left(x-y\right)>0 which is to say $-\left(x-y\right)\in\mathbb{Q}^+$

The result has been shown. $\qed$ :::

We can now generalise the absolute value function.

::: definition Definition 140. Absolute value function

Let x\in\mathbb{Q} we define the absolute value function, denoted by \left|x\right| by the function

$$\begin{equation} \left|x\right|=d\left(x,0\right)=\begin{cases} x,\ \text{If } x\geq 0\ -x,\ \text{If } x< 0 \end{cases} \end{equation*}$$* :::

We have generalised the idea of "size" to the rationals. We can now also generalise the properties of the absolute value function explored in the construction of the integers.

::: proposition Proposition 98. Properties of the absolute value

Let x,y,z\in\mathbb{Q}. We have that the absolute value function has the following properties

  1. \left|x\right|\geq 0 for all $x\in\mathbb{Q}$

  2. $\left|x\right|=0\iff x=0$

  3. $\left|x-y\right|=0\iff x=y$

  4. $\left|xy\right|=\left|x\right|\left|y\right|$

  5. \displaystyle \left|\frac{x}{y}\right|=\frac{\left|x\right|}{\left|y\right|} with $y\neq 0$

  6. $\left|\left|x\right|\right|=\left|x\right|$

  7. $\left|-x\right|=\left|x\right|$

  8. $\left|x\right|\leq y \iff -y\leq x\leq y$

  9. \left|x\right|\geq y\iff x\leq -y or $x\geq y$

  10. $\left|x+y\right|\leq \left|x\right|+\left|y\right|$

  11. $\left|x-y\right|\leq\left|x-z\right|+\left|z-y\right|$

  12. $\left|x-y\right|\geq \left|\left|x\right|-\left|y\right|\right|$

  13. \left|\cdot\right| is not injective

  14. \left|\cdot\right| is not surjective

Proof:

  1. \left|x\right|\geq 0 for all x\in\mathbb{Q}:

    This follows by proposition 97{reference-type="ref" reference="prop:RationalDistanceFuncWellDefined"}.

  2. \left|x\right|=0\iff x=0:

    We have by definition that \left|x\right|=0, if and only if x=0.

  3. \left|x-y\right|=0\iff x=y:

    \left(\Rightarrow\right): Suppose that \left|x-y\right|=0. There are two cases to consider.

    Firstly if x\geq y, then by definition we have that \left|x-y\right|=x-y=0 from which we clearly have x=y. The other case is x<y from which we get \left|x-y\right|=-\left(x-y\right)=0. In other words, we have -1*\left(x-y\right)=0. Now by proposition 87{reference-type="ref" reference="prop:RationalsHaveNoZeroDivisors"} we know that for rationals a,b that if ab=0, at least one of a or b is zero. As -1\neq 0 we conclude that x-y=0 from which we get x=y.

    \left(\Leftarrow\right): Suppose that x=y then x-y=0 and so \left|x-y\right|=0.

  4. \left|xy\right|=\left|x\right|\left|y\right|:

    Let x,y\in\mathbb{Q}. There are four cases to consider.

    1. x\geq 0 and $y\geq 0$

    2. x\geq 0 and $y<0$

    3. x<0 and $y\geq 0$

    4. x<0 and $y<0$

    1. x\geq 0 and y\geq 0:

      If x\geq 0 and y\geq 0 then xy\geq 0 and so \left|xy\right|=xy. Likewise \left|x\right|=x and \left|y\right|=y. Hence \left|xy\right|=\left|x\right|\left|y\right|.

    2. x\geq 0 and y<0:

      If x\geq 0 then \left|x\right|=x by definition, and if y<0 then \left|y\right|=-y. Now \left|xy\right|=-xy as y<0. Moreover, we have that

      $$\begin{equation} -xy=\left(-1\right)\left(x\right)\left(y\right)=\left(x\right)\left(-1\right)\left(y\right)=\left(x\right)\left(-y\right)=\left|x\right|\left|y\right| \end{equation*}$$*

      Hence we get $\left|xy\right|=\left|x\right|\left|y\right|$

    3. x<0 and y\geq 0:

      This is similar to the above but swapping the roles of x and y.

    4. x<0 and y<0:

      Suppose that x<0 and y<0, then we have that \left|x\right|=-x and \left|y\right|=-y by definition. Moreover, we have that -x*-y = xy. Hence $\left|xy\right|=xy=\left(-x\right)\left(-y\right)=\left|x\right|\left|y\right|$

  5. \displaystyle\left|\frac{x}{y}\right|=\frac{\left|x\right|}{\left|y\right|} with y\neq 0:

    This follows by part 4.

  6. \left|\left|x\right|\right|=\left|x\right|:

    We have that \left|x\right|=x if x\geq 0 and -x if x<0.

    So if x\geq 0, we have

    $$\begin{equation} \left|\left|x\right|\right|=\left|x\right|=x=\left|x\right| \end{equation*}$$*

    Now if x<0 then

    $$\begin{equation} \left|\left|x\right|\right|=\left|-x\right|=\underbrace{-x}_{\text{As }-x>0}=\left|x\right| \end{equation*}$$*

  7. \left|-x\right|=\left|x\right|:

    As -x=-1 *x we have by part 4 that

    $$\begin{equation} \left|-x\right|=\left|-1x\right|=\left|-1\right|\left|x\right|=1\left|x\right|=\left|x\right| \end{equation*}$$*

  8. \left|x\right|\leq y \iff -y\leq x\leq y:

    \left(\Rightarrow\right): Suppose that \left|x\right|\leq y. If x\geq 0 then we get that \left|x\right|=x\leq y. From this, it is clear that -y\leq x\leq y as x\geq 0 and x\leq y \Rightarrow y \geq 0.

    Now if x<0, then \left|x\right|=-x\leq y. Clearly x\leq -x as x<0 hence we conclude that x\leq -x\leq y. Now by part 18 of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we have we have

    $$\begin{equation} \left(-1\right)\left(-x\right)\geq \left(-1\right)\left(y\right) \iff x\geq -y \end{equation}$$*

    Now x\geq -y is the same as -y\leq x and so we have -y\leq x\leq -x \leq y.

    Hence -y\leq x\leq y.

    \left(\Leftarrow\right): Suppose that -y\leq x\leq y. There are two cases to consider.

    1. $x\geq 0$

    2. $x<0$

    1. x\geq 0:

      Suppose x\geq 0, then clearly as x\leq y then \left|x\right|\leq \left|y\right|=y. Moreover, we have that -y\leq x is the same x\geq -y and by part 22. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} when applied to x\geq -y gives

      $$\begin{equation} \left(-1\right)\left(x\right)\leq \left(-1\right)\left(-y\right) \iff -x\leq y \end{equation}$$*

      We have that \left|-x\right|=\left|x\right| by part 6. Hence \left|-x\right|=\left|x\right|\leq \left|y\right|=y.

    2. x<0:

      Suppose x<0. By assumption x\leq y so either y\geq 0 or y< 0. We can't have y<0 as for example take x=-4 and y=-2 then we would have 2\leq -4\leq -2 a contradiction.

      So suppose that y\geq 0 then as x\leq y we have \left|x\right|\leq\left|y\right|=y. Now as -y\leq x by assumption we have that x\geq -y and so part 22. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} gives

      $$\begin{equation} \left(-1\right)\left(x\right)\leq \left(-1\right)\left(-y\right) \iff -x\leq y \end{equation}$$*

      Hence part 6. applies and we get that $\left|x\right|\leq y$

  9. \left|x\right|\geq y\iff x\leq -y or x\geq y:

    \left(\Rightarrow\right): Suppose that \left|x\right|\geq y. If x\geq 0 then \left|x\right|=x\geq y. So suppose that x<0 then by definition we have that \left|x\right|=-x and so -x\geq y and the result follows when applying part 22. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"}.

    \left(\Leftarrow\right): Suppose that either x\leq -y or x\geq y. We have three cases to consider.

    1. $x\leq -y$

    2. $x\geq y$

    3. x\leq -y and $x\geq y$

    1. x\leq -y:

      Suppose that x\leq -y holds. If x\geq 0 then we have that -y\geq 0, Hence y<0. Moreover, we have that by part 18. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} that

      $$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(-y\right) \iff -x\geq y \end{equation}$$*

      Now part 6. applies and we see that \left|-x\right|=\left|x\right|\geq\left|y\right|=y. This is to say \left|x\right|\geq y.

      Now suppose that x<0. Then as x\leq -y we have that either -y\geq 0 or -y<0. In the former case -y\geq 0 gives y<0. Hence by part 18. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we conclude that

      $$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(y\right) \iff -x\geq y \end{equation}$$*

      As x<0 then -x\geq 0. The result follows when taking the absolute value.

      Now suppose that -y<0 then y\geq 0. Following similar logic to the previous case, we see that

      $$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(y\right) \iff -x\geq y \end{equation}$$*

      The result again follows after taking the absolute value.

    2. x\geq y:

      This case is trivial.

    3. x\leq -y and x\geq y:

      Suppose that x\leq -y and x\geq y are both true. We know by the first case that x\leq -y gives \left|x\right|\geq y and x\leq y also implies \left|x\right|\geq y by the second case. Hence both inequalities being true at the same time implies the result \left|x\right|\geq y.

  10. \left|x+y\right|\leq \left|x\right|+\left|y\right|:

    Let x,y\in\mathbb{Q}. There are four cases to consider.

    1. x\geq 0 and $y\geq 0$

    2. x\geq 0 and $y\leq 0$

    3. x\leq 0 and $y\geq 0$

    4. x\leq 0 and $y\leq 0$

    1. x\geq 0 and y\geq 0:

      Suppose x\geq 0 and y\geq 0, then we have that

      $$\begin{equation} \left|x+y\right|=x+y=\left|x\right|+\left|y\right|\Rightarrow \left|x+y\right|\leq\left|x\right|+\left|y\right| \end{equation*}$$*

    2. x\geq 0 and $y\leq 0$

      By assumption we have that \left|x\right|=x and \left|y\right|=-y. We have two cases based on the absolute value, \left|x\right|\leq\left|y\right| and \left|x\right|\geq\left|y\right|.

      So suppose that \left|x\right|\leq\left|y\right| then by definition x\leq -y and so by part 12. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we have that

      $$\begin{equation} x\leq -y \Rightarrow x+y\leq 0 \end{equation*}$$*

      Moreover, as x\geq 0 then y\leq x+y\leq 0. Hence we have by the definition of the absolute value that

      $$\begin{equation} \left|x+y\right|=-\left(x+y\right)\leq -y=\left|y\right| \end{equation*}$$ As -y>0.*

      In the case \left|x\right|\geq\left|y\right| we have by definition that x\geq -y and so x+y\geq 0. Additionally it is clear that x\geq x+y as y\leq 0 and \left|x\right|\geq\left|y\right|. Hence by definition of the absolute value we have that

      $$\begin{equation} \left|x+y\right|=x+y\leq x=\left|x\right| \end{equation*}$$*

      Now, it is clear to see that \left|x\right|\leq \left|x\right|+\left|y\right| and likewise \left|y\right|\leq \left|x\right|+\left|y\right|.

      We have hence shown that \left|x+y\right|leq\left|x\right|+\left|y\right|.

    3. x\leq 0 and y\geq 0:

      This is similar to above, interchanging the roles of x and y.

    4. x\leq 0 and y\leq 0:

      Suppose that x\leq 0 and y\leq 0 then by definition we have that \left|x+y\right|=-\left(x+y\right)=-x-y. As x\leq 0 and y\leq 0 then we have that and \left|y\right|=-y which shows $\left|x+y\right|=\left|x\right|+\left|y\right|\leq\left|x\right|+\left|y\right|$

  11. \left|x-y\right|\leq\left|x-z\right|+\left|z-y\right|:

    We have that

    $$\begin{align} \left|x-y\right|&=\left|x-\left(z-z\right)-y\right|\ &=\left|x-z+z-y\right|\ &\leq \left|x-z\right|+\left|z-y\right| \end{align*}$$*

  12. \left|x-y\right|\geq \left|\left|x\right|-\left|y\right|\right|:

    We have that

    $$\begin{align} \left|x\right|&=\left|\left(x-y\right)+y\right|\leq \left|x-y\right|+\left|y\right| \Rightarrow \left|x\right|-\left|y\right|\leq \left|x-y\right|\ \left|y\right|&=\left|\left(y-x\right)+x\right|\leq \left|x-y\right|+\left|x\right| \Rightarrow \left|y\right|-\left|x\right|\leq \left|x-y\right|\ \end{align*}$$*

    Hence we have

    $$\begin{align} \left|x\right|-\left|y\right|\leq \left|x-y\right| &\Rightarrow \left|\left|x\right|-\left|y\right|\right|\leq \left|x-y\right|\ \left|y\right|-\left|x\right|=\left(-1\right)\left(\left|x\right|-\left|y\right|\right)\leq \left|x-y\right| &\Rightarrow \left|\left|x\right|-\left|y\right|\right|\leq \left|x-y\right|\ \end{align*}$$*

    Hence we have the result.

  13. \left|\cdot\right| is not injective:

    This follows as the absolute value function was not injective for the integers

  14. \left|\cdot\right| is not surjective:

    This follows as the absolute value function was not surjective for the integers

As required. $\qed$ :::

Elementary Number Theory

Introduction

::: epigraph Mathematics is the queen of the sciences and Number Theory is the queen of mathematics.

Carl Friedrich Gauss :::

In the previous part, we have gone from only having the axioms of ZFC, the rules of logic and knowledge of mappings and have built two types of numbers, the naturals and the integers. Unfortunately, we need to make a detour from constructing new objects. We need to start using the objects we have constructed to provide a guide on how to proceed with building more mathematical objects.

We will start with Number Theory. Number Theory primarily deals with the properties of the integers \mathbb{Z} as well as mappings defined on \mathbb{Z}. This includes properties about the operations on the integers, properties about the compositions and ways of expressing relationships between certain "types" of integers, solving equations involving the integers and more.

The applications of Number Theory to the modern world are numerous. One main example of the usage of Number Theory is encryption, the art of obfuscating information so that it can only be read by trusted individuals11 . We will later consider an example of encryption called RSA.

Additionally, the ideas that we will develop when studying Number Theory are key to providing crucial insights into other branches of mathematics. We will come to see that many of the key properties of the integers are also enjoyed by many other types of mathematical objects, especially in an abstract setting.

Divisibility

::: epigraph Now where there are no parts, neither extension, shape, nor divisibility is possible. And these monads are the true atoms of nature and, in a word, the elements of things.

Gottfried Leibniz :::

Definition of divisibility of integers

Although we have a concrete construction of the integers, we haven't even discussed some of their most basic properties! We know how to add, subtract and multiply them, but we don't know how to divide them without the rational numbers \mathbb{Q}. It is with \mathbb{Q} that we can hope to find a rule that says that \displaystyle\frac{a}{b}\in\mathbb{Z} for some a,b\in\mathbb{Z}.

Recall that in \mathbb{Q} we defined an equivalence relation \sim so that for \left(a,b\right),\left(c,d\right)\in\mathbb{Z}^2 we have that

$$\begin{equation*} \left(a,b\right)\sim\left(c,d\right)\iff ad=bc \end{equation*}$$

where we had b\neq 0 and d\neq 0. We also saw that \left(x,1\right)\in\left[\left(x,1\right)\right] represented an integer. Hence the question we are resolving is when does \left(a,b\right)\sim\left(x,1\right). We have that

$$\begin{equation*} \left(a,b\right)\sim\left(x,1\right)\iff a=bx \end{equation*}$$

That is b divides a and gives an integer if and only if a=bx. We make this our first formal definition in the field of Number Theory.

::: {#def:NT_Int_Div_def .definition} Definition 141. Integer divisibility

Let a,b\in\mathbb{Z} with b\neq 0. We say that a is divisible by b, or b divides a, written as b\mid a if and only if \exists c\in\mathbb{Z} so that a=bc. We say that b is a divisor of a.

If b does not divide a we write b\not\nmid a. :::

::: example Example 90. We have that 3\mid 6 as 6=3*2.

Obverse that 2\nmid 3. Indeed there is no integer x so 3=2x. :::

We make a definition based on the definition of divisibility. Namely based on if a number can be divided into two equal parts.

::: definition Definition 142. Even number

Let x\in\mathbb{Z}. We say that x is even if we have that 2\mid x. :::

This immediately gives another definition.

::: definition Definition 143. Odd number

Let x\in\mathbb{Z}. We say that x is odd if we have that 2\nmid x. :::

We can make another definition, based on divisibility.

::: definition Definition 144. Integer multiple

Let a,b\in\mathbb{Z} so that b\mid a. We say that b is a multiple of a. :::

There are two results that we can derive based on an even number, an odd number and integer multiples.

::: {#prop:NT_even_iff_2n .proposition} Proposition 99. Integer is even if it is a multiple of 2

Let x\in\mathbb{Z}. We have that x is even if and only if x is a multiple of 2.

Proof:

\left(\Rightarrow\right): Suppose that x is even, then by definition we have that 2\mid x and so by the definition of divisibility we have that x=2c for some c\in\mathbb{X}. By the definition of being an integer multiple we have that x is a multiple of 2.

\left(\Leftarrow\right): Suppose that x is a multiple of 2. By definition of being an integer multiple, we have that x=2r for some r\in\mathbb{Z}. Hence by the definition of divisibility, we have that 2\mid x and so by definition of an even number we have that x is even. $\qed$ :::

We can find a similar proposition for odd numbers. Observe that by the previous proposition that x being even means that x=2n for some integer n. Also, we have that 2n+2=2\left(n+1\right) is even, so what can we say about 2n+1?

::: proposition Proposition 100. Integer is odd if and only if it is not a multiple of 2

Let x\in\mathbb{Z}. We have that x is odd if and only if x is not a multiple of 2.

Proof:

The proof follows by the contra-positive, that is x is a multiple of 2 if and only if x is even, which is the previous proposition. $\qed$ :::

Hence we need to determine if 2n+1 is even or odd. We need to develop the theory of divisibility.

The definition of divisibility gives an immediate result. Namely that when considering the divisibility of integers we need only concern ourselves with positive integers, as negative integers will also be divisors. That is if b\mid a then so does -b.

::: {#prop:NT_PositiveAndNegativeDivisorsForIntsExist .proposition} Proposition 101. Integer dividing another implies negative integer also divides

Let a,b\in\mathbb{Z} with b\mid a. We also have that -b\mid a.

Proof:

Let a,b\in\mathbb{Z} with b\mid a. By definition of divisibility, we have that \exists c\in\mathbb{Z} so that a=bc. We know that -1*1=1 and so we have that

$$\begin{equation} a=bc=\left(-1*-1\right)bc=-b*-c \end{equation*}$$*

As -c\in\mathbb{Z} then it follows by definition that -b\mid a. $\qed$ :::

Hence by proposition 101{reference-type="ref" reference="prop:NT_PositiveAndNegativeDivisorsForIntsExist"} we will restrict our view to positive divisors only, knowing that any results about a positive divisor will extend to negative divisors.

One clear divisor of any integer a is itself, that is a\mid a as a=a*1. We will find it interesting to consider the more non-trivial divisors of some integers. Hence we make the following definition

::: definition Definition 145. Proper divisor

Let a,b\in\mathbb{Z} with b\mid a. If we have that 0<b<a then we say that b is a proper divisor of a. :::

There are some clear results about divisibility.

::: {#prop:NT_divisibility_properties .proposition} Proposition 102. Properties of divisibility

Let a,b,c\in\mathbb{Z}. We have the following properties for divisibility

  1. a\mid b \Rightarrow a\mid bc for any $c\in\mathbb{Z}$

  2. a\mid b and b\mid c implies that $a\mid c$

  3. a\mid b and a\mid c implies that a\mid\left(bx+cy\right) for any $x,y\in\mathbb{Z}$

  4. a\mid b and b\mid a implies a=\pm b, that is either a=b or a=-b.

  5. a\mid b and a>0 and b>0 implies that a\leq b.

  6. If m\in\mathbb{Z} is such that m\neq 0 then a\mid b is true if and only if ma\mid mb.

  7. For all a\in\mathbb{Z} with a\neq 0 we have $a\mid 0$

Proof:

  1. a\mid b \Rightarrow a\mid bc for any c\in\mathbb{Z}:

  2. a\mid b and b\mid c implies that a\mid c:

    Suppose that a\mid b, then by definition there exists d\in\mathbb{Z} so that b=ad. Hence we have that

    $$\begin{equation} bc=adc \Rightarrow a\mid bc \end{equation*}$$*

    as dc\in\mathbb{Z}.

  3. a\mid b and a\mid c implies that a\mid\left(bx+cy\right) for any x,y\in\mathbb{Z}:

    Suppose that a\mid b and b\mid c, then by the definition of divisibility, and by part 1., we have that b=ax and c=by for all x,y\in\mathbb{Z}. We hence see that

    $$\begin{equation} c=axy \end{equation*}$$*

    Hence as xy\in\mathbb{Z} then we conclude that a\mid c.

  4. a\mid b and b\mid a implies a=\pm b, that is either a=b or a=-b:

    Let a\mid b and a\mid c, then there are d,e\in\mathbb{Z} such that b=ad and c=ae. Now, let x,y\in\mathbb{Z} then we have that bx=adx and cy=aey and bx+cy=adx+aey=a\left(dx+ey\right). Hence a\mid\left(bx+cy\right).

  5. a\mid b and a>0 and b>0 implies that a\leq b:

    If a\mid b then \exists x\in\mathbb{Z} so that b=ax, likewise if b\mid a then \exists y\in\mathbb{Z} so that a=by. It follows that b=byx. We have that b=byx is true if and only if yx=1. Therefore either x=y=1 or x=y=-1.

    The result is clear after substituting y into a=by.

  6. If m\in\mathbb{Z} is such that m\neq 0 then a\mid b is true if and only if ma\mid mb:

    \left(\Rightarrow\right): Let m\in\mathbb{Z} be non-zero and let a\mid b. By definition, there is some c\in\mathbb{Z} so that b=ac. Multiplying both sides by m gives

    $$\begin{equation} bm=acm=amc \end{equation*}$$*

    and so am\mid bm.

    \left(\Leftarrow\right): Suppose that am\mid bm, then again by the definition of divisibility we have that there is some c\in\mathbb{Z} so that bm=amc. By the cancellation law, we can cancel the m to get b=ac and the result follows.

  7. For all a\in\mathbb{Z} with a\neq 0 we have a\mid 0:

    Let a\in\mathbb{Z}, where a\neq 0. We have that 0=ka has the solution k=0 by part I proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"}. Hence a\mid 0.

As required. $\qed$ :::

Part 3. of the previous proposition can be generalised. We will work with an example to see how this can be achieved.

::: example Example 91. Let a=2, b=16 and c=32. Clearly we have that a\mid b as 16=4*2 and likewise a\mid c as 32=5*2.

Now part 3. states that if a\mid b and a\mid c then we must have that a\mid\left(bx+cy\right) for any x,y\in\mathbb{Z}.

Indeed, for example, we can see that 2\mid\left(-5\left(16\right)+7\left(32\right)\right). As -5\left(16\right)+7\left(32\right)=-80+224=144. Now suppose that d=64 and say z=5. We can see that

$$\begin{equation} -5\left(16\right)+7\left(32\right)+5\left(64\right)=144+320=464 \Rightarrow 2\mid\left(-5\left(16\right)+7\left(32\right)+5\left(64\right)\right) \end{equation*}$$* :::

We prove the general statement now.

::: {#prop:NT_Divisor_dividing_all_in_set_divides_linear_combination .proposition} Proposition 103. Divisor that divides a set of integers divides a combination of the set

Let a\in\mathbb{Z} and let S=\left\{b_1,b_2,b_3,\dots,b_n\right\} be a set of n integers where b_i\in\mathbb{Z} for each b_i. Moreover suppose that a\mid b_i for each b_i\in S. We have that

$$\begin{equation} a\mid\sum_{i=1}^n b_i x_i \end{equation*}$$*

for any x_i\in\mathbb{Z}.

Proof:

We argue by induction on n. The base case is n=2 which is shown in proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"}. So suppose that the result holds for some k\geq 1, which is to say that if S=\left\{b_1,b_2,\dots,b_k\right\} and we have that a\mid b_i for each b_i\in S then

$$\begin{equation} a\mid\sum_{i=1}^k b_i x_i \end{equation*}$$*

We need to show that the result holds for k+1. That is if \Tilde{S}=S\cup \left\{b_{k+1}\right\} so that a\mid b_i for each b_i\in\Tilde{S} then

$$\begin{equation} a\mid\sum_{i=1}^{k+1} b_i x_i \end{equation*}$$*

So take \Tilde{S}=S\cup \left\{b_{k+1}\right\} so that a\mid b_i for each b_i\in\Tilde{S}. By applying part 1. of proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} to each a\mid b_i we know that for all x_i\in\mathbb{Z} that a\mid b_ix_i.

Now, by the induction hypothesis we know that \forall b_i\in S that a\mid b_i and moreover we have that

$$\begin{equation} a\mid\sum_{i=1}^k b_i x_i \end{equation*}$$*

Let \displaystyle d=\sum_{i=1}^k b_i x_i. Again by part 1 of proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} we have that a\mid ad. Additional we know that a\mid b_{k+1} and so by part 3. of 102{reference-type="ref" reference="prop:NT_divisibility_properties"}, As d\in\mathbb{Z}, we have that

$$\begin{align} a &\mid\left(1d + b_{k+1}x_{k+1}\right)\ a &\mid\left(\sum_{i=1}^k b_i x_i + b_{k+1}x_{k+1}\right)\ a &\mid\left(\sum_{i=1}^{k+1} b_i x_i\right)\ \end{align}$$*

Which implies the result holds for k+1 and hence for any n\in\mathbb{N} by induction. $\qed$ :::

The greatest common divisor and the least common multiple

Now that we have a solid grasp of the basics of integer divisibility, we can start looking towards some applications. One immediate question is given a set of integers say

$$\begin{equation*} S=\left{a_1,a_2,a_3,\dots,a_n\right} \end{equation*}$$

What is the largest integer which divides each a_i\in S. and what is the largest integer m so that m has each a_i\in S as a proper divisor? An immediate use of these two ideas is very useful when doing arithmetic with rational numbers. For example, consider trying to simplify the fraction \displaystyle\frac{525}{2925}. To simplify this we need to find the integers that multiply to make 525 and those that multiply to make 2925. If there are any in common then we know from the construction of the rationals that \displaystyle \frac{x}{x}=1 and in particular we have that \displaystyle\frac{xy}{xz}=\frac{y}{z}*\frac{x}{x}=1.

Likewise suppose we wanted to add \displaystyle\frac{1}{4} and \displaystyle\frac{1}{7}. It is true that by definition of addition, we would have

$$\begin{equation*} \frac{1}{4}+\frac{1}{7}=\frac{17+14}{74}=\frac{7+4}{74}=\frac{11}{28} \end{equation*}$$

The key stage was \displaystyle\frac{1*7+1*4}{7*4}, breaking this down we see that

$$\begin{equation*} \frac{17+14}{74}=\frac{17}{74}+\frac{14}{74} \end{equation}$$

In other words, we are finding a multiple in common with 7 and 4 to turn the denominator into. It is therefore worthwhile to work out the theory of working out common divisors and common multiples.

We will start by working out common divisors, by first making a definition.

::: definition Definition 146. Common divisor

Let a,b,c\in\mathbb{Z} be non-zero integers. We say that c is a common divisor of a and b if c\mid a and c\mid b. :::

::: example Example 92. Consider the integers 35 and 25. The divisors of 35 are 1, 5 and 7 and 35, likewise the divisors of 25 are 1 and 5 and 25. The largest common divisor is therefore 5. :::

::: example Example 93. Consider the integers 24 and 54. Doing the same as before, we can see that the divisors of 24 are 1, 2, 3, 4, 6, 8, 12 and 24. Looking at the divisors of 54 we see that they are 1, 2, 3, 6, 9, 18, 27 and 54.

The common divisors of 24 and 54 are therefore 1, 2, 3 and 6, :::

::: example Example 94. Consider the common divisors of 3 and 5. The divisors of 3 are simply 1 and 3, likewise the divisors of 5 are 1 and 5. The only common divisor is 1. :::

We can see from the previous examples that there was a largest, or greatest common divisor between the pairs of integers in each case. We can show that for any two integers, there is always a greatest common divisor.

::: {#thm:NT_gcd_exists .theorem} Theorem 32. The greatest common divisor of two integers exists

Let a,b\in\mathbb{Z} so that a\neq 0 or b\neq 0. Then there exists d\in\mathbb{Z} so that d is the largest possible common divisor, that is there is no g\in\mathbb{Z} with g>d so that g\mid a and g\mid b.

Proof:

Firstly, we note that as 1\mid a and 1\mid b, the largest possible common divisor is at least 1, proving existence. To show that there is the largest possible common divisor we must show that this divisor can't exceed some integer, say M, where M depends on a and b. Moreover by proposition 101{reference-type="ref" reference="prop:NT_PositiveAndNegativeDivisorsForIntsExist"} we only need to consider the case where a\geq 0 and b\geq 0.

So. suppose that c\mid a and c\mid b for some c\geq 1. By part 5. of proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} we have that as c\mid a then c\leq a, likewise as c\mid b then c\leq b. There are three possibilities to consider

  1. $a=b$

  2. Without any loss of generality we have $a<b$

  3. One of a=0 or b=0 but not both at the same time.

  1. a=b:

    In this case we easily take M to be the largest divisor of a, or equivalently b, then $c\leq M$

  2. Without any loss of generality we have a<b:

    Without loss of generality, we take a<b, if this is not the case we simply swap the roles of a and b. In this case, we take M to be the largest divisor so that M\leq a. For if we took a M so that M\leq b then by the fact a<b we could have the case that M>a a contradiction to the fact that c\leq a as c\mid a.

  3. One of a=0 or b=0 but not both at the same time:

    Suppose that a=0 and b\neq 0, then we have that for all M\in\mathbb{Z} that M\mid a, but as c\mid b then c\leq b and so we take M=b as b\mid b. Likewise if we assume b=0 and a\neq 0.

In each case we found a M so that if we take c\leq M then c\mid a and c\mid b. :::

We have shown that the for any two integers a greatest common divisor always exists. We can make a formal definition.

::: definition Definition 147. Greatest common divisor

Let a,b\in\mathbb{Z} so that a\neq 0 and b\neq 0. Let d\in\mathbb{Z} be such that d\mid a and d\mid b. We say the largest value of d where d\mid a and d\mid b is the greatest common divisor of a and b, denoted d=\mathop{\mathrm{GCD}}\left(a,b\right), sometimes written \gcd\left(a,b\right) and in some texts simply by \left(a,b\right).

As a\mid 0 for any integer a. We define \mathop{\mathrm{GCD}}\left(a,0\right)=a, similarly \mathop{\mathrm{GCD}}\left(0,b\right)=b. :::

We will use the notation \mathop{\mathrm{GCD}} in this text and we will usually abbreviate saying the greatest common divisor to \mathop{\mathrm{GCD}}. Although we have proved that the greatest common divisor exists, we do not yet actually have a method of calculating what it is other than trying through trial and error. To see how we can attempt to construct a method of finding \mathop{\mathrm{GCD}} we should look to cases where integer division does not fail and to cases where it does fail.

::: example Example 95. It is clear that 2\nmid 3 as there is no integer x so that 3=2x. If we take x=1 we get the false equality of 3=2, if we take x=2 we get another false equality of 3=4. We observe however that 3=2*1+1. :::

::: example Example 96. Let a=25 and b=7. It is clear that b\nmid a. The first couple multiples of 7 are 7=7*1, 14=7*2, 21=7*3, 28=7*4 and so on. However, we can see that 25=7*3+4. :::

::: example Example 97. Let a=36 and b=12. Clearly that b\mid a as 36=12*3. The first couple multiples of 7 are 7=7*1, 14=7*2, 21=7*3, 28=7*4 and so on. :::

::: example Example 98. This time, let a=8 and b=2. Then we have that 2\mid 8 as 8=2*4. In a similar way to the previous examples we see that $8=24+0$* :::

If we let a,b\in\mathbb{Z} so that b\nmid a then, in the previous examples it seems that we can always find a multiple of b so that bx\leq a for some x\in\mathbb{Z} and in particular we have that

$$\begin{equation*} a=bx+\left(a-bx\right) \end{equation*}$$

In the case that b\mid a then a-bx=0. Interpreting what a-bx means, when b\nmid a then a-bx\neq 0 and when b\mid a we had that a-bx=0. Hence a-bx\neq 0 is a measure of how far off we are from having b\mid a. This is to say that if a-bx>0 then we are a little short of making a multiple of a from b and if a-bx<0 we are a little over of making a multiple of a from b.

In general, we can see that any integer division can be viewed in this way, that is if a,b\in\mathbb{Z} we can see the result of a divided by b in the form a=qb+r for some q,r\in\mathbb{Z}.

::: {#thm:NT_divAlg .theorem} Theorem 33. The division algorithm

Let a,b\in\mathbb{Z} so that b> 0, then there exist q,r\in\mathbb{Z} with q,r being unique so that

$$\begin{equation} a=bq+r \end{equation*}$$*

where $0\leq r < b$

Proof:

There are three cases to consider

  1. $a=b$

  2. $a<b$

  3. $a>b$

  1. a=b:

    If a=b then b\mid a holds trivially and we see that a=1*b+0 where q=1 and r=0.

  2. a<b:

    If a<b then we also see that trivially we have that a=0*b+a where q=0 and r=a.

  3. a>b:

    This case is the meat of the theorem. To prove the division theorem we will argue by induction on a. The base case is a=1 where we either have a=b or a<b which have been dealt with. So now suppose that the result holds for some k>1. Likewise in the base case, we only need to consider the case of k+1>b, or equivalently $b<k+1$

    As b<k+1 we have that 1\leq \left(k+1\right)-b and so by the induction hypothesis we have that there are integers q,r\in\mathbb{Z} so that

    $$\begin{equation} \left(k+1\right)-b=bq+r \end{equation*}$$*

    where 0\leq r< b. From this, we clearly get k+1=bq'+r where q'=1+q which shows the induction step. The result now follows by induction.

Now that the existence has been shown, it is left to show the uniqueness of q and r. So suppose that q_1,r_1 and q_2,r_2 are two such pairs that satisfy the conditions of the theorem. Firstly suppose that r_1\neq r_2 then we have that, without loss of generality that r_1<r_2 so that 0<r_2-r_1<b and then by the theorem we have that

$$\begin{equation} r_2-r_1=b\left(q_2-q_1\right) \end{equation*}$$*

which implies that b \mid\left(r_2-r_1\right). This is a contradiction to theorem 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 5. as this part implies that b\leq r_2-r_1. Therefore r_1=r_2 and from r_1=r_2 we have that 0=b\left(q_2-q_1\right) and by part {reference-type="ref" reference="part1"} proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"} as b>0 then q_2-q_1=0 giving q_2=q_1. $\qed$ :::

Based on this theorem we make a definition.

::: definition Definition 148. Quotient and remainder

Let a,b\in\mathbb{Z} so that b>0. We have by the division algorithm that

$$\begin{equation} a=qb+r \end{equation*}$$*

where q,r\in\mathbb{Z} and 0\leq r < b. We say that q is the quotient of the division and that r is the remainder. :::

In the theorem, we assumed that b>0. However by proposition 101{reference-type="ref" reference="prop:NT_PositiveAndNegativeDivisorsForIntsExist"} we know that negative divisors are also valid. To resolve this we reformulate theorem 33{reference-type="ref" reference="thm:NT_divAlg"} so that 0\leq r <\left|b\right|.

::: {#thm:NT_divAlg_ext .theorem} Theorem 34. The division algorithm (Extended)

Let a,b\in\mathbb{Z} so that b\neq 0, then there exist q,r\in\mathbb{Z} with q,r being unique so that

$$\begin{equation} a=bq+r \end{equation*}$$*

where $0\leq r < \left|b\right|$

Proof:

By the division algorithm, theorem 33{reference-type="ref" reference="thm:NT_divAlg"} we have for \left|a\right| and \left|b\right| that there exist unique q,r\in\mathbb{Z} so that

$$\begin{equation} \left|a\right|=q\left|b\right|+r \end{equation*}$$*

where 0\leq r<\left|b\right|. There are a few cases to consider.

  1. $r=0$

  2. r>0 and $a\geq 0$

  3. r>0 and $a<0$

  1. r=0:

    If r=0, then \left|a\right|=q\left|b\right| and so by the properties of the absolute value we have that a=\pm qb, hence a=b\left(\pm q\right) and we have the result.

  2. r>0 and a\geq 0:

    Now suppose r>0 and a\geq 0. We hence have that a=q\left|b\right|+r which gives

    $$\begin{align} a&=bq+r,\ \text{If } b>0\ a&=\left(-b\right)q+r,\ \text{If } b<0\ \end{align*}$$*

    The first is simply the first version of the division algorithm and the second can be written as a=b\left(-q\right)+r which gives the result.

  3. r>0 and a<0:

    Finally if r>0 and a<0 then we have

    $$\begin{equation} -a=\left|b\right|q+r \Rightarrow a=-\left|b\right|q-r \end{equation*}$$*

    This is a problem as it would give a negative remainder. We can employ a trick that doesn't change the value of a but allows us to express a=-\left|b\right|q-r in a more suitable form.

    $$\begin{align} a&=-\left|b\right|q-r\ a&=-\left|b\right|q+\left(\left|b\right|-\left|b\right|\right)-r\ a&=-\left|b\right|q+\left|b\right|+\left(\left|b\right|r\right)\ a&=\left|b\right|\left(-1-q\right)+\left(\left|b\right|r\right)\ \end{align*}$$*

    By assumption we have that 0<r<\left|b\right| implies that 0<\left|b\right|-r<\left|b\right|, so we re-write the above as

    $$\begin{equation} a=bq'+r' \end{equation*}$$*

    where r'=\left|b\right|-r and q'=-1-q, if b>0 and for b<0 we write q'=1+q.

This completes the proof. $\qed$ :::

We can now go back to a problem from the first section, namely showing that 2n+1 must be odd

::: {#prop:NT_Odd_iff_2n+1 .proposition} Proposition 104. Integer is odd if and only if it is a multiple of $2n+1$

Let x\in\mathbb{Z}. We have that x is odd if and only if it is a multiple of 2n+1 where x=2n+1 for n\in\mathbb{Z}. Then n is odd.

Proof:

Suppose x\in\mathbb{Z}, then by the division algorithm we have that

$$\begin{equation} x=2q+r \end{equation*}$$*

where 0\leq r< \left|2\right|. Hence the only remainders possible are r=0 or r=1. Hence either x=2q or x=2q+1. In the first case we have x=2q is even by definition. In the case x=2q+1 we have that 2\nmid 2n+1 and so x can't be even by definition. It follows that x is odd. $\qed$ :::

With this proposition and proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"} we can derive the evenness or oddness when adding or multiplying even or odd integers.

::: proposition Proposition 105. Even and oddness for addition and multiplication

Let x,y\in\mathbb{Z}. We have that

  1. If x is even and y is even then x+y is even and xy is even.

  2. If x is even and y is odd then x+y is odd and xy is even.

  3. If x is odd and y is even then x+y is odd and xy is even.

  4. If x is odd and y is odd then x+y is even and xy is odd.

Proof:

  1. If x is even and y is even then x+y is even and xy is even:

    Suppose that x and y are even, then by proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"} we have x=2n for some n\in\mathbb{Z} and y=2m for some m\in\mathbb{Z}. We have that x+y=2n+2m=2\left(n+m\right) hence x+y is even by proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"}. Likewise, we have that xy=2n*2m=2\left(n*m\right) and therefore even.

  2. If x is even and y is odd then x+y is odd and xy is odd:

    Suppose that x is even and y is odd. By we have that x=2n for some n\in\mathbb{Z} by 99{reference-type="ref" reference="prop:NT_even_iff_2n"} and by proposition 104{reference-type="ref" reference="prop:NT_Odd_iff_2n+1"} we have that y=2m+1 for some m\in\mathbb{Z}.

    We have x+y=2n+2m+1-2\left(n+m\right)+1 and so x+y is odd by proposition 104{reference-type="ref" reference="prop:NT_Odd_iff_2n+1"}. Additionally, xy=2n\left(2m+1\right)=2\left(2mn+n\right) and so by proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"} we have that xy is even.

  3. If x is odd and y is even then x+y is odd and xy is even:

    Similar to above, swapping the roles of x and y.

  4. If x is odd and y is odd then x+y is even and xy is odd:

    By proposition 104{reference-type="ref" reference="prop:NT_Odd_iff_2n+1"} we have that x=2n+1 for some n\in\mathbb{Z} and y=2m+1 for some m\in\mathbb{Z}.

    Now, x+y=\left(2n+1\right)+\left(2m+1\right)=2\left(n+m\right)+2=2\left(\left(n+m\right)+1\right). So by proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"} we have x+y is even.

    Finally, xy=\left(2n+1\right)\left(2m+1\right)=4nm+2n+2m+1=2\left(2nm+\left(n+m\right)\right)+1 and so by proposition 104{reference-type="ref" reference="prop:NT_Odd_iff_2n+1"} is odd.

As required. $\qed$ :::

Continuing with our quest to find a method to compute the greatest common divisor. At first, it might seem that we haven't made much progress in finding a way to calculate the \mathop{\mathrm{GCD}}. However, consider the following examples.

::: example Example 99. Consider a=56 and b=24. By the division algorithm, we have that 56=2*24+8. Now what about a=24 and b=8? Again, by the division algorithm, we have that 24=3*8+0.

Now, the divisors of 56 are 1, 2, 4, 7, 8, 14, 28 and 56, the divisors of 24 are 1, 2, 3, 4, 6, 8, 12 and 24. The largest common divisor was 8, which was the remainder after the first use of the division algorithm. Likewise, it was the quotient in the second application of the division algorithm. :::

::: example Example 100. Consider a=4947 and b=1552. By the division algorithm, we have that 4974=3*1552+291. Applying the division algorithm to a=1552 and b=291 gives 1552=5*291+97. A third application of the division algorithm to a=291 and b=97 gives 291=3*97+0.

Unlike with the previous example, there may be potentially too many divisors for 4947 to list them out by trying each integer 0<x\leq 4947. The same is true for 1552. However, if we follow the same logic as the previous example we might suspect that 97 is the greatest common divisor, as by the division algorithm for a=4947 and b=97 we get 4947=51*97+0. Applying the division algorithm to a=1552 and b=97 gives 1552=16*97+0. :::

Based on these two examples we might be tempted to make a conjecture on how we can potentially calculate the \mathop{\mathrm{GCD}}. A further example is needed.

::: example Example 101. Let a=574 and b=34. By the division algorithm, we have that 574=16*34+30. Applying the algorithm again to a=34 and b=30 gives 34=1*30+4. Another application gives 30=7*4+2 and finally a last application gives 4=2*2.

Now, applying the division algorithm to 574 and 2 gives 574=287*2+0 and applying it to 34 and 2 gives 34=17*2+0. So we suspect that \mathop{\mathrm{GCD}}\left(574,34\right)=2. :::

If what we suspect is true, then repeated applications of the division algorithm might provide a way to compute the greatest common divisor of any two integers. We can provide more evidence that this must be the case by considering the examples in reverse.

::: example Example 102. Consider a=56 and b=24. We saw that applying the division algorithm twice gave us that

$$\begin{align} 56&=224+8\ 24&=38 \end{align*}$$*

By substituting 24=3*8 into 56=2*24+8 we get

$$\begin{align} 56&=224+8\ 56&=2\left(38\right)+8\ 56&=68+8\ 56&=78 \end{align}$$*

And hence by the definition of divisibility 8\mid 56, likewise by 24=3*8 we have that 8\mid 24.

Now, suppose that d is a common divisor of 56 and 24. We have that as d\mid 56 and d\mid 24, in particular we must have that d\mid\left(2*24+8\right) as d\mid 56. Hence d\mid 8 as d\mid 24, and clearly the largest such d\mid 8 is 8 itself. :::

::: example Example 103. In the example where a=4947 and b=1552. We saw that applying the division algorithm three times gave us

$$\begin{align} 4947&=31552+291\ 1552&=5291+97\ 291&=397+0 \end{align}$$*

By substituting 291=3*97 into 1552=5*291+97 we get

$$\begin{align} 1552&=5291+97\ 1552&=5\left(397\right)+97\ 1552&=1597+97\ 1552&=1697\ \end{align*}$$ Which gives us that 97\mid 1552. Now substituting 1552=16*97 and 291=3*97 into 4947=3*1552+291 yields.*

$$\begin{align} 4947&=31552+291\ 4947&=3\left(1697\right)+397\ 4947&=4897+397\ 4947&=5197 \end{align}$$*

Showing 97\mid 4947. Now as in the previous example, suppose that d is a common divisor of 4947 and 1552. As d\mid 4947 and d\mid 1552 then d\mid\left(3*1552+291\right) which gives d\mid 291. Applying similar logic, we see that as d\mid 1552 and d\mid 97 then d\mid\left( 5*291+97\right) and so d\mid 97. The largest such d satisfying this is d=97. :::

It is therefore clear that for integers a and b repeated applications of the division algorithm on b and the remainder r give a candidate for the greatest common divisor. When this candidate is used candidate through the equations generated by each use of the division algorithm proves that it is the largest such common divisor of a and b. Hence, informally, we have found the method for computing the \mathop{\mathrm{GCD}}! It is left to formalise this discovery.

From working the examples in reverse we have an important proposition that will be crucial for proving the result. Namely that the greatest common divisor of a and b is also equal to the greatest common divisor of b and r where r is the remainder from the division algorithm.

::: {#prop:NT_GCD_of_ints_is_GCD_of_divisor_and_remainder .proposition} Proposition 106. $\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right)$

Let a,b\in\mathbb{Z} so that b\neq 0. By the division algorithm we have that a=qb+r where q,r\in\mathbb{Z} and 0\leq r<\left|b\right|.

We have that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right) \end{equation*}$$*

Proof:

Let d=\mathop{\mathrm{GCD}}\left(a,b\right). By definition of the greatest common divisor, we have that d\mid a and d\mid b. By the division algorithm we have that a=qb+r where q,r\in\mathbb{Z} and 0\leq r<\left|b\right|.

Hence as d\mid a then d\mid\left(qb+r\right). Now, as r=a-qb then d\mid r. Hence by definition of the greatest common divisor, we must have that d\leq\mathop{\mathrm{GCD}}\left(b,r\right) as d is a common divisor of b and r.

Now suppose that g=\mathop{\mathrm{GCD}}\left(b,r\right) then g\mid b and g\mid r. However, by proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 3. as g\mid b and g\mid r then \forall x,y\in\mathbb{Z} we have that g\mid\left(bx+yr\right). In particular, we have that g\mid\left(qb+r\right). But if g\mid\left(qb+r\right) then as a= qb+r we have that g\mid a.

Therefore we have that g\leq \mathop{\mathrm{GCD}}\left(a,b\right). Combining the two directions gives us that

$$\begin{align} d&=\mathop{\mathrm{GCD}}\left(a,b\right)\leq \mathop{\mathrm{GCD}}\left(b,r\right)\ g&=\mathop{\mathrm{GCD}}\left(b,r\right)\leq \mathop{\mathrm{GCD}}\left(a,b\right)\ \end{align*}$$*

That is, d\leq g and g\leq d which is true if and only if d=g. Which is to say \mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right). As required. $\qed$ :::

We are almost ready to formalise the process of computing the greatest common divisor. The last step to show is that repeatedly applying the division algorithm doesn't result in a process that never ends. We have for integers a and b that the division algorithm gives a=qb+r where 0\leq r<\left|b\right|. Another application applied to b and r would give b=q'r+\Tilde{r} where we have \leq \Tilde{r}<\left|r\right|<\left|b\right|.

Clearly then, applying multiple stages of the division algorithm will always cause the remainder at each stage to decrease, and by the condition that 0\leq r <\left|b\right| this process ultimately will give a remainder of 0. For if not then there would be some integer x so that 0\leq x < 1 is a contradiction. We formally prove this result.

::: {#prop:NT_EuclidAlgor_Terminates .proposition} Proposition 107. Remainders from multiple applications of division algorithm decrease to $0$

Let a,b\in\mathbb{Z} with b\neq 0. Consider the result of the division algorithm on a,b, i.e

$$\begin{equation} a=qb+r,\ ,\ 0\leq r< \left|b\right| \end{equation*}$$*

Likewise consider applying the division algorithm to b and r to get

$$\begin{equation} b=\Tilde{q}r+\Tilde{r},\ ,\ 0\leq \Tilde{r} < r \end{equation*}$$*

If we continually apply this process we have that the remainder is eventually zero.

Proof:

By proposition 106{reference-type="ref" reference="prop:NT_GCD_of_ints_is_GCD_of_divisor_and_remainder"}, we know that \mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right) where r is the remainder from the division algorithm and 0\leq r < \left|b\right|.

Applying the division algorithm to b and r gives us again, by proposition 106{reference-type="ref" reference="prop:NT_GCD_of_ints_is_GCD_of_divisor_and_remainder"} that \mathop{\mathrm{GCD}}\left(b,r\right)=\mathop{\mathrm{GCD}}\left(r,r_1\right) where 0\leq r_1 < \left|r\right|.

Continuing in this fashion for n applications we get the chain of inequalities

$$\begin{equation} 0\leq r_n <\left|r_{n-1}\right|<\left|r_{n-2}\right|<\dots <\left|r_2\right|<\left|r_1\right|<\left|r\right| \end{equation*}$$*

Now, for any integers x,y\in\mathbb{Z}, where x\geq 0 and y\geq 0, we have that the largest value of x so that x<y is given by x=y-1. Hence, in the chain of inequalities for the remainder, the smallest decrease from one remainder to the next is 1 and hence there can only be at most r such decreases. If there were more than r decreases, then at the $n$-th application we would have r_n<0 a contradiction to the division algorithm.

This bounds the length of the chain of inequalities to be at most r and therefore we eventually get to 0 as required. $\qed$ :::

We can now formalise the process for computing the greatest common divisor using repeated applications of the division algorithm.

::: {#thm:NT_EuclidAlgor .theorem} Theorem 35. The Euclidean algorithm

Let a,b\in\mathbb{Z} so that b\neq 0, and suppose that \left|a\right|\geq \left|b\right|. Let x,y\in\mathbb{Z} so that x=a and y=b. We have that following these steps computes the greatest common divisor of a and b.

  1. Let d=\mathop{\mathrm{GCD}}\left(x,y\right). If b=0 then d=a and there is nothing more to do.

  2. Otherwise, b\neq 0 so use the division algorithm to write a=qb+r where 0\leq r <\left|b\right|.

  3. Let x=b and y=r, then by the division algorithm we have that \left|b\right|\geq\left|r\right|.

  4. Go back to step 1. and repeat until y=0.

Following these steps gives us that d=\mathop{\mathrm{GCD}}\left(a,b\right) is the value of x after these steps have been performed. This is to say we have that $d=\mathop{\mathrm{GCD}}\left(a,b\right)=x$

Proof:

Let a,b\in\mathbb{Z} be as stated in the theorem. Let x=a and y=b. By the division algorithm we know that a=qb+r for some q,b\in\mathbb{Z} where 0\leq r<\left|b\right|. Moreover by proposition 106{reference-type="ref" reference="prop:NT_GCD_of_ints_is_GCD_of_divisor_and_remainder"} we have that \mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right).

By this proposition, we are therefore looking for the value of \mathop{\mathrm{GCD}}\left(b,r\right). By proposition 107{reference-type="ref" reference="prop:NT_EuclidAlgor_Terminates"} we know that the chain of remainders that are generated by repeatedly using the division algorithm must eventually be 0. Hence at some point, we are computing \mathop{\mathrm{GCD}}\left(r_n,0\right) after some step n. The value of \mathop{\mathrm{GCD}}\left(r_n,0\right)=r_n. Which is the required greatest common divisor. $\qed$ :::

Theorem 35{reference-type="ref" reference="thm:NT_EuclidAlgor"} has shown that we can calculate the greatest common divisor for any integers a,b\in\mathbb{Z} where b\neq 0. With this theorem, we can now assume that whenever d=\mathop{\mathrm{GCD}}\left(a,b\right) is stated we know the value of d by applying this algorithm. We can now consider properties of the \mathop{\mathrm{GCD}}. One such example is \mathop{\mathrm{GCD}}\left(ma,mb\right) for some m\in\mathbb{Z}. Clearly if d=\mathop{\mathrm{GCD}}\left(a,b\right) then d\mid ma and d\mid mb so d\mid\mathop{\mathrm{GCD}}\left(ma,mb\right). As we will see it turns out that we must have in fact, that d=\mathop{\mathrm{GCD}}\left(ma,mb\right). Another property is a particular application of proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 3.

We know from part 3. that if a\mid b and a\mid c then for all integers x,y\in\mathbb{Z} that a\mid\left(bx+cy\right). Now suppose that d=\mathop{\mathrm{GCD}}\left(a,b\right), then by definition we have that d\mid a and d\mid b then d\mid\left(ax+by\right) for any x,y\in\mathbb{Z}. By the definition of divisibility, we have that ax+by=cd for some c\in\mathbb{Z}. The question now is, is it possible to have c=1?

As it turns out the answer is yes.

::: {#thm:NT_bezout_id .theorem} Theorem 36. Bézout's Identity

Let a,b\in\mathbb{Z} so that b\neq 0 and consider d=\mathop{\mathrm{GCD}}\left(a,b\right). Then, there exists x,y\in\mathbb{Z} so that

$$\begin{equation} d=ax+by \end{equation*}$$*

Proof:

Let a,b\in\mathbb{Z} be as given and let d=\mathop{\mathrm{GCD}}\left(a,b\right). By proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 3. we have that as d\mid a and d\mid b then we have that for all x,y\in\mathbb{Z} that d\mid\left(ax+by\right).

Let S denote the set of all such ax+by, that is

$$\begin{equation} S=\left{ax+by:x,y\in\mathbb{Z}\right} \end{equation*}$$*

Now, it is clear that there are s\in S where s<0 and s\in S where s>0. Moreover, we clearly have 0\in S as we can take x=0 and y=0.

Now consider the set \Tilde{S} given by

$$\begin{equation} \Tilde{S}=\left{s\in S: s>0\right} \end{equation*}$$*

We have by definition of \Tilde{S} that \forall s \in \Tilde{S} that s>0 and so \Tilde{S}\subset\mathbb{N}. Hence by the well-ordering principle, theorem 18{reference-type="ref" reference="thm:WOP"}, there is a smallest element, say \Bar{s}. By definition of being an element of \Tilde{S} we have that \Bar{s}=ax_0+by_0 for some x_0,y_0\in\mathbb{Z}, where x_0,y_0 each have a fixed value.

We show that \Bar{s}\mid a and \Bar{s}\mid b. Suppose instead that \Bar{s}\nmid a, then by the division algorithm we have that a=q\Bar{s}+r where 0<r<\left|\Bar{s}\right|. It hence follows that

$$\begin{align} a&=q\Bar{s}+r\ r&=a-q\Bar{s}\ r&=a-q\left(ax_0+by_0\right)\ r&=a-qax_0-qby_0\ r&=a\left(1-qx_0\right)+b\left(-qy_0\right)\ \end{align*}$$*

This gives us at r\in\Tilde{S}. We know that by the division algorithm that 0<r<\left|\Bar{s}\right| hence r<\Bar{s} which gives a contradiction to the well-ordering principle. Meaning that \Bar{s}\nmid a is false so it must be the case that \Bar{s}\mid a. A similar argument shows that \Bar{s}\mid b.

Now, we have that d=\mathop{\mathrm{GCD}}\left(a,b\right) and so a=md and b=nd for some n,m\in\mathbb{Z}. Moreover, we have that \Bar{s}=ax_0+by_0. So we have that

$$\begin{align} \Bar{s}&=ax_0+by_0\ \Bar{s}&=\left(md\right)x_0+\left(nd\right)y_0\ \Bar{s}&=d\left(mx_0+ny_0\right) \end{align*}$$*

Hence by the definition of divisibility, we conclude that d\mid\Bar{s}. Applying part 5. of proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} we have that d\leq \Bar{s}. But as d is the greatest common divisor of a and b we can't have d< \Bar{s}, so it follows d=\Bar{s} as required. $\qed$ :::

We now note the more standard properties of the greatest common divisor.

::: {#prop:NT_GCD_properties .proposition} Proposition 108. Properties of the greatest common divisor

Let a,b\in\mathbb{Z} with b\neq 0. We have the following properties of the \mathop{\mathrm{GCD}} hold.

  1. $\mathop{\mathrm{GCD}}\left(a,a\right)=a$

  2. $\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,a\right)$

  3. Let D be the set of all common divisors of a and b. then \forall d\in D we have that $d\mid\mathop{\mathrm{GCD}}\left(a,b\right)$

  4. We have that \mathop{\mathrm{GCD}}\left(a,b\right) is the smallest such ax+by where x,y\in\mathbb{Z} so that $\mathop{\mathrm{GCD}}\left(a,b\right)=ax+by$

  5. Let m\in\mathbb{Z} with m>0, then $\mathop{\mathrm{GCD}}\left(am,bm\right)=m\mathop{\mathrm{GCD}}\left(a,b\right)$*

  6. If d\mid a and d\mid b where d\in\mathbb{Z} and d>0 then $\displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=\frac{1}{d}\mathop{\mathrm{GCD}}\left(a,b\right)$

  7. If \mathop{\mathrm{GCD}}\left(a,b\right)=d then $\displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=1$

Proof:

  1. \mathop{\mathrm{GCD}}\left(a,a\right)=a:

    Clearly, we have that a\mid a. Now by proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 5. We have that if a\mid a with a>0 then a\leq a. Hence a is the largest such divisor so $\mathop{\mathrm{GCD}}\left(a,a\right)=a$

  2. \mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,a\right):

    This is trivial. If d=\mathop{\mathrm{GCD}}\left(a,b\right) then d is the largest common divisor of a and b.

  3. Let D be the set of all common divisors of a and b. then \forall d\in D we have that d\mid\mathop{\mathrm{GCD}}\left(a,b\right):

    Let D be defined as above, then

    $$\begin{equation} D=\left{x\in\mathbb{Z}: x>0\text{ and } x\mid a \text{ and } x\mid b\right} \end{equation*}$$*

    Then by definition of D we have that \forall d\in D that d is a common divisor of a and d is a common divisor of b. Clearly then d\mid\mathop{\mathrm{GCD}}\left(a,b\right) as \mathop{\mathrm{GCD}}\left(a,b\right) is the largest such common divisor of a and b and therefore \mathop{\mathrm{GCD}}\left(a,b\right)\in D.

  4. We have that \mathop{\mathrm{GCD}}\left(a,b\right) is the smallest such ax+by where x,y\in\mathbb{Z} so that \mathop{\mathrm{GCD}}\left(a,b\right)=ax+by:

    This follows from the proof of theorem \ref{thm:NT_bezout_id}. For it it were not we would have a contradiction.

  5. Let m\in\mathbb{Z} with m>0, then \mathop{\mathrm{GCD}}\left(a,b\right)=m\mathop{\mathrm{GCD}}\left(a,b\right):

    By the previous part we have that \mathop{\mathrm{GCD}}\left(a,b\right) is the smallest such element of the set

    $$\begin{equation} S=\left{ax+by:x,y\in\mathbb{Z}\right} \end{equation*}$$*

    Let s\in S denote the smallest such ax+by, that is s=ax+by and s=\mathop{\mathrm{GCD}}\left(a,b\right).

    As s=\mathop{\mathrm{GCD}}\left(a,b\right) then s\mid a and s\mid b. As s\mid a then a=ks for some k\in\mathbb{Z} and so am=k\left(ms\right) which is to say ms\mid am. Likewise as s\mid b then b=ls for some l\in\mathbb{Z} and hence bm=l\left(ms\right) giving ms\mid bm.

    Now as s=ax+by then we have that ms=m\left(ax+by\right)=a\left(mx\right)+b\left(my\right). Moreover, as s\in S is the smallest such ax+by then m\left(ax+by\right) will be the smallest such element of the set

    $$\begin{equation} \Tilde{S}=\left{amx+bmy:x,y\in\mathbb{Z}\right} \end{equation*}$$*

    Hence we have that amx+bmy=\mathop{\mathrm{GCD}}\left(am,bm\right)=ms=m*\mathop{\mathrm{GCD}}\left(a,b\right).

  6. If d\mid a and d\mid b where d\in\mathbb{Z} and d>0 then $\displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=\frac{1}{d}\mathop{\mathrm{GCD}}\left(a,b\right)$

    Let a,b,d\in\mathbb{Z} so that d\mid a and d\mid b. As d\mid a then we have that \displaystyle\frac{a}{d}\in\mathbb{Z}, likewise as d\mid b then \displaystyle\frac{b}{d}\in\mathbb{Z}. The result now follows by applying the previous part.

  7. If \mathop{\mathrm{GCD}}\left(a,b\right)=d then \displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=1:

    This follows by the previous part.

Concluding the proof. $\qed$ :::

We have talked a lot about the greatest common divisor but nothing about the least common multiple. As with common divisors, we start by making a definition of a common multiple.

::: definition Definition 149. Common multiple

Let a,b,c\in\mathbb{Z} so that a\mid m and b\mid m. We say that m is a common multiple of a and b. :::

::: example Example 104. Let a=2, b=4 and c=8. We have that 2\mid 8 and 4\mid 8 and so 8 is a common multiple of 2 and 4. In fact, 4 is a common multiple of 2 and 4. :::

::: example Example 105. Let a=4 and b=14. Listing multiples of 2 we have 4, 8, 12, 16, 20, 24, 28, 32 and so on. Doing a similar procedure for 14 we see we have 14, 28, 42 and so on. We see that 28 is a common multiple of 4 and 14. :::

::: example Example 106. Consider a=24 and b=54. Listing the first ten multiples of a and b we have

$$\begin{align} &24,\ 48,\ 72,\ 96,\ 120,\ 144,\ 168,\ 192,\ 216,\ 240,\ \dots\ &54,\ 108,\ 162,\ 216,\ 270,\ 324,\ 378,\ 432,\ 486,\ 540,\ \dots\ \end{align*}$$*

The first common multiple is 216. Interestingly, we saw that \mathop{\mathrm{GCD}}\left(a,b\right) was 6. We have that 216*6=1296 and $2454$=1296.* :::

::: example Example 107. We observe for any integer a that a\mid 0 as 0=am for some m\in\mathbb{z} and by proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"} we must have either a=0 or m=0. Hence 0 can be argued to be a common multiple of any integers a and b. This result is not particularly useful. :::

These examples indicate that a common multiple always exists. In fact, there is always a smallest common multiple

::: {#thm:NT_lcm_exists .theorem} Theorem 37. The least common multiple of two integers exists

Let a,b\in\mathbb{Z} where a>0 and b>0. We have that \exists m\in\mathbb{Z} with m>0 so that m is the smallest common multiple of a and b. That is m is the smallest such integer so that a\mid m and b\mid m.

Proof:

We first prove that a non-trivial common multiple of a and b exists. That is some m\neq 0 as 0 can be viewed as a common divisor of any two integers a,b. Clearly ab is a common multiple of a and b as a\mid ab and b\mid ab. Hence a non-trivial common multiple exists.

It is left to show that there is a minimal common multiple. Let S be the set of all positive common multiples of a and b. By the well-ordering principle, S has a smallest element as S\subset\mathbb{N}. The result follows. $\qed$ :::

We can now make a formal definition. However, first, we can note that the restriction of a>0 and b>0 is not needed.

::: corollary Corollary 6. Let a,b\in\mathbb{Z}, where a\neq 0 and b\neq 0. We have that \exists m\in\mathbb{Z} with m>0 so that m is the smallest common multiple of a and b. This is, m is the smallest such integer so that a\mid m and b\mid m.

Proof:

The proof is similar to theorem 37{reference-type="ref" reference="thm:NT_lcm_exists"}. We have that ab is a common multiple of a and b as is -ab. Hence we have that one of ab>0 or -ab>0. Let S be the set of all positive common multiples of a and b. Then the well-ordering principle gives us that S has the smallest such element. \qed. :::

::: definition Definition 150. Least common multiple

Let a,b\in\mathbb{Z} so that a\neq 0 and b\neq 0. We say that the smallest positive value m so that a\mid m and b\mid m is the least common multiple of a and b, denoted m=\mathop{\mathrm{LCM}}\left(a,b\right), sometimes written \mathop{\mathrm{lcm}}\left(a,b\right). :::

It is important to note why we say that the least common multiple is positive. If we allowed a negative least common multiple, say -m, then for all n\in\mathbb{Z} with n>0 we have that -nm is a smaller common multiple than -m and so we could always find a smaller such multiple.

As with the greatest common divisor, we need a way to compute the least common multiple. We should look again at the example where a=24 and b=54. We saw that the first, smallest, common multiple was 216, and that the greatest common divisor was 6. We also noted that the product ab=1296 which is also the product 216*6. We should look to more examples to see if this holds in other cases.

::: example Example 108. Let a=14 and b=21. Using the method of writing multiples out we have

$$\begin{align} &14,\ 28,\ 42,\ 56,\ \dots\ &21,\ 42,\ 63,\ 84,\ \dots\ \end{align*}$$*

So the smallest positive common multiple is 42. Now, \mathop{\mathrm{GCD}}\left(14,21\right)=7. Finally, 14*21=294 and 7*42=294.

Hence we have that \displaystyle \mathop{\mathrm{LCM}}\left(14,21\right)=\frac{14*21}{\mathop{\mathrm{GCD}}\left(14,21\right)}.

In general we might expect that $\displaystyle \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)}$* :::

::: example Example 109. Let a=6 and b=36. Using our expected result, we have that \displaystyle \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{a*b}{\mathop{\mathrm{GCD}}\left(a,b\right)}. So computing \mathop{\mathrm{GCD}}\left(a,b\right) we see that \mathop{\mathrm{GCD}}\left(a,b\right)=6 and so we suspect that \displaystyle\mathop{\mathrm{LCM}}\left(6,36\right)=\frac{6*36}{6}=36. Writing out the multiples of both 6 and $36$

$$\begin{align} &6,\ 12,\ 18,\ 24,\ 30,\ 36,\ 42,\ \dots\ &36,\ 72,\ 108,\ \dots\ \end{align*}$$*

So the smallest common multiple is indeed 36. :::

We have enough evidence to postulate and prove the following theorem.

::: {#thm:NT_LCM_by_GCD_is_product .theorem} Theorem 38. Least common multiple by greatest common divisor equals product

Let a,b\in\mathbb{Z} so that a> 0 and b> 0. We have that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)\mathop{\mathrm{LCM}}\left(a,b\right)=ab \end{equation}$$*

Proof:

Let d=\mathop{\mathrm{GCD}}\left(a,b\right), then by definition we have that d\mid a so by proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 1. implies that d\mid ac for any c\in\mathbb{Z} and in particular d\mid ab. Hence by the definition of divisibility, there exists n\in\mathbb{Z} so that ab=dn.

Now as d\mid a then there is an integer u so that a=du, likewise as d\mid b then there is an integer v so that b=dv. Hence we have that

$$\begin{align} dn&=dub \Rightarrow n=ub,\ \text{By the cancellation law for the integers}\ dn&=adv \Rightarrow n=av,\ \text{By the cancellation law for the integers} \end{align*}$$*

Hence as n=ub we have that b\mid n and likewise as n=av we have that a\mid n. Hence it follows that n is a common multiple of a and b. We need to show that n is the smallest such multiple so then \mathop{\mathrm{LCM}}\left(a,b\right)=n.

So, let S denote the set of positive common multiples of a and b and let s\in S be a common multiple of a and b. By definition of a common multiple, we have that there exists some k_1,k_2\in\mathbb{Z} so that s=ak_1 and s=bk_2.

Now, we have by Bézout's identity we have that \exists x,y\in\mathbb{Z} so that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)=d=ax+by \end{equation*}$$*

Now, consider sd, we have that

$$\begin{align} sd&=s\left(ax+by\right)\ &=sax+sby\ &=\left(bk_2\right)ax+\left(ak_1\right)by\ &=abk_2x+abk_1y\ &=ab\left(k_2x+k_1y\right)\ &=dn\left(k_2x+k_1y\right)\ s&=n\left(k_2x+k_1y\right),\ \text{By the cancellation law for the integers} \end{align*}$$*

Now \left(k_2x+k_1y\right)\in\mathbb{Z} and so we have that n\mid s. Now by proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 5. we have that n\leq s. As s\in S was arbitrary we have that n divides the smallest element of S by the well-ordering principle, i.e n is the smallest common divisor and so by definition \mathop{\mathrm{LCM}}\left(a,b\right)=n.

Hence we have that ab=dn=\mathop{\mathrm{GCD}}\left(a,b\right)\mathop{\mathrm{LCM}}\left(a,b\right). As required. \qed. :::

We can now justify the following corollary to compute the least common multiple.

::: {#cor:NT_lcm_formula .corollary} Corollary 7. Least common multiple is product divided by greatest common divisor

Let a,b\in\mathbb{Z} so that a>0 and b>0. We have that

$$\begin{equation} \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)} \end{equation*}$$*

Proof:

By theorem 38{reference-type="ref" reference="thm:NT_LCM_by_GCD_is_product"} we have that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)\mathop{\mathrm{LCM}}\left(a,b\right)=ab \end{equation}$$*

Let d=\mathop{\mathrm{GCD}}\left(a,b\right) then by definition we have that d\mid a and d\mid b so that d\mid ab. Hence \displaystyle\frac{ab}{d}\in\mathbb{Z}. Hence \mathop{\mathrm{LCM}}\left(a,b\right)\in\mathbb{Z}. $\qed$ :::

We can now show some similar results to proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"}

::: {#prop:NT_LCM_properties .proposition} Proposition 109. Properties of the least common multiple

Let a,b\in\mathbb{Z} with a>0 b> 0. We have the following properties of the \mathop{\mathrm{LCM}} hold.

  1. $\mathop{\mathrm{LCM}}\left(a,a\right)=a$

  2. $\mathop{\mathrm{LCM}}\left(a,b\right)=\mathop{\mathrm{LCM}}\left(b,a\right)$

  3. Let M be the set of all positive common multiples of a and b. then \forall m\in M we have that $\mathop{\mathrm{LCM}}\left(a,b\right)\mid m$

  4. We have that \mathop{\mathrm{LCM}}\left(a,b\right) is the greatest \displaystyle \frac{ab}{ax+by} where \mathop{\mathrm{GCD}}\left(a,b\right)=ax+by.

Proof:

  1. \mathop{\mathrm{LCM}}\left(a,a\right)=a:

    As \mathop{\mathrm{GCD}}\left(a,a\right)=a and a*a=a^2, we have by corollary 7{reference-type="ref" reference="cor:NT_lcm_formula"} that

    $$\begin{equation} \mathop{\mathrm{LCM}}\left(a,a\right)=\frac{aa}{\mathop{\mathrm{GCD}}\left(a,a\right)}=\frac{a^2}{a}=a \end{equation}$$*

  2. \mathop{\mathrm{LCM}}\left(a,b\right)=\mathop{\mathrm{LCM}}\left(b,a\right):

    This follows as \mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,a\right) and integer multiplication is commutative, this is to say

    $$\begin{equation} \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)}=\frac{ba}{\mathop{\mathrm{GCD}}\left(b,a\right)}=\mathop{\mathrm{LCM}}\left(b,a\right) \end{equation*}$$*

  3. Let M be the set of all positive common multiples of a and b. then \forall m\in M we have that \mathop{\mathrm{LCM}}\left(a,b\right)\mid m:

    Let M be the set of all positive common multiples. By the well-ordering principle, there is a smallest element \Tilde{m}. By the definition of the least common multiple we have that \mathop{\mathrm{LCM}}\left(a,b\right) divides any other common multiple, so \mathop{\mathrm{LCM}}\left(a,b\right)\mid\Tilde{m}. For every m\in M, we have that m\geq\Tilde{m} and so \mathop{\mathrm{LCM}}\left(a,b\right)\mid m for every m\in M.

  4. We have that \mathop{\mathrm{LCM}}\left(a,b\right) is the greatest \displaystyle \frac{ab}{ax+by} where \mathop{\mathrm{GCD}}\left(a,b\right)=ax+by:

    By proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"} part 4. we have that \mathop{\mathrm{GCD}}\left(a,b\right)=ax+by for some x,y\in\mathbb{Z} is the smallest such ax+by. Hence

    $$\begin{equation} \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)} \end{equation*}$$*

    Will be the greatest such fraction. For if not then there is either x_0,y_0\in\mathbb{Z} so that ax_0+by_0<ax+by a contradiction to part 4. of proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"}, or we have that there is x_1,y_1\in\mathbb{Z} with ax_1+by_1>ax+by then by part 35. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we have that

    $$\begin{equation} \frac{ab}{ax_1+by_1}<\frac{ab}{ax+by} \end{equation*}$$*

Concluding the proof. $\qed$ :::

Prime and co-prime numbers

::: epigraph God may not play dice with the universe, but something strange is going on with the prime numbers.

Paul Erdos :::

So far we have been building a theory of divisibility. This theory has allowed us to define what it means to be an odd or an even integer. To know when one integer divides another, and computing the largest divisor of two integers. Where do we go from here? One question we could ask is how many divisors does a given integer have?

The divisor function

We start with the following definition.

::: definition Definition 151. The Divisor function

Let x\in\mathbb{Z}. We define \sigma:\mathbb{Z}\rightarrow\mathbb{Z} by

$$\begin{align} \sigma:\mathbb{Z}&\mathlarger{\mathlarger{\rightarrow}}\mathbb{Z}\ x&\mapsto \sigma\left(x\right)=\sum_{d\mid x} 1 \end{align*}$$*

here we are summing over all of the divisors d of x, where if d\mid x then we add one to the sum total. :::

Rather than work with explicit examples we will provide a table of the first 20 integers.

        $x$             1   2   3   4   5   6   7   8   9   10   11   12   13   14   15   16   17   18   19   20

\sigma\left(x\right) 1 2 2 3 2 4 2 4 3 4 2 6 2 4 4 5 2 6 2 6

: The divisor function for the integers 1\leq x\leq 20

There are a few things to note from this table. Firstly the only integer with a single divisor is 1. Secondly, there are many examples of integers having only 2 divisors. These are 2, 3, 5, 7, 11, 13, 17 and 19. As 1 is a divisor of every integer we can conclude the other divisors in the case of \sigma\left(x\right)=2 must be x itself.

What about the case when \sigma\left(x\right)>2. Looking at 6 we see the divisors are 1, 2, 3 and 6 itself, and from the table \sigma\left(2\right)=\sigma\left(3\right)=2. Moreover, we have that 6=2*30.

Similarly with 12 we have that the divisors are 1, 2, 3, 4, 6 and 12. Again, we have that \sigma\left(2\right)=\sigma\left(3\right)=2. Now, as 12=2*6 and 6=2*3 then we have that 12=2*2*3. In both cases, we have seen that a number x with \sigma\left(x\right)>2 can be written into a product of integers with exactly 2 divisors. We can ask does this hold in general? To do so we need to make some definitions.

Prime numbers

With the remarks of the previous section, we give a special name to any integer x where \sigma\left(x\right)=2.

::: definition Definition 152. Prime number

Let x\in\mathbb{Z} with x\geq 2. We say that x is a prime number, or simply that x is prime, if and only if \sigma\left(x\right)=2. In other words, we say that x is prime, if and only if the only two distinct positive divisors of x are 1 and itself. If x is not prime we say that x is composite. :::

We noted that there were many x\in\mathbb{Z} with \sigma\left(x\right)=2. A natural question that arises is are there infinitely many such x, or are there only finitely so many? To answer this we need to see how primes and divisibility interact. We first have to make another definition based on the greatest common divisor of two integers. We show some examples to motivate this new definition.

::: example Example 110. Let a=6 and b=35. By the Euclidean algorithm, we see that

$$\begin{align} 35&=5\left(6\right)+5\ 6&=5+1\ 5&=5\left(1\right) \end{align*}$$*

Hence \mathop{\mathrm{GCD}}\left(a,b\right)=1. :::

::: example Example 111. Let a=2 and b=3. By the Euclidean algorithm, we see that

$$\begin{align} 3&=2+1\ 2&=2\left(1\right) \end{align*}$$*

Hence \mathop{\mathrm{GCD}}\left(a,b\right)=1. We note that a and b are prime. :::

::: example Example 112. Let a=4 and b=9. By the Euclidean algorithm, we see that

$$\begin{align} 9&=2\left(4\right)+1\ 4&=4\left(1\right) \end{align*}$$*

Hence \mathop{\mathrm{GCD}}\left(a,b\right)=1. :::

We see that there are integers a,b\in\mathbb{Z} so that \mathop{\mathrm{GCD}}\left(a,b\right)=1. Meaning that they have no common divisors other than 1. This situation turns out to happen enough in Number Theory to warrant a definition.

::: definition Definition 153. Co-prime Integers

Let a,b\in\mathbb{Z}. We say that a is co-prime to b, or a and b are co-prime, or a and b are relatively prime, if and only if \mathop{\mathrm{GCD}}\left(a,b\right)=1. :::

We have some immediate results.

::: {#prop:NT_Bezout_coprime .proposition} Proposition 110. Bézout's Identity for co-prime integers

Let a,b\in\mathbb{Z} so that \mathop{\mathrm{GCD}}\left(a,b\right)=1. We have that \exists x,y\in\mathbb{Z} so that

$$\begin{equation} 1=ax+by \end{equation*}$$*

Proof:

This immediately follows from theorem 36{reference-type="ref" reference="thm:NT_bezout_id"}. $\qed$ :::

::: proposition Proposition 111. Distinct prime numbers are co-prime

Let p,q\in\mathbb{Z} so that p and q are prime. We have that \mathop{\mathrm{GCD}}\left(p,q\right)=1.

Proof:

Let p,q\in\mathbb{Z} so that p and q are prime and p\neq q. As p is prime then the only positive divisors are p and 1, likewise for q. Hence the largest divisor of both p and q is 1 so that \mathop{\mathrm{GCD}}\left(p,q\right)=1 by definition. $\qed$ :::

::: {#cor:NT_PrimeNotDividing_Integer_implies_coprime .corollary} Corollary 8. Prime not dividing integer implies co-prime

Let a,p\in\mathbb{Z} where p is prime. If p\nmid a then $\mathop{\mathrm{GCD}}\left(a,p\right) = 1$

Proof:

Let a,p\in\mathbb{Z} where p is prime and where p\nmid a. Suppose that \mathop{\mathrm{GCD}}\left(a,p\right)=d for some d\in\mathbb{Z}. By definition of the greatest common divisor, we have that d\mid p and by definition of a prime, we have that either d=1 or d=p. But if d=p then p\mid a by definition of the greatest common divisor, contradicting the assumption that p\nmid a. Hence d=1. $\qed$ :::

::: proposition Proposition 112. Product of co-prime integers is equal to their least common multiple

Let a,b\in\mathbb{Z} so that \mathop{\mathrm{GCD}}\left(a,b\right)=1. We have that ab=\mathop{\mathrm{LCM}}\left(a,b\right).

Proof:

Let a,b\in\mathbb{Z} be as given in the proposition. We have by corollary 7{reference-type="ref" reference="cor:NT_lcm_formula"} that

$$\begin{equation} \mathop{\mathrm{LCM}}\left(a,b\right)= \frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)} \end{equation*}$$*

As a and b are co-prime, we have \mathop{\mathrm{GCD}}\left(a,b\right)=1, hence the result. $\qed$ :::

::: {#prop:NT_Bezout_coef_coprime .proposition} Proposition 113. Coefficients in Bézout's identity are co-prime

Let a,b\in\mathbb{Z} with d=\mathop{\mathrm{GCD}}\left(a,b\right) so that by Bézout's identity we have \exists x,y\in\mathbb{Z} so that

$$\begin{equation} d=ax+by \end{equation*}$$*

We have that $\mathop{\mathrm{GCD}}\left(x,y\right)=1$

Proof:

Let a,b\in\mathbb{Z} with d=\mathop{\mathrm{GCD}}\left(a,b\right). By Bézout's identity we have that there exists x,y\in\mathbb{Z} so that

$$\begin{equation} d=ax+by \end{equation*}$$*

Now, dividing by d gives

$$\begin{equation} 1=\frac{a}{d}x+\frac{b}{d}y \end{equation*}$$*

As d\mid a and d\mid b. Hence we have that 1=k_1x+k_2y where \displaystyle k_1=\frac{a}{d} and \displaystyle k_2=\frac{b}{d}. Hence \mathop{\mathrm{GCD}}\left(x,y\right)=1 and so by definition x and y are co-prime. $\qed$ :::

With some basic results out of the way, we can start seeing more meaningful consequences of defining prime and co-prime numbers. One of the first things we should do is see how primes divide other integers.

::: example Example 113. Let n=10, we have that 2\mid 10 and \sigma\left(2\right)=2, hence 2 is prime. Moreover 10=2*5 and clearly 2\mid 2. :::

::: example Example 114. let n=4, clearly 4=2*2 and so 2\mid 4. Moreover, 2\mid 2. :::

::: example Example 115. Let n=14=2*7. Both 2 and 7 are prime and so 2\mid 14 and 7\mid 14. :::

Then, if a prime p divides n=ab we seem to have that either p\mid a or p\mid b.

::: {#lem:NT_Euclid .lemma} Lemma 9. Euclid's Lemma

Let a,b\in\mathbb{Z} and let p\in\mathbb{Z} be prime. Suppose that p\mid ab. We have that either p\mid a or p\mid b.

Proof:

Let p\mid ab. Suppose that p\nmid b. As the only divisors of p are 1 and itself then we have that \mathop{\mathrm{GCD}}\left(p,b\right)=1 by corollary 8{reference-type="ref" reference="cor:NT_PrimeNotDividing_Integer_implies_coprime"}. Now by proposition 110{reference-type="ref" reference="prop:NT_Bezout_coprime"} we have that \exists x,y\in\mathbb{Z} so that

$$\begin{equation} 1=px+by \end{equation*}$$*

Multiplying by a gives a=apx+aby and as p\mid apx and p\mid ab we have that p\mid a. Likewise if p\nmid a. $\qed$ :::

This result generalises to products of more than two integers.

::: {#lem:NT_Euclid_general .lemma} Lemma 10. Generalised Euclid's lemma

Let p\in\mathbb{Z} be prime. Let n\in\mathbb{Z} be such that

$$\begin{equation} n=\prod_{i=1}^m a_i \end{equation*}$$*

where a_i\in\mathbb{Z} for each i. Suppose that p\mid n, then there exists an i\in\mathbb{N} so that p\mid a_i.

Proof:

We argue by induction on m. The base case is m=2 which follows by Euclid's lemma. So suppose the result holds for some k>2 that is if n is such that

$$\begin{equation} n=\prod_{i=1}^k a_i \end{equation*}$$*

then there is some i\in\mathbb{N} so that p\mid a_i. We show that if n is such that

$$\begin{equation} n=\prod_{i=1}^{k+1} a_i \end{equation*}$$*

then there is some i\in\mathbb{N} so that p\mid a_i. So suppose that p\mid n, then

$$\begin{equation} p\mid\prod_{i=1}^{k+1} a_i \end{equation*}$$*

We have that

$$\begin{align} p\mid&\prod_{i=1}^{k+1} a_i \ p\mid&\left(\prod_{i=1}^{k} a_i a_k\right) \end{align}$$*

By the induction hypothesis we have that as \displaystyle p\mid\prod_{i=1}^{k} a_i then there is some i\in\mathbb{N} so that p\mid a_i where 1\leq i \leq k. Hence we have that either p\mid a_i or p\mid a_{k+1}. The result now follows by induction. $\qed$ :::

With Euclid's lemma, we can provide a very famous theorem. Namely, there is no x\in\mathbb{Q} so that x^2=2. We first need a definition, based on co-prime integers.

::: definition Definition 154. Reduced fraction

Let x\in\mathbb{Q} where \displaystyle x=\frac{a}{b} and b\neq 0. We say that x is a reduced fraction, or a fraction in its lowest terms if \mathop{\mathrm{GCD}}\left(a,b\right)=1. :::

We give some examples.

::: example Example 116. Let \displaystyle x=\frac{1}{2}. As \mathop{\mathrm{GCD}}\left(1,2\right)=1 we have that x is a reduced fraction. :::

::: example Example 117. Let \displaystyle x=\frac{3}{6}. We can compute that \mathop{\mathrm{GCD}}\left(3,6\right)=3, hence we have that 3\mid 3 and 3\mid 6. We hence can write

$$\begin{equation} x=\frac{3}{6}=\frac{31}{32}=\frac{1}{2} \end{equation*}$$*

And as \mathop{\mathrm{GCD}}\left(1,2\right)=1 we can conclude x is now in its lowest terms. :::

We can now show the theorem.

::: {#thm:NT_Root2Irrational .theorem} Theorem 39. No rational exists whose square is $2$

We have that \not\exists x\in\mathbb{Q} with x^2=2.

Proof:

Suppose instead that x\in\mathbb{Q} where \displaystyle x=\frac{a}{b} with b\neq 0. Moreover assume that x is a reduced fraction, i.e \mathop{\mathrm{GCD}}\left(a,b\right)=1. We can make this assumption as otherwise we can reduce x until it is reduced without affecting the proof.

We have that

$$\begin{align} x^2&=2\ \frac{a^2}{b^2}&=2\ a^2&=2b^2 \end{align*}$$*

Hence by the definition of divisibility, we have 2\mid a^2 and so by Euclid's lemma we have that 2\mid a as 2 is prime. So write a=2k for some k\in\mathbb{Z}. Then we have that

$$\begin{align} a^2&=2b^2\ \left(2k\right)^2&=2b^2\ 4k^2&=2b^2\ 2k^2&=b^2\ \end{align*}$$*

Hence 2\mid b^2 and again by Euclid's lemma we have that 2\mid b. We have a contradiction as 2\mid a and 2\mid b implies that \mathop{\mathrm{GCD}}\left(a,b\right)\geq 2 and so x can't have been a reduced fraction. But then if x was not a reduced fraction and a and b can't be co-prime then we can conclude that there is no rational x so that x^2=2. $\qed$ :::

This raises the question if there is no rational x whose square is 2 then what exactly is x? Unfortunately, we are not quite ready to properly answer this question in a satisfying way, all we can is that we have seen a hint of a new type of number. One that we can define but not study in more detail at the moment.

::: definition Definition 155. Irrational number

If we have x\not\in\mathbb{Q}, then we say that x is irrational. In other words, x is irrational if and only if \displaystyle x=\frac{a}{b} where a,b\in\mathbb{Z} and b\neq 0. :::

Clearly, if S denotes the set of irrational numbers then by theorem \ref{thm:NT_Root2Irrational} that S\neq\emptyset. Perhaps then it makes sense, for now, to consider which elements of x\in\mathbb{Q} so that x^2=y where y\in\mathbb{Z}, or more restrictively, which x\in\mathbb{Z} are such that we have x^2=y where y\in\mathbb{Z}.

Before we start answering this question, we note one useful result by generalising Euclid's lemma from the prime case to the co-prime case.

::: {#lem:NT_Euclid_co_primes .lemma} Lemma 11. Euclid's lemma for co-primes

Let a,b,c\in\mathbb{Z} and suppose that c\mid ab and \mathop{\mathrm{GCD}}\left(b,c\right)=1. We have that c\mid a.

Proof:

Let a,b,c\in\mathbb{Z} be such that c\mid ab and \mathop{\mathrm{GCD}}\left(b,c\right)=1. As \mathop{\mathrm{GCD}}\left(b,c\right)=1, we have by proposition 110{reference-type="ref" reference="prop:NT_Bezout_coprime"} that there exists integers x,y\in\mathbb{Z} so that

$$\begin{equation} bx+cy=1 \end{equation*}$$*

On multiplication by a we have that abx+acy=a. Clearly c\mid abx and c\mid acy and so c\mid a as required. $\qed$ :::

There is a useful application of this lemma.

::: {#exam:NT_solutions_to_ax_plus_by .example} Example 118.

Let a,b\in\mathbb{Z} and let d=\mathop{\mathrm{GCD}}\left(a,b\right). We know by Bézout's identity that \exists x,y\in\mathbb{Z} so that

$$\begin{equation} ax+by=d \end{equation*}$$*

The theorem for Bézout's identity, theorem 36{reference-type="ref" reference="thm:NT_bezout_id"}, doesn't state anything about there not being another pair x',y' so that

$$\begin{equation} ax'+by'=d \end{equation*}$$*

For example, consider a=30 and b=105, then \mathop{\mathrm{GCD}}\left(a,b\right)=15 and we have that 15=-3*30+1*105, i.e x=-3 and y=1 in this case. We could have also have x=-10 and y=3 as -10*30+3*105=-300+315=15.

So supposing that a,b\in\mathbb{Z} and d=\mathop{\mathrm{GCD}}\left(a,b\right) we know that \exists x,x',y, y'\in\mathbb{Z} with

$$\begin{align} ax+by&=d\ ax'+by'&=d \end{align*}$$*

Can we find a relation between the pair x and y and the pair x' and y'? As d\mid a then there exists a'\in\mathbb{Z} so that a=a'd and likewise as d\mid b then there exists b'\in\mathbb{Z} so that b=b'd. Hence we see that

$$\begin{align} ax+by&=d\ a'dx+b'dy&=d\ a'x+b'y&=1 \end{align*}$$*

Now, we have that x and y are co-prime so we can deduce that a' and b' are also co-prime. Now, we have that

$$\begin{equation} ax+by=d=ax'+by' \end{equation*}$$*

So, re-arranging we see that

$$\begin{align} ax-ax'&=by'-by\ a\left(x-x'\right)&=b\left(y'-y\right) \end{align*}$$*

Dividing by d gives

$$\begin{equation} a'\left(x-x'\right)=b'\left(y'-y\right) \end{equation*}$$*

Now, as a' and b' are co-prime, we have by Euclid's lemma for co-primes that a'\mid\left(y'-y\right), We, therefore have that \exists k\in\mathbb{Z} so that

$$\begin{equation} y'-y=a'k \Rightarrow y'=y+a'k \end{equation*}$$*

But as y'-y=a'k we have that

$$\begin{align} a'\left(x-x'\right)&=b'\left(a'k\right)\ x-x'&=b'k\ x'&=x-b'k\ \end{align*}$$*

Therefore, we can conclude that

$$\begin{align} x'&=x-\frac{b}{d}k\ y'&=y+\frac{a}{d}k \end{align*}$$*

where k\in\mathbb{Z}. To check this is the case we return to the example of a=30 and b=105 where we had that \mathop{\mathrm{GCD}}\left(a,b\right)=15. We saw that x=-3 and y=1. Using these values in the equations above we get

$$\begin{align} x'&=-3-\frac{105}{15}k \Rightarrow x'=-3-7k\ y'&=1+\frac{30}{15}k \Rightarrow y'=1+2k \end{align*}$$*

Using k=1 gives us the alternative solution we saw of x'=-10 and y'=3. :::

From Euclid's lemma for co-primes we have deduced the full set of values where d=\mathop{\mathrm{GCD}}\left(a,b\right) and d=ax+by.

We now return to the problem at hand. We wish to consider the elements of x\in\mathbb{Z} so that x^2=y where y\in\mathbb{Z}. As is the theme of this section we will do some exploratory examples.

::: example Example 119. Let x\in\mathbb{Q} be such that \displaystyle x=\frac{2}{1}, then \displaystyle x^2=\frac{4}{1}=4\in\mathbb{Z}. In particular, we have that 4=2*2=2^2. :::

::: example Example 120. Consider \displaystyle x=\frac{10}{1}=10. Clearly x^2=100\in\mathbb{Z}. We have that

$$\begin{equation} 100=250=2225=2255=2^25^2 \end{equation}$$* :::

::: example Example 121. We generalise the example of there being no x\in\mathbb{Q} so that x^2=2. We will show that for a prime p\in\mathbb{Z}, there is no x\in\mathbb{Q} so that x^2=p. So suppose there is such an x, that is \displaystyle x=\frac{a}{b} so that a,b\in\mathbb{Z} and b\neq 0 and moreover suppose that x is a reduced fraction, which is to say \mathop{\mathrm{GCD}}\left(a,b\right)=1. We then have that

$$\begin{equation} x^2=\frac{a^2}{b^2}=p \Rightarrow a^2=pb^2 \end{equation*}$$*

Hence p\mid a^2. Hence by Euclid's lemma, we have that p\mid a. Hence let a=pk for some k\in\mathbb{Z}. We then have that

$$\begin{align} a^2&=pb^2\ \left(pk\right)^2&=pb^2\ p^2k^2&=pb^2\ pk&=b^2 \end{align*}$$*

Therefore p\mid b^2 and so by Euclid's lemma we have that p\mid b, a contradiction to the assumption that x was a fraction in reduced form. :::

This last example shows that for any prime p there is no rational number x with x^2=p. We also saw an example of when x^2=p^2, namely when p=2. Also an example of a product of primes satisfying x^2=p^2*q^2 for some primes p and q. It seems therefore that the question of what x\in\mathbb{Z} so that x^2=y for some integer y is deeply connected to primes. In particular, we have seen that the powers of the primes must be even. We need more examples before we can make a claim.

::: example Example 122. Consider x=4, we have that x^2=8 and 8 is not prime as \sigma\left(8\right)=4, with divisors 1, 2, 4 and 8. However, we have that 8=2^4 and we know that 2 is prime. :::

::: example Example 123. Let y=3^2*5^4=5625, a product of primes. We can see that we can take x=3*5^2=75. :::

With these examples, we can see that to answer the question of what x\in\mathbb{Z} are such that x^2=y for some y\in\mathbb{Z}, it is enough to consider the structure of the primes that make y. This leads us to, perhaps, the most important theorem of elementary Number Theory12 .

::::: {#thm:NT_FTOA .theorem} Theorem 42. The fundamental theorem of arithmetic

Let n\in\mathbb{Z} be such that n\geq 2. We have that n can be expressed as a product of one or more primes. This product is uniquely up to the order of the primes. This is to say we have that

$$\begin{equation} n=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dotsp_k^{e_k} \end{equation*}$$*

where p_i are the primes and e_i are the powers for the prime p_i. Here uniquely up to the order of the primes means that, for example, 6=2*3=3*2 are considered the same product.

Proof:

There are two parts to this theorem, firstly we must show that every integer n\geq 2 is expressible as a product of primes. Secondly that this product is unique up to the ordering of the primes.

As a result, we will break this theorem down into two sub-theorems.

::: {#thm:NT_FOTA_EveryIntIsProductOfPrimes .theorem} Theorem 40. Every integer greater than one is expressible as a product of primes

Let n\in\mathbb{Z} be such that n>1. We have that

$$\begin{equation*} n=p_1p_2p_3*\dotsp_k \end{equation}$$

where p_i are the primes.

Proof:

We argue by induction on n. The base case is n=2 for which we have n=2 which is a prime. So the base case is immediate. So suppose the result holds for some k>2, that is n=k can be written as a product of primes. We show that n=k+1 can be written as a product of primes.

If k+1 is itself prime we are done, so suppose not, then \sigma\left(k+1\right)>2 and so there are some factors, say a and b so that k+1=ab, where 2\leq a < k+1 and 2\leq b < k+1. However, this means that we have 2\leq a \leq k and 2\leq b \leq k and so by the induction hypothesis we can write a and b as a product of primes. But then ab will be a product of primes and so k+1 is a product of primes.

The result follows by induction. \qed :::

::: {#thm:NT_FOTA_PrimeProdUnique .theorem} Theorem 41. The product of primes expression for an integer is unique

Let n\in\mathbb{Z} be such that n\geq 2. We have that the expression for n as a product of primes is unique.

Proof:

Let n\in\mathbb{Z} be as given. Suppose that n has two different representations into a product of primes, that is

$$\begin{align*} n&=p_1p_2p_3\dots p_r\ n&=q_1q_2q_3\dots q_s \end{align*}$$

where without loss of generality we suppose that r\leq s. Moreover, Without loss of generality suppose that we have the primes in ascending order, that is, p_1\leq p_2\leq p_3\leq\dots\leq p_r and that q_1\leq q_2\leq q_3\leq\dots\leq q_s.

Now as p_1\mid q_1q_2q_3\dots q_s we have by Euclid's lemma that p_1\mid q_i for some 1\leq i \leq s. Therefore p_1\geq q_1 as the primes are in ascending order. Likewise, as q_1\mid p_1p_2p_3\dots p_r, then q_1\mid p_j for some 1\leq j\leq r. Hence q_1\geq p_1. As p_1\geq q_1 and q_1\geq p_1 we must have that p_1=q_1. Hence we have

$$\begin{align*} p_1p_2p_3\dots p_r&=q_1q_2q_3\dots q_s\ p_1p_2p_3\dots p_r&=p_1q_2q_3\dots q_s\ p_2p_3\dots p_r&=q_2q_3\dots q_s\ \end{align*}$$

This process can be repeated for each prime p_j for the remaining 2\leq j\leq r. Now if r<s we will eventually get to

$$\begin{equation*} 1=q_{r+1}q_{r+2}q_{r+3}\dots q_s \end{equation*}$$

However the only divisors of 1 are 1 and -1, hence none of the q_i for r+1\leq i\leq s can't be prime, a contradiction. So r=s. If r=s then we must have that p_i=q_i for 1\leq i\leq s. Hence the two expressions of for n are equal giving us uniqueness. \qed :::

.

The fundamental theorem of arithmetic now follows from theorem 40{reference-type="ref" reference="thm:NT_FOTA_EveryIntIsProductOfPrimes"} and theorem 41{reference-type="ref" reference="thm:NT_FOTA_PrimeProdUnique"}. The final result involving the powers of primes is trivial to see. Suppose that n is a product of primes given by

$$\begin{equation} n=p_1p_2p_3\dots p_k \end{equation*}$$*

We will have that some of the p_i will be the same and others will not, if we combine the primes that are equal then we will get that, after re-labelling so that k is once again the largest index that appears,

$$\begin{equation} n=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dotsp_k^{e_k} \end{equation*}$$*

The result is shown. $\qed$ :::::

This theorem is of great importance. It ultimately allows us to deal with problems of divisibility by recasting them into statements about the primes that make the integer. We make a quick definition.

::: definition Definition 156. Prime factorisation of an integer

Let n\in\mathbb{Z} where n=p_1^{e_1}*p_2^{e_2}*p_3^{e_3}*\dots*p_k^{e_k}. We say that the expression for n is the prime factorisation of n, or simply the factorisation of $n$ :::

We have shown that any integer can be factored into a product of primes, a natural question we can now ask, and answer, is how many primes are there. Could it be the case that the set of primes finite, if very large? We can see that the number of primes is infinite.

::: theorem Theorem 43. Number of primes is infinite

We have that the number of primes is infinite.

Proof:

We will argue by contradiction. Suppose that there are only a finite number of primes, say

$$\begin{equation} P=\left{p_1,p_2,p_3,\dots,p_n\right} \end{equation*}$$*

where we have that p_i<p_j for i<j and 1\leq i,j\leq n, i.e p_1=2, p_2=3 etc. Let N be the integer

$$\begin{equation} N=\left(p_1p_2p_3\dots p_n\right)+1 \end{equation*}$$*

Clearly, N is not prime as otherwise we would have N\in P but N>p_n, which would be a contradiction. So N is composite and by the fundamental theorem of arithmetic, we have that N has a factorisation into primes. Clearly, none of the p_i divide N, but then none of the p_i divide the prime factorisation of N from the fundamental theorem of arithmetic, a contradiction. Hence P can't be a finite set and the number of primes must be infinite. $\qed$ :::

The fundamental theorem of arithmetic can be used to recast some previous results for the greatest common divisor. We start with a result for integers being co-primes.

::: {#prop:NT_co-prime_iff_no_common_primes .proposition} Proposition 114. Greatest common divisor is 1 if and only if no-common prime in factorisation

Let a,b\in\mathbb{Z} with b\neq 0. We have that \mathop{\mathrm{GCD}}\left(a,b\right)=1 if and only if a and b share no common primes in their factorisations.

Proof:

We have that \mathop{\mathrm{GCD}}\left(a,b\right)=1 if and only if the largest divisor of both a and b is 1, which occurs if and only if there are no primes in the factorisation of a and in the factorisation of b in common. $\qed$ :::

We can compute the greatest common divisor by considering the prime factorisations of a and b. To do so we need a helpful result.

::: {#prop:NT_express_primes_in_common_basis .proposition} Proposition 115. Expression for integers as powers of same primes

Let a,b\in\mathbb{Z} with a\geq 2 and b\geq 2. Consider the prime factorisations of a and b given by

$$\begin{align} a&=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_n^{e_n}\ &=\prod_{\substack{p_i\mid a \ p_i\text{ is prime}}} p_i^{e_i}\ b&=q_1^{f_1}q_2^{f_2}q_3^{f_3}\dots q_m^{f_m}\ &=\prod_{\substack{q_i\mid b \ q_i\text{ is prime}}} q_i^{f_i}\ \end{align*}$$*

where n need not be equal to m. We have that there exist prime numbers

$$\begin{equation} t_1<t_2<t_3\dots <t_v \end{equation*}$$*

So that

$$\begin{align} a&=t_1^{g_1}t_2^{g_2}t_3^{g_3}\dots t_v^{g_v}\ b&=t_1^{h_1}t_2^{h_2}t_3^{h_3}\dots t_v^{h_v} \end{align*}$$*

Proof:

Let a,b\in\mathbb{Z} be as given. We have that

$$\begin{align} a&=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_n^{e_n}\ b&=q_1^{f_1}q_2^{f_2}q_3^{f_3}\dots q_m^{f_m}\ \end{align*}$$*

In particular. Let A=\left\{p_1,p_2,p_3,\dots,p_n\right\} and let B=\left\{q_1,q_2,q_3,\dots,q_m\right\}. We can therefore define the set T=A\cup B. Where we have

$$\begin{equation} T=\left{t_1,t_2,\dots,t_v\right} \end{equation*}$$*

where clearly v\leq \left(n+m\right). We now need a way to pick the primes in the factorisation of a, and b, from the set T.

Define \iota_A and \iota_B as follows

$$\begin{align} \iota_A:A&\mathlarger{\mathlarger{\rightarrow}}T\ x&\mapsto \iota_A\left(x\right)=x \end{align*}$$*

$$\begin{align} \iota_B:B&\mathlarger{\mathlarger{\rightarrow}}T\ x&\mapsto \iota_B\left(x\right)=x \end{align*}$$*

That is, \iota_A and \iota_B simply map elements of either A or B to the same element in T. Using these mappings we can see that

$$\begin{align} a&=\prod_{i=1}^n p_i^{e_i}\ &=\prod_{i=1}^n \iota_A\left(p_i\right)^{e_i}\
&=\prod_{p_i\in A} p_i^{e_i}\
&=\prod_{p_i\in A} p_i^{e_i} * \prod_{t_i\in T\setminus A} t_i^0\ &=\prod_{t_i\in T} t_i^{g_i}, \text{ where } g_i =\begin{cases} e_i,\ \text{If } t_i=p_i\ 0,\ \text{If } t_i\not\in A \end{cases}\ &= t_1^{g_1}t_2^{g_2}t_3^{g_3}\dots t_v^{g_v} \end{align*}$$*

Likewise, for b we have

$$\begin{align} b&=\prod_{j=1}^m q_j^{f_j}\ &=\prod_{j=1}^n \iota_B\left(q_j\right)^{f_j}\
&=\prod_{q_j\in B} q_j^{f_j}\
&=\prod_{q_j\in B} q_j^{f_j} * \prod_{t_j\in T\setminus B} t_j^0\ &=\prod_{t_j\in T} t_j^{h_j}, \text{ where } h_j =\begin{cases} f_j,\ \text{If } t_j=q_j\ 0,\ \text{If } t_j\not\in B \end{cases}\ &= t_1^{h_1}t_2^{h_2}t_3^{h_3}\dots t_v^{h_v} \end{align*}$$*

Hence, a and b have been expressed as a product of the same set of primes, where possibly one or more of the powers in either of the products could be zero. As required. $\qed$ :::

In other words, proposition 115{reference-type="ref" reference="prop:NT_express_primes_in_common_basis"} is saying that given the prime factorisations of a and b, we can always construct a new set containing the common primes of a and b and use this new set to express the factorisations of a and b. Why is this useful? It is useful because it will allow us to find the greatest common divisor of two integers by simply looking at the primes, and the powers of those primes, in common of those integers. We can use some examples to express this idea. The reader is encouraged to also try these examples using the Euclidean algorithm to verify.

::: example Example 124. Let a=2*3^2*5=90 and b=3*5^2*7=525. We have that the \mathop{\mathrm{GCD}}\left(a,b\right)=15. Additionally, we know that 15=3*5. The common primes of a and b are 3 and 5. :::

::: example Example 125. Let a=5*11=55 and b=2*7=14. We have that the \mathop{\mathrm{GCD}}\left(a,b\right)=1. Moreover, a and b have no common primes. :::

::: example Example 126. Let a=7*11*13=1001 and b=7*11*17=1309. We have that the \mathop{\mathrm{GCD}}\left(a,b\right)=77, as the primes in common are 7 and 11. :::

::: example Example 127. Let a=2*3^4=162 and b=3^3*5=135. We have that the \mathop{\mathrm{GCD}}\left(a,b\right)=27, as the primes, with powers, in common is only 3^3. :::

We show that looking at the primes in common is sufficient to get the greatest common divisor. To aid in the notation we make a definition

::: definition Definition 157. The minimum function for integers

Let a,b\in\mathbb{Z}. We define the minimum function, denoted \min\left(a,b\right) by

$$\begin{align} \min:\mathbb{Z}^2&\rightarrow\mathbb{Z}\ \left(a,b\right)&\mapsto\min\left(a,b\right)=\begin{cases} a,\ \text{If } a\leq b\ b,\ \text{If } b\leq a \end{cases} \end{align*}$$* :::

::: {#prop:NT_gcd_can_be_computed_by_primes .proposition} Proposition 116. Greatest common divisor from prime factorisation

Let a,b\in\mathbb{Z} with b\neq 0. By proposition 115{reference-type="ref" reference="prop:NT_express_primes_in_common_basis"} we know that there exists a set of primes

$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*

so that the prime factorisations of a and b are given by

$$\begin{align} a&=\prod_{i=1}^v t_i^{e_i}\ b&=\prod_{i=1}^v t_i^{f_i}\ \end{align*}$$*

We have that the greatest common divisor \mathop{\mathrm{GCD}}\left(a,b\right) is given by

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)=t_1^{\min\left(e_1,f_1\right)}t_2^{\min\left(e_2,f_2\right)}t_3^{\min\left(e_3,f_3\right)}\dots t_v^{\min\left(e_v,f_v\right)} \end{equation*}$$*

Proof:

Let a,b\in\mathbb{Z} be as given as suppose that we have expressed a and b in accordance with proposition 115{reference-type="ref" reference="prop:NT_express_primes_in_common_basis"}. This is to say we have a set T of primes so that

$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*

and the prime factorisations of a and b are given by

$$\begin{align} a&=\prod_{i=1}^v t_i^{e_i}\ b&=\prod_{i=1}^v t_i^{f_i}\ \end{align*}$$*

let d=\mathop{\mathrm{GCD}}\left(a,b\right) and let \displaystyle D=t_1^{\min\left(e_1,f_1\right)}t_2^{\min\left(e_2,f_2\right)}t_3^{\min\left(e_3,f_3\right)}\dots t_v^{\min\left(e_v,f_v\right)}. We need to show that d=D. To do so we show

  1. $\displaystyle D\leq d$

  2. $\displaystyle d\leq D$

Then the result follows from the fact that for n,m\in\mathbb{Z} we have n\leq m and m\leq n we have n=m, for ease of notation, let \sigma_i=\min\left(e_i,f_i\right) for 1\leq i\leq v.

  1. \displaystyle D\leq d:

    We have by definition of the minimum that \sigma_i\leq e_i and \sigma_i\leq f_i. Hence \exists k_i, l_i\in\mathbb{Z} so that

    $$\begin{align} e_i&=\sigma_i+k_i\ f_i&=\sigma_i+l_i \end{align*}$$*

    Hence, we can express a as

    $$\begin{align} a&=\prod_{i=1}^v t_i^{e_i}\ &=\prod_{i=1}^v t_i^{\sigma_i+k_i}\ &=\prod_{i=1}^v t_i^{\sigma_i}t_i^{k_i}\ &=\prod_{i=1}^v t_i^{\sigma_i}\prod_{i=1}^v t_i^{k_i}\ &=D*\prod_{i=1}^v t_i^{k_i} \end{align*}$$*

    Therefore, as \displaystyle \prod_{i=1}^v t_i^{k_i} \in\mathbb{Z} we have that D\mid a. A similar argument for b shows that D\mid b. Hence D is a common divisor of a and b and so by definition we have that D\leq d.

  2. d\leq D:

    To show that d\leq D we will show that d\mid D then by proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 5. we will have that d\leq D. So suppose that d\mid D then by the definition of divisibility we have that there is some k\in\mathbb{Z} so that

    $$\begin{equation} d=Dk \end{equation*}$$*

    As k\in\mathbb{Z}, it has a factorisation into primes by the fundamental theorem of arithmetic. Now, k could have primes in common with D, hence we can take those primes that are in common with D and k and place them into the factorisation of D, this is to say we have that

    $$\begin{align} d&=Dk\ d&=t_1^{\sigma_1}t_2^{\sigma_2}t_3^{\sigma_3}\dots t_v^{\sigma_v}k\ d&=t_1^{\lambda_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}k'\ \end{align*}$$*

    Where we have that \lambda_i is the new value for each prime after we extract the primes that were in common with D and k, additionally, k' are the primes that were not in common.

    To get the result we want we need to show two things.

    1. $k'=1$

    2. \lambda_i\leq \sigma_i for all $1\leq i\leq v$

    1. k'=1:

      Suppose for a contradiction that k'\neq 1. As d>0 and D>0 then we must have that k>0 which means that k'>0. Hence as k'>0 we have by the fundamental theorem of arithmetic that k' has a factorisation into primes, say

      $$\begin{equation} k'=q_1^{r_1}q_2^{r_2}q_3^{r_3}\dots q_c^{r_c} \end{equation*}$$*

      Moreover, no q_j=t_i as k' has no common primes with t_1^{\lambda_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}. Pick one of the primes in k' say q=q_j then we have that q\mid d. Moreover we have that d\mid a as d=\mathop{\mathrm{GCD}}\left(a,b\right) hence we must have that q\mid a. Hence we have that q is one of the primes t_i a contradiction. Therefore k'=1.

    2. \lambda_i\leq \sigma_i for all 1\leq i\leq v:

      Suppose for a contradiction that \lambda_i>\sigma_i for all 1\leq i\leq v. Without loss of generality take i=1, for if this is not the case re-label the primes. Now by definition of \sigma_1 we have that \sigma_1=\min\left(e_,f_1\right) and so we must have that either \sigma_1=e_1 or \sigma_1=f_1. Without loss of generality let \sigma_1=e_1 as the case where \sigma_1=f_1 is similar.

      We, therefore, have that \lambda_1>e_1. Now, as d is the greatest common divisor of a there is a s\in\mathbb{Z} so that ds=a where s>0 as both a and d are. Now, comparing the prime factorisations of ds and a we have that

      $$\begin{equation} st_1^{\lambda_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_1^{e_1}t_2^{e_2}t_3^{e_3}\dots t_v^{e_v} \end{equation}$$*

      Dividing by \displaystyle t_1^{e_1} we get that

      $$\begin{equation} st_1^{\lambda_1-e_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_1^{e_1-e_1}t_2^{e_2}t_3^{e_3}\dots t_v^{e_v} \end{equation}$$*

      Where clearly \displaystyle t_1^{e_1-e_1}=1. So this can be re-written as

      $$\begin{equation} st_1^{\lambda_1-e_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_2^{e_2}t_3^{e_3}\dots t_v^{e_v} \end{equation}$$*

      As \lambda_1>e_1, we have \lambda_1-e_1>0. and so t_1 divides the left-hand side of the equation. But by the fundamental theorem of arithmetic if t_1 divides the left-hand side it must also divide the right-hand side and so would appear in the factorisation, but it is not in the factorisation of the right-hand side a contradiction. So \lambda_i\leq\sigma_i for all 1\leq i\leq v.

    Therefore $d\leq D$

    As D\leq d and d\leq D we must have that d=D. As required. $\qed$ :::

Proposition 116{reference-type="ref" reference="prop:NT_gcd_can_be_computed_by_primes"} allows us to compute the greatest common divisor by considering the prime factorisations, rather than using the Euclidean algorithm. Unfortunately, we now have a new problem, how do we compute the prime factorisation of an integer? Thankfully to answer this question we have to answer the original question posed, what x\in\mathbb{Z} are such that x^2=y for some y\in\mathbb{Z}? Clearly if x\in\mathbb{Z} then x has some prime factorisation, say

$$\begin{equation*} x=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$

So that

$$\begin{align*} x^2&=\left(p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}\right)\left(p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}\right)\ &=\left(p_1^{e_1}p_1^{e_1}\right)\left(p_2^{e_2}p_2^{e_2}\right)\left(p_3^{e_3}p_3^{e_3}\right)\dots\left(p_k^{e_k}p_k^{e_k}\right)\ &=p_1^{2e_1}p_2^{2e_2}p_3^{2e_3}\dots p_k^{2e_k}=y \end{align*}$$

For each prime p_i, the power of that prime is now of the form 2e_i and therefore the power is even. We make this fact a definition.

::: {#def:NT_square_number .definition} Definition 158. Square number

Let y\in\mathbb{Z} where y>0, if there exists an x\in\mathbb{Z} so that

$$\begin{equation} x^2=y \end{equation*}$$*

Then we say that y is a square number. :::

In light of the above discussion, we have the following result.

::: {#prop:NT_square_number_iff_prime_exonents_even .proposition} Proposition 117. Square number if and only if prime factorisation has even powers

Let x\in\mathbb{Z}. We have that x is a square number if and only if the prime factorisation of x only contains even prime powers. This is to say that each prime p_i in the factorisation of x has an exponent of the form 2e_i.

Proof:

\left(\Rightarrow\right): Suppose that x is a square number, by definition there exists y\in\mathbb{Z} so that y^2=x. Let the prime factorisation of y be

$$\begin{equation} y=q_1^{f_1}q_2^{f_2}q_3^{f_3}\dots q_k^{f_k} \end{equation*}$$*

We have that then

$$\begin{equation} x=y^2=q_1^{2f_1}q_2^{2f_2}q_3^{2f_3}\dots q_k^{2f_k} \end{equation*}$$*

Hence all the prime factors of x have an exponent of the form 2f_i making them even.

\left(\Leftarrow\right): Suppose that the prime factorisation of x has prime factors which only have even powers, that is

$$\begin{equation} x=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$*

As each e_i is even we have that \displaystyle \frac{e_i}{2}\in\mathbb{Z}. Define y to be

$$\begin{equation} y=p_1^{e_1/2}p_2^{e_2/2}p_3^{e_3/2}\dots p_k^{e_k/2} \end{equation*}$$*

Where clearly y\in\mathbb{Z}. We then have that

$$\begin{align} y^2&=\left(p_1^{e_1/2}p_2^{e_2/2}p_3^{e_3/2}\dots p_k^{e_k/2}\right)\left(p_1^{e_1/2}p_2^{e_2/2}p_3^{e_3/2}\dots p_k^{e_k/2}\right)\ &=\left(p_1^{e_1/2}p_1^{e_1/2}\right)\left(p_2^{e_2/2}p_2^{e_2/2}\right)\left(p_3^{e_3/2}p_1^{e_3/2}\right)\dots \left(p_k^{e_k/2}p_k^{e_k/2}\right)\ &=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}=x \end{align*}$$*

Hence as x=y^2 for some y\in\mathbb{Z} we conclude that x is a square number. $\qed$ :::

We also have an immediate proposition.

::: {#prop:NT_product_of_sqaure_numbers_is_sqaure_number .proposition} Proposition 118. Product of two square numbers is a square number

Let x,y\in\mathbb{Z} be square numbers. We have that xy is a square number.

Proof:

Let x,y\in\mathbb{Z} be square numbers. We have by definition that \exists a,b\in\mathbb{Z} so that

$$\begin{align} a^2&=x\ b^2&=y \end{align*}$$*

Now, consider the product xy, we have

$$\begin{equation} xy=a^2b^2=\left(ab\right)^2 \end{equation}$$*

Hence by definition, xy is a square number. $\qed$ :::

With proposition 117{reference-type="ref" reference="prop:NT_square_number_iff_prime_exonents_even"} we can finally answer the question of what x\in\mathbb{Z} are such that x^2=y for some y\in\mathbb{Z}. It is those x\in\mathbb{Z} so that x^2 is a square number! At first, this doesn't seem too useful as we can clearly take any n\in\mathbb{Z} and see that n^2\in\mathbb{Z}. However, the real meaning of this result is actually the converse, given some n\in\mathbb{Z} we can see if there is an x\in\mathbb{Z} so that x^2=n. With this, we make a definition

::: definition Definition 159. Square root function

Let x\in\mathbb{Z} be a positive square number. We define the square root function, denoted by \sqrt{} as follows

$$\begin{align} \sqrt{}:\mathbb{Z}&\rightarrow\mathbb{Z}\ x&\mapsto \sqrt{x}=\begin{cases} n,\ \text{If } n^2=x\ \text{Undefined otherwise} \end{cases} \end{align*}$$*

That is, we define the square root of an integer x to be the integer n that when squared gives x. :::

In light of this definition, we have the following result.

::: {#prop:NT_root_of_product_is_product_of_roots .proposition} Proposition 119. Square root of product is product of square roots

Let x,y\in\mathbb{Z} be square numbers. We have that

$$\begin{equation} \sqrt{xy}=\sqrt{x}\sqrt{y} \end{equation*}$$*

Proof:

Let x,y be as given. By proposition 118{reference-type="ref" reference="prop:NT_product_of_sqaure_numbers_is_sqaure_number"} we have that xy is a square number and so \sqrt{xy} is well-defined. We need to show that

$$\begin{equation} \sqrt{xy}=\sqrt{x}\sqrt{y} \end{equation*}$$*

By definition, we suppose that \sqrt{xy}=n, where n^2=xy. Additionally, we can suppose that \sqrt{x}=a where a^2=x and \sqrt{y}=b where b^2=y. Now, we have that

$$\begin{equation} \left(\sqrt{x}\sqrt{y}\right)^2=\left(ab\right)^2=a^2b^2=xy=n^2=\left(\sqrt{xy}\right)^2 \end{equation*}$$*

As n^2=a^2b^2 we have that n=ab. Hence we have that \sqrt{xy}=\sqrt{x}\sqrt{y} as required. $\qed$ :::

The idea of a square number actually generalises, meaning the question of what x\in\mathbb{Z} are such that x^2=y for some y\in\mathbb{Z} can be generalised to the question what x\in\mathbb{Z} are such that x^n=y for some y\in\mathbb{Z} and every n\in\mathbb{N}.

The generalisation works very similarly to how we got to square numbers. As before let x\in\mathbb{Z} which has a factorisation

$$\begin{equation*} x=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$

Now, consider x^n, the factorisation is given by

$$\begin{equation*} x=p_1^{ne_1}p_2^{ne_2}p_3^{ne_3}\dots p_k^{ne_k} \end{equation*}$$

Hence the power of each prime p_i is of the form ne_i. This is the defining characteristic for the next definition.

::: definition Definition 160. $n$-th power number

Let y\in\mathbb{Z} with y>0 and let n\in\mathbb{N}, if there exists an x\in\mathbb{Z} so that

$$\begin{equation} x^n=y \end{equation*}$$*

We say that y is the $n$-th power of x. We have already seen the case of n=2 where y is called a square number. For n=3 we call y a cube number. For n>4, there is no formal term hence the definition using the terminology of $n$-th power. :::

The next step is to prove an equivalent proposition to 117{reference-type="ref" reference="prop:NT_square_number_iff_prime_exonents_even"}.

::: {#prop:NT_nth_power_number_iff_prime_exonents_multiple_of_n .proposition} Proposition 120. $n$-th power number if and only if prime factorisation has multiples of n powers

Let x\in\mathbb{Z}. We have that x is a $n$-th power number if and only if the prime factorisation of x only contains prime powers that are a multiple of n. this is to say that each prime p_i in the factorisation of x has an exponent of the form ne_i.

Proof:

\left(\Rightarrow\right): Suppose that x is a $n$-th power number, by definition there exists y\in\mathbb{Z} so that y^n=x. Let the prime factorisation of y be

$$\begin{equation} y=q_1^{f_1}q_2^{f_2}q_3^{f_3}\dots q_k^{f_k} \end{equation*}$$*

We have that then

$$\begin{equation} x=y^2=q_1^{nf_1}q_2^{nf_2}q_3^{nf_3}\dots q_k^{nf_k} \end{equation*}$$*

Hence all the prime factors of x have an exponent of the form nf_i, meaning each prime power is a multiple of n.

\left(\Leftarrow\right): Suppose that the prime factorisation of x has prime factors which only have multiples of n, that is

$$\begin{equation} x=p_1^{ne_1}p_2^{ne_2}p_3^{ne_3}\dots p_k^{ne_k} \end{equation*}$$*

As each e_i is a multiple of n we have that \displaystyle \frac{e_i}{n}\in\mathbb{Z}. Define y to be

$$\begin{equation} y=p_1^{e_1/n}p_2^{e_2/n}p_3^{e_3/n}\dots p_k^{e_k/n} \end{equation*}$$*

Where clearly y\in\mathbb{Z}. We then have that

$$\begin{align} y^n&=\prod_{i=1}^n\left(p_1^{e_1/n}p_2^{e_2/n}p_3^{e_3/n}\dots p_k^{e_k/n}\right)\ &=\prod_{j=1}^k\left(\prod_{i=1}^n\left(p_j^{e_j/n}\right)\right)\ &=\prod_{j=1}^k\left(p_j^{e_j}\right)\ &=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}=x \end{align*}$$*

Hence as x=y^n for some y\in\mathbb{Z} we conclude that x is an $n$-th power number. $\qed$ :::

As before, there is an immediate proposition.

::: proposition Proposition 121. Product of two n-th power numbers is an $n$-th power number

Let x,y\in\mathbb{Z} be $n$-th power numbers. We have that xy is an $n$-th power number.

Proof:

Let x,y\in\mathbb{Z} be $n$-th power numbers. By definition, we have that \exists a,b\in\mathbb{Z} so that

$$\begin{align} a^n&=x\ b^n&=y \end{align*}$$*

We have

$$\begin{equation} xy=a^nb^n=\left(ab\right)^n \end{equation}$$*

Giving the result. $\qed$ :::

We can now now generalise the square root function.

::: definition Definition 161. $n$-th root function

Let x\in\mathbb{Z} be a positive $n$-th power number. We define the $n$-th root function, denoted by \displaystyle \sqrt[n]{} is given by

$$\begin{align} \sqrt[n]{}:\mathbb{Z}&\rightarrow\mathbb{Z}\ x&\mapsto \sqrt[n]{x}=\begin{cases} m,\ \text{If } m^n=x\ \text{Undefined otherwise} \end{cases} \end{align*}$$*

That is, we define the $n$-th root of an integer x to be the integer m that when raised to the power of n gives x. :::

The integers modulo n

::: epigraph Mathematicians call it "the arithmetic of congruences." You can think of it as clock arithmetic

John Derbyshire :::

So far in the study of the divisibility of integers, we have considered what it means for an integer a to divide another b, namely we have that a\mid b if there is some c\in\mathbb{Z} such that ac=b. We now explore the implications of the case where a\nmid b, in particular, we look at the the remainders from the division algorithm.

Remainders after division

Recall that for a,b\in\mathbb{Z} we have that a\mid b if \exists c\in\mathbb{Z} so that b=ac. When this is not the case we have that a\nmid b. By the division algorithm, when a\nmid b we have that 0<r<\left|a\right|, that is we have that

$$\begin{equation*} b=qa+r \end{equation*}$$

The question is, what are the possible values for r? The division algorithm gives us lower and upper bounds on the valid values of r but does not say anything about whether it can take all values in this range. Is only a small subset of this range valid? What happens if we allow b to be an arbitrary integer rather than some fixed integer? Some exploratory examples will be helpful.

::: example Example 128. Let a=2 and consider some b>a. We have by the division algorithm that

$$\begin{equation} b=2q+r \end{equation*}$$*

Hence r can only be one of 0 or 1. Now if r=0 then we must have that b is an even number and if r=1 we must have that b is an odd number. Then as b is an arbitrary integer we must have that dividing any integer by 2 will give us all of the possible remainders as an integer x\in\mathbb{Z} is either even or odd. :::

::: example Example 129. Let a=3 and consider some b>a. By the division algorithm we have that r is either 0, 1 or 2. Like before, if r=0 then b is a multiple of 3 so that b=3q.

Now, suppose b is a multiple of 3. We have that b+3 is also a multiple of 3 as

$$\begin{equation} b+3=3q+3\Rightarrow 3\left(q+1\right) \end{equation*}$$*

So, as b is a multiple of 3 and b+3 is a multiple of 3 then these will give a remainder r=0 by the division algorithm. What can we say about b+1 and b+2? Using b+1 in the division algorithm with 3 gives

$$\begin{align} b+1=3q+1 \end{align*}$$*

as b=3q. Hence the remainder is 1. Likewise using b+2 in the division algorithm with 3 gives a remainder of 2. As b was an arbitrary integer, we can conclude that the possible remainders when dividing an arbitrary integer by 3 are 0, 1 and 2. All of the possibilities are realised for the division of an arbitrary integer. :::

::: example Example 130. Let a=4 and consider some b>a. The division algorithm gives the possible range of remainders of 0, 1, 2 and 3. Like the previous example, we see that if the remainder is 0 then b is a multiple of 4, so similarly b+4 is a multiple of 4. Looking at b+1, b+2 and b+3 we see by the division algorithm that

$$\begin{align} b+1=4q+1\ b+2=4q+2\ b+3=4q+3\ \end{align*}$$*

So dividing an arbitrary integer by 4 will give a remainder in the range 0 to 3 inclusive. :::

These examples suggest that when dividing an arbitrary integer b by some a\in\mathbb{Z} with a>0 will always give one value r with 0<r<\left|a\right|. This is true not just for the examples above but for every a>0. We can prove this but first make an important observation.

::: {#cor:NT_integer_minus_remainder_is_divisable .corollary} Corollary 9.

Let a,b\in\mathbb{Z} with b>a. Consider the division algorithm for b divided by a, that is we have

$$\begin{equation} b=qa+r \end{equation*}$$*

for some q,r\in\mathbb{Z} and 0<r<\left|a\right|. We have that a\mid\left(b-r\right).

Proof:

Let a,b\in\mathbb{Z} be as given by the hypothesis. The division algorithm applied to a dividing b gives

$$\begin{equation} b=qa+r \end{equation*}$$*

This gives \left(b-r\right)=qa and so by the definition of divisibility we have that a\mid\left(b-r\right). As required. $\qed$ :::

Corollary 9{reference-type="ref" reference="cor:NT_integer_minus_remainder_is_divisable"} is slightly misleading, this corollary provides the bedrock for the rest of this section.

Congruences and residues (Modular arithmetic)

We are now able to define the main topic behind this section. Corollary 9{reference-type="ref" reference="cor:NT_integer_minus_remainder_is_divisable"} tells us that when we divide b by a, then the difference between b and the remainder r is always divisible by a. This is to say a\mid\left(b-r\right). Now suppose we have another integer c so that when c is divided by a the remainder is also r. We then also have a\mid\left(c-r\right), in a sense b and c are similar when divided by a. That is they give the same remainder after division. We use this to make the definitions.

::: definition Definition 162. Congruences, congruent number and residue number

Let a,b,n\in\mathbb{Z} so that n>0. If we have that a and b have the same remainder when divided by n we say that a and b a congruent modulo n. This is denoted by

$$\begin{equation} a\equiv b \ (\mathrm{mod}\ n) \end{equation*}$$*

We call b a residue of a modulo n. We usually say that a is congruent to b modulo n. We define a congruence to capture the notion of congruent numbers and residue numbers. We call the number n the modulus of the congruence.

If a is not congruent to b, equivalently if b is not a residue of a we write a\not\equiv b\ (\mathrm{mod}\ n). :::

We can make use of corollary 9{reference-type="ref" reference="cor:NT_integer_minus_remainder_is_divisable"} to connect division to congruences.

::: {#prop:NT_congruent_iff_difference_is_divisible .proposition} Proposition 122. Congruent if and only if the difference is divisible by modulus

Let a,b,n\in\mathbb{Z} and fix n\geq 1. We have that a\equiv b\ (\mathrm{mod}\ n) if and only if n\mid\left(a-b\right).

Proof:

By the division algorithm, we have that

$$\begin{align} a&=qn+r\ b&=q'n+r' \end{align*}$$*

for some q,q',r,r'\in\mathbb{Z} where 0<r<\left|n\right| and 0<r'<\left|n\right|. Hence we have that

$$\begin{equation} a-b=\left(q-q'\right)n+\left(r-r'\right) \end{equation*}$$*

where -n<r-r'<n.

\left(\Rightarrow\right): Suppose that a\equiv b\ (\mathrm{mod}\ n). By definition of congruences, we have that a and b share the same remainder when divided by n. Hence r=r' and so r-r'=0 so that

$$\begin{equation} a-b=\left(q-q'\right)n \end{equation*}$$*

which implies that n\mid\left(a-b\right).

\left(\Leftarrow\right): Now suppose that n\mid\left(a-b\right). We then have that

$$\begin{equation} \left(a-b\right)-\left(q-q'\right)n=\left(r-r'\right) \end{equation*}$$*

Where -n<r-r'<n. The only integer strictly between -n and n which is divisible by n is 0. Indeed if there were such a number between -n and n which was divisible by n it would be a multiple of n so the inequality wouldn't be strict. Hence r-r'=0 which implies that r=r'. So by the definition of a congruence, we have that a\equiv b\ (\mathrm{mod}\ n). $\qed$ :::

Proposition 122{reference-type="ref" reference="prop:NT_congruent_iff_difference_is_divisible"} gives us a bridge, allowing us to translate statements about divisibility into statements about congruences and visa versa. To get used to working with congruences we will use some examples.

::: example Example 131. Suppose you were asked given that it is Monday, what would be the day in 100 days times? We can make use of the fact that days repeat in a 7-day cycle. By the division algorithm, we have that

$$\begin{equation} 100=14\left(7\right)+2 \end{equation*}$$*

That is to say, 100\equiv 2\ (\mathrm{mod}\ 7). So we know that if the current day is a Monday, then in 100 days times the day would be a Wednesday. :::

::: example Example 132. The previous example can also be used to calculate some time in the future. Suppose we are using a $24$-hour clock and we know that it is 1 PM, what would be the time in 164 hours?

To find this we make use of the fact that a $24$-hour clock repeats every 24 hours. So we need to compute the remainder of 164 when divided by 24. The division algorithm gives

$$\begin{equation} 164=6\left(24\right)+20 \end{equation*}$$*

Hence 164\equiv 20\ (\mathrm{mod}\ 24). Now on a $24$-hour clock, 1 PM is equal to 13. So the time 164 hours later will be given by 13+20=33. We have a problem, 33 is not on a $24$-hour clock! To find what time this is we need to find the remainder when 33 is divided by 24. We can quickly see that

$$\begin{equation} 33=124+9 \end{equation}$$*

So 33\equiv 9\ (\mathrm{mod}\ 24). Hence, 164 hours after 1 PM :::

::: example Example 133. Let a,n\in\mathbb{Z}. We have that n\mid\left(a-a\right) and so by proposition 122{reference-type="ref" reference="prop:NT_congruent_iff_difference_is_divisible"} that a\equiv a\ (\mathrm{mod}\ n). :::

::: example Example 134. Let a=8, b=11 and n=5. Using the division algorithm we can see that

$$\begin{align} 8&\equiv 3\ (\mathrm{mod}\ 5)\ 11&\equiv 1\ (\mathrm{mod}\ 5) \end{align*}$$*

Now, consider a+b=19. By the division algorithm, we see that

$$\begin{equation} 19=35+4 \end{equation}$$*

So that 19\equiv 4\ (\mathrm{mod}\ 5). It would seem that we can add congruences together and the result makes sense. :::

The last two examples hint at some properties of congruences which should be investigated. In particular, a\equiv a\ (\mathrm{mod}\ n) is one criterion of being an equivalence relation. Do the other properties for being an equivalence relation hold? This is to say if a\equiv b\ (\mathrm{mod}\ n) do we have that b\equiv a\ (\mathrm{mod}\ n) and if a\equiv b\ (\mathrm{mod}\ n) and b\equiv c\ (\mathrm{mod}\ n) do we have a\equiv c\ (\mathrm{mod}\ n)? As it turns out the answer is yes

::: {#prop:NT_congruences_form_equivalence_relation .proposition} Proposition 123. Congruences are an equivalence relation

Let a,b,n\in\mathbb{Z} so that n is fixed and n\geq 1. Consider the relation \sim_n where

$$\begin{align} a\sim_n b \iff a\equiv b\ (\mathrm{mod}\ n) \end{align*}$$*

We have that \sim_n is an equivalence relation. That is

  1. $a\equiv a\ (\mathrm{mod}\ n)$

  2. If a\equiv b\ (\mathrm{mod}\ n) then $b\equiv a\ (\mathrm{mod}\ n)$

  3. If a\equiv b\ (\mathrm{mod}\ n) and b\equiv c\ (\mathrm{mod}\ n) then $a\equiv c\ (\mathrm{mod}\ n)$

Proof:

  1. a\equiv a\ (\mathrm{mod}\ n):

    As n\mid\left(a-a\right) then by proposition 122{reference-type="ref" reference="prop:NT_congruent_iff_difference_is_divisible"} that a\equiv a\ (\mathrm{mod}\ n).

  2. If a\equiv b\ (\mathrm{mod}\ n) then b\equiv a\ (\mathrm{mod}\ n):

    Suppose that a\equiv b\ (\mathrm{mod}\ n) then n\mid\left(a-b\right). By the definition of divisibility, we have that \exists k\in\mathbb{Z} so that a-b=kn. Multiplying both sides by -1 gives b-a=\left(-k\right)n. So again by the definition of divisibility, we have that n\mid\left(b-a\right) and so b\equiv a\ (\mathrm{mod}\ n).

  3. If a\equiv b\ (\mathrm{mod}\ n) and b\equiv c\ (\mathrm{mod}\ n) then a\equiv c\ (\mathrm{mod}\ n):

    Suppose that a\equiv b\ (\mathrm{mod}\ n) and b\equiv c\ (\mathrm{mod}\ n), then n\mid\left(a-b\right) and n\mid\left(b-c\right). Property 3. of proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} then gives us that n\mid\left(\left(a-b\right)+\left(b-c\right)\right).

    Clearly \left(a-b\right)+\left(b-c\right)=a-c and so n\mid\left(a-c\right) which is to say a\equiv c\ (\mathrm{mod}\ n).

As required. $\qed$ :::

We now know that congruences form an equivalence relation, one for each n\geq 1. So we can consider the equivalence classes that are formed by congruences. Let a\in\mathbb{Z}, what does the equivalence class \left[a\right] look like?

Recall that \left[a\right]_n is given by

$$\begin{equation*} \left[a\right]_n=\left{x\in\mathbb{Z}: a\sim_n x\right}=\left{x\in\mathbb{Z}:a\equiv x\ (\mathrm{mod}\ n)\right} \end{equation*}$$

That is, the equivalence class \left[a\right] is a set of integers that are congruent to a modulo n. Equivalently, as \sim_n is an equivalence relation, we have that a\equiv x\ (\mathrm{mod}\ n) is the same as x\equiv a\ (\mathrm{mod}\ n). Hence we can view \left[a\right] as the set of integers that x so that x gives a remainder of a when divided by $n$13 . This is to say

$$\begin{equation*} \left[a\right]_n=\left{x\in\mathbb{Z}:x\equiv a\ (\mathrm{mod}\ n)\right} \end{equation*}$$

For example, suppose that a=0, then we have that

$$\begin{align*} \left[0\right]_n&=\left{x\in\mathbb{Z}:x\equiv 0\ (\mathrm{mod}\ n)\right}\ &=\left{\dots,-2n,n,0,n,2n,\dots\right} \end{align*}$$

Likewise, when a=1 we have that

$$\begin{align*} \left[1\right]_n&=\left{x\in\mathbb{Z}:x\equiv 1\ (\mathrm{mod}\ n)\right}\ &=\left{\dots,1-2n,1-n,1,1+n,1+2n,\dots\right} \end{align*}$$

That is, the equivalence class of 0 modulo n is simply the multiples of n and the equivalence class of 1 modulo n are one more than a multiple of n.

How many congruence classes are there for a given n? Clearly, by the definition algorithm, there are at most n such classes, We can show that there are exactly n classes.

::: {#prop:NT_congruence_equiv_class_count .proposition} Proposition 124. Number of equivalence classes for congruence equivalence relation

Let a,b,n\in\mathbb{Z} so that n\geq 1 and consider the relation \sim_n given by

$$\begin{equation} a\sim_n b\iff a\equiv b\ (\mathrm{mod}\ n) \end{equation*}$$*

We have that there are n equivalence classes for the relation \sim_n, one for each possible remainder.

Proof:

By the division algorithm applied to n dividing a we have for a,b,n\in\mathbb{Z} with n\geq 1 that

$$\begin{equation} a=qn+r \end{equation*}$$*

for some unique q,r\in\mathbb{Z} and 0\leq r < \left|n\right|. Hence the possible remainders are in the set

$$\begin{equation} R=\left{0,1,2,\dots,n-1\right} \end{equation*}$$*

as n\geq 1. Firstly we show that no two i,j\in R are congruent modulo n. So suppose, WLOG, that 0\leq i\leq j<n, then j-i>0 and j-i<n. Then we have that n\nmid\left(j-i\right) and so j\not\equiv i\ (\mathrm{mod}\ n). As the choice of i,j was arbitrary we conclude that no two elements of R are congruent. It was shown in the proof of theorem 19{reference-type="ref" reference="thm:EquivClassesOfRelationPartitionSet"} that unequal equivalence classes are disjoint, which is to say unique. Hence \left[i\right]_n\neq\left[j\right]_n for all i,j\in R.

Now, it is left to show that any given r\in R belongs to exactly one equivalence class. This is clear upon rewriting the result from the division algorithm as

$$\begin{equation} a-r=qn \end{equation*}$$*

which gives that a\equiv r\ (\mathrm{mod}\ n). $\qed$ :::

It follows immediately by theorem 19{reference-type="ref" reference="thm:EquivClassesOfRelationPartitionSet"} that the equivalence classes modulo n partition \mathbb{Z}. Additionally we have that \left[a\right]_n=\left[b\right]_n if and only if a\equiv b\ (\mathrm{mod}\ n).

As we have shown that \sim_n is an equivalence relation we can define the quotient set, as in definition 96{reference-type="ref" reference="def:QuotientSet"}

::: definition Definition 163. The integers modulo $n$

Let a\in\mathbb{Z}. We define the quotient set \mathbb{Z}_n to be

$$\begin{equation} \mathbb{Z}_n=\left{\left[a\right]_n:a\in\mathbb{Z}\right} \end{equation*}$$*

By proposition 124{reference-type="ref" reference="prop:NT_congruence_equiv_class_count"} we know there are n such sets which correspond to all the possible remainders when an integer a is divided by n. Hence we can explicitly write

$$\begin{equation} \mathbb{Z}_n=\left{\left[0\right]_n,\left[1\right]_n,\left[2\right]_n,\dots,\left[n-1\right]_n\right} \end{equation*}$$*

If we take the canonical representative of each class, for example, if the class is \left[0\right]_n we take the canonical representative to be 0. We can write \mathbb{Z}_n more cleanly as

$$\begin{equation} \mathbb{Z}_n=\left{0,1,2,\dots,n-1\right} \end{equation*}$$* :::

As hinted by an example and because we have defined arithmetic on \mathbb{Z}, the next natural question is how does arithmetic work in \mathbb{Z}_n? Recall the example, we had that a=8, b=11 and n=5 and

$$\begin{align*} 8&\equiv 3\ (\mathrm{mod}\ 5)\ 11&\equiv 1\ (\mathrm{mod}\ 5) \end{align*}$$

When we computed a+b=19 and found that

$$\begin{equation*} 19\equiv 4\ (\mathrm{mod}\ 5) \end{equation*}$$

What about multiplication? We know that 8*8=64 and we can see that 64\equiv 4\ (\mathrm{mod}\ 5). Multiplying the residue of 3 with itself we get 3*3=9 from which we see 9\equiv 4\ (\mathrm{mod}\ 5). Similarly, we can see that subtraction makes sense in \mathbb{Z}_n. We know that 11-8=3 so clearly 3\equiv 3\ (\mathrm{mod}\ n). Subtracting the residues gives 1-3=-2. At first, this seems to be a problem, we seem to be saying that 3\equiv -2\ (\mathrm{mod}\ 5). However, a quick review of the definition of congruences tells us that is correct. We know that a\equiv b\ (\mathrm{mod}\ 5) if and only if n\mid\left(a-b\right), in our case we indeed have that 5\mid\left(3-\left(-2\right)\right) as 3-\left(-2\right)=5.

We can make the idea of addition, subtraction and multiplication rigorous.

::: {#prop:NT_operations_on_congruences .proposition} Proposition 125. Addition, subtraction and multiplication of congruences

Let a,b,c,d,n\in\mathbb{Z} so that a\equiv b\ (\mathrm{mod}\ n) and c\equiv d\ (\mathrm{mod}\ n). We have that

  1. $\left(a+c\right)\equiv \left(b+d\right)\ (\mathrm{mod}\ n)$

  2. $\left(a-c\right)\equiv \left(b-d\right)\ (\mathrm{mod}\ n)$

  3. $\left(ac\right)\equiv \left(bd\right)\ (\mathrm{mod}\ n)$

Proof:

Let a,b,c,d,n\in\mathbb{Z} be as given by the hypothesis. As a\equiv b\ (\mathrm{mod}\ n) we have by proposition 122{reference-type="ref" reference="prop:NT_congruent_iff_difference_is_divisible"} that n\mid\left(a-b\right) and so by the definition of divisibility we have that a-b=kn. Likewise, we have that as c\equiv d\ (\mathrm{mod}\ n) then by proposition 122{reference-type="ref" reference="prop:NT_congruent_iff_difference_is_divisible"} that n\mid\left(c-d\right) and so by the definition of divisibility we have that c-d=ln.

In particular, we have that a=b+kn and c=d+ln. It follows that

$$\begin{align} a+c&=\left(b+kn\right)+\left(d+ln\right)\ &=\left(b+d\right)+\left(kn+ln\right)\ &=\left(b+d\right)+n\left(k+l\right)\ &\Rightarrow n\mid\left(\left(a+c\right)-\left(b+d\right)\right)\ &\Rightarrow \left(a+c\right)\equiv \left(b+d\right)\ (\mathrm{mod}\ n) \end{align*}$$*

Likewise, for subtraction we have

$$\begin{align} a-c&=\left(b+kn\right)-\left(d+ln\right)\ &=\left(b-d\right)+\left(kn-ln\right)\ &=\left(b-d\right)+n\left(k-l\right)\ &\Rightarrow n\mid\left(\left(a-c\right)-\left(b-d\right)\right)\ &\Rightarrow \left(a-c\right)\equiv \left(b-d\right)\ (\mathrm{mod}\ n) \end{align*}$$*

Finally, for multiplication, we see that

$$\begin{align} ac&=\left(b+kn\right)\left(d+ln\right)\ &=bd+bln+dkn+kln^2\ &=bd+n\left(bl+dk+kln\right)\ &\Rightarrow n\mid\left(\left(ac\right)-\left(bd\right)\right)\ &\Rightarrow \left(ac\right)\equiv \left(bd\right)\ (\mathrm{mod}\ n) \end{align*}$$*

As required. $\qed$ :::

This proposition provides the backbone of showing that the operations of addition, subtraction and multiplication are well-defined on \mathbb{Z}_n.

::: definition Definition 164. Addition, subtraction and multiplication on $\mathbb{Z}_n$

Let a,b,n\in\mathbb{Z} with n\geq 1. We define addition, subtraction and multiplication on \mathbb{Z}_n by

  1. $\left[a\right]_n+\left[b\right]_n=\left[a+b\right]_n$

  2. $\left[a\right]_n-\left[b\right]_n=\left[a-b\right]_n$

  3. $\left[a\right]_n\left[b\right]_n=\left[ab\right]_n$ :::

We prove these are well-defined.

::: {#prop:NT_addition_subtraction_multiplication_Zn_well_defined .proposition} Proposition 126. Addition, subtraction and multiplication on \mathbb{Z}_n is well-defined and closed

Let n\in\mathbb{Z} so that n\geq 1. We have that addition, subtraction and multiplication of equivalence classes are well-defined and closed. This is to say \forall x,y\in\mathbb{Z}_n we have that

  1. $\left[x\right]_n+\left[y\right]_n=\left[x+y\right]_n\in\mathbb{Z}_n$

  2. $\left[x\right]_n-\left[y\right]_n=\left[x-y\right]_n\in\mathbb{Z}_n$

  3. $\left[x\right]_n\left[y\right]_n=\left[xy\right]_n\in\mathbb{Z}_n$

Proof:

Suppose that a\in\left[x\right]_n and b\in\left[y\right]_n. By definition, we have that a\equiv x\ (\mathrm{mod}\ n) and b\equiv y\ (\mathrm{mod}\ n).

By proposition 125{reference-type="ref" reference="prop:NT_operations_on_congruences"} we have that

  1. $a+b\equiv x+y\ (\mathrm{mod}\ n)$

  2. $a-b\equiv x-y\ (\mathrm{mod}\ n)$

  3. $ab\equiv xy\ (\mathrm{mod}\ n)$

So that a+b\in\left[x+y\right]_n, a-b\in\left[x-y\right]_n and ab\in\left[xy\right]_n, showing the operations are well-defined. Closure is immediate in each case. $\qed$ :::

We now have a well-defined idea of arithmetic on \mathbb{Z}_n. A poor student or a particularly clever dog will realise immediately that we have missed out on some operations that were defined on \mathbb{Z}. In this section for example we defined what integer division means. What about exponentiation?

We will first look at exponentiation. Thankfully there isn't much work to do as we can make use of the definition of multiplication for \mathbb{Z}_n. We can see that

$$\begin{align*} \left(\left[a\right]\right)^2&=\left[a\right]\left[a\right]=\left[aa\right]=\left[a^2\right]\ \left(\left[a\right]\right)^3&=\left[a\right]^2\left[a\right]=\left[a^2a\right]=\left[a^3\right]\ \left(\left[a\right]\right)^4&=\left[a\right]^3\left[a\right]=\left[a^3a\right]=\left[a^4\right]\ &\dots \end{align}$$

So clearly exponentiation is well-defined. Now, what about division? We expect that to get a well-defined definition for division in \mathbb{Z}_n it should respect the definition of divisibility for the integers. Here in lies the problem, division over \mathbb{Z} is not well-defined, for example, 3\nmid 2, so it is clear there is no equivalence class for this case. What about the cases where division over \mathbb{Z} is well-defined? This is our definition of being congruent so we can't extend to division of congruences this way either.

However, recall that we have defined the idea of a multiplicative inverse. In particular, we had that for x\in\mathbb{Z} such that x\neq 0, then y\in\mathbb{Q} was said to be a multiplicative inverse of x so that

$$\begin{equation*} xy=1=yx \end{equation*}$$

Perhaps then, we might hope to recover some notion of division modulo n by using multiplicative inverses. Such a definition, of course, would have to respect congruences. So for x\in\mathbb{Z}_n with x\not\equiv 0\ (\mathrm{mod}\ n), we are looking for y\in\mathbb{Z}_n so that x*y\equiv 1 \ (\mathrm{mod}\ n) To start, it would be wise to look at multiplication for a few small values of n\geq 2, to get a feel for what we are looking for.

* 0 1


0 0 0 1 0 1

: The multiplication table for n=5

* 0 1 2


0 0 0 0 1 0 1 2 2 0 2 1

: The multiplication table for n=5

* 0 1 2 3


0 0 0 0 0 1 0 1 2 3 2 0 2 0 2 3 0 3 2 1

: The multiplication table for n=5

* 0 1 2 3 4


0 0 0 0 0 0 1 0 1 2 3 4 2 0 2 4 1 3 3 0 3 1 4 2 4 0 4 3 2 1

: The multiplication table for n=5

* 0 1 2 3 4 5


0 0 0 0 0 0 0 1 0 1 2 3 4 5 2 0 2 4 0 2 4 3 0 3 0 3 0 3 4 0 4 2 0 4 2 5 0 5 4 3 2 1

: The multiplication table for n=7

* 0 1 2 3 4 5 6


0 0 0 0 0 0 0 0 1 0 1 2 3 4 5 6 2 0 2 4 6 1 3 5 3 0 3 6 2 5 1 4 4 0 4 1 5 2 6 3 5 0 5 3 1 6 4 2 6 0 6 5 4 3 2 1

: The multiplication table for n=7

What do these tables tell us? Starting with the case n=2, we see that only 1\equiv 1\mod{2} has a multiplicative inverse, namely 1\equiv 1\mod{2}. For n=3, we that if x\equiv 1\ (\mathrm{mod}\ 3) then we can take y\equiv 1\mod{3} and likewise if x\equiv 2\ (\mathrm{mod}\ 3) then we can take y\equiv 2\mod{3}.

Things get a little more complicated for n=4. We see that if x\equiv 1\ (\mathrm{mod}\ 4) then we take y\equiv 1\ (\mathrm{mod}\ 4) and if x\equiv 3\ (\mathrm{mod}\ 4) we take y\equiv 3\ (\mathrm{mod}\ 4). What about x\equiv 2\ (\mathrm{mod}\ 4)?. Looking at the table we see that x*1\equiv 2\ (\mathrm{mod}\ 4) and x*3\equiv 2\ (\mathrm{mod}\ 4), finally x*2\equiv 0\ (\mathrm{mod}\ 4). A disaster! We have that 2 does not have a multiplicative inverse modulo 4. Hence not all elements of \mathbb{Z}_4 have a multiplicative inverse. A similar situation occurs for the case n=6, for example, the row for x\equiv 3\ (\mathrm{mod}\ 6) shows only 3 and 0 can be results.

So our quest of being able to define some notion of division for \mathbb{Z}_n in general appears to be at an end.

That being said, the situation looks more promising in the cases of n=5 and n=7. For \mathbb{Z}_5 we have that the following multiplicative inverses, for those x\not\equiv 0\ (\mathrm{mod}\ 5)

x x^{-1}


1 1 2 3 3 2 4 4

: The elements x\not\equiv 0\ (\mathrm{mod}\ 5) and their respective multiplicative inverses

Likewise for the elements x\not\equiv 0\ (\mathrm{mod}\ 7) for \mathbb{Z}_7 we have

x x^{-1}


1 1 2 4 3 5 4 2 5 3 6 6

: The elements x\not\equiv 0\ (\mathrm{mod}\ 7) and their respective multiplicative inverses

We saw similar situations for \mathbb{Z}_2 and \mathbb{Z}_3, so what do \mathbb{Z}_2, \mathbb{Z}_3, \mathbb{Z}_5, \mathbb{Z}_7 have in common? The thing they have in common is that the modulus is a prime! Does this result hold for all primes? If so, why? If not, why not and what primes does it fail for?

We also saw cases in \mathbb{Z}_4 and \mathbb{Z}_6 where certain elements did have a multiplicative inverse. For example in \mathbb{Z}_4 we saw x\equiv 1\ (\mathrm{mod}\ 4) had the multiplicative inverse of 1, similarly we saw x\equiv 3\ (\mathrm{mod}\ 4) had a multiplication inverse of 3. In \mathbb{Z}_6 we can see that 1 has the inverse of 1, and 5 has an inverse of 5. So what is special in the case where n is not prime that allows some elements to have an inverse?

In the case of \mathbb{Z}_4 the elements which had an inverse, 1 and 3 are co-prime to 4. Likewise in \mathbb{Z}_6 the elements that had inverse were 1 and 5 which are again co-prime to 6. When n was prime, we make the trivial observation that all non-zero elements of \mathbb{Z}_n are co-prime to n, for if not then they share a common prime factor and hence the greatest common divisor would be larger than 1. It seems we have recovered our original goal, that is to say, it looks like it is the case that an element of x\in\mathbb{Z}_n for n\geq 2 has a multiplicative inverse if \mathop{\mathrm{GCD}}\left(x,n\right)=1. Clearly, this is an if and only-if statement.

::: {#prop:NT_modulo_inverse_iff_coprime_with_modulus .proposition} Proposition 127. Existence of inverse element in $\mathbb{Z}_n$

Let n\in\mathbb{Z} with n\geq 2. Let x\in\mathbb{Z}_n. The multiplicative inverse of x in \mathbb{Z}_n exist if and only if $\mathop{\mathrm{GCD}}\left(x,n\right)=1$

Proof:

\left(\Rightarrow\right): Let x\in\mathbb{Z}_n have an inverse y\in\mathbb{Z}_n. We therefore have that

$$\begin{equation} xy\equiv 1\ (\mathrm{mod}\ n) \end{equation*}$$*

By the definition of congruences, we therefore have that xy=1+kn for some k\in\mathbb{Z}. Let d=\mathop{\mathrm{GCD}}\left(x,n\right). As d is the greatest common divisor of x and n we have that d\mid x and d\mid n so d\mid xy-kn. But xy-kn=1 so d\mid 1. Clearly \mathop{\mathrm{GCD}}\left(x,n\right)\geq 1 and so we conclude that d=1.

\left(\Leftarrow\right): Suppose that \mathop{\mathrm{GCD}}\left(x,n\right)=1. We have by Bézout's Identity (theorem \ref{thm:NT_bezout_id}) that \exists a,b\in\mathbb{Z} so that

$$\begin{equation} ax+bn=1 \end{equation*}$$*

Modulo n, we get that ax\equiv 1\ (\mathrm{mod}\ n) and so a is the inverse element of x in \mathbb{Z}_n.

As required. $\qed$ :::

::: {#cor:NT_all_modulo_inverses_if_n_prime .corollary} Corollary 10. All non-zero elements of \mathbb{Z}_n exist if n is prime

Let p be prime. We have that all the non-zero elements of \mathbb{Z}_p have a multiplicative inverse.

Proof:

By corollary 8{reference-type="ref" reference="cor:NT_PrimeNotDividing_Integer_implies_coprime"}, if p is a prime and p\nmid a for some a\in\mathbb{Z} then \mathop{\mathrm{GCD}}\left(a,,p\right)=1. Now suppose that x\in\mathbb{Z}_p, clearly x\leq p. In particular p\not\mid x. As this is true for every non-zero x\in\mathbb{Z}_p then \mathop{\mathrm{GCD}}\left(x,,p\right)=1 and so each x\in\mathbb{Z}_p has a multiplicative inverse by proposition 127{reference-type="ref" reference="prop:NT_modulo_inverse_iff_coprime_with_modulus"}. $\qed$ :::

We have now recovered a definition of division of the congruence classes of \mathbb{Z}_n. Now that modular arithmetic is on a solid footing, what can we use it for? One immediate use case is solving problems about divisibility.

::: example Example 135. We will show that 6\mid a\left(a+1\right)\left(+2\right) for every integer a. We observe that the possible residues of a modulo 6 are 0, 1, 2, 3, 4 and 5. It is enough to check that each is congruent to zero modulo 6.

When a\equiv 0 \mod{6} we see that a+1\equiv 1\ (\mathrm{mod}\ 6) and a+2\equiv2\ (\mathrm{mod}\ 6). So that

$$\begin{equation} a\left(a+1\right)\left(a+2\right)\equiv 011 \equiv 0\ (\mathrm{mod}\ 6) \end{equation*}$$*

Now, when a\equiv 1 \mod{6} we see that a+1\equiv 2\ (\mathrm{mod}\ 6) and a+3\equiv2\ (\mathrm{mod}\ 6), giving

$$\begin{equation} a\left(a+1\right)\left(a+2\right)\equiv 123 \equiv 0\ (\mathrm{mod}\ 6) \end{equation*}$$*

As 1*2*3=6 which is congruent to zero modulo 6. We see that, with an abuse of notation for brevity, that

$a$ $a\ (\mathrm{mod}\ 6)$ $\left(a+1\right)\ (\mathrm{mod}\ 6)$ $\left(a+2\right)\ (\mathrm{mod}\ 6)$ $a\left(a+1\right)\left(a+2\right)\ (\mathrm{mod}\ 6)$


$2$ $2$ $3$ $4$ $24\equiv 0$ $3$ $3$ $4$ $5$ $60\equiv 0$ $4$ $4$ $5$ $0$ $0\equiv 0$ $5$ $5$ $0$ $1$ $0\equiv 0$

: The residues of a\ (\mathrm{mod}\ 6) for a\geq 2, the values of each term and their resultant multiplication modulo $6$

We can see that the product is always zero modulo 6. As each product is always congruent to zero modulo 6 then a\left(a+1\right)\left(a+2\right)\equiv 0\ (\mathrm{mod}\ 6) which implies 6\mid a\left(a+1\right)\left(a+2\right). :::

The astute reader may notice that this feels longer than a proof that uses only the definition of divisibility. The astute reader would be correct. In fact we have that 6\mid m for some m\in\mathbb{Z} if and only if 2\mid m and 3\mid m.

Indeed, suppose that 6\mid m then m=6n for some n\in\mathbb{Z}, moreover 6=2*3 so m=2*3*n which implies that 2\mid m and 3\mid m. Conversely, if 2\mid m and 3\mid m then 6 clearly divides m as 2 and 3 will appear at least once in the prime factorisation of m. So why did we bother with congruences? By first doing the longer calculations and then the shorter proof, we have seen a hint at a possible generalisation to the theory!

That is, if a\equiv b\ (\mathrm{mod}\ n) for some n\in\mathbb{Z} with n>0 with a prime factorisation

$$\begin{equation*} n=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$

then we might expect that a\equiv b\ (\mathrm{mod}\ n) if and only if a\equiv b\ (\mathrm{mod}\ p_i^{e_i}) where i=1,2,\dots, k. We can prove this.

::: proposition Proposition 128. Congruent if and only if congruent to each prime in factorisation

Let n\in\mathbb{Z} so that n>0 and n has a prime factorisation given by

$$\begin{equation} n=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$*

Let a,b\in\mathbb{Z}. We have that a\equiv b\ (\mathrm{mod}\ n) if and only if a\equiv b\ (\mathrm{mod}\ p_i^{e_i}) for each i where $i=1,2,\dots, k$

Proof:

Let n\in\mathbb{Z} be as given in the hypothesis. We have that

$$\begin{align} a\equiv b\ (\mathrm{mod}\ n) &\iff n\mid\left(a-b\right)\ &\iff p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}\mid\left(a-b\right)\ &\iff p_i^{e_i}\mid\left(a-b\right),\ \text{For each } i=1,2,\dots, k\ &\iff a\equiv b\ (\mathrm{mod}\ p_i^{e_i}),\ \text{For each } i=1,2,\dots, k \end{align*}$$*

As required. $\qed$ :::

Another use of congruences is in cryptography, which is a field of study of taking messages and encoding (obfuscating) them in such a way that only the person the message was intended for can read it. This is especially true for the RSA14 encryption method. We already have some of the mathematical machinery required to explore how this method of cryptography works, namely prime numbers and congruences. On the other hand, we still lack some important theory. If cryptography is the field of encoding messages so that only the person the message was intended for can read it, then there is some method that encodes the message and a method that decodes the message using some information known to both the sender and recipient. This means that using this information the recipient will have some method of finding out the original message! We look at this idea in more detail.

Diophantine equations and Polynomials

::: epigraph I had a Polynomial once. My Doctor removed it.

Micheal Grant :::

We start with a definition that we have seen numerous times so far but have not formally defined. That of an equation.

::: definition Definition 165. Equation

An equation is a mathematical statement that states that two expressions are equal. :::

This seems simple enough, but what does it mean? Unfortunately, this depends on the situation, different situations will have a different meaning of what a statement is. Thankfully, we have seen equations already throughout the text so this abstract definition is familiar to us. For example

$$\begin{equation*} 1+1=2 \end{equation*}$$

is an equation. So is \mathop{\mathrm{GCD}}\left(a,b\right)=d, in a similar vain we have from Bézout's Identity that d=ax+by is also an equation. So why define something if it is really this simple? Simply put, we can use the idea of an equation in a more complex way. For example,

$$\begin{equation*} 1+x=2 \end{equation*}$$

says that 1 plus x is equal to 2 but we don't know what x is. However, we can see that

$$\begin{align*} 1+x&=2\ x&=1 \end{align*}$$

That is, we see that x=1, this is an equation! This is where the power of an equation starts to show its worth. If we have a problem where we don't know the value of some quantity of interest, we might be able to work out what that quantity is. We have seen more complex examples of equations, for example x^2=2 which we have shown has no value of x\in\mathbb{Q} where it is true.

Hence, equations that contain a value, or maybe multiple values that we don't know but want to know, are important. This section is focused on looking at such equations. We make another couple of definitions for when an equation contains a value we don't know.

::: definition Definition 166. Variable

A variable is a value that is allowed to be changed either freely or restricted by some constraint or equation. A variable can be taken to be any meaningful value, either inside or outside of some set S. The context of the statement under study usually makes it clear where the variable belongs. :::

::: definition Definition 167. Indeterminate variable

An indeterminate variable is a variable value which has not been specified. As with a variable, it could be inside or outside of some set S. :::

::: definition Definition 168. An unknown variable

An unknown variable, or simply an unknown, is a variable whose value is unknown but we wish to find its value. As before, this unknown variable is to be taken as a member of a set S. If a value for the unknown variable can be found, we call it a solution to the equation. :::

For example, the equation 5x+1=2 would have x as the indeterminate variable, if we were solving for x then x would be the unknown variable as well. The equation 2x+5y=6 has two indeterminate variables, x and y. We can potentially have many indeterminate variables in an equation. Moreover, in many problems, we will have a certain type of variable whose value can vary but is not the unknown that we are looking to solve for. We define this type of variable as well.

::: definition Definition 169. Coefficient

A variable which can vary but is not the variable that is being solved for is called a coefficient, or a parameter of the equation. :::

So, let's start simply and consider the simplest equation possible with one unknown variable and two coefficients.

$$\begin{equation*} x+a=b \end{equation*}$$

This is simple to solve for the unknown x, simply take a from both sides to give x=b-a. So for example if we let a,b\in\mathbb{Z} say with a=5 and b=3, then we see that x\in\mathbb{Z} with x=3-5=-2. This is also true if we take a,b\in\mathbb{Q}. A more complex form of the above equation is

$$\begin{equation*} ax+b=c \end{equation*}$$

Now we hit a problem we are looking for a solution x\in\mathbb{Z}. Firstly, we have that ax=c-b, but then a solution x\in\mathbb{Z} can occur if and only if a\mid\left(b-c\right). If we look for a solution where x\in\mathbb{Q} then no such problem occurs. Therefore, the set that we are looking for solutions in is crucial in solving equations. With our current theory, the situation gets more hopeless the more complicated the equation becomes. For example, if we consider the equation

$$\begin{equation*} 4x^2+2x+3=0 \end{equation*}$$

Does this equation have solutions in \mathbb{Z}? How about \mathbb{Q}?. Additionally, what happens if we have more than one equation or unknowns? For example, consider the two equations given by

$$\begin{align*} 4x+2y&=6\ -2x+5y&=7 \end{align*}$$

How do we solve equations like this? This section aims to answer questions like these. We make a final definition, a special case for when we only seek integer solutions.

::: definition Definition 170. Diophantine equation

An equation for which the solutions have to be integers is called a Diophantine equation15 . :::

Linear Diophantine equations

Linear equations with two variables

We start where the previous section left off, by looking at the simplest type of equation that can be solved.

::: definition Definition 171. Linear equation of a single indeterminate variable

Let S be a set. We say an equation is a linear equation in a single variable x if it has the form

$$\begin{equation} ax+b=c \end{equation*}$$*

for some coefficients a,b,c\in S and an indeterminate variable x. In particular as this equation only has one indeterminate variable we say it is a single-variable linear equation. :::

We have already seen that solutions to this equation exist in \mathbb{Z} if and only if a\mid\left(c-b\right), and a solution always exists if we want x\in\mathbb{Q}. Things are a bit more interesting if we introduce a second variable.

::: definition Definition 172. Linear equation of two indeterminate variables

Let S be a set. We say an equation is a linear equation in two variables x,y if it has the form

$$\begin{equation} ax+by=c \end{equation*}$$*

for some coefficients a,b,c\in S and indeterminate variables x and y. :::

We have seen this type of equation before, in Bézout's Identity (Theorem 36{reference-type="ref" reference="thm:NT_bezout_id"}). In Bézout's Identity, we have that the greatest common divisor, d, of two integers a,b can be expressed as

$$\begin{equation*} ax+by=d \end{equation*}$$

for some x,y\in\mathbb{Z}. This gives us examples of already solved equations, but what about the other way? Given an equation of the form

$$\begin{equation*} ax+by=c \end{equation*}$$

with a,b,c\in\mathbb{Z} given, can we find integer values for x and y?. That is, we are considering ax+by=c to be a Diophantine equation. If the reader is sufficiently alert, they will notice that by mentioning Bézout's Identity we are hinting that it will be crucial to finding the solutions.

We know of one solution, namely if \mathop{\mathrm{GCD}}\left(a,b\right)=d and c=d then the solution is found by the Euclidean algorithm. Now if c were a multiple of d can we find solutions? Recall proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"} part 4. We have that \mathop{\mathrm{GCD}}\left(a,b\right)=d is the smallest such so that ax+by=d, given that this is the smallest such then we can show that there exist others, namely these solutions are multiples of d.

::: {#prop:NT_bezout_extension .proposition} Proposition 129. Integer has form ax+by if it is a multiple of the greatest common divisor of a and $b$

Let a,b\in\mathbb{Z} and d=\mathop{\mathrm{GCD}}\left(a,b\right). Let c\in\mathbb{Z}. We have that

$$\begin{equation} c=ax+by \end{equation*}$$*

if and only if d\mid c. Which is to say c is a multiple of $d$

Proof:

\left(\Rightarrow\right): Clearly if c=ax+by then as d=\mathop{\mathrm{GCD}}\left(a,b\right) we have by proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 3 that d\mid c.

\left(\Leftarrow\right): Suppose that c=de for some e\in\mathbb{Z}. By Bézout's Identity, we have that \exists u,v\in\mathbb{Z} so that

$$\begin{equation} d=au+bv \end{equation*}$$*

where d=\mathop{\mathrm{GCD}}\left(a,b\right). Multiplying both sides by e we get

$$\begin{equation} c=aue+bve=ax+by \end{equation*}$$*

Hence x=ue and y=ve.

As required. $\qed$ :::

Armed with this proposition we can find the solutions to the Diophantine equation ax+by=c.

::: {#prop:NT_solutions_to_two_var_linear_diophantine_equation .proposition} Proposition 130. Solutions to the Diophantine equation $ax+by=c$

Let a,b,c\in\mathbb{Z} be such that

$$\begin{equation} ax+by=c \end{equation*}$$*

for the indeterminate variables x,y and let d=\mathop{\mathrm{GCD}}\left(a,b\right). We have that there are solutions so that x,y\in\mathbb{Z} if and only if d\mid c.

Moreover, there are infinitely many solutions where the solutions are given by

$$\begin{align} x&=x_0+\frac{bn}{d}\ y&=y_0-\frac{an}{d} \end{align*}$$*

where x_0,y_0\in\mathbb{Z} is one solution.

Proof:

The existence of a solution is given by proposition 129{reference-type="ref" reference="prop:NT_bezout_extension"}. It is left to show that the suggested solutions x,y are solutions and that there are infinitely many solutions. This follows the argument in example 118{reference-type="ref" reference="exam:NT_solutions_to_ax_plus_by"}. We give the argument again to refresh the reader's memory.

Let x_0,y_0\in\mathbb{Z} be a solution, then we have that

$$\begin{equation} ax_0+by_0=c \end{equation*}$$*

For any n\in\mathbb{Z} let

$$\begin{align} x&=x_0+\frac{bn}{d}\ y&=y_0-\frac{an}{d} \end{align*}$$*

We then have that \displaystyle\frac{bn}{d}\in\mathbb{Z} as d\mid b by definition of the greatest common divisor, likewise for \displaystyle\frac{ab}{d}. Hence, we have that

$$\begin{align} ax+by&=a\left(x_0+\frac{bn}{d}\right)+b\left(y_0-\frac{an}{d}\right)\ &=ax_0+a\frac{bn}{d}+by_0-b\frac{an}{d}\ &=ax_0+\frac{abn}{d}+by_0-\frac{abn}{d}\ &=ax_0+by_0=c\ \end{align*}$$*

Hence x,y is a solution. Moreover, as n\in\mathbb{Z} is any integer we have shown that there are infinitely many solutions. It is left to show that these are the only solutions.

Let x,y\in\mathbb{Z} be any solution to ax+by=c, and let x_0,y_0\in\mathbb{Z} be a particular solution. Hence

$$\begin{equation} ax+by=ax_0by_0 \end{equation*}$$*

Subtracting ax_0by_0 from the right-hand side gives

$$\begin{align} ax+by-ax_0by_0&=0\ a\left(x-x_0\right)+b\left(y-y_0\right)&=0 \end{align*}$$*

Now, as d=\mathop{\mathrm{GCD}}\left(a,b\right) then we have that d\mid a and d\mid b so that

$$\begin{align} \frac{a}{d}\left(x-x_0\right)+\frac{b}{d}\left(y-y_0\right)&=0\ \frac{a}{d}\left(x-x_0\right)&=-\frac{b}{d}\left(y-y_0\right) \end{align*}$$*

If a=b=0, we are done so suppose not. Then one of a or b is non-zero. Without loss of generality, suppose that a\neq 0. We have that by proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"} that if \mathop{\mathrm{GCD}}\left(a,b\right)=d then \displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=1, moreover by definition of co-prime integers we have that \displaystyle\frac{a}{d} and \displaystyle\frac{b}{d} are co-prime.

By Euclid's lemma for co-primes (lemma 11{reference-type="ref" reference="lem:NT_Euclid_co_primes"}) we have that \displaystyle\frac{a}{d} \mid-\left(y-y_0\right). Hence there is some n\in\mathbb{Z} so that

$$\begin{equation} -\left(y-y_0\right)=n\frac{a}{d} \end{equation*}$$*

Which is to say

$$\begin{equation} y=y_0-\frac{an}{d} \end{equation*}$$*

Similarly, we have that

$$\begin{equation} x=x_0+\frac{bn}{d} \end{equation*}$$*

As required. $\qed$ :::

Linear equations with more than two variables

A natural question to ask now is what happens when we have more than two indeterminate variables? For example ax+by+cz=e? We can take some inspiration from the two variable case.

Recall that for ax+by=c with d=\mathop{\mathrm{GCD}}\left(a,b\right) that there are solutions with x,y\in\mathbb{Z} if and only if d\mid c. More importantly, we have that if d=\mathop{\mathrm{GCD}}\left(a,b\right) then we can express d by d=ax+by for some x,y\in\mathbb{Z} by Bézout's Identity. Moreover by proposition 103{reference-type="ref" reference="prop:NT_Divisor_dividing_all_in_set_divides_linear_combination"} we have that for a set of n integers S=\left\{b_1,b_2,b_3,\dots,b_n\right\} and additionally we have that that a\mid b_i for each b_i\in S then

$$\begin{equation*} a\mid\sum_{i=1}^n b_i x_i \end{equation*}$$

This hints at an extension to Bézout's Identity, given a suitable extension to the definition of the greatest common divisor for more than two inputs. Hence, our goal is to build this suitable extension to the greatest common divisor. We will start by looking at some exploratory examples before moving on with the generalisation.

::: example Example 136. Let a=2, b=4 and c=6. What is \mathop{\mathrm{GCD}}\left(a,b,c\right)? Clearly, by inspection, we have that 2 is the largest divisor of a,b and c. In particular we have that \mathop{\mathrm{GCD}}\left(2,4\right)=2 and \mathop{\mathrm{GCD}}\left(2,6\right)=2. In other words, we have that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(2,4,6\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(2,4\right),6\right) \end{equation*}$$*

Equivalently, we could have first considered \mathop{\mathrm{GCD}}\left(4,6\right)=2 and then \mathop{\mathrm{GCD}}\left(2,2\right)=2 so we have

$$\begin{equation} \mathop{\mathrm{GCD}}\left(2,4,6\right)=\mathop{\mathrm{GCD}}\left(2,\mathop{\mathrm{GCD}}\left(4,6\right)\right) \end{equation*}$$* :::

::: example Example 137. Let a=3, b=6 and c=30. What is \mathop{\mathrm{GCD}}\left(a,b,c\right)? Breaking this problem down we have that \mathop{\mathrm{GCD}}\left(3,6\right)=3, \mathop{\mathrm{GCD}}\left(3,30\right)=3 and \mathop{\mathrm{GCD}}\left(6,30\right)=6. As the greatest common divisor must divide all of the numbers we must conclude that \mathop{\mathrm{GCD}}\left(3,6,30\right)=3. :::

::: example Example 138. Let a=3, b=5 and c=7. As a,b and c are all prime we clearly see that $\mathop{\mathrm{GCD}}\left(a,b,c\right)=1$ :::

::: example Example 139. Let a=14, b=35, c=7 and d=5. We again break this down. We see that

$$\begin{align} \mathop{\mathrm{GCD}}\left(14,33\right)&=7\ \mathop{\mathrm{GCD}}\left(14,7\right)&=7\ \mathop{\mathrm{GCD}}\left(14,5\right)&=1\ \mathop{\mathrm{GCD}}\left(35,7\right)&=5\ \mathop{\mathrm{GCD}}\left(35,5\right)&=7\ \mathop{\mathrm{GCD}}\left(7,5\right)&=1\ \end{align*}$$*

Again the greatest common divisor is the smallest value that divides all of the inputs a,b,c and d. The smallest such number here is 1 so \mathop{\mathrm{GCD}}\left(14,35,7,5\right)=1. :::

In these examples, we made use of the fact that the greatest common divisor of two numbers is the smallest number that divides both of the input numbers. We then looked at all of the possible combinations of the inputs and took the smallest value that occurred. This is to be consistent with two variable version of the \mathop{\mathrm{GCD}} that we have already developed. This was shown explicitly in the first example with

$$\begin{equation*} \mathop{\mathrm{GCD}}\left(2,4,6\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(2,4\right),6\right)=\mathop{\mathrm{GCD}}\left(2,\mathop{\mathrm{GCD}}\left(4,6\right)\right) \end{equation*}$$

Hence an immediate property that we can deduce is that the \mathop{\mathrm{GCD}} is associative, in the sense that computing the \mathop{\mathrm{GCD}} of three numbers is equivalent to computing the \mathop{\mathrm{GCD}} of two of the inputs with the remaining input.

::: proposition Proposition 131. \mathop{\mathrm{GCD}} is associative

Let a,b,c\in\mathbb{Z}. We have that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,\mathop{\mathrm{GCD}}\left(b,c\right)\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a,b\right),c\right) \end{equation*}$$*

Proof:

Let x=\mathop{\mathrm{GCD}}\left(a,\mathop{\mathrm{GCD}}\left(b,c\right)\right) and y=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a,b\right),c\right), We need to show that x\mid y and y\mid x then we can conclude that x=y.

As x=\mathop{\mathrm{GCD}}\left(a,\mathop{\mathrm{GCD}}\left(b,c\right)\right) then by definition of the greatest common divisor, we have that x\mid a and x\mid\mathop{\mathrm{GCD}}\left(b,c\right). Moreover as x\mid\mathop{\mathrm{GCD}}\left(b,c\right) then again by definition of the greatest common divisor we have that x\mid b and x\mid c.

As x\mid a and x\mid b then x\mid\mathop{\mathrm{GCD}}\left(a,b\right) and likewise x\mid c so x\mid\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a,b\right),c\right) by definition and so x\mid y. The proof that y\mid x is similar.

As x\mid y and y\mid x and x>0 and y>0 we conclude that x=y as required. $\qed$ :::

To extend our definition of the greatest common divisor to more than two inputs, we will use the definition of the \mathop{\mathrm{GCD}} given by the decomposition of primes. That is to say, given a,b\in\mathbb{Z}, we know that there exists a set of primes

$$\begin{equation*} T=\left{t_1,t_2,\dots,t_v\right} \end{equation*}$$

So that a and b can be represented by a prime factorisation of primes t_i\in T. That is

$$\begin{align*} a&=\prod_{i=1}^v t_i^{e_i}\ b&=\prod_{i=1}^v t_i^{f_i}\ \end{align*}$$

We then have that the greatest common divisor is given by

$$\begin{equation*} \mathop{\mathrm{GCD}}\left(a,b\right)=t_1^{\min\left(e_1,f_1\right)}t_2^{\min\left(e_2,f_2\right)}t_3^{\min\left(e_3,f_3\right)}\dots t_v^{\min\left(e_v,f_v\right)} \end{equation*}$$

Firstly, we will extend the result of proposition 115{reference-type="ref" reference="prop:NT_express_primes_in_common_basis"} to the case of n integers, the proof is similar to proposition 115{reference-type="ref" reference="prop:NT_express_primes_in_common_basis"}.

::: {#prop:NT_General_express_primes_in_common_basis .proposition} Proposition 132. Expression of set of integers as powers of same primes

Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\} be such that a_i\in\mathbb{Z} and a_i>2 for 1\leq i\leq n. For each a_i let its prime factorisation be denoted by

$$\begin{equation} \mathlarger{a_i=\prod_{\substack{p_{\left(i,k\right)\mid a_i} \ p_{\left(i,k\right)}\text{ is prime}}} p_{\left(i,k\right)}^{e_{\left(i,k\right)}}} \end{equation*}$$*

where \left(i,k\right) is a index tuple with i denoting one of the primes and k denoting the $k$-th element of $a_i$'s prime factorisation. Then there exists a set of primes

$$\begin{equation} T=\left{t_1,t_2,t_3\dots,t_v\right} \end{equation*}$$*

with t_1<t_2<t_3<\dots <t_v so that

$$\begin{equation} \mathlarger{a_i=\prod_{j=1}^v t_{j}^{f_{\left(i,j\right)}}} \end{equation*}$$*

for each 1\leq i\leq n.

Proof:

Let each a_i be as given. That is,

$$\begin{equation} \mathlarger{a_i=\prod_{\substack{p_{\left(i,k\right)\mid a_i} \ p_{\left(i,k\right)}\text{ is prime}}} p_{\left(i,k\right)}^{e_{\left(i,k\right)}}} \end{equation*}$$*

Let A_i=\left\{p_{\left(i,k\right)} : p_{\left(i,k\right)} \text{ appears in the prime factorisation of } a_i\right\}, that is each A_i denotes the set of the prime factors that appear in a_i. We can therefore take T to be

$$\begin{equation} T=\bigcup_{i=1}^n A_i \end{equation*}$$*

so that

$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*

where \displaystyle v\leq \sum_{i=1}^n \left|A_i\right|. It is now left to show that we can pick the primes in the factorisations of the a_i from T. Define the mapping \iota_{A_i} by

$$\begin{align} \iota_{A_i}:A_i&\rightarrow T\ x&\mapsto\iota_{A_i}\left(x\right)=x \end{align*}$$*

We have that \iota_{A_i} maps the elements of A_i to the same element in T. Therefore, we have for some a_i that

$$\begin{align} a_i&=\prod_{j=1}^k p_{\left(i,j\right)}^{e_{\left(i,k\right)}}\ &=\prod_{j=1}^k \iota_{A_i}\left(p_{\left(i,j\right)}\right)^{e_{\left(i,j\right)}}\ &=\prod_{p_{\left(i,j\right)\in A_i}} p_{\left(i,j\right)}^{e_{\left(i,j\right)}}\ &=\prod_{p_{\left(i,j\right)\in A_i}} p_{\left(i,j\right)}^{e_{\left(i,j\right)}}\prod_{t_i\in T\setminus A_i} t_i^0\ &=\prod_{t_i\in T} t_i^{g_i}, \text{ where } g_i =\begin{cases} e_{\left(i,j\right)},\ &\text{If } t_i=p_{\left(i,j\right)}\ 0, &\text{If } t_i\not\in A_i \end{cases}\ &=t_1^{g_1}t_2^{g_2}t_3^{g_3}\dots t_v^{g_v} \end{align}$$*

Which expresses a_i in terms of the primes in T as required. $\qed$ :::

The final ingredient required before we can extend the \mathop{\mathrm{GCD}} is to extend the minimum function to multiple inputs. This is a straightforward extension.

::: definition Definition 173. General minimum function for integers

Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a $n$-tuple of integers. We define the minimum function on S by

$$\begin{align} \min:\mathbb{Z}^n&\rightarrow\mathbb{Z}\ S&\mapsto\min\left(S\right)=\begin{cases} a_1,\ &\text{If } n=1\ \min\left(a_1,a_2\right),\ &\text{If } n=2\ \min\left(\min\left(a_1,a_2,a_3,\dots,a_{n-1}\right),a_n\right),\ &\text{If } n\geq 3\ \end{cases} \end{align*}$$* :::

We need to show that this is well-defined.

::: {#prop:NT_general_min_on_integers_is_well_defined .proposition} Proposition 133. General minimum function for the integers is well-defined

Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a $n$-tuple of integers. We have that \min\left(S\right) is well-defined.

Proof:

We argue by induction on n. The base case is n=1 for which the result is trivial, likewise the case n=2 is trivial. So suppose the result holds for some k>2, then we have that

$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right) \end{equation*}$$*

is well-defined. We show that

$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right) \end{equation*}$$*

is well-defined. Evaluating the inner \min\left(a_1,a_2,a_3,\dots,a_{k}\right) we have by definition that

$$\begin{equation} \min\left(a_1,a_2,a_3,\dots,a_{k}\right)=\min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right) \end{equation*}$$*

Which by hypothesis is well-defined. Hence \min\left(a_1,a_2,a_3,\dots,a_{k}\right)=m for some m\in\mathbb{Z}. Hence we have that

$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)=\min\left(m,a_{k+1}\right) \end{equation*}$$*

Which is well-defined. Hence by induction, we have that the general minimum function on the integers is well-defined. $\qed$ :::

We also have the following proposition.

::: {#prop:NT_general_min_function_on_integers_is_associative .proposition} Proposition 134. The general minimum function is associative

Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a $n$-tuple of integers. We have that

$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{n-1}\right),a_n\right)=\min\left(a_1,\min\left(a_2,a_3,\dots,a_{n-1},a_n\right)\right) \end{equation*}$$*

Proof:

We argue by induction on n. The case n=1 has nothing to prove. Likewise for n=2, so we shall show it holds for n=3. That is

$$\begin{equation} \min\left(\min\left(a_1,a_2\right),a_3\right)=\min\left(a_1,\min\left(a_2,a_3\right)\right) \end{equation*}$$*

There are 6 cases to consider.

  1. $a_1\leq a_2\leq a_3$

  2. $a_1\leq a_3\leq a_2$

  3. $a_2\leq a_1\leq a_3$

  4. $a_2\leq a_3\leq a_1$

  5. $a_3\leq a_1\leq a_2$

  6. $a_3\leq a_2\leq a_1$

  1. a_1\leq a_2\leq a_3:

    We have that

    $$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_1,a_3\right)=a_1\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_2\right)=a_1\ \end{align*}$$*

  2. a_1\leq a_3\leq a_2:

    $$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_1,a_3\right)=a_1\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_3\right)=a_1\ \end{align*}$$*

  3. a_2\leq a_1\leq a_3:

    $$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_2,a_3\right)=a_2\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_2\right)=a_2\ \end{align*}$$*

  4. a_2\leq a_3\leq a_1:

    $$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_2,a_3\right)=a_2\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_2\right)=a_2\ \end{align*}$$*

  5. a_3\leq a_1\leq a_2:

    $$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_1,a_3\right)=a_3\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_3\right)=a_3\ \end{align*}$$*

  6. a_3\leq a_2\leq a_1:

    $$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_2,a_3\right)=a_3\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_2,a_3\right)=a_3\ \end{align*}$$*

Hence the base case is shown. Now suppose that the proposition holds for some k>3, that is

$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),k_n\right)=\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k-1},a_k\right)\right) \end{equation*}$$*

we show that it holds for k+1, i.e.

$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)=\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k},a_{k+1}\right)\right) \end{equation*}$$*

We have by evaluating the inner minimum of the left-hand side we get

$$\begin{equation} \min\left(a_1,a_2,a_3,\dots,a_{k}\right)=\min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_{k}\right) \end{equation*}$$*

And so by the induction hypothesis, we have that

$$\begin{align} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)&=\min\left(\min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_{k}\right),a_{k+1}\right)\ &=\min\left(\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)\right),a_{k+1}\right),\ \text{Induction hypothesis}\ \end{align*}$$*

As \min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right) is well-defined by proposition 133{reference-type="ref" reference="prop:NT_general_min_on_integers_is_well_defined"} then \min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)=M say where M\in\mathbb{Z}. Therefore, on substituting \min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right) for M for ease of reading we have

$$\begin{align} \min\left(\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)\right),a_{k+1}\right)&=\min\left(\min\left(a_1,M\right),a_{k+1}\right)\ &=\min\left(a_1,\min\left(M,a_{k+1}\right)\right)\ &=\min\left(a_1,\min\left(\min\left(a_2,a_3,\dots, a_{k-1},a_{k}\right),a_{k+1}\right)\right)\ &=\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k},a_{k+1}\right)\right) \end{align*}$$*

The result now follows by induction. $\qed$ :::

Proposition 134{reference-type="ref" reference="prop:NT_general_min_function_on_integers_is_associative"} is a useful proposition, it allows us to discard the cumbersome notation of the definition of the general minimum function on the Integers. That is to say, we can now simply, and more easily write

$$\begin{equation*} \min\left(a_1,a_2,a_3,\dots,a_n\right) \end{equation*}$$

For convenience, we also define the minimum function for a subset of n integers.

::: definition Definition 174. General minimum function for a subset of integers

Let A=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be a subset of n integers. Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in A^n. We define the minimum of the set of integers A by

$$\begin{equation} \min\left(A\right)=\min\left(S\right)=\min\left(a_1,a_2,a_3,\dots,a_n\right) \end{equation*}$$*

That is, we simply take the element of A^n which corresponds to the set. :::

::: example Example 140. Let A=\left\{2,3\right\}. We have that

$$\begin{equation} A^2=\left{\left(2,2\right), \left(2,3\right), \left(3,2\right),\left(3,3\right)\right} \end{equation*}$$*

We have that S=\left(2,3\right)\in A^2 and

$$\begin{equation} \min\left(A\right)=\min\left(S\right)=\min\left(2,3\right)=2 \end{equation*}$$* :::

We have all the ingredients required to extend the \mathop{\mathrm{GCD}} function. We use a method similar to how we extended the minimum function.

::: definition Definition 175. Generalised greatest common divisor

Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a $n$-tuple of integers. We define the greatest common divisor function on S by

$$\begin{align} \mathop{\mathrm{GCD}}:\mathbb{Z}^n&\rightarrow\mathbb{Z}\ S&\mapsto\mathop{\mathrm{GCD}}\left(S\right)=\begin{cases} a_1,\ &\text{If } n=1\ \mathop{\mathrm{GCD}}\left(a_1,a_2\right),\ &\text{If } n=2\ \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{n-1}\right),a_n\right),\ &\text{If } n\geq 3\ \end{cases} \end{align*}$$* :::

We show that this is well-defined.

::: {#prop:NT_general_gcd_on_integers_is_well_defined .proposition} Proposition 135. Generalised greatest common divisor function for the integers is well-defined

Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a $n$-tuple of integers. We have that \gcd\left(S\right) is well-defined.

Proof:

The argument is by induction on n. The base case is n=2 which is well-defined by theorem 32{reference-type="ref" reference="thm:NT_gcd_exists"}. Now suppose the result is true for some k>2, that is

$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right) \end{equation*}$$*

is well-defined. We show that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right) \end{equation*}$$*

is well-defined. Evaluating the inner \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right) we have by definition that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right) \end{equation*}$$*

Which by hypothesis is well-defined. Hence \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right)=d for some d\in\mathbb{Z}. Hence we have that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)=\mathop{\mathrm{GCD}}\left(d,a_{k+1}\right) \end{equation*}$$*

Which is well-defined. The result now follows by induction. $\qed$ :::

As with the minimum function, to avoid cumbersome notation we can show that the generalised greatest common divisor is associative.

::: {#prop:NT_general_gcd_on_integers_is_associative .proposition} Proposition 136. Generalised \mathop{\mathrm{GCD}} is associative

Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a $n$-tuple of integers. We have that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{n-1}\right),a_n\right)=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{n-1},a_n\right)\right) \end{equation*}$$*

Proof:

We argue by induction on n. The cases of n=1 and n=2 are trivial, so we show it holds for n=3.

Let x=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3\right)\right) and y=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2\right),a_3\right), We need to show that x\mid y and y\mid x then we can conclude that x=y.

As x=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3\right)\right) then by definition of the greatest common divisor, we have that x\mid a_1 and x\mid\mathop{\mathrm{GCD}}\left(a_2,a_3\right). Moreover, as x\mid\mathop{\mathrm{GCD}}\left(a_2,a_3\right) then again by definition of the greatest common divisor we have that x\mid a_2 and x\mid a_3.

As x\mid a_1 and x\mid a_2 then x\mid\mathop{\mathrm{GCD}}\left(a_1,a_2\right) and likewise x\mid a_3 so x\mid\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2\right),a_3\right) by definition and so x\mid y. The proof that y\mid x is similar.

As x\mid y and y\mid x and x>0 and y>0 we conclude that x=y as required.

Now suppose the result is true for some k>2. That is

$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right)=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_k\right)\right) \end{equation*}$$*

we show that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k},a_{k+1}\right)\right) \end{equation*}$$*

Evaluation of the inner \mathop{\mathrm{GCD}} of the left-hand side yields

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right)a_{k}\right) \end{equation*}$$*

So by the induction hypothesis, we have that

$$\begin{align} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)&=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right)a_{k}\right),a_{k+1}\right)\ &=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)\right),a_{k+1}\right),\ \text{By hypothesis}\ \end{align*}$$*

As \mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right) is well-defined by proposition 135{reference-type="ref" reference="prop:NT_general_gcd_on_integers_is_well_defined"}, we have \mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)=d with d\in\mathbb{Z}. Hence we have

$$\begin{align} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)\right),a_{k+1}\right)&=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,d\right),a_{k+1}\right)\ &=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(d,a_{k+1}\right)\right)\ &=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right),a_{k+1}\right)\right)\ \end{align*}$$*

As required. $\qed$ :::

As with the minimum function, we can now simply write

$$\begin{equation*} \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{n-1},a_n\right) \end{equation*}$$ Likewise for convenience, we define the \mathop{\mathrm{GCD}} function for a subset of n integers.

::: definition Definition 176. General greatest common divisor function for a subset of integers

Let A=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be a subset of n integers. Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in A^n. We define the \mathop{\mathrm{GCD}} of the set of integers A by

$$\begin{equation} \mathop{\mathrm{GCD}}\left(A\right)=\mathop{\mathrm{GCD}}\left(S\right)=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right) \end{equation*}$$*

That is, we simply take the element of A^n which corresponds to the set. :::

::: example Example 141. Let A=\left\{2,3\right\}. We have that

$$\begin{equation} A^2=\left{\left(2,2\right), \left(2,3\right), \left(3,2\right),\left(3,3\right)\right} \end{equation*}$$*

We have that S=\left(2,3\right)\in A^2 and

$$\begin{equation} \mathop{\mathrm{GCD}}\left(A\right)=\mathop{\mathrm{GCD}}\left(S\right)=\mathop{\mathrm{GCD}}\left(2,3\right)=1 \end{equation*}$$* :::

We can now finally generalise the computation of the greatest common divisor from the prime factorisation of the inputs.

::: {#prop:NT_general_gcd_can_be_computed_by_primes .proposition} Proposition 137. Generalised version of the greatest common divisor from prime factorisation

Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be a set of integers so that at least one a_i\neq 0 for 1\leq i\leq n. By proposition 132{reference-type="ref" reference="prop:NT_General_express_primes_in_common_basis"}, we know that there exists a set of primes

$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*

so that for each a_i we have prime factorisations given by

$$\begin{equation} \mathlarger{a_i=\prod_{j=1}^v t_{j}^{f_{\left(i,j\right)}}} \end{equation*}$$ For 1\leq i\leq n. Define the family of sets for each $1\leq j\leq v$*

$$\begin{equation} P_j=\left{f_{\left(i,j\right)} : 1\leq i\leq n\right} \end{equation*}$$*

We have that the greatest common divisor \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,dots,a_n\right) is given by

$$\begin{equation} \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right)=t_1^{\min\left(P_1\right)}t_2^{\min\left(P_2\right)}t_3^{\min\left(P_3\right)}\dots t_v^{\min\left(P_v\right)} \end{equation*}$$*

Proof:

The proof is similar to that of proposition 116{reference-type="ref" reference="prop:NT_gcd_can_be_computed_by_primes"}. Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be as given so by proposition 132{reference-type="ref" reference="prop:NT_General_express_primes_in_common_basis"} we have a set of primes

$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*

so that for each a_i we have prime factorisations given by

$$\begin{equation} \mathlarger{a_i=\prod_{j=1}^v t_{j}^{f_{\left(i,j\right)}}} \end{equation*}$$*

Now, let d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right) and let D = t_1^{\min\left(P_1\right)}t_2^{\min\left(P_2\right)}t_3^{\min\left(P_3\right)}\dots t_v^{\min\left(P_v\right)}, we show that d\leq D and D\leq d. Define \sigma_j=\min\left(\left\{f_{\left(i,j\right)}: 1\leq i\leq n\right\}\right) for 1\leq j\leq v.

  1. D\leq d:

    By the definition of the minimum, we have that \sigma_j\leq f_{\left(i,j\right)} for each 1\leq i\leq n. Hence, for each i and j there exists k_{\left(i,j\right)}\in\mathbb{Z} so that

    $$\begin{equation} f_{\left(i,j\right)} = \sigma_j + k_{\left(i,j\right)} \end{equation*}$$*

    So that a_i can be expressed as

    $$\begin{align} a_i&=\prod_{j=1}^v t_j^{f_{\left(i,j\right)}}\ &=\prod_{j=1}^v t_j^{\sigma_j+k_{\left(i,j\right)}}\ &=\prod_{j=1}^v t_j^{\sigma_j} t_j^{k_{\left(i,j\right)}}\ &=\prod_{j=1}^v t_j^{\sigma_j} \prod_{j=1}^vt_j^{k_{\left(i,j\right)}}\ &= D * \prod_{j=1}^vt_j^{k_{\left(i,j\right)}} \end{align*}$$*

    As a_i was arbitrary this argument holds for each 1\leq i\leq n. Hence, we have that D\mid a_i for each i, so D is a common divisor of each a_i. We conclude that D\leq d.

  2. d\leq D:

    Suppose that d\mid D then \exists k\in\mathbb{Z} so that

    $$\begin{equation} d=DK \end{equation*}$$*

    Now, k has a factorisation into primes by the fundamental theorem of arithmetic. Moreover, k could have primes in common with D, so we can take those primes that are in common with D and k and place them into the factorisation of D. That is

    $$\begin{align} d&=Dk\ d&=t_1^{\sigma_1}t_2^{\sigma_1}t_3^{\sigma_3}\dots t_v^{\sigma_v}k\ d&=t_1^{\lambda_1}t_2^{\lambda_1}t_3^{\lambda_3}\dots t_v^{\lambda_v}k'\ \end{align*}$$*

    Where \lambda_j are the new values for each prime after extracting the primes in common with D and k into D. k' are the primes that are not in common. We need to show that

    1. $k'=1$

    2. \lambda_j\leq \sigma_j for all $1\leq j\leq v$

    1. k'=1:

      Suppose for a contradiction that k'\neq 1. As d>0 and D>0 then k>0 and so k'>0. Now as k'\neq 1 we have k'>1 and so by the fundamental theorem of arithmetic we have that k' has a factorisation into primes, say

      $$\begin{equation} k'=q_1^{r_1}q_2^{r_2}q_3^{r_3}\dots q_c^{r_c} \end{equation*}$$*

      Now, no q_l=t_j as k' has no primes in common with t_1^{\lambda_1}t_2^{\lambda_1}t_3^{\lambda_3}\dots t_v^{\lambda_v}. Pick one of the primes in k', say q=q_l then q\mid d. Now as d=\gcd\left(a_1,a_2,a_3,\dots,a_n\right) then we have q\mid a_i for at least one a_i. This is a contradiction as then q is one of the primes t_j. We conclude that $k'=1$

    2. \lambda_j\leq \sigma_j for all 1\leq j\leq v:

      Suppose for contraction that \lambda_j>\sigma_j for all 1\leq j\leq v. Without loss of generality, take j=1, for if not re-label the primes.

      By definition of \sigma_1, we have that \sigma_1=\min\left(\left\{f_{\left(i,1\right)}: 1\leq i\leq n\right\}\right), without loss of generality take i=1 as the case for the other values of i are similar. We have that \sigma_1=f_{\left(1,1\right)} and so \lambda_1>f_{\left(1,1\right)}. As d is the greatest common divisor of a_1 then there is an s\in\mathbb{Z} so that ds=a where s>0 as both a and d are.

      Comparing the prime factorisations, we get that

      $$\begin{equation} st_1^{\lambda_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_1^{f_{\left(1,1\right)}}t_2^{f_{\left(1,2\right)}}t_3^{f_{\left(1,3\right)}}\dots t_v^{f_{\left(1,v\right)}} \end{equation}$$*

      Dividing by \displaystyle t_1^{f_{\left(1,1\right)}} we get that

      $$\begin{equation} st_1^{\lambda_1-f_{\left(1,1\right)}}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_1^{f_{\left(1,1\right)}-f_{\left(1,1\right)}}t_2^{f_{\left(1,2\right)}}t_3^{f_{\left(1,3\right)}}\dots t_v^{f_{\left(1,v\right)}} \end{equation}$$*

      Where clearly \displaystyle t_1^{f_{\left(1,1\right)}-f_{\left(1,1\right)}}=1. So this can be re-written as

      $$\begin{equation} st_1^{\lambda_1-f_{\left(1,1\right)}}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_2^{f_{\left(1,2\right)}}t_3^{f_{\left(1,3\right)}}\dots t_v^{f_{\left(1,v\right)}} \end{equation}$$*

      As \lambda_1>f_{\left(1,1\right)} then \lambda_1-f_{\left(1,1\right)}>0 and so t_1 divides the left-hand side of the equation. By the fundamental theorem of arithmetic, t_1 divides the left-hand side it must also divide the right-hand side and therefore be in the factorisation. It is not in the factorisation on the right-hand side which is a contradiction. It follows \lambda_j\leq\sigma_j for all $1\leq j\leq v$

    Therefore we conclude that d\leq D.

    As d\leq D and D\leq d we have that d=D and the result is shown. $\qed$ :::

These last few results were somewhat technical. To show that our new generalised \mathop{\mathrm{GCD}} works we give an example.

::: example Example 142. We compute \mathop{\mathrm{GCD}}\left(54,78,35,144,50\right). By inspection of each of the numbers we have that

$$\begin{align} 54&=23^3\ 78&=2313\ 35&=57\ 144&=2^43^2\ 50&=25^2 \end{align*}$$*

Hence, the set of primes T is given by

$$\begin{equation} T=\left{2,3,5,7,13\right} \end{equation*}$$*

Now, by the proposition, we know that

$$\begin{equation} \mathop{\mathrm{GCD}}\left(54,78,35,144,50\right)=t_1^{\min\left(P_1\right)}t_2^{\min\left(P_2\right)}t_3^{\min\left(P_3\right)}t_4^{\min\left(P_4\right)}t_5^{\min\left(P_5\right)} \end{equation*}$$*

Where P_j will be the powers of the prime t_j that appear in the factorisation of each of the inputs. Taking t_1=2, t_2=3, t_3=5, t_4=7 and t_5=13 we have

$$\begin{align} P_1&=\left{1,1,0,4,1\right}=\left{0,1,4\right}\ P_2&=\left{3,1,0,2,0\right}=\left{0,1,2,3\right}\ P_3&=\left{0,0,1,0,2\right}=\left{0,1,2\right}\ P_4&=\left{0,0,1,0,0\right}=\left{0,1\right}\ P_5&=\left{0,1,0,0,0\right}=\left{0,1\right}\ \end{align*}$$ From which it is clear that the minimum of every P_j is 0. So that*

$$\begin{equation} \mathop{\mathrm{GCD}}\left(54,78,35,144,50\right)=1 \end{equation*}$$* :::

With a generalised \mathop{\mathrm{GCD}} function, we can extend Bézout's Identity.

::: {#thm:NT_general_bezout_idenity .theorem} Theorem 44. Generalised Bézout's Identity

Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be a set of integers so that at least one a_i\neq 0 for 1\leq i\leq n. Consider d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots ,a_n\right). Then, for i\leq 1\leq n we have \exists x_i\in\mathbb{Z} so that

$$\begin{equation} d=a_1x_1+a_2x_2+a_2x_2+\dots+a_nx_n=\sum_{i=1}^n a_ix_n \end{equation*}$$*

Proof:

Let S be as given by the hypothesis and let d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots ,a_n\right). By definition, we have that as d\mid a_i for each 1\leq i\leq n then by proposition 103{reference-type="ref" reference="prop:NT_Divisor_dividing_all_in_set_divides_linear_combination"} we have that

$$\begin{equation} d\mid\sum_{i=1}^n a_ix_n \end{equation*}$$*

for any x_i\in\mathbb{Z}. Define the set A by

$$\begin{equation} G=\left{\sum_{i=1}^n a_ix_n : x_i\in\mathbb{Z}\right} \end{equation*}$$*

Clearly, there are both positive and negative elements in G, additionally 0\in G by taking each x_i=0. Define \Tilde{G} by

$$\begin{equation} \Tilde{G}=\left{g\in G: g>0\right} \end{equation*}$$*

It follows that \Tilde{G}\subset\mathbb{Z} and so by the well-ordering principle it has a smallest element \Tilde{g} of the form

$$\begin{equation} \Tilde{g}=\sum_{i=1}^n a_ix_n \end{equation*}$$*

We must show that \Tilde{g}\mid a_i for each i. Suppose for contradiction and without loss of generality that \Tilde{g}\nmid a_1. By the division algorithm, we have that

$$\begin{equation} a_1=q\Tilde{g}+r \end{equation*}$$*

with 0<r<\left|\Tilde{g}\right|. Therefore

$$\begin{align} a_1&=q\Tilde{g}+r\ r&=a_1-q\Tilde{g}\ r&=a_1-q\sum_{i=1}^n a_ix_n\ r&=a_1-\left(qa_1x_1+q\sum_{i=2}^n a_ix_n\right)\ r&=a_1-qa_1x_1-q\sum_{i=2}^n a_ix_n\ r&=a_1\left(1-qx_1\right)-q\sum_{i=2}^n a_ix_n\ r&=a_1\left(1-qx_1\right)+\sum_{i=2}^n a_i\left(-qx_n\right)\ \end{align*}$$*

Which shows that r\in\Tilde{G}. Moreover as 0<r<\left|\Tilde{g}\right| we have r<\Tilde{g} a contradiction, so \Tilde{g}\mid a_1. It follows that \Tilde{g}\mid a_i for each i.

As d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots ,a_n\right) we have that a_1=m_id for each m_i\in\mathbb{Z}. Combining this with the expression for \Tilde{g} shows that

$$\begin{align} \Tilde{g}&=\sum_{i=1}^n a_ix_n\ \Tilde{g}&=\sum_{i=1}^n \left(m_i d\right)x_n\ \Tilde{g}&=d\sum_{i=1}^n m_ix_n\ \end{align*}$$*

So d\mid\Tilde{g} and we have that d\leq\Tilde{g}. As d is the greatest common divisor we have that d=\Tilde{g} as required. $\qed$ :::

We are now at the end of a long road. We can now, partially, generalise proposition 130{reference-type="ref" reference="prop:NT_solutions_to_two_var_linear_diophantine_equation"} to the n variable case, namely we state the requirement for solutions to exist.

::: definition Definition 177. Linear equation of n indeterminate variables

Let S be a set. We say an equation is a linear equation in $n$-variables if it has the form

$$\begin{equation} a_1x_1+a_2x_2+a_3x_2+\dots+a_nx_n=c \end{equation*}$$*

for some coefficients a_i\in S and c\in S and n indeterminate variables x_n. :::

::: {#prop:NT_existence_of_solutions_to_n_var_linear_diophantine_equation .proposition} Proposition 138. Existence of solutions to n variable linear Diophantine equation

Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be such that

$$\begin{equation} a_1x_1+a_2x_2+a_3x_3+\dots+a_nx_n=c \end{equation*}$$*

for the indeterminate variable x_i with 1\leq i\leq n. Let d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right). We have that there are solutions so that each x_i\in\mathbb{Z} if and only if d\mid c.

Proof:

\left(\Rightarrow\right): If \displaystyle c= \sum_{i=1}^n a_ix_n then by proposition 103{reference-type="ref" reference="prop:NT_Divisor_dividing_all_in_set_divides_linear_combination"} we have that d\mid c.

\left(\Leftarrow\right): Suppose that d\mid c then \exists e\in\mathbb{Z} so that c=de. By the generalised Bézout's Identity for each i that \exists y_i\in\mathbb{Z} so that

$$\begin{equation} d=\sum_{i=1}^n a_iy_n \end{equation*}$$*

where d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right). Multiplying both sides by e we see that

$$\begin{equation} c=e\sum_{i=1}^n a_iy_i=\sum_{i=1}^n a_i\left(ey_i\right) \end{equation*}$$*

Hence each x_i=ey_i.

The result is shown. $\qed$ :::

Unlike proposition 130{reference-type="ref" reference="prop:NT_solutions_to_two_var_linear_diophantine_equation"}, we did not show that there are infinitely many solutions and what form they take. Recall that we used example 118{reference-type="ref" reference="exam:NT_solutions_to_ax_plus_by"} to find the general form of the solutions for the two-variable case. We shall see if we can do the same for multiple variables.

::: example Example 143. Consider the three-variable Diophantine equation

$$\begin{equation} 15x+9y+27z=9 \end{equation*}$$*

Clearly \mathop{\mathrm{GCD}}\left(15,9,27\right)=3 and 3\mid 9 so integer solutions exist. How can we find one?

One idea that might give a solution is to try and reduce this to a two-variable equation. How can this be done? We can see that

$$\begin{equation} 15x+9y+27z=3\left(5x+3y\right)+27z=9 \end{equation*}$$*

As 5x+3y\in\mathbb{Z} for x,y\in\mathbb{Z} we will denote this by \in\mathbb{Z}, that is v=5x+3y. As v\in\mathbb{Z} we can set it to any integer value as \mathop{\mathrm{GCD}}\left(5,3\right)=1 and 1 divides every integer. So let v=1 to give

$$\begin{equation} 5x+3y=1 \end{equation*}$$*

The Euclidean algorithm shows that x=2 and y=-3 is a particular solution. Hence the general solutions will be given by

$$\begin{align} x&=2+\frac{3n}{1}=2+3n\ y&=-3-\frac{5n}{1}=-3-5n\ \end{align*}$$*

In particular, the general solutions satisfy 5x+3y=1 and so the general solutions to 5x+3y=v will be given by

$$\begin{align} x&=v\left(2+3n\right)\ y&=v\left(-3-5n\right)\ \end{align*}$$*

Now, consider the remaining equation given by

$$\begin{equation} 3v+27z=9 \end{equation*}$$*

By the Euclidean algorithm, we see that v=3 and z=0 is a particular solution, with the general solutions given by

$$\begin{align} v&=3+\frac{27n}{3}=3+9k\ z&=-0-\frac{3k}{3}=-k\ \end{align*}$$*

Hence we have that

$$\begin{align} x&=v\left(2+3n\right)\ y&=v\left(-3-5n\right)\ v&=3+9k\ z&=-k\ \end{align*}$$*

As we have an expression for v we can substitute it into the expressions for x and y to give

$$\begin{align} x&=\left(3+9k\right)\left(2+3n\right)\ y&=\left(3+9k\right)\left(-3-5n\right)\ z&=-k\ \end{align*}$$*

where n,k\in\mathbb{Z}. This is a general solution to 15x+9y+27z=9. Indeed, if we substitute these back into the original equation we get

$$\begin{align} 15x+9y+27z&=15\left(\left(3+9k\right)\left(2+3n\right)\right)+9\left(\left(3+9k\right)\left(-3-5n\right)\right)+27\left(-k\right)\ &=15\left(3+9k\right)\left(2+3n\right)+9\left(3+9k\right)\left(-3-5n\right)-27k\ &=\left(3+9k\right)\left(15\left(2+3n\right)+9\left(-3-5n\right)\right)-27k\ &=\left(3+9k\right)\left(30+45n+\left(-27-45n\right)\right)-27k\ &=\left(3+9k\right)\left(3\right)-27k\ &=9+27k-27k=9\ \end{align*}$$* :::

::: {#exam:NT_solution_to_linear_diophantine_by_subs .example} Example 144.

We consider the same example again, but we will find a different solution method. So consider the three-variable equation given by

$$\begin{equation} 15x+9y+27z=9 \end{equation*}$$*

We will express x in terms of y and z. We have

$$\begin{equation} x=\frac{9-9y-27z}{15} \end{equation*}$$*

Observe that we can express 9 and 27 is terms of 15. We have

$$\begin{align} 9&=15-6\ 27&=215-3 \end{align}$$*

Hence, we can split the expression for x up into the parts that we know are divisible by 15 and the parts that may or may not be divisible by $15$

$$\begin{align} x&=\frac{9-9y-27z}{15}\ &=\frac{\left(15-6\right)-\left(15-6\right)y-\left(2\left(15\right)-3\right)z}{15}\ &=1-y-2z+\frac{-6+6y+3z}{15}\ \end{align*}$$*

As we seek x\in\mathbb{Z} then as 1-y-2z\in\mathbb{Z} we will also require that \displaystyle \frac{-6+6y+3z}{15}\in\mathbb{Z}. Let \displaystyle \frac{-6+6y+3z}{15}=s where s\in\mathbb{Z}. Then we have

$$\begin{equation} x=1-y-2z+s \end{equation*}$$*

Now, We have that

$$\begin{equation} 15s=-6+6y-3z \end{equation*}$$*

We repeat the above process to get y in terms of z and s. Doing so gives

$$\begin{align} 15s&=-6+6y-3z\ 6y&=15s+6+3z\ \end{align*}$$*

As before, we can express 15 and 3 in terms of 6 to get

$$\begin{align} 15&=26+3\ 3&=6-3 \end{align}$$*

So that,

$$\begin{align} 6y&=15s+6+3z\
6y&=\left(2\left(6\right)+3\right)s+6+\left(6-3\right)z\ y&=\frac{\left(2\left(6\right)+3\right)s+6+\left(6-3\right)z}{6}\ y&=2s+1-z+\frac{3s+3z}{6}\ \end{align*}$$*

As we need y\in\mathbb{Z} then we require \displaystyle \frac{3s+3z}{6}\in\mathbb{Z} say \displaystyle \frac{3s+3z}{6}=t. Then we have

$$\begin{equation} y=2s+1-z+t \end{equation*}$$*

Finally, we have that

$$\begin{equation} 6t=3s+3z \end{equation*}$$*

Which can be solved directly for z to give z=2t-s.Substituting the value of z in y gives

$$\begin{align} y&=2s+1-z+t\ y&=2s+1-\left(2t-s\right)+t\ y&=3s+1-t\ \end{align*}$$ And on substitution of this y value and z into x we get*

$$\begin{align} x&=1-y-2z+s\ x&=1-\left(3s+1-t\right)-2\left(2t-s\right)+s\ x&=1-3s-1+t-4t+2s+s\ x&=-3t\ \end{align*}$$*

Hence, a general solution is given by,

$$\begin{align} x&=-3t\ y&=3s+1-t\ z&=2t-s \end{align*}$$*

for s,t\in\mathbb{Z}. This is indeed a general solution as

$$\begin{align} 15x+9y+27z&=15\left(-3t\right)+9\left(3s+1-t\right)+27\left(2t-s\right)\ &=-45t+27s+9-9t+54t-27s\ &=9 \end{align*}$$* :::

Of course, there was nothing special about using only three variables.

::: example Example 145. Consider the four-variable Diophantine equation

$$\begin{equation} 55a+35b-77c+144d=1 \end{equation*}$$*

As \mathop{\mathrm{GCD}}\left(55,35,77,144\right)=1 then integer solutions exist. We will use a similar method to example 144{reference-type="ref" reference="exam:NT_solution_to_linear_diophantine_by_subs"}, with some details omitted for brevity.

Expressing, a in terms of b,c and d we get that

$$\begin{equation} a=\frac{1-35b+77c-144d}{55} \end{equation*}$$*

Noting that

$$\begin{align} 35&=55-20\ 77&=55+22\ 144&=255+34\ \end{align}$$*

We can express a as

$$\begin{align} a&=\frac{1-35b+77c-144d}{55}\ &=\frac{1-\left(55-20\right)b+\left(55+22\right)c-\left(2\left(55\right)+34\right)d}{55}\ &=-b+c-2d+\frac{1+20b+22c-34d}{55}\ \end{align*}$$*

We require that \displaystyle \frac{1+20b+22c-34d}{55}\in\mathbb{Z} say with \displaystyle \frac{1+20b+22c-34d}{55} = u for u\in\mathbb{Z}. We therefore have that

$$\begin{equation} 55u=1+20b+22c-34d \end{equation*}$$*

Expressing, b in terms of c,d and u we get

$$\begin{align} b=\frac{55u-1-22c+34d}{20} \end{align*}$$*

We see that

$$\begin{align} 55&=220+15\ 22&=20+2\ 34&=20+14 \end{align}$$*

Hence,

$$\begin{equation} b=2u-c+d+\frac{15u-1-2c+14d}{20} \end{equation*}$$*

Set \displaystyle \frac{15u-1-2c-14d}{20}=v where v\in\mathbb{Z}. Then

$$\begin{equation} 20v=15u-1-2c+14d \end{equation*}$$*

Solving c in terms of d,u and v gives

$$\begin{equation} c=\frac{15u-1-20v+14d}{2} \end{equation*}$$*

where we get

$$\begin{equation} c=7u-10v+7d+\frac{u-1}{2}=7u-10v+7d+x \end{equation*}$$*

where \displaystyle x=\frac{u-1}{2}\in\mathbb{Z}. We seem to have hit a problem, we still don't have an expression for d. However, suppose d\in\mathbb{Z} is arbitrary, can we recover a general solution with this assumption?

Firstly, we will express a,b and c in terms of u,v,x and d. We get that

$$\begin{align} d&\in\mathbb{Z}\ c&=7u-10v+7d+x\ b&=-6d-5u+11v-x\ a&=11d+13u-21v+2x \end{align*}$$*

Observe that,

$$\begin{align} 55a&=55\left(11d+13u-21v+2x\right)=605d+715u-1155v+110x\ 35b&=35\left(-6d-5u+11v-x\right)=-210d-175u+385v-35x\ 77c&=77\left(7u-10v+7d+x\right)=539d+539u-770v+77x \end{align*}$$*

Hence

$$\begin{align} 55a+35b-77c+144d&=605d+715u-1155v+110x\ &-210d-175u+385v-35x\ &-\left(539d+539u-770v+77x\right)\ &+144d\ &=0d+u+0v-2x=u-2x \end{align*}$$*

However, we know that \displaystyle x=\frac{u-1}{2} so that 2x=u-1 so

$$\begin{equation} u-2x=u-\left(u-1\right)=1 \end{equation*}$$*

Hence

$$\begin{align} a&=11d+13u-21v+2x\ b&=-6d-5u+11v-x\ c&=7u-10v+7d+x\ u&,v,x,d\in\mathbb{Z}\ \end{align*}$$*

Gives a general solution for u,v,x,d\in\mathbb{Z}. :::

It is interesting to note that we have four arbitrary integer variables in the previous example. In the case of three variables, we were able to find solutions requiring only two arbitrary integer variables. Does the other method also give a general solution requiring four arbitrary integer variables?

::: example Example 146. Consider again

$$\begin{equation} 55a+35b-77c+144d=1 \end{equation*}$$*

We have that \mathop{\mathrm{GCD}}\left(55,35\right)=5 so that

$$\begin{equation} 55a+35b-77c+144d=5\left(11a+7b\right)-77c+144d=1 \end{equation*}$$ We have that as 11a+7b\in\mathbb{Z} we can replace this with a variable, say u so that we get the equation*

$$\begin{equation} 11a+7b=1 \end{equation*}$$.*

By the Euclidean algorithm, we see for u=5 that a=2 and b=-3 is a general solution, with the general solutions being given by

$$\begin{align} a&=2+\frac{7n}{1}=2+7x\ b&=-3-\frac{11n}{1}=-3-11x\ \end{align*}$$*

Hence the general solution to 11a+7b=u is given by

$$\begin{align} a&=u\left(2+7x\right)\ b&=u\left(-3-11x\right)\ \end{align*}$$*

Now, the original four-variable equation is the three-variable equation

$$\begin{equation} 5u-77c+144d=1 \end{equation*}$$*

We have that \mathop{\mathrm{GCD}}\left(-77,144\right)=1. So replace -77c+144d with a variable, say v so that

$$\begin{equation} -77c+144d=1 \end{equation*}$$*

By the Euclidean algorithm, a particular solution is c=43 and d=23 and the general solution is

$$\begin{align} c&=43+\frac{144y}{1}=43+144y\ d&=23-\frac{-77y}{1}=23+77y\ \end{align*}$$*

So that solution to -77c+144d=v is given by

$$\begin{align} c&=v\left(43+144y\right)\ d&=v\left(23+77y\right)\ \end{align*}$$*

This turns the three-variable equation into a two-variable equation given by

$$\begin{equation} 5u+v=1 \end{equation*}$$*

which clearly has a particular solution of u=0 and v=1 to give general solutions given by

$$\begin{align} u&=0+\frac{z}{1}=z\ v&=-1-\frac{5z}{1}=-1-5z\ \end{align*}$$*

Therefore, we have

$$\begin{align} a&=u\left(2+7x\right)\ b&=u\left(-3-11x\right)\ c&=v\left(43+144y\right)\ d&=v\left(23+77y\right)\ u&=z\ v&=1-5z\ \end{align*}$$*

So, substituting u and v where required yields

$$\begin{align} a&=z\left(2+7x\right)\ b&=z\left(-3-11x\right)\ c&=\left(1-5z\right)\left(43+144y\right)\ d&=\left(1-5z\right)\left(23+77y\right)\ \end{align*}$$*

Where x,y,z\in\mathbb{Z}. We verify that this is a general solution. We have

$$\begin{align} 55a+35b-77c+144d&=55z\left(2+7x\right)+35z\left(-3-11x\right)-77\left(1-5z\right)\left(43+144y\right)+144\left(1-5z\right)\left(23+77y\right)\ &=z\left(55\left(2+7x\right)+35\left(-3-11x\right)\right)+\left(1-5z\right)\left(-77\left(43+144y\right)+144\left(23+77y\right)\right)\ &=z\left(110+385x-105-385x\right)+\left(1-5z\right)\left(-77\left(43+144y\right)+144\left(23+77y\right)\right)\ &=5z+\left(1-5z\right)\left(-3311-11088y+3312+11088y\right)\ &=5z+\left(1-5z\right)=1\ \end{align*}$$*

Hence

$$\begin{align} a&=z\left(2+7x\right)\ b&=z\left(-3-11x\right)\ c&=\left(1-5z\right)\left(43+144y\right)\ d&=\left(1-5z\right)\left(23+77y\right)\ \end{align*}$$*

where x,y,z\in\mathbb{Z} is a general solution. :::

Hence we expressed the $4$-variable linear Diophantine equation in terms of three arbitrary variables and found expressions for $3$-variable linear Diophantine equations in terms of two arbitrary parameters. Is this always the case, and if so does it hold for any number of variables? The answer to this question can be found by considering the method of replacing parts of the $n$-variable linear Diophantine equation with variables.

For example, for the $3$-variable case we have

$$\begin{equation*} ax+by+cz=d \end{equation*}$$

Suppose that \mathop{\mathrm{GCD}}\left(a,b,c\right)\mid d. After a potential factoring of ax+by=g_1\left(a'x+b'y\right) where g_1=\mathop{\mathrm{GCD}}\left(a,b\right), we can replace a'x+by' with a variable, say u so that

$$\begin{equation*} a'x+b'y=u \end{equation*}$$

As we have factored out the greatest common divisor we will have \mathop{\mathrm{GCD}}\left(a',b'\right)=1 by proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"} part 7. Hence we can solve a'x+b'y=1 by the Euclidean algorithm and get a general solution

$$\begin{align*} x&=u\left(x_0+b'n\right)\ y&=u\left(y_0-a'n\right) \end{align*}$$

for some n\in\mathbb{Z}. As we have seen, this turns the $3$-variable equation into a $2$-variable equation given by

$$\begin{equation*} g_1u+cz=d \end{equation*}$$

Which is solvable as \mathop{\mathrm{GCD}}\left(g_1,c\right)\mid d.

This will have a general solution of

$$\begin{align*} u&=u_0+\frac{cm}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\ z&=z_0-\frac{g_1m}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\ \end{align*}$$

Hence a general form of the general solution to the $3$-variable case is given by

$$\begin{align*} x&=\left(u_0+\frac{cm}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\right)\left(x_0+b'n\right)\ y&=\left(u_0+\frac{cm}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\right)\left(y_0-a'n\right)\ z&=z_0-\frac{g_1m}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\ \end{align*}$$

Here the arbitrary variables are n and m.

Now, in the $4$-variable case we have

$$\begin{equation*} aw+bx+cy+dz=e \end{equation*}$$

Suppose that \mathop{\mathrm{GCD}}\left(a,b,c,d\right)\mid e. As before, after a potential factoring of aw+bx=g_1\left(a'w+a'x\right) where g_1=\mathop{\mathrm{GCD}}\left(a,b\right), we can replace a'w+bx' with a variable, say u so that

$$\begin{equation*} a'w+b'x=u \end{equation*}$$

We can solve a'w+b'x=1 by the Euclidean algorithm and get a general solution for any u

$$\begin{align*} w&=u\left(w_0+b'n\right)\ x&=u\left(x_0-a'n\right) \end{align*}$$

for some n\in\mathbb{Z}. This turns the $4$-variable equation into a $3$-variable equation given by

$$\begin{equation*} g_1u+cy+dz=e \end{equation*}$$

Which is solvable as \mathop{\mathrm{GCD}}\left(g_1,c,d\right)\mid e. Two choices can be made here: replacing g_1u+cy or cy+dz.

Taking the first choice, we have after a potential factoring g_1u+cy=g_2\left(g_1'u+c'y\right) where g_2=\mathop{\mathrm{GCD}}\left(g_1,c\right), we set g_1'u+c'y=v, solving g_1'u+c'y=1 by the Euclidean algorithm, we get a general solution for any v

$$\begin{align*} u&=v\left(u_0+c'm\right)\ y&=v\left(y_0-g_1'm\right) \end{align*}$$

We are now left with a $2$-variable equation given by

$$\begin{equation*} g_2v+dz=e \end{equation*}$$

Again, solvable because \mathop{\mathrm{GCD}}\left(g_2,d\right)\mid e. This has a general solution given by

$$\begin{align*} v&=v_0+\frac{dk}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\ z&=z_0-\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\ \end{align*}$$

Hence, we have a general solution given by.

$$\begin{align*} w&=\left(v_0+\frac{dk}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\right)\left(u_0+c'm\right)\left(w_0+b'n\right)\ x&=\left(v_0+\frac{dk}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\right)\left(u_0+c'm\right)\left(x_0-a'n\right)\ y&=\left(v_0+\frac{dk}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\right)\left(y_0-g_1'm\right)\ z&=z_0-\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\ \end{align*}$$

Alternatively, suppose we took the choice of cy+dz, after a potential factoring cy+dz=g_2\left(c'y+d'z\right) where g_2=\mathop{\mathrm{GCD}}\left(c,d\right). Setting c'y+d'z=v and solving c'y+d'e=1 by the Euclidean algorithm, we get a general solution for any v given by

$$\begin{align*} y&=v\left(y_0+d'm\right)\ z&=v\left(z_0-c'm\right)\ \end{align*}$$

Which now gives the $2$-variable equation

$$\begin{equation*} g_1u+g_2v=e \end{equation*}$$

Solutions exists as \mathop{\mathrm{GCD}}\left(g_1,g_2\right)\mid e. This has a general solution

$$\begin{align*} u&=u_0+\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\ v&=v_0+\frac{g_1k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)} \end{align*}$$

Hence, a different general solution to the original $4$-variable equation is given by

$$\begin{align*} w&=\left(u_0+\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\right)\left(w_0+b'n\right)\ x&=\left(u_0+\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\right)\left(x_0-a'n\right)\ y&=\left(v_0+\frac{g_1k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\right)\left(y_0+d'm\right)\
z&=\left(v_0+\frac{g_1k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\right)\left(z_0-c'm\right)\ \end{align*}$$

It seems there are a few things to show. Firstly, is it possible to show that the solution to an $n$-variable linear Diophantine equation can be expressed in terms of n-1 variables? Secondly, do the general solutions have the form of being a product of 2 variable solutions?

Given an $n$-variable linear Diophantine equation, replacing two variables with a single variable turns any $n$-variable equation with n-1 variables. Eventually, this process terminates when, after some number of replacements, we get to a $2$-variable equation.

We have therefore a strong understanding of solving Linear Diophantine equations in any number of variables.

Polynomials

Previously, we have seen how to handle linear equations in multiple variables. That is equations of the form

$$\begin{equation*} a_1x_2+a_2x_2+\dots+a_nx_n=b,\ , a_i,x_i,b\in\mathbb{Z} \end{equation*}$$

A natural question to ask is can we extend this to non-linear equations? For example, we have defined what we mean by a square number 158{reference-type="ref" reference="def:NT_square_number"}. That is, a number y\in\mathbb{Z} is square if \exists x\in\mathbb{Z} so that x^2=y. If we consider x as a variable for a moment, then we have seen many examples of solving this type of equation. We studied this when finding what integer numbers were squares. We can ask the question, what happens if we combine this x^2 variable with just the variable x, for example, what values for x\in\mathbb{Z} or \mathbb{Q} would satisfy

$$\begin{equation*} x^2+x=2 \end{equation*}$$

We can go further than simply x^2. For example, we can consider

$$\begin{equation*} x^n=\prod_{i=1}^n x \end{equation*}$$

and combine variables of this form however we wish and multiply them by constants, for example.

$$\begin{equation*} x^8+15x^7-8x^3+2x^2+x+5=0 \end{equation*}$$

We will want to study equations of this form. We will want to

::: definition Definition 178. Monomial

Let X be a variable and let a\in S for some set S\neq\emptyset. We define a monomial to be an expression of the form

$$\begin{equation} aX^n \end{equation*}$$*

where n\in\mathbb{Z} with $n\geq 0$ :::

From a monomial, we define a so-called polynomial

::: definition Definition 179. Polynomial

Let S be a set and let n\in\mathbb{Z} with n\geq 0. Let X be a variable. We define a polynomial to be an expression of the form

$$\begin{equation} P\left(X\right)=a_nX^n+a_{n-1}X^{n-1}+a_{n-2}X^{n-2}+\dots+a_1X+a_0 \end{equation*}$$*

Where a_0,a_1,\dots,a_n\in S are called the coefficients of the polynomial. We say that X is an indeterminate variable and we say that P\left(X\right) is a polynomial in X with coefficients in S.

Here we are formally using the + operation associated with S between the terms in the polynomial. :::

We can, of course, replace X with a particular value to evaluate the polynomial.

::: definition Definition 180. Evaluation of a polynomial

Let S\neq\emptyset be a set and let P\left(X\right) be a polynomial with coefficients in S. Let s\in S. We define the evaluation of the polynomial P at s by

$$\begin{equation} P\left(s\right)=a_ns^n+a_{n-1}s^{n-1}+a_{n-2}s^{n-2}+\dots+a_1s+a_0 \end{equation*}$$* :::

::: example Example 147. Let S=\mathbb{Z} and define P\left(X\right) by

$$\begin{equation} P\left(X\right)=2X^2-3X+5 \end{equation*}$$*

What is P\left(1\right)?

On substituting X=1 we see

$$\begin{equation} P\left(1\right)=2\left(1\right)^2-3\left(1\right)+5=2-3+5=4 \end{equation*}$$* :::

It will be useful to describe the set of all polynomials whose coefficients lie in some set S.

::: definition Definition 181. Set of all polynomials with coefficients in a set $S$

Let S\neq\emptyset. We define the set of all polynomials whose coefficients are in S by the set

$$\begin{equation} S\left[X\right]=\left{\sum_{i=0}^n s_iX^i: n\in\mathbb{N}\text{ and } s_i\in S\right} \end{equation*}$$*

We define the polynomials to be the elements of this set and write P\in S\left[X\right], with the understanding that P actually means P\left(X\right). :::

From the definition of a polynomial, we have many choices that we can make that allow us to create a polynomial. We can modify the coefficients a_i for 0\leq i\leq n however we wish, so long as they are all in the set S. In particular, the choices we can make are clearly dependent on the value of n we can pick. The value of n is an important property of polynomials.

::: definition Definition 182. Degree of a polynomial

Let P\in S\left[X\right]. We define the degree of the polynomial P to be the largest n\in\mathbb{Z} so that the coefficient of X^n is not equal to zero.

We write \deg\left(P\right)=n to mean the degree of the polynomial P is n. :::

::: example Example 148. Let S=\mathbb{Z} and define P\left(X\right) by

$$\begin{equation} P\left(X\right)=2X^2-3X+5 \end{equation*}$$*

We see that the largest n where X^n\neq 0 is 2 so \deg\left(P\right)=2. :::

The astute reader might ask the following. Suppose that P is given by

$$\begin{equation*} P=0+0X+0X^2+0X^3+\dots+0X^n \end{equation*}$$

what is the degree of P? On one hand, by our definition, we can't assign it a degree! There are no non-zero coefficients in the polynomial! On the other hand, we intuitively know that the above polynomial represents a meaningful polynomial, especially for the theory we are attempting to develop. It is not clear how to resolve this problem for now. Perhaps, developing the theory as much as we can without it will make it clear how to resolve this issue.

Defining addition between two polynomials

We can define how to add two polynomials together. To do so we need to recast how we see a polynomial, and to do so recall the definition of the Cartesian product of n sets 33{reference-type="ref" reference="def:CartProductOfNSet"}.

Let S_1,S_2,\dots,S_n be sets. We define the Cartesian product of S_1,S_2,\dots,S_N, denoted S_1\times S_2\times\dots\times S_n to be the set of all ordered pairs of the form \left(s_1,s_2,\dots,s_n\right) where s_1\in S_1.s_2\in S_2,\dots s_n\in S_n. This is to say that

$$\begin{equation*} S_1\times S_2\times\dots\times S_n=\left{\left(s_1,s_2,\dots,s_n\right):s_1\in S_1.s_2\in S_2,\dots s_n\in S_n\right} \end{equation*}$$

In particular, if all of the sets are the same we denote this by S^n. We can use this idea to define a polynomial of degree n as a tuple.

Firstly, observe that we can write a polynomial P\left(X\right) as

$$\begin{align*} P\left(X\right)&=a_nX^n+a_{n-1}X^{n-1}+a_{n-2}X^{n-2}+\dots+a_1X+a_0\ &= a_0+a_1X+a_2X^2+a_3X^3+\dots+a_{n-1}X^{n-1}+a_nX^n\ &=\sum_{i=0}^n a_i X^i \end{align*}$$

That is we can express P as the sum of products of coefficients in S and the corresponding power of the indeterminate variable X.

Now, we have that \deg\left(P\right)=n so in order to have the correct sized tuple we must consider the Cartesian product of S with itself n+1 times, that is

$$\begin{equation*} S^{n+1}=\prod_{i=0}^n S \end{equation*}$$

As each a_i\in S for 0\leq i\leq n we have that the tuple a=\left(a_0,a_1,a_2,\dots,a_{n-1},a_n\right)\in S^{n+1}. This is the correspondence we need.

::: definition Definition 183. Polynomial as an $n+1$-tuple

Let S\neq\emptyset and let n\in\mathbb{Z} with n\geq 0 so that \deg\left(P\right)=n where

$$\begin{equation} P\left(X\right)=a_nX^n+a_{n-1}X^{n-1}+a_{n-2}X^{n-2}+\dots+a_1X+a_0 \end{equation*}$$*

We can view a polynomial as an element of the set S^{n+1}, say a with the form

$$\begin{equation} a=\left(a_0,a_1,a_2,\dots,a_{n-1},a_n\right) \end{equation*}$$*

More simply, we can write

$$\begin{equation} P=\left(a_0,a_1,a_2,\dots,a_{n-1},a_n\right) \end{equation*}$$*

where we have the powers of X^n being implicit. :::

This definition has an immediate consequence, it enables us to have a representation for each X^n for any n\geq 0. For example, we see that

$$\begin{align*} P\left(X\right)=1=X^0 &\iff a=\left(1\right)\ P\left(X\right)=X &\iff a=\left(0,1\right)\ P\left(X\right)=X^2 &\iff a=\left(0,0,1\right)\ P\left(X\right)=X^3 &\iff a=\left(0,0,0,1\right)\ &\dots \end{align*}$$

This allows us to build an understanding of how to properly define addition of two polynomials. Suppose we have

$$\begin{align*} P\left(X\right)&=1+X+X^2\ Q\left(X\right)&=4-3X+X^2+X^3 \end{align*}$$

Where the coefficients of P and Q are elements of \mathbb{Z}. We see that P=\left(1,1,1\right) and Q=\left(4,-3,1,1\right). Firstly, we have that P has less entries in its tuple than Q. We can account for this by noting that P\left(X\right)=1+X+X^2=1+X+X^2+0X^3 and so an alternative representation of P is given by P=\left(1,1,1,0\right).

Now, considers the terms in both P and Q which are associated with X^0 i.e. P_0\left(X\right)=1 and Q_0\left(X\right)=4. As these are simply elements of \mathbb{Z} we would expect that P_0\left(X\right)+Q_0\left(X\right)=1+4=5 and so the sum to have the tuple form \left(5\right).

Considering the terms in both P and Q which are associated with X^1, P_1\left(X\right)=X and Q_1\left(X\right)=-3X, we would then expect that P_1\left(X\right)+Q_1\left(X\right)=X-3X=-2X.

We can continue this process for the other terms X^2 and X^3 to get

$$\begin{align*} P_0\left(X\right)+Q_0\left(X\right)&=1+4=5\ P_1\left(X\right)+Q_1\left(X\right)&=X-3X=-2X\ P_2\left(X\right)+Q_2\left(X\right)&=X^2+X^2=2X^2\ P_3\left(X\right)+Q_3\left(X\right)&=0X^3+X^3=X^3\ \end{align*}$$

This would then suggest that P\left(X\right)+Q\left(X\right)=5-2X+2X^2+X^3. Or, expressing this in tuple form, we have

$$\begin{equation*} \left(1,1,1,0\right)+\left(4,-3,1,1\right)=\left(5,-2,2,1\right) \end{equation*}$$

That is, the addition of two tuples representing polynomials is done by doing an "element-wise" addition of the tuples. There are a few things that would need to be considered for this to become the foundation for defining addition for polynomials.

Firstly, we observed that P_0\left(X\right)+Q_0\left(X\right) made sense as this represents integer addition. If we picked our coefficients from say \mathbb{N}, we would not be able to consider P_0\left(X\right)-Q_0\left(X\right); in fact, this holds for each P_i-Q_i. It would therefore be useful to have closure of addition, and additionally a notion of subtraction of the elements of S. This puts a restriction on what the set S can be, for example, it is clear that S\neq\mathbb{N} as subtraction is not closed in \mathbb{N}.

This also means our definition of polynomials is going to depend on the underlying set that the coefficients come from. It is therefore a wise idea to, at least temporarily, distinguish between when we are talking about polynomial addition and when we are talking about the addition of the elements of the set S.

We will use +_S when talking about addition between the elements of the set S, and we will use \oplus_S for the polynomial addition16 .

Furthermore, we had that P was of a lesser degree than Q, \deg\left(P\right)=2 and \deg\left(Q\right)=3. This poses no real issue as we can always extend an $m$-tuple to an $n$-tuple, for m<n. Indeed, suppose we have an element s\in S^m and we want to extend it to an element of S^n where m<n, then we can use the following map

$$\begin{align*} f:S^m&\mathlarger{\mathlarger{\rightarrow}}S^n\ (s_1,s_2,\dots,s_m)&\mapsto f\left((s_1,s_2,\dots,s_m)\right)=(s_1,s_2,\ldots,s_m, \underbrace{0, 0, \dots, 0}_{n-m \text{ times}}) \end{align*}$$

Or more simply, append n-m $0$s to the element of s^m. We provide a general definition.

::: definition Definition 184. Polynomial tuple extension map

Let n,m\in\mathbb{Z} so that m\leq n. We define the polynomial tuple extension map by

$$\begin{align} E_m^n:S^m&\mathlarger{\mathlarger{\rightarrow}}S^n\ (s_1,s_2,\dots,s_m)&\mapsto E_m^n\left((s_1,s_2,\dots,s_m)\right)=(s_1,s_2,\ldots,s_m, \underbrace{0, 0, \dots, 0}_{n-m \text{ times}}) \end{align*}$$*

Here, we are using the notation E_m^n to indicate this extends an $m$-tuple to an n-tuple. :::

This means that given two polynomials expressed in their tuple forms, we can always extend the one with the fewer elements so that they share the same number of elements. From this, we can define polynomial addition.

::: definition Definition 185. Polynomial addition

Let S be a set and let P,Q\in S\left[X\right] so that \deg\left(P\right)=n and \deg\left(Q\right)=m so that without loss of generality we have that m\leq n and

$$\begin{align} P=\left(p_0,p_1,p_2,\dots, p_{n-1},p_n\right)\ Q=\left(q_0,q_1,q_2,\dots, q_{m-1},q_m\right)\ \end{align*}$$*

Furthermore, suppose that S is endowed with an addition +_S such that +_S is well-defined and closed.

We define the addition of P and Q, by

$$\begin{align} \oplus_S:s^n\times s^n&\mathlarger{\mathlarger{\rightarrow}}s^n\ \left(P, E_m^n\left(Q\right)\right)&\mapsto\oplus_S\left(P,E_m^n\left(Q\right)\right)=\left(p_0+_S q_0,p_1+_S q_1, p_2+S q_2,\dots, p{n-1}+_S0, p_n+_S0\right) \end{align*}$$* :::

::: proposition Proposition 139. Polynomial addition is well-defined and closed

Let S be a set and let P,Q\in S\left[X\right] so that \deg\left(P\right)=n and \deg\left(Q\right)=m so that without loss of generality we have that m\leq n and

$$\begin{align} P=\left(p_0,p_1,p_2,\dots, p_{n-1},p_n\right)\ Q=\left(q_0,q_1,q_2,\dots, q_{m-1},q_m\right)\ \end{align*}$$*

Furthermore, suppose that S is endowed with an addition +_S such that +_S is well-defined and closed. We have that the polynomial addition of P and Q, denoted P\oplus_S Q is well-defined and closed.

Proof:

This is immediate. By the definition of the polynomial addition, we have that

$$\begin{equation} P\oplus_S Q=\left(p_0+_S q_0,p_1+_S q_1, p_2+S q_2,\dots, p{n-1}+_S0, p_n+_S 0\right) \end{equation*}$$*

As +_S is well defined and closed, then we have that p_i+_S q_i\in S for 0\leq i\leq m. Moreover, for m<i\leq n we have that p_i+_S 0=p_i\in S. Hence, all the entries in the tuple given by P\oplus_S Q are elements of S so that P\oplus_S Q\in S^n. $\qed$ :::

::: {#lem:NT_Polynomial_degree_addition .lemma} Lemma 12. Degree of polynomial from polynomial addition

Let P,Q\in S\left[X\right]. Then

$$\begin{equation} \deg\left(P\oplus_S Q\right)\leq\max\left(\deg\left(P\right),\deg\left(Q\right)\right) \end{equation*}$$*

Proof:

The result is instant if \deg\left(P\right)=\deg\left(Q\right) so suppose not and without loss of generality suppose that \deg\left(P\right)>\deg\left(Q\right) where \deg\left(P\right)=n and \deg\left(P\right)=m. Then as tuples we have that

$$\begin{align} P=\left(p_0,p_1,p_2,\dots,p_{n-1},p_n\right)\ Q=\left(q_0,q_1,q_2,\dots,q_{m-1},q_m\right)\ \end{align*}$$*

As \deg\left(Q\right)<\deg\left(P\right) we use the tuple extension mapping E_m^n on Q and we have that \deg\left(E_m^n\left(Q\right)\right)\leq n. Hence

$$\begin{equation} \deg\left(P\oplus_S Q\right)\leq \deg\left(P\oplus_S E\left(Q\right)\right)\leq n = \max\left(\deg\left(P\right),\deg\left(Q\right)\right) \end{equation*}$$*

$\qed$ :::

We are getting an idea for our problem with the polynomial given by

$$\begin{equation*} P=0+0X+0X^2+0X^3+\dots+0X^n \end{equation*}$$

If we want lemma 12{reference-type="ref" reference="lem:NT_Polynomial_degree_addition"} to be consistent, we should define the degree of P to be such that it is no larger than the degree of any other polynomial. In particular, for c\in S we have that Q=c with Q\in S\left[X\right] has degree 0, we must have that \deg\left(P\right)< \deg\left(Q\right)=0. This still doesn't fully answer the question, which negative integer should we take for the degree of P? Maybe, once we have a definition for the multiplication of polynomials, it will provide further insight.

Now, given a potential candidate for defining the addition of two polynomials, we can also consider a potential candidate for defining the subtraction of two polynomials. As before, we take inspiration from \mathbb{Z}.

As we have shown that the addition of integers is closed and well-defined, additionally, for every x\in\mathbb{Z} we have that \exists y so that x+y=0. In particular, we take y=-x so that the expression becomes x-x=0. A sensible definition for polynomial subtraction should also respect these properties; subtracting two polynomials should give another polynomial. This raises a question; suppose P\in S\left[X\right], what is P-P?

We know that in \mathbb{N}, \mathbb{Z} and \mathbb{Q}, that for an element x that x-x should be 0, but what does it mean for 0 to be an element of S and by extension S\left[X\right]? In particular is it the same 0 as for \mathbb{N}, \mathbb{Z} and \mathbb{Q}?

On the other hand, we know that for any x in \mathbb{N}, \mathbb{Z} and \mathbb{Q} that x+0=x=0+x, a similar sort of element of S would be useful and clearly plays an important role for defining a similar element for S\left[X\right]. This idea is general enough, assuming we have a well-behaved +_S, that we can apply it to a set S.

::: definition Definition 186. Additive Identity of a set $S$

Let S be a set so that there is an operation +_S:S^2\rightarrow S such that +_S is closed and well-defined. Let e\in S. If we have that \forall s \in S that s+_S e=s, then we say that e is a right additive identity element of S.

Similarly, if \forall s \in S we have that e+_Ss=s, then we say that e is a left additive identity element of S.

If we have that \forall s\in S that e+_S s=s=s+_S e, we simply call e an additive identity element.

If we need to be clear which set the additive inverse belongs to, we will write $e_S$ :::

It is an immediate consequence of +_S that the identity element is unique.

::: proposition Proposition 140. The additive identity element of a set S is unique

Let S be a set so that there is an operation +_S:S^2\rightarrow S such that +_S is closed and well-defined. Let e,f\in S be additive identity elements of S.

We have that e=f.

Proof:

Let S and +_S:S^2\rightarrow S be as given, and let e,f\in S be additive identity elements of S.

By definition, we have that

$$\begin{equation} e=e+_s f=f \end{equation*}$$*

As +_S is well-defined and closed, we have that e=f as required. $\qed$ :::

From this, we can immediately identify that 0 in \mathbb{N}, \mathbb{Z} and \mathbb{Q} is unique.

We have resolved one part of this problem, that in \mathbb{N}, \mathbb{Z} and \mathbb{Q}, for an element x that x-x=0. We have answered what it means for "$0$" to be in S, but what does it mean for -x\in S given x\in S?. Noting that x-x=x+_S\left(-x\right), this is precisely what it means for x to be invertible in S at least with respect to +_S. As with the additive identity of S, this idea is also general enough to apply to a more general set S.

::: definition Definition 187. Additive Inverse of a set $S$

Let S be a set so that there is an operation +_S:S^2\rightarrow S such that +_S is closed and well-defined. Let s\in S.

If we have that \exists x\in S such that s+_S x=e, then we say that x is a right additive inverse element of s in S.

Similarly, if \exists x\in S such that x+_S s=e, then we say that x is a left additive inverse element of s in S.

If we have that \exists x\in S that x+_S s=s=s+_S x, we simply call x an additive inverse element of s in S. :::

As with the additive identity element, we have an immediate consequence that the inverse of an element s\in S is unique.

::: proposition Proposition 141. The additive inverse element of an element of S is unique

Let S be a set so that there is an operation +_S:S^2\rightarrow S such that +_S is closed and well-defined. Let s\in S be an arbitrary element of S.

We have that the additive inverse of s is unique.

Proof:

Let S and +_S:S^2\rightarrow S be as given, and let s\in S be an arbitrary element of S and suppose that s has two inverses x and y.

By definition, we have that

$$\begin{align} x&=x+_S e\ &=x+_S\left(s+_S y\right)\ &= \end{align*}$$*

As +_S is well-defined and closed, we have that e=f as required. $\qed$ :::

It would also be useful to undo the addition of polynomials via polynomial subtraction. The only requirement is that we need +_S to be invertible In particular, as we are using a well-defined and closed operation on S, that is +_S, we have gained a definition of subtraction for free! Using -_S to denote subtraction in S, we have

$$\begin{align*} \ominus_S:s^n\times s^n&\mathlarger{\mathlarger{\rightarrow}}s^n\ \left(P, E\left(Q\right)\right)&\mapsto\ominus_S\left(P,E\left(Q\right)\right)=\left(p_0-_S q_0,p_1-_S q_1, p_2-S q_2,\dots, p{n-1}-_S0, p_n-_S0\right) \end{align*}$$

Given a notion of subtraction, we can also define what it means for two polynomials to be equal. Firstly, recall what it means for

::: definition Definition 188. Equality of Polynomials

Let P,Q\in S\left[X\right] where \deg\left(P\right)=n and \deg\left(Q\right)=m where without loss of generality m\leq n.

We say that P and Q are equal as polynomials, written P=Q, if and only if

$$\begin{equation} P\ominus_S Q = 0 = \left(\underbrace{0,0,0,\dots, 0,0}_{n+1 \text{ times}}\right) \end{equation*}$$*

That is, if the difference between the two is the zero polynomial. :::

We can therefore define the following relation.

::: definition Definition 189. :::

It is immediate that a polynomial therefore has a unique representation as an $n+1$-tuple.

Defining multiplication between two polynomials

We can use the same idea of the $n+1$-tuples to define multiplication of polynomials. Recall that we observed that we can express the intermediate X, and powers of it, as follows

$$\begin{align*} P\left(X\right)=1=X^0 &\iff a=\left(1\right)\ P\left(X\right)=X &\iff a=\left(0,1\right)\ P\left(X\right)=X^2 &\iff a=\left(0,0,1\right)\ P\left(X\right)=X^3 &\iff a=\left(0,0,0,1\right)\ &\dots \end{align*}$$

Intuitively, we want X^2=X*X, X^3=X^2*X and so on. That is

$$\begin{align*} XX=\left(0,1\right)\left(0,1\right)&=\left(0,0,1\right)=X^2\ X^2X=\left(0,0,1\right)\left(0,1\right)&=\left(0,0,0,1\right)=X^3\ &\dots \end{align*}$$

What about more complex expressions? Say X*\left(X+X^2\right). The answer to this would depend on if multiplication is distributive over addition with respect to the indeterminate, and additionally on multiplication is commutative!. For now, let us assume that this is the case,

It seems therefore that multiplication by X has the effect of "shifting" to the right


  1. We are clearly not talking about sunsets ↩︎

  2. If we are being logical and don't want to get soaked before we get to our destination. ↩︎

  3. By exist we mean in the abstract sense. ↩︎

  4. Without loss of generality means we have made a choice in the proof which allows us to consider a single case as the other cases have the same argument just with the notation changed to reflect the different choice. ↩︎

  5. Unless you are either not a human or somehow reading this in some unknown form of existence ↩︎

  6. We will first need to prove that in order to speak of the inverse of a mapping that we will need the left and right inverses to be equal ↩︎

  7. Hence the similar names. ↩︎

  8. Hopefully not all at once! ↩︎

  9. We can think of this as some sort of singularity ↩︎

  10. Phew! ↩︎

  11. Until someone manages to find a way to get past the elegant mathematics of the encryption scheme! ↩︎

  12. If there is only one theorem you learn when studying Number Theory, it has to be this one! ↩︎

  13. I prefer this way of thinking. ↩︎

  14. RSA stands for Rivest--Shamir--Adleman ↩︎

  15. Named after the 3rd-century mathematician Diophantus of Alexandria ↩︎

  16. When we have fully defined polynomial addition, we will go with the usual convention of just using + to denote addition ↩︎