824 KiB
Foundations
Mathematical logic (To add to as needed)
::: epigraph There are no facts, only interpretations.
Friedrich Nietzsche :::
In this section, we will introduce mathematical logic. This will give us
the tools and basic building blocks to be able to talk about mathematics
formally. What do we mean by 'in a formal way'? Modern mathematics is
built on a bedrock of logic, that is to say, given some statements which
we will take to be true or have already been proven true, what can we
logically deduce must also be true, and what is also false. As an
example, we are familiar with the idea of positive whole numbers, also
called positive integers; we are also familiar with the idea of a
positive whole number being prime when the only other positive whole
numbers that divide it are 1 and itself, for example, 2 is prime.
From the facts that the positive whole numbers exist and there is at
least one prime, we can logically deduce there must be infinitely many
primes. We will see the proof of this later.
In this document we won't be needing the full tools of mathematical logic, doing so will take us too far afield, instead, we will only cover the key fundamentals we will need as well as define some terms which will be used throughout.
Defining a definition
What is a definition? What does it mean to define something? Definitions are at the heart of mathematics, without them we wouldn't be able to do anything at all. A definition is a declaration that gives a formal name to an object, class of objects, ideas, etc. For example, we can define prime numbers, such a definition might look something like this.
::: Def Definition 1. Definition of a prime number
Consider a positive whole number, we say that this positive whole
number is a prime number if the only other positive whole numbers that
divide it are itself and the number 1.
:::
With this definition whenever we refer to the idea of a prime number, we
know that this prime number must satisfy that it only has two distinct
numbers that divide it, itself and 1. As we will say throughout this
document, we can use a definition when making logical arguments.
Definitions are the backbone of defining the setup to logical arguments,
if we don't know about the objects we are arguing about then we can't
make any logical deductions, or deduce the truth of mathematical
statements. Now that we know what a definition is, we can start using it
to lay the foundation for the rest of the document. For formality, we
will make, somewhat paradoxically, a formal definition of a definition
::: definition Definition 1. Definition
A definition is a statement which gives a formal name to a concept. :::
What is truth?
What is truth? In particular, what is mathematical truth? Loosely speaking truth and mathematical truth is based on the idea of does the premise entail this conclusion. That is to say, if we assume that a few statements are true, then the conclusion we are trying to reach is also true. This is rather vague at the moment because we haven't defined what we mean by true.
Logical statements and logical connectives
We will need a few definitions.
::: definition Definition 2. Declarative logical statement
We define a Declarative logical statement to be either true or false. Here we are using the intuitive definition of true and false. :::
We need to make the definition of declarative logical statements to define what we mean by true and false, again somewhat paradoxically we need a definition of true and false to define what we mean when a declarative logical statement is true. We shall ignore the paradoxical nature of these definitions.
::: definition Definition 3. Assignment of truth
Let P be a declarative logical statement, an assignment of truth is
an interpretation of the statement P that sees P as either true or
false. We write this as \delta\left(P\right).
If this assignment of truth \delta sees P as true we write
\delta\left(P\right)=1 and we say that \delta interprets P as
true. If this assignment of truth sees P as false we write
\delta\left(P\right)=0 and we say that \delta interprets P as
false.
:::
These two definitions will allow us to build the foundations that we
will need. It is first important to note that an assignment of truth is
not an absolute assignment of the truth of a declarative logical
statement. Different assignments of truth, and thus different
interpretations, can give rise to different values of P being true or
false. Now, we have a building blocks to build more complex logical
statements.
A first natural question is when does one the truth of one logical statement imply the truth or falseness of another? Thinking about how this should work gives us a sense that something true should never imply that something false is true, whereas something false can imply anything at all. Using this we define the logical implication operator.
::: definition Definition 4. Logical implication
Let P and Q be logical statements. We define the logical
implication of the statements P and Q, written as P\Rightarrow Q,
to have the following logical values
$P$ $Q$ $P\Rightarrow Q$
*1* *1* *1*
*1* *0* *0*
*0* *1* *1*
*0* *0* *1*
: The truth table for the logical implication operator.
We read this as P implies Q, or if P then Q.
:::
::: example
Example 1. Let P = "The sky is overcast" and let Q = "The sun
is not visible". We have by the truth table of logical implication that
P\Rightarrow Q is true when
-
Pis true andQis true -
Pis false andQis true -
Both
PandQare false.
In words we have P\Rightarrow Q is true in these circumstances
-
If it is true the sky is overcast then the sun is not visible.
That is, if the sky is overcast then the sun is not visible
-
If it is false that the sky is overcast then the sun is not visible.
That is, if the sky is not overcast then the sun is not visible.
-
If it is false that the sky is overcast then the sun is visible.
That is, if the sky is not overcast then the sun is visible.
In particular case two could be true say when it is nighttime, if it is nighttime the sun is clearly not visible1 .
Lets look at these statements the other way, Q\Rightarrow P. We have
that is is true when
-
Qis true andPis true -
Qis false andPis true -
Both
QandPare false.
In words that is we have Q\Rightarrow P is true in these
circumstances
-
If the sun is not visible then the sky is overcast
-
If the sun is visible then the sky is overcast
-
If the sun is visible then the sky not is overcast :::
There is one definition that arises from logical implication that is occasionally useful in proving other statements.
::: definition Definition 5. Vacuous truth
Let P and Q be statements such that we have P\Rightarrow Q.
Suppose that P is false, then by the definition of logical implication
we have that P\Rightarrow Q is true. We say that P\Rightarrow Q is
vacuously true.
:::
::: example Example 2. The statement "All my children are goats" is vacuously true for someone who doesn't have any children. :::
It is often the case we have theorems in mathematics which are of the
form P implies Q and Q implies P, that is two separate logical
sentences can imply each other. This is the logical bi-conditional.
::: definition Definition 6. Logical Bi-conditional
Let P and Q be logical statements. We define the logical
Bi-conditional of the statements P and Q, written
P\Leftrightarrow Q, to have the following logical values
$P$ $Q$ $P\Leftrightarrow Q$
*1* *1* *1*
*1* *0* *0*
*0* *1* *0*
*0* *0* *1*
: The truth table for the logical Bi-conditional operator.
We read this as P if and only Q, meaning P implies Q and Q
implies P.
:::
::: example
Example 3. Let P = "A number is even" and let Q = "It is
divisible by 2". By the truth table of the logical bi-conditional that
P\Leftrightarrow Q is true when
-
Both
PandQare true. -
Both
PandQare false.
That is in words we have P\Leftrightarrow Q when
-
A number is even if and only if it is divisible by 2
-
A number is not even if and only if it is not divisible by 2 :::
Now that we have the logical implication and logical bi-conditional, we can start defining more complex logical connectives. These are the logical conjunction, logical disjunction and logical negation
::: definition Definition 7. Logical conjunction
Suppose we have two logical statements P and Q. We define logical
conjunction, written as P\wedge Q, to be true if and only if P and
Q are both true, that is to say the logical conjunction connective has
the following truth table
$P$ $Q$ $P\wedge Q$
*1* *1* *1*
*1* *0* *0*
*0* *1* *0*
*0* *0* *0*
: The truth table for the logical conjunction operator.
Informally, we call this logical AND rather than logical conjunction. :::
::: example
Example 4. Let $P =$"$x > 2$" and $Q =$"$x < 10$" and suppose that
P and Q are true, then P\wedge Q is true and represents the
expression 2<x<10.
:::
::: example
Example 5. Let P\wedge Q be the expression "Adam likes apples and
oranges". We can break down p\wedge Q into the two separate logical
sentences, P = "Adam likes apples" and Q = "Adam likes oranges".
:::
::: definition Definition 8. Logical disjunction
Suppose we have two logical statements P and Q. We define logical
disjunction, written as P\vee Q, to be true when either one of P and
Q are true or both P and Q are true. This is to say the logical
disjunction connective has the following truth table
$P$ $Q$ $P\vee Q$
*1* *1* *1*
*1* *0* *1*
*0* *1* *1*
*0* *0* *0*
: The truth table for the logical disjunction operator.
Informally, we call this logical OR rather than logical disjunction. :::
Logical propositions
Now that we have an idea about logical connectives we can consider more complex logical statements, in particular we now start to consider statements whose truth values can depend on a variable.
::: definition Definition 9. Variable
A variable is something that has a value that can change. :::
How can a statement whose truth value change depending on a variable. To answer this we need to introduce the idea of all the possible values that this variable can take.
::: definition Definition 10. Domain of Discourse
We define the Domain of Discourse to be the collection of all values
that a variable can take. We will denote the domain of Discourse by
\mathbb{D}.
:::
::: definition Definition 11. Logical proposition
Let \mathbb{D} be a Domain of Discourse. We define a logical
proposition, denoted by P\left(n_1,n_2,\dots ,n_k\right) be a
proposition that is based on the variables n_1,n_2,\dots ,n_k are
variables from the domain of discourse.
:::
::: example
Example 6. Let P\left(n\right) be the proposition denoted by
$$\begin{equation}
P\left(n\right) = n\text{ is a even number}
\end{equation*}$$ where the domain of discourse D=\mathbb{N} where
\mathbb{N} denotes the positive whole numbers, i.e 1,2,3,\dots.*
We have for even n, say 2,4,6,8,\dots that P\left(n\right) is
true. and for odd n, say 1,3,5,7,\dots that P\left(n\right) is
false.
:::
::: example
Example 7. Let P\left(n,m\right) be the proposition denoted
$$\begin{equation}
P\left(n,m\right) = n>m
\end{equation*}$$ that is n is greater than m, where the domain of
discourse D=\mathbb{N} is again the positive whole numbers
1,2,3,\dots.*
Suppose that n=2 and m=3, then P\left(n,m\right) is false, if
n=45 and m=7 then P\left(n,m\right) is true.
:::
We see that logical propositions allow us to construct more complex logical statements and are the building blocks for the more complex Mathematical statements that we will be using.
Proof
Logic and truth are two of the corner stones of Mathematics, the third is proof. Without proof we are unable to verify the truth of any mathematical statements. So what exactly is a proof?
::: definition Definition 12. Mathematical proof
Suppose we have some logical statements which are known or assumed to be true, and suppose we wish to see if some conclusion if true given this assumption. We define a Mathematical proof is where we start from these assumptions and at each step logically deduce additionally true statements until we have proven the conclusion. In other words a Mathematical proof can be broken down into a simple question. Do the assumptions entail this conclusion? :::
This isn't a truly rigours definition of a mathematical proof, and one
can define this rigorously in a course on mathematical logic. To do so
here would be too much of a diversion, instead we will just keep in our
minds that a proof is a series of logical deductions from assumptions to
a conclusion. When we have reached the conclusion we use a special
symbol. We use the symbol \qed at the end of a proof to show that we
are done.
There are many different types of proof that we will invoke throughout the rest of this document.
Direct Proof
The first type of proof we define is direct proof. We define a direct proof as follows.
::: definition Definition 13. Direct Proof
In a direct proof, the conclusion is logically established by using axioms, definitions and previously proven theorems. :::
We will give an example of direct proof.
::: example Example 8. In this example we will breakdown each step of a direct proof.
Here we will give the definitions we will be using and any assumptions which we will be using in the prove(i,e previously proven theorem):
-
We say a number is an integer if it is a whole number, such as
-4,-3,54,8,0,2,7and so on. -
We will also assume that adding and multiplying integers works as we would have taught in school, for example
5+7=12,2*14=28etc. -
We say that an integer is an even integer if it can be written as
x=2*mwheremis any integer.
We now move to the proof.
Suppose we have two such even integers, say x and y. We will use
direct proof to show that x+y must also be even.
Proof:
Suppose we have two even integers x and y. By the definition of an
even integer we have that x=2*n and y=2*m for some integers n and
m. Now consider xx+y, we have
$$\begin{equation}
x+y=2n+2m=2*\left(n+m\right)
\end{equation*}$$ Now, n+m is adding two integers together and is an
integer. say k=n+m, hence we have that x+y=2*k, but by definition of
an even we have that x+y is even. This concludes the proof. $\qed$*
:::
Proof by contradiction
The second type of proof we define is proof by contradiction. This is a very powerful tool.
::: definition Definition 14. Proof by contradiction
Suppose we have a logical statement P that we wish to find the truth
of. If we suppose that \neg P is true and then assuming \neg P we
can derive another logical statement Q which is known to be false, or
we can derive both Q and \neg Q. Then we must have that \neg P is
false and P is true.
:::
In other words, proof by contradiction states that if, when making an assumption, we can derive a false statement, then the assumption itself must have been invalid. We can justify proof by contradiction using the following truth table.
P \neg P \neg\neg P \neg\neg P\Rightarrow P
1 0 1 1
0 1 0 1
: The truth table for proof by contradiction.
::: example Example 9. Like with the example using direct proof. We will break down each step of proof by contradiction.
Here we will give the definitions we will be using and any assumptions which we will be using in the prove(i,e previously proven theorem):
-
We say a number is a rational number if it is the ration of two integers
aandbwhereb\neq 0. Examples of rational numbers are\displaystyle \frac{1}{2},\frac{2}{3},-\frac{15}{8}and so on. We say a number is irrational if it is not rational. -
We say that a rational number
\displaystyle \frac{a}{b}is in simplest form if the only number that divides bothaandbis1. -
Any rational number has a simplest form.
-
We will also assume that adding and multiplying rational numbers works as we would have taught in school, that is we have for two rational numbers
\displaystyle \frac{a}{b}and\displaystyle \frac{c}{d}that$$\begin{equation} \frac{a}{b}+\frac{c}{d}= \frac{ad+bc}{bd},\ \frac{a}{b}\frac{c}{d}=\frac{ac}{bd} \end{equation*}$$*
-
We say
\sqrt{2}is the number which satisfies $\sqrt{2}\sqrt{2}=2$* -
We assume the definition of an even integer from the previous example
-
If
a*a=a^2is an even integer, then so is $a$
We now move to the proof.
We have that \sqrt{2} is an irrational number. This is to say that
\sqrt{2} is not the ratio of two whole numbers a and b where
\displaystyle \frac{a}{b} is in simplest form.
Proof:
Aiming for a proof by contradiction, suppose that \sqrt{2} is a
rational number that is in simplest form. This is to say we have that
\displaystyle \sqrt{2}=\frac{a}{b} for some integers a,b. We have by
assumption that \sqrt{2} is the number such that
\sqrt{2}*\sqrt{2}=2. Hence we have that
$$\begin{equation}
\sqrt{2}\sqrt{2}=\frac{a}{b}\frac{a}{b}=\frac{a^2}{b^2}=2
\end{equation*}$$ Where a^2=a*a and b^2 = b*b. We can multiply the
above expression by b^2 on both sides to get*
$$\begin{equation} a^2=2b^2 \end{equation}$$*
By definition of an even integer we have that a^2 is even and so a
must be even, that is a=2*k for some integer k. Hence we have that
$$\begin{equation}
a^2=\left(2k\right)^2=4k^2=2b^2
\end{equation}$$ That is 4*k^2=2*b^2 which implies that b^2=2*k^2,
that is b^2 is even and so b must be even. This is a contradiction,
as we have that a is even and b is even and so there share a divisor
of 2, contradicting the fact we assumed that
\displaystyle\sqrt{2}=\frac{a}{b} was in simplest form.*
Therefore, \sqrt{2} must be irrational. $\qed$
:::
Proof by contra-position
Another type of proof that we define is proof by contra-position, sometimes called proof by contra-positive.
::: definition Definition 15. Proof by contra-position
Suppose we have a logical statement P and we wish to show that P
implies some other statement Q. We are able to show that
P\Rightarrow Q if we can show that \neg Q\Rightarrow \neg P.
:::
Proof by contra-position states that proving a statement of the form
P\Rightarrow Q is the same as showing that \neg Q\Rightarrow\neg P.
It is easier to see this from the truth table.
P Q \neg P \neg Q P\Rightarrow Q \neg Q\Rightarrow\neg P
1 0 0 1 0 0
1 1 0 0 1 1
0 0 1 1 0 1
0 1 1 0 1 1
: The truth table for proof by contra-positive.
Maybe, to make it even clearer, we can use a worded example. Let P
denote the statement "It is raining" and Q denote the statement "I
wear my coat". We have that $P\Rightarrow Q$2 . The contra-positive
would be \neg Q\Rightarrow\neg P. In words this would be "If I don't
wear my coat" then "It is not raining".
::: example
Example 10. A more mathematical example can be seen now. We will
let x be an integer and we will show that if x^2 is even then x is
even. We will use proof by contra-position. So We will show that if x
is not even then x^2 is not even.
-
So,
xnot being even meansxis odd. This means thatx=2n+1for some integern. -
Now, we have
x^2=\left(2n+1\right)^2=4n^2+4n+1=2\left(2n^2+2n\right)+1. -
Hence, we have shown that
x^2is of the form2k+1for some integerk. -
Therefore
x^2is odd.
Concluding the proof by contra-positive. :::
Sets and mappings
::: epigraph No one shall expel us from the paradise that Cantor has created for us.
David Hilbert :::
Sets
Introduction and basic definitions
We start with the most elementary definition, a Set or less formally, a collection of 'objects'. This notion of an object is not very rigorous, what do we mean by an object? Do these objects really exist?3 In what way can one collection of objects differ from another?
These questions are at the foundation of Mathematics and to justify the notions and hence tools we need would require a significant detour into the realm of Mathematical logic. The interested reader would find so-called Zermelo--Fraenkel set theory to be of interest in formalising the notion of a set, we will give a brief overview at the end of the section. To avoid the trip into Mathematical logic, we will instead define sets with a more 'hands on' approach
::: definition Definition 16. Naive definition of a Set
A set is a collection of objects. We list the elements surrounded by
curly brackets \{ \}.
:::
This definition will make sense after we see some examples
::: example
Example 11. Let S=\left\{1,2,3,Dogs,Cats,Apples,Pears\right\}.
Then S is a set.
:::
::: example
Example 12. Let
S=\left\{"Foo", \left\{1,2,3,Dogs,Cats,Apples,Pears\right\}, Apples, Pears\right\}.
Then S is a set. We note that the set from the previous example is now
in this set.
:::
It would be useful to talk about a particular object in some set S.
For example we can say that 1 is in the set from example 2.1. above.
We formalise this idea
::: definition Definition 17. Element of a set
An object in a set is called an element of the set. :::
::: definition Definition 18. Set membership
Let S be a set and let x be an element of the set S. We say that
x is a member of the set S and write x\in S. If y is some object
which is not in the set S we write that y\not\in S.
:::
::: example
Example 13. Let S=\left\{1,2,3,Dogs,Cats,Apples,Pears\right\}. We
have that 1\in S and Dogs\in S but we have that Blue\not\in S.
:::
The above example shows a few interesting points. Dogs in English is
used when we wish to talk about multiple dogs at once, so it would be
absurd to deny that Dogs could itself be a set, for example
Dogs=\left\{Lassie, Scooby-Doo, Snoopy, Blue\right\}. So we have that
$$\begin{equation*} S=\left{1,2,3,\left{Lassie, Scooby-Doo, Snoopy, Blue\right},Cats,Apples,Pears\right} \end{equation*}$$
Does this now mean that Blue\in S?. The answer is no, Blue is not
any one of the objects in S, however there is an object in S that
does contain Blue, namely Dogs. This shows that \in only looks at
most one layer deep of \left\{\dots\right\}.
One might wonder if it can ever be the case that a set contains itself,
that is a set like S=\left\{S\right\}? Again the answer is no, to see
why we need to define a new way of making sets, where the elements of
the set are conditioned on some statement being true.
::: example Example 14. Suppose we want the set of all even integers then we have
$$\begin{equation}
S=\left{x : x\text{ is an even integer}\right}
\end{equation*}$The:$ symbol stands for such that, so S reads the
elements x such that x is an even integer.*
:::
Returning to the question of can a set contain itself. Consider the set
$$\begin{equation*}
S=\left{R: R\text{ is a set and }R\not\in R\right}
\end{equation*}$$ That is S is the set of all sets R such that R
is a set and R does not contain itself. Now suppose that S\in S. By
definition of S we must conclude that S\not\in S. Conversely if
S\not\in S then by definition of S we have that S\in S. This is an
issue, and shows the flavour of the issues of allowing a set to contain
itself, so we shall revise our definition to not allow for a set to
contain itself.
::: definition Definition 19. Set
A set is a collection of objects such that none of the objects in the collection is the set itself. :::
Subsets and universal quantifiers
Given a set, we can talk about a smaller collection of the elements of the set, which we call a subset.
::: definition Definition 20. Subset
Let S be a set. If K is also a set such that for every x\in K we
also have that x\in S then we say that K is a subset of X, and
write K\subseteq S. We say that K is a proper subset of S if we
have that S\subseteq T and S\neq T, we denote a proper subset by
\subset, hence \subseteq allows for the possibility that K=S. We
call \subseteq and \subset the set inclusion operators.
:::
Conversely can also define the notion of a super-set, this isn't too useful for what we are doing but it does sometimes appear in other text so it worth mentioning it now.
::: definition Definition 21. Super-set
Let S\subseteq T. We say that T is a super-set of the set S and
we write this as T\supseteq S.
:::
::: example
Example 15. Let S=\left\{1,2,3,4,5,6\right\} then some subsets of
S are \left\{1,2\right\}, \left\{4\right\} and
$\left{1,2,6\right}$
:::
With the idea of a subset we have our first proposition
::: {#prop:TwosetsEqualIfContainedInEachOther .proposition} Proposition 1. Two sets are equal if and only if they are subsets of each other
Let X and Y be sets. We have that X=Y if and only if
X\subseteq Y and Y\subseteq X.
Proof:
This is an if and only if proposition so we have to prove that given
X=Y then X\subseteq Y and Y\subseteq X and then we need to show
that given X\subseteq Y and Y\subseteq X, that X=Y.
\left(\Rightarrow\right): Suppose that X=Y then we have that X
and Y have the same elements, in particular we have that every
x\in X is also in Y so that X\subseteq Y. Likewise
Y\subseteq X.
\left(\Leftarrow\right): Suppose that X\subseteq Y and
Y\subseteq X. X\subseteq Y means that for every x\in X we have
that x\in Y. Likewise Y\subseteq X means that for every x\in Y we
have that x\in X. Hence we must have that the elements of X and Y
are the same, that is X=Y. $\qed$
:::
There is also another property of subsets that is useful.
::: {#prop:SetInclusionTransitivityProp .proposition} Proposition 2. Set inclusion transitivity property
Let R,S and T be sets such that R\subseteq S and S\subseteq T.
We have that $R\subseteq T$
Proof:
Let R,S and T be sets such that R\subseteq S and S\subseteq T.
Suppose that x\in R. By assumption we have that R\subseteq S and so
x\in S. Likewise by assumption we have that S\subseteq T and so
x\in T. Hence R\subseteq T.
The result follows. $\qed$ :::
A similar result holds if we replace subsets with proper subsets.
::: {#prop:ProperSetInclusionTransitivityProp .proposition} Proposition 3. Proper set inclusion transitivity property
Let R,S and T be sets such that R\subset S and S\subset T. We
have that $R\subset T$
Proof:
Let R,S and T be sets such that R\subset S and S\subset T.
Suppose that x\in R. By assumption we have that R\subset S and so
x\in S. Likewise by assumption we have that S\subset T and so
x\in T. Hence R\subset T.
We must show that it is not possible for R=T. As R\subset S then by
definition we have that R\neq S, likewise as S\subset T then
S\neq T. As R\neq S\neq T we conclude that R\neq T and so
R\subset T.
The result follows. $\qed$ :::
We can also make the following observation.
::: {#prop:ProperSetSubSetInclusionNotTransitivity .proposition} Proposition 4. Proper set inclusion and subset inclusion is not transitive
Let R,S and T be sets such that R\subseteq S and S\subset T. We
have that $R\subset T$
Proof:
Let R,S and T be sets such that R\subseteq S and S\subset T.
If R\neq S then R\subset S and so proposition
3{reference-type="ref"
reference="prop:ProperSetInclusionTransitivityProp"} applies. So suppose
that R=S then R\subseteq S and so \forall x\in R we have that
x\in S. Now as S\subset T we have that S\neq T\implies R\neq T as
R=S.
The result follows. $\qed$ :::
We will define what we truly mean by transitivity in the next chapter, right now it is more important to know that sets satisfy this property than why this property is named the way it is. As set inclusion is transitive, so is set equality.
::: proposition Proposition 5. Set equality transitivity property
Let R,S and T be sets such that R=S and S=T. We have that
R=T.
Proof:
Let R,S and T be sets such that R=S and S=T. We have that
R=T. By equality of sets we have that R\subseteq S and
S\subseteq R, likewise we also have that S\subseteq T and
T\subseteq S. Now as R\subseteq S and S\subseteq T then we must
have by transitivity of set inclusion that R\subseteq T. Moreover as
T\subseteq S and S\subseteq R we again have by transitivity that
T\subseteq R. The result follows by equality of sets. $\qed$
:::
::: definition Definition 22. The empty-set
The empty-set is the set that contains no elements. It is denoted by
\emptyset.
:::
To make our lives a little easier we will introduce some notation
::: definition Definition 23. Universal and existential quantifiers
Let S be any set. The universal quantifier \forall, meaning for
all, allows us to talk about every element S. We can condition the
universal quantifier with a such that ,$:$, in order to pick all the
elements that satisfy a given condition.
The existential quantifier \exists tells us of the existence of an
element in S. Just saying an element in a set exists is not
particularly usual and so we normally combine \exists with a
condition.
:::
Some examples will help us here.
::: example
Example 16. Consider the set
\left\{1,2,3,4,5,\dots\right\}=\mathbb{N}, we call \mathbb{N} the
natural numbers. Moreover, consider $S=\left{1,2,3,4,5,6\right}$
-
We have that
\forall x\in Sthatx\in\mathbb{N}, that is every element ofSis also an element of\mathbb{N}. -
We can apply the universal quantifier multiple times in a statement, for example
$$\begin{equation} \forall a\in\mathbb{N},\forall b\in\mathbb{N},\exists c\in\mathbb{N}:a+b=c \end{equation*}$$*
-
Let
a,b\in\mathbb{N}that is leta\in\mathbb{N}and letb\in\mathbb{N}. Then we can construct the following set. We say thatais divisible bybif\exists c\in\mathbb{N}such thata=bc, we write this asb\mid a. The set of all suchccan be expressed by$$\begin{equation} C=\left{c\in\mathbb{N}:a=bc\right} \end{equation*}$$* :::
The empty set has the interesting property that it is a subset of any set.
::: {#prop:EmptySetincontainedineveryset .proposition} Proposition 6. The empty-set is contained in every set
Let S be any set. Then $\emptyset\subseteq S$
Proof:
We have that \emptyset\subseteq S means that every element of
\emptyset is also contained in S. The definition of the empty set
means that there are no elements in \emptyset. We can phrase this to
the following statement
$$\begin{equation}
\forall x: x\in\emptyset\Rightarrow x\in S
\end{equation*}$$ But x\in\emptyset is not true for any x so*
$$\begin{equation} \forall x: x\in\emptyset\Rightarrow x\in S \end{equation*}$$*
is vacuously true. It hence follows the empty-set is contained in any set. $\qed$ :::
::: {#prop:EmptySetUnique .proposition} Proposition 7. The empty-set is unique
The empty-set is unique, that is there is only one distinct set which is the empty-set.
Proof:
Suppose that \emptyset and \emptyset' are two empty sets. By
proposition
6{reference-type="ref"
reference="prop:EmptySetincontainedineveryset"} we have that
\emptyset\subseteq\emptyset', likewise \emptyset'\subseteq\emptyset.
So by proposition
1{reference-type="ref"
reference="prop:TwosetsEqualIfContainedInEachOther"} we have that
\emptyset=\emptyset'. Hence the empty-set is unique. $\qed$
:::
It would be nice to have more ways to construct sets. Two key ways to do this are with the union operation and intersection operation.
::: definition Definition 24. Union and intersection of sets
Let S and T be any two sets. We define the union of S and T,
denoted by S\cup T, is the set
$$\begin{equation} S\cup T=\left{x: x\in S\text{ or } x\in T\right} \end{equation*}$$*
The intersection of S and T, denoted by S\cap T, is the set
$$\begin{equation} S\cap T = \left{x : x\in S\text{ and } x\in T\right} \end{equation*}$$*
If we have a finite number of sets, given by A_1, A_2, \dots,
A_n then the union of all of these sets is denoted by
$$\begin{align} \bigcup_{i=1}^n A_i \end{align*}$$*
and the intersection is denoted by
$$\begin{align} \bigcap_{i=1}^n A_i \end{align*}$$ Sometimes it is useful to define a union or intersection of multiple sets given some condition or multiple conditions, usually when the conditions involve other previously defined sets, this is denoted as*
$$\begin{equation} \bigcup_{\substack{\text{Condition 1 for} A \ \text{Condition 2 for} A\ \ \dots}} A \end{equation*}$$ for the union and for the intersection*
$$\begin{equation} \bigcap_{\substack{\text{Condition 1 for} A \ \text{Condition 2 for} A\ \text{}\dots}} A \end{equation*}$$* :::
::: example
Example 17. Let S=\left\{1,2,3,4,5,6\right\} and let
T=\left\{2,4,5,6,7,8\right\}, we have that
$$\begin{align} S\cup T &=\left{1,2,3,4,5,6\right}\cup \left{2,4,5,6,7,8\right}=\left{1,2,3,4,5,6,2,4,5,6,7,8\right}=\left{1,2,3,4,5,6,7,8\right}\ S\cap T &=\left{1,2,3,4,5,6\right}\cap \left{2,4,5,6,7,8\right}=\left{1,2,3,4,5,6,2,4,5,6,7,8\right}=\left{2,4,5,6\right}\ \end{align*}$$*
We note that in the union we have multiple elements, for example we have two $2$'s. Repeated elements in a set are considered to be the same element so we don't write them, i.e $\left{2,2\right}=\left{2\right}$ :::
::: example
Example 18. Let A_1=\left\{1,2,3\right\},
A_2=\left\{1,2,7,9\right\} and A_3=\left\{1,4,8,12\right\}. We have
that the union of these sets is given by
$$\begin{align} \bigcup_{i=1}^n A_i&=A_1\cup A_2\cup A_3\ &=\left{1,2,3\right}\cup \left{1,2,7,9\right}\cup \left{1,4,8,12\right}\ &=\left{1,2,3,4,7,8,9,12\right} \end{align*}$$*
The intersection of these sets is given by
$$\begin{align} \bigcap_{i=1}^n A_i&=A_1\cap A_2\cap A_3\ &=\left{1,2,3\right}\cap \left{1,2,7,9\right}\cap \left{1,4,8,12\right}\ &=\left{1\right} \end{align*}$$* :::
We make one useful definition about intersections
::: definition Definition 25. Disjoint sets
Let X and Y be sets. If we have that X\cap Y =\emptyset then we
say that X and Y are disjoint sets.
:::
Operations on sets
The union, the intersection and set inclusion
Before we continue we introduce three new ideas that will play a role throughout the rest of this paper.
::: definition Definition 26. Operation
An operation \circ acts on some inputs to produce an output or some
outputs.
:::
::: example
Example 19. The union \cup and intersection \cap are examples
of operations. These operators operate on two sets to produce a third.
:::
::: definition Definition 27. Commutative operation
Let \circ be an operation that accepts two inputs, i.e we have
A\circ B for valid inputs A and B. We say that \circ is
commutative if and only if $A\circ B = B\circ A$
:::
::: example
Example 20. Consider \mathbb{N}=\left\{1,2,3,4,5,\dots\right\}.
We are familiar with the idea of addition of positive numbers, say
1+2=3. It is clear that the addition operation is commutative for
\mathbb{N}, e.g. $1+2=3=2+1$
:::
::: definition Definition 28. Associative operation
Let \circ be an operation that accepts two inputs, i.e we have
A\circ B for valid inputs A and B. We say that \circ is
associative if and only if
\left(A\circ B\right)\circ C = A\circ\left(B\circ C\right) where the
operation in the brackets should be computed first.
:::
::: example
Example 21. Again consider
\mathbb{N}=\left\{1,2,3,4,5,\dots\right\}. The addition operator for
\mathbb{N} is associative, e.g.
$\left(1+2\right)+3=3+3=6=1+5=1+\left(2+3\right)$
:::
We note that we have not defined a rigorous notion of addition, to do so will require us to consider mappings which we do later.
We have the following proposition about the properties of intersections, unions and set inclusions.
::: {#prop:PropertiesOfUnionIntersectionSetinclusion .proposition} Proposition 8. Properties of intersection, union and set inclusion
Let A,B,C be sets. Then we have that the following properties are
true
-
$A\cap B = B\cap A$
-
$A\cup B = B\cup A$
-
$A\cap B\subseteq A$
-
$A\subseteq A\cup B$
-
$A\subseteq B \Rightarrow A\cap B = A$
-
$A\subseteq B\Rightarrow A\cup B =B$
-
$A\subseteq B \Rightarrow A\cap C \subseteq B\cap C$
-
$A\subseteq B \Rightarrow A\cup C \subseteq B\cup C$
-
$A\cap A = A$
-
$A\cup A =A$
-
$A\cap\left(B\cap C\right)=\left(A\cap B\right)\cap C$
-
$A\cup\left(B\cup C\right)=\left(A\cup B\right)\cup C$
-
$A\cap\left(B\cup C\right)=\left(A\cap B\right)\cup\left(A\cap C\right)$
-
$A\cup\left(B\cap C\right)= \left(A\cup B\right) \cap \left(A\cup C\right)$
Proof:
-
A\cap B = B\cap A:Let
x\in A\cap Bthenx\in Aandx\in Bby the definition of the intersection. It is hence clear thatx\in B\cap A. So we haveA\cap B\subseteq B\cap A. Likewise ifx\in B\cap Athenx\in Bandx\in A, so thatx\in A\cap B. SoB\cap A\subseteq A\cap B. It hence follows by proposition 1{reference-type="ref" reference="prop:TwosetsEqualIfContainedInEachOther"} thatA\cap B = B\cap A. -
A\cup B = B\cup A:Let
x\in A\cup Bthenx\in Aorx\in Bby the definition of the union. We hence have thatx\in B\cup A. So we haveA\cup B\subseteq B\cup A. Likewise ifx\in B\cup Athenx\in Bandx\in A, so thatx\in A\cup B. SoB\cup A\subseteq A\cup B. It hence follows by proposition 1{reference-type="ref" reference="prop:TwosetsEqualIfContainedInEachOther"} thatA\cup B = B\cup A. -
A\cap B\subseteq A:Let
x\in A\cap B, then by the definition of the intersectionx\in Aandx\in B. Hencex\in A\cap Bmeans thatx\in Aso thatA\cap B\subseteq A. -
A\subseteq A\cup B:Let
x\in A. By the definition of the union of two sets we have thaty\in A\cup Bif and only ify\in Aory\in B. Hence it follows that $x\in A\cup B$ -
A\subseteq B \Rightarrow A\cap B = A:Let
A\subseteq Band suppose thatx\in A, then we have thatx\in BasA\subseteq B. Hencex\in A\cap B. This holds for any choice ofx\in A. We conclude that ifA\subseteq Bthen $A\cap B = A$ -
A\subseteq B\Rightarrow A\cup B =B:Let
A\subseteq B. Observe thatB\subseteq Bso thatA\cup B\subseteq B\cup B= B, that is to sayA\cup B \subseteq B. NowB\subseteq A\cup B. HenceA\cup B = B. -
A\subseteq B \Rightarrow A\cap C \subseteq B\cap C:Suppose that
A\subseteq Band letx\in A\cap C, then by definitionx\in Aandx\in C. Also we have that asA\subseteq Bthatx\in Agivesx\in B. Hencex\in B\cap C. It follows thatA\cap C\subseteq B\cap C. -
A\subseteq B \Rightarrow A\cup C \subseteq B\cup C:Suppose
A\subseteq Band letx\in A\cup C. We have thatx\in Aorx\in C. Ifx\in Athen asA\subseteq Bwe have thatx\in Bso thatx\in B\cup C. Ifx\in Cthen clearlyx\in B\cup C. Either way we have thatA\cup C\subseteq B\cup C. -
A\cap A = A:Let
x\in A, then by the definition of the intersection we have thaty\in A\cap Aif and only ify\in Aandy\in A, hencex\in A\cap A. So thatA\subseteq A\cap A. Now Ifx\in A\cap Awe have by definition of the intersection of two sets thatx\in Aandx\in A, so the force of deductive logic then drives one to the conclusion thatx\in A. SoA\cap A\subseteq A. HenceA\cap A = A. -
A\cup A =A:Let
x\in A, then by the definition of the union of two sets, we have thaty\in A\cup Aif and only ify\in Apry\in A, hencex\in A\cup Aso thatA\subseteq A\cup A. Now suppose thatx\in A\cup A, then again by the definition of the union we have thatx\in Aso thatA\cup A\subseteq A. HenceA=A\cup A. -
A\cap\left(B\cap C\right)=\left(A\cap B\right)\cap C:Let
A,BandCbe sets. ConsiderA\cap\left(B\cap C\right), we have thatx\in A\cap\left(B\cap C\right)means thatx\in Aandx\in B\cap C, likewisex\in B\cap Cmeans thatx\in Bandx\in C. Now asx\in Aandx\in Bandx\in Cso we have thatx\in A\cap Bandx\in C. Finally we have thatx\in\left(A\cap B\right)\cap Cso thatA\cap\left(B\cap C\right)\subseteq \left(A\cap B\right)\cap C.Now consider
\left(A\cap B\right)\cap C, ifx\in\left(A\cap B\right)\cap Cthenx\in A\cap Bandx\in C, alsox\in A\cap Bmeans thatx\in Aandx\in B. Asx\in Aandx\in Bandx\in Cso we have thatx\in Aandx\in B\cap Cso thatx\in A\cap\left(B\cap C\right). Hence\left(A\cap B\right)\cap C\subseteq A\cap\left(B\cap C\right).Hence $A\cap\left(B\cap C\right)=\left(A\cap B\right)\cap C$
-
A\cup\left(B\cup C\right)=\left(A\cup B\right)\cup C:Let
A,BandCbe sets. ConsiderA\cup\left(B\cup C\right)and letx\in A\cup\left(B\cup C\right), we have that eitherx\in Aorx\in\left(B\cup C\right). Ifx\in Athen we have thatx\in A\cup Bso thatx\in\left(A\cup B\right)\cup C. Ifx\in B\cup Cthen eitherx\in Borx\in C. Ifx\in BthenX\in A\cup Cso thatx\in \left(A\cup B\right)\cup C. Otherwisex\in Cand we have thatx\in \left(A\cup B\right)\cup C. Hence we have that $A\cup\left(B\cup C\right)\subseteq\left(A\cup B\right)\cup C$Conversely let
x\in\left(A\cup B\right)\cup C. We have that eitherx\in\left(A\cup B\right)orx\in C. Ifx\in\left(A\cup B\right)then eitherx\in Aorx\in B, in either case we have thatx\in A\cup\left(B\cup C\right). Ifx\in Cthenx\in A\cup\left(B\cup C\right). So that\left(A\cup B\right)\cup C\subseteq A\cup\left(B\cup C\right).Hence $A\cup\left(B\cup C\right)=\left(A\cup B\right)\cup C$
-
A\cap\left(B\cup C\right)=\left(A\cap B\right)\cup\left(A\cap C\right):Let
x\in A\cap\left(B\cup C\right), then we have thatx\in Aandx\in B\cup C. We havex\in B\cup Cgives us thatx\in Borx\in C. Ifx\in Bthenx\in A\cap Band sox\in\left(A\cap B\right)\cup\left(A\cap C\right). Likewise isx\in Cthenx\in A\cap Csox\in \left(A\cap B\right)\cup\left(A\cap C\right). HenceA\cap\left(B\cup C\right)\subseteq\left(A\cap B\right)\cup\left(A\cap C\right).For the opposite inclusion, let
x\in\left(A\cap B\right)\cup\left(A\cap C\right)then we have that eitherx\in A\cap Borx\in A\cap C. Ifx\in A\cap Bthenx\in Aandx\in B, so we hence have thatx\in B\cup Cso thatx\in A\cap\left(B\cup C\right). Likewise if we havex\in A\cap Cthenx\in Aandx\in C, sox\in B\cup Candx\in A\cap\left(B\cup C\right). Hence $\left(A\cap B\right)\cup\left(A\cap C\right)\subseteq A\cap\left(B\cup C\right)$So
A\cap\left(B\cup C\right)=\left(A\cap B\right)\cup\left(A\cap C\right). -
A\cup\left(B\cap C\right)= \left(A\cup B\right) \cap \left(A\cup C\right):Let
x\in A\cup\left(B\cap C\right)then eitherx\in Aorx\in B\cap C. Ifx\in Athenx\in A\cup Bandx\in A\cup C, which is to sayx\in\left(A\cup B\right)\cap\left(A\cup C\right). Ifx\in B\cap Cthenx\in Bandx\in C, so it follows thatx\in A\cup Bandx\in A\cup Cwhich is to sayx\in\left(A\cup B\right)\cap\left(A\cup C\right). HenceA\cup\left(B\cap C\right)\subseteq \left(A\cup B\right) \cap \left(A\cup C\right).Now, suppose that
x\in\left(A\cup B\right) \cap \left(A\cup C\right). We then have thatx\in A\cup Bandx\in A\cup C. Nowx\in A\cup Bgivesx\in Aorx\in B, alsox\in A\cup Cmeans thatx\in Aorx\in C. This gives us two possible outcomes. Ifx\in Athenx\in A\cup\left(B\cap C\right)so that\left(A\cup B\right) \cap \left(A\cup C\right)\subseteq A\cup\left(B\cap C\right). Suppose thatx\not\in Athen we must have thatx\in Bandx\in Casx\in A\cup Bandx\in A\cup C. Hencex\in B\cap Csox\in A\cup\left(B\cap C\right). Hence\left(A\cup B\right) \cap \left(A\cup C\right)\subseteq A\cup\left(B\cap C\right).So we have that
A\cup\left(B\cap C\right)= \left(A\cup B\right) \cap \left(A\cup C\right).
The proposition now follows. $\qed$ :::
::: {#thm:EquivSubsetIntUnion .theorem} Theorem 1. Equivalence of Subsets with union and intersection
Let A,B be sets. The following are equivalent
-
$A\subseteq B$
-
$A\cap B = A$
-
$A\cup B =B$
Proof:
Suppose A\subseteq B. By proposition
8{reference-type="ref"
reference="prop:PropertiesOfUnionIntersectionSetinclusion"} we have
that
$$\begin{equation}
A=A\cap A \subseteq A\cap B\subseteq A
\end{equation*}$$ Hence A=A\cap B.*
Now suppose that A\cap B = A, then A\subseteq B. This shows 1 and 2
are equivalent.
Suppose A\subseteq B. Let x\in A then x\in B. Then as x\in B we
have that x\in A\cup B so that B\subseteq A\cup B. Suppose that
x\in A\cup B, then either x\in A or x\in B. If x\in B we are
done and we have that A\cup B\subseteq B. If x\in A then as
A\subseteq B we have that x\in B so that A\cup B\subseteq B.
Hence A\cup B = B.
Now suppose that A\cup B = B. Suppose that x\in A then
x\in A\cup B =B so x\in B, hence A\subseteq B.
This shows the equivalence of 1 and 3.
The equivalence of 2 and 3 now follows. Indeed, suppose that
A\cap B =A then by the equivalence of 1 and 2 we know that
A\subseteq B, also by the equivalence of 1 and 3 we know that
A\cup B = B. $\qed$
:::
The complement of a set
It sometimes becomes useful to talk about the elements that are not in
some set S. This only makes sense if S is contained inside some
larger set.
::: definition Definition 29. Complement of a set
Let S be a set such that S\subseteq U for some set U. We define
the complement of S, denoted by S^C as the following set
$$\begin{equation} S^C = \left{x\in U:x\not\in S\right} \end{equation*}$$*
We can alternatively write S^C = U\setminus S, where \setminus is
the set difference operation.
Moreover we can also consider the complement of a set A with respect
to some other set B, again occurring inside some larger set U which
is to say A\subseteq U and B\subseteq U. We have that
$$\begin{equation} A\setminus B = \left{x\in A : x\not\in B\right} \end{equation*}$$*
We call :::
::: example
Example 22. Let U=\left\{1,2,3,4,5,6\right\},
S=\left\{1,2,3,4,6\right\} and T=\left\{2,4,6\right\}. We have that
S\subseteq U so that
$$\begin{align} S^C&=\left{x\in U:x\not\in S\right}=\left{5\right}\ T^C&=\left{x\in U:x\not\in T\right}=\left{1,3,5\right}\ \end{align*}$$*
Also
$$\begin{align} S\setminus T=\left{x\in S: x\not\in T\right}=\left{1,3\right}\ T\setminus S=\left{x\in T: x\not\in S\right}=\emptyset \end{align*}$$* :::
An immediate result follows from the previous definitions of the complement of a set and set difference.
::: {#thm:DeMorgan .theorem} Theorem 2. De-Morgan's laws
Let A and B be subsets of some universal set U. We have the
complement laws
-
$\left(A\cap B\right)^C=A^C\cup B^C$
-
$\left(A\cup B\right)^C= A^C\cap B^C$
We also have the set difference laws
-
$U\setminus\left(A\cap B\right)=\left(U\setminus A\right)\cup \left(U\setminus B\right)$
-
$U\setminus\left(A\cup B\right)=\left(U\setminus A\right)\cap \left(U\setminus B\right)$
Proof:
We first prove the complement laws.
-
\left(A\cap B\right)^C=A^C\cup B^C:Let
x\in\left(A\cap B\right)^C, by the definition of the set complement we have thatx\not\in \left(A\cap B\right). So by the definition of the intersection andxnot being an element ofA\cap Bwe have thatx\not\in Aorx\not\in B. Suppose thatx\not\in A, then by the definition of set complement we have thatx\in A^Cso thatx\in A^C\cup B^C. Likewise ifx\not\in Bthenx\in B^Cso thatx\in A^C\cup B^C. Hence we have that\left(A\cap B\right)^C\subseteq A^C\cup B^C.Now suppose
x\in A^C\cup B^C, thenx\in A^Corx\in B^C. Supposex\in A^Cthenx\not\in Aso thatx\not\in A\cap Bhencex\in\left(A\cap B\right)^C. Likewise ifx\in B^Cthenx\not\in Bsox\not\in A\cap Bso thatx\in\left(A\cap B\right)^C. Thus $A^C\cup B^C\subseteq \left(A\cap B\right)^C$Hence
\left(A\cap B\right)^C=A^C\cup B^C. -
\left(A\cup B\right)^C= A^C\cap B^C:Let
x\in \left(A\cup B\right)^C, then we have thatx\not\in A\cup Bsox\not\in Aandx\not\in B. This means thatx\in A^Candx\in B^Cwhich is to sayx\in A^C\cap B^C. So\left(A\cup B\right)^C\subseteq A^C\cap B^C.Suppose
x\in A^C \cap B^cthenx\in A^Candx\in B^C.x\in A^Cmeans thatx\not\in Aandx\in B^Cmeans thatx\not\in B, sox\not\in Aandx\not\in Bhencex\not\in A\cup B. Thusx\in\left(A\cup B\right)^C. Hence $A^C\cap B^C\subseteq\left(A\cup B\right)^C$Thus $\left(A\cup B\right)^C= A^C\cap B^C$
It is left to prove the set difference laws.
-
U\setminus\left(A\cap B\right)=\left(U\setminus A\right)\cup \left(U\setminus B\right):Let
X\in U\setminus\left(A\cap B\right)then by definition we have thatx\in Uandx\not\in A\cap B, which is to say thatx\not\in Aorx\not\in Bwith the possibility of being in neither. Ifx\not\in Athenx\in \left(U\setminus A\right)and we clearly havex\in \left(U\setminus A\right)\cup \left(U\setminus B\right). Likewise ifx\not\in Band both cases clearly hold in the case wherex\not\in AandX\not\in B. It follows that in every case thatx\in \left(U\setminus A\right)\cup \left(U\setminus B\right). Hence $U\setminus\left(A\cap B\right)\subseteq\left(U\setminus A\right)\cup \left(U\setminus B\right)$Now suppose that
x\in \left(U\setminus A\right)\cup \left(U\setminus B\right)then by definition we have thatx\in U\setminus Aorx\in U\setminus Bwith the possibility of being in both. Ifx\in U\setminus Athenx\in UandX\not\in A. Hencex\not\in A\cap B, likewise ifX\in Y\setminus Bthen we again conclude thatX\not\in A\cap B. However asx\in Uthen we have by definition thatx\in U\setminus\left(A\cap B\right). We conclude that $\left(U\setminus A\right)\cup \left(U\setminus B\right)\subseteq U\setminus\left(A\cap B\right)$It follows that $U\setminus\left(A\cap B\right)=\left(U\setminus A\right)\cup \left(U\setminus B\right)$
-
U\setminus\left(A\cup B\right)=\left(U\setminus A\right)\cap \left(U\setminus B\right):Suppose that
U\setminus\left(A\cup B\right)thenx\in Uandx\not\in A\cup Bsox\not\in Aandx\not\in B. Clearly thenx\in U\setminus Aandx\in U\setminus Bso thatx\in \left(U\setminus A\right)\cap \left(U\setminus B\right). So we have thatU\setminus\left(A\cup B\right)\subseteq\left(U\setminus A\right)\cap \left(U\setminus B\right).Let
x\in \left(U\setminus A\right)\cap \left(U\setminus B\right)thenx\in U\setminus Aandx\in U\setminus Bwhich is to say thatx\in Uandx\not\in Aandx\not\in B. Clearlyx\not\in Aandx\not\in Bgives us thatx\not\in A\cup Band sox\in U\setminus \left(A\cup B\right)by definition. This allows us to conclude that $\left(U\setminus A\right)\cap \left(U\setminus B\right)\subseteq U\setminus\left(A\cup B\right)$Hence $U\setminus\left(A\cup B\right)=\left(U\setminus A\right)\cap \left(U\setminus B\right)$
This proves the theorem. $\qed$ :::
::: {#prop:AdditionComplement .proposition} Proposition 9. Additional properties of set complements and set differences
Let A, B and C be a sets such that A\subseteq U, B\subseteq U
and C\subseteq U. Moreover suppose U is not contained in any other
set. Then we have that
-
$A\cup A^C = U$
-
$A\cap A^C =\emptyset$
-
$\emptyset^C =U$
-
$U^C=\emptyset$
-
If
A\subseteq Bthen $B^C\subseteq A^C$ -
$\left(A^C\right)^C=A$
-
$A\setminus B = A\cap B^C$
-
$\left(A\setminus B\right)^C=A^C\cup B$
-
$A^C\setminus B^C=B\setminus A$
-
$\left(A\setminus B\right)\cap C = \left(A\cap C\right)\setminus\left(B\cap C\right)$
-
$A\setminus\left(B\setminus C\right) = \left(A\cap B\right)\setminus\left(A\cap C\right)$
-
$\left(A\setminus B\right)\cap B=\emptyset$
-
$\left(A\setminus B\right)\cap\left(A\cap B\right)=\emptyset$
Proof:
-
A\cup A^C = U:Let
x\in A\cup A^Cthenx\in Aorx\in A^C. Ifx\in Athen asA\subseteq Uwe have thatx\in U. Ifx\in A^cthen by the definition of set complements we have thatx\in A^Cif and only ifx\in U. HenceA\cup A^C\subseteq U.Conversely suppose that
x\in U. We know thatA\subseteq Uso ifx\in Awe clearly havex\in A\cup A^C. So supposex\not\in Athen by definition of the set complement we have thatx\in A^Cso thatx\in A\cup A^C. HenceU\subseteq A\cup A^C.So
A\cup A^C=U. -
A\cap A^C =\emptyset:Let
x\in A\cap A^C, thenx\in Aandx\in A^C, howeverx\in A^Cmeans thatx\not\in A. This contradicts the fact thatx\in A, hence there are no elementsx\in Uso thatx\in Aandx\in A^C, this is to sayA\cap A^C= \emptyset.Hence
A\cap A^C =\emptyset. -
\emptyset^C =U:By the definition of the empty set we have that
\emptysethas no elements. The complement of the empty-set is$$\begin{equation} \emptyset^C=\left{x\in U:x\not\in\emptyset\right} \end{equation*}$$*
Hence every
x\in Uis such thatx\not\in\emptyset. So\emptyset^C\subseteq U.Conversely let
x\in U, thenx\not\in \emptysetas\emptysethas no elements. sox\in\emptyset^ChenceU\subseteq \emptyset^C.It follows that
\emptyset^C=U. -
U^C=\emptyset:Let
x\in U^C, by the definition of set complement we have that$$\begin{equation} U^C=\left{y\in U:y\not\in U\right} \end{equation*}$$*
This is clearly empty as no such
ycan satisfyy\in Uandy\not\in U.Hence
U^C=\emptyset. -
If
A\subseteq BthenB^C\subseteq A^C:Suppose that
A\subseteq B. We have by proposition 8{reference-type="ref" reference="prop:PropertiesOfUnionIntersectionSetinclusion"} property 5 we have thatA\cap B = A. It follows that\left(A\cap B\right)^C = A^C. Now by De-Morgan's laws we have that\left(A\cap B\right)^C= A^C\cup B^C. HenceA^C\cup B^C = A^C. Finally by theorem 1{reference-type="ref" reference="thm:EquivSubsetIntUnion"} we know thatX\cup Y = Yif and only ifX\subseteq Yfor setsXandY. HenceB^C\subseteq A^C. -
\left(A^C\right)^C=A:Let
x\in \left(A^C\right)^C. By definition we have that$$\begin{equation} \left(A^C\right)^C=\left{x\in U : x\not\in A^c\right} \end{equation*}$$*
Hence
x\in \left(A^C\right)^Cif and only ifx\not\in A^C. Howeverx\not\in A^Cmeans thatx\in A. Hence $\left(A^C\right)^C\subseteq A$Suppose that
x\in A, thenx\not\in A^C, moreover by definitionx\not\in A^Cif and only ifx\in \left(A^C\right)^C, henceA\subseteq \left(A^C\right)^C.Hence $\left(A^C\right)^C=A$
-
A\setminus B = A\cap B^C:Let
x\in A\setminus B, then by definition we have thatA\setminus Bis the set$$\begin{equation} A\setminus B = \left{y\in A:y\not\in B\right} \end{equation*}$$ Hence
x\in A\setminus Bmeans thatx\in Aandx\not\in B. We have thatx\not\in Bmeans thatx\in B^C. So thatx\in A\cap B^C. It follows thatA\setminus B\subseteq A\cap B^C.*Let
x\in A\cap B^C, thenx\in Aandx\in B^C.x\in B^Cmeans thatx\not\in B, so by definitionx\in Aandx\not\in Bmeans thatx\in A\setminus B. HenceA\cap B^C\subseteq A\setminus B.Hence
A\setminus B = A\cap B^C. -
\left(A\setminus B\right)^C=A^C\cup B:We know that
A\setminus B = A\cap B^Cby the previous property. Now by De-Morgan's laws we have that$$\begin{equation} \left(A\setminus B\right)^C=\left(A\cap B^C\right)^C = A^C\cup \left(B^C\right)^C = A^C \cup B \end{equation*}$$*
-
A^C\setminus B^C=B\setminus A:We know that
A^C\setminus B^C = A^C\cap \left(B^C\right)^C. Now,\left(B^C\right)^C=BhenceA^C\cap \left(B^C\right)^C=A^C\cap B = B\cap A^C. Finally we know thatB\cap A^C = B\setminus Aby property 7.Hence
A^C\setminus B^C=B\setminus A. -
\left(A\setminus B\right)\cap C = \left(A\cap C\right)\setminus\left(B\cap C\right): -
$A\setminus\left(B\setminus C\right) = \left(A\cap B\right)\setminus\left(A\cap C\right)$
-
$\left(A\setminus B\right)\cap B=\emptyset$
-
$\left(A\setminus B\right)\cap\left(A\cap B\right)=\emptyset$
The proposition now follows. $\qed$ :::
Cartesian Product
We now look to another method of constructing a set. This method differs from the union and intersection as it allows us to construct a set where the elements come in pairs, in particular these pairs are ordered.
::: definition Definition 30. Ordered pair
Let S and T be sets. Let s\in S and t\in T. We say that the
tuple \left(s,t\right) is an ordered pair of an element in S and an
element in T.
:::
::: definition Definition 31. Cartesian product of two sets
Let S and T be sets. We define the Cartesian product of S and
T, denoted S\times T to be the set of all ordered pairs of the form
\left(s,t\right) where s\in S and t\in T. This is to say that
$$\begin{equation} S\times T=\left{\left(s,t\right):s\in S,t\in T\right} \end{equation*}$$* :::
::: example
Example 23. Let S=\left\{1,2,3\right\} and
T=\left\{4,5,6\right\}. We have that
$$\begin{align}
S\times T&=\left{\left(1,4\right),\left(1,5\right),\left(1,6\right),\left(2,4\right),\left(2,5\right),\left(2,6\right),\left(3,4\right),\left(3,5\right),\left(3,5\right)\right}\
T\times S&=\left{\left(4,1\right),\left(4,2\right),\left(4,3\right),\left(5,1\right),\left(5,2\right),\left(5,3\right),\left(6,1\right),\left(6,2\right),\left(6,3\right)\right}\
\end{align*}$$ This example shows that S\times T\neq T\times S in
general.*
:::
We can make repeated uses of this idea, we just need to defined an ordered $n$-tuple.
::: {#def:orderedNtuple .definition} Definition 32. Ordered $n$-tuple
Let S_1,S_2,\dots,S_n be sets. Let
s_1\in S_1,s_2\in S_2,\dots,s_n\in S_n. We say that
\left(s_1,s_2,\dots,s_n\right) is an ordered $n$-tuple of an elements
in S_1,S_2,\dots,S_n.
:::
::: {#def:CartProductOfNSet .definition}
Definition 33. Cartesian product of n sets
Let S_1,S_2,\dots,S_n be sets. We define the Cartesian product of
S_1,S_2,\dots,S_N, denoted S_1\times S_2\times\dots\times S_n to be
the set of all ordered pairs of the form
\left(s_1,s_2,\dots,s_n\right) where
s_1\in S_1.s_2\in S_2,\dots s_n\in S_n. This is to say that
$$\begin{equation} S_1\times S_2\times\dots\times S_n=\left{\left(s_1,s_2,\dots,s_n\right):s_1\in S_1.s_2\in S_2,\dots s_n\in S_n\right} \end{equation*}$$*
If all the sets are the same we denote this by S^n.
:::
We make the following observations
::: {#lem:CartEmpty .lemma} Lemma 1. Cartesian product is empty if and only if at least one of the sets in the product is empty
Let A and B be sets. We have that A\times B=\emptyset if and only
if A=\emptyset or B=\emptyset.
Proof:
We argue as follows. Suppose that A\times B\neq \emptyset then we
have by definition of a non-empty Cartesian product that
A\times B\neq \emptyset if and only if
\exists\left(a,b\right)\in A\times B. Now, by the definition of a
Cartesian product we have that as \left(a,b\right)\in A\times B if and
only if \exists a\in A and \exists b\in B, which is to say
A\neq\emptyset and B\neq\emptyset.
This proves the result as assuming A\times B\neq \emptyset gives us
A\neq\emptyset and B\neq\emptyset. $\qed$
:::
::: {#prop:CriterionForComOfCartProd .proposition} Proposition 10. Criterion for commutativity of the Cartesian product
Let A and B be sets. We have that A\times B = B\times A only if
at least one of the following holds.
-
$A=B$
-
A = \emptysetorB=\emptysetor $A=B=\emptyset$
Proof:
Let A and B be sets.
-
A=B:Suppose that
A=Bthen without loss of generality4 consider$$\begin{equation} A\times B = A\times A = \left{\left(a,a\right):a\in A\right} \end{equation*}$$*
Moreover
$$\begin{equation} B\times A = A\times A = \left{\left(a,a\right):a\in A\right} \end{equation*}$$*
Hence, varying over every
a\in Awe have thatA\times B = B\times A. -
A = \emptysetorB=\emptysetorA=B=\emptyset:By lemma 1{reference-type="ref" reference="lem:CartEmpty"} we have that if
A=\emptysetorB=\emptysetorA=B=\emptysetthenA\times B=\emptyset =B\times A.
The proposition follows. $\qed$ :::
We have seen that the Cartesian product is not commutative, but what can we say about associativity.
::: example
Example 24. Let A=\left\{1\right\}. Consider
$$\begin{align} A\times\left(A\times A\right)&=A\times\left{\left(1,1\right)\right}=\left{\left(1,\left(1,1\right)\right)\right}\ \left(A\times A\right)\times A &=\left{\left(1,1\right)\right}\times A = \left{\left(\left(1,1\right),1\right)\right}\ \end{align*}$$*
Hence
A\times\left(A\times A\right)\neq \left(A\times A\right)\times A. So
in general the Cartesian product is not associative.
:::
We have the following criterion for the associativity of the Cartesian product.
::: {#prop:CriterionForAssOfCartProd .proposition} Proposition 11. Criterion for associativity of the Cartesian product
Let A,B and C be sets. We have that
A\times\left(B\times C\right)=\left(A\times B\right)\times C if and
only if A=\emptyset or B=\emptyset or C=\emptyset.
Proof:
Suppose that
A\times\left(B\times C\right)=\left(A\times B\right)\times C, we need
to show one of A,B or C is empty.
Consider A\times\left(B\times C\right), we have that
$$\begin{equation} A\times\left(B\times C\right)=A\times\left{\left(b,c\right):b\in B,c\in C\right}=\left{\left(a,\left(b,c\right)\right):a\in A, \left(b,c\right)\in B\times C\right} \end{equation*}$$*
Now consider \left(A\times B\right)\times C, we have that
$$\begin{equation} \left(A\times B\right)\times C=\left{\left(a,b\right):a\in A,b\in B\right}\times C=\left{\left(\left(a,b\right),c\right):\left(a,b\right)\in A\times B, c\in C\right} \end{equation*}$$*
Hence for equality we need that a=\left(a,b\right) and
\left(b,c\right)=c. However this is not possible as
\left(a,b\right)\not\in A and \left(b,c\right)\not\in C. Hence one
of the products must be empty, which implies that one of A,B or C is
empty.
Now suppose that one of A,B or C is empty. Without loss of
generality suppose that A=\emptyset, then by lemma
1{reference-type="ref" reference="lem:CartEmpty"} we
know that one of A\times B=\emptyset and
A\times\left(B\times C\right)=\emptyset. Also
\left(A\times B\right)\times C=\emptyset\times C=\emptyset.
Hence we have that
\left(A\times B\right)\times C=\emptyset=A\times\left(B\times C\right).
is associative. $\qed$
:::
It is left to see how the Cartesian product interacts with unions, intersections and complements.
::: {#prop:CartProdUnIntComp .proposition} Proposition 12. Properties of Cartesian products, unions, intersections and complements
Let A,B,C and D be sets. We have the following properties
-
$\left(A\cap B\right)\times\left(C\cap D\right) =\left(A\times C\right)\cap\left(B\times D\right)$
-
$A\times\left(B\cap C\right)=\left(A\times B\right)\cap \left(A\times C\right)$
-
$\left(A\times B\right)\cap\left(B\times A\right)=\left(A\cap B\right)\times\left(A\cap B\right)$
-
$\left(A\cup B\right)\times\left(C\cup D\right) = \left(A\times C\right)\cup \left(B\times D\right)\cup\left(A\times D\right)\cup\left(B\times C\right)$
-
$A\times\left(B\cup C\right) = \left(A\times B\right)\cup\left(A\times C\right)$
-
$\left(B\cup C\right)\times A = \left(B\times A\right)\cup\left(C\times A\right)$
-
If
A\subseteq BandC\subseteq DthenA\times C\subseteq B\times D. Moreover ifA\neq\emptysetandC\neq\emptysetthen$$\begin{equation} A\times C\subseteq B\times T \iff A\subseteq B\text{ and } C\subseteq D \end{equation*}$$*
-
If
A\subseteq Bthen $A\times C\subseteq B\times C$ -
If
C\subseteq Dthen $A\times C\subseteq A\times D$ -
$A\times\left(B\setminus C\right)=\left(A\times B\right)\setminus\left(A\times C\right)$
-
$\left(A\setminus B\right)\times C = \left(A\times C\right)\setminus\left( B\times C\right)$
-
$\left(A\times B\right)\setminus\left(C\times D\right)=\left(A\times\left(B\setminus D\right)\right)\cup\left(\left(A\setminus B\right)\times C\right)$
-
Suppose
A\subseteq CandB\subseteq Dand considerC\setminus AandT\setminus B. We have$$\begin{align} \left(C\setminus A\right)\times D &= \left(C\times D\right)\setminus\left(A\times D\right)\ C\times\left(D\setminus B\right) &=\left(C\times D\right)\setminus \left(C\times B\right) \end{align*}$$*
Proof:
-
\left(A\cap B\right)\times\left(C\cap D\right) =\left(A\times C\right)\cap\left(B\times D\right):Let
\left(x,y\right)\in\left(A\cap B\right)\times\left(C\cap D\right), then by definition of the Cartesian product we have that\left(x,y\right)\in\left(A\cap B\right)\times\left(C\cap D\right)if and if onlyx\in Aandx\in Bandy\in Candy\in D.x\in Aandx\in Bandy\in Candy\in Dmeans that\left(x,y\right)\in A\times Cand\left(x,y\right)\in B\times D, finally this happens if and only if\left(x,y\right)\in \left(A\times C\right)\cap\left(B\times D\right). -
A\times\left(B\cap C\right)=\left(A\times B\right)\cap \left(A\times C\right):We know that
A\cap A=A. By the previous property we have that$$\begin{equation} A\times\left(C\cap D\right)=\left(A\cap A\right)\times\left(B\cap C\right)=\left(A\times B\right)\cap \left(A\times C\right) \end{equation*}$$*
-
\left(A\times B\right)\cap\left(B\times A\right)=\left(A\cap B\right)\times\left(A\cap B\right):By property 1 we have
$$\begin{equation} \left(A\times B\right)\cap\left(B\times A\right)=\left(A\cap B\right)\times \left(B\cap A\right) = \left(A\cap B\right)\times \left(A\cap B\right) \end{equation*}$$*
-
\left(A\cup B\right)\times\left(C\cup D\right) = \left(A\times C\right)\cup \left(B\times D\right)\cup\left(A\times D\right)\cup\left(B\times C\right):Let
\left(x,y\right)\in \left(A\cup B\right)\times\left(C\cup D\right), then by definition of Cartesian product and the union of sets we have that\left(x,y\right)\in \left(A\cup B\right)\times\left(C\cup D\right)if and only ifx\in Aorx\in Bandy\in Cory\in D.x\in Aorx\in Bandy\in Cory\in Dwill occur if and only if (x\in Aorx\in Bandy\in C) or (x\in Aorx\in Bandy\in D).(
x\in Aorx\in Bandy\in C) or (x\in Aorx\in Bandy\in D) occurs if and only if (x\in Aandy\in C) or (x\in Bandy\in C) or (x\in Aandy\in D) or (x\in Bandy\in D).By the definition of the Cartesian product we have that (
x\in Aandy\in C) or (x\in Bandy\in C) or (x\in Aandy\in D) or (x\in Bandy\in D) if and only if\left(x,y\right)\in A\times Cor\left(x,y\right)\in A\times Dor$\left(x,y\right)\in B\times C$ or\left(x,y\right)\in B\times D. Hence by the definition of the union of two sets,\left(x,y\right)\in A\times Cor\left(x,y\right)\in A\times Dor$\left(x,y\right)\in B\times C$ or\left(x,y\right)\in B\times Doccurs if and only if\left(x,y\right)\in \left(A\times C\right)\cup \left(B\times D\right)\cup\left(A\times D\right)\cup\left(B\times C\right). -
A\times\left(B\cup C\right) = \left(A\times B\right)\cup\left(A\times C\right):We know
A=A\cup Aand so by the previous property we have that$$\begin{align} A\times\left(B\cup C\right)&=\left(A\cup A\right)\times\left(B\cup C\right)\ &=\left(A\times B\right)\cup \left(A\times C\right)\cup\left(A\times C\right)\cup\left(A\times B\right)\ &=\left(A\times B\right)\cup\left(A\times C\right) \end{align*}$$*
-
\left(B\cup C\right)\times A = \left(B\times A\right)\cup\left(C\times A\right):Again
A=A\cup Aand so by property 4 we have$$\begin{align} \left(B\cup C\right)\times A&=\left(B\cup C\right)\times\left(A\cup A\right)\ &=\left(B\times A\right)\cup \left(B\times A\right)\cup\left(C\times A\right)\cup\left(C\times A\right)\ &=\left(B\times A\right)\cup\left(C\times A\right) \end{align*}$$*
-
If
A\subseteq BandC\subseteq DthenA\times C\subseteq B\times D. Moreover ifA\neq\emptysetandC\neq\emptysetthen$$\begin{equation} A\times C\subseteq B\times T \iff A\subseteq B\text{ and } C\subseteq D \end{equation*}$$:*
Let
A\subseteq BandC\subseteq D. IfA=\emptysetorC=\emptysetthen by lemma 1{reference-type="ref" reference="lem:CartEmpty"} we haveA\times C=\emptysetand by proposition 6{reference-type="ref" reference="prop:EmptySetincontainedineveryset"} we haveA\times C=\emptyset \subseteq B\subseteq D.So suppose that
A\neq\emptysetandC\neq\emptysetthen lemma 1{reference-type="ref" reference="lem:CartEmpty"} givesA\times C\neq\emptyset. Then we have that\left(x,y\right)\in A\times Cif and if onlyx\in Aandy\in C. We haveA\subseteq Bsox\in BandC\subseteq Dsoy\in D, hence\left(x,y\right)\in B\times D. HenceA\times C\subseteq B\times D.It is left to prove that if
A\neq\emptysetandC\neq\emptysetandA\times C\subseteq B\times D, thenA\subseteq BandC\subseteq D. SupposeA\times C\subseteq B\times D. IfA=\emptysetthenA\times C=\emptysetby lemma 1{reference-type="ref" reference="lem:CartEmpty"} andA\times C=\emptyset\subseteq B\times Dirrespective ofC, soCneed not be a subset ofD. Likewise ifC=\emptysetthenA\times C=\emptyset\subseteq B\times Dirrespective ofAsoAneed not be a subset ofB.So suppose that
A\neq\emptysetandC\neq\emptysetthen\exists x\in Aand\exists y\in Csuch that\left(x,y\right)\in A\times C, we have thatA\times C\subseteq B\times Tand so\left(X,y\right)\in B\times Dsox\in Bandy\in D.Hence for
A\neq\emptysetandC\not\emptyset, we have thatA\subseteq BandC\subseteq DgivesA\times C\subseteq B\times DandA\times C\subseteq B\times DgivesA\subseteq BandC\subseteq D. Hence we have$$\begin{equation} A\times C\subseteq B\times D\iff A\subseteq B\text{ and } C\subseteq D \end{equation*}$$*
-
If
A\subseteq BthenA\times C\subseteq B\times C:Let
Abe such thatA\subseteq B. We have for any setCthatC\subseteq C, hence by the previous property we know that$$\begin{equation} A\subseteq B\text{ and } C\subseteq C\Rightarrow A\times C\subseteq B\times C \end{equation*}$$*
-
If
C\subseteq DthenA\times C\subseteq A\times D:Let
Cbe such thatC\subseteq D. We have thatA\subseteq Aand so by property 7 we have that$$\begin{equation} A\subseteq A\text{ and } C\subseteq D\Rightarrow A\times C\subseteq A\times D \end{equation*}$$*
-
A\times\left(B\setminus C\right)=\left(A\times B\right)\setminus\left(A\times C\right):Let
\left(x,y\right)\in A\times\left(B\setminus C\right)then we have that\left(x,y\right)\in A\times\left(B\setminus C\right)if and only ifx\in Aandy\in B\setminus C.y\in B\setminus Cmeans thaty\in Bandy\not\in C. Thus,x\in Aandy\in Bandy\not\in Chappens if and only if\left(x,y\right)\in A\times Band\left(x,y\right)\not\in A\times C. Hence by definition of the difference of two sets we have that\left(x,y\right)\in A\times Band\left(x,y\right)\not\in A\times Cif and only if\left(x,y\right)\in \left(A\times B\right)\setminus\left(A\times C\right). -
\left(A\setminus B\right)\times C = \left(A\times C\right)\setminus\left( B\times C\right):Let
\left(x,y\right)\in \left(A\setminus B\right)\times Cthen we have that\left(x,y\right)\in \left(A\setminus B\right)\times Cif and only ifx\in A\setminus Bandy\in C, moreoverx\in A\setminus Bmeans thatx\in Aandx\not\in B. Hencex\in Aandx\not\in Bandy\in Coccurs if and only if\left(x,y\right)\in A\times Cand\left(x,y\right)\not\in B\times C. Hence by definition we have that\left(x,y\right)\in A\times Cand\left(x,y\right)\not\in B\times Cif and only if\left(x,y\right)\in\left(A\times C\right)\setminus\left( B\times C\right). -
\left(A\times B\right)\setminus\left(C\times D\right)=\left(A\times\left(B\setminus D\right)\right)\cup\left(\left(A\setminus B\right)\times C\right):Let
\left(x,y\right)\in \left(A\times B\right)\setminus\left(C\times D\right), then we have that\left(x,y\right)\in A\times Band\left(x,y\right)\not\in C\times D, which happens if and only ifx\in Aandy\in Bandx\not\in Candy\not\in D. Now,x\in Aandy\in Bandx\not\in Candy\not\in Dmeans that eitherx\in Aandy\in Bandx\not\in Corx\in Aandy\in Bandy\not\in D. In the first case,x\in Aandy\in Bandx\not\in C, we have thatx\in A\setminus Candy\in B, in the second case,x\in Aandy\in Bandy\not\in Dwe havex\in Aandy\in B\setminus D.x\in Aandy\in Bandx\not\in Corx\in Aandy\in Bandy\not\in Doccurs if and only ifx\in A\setminus Candy\in Borx\in Aandy\in B\setminus D. Now by the definition of the Cartesian product we have thatx\in A\setminus Candy\in Bgives us that\left(x,y\right)\in \left(A\setminus C\right)\times Bandx\in Aandy\in B\setminus Dgives us\left(x,y\right)\in A\times \left(C\setminus D\right).Hence
x\in A\setminus Candy\in Borx\in Aandy\in B\setminus Doccurs if and only if\left(x,y\right)\in \left(A\setminus C\right)\times Bor\left(x,y\right)\in A\times \left(C\setminus D\right), from which we deduce that\left(x,y\right)\in \left(A\setminus C\right)\times Bor\left(x,y\right)\in A\times \left(C\setminus D\right)if and only if\left(x,y\right)\in \left(A\setminus C\right)\times B\cup A\times \left(C\setminus D\right). -
Suppose
A\subseteq CandB\subseteq Dand considerC\setminus AandT\setminus B. We have$$\begin{align} \left(C\setminus A\right)\times D &= \left(C\times D\right)\setminus\left(A\times D\right)\ C\times\left(D\setminus B\right) &=\left(C\times D\right)\setminus \left(C\times B\right) \end{align*}$$*
Recall that
C\setminus A=\left\{x: x\in C\text{ and } x\not\in A\right\}. Now we have by property 11. that$$\begin{equation} \left(C\setminus A\right)\times D= \left(C\times D\right)\setminus \left(A\times D\right) \end{equation*}$$*
Likewise, by property 10. we have that
$$\begin{equation} C\times\left(D\setminus B\right)= \left(C\times D\right)\setminus \left(C\times B\right) \end{equation*}$$*
Hence the result has been shown. $\qed$ :::
Power Set
We make one final definition of an elementary operation for sets.
::: definition Definition 34. Power set
Let S be a set. We define the power set of the set S, denoted
P\left(S\right) to be the set which contains all of the possible
subsets of S.
:::
::: example
Example 25. Let S=\left\{1,2,3\right\} then we have that
$$\begin{equation} P\left(S\right)=\left{\emptyset,\left{1\right},\left{2\right},\left{3\right},\left{1,2\right},\left{1,3\right},\left{2,3\right},S\right} \end{equation*}$$* :::
Set Partitions
Recall the idea of disjoint sets, that is if X and Y are sets then
X and Y are disjoint if X\cap Y=\emptyset. This is saying that X
and Y have no elements in common. Now suppose we have a set S such
that X\cup Y=S but X\cap Y=\emptyset. Then S is made of two
distinct pieces. Of course there is nothing special about S being made
of only two pieces, and could be made of many many pieces. We capture
this idea in the next definition.
::: definition Definition 35. Partition of a set
Let S be a set and define \mathbb{S} to be the set of subsets of
S. We say that \mathbb{S} is a partition of S if the following
hold.
-
\forall S_1,S_2\in\mathbb{S}we haveS_1\cap S_2=\emptysetwhenever $S_1\neq S_2$ -
Taking the union of every
T\in\mathbb{S}gives usSthat is$$\begin{equation} S=\bigcup_{T\in\mathbb{S}} T \end{equation*}$$*
-
\forall T\in\mathbb{S}we have thatT\neq\emptyset.
If the number of sets in \mathbb{S} is finite with say n elements
then we call \mathbb{S} an $n$-component partition
:::
::: example
Example 26. Let S=\left\{1,2,3,4\right\} and let
S_1=\left\{2,4\right\} and S_2=\left\{1,3\right\}. Then S_1 and
S_2 partition S. Interestingly we have that S_1^C=S_2 and
S_2^C = S_1, so the complements of these sets still forms a partition
If instead we have S_3 = \left\{1\right\} and
S_4=\left\{2,3,4\right\} then we also have a partition where the
complements are also a partition. Now if S_5=\left\{2\right\},
S_6=\left\{1,3\right\} and S_7=\left\{4\right\} then S_5,S_6 and
S_7 is a partition of S.
:::
The fact in the first two examples we had two sets partitioning S
where the complements also partitioned S is not a coincidence.
::: proposition Proposition 13. Complements of 2-component partition is partition
Let S be a set such that A\subseteq S and B\subseteq S is a
$2$-component partition for S. We have that A and B partition S
if and only if A^C and B^C partition S.
Proof:
\left(\Rightarrow\right): Suppose that A\subseteq S and
B\subseteq S partition S. By definition we have that
-
$A\cap B = \emptyset$
-
$A\cup B = S$
-
A\neg\emptysetand $B\neq \emptyset$
We need to show that A^C and B^C is a partition that is
-
$A^C\cap B^C = \emptyset$
-
$A^C\cup B^C = S$
-
A^C\neq\emptysetand $B^C\neq \emptyset$
-
A^C\cap B^C = \emptyset:As
A\cup B = Swe have on taking the complement of both sides that$$\begin{align} A\cup B &= S\ \left(A\cup B\right)^C &= S^C\ A^C\cap B^C &= \emptyset \end{align*}$$*
So
A^C\cap B^C = \emptyset. -
A^C\cup B^C = S:Likewise as
A\cap B = \emptysetthen on taking the complement of both sides we have that$$\begin{align} A\cap B &= \emptyset\ \left(A\cap B\right)^C &= \emptyset^C\ A^C\cup B^C &= S \end{align*}$$*
So
A^C\cup B^C = S. -
A^C\neq\emptysetandB^C\neq \emptyset:Suppose that
A^C = \emptysetthen by taking the complement of both sides we have thatA=Swhich impliesB=\emptyset, which is a contradiction asAandBpartitionS. Likewise if we suppose thatB^C=\emptysetwe will have to conclude thatA=\emptysetwhich will be a contradiction. It thus follows that neitherA^CorB^Ccan be empty.Hence
A^C\neq\emptysetandB^C\neq \emptyset.
It follows that A^C and B^C is a partition of $S$
\left(\Leftarrow\right): Suppose that A^C and B^C is a partition
of S. We have that A^C\subseteq S and B^C\subset S. By the
previous part we have that \left(A^C\right)^C and \left(B^C\right)^C
is a partition of S. However \left(A^C\right)^C=A and
\left(B^C\right)^C=B. Thus A and B is a partition of $S$
The result now follows. $\qed$ :::
There are some additional results we can state about partitions that relate to the operations we can do on sets. We will require the following lemma.
::: lemma Lemma 2. Set difference and intersection are disjoint sets
Let S and T be two sets. We have that S\setminus T and S\cap T
are disjoint sets, which is to say that
$$\begin{equation} \left(S\setminus T\right)\cap \left(S\cap T\right)=\emptyset \end{equation*}$$*
Proof:
Suppose that x\in \left(S\setminus T\right)\cap \left(S\cap T\right)
then by definition x\in S\setminus T and x\in S\cap T. As
x\in S\setminus T then we have that x\in S and x\not\in T,
likewise as x\in S\cap T then x\in S and x\in T. It is clear that
no such x can exist hence
\left(S\setminus T\right)\cap \left(S\cap T\right)=\emptyset.
:::
A brief look at Zermelo--Fraenkel set theory
At the start of this section we introduced the idea of Zermelo--Fraenkel set theory. This is the complete formalisation of set theory and the true bedrock of mathematics. The Zermelo--Fraenkel set theory axioms, hence now referred to as ZF, are given as follows.
::: definition Definition 36. Zermelo--Fraenkel set theory axioms
The Zermelo-Fraenkel set theory axioms are the following.
-
The axiom of extensionality:
The axiom of extensionality asserts that two sets are equal if and only if they contain the same elements.
-
The axiom of the empty-set:
The axiom of the empty-set asserts that there exists a set which contains no elements
-
The axiom of pairing:
The axiom of pairing asserts that given any set
Aand any setB, there is a setCsuch that, given any setD,Dis a member ofCif and only ifDis equal toAorDis equal toB. This is to say, given two sets, there is a set whose members are exactly the two given sets. -
The axiom of specification:
The axiom of specification asserts that we can construct a set which satisfies a given condition, so long as this condition is not inherently contradictory.
-
The axiom of unions:
The axiom of unions asserts that we can perform the union of two sets
Aand $B$ -
The axiom of powers:
The axiom of powers asserts that for any set
Swe can construct a setP\left(S\right)whose elements are all the possible subsets ofS. -
The axiom of infinity:
The axiom of infinity asserts that there is at least one infinite set
A, that is at least one set with infinitely many elements. That is we have a setAsuch that the\emptyset\in Aand ifx\in Athen the setx\cup\left\{x\right\}is also inA. -
The axiom of replacement:
We will need the next section to fully understand this axiom, however informally asserts that for some set
S, and form another set by replacing the elements ofSby other sets according to any definite rule. -
The axiom of foundation:
The axiom of foundation asserts that for every non-empty set
S, there exists an elementx\in Ssuch thatxandSare disjoint. This also asserts that no set can contain itself. :::
There is also one axiom which we have left off. This is the controversial axiom of choice.
::: definition Definition 37. The axiom of choice
Let S be a set of non-empty sets. The axiom of choice asserts that
there is a way to pick an element of each of the sets in S.
:::
With the axiom of choice we have the following
::: definition Definition 38. ZFC axioms
The axioms of ZF along with the axiom of choice gives us the ZFC axioms :::
We can already see that our "hands-on" approach to set theory has somewhat indirectly captured the essence of the ZF axioms. We can use the ZF axiom to prove in a truly rigours way what we did with out "hands-on" approach. Although an interesting field of study itself, we will not really need to use the ZF axioms, although occasionally we may rely on choice.
There is one other thing that needs bringing up, ZFC has one more
component, the axioms alone are not enough to prove anything. We need
the notion of inclusion, that is being an element of a set. That is we
include the symbol \in along with the axioms, where \in takes on the
meaning we defined earlier. With this we can in theory use ZFC to start
proving and building up mathematics from the bedrock.
Mappings
Introduction and basic definitions
Now that we have the of a set what can we use it for? Many areas of
mathematics can be broken down into the theory of sets, in particular
how we can get from one set to another. Without this idea we wouldn't be
able to get very far at all. As an example, you may have seen, in a
calculus course for example, the idea of a function f\left(x\right),
say f\left(x\right)=x^2 where x can be any number we choose. Say
x=2 then f\left(2\right)=4. You may have also seen functions where
we are not allowed to use any number we wish for example, if we take
f\left(x\right)=\sqrt{x} then we are only allowed positive numbers if
we want a to find an answer using the numbers we are familiar with, such
as 1, 88.125, \pi,\sqrt{2} etc. This set we will denote by
\mathbb{R}. The alert reader may now see how sets will come into play,
to define in a rigours way the ideas of f\left(x\right)=x^2 and other
such functions, we need to consider what are the allowable inputs which
once done will give us the possible outputs. That is if we have a set
whose elements are inputs and we define some form of function, which we
will now call a map, then we will get another set whose elements are
what inputs will be 'mapped' to.
::: definition Definition 39. Mapping
Let X and Y be sets. Suppose we have some rule or description,
which we will denote by f, by which for each x\in X there is some
element f\left(x\right)\in Y. We say that the rule (description) is a
mapping or map or function from X to Y. We denote a mapping with the
following notation
$$\begin{align} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right) \end{align*}$$*
where the first line tells us what sets the mapping is between, and the
bottom line tells us where each element x\in X gets mapped to
:::
::: definition Definition 40. Domain
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between
two sets X and Y. We say that the set X is the domain of the
mapping f. The domain contains the elements which the map can act on.
We can write this as
$$\begin{equation} \mathop{\mathrm{Dom}}\left(f\right)=X \end{equation*}$$* :::
::: definition Definition 41. Co-Domain
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between
two sets X and Y. We say that the set Y is the Co-domain of the
mapping f. The co-domain contains the possible elements that the map
can send elements of X to. We can write this as
$$\begin{equation} \mathop{\mathrm{Cdm}}\left(f\right)=Y \end{equation*}$$* :::
We have some examples of mappings.
::: example
Example 27. Let X=\left\{1,2,3\right\} and let Y=X. Define the
map
$$\begin{align} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right)=x \end{align*}$$*
To see what f does we will take each element of X one at a time.
Starting with 1 we have that 1\mapsto f\left(1\right)=1, for 2 we
have 2\mapsto f\left(2\right)=2 and finally
3\mapsto f\left(3\right)=3. Hence the map f takes an element of X
and leaves it alone. A map which takes every element of its domain and
leaves it alone is called an identity map, or if you prefer the do
nothing at all map.
:::
::: {#exmp:Mapping 1 .example}
Example 28. Let X=Y=\mathbb{N}. Let f be the map given by
$$\begin{align} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right)=2x \end{align*}$$*
It is clear to see that every element in the domain gets doubled, i.e
f\left(1\right)=2, f\left(2\right)=4, f\left(3\right)=6 and so
on.
:::
A map does not need to be given by an explicit mathematical formulae
::: example
Example 29. Let
A=\text{The set of all humans currently alive on planet earth}, from
which it should be clear to see that \text{You}\in A 5 . Let
B=\left\{0,1\right\}. Let f be the mapping given by
$$\begin{align} f:A&\mathlarger{\mathlarger{\rightarrow}}B\ a&\mapsto f\left(a\right)= \begin{cases} 1,\ \text{If } a \text{ has hair on their head}\ 0,\ \text{If } a \text{ does not have hair on their head}\ \end{cases} \end{align*}$$*
Then f is a map which indicates if a given person has hair on their
head or not.
:::
The above definition of a mapping can be made more general
::: definition Definition 42. Piecewise mapping
Let f:X\rightarrow Y be a mapping. We say that f is a piecewise
mapping if we need multiple rules or descriptions to fully describe f.
That we wish to define the mapping using different rules based on the
input. If for each of this input ranges we define a mapping
g_1,g_2,g_3,\dots then we can write the piecewise function as follows
$$\begin{align} f:X&\rightarrow Y\ x&\mapsto f\left(x\right)=\begin{cases} g_1\left(x\right),\ \text{Condition for }g_1\ g_2\left(x\right),\ \text{Condition for }g_2\ g_3\left(x\right),\ \text{Condition for }g_3\ \dots \end{cases} \end{align*}$$* :::
::: example
Example 30. Let f:\mathbb{N}\rightarrow\mathbb{N} be defined by
$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x &\mapsto f\left(x\right) = \begin{cases} 2x,\ \text{If } $x <5$\ 5x,\ \text{Otherwise} \end{cases} \end{align*}$$*
We have that f\left(1\right)=2, f\left(2\right)=4 and so on up to
f\left(4\right)=8, then f\left(5\right)=25 and so on.
:::
We make one more useful definition that will be useful throughout the rest of the text,
::: definition Definition 43. Closure of a mapping
Let X be a set. If we have a mapping such that f:X^n\rightarrow X.
We say the mapping has closure on the set X, or we say that f is a
closed mapping.
:::
The image and pre-image
We now define a more technical notion of how a mapping f maps an
element in the domain to the co-domain.
::: definition Definition 44. Image of an element
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of between
two sets X and Y, and let x\in X be an element of the domain. We
say that f\left(x\right)\in Y is the image of the element x.
:::
Which in turn allows us to define a subset of the co-domain for which
every element x\in X gets mapped to
::: {#def:ImageMapping .definition} Definition 45. Image of a mapping
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of between
two sets X and Y. We define the set
$$\begin{equation}
\mathop{\mathrm{Image}}\left(f\right)=f\left(X\right)=\left{f\left(x\right):x\in X\right}\subseteq Y
\end{equation*}$$ To be the image of the domain, sometimes called the
range of f. That is the image is the set of all possible outputs of
the mapping f with the domain X.*
Moreover, suppose that A\subseteq X then we define the image of the
subset A to be
$$\begin{equation} f\left(A\right)=\left{f\left(x\right):x\in A\right}\subseteq f\left(X\right)\subseteq Y \end{equation*}$$*
That is we can consider the image of subsets of X.
:::
::: example
Example 31. Consider the mapping in example
[28](#exmp:Mapping 1){reference-type="ref" reference="exmp:Mapping 1"},
we have that X=Y=\mathbb{N} and is f the map
$$\begin{align} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right)=2x \end{align*}$$*
then we have that $\mathop{\mathrm{Image}}\left(f\right)=f\left(\mathbb{N}\right)=\left{2x:x\in\mathbb{N}\right}$ :::
::: example
Example 32. Let f be an arbitrary mapping such that
f:\emptyset\mathlarger{\mathlarger{\rightarrow}}Y for some set Y.
What is \mathop{\mathrm{Image}}\left(f\right)?. We have by the
definition of a mapping 45{reference-type="ref"
reference="def:ImageMapping"}, we have that
$$\begin{equation} \mathop{\mathrm{Image}}\left(f\right)=\left{f\left(x\right):x\in\emptyset\right} \end{equation*}$$*
However, we know that the empty set has no elements, so there are no
elements that f can send anything to, so
\mathop{\mathrm{Image}}\left(f\right)=\emptyset.
:::
Likewise we can define how a mapping is mapped to from the domain to the co-domain. This is called the pre-image.
::: definition Definition 46. Pre-image of an element
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of between
two sets X and Y, and let y\in Y be an element of the co-domain.
If f\left(x\right)=y then we say that f\left(x\right)\in X is the
pre-image of the element y and we denote this f^{-1}\left(y\right).
:::
Which in turn allows us to define a subset of the domain for which every
element y\in Y gets mapped to
::: {#def:PreImageMapping .definition} Definition 47. Pre-image of a mapping
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of between
two sets X and Y. We define the set
$$\begin{equation} \mathop{\mathrm{PreImage}}\left(f\right)=f^{-1}\left(Y\right)=\left{x\in X:f\left(x\right)\in Y\right}\subseteq X \end{equation*}$$ To be the pre-image of the co-domain. That is the pre-image is the set of all possible inputs that give the given outputs.*
Moreover, suppose that B\subseteq Y then we define the pre-image of
the subset B to be
$$\begin{equation} f^{-1}\left(B\right)=\left{x\in X:f\left(x\right)\in B\right}\subseteq f^{-1}\left(Y\right)\subseteq X \end{equation*}$$* :::
::: example
Example 33. Consider the mapping
f:\mathbb{N}\rightarrow\mathbb{N} given by
$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto f\left(x\right)=\frac{x}{2} \end{align*}$$*
We have that \frac{x}{2} is defined in the naturals only when x is
an even number, hence the pre-image must consist of the even numbers.
$$\begin{equation} \mathop{\mathrm{PreImage}}\left(f\right)=f^{-1}\left(\mathbb{N}\right)=\left{x\in\mathbb{N}:\frac{x}{2}\in\mathbb{N}\right}=\left{0,2,4,6,8\dots\right} \end{equation*}$$* :::
::: example
Example 34. Consider the mapping
f:\mathbb{N}\rightarrow\mathbb{N} given by
$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto f\left(x\right)=x^2 \end{align*}$$*
We have that the pre-image is given by
$$\begin{equation} \mathop{\mathrm{PreImage}}\left(f\right)=\left{x\in\mathbb{N}:x^2\in\mathbb{N}\right}=\left{0,1,2,3,4\dots\right}=\mathbb{N} \end{equation*}$$* :::
With these definitions we can make the following observations
::: {#prop:PropertyImagePreImage .proposition} Proposition 14. Properties of the image and pre-image
Let f:X\rightarrow Y be a mapping and let A\subseteq X and
B\subseteq Y. We have that the following properties hold for the image
and pre-image
-
$f\left(X\right)\subseteq Y$
-
$f\left(f^{-1}\left(Y\right)\right)=f\left(X\right)$
-
$f\left(f^{-1}\left(B\right)\right)\subseteq B$
-
$f\left(f^{-1}\left(B\right)\right)=B\cap f\left(X\right)$
-
$f\left(f^{-1}\left(f\left(A\right)\right)\right)=f\left(A\right)$
-
$f\left(A\right)=\emptyset\iff A=\emptyset$
-
$B\subseteq f\left(A\right)\iff\exists C\subseteq A: f\left(C\right)=B$
-
$f\left(X\setminus A\right)\subseteq f\left(A\right)\iff f\left(A\right)=f\left(X\right)$
-
$f\left(X\right)\setminus f\left(A\right)\subseteq f\left(X\setminus A\right)$
-
$f\left(A\cup f^{-1}\left(B\right)\right)\subseteq f\left(A\right)\cup B$
-
$f\left(A\cap f^{-1}\left(B\right)\right)= f\left(A\right)\cap B$
Likewise the following properties hold for the pre-image
-
$f^{-1}\left(Y\right)=X$
-
$f^{-1}\left(f\left(X\right)\right)=X$
-
$A\subseteq f^{-1}\left(f\left(A\right)\right)$
-
Suppose that instead of the mapping
f:X\rightarrow Ywe consider a new mapping based onf, which we we call\Bar{f}. We define\Bar{f}to be the mapping$$\begin{align} \Bar{f}:A&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto \Bar{f}\left(x\right)=f\left(x\right) \end{align*}$$*
that is
\Bar{f}maps every element ofa\in Ato whatf\left(a\right)does. With this new mapping we have the following property$$\begin{equation} \left(\Bar{f}\right)^{-1}\left(B\right)=A\cap f^{-1}\left(B\right) \end{equation*}$$*
-
$f^{-1}\left(f\left(f^{-1}\left(B\right)\right)\right)=f^{-1}\left(B\right)$
-
$f^{-1}\left(B\right)=\emptyset\iff B\subseteq Y\setminus f\left(X\right)$
-
$A\subseteq f^{-1}\left(B\right)\iff f\left(A\right)\subseteq B$
-
$f^{-1}\left(Y\setminus B\right)\subseteq f^{-1}\left(B\right)\iff f^{-1}\left(B\right)=X$
-
$f^{-1}\left(Y\setminus B\right)= X\setminus f^{-1}\left(B\right)$
-
$A\cup f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cup B\right)$
-
$A\cap f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cap B\right)$
Proof:
We start with the properties of the image.
-
f\left(X\right)\subseteq Y:This holds by definition of the image.
-
f\left(f^{-1}\left(Y\right)\right)=f\left(X\right):Let
x\in f\left(f^{-1}\left(Y\right)\right)and recall the definition of the image and pre-image.$$\begin{align} f\left(A\right)&=\left{f\left(x\right):x\in A\right}\subseteq f\left(X\right)\subseteq Y\ f^{-1}\left(B\right)&=\left{x\in X:f\left(x\right)\in B\right}\subseteq f^{-1}\left(Y\right)\subseteq X \end{align*}$$*
We have that
$$\begin{equation} f\left(f^{-1}\left(Y\right)\right)=\left{f\left(y\right):y\in f^{-1}\left(Y\right)\right} \end{equation*}$$*
Hence
x\in f\left(f^{-1}\left(Y\right)\right)means thatx=f\left(y\right)for somey\in f^{-1}\left(Y\right), additionally we conclude thaty\in X. Moreover by the definition of the pre-image we have thatf^{-1}\left(Y\right)\subseteq X. It thus follows thatx\in f\left(X\right)and sof\left(f^{-1}\left(Y\right)\right)\subseteq f\left(X\right).Now suppose that
x\in f\left(X\right), that isx=f\left(x'\right)for somex'\in X. Now by definition of the pre-image asx'\in Xwithf\left(x'\right)\in Ywe have thatx'\in f^{-1}\left(Y\right). Hence by definition of the setf\left(f^{-1}\left(Y\right)\right)we must conclude thatf\left(x'\right)\in f\left(f^{-1}\left(Y\right)\right), which is to sayx\in f\left(f^{-1}\left(Y\right)\right). Hencef\left(X\right)\subseteq f\left(f^{-1}\left(Y\right)\right).It follows that
f\left(f^{-1}\left(Y\right)\right)=f\left(X\right). -
f\left(f^{-1}\left(B\right)\right)\subseteq B:Suppose that
x\in f\left(f^{-1}\left(B\right)\right)whereB\subseteq Y. We hence have thatx=f\left(b\right)for someb\in f^{-1}\left(B\right), henceb\in Xgiving usf\left(b\right)\in Band sof\left(f^{-1}\left(B\right)\right)\subseteq B. -
f\left(f^{-1}\left(B\right)\right)=B\cap f\left(X\right):Let
x\in f\left(f^{-1}\left(B\right)\right)then by property 3 we have thatx\in B. Additionally asx\in f\left(f^{-1}\left(B\right)\right)andB\subseteq Ythenf\left(f^{-1}\left(B\right)\right)\subseteq f\left(f^{-1}\left(Y\right)\right)and sox\in f\left(f^{-1}\left(Y\right)\right). Now by property 2 we have thatf\left(f^{-1}\left(Y\right)\right)=f\left(X\right)thusx\in f\left(X\right)and sox\in B\cap f\left(X\right). It follows thatf\left(f^{-1}\left(B\right)\right)\subseteq B\cap f\left(X\right).Now suppose that
x\in B\cap f\left(X\right). By definition off\left(X\right)we havex\in f\left(X\right)gives us thatx=f\left(x'\right)wherex'\in X, moreover we also have thatx\in B. Now we have the setf\left(f^{-1}\left(B\right)\right)is given by$$\begin{equation} f\left(f^{-1}\left(B\right)\right)=\left{f\left(b\right):b\in f^{-1}\left(B\right)\right} \end{equation*}$$*
We have that
x=f\left(x'\right)and sox'\in f^{-1}\left(B\right), hence clearly by definition of the image we have thatx\in f\left(f^{-1}\left(B\right)\right). It follows thatB\cap f\left(X\right)\subseteq f\left(f^{-1}\left(B\right)\right).Hence the result
f\left(f^{-1}\left(B\right)\right)=B\cap f\left(X\right). -
f\left(f^{-1}\left(f\left(A\right)\right)\right)=f\left(A\right):By property
4we have that$$\begin{equation} f\left(f^{-1}\left(f\left(A\right)\right)\right)=f\left(A\right)\cap f\left(X\right) \end{equation*}$$ as
f\left(A\right)\subseteq Y. Finallyf\left(A\right)\cap f\left(X\right)=f\left(A\right)asf\left(A\right)\subseteq f\left(X\right). The result follows.* -
f\left(A\right)=\emptyset\iff A=\emptyset:\left(\Leftarrow\right): Suppose thatf\left(A\right)=\emptyset. By definition of the image we have that$$\begin{equation} f\left(A\right)=\left{f\left(x\right):x\in A\right} \end{equation*}$$ By set equality we must have that
f\left(A\right)=\left\{f\left(x\right):x\in A\right\}=\emptyset. Hence there can be no elementsf\left(x\right)wherex\in Awhich can only occur ifA=\emptysetfor if not thenf\left(A\right)has at least one element for somex'\in A, contradicting the fact thatf\left(A\right)=\emptyset. It follows thatA=\emptyset.*\left(\Rightarrow\right): Suppose thatA=\emptyset, we have that the image of the empty set is given by$$\begin{equation} f\left(A\right)=f\left(\emptyset\right)=\left{f\left(x\right):x\in \emptyset\right}=\emptyset \end{equation*}$$*
It follows that
f\left(A\right)=\emptyset. -
B\subseteq f\left(A\right)\iff\exists C\subseteq A: f\left(C\right)=B:\left(\Rightarrow\right): Suppose thatB\subseteq f\left(A\right). We show that\exists C\subseteq A: f\left(C\right)=B. So, suppose thatx\in Bthen we have thatx\in f\left(A\right)by assumption. By definition of the image we have that$$\begin{equation} f\left(A\right)=\left{f\left(x\right):x\in A\right} \end{equation*}$$*
Hence we have
x\in f\left(A\right)gives us thatx=f\left(x'\right)for somex'\in A. We define the required setCas follows.$$\begin{equation} C = \bigcup_{\substack{x'\in A \ f\left(x'\right)\in B}} x' \end{equation*}$$ That is
Cis defined to be those elementsx'\in Asuch thatf\left(x'\right)\in Bwhich is a subset off\left(A\right). ClearlyC\subseteq Aas eachx'\in Cis by construction an element ofA. Additionally we also havef\left(C\right)=Bby construction ofC.*\left(\Leftarrow\right): Suppose that\exists C\subseteq A: f\left(C\right)=B. Asf\left(C\right)=Bwe have by the definition of the image that$$\begin{equation} f\left(C\right)=\left{f\left(x\right):x\in C\right} \end{equation*}$$ that is
x\in f\left(C\right)givesx=f\left(c\right)for somec\in Cand additionallyx\in Bby assumption. NowC\subseteq Asoc\in A. Hencex\in f\left(A\right), hence we must conclude thatB\subseteq f\left(A\right), possibly being equal ifC=A.*The result follows.
-
f\left(X\setminus A\right)\subseteq f\left(A\right)\iff f\left(A\right)=f\left(X\right):\left(\Rightarrow\right): Suppose thatf\left(X\setminus A\right)\subseteq f\left(A\right)and recall the definition of the complement of sets. We have that$$\begin{equation} X\setminus A = \left{x\in X: x\not\in A\right} \end{equation*}$$ Now,
A\subseteq Xby hypothesis of the proposition. So ifx\in f\left(X\setminus A\right)then by definition of the image we have that*$$\begin{equation} f\left(X\setminus A\right)=\left{f\left(x\right): x\in X\setminus A\right}=\left{f\left(x\right):x\in X\text{ and } x\not\in A\right} \end{equation*}$$ but then if
x\not\in Athenx\not\in f\left(A\right). However ifA=Xthen we have thatX\setminus A = \emptysetfrom which it follows by property 6 thatf\left(X\setminus A\right)=\emptysetand so as the empty set is a subset of any set we conclude that\emptyset\subseteq f\left(A\right), that is we must havef\left(A\right)=f\left(X\right).*\left(\Leftarrow\right):Suppose thatf\left(A\right)=f\left(X\right), by definition of the image we have that$$\begin{equation} f\left(A\right)=\left{f\left(a\right):a\in A\right}=\left{f\left(x\right):x\in X\right)=f\left(X\right) \end{equation*}$$*
Now consider
f\left(X\setminus A\right)this set is given by$$\begin{equation} f\left(X\setminus A\right)=\left{f\left(x\right): x\in X\setminus A\right}=\left{f\left(x\right):x\in X\text{ and } x\not\in A\right} \end{equation*}$$ But as all such
x\in Amust also bex\in Xby assumption we conclude thatf\left(X\setminus A\right)=\emptysetand the empty set is clearly contained in any other set. Hencef\left(X\setminus A\right)\subseteq f\left(A\right). The result has now been shown.* -
f\left(X\right)\setminus f\left(A\right)\subseteq f\left(X\setminus A\right):Let
x\in f\left(X\right)\setminus f\left(A\right). By definition we have that$$\begin{equation} f\left(X\right)\setminus f\left(A\right)=\left{x\in f\left(X\right):x\not\in f\left(A\right)\right} \end{equation*}$$*
Hence
x\in f\left(X\right)\setminus f\left(A\right)gives us thatx\in f\left(X\right)andx\not\in f\left(A\right). That is\exists y\in Xwithy\nexists Asuch thatx=f\left(y\right), this isy\in X\setminus A. Hence it follows thatx\in f\left(X\setminus A\right). That isf\left(X\right)\setminus f\left(A\right)\subseteq f\left(X\setminus A\right). -
f\left(A\cup f^{-1}\left(B\right)\right)\subseteq f\left(A\right)\cup B:Let
x\in f\left(A\cup f^{-1}\left(B\right)\right). This is our first usage of the pre-image of a set so we recall the definition, we have that$$\begin{equation} f^{-1}\left(B\right)=\left{x\in X:f\left(x\right)\in B\right)\subseteq X \end{equation*}$$*
Hence the image
f\left(A\cup f^{-1}\left(B\right)\right)is given by$$\begin{align} f\left(A\cup f^{-1}\left(B\right)\right)&=\left{f\left(y\right):y\in A\cup f^{-1}\left(B\right)\right}\ &=\left{f\left(y\right):y\in A\text{ or } y\in f^{-1}\left(B\right)\right}\ &=\left{f\left(y\right):y\in A\text{ or } y\in X : f\left(y\right)\in B\right} \end{align*}$$*
Now,
x\in f\left(A\cup f^{-1}\left(B\right)\right)gives us that either\exists y\in Awithx=f\left(y\right)or\exists y\in Xwithf\left(y\right)\in B. In the first case where\exists y\in Awithx=f\left(y\right)then by definition of the image we have thatx\in f\left(A\right)and so is clearly in the unionf\left(A\right)\cup B. Now for the second case we have thatx\in Basy\in Xsuch thatx=f\left(y\right)\in B, likewise it is in the unionf\left(A\right)\cup B.Hence
x\in f\left(A\right)\cup Band we have thatf\left(A\cup f^{-1}\left(B\right)\right)\subseteq f\left(A\right)\cup B. Hence the result. -
f\left(A\cap f^{-1}\left(B\right)\right)= f\left(A\right)\cap B:Let
x\in f\left(A\cap f^{-1}\left(B\right)\right), the image ofA\cap f^{-1}\left(B\right)is given by$$\begin{align} f\left(A\cap f^{-1}\left(B\right)\right)&=\left{f\left(y\right):y\in A\cap f^{-1}\left(B\right)\right}\ &=\left{f\left(y\right):y\in A\text{ and } y\in f^{-1}\left(B\right)\right}\ &=\left{f\left(y\right):y\in A\text{ and } y\in X : f\left(y\right)\in B\right}\ \end{align*}$$*
Now
x\in f\left(A\cap f^{-1}\left(B\right)\right)gives us that\exists y\in Awithx=f\left(y\right)and\exists y\in Xwithf\left(y\right)\in B. Hence we clearly have thatx\in f\left(A\right)andx\in Band so is in the intersectionf\left(A\right)\cap B. Hence we have thatf\left(A\cap f^{-1}\left(B\right)\right)\subseteq f\left(A\right)\cap B.Now suppose that
x\in f\left(A\right)\cap B. We have thatx\in f\left(A\right)andx\in B, from the first of these havingx\in f\left(A\right)means that\exists y\in Asuch thatx=f\left(y\right). Now asx\in Bmeans there is somey'\in Xwithx=f\left(y'\right). However asf\left(A\right)\cap Bthen we must have thatf\left(y'\right)\in f\left(A\right)hencey'\in A. Hence bothyandy'are in the setA\cap f^{-1}\left(B\right)and so we havex\in f\left(A\cap f^{-1}\left(B\right)\right)and thereforef\left(A\right)\cap B\subseteq f\left(A\cap f^{-1}\left(B\right)\right).The result
f\left(A\cap f^{-1}\left(B\right)\right)= f\left(A\right)\cap Bfollows.
We now turn our attention to the results for the pre-image.
-
f^{-1}\left(Y\right)=X:By definition of the pre-image we have that
$$\begin{equation} f^{-1}\left(Y\right)=\left{x\in X:f\left(x\right)\in Y\right}\subseteq X \end{equation*}$$*
Clearly
f^{-1}\left(Y\right)\subseteq Xby definition. Now ifx\in Xthen we must also clearly havef\left(x\right)\in Yand soX\subseteq f^{-1}\left(Y\right). Hencef^{-1}\left(Y\right)=X. -
f^{-1}\left(f\left(X\right)\right)=X:Let
y\in f^{-1}\left(f\left(X\right)\right), we have that the setf^{-1}\left(f\left(X\right)\right)is given by$$\begin{equation} f^{-1}\left(f\left(X\right)\right)=\left{x\in X: f\left(x\right)\in f\left(X\right)\right}\ \end{equation*}$$*
It is hence clear that for any
x\in f^{-1}\left(f\left(X\right)\right)we have clearly havex\in X, that isf^{-1}\left(f\left(X\right)\right)\subseteq X. Likewise ifx\in Xthen clearlyx\in f\left(X\right)and so by the definition off^{-1}\left(f\left(X\right)\right)we have thatx\in f^{-1}\left(f\left(X\right)\right). That isX\subseteq f^{-1}\left(f\left(X\right)\right). The result follows. -
A\subseteq f^{-1}\left(f\left(A\right)\right):Suppose that
x\in A\subseteq X. By property2. of the pre-image we have thatf^{-1}\left(f\left(X\right)\right)=X. Hencex\in A\subseteq f^{-1}\left(f\left(X\right)\right)=Xgiving the result. -
Suppose that instead of the mapping
f:X\rightarrow Ywe consider a new mapping based onf, which we we call\Bar{f}. We define\Bar{f}to be the mapping$$\begin{align} \Bar{f}:A&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto \Bar{f}\left(x\right)=f\left(x\right) \end{align*}$$*
that is
\Bar{f}maps every element ofa\in Ato whatf\left(a\right)does. With this new mapping we have the following property$$\begin{equation} \left(\Bar{f}\right)^{-1}\left(B\right)=A\cap f^{-1}\left(B\right): \end{equation*}$$*
Let
x\in \left(\Bar{f}\right)^{-1}\left(B\right). We have that\left(\Bar{f}\right)^{-1}\left(B\right)is given by$$\begin{equation} \left(\Bar{f}\right)^{-1}\left(B\right)=\left{x\in A:\Bar{f}\left(x\right)\in B\right} \end{equation*}$$*
So
x\in \left(\Bar{f}\right)^{-1}\left(B\right)gives thatx\in A, moreover as\Bar{f}\left(x\right)\in Band\Bar{f}maps everyx\in Atof\left(x\right)then\Bar{f}\left(x\right)=f\left(x\right)\in B. It follows thatx\in f^{-1}\left(B\right)and sox\in A\cap f^{-1}\left(B\right). Thus\left(\Bar{f}\right)^{-1}\left(B\right)\subseteq A\cap f^{-1}\left(B\right).Now, suppose that
x\in A\cap f^{-1}\left(B\right), by definition of\Bar{f}we have that\Bar{f}\left(x\right). Nowx\in f^{-1}\left(B\right)means thatf\left(x\right)\in B, now as\Bar{f}\left(x\right)maps anyx\in Atof\left(x\right)we have that\Bar{f}\left(x\right)=f\left(x\right)and so $x\in \left(\Bar{f}\right)^{-1}\left(B\right)$Hence $\left(\Bar{f}\right)^{-1}\left(B\right)=A\cap f^{-1}\left(B\right)$
-
f^{-1}\left(f\left(f^{-1}\left(B\right)\right)\right)=f^{-1}\left(B\right):This follows by property 2.
f^{-1}\left(f\left(X\right)\right)=X. Indeed we have$$\begin{equation} f^{-1}\left(f\left(f^{-1}\left(B\right)\right)\right)=f^{-1}\left(B\right) \end{equation*}$$*
-
f^{-1}\left(B\right)=\emptyset\iff B\subseteq Y\setminus f\left(X\right):\left(\Rightarrow\right):Supposef^{-1}\left(B\right)=\emptyset, by definition of the pre-image we have$$\begin{equation} f^{-1}\left(B\right)=\left{x\in X:f\left(x\right)\in B\right}=\emptyset \end{equation*}$$*
Hence the pre-image being empty means that there are no elements
x\in Xwithf\left(x\right)\in B. Now the setY\setminus f\left(X\right)is given$$\begin{equation} Y\setminus f\left(X\right)=\left{y\in Y: y\not\in f\left(X\right)\right} \end{equation*}$$*
Thus as there are no
x\in Xwithf\left(x\right)\in B, thenY\setminus f\left(x\right)will not remove anyf\left(x\right)\in B, that isB\subseteq Y\setminus f\left(X\right).\left(\Leftarrow\right):Suppose thatB\subseteq Y\setminus f\left(X\right). We Have thatY\setminus f\left(X\right)is precisely the set ofy\in Ywithy\not\in f\left(X\right), therefore the setB\subseteq Y\setminus f\left(X\right)means that iff\left(b\right)\in Bthen we have have thatb\not\in f\left(X\right)and henceb\not\in X. This holds for anyf\left(b\right)\in Band hence we must have that the pre-image ofBis empty. This is to sayf^{-1}\left(B\right)=\emptyset. -
A\subseteq f^{-1}\left(B\right)\iff f\left(A\right)\subseteq B:\left(\Rightarrow\right):Suppose thatA\subseteq f^{-1}\left(B\right). Recall the definition of the image$$\begin{equation} f\left(A\right)=\left{f\left(x\right):x\in A\right} \end{equation*}$$*
Now for some
a\in Awe have thata\in f^{-1}\left(B\right)and so there is somex\in Xsuch thatf\left(x\right)\in B, in particulara=xand sox\in Awhich givesf\left(A\right)\subseteq B.\left(\Leftarrow\right):Now, suppose thatf\left(A\right)\subseteq Bwe have that for somey\in f\left(A\right)thaty\in Band in particular by definition there is somex\in Asuch thatf\left(x\right)=y\in f\left(A\right). Hence asA\subseteq Xwe have thatx\in Xand so by definition of the pre-image we have thatx\in f^{-1}\left(B\right). This is to say we conclude thatA\subseteq f^{-1}\left(B\right). -
f^{-1}\left(Y\setminus B\right)\subseteq f^{-1}\left(B\right)\iff f^{-1}\left(B\right)=X:Suppose that
f^{-1}\left(Y\setminus B\right)\subseteq f^{-1}\left(B\right). We have that pre-image ofY\setminus Bis given by$$\begin{equation} f^{-1}\left(Y\setminus B\right)=\left{x\in X: f\left(x\right) \in Y\setminus B\right}=\left{x\in X: f\left(x\right)\in Y \text{ and } f\left(x\right)\not\in B\right} \end{equation*}$$*
Hence by definition
y\in f^{-1}\left(Y\setminus B\right)gives us thaty=xfor somex\in Xwithf\left(x\right)\in Yandf\left(x\right)\not\in B, but then we can't havey\in f^{-1}\left(B\right)by the definition of the pre-image onB. Hence we conclude thatf^{-1}\left(Y\setminus B\right)\subseteq f^{-1}\left(B\right)holds if and only ifY\setminus B = \emptysetfrom whichB= Yand so by property1. we have thatf^{-1}\left(B\right)= X. -
f^{-1}\left(Y\setminus B\right)= X\setminus f^{-1}\left(B\right):Suppose that
x\in f^{-1}\left(Y\setminus B\right)then by definition we have thatf\left(x\right)\in yandf\left(x\right)\not\in Bfor somex\in X, but this is clearly the definition ofX\setminus f^{-1}\left(B\right)and sox\in X\setminus f^{-1}\left(B\right).Conversely if
x\in X\setminus f^{-1}\left(B\right)thenf\left(x\right)\not\in Bbut by definition offwe have thatf\left(x\right)\in Yand sox\in f^{-1}\left(Y\setminus B\right).It follows that
f^{-1}\left(Y\setminus B\right)= X\setminus f^{-1}\left(B\right). -
A\cup f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cup B\right):Let
x\in A\cup f^{-1}\left(B\right). We have that eitherx\in Aorx\in f^{-1}\left(B\right). Ifx\in Athenf\left(x\right)\in f\left(A\right)and sof\left(x\right)\in f\left(A\right)\cup B, the result follows on taking the pre-image as$$\begin{equation} f^{-1}\left(f\left(A\right)\cup B\right)=\left{x\in X: f\left(x\right)\in f\left(A\right)\cup B\right} \end{equation*}$$ This is to say that
x\in f^{-1}\left(f\left(A\right)\cup B\right)=\left\{x\in X: f\left(x\right)\in f\left(A\right)\cup B\right\}.*Now if
x\in f^{-1}\left(B\right)then we have by definition thatf\left(x\right)\in Band by a similar argument to above we conclude thatf\left(x\right)\in f\left(A\right)\cup Bso that on taking the pre-image we conclude thatx\in f^{-1}\left(f\left(A\right)\cup B\right)=\left\{x\in X: f\left(x\right)\in f\left(A\right)\cup B\right\}.Hence it follows that
A\cup f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cup B\right). -
A\cap f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cap B\right):Suppose that
x\in A\cap f^{-1}\left(B\right)thenx\in Aandx\in f^{-1}\left(B\right)and sof\left(x\right)\in B. Asx\in Athenf\left(x\right)\in f\left(A\right)and hence asf\left(x\right)\in f\left(A\right)andf\left(x\right)\in Bthenf\left(x\right)\in f\left(A\right)\cap B. The result follows on taking the pre-image.Hence $A\cap f^{-1}\left(B\right)\subseteq f^{-1}\left(f\left(A\right)\cap B\right)$
The proposition now follows. $\qed$ :::
Injective, surjective and bijective mappings
Armed with the examples we have seen we can make a few comments about
mappings. Consider example [28](#exmp:Mapping 1){reference-type="ref"
reference="exmp:Mapping 1"} where we have that X=Y=\mathbb{N} and is
f the map
$$\begin{align*} f:X&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\left(x\right)=2x \end{align*}$$
We have that for every x,y\in X with f\left(x\right)=f\left(y\right)
that x=y, which is to say if the image of two different elements
agree, then the elements are in-fact the same. This is clear to see,
suppose that x,y\in X with f\left(x\right)=f\left(y\right), then we
have that
$$\begin{align*} f\left(x\right)&=f\left(y\right)\ 2x&=2y\ x&=y \end{align*}$$
Another way of expressing this idea is that two distinct elements in the
domain will have distinct images, we say a mapping with this property is
an injective mapping. Now, if we consider
\mathop{\mathrm{Image}}\left(f\right)\subseteq Y and consider the map
$$\begin{align*}
g:X&\mathlarger{\mathlarger{\rightarrow}}\mathop{\mathrm{Image}}\left(f\right)\
x&\mapsto g\left(x\right)=2x
\end{align*}$$ Then, for every
y\in\mathop{\mathrm{Image}}\left(f\right), we have that there exists
some element x\in X such that y=g\left(x\right). Again, we can show
this. Let y\in\mathop{\mathrm{Image}}\left(f\right), then we need to
show that \exists x\in X such that g\left(x\right)=y. Now
$$\begin{align*} y&=g\left(x\right)\ y&=2x\ \frac{y}{2}&=x \end{align*}$$
We hence will need to take \displaystyle x=\frac{y}{2}, however we
first then to verify that \displaystyle x=\frac{y}{2}\in X. We note
that y\in\mathop{\mathrm{Image}}\left(f\right) means that y=2k for
some k\in\mathbb{N}, so
$$\begin{align*} x&=\frac{y}{2}\ x&=\frac{2k}{2}\ x&=k \end{align*}$$
as x\in X=\mathbb{N} and k\in\mathbb{N} then we can rest safe in the
knowledge that our choice for x indeed works. As a sanity check we
have that
$$\begin{equation*} g\left(x\right)=2x=2\frac{y}{2}=y \end{equation*}$$
This choice of x works for any choice of y. Another way to express
this idea is that every element in the image of the mapping is the image
of some element in the domain, we say a mapping with this property is a
surjective mapping.
It is worth noting that the mapping g is both injective and
surjective, this makes g a special type of mapping. If we take an
element in the domain x and consider its image
g\left(x\right)\in\mathop{\mathrm{Image}}\left(f\right), then as g
is injective we know that g\left(x\right) is a distinct element in
\mathop{\mathrm{Image}}\left(f\right). Moreover, as g is surjective
then there is an element in the domain, say a with the property that
g\left(a\right)=g\left(x\right), but as g is injective then we know
that a=x. This means that we can go between elements of the domain and
elements of the image in a distinct way, a mapping with this property is
called a bijective mapping and the domain and image are said to be in
bijection with each other.
We formalise these ideas now to a mapping between any two sets.
::: definition Definition 48. Injective, surjective and bijective maps
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between
two sets X and Y.
-
We say that
fis an injective mapping, sometimes called a one-to-one mapping, if$$\begin{equation} \forall x,y\in X,\ f\left(x\right)=f\left(y\right) \Rightarrow x=y \end{equation*}$$*
That is we have that
f\left(x\right)=f\left(y\right)forx,y\in Xthenx=y. If we know thatfis injective we can write the mapping as$$\begin{equation} f:X\mathlarger{\mathlarger{\hookrightarrow}}Y \end{equation*}$$*
which is read as
fis an injective mapping fromXtoY. -
We say that
fis a surjective mapping, sometimes called a onto mapping, if$$\begin{equation} \forall y\in Y,\exists x\in X: y=f\left(x\right) \end{equation*}$$*
That is we have that for each
y\in Y, there exists somex\in Xsuch thatf\left(x\right)=y. If we know thatfis a surjective then we can write the mapping as$$\begin{equation} f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Y \end{equation*}$$*
which is read as
fis a surjective mapping fromXto $Y$ -
We say that
fis a bijective mapping, sometimes called a one-to-one and unto mapping, iffis both injective and surjective. If we know thatfis a bijection then we can write the mapping as$$\begin{equation} f:X% \mathlarger{\mathlarger{\hookrightarrow}}\mathrel{\mspace{-27.5mu}}\mathlarger{\mathlarger{\rightarrow}} Y \end{equation*}$$*
which is read as
fis a bijective mapping fromXtoY. :::
We will look for additional examples of each type of mapping.
::: example
Example 35. Let
f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} where
f\left(x\right)=x. We will prove that f is a bijective mapping.
Proof:
To show f is bijective we show that f is injective and surjective.
To see that f is an injection, suppose that
f\left(x\right)=f\left(y\right) where x,y\in N, the domain. then we
have that
$$\begin{align}
f\left(x\right)&=f\left(y\right)\
x&=y
\end{align*}$$ This shows f is injective as this holds for any choice
of x,y\in \mathbb{N}. To see that f is surjective consider
y\in\mathbb{N}, the co-domain, we show there exists an
x\in\mathbb{N}, the domain, so that f\left(x\right)=y. We have*
$$\begin{align}
y&=f\left(x\right)\
y&=x
\end{align*}$$ so we take x=y. This works for every y\in\mathbb{N},
the co-domain, so f is surjective.*
As f is both injective and surjective it is by definition a bijective
map, that is $f:\mathbb{N}%
\mathlarger{\mathlarger{\hookrightarrow}}\mathrel{\mspace{-27.5mu}}\mathlarger{\mathlarger{\rightarrow}}
\mathbb{N}$. $\qed$
:::
::: example
Example 36. Let
f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} where
$$\begin{equation} f\left(x\right)=\begin{cases} x,\ \text{If } x \text{ is odd}\ \frac{x}{2},\ \text{If } x \text{ is even}\ \end{cases} \end{equation*}$$*
Is f injective? To see if it is we would need to show that
f\left(x\right)=f\left(y\right) with x,y\in \mathbb{N} means that
x=y. It becomes clear that there are x,y\in\mathbb{N} where this
does not hold, for example f\left(1\right)=1 and f\left(2\right)=1
so f\left(1\right)=f\left(2\right) but 1\neq 2. This shows that f
is not injective. Is f surjective? To see if it is we would need to
show that \forall y\in\mathbb{N},\exists x\in\mathbb{N} such that
y=f\left(x\right). Note that for every even input x=2k we have that
\displaystyle f\left(x\right)=\frac{2k}{2}=k. So for any
y\in\mathbb{N} if we take x=2y then every y\in\mathbb{N} gets
mapped to to by 2y. So f is surjective.
As f was not injective we have that f is not a bijection, so we
have
f:\mathbb{N}\mathlarger{\mathlarger{\twoheadrightarrow}}\mathbb{N}.
:::
::: example
Example 37. Let X=\left\{1,2\right\} and Y=\left\{3,4,5\right\}
and define the map f:X\mathlarger{\mathlarger{\rightarrow}}Y by
$$\begin{equation} f\left(1\right)=3,\ f\left(2\right)=4 \end{equation*}$$*
Then it is clear that f is injective, as each input is mapped to a
distinct output. More formally suppose that
f\left(x\right)=f\left(y\right) where x,y\in X. We have that by the
definition of the mapping f\left(1\right)=3,\ f\left(2\right)=4. In
the first case we have f\left(x\right)=f\left(y\right)=3 and so
x=y=1, likewise in the second case we have that
f\left(x\right)=f\left(y\right)=4 and so x=y=2. This proves
injectivity.
To see that f is not surjective, consider the image
\mathop{\mathrm{Image}}\left(f\right)=\left\{f\left(x\right):x\in X\right\}=\left\{3,4\right\}\neq Y.
So \exists y\in Y such that \not\exists x\in X with
y=f\left(x\right).
It hence follows that f is not bijective, that is
f:\left\{1,2\right\}\mathlarger{\mathlarger{\hookrightarrow}}\left\{3,4,5\right\}.
:::
::: example
Example 38. Let X=\left\{1,2,3\right\} and Y=\left\{4,5\right\}
and define the map f:X\mathlarger{\mathlarger{\rightarrow}}Y by
$$\begin{equation} f\left(1\right)=4,\ f\left(2\right)=4,\ f\left(3\right)=5 \end{equation*}$$*
We have that f is not injective as
f\left(1\right)=f\left(2\right)=4 but 1\neq 2. However we have that
f is surjective as the image of f is
\mathop{\mathrm{Image}}\left(f\right)=\left\{f\left(x\right):x\in X\right\}=\left\{4,5\right\}=Y.
By definition f is not bijective, hence
f:\left\{1,2,3\right\}\mathlarger{\mathlarger{\twoheadrightarrow}}\left\{4,5\right\}.
:::
We note that we can always construct a mapping g from
f:X\rightarrow Y such that
g:X\mathlarger{\mathlarger{\rightarrow}}\mathop{\mathrm{Image}}\left(f\right)
is a surjection.
::: {#prob:RestOfCodomainToImageIsSurjective .proposition} Proposition 15. The restriction of a mappings co-domain to its image is a surjective mapping
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping and
consider
\mathop{\mathrm{Image}}\left(f\right)=\left\{f\left(x\right):x\in X\right\}.
Consider the following mapping
$$\begin{align} g:X&\mathlarger{\mathlarger{\rightarrow}}\mathop{\mathrm{Image}}\left(f\right)\ x&\mapsto f\left(x\right) \end{align*}$$*
Then g is a surjective map.
Proof:
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and consider
\mathop{\mathrm{Image}}\left(f\right)=\left\{f\left(x\right):x\in X\right\}.
By the definition of the image of a mapping
45{reference-type="ref"
reference="def:ImageMapping"} we have that
\mathop{\mathrm{Image}}\left(f\right)\subseteq Y. Moreover, by the
definition of the image of a map we have that
y\in\mathop{\mathrm{Image}}\left(f\right) if and only if
\exists x\in X such that y=f\left(x\right). This will hold for all
y\in\mathop{\mathrm{Image}}\left(f\right) so g is a surjection.
\qed.
:::
In the proof we used the idea of restricting the co-domain of the
function so that it was the image
\mathop{\mathrm{Image}}\left(f\right) rather than Y, while leaving
the domain X unchanged. In actuality we didn't restrict the co-domain
at all but instead only considered those elements of the co-domain that
actually get mapped to. It should be clear that the image
\mathop{\mathrm{Image}}\left(f\right), the elements that actually get
mapped to, only depends on the allowable inputs for the function, that
is only depend on the domain X. In many fields of mathematics it is
sometimes desirable to restrict the domain X that is being worked with
to a smaller subset of the domain A\subseteq X. As a quick example of
why this is useful, and which we will see later, is for inverse
mappings. For now the key idea of an inverse map is to be able to create
a bijection between a mapping and its domain and co-domain to enable us
to unambiguously go between the two. Why is this useful?
For an example, suppose that you wanted to go on holiday abroad then
you'll need to convert your currency to the currency that is in use
where you go to. Suppose that you use gold coins where as the contry you
vist only uses silver coins. The exchange rate from gold coins to silver
coins is given by the following mapping E\left(x\right) = Ax^2 where
the domain is the set of all the numbers that we are familiar with, that
is \mathbb{R}, and A is some positive number which is greater than
0.
Suppose we wish to convert 50 gold coins into the new currency, then
we will have E\left(50\right)=A*50^2=2500A silver coins. Finally
suppose that after our holiday we have some silver coins left over that
we wish to convert back to gold coins, say 2500A-y where 0<y<2500A,
how many gold coins will we get back?
To work this out we will need to find a way to go backwards from
\mathop{\mathrm{Image}}\left(E\right) back to the domain. To do this
we will solve g=Ax^2 for x, we have that
$$\begin{align*}
g&=Ax^2\
\frac{g}{A}&=x^2\
x&=\pm\sqrt{\frac{g}{A}}
\end{align*}$$ You may wonder where \pm came from and what it means.
\pm stands for plus or minus and is used when we are unsure wherever
the number is positive or negative. It occurs here because for the
numbers we are familiar with there are two possible answers when taking
the square root of a number, for example if we wanted to find the square
root of 2 we have that \sqrt{2}*\sqrt{2}=2 or
\left(-\sqrt{2}\right)\left(-\sqrt{2}\right)=2.
So, going back to the currency problem. When we wish to convert our remaining silver coins back into gold coins we will get back
$$\begin{equation*} x=\pm\sqrt{\frac{2500A-y}{A}} \end{equation*}$$
This is a problem, because the domain of E was any of the usual
numbers we don't know wherever we should get back the positive or
negative value, as both will have given us the silver coins we had
remaining; perhaps on a more relatable note, we would find it very
annoying if we got back from holiday and converted are positive money
back only to end up with a negative amount of money. To over come this
problem we should ensure that the domain of E consists of only
positive numbers, rather than any value, by doing so the negative square
root value is no longer valid and we hence get back the correct amount
of money.
Although a simple example, this shows the importance sometimes having to restrict the domain of a mappings. We define the idea of a restriction of the domain now.
::: definition Definition 49. Restriction of a mapping
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between
two sets X and Y. Let A\subseteq X be any subset of X. We define
the restriction of f to A, denoted by
\displaystyle \mathrel f\restriction_A, by the mapping
$$\begin{align} \mathrel f\restriction_A:A&\mathlarger{\mathlarger{\rightarrow}}Y\ x&\mapsto f\restriction_A\left(x\right) \end{align*}$$*
In particular, restricting a mapping will cause the image to change so that $\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)\subseteq\mathop{\mathrm{Image}}\left(f\right)$ :::
Now that we have the idea of restricting a mapping we can see the following
::: {#prop;RestOfInjectionIsInjection .proposition} Proposition 16. Restriction of an injective mapping is injective
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping between
two sets X and Y such that f is an injective mapping. Let
A\subseteq X be any subset of X. We have that the restriction
\mathrel f\restriction_A:A\rightarrow Y is an injective map. In
particular we have that
\mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)
is an injection.
Proof:
To show that \mathrel f\restriction_A:A\rightarrow Y is an injective
we show that
\mathrel f\restriction_A\left(x\right)=\mathrel f\restriction_A\left(y\right)
for x,y\in A means that x=y. Suppose that \mathrel f\restriction_A
is not an injective map, then we have that \exists x,y\in A with
x\neq y such that
\mathrel f\restriction_A\left(x\right)=\mathrel f\restriction_A\left(y\right).
However A\subseteq X and so x,y\in X but f is an injective map so
f\left(x\right)=f\left(y\right) with x\neq y, contradicting the fact
that f is an injection.
We conclude the the restriction map \mathrel f\restriction_A must be
injective.
Finally, by definition of
\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right) we have
that
$$\begin{equation} \mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)=\left{\mathrel f\restriction_A\left(x\right):x\in A\right}\subseteq Y \end{equation*}$$*
that is the image is all the elements \mathrel f\restriction_A will
map elements of A to, as \mathrel{f\restriction_A}:X\rightarrow Y is
an injection we must conclude that
\mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)
is an injection, for if not then the original restriction map could not
have been an injection. $\qed$
:::
::: example
Example 39. Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a
mapping where x=\left\{1,2,3,4,5,6\right\} and
Y=\left\{7.8,9,10,11,12\right\} where
$$\begin{equation} x\mapsto f\left(x\right)=x+6 \end{equation*}$$*
Consider A\subseteq X where A=\left\{1,2,3\right\} and
B\subseteq X with B=\left\{1,2,3,4,5\right\}, then A\subseteq B.
We have that
\mathrel f\restriction_A:A\mathlarger{\mathlarger{\rightarrow}}Y has
the image
\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)=\left\{7,8,9\right\}
and we have that
\mathrel f\restriction_B:B\mathlarger{\mathlarger{\rightarrow}}Y has
the image
$\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right)=\left{7,8,9,10,11\right}$
Hence under the two different restrictions we observe that
\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)\subseteq\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right).
:::
From this example we have the following result.
::: {#prop:ImageOfSubsetIsSubsetOfImage .proposition} Proposition 17. The image of a subset is a subset of the image
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of sets
and let A,B\subseteq X where A\subseteq B, we have that
$$\begin{equation} \mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)\subseteq\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right)\subseteq\mathop{\mathrm{Image}}\left(f\right) \end{equation*}$$*
Proof:
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping of sets
and let A,B\subseteq X where A\subseteq B. Consider the restriction
mappings
\mathrel f\restriction_A:A\mathlarger{\mathlarger{\rightarrow}}Y and
\mathrel f\restriction_B:B\mathlarger{\mathlarger{\rightarrow}}Y. Let
y\in\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right), then
by definition we have that \exists x\in A such that
\mathrel f\restriction_A\left(x\right)=y. As A\subseteq B we have
that x\in A \Rightarrow x\in B and so
\mathrel f\restriction_B\left(x\right)=y, hence
y\in\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right). This
shows that
\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)\subseteq\mathop{\mathrm{Image}}\left(\mathrel f\restriction_B\right).
To see the final inclusion note that A\subseteq B\subseteq X so
x\in A\Rightarrow x\in B \Rightarrow x\in X and so
f\left(x\right)=y, hence y\in\mathop{\mathrm{Image}}\left(f\right).
This shows the result. $\qed$ :::
We conclude with the following observation.
::: {#prop:InjectiveMapToImageIsBijection .proposition} Proposition 18. Injective mapping to image is a bijection
Let f:X\mathlarger{\mathlarger{\hookrightarrow}}Y be an injective map
between two sets X and Y. Let A\subseteq X be any subset of X
possibly being X itself. We have that the mapping
g:A\mathlarger{\mathlarger{\rightarrow}}\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)
is a bijection.
Proof:
Let f:X\mathlarger{\mathlarger{\hookrightarrow}}Y be an injective
mapping and let A\subseteq X. By proposition
16{reference-type="ref"
reference="prop;RestOfInjectionIsInjection"} we have that the mapping
\mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)
is an injection. Also, by proposition
15{reference-type="ref"
reference="prob:RestOfCodomainToImageIsSurjective"} that
\mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)
is a surjection. Hence by definition we have that
\mathrel f\restriction_A:A\rightarrow\mathop{\mathrm{Image}}\left(\mathrel f\restriction_A\right)
is a bijection. $\qed$
:::
Compositions of maps
We have seen how a mapping f takes elements in one set, the domain
X, and sends them to the elements of another set, the image
\mathop{\mathrm{Image}}\left(f\right)\subseteq Y of some co-domain
Y. We can extend this idea so that the image
\mathop{\mathrm{Image}}\left(f\right) and more generally the co-domain
Y act as the domain for some other mapping g. This will allow us to
consider some more interesting examples of mappings in general.
::: definition Definition 50. Composition of two mappings
Let f:X\rightarrow Y and g:Y\rightarrow Z be two mappings for some
sets X,Y and Z. We define the composition map by
$$\begin{align} g\circ f: X&\rightarrow Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$*
That is, the mapping f is done first and then we apply the mapping
g.
Additionally, let h:X\rightarrow X be a mapping from X to X then
if h is composed with itself we write
h\circ h = h\left(h\left(x\right)\right)=h^2\left(x\right). If h is
composed with itself n times we write h^{n+1}\left(x\right). This is
sometimes called the $n+1$-fold composition of h with itself.
:::
::: example
Example 40. Let f:\mathbb{N}\rightarrow\mathbb{N} and
g:\mathbb{N}\rightarrow\mathbb{N} be maps such that
$$\begin{align}
f:\mathbb{N}&\rightarrow\mathbb{N}\
x&\mapsto f\left(x\right)=x^2\
g:\mathbb{N}&\rightarrow\mathbb{N}\
x&\mapsto g\left(x\right)=x^3\
\end{align*}$$ Then we have that, for some arbitrary x\in\mathbb{N}
that*
$$\begin{align} g\circ f\left(x\right)=g\left(f\left(x\right)\right)=g\left(x^2\right)=\left(x^2\right)^3=x^6\ f\circ g\left(x\right)=f\left(g\left(x\right)\right)=g\left(x^3\right)=\left(x^3\right)^2=x^6\ \end{align*}$$*
In this case g\circ f=f\circ g, and it does not matter in which way
we compose the two mappings.
The ideas of injectivity and subjectivity also apply to compositions of
maps. We will see if g\circ f is injective.
Recall that a mapping h:X\rightarrow Y is injective if
h\left(x\right)=h\left(y\right) for x,y\in X means that x=y. So
let x,y\in\mathbb{N} and consider
g\circ f\left(x\right)=g\circ f\left(y\right). Then we have that
$$\begin{align} g\circ f\left(x\right)&=g\circ f\left(y\right)\ x^6&=y^6\ x&=y \end{align*}$$*
This makes sense as x^6,y^6\in\mathbb{N} as x^6=x*x*x*x*x*x which
is multiplication in \mathbb{N}, also We can take the sixth-root of
x^6 without issue. Likewise for y. It is clear that the composition
is not surjective for example 2\in\mathbb{N} does not have an element
x\in\mathbb{N} such that x^6=2. If we were to include any possible
positive number we would have x=\sqrt[6]{2}\approx 1.1224620483094.
Hence g\circ f is not bijective as it is not surjective. Likewise for
g\circ f.
:::
::: example
Example 41. Consider the mappings
f:\mathbb{N}\rightarrow\mathbb{N} and
g:\mathbb{N}\rightarrow\mathbb{N} given by
$$\begin{align} f:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto f\left(x\right)=4x+2\ g:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto g\left(x\right)=\sqrt{x}\ \end{align*}$$*
We have that
$$\begin{align} g\circ f\left(x\right)&=g\left(f\left(x\right)\right)=g\left(4x+2\right)=\sqrt{4x+2}\ f\circ g\left(x\right)&=f\left(g\left(x\right)\right)=f\left(\sqrt{x}\right)=4\sqrt{x}+2 \end{align*}$$*
Unlike last time we have that g\circ f\neq g\circ g. Now is
g\circ f injective? Let x,y\in\mathbb{N} and consider
$$\begin{align}
g\circ f\left(x\right)&=g\circ f\left(y\right)\
\sqrt{4x+2}&=\sqrt{4y+2} \iff 4x+2=4y+2\
4x+2&=4y+2\
x&=y
\end{align*}$$ So we have injectivity. We do not have subjectivity as,
for example with y=1\in\mathbb{N} then*
$$\begin{align} 1&=\sqrt{4x+2}\ 1&=4x+2\ -1&=4x\ x&=-\frac{1}{4}\not\in\mathbb{N} \end{align*}$$*
What about f\circ g?. For injectivity let x,y\in\mathrel{N} then
$$\begin{align}
f\circ g\left(x\right)&=f\circ g\left(y\right)\
4\sqrt{x}+2&=4\sqrt{y}+2\
\sqrt{x}&=\sqrt{y}\iff x=y
\end{align*}$$ hence we have injectivity. We do not have subjectivity,
for example with y=1\in\mathbb{N} we have that*
$$\begin{align} 1&=4\sqrt{x}+2\ -1&+4\sqrt{x}\ -\frac{1}{4}&=\sqrt{x}\Rightarrow x\not\in\mathbb{N} \end{align*}$$* :::
::: example
Example 42. Consider X=\left\{1,2,3\right\},
Y=\left\{4,5\right\} and Z=\left\{6\right\} and the mappings
f:X\mathlarger{\mathlarger{\rightarrow}}Y and
g:Y\mathlarger{\mathlarger{\rightarrow}}Z given by
$$\begin{equation} f\left(1\right)=4,\ f\left(2\right)=4,\ f\left(3\right)=5 \end{equation*}$$*
$$\begin{equation} g\left(4\right)=6,\ g\left(5\right)=6 \end{equation*}$$*
Finally, consider the composition map given by
$$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align}$$
Clearly g\circ f is not injective as
g\left(f\left(1\right)\right)=6 and g\left(f\left(2\right)\right)=6
but 1\neq 2. However the compositing map is surjective as
\mathop{\mathrm{Image}}\left(g\circ f\right)=\left\{6\right\}=Z.
:::
We deduce an immediate result.
::: {#prop:DomainOfCompMapisDomainofFirstFunc .proposition} Proposition 19. Domain of composition mapping equals the domain of the first function
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and
g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings. Consider the
composite mapping g\circ f:X\mathlarger{\mathlarger{\rightarrow}}Z. We
have that
$$\begin{equation} \mathop{\mathrm{Dom}}\left(g\circ f\right)=\mathop{\mathrm{Dom}}\left(f\right) \end{equation*}$$*
Proof:
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and
g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings and consider the
composite mapping g\circ f:X\mathlarger{\mathlarger{\rightarrow}}Z. We
need to show that
\mathop{\mathrm{Dom}}\left(g\circ f\right)=\mathop{\mathrm{Dom}}\left(f\right).
Let x\in\mathop{\mathrm{Dom}}\left(g\circ f\right), then
g\left(f\left(x\right)\right) is well-defined with say
z=g\left(f\left(x\right)\right) for some z\in Z. Hence for this to
be well-defined we have that \exists y\in Y such that
y=f\left(x\right) is well-defined. But then
x\in\mathop{\mathrm{Dom}}\left(f\right), hence
\mathop{\mathrm{Dom}}\left(g\circ f\right)\subseteq \mathop{\mathrm{Dom}}\left(f\right).
For the inverse inclusion, let
x\in\mathop{\mathrm{Dom}}\left(f\right) then f\left(x\right)=y for
some y\in Y. As g:Y\mathlarger{\mathlarger{\rightarrow}}Z is a
mapping with domain Y, then \exists z\in Z such that
g\left(y\right)=Z. Hence we have that
g\left(y\right)=g\left(f\left(x\right)\right)=z. Hence
\mathop{\mathrm{Dom}}\left(f\right)\subseteq\mathop{\mathrm{Dom}}\left(g\circ f\right).
As we have that
\mathop{\mathrm{Dom}}\left(g\circ f\right)\subseteq \mathop{\mathrm{Dom}}\left(f\right)
and
\mathop{\mathrm{Dom}}\left(f\right)\subseteq\mathop{\mathrm{Dom}}\left(g\circ f\right),
then we conclude by proposition
1{reference-type="ref"
reference="prop:TwosetsEqualIfContainedInEachOther"} that
\mathop{\mathrm{Dom}}\left(g\circ f\right)=\mathop{\mathrm{Dom}}\left(f\right)
as required. $\qed$
:::
These examples show something interesting. In the first example we note
that f and g are both injective. Indeed we have for
x,y\in\mathbb{N} that
$$\begin{align*} x^2&=y^2\Rightarrow x=y\ x^3&=y^3\Rightarrow x=y \end{align*}$$
and the composition mappings g\circ f and f\circ g where both
injective, in the last example we had that both f and g where
surjective as
$$\begin{align*}
\mathop{\mathrm{Image}}\left(f\right)=\left{f\left(x\right):x\in \left{1,2,3\right}\right}=\left{4,5\right}=Y\
\mathop{\mathrm{Image}}\left(g\right)=\left{g\left(x\right):x\in \left{4,5\right}\right}=\left{6\right}=Z
\end{align*}$$ and the composition map g\circ f was also surjective.
This is not a coincidence which we prove now
::: {#prop: PropInjecSurjecBijecMapping .proposition} Proposition 20. Injectivity, surjectivity and bijectivity of composition mappings
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and
g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings.
-
If
fandgare injective maps then so is $g\circ f$ -
If
fandgare surjective maps then so is $g\circ f$ -
If
fandgare bijective maps then s is $g\circ f$
Proof:
-
If
fandgare injective maps then so isg\circ f:Let
f:X\mathlarger{\mathlarger{\hookrightarrow}}Yandg:Y\mathlarger{\mathlarger{\hookrightarrow}}Zbe injective mappings, then by definition we have that$$\begin{align} \forall a,b\in X,\ f\left(a\right)&=f\left(b\right)\Rightarrow a=b\ \forall c,d\in X,\ g\left(c\right)&=g\left(d\right)\Rightarrow c=d \end{align*}$$*
Consider the composition map
$$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$ Let
x,y\in Xthen we have that*$$\begin{align} g\left(f\left(x\right)\right)&=g\left(f\left(y\right)\right)\ f\left(x\right)&=f\left(y\right),\ \text{As } g \text{ is an injective map}\ x&=y,\ \text{As } f \text{ is an injective map} \end{align*}$$ As this works for every
x,y\in Xwe have thatg\circ fis injective.* -
If
fandgare surjective maps then so isg\circ f:Let
f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Yandg:Y\mathlarger{\mathlarger{\twoheadrightarrow}}Zbe surjective mappings, then by definition we have that$$\begin{align} \forall b\in Y,\exists a\in X: f\left(a\right)&=b\ \forall d\in Z,\exists c\in X: g\left(c\right)&=d \end{align*}$$*
Consider the composition map
$$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$*
Let
z\in Z, then\exists y\in Ysuch thatg\left(y\right)=z, also\exists x\in Xsuch thatf\left(x\right)=yas bothfandgare surjective, then we have that$$\begin{equation} g\left(f\left(x\right)\right)=g\left(y\right)=z \end{equation*}$$ As this works for every
z\in Zwe have thatg\circ fis surjective.* -
If
fandgare bijective maps then s isg\circ f:Let $f:X% \mathlarger{\mathlarger{\hookrightarrow}}\mathrel{\mspace{-27.5mu}}\mathlarger{\mathlarger{\rightarrow}} Y$ and $g:Y% \mathlarger{\mathlarger{\hookrightarrow}}\mathrel{\mspace{-27.5mu}}\mathlarger{\mathlarger{\rightarrow}} Z$ be bijective mappings, then by definition we have that
fis an injection and a surjection sofsatisfies$$\begin{align} \forall a,b\in X,\ f\left(a\right)&=f\left(b\right)\Rightarrow a=b\ \forall d\in Y,\exists c\in X: f\left(c\right)&=d \end{align*}$$*
Also
gis an injection and surjection and so satisfies$$\begin{align} \forall a,b\in Y,\ g\left(a\right)&=g\left(b\right)\Rightarrow a=b\ \forall d\in Z,\exists c\in Y: Y\left(c\right)&=d \end{align*}$$*
Consider the composition map
$$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$*
By part 1. we have that
g\circ fis an injection asfandgare injections, by part 2. we have thatg\circ fis a surjection asfandgare surjections. Hence by definition asg\circ fis both injective and surjective it is bijective.
$\qed$ :::
In a sense we can also deduce properties about the mappings f and g
if we know something about the composition map g\circ f
::: {#prop:CompositeMapInectSurjectProp .proposition} Proposition 21. Properties of mappings from composite map
Let f:X\rightarrow Y and g: Y\mathlarger{\mathlarger{\rightarrow}}Z
be mappings and consider the composite map
g\circ f:X\mathlarger{\mathlarger{\rightarrow}}Z.
-
If
g\circ f:X\mathlarger{\mathlarger{\hookrightarrow}}Zis an injective map, thenf:X\mathlarger{\mathlarger{\rightarrow}}Yis an injective map. -
If
g\circ f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Zis a surjective map, theng:Y\mathlarger{\mathlarger{\rightarrow}}Zis a surjective map.
Proof:
-
If
g\circ f:X\mathlarger{\mathlarger{\hookrightarrow}}Zis an injective map, thenf:X\mathlarger{\mathlarger{\rightarrow}}Yis an injective map:Let
g\circ f:X\mathlarger{\mathlarger{\hookrightarrow}}Zis an injective composite mapping, theng\left(f\left(x\right)\right)=g\left(f\left(y\right)\right)for allx,y\in X, we need to show that\forall x,y\in Xthatf\left(x\right)=f\left(y\right)\Rightarrow x=y.So indeed, suppose that for some
x,y\in Xthatf\left(x\right)=f\left(y\right), then we have that$$\begin{align} g\circ f\left(x\right)&=g\left(f\left(x\right)\right)\ &=g\left(f\left(y\right)\right)\ &=g\circ f\left(y\right) \end{align*}$$ Now, as
g\circ fis an injective map we conclude thatx=y, hencef\left(x\right)=f\left(y\right)\Rightarrow x=y. Hencef:X\mathlarger{\mathlarger{\rightarrow}}Yis an injection.* -
If
g\circ f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Zis a surjective map, theng:Y\mathlarger{\mathlarger{\rightarrow}}Zis a surjective map:Let
g\circ f:X\mathlarger{\mathlarger{\twoheadrightarrow}}Zis a surjective composite mapping, then\forall z\in Z, \exists x\in X: z=g\circ f\left(x\right), we need to show that\forall z\in Z,\exists y\in Y: z=g\left(y\right). Letz\in Zthen asg\circ fis surjective there is somex\in Xsuch thatz=g\circ f\left(x\right).Now, we have by proposition 19{reference-type="ref" reference="prop:DomainOfCompMapisDomainofFirstFunc"} that
\mathop{\mathrm{Dom}}\left(g\circ f\right)=\mathop{\mathrm{Dom}}\left(f\right)and sox\in\mathop{\mathrm{Dom}}\left(f\right), so thatf\left(x\right)\in\mathop{\mathrm{Image}}\left(f\right). This is to say\exists y\in Y: y=f\left(x\right)and hencez=g\left(y\right). As this can be done for anyz\in Zwe conclude thatg:Y\mathlarger{\mathlarger{\twoheadrightarrow}}Zis a surjection.
$\qed$ :::
The examples also allow us to deduce something about the image of composition mappings
:::: {#Prop:ImageOfCompMap .proposition} Proposition 22. The image of a composite mapping
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and
g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings for sets X,Y
and Z. Consider the composition mapping given by
$$\begin{align} g\circ f:X&\mathlarger{\mathlarger{\rightarrow}}Z\ x&\mapsto g\left(f\left(x\right)\right) \end{align*}$$*
We have that
\mathop{\mathrm{Image}}\left(g\circ f\right)=g\left(\mathop{\mathrm{Image}}\left(f\right)\right)
where we recall the notation
$f\left(X\right)=\left{f\left(x\right):x\in X\right}$
Proof:
We have that
$$\begin{equation} \mathop{\mathrm{Image}}\left(g\circ f\right)=\left{g\left(f\left(x\right)\right):x\in X\right} \end{equation*}$$*
also, we have that
$$\begin{equation}
g\left(\mathop{\mathrm{Image}}\left(f\right)\right)=\left{g\left(y\right):y\in\mathop{\mathrm{Image}}\left(f\right)\right}
\end{equation*}$$ Now, y\in\mathop{\mathrm{Image}}\left(f\right) means
that y\in\left\{f\left(x\right):x\in X\right\}, hence
y=f\left(x\right) for some x\in X, hence*
$$\begin{equation} g\left(\mathop{\mathrm{Image}}\left(f\right)\right)=\left{g\left(f\left(x\right)\right):x\in X\right} \end{equation*}$$*
Hence the two definitions agree, that is
\mathop{\mathrm{Image}}\left(g\circ f\right)=g\left(\mathop{\mathrm{Image}}\left(f\right)\right).
We do need to check the case of Y=\emptyset. If Y=\emptyset then we
note that \mathop{\mathrm{Image}}\left(g\right)=\emptyset by the
remark after the definition of the image of a function. So
g:\emptyset\mathlarger{\mathlarger{\rightarrow}}\emptyset, i.e g
takes has no elements in its domain and no elements in its co-domain and
so is a mapping that maps nothing to nothing. Also
f:\emptyset\mathlarger{\mathlarger{\rightarrow}}\emptyset, we prove
this
::: lemma Lemma 3. Mapping from empty set to some co-domain set is valid if and only if co-domain is empty
Let Y be some set, then
f:\emptyset\mathlarger{\mathlarger{\rightarrow}}Y is a mapping if and
only if Y=\emptyset
Proof:
\left(\Rightarrow\right): Suppose that Y\neq\emptyset then
\exists s\in S, that is there is at least one element in S, but the
domain is empty so there are no elements that could be mapped to s,
hence f is not a well-defined mapping, so we conclude that
Y=\emptyset.
\left(\Leftarrow\right): Suppose that Y=\emptyset, then
f:\emptyset\mathlarger{\mathlarger{\rightarrow}}\emptyset holds as a
mapping, mapping nothing to nothing. \qed
:::
So the lemma shows
f:\emptyset\mathlarger{\mathlarger{\rightarrow}}\emptyset. Hence
$$\begin{equation} \mathop{\mathrm{Image}}\left(g\circ f\right)=\emptyset=g\left(\emptyset\right)=g\left(\mathop{\mathrm{Image}}\left(f\right)\right) \end{equation*}$$*
As required. $\qed$ ::::
From the proposition we also deduce the following
::: proposition Proposition 23. Image of composite mapping is a subset of the image of the second function
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y and
g:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings. Consider the
composite mapping g\circ f:X\mathlarger{\mathlarger{\rightarrow}}Z. We
have that
$$\begin{equation} \mathop{\mathrm{Image}}\left(g\circ f\right)\subseteq\mathop{\mathrm{Image}}\left(g\right) \end{equation*}$$*
Proof:
We know by proposition 22{reference-type="ref"
reference="Prop:ImageOfCompMap"} that
\mathop{\mathrm{Image}}\left(g\circ f\right)=g\left(\mathop{\mathrm{Image}}\left(f\right)\right)
where
$$\begin{equation} g\left(\mathop{\mathrm{Image}}\left(f\right)\right)=\left{g\left(y\right):y\in\mathop{\mathrm{Image}}\left(f\right)\right} \end{equation*}$$*
Now, observe that \mathop{\mathrm{Image}}\left(f\right)\subseteq Y,
and in particular
\mathop{\mathrm{Image}}\left(f\right)\subseteq\mathop{\mathrm{Dom}}\left(g\right).
Hence, with proposition
17{reference-type="ref"
reference="prop:ImageOfSubsetIsSubsetOfImage"}, we deduce that
$$\begin{equation} g\left(\mathop{\mathrm{Image}}\left(f\right)\right)=\left{g\left(y\right):y\in\mathop{\mathrm{Image}}\left(f\right)\right}\subseteq g\left(\mathop{\mathrm{Dom}}\left(g\right)\right)=\left{g\left(y\right):y\in\mathop{\mathrm{Dom}}\left(g\right)\right}=\mathop{\mathrm{Image}}\left(g\right) \end{equation*}$$*
Hence we have
\mathop{\mathrm{Image}}\left(g\circ f\right)=\left\{g\left(y\right):y\in\mathop{\mathrm{Image}}\left(f\right)\right\}\subseteq\mathop{\mathrm{Image}}\left(g\right),
which is to say
\mathop{\mathrm{Image}}\left(g\circ f\right)\subseteq\mathop{\mathrm{Image}}\left(g\right)
as required. $\qed$
:::
We have seen earlier that function composition need not be commutative,
for example when
f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} with
f\left(x\right)=4x+2 and
g:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} with
g\left(x\right)=\sqrt{x}. We saw that
$$\begin{align*} g\circ f\left(x\right)&=\sqrt{4x+2}\ f\circ g\left(x\right)&=4\sqrt{x}+2 \end{align*}$$
What can we say about associativity and function composition?
::: example
Example 43. Let
f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N},
g:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} and
h:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} where
$$\begin{align} f:\mathbb{N}&\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ x&\mapsto f\left(x\right)=4x+2\ g:\mathbb{N}&\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ x&\mapsto g\left(x\right)= x^2\ h:\mathbb{N}&\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ x&\mapsto h\left(x\right) = \sqrt{x} \end{align*}$$*
Consider the following
$$\begin{align} h\circ\left(g\circ f\right)\left(x\right)&=h\left(g\left(f\left(x\right)\right)\right)=h\left(\left(4x+2\right)^2\right)=\sqrt{\left(4x+2\right)^2}=4x+2\ \left(h\circ g\right)\circ f\left(x\right)&=h\left(g\left(x\right)\right)\circ f\left(x\right)=\sqrt{x^2} \circ \left(4x+2\right)=x\circ 4x+2=4x+2 \end{align*}$$ In this case the function composition is associative.* :::
This is not a coincidence
::: proposition Proposition 24. Function composition is associative
Let f:W\mathlarger{\mathlarger{\rightarrow}}X,
g:X\mathlarger{\mathlarger{\rightarrow}}Y and
h:Y\mathlarger{\mathlarger{\rightarrow}}Z be mappings. We have that
$$\begin{equation} h\circ\left(g\circ f\right)=\left(h\circ g\right)\circ f \end{equation*}$$*
Proof:
Let f:W\mathlarger{\mathlarger{\rightarrow}}X,
g:X\mathlarger{\mathlarger{\rightarrow}}Y and
h:Y\mathlarger{\mathlarger{\rightarrow}}Z and consider the composite
mappings
h\circ\left(g\circ f\right):W\mathlarger{\mathlarger{\rightarrow}}Z
and
\left(h\circ g\right)\circ f:W\mathlarger{\mathlarger{\rightarrow}}Z.
Let w\in W, then we have that as
f:W\mathlarger{\mathlarger{\rightarrow}}X is a mapping then
\exists x\in X such that x=f\left(w\right), likewise as
g:X\mathlarger{\mathlarger{\rightarrow}}Y, then \exists y\in Y such
that y=g\left(x\right)=g\left(f\left(w\right)\right). Finally as
h:Y\mathlarger{\mathlarger{\rightarrow}}Z is a mapping then
\exists z\in Z such that
z=h\left(y\right)=h\left(g\left(f\left(w\right)\right)\right).
Likewise, for the same w\in W we have x=f\left(w\right), now as
\left(h\circ g\right)\circ f\left(w\right)=\left(h\circ g\right)\left(f\left(w\right)\right)
then we need to see where h\circ g maps f\left(w\right). As
h\circ g\left(x\right)=h\left(g\left(x\right)\right) then we have that
\left(h\circ g\right)\left(f\left(w\right)\right)=h\left(g\left(f\left(w\right)\right)\right),
We know that g\left(f\left(w\right)\right)=y and
h\left(g\left(f\left(w\right)\right)\right)=z.
Hence h\circ\left(g\circ f\right)=\left(h\circ g\right)\circ f as w
as an arbitrary element of W. $\qed$
:::
Inverse mappings
With the theory of composite mappings now understood, we are in a
position to try and understand how to undo a given map
f:X\mathlarger{\mathlarger{\rightarrow}}Y. Why did we need to develop
a theory of composite mappings? The idea comes from the fact that
undoing a mapping should somehow be the same as never doing anything in
the first place. This is to say, if we denote the inverse map by
f^{-1} then we should expect that f^{-1}\circ f\left(x\right)=x,
likewise the original mapping f somehow undoes f^{-1} i.e
f\circ f^{-1}\left(y\right)=y where y is in the co-domain of f. As
always in mathematics, examples will help to understand whats going on.
You may recall from a course in physics that an object thrown in a vacuum so that there is no air resistance, where only gravity acts has the following equation for its height
$$\begin{equation*}
H\left(t\right)=V_0\sin\left(\theta\right)t-\frac{1}{2}gt^2
\end{equation*}$$ where V_0 is the objects launch velocity in metres
per second m/s, \theta is the angle that the projectile is launched
at from the horizontal, g is gravity in metres per second$^2$ m/s^2
and t is time in seconds s. Suppose the particle is launched with a
velocity of 45 m/s at an angle of 45 degrees to the horizontal and
we take g=9.8 m/s^2, then for example, the height above the origin of
the projectile at t=1s is
$$\begin{equation*} H\left(1\right)=10sin\left(45\right)\frac{1}{2}9.8=5\sqrt{2}-\frac{49}{10}\approx 2.17 m \end{equation*}$$
Now suppose you are told that the maximum height is achieved at a time
of \displaystyle t=\frac{25\sqrt{2}}{49}\approx 0.721 s which is
\displaystyle h=\frac{125}{49}\approx 2.551 m. Considering time values
\displaystyle 0<t<\frac{25\sqrt{2}}{49}\approx 0.721, find the time
that the projectile was first at 2 m above the ground. In essence we
need to take h\left(t\right) and somehow undo the process to find some
t such that h\left(t\right)=2. How do we do this? Well set
h\left(t\right)=h then solve for t as follows
$$\begin{align*} h&=10\sin\left(45\right)t-\frac{1}{2}\left(9.8\right)t^2\ h&=5\sqrt{2}t-\frac{49}{10}t^2\ \frac{49}{10}t^2-5\sqrt{2}t+h&=0\ 49t^2-50\sqrt{2}t+10h&=0 \end{align*}$$
Now, from school we have learnt the quadratic formula, applying this
here we will get two answers for t
$$\begin{align*} t&=\frac{-\left(-50\sqrt{2}\right)\pm\sqrt{\left(-50\sqrt{2}\right)^2-4\left(49\right)\left(10h\right)}}{2\left(49\right)}\ t&=\frac{50\sqrt{2}\pm\sqrt{5000-1960h}}{98}\ \end{align*}$$
Hence when h=2 we get the following times
$$\begin{align*} t&=\frac{50\sqrt{2}\pm\sqrt{5000-1960\left(2\right)}}{98}\ t&=\frac{50\sqrt{2}\pm\sqrt{5000-3920}}{98}\ t&=\frac{50\sqrt{2}\pm\sqrt{1080}}{98}\ t&=\frac{50\sqrt{2}\pm 6\sqrt{30}}{98}\ t&=\frac{25\sqrt{2}\pm 3\sqrt{30}}{49}\ \end{align*}$$
That is,
\displaystyle t= \frac{25\sqrt{2}+3\sqrt{30}}{49}\approx 1.057 s or
\displaystyle t= \frac{25\sqrt{2}-3\sqrt{30}}{49}\approx 0.386 s. This
example illustrates a key point about inverse maps, when we undo a given
map we should get back the original input. Thankfully in this case we
were told when the ball reaches its maximum height and the time it does
so which was about 0.721 s hence we have that the value we are looking
for is the smaller
\displaystyle t= \frac{25\sqrt{2}-3\sqrt{30}}{49}\approx 0.386 s. In
fact if we want to find the time the projectile was first at h m above
the ground we will always take the smaller of the two values for t
found. That is, defining a new map T given by
$$\begin{equation*} t\left(h\right)=\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\ \end{equation*}$$
So that when he particle is launched with a velocity of 45 m/s at an
angle of 45 degrees to the horizontal with g=9.8 m/s^2 and using our
knowledge of the fact that the maximum height is achieved at a time of
\displaystyle t=\frac{25\sqrt{2}}{49}\approx 0.721 s which is
\displaystyle h=\frac{125}{49}\approx 2.551 m, then the mapping
$$\begin{equation*} T\left(h\right)=\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\ \end{equation*}$$
is the inverse to
$$\begin{equation*} H\left(t\right)=10\sin\left(45\right)t-\frac{1}{2}\left(9.8\right)t^2\ \end{equation*}$$
Indeed, for example we have that
$$\begin{align*} H\circ T\left(t\right)&=H\left(T\left(h\right)\right)\ &=H\left(\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\right)\ &=10\sin\left(45\right)\left(\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\right)-\frac{1}{2}\left(9.8\right)\left(\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\right)^2\ &=5\sqrt{2}\left(\frac{50\sqrt{2}-\sqrt{5000-1960h}}{98}\right)-\frac{1}{2}\left(9.8\right)\frac{\left(50\sqrt{2}-\sqrt{5000-1960h}\right)^2}{98^2}\ &=\frac{500}{98}-\frac{5\sqrt{2}\sqrt{5000-1960h}}{98}-\frac{1}{2}\left(9.8\right)\frac{\left(5000-100\sqrt{10000-3920h}+\left(5000-1960h\right)\right)}{98^2}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{1}{2}\frac{\left(5000-100\sqrt{10000-3920h}+\left(5000-1960h\right)\right)}{980}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{1}{2}\frac{\left(10000-100\sqrt{10000-3920h}-1960h\right)}{980}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{\left(5000-50\sqrt{10000-3920h}-980h\right)}{980}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{5000}{980}+\frac{50\sqrt{10000-3920h}}{980}+\frac{980h}{h}\ &=\frac{250}{49}-\frac{5\sqrt{10000-3920h}}{98}-\frac{250}{49}+\frac{5\sqrt{10000-3920h}}{98}+h\ &=h\ \end{align*}$$
Again, we have this idea that inverse functions should somehow return any mapping back to where it started.
We can start to express this idea in terms of a so-called identity mapping.
::: definition
Definition 51. Let
\mathop{\mathrm{id}}_X:X\mathlarger{\mathlarger{\rightarrow}}X be a
mapping from X to itself, so that
$$\begin{align} \mathop{\mathrm{id}}:X&\mathlarger{\mathlarger{\rightarrow}}X\ x&\mapsto\mathop{\mathrm{id}}\left(x\right)=x \end{align*}$$*
We say that \mathop{\mathrm{id}} is the identity mapping on the set
X. Suppose we also have a mapping
\mathop{\mathrm{id}}_Y:Y\mathlarger{\mathlarger{\rightarrow}}Y, then
\mathop{\mathrm{id}}_Y is the identity map on the set Y, so it is
clear that \mathop{\mathrm{id}}_X\neq \mathop{\mathrm{id}}_Y.
:::
Indeed we can prove that these identity maps do nothing under function composition.
::: proposition Proposition 25. Composition with the identity mapping does nothing
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping and
consider the identity maps
\mathop{\mathrm{id}}_X:X\mathlarger{\mathlarger{\rightarrow}}X and
\mathop{\mathrm{id}}_Y:Y\mathlarger{\mathlarger{\rightarrow}}Y. We
have that
-
$f\circ \mathop{\mathrm{id}}_X=f$
-
$\mathop{\mathrm{id}}_Y\circ f=f$
Proof:
We simply need to compose the maps to see the desired results.
-
f\circ \mathop{\mathrm{id}}_X=f:Let
x\in Xthenf\circ \mathop{\mathrm{id}}_X\left(x\right)=f\left(id_X\left(x\right)\right)=f\left(x\right)=f. -
\mathop{\mathrm{id}}_Y\circ f=f:Let
x\in Xthen $\mathop{\mathrm{id}}_Y\circ f\left(x\right)=id_Y\left(f\left(x\right)\right)=f\left(x\right)=f$
Hence the result follows. :::
For completeness we will prove some trivial properties about the identity mapping.
::: {#prop:IdentityMapProperties .proposition} Proposition 26. Properties of the identity mapping
Let \mathop{\mathrm{id}}_X:X\mathlarger{\mathlarger{\rightarrow}}X be
the identity map on X. Then the following hold
-
\mathop{\mathrm{id}}_Xis an injective map -
\mathop{\mathrm{id}}_Xis a surjective map -
\mathop{\mathrm{id}}_Xis a bijective map -
$\mathop{\mathrm{id}}_X\circ\mathop{\mathrm{id}}_X=\mathop{\mathrm{id}}_X$
Proof:
-
\mathop{\mathrm{id}}_Xis an injective map:Let
x,y\in Xthen we have that\mathop{\mathrm{id}}_X\left(x\right)=id_X\left(y\right)\Rightarrow x=y, hence\mathop{\mathrm{id}}_Xis injective. -
\mathop{\mathrm{id}}_Xis a surjective map:Let
y\in Xbe such thaty=\mathop{\mathrm{id}}_X\left(x\right)for somex\in X, theny=xas this works for everyy\in Xthen\mathop{\mathrm{id}}_Xis surjective. -
\mathop{\mathrm{id}}_Xis a bijective map:By parts 1. and 2. we have that
\mathop{\mathrm{id}}_Xis injective and surjective and thus by definition is bijective. -
\mathop{\mathrm{id}}_X\circ\mathop{\mathrm{id}}_X=\mathop{\mathrm{id}}_X:Let
x\in Xand consider\mathop{\mathrm{id}}_X\circ\mathop{\mathrm{id}}_X\left(x\right)=\mathop{\mathrm{id}}_X\left(\mathop{\mathrm{id}}_X\left(x\right)\right)=\mathop{\mathrm{id}}_X\left(x\right)=x=\mathop{\mathrm{id}}_X\left(x\right).
\qed.
:::
The identity mapping will allow us to define the idea of a left and right inverse of a mapping.
::: definition Definition 52. Left inverse
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping. We define
g:Y\mathlarger{\mathlarger{\rightarrow}}X to be the left inverse of
f if
$$\begin{equation} g\circ f=\mathop{\mathrm{id}}_X \end{equation*}$$* :::
::: definition Definition 53. Right inverse
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping. We define
g:Y\mathlarger{\mathlarger{\rightarrow}}X to be the right inverse of
f if
$$\begin{equation} f\circ g=\mathop{\mathrm{id}}_Y \end{equation*}$$* :::
::: example
Example 44. Let
f:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} be such
that f\left(x\right)=x+1. Define the mapping
g:\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N} by
$$\begin{equation} g\left(x\right)=\begin{cases} x-1,\ \text{If }x\neq 0,\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
Then g is a left inverse of f. Indeed we have that
$$\begin{equation}
g\circ f\left(x\right)=g\left(f\left(x\right)\right)=g\left(x+1\right)=\left(x+1\right)-1=x
\end{equation*}$$ as x+1>0 for every x\in\mathbb{N}. Observe also
that f is an injective map, indeed let x,y\in\mathbb{N} and suppose
f\left(x\right)=f\left(y\right) then*
$$\begin{align} f\left(x\right)&=f\left(y\right)\ x+1&=y+1\ x&=y \end{align*}$$*
It is also worth noting that g is not injective as we have
g\left(1\right)=0=g\left(0\right) but 1\neq 0. We note that f is
the right inverse of g as the calculation above shows.
:::
::: example
Example 45. Let X=\mathbb{R} and define
Y=\mathbb{R}^+=\left\{x\in\mathbb{R}:x\geq 0\right\}, the set of
familiar numbers. Let
f:\mathbb{R}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}^+ be
define by f\left(x\right)=x^2. We can define two possible right
inverses of f. The first is given by
g_1:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} where
g_1\left(x\right)=\sqrt{x}. Indeed
$$\begin{equation}
f\circ g_1\left(x\right)=f\left(g_1\left(x\right)\right)=f\left(\sqrt{x}\right)=\left(\sqrt{x}\right)^2=x=\mathop{\mathrm{id}}_{\mathbb{R}}\left(x\right)
\end{equation*}$$ The second, as you may have guessed, is given by
g_2:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} where
g_1\left(x\right)=-\sqrt{x} where likewise we have*
$$\begin{equation} f\circ g_2\left(x\right)=f\left(g_2\left(x\right)\right)=f\left(-\sqrt{x}\right)=\left(-\sqrt{x}\right)^2=x=\mathop{\mathrm{id}}_{\mathbb{R}}\left(x\right) \end{equation*}$$*
We note that f is surjective. Let y\in\mathbb{R}^+ then
f\left(x\right)=y\Rightarrow x^2=y\Rightarrow x=\pm\sqrt{y}\in\mathbb{R},
hence every output of f is mapped to by some input. It is clear that
f is not injective as f\left(2\right)=4=f\left(-2\right).
Does f have a left inverse?. By the definition of a left inverse we
will need to find some
g:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} such
that g\circ f=id_{\mathbb{R}}. So for each input of f, g will have
to send f\left(x\right) back to x, hence we might require that f
be injective, for if not then \exists x,y\in\mathbb{R} such that
f\left(x\right)=f\left(y\right) with x\neq y and we have the problem
where g could send f\left(x\right) back to either x or y, and if
it is sent back to y then we don't have the identity mapping!
Now, f is not injective as we have seen that
f\left(2\right)=4=f\left(-2\right), so if there where a left inverse
g it wouldn't know where to send 4 back to, it could have been
either 2 or -2.
:::
::: example
Example 46. Let X=\mathbb{R} and
Y=\mathbb{R}\setminus\left\{0\right\}=\left\{x\in\mathbb{R}:x\neq 0\right\}.
You may have seen the function e^x before, we shall consider this
mapping, that is the mapping
f:\mathbb{R}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}\setminus\left\{0\right\}
given by f\left(x\right)=e^x=\exp\left(x\right). We can define a left
inverse to f by
g:\mathbb{R}\setminus\left\{0\right\}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}
where g\left(x\right)=\ln\left(x\right), where \ln\left(x\right) is
the natural logarithm, the logarithm to the base e. We will discuss
logarithms in more detail later but for now we can think of
\ln\left(x\right)=y as asking the question e^y=x, that is value of
y do we need to raise e to to get x. This g is indeed a left
inverse of f as
$$\begin{equation} g\circ f\left(x\right)=g\left(f\left(x\right)\right)=g\left(e^x\right)=\ln\left(e^x\right)=x=\mathop{\mathrm{id}}_{\mathbb{R}} \end{equation*}$$*
Like in the previous example, we can ask the question does f have a
right inverse? By definition for f to have a right inverse, there
needs to be a mapping
g:\mathbb{R}\setminus\left\{0\right\}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}
such that
f\circ g=\mathop{\mathrm{id}}_\mathbb{R}\setminus\left\{0\right\}. So
for each g\left(y\right) with
y\in\mathbb{R}\setminus\left\{0\right\} we have that f will send
g\left(y\right) back to y. This will happen if every output of f
has some input that generates it, that is f is a surjection. If this
not the case then there is some element
y\in\mathbb{R}\setminus\left\{0\right\} that is not mapped to by
f\left(x\right) for some x\in\mathbb{R}.
For example we have that \not\exists x\in\mathbb{R} such that
e^x=-1 for example. So f is not surjective in this case we are not
able to define a right inverse that makes sense.
:::
We can generalise the last two examples to the next two propositions.
::: {#prop:LeftInverseIffInjective .proposition} Proposition 27. Condition for the existence of a left inverse
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping with
X\neq\emptyset. We have that f has a left inverse
g:Y\mathlarger{\mathlarger{\rightarrow}}X such that
g\circ f=\mathop{\mathrm{id}}_X if and only if f is an injective
mapping.
Proof:
\left(\Rightarrow\right): Suppose that f has a left inverse
g:Y\mathlarger{\mathlarger{\rightarrow}}X such that
g\circ f=\mathop{\mathrm{id}}_X. We know by proposition
26{reference-type="ref"
reference="prop:IdentityMapProperties"} that \mathop{\mathrm{id}}_X is
an injective mapping, moreover we know by proposition
21{reference-type="ref"
reference="prop:CompositeMapInectSurjectProp"} that if a composite map
g\circ f is injective then so is f. Hence as
g\circ f = \mathop{\mathrm{id}}_X and \mathop{\mathrm{id}}_X is
injective, we conclude that f is an injective map.
\left(\Leftarrow\right): Suppose that f is an injective map, then
\forall x,y\in X we have that
f\left(x\right)=f\left(y\right)\Rightarrow x=y. Let x\in X, we need
to construct a map which acts as a left inverse to f.
Consider the following map
\mathrel{h\restriction_{\mathop{\mathrm{Image}}\left(f\right)}}:\mathop{\mathrm{Image}}\left(f\right)\mathlarger{\mathlarger{\rightarrow}}X,
where we send y\in\mathop{\mathrm{Image}}\left(f\right) back to the
element that it was mapped from. Now, define g as follows
$$\begin{align} g:Y&\mathlarger{\mathlarger{\rightarrow}}X\ y&\mapsto g\left(y\right)=\begin{cases} x,\ \text{If } y\in Y\setminus\mathop{\mathrm{Image}}\left(f\right)\ h\left(y\right),\ \text{If } y\in\mathop{\mathrm{Image}}\left(f\right) \end{cases} \end{align*}$$*
We note that if \mathop{\mathrm{Image}}\left(f\right) = Y then we do
not need to consider the first case
x,\ \text{If } y\in Y\setminus\mathop{\mathrm{Image}}\left(f\right),
however if \mathop{\mathrm{Image}}\left(f\right) \subset Y then there
exists at least one x for this case.
Now with this g we have that
$$\begin{equation} g\circ f\left(x\right)=g\left(f\left(x\right)\right)=h\left(f\left(x\right)\right)=x=\mathop{\mathrm{id}}_X \end{equation*}$$*
Hence g is indeed a left inverse of f.
The proposition now follows. $\qed$ :::
::: {#prop:RightInverseIffSurjective .proposition} Proposition 28. Condition for the existence of a right inverse
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping with
X\neq\emptyset. We have that f has a right inverse
g:Y\mathlarger{\mathlarger{\rightarrow}}X such that
f\circ g=\mathop{\mathrm{id}}_Y if and only if f is a surjective
mapping.
Proof:
\left(\Rightarrow\right): Suppose that f has a right inverse
g:Y\mathlarger{\mathlarger{\rightarrow}}X such that
f\circ g=\mathop{\mathrm{id}}_Y. We know by proposition
26{reference-type="ref"
reference="prop:IdentityMapProperties"} that \mathop{\mathrm{id}}_X is
a surjective mapping, moreover we know by proposition
21{reference-type="ref"
reference="prop:CompositeMapInectSurjectProp"} that if a composite map
f\circ g is surjective then so is f. Hence as
f\circ g = \mathop{\mathrm{id}}_Y and \mathop{\mathrm{id}}_Y is
surjective, we conclude that f is a surjective map.
\left(\Leftarrow\right): Suppose that f is a surjective map, then
\forall y\in Y,\exists x\in X: f\left(x\right)=y. We need to construct
a g:Y\mathlarger{\mathlarger{\rightarrow}}X such that
f\circ g=\mathop{\mathrm{id}}_Y. As f is surjective we have that
\forall y\in Y,\exists x\in X: f\left(x\right)=y, in particular we
know that there maybe more than one such x so that
f\left(x\right)=y, if this is the case we pick for that y one of the
possible choices of x. Hence we can define g\left(y\right)=x for
every y\in Y then we have that
$f\circ g\left(y\right)=f\left(g\left(y\right)\right)=f\left(x\right)=y=\mathop{\mathrm{id}}_Y$
The proposition now follows. $\qed$ :::
These two propositions give the following immediate results
::: {#LeftInverseOfInjectionIsSurjective .proposition} Proposition 29. Left inverse of injective mapping is a surjection
Let f:X\rightarrow Y be an injection with left inverse
g:Y\rightarrow X. We have that g is a surjection.
Proof let f and g be as stated. Then by definition of a left
inverse we have that g\circ f =\mathop{\mathrm{id}}_X. Moreover we
have the identity mapping \mathop{\mathrm{id}}_X is an injection as it
is bijective. We then have by proposition
21{reference-type="ref"
reference="prop:CompositeMapInectSurjectProp"} that g is a surjection.
$\qed$
:::
::: {#RightInverseOfSurjecctionisInection .proposition} Proposition 30. Right inverse of surjective mapping is an injection
Let f:X\rightarrow Y be a surjection with right inverse
g:Y\rightarrow X. We have that g is an injection.
Proof let f and g be as stated. Then by definition of a right
inverse we have that f\circ g =\mathop{\mathrm{id}}_Y. Moreover we
have the identity mapping \mathop{\mathrm{id}}_Y is a surjection as it
is bijective. We then have by proposition
21{reference-type="ref"
reference="prop:CompositeMapInectSurjectProp"} that g is an injection.
$\qed$
:::
The ideas of a left and right inverse will allow us to construct the idea of a so-called two-sided inverse, that is an inverse which is both a left inverse and a right inverse. this will allow us to consider when a mappings can be inverted without regards to how we compose the mappings. However there is one final result about left and right inverse that will be required in order to pave the way.
::: {#prop:BijectionHasLeftRightInverse .proposition} Proposition 31. Bijection has a left and right inverse
Let f:X\rightarrow Y be a bijective mapping. We have that there
exists a left inverse g:Y\rightarrow X and there exists a right
inverse h:Y\rightarrow X such that
$$\begin{align} g\circ f &= \mathop{\mathrm{id}}_X\ f\circ h&=\mathop{\mathrm{id}}_Y \end{align*}$$*
Proof:
Let f:X\rightarrow Y be a bijection. We have that as f is a
bijection then we know that f is both injective and surjective. Now by
proposition 27{reference-type="ref"
reference="prop:LeftInverseIffInjective"} that a left inverse exists if
and only if f is an injective mapping. Likewise by proposition
28{reference-type="ref"
reference="prop:RightInverseIffSurjective"} we have that a right inverse
exists if and only if f is a surjective mapping. Hence we have the
existence of a left and right inverse. As required. $\qed$
:::
::: {#prop:LeftRightInverseImpliesBijection .proposition} Proposition 32. The existence of a left and right inverse implies a bijection
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping such that
\exists g_1:Y\mathlarger{\mathlarger{\rightarrow}}X such that
g_1\circ f=\mathop{\mathrm{id}}_X and
\exists g_2:Y\mathlarger{\mathlarger{\rightarrow}}X such that
f\circ g_2=\mathop{\mathrm{id}}_Y. We have that f is a bijection.
Proof:
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping such that
\exists g_1:Y\mathlarger{\mathlarger{\rightarrow}}X such that
g_1\circ f=\mathop{\mathrm{id}}_X and
\exists g_2:Y\mathlarger{\mathlarger{\rightarrow}}X such that
f\circ g_2=\mathop{\mathrm{id}}_Y. We have by proposition
27{reference-type="ref"
reference="prop:LeftInverseIffInjective"} that as g_1 is a left
inverse of f then f must be injective. Likewise by proposition
28{reference-type="ref"
reference="prop:RightInverseIffSurjective"} that as g_2 is a right
inverse of f then f must be surjective. It hence follows by
definition that f is a bijective mapping. $\qed$
:::
These propositions are useful in proving the following.
::: proposition Proposition 33. Bijection if and only if left and right inverses exist
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping. We have
that f is bijective if and only if
\exists g_1:Y\mathlarger{\mathlarger{\rightarrow}}X such that
g_1\circ f=\mathop{\mathrm{id}}_X and
\exists g_2:Y\mathlarger{\mathlarger{\rightarrow}}X such that
f\circ g_2=\mathop{\mathrm{id}}_Y.
Proof:
\left(\Rightarrow\right): Let f: X\rightarrow Y be a bijective
mapping. We have by proposition
31{reference-type="ref"
reference="prop:BijectionHasLeftRightInverse"} we have that f being a
bijection gives the existence of a left and right inverse.
\left(\Leftarrow\right): Suppose we have a mapping f:X\rightarrow Y
such that \exists g_1:Y\mathlarger{\mathlarger{\rightarrow}}X such
that g_1\circ f=\mathop{\mathrm{id}}_X and
\exists g_2:Y\mathlarger{\mathlarger{\rightarrow}}X such that
f\circ g_2=\mathop{\mathrm{id}}_Y. Then f has both a left inverse
and a right inverse, hence by proposition
32{reference-type="ref"
reference="prop:LeftRightInverseImpliesBijection"} we have that f is a
bijection.
The result is shown. $\qed$ :::
We have seen that if f:X\rightarrow Y is a bijection then f has both
a left and a right inverse, likewise if these two inverses exist then we
have that f is a bijection. This property is key to defining what we
mean by the inverse to a bijective mapping.
::: definition Definition 54. Inverse
Let f:X\mathlarger{\mathlarger{\rightarrow}}Y be a mapping. We say
that the mapping g:Y\mathlarger{\mathlarger{\rightarrow}}X is an
inverse6 of f if we have that g is both a left inverse and a
right inverse for f. This is to say, g is an inverse of f if we
have that
$$\begin{align} g\circ f&=\mathop{\mathrm{id}}_X\ f\circ g &=\mathop{\mathrm{id}}_Y \end{align*}$$*
We sometimes use the notation f^{-1} to denote the inverse.
:::
::: example
Example 47. Let
f:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}^+ be
such that f\left(x\right)=x^2, then we have that
g:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}^+ with
g\left(x\right)=\sqrt{x} is an inverse of f. Indeed
$$\begin{align} g\circ f\left(x\right)&=g\left(f\left(x\right)\right)=g\left(x^2\right)=\sqrt{x^2}=x=\mathop{\mathrm{id}}{\mathbb{R}^+}\ f\circ g\left(x\right)&=f\left(g\left(x\right)\right)=f\left(\sqrt{x}\right)=\left(\sqrt{x}\right)^2=x=\mathop{\mathrm{id}}{\mathbb{R}^+}\ \end{align*}$$* :::
::: example
Example 48. The identity mapping
\mathop{\mathrm{id}}_X:X\mathlarger{\mathlarger{\rightarrow}}X with
\mathop{\mathrm{id}}_X\left(x\right)=x,\ \forall x\in X is its own
inverse, indeed
$$\begin{equation} \mathop{\mathrm{id}}_X\circ\mathop{\mathrm{id}}_X=\mathop{\mathrm{id}}_X\left(\mathop{\mathrm{id}}_X\left(x\right)\right)=\mathop{\mathrm{id}}_X\left(x\right)=x=\mathop{\mathrm{id}}_X \end{equation*}$$* :::
::: example
Example 49. Let
f:\left\{1,2\right\}\mathlarger{\mathlarger{\rightarrow}}\left\{a,b\right\}
be such that f\left(1\right)=a and f\left(2\right)=b. We have that
g:\left\{a,b\right\}\mathlarger{\mathlarger{\rightarrow}}\left\{1,2\right\}
with g\left(a\right)=1 and g\left(b\right)=2 is an inverse to f.
Indeed we have that
$$\begin{align} f\left(g\left(a\right)\right)&=f\left(1\right)=a\ f\left(g\left(b\right)\right)&=f\left(2\right)=b\ \end{align*}$$*
It also follows that g is an inverse to f, indeed
$$\begin{align} g\left(f\left(1\right)\right)&=g\left(a\right)=1\ g\left(f\left(1\right)\right)&=g\left(b\right)=2\ \end{align*}$$* :::
::: example
Example 50. Let
f:\mathbb{R}\mathlarger{\mathlarger{\rightarrow}}\mathbb{R}^+ be given
by f\left(x\right)=e^x. We have that
g:\mathbb{R}^+\mathlarger{\mathlarger{\rightarrow}}\mathbb{R} where
g\left(x\right)=\ln\left(x\right) is an inverse of f.
:::
We shall prove that the composition of a mapping and its inverse gives the identity mapping. Firstly, we will need to show the following propositions.
::: {#prop:MappingInjectiveSurjectiveIFFInverseIsMapping .proposition} Proposition 34. Mapping is injective and surjective if and only if the inverse is a mapping
Let f:X\rightarrow Y be a mapping. We have that f is a bijection if
and only if f^{-1}, the inverse of f, is a mapping.
Proof:
\left(\Rightarrow\right): Let f:X\rightarrow Y be a bijection, then
f is both surjective and injective. Let y\in Y, then as f is
surjective we have that \exists x\in X such that f\left(x\right)=y,
moreover by injectivity of f we have that there is only one such x
which does this. Define g:Y\rightarrow X by
$$\begin{equation} g\left(y\right)=x \end{equation*}$$*
As y\in Y is an arbitrary element, it follows that
$$\begin{equation}
\forall y\in Y:\exists x\in X : g\left(y\right)=x
\end{equation*}$$ such that x is unique for a given y. That is g
is a mapping. Now by the definition of g we have that*
$$\begin{equation}
\forall y\in Y: f\left(g\left(y\right)\right)=y
\end{equation*}$$ Now, let x\in X and let*
$$\begin{equation} x'=g\left(f\left(x\right)\right) \end{equation*}$$ then*
$$\begin{equation}
f\left(x'\right)=f\left(g\left(f\left(x\right)\right)\right)=f\left(x\right)
\end{equation*}$$ by the above. However, f is an injection so we have
that x'=x and thus x=g\left(f\left(x\right)\right).*
It follows that f and g are inverse mappings of each other.
\left(\Leftarrow\right): Suppose that f:X\rightarrow Y is a
mapping, moreover suppose that f^{-1}:Y\rightarrow X is also a mapping
which is the inverse of f. We show that f must be a bijection.
-
fis injective:Let
x,y\in Xand suppose that $f\left(x\right)=f\left(y\right)$$$\begin{align} f\left(x\right)&=f\left(y\right)\ f^{-1}\left(f\left(x\right)\right)&=f^{-1}\left(f\left(y\right)\right)\ \Rightarrow x&=y,\ \text{As } f^{-1} \text{ is the inverse of f} \end{align*}$$ Hence we have that
fis injective.* -
fis surjective:Suppose that
y\in Y. We then have that$$\begin{align} y&\in Y\ \Rightarrow f^{-1}\left(y\right)&\in X,\ \text{As } f^{-1} \text{ is the inverse of f}\ \Rightarrow f\left(^{-1}\left(y\right)\right)&=y,\ \text{By definition of an inverse mapping}\ \Rightarrow \exists x\in X: f\left(x\right)&= y,\ \text{Where } x=f^{-1}\left(y\right) \end{align*}$$ Hence we have that
fis surjective.*
As f is both injective and surjective it is a bijection. $\qed$
:::
We can now show that the inverse of a bijective mapping is also a bijective mapping.
::: {#prop:InverseBijectionIsBijection .proposition} Proposition 35. Inverse of a bijective mapping is a bijective mapping
Let f:X\rightarrow Y be a bijective mapping. We have that
f^{-1}:Y\rightarrow X, the inverse of f, is also a bijection.
Proof:
Let f:X\rightarrow Y be a bijective mapping. By definition of being a
bijection we have that f is both injective and surjective. By
proposition
34{reference-type="ref"
reference="prop:MappingInjectiveSurjectiveIFFInverseIsMapping"} we have
that f^{-1} is a mapping. Now it is clear that the inverse of the
inverse is the original mapping that is.
$$\begin{equation} \left(f^{-1}\right)^{-1}=f \end{equation*}$$*
Now, f is a bijection and thus is a mapping. But as f is a mapping
we have that by proposition
34{reference-type="ref"
reference="prop:MappingInjectiveSurjectiveIFFInverseIsMapping"} we have
that f^{-1} is a bijection. As required. $\qed$
:::
We can now see that the composition of a bijective mapping with its inverse must be the identity map.
::: {#prop:BijectionWithInverseIsIdentity .proposition} Proposition 36. Composition of bijective mapping with the inverses is the identity mapping
Let f:X\rightarrow Y be a bijective mapping, and let
f^{-1}:Y\rightarrow X be the inverse mapping of f. We have that
$$\begin{align} f\circ f^{-1} &=\mathop{\mathrm{id}}_Y\ f^{-1}\circ f &= \mathop{\mathrm{id}}_X \end{align*}$$*
Proof:
Let f:X\rightarrow Y be a bijective mapping, with inverse given by
f^{-1}:Y\rightarrow X. As f is bijective we have that by proposition
35{reference-type="ref"
reference="prop:InverseBijectionIsBijection"} we have that f^{-1} is a
bijection. Let x\in X, then we have that
$$\begin{equation} \exists y\in Y: f\left(x\right)=y \Rightarrow f^{-1}\left(y\right)=x \end{equation*}$$*
Hence, we have that
$$\begin{align} f^{-1}\circ f\left(x\right)&=f^{-1}\left(f\left(x\right)\right),\ \text{By function composition}\ &=f^{-1}\left(y\right),\ \text{By above}\ &=x,\ \text{By above}\ &=\mathop{\mathrm{id}}_X\left(x\right),\ \text{By the definition of the identity map of } X \end{align*}$$*
We have that the domain of f^{-1}\circ f is clearly X, likewise the
co-domain is X, which is the same as \mathop{\mathrm{id}}_X.
Moreover \forall x\in X we have
f^{-1}\circ f\left(x\right)=x=\mathop{\mathrm{id}}_X\left(x\right). So
the mappings are equal.
Likewise, let y\in Y, then we have that
$$\begin{equation} \exists x\in X: f^{-1}\left(y\right)=x \Rightarrow f\left(x\right)=y \end{equation*}$$*
Hence, we have that
$$\begin{align} f\circ f^{-1}\left(y\right)&=f\left(f^{-1}\left(y\right)\right),\ \text{By function composition}\ &=f\left(x\right),\ \text{By above}\ &=y,\ \text{By above}\ &=\mathop{\mathrm{id}}_Y\left(y\right),\ \text{By the definition of the identity map of } Y \end{align*}$$*
We have that the domain of f\circ f^{-1} is clearly Y, likewise
the co-domain is Y, which is the same as \mathop{\mathrm{id}}_Y.
Moreover \forall y\in Y we have
f\circ f^{-1}\left(y\right)=y=\mathop{\mathrm{id}}_Y\left(y\right).
So the mappings are equal.
In both cases the composition yields the required identity mappings, as required. $\qed$ :::
The Natural numbers
::: epigraph The natural numbers are the work of God. All the rest is the work of mankind.
Leopold Kronecker (Paraphrased) :::
Constructing the Natural numbers
We now have enough tools and core theory to start building up from the foundations of mathematics. We do this using the ZFC axioms, although perhaps not with the complete rigour we should be using. We touched on these briefly in section 2.1.5{reference-type="ref" reference="subsubSec:ZFCAxioms"}. We will state them again.
-
The axiom of extensionality:
The axiom of extensionality asserts that two sets are equal if and only if they contain the same elements.
-
The axiom of the empty-set:
The axiom of the empty-set asserts that there exists a set which contains no elements
-
The axiom of pairing:
The axiom of pairing asserts that given any set
Aand any setB, there is a setCsuch that, given any setD,Dis a member ofCif and only ifDis equal toAorDis equal toB. This is to say, given two sets, there is a set whose members are exactly the two given sets. -
The axiom of specification:
The axiom of specification asserts that we can construct a set which satisfies a given condition, so long as this condition is not inherently contradictory.
-
The axiom of unions:
The axiom of unions asserts that we can perform the union of two sets
AandB -
The axiom of powers:
The axiom of powers asserts that for any set
Swe can construct a setP\left(S\right)whose elements are all the possible subsets ofS. -
The axiom of infinity:
The axiom of infinity asserts that there is at least one infinite set
A, that is at least one set with infinitely many elements. That is we have a setAsuch that the\emptyset\in Aand ifx\in Athen the setx\cup\left\{x\right\}is also inA. -
The axiom of replacement:
We will need the next section to fully understand this axiom, however informally asserts that for some set
S, and form another set by replacing the elements ofSby other sets according to any definite rule. -
The axiom of foundation:
The axiom of foundation asserts that for every non-empty set
S, there exists an elementx\in Ssuch thatxandSare disjoint. This also asserts that no set can contain itself.
We also recall that we include the symbol \in in the ZFC axioms, which
allows us to talk about element inclusions in sets. In other words, ZFC
defines a set of axioms that allow us to talk about sets and elements of
sets. Next, we have that, formally speaking, ZFC is allowed to make
statements about mappings. Finally, we will ZFC has the power to prove
the results in the previous two sections we made on sets and mappings,
so we will assume these as well. We will use this as the building blocks
for building the natural numbers. How can we do this from the ZFC
axioms?
As it stands right now ZFC only gives us the existence of the empty set, and there is at least a set which contains infinitely many elements. We start with the empty set, a set which contains no elements, we can use the ZFC axioms to build a new set which contains the empty set.
Our ultimate goal is to identify each natural number with the number of
elements in some corresponding set. Hence naturally the empty set
containing no elements would be identified with the number 0, and so
on. The question is given that we only have the empty set, how can we
build a new set? We can use the axiom of powers. This states that we can
take any set S and construct a new set P\left(S\right) whose
elements are the possible subsets of S. Applying this to the
empty-set, a set which contains no elements and thus has no subsets
except for itself, must give us
P\left(\emptyset\right)=\left\{\emptyset\right\}. This is sufficient
for what we need to do.
So, we have two sets, \emptyset and \left\{\emptyset\right\}. We
shall identify \emptyset with 0 and \left\{\emptyset\right\} with
1.
::: {#def:Zero .definition} Definition 55. Zero
We define the number zero to be \emptyset. That is, we say Zero is a
set that contains no elements.
:::
::: {#def:One .definition} Definition 56. One
We define the number zero to be \left\{\emptyset\right\}. That is, we
say One is the set whose only element is \emptyset.
:::
How do we define any more numbers? We can use the axiom of unions. This
raises the question why not use the axiom of powers again? If we apply
the axiom of powers to \left\{\emptyset\right\} we get the set
$$\begin{equation*}
P\left(\left{\emptyset\right}\right)=\left{\emptyset,\left{\emptyset\right}\right}
\end{equation*}$$ If we assume we already know what the natural numbers
are, we could identify this with the number 2. However, a repeated
application of the axiom of powers would give us
$$\begin{equation*}
P\left(\left{\emptyset,\left{\emptyset\right}\right}\right)=\left{\emptyset,\left{\emptyset\right},\left{\left{\emptyset\right}\right},\left{\emptyset,\left{\emptyset\right}\right}\right}
\end{equation*}$$ Which we would identify with the number 4. Another
application would give us a set that we would identify with the number
8. Clearly, we are skipping numbers such as 3,5,7,9 etc. We can't
get additional numbers that aren't powers of 2. Instead, we can define
an operation that will allow us to construct each number one at a time.
This operation uses the axiom of unions, and starts of with the numbers
0 and 1, which we recall are the sets \emptyset, and
\left\{\emptyset\right\} respectively. Applying the axiom of unions to
these two sets gives us
$$\begin{equation*}
\emptyset\cup\left{\emptyset\right}=\left{\emptyset,\left{\emptyset\right}\right}
\end{equation*}$$ This is in agreement with
P\left(\left\{\emptyset\right\}\right), so we can identify this with
the number 2. Now, the axiom of pairing allows us to create a set that
contains as elements any two sets that have already been created.
Applying this to \left\{\emptyset,\left\{\emptyset\right\}\right\}
with itself allows us to create the set
\left\{\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}.
Hence we can now apply this operation again on the set
\left\{\emptyset,\left\{\emptyset\right\}\right\} to get
$$\begin{equation*}
\left{\emptyset,\left{\emptyset\right}\right}\cup\left{\left{\emptyset,\left{\emptyset\right}\right}\right}=\left{\emptyset,\left{\emptyset\right},\left{\left{\emptyset,\left{\emptyset\right}\right}\right}\right}
\end{equation*}$$ A set of 3 elements so we identify this with the
number 3. We can keep doing this to build the Natural numbers. Lets
make some definitions
::: definition Definition 57. The successor operation
Let x be a set. We define the successor operation, denoted by S to
be given by
$$\begin{equation} S\left(x\right)= x\cup\left{x\right} \end{equation}$$ :::
We call this the successor function, as it is clear in the context of
the Natural numbers that S\left(n\right)=n+1, but we shall prove this
later.
This definition allows us to essentially make any finite number. This leads us to our first potential definition for the Natural numbers. We first need to define the idea of recursion.
We have the following proposition
::: {#prop:EqualSuccOp .proposition} Proposition 37. Equality of successor operation
Let a,b be sets. We have that S\left(a\right)=S\left(b\right) if
and only if a=b.
Proof:
\left(\Rightarrow\right): Suppose that a,b are sets and
S\left(a\right)=S\left(b\right). By definition of S we have that
$$\begin{equation} a\cup\left{a\right}=b\cup\left{b\right} \end{equation*}$$*
Now, as a\in S\left(a\right) and S\left(a\right)=S\left(b\right)
then we have that a\in b\cup\left\{b\right\} and so a\in b or a=b.
Similarly, as b\in S\left(b\right) we get that
b\in a\cup\left\{a\right\} and so b\in a or b=a.
Now, if a=b we are done, so suppose a\neq b, then we have that
a\in b and b\in a. Consider the set given by
$$\begin{equation} X=\left{a,b\right} \end{equation*}$$*
which can be constructed by the Axiom of pairing. Now as a\in b we
have that b\cap \left\{a,b\right\}\neq\emptyset and likewise as
b\in a we have a\cap \left\{a,b\right\}\neq\emptyset. This
contradicts the Axiom of Foundation, X does not contain an element
that is disjoint from it. It follows that we can't have a\neq b and
conclude that a=b.
\left(\Leftarrow\right): This is trivial by the definition of S.
$\qed$
:::
There are a few extra properties about the successor function that we shall make use of
::: corollary Corollary 1. Successor mapping is injective
Let a,b be sets. We have that the successor function is injective,
that is for all sets a,b we have that
$$\begin{equation} S\left(a\right)=S\left(b\right) \Rightarrow a=b \end{equation*}$$*
Proof:
Suppose that a,b are arbitrary sets and that
S\left(a\right)=S\left(b\right), by proposition
37{reference-type="ref"
reference="prop:EqualSuccOp"} this holds if and only if a=b. Hence we
have injectivity. $\qed$
:::
::: corollary Corollary 2. Empty-set is not the successor of any set
We have that \emptyset\neq S\left(a\right) for all sets a.
Proof:
Consider the definition of S\left(a\right) and suppose for
contradiction that \emptyset= S\left(a\right). We have by definition
of the successor mapping that
$$\begin{equation}
\emptyset=S\left(a\right)=a\cup\left{a\right}
\end{equation*}$$ This is a contradiction, as a\cup\left\{a\right\} is
a set of two elements, namely a and \left\{a\right\} but the
empty-set by definition has no elements. $\qed$*
:::
::: definition Definition 58. Recursive definition of a set
A set S is defined recursively if the elements of S are defined in
terms of other elements x\in S. Moreover we have that there is some
initial element x_0 which is used to define the other elements of the
set.
:::
::: definition Definition 59. First definition of the Natural numbers
We define the set \mathbb{N}, called the set of natural numbers, to
be the set given by
$$\begin{equation} \mathbb{N}=\left{x: x=\emptyset\text{ or } x=S\left(y\right)\text{ for some } y\in\mathbb{N}\right} \end{equation}$$ :::
We have defined \mathbb{N} recursively in terms of elements of
\mathbb{N}. As an example 2\in\mathbb{N} as 2=S\left(1\right) and
likewise 1=S\left(0\right) and we know that 0 is really the same as
\emptyset, which is the initial element of \mathbb{N} as defined
above. This definition allows us to get any x\in\mathbb{N}, however it
is not quite enough to get every element of \mathbb{N} at the same
time. We know that there should be infinitely many natural numbers,
indeed for any n\in\mathbb{N} we have also that n+1\in\mathbb{N}. In
other words we have a chain of sets of increasing size, that is we have
$$\begin{align*}
\mathbb{N}_0&=\emptyset=0\
\mathbb{N}_1&=\left{\emptyset\right}=1\
\mathbb{N}_2&=\left{\emptyset,\left{\emptyset\right}\right}=2\
\mathbb{N}_3&=\left{\emptyset,\left{\emptyset\right},\left{\left{\emptyset,\left{\emptyset\right}\right}\right}\right}=3\
\end{align*}$$ Which satisfy
\mathbb{N}_0\subset\mathbb{N}_1\subset\mathbb{N}_2\subset\mathbb{N}_3\subset\dots.
So we see at each stage \mathbb{N}_n is a finite set of size n and
so ultimately our current definition of \mathbb{N} can ultimately only
ever reach a finite n. although we can make this n arbitrarily
large. To ensure we get every possible n at once we need to invoke the
axiom of infinity.
-
The axiom of infinity:
The axiom of infinity asserts that there is at least one infinite set
A, that is at least one set with infinitely many elements. That is we have a setAsuch that the\emptyset\in Aand ifx\in Athen the setx\cup\left\{x\right\}is also inA.
There is a useful definition that we can extract from the axiom of infinity.
::: definition Definition 60. Inductive set
Let A be a set and let f:A\rightarrow A be a mapping. We say that
A is an inductive if it satisfies the following two properties
-
$\emptyset\in A$
-
If
x\in Athen $f\left(x\right)\in A$
For now, we will be focused on the case where f=S, the successor
mapping.
:::
In light of the axiom of infinity we have a set that contains the infinitely many Natural numbers. This is nearly what we want, although it won't be the set of Natural numbers. This set could clearly have many, many more things than just the Natural numbers.
We can make a new definition, which will allow us to define
\mathbb{N}. We will also be able to show the fact this definition
defines \mathbb{N} to be the smallest such inductive set that contains
all of the Natural numbers.
::: definition Definition 61. The set $\mathbb{N}_S$
Let S be an inductive set. We define \mathbb{N}_S as follows
$$\begin{equation} \mathbb{N}S=\bigcap{\substack{A\subseteq S \ A\text{ is inductive}}} A \end{equation}$$
This is well-defined by the axiom of specification, being an inductive
step is definable and the collection of all subsets of S is a set we
can define.
:::
We have that all of these sets \mathbb{N}_S are the same.
::: {#thm:EveryNsSetIsSame .theorem}
Theorem 3. Every \mathbb{N}_S set is the same set
Let S and T be inductive sets. Define the sets \mathbb{N}_S and
\mathbb{N}_T We have that
$$\begin{equation} \mathbb{N}_S=\mathbb{N}_T \end{equation}$$
Proof:
By the axiom of extensionality we know that two sets are equal if and
only if they contain the same elements. To see that \mathbb{N}_S and
\mathbb{N}_T have the same elements consider the new set given by
$$\begin{equation} C=\mathbb{N}_S\cap\mathbb{N}_T \end{equation*}$$*
We recall from proposition
8{reference-type="ref"
reference="prop:PropertiesOfUnionIntersectionSetinclusion"} that for two
sets A and B we have A\cap B\subseteq A. Hence it follows that
$$\begin{equation}
C=\mathbb{N}_S\cap\mathbb{N}_T\subseteq\mathbb{N}_S
\end{equation*}$$ That is, C\subseteq\mathbb{N}_S, that is to say
every element of C is also an element of \mathbb{N}_S. Now recall
the definition of \mathbb{N}_S,*
$$\begin{equation}
\mathbb{N}S=\bigcap{\substack{A\subseteq S \ A\text{ is inductive}}} A
\end{equation*}$$ We know that C\subseteq \mathbb{N}_S, hence as
\mathbb{N}_S is the intersection of all subsets of S we must
conclude that C\subseteq S.*
Now, we know that S is an inductive set. Hence S satisfies the
following
-
$\emptyset\in S$
-
If
x\in Sthen $S\left(x\right)\in S$
If we can show that C is an inductive set we know that C was one of
the sets we used to construct \mathbb{N}_S and hence
\mathbb{N}_S\subseteq C, which will give the equality
C=\mathbb{N}_S.
Now, to show that C is an inductive set me must show that
-
$\emptyset\in C$
-
If
x\in Cthen $S\left(x\right)\in C$
-
\emptyset\in C:We have that
C=\mathbb{N}_S\cap\mathbb{N}_Tand we have that$$\begin{align} \mathbb{N}S&=\bigcap{\substack{A\subseteq S \ A\text{ is inductive}}} A\ \mathbb{N}T&=\bigcap{\substack{A\subseteq T \ A\text{ is inductive}}} A\ \end{align*}$$ In the definitions of both
\mathbb{N}_Sand\mathbb{N}_Twe have that these are the intersections of inductive sets and so\emptyset\in\mathbb{N}_Sand\emptyset\in\mathbb{N}_T. It hence follows that asC=\mathbb{N}_S\cap\mathbb{N}_Twe must have\emptyset\in C.* -
If
x\in CthenS\left(x\right)\in C:Now suppose that
x\in C. Like before we know thatC=\mathbb{N}_S\cap\mathbb{N}_T, and by the definition of the intersection of two sets, it follows thatx\in\mathbb{N}_Sandx\in\mathbb{N}_T. Now we have that$$\begin{equation} \mathbb{N}S=\bigcap{\substack{A\subseteq S \ A\text{ is inductive}}} A \end{equation*}$$ hence as
x\in\mathbb{N}_Swe have we must have thatx\in Afor every subsetAofS. Moreover each suchAis an inductive set and so by definition of an inductive set we have thatS\left(x\right)\in Afor every subsetAofS. HenceS\left(x\right)\in\mathbb{N}_Sand likewise a similar argument shows thatS\left(x\right)\in\mathbb{N}_T. It thus follows thatS\left(x\right)\in C.*As
x\in Cwas arbitrary we must conclude that this holds for anyx\in C.
Hence C is an inductive set.
Now, we know that C\subseteq S and C is an inductive set then it
follows that C is one of the inductive sets in the definition of
\mathbb{N}_S. It hence follows that \mathbb{N}_S\subseteq C. It
follows by the axiom of extensionality that as \mathbb{N}_S and C
contain the same elements then C=\mathbb{N}_S.
Likewise the a similar argument shows that C=\mathbb{N}_T. So it
follows that \mathbb{N}_S = \mathbb{N}_T. $\qed$
:::
In light of this theorem we can now truly define \mathbb{N}.
::: definition Definition 62. The Natural numbers $\mathbb{N}$
Let S be an inductive set, and construct the set \mathbb{N}_S. The
set \mathbb{N}_S is the set of Natural numbers and by theorem
3{reference-type="ref"
reference="thm:EveryNsSetIsSame"} no matter the inductive set S we
have that all such \mathbb{N}_S are the same. Hence we simply refer to
the natural numbers by \mathbb{N}.
:::
We identify the elements of \mathbb{N} not in terms of \emptyset,
and sets of sets containing \emptyset, but instead by the more usually
numerals that we use. We have already defined Zero and One, by
definitions 55{reference-type="ref" reference="def:Zero"}
and 56{reference-type="ref" reference="def:One"}. The other
numbers follow likewise, i.e
$$\begin{align*} 0&=\emptyset\ 1&=S\left(0\right)=\left{\emptyset\right}\ 2&=S\left(1\right)=\left{\emptyset,\left{\emptyset\right}\right}\ 3&=S\left(2\right)\ 4&=S\left(3\right)\ &\dots\ n+1&=S\left(n\right) \end{align*}$$
We said that we can prove that \mathbb{N} is the smallest such
inductive set that contains all the natural numbers, this is to say if
A\subseteq\mathbb{N} is an inductive set we must have that
A=\mathbb{N}. We thankfully do not need to prove this as the previous
theorem gives this for free. This also gives us the following definition
for a minimally inductive set, we make the definition in such a way that
we argue about sets of inductive sets.
::: definition Definition 63. Minimally inductive set of sets
Let S be a set whose elements are also sets satisfying some
condition, and let f:S\rightarrow S be a mapping. We say that S is
minimally inductive if and only if the foll lowing holds
-
Sis an inductive set under the mapping $g$ -
No proper subset of
Sis inductive under the mapping $g$ :::
One of the most powerful properties of the natural numbers is the
principle of Induction. This tool is powerful in proving many statements
on the Natural numbers. It works in a similar way to how an inductive
set works7 . We show that the statement works for some base case,
usually n=0, then we assume that if it holds true for some n then it
holds true for S\left(n\right)=n+1.
::: theorem Theorem 4. The principle of induction
Suppose we have a proposition P\left(n\right) about a Natural number
n\in\mathbb{N}. Moreover, suppose that
-
P\left(0\right)is true -
P\left(n\right)being true impliesP\left(S\left(n\right)\right)=P\left(n+1\right)is true for any Natural numbern.
If these two statements are true, we have that P\left(n\right) is
true for any natural number n, and we say the proposition
P\left(n\right) holds by the principle of mathematical induction.
Moreover we call P\left(0\right) the base case for induction and
P\left(n\right) being true implies P\left(n+1\right) being true is
the inductive step.
Proof:
Let P\left(n\right) be a proposition about a Natural number
n\in\mathbb{N} such that P\left(n\right) satisfies
-
P\left(0\right)is true -
P\left(n\right)being true impliesP\left(S\left(n\right)\right)=P\left(n+1\right)is true for any Natural numbern.
Consider the set given by
$$\begin{equation}
Q=\left{n:P\left(n\right)\text{ is true}\right}
\end{equation*}$$ That is, Q is defined as the set of Natural numbers
such that that P\left(n\right) is true, clearly
Q\subseteq\mathbb{N}. By hypothesis we know that P\left(0\right) is
true, so 0\in Q. Also by hypothesis we know that if P\left(n\right)
is true for some n\in\mathbb{N}. then we have that
P\left(S\left(n\right)\right)=P\left(n+1\right) is also true, hence we
have that every n\in\mathbb{N} is also in Q, hence
\mathbb{N}\subseteq Q and so by the axiom of extensionality we have
that Q=\mathbb{N}. Hence P\left(n\right) is true for every Natural
number n\in\mathbb{N}. \qed.*
:::
Now that we have induction we can make a final definition that will be useful. This definition combines a few previously proven results into a convenient package, this package has the strength to prove the usual properties of the natural numbers and perhaps are an easy way to remember the basis for deducing properties about the natural numbers.
::: definition Definition 64. The Peano axioms
We define the Peano axioms as follows. Let A be a set and consider
the successor mapping on A, S: A\rightarrow A. If we have that
-
Ais an inductive set-
$\emptyset\in A$
-
If
x\in Athen $S\left(x\right)\in A$
-
-
Sis an injective mapping. -
\forall x\in Swe have that $\emptyset\neq S\left(x\right)$ -
\forall B \subseteq A. If0\in BandS\left(n\right)\in Bfor alln\in Bthen $B=A$
If A satisfies all of the above, then we say that A satisfies the
Peano axioms and induces Peano arithmetic.
:::
Properties of the natural numbers
Although we have constructed \mathbb{N} we haven't defined what we can
do with this set. We know from our intuitions that we can define
addition, a form of subtraction, multiplication and in some cases
division. We also know that there is some notion of a Natural number
being larger or smaller than another, when two Natural numbers are equal
and so. We will explore some of these properties so that we can start
doing some form of Mathematics.
Equality of natural numbers
Firstly, it is important to define when two Natural numbers are equal, again as we have defined the natural numbers in terms of Sets, this just comes down to the axiom of extensionality.
::: definition Definition 65. Equality of natural numbers
Let n,m\in\mathbb{N} be two natural numbers. We define that two
natural numbers are equal, denoted n=m if and only if n\subseteq m
and m\subseteq n. This is simply the axiom of extensionality.
If we do not have n=m then we say that n and m are not equal and
we denote this n\neq m.
:::
This definition clearly makes sense as each natural number is a set.
::: example
Example 51. We have that 1=1. Indeed by definition 0=\emptyset
and 1=\left\{\emptyset\right\}. It is clear that
\left\{\emptyset\right\}\subseteq \left\{\emptyset\right\} hence the
axiom of extensionality gives us that
\left\{\emptyset\right\}=\left\{\emptyset\right\}. That is $1=1$
:::
::: example
Example 52. We have that 3=3. Indeed by construction we have that
3=\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}
It is clear that
\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}
hence the axiom of extensionality gives us that
\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}=\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}.
i.e $3=3$
:::
::: example
Example 53. We have 1\neq 2. We have that
1=\left\{\emptyset\right\} and
1=\left\{\emptyset,\left\{\emptyset\right\}\right\}. Now
\left\{\emptyset\right\}\subseteq \left\{\emptyset,\left\{\emptyset\right\}\right\}
but
\left\{\emptyset,\left\{\emptyset\right\}\right\}\not\subseteq \left\{\emptyset\right\}.
:::
In particular in light of the definition of equality on the natural
numbers if n=m and m=k we must have that n=k.
Inequality of natural numbers
We can define also define what it means for natural numbers to not be
equal. We make use of the notion of set inclusion. Recall that a set S
is a subset of the set T, written S\subseteq T, if for every
s\in S we have that s\in T and that S is a proper subset of T,
written S\subset T if S\subseteq T and S\neq T. We will use the
proper subset notation to define the so-called less than operator. This
operation comes naturally from the definition of the natural numbers by
the successor mapping. The successor function has the following chain of
definitions for each n\in\mathbb{N}
$$\begin{align*} 0&=\emptyset\ 1&=S\left(0\right)=\left{\emptyset\right}\ 2&=S\left(1\right)=\left{\emptyset,\left{\emptyset\right}\right}\ 3&=S\left(2\right)=\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right}\ 4&=S\left(3\right)=\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right},\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right}\right}\ &\dots\ n+1&=S\left(n\right) \end{align*}$$
From this chain of definitions and the axiom of foundation,
0=\emptyset is the set element minimal element of \mathbb{N}, so
every natural number is contained in one that comes after. We can make
the following definition which defines when one natural number is
smaller than another.
::: definition Definition 66. Less than Operator
Let n,m\in\mathbb{N}. The less than operator, denoted by n<m and
read as n is less than m, is defined as follows.
We have n<m if and only if n\subset m. The set that denotes the
number n is an element of the set m. In the language of mathematical
logic, we have that that < is actually a logical proposition, given
by
$$\begin{equation} <\left(n,m\right)=\begin{cases} 1,\ \text{If } n\subset m\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$* :::
Recall that for predicates 0 indicates that the predicate is false and
1 indicates that the predicate is true.
::: example
Example 54. We have that 2<3. Indeed
2=\left\{\emptyset,\left\{\emptyset\right\}\right\} and
3=\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\},
and clearly
$$\begin{equation} \left{\emptyset,\left{\emptyset\right}\right}\subset\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right} \end{equation*}$$* :::
We can combine the less than operator with the equality operator.
::: definition Definition 67. Less than or equal to operator
Let n,m\in\mathbb{N}. The less than or equal to operator, denoted by
n\leq m, and read as n is less than or equal to m, is defined the
same as n<m except we now allow for the situation that n=m. This is
to say \leq is a logical proposition given by
$$\begin{equation} \leq\left(n,m\right) = \begin{cases} 1,\ \text{If } n< m\ 1,\ \text{If } n=m\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
Where on the right-hand side of the definition we are talking about sets, and on the left-hand side we are talking about natural numbers, although we know these are the same thing. :::
::: example
Example 55. We have that 2\leq 3. From the previous example, we
know that 2<3. Moreover, we have that 3\leq 3 as 3=3.
:::
We have defined less than and less than or equal to, we can define a
similar notation of greater than and greater than then equal to, we can
do this by considering when n\not\subset m.
::: definition Definition 68. Greater than operator
Let n,m\in\mathbb{N}. The greater than operator, denoted by n>m and
is read as n is greater than m, is defined as follows.
We have n>m if and only if n\not\subset m. That is, the set that
denotes the number n is not an element of the set m. That is to say
that > is a logical proposition, given by
$$\begin{equation} >\left(n,m\right)=\begin{cases} 1,\ \text{If } n\not\subset m\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$* :::
Likewise, we can define the greater than or equal to operator.
::: definition Definition 69. Greater than or equal to operator
Let n,m\in\mathbb{N}. The greater than or equal to operator, denoted
by n\geq m, and read as n is greater than or equal to m, is
defined the same as n>m except we now allow for the situation that
n=m. This is to say \geq is a logical proposition given by
$$\begin{equation} \geq\left(n,m\right) = \begin{cases} 1,\ \text{If } n> m\ 1,\ \text{If } n=m\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$* :::
Defining addition and multiplication on the Natural numbers
We can use the principle of induction to make definitions as well as a
proof technique. We shall use induction now to make two definitions, in
particular, we define two mappings that will allow us to start
manipulating Natural numbers as we expect them to. To do so it is enough
to specify what the mapping does when 0 is given as an argument, and
then do define what the mapping does when given S\left(n\right) as an
argument, hence defining it in terms of n for each n\in\mathbb{N}.
This will make sense when we define these operations.
We first recall the Cartesian product of two sets. Let S and T be
sets, the Cartesian product of S and T, denoted S\times T is the
set of all ordered pairs of the form \left(S,t\right) where s\in S
and t\in T. This is to say that
$$\begin{equation*} S\times T=\left{\left(s,t\right):s\in S,t\in T\right} \end{equation*}$$
If S=T then we denote S\times T by S^2.
::: definition Definition 70. Addition on the Natural numbers
We define addition on the Natural numbers by the following mapping. Let
+:\mathbb{N}^2\rightarrow\mathbb{N} be such that for all
\left(m,n\right)\in\mathbb{N}^2 we have the following
$$\begin{align} +&:\mathbb{N}^2\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ \left(m,n\right)&\mapsto +\left(m,n\right)=\begin{cases} m+0=m,\ \text{If } n=0\ m+S\left(n\right)=S\left(m+n\right),\ \text{If } n\neq 0 \end{cases} \end{align}$$
We will write +\left(m,n\right) as m+n.
:::
In light of this definition, we can prove that 1+1=2
::: theorem Theorem 5. 1+1=2
We have that 1+1=2.
Proof:
We know that 1=S\left(0\right) and 2=S\left(S\left(0\right)\right).
Hence, we are proving
$$\begin{equation} S\left(0\right)+S\left(0\right)=S\left(S\left(0\right)\right) \end{equation*}$$*
By the definition of the addition mapping, we know that
\forall \left(m,n\right)\in\mathbb{N}^2 that
$$\begin{equation}
m+S\left(n\right)=S\left(m+n\right)
\end{equation*}$$ In particular if n=0 we have \forall m that*
$$\begin{equation} m+S\left(0\right)=S\left(m+0\right) \end{equation*}$$*
and
$$\begin{equation}
\label{eq:OnePlusOneProofEq1}
S\left(0\right)+S\left(0\right)=S\left(S\left(0\right)+0\right)
\end{equation}$$ Moreover, by the definition of addition, we know that
\forall m that if n=0 then
$$\begin{equation} m+0=m \end{equation*}$$ Hence*
$$\begin{align} S\left(0\right)+0&=S\left(0\right)\ \Rightarrow S\left(S\left(0\right)+0\right)&= S\left(S\left(0\right)\right)\ \Rightarrow S\left(0\right)+S\left(0\right)&=S\left(S\left(0\right)\right) \end{align*}$$*
This is to say. 1+1=2. As required. \qed.
:::
::: definition Definition 71. Multiplication on the Natural numbers
We define multiplication on the Natural numbers by the following
mapping. Let *:\mathbb{N}\times\mathbb{N}\rightarrow\mathbb{N} be such
that for all \left(m,n\right)\in\mathbb{N}\times\mathbb{N} we have the
following
$$\begin{align}
&:\mathbb{N}\times\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\
\left(m,n\right)&\mapsto \left(m,n\right)=\begin{cases}
m0=0,\ \text{If } n=0\
mS\left(n\right)=mn+m,\ \text{If } n\neq 0
\end{cases}
\end{align}$$ We will write *\left(m,n\right) as m*n, or more
compactly just as the juxtaposition $mn$*
:::
As with addition we provide a proof that 2*2=4
::: theorem Theorem 6. 2*2=4
We have 2*2=4.
Proof:
We know that S\left(1\right)=2 and so by definition of multiplication
we have that
$$\begin{equation} 22=2S\left(1\right)=21+2 \end{equation}$$*
Likewise we know that S\left(0\right)=1 and so by another application
of the definition of multiplication we have that
$$\begin{equation} 21+2=2S\left(0\right)+2=20+2+2 \end{equation}$$*
Now 2*0=0 by definition as so we have that
$$\begin{equation} 22=20+2+2=0+2+2=2+2 \end{equation*}$$*
It is left to show that 2+2 = 4. We use a similar proof to 1+1=2.
As
4=S\left(S\left(2\right)\right)=S\left(S\left(S\left(S\left(0\right)\right)\right)\right)
and 2=S\left(S\left(0\right)\right) we need to show that
$$\begin{equation} S\left(S\left(0\right)\right)+S\left(S\left(0\right)\right)=S\left(S\left(S\left(S\left(0\right)\right)\right)\right) \end{equation*}$$*
By the definition of addition we have that
\forall\left(m,n\right)\in\mathbb{N}^2 that
$$\begin{equation} m+S\left(n\right)=S\left(m+n\right) \end{equation*}$$*
In particular we have that if n=0 and \forall n\in\mathbb{N} that
$$\begin{equation} m+S\left(0\right)=S\left(m+0\right) \end{equation*}$$*
So that
$$\begin{align} S\left(S\left(0\right)\right)+S\left(S\left(0\right)\right)&=S\left(S\left(S\left(0\right)\right)+S\left(0\right)\right)\ &=S\left(S\left(S\left(S\left(0\right)\right)+0\right)\right)\ &=S\left(S\left(S\left(S\left(0\right)\right)\right)\right)\ \end{align*}$$*
That is 2+2=4 and so the theorem is proved. $\qed$
:::
These two definitions are enough to prove every elementary property of addition and multiplication that we are familiar with. However to do so will require an upgrade to the idea of induction. This will allow us to perform induction on both the addition and multiplication mappings. Once we have done this we will have put the natural numbers on a firm logical basis. This idea is called double induction, or more clearly induction on two variables.
For example, we know from school that n+m=m+n for all natural numbers
n and m. To show that this is true, we start by induction on n, so
we have to show that m+0=0+m and then that \left(m+n=n+m\right)
implies that \left(m+S\left(n\right)=S\left(n\right)+m\right), each of
these will be proved by induction on m. This is the idea of double
induction.
::: theorem Theorem 7. Double induction
Let P\left(m,n\right) be a proposition about a pair of natural
numbers m,n\in\mathbb{N}. Moreover suppose that
-
P\left(0,0\right)is true. -
P\left(0,n\right)being true implies thatP\left(0,S\left(n\right)\right)is true. -
P\left(m,0\right)being true implies thatP\left(S\left(m\right),0\right)is true -
For a given
m\in\mathbb{N}, from the truth thatP\left(m,x\right)is true for allx, and also that ofP\left(S\left(m\right),n\right)for somen, we can infer thatP\left(S\left(m\right),S\left(n\right)\right)is true.
If these statements are true, we have that P\left(m,n\right) is true
for any natural numbers m,n\in\mathbb{N} and we say that the
proposition P\left(m,n\right) hold by the principle of mathematical
double induction.
Proof:
Let P\left(m,n\right) be a proposition about a pair of natural
numbers m,n\in\mathbb{N}, which satisfies
-
P\left(0,0\right)is true. -
P\left(0,n\right)being true implies thatP\left(0,S\left(n\right)\right)is true. -
P\left(m,0\right)being true implies thatP\left(S\left(m\right),0\right)is true -
For a given
m\in\mathbb{N}, from the truth thatP\left(m,x\right)is true for allx, and also that ofP\left(S\left(m\right),n\right)for somen, we can infer thatP\left(S\left(m\right),S\left(n\right)\right)is true.
Statements 1 and 2 are the base case and the inductive step for the
proof of P\left(0,n\right) for all n\in\mathbb{N}. Likewise
statements 1 and 3 are the base case and the inductive step for the
proof of P\left(m,0\right) for all m\in\mathbb{N}.
Finally, the statements 3 and 4 is the base case and inductive step
for a proof, by induction on n for a proof of the statement that if
P\left(m,n\right) holds for all n, then
P\left(S\left(m\right),n\right) holds for all n, and thus by
induction we have that P\left(m,n\right) is true for all m. \qed.
:::
We can start proving the basic properties of \mathbb{N} that we are
familiar with.
Closure properties of addition and multiplication
We show that addition and multiplication on the natural numbers to
produces a natural number.
::: theorem Theorem 8. The addition and multiplication mappings on the natural numbers are closed
For all n,m\in\mathbb{N}. We have that
-
n+m\in\mathbb{N}. -
nm\in\mathbb{N}.
Proof:
-
n+m\in\mathbb{N}:Let
n,m\in\mathbb{N}. We need to show that-
$0+0\in\mathbb{N}$
-
0+n\in\mathbb{N}implies $0+S\left(n\right)\in\mathbb{N}$ -
m+0\in\mathbb{N}implies $S\left(m\right)+0\in\mathbb{N}$ -
For some
m\in\mathbb{N}. Suppose thatm+x\in\mathbb{N}for allx\in\mathbb{N}, andS\left(m\right)+n\in\mathbb{N}for somen\in\mathbb{N}implies that $S\left(m\right)+S\left(n\right)\in\mathbb{N}$
-
0+0\in\mathbb{N}:We have by the definition of addition that
$$\begin{equation} 0+0=0 \end{equation*}$$ which is clearly in
\mathbb{N}.* -
0+n\in\mathbb{N}implies0+S\left(n\right)\in\mathbb{N}:Now, suppose that
0+n\in\mathbb{N}for somen, we show that0+S\left(n\right)\in\mathbb{N}.By the definition of addition we have that
$$\begin{equation} 0+S\left(n\right)=S\left(0+n\right) \end{equation*}$$ Now
0+n\in\mathbb{N}by assumption, therefore we have thatS\left(0+n\right)\in\mathbb{N}. Hence0+S\left(n\right)\in\mathbb{N}.* -
m+0\in\mathbb{N}impliesS\left(m\right)+0\in\mathbb{N}:Now, suppose that
m+0\in\mathbb{N}for somem, we show thatS\left(m\right)+0\in\mathbb{N}.By the definition of addition we have that
$$\begin{equation} S\left(m\right)+0=S\left(m\right)=S\left(m+0\right) \end{equation*}$$ Now
m+0\in\mathbb{N}by assumption, thereforeS\left(m+0\right)\in\mathbb{N}. Hence $S\left(m\right)+0\in\mathbb{N}$* -
For some
m\in\mathbb{N}. Suppose thatm+x\in\mathbb{N}for allx\in\mathbb{N}, andS\left(m\right)+n\in\mathbb{N}for somen\in\mathbb{N}implies that $S\left(m\right)+S\left(n\right)\in\mathbb{N}$Now suppose that
m+x\in\mathbb{N}for allx\in\mathbb{N}and some fixedm\in\mathbb{N}, and suppose thatS\left(m\right)+n\in\mathbb{N}wherenis some fixed value, we show thatS\left(m\right)+S\left(n\right)\in\mathbb{N}.So, we have that
S\left(m\right)\in\mathbb{N}andS\left(n\right)\in\mathbb{N}we can use the definition of addition, doing so gives$$\begin{equation} S\left(m\right)+S\left(n\right)=S\left(S\left(m\right)+n\right) \end{equation*}$$ By assumption
S\left(m\right)+n\in\mathbb{N}, hence as we have thatm+x\in\mathbb{N}for allx\in\mathbb{N}, then we have thatS\left(S\left(m\right)+n\right)\in\mathbb{N}. Therefore we must conclude thatS\left(m\right)+S\left(n\right)\in\mathbb{N}.*
Hence by the principle by double induction we have that
m+n\in\mathbb{N}for allm,n\in\mathbb{N}. That is, addition is closed. -
-
nm\in\mathbb{N}:Let
n,m\in\mathbb{N}. We need to show that-
$00\in\mathbb{N}$*
-
0*n\in\mathbb{N}implies $0S\left(n\right)\in\mathbb{N}$* -
*
m*0\in\mathbb{N}implies $S\left(m\right)0\in\mathbb{N}$ -
*For some
m\in\mathbb{N}. Suppose thatm*x\in\mathbb{N}for allx\in\mathbb{N}, andS\left(m\right)*n\in\mathbb{N}for somen\in\mathbb{N}implies that $S\left(m\right)S\left(n\right)\in\mathbb{N}$
-
0*0\in\mathbb{N}:We have by the definition of multiplication that
$$\begin{equation} 00=0 \end{equation}$$ which is clearly in
\mathbb{N}.* -
0*n\in\mathbb{N}implies0*S\left(n\right)\in\mathbb{N}:Now, suppose that
0*n\in\mathbb{N}for somen, we show that0*S\left(n\right)\in\mathbb{N}.By the definition of multiplication we have that
$$\begin{equation} 0S\left(n\right)=0n+0 \end{equation*}$$*
Now
0*n\in\mathbb{N}by assumption, moreover we have proved that addition is closed, so0*n+0\in\mathbb{N}therefore we have that $0S\left(n\right)\in\mathbb{N}$* -
m*0\in\mathbb{N}impliesS\left(m\right)*0\in\mathbb{N}:Now, suppose that
m*0\in\mathbb{N}for somem, we show thatS\left(m\right)*0\in\mathbb{N}.By the definition of addition we have that
$$\begin{equation} S\left(m\right)0=0 \end{equation}$$ Where
S\left(m\right)*0=0by definition of multiplication. Hence as0\in\mathbb{N}we have thatS\left(m\right)*0\in\mathbb{N}.* -
For some
m\in\mathbb{N}. Suppose thatm*x\in\mathbb{N}for allx\in\mathbb{N}, andS\left(m\right)*n\in\mathbb{N}for somen\in\mathbb{N}implies thatS\left(m\right)*S\left(n\right)\in\mathbb{N}:Now suppose that
m*x\in\mathbb{N}for allx\in\mathbb{N}and some fixedm\in\mathbb{N}, and suppose thatS\left(m\right)*n\in\mathbb{N}wherenis some fixed value, we show thatS\left(m\right)*S\left(n\right)\in\mathbb{N}.So, we have that
S\left(m\right)\in\mathbb{N}andS\left(n\right)\in\mathbb{N}we can use the definition of multiplication, doing so gives$$\begin{equation} S\left(m\right)S\left(n\right)=S\left(m\right)n+S\left(m\right) \end{equation}$$ By assumption
S\left(m\right)*n\in\mathbb{N}, moreover asm*x\in\mathbb{N}for allx\in\mathbb{N}we must haveS\left(m\right)*n+S\left(m\right)\in\mathbb{N}as addition is closed.Hence
S\left(m\right)*S\left(n\right)\in\mathbb{N}.
Hence by the principle by double induction we have that
m*n\in\mathbb{N}for allm,n\in\mathbb{N}. That is, multiplication is closed. -
Hence, we have that the addition and multiplication mappings are closed. $\qed$ :::
Commutativity of addition and multiplication
This will prove that for all a,b\in\mathbb{N} that a+b=b+a and
ab=ba.
::: theorem Theorem 9. Addition and multiplication are commutative
For all a,b\in\mathbb{N} we have that
-
$a+b=b+a$
-
$ab=ba$
Proof:
-
a+b=b+a:We argue by double induction. We need to show that
-
$0+0=0+0$
-
0+n=n+0implies $0+S\left(n\right)=S\left(n\right)+0$ -
m+0=0+mimplies $S\left(m\right)+0=0+S\left(m\right)$ -
If
m+x=x+mfor allx\in\mathbb{N}andS\left(m\right)+n=n+S\left(m\right)for somen\in\mathbb{N}, then we have that $S\left(m\right)+S\left(n\right)=S\left(n\right)+S\left(m\right)$
-
0+0=0+0:This is trivial by definition of addition.
-
0+n=n+0implies0+S\left(n\right)=S\left(n\right)+0:Suppose that
0+n=n+0, we show that0+S\left(n\right)=S\left(n\right)+0. By the definition of addition we have that$$\begin{equation} 0+S\left(n\right)=S\left(0+n\right) \end{equation*}$$ We know by assumption that
0+n=n+0. Hence*$$\begin{equation} S\left(0+n\right)=S\left(n+0\right)=S\left(n\right)+0 \end{equation*}$$*
-
m+0=0+mimpliesS\left(m\right)+0=0+S\left(m\right):Suppose that
m+0=0+m, we show thatS\left(m\right)+0=0+S\left(m\right). By the definition of addition we have that$$\begin{equation} S\left(m\right)+0=S\left(m\right)=S\left(m+0\right) \end{equation*}$$ We know by assumption that
n+0=+m. Hence*$$\begin{equation} S\left(m+0\right)=S\left(0+m\right)=0+S\left(m\right) \end{equation*}$$*
-
If
m+x=x+mfor allx\in\mathbb{N}andS\left(m\right)+n=n+S\left(m\right)for somen\in\mathbb{N}, then we have thatS\left(m\right)+S\left(n\right)=S\left(n\right)+S\left(m\right):Suppose
m+x=x+mfor allx\in\mathbb{N}and thatS\left(m\right)+n=n+S\left(m\right)for somen\in\mathbb{N}, we show thatS\left(m\right)+S\left(n\right)=S\left(n\right)+S\left(m\right).We have
$$\begin{equation} S\left(m\right)+S\left(n\right)=S\left(S\left(m\right)+n\right) \end{equation*}$$ Now we have by assumption that
S\left(m\right)+n=n+S\left(m\right), for somen\in\mathbb{N}, hence*$$\begin{equation} S\left(S\left(m\right)+n\right)=S\left(n+S\left(m\right)\right)=S\left(S\left(n+m\right)\right) \end{equation*}$$*
Likewise a similar chain of reasoning gives
$$\begin{equation} S\left(n\right)+S\left(m\right)=S\left(S\left(n\right)+m\right)=S\left(m+S\left(n\right)\right)=S\left(S\left(m+n\right)\right) \end{equation*}$$ Finally, we have that
m+n=m+nby assumption, and so $S\left(S\left(n+m\right)\right)=S\left(S\left(m+n\right)\right)$*
Hence by the principle of double induction we have that
a+b=b+afor alla,b\in\mathbb{N}. That is addition is commutative. -
-
ab=ba:We need to show that
-
$00=00$
-
0*n=n*0implies $0S\left(n\right)=S\left(n\right)0$ -
m*0=0*mimplies $S\left(m\right)0=0S\left(m\right)$ -
*If
m*x=x*mfor allx\in\mathbb{N}andS\left(m\right)*n=n*S\left(m\right)for somen\in\mathbb{N}, then we have that $S\left(m\right)*S\left(n\right)=S\left(n\right)S\left(m\right)$
-
0*0=0*0:This is trivial by the definition of multiplication.
-
0*n=n*0implies0*S\left(n\right)=S\left(n\right)*0:Suppose that
0*n=n*0, we show that0*S\left(n\right)=S\left(n\right)*0. We have by definition of multiplication that$$\begin{align} 0S\left(n\right)&=0n+0\ &=n0+0,\ \text{By assumption}\ &=0+0,\ \text{By definition of multiplication}\ &=0,\ \text{By definition of addition}\ &=S\left(n\right)0,\ \text{By definition of multiplication}\ \end{align}$$
-
m*0=0*mimpliesS\left(m\right)*0=0*S\left(m\right):Suppose that
m*0=0*m, we show thatS\left(m\right)*0=0*S\left(m\right). We have by definition of multiplication that$$\begin{align} 0S\left(m\right)&=0m+0\ &=m0+0,\ \text{By assumption}\ &=0+0,\ \text{By definition of multiplication}\ &=0,\ \text{By definition of addition}\ &=S\left(m\right)0,\ \text{By definition of multiplication}\ \end{align}$$
-
If
m*x=x*mfor allx\in\mathbb{N}andS\left(m\right)*n=n*S\left(m\right)for somen\in\mathbb{N}, then we have thatS\left(m\right)*S\left(n\right)=S\left(n\right)*S\left(m\right):Suppose that
m*x=x*mfor allx\in\mathbb{N}andS\left(m\right)*n=n*S\left(m\right)for somen\in\mathbb{N}, we showS\left(m\right)*S\left(n\right)=S\left(n\right)*S\left(m\right). By definition of multiplication we have that$$\begin{equation} S\left(m\right)S\left(n\right)=S\left(m\right)n+S\left(m\right)=nS\left(m\right)+S\left(m\right)=nm+n+S\left(m\right)=nm+S\left(n+m\right) \end{equation}$$*
Likewise, we have that $$\begin{equation} S\left(n\right)S\left(m\right)=S\left(n\right)m+S\left(n\right)=mS\left(n\right)+S\left(n\right)=mn+m+S\left(n\right)=mn+S\left(m+n\right) \end{equation}$$ Now, we know that addition is commutative so we have that
S\left(m+n\right)=S\left(n+m\right), moreover by assumption we have thatn*m=m*n. Hence*$$\begin{equation} nm+S\left(n+m\right)=mn+S\left(m+n\right) \end{equation*}$$*
Hence by the principle of double induction we have that
ab=bafor alla,b\in\mathbb{N}. That is multiplication is commutative. -
The result now follows. $\qed$ :::
We can also now deduce the following property of multiplication
Associativity of addition
This will prove that for all a,b,c\in\mathbb{N} that
a+\left(b+c\right)=\left(a+b\right)+c
::: theorem Theorem 10. Addition is associative
For all a,b,c\in\mathbb{N} we have that
$$\begin{equation} a+\left(b+c\right)=\left(a+b\right)+c \end{equation*}$$*
Proof: We can show this by induction. Let x,y\in\mathbb{N} be
arbitrary, and let P\left(n\right) be the proposition given by
$$\begin{equation} \left(x+y\right)+n=x+\left(y+n\right) \end{equation*}$$*
For the base case we have n=0 and so
$$\begin{align} \left(x+y\right)+0&=x+y ,\text{By definition of addition}\ &=x+\left(y+0\right) \end{align*}$$*
Hence P\left(0\right) is true.
Now, suppose that P\left(n\right) is true, that is
$$\begin{equation}
\left(x+y\right)+n=x+\left(y+n\right)
\end{equation*}$$ We show that P\left(S\left(n\right)\right) is also
true, that is*
$$\begin{equation} \left(x+y\right)+S\left(n\right)=x+\left(y+S\left(n\right)\right) \end{equation*}$$*
Now, we have that
$$\begin{align} \left(x+y\right)+S\left(n\right)&=S\left(\left(x+y\right)+n\right),\text{By definition of addition}\ &=S\left(x+\left(y+n\right)\right),\ \text{By the induction hypothesis}\ &=x+\left(S\left(y+n\right)\right),\text{By definition of addition}\ &=x+\left(y+S\left(n\right)\right),\text{By definition of addition}\ \end{align*}$$*
Hence P\left(S\left(n\right)\right) is true.
It follows by mathematical induction that \forall a,b,c\in\mathbb{N}
we have that a+\left(b+c\right)=\left(a+b\right)+c, that is addition
is associative. $\qed$
:::
Multiplication distributes over addition
This will prove that for all a,b,c\in\mathbb{N} we have that
a\left(b+c\right)=ab+ac and \left(a+b\right)c=ac+bc.
::: theorem Theorem 11. Multiplication distributes over addition
For all a,b,c\in\mathbb{N} we have that
-
$a\left(b+c\right)=ab+ac$
-
$\left(b+c\right)a=ba+ca=ab+ac$
Proof:
We can be quick, and solve both problems nearly simultaneously, as we
have shown that multiplication is commutative.. To do this we show that
for all a,b,c\in\mathbb{N} we have that a\left(b+c\right)=ab+ac.
Let a,b\in\mathbb{N} be arbitrary and we argue by induction on the
proposition P\left(n\right) given by
$$\begin{equation} a\left(b+n\right)=ab+an \end{equation*}$$*
For the base case n=0 we have that
$$\begin{align} a\left(b+0\right)&=a\left(b\right),\text{By definition of multiplication}\ &=ab \ &=ab+0,\text{By definition of addition}\ &=ab+a0,\text{By definition of multiplication}\ \end{align}$$*
Hence P\left(0\right) is true.
Now suppose that P\left(n\right) is true, that is to say
$$\begin{equation} a\left(b+n\right)=ab+an \end{equation*}$$*
We show that P\left(S\left(n\right)\right) is true, that is
$$\begin{equation} a\left(b+S\left(n\right)\right)=ab+aS\left(n\right) \end{equation*}$$*
Indeed, we have that
$$\begin{align} a\left(b+S\left(n\right)\right)&=a\left(S\left(b+n\right)\right),\ \text{By definition of addition}\ &=a\left(b+n\right)+a,\ \text{By definition of multiplication}\ &=ab+an+a,\ \text{By assumption}\ &=ab+aS\left(n\right)0,\ \text{By definition of multiplication}\ \end{align*}$$*
Hence P\left(S\left(n\right)\right) is true.
It hence follows by the principle of mathematical induction that
\forall a,b,c\in\mathbb{N} we have that a\left(b+c\right)=ab+ac.
Now, we have shown that a\left(b+c\right)=ab+ac, to see that
\left(b+c\right)a=ba+ca=ab+ac we simply observe that
$$\begin{align} \left(b+c\right)a&=a\left(b+c\right),\ \text{Multiplication is commutative}\ &=ab+ac,\ \text{By part 1 of the theorem}\ &ba+ca,\ \text{Multiplication is commutative}\ \end{align*}$$*
As required. $\qed$ :::
Associativity of multiplication
This will prove that for all a,b,c\in\mathbb{N} that
a\left(bc\right)=\left(ab\right)c
::: theorem
Theorem 12. For all a,b,c\in\mathbb{N} we have that
$a\left(bc\right)=\left(ab\right)c$
Proof:
We again show this by induction. Let x,y\in\mathbb{N} be arbitrary,
and let P\left(n\right) be the proposition given by
$$\begin{equation} \left(xy\right)n=x\left(yn\right) \end{equation*}$$*
For the base case we have n=0 and so
$$\begin{align} \left(xy\right)0&=0 ,\text{By definition of multiplication}\ &=x\left(0\right),\text{By definition of multiplication}\ &=x\left(y0\right),\text{By definition of multiplication}\ \end{align}$$*
Hence P\left(0\right) is true.
Now, suppose that P\left(n\right) is true, that is
$$\begin{equation} \left(xy\right)n=x\left(yn\right) \end{equation*}$$*
We show that P\left(S\left(n\right)\right) is also true, that is
$$\begin{equation} \left(xy\right)S\left(n\right)=x\left(yS\left(n\right)\right) \end{equation*}$$*
Now, we have that
$$\begin{align} \left(xy\right)S\left(n\right)&=\left(xy\right)n+xy,\ \text{Definition of multiplication}\ &=x\left(yn\right)+xy,\ \text{By assumption}\ &=xy+x\left(yn\right),\ \text{Addition is commutative}\ &=x\left(y+\left(yn\right)\right),\ \text{Multiplication is distributive over addition}\ &=x\left(\left(yn\right)+y\right),\ \text{Addition is commutative}\ &=x\left(yS\left(n\right)\right),\ \text{Addition is commutative}\ \end{align*}$$*
Hence P\left(S\left(n\right)\right) is true.
Hence, it follows by the principle of mathematical induction that for
all a,b,c\in\mathbb{N} we have that
a\left(bc\right)=\left(ab\right)c. $\qed$
:::
The Zero and Identity laws
These two laws allow us to note that adding zero to any natural number
n gives back n and multiplying n by 1 gives n.
::: theorem Theorem 13. The zero and Identity laws
Let n\in\mathbb{N}. We have that
-
$n+0=n=0+n$
-
$1n=n=n1$
Proof:
By commutativity, it is enough to only prove
-
$n+0=n$
-
$n1=n$*
-
n+0=n:This is true by the definition of addition.
-
n*1=n:We have by the definition of multiplication that
$$\begin{equation} n1=nS\left(0\right)=n0+n=0+n=n \end{equation}$$ Where the last equality comes from the zero law and the fact addition is commutative.*
The result follows. $\qed$ :::
The cancellation laws
These laws allow us to deduce that if a+b=a+c then we must have b=c,
and if a\neq 0 that ab=ac gives b=c
::: theorem Theorem 14. The cancellation laws
Let a,b,c\in\mathbb{N}. We have that
-
If
a+b=a+cthen we haveb=c. -
For
a\neq 0, ifab=acthen we have that $b=c$
Proof:
-
If
a+b=a+cthen we haveb=c:We argue by induction, let
b,c\in\mathbb{N}be arbitrary and letP\left(n\right)be the proposition given by$$\begin{equation} n+b=n+c \Rightarrow b=c \end{equation*}$$*
For the base case
P\left(0\right)this holds trivially. Now suppose the propositionP\left(n\right)holds that is$$\begin{equation} n+b=n+c \Rightarrow b=c \end{equation*}$$*
We show that
P\left(S\left(n\right)\right)holds, that is$$\begin{equation} S\left(n\right)+b=S\left(n\right)+c \Rightarrow b=c \end{equation*}$$*
Now, we have that
$$\begin{align} S\left(n\right)+b&=S\left(n\right)+c\ S\left(n+0\right)+b&=S\left(n+0\right)+c\ n+S\left(0\right)+b&=n+S\left(0\right)+c\ n+\left(S\left(0\right)+b\right)&=n+\left(S\left(0\right)+c\right),\ \text{By associativity}\ \left(S\left(0\right)+b\right)&=\left(S\left(0\right)+c\right),\ \text{By hypothesis, as
P\left(n\right)hasb,cbeing arbitrary}\ b+S\left(0\right)&=c+S\left(0\right),\ \text{By commutativity}\ S\left(b+0\right)&=S\left(c+0\right)\ S\left(b\right)&=S\left(c\right)\ \end{align*}$$*Hence we have
b=cby proposition 37{reference-type="ref" reference="prop:EqualSuccOp"}. SoP\left(S\left(n\right)\right)is true.Hence by mathematical induction we have that if
a+b=a+cwe must have thatb=c. -
For
a\neq 0, ifab=acthen we have thatb=c:We again argue by induction, let
b,c\in\mathbb{N}be arbitrary and letP\left(n\right)be the proposition given by$$\begin{equation} nb=nc\Rightarrow b=c \end{equation*}$$*
Moreover, we do induction starting at
n=1as the casen=0is vacuously true. So forP\left(1\right)we have that this holds trivially. Now suppose thatP\left(n\right)holds. that is$$\begin{equation} nb=nc\Rightarrow b=c \end{equation*}$$*
We show that
P\left(S\left(n\right)\right)is true$$\begin{equation} S\left(n\right)b=S\left(n\right)c\Rightarrow b=c \end{equation*}$$*
Indeed we have that
$$\begin{align} S\left(n\right)b&=S\left(n\right)c\ bS\left(n\right)&=cS\left(n\right),\ \text{By commutativity}\ bn+b&=cn+c,\ \text{By commutativity}\ a+b&=a+c,\ nb=nc \text{ by assumption, so let } nb=nc=a \text{ for some } a\ b&=c,\ \text{By the cancellation law for addition}\ \end{align*}$$*
Hence
P\left(S\left(n\right)\right)is true.Hence by mathematical induction we have that for
a\neq 0ifab=acwe must have thatb=c.
As required. \qed.
:::
Summation and product notation
Now that we have a well-defined notion of addition and multiplication we
can define a shorthand to can be useful in avoiding writing out longer
chains of additions (or multiplications) in certain situations. We will
require the following mapping. Let s\in\mathbb{N}^{n+1} be an ordered
$n+1$-tuple of Natural numbers where
s=\left(s_0,s_1,s_1,s_2,\dots,s_n\right) and define
\mathbb{N}_n=\left\{0,1,2,3,\dots,n\right\}. Let
f:\mathbb{N}_n\rightarrow\mathbb{N} be a mapping defined by
$$\begin{align*} f:\mathbb{N}_n&\rightarrow\mathbb{N}\ i&\mapsto f\left(i\right) =s_i \end{align*}$$
This is to say that f simply gets the value of s_i which is an
element of the ordered tuple s.
::: definition Definition 72. Summation notation
Let s\in\mathbb{N}^{n+1} be an ordered $n+1$-tuple of Natural numbers
where s=\left(s_0,s_1,s_1,s_2,\dots,s_n\right) and define
\mathbb{N}_n=\left\{0,1,2,3,\dots,n\right\}. Let
f:\mathbb{N}_n\rightarrow\mathbb{N} be a mapping defined by
$$\begin{align} f:\mathbb{N}_n&\rightarrow\mathbb{N}\ i&\mapsto f\left(i\right) =s_i \end{align*}$$*
We define the summation notation by
$$\begin{equation} \sum_{i=0}^n f\left(i\right)=f\left(0\right)+f\left(1\right)+f\left(2\right)+\dots+f\left(n\right) \end{equation*}$$ This can also be written as*
$$\begin{equation} \sum_{i=0}^n s_i=s_0+s_1+s_2+\dots+s_n \end{equation*}$$*
We call i the index of the summation and that i=0 as the starting
index of the summation for some a\in\mathbb{N} and that n is the
ending index of the summation. In the case that s\in\emptyset then we
define the summation to be 0 and call such a summation an empty sum.
We can also define the summation over a subset of \mathbb{N}_n which
allows for starting the summation at a starting point other than i=0.
Let T\subseteq\mathbb{N}. We can define the summation over the set T
by
$$\begin{equation} \sum_{i\in T} s_i \end{equation*}$$*
If we have a mapping g:\mathbb{N}\rightarrow\mathbb{N} for some
mapping g then we can define a summation over g by
$$\begin{equation}
\sum_{i\in T} g\left(s_i\right)
\end{equation*}$$*
Finally, we can define a summation over a predicate P\left(i\right)
for i\in T giving
$$\begin{equation}
\sum_{P\left(i\right)} g\left(s_i\right)
\end{equation*}$$ which means to take the sum of the g\left(s_i\right)
where i satisfies the predicate P. If the predicate is not satisfied
by any i then the summation is also said to be an empty summation and
given a value of 0.*
In light of definition a summation of a predicate we have that if a>n
where a is the index lower of summation and n the upper point of
summation then the sum would be by definition equal to 0. That is to
say
$$\begin{equation} \sum_{i=a}^n s_i = 0 ,\ \text{If } a>n \end{equation*}$$* :::
::: example
Example 56. Let s=\left(2,3,4,8\right)\in\mathbb{N}^4 then we
have that
$$\begin{equation} \sum_{i=0}^3 s_i = 2+3+4+8 = 17 \end{equation*}$$* :::
::: example
Example 57. Let g\left(n\right)=n and let k=4 then we have
that
$$\begin{equation} \sum_{i=0}^4-1 g\left(i\right) = \sum_{i=0}^3 i = 1+2+3+4 = 10 \end{equation*}$$* :::
::: example
Example 58. Let s_1\in\mathbb{N} then we have
$$\begin{equation} \sum_{i=1}^1 s_1 = s_1 \end{equation*}$$* :::
::: example
Example 59. Let g\left(n\right) = n*n and let
T=\left\{2,6,11\right\}\subseteq\mathbb{N}^{11} then
$$\begin{equation} \sum_{i\in T} g\left(i\right) = g\left(2\right)+g\left(6\right)+g\left(11\right)=22+66+1111=4+36+121=161 \end{equation}$$* :::
::: example
Example 60. Let g\left(n\right) = n, let P\left(n\right) be the
predicate such that
$$\begin{equation}
P\left(n\right)=\begin{cases}
1,\ \text{If } n=2,4,6\
0,\ \text{Otherwise }
\end{cases}
\end{equation*}$$ Let T=\left\{2,6,11\right\}\subseteq\mathbb{N}^{11}
then we have for the i\in T that satisfies P\left(i\right) is given
by*
$$\begin{equation} \sum_{P\left(i\right)} i = 2+4=6 \end{equation*}$$* :::
::: example
Example 61. Let f\left(n\right)= n+5. Consider the sum
$$\begin{equation} \sum_{i=3}^6 n+5 = \left(3+5\right)+\left(4+5\right)+\left(5+5\right)+\left(6+5\right)=8+9+10+11=38 \end{equation*}$$*
We can re-express this sum as
$$\begin{equation} \sum_{i=0}^3 n+5 = \left(\left(0+3\right)3+5\right)+\left(\left(1+3\right)+5\right)+\left(\left(2+3\right)+5\right)+\left(\left(3+3\right)+5\right)=38 \end{equation*}$$*
We have re-indexed the sum into an equivalent form. :::
We can make some observations about summation notation.
::: {#prop:summation_properties_naturals .proposition} Proposition 38. Properties of summation notation
Let n,m\in\mathbb{N} such that m<n. Let s,t\in\mathbb{N}^n and
let c\in\mathbb{N}. In addition define A=\mathbb{N}_m and
B=\mathbb{N}_n\setminus A=\left\{m+1,m+2,\dots,n\right\} so that
A\cup B =\mathbb{N}_n. Let a\in \mathbb{N} be the lower index
summation. We have that the following properties hold.
-
$\displaystyle \sum_{i=0}^n s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i = \sum_{i=0}^m s_i + \sum_{i=m+1}^n s_i$
-
$\displaystyle \sum_{i=a}^n s_i = \sum_{i=a}^m s_i + \sum_{i=m+1}^n s_i$
-
$\displaystyle\sum_{i=1}^n c = cn$*
-
$\displaystyle\sum_{i=1}^n cs_i = c*\sum_{i=1}^n s_i$*
-
$\displaystyle\sum_{i=1}^n s_i+t_i = \sum_{i=1}^n s_i + \sum_{i=1}^n t_i$
Proof:
-
\displaystyle \sum_{i=0}^n s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i = \sum_{i=0}^m s_i + \sum_{i=m+1}^n s_i:We argue by induction on
n. LetP\left(n\right)be the proposition given by$$\begin{equation} \sum_{i=1}^n s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i = \sum_{i=1}^m s_i + \sum_{i=m+1}^n s_i \end{equation*}$$*
The base case
P\left(0\right)we have thatA=\emptysetandB=\mathbb{N}_0\setminus A=\left\{0\right\}as we have by assumption thatm<n. Hence$$\begin{equation} \sum_{i=0}^0 s_i = s_0 \end{equation*}$$ Likewise we have*
$$\begin{equation} \sum_{i\in A} s_i + \sum_{i\in B} s_i = 0+\sum_{i=0}^0 s_i = s_0 \end{equation*}$$*
So the base case holds. Now suppose that the
P\left(n\right)that is$$\begin{equation} \sum_{i=1}^n s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i = \sum_{i=1}^m s_i + \sum_{i=m+1}^n s_i \end{equation*}$$*
we need to show that
$$\begin{equation} \sum_{i=1}^{n+1} s_i = \sum_{i=1}^m s_i + \sum_{i=m+1}^{n+1} s_i \end{equation*}$$ also holds. By definition we have that*
$$\begin{equation} \sum_{i=1}^{n+1} s_i =s_0+s_1+s_2+\dots+s_n+s_{n+1}=\sum_{i=0}^n s_i + s_{n+1} \end{equation*}$$*
Now we have that
$$\begin{align} \sum_{i=1}^{n+1} s_i &= \sum_{i=0}^n s_i + s_{n+1}\ &= \sum_{i=1}^m s_i + \sum_{i=m+1}^n s_i + s_{n+1},\ \text{By the induction hypothesis}\ &= \sum_{i=1}^m s_i + \sum_{i=m+1}^{n+1} s_i,\ \text{By definition}\ \end{align*}$$*
Hence
P\left(n+1\right)holds and the results follows by induction. -
\displaystyle \sum_{i=a}^n s_i = \sum_{i=a}^m s_i + \sum_{i=m+1}^n s_i:This follows by a similar argument as 1. but starting the induction at
a. -
\displaystyle\sum_{i=1}^n c = c*n:We argue by induction on
n. LetP\left(n\right)be the proposition given by$$\begin{equation} \sum_{i=1}^n c = cn \end{equation}$$*
For the base case $P\left(1\right)$
$$\begin{equation} \sum_{i=1}^1 c = c = c1 \end{equation}$$*
Now suppose that
P\left(n\right)holds we need to show thatP\left(n+1\right)holds, that is$$\begin{equation} \sum_{i=1}^{n+1}n c = c*\left(n+1\right) \end{equation*}$$*
We have that $$\begin{equation} \sum_{i=1}^{n+1} c = \sum_{i=1}^n c + c = nc+c=cn+c=cS\left(n\right)=c\left(n+1\right) \end{equation*}$$*
The result follows by induction.
-
\displaystyle\sum_{i=1}^n c*s_i = c*\sum_{i=1}^n s_i:We have by definition of summation that
$$\begin{equation} \sum_{i=1}^n cs_i=cs_1+cs_2+\dots+cs_n \end{equation*}$$*
Now as multiplication distributes over addition we have
$$\begin{equation} \sum_{i=1}^n cs_i=cs_1+cs_2+\dots+cs_n = c\left(s_1+s_2+\dots+s_n\right)=c*\sum_{i=1}^n s_i \end{equation*}$$*
-
\displaystyle\sum_{i=1}^n s_i+t_i = \sum_{i=}^n s_i + \sum_{i=1}^n t_i:We argue by induction. Let
P\left(n\right)denote be the proposition given by$$\begin{equation} \sum_{i=1}^n s_i+t_i = \sum_{i=}^n s_i + \sum_{i=1}^n t_i \end{equation*}$$*
For the base case
P\left(1\right)we have that$$\begin{equation} \sum_{i=1}^1 s_i+t_i = s_1+t_1 \end{equation*}$$*
Likewise we have
$$\begin{equation} \sum_{i=}^1 s_i + \sum_{i=1}^n t_i = s_1+t_1 \end{equation*}$$ So the base case holds. Now suppose
P\left(n\right)holds so we need to showP\left(n+1\right)holds. By definition we have*$$\begin{equation} \sum_{i=}^{n+1} s_i + \sum_{i=1}^n s_i+ t_i +s_{n+1}+t_{n+1} = \sum_{i=1}^n s_i + \sum_{i=1}^n t_i +s_{n+1} +t_{n+1} \end{equation*}$$ By the induction hypothesis. Now addition is commutative so we get*
$$\begin{equation} \sum_{i=}^{n+1} s_i+ t_i = \sum_{i=1}^n s_i + \sum_{i=1}^n t_i +s_{n+1} +t_{n+1}= \sum_{i=1}^n s_i + s_{n+1} + \sum_{i=1}^n t_i + t_{n+1} = \sum_{i=1}^{n+1} s_i + \sum_{i=1}^{n+1} t_i \end{equation*}$$*
The result follows by induction.
$\qed$ :::
The summation notation allows us to deduce an additional property of multiplication .
::: {#prop:NaturalsHaveNoZeroDivisors .proposition} Proposition 39. Product of two naturals being zero implies one of the numbers is zero
Let a,b\in\mathbb{N}. If ab=0 then at least one of a or b is
zero.
Proof:
Let a,b\in\mathbb{N} and let ab=0. Using the summation notation we
have that
$$\begin{equation} ab=\sum_{i=1}^b a= \underbrace{a+a+a+\dots+a}_{b\text{ times}} = 0 \end{equation*}$$*
From which we can see that this holds for a=0 and only a=0. Suppose
that a\neq 0 then
$$\begin{equation} \sum_{i=1}^b a= \underbrace{a+a+a+\dots+a}_{b\text{ times}} > 0 \end{equation*}$$*
A contradiction to the hypothesis.
A similar result holds for \displaystyle ab = \sum_{i=1}^a b. Finally
if both a and b are zero the result is trivial.
The result has been shown. \qed.
:::
A similar definition can be made for multiplication, called product notation
::: definition Definition 73. Product notation
Let s\in\mathbb{N}^{n+1} be an ordered $n+1$-tuple of Natural numbers
where s=\left(s_0,s_1,s_1,s_2,\dots,s_n\right) and define
\mathbb{N}_n=\left\{0,1,2,3,\dots,n\right\}. Let
f:\mathbb{N}_n\rightarrow\mathbb{N} be a mapping defined by
$$\begin{align} f:\mathbb{N}_n&\rightarrow\mathbb{N}\ i&\mapsto f\left(i\right) =s_i \end{align*}$$*
We define the product notation by
$$\begin{equation} \prod_{i=0}^n f\left(i\right)=f\left(0\right)f\left(1\right)f\left(2\right)\dotsf\left(n\right) \end{equation*}$$ This can also be written as*
$$\begin{equation} \prod_{i=0}^n s_i=s_s_1s_2*\dotss_n \end{equation}$$*
We call i the index of the product and that i=0 as the lower
starting point of the product for some a\in\mathbb{N} and that n is
the upper point of the product. In the case that s\in\emptyset then we
define the product to be 1 and call such a product an empty product.
We can also define the product over a subset of \mathbb{N}_n which
allows for starting the product at a starting point other than i=0.
Let T\subseteq\mathbb{N}. We can define the product over the set T
by
$$\begin{equation} \prod_{i\in T} s_i \end{equation*}$$*
If we have a mapping g:\mathbb{N}\rightarrow\mathbb{N} for some
mapping g then we can define a product over g by $$\begin{equation}
\prod_{i\in T} g\left(s_i\right)
\end{equation*}$$*
Finally, we can define a product over a predicate P\left(i\right) for
i\in T giving
$$\begin{equation}
\sum_{P\left(i\right)} g\left(s_i\right)
\end{equation*}$$ which means to take the product of the
g\left(s_i\right) where i satisfies the predicate P. If the
predicate is not satisfied by any i then the product is also said to
be an empty product and given a value of 1. In light of definition a
product of a predicate we have that if a>n where a is the lower
index of the product and n the upper point of product then the product
would be by definition equal to 1. That is to say*
$$\begin{equation} \sum_{i=a}^n s_i = 1 ,\ \text{If } a>n \end{equation*}$$* :::
::: example
Example 62. Let s=\left(2,3,4,8\right)\in\mathbb{N}^4 then we
have that
$$\begin{equation} \prod_{i=0}^3 s_i = 2348 = 192 \end{equation}$$* :::
::: example
Example 63. Let g\left(n\right)=n and let k=4 then we have
that
$$\begin{equation} \prod_{i=0}^{4-1} g\left(i\right) = \prod_{i=0}^3 i = 1234 = 24 \end{equation}$$* :::
::: example
Example 64. Let s_1\in\mathbb{N} then we have
$$\begin{equation} \prod_{i=1}^1 s_1 = s_1 \end{equation*}$$* :::
::: example
Example 65. Let g\left(n\right) = n*n and let
T=\left\{2,6,11\right\}\subseteq\mathbb{N}^{11} then
$$\begin{equation} \prod_{i\in T} g\left(i\right) = g\left(2\right)g\left(6\right)g\left(11\right)=\left(22\right)+\left(66\right)+\left(1111\right)=436121=17424 \end{equation}$$* :::
::: example
Example 66. Let g\left(n\right) = n, let P\left(n\right) be the
predicate such that
$$\begin{equation}
P\left(n\right)=\begin{cases}
1,\ \text{If } n=2,4,6\
0,\ \text{Otherwise }
\end{cases}
\end{equation*}$$ Let T=\left\{2,6,11\right\}\subseteq\mathbb{N}^{11}
then we have for the i\in T that satisfies P\left(i\right) is given
by*
$$\begin{equation} \sum_{P\left(i\right)} i = 24=12 \end{equation}$$* :::
There is an some immediate properties of product notation that are clear
::: proposition Proposition 40. Properties of product notation
Let n,m\in\mathbb{N} such that m<n. Let s,t\in\mathbb{N}^n and
let c\in\mathbb{N}. In addition define A=\mathbb{N}_m and
B=\mathbb{N}_n\setminus A=\left\{m+1,m+2,\dots,n\right\} so that
A\cup B =\mathbb{N}_n. Let a\in \mathbb{N} be the lower index
summation. We have that the following properties hold.
-
*$\displaystyle \prod_{i=0}^n s_i = \prod_{i\in A} s_i \prod_{i\in B} s_i = \prod_{i=0}^m s_i + \prod_{i=m+1}^n s_i$
-
$\displaystyle \prod_{i=a}^n s_i = \prod_{i=a}^m s_i * \prod_{i=m+1}^n s_i$
-
$\displaystyle\prod_{i=1}^n s_it_i = \prod_{i=1}^n s_i \prod_{i=1}^n t_i$
Proof:
-
\displaystyle \prod_{i=0}^n s_i = \prod_{i\in A} s_i *\prod_{i\in B} s_i = \prod_{i=0}^m s_i + \prod_{i=m+1}^n s_i:We argue by induction on
n. LetP\left(n\right)be the proposition given by$$\begin{equation} \prod_{i=0}^n s_i = \prod_{i=0}^n s_i = \prod_{i\in A} s_i \prod_{i\in B} s_i = \prod_{i=0}^m s_i + \prod_{i=m+1}^n s_i \end{equation}$$*
The base case
P\left(0\right)we have thatA=\emptysetandB=\mathbb{N}_0\setminus A=\left\{0\right\}as we have by assumption thatm<n. Hence$$\begin{equation} \prod_{i=0}^0 s_i = s_0 \end{equation*}$$ Likewise we have*
$$\begin{equation} \prod_{i\in A} s_i + \prod_{i\in B} s_i = 0+\prod_{i=0}^0 s_i = s_0 \end{equation*}$$*
So the base case holds. Now suppose that the
P\left(n\right)that is$$\begin{equation} \prod_{i=1}^n s_i = \prod_{i=1}^m s_i + \prod_{i=m+1}^n s_i \end{equation*}$$*
we need to show that
$$\begin{equation} \prod_{i=1}^{n+1} s_i = \prod_{i=1}^m s_i + \prod_{i=m+1}^{n+1} s_i \end{equation*}$$ also holds. By definition we have that*
$$\begin{equation} \prod_{i=1}^{n+1} s_i =s_0s_1s_2*\dotss_ns_{n+1}=\prod_{i=0}^n s_i * s_{n+1} \end{equation*}$$*
Now we have that
$$\begin{align} \prod_{i=1}^{n+1} s_i &= \prod_{i=0}^n s_i * s_{n+1}\ &= \prod_{i=1}^m s_i * \prod_{i=m+1}^n s_i * s_{n+1},\ \text{By the induction hypothesis}\ &= \prod_{i=1}^m s_i * \prod_{i=m+1}^{n+1} s_i,\ \text{By definition}\ \end{align*}$$*
Hence
P\left(n+1\right)holds and the results follows by induction. -
\displaystyle \prod_{i=a}^n s_i = \prod_{i=a}^m s_i * \prod_{i=m+1}^n s_i:A similar argument as in part 1 shows this.
-
\displaystyle\prod_{i=1}^n s_it_i = \prod_{i=1}^n s_i \prod_{i=1}^n t_i:We argue by induction. Let
P\left(n\right)denote the proposition$$\begin{equation} \prod_{i=1}^n s_it_i = \prod_{i=1}^n s_i \prod_{i=1}^n t_i \end{equation*}$$*
In the base case
P\left(1\right)we have$$\begin{equation} \prod_{i=1}^1 s_it_i=s_1t_1 \end{equation*}$$*
Likewise
$$\begin{equation} \prod_{i=1}^1 s_i \prod_{i=1}^1 t_i=s_1t_1=s_1t_1 \end{equation}$$*
Which shows the base case. Now suppose
P\left(n\right)is true, we show thatP\left(n+1\right)is true. We have that$$\begin{align} \prod_{i=1}^{n+1}n s_it_i&=\prod_{i=1}^n s_it_i * s_{n+1}t_{n+1}\ &=\prod_{i=1}^n s_i\prod_{i=1}^n t_i * s_{n+1}t_{n+1}\ &=\prod_{i=1}^n s_is_{n+1}\prod_{i=1}^n t_i t_{n+1}\ &=\prod_{i=1}^{n+1} s_i\prod_{i=1}^{n+1} t_i\ \end{align}$$ The result follows by induction.
$\qed$ :::
Exponentiation
With the product notation defined we can define another operation called
exponentiation
::: definition Definition 74. Exponentiation of Natural numbers
Let \left(m,n\right)\in\mathbb{N}\times\mathbb{N} and let
\wedge:\mathbb{N}\times\mathbb{N}\rightarrow\mathbb{N}. We define the
exponentiation of m by n to be m multiplied by itself n-1 times
$$\begin{align}
\wedge:\mathbb{N}\times\mathbb{N}&\rightarrow\mathbb{N}\
\left(m,n\right)&\mapsto \wedge\left(m,n\right)=\begin{cases}
1,\ \text{If } n=0\text{ and } m=0\
1,\ \text{If } n=0\
\displaystyle \wedge\left(m,n\right)=\prod_{i=1}^n m = 1 * \prod_{i=1}^n m,,, , n\neq 0
\end{cases}
\end{align*}$$ We will write \wedge\left(m,n\right) as m^n. We say
that m is the base and n is the exponent. We sometimes say that m
has been raised to the power of n. In the case that n=0 and m=0 we
have a vacuous product and so an empty product which by definition has a
value of 1.*
:::
With the above definition, we make a quick remark. We know that an empty
product has a value of 1 and as multiplication by 1 doesn't change
the value we can write exponentiation as
$$\begin{equation*}
\prod_{i=1}^n m = 1 * \prod_{i=1}^n m
\end{equation*}$$ This makes it clear that exponentiation is
multiplication of 1 by n copies of m.
::: example
Example 67. Let n=2 and m=2 then we have that as
2=S\left(1\right) then $$\begin{equation}
\wedge\left(2,2\right)=\prod_{i=1}^2 2 = 22 = 4
\end{equation}$$*
:::
::: example
Example 68. Let m=4 and n=1 then we have that
$$\begin{equation}
\wedge\left(4,1\right)=4^1=4
\end{equation*}$$*
:::
::: example
Example 69. Let m=5 and n=0 then we have that
$$\begin{equation}
\wedge\left(5,0\right)=5^0=1
\end{equation*}$$*
:::
::: example
Example 70. Let m=2 and n=7 then we have that
$$\begin{equation}
\wedge\left(5,0\right)=5^0=1
\end{equation*}$$*
:::
As we have defined a new operation we should check that the operation is meaningful
::: theorem Theorem 15. Exponentiation is closed
For all n,m\in\mathbb{N} we have that
$$\begin{equation} \wedge\left(n,m\right)\in\mathbb{N} \end{equation*}$$*
Proof:
There are two cases to consider m=0 and m\neq 0. When m=0 the
operation is defined such that
$$\begin{equation}
\wedge\left(n,0\right)=1
\end{equation*}$$ which is in \mathbb{N}. When m\neq 0 then
\wedge\left(n,m\right)\in\mathbb{N} as multiplication in \mathbb{N}
is closed. $\qed$*
:::
We should also verify that the other properties that we have verified for addition and multiplication either hold or do not. For example we can find examples that show that exponentiation is not commutative.
::: {#prop:ExponentiationOfNaturalsIsNotCommutative .proposition} Proposition 41. Exponentiation is non-commutative
There exist n,m\in\mathbb{N} such that
$$\begin{equation} \wedge\left(n,m\right)\neq\wedge\left(m,n\right) \end{equation*}$$*
Proof:
Let n=3 and m=4 then we have that
$$\begin{align}
\wedge\left(3,4\right)&=81\
\wedge\left(4,3\right)&=64
\end{align*}$$ from which its clear that 81\neq 64. $\qed$*
:::
::: {#prop:ExponentiationOfNaturalsIsNotAssociative .proposition} Proposition 42. Exponentiation is non-associative
There exist a,b,c\in\mathbb{N} such that
$$\begin{equation} \left(a^{b}\right)^{c}\neq a^{\left(b^c\right)} \end{equation*}$$*
Proof:
Let a=2, b=3 and c=4 then we have that
$$\begin{align} \left(2^{3}\right)^{4}&=8^3=4096\ 2^{\left(3^4\right)}&=2^81=2417851639229258349412352 \end{align*}$$*
Clearly 4096\neq 2417851639229258349412352. $\qed$
:::
The non-associativity of exponentiation shows an important point, that
the order in which we do exponentiation can give drastically different
result and so the order in which the exponentiation should be done will
depend on the context. We will bracket for each case as required. There
is an interesting property for the case \left(a^{b}\right)^{c} called
the power law of exponentiation.
::: {#prop:ExponentiationOfNaturalsPowerLaw .proposition} Proposition 43. Power law of exponentiation
Let a,b,c\in\mathbb{N}. We have that
$$\begin{equation} \left(a^{b}\right)^{c}=a^{bc} \end{equation*}$$*
Proof:
By definition of exponentiation we have that
$$\begin{equation} \left(a^{b}\right)^{c}=\prod_{i=1}^c a^b = \prod_{i=1}^c \left(\prod_{j=1}^b a\right) \end{equation*}$$*
That is we are multiplying \prod_{j=1}^b a by itself c times. The
product \prod_{j=1}^b a itself is the multiplication of a by itself
b times. We can therefore express the above by
$$\begin{align} \left(a^{b}\right)^{c}&=\underbrace{\prod_{j=1}^b a*\prod_{j=1}^b a*\dots*\prod_{j=1}^b a}{c\text{ times}}\ &=\underbrace{\underbrace{\left(aa\dots*a\right)}{b\text{ times}}\underbrace{\left(aa*\dotsa\right)}_{b\text{ times}}\dots*\underbrace{\left(aa\dotsa\right)}{b\text{ times}}}{c\text{ times}}\ \end{align}$$*
There are therefore b*c multiplications of a with itself, as we
need to perform c iterations of \prod_{j=1}^b a. Hence we have that
$$\begin{equation} \left(a^{b}\right)^{c}=\underbrace{aaa*\dotsa}_{bc\text{ times}}=\prod_{i=1}^{bc} a = a^{bc} \end{equation*}$$*
As required. $\qed$ :::
There are some additional properties that we can deduce. Consider 2^m
for m\in\mathbb{N} we have for m=0,1,2,3 and 4 that
2^0=1, 2^1=2, 2^2=4, 2^3=8 and 2^4=16. Notice that multiplying
any 2^m by 2 adds one to the power. In fact multiplying any 2^m by
4 adds to the power. It looks like the powers multiply together. For
example 2^m*2^n=2^{m+n}. We can show this is true for bases other than
2.
::: {#prop:ExponentsOfSameNaturalNumberBaseAdd .proposition} Proposition 44. Multiplying exponents of same base adds the powers
Let a,m,n\in\mathbb{N}. We have that
$$\begin{equation} a^na^m=a^{n+m} \end{equation}$$*
Proof:
Let a,n,m\in\mathbb{N}. If n=0 and m\geq 0 then a^n = 1 and we
have that a^n*a^m=a^{n+m}=1*a^m=a^{0+m}=a^m. Likewise for the case
m=0 and n\geq 0. So suppose that m>0 and m>0. We have by
definition of exponentiation that
$$\begin{align} a^na^m=\prod_{i=1}^n a * \prod_{i=1}^m a=\underbrace{aa*\dotsa}_{n\text{ times}}\underbrace{aa\dotsa}_{m\text{ times}} &=\underbrace{aa\dotsa}_{n+m\text{ times}} =a^{n+m} \end{align}$$ as required. $\qed$* :::
We also have the following result that combines multiplying two numbers
and raising that result to a power. As an example consider
\left(2*3\right)^2= 6^2=36. Now consider 2^2=4 and 3^2=9 and we
clearly have 4*9=36. The powers can come through to each of the
numbers of the multiplication.
::: {#prop:ExponentiationPowerOfProductIsProductOfPowers .proposition} Proposition 45. Power of product is product of powers
Let a,b,n\in\mathbb{N}. We have that
$$\begin{equation} \left(ab\right)^n=a^nb^n \end{equation*}$$*
Proof:
If n=0 then \left(a*b\right)^0=1 by definition and a^0*b^0=1. So
suppose that n>0 then we have that
$$\begin{align} \left(ab\right)^n=\prod_{i=1}^n ab &= \underbrace{ababab\dotsab}{n\text {times}}\ &= \left(\underbrace{aaa\dots*a}{n\text {times}}\right)\left(\underbrace{bbb\dotsb}_{n\text {times}}\right),\ \text{By commutativity of multiplication}\ &= a^nb^n\ \end{align}$$*
The proposition has been shown. $\qed$ :::
Subtraction
We can define an operation that will allow us to at least partially undo addition. To define this operation we need to make use of the less than operator.
::: definition Definition 75. Subtraction of natural number
Let n,m\in\mathbb{N} such that m\leq n. Let d\in\mathbb{N} such
that n=m+d. We define subtraction by
$$\begin{equation} d=n-m \end{equation*}$$*
We call d the difference between n and m.
:::
There is an immediate result from the definition of subtraction
::: {#prop:NaturalAddDifference .proposition} Proposition 46. $a+\left(b-c\right)=\left(a+b\right)-c$
Let a,b,c\in\mathbb{N} with b\geq c. We have that
$$\begin{equation} a+\left(b-c\right)=\left(a+b\right)-c \end{equation*}$$*
Proof:
We argue by induction. Let P\left(n\right) denote the proposition
$$\begin{equation} a+\left(n-c\right)=\left(a+n\right)-c \end{equation*}$$*
For the base case n=0 we have by definition c=0 and so
$$\begin{equation} a+\left(0-0\right)=a=\left(a+0\right)-0 \end{equation*}$$*
Now suppose that P\left(n\right) holds, we show that
P\left(n+1\right) is true that is
$$\begin{equation} a+\left(\left(n+1\right)-c\right)=\left(a+\left(n+1\right)\right)-c \end{equation*}$$*
We have that n+1=\left(n+0\right)+1=n+\left(0+1\right) and so
$$\begin{align} a+\left(\left(n+1\right)-c\right)&=a+\left(n+\left(0+1\right)-c\right)\ &=a+\left(n+\left(1-c\right)\right)\ &=\left(a+n\right)+1-c\ &=a+\left(n+1\right)-c \end{align*}$$*
As required. $\qed$ :::
We immediately see that subtraction is not commutative that is
a-b\neq b-a in fact it is not even defined for b-a unless b\geq a
but then it is not defined for a-b and visa-versa. Likewise it is not
associative as for example \left(8-4\right)-2=2 but
8-\left(4-2\right)=6. We do however retain the fact that
multiplication is commutative over subtraction
::: proposition Proposition 47. Multiplication distributes over subtraction
Let a,b,c\in\mathbb{N} with b\geq c and let a\in\mathbb{N}. We
have that
-
$a\left(b-c\right)=ab-ac$
-
$\left(b-c\right)a=ba-ca=ab-ac$
Proof:
-
a\left(b-c\right)=ab-ac:Let
a\in\mathbb{N}be arbitrary. We argue by induction of the propositionP\left(n\right)given by$$\begin{equation} a\left(n-m\right)=an-am \end{equation*}$$ where by definition
m\leq n. For the base case we haveP\left(0\right)we have thatn=m=0and so*$$\begin{equation} a\left(0-0\right)=a0=0=a0-a0 \end{equation}$$*
Showing the base case. Now suppose that
P\left(n\right)holds we show thatP\left(n+1\right)is true, that is we show$$\begin{equation} a\left(\left(n+1\right)-m\right)=a\left(n+1\right)-am \end{equation*}$$ where
m\leq \left(n+1\right). There are two cases to considerif m=n+1then we have*$$\begin{equation} a\left(\left(n+1\right)-m\right)=a0=0=a\left(n-1\right)-am \end{equation}$$*
Now suppose that
m<\left(n+1\right)then$$\begin{equation} a\left(\left(n+1\right)-m\right)=a\left(n+1\right)-am \end{equation*}$$ by the induction hypothesis. The result follows by induction.*
-
\left(b-c\right)a=ba-ca=ab-ac:As multiplication is commutative we have that
$$\begin{align} \left(b-c\right)a&=a\left(b-c\right)\ &=ab-ac\ &=ba-ca \end{align*}$$*
The result follows. $\qed$ :::
The principle of strong induction
The final property of the natural we shall look at is that of the
principle of strong induction, although as we will see, this is actually
equivalent to usual induction. There is one more version of induction
that is sometimes useful, this is the so-called principle of strong
induction, this is instead of assuming P\left(n\right) is true and
showing that P\left(n+1\right). We instead assume that for all
n\leq k for some k\in\mathbb{N} we have that P\left(n\right) is
true for all n\leq k and we show that this implies that
P\left(k+1\right) is true.
::: theorem Theorem 16. The principle of strong induction
Let P\left(n\right) be a proposition about a natural number
n\in\mathbb{N}. Moreover, suppose that
-
P\left(0\right)is true. -
\forall k\in\mathbb{N}:P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(k\right)all being true implies thatP\left(k+1\right)is true.
If these two statements are true, we have that P\left(n\right) is
true for any natural number n, and we say the proposition
P\left(n\right) holds by the principle of strong mathematical
induction.
Proof:
Define \Tilde{P}\left(n\right) to be the following proposition
$$\begin{equation} \Tilde{P}\left(n\right)=P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right) \end{equation*}$$*
We show that \Tilde{P}\left(n\right) for all n\geq 0. By assumption
\Tilde{P}\left(n\right) is true as
\Tilde{P}\left(n\right)=P\left(0\right). Now suppose that
\Tilde{P}\left(n\right) is true for some n\in\mathbb{N}, that is
$$\begin{equation}
\Tilde{P}\left(n\right)=P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right)
\end{equation*}$$ is true, we show that \Tilde{P}\left(n+1\right) is
true, that is*
$$\begin{equation} \Tilde{P}\left(n+1\right)=P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right)\wedge P\left(n+1\right) \end{equation*}$$*
By assumption 2. as we have that
\forall n\in\mathbb{N}:P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right)
implies that P\left(n+1\right) is true. Hence we have that
$$\begin{equation} \Tilde{P}\left(n+1\right)=\Tilde{P}\left(n\right)\wedge P\left(n+1\right)=\Tilde{P}\left(n+1\right) \end{equation*}$$ is true.*
Hence by the principle of mathematical induction we have that
\Tilde{P}\left(n\right) is true for all n\geq 0. $\qed$
:::
As mentioned earlier, we said that strong induction and the usual
induction are equivalent, we shall prove this. We used induction to
prove strong induction so it is left to show that given the assumptions
for strong induction, we can deduce the truth \forall n\in\mathbb{N}
of the proposition P\left(n\right) only using induction.
::: theorem Theorem 17. Strong induction is equivalent to the usual induction
Suppose that the assumptions of strong induction hold. That is suppose
P\left(n\right) be a proposition about a natural number
n\in\mathbb{N} and moreover suppose that
-
P\left(0\right)is true. -
\forall k\in\mathbb{N}:P\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(k\right)all being true implies thatP\left(k+1\right)is true.
We have that the truth of P\left(n\right) for all n\in\mathbb{N}
can be deduced using only regular induction.
Proof:
Let \Tilde{P}\left(n\right) be the proposition be given by
$$\begin{equation} \forall k\leq n\text{ we have } P\left(k\right) \text{ is true} \end{equation*}$$*
We show by the principle of induction that
-
\Tilde{P}\left(0\right)is true -
\Tilde{P}\left(n\right)being true implies\Tilde{P}\left(n+1\right)is true for any natural numbern.
-
\Tilde{P}\left(0\right)is true:To see this, we have that
\Tilde{P}\left(0\right)is given by$$\begin{equation} \forall k\leq 0\text{ we have } P\left(0\right) \text{ is true} \end{equation*}$$*
This clearly holds as the only natural number that is less than or equal to zero is zero. Hence
P\left(0\right)is true and so\Tilde{P}\left(0\right). -
\Tilde{P}\left(n\right)being true implies\Tilde{P}\left(n+1\right)is true for any Natural numbern:Suppose that
\Tilde{P}\left(n\right)is true, that is$$\begin{equation} \forall k\leq n\text{ we have } P\left(k\right) \text{ is true} \end{equation*}$$*
we show that
\Tilde{P}\left(n+1\right)is true, that is$$\begin{equation} \forall k\leq n+1\text{ we have } P\left(k\right) \text{ is true} \end{equation*}$$*
Let
k\leq n+1be a natural number, have two cases to consider.-
If
k<n+1then we must have thatk\leq n. Now, we know that\Tilde{P}\left(n\right)is true by assumption, moreover by the assumptions of strong induction holding true we can conclude thatP\left(0\right)\wedge P\left(1\right)\wedge P\left(2\right)\wedge\dots\wedge P\left(n\right)all being true gives usP\left(n+1\right)is true. Hence we can conclude that\Tilde{P}\left(n+1\right)holds. -
Now, the remaining case is
k=n+1. In this case\Tilde{P}\left(n+1\right)is the statement$$\begin{equation} \forall k\leq n+1\text{ we have } P\left(k\right) \text{ is true} \end{equation*}$$*
Now, we have that
\Tilde{P}\left(n+1\right)=\Tilde{P}\left(n\right)\wedge P\left(n+1\right), from which we can assume the truth of\Tilde{P}\left(n\right)by assumption and by hypothesis this allows us to deduce the truth ofP\left(n+1\right). This gives us the truth of\Tilde{P}\left(n+1\right).
Hence, in both cases we conclude the truth of
\Tilde{P}\left(n+1\right). -
Hence the proposition follows by mathematical induction. Which is to say, strong induction can be proven using regular induction. $\qed$ :::
We have now shown the equivalence of regular and strong induction.
The well-ordering principle
Consider the way we constructed the natural numbers, we started with one
element 0=\emptyset, and build each element in turn by the successor
function. That is
$$\begin{align*} 1&=S\left(0\right)=0\cup\left{0\right}\ 2&=S\left(1\right)=1\cup\left{1\right}\ 3&=S\left(2\right)=2\cup\left{2\right}\ \end{align*}$$
This is clearly constructing some form of ordering on \mathbb{N}, in
particular we can consider this in two different ways. Firstly we can
see the successor map under set inclusion, that is
$$\begin{equation*}
0\subset 1\subset 2\subset 3\subset 4\subset 5\subset\dots
\end{equation*}$$ likewise we can consider this ordering in the more
intuitive sense of the less than or equal to operator.
$$\begin{equation*}
0\leq 1\leq 2\leq 3\leq 4\leq 5\leq\dots
\end{equation*}$$ This just doesn't hold for the entirety of
\mathbb{N}. For example consider the set S=\left\{2,4,6,8\right\},
we have from the successor mapping that 2\in 4\in 6\in 8, hence 2 is
the smallest element of S, with respect to the inclusion of sets. We
phrase this in the following proposition
::: {#thm:WOP .theorem} Theorem 18. Well-ordering principle
Let S\subseteq\mathbb{N} be a subset of \mathbb{N} with the
possibility of being the entirety of \mathbb{N}. We have that
\exists x\in S such that x is the smallest element of S with
respect to set inclusion. This is to say \exists x\in S such that
\forall y\in S we have x\subseteq y.
Proof:
As 0 is by construction included in every natural number it is enough
to show that any subset of \mathbb{N}\setminus\left\{ 0\right\} has no
minimal element with respect to set inclusion. For this purpose we will
define $M=\mathbb{N}\setminus\left{ 0\right}$
Let S\subseteq M that has no smallest element with respect to set
inclusion. We argue by strong induction on $S$
By assumption S has no smallest element with respect to inclusion
then 1\not\in S otherwise it would be by definition the smallest
element with respect to inclusion. Define T to be the complement of
S and then we 0\in T.
Now suppose that every n\in M such that k\leq n is in T. If
n+1\in S then it would be a minimal element as every element less than
n+1 is in the complement of S, hence n+1\in T. This implies that
every element of M is in T by strong induction.
It follows that S=\emptyset. Hence the result. $\qed$
:::
We have shown in some sense that \mathbb{N} is well-ordered. We will
see that the idea of well-ordering is an example of a so-called
relation.
Rules for the inequality operators
Now that we have a firm grasp of the natural numbers we can deduce some properties that relate to inequalities. In the natural numbers, there are a few results which can be deduced.
::: {#prop:InequalityNaturalNumbers .proposition} Proposition 48. Properties of inequalities for natural numbers
Let a,b,c,d\in\mathbb{N}. We have the following properties for
inequalities
-
a\leq bis the same as $b\geq a$ -
a<bis the same as $b>a$ -
If
a\leq bandb\leq cthen $a\leq c$ -
If
a<bandb\leq cthen $a<c$ -
If
a\leq bandb<cthen $a<c$ -
If
a< bandb<cthen $a<c$ -
If
a\geq bandb\geq cthen $a\geq c$ -
If
a>bandb\geq cthen $a>c$ -
If
a\geq bandb>cthen $a>c$ -
If
a>bandb>cthen $a>c$ -
If
a\leq bthen $a+c\leq b+c$ -
If
a<bthen $a+c<b+c$ -
If
a\geq bthen $a+c\geq b+c$ -
If
a>bthen $a+c>b+c$ -
If
a\leq bthen $ac\leq bc$ -
If
a<bthen $ac<bc$ -
If
a\geq bthen $ac\geq bc$ -
If
a>bthen $ac>bc$
Proof:
-
a\leq bis the same asb\geq a:Suppose that
a\leq bthen by definition ofa\leq bwe have thata\subseteq b. We then clearly have thatb\not\subset aand so eitherb>aby definition orb=a. In other wordsb\geq a. -
a<bis the same asb>a:Similar to the first part. If
a<bthen by definitionais a strict subset ofb, that isa\subset b. Ifais a strict subset ofbthenb\not\subset aby definition of a subset. Henceb>aby definition of greater than. -
If
a\leq bandb\leq cthena\leq c:Suppose that
a\leq bandb\leq c. By definition, we have thata\subseteq bandb\subseteq aand so by proposition 2{reference-type="ref" reference="prop:SetInclusionTransitivityProp"} we havea\subseteq cwhich is to saya\leq c. -
If
a<bandb\leq cthena<c:As
a<bandb\leq cthena\subset bandb\subseteq c. Applying proposition 3{reference-type="ref" reference="prop:ProperSetInclusionTransitivityProp"} givesa\subset cand so $a<c$ -
If
a\leq bandb<cthena<c:Similar to part 4. As
a\leq bthena\subseteq band likewise asb\leq cthenb\subset c. Applying 3{reference-type="ref" reference="prop:ProperSetInclusionTransitivityProp"} givesa\subset cand hencea<c. -
If
a<bandb<cthena<c:Similar to part 4. and 5. As
a<bthena\subset band likewise asb<cthenb\subset c. By proposition 4{reference-type="ref" reference="prop:ProperSetSubSetInclusionNotTransitivity"} we have thata\subset cand hencea<c. -
If
a\geq bandb\geq cthena\geq c:By the first part of the proposition we have that
a\geq bandb\geq cthena\geq cis the same asb\leq aandc\leq bthenc\leq a, and so part 3. of the proposition applies. -
If
a>bandb\geq cthena>c:Applying part 2. of this proposition to
a>banda>cand part 1. tob\geq cgives the equivalent statementb<aandc\leq athenc<a, and so part 4. of the proposition applies. -
If
a\geq bandb>cthena>c:Applying part 1. of this proposition to
a\geq band part 1. tob>canda>cgives the equivalent statementb\leq aandc< bthenc<a, and so part 5. of the proposition applies. -
If
a>bandb>cthena>c:Applying part 2. to
a>b,b>candc>agives the equivalent statementb<aandc<bthenc<aand so part 6. applies. -
If
a<bthena+c<b+c:Suppose that
a<b, thena\subset b. We argue by induction oncthat\left(a+c\right)\subset \left(b+c\right).Let
P\left(c\right)be the proposition given by$$\begin{equation} \left(a+c\right)\subset \left(b+c\right) \end{equation*}$$*
For the base case
c=0and we trivially havea<bby hypothesis. HenceP\left(0\right)is true.So suppose that
P\left(c\right)is true, that is to say$$\begin{equation} \left(a+c\right)\subset \left(b+c\right) \end{equation*}$$*
We need to show that
P\left(c+1\right)=P\left(S\left(c\right)\right)is true. That is$$\begin{equation} \left(a+S\left(c\right)\right)\subset \left(b+S\left(c\right)\right) \end{equation*}$$*
We know from the definition of addition that
\forall m\in\mathbb{N}andn\neq 0that$$\begin{equation} m+n=m+S\left(n\right)=S\left(m+n\right) \end{equation*}$$*
Hence we have
$$\begin{equation} \left(a+S\left(c\right)\right)\subset \left(b+S\left(c\right)\right) \Rightarrow S\left(a+c\right)\subset S\left(b+c\right) \end{equation*}$$*
By the induction hypothesis, we know that
a+c\subset b+c. Letx,y\in\mathbb{N}withx=a+candy=b+c. Then we have to show that$$\begin{equation} S\left(x\right)\subset S\left(y\right) \end{equation*}$$*
Now, we have that
x=x+0, likewisey=y+0and so$$\begin{align} S\left(x\right)&\subset S\left(y\right)\ S\left(x+0\right)&\subset S\left(y+0\right)\ x+S\left(0\right)&\subset y+S\left(0\right)\ a+c+S\left(0\right)&\subset b+c+S\left(0\right)\ a+S\left(c+0\right)&\subset b+S\left(c+0\right)\ a+S\left(c\right)&\subset b+S\left(c\right)\ \end{align*}$$*
Hence
P\left(S\left(c\right)\right)=P\left(c+1\right)holds.The result follows by induction. Therefore
a+c\subset b+cfor allc\in\mathbb{N}and thereforea+c<b+c. -
If
a\leq bthena+c\leq b+c:Suppose that
a\leq b. Ifa<bthen by part 11. we havea+c<b+c. So suppose thata=bthen we must have thata+c=b+cand so by definitiona+c\leq b+c. -
If
a>bthena+c>b+c:Applying part 2. of the proposition give the equivalent statement of
b< athenb+c< a+cand so we can apply part 11. -
If
a\geq bthena+c\geq b+c:Applying part 1. of the proposition give the equivalent statement of
b\leq athenb+c\leq a+cand so we can apply part 12. -
If
a<bthenac<bc:Suppose that
a<b, thena\subset b. We argue by induction oncthatac\subseteq bc.Let
P\left(c\right)be the proposition given by$$\begin{equation} \left(ac\right)\subset \left(bc\right) \end{equation*}$$*
For the base case
c=0and we trivially havea*0<b*0\Rightarrow 0<0is vacuously true. HenceP\left(0\right)is true.So suppose that
P\left(c\right)is true, that is to say$$\begin{equation} \left(ac\right)\subset \left(bc\right) \end{equation*}$$*
We need to show that
P\left(c+1\right)=P\left(S\left(c\right)\right)is true. That is$$\begin{equation} \left(aS\left(c\right)\right)\subset \left(bS\left(c\right)\right) \end{equation*}$$*
We know from the definition of multiplication that
\forall m\in\mathbb{N}andn\neq 0that$$\begin{equation} mn=mS\left(n\right)=mn+m \end{equation}$$*
Hence we have
$$\begin{equation} \left(aS\left(c\right)\right)\subset \left(bS\left(c\right)\right) \Rightarrow ac+c\subset bc+c \end{equation*}$$*
By the induction hypothesis, we know that
ac<bcand so by part 11. we conclude thatac+c \subset bc+cwhich is to sayaS\left(c\right)\subset bS\left(c\right). HenceP\left(S\left(c\right)\right)=P\left(c+1\right)is true and the result follows by induction. Hence we conclude thatac\subset bc. -
If
a\leq bthenac\leq bc:Suppose
a\leq bthen ifa< bwe apply part 15. Otherwise, we have thata=band so by definitionac=bcwhich is to sayac\leq bc. -
If
a>bthenac>bc:Applying part 2. of the proposition gives the equivalent statement of
b< athenbc<acand so we apply part 15. of the proposition. -
If
a\geq bthenac\geq bc:Applying part 1. of the proposition gives the equivalent statement of
b\leq athenbc\leq acand so we apply part 16. of the proposition.
The result has been shown. $\qed$ :::
Cardinality, countability, relations
::: epigraph God created infinity, and man, unable to understand infinity, had to invent finite sets.
Gian-Carlo Rota :::
Cardinality
In the previous chapter when constructing the natural numbers, we made
continuous reference to the idea that the number
1=\left\{\emptyset\right\} is somehow the set that contains a single
element,
3=\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}
somehow contains three individual elements. We can make this a rigorous
definition, to do so we will use the idea we have been using all along.
This is to say, the natural number n is a set that has n elements.
::: definition Definition 76. Cardinality of a natural number
We define the cardinality of a natural number n\in\mathbb{N}, which
we will denote \left|n\right| to be the same as the identity mapping.
That is to say
$$\begin{align} \left|\cdot\right|:\mathbb{N}&\rightarrow\mathbb{N}\ n&\mapsto\left|n\right|=n \end{align*}$$* :::
::: example
Example 71. Consider 1 and 3 from before. We have that
$$\begin{equation} \left|1\right|=\left|\left{\emptyset\right}\right|=1 \end{equation*}$$ and*
$$\begin{equation} \left|3\right|=\left|\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right}\right|=3 \end{equation*}$$*
Indeed, we have captured the essence of that intuitive idea that a
natural number n is a set that has n elements.
:::
Now that we have a notion of size for natural numbers, we can extend
this idea to sets in general. In particular, how many elements does a
given set have? To build this idea we will also be making use of
mappings. For an example, suppose we have the set
S=\left\{2,4,6,8\right\}. Intuitively we know that this is a set which
has four element, and by our definition above we know that
\left|4\right|=\left|\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\},\left\{\emptyset,\left\{\emptyset\right\},\left\{\emptyset,\left\{\emptyset\right\}\right\}\right\}\right\}\right|=4,
is a set that contains 4 elements. Now, consider the mapping
f:S\rightarrow 4, given by
$$\begin{align*} f\left(2\right)&=\emptyset\ f\left(4\right)&=\left{\emptyset\right}\ f\left(6\right)&=\left{\emptyset,\left{\emptyset\right}\right}\ f\left(8\right)&=\left{\emptyset,\left{\emptyset\right},\left{\emptyset,\left{\emptyset\right}\right}\right}\ \end{align*}$$
With f being defined as it is, we conclude that f is a bijection.
Hence we know that each element x\in S maps exactly to one and only
one of the elements y\in 4. This somehow tells us that S is a set
which has 4 elements, as 4 is a set which contains 4 elements. We
can make this a definition of the size of a set, and in doing so define
the notion of a finite and "Infinite" set
::: definition Definition 77. Cardinality of a set
Let S be a set.
-
Suppose that
n\in\mathbb{N}. We define the cardinality of the setS, denoted by\left|S\right|=n, to benif and only if there exists a bijective mappingf:S\rightarrow n. We write this$$\begin{equation} \left|S\right|=n \end{equation*}$$*
If such a mapping exist we say that
Sis a finite set of sizen. We recall that each element ofnis nothing but a set whose elements are sets. -
Suppose that
f:S\rightarrow\mathbb{N}be a bijective mapping. We say that the cardinality of the setSis infinite. Informally we denote this by\infty, but formally we say that\left|S\right|=\aleph_0=\left|\mathbb{N}\right|, where\aleph_0is pronounced Aleph-Null. -
Suppose that
f:S\rightarrow Tis a bijective mapping. We say that the setsSandThave the same cardinality and write\left|S\right|=\left|T\right|. :::
This definition made reference to the idea of an "infinite" set. We know
that the axiom of infinity gives us the existence of one infinite set,
this infinite set that is defined by the axiom of infinite includes the
natural numbers but it also includes the so called ordinal numbers. The
natural numbers are what are called cardinal numbers, they refer to the
size of collections of objects or the amount of some quantity, they can
also be used to list (enumerate) a collection. For example we can think
of a race between 20 drivers. We must have that one driver comes
first, another second, another third, and so on. Each driver can be
listed using a number from 1 to 20 inclusive, alternatively a number
from 0 to 19. When used in this way, the natural numbers order by
enumeration the positions the drivers in the race finished. Now,
elements in the infinite set that are not the natural numbers also have
this property that they can be used to enumerate the finishing positions
of race drivers, however to use them we first would have to go through
every single natural number first. The first such non-natural number
ordinal is usually denoted \omega, and so to label something
$\omega$-th, infinitely many things would have to come before.
This gets complicated quickly and as such we won't go into more details
for now. Instead the idea that the natural numbers can be used for
enumeration turns out to be a useful one, especially later down the line
when we start considering sets like \mathbb{R}. For now, we are only
interested in sets whose cardinality is either finite or \aleph_0 and
we will continue the exploration of cardinality.
To continue this exploration we will need to relate the ideas of subsets to that of cardinality.
::: {#prop:ProperSubsetStrictlySmallarCard .proposition} Proposition 49. Proper subset of a finite set has strictly smaller cardinality
Let S and T be finite sets such that S\subset T, then we have
that \left|S\right|<\left|T\right|.
Proof:
Let S and T be finite sets such that S\subset T with say
\left|T\right|=n, we argue by induction on n, the cardinality of the
set T.
Let P\left(n\right) be the proposition given by
$$\begin{equation} \text{If }T\text{ is a finite set with } S\subset T \text{ and }\left|T\right|=n \text{ then } S\text{ is a finite set and} \left|S\right|<\left|T\right|=n \end{equation*}$$*
We need to show that
-
P\left(0\right)is true. -
If
P\left(n\right)is true thenP\left(n+1\right)is true.
-
P\left(0\right)is true:We have that
\left|T\right|=0and soT=\emptyset. AsT=\emptysetthen there are no subsetsS\subset \emptysetfor if there were thenT\neq\emptyset. Hence the base case is vacuously true. -
If
P\left(n\right)is true thenP\left(n+1\right)is true:Suppose that
P\left(n\right)holds for somen\in\mathbb{N}which is the statement$$\begin{equation} \text{If }T\text{ is a finite set with } S\subset T \text{ and }\left|T\right|=n \text{ then } S\text{ is a finite set and} \left|S\right|<\left|T\right|=n \end{equation*}$$ We need to show that
P\left(n+1\right)also holds that is we show that*$$\begin{equation} \text{If }T\text{ is a finite set with } S\subset T \text{ and }\left|T\right|=n+1 \text{ then } S\text{ is a finite set and} \left|S\right|<\left|T\right|=n+1 \end{equation*}$$*
So suppose that
\left|T\right|=n+1for somen\in\mathbb{N}such thatS\subset T. AsSis a strict subset ofTwe know that\exists t\in Twitht\not\in S. Hence we have thatS\subseteq T\setminus\left\{t\right\}. We need to now show that $\left|T\setminus\left{t\right}\right|=n$::: lemma Lemma 4. Set of cardinality
n+1minus an element has cardinality $n$Let
Sbe a finite set with cardinalityn+1. Consider the setS\setminus\left\{s\right\}wheres\in Sis an arbitrary element ofS. We have that\left|S\setminus\left\{s\right\}\right|=nProof:
We need to show that for the set
S\setminus\left\{s\right\}that there exists a bijective mapping to a set ofnelements. We know thatShas cardinalityn+1, hence there exists a bijectionf:S\rightarrow n+1. We know by construction thatn+1=n\cup\left\{n\right\}, hence we have thatn=n+1\setminus\left\{n\right\}.Consider the mapping given by
gdefined as follows$$\begin{align*} g:S\setminus\left{s\right}&\rightarrow n=n+1\setminus\left{n\right}\ x&\mapsto g\left(x\right)=\begin{cases} f\left(x\right): \text{If }f\left(x\right)\neq \left{n\right}\ f\left(s\right): \text{If }f\left(x\right)=\left{n\right} \end{cases} \end{align*}$$
This is to say
gis a mapping that takes eachx\in Sand maps it tof\left(x\right)iff\left(x\right)\neq \left\{n\right\}\in n+1, that is iffdoesn't mapxto the removed element of the setn+1, otherwise iffdoes map an elementx\in Sto\left\{n\right\}thengmapsxto whateverftakes the removed elementsto.For example suppose that
S=\left\{0,1,2\right\}, i.e we are considering the casen=2, letf:S\rightarrow 3be the identity mapping, this is a bijection. Suppose we now considerS\setminus\left\{2\right\}=\left\{0,1\right\}and consider the mappingg:S\setminus\left\{2\right\}\rightarrow 2given by$$\begin{align*} g:S\setminus\left{2\right}&\rightarrow 2=3\setminus\left{2\right}=\left{\emptyset,\left{\emptyset\right}\right}\ x&\mapsto g\left(x\right)=\begin{cases} f\left(x\right): \text{If }f\left(x\right)\neq \left{2\right}\ f\left(2\right): \text{If }f\left(x\right)=\left{2\right} \end{cases} \end{align*}$$ We have that
g\left(0\right)=0=\emptysetandg\left(1\right)=1=\left\{\emptyset\right\}. We could have instead consideredS\setminus\left\{1\right\}=\left\{0,2\right\}again withfbeing the identity mapping. We have that in this casegis the mapping given by$$\begin{align*} g:S\setminus\left{1\right}&\rightarrow 2=3\setminus\left{2\right}=\left{\emptyset,\left{\emptyset\right}\right}\ x&\mapsto g\left(x\right)=\begin{cases} f\left(x\right): \text{If }f\left(x\right)\neq \left{2\right}\ f\left(1\right): \text{If }f\left(x\right)=\left{2\right} \end{cases} \end{align*}$$ In this case we have that
g\left(0\right)=0=\emptysetbutg\left(2\right)=f\left(1\right)=1=\left\{\emptyset\right\}.Now, we need to show the general case where
gis given by$$\begin{align*} g:S\setminus\left{s\right}&\rightarrow n=n+1\setminus\left{n\right}\ x&\mapsto g\left(x\right)=\begin{cases} f\left(x\right): \text{If }f\left(x\right)\neq \left{n\right}\ f\left(s\right): \text{If }f\left(x\right)=\left{n\right} \end{cases} \end{align*}$$ is a bijection.
-
gis an injection:To see that
gis an injection, suppose thatx,y\in S\setminus\left\{s\right\}and thatx\neq y. There are three cases to consider.-
f\left(x\right)\neq \left\{n\right\}andf\left(y\right)\neq\left\{n\right\}:We have by definition of the mapping
gthatf\left(x\right)=g\left(x\right)andf\left(y\right)=g\left(y\right). Moreover we know thatfis a bijection and in particular an injection, hence asf\left(x\right)\neq f\left(y\right)we must have thatg\left(x\right)\neq g\left(y\right) -
f\left(x\right)=\left\{n\right\}:By the definition of the mapping
gwe have thatg\left(x\right)=f\left(s\right). Now, recall thaty\in S\setminus\left\{s\right\}, thus it follows thaty\neq s. Now, by the injectivity offwe have thatf\left(y\right)\neq f\left(s\right)=g\left(x\right). Moreover by the injectivity offwe have thatf\left(y\right)\neq \left\{n\right\}. It now follows by definition ofgthat$$\begin{equation*} g\left(y\right)=f\left(y\right)\neq f\left(x\right)=g\left(x\right) \end{equation*}$$ That is
g\left(y\right)\neq g\left(x\right). -
f\left(y\right)=\left\{n\right\}:This is the same as
f\left(x\right)=\left\{n\right\}except the roles ofxandyare swapped, for completeness we give the details.By the definition of the mapping
gwe have thatg\left(y\right)=f\left(s\right). Now, asx\in S\setminus\left\{s\right\}it follows thatx\neq s. By the injectivity offwe have thatf\left(x\right)\neq f\left(s\right)=g\left(y\right). Moreover by the injectivity offwe have thatf\left(x\right)\neq \left\{n\right\}. It now follows by definition ofgthat$$\begin{equation*} g\left(x\right)=f\left(x\right)\neq f\left(y\right)=g\left(y\right) \end{equation*}$$ That is
g\left(y\right)\neq g\left(x\right).
This shows that
gis an injection. -
-
gis a surjection:We need to show that
\forall y\in n,\exists x\in Ssuch thatg\left(x\right)=y. Lety\in n. We know thatfis a bijection and in particular it is a surjection and so by definition we know we must have$$\begin{equation*} \forall y\in n+1,\exists x\in S : f\left(x\right)=y \end{equation*}$$
Consider the definition of
g. We know thatg:S\setminus\left\{s\right\}\rightarrow n, hence to show thatgis surjective we need to show that anyy\in nhas an elementx'\in Swithf\left(x'\right)=y. Moreover asSdoesn't have the elementswe can't usex=sin the surjectivity offto show surjectivity ofg.$$\begin{equation*} \forall y\in n=n+1\setminus\left{n\right},\exists x'\in S\setminus\left{s\right}: x\neq a\text{ and } f\left(x'\right)=y \end{equation*}$$
Finally, we need to consider
f\left(s\right)and in particular the two cases off\left(s\right)\neq\left\{n\right\}andf\left(s\right)=\left\{n\right\}, from the definition ofg.-
f\left(s\right)\neq\left\{n\right\}:Suppose that
f\left(s\right)\neq\left\{n\right\}. Asfis a bijection we have thatfis invertible, in particular we must have thatf^{-1}\left(\left\{n\right\}\right)\neq s. There are two additional cases to consider now,f\left(s\right)=y=f\left(x\right)andf\left(s\right)\neq y=f\left(x\right).-
f\left(s\right)=y=f\left(x\right):Suppose that
f\left(s\right)=y, by definition ofgwe have that$$\begin{equation*} g\left(f^{-1}\left(\left{n\right}\right)\right)=y \end{equation*}$$ as
f^{-1}\left(\left\{n\right\}\right)\neq s. So letx'=f^{-1}\left(\left\{n\right\}\right). -
f\left(s\right)\neq y=f\left(x\right):Suppose that
f\left(s\right)\neq y, by assumption of surjectivity offwe have thatf\left(x\right)=y. Hencef\left(s\right)\neq f\left(x\right)and so by injectivity offwe have thatx\neq s., hence we can simply takex'=x,
-
-
f\left(s\right)=\left\{n\right\}:Now suppose that
f\left(s\right)=\left\{n\right\}We know that\left\{n\right\}\not\in nand so by assumption we have thatf\left(x\right)=y\neq \left\{n\right\}. Thus we conclude thatx\neq sso we letx'=x
In each case we have found a valid choice for
x'and so surjectivity has been shown. -
It follows that
g:S\setminus\left\{s\right\}\rightarrow nis a bijection and by definition of set cardinality we conclude thatS\setminus\left\{s\right\}has cardinalityn. As required.\qed:::Now, by the lemma we have that
T\setminus\left\{t\right\}is set of cardinalityn. Now ifS=T\setminus\left\{t\right\}then\left|S\right|=n<n+1=S\left(n\right)and so is finite by definition, otherwise we must have thatSis a proper subset ofT\setminus\left\{t\right\}. So the induction hypothesis holds, that isSis a finite set with less thannelements. Moreover asn<n+1it follows thatShas less thann+1elements.Hence
P\left(n+1\right)holds. -
The result now follows by induction. $\qed$ :::
This proposition has an immediate consequence.
::: {#lem:SubsetOfFiniteSetHasAtMostSameCard .lemma} Lemma 5. Subset of a finite set has at most the same cardinality
Let S and T be finite sets such that S\subseteq T, we have that
\left|S\right|\leq\left|T\right|.
Proof:
There are two cases to consider. Firstly if S=T we have by definition
that S and T have the same elements and therefore the identity map
is a bijection between the two sets. Hence
\left|S\right|=\left|T\right|. The finally case is S\subset T which
is simply proposition
49{reference-type="ref"
reference="prop:ProperSubsetStrictlySmallarCard"}. $\qed$
:::
We defined the cardinality of a set S in terms of a bijective mapping
from S to \mathbb{N}, although this doesn't mean we can't deduce
things about cardinality for say injective mappings or surjective
mappings. We will assume unless stated otherwise that the sets we are
dealing with are finite.
::: {#prop:CardinalityOfFiniteInjectiveMap .proposition} Proposition 50. Cardinality of finite sets in an injective mapping
Let S and T be two finite sets, and suppose that f:S\rightarrow T
is an injection. We have that
$$\begin{equation} \left|S\right|\leq \left|T\right| \end{equation*}$$*
Proof:
Suppose that f:S\rightarrow T is an injective mapping between finite
sets with \left|S\right|=n and \left|T\right|=m. Now consider the
mapping given by g:S\rightarrow\mathop{\mathrm{Image}}\left(f\right).
We have by proposition
18{reference-type="ref"
reference="prop:InjectiveMapToImageIsBijection"} that an injective
mapping to the image is a bijection and so by definition
\left|\mathop{\mathrm{Image}}\left(f\right)\right|=n. Additionally by
definition of the image of f we have that
\mathop{\mathrm{Image}}\left(f\right)\subseteq T. It follows that as
\mathop{\mathrm{Image}}\left(f\right)\subseteq T then
n=\left|S\right|=\left|\mathop{\mathrm{Image}}\left(f\right)\right|\leq\left|T\right|=m,
that is \left|S\right|\leq \left|T\right|. As required. $\qed$
:::
::: {#prop:CardinalityOfFiniteSurjectiveMap .proposition} Proposition 51. Cardinality of finite sets in a surjective mapping
Let S,T be two finite sets, and suppose that f:S\rightarrow T is a
surjection. We have that
$$\begin{equation} \left|T\right|\leq \left|S\right| \end{equation*}$$*
Proof:
Suppose that f:S\rightarrow T is a surjective mapping between finite
sets with \left|S\right|=n and \left|T\right|=m. For each t\in T
define x_t\in f^{-1}\left(\left\{t\right\}\right), x_t exists
because f is surjective and so by definition for any t\in T the is
some s\in S such that f\left(s\right)=T.
Define X=\left\{x_t\in S:t\in T\right\}, that is X is the set of
all such elements defined by the pre-image above. Clearly X\subseteq S
and so by
5{reference-type="ref"
reference="lem:SubsetOfFiniteSetHasAtMostSameCard"} we have that
\left|X\right|\leq\left|S\right|.
Now, consider the restriction mapping \mathrel f\restriction_S. We
have that for all t\in T that x_t\in X and that
\mathrel f\restriction_S\left(x_t\right)=t, and so
\mathrel f\restriction_S is surjective. Moreover if we have so some
x_t,x_v\in S with \mathrel f\restriction_S\left(x_t\right)=t and
\mathrel f\restriction_S\left(x_v\right)=v and t=v then by
definition we have that x_t=x_v and \mathrel f\restriction_S is a
bijection. Hence by definition
\left|T\right|=\left|X\right|\leq\left|S\right|. $\qed$
:::
If we have two sets of finite cardinality , what can we say about the
Cartesian product? This should also have finite cardinality. If we have
a set S of cardinality n and a set T of cardinality m. The
Cartesian product S\times T has elements of the form
\left(s,t\right) for s\in S and t\in T. For some element s_0 we
can have that every element t\in T is in S\times T for which there
are precisely m such elements of this form. We can do this for each
element in s\in S for which there are n such elements. Hence we
expect the total number of elements in S\times T to be nm.
::: {#prop:CardinalityOfCartesianProduct .proposition} Proposition 52. Cardinality of the Cartesian product of finite sets
Let S and T be two sets with cardinalities \left|S\right|=n and
\left|T\right|=m then
$$\begin{equation} \left|S\times T\right| = \left|S\right|\left|T \right|=nm \end{equation*}$$*
Proof:
If either one of \left|S\right|=0 or \left|T\right|=0 then
S=\emptyset or T=\emptyset and so S\times T=\emptyset. So let S
and T be as given then \left|S\right|=m and \left|T\right| =m. Let
s\in S and define the following mapping
$$\begin{align} f:T&\rightarrow\left{s\right}\times T\ t&\mapsto f\left(t\right)=\left(s,t\right) \end{align*}$$*
We show that f is a bijection. Indeed suppose that
f\left(a\right)=f\left(b\right) where a,b\in T then
\left(s,a\right)=\left(s,b\right) and as s is fixed we conclude that
a=b which shows injectivity. Now let t\in\left\{s\right\}\times T
then t=\left(s,t'\right) for some t'\in T but then clearly
f\left(t'\right)=t and so
\forall t\in \left\{s\right\}\times T,\exists t'\in T such that
f\left(t'\right)=t. Hence f is surjective and therefore we have that
f is a bijection. By proposition
50{reference-type="ref"
reference="prop:CardinalityOfFiniteInjectiveMap"} as f is an injective
mapping between finite sets then
\left|T\right|\leq \left|\left\{s\right\}\times T\right|. Likewise by
proposition
51{reference-type="ref"
reference="prop:CardinalityOfFiniteSurjectiveMap"} we conclude that
\left|\left\{s\right\}\times T\right|\leq \left|T\right| hence
\left|T\right|=\left|\left\{s\right\}\times T\right|=m.
Now define the set K by
$$\begin{equation} K=\left{\left{s\right}\times T: s\in S\right} \end{equation*}$$*
for any s\in S. Define the following mapping
$$\begin{align} g:S&\rightarrow K\ x&\mapsto g\left(x\right)=\left{x\right}\times T \end{align*}$$*
We show that g is a bijection. Clearly if
g\left(a\right)=g\left(b\right) then
\left\{a\right\}\times T=\left\{b\right\}\times T and as T is a
fixed set then a=b and injectivity holds. Now let k\in K then
k=\left\{k'\right\}\times T where k'\in S then clearly
g\left(k'\right)=k so surjectivity holds. Hence g is a bijection and
so by a similar argument with the mapping f we conclude that
$\left|S\right| = \left|K\right|=n$
We now need to show that set K partitions S\times T. This is to say
we need to show that
-
\forall x,y\in Kwe have thatx\cap y=\emptysetwhenever $x\neq y$ -
\forall x\in Kwe have that$$\begin{equation} S\times T=\bigcup_{x\in K} x \end{equation*}$$*
-
\forall x\in Kwe have that $x\neq \emptyset$
-
\forall x,y\in Kwe have thatx\cap y=\emptysetwhenever $x\neq y$We can make use of the fact that
gis a bijection. Ifg\left(x\right)=g\left(y\right)thenx=yand sox\cap y=x=y\neq\emptyset. Now ifg\left(x\right)\neq g\left(y\right)thenx\neq ysayx=\left\{s_1\right\}\times Tandy=\left\{s_2\right\}\times Twiths_1\neq s_2. It follows thatx\cap y = \emptyset. -
\forall x\in Kwe have that$$\begin{equation} S\times T=\bigcup_{x\in K} x \end{equation*}$$*
By definition we have that any
x\in Khas the form\left\{s\right\}\times Twheres\in S. Lety\in \left\{s\right\}\times Ttheny=\left(s,t\right)for somet\in Tand soy\in S\times Ttherefore$$\begin{equation} \bigcup_{x\in K} x\subseteq S\times T \end{equation*}$$*
Likewise suppose that
x\in S\times Tthenx = \left(s,t\right)for somes\in Sandt\in T. This implies thatx\in \left\{s\right\}\times Tand as\left\{s\right\}\times T\in Kthenx\in Kso that$$\begin{equation} S\times T\subseteq\bigcup_{x\in K} x \end{equation*}$$*
It follows that
$$\begin{equation} S\times T=\bigcup_{x\in K} x \end{equation*}$$ for all
x\in K.* -
\forall x\in Kwe have that $x\neq \emptyset$Let
x\in Kthenx\neq\emptysetasS\neq\emptysetandT\neq\emptyset. Hence\forall x\in Kx\neq \emptyset.
It follows that K partitions S\times T. Now as K is a set
containing n elements and K partitions S\times T and each element
of K is a set containing m elements. We have that the cardinality of
S\times T is the sum of the cardinalities of each set x\in K which
is m*n. That is to say
$$\begin{equation} \left|S\times T\right|=nm \end{equation*}$$ and the result is shown. $\qed$* :::
Countability
::: definition Definition 78. Countable Set
Let S be a set. Let T\subseteq\mathbb{N} allowing for the
possibility that T=\mathbb{N}. We say that S is a countable set if
and only if the mapping f:S\rightarrow T is a bijection.
If T is a finite subset of \mathbb{N} we say that S is a finitely
countable set and thus countable. If T=\mathbb{N} we say that S is a
countably infinite set. If S is not a finitely countable set or a
countably infinite set we say that S is a uncountably infinite set.
:::
Informally, a set S is finitely countable or countably infinite if we
have some process for which we can enumerate each element of S, that
is to say list out each element in some way. We have an immediate
result. We can make the notion of an enumeration rigorous
::: definition Definition 79. Enumeration
Let S be a finitely countable set with cardinality \left|S\right|=n
and define \mathbb{N}_n=\left\{1,2,3,\dots,n\right\} for some
n\in\mathbb{N}. We define an enumeration of S to be a bijective
mapping f:\mathbb{N}_n\rightarrow S or a bijective mapping
g:S\rightarrow\mathbb{N}_n.
If S is a countably infinite we define an enumeration of S to be
the bijection f:\mathbb{N}\rightarrow S or a bijective mapping
g:S\rightarrow\mathbb{N}.
:::
It is clear that in either case the if f is a enumeration of a
countable set S then so is f^{-1} is also an enumeration of S.
::: proposition Proposition 53. Inverse of an enumeration mapping is an enumeration mapping
-
Let
Sbe a finitely countable set with cardinality\left|S\right|=nhave enumerationf:\mathbb{N}_n\rightarrow Sthenf^{-1}:S\rightarrow\mathbb{N}_nis an enumeration ofSwherefandf^{-1}define the same enumeration of the elements of $S$ -
Let
Sbe a countable set have enumerationf:\mathbb{N}\rightarrow Sthenf^{-1}:S\rightarrow\mathbb{N}is an enumeration ofSwherefandf^{-1}define the same enumeration of the elements of $S$
Proof:
-
Let Let
Sbe a finitely countable set with cardinality\left|S\right|=nhave enumerationf:\mathbb{N}_n\rightarrow Sthenf^{-1}:S\rightarrow\mathbb{N}_nis an enumeration ofSwherefandf^{-1}define the same enumeration of the elements ofS:As
fis a bijection then it has an inversef^{-1}:S\rightarrow\mathbb{N}_nwhich is also a bijection. Hencef^{-1}is an enumeration. To show thatfandf^{-1}define the same enumeration of the elements ofSwe note thatf\circ f^{-1}=\mathop{\mathrm{id}}_{\mathbb{N}_n}andf^{-1}\circ f = \mathop{\mathrm{id}}_S. -
Let Let
Sbe a countable set have enumerationf:\mathbb{N}\rightarrow Sthenf^{-1}:S\rightarrow\mathbb{N}is an enumeration ofSwherefandf^{-1}define the same enumeration of the elements ofS:As
fis a bijection then it has an inversef^{-1}:S\rightarrow\mathbb{N}which is also a bijection. Hencef^{-1}is an enumeration. To show thatfandf^{-1}define the same enumeration of the elements ofSwe note thatf\circ f^{-1}=\mathop{\mathrm{id}}_{\mathbb{N}}andf^{-1}\circ f = \mathop{\mathrm{id}}_S.
The result is shown. $\qed$ :::
::: proposition Proposition 54. The natural numbers are countably infinite
We have that \mathbb{N} is a countably infinite set.
Proof:
To show that \mathbb{N} is countable we need to find a bijective
mapping f:\mathbb{N}\rightarrow\mathbb{N}. We can clearly take
\mathop{\mathrm{id}}_\mathbb{N}, that is the identity mapping on
\mathbb{N}. That is to say
$$\begin{align} \mathop{\mathrm{id}}\mathbb{N}:\mathbb{N}&\rightarrow\mathbb{N}\ x&\mapsto\mathop{\mathrm{id}}\mathbb{N}\left(x\right)=x \end{align*}$$ As required. $\qed$* :::
We also have the following immediate result.
::: proposition
Proposition 55. Any subset of \mathbb{N} is countable
Let S\subseteq\mathbb{N} then S is countable.
Proof:
Let S\subseteq\mathbb{N} and suppose that S is not finite, for if
it is by definition it is countable. As \mathbb{N} is well-ordered we
have by theorem 18{reference-type="ref" reference="thm:WOP"}
that S is well-ordered and so have a set inclusion minimal element say
s_0. As S is infinite then S\setminus\left\{s_0\right\}. We will
use this as the basis for induction.
Suppose we have
s_n\in S\setminus\left\{s_0,s_1,s_2,\dots,s_{n-1}\right\} then another
application of the well-order principle means there is some set
inclusion minimal element s_{n+1} with
s_{n+1}\in S\setminus\left\{s_0,s_1,s_2,\dots,s_n\right\}. This holds
for all n\in\mathbb{N} and so we conclude that
S=\left\{s_0,s_1,s_2,\dots\right\} is countable by defining the
bijective mapping mapping
$$\begin{align} f:\mathbb{N}&\rightarrow S\ x&\mapsto f\left(x\right)=s_x \end{align*}$$*
The result follows. $\qed$ :::
::: proposition Proposition 56. The empty-set is countable
We have that \emptyset is a countable set.
Proof:
The empty-set has cardinality 0 which is finite. $\qed$
:::
There are some results that can be deduced which give equivalent conditions for a set to be countable. Two of these results follow by definition of a countable set.
::: {#prop:EquivalelntDefinitionsOfCountable .proposition} Proposition 57. Equivalence definitions of a countable set
Let S be a set. The following hold.
-
Sis countable if and only if there is an injectionf:S\rightarrow Tfor some subset $T\subseteq\mathbb{N}$ -
Sis countable if and only ifS=\emptysetor there is a surjectionf:T\rightarrow Sfor some subset $T\subseteq\mathbb{N}$
Proof:
-
Sis countable if and only if there is an injectionf:S\rightarrow Tfor some subsetT\subseteq\mathbb{N}:\left(\Rightarrow\right): Suppose thatSis countable then by definition there is a bijectionf:S\rightarrow Tfor someT\subseteq\mathbb{N}. Asfis a bijection thenfis an injection and we are done.\left(\Leftarrow\right): Suppose that there is an injectionf:S\rightarrow Tfor someT\subseteq\mathbb{N}. Consider the mappingg:S\rightarrow\mathop{\mathrm{Image}}\left(f\right). By proposition 15{reference-type="ref" reference="prob:RestOfCodomainToImageIsSurjective"} we have thatgis a surjection. By definition of a surjection we have that\forall y\in\mathop{\mathrm{Image}}\left(f\right)there is somex\in Ssuch thatf\left(x\right)=y. It follows thatgis a bijection asgis also an injection by definition of the image of a mapping. Therefore\left|S\right|=\left|\mathop{\mathrm{Image}}\left(f\right)\right|and as\mathop{\mathrm{Image}}\left(f\right)\subseteq T\subseteq\mathbb{N}we have thatSis countable. -
Sis countable if and only ifS=\emptysetor there is a surjectionf:T\rightarrow Sfor some subsetT\subseteq\mathbb{N}:\left(\Rightarrow\right): Suppose thatSis countable then there is a bijectionf:T\rightarrow Sand by definition is therefore a surjection.\left(\Leftarrow\right): Suppose thatf:T\rightarrow Sis a surjection. IfS=\emptysetthenf:T\rightarrow Sis vacuously injective and surjective and therefore\left|S\right|=\left|\emptyset\right|=\left|T\right|and therefore countable. So suppose thatS\neq\emptyset. By proposition 14{reference-type="ref" reference="prop:PropertyImagePreImage"} we have for any mappingg:X\rightarrow Ythat the pre-image ofg^{-1}\left(Y\right)=X, thereforef^{-1}\left(S\right)=T. By assumptionT\subseteq \mathbb{N}and is therefore either finite or some countably infinite subset of\mathbb{N}possibly being\mathbb{N}itself. IfTis finite then we have that\left|S\right|\leq\left|T\right|by definition offbeing surjective and there for\left|S\right|is finite and therefore countable. So suppose that\left|T\right|=\aleph_0thenTis either a countable subset of\mathbb{N}or\mathbb{N}itself.Let
g:T\rightarrow\mathbb{N}be a bijection theng^{-1}:\mathbb{N}\rightarrow Tis an bijection by proposition 35{reference-type="ref" reference="prop:InverseBijectionIsBijection"} and we have thatf\circ g^{-1}:\mathbb{N}\rightarrow Sis a surjection by proposition [20](#prop: PropInjecSurjecBijecMapping){reference-type="ref" reference="prop: PropInjecSurjecBijecMapping"}. It is left to show thatf\circ g^{-1}being surjective impliesSis countable. Proposition 28{reference-type="ref" reference="prop:RightInverseIffSurjective"} gives thatf\circ g^{-1}being surjective means there exists a right inversehsuch thath:S\rightarrow \mathbb{N}. By proposition 30{reference-type="ref" reference="RightInverseOfSurjecctionisInection"} we have thathis injective. It follows by part 1 thatSis countable.
The result is shown. $\qed$ :::
::: proposition Proposition 58. Set is countable if cardinality of set equals cardinality of a countable set
Let S,T be sets such that \left|S\right|=\left|T\right| then if S
is countable so is T.
Proof:
Suppose that S is countable. We have that as
\left|S\right|=\left|T\right| then there exists a bijection
f:S\rightarrow T, in particular there exists a bijection
g:T\rightarrow S. Now as S is countable there exists and injection
h:S\rightarrow\mathbb{N}. Now as g is a bijection we have that g
is an injection. The mapping h\circ g:T\rightarrow \mathbb{N} is an
injection as h and g are. Hence as h\circ g is an injection it
follows that T is countable by proposition
57{reference-type="ref"
reference="prop:EquivalelntDefinitionsOfCountable"}. $\qed$
:::
Relations
Definition of a relation
So far we have seen a few notations that relate elements of a set to
another. An example that relates elements of a set is equality of
natural numbers, two natural numbers are equal if and only if there are
the same element. Another example that we have seen on the natural
numbers is the less than operator <. A natural number x is less than
y if and only if x\subseteq y. A more fundamental example of a
relation is that of a mapping f:S\rightarrow T. We can consider a
function as relating any s\in S and t\in T to the pair
\left(s,t\right) where f\left(s\right)=t.
In a sense, we have that the idea of relations is somehow as fundamental
as sets and mappings, in fact we just described a mapping as some form
of relation so the idea of relations is more fundamental than that of a
mapping. Using the examples of the comparison operators on \mathbb{N}
we can motivate a definition for a relation.
::: definition Definition 80. Relation
Let S be a set and consider the Cartesian product S\times S. A
relation is a subset R\subseteq S\times S. We write an element
\left(a,b\right)\in R as aRb or we also write a\sim b and we say
that a relates to b. If \left(a,b\right)\not\in R we write
a\slashed{R} b or we write $a\not\sim b$
:::
We can recast the ideas at the start of this section into the language of relations.
::: example
Example 72. Consider equality on \mathbb{N}. We can define
equality as a relation \mathbb{N}\times \mathbb{N} where a\sim b if
and only if a\subseteq b and b\subseteq a. Explicitly we have that
R is a subset of \mathbb{N}\times\mathbb{N} given by
$$\begin{equation} R=\left{\left(0,0\right),\left(1,1\right),\left(2,2\right),\dots\right} \end{equation*}$$* :::
::: example
Example 73. Consider the less than operator on \mathbb{N}. We
have that the less than operator is a relation where a\sim b is given
by a\subset b. To see this consider T=\left\{0,1,2\right\}. Then the
less than relation on T is given by the relation
$$\begin{equation} R=\left{\left(0,1\right),\left(0,2\right),\left(1,2\right)\right} \end{equation*}$$* :::
::: example
Example 74. Let S=\left\{0,1\right\}\subseteq\mathbb{N} and
define T=P\left(S\right\} be the power set of S given by
$$\begin{equation} T=\left{\emptyset,\left{0\right}, \left{1\right}, \left{0,1\right}, S\right} \end{equation*}$$*
We can define a relation R\subseteq T\times T by
$$\begin{align}
R = {
&\left(\emptyset,\emptyset\right),\left(\emptyset,\left{0\right}\right),\left(\emptyset,\left{1\right}\right),\left(\emptyset,\left{0,1\right}\right),\left(\emptyset,S\right),\left(\left{0\right},\left{0\right}\right),\left(\left{0\right},\left{0,1\right}\right),\left(\left{0\right},S\right),\
&\left(\left{1\right},\left{1\right}\right),\left(\left{1\right},\left{0,1\right}\right),\left(\left{1\right},S\right),\left(\left{0,1\right},\left{0,1\right}\right),\left(\left{0,1\right},S\right),\left(S,S\right)}
\end{align*}$$ This relation expresses inclusive subset inclusion,
\subseteq, on S.*
:::
::: example
Example 75. Let S=\left\{0,1,2\right\} and T=S. Define
T\times T by
$$\begin{equation}
T\times T = \left{\left(0,0\right),\left(0,1\right),\left(0,2\right),\left(1,0\right),\left(1,1\right),\left(1,2\right),\left(2,0\right),\left(2,1\right),\left(2,2\right)\right}
\end{equation*}$$ We can use the less than or equal to operator, \leq,
to define a relation. We have that*
$$\begin{equation} R=\left{\left(0,0\right),\left(0,1\right),\left(0,2\right),\left(1,1\right),\left(1,2\right),\left(2,2\right)\right} \end{equation*}$$* :::
Reflexive Relation
All of the examples from the previous section, except the strictly less
than example, share a common property. Each element is related to
itself, that is in each example there is some element s\in S such that
\left(s,s\right)\in R\subseteq S\times S. We formalise this in the
following definition.
::: definition Definition 81. Reflexive relation
Let S be a set with a relation R\subseteq S\times S. We say that
the relation R is reflexive if and only if \forall s\in S we have
that \left(s,s\right)\in R. If there is an s\in S such that
\left(s,s\right)\not\in R then we say that the relation is
anti-reflexive.
:::
We have given examples of reflexive relations and one example of an anti-reflexive relation. We give an additional example of an anti-reflexive relation.
::: example
Example 76. We have for a,b\in\mathbb{N} that a=b if and only
if a\subseteq b and b\subseteq a. If this doesn't hold then
a\neq b and either one of a\subseteq b or b\subseteq a is true but
not both. It follows that the relation a\sim b meaning a\neq b is
anti-reflexive. This also implies that if a\neq b then either
a\leq b or b\leq a.
:::
The examples given so far have allowed us to see some examples of
relations and one particular type of relation, a reflexive relation.
Unfortunately only considering relations on elements a single set S
currently gives us few practical examples to work with. A simple
extension to the idea of a relation can fix this.
::: definition Definition 82. Binary Relation
Let S and T be sets. We define a binary relation to be a subset
R\subseteq S\times T. We write an element \left(s,t\right)\in R as
sRt or write s\sim t and we say that s relates to t. If
\left(s,t\right)\not\in R we write s\slashed{R} t or we write
s\not\sim t.
:::
We can extend this the notion of a relation and binary relation to that of any finite Cartesian product
::: definition Definition 83. $n$-ary Relation
Let S_1,S_2,S_3,\dots,S_n be sets. We define an $n$-ary relation to
be a subset
R\subseteq S_1\times S_2\times S_3\times\dots\times S_n=\mathbb{S}. An
element of R has the form r=\left(r_1,r_2,r_3,\dots,r_n\right) and
we say that the elements of r relate. We write this as
$R\left(r\right)=R\left(r_1,r_2,r_3\dots,r_n\right)$
:::
In light of these previous definitions we would like to extend the
definition of a reflexive relation to binary and $n$-ary relations. To
see how we could extend a reflexive relation to a binary relation
suppose we have two sets S and T. The definition of a reflexive
relation of a set Z is that
\left(z,z\right)\in R_z\subseteq Z\times Z where z\in Z and R_z is
the relation defined on Z. A natural way to extend this two S and
T is to have either \left(s,s\right)\in R\subseteq S\times T or
\left(t,t\right)\in R\subseteq S\times T where R is a binary
relation for S and T. Hence for a reflexive binary relation to makes
sense we must have that s,t\in S\cap T and therefore the relation
would have to be defined on S\cap T.
In the first case \left(s,s\right)\in R\subseteq S\times T we have by
definition of an ordered tuple that \left(s,s\right)\in R if and only
if s\in S and s\in T. Likewise for
\left(t,t\right)\in R\subseteq S\times T we must have s\in S and
t\in T which is to say s,t\in S\cap T. If S\neq T then there will
exist at least one element \left(s,t\right)\in R\subseteq S\times T
where either s\in S and s\not\in T or t\in T and t\not\in S, in
this case it is not possible for a reflexive relation to exist.
::: definition Definition 84. Reflexive binary relation
Let S and T with relation R\subseteq S\times T. We say that the
relation R is reflexive if and only if S=T.
:::
A similar argument shows there can be no reflexive $n$-ary relation
unless all of the sets that make the relation are the same. For example
consider the sets X,Y and Z. The natural way to represent a relation
R\subseteq X\times Y\times Z would be to have either
\left(x,x,x\right)\in R, \left(y,y,y\right)\in R or
\left(z,z,z\right)\in R where x\in X, y\in Y and z\in Z. If
\left(x,x,x\right)\in R then by definition we must have x\in Y and
x\in Z, likewise if \left(y,y,y\right)\in R then y\in X and
y\in Z and finally if \left(z,z,z\right)\in R then z\in X and
z\in Y. Any of the cases implies that x,y,z\in X\cap Y\cap Z
::: definition Definition 85. Reflexive $n$-ary relation
Let S_1,S_2,S_3,\dots,S_n be sets with relation
R\subseteq S_1\times S_2\times S_3\times\dots\times S_n. We say that
the relation R is reflexive if and only if S_i=S_j for all
$i,j\in\left{1,2,3,\dots,n\right}$
:::
This means when talking about a reflexive relation we only need to consider a single set.
An example of a binary relation is a mapping.
::: example
Example 77. Let S=T=\mathbb{N} and define the mapping
f:S\rightarrow T given by f\left(s\right)=s. We have that f
defines a relation as we have that
$$\begin{equation} R=\left{\left(0,0\right),\left(1,1\right),\left(2,2\right),\left(3,3\right),\dots\right}\subseteq\mathbb{N}\times\mathbb{N} \end{equation*}$$* :::
::: example
Example 78. Let S=\left\{1,2\right\} and T=\left\{3,4\right\}.
Define the mapping f:S\rightarrow T by f\left(1\right)=4 and
f\left(2\right)=3. We have f defines a relation as
$$\begin{equation} R=\left{\left(1,4\right),\left(2,3\right)\right}\subseteq S\times T \end{equation*}$$* :::
We can consider operators as relations by using the $n$-aray notion of a relation
::: example
Example 79. Let X=Y=Z=\mathbb{N}. We can consider the operator
+ as a mapping given by
$$\begin{align} f:X\times Y &\rightarrow Z\ \left(x,y\right)&\mapsto f\left(x,y\right) = x+y \end{align*}$$*
A relation can be defined by f. A sample of this relation R looks
as follows
$$\begin{equation} R=\left{\left(0,0,0\right), \left(0,1,1\right),\left(4,3,7\right),\left(3,4,7\right),\left(2,2,4\right),\dots,\right}\subseteq\mathbb{N}\times\mathbb{N}\times\mathbb{N} \end{equation*}$$*
In general, R has the following definition
$$\begin{equation} R=\left{\left(x,y,x+y\right):x,y\in\mathbb{N}\right} \end{equation*}$$*
We note that as X=Y then for any x\in X we have x\in Y and
likewise for any y\in Y we have that y\in X. We therefore have that
R\left(x,y,x+y\right)=R\left(y,x,y+x\right). This is confirming the
fact that addition is commutative.
:::
::: example
Example 80. Let X=Y=Z=\mathbb{N}. We can consider the operator
* as a mapping given by
$$\begin{align} f:X\times Y &\rightarrow Z\ \left(x,y\right)&\mapsto f\left(x,y\right) = xy \end{align}$$*
The relation defined by f looks as follows
$$\begin{equation} R=\left{\left(0,0,0\right), \left(0,1,0\right),\left(4,3,12\right),\left(3,4,12\right),\left(2,2,4\right),\dots,\right}\subseteq\mathbb{N}\times\mathbb{N}\times\mathbb{N} \end{equation*}$$*
In general, R has the following definition
$$\begin{equation} R=\left{\left(x,y,xy\right):x,y\in\mathbb{N}\right} \end{equation}$$*
As before, we have that as X=Y then for any x\in X we have x\in Y
and likewise, for any y\in Y we have that y\in X. We, therefore,
have that R\left(x,y,x*y\right)=R\left(y,x,y*x\right), again
confirming the fact that multiplication is commutative.
:::
::: example
Example 81. Let X=Y=Z=\mathbb{N}. We can consider the operator
\wedge as a mapping given by
$$\begin{align} f:X\times Y &\rightarrow Z\ \left(x,y\right)&\mapsto f\left(x,y\right) = \wedge\left(x,y\right)=x^y \end{align*}$$*
The relation defined by f looks as follows
$$\begin{equation} R=\left{\left(0,0,1\right), \left(0,1,0\right),\left(2,3,8\right),\left(8,2,64\right),\left(3,2,9\right),\dots,\right}\subseteq\mathbb{N}\times\mathbb{N}\times\mathbb{N} \end{equation*}$$*
In general, R has the following definition
$$\begin{equation} R=\left{\left(x,y,x^y\right):x,y\in\mathbb{N}\right} \end{equation*}$$*
As before, we have that as X=Y then for any x\in X we have x\in Y
and likewise, for any y\in Y we have that y\in X. We, therefore,
have that R\left(x,y,x^y\right)\neq R\left(y,x,y^x\right), which
confirms that in general exponentiation is not commutative.
:::
The last three examples expose another property that relations can have.
If two or more elements relate then it doesn't matter which way the
relation is written, that is if x\sim y then we can have the case that
y\sim x. Such a relation is called symmetric.
::: definition Definition 86. Symmetric relation
Let S be a set with relation R\subseteq S\times S. We say that R
is a symmetric relation if and only if \forall x,y\in S we have that
xRy implies yRx, equivalently we can write R is symmetric if and
only if x\sim y implies y\sim x. If R is not symmetric we say that
R is an anti-symmetric relation.
:::
As with reflexive relations we can show that trying to extend a the idea of a symmetric relation on a single set to multiple sets we have to conclude the sets have to be the same.
Indeed suppose that S and T are sets with a relation
R\subseteq S\times T. The natural extension for a symmetric relation
would be \forall s\in S that sRt\Rightarrow tRs for t\in T. This
implies that t\in S and s\in T and therefore s,t\in S\cap T.
::: definition Definition 87. Symmetric binary relation
Let S and T be sets with relation R\subseteq S\times T. We say
that R is symmetric if and only if $S=T$
:::
Likewise a similar argument holds for $n$-ary symmetric relations
::: definition Definition 88. Symmetric $n$-ary relation
Let S_1,S_2,S_3,\dots,S_n be sets with relation
R\subseteq S_1\times S_2\times S_3\times\dots\times S_n. We say that
the relation R is symmetric if and only if S_i=S_j for all
$i,j\in\left{1,2,3,\dots,n\right}$
:::
The comparison, less than, less than or equal to, greater than, and greater than or equal to operators on the naturals also give insight into another interesting property. The following examples will make it more clear
::: example
Example 82. Let S=T=\mathbb{N} and define x\sim y by x\leq y.
Consider a,b,c\in\mathbb{N} with a=2, b=4 and c=6. We have that
a\sim b as 2\leq 4 and we have that b\sim c as 4\leq 6, we
clearly also have a\sim c as 2\leq 6. In general if we have
a,b,c\in\mathbb{N} with a\leq b\leq c we have that a\sim b and
b\sim c implies a\sim c.
:::
::: example
Example 83. Let S=T=\mathbb{N} and define x\sim y by x\geq y.
Consider a,b,c\in\mathbb{N} with a=8, b=3 and c=1. We have that
a\sim b as 8\geq 3 and we have that b\sim c as 3\leq 1, we also
have a\sim c as 8\geq 1. More generally if we have
a,b,c\in\mathbb{N} with a\geq b\geq c we have that a\sim b and
b\sim c implies a\sim c.
:::
::: example
Example 84. Let S=T=\mathbb{N} and define x\sim y by x= y.
Consider a,b,c\in\mathbb{N} with a=2, b=2 and c=2. We have that
a\sim b as 2=2 and we have that b\sim c as 2=2, we also have
a\sim c as 2=2. More generally if we have a,b,c\in\mathbb{N} with
a= b= c we have that a\sim b and b\sim c implies a\sim c.
:::
We see that with certain relations that if a\sim b is true and
b\sim c is true then we can conclude that a\sim b is true. Such a
relation is called a Transitive relation.
::: definition Definition 89. Transitive relation
Let S be a set with relation R\subseteq S\times S. We say that R
is a transitive relation if and only if \forall a,b,c\in S we have
that if aRb and bRc then we have that aRc.
:::
We again consider if a transitive relation can be extended to multiple
sets. Suppose that we have a binary relation R\subseteq S\times T for
some sets S and T. The natural extension to make R a transitive
relation is to have s\sim t and t\sim u implies s\sim u for
s,t\in S and t,u\in T. Hence we must have s,t\in S but need not
have u\in S. As we aren't assuming anything else about the relation
R there is nothing more we can deduce about a binary transitive
relation.
::: definition Definition 90. Transitive binary relation
Let S and T be sets with relation R\subseteq S\times T. We say
that R is transitive if and only if the set \tilde{R} given by
$$\begin{equation} \tilde{R} = \left{\left(x,z\right) \in S \times T:\forall x\in S\wedge\forall z\in T: \exists y \in S \cap T: \left(x, y\right) \in R \wedge \left(y, z\right) \in R\right} \end{equation*}$$ is non-empty.* :::
A definition can be made for a transitive $n$-ary relation. I AM NOT SURE HOW TO DEFINE THIS YET, PAIR-WISE RELATION OF EACH SET????????????????? We can make use of a binary relation in order to define
::: definition Definition 91. Transitive $n$-ary relation
Let S_1,S_2,S_3,\dots,S_n be sets with relation
R\subseteq S_1\times S_2\times S_3\times\dots\times S_n. We say that
the relation R is transitive if and only if the set \tilde{R} given
by
$$\begin{equation} \tilde{R}=\left{\left(x,z\right)\in \right} \end{equation*}$$ is non-empty* :::
Equivalence Relations
Of all the examples of relations we have seen so far there is one in
particular that is special, the equality operator =. This relation is
reflexive, symmetric and transitive.
::: {#prop:EqualityOnNaturalsIsEquivRelation .proposition} Proposition 59. The equality relation on the natural numbers is reflexive, symmetric and transitive
Let S=T=\mathbb{N} and for x,y\in\mathbb{N} define the relation
x\sim y by x=y. We have that
-
\simis reflexive, that is\forall x\in\mathbb{N}we have $x\sim x$ -
\simis symmetric, that is\forall x,y\in\mathbb{N}we have $x\sim y\Rightarrow y\sim x$ -
\simis transitive, that is\forall x,y,z\in\mathbb{N}we have that ifx\sim yandy\sim zthen $x\sim z$
Proof:
-
\simis reflexive, that is\forall x\in\mathbb{N}we havex\sim x:Let
x\in\mathbb{N}then by definition of equality we have that fory,z\in\mathbb{N}thaty=zif and only ify\subseteq zandz\subseteq y. It is clear thatx=xand thereforex\sim xproving reflexivity. -
\simis symmetric, that is\forall x,y\in\mathbb{N}we havex\sim y\Rightarrow y\sim x:Let
x,y\in\mathbb{N}withx\sim y. We have that asx\sim ythenx=y. By definition of equality we also havey=xand soy\sim xshowing that\simis symmetric. -
\simis transitive, that is\forall x,y,z\in\mathbb{N}we have that ifx\sim yandy\sim zthenx\sim z:Let
x,y,z\in\mathbb{N}such thatx\sim yandy\sim z, thenx=yandy=z. By definition of equality it follows thatx=zand sox\sim zshowing transitivity.
The result follows. $\qed$ :::
What does it mean for a relation to be reflexive, symmetric and
transitive? In the case of equality on the natural numbers we see that
reflexivity tells us that an element is equal to itself. Equality being
symmetric tells us that if x=y then y=x that is it does not matter
which we we say the two numbers are equal. Finally transitivity tells us
that if x=y and y=z we are able to deduce that x=z. In this
context, equality being reflexive, symmetric and transitive allows us to
quantify which elements are equivalent. In the case of equality it is
clear which elements are equivalent, the ones that are equal!
::: example
Example 85. Consider X=Y=\mathbb{N} and for x,y\in\mathbb{N}
define the relation R=\mathbb{N}\times\mathbb{N}. We have that R is
reflexive as for any x\in\mathbb{N} we have that
\left(x,x\right)\in R. Likewise R symmetric as
\forall x,y\in\mathbb{N} we have that
\left(x,y\right)\in R\Rightarrow\left(y,x\right)\in R. as X=Y.
Finally R is transitive as \forall x,y,z\in\mathbb{N} we have that
\left(x,y\right)\in R and \left(y,z\right)\in R and
\left(x,z\right)\in R.
What does R being reflexive, symmetric and transitive mean? In this
case R being reflexive, symmetric and transitive means that every
x\in X and y\in Y are related and we can see R as a relation
meaning "is an element of $\mathbb{N}$". This means that we have shown
that X and Y are equivalent, which we already know by the fact we
set X=Y=\mathbb{N}.
:::
Based on the two examples we motivate the following definition.
::: definition Definition 92. Equivalence relation
Let S be a set and R\subseteq S\times S a relation. We say that R
is an equivalence relation if and only if
-
Ris reflexive -
Ris symmetric -
Ris transitive :::
Proposition
59{reference-type="ref"
reference="prop:EqualityOnNaturalsIsEquivRelation"} is equivalent to
saying that equality is an equivalence relation on \mathbb{N}. The two
examples also show a disparity between the two equivalence relations
shown. In the case of the equality the relation R was a strict subset
of \mathbb{N}\times\mathbb{N} where as in the second example R was
equal to \mathbb{N}\times \mathbb{N}. This raises the question what is
different? We can answer this by looking at the set of elements that
relate to a given element. Such a set is called an equivalence class.
::: definition Definition 93. Equivalence class
Let S be a set, let x\in S and let R be an equivalence relation
on S. We define an equivalence class, denoted \left[x\right] to be
the set
$$\begin{equation} \left[x\right]=\left{y\in S:xRy\right} \end{equation*}$$*
If the context doesn't make clear the relation we are referring we
explicitly write \left[x\right]_R to be the equivalence class of x
under the relation R.
We say that an element y\in\left[x\right] is a representative of the
equivalence class of $x$
:::
To get a feel for equivalence classes we consider the, non-mathematical, following example.
::: example
Example 86. Consider the set X to be the set of all people
currently alive. Define a relation, \sim, on X by
$$\begin{equation} \forall\left(x,y\right)\in X\times X: x\sim y\iff x\text{ and }y\text{ where born in the same year} \end{equation*}$$*
We have that \sim is an equivalence relation. Clearly if x\sim x as
x was born in some year D. We have that if x\sim y then x and
y are born in the same year and clearly y\sim x. Now if x\sim y
and y\sim z then x and y are born in the same year and y and z
are born in the same year. This therefore means x and z are born in
the same year so x\sim z showing transitivity.
Now let x\in X and consider the equivalence class
\left[x\right]_\sim. By definition of an equivalence class we have
that
$$\begin{equation} \left[x\right]_\sim=\left{y\in x:x\sim y\right} \end{equation*}$$*
This means that the equivalence class \left[x\right]_\sim is the set
of all people currently alive that were born in the same year. As X
was the set of all currently alive people we have found a way to extract
a subset of X such that they are all born in the same year. If we now
pick another element of X, say a, such that x\not\sim a then by
definition a was not born in the same year as x and
\left[a\right]_\sim is another subset of X of currently alive people
born in the same year. Moreover we have that
\left[x\right]_\sim\neq\left[a\right]_\sim. We can do this for every
element of X and get a collection of sets that correspond to all of
the possible different years that anyone currently alive could possibly
be in.
:::
The previous example has shown that we are able to construct a partition
of a set S which has an equivalence relation \sim. We can prove this
more generally, firstly we recall the definition of a set partition.
Let S be a set and define \mathbb{S} to be the set of subsets of
S. We say that \mathbb{S} is a partition of S if the following
hold.
-
\forall S_1,S_2\in\mathbb{S}we haveS_1\cap S_2=\emptysetwheneverS_1\neq S_2 -
Taking the union of every
T\in\mathbb{S}gives usSthat is$$\begin{equation*} S=\bigcup_{T\in\mathbb{S}} T \end{equation*}$$
-
\forall T\in\mathbb{S}we have thatT\neq\emptyset.
Before we can show that the equivalence classes partition the set we must first show that there can be no empty equivalence class.
::: {#prop:EquivClassNonEmpty .proposition} Proposition 60. Equivalence class is non-empty
Let S be a set with an equivalence relation \sim. Let x\in S then
we have that $\left[x\right]_\sim\neg\emptyset$
Proof:
Let S be a set with an equivalence relation \sim. By definition of
an equivalence relation we have that \forall x,y,z\in S that
-
\simis reflexive, that is $x\sim x$ -
\simis symmetric, that is $x\sim y\Rightarrow y\sim x$ -
\simis transitive, that isx\sim yandy\sim ximplies that $x\sim z$
Consider the equivalence class \left[x\right]_\sim. By definition of
an equivalence class we know that
$$\begin{equation} \left[x\right]_\sim=\left{y\in S:x\sim y\right} \end{equation*}$$*
As \sim is reflexive we have that x\mathop{\mathrm{Im}}x and so
x\in\left[x\right]_\sim and therefore
\left[x\right]_\sim\neq\emptyset. $\qed$
:::
We can prove that an equivalence relation partitions the set it is defined on.
::: {#thm:EquivClassesOfRelationPartitionSet .theorem} Theorem 19. Equivalence classes of a relation partitions the set
Let S be a set with an equivalence relation \sim. Let \mathbb{S}
denote the equivalence classes of \sim for each s\in S. We have that
\mathbb{S} is a partition of S.
Proof:
Let S be a set with an equivalence relation \sim and let
\mathbb{S} be the set of equivalence classes of \sim for each
s\in S. Let x\in S then x belongs to at least one equivalence
class by proposition 60{reference-type="ref"
reference="prop:EquivClassNonEmpty"}. We therefore have that
$$\begin{equation} S=\bigcup_{x\in S}\left[x\right]_\sim \end{equation*}$$*
It is left to show that if \left[x\right]_\sim\neq\left[y\right]_\sim
for x,y\in S then we have that
\left[x\right]_\sim\cap\left[y\right]_\sim=\emptyset. This is
equivalent to saying that if
\left[x\right]_\sim\cap\left[y\right]_\sim\neq\emptyset then
\left[x\right]_\sim=\left[y\right]_\sim. So suppose that
\left[x\right]_\sim\cap\left[y\right]_\sim\neq\emptyset then
\left[x\right]_\sim\cap\left[y\right]_\sim has at least one element
z. Suppose that z\in\left[x\right]_\sim then by definition we have
that x\sim z. Let a\in\left[x\right]_\sim be an arbitrary element of
the equivalence class of x. We have that a\sim x then by
transitivity of \sim we conclude that a\sim z. However as
z\in\left[x\right]_\sim\cap\left[y\right]_\sim then we have that
z\in\left[y\right]_\sim and so y\sim z. As \sim is symmetric we
have z\sim y and again by transitivity we conclude that a\sim y.
Hence a\in\left[y\right]_\sim and so
\left[x\right]_\sim\subseteq\left[y\right]_\sim.
A similar argument shows
\left[y\right]_\sim\subseteq\left[x\right]_\sim and therefore we have
that \left[x\right]_\sim=\left[y\right]_\sim. Finally we conclude that
unequal equivalence classes are disjoint and therefore the set of
equivalence classes \mathbb{S} is a partition for S.
The result is shown. $\qed$ :::
Construction of the Integers
::: epigraph The trouble with integers is that we have examined only the very small ones. Maybe all the exciting stuff happens at really big numbers, ones we can't even begin to think about in any very definite way.
Ronald Graham :::
We now have enough theory to consider extending the natural numbers
\mathbb{N}. One reason to do this is to provide a completion to the
idea of subtraction. Recall that n-m is only defined in \mathbb{N}
if and only if m\leq n. This is a limiting idea. For example, the idea
of debt can't be explained using only \mathbb{N}. We know that if the
balance on your bank account is negative then you owe money to someone,
if your balance is positive you have money to spend8 . The natural
numbers don't have a concept of "negative" or debt, we can only deal
with "positive" values. To keep the financial institutions happy we
should resolve this issue.
To do this we need to consider exactly what it is we want to achieve.
Firstly we want to be able to define n-m for all n,m\in\mathbb{N}.
Clearly, if n\geq m then such a number already exists in \mathbb{N}.
Secondly, such a number n-m could have many different representations,
for example, 6-2=4 and 5-1=4. We need a way to say that any of these
different representations actually represents the same thing. Formally
if we have a,b,c,d\in\mathbb{N} such that a-b=c-d then a-b and
c-d represent the same number, this is equivalent to a+d=b+c.
Thinking of - as a relation we can use the language of equivalence
relations to solve this issue. That is a relation where
\left(a,b\right)\sim\left(c,d\right)
Defining the Integers
We start by recasting the defining of subtraction to be defined as an ordered tuple.
::: definition Definition 94. Subtraction as an ordered tuple
Let a,b\in\mathbb{N}. We define the subtraction as an ordered tuple
\left(a,b\right)\in\mathbb{N}^2 to mean \left(a-b\right). We will
call an element x\in\mathbb{N}^2 a subtraction tuple. We note that if
a\geq b we have $\left(a-b\right)\in\mathbb{N}$
:::
From this we can define a relation
::: definition Definition 95. Relation on subtraction
Let \left(a,b\right),\left(c,d\right)\in\mathbb{N}^2 be subtraction
tuples. We define the relation \sim such that
\left(a,b\right)\sim\left(c,d\right) if and only if $a+d=b+c$
:::
We have that this relation is an equivalence relation.
::: proposition Proposition 61. Relation on subtraction ordered tuples is an equivalence relation
Let x,y\in\mathbb{N}^2 be subtraction tuples and define the relation
x\sim y as above. We have that \sim is an equivalence relation.
Proof:
Let x,y,z\in\mathbb{N}^2 be subtraction tuples such that
x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right). We
need to show that \sim is an equivalence relation, that is
-
\simis reflexive -
\simis symmetric -
\simis transitive
-
\simis reflexive:We have that
x=\left(a,b\right)and by definition of\simwe know thatx\sim xif and only ifa+b=a+bwhich is clear by definition of equality on the natural numbers. Hencex\sim xand\simis reflexive. -
\simis symmetric:We have that
x=\left(a,b\right)andy=\left(c,d\right). Suppose thatx\sim ythen we have thata+d=b+c. By commutativity of equality of natural numbers thata+d=b+c\Rightarrow b+c=a+d. By commutativity of addition on the natural numbers we have thatb+c=a+dis the same asc+b=d+a. Hence we have that\left(c,d\right)\sim\left(a,b\right)by definition of\simand soy\sim xshowing that\simis symmetric. -
\simis transitive:We know that
x=\left(a,b\right),y=\left(c,d\right)andz=\left(e,f\right). Now suppose thatx\sim yandy\sim zthen by definition we have that\left(a,b\right)\sim\left(c,d\right)and\left(c,d\right)\sim\left(e,f\right)and hence by definition of\simwe havea+d=c+bandc+f=e+d.Consider
a+c+fwe have$$\begin{equation} a+c+f=a+e+d=a+d+e=c+b+e \end{equation*}$$*
That is to say
a+c+f=c+b+e. We have by the cancellation laws on the natural numbers thata+f=b+ewhich implies that\left(a,b\right)\sim\left(e,f\right). Which is to sayx\sim z. Hence transitivity has been shown.
It follows that \sim is an equivalence relation. $\qed$
:::
Now that we have shown that \sim is an equivalence relation we can
solve the multiple representation problem by considering the equivalence
classes of \mathbb{N}^2 with the relation \sim. Let
x\in\mathbb{N}^2 with x=\left(a,b\right) then the equivalence class
of x is given by
$$\begin{equation*} \left[x\right]\sim=\left[\left(a,b\right)\right]\sim=\left{\left(c,d\right)\in\mathbb{N}^2 : \left(a,b\right)\sim\left(c,d\right)\right} \end{equation*}$$
We know by theorem
19{reference-type="ref"
reference="thm:EquivClassesOfRelationPartitionSet"} that for each
x\in\mathbb{N}^2 there is set of equivalence classes partition
\mathbb{N}^2 and that each equivalence class is disjoint. This is to
say if x,y\in\mathbb{N}^2 then we have that if
\left[x\right]_\sim\cap\left[y\right]_\sim\neq\emptyset then
\left[x\right]_\sim =\left[y\right]_\sim. This solves the multiple
representation issue.
Let us have a look at some equivalence classes
::: example
Example 87. Let x\in\mathbb{N}^2 with x=\left(1,3\right) by
definition we have that x represents 1-3. Consider the equivalence
class of x, \left[x\right]=\left\{y\in\mathbb{N}:x\sim y\right\} and
let y\in\left[x\right]. We have that y=\left(c,d\right) and that
1+d=3+c, one possible y where this is true is given by
y=\left(0,2\right) and y represents 0-2, As we have
y\in\left[x\right] then we have that \left[x\right]=\left[y\right]
so we shall take y to be the canonical representative of this
equivalence class.
:::
Now that we have that the subtraction tuples are in equivalence classes
we can consider the following. Suppose that a,b,c\in\mathbb{N} then
what is a-\left(b-c\right)? For example if a=10, b=6 and c=3 then
we have that 10-\left(6-3\right)=10-3=7. This is also the same as
10+3-6=13*6=7. This holds in general where we have that
\left(a,b-c\right)\sim\left(a+c,b\right)
::: {#lem:NaturalMinusDifferenceOfNatural .lemma} Lemma 6. $\left(a,b-c\right)\sim\left(a+c,b\right)$
Let a,b,c\in\mathbb{N} with a> b\geq c. We have that
$$\begin{equation} \left(a,b-c\right)\sim\left(a+c,b\right) \end{equation*}$$*
Proof:
Let a,b,c\in\mathbb{N} be as given. By definition of \sim we have
\left(x,y\right)\sim\left(u,v\right) if and only if x+v=u+y. We
argue by contradiction, suppose that
\left(a,b-c\right)\not\sim\left(a+c,b\right) then by definition we
have that
$$\begin{align} a+b&\neq a+c+\left(b-c\right)\ b&\neq c+\left(b-c\right),\ \text{By the cancellation law}\ b&\neq \left(c+b\right)-c,\ \text{By proposition}\ref{prop:NaturalAddDifference}\ b&\neq\left(b+c\right)-c,\ \text{By commutativity}\ b&\neq b+\left(c-c\right),\ \text{By proposition}\ref{prop:NaturalAddDifference}\ 0&\neq \left(c-c\right),\ \text{By the cancellation law}\ 0&\neq 0 \end{align*}$$*
A contradiction. $\qed$ :::
By this lemma it follows that a-\left(b-c\right)=\left(a+c\right)-b.
We now look at the definition of what the set of equivalence relations looks like. We make the following definition
::: {#def:QuotientSet .definition} Definition 96. Quotient set
Let S be a set with an equivalence relation \sim. Let x\in S and
consider the equivalence class \left[x\right]_\sim. We define the
quotient set of S, denoted by S/\sim by
$$\begin{equation} S/\sim=\left{\left[x\right]_\sim :x\in S\right} \end{equation*}$$* :::
Why have we called the set of the equivalence classes a quotient set? We can see why with a few examples.
::: example
Example 88. We reconsider the example where X is the set of all
people currently alive with the relation \sim given by
$$\begin{equation} \forall\left(x,y\right)\in X\times X: x\sim y\iff x\text{ and }y\text{ where born in the same year} \end{equation*}$$*
We know that \sim is an equivalence relation and we know that the
equivalence classes define a set of all people currently alive born in a
certain year. We can identify the quotient set X/\sim as the set of
all of the possible years that all people currently alive could live in.
As an example suppose that person x\in X was born in 1983. Then by the
definition of \sim we have that x\sim y if and only if y is also
born in 1983 and that \left[x\right]_\sim is the equivalence class of
all people born in 1983. As \left[x\right]_\sim\in X/\sim then
\left[x\right]_\sim is the set in X/\sim that represents the year
1983. That is the quotient set has taken the set X of all currently
alive people who were born in a certain year and turned it into the set
of all possible years.
:::
::: example
Example 89. Let X be the set of all possible cars and define the
equivalence relation \sim such that x\sim y if and only if x and
y are the same colour. We have that sim is an equivalence relation.
Reflexivity is clear as if x is a certain colour then clearly
x\sim x will be true. Now if x\sim y then both x and y are the
same colour and so y\sim x. Finally if x\sim y and y\sim z then
x and y are the same colour and so are y and z so it follows
that x\sim z.
Suppose now that x\in X, then the equivalence class
\left[x\right]_\sim is the set where all cars are the same colours.
Hence the quotient set X/\sim will be the set of all possible car
colours. The quotient set has taken the set of all possible cars and
turned it into the set of all possible car colours.
If we had a different relation R where xRy if and only if x and
y have exactly two doors then R is also an equivalence relation and
X/R would take all of the possible cars X and turn it into the set
of all of the cars that have exactly two doors.
:::
These examples show that the quotient set takes a set of objects S and
extracts a given property defined by the equivalence relation \sim
defined on S. How can we use the quotient set on the equivalence
classes of the subtraction tuples?
We have that the the quotient set of \mathbb{N}^2/\sim is given by
$$\begin{equation*} \mathbb{N}^2/\sim=\left{\left[x\right]_\sim:x\in\mathbb{N}^2\right} \end{equation*}$$
What do these elements actually look like? Let
\left(a,b\right)=x\in\mathbb{N}^2 and consider the equivalence class
\left[x\right]_\sim. Firstly, in the naturals, we know that 0=0-0
and more generally that 0=a-a for any a\in\mathbb{Z}. Hence
0\in\left[\left(0,0\right)\right].
Now, consider \left[\left(a,0\right)\right] then we would have that
any \left(c,d\right)=y\in\left[\left(a,0\right)\right] is such that
\left(a,0\right)\sim\left(c,d\right) if and only if a-0=c-d. Hence
each a is equivalent to some subtraction tuple. Moreover each
\left(a,0\right)=a\in\mathbb{N}, therefore we have a canonical
representation for each element a\in\mathbb{N}. What happens if we
have a tuple \left(a,b\right) where a\geq b? We can see that if
\left(a,b\right)\sim\left(c,d\right) then a+d=c+b. For example we
have that \left(0,3\right)\sim\left(1,4\right) which gives
\left(8,11\right)\sim\left(0,3\right) = 8-11 = 0-3 8+3 = 11
$$\begin{equation*}
0-3=1-4 \Rightarrow 0+4=1+3 \Rightarrow 4=4
\end{equation*}$$
Hence we can define a canonical representation for each
\left(0,a\right) where a\in\mathbb{N}. We will write the element
\left(0 ,a\right) by -a for each a\in\mathbb{N}. We have define
the set of Integers.
::: definition Definition 97. Integers
Let \mathbb{N}^2 have the equivalence relation \sim defined by
\left(a,b\right)\sim\left(c,d\right) if and only if a+d=b+c. We
define the set of Integers, denoted \mathbb{Z}, as the quotient set
\mathbb{N}^2/\sim. The set \mathbb{Z} has the form
$$\begin{equation} \mathbb{Z}=\left{\dots,-4,-3,-2,-1,0,1,2,3,4,\dots\right} \end{equation}$$ :::
We make two additional definitions based on the definition of the canonical form the equivalence classes
::: definition Definition 98. Positive Integer
Let a\in\mathbb{Z}. We say that a is a positive integer if and only
if a\in\left[\left(b,0\right)\right] for some b\in\mathbb{N} with
b\neq 0.
:::
::: definition Definition 99. Negative Integer
Let a\in\mathbb{Z}. We say that a is a negative integer if and only
if a\in\left[\left(0,b\right)\right] for some b\in\mathbb{N} with
b\neq 0.
:::
We can use these two definitions to define an occasionally useful idea.
::: definition Definition 100. Sign of an integer
Let x\in\mathbb{Z}. We define the sign of x, denoted by
\mathop{\mathrm{sgn}}\left(x\right) to be the following function
$$\begin{align} \mathop{\mathrm{sgn}}:\mathbb{Z}&\rightarrow\left{-1,0,1\right}\ x&\mapsto\mathop{\mathrm{sgn}}\left(x\right)=\begin{cases} 1,\ \text{If } x\text{ is a positive integer}\ -1,\ \text{If } x\text{ is a negative integer}\ 0,\ \text{Otherwise} \end{cases} \end{align*}$$* :::
We also have the following, clear result
::: proposition Proposition 62. The natural numbers are a subset of the integers
We have that $\mathbb{N}\subseteq\mathbb{Z}$
Proof:
We have that the elements of the equivalence class
\left[\left(x,0\right)\right] have the form x-0=x\in\mathbb{N}. Let
a\in\mathbb{N} then we have that a\in\left[\left(a,0\right)\right].
This holds for every a\in\mathbb{N} and so
\mathbb{N}\subseteq\mathbb{Z}. $\qed$
:::
We will let \left[\left(a,b\right)\right] be denoted by
\left[a,b\right] and extend the operations of addition and
multiplication to the integers by defining how they work on the
equivalence classes.
Extending equality to the integers
Equality for the integers is easy to define.
::: definition Definition 101. Equality of integers
Let x,y\in\mathbb{Z} be two integer numbers. We define that two
integers are equal, denoted x=y if and only if x\sim y. This is the
same as saying both x and y belong to the same equivalence class. In
the case where x\not\sim y, we say that x is not equal to y and
write x\neq y.
:::
Extending inequality operators to the integers
Inequality operators extend in a natural way.
::: definition Definition 102. Less than operator
Let x,y\in\mathbb{Z} where x\in\left[a,b\right] and
y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. The less than
operator, denoted by x<y is defined by the logical proposition
$$\begin{equation} <\left(x,y\right)=\begin{cases} 1,\ \text{If } a+d<b+c\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
This can equivalently be express as
$$\begin{equation} x<y \iff a+d<b+c \end{equation*}$$* :::
::: definition Definition 103. Less than or equal to operator
Let x,y\in\mathbb{Z} where x\in\left[a,b\right] and
y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. The less than or
equal operator, denoted by x\leq y is defined by the logical
proposition
$$\begin{equation} \leq\left(x,y\right)=\begin{cases} 1,\ \text{If } a+d\leq b+c\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
This can equivalently be express as
$$\begin{equation} x\leq y \iff a+d\leq b+c \end{equation*}$$* :::
::: definition Definition 104. Greater than operator
Let x,y\in\mathbb{Z} where x\in\left[a,b\right] and
y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. The greater than
operator, denoted by x>y is defined by the logical proposition
$$\begin{equation} >\left(x,y\right)=\begin{cases} 1,\ \text{If } a+d>b+c\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
This can equivalently be express as
$$\begin{equation} x>y \iff a+d>b+c \end{equation*}$$* :::
::: definition Definition 105. Greater than or equal to operator
Let x,y\in\mathbb{Z} where x\in\left[a,b\right] and
y\in\left[c,d\right] for some a,b,c,d\in\mathbb{N}. The greater than
or equal to operator, denoted by x\geq y is defined by the logical
proposition
$$\begin{equation} \geq\left(x,y\right)=\begin{cases} 1,\ \text{If } a+d\geq b+c\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
This can equivalently be express as
$$\begin{equation} x\geq y \iff a+d\geq b+c \end{equation*}$$* :::
Extending addition to the integers
We have an understanding of addition on the natural numbers, mainly the recursive definition given by
$$\begin{align*} +&:\mathbb{N}^2\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ \left(m,n\right)&\mapsto +\left(m,n\right)=\begin{cases} m+0=m,\ \text{If } n=0\ m+S\left(n\right)=S\left(m+n\right),\ \text{If } n\neq 0 \end{cases} \end{align*}$$
Now if we take a,b\in\mathbb{Z} with a,b being positive integers
then we have that a\in\left[\left(a,0\right)\right] and
b\in\left[\left(b,0\right)\right]. We then have that a+b will be in
\left[\left(a+b,0\right)\right]. Now suppose that a,b\in\mathbb{N}
with a,b being negative integers then we have that
a\in\left[\left(0,a\right)\right] and
b\in\left[\left(0,b\right)\right]. Intuitively we know that -2+-3=-5
so we want these to add like in the positive integer case. This is to
say we have a+b will be in the class
\left[\left(0,a+b\right)\right].
We can combine these two observations to define addition on the integers.
::: definition Definition 106. Addition on the Integers
Let x,y\in\mathbb{Z} with x=\left(a,b\right) and
y=\left(c,d\right). We define addition on the integers by
$$\begin{equation} \left[a,b\right]+\left[c,d\right]=\left[a+c,b+d\right] \end{equation}$$ :::
To check this definition makes sense consider x=4,y=3. Both x and
y belong to some equivalence class, for example
x\in\left[\left(5,1\right)\right] and
y\in\left[\left(8,5\right)\right]. Then we have that x+y=7 and
$$\begin{equation*} \left(5,1\right)+\left(8,5\right)=\left(5+8,1+5\right)=\left(13,6\right) \Rightarrow 13-6=7 \end{equation*}$$
Extending multiplication to the integers
We also extend multiplication to the integers. We have the definition of multiplication on the naturals given by
$$\begin{align*} &:\mathbb{N}\times\mathbb{N}\mathlarger{\mathlarger{\rightarrow}}\mathbb{N}\ \left(m,n\right)&\mapsto \left(m,n\right)=\begin{cases} m0=0,\ \text{If } n=0\ mS\left(n\right)=mn+m,\ \text{If } n\neq 0 \end{cases} \end{align}$$
As before, if we take x,y\in\mathbb{Z} with x,y being positive
integers then we have that x\in\left[\left(x,0\right)\right] and
b\in\left[\left(x,0\right)\right] we have that
x*y\in\left[\left(x*y,0\right)\right].
Suppose that x,y\in\mathbb{Z} with x=\left(a,b\right) and
y=\left(c,d\right). We have that
$$\begin{align*}
\left(a-b\right)\left(c-d\right)&=\left(a-b\right)c-\left(a-b\right)d\
&=ac-bc-\left(ad-bd\right)\
&=ac-bc+bd-ad\
&= ac+bd-bc-ad\
&=ac+bd-\left(ad+bc\right)
\end{align}$$ This is
\left(a,b\right)*\left(c,d\right)=\left(ac+bd,ad+bc\right)
This well be the definition of multiplication of the integers.
::: definition Definition 107. Multiplication on the Integers
Let x,y\in\mathbb{Z} with x=\left(a,b\right) and
y=\left(c,d\right). We define multiplication on the integers by
$$\begin{equation} \left[a,b\right]\left[c,d\right]=\left[ac+bd,ad+bc\right] \end{equation}$$* :::
Closure properties of addition and multiplication
As with the natural numbers we need to show that the operations of addition and multiplication are closed. Additionally we want to prove our claim at the start of this section that the integers allow us to completely perform subtraction.
::: theorem Theorem 20. Addition and multiplication on the integers are well-defined operators and closed
We have that \forall x,y\in\mathbb{Z} that
-
$x+y\in\mathbb{Z}$
-
$xy\in\mathbb{Z}$*
Proof:
-
x+y\in\mathbb{Z}:We need to show that if
\left(a,b\right)\sim\left(a',b'\right)and\left(c,d\right)\sim\left(c',d'\right)then\left(a+c,b+d\right)\sim\left(a'+c',b'+d'\right)as this will show equivalent elements produce the same result when added and therefore integer addition is well-defined.We have by definition that
\left(a,b\right)\sim\left(a',b'\right)thata+b'=a'+b, likewise we have\left(c,d\right)\sim\left(c',d'\right)givesc+d'=c'+d.Now, we have that
$$\begin{align} a+b'+c+d'&=a'+b+c'+d\ a+c+b'+d'&=a'+c'+b+d\ \Rightarrow \left(a+c,b+d\right)&\sim\left(a'+c',b'+d'\right) \end{align*}$$ Hence
\left[\left(a+c,b+d\right)\right]=\left[\left(a'+c',b'+d'\right)\right]and so addition is well-defined.*It is left to prove closure. Let
x,y\in\mathbb{Z}withx=\left(a,b\right)andy=\left(c,d\right). By definition of integer addition we have thatx+y=\left(a+c,b+d\right)and moreover we havea+c\in\mathbb{N}andb+d\in\mathbb{N}. Hence\left(a+c,b+d\right)\in\left[a+c,b+d\right]and thereforex+y\in\mathbb{Z}showing closure. -
x*y\in\mathbb{Z}:As with addition we need to show that if
\left(a,b\right)\sim\left(a',b'\right)and\left(c,d\right)\sim\left(c',d'\right)then\left(a,b\right)*\left(c,d\right) \sim \left(a',b'\right)*\left(c',d'\right). As before we have thatWe have that
$$\begin{equation} \left(a,b\right)\left(c,d\right)=\left(ac+bd,ad+bc\right)\iff ac+bd-\left(ad+bc\right) \end{equation}$$*
Now as
\left(a,b\right)\sim\left(a',b'\right)thena+b'=b+a'and\left(c,d\right)\sim\left(c',d'\right)thenc+d'=d+c'. Hence$$\begin{align} ac+bd-\left(ad+bc\right)&=\left(ac-ad\right)+\left(bd-bc\right)\ &=a\left(c-d\right)+b\left(d-c\right)\ &=a\left(c'-d'\right)+b\left(d'-c'\right), \text{ By assumption as} c+d'=d+c'\Rightarrow c-d=c'-d'\ &=ac'-ad'+bd'-bc'\ &=\left(ac'-bc'\right)+\left(bd'-ad'\right), \text{ By commutativity of the Naturals}\ &=c'\left(a-b\right)+d'\left(b-a\right)\ &=c'\left(a'-b'\right)+d'\left(b'-a'\right), \text{ By assumption as } a+b'=b+a'\Rightarrow a-b=a'-b'\ &=\left(c'a'-c'b'\right)+\left(d'b'-d'a'\right)\ &=c'a'-c'b'+d'b'-d'a'\ &=a'c'-b'c'+b'd'-a'd', \text{ By commutativity of the Naturals}\ &=\left(a'c+b'd'\right)-b'c'-a'd'\ &=\left(a'c+b'd'\right)-\left(a'd'+b'c'\right), \text{ By lemma \ref{lem:NaturalMinusDifferenceOfNatural}}\ \end{align*}$$*
This shows that multiplication is well-defined. It is left to show closure. Let
x,y\in\mathbb{Z}withx=\left(a,b\right)andy=\left(c,d\right). By the definition of multiplication on the integers we have thatx*y=\left(ac+bd,ad+bc\right)withac+bd\in\mathbb{N}andad+bc\in\mathbb{N}. Hence we conclude that\left(ac+bd,ad+bc\right)\in\left[ac+bd,ad+bc\right], and so by definitionx*y\in\mathbb{Z}.
The result is shown. $\qed$ :::
Now that we have shown closure we can deduce an immediate property.
::: {#prop:multiplication_by_negative_one_for_integers .proposition} Proposition 63. Multiplication of an integer by $-1$
Let x\in\mathbb{Z} where x\in\left[a,b\right] for some
a,b\in\mathbb{N}. We have that
-
$-1x = -1*\left(a,b\right)=\left(b,a\right)$*
-
$x-1 = \left(a,b\right)-1=\left(b,a\right)$
Proof:
-
-1*x = -1*\left(a,b\right)=\left(b,a\right):We have that
-1\in\left[0,1\right]and so$$\begin{align} -1x&=\left(0,1\right)\left(a,b\right)\ &=\left(0a+1b,0b+1a\right)\ &=\left(b,a\right) \end{align*}$$*
-
x*-1 = \left(a,b\right)*-1=\left(b,a\right):Likewise we have
$$\begin{align} x*-1&=\left(a,b\right)\left(0,1\right)\ &=\left(a0+b1,a1+b0\right)\ &=\left(b,a\right) \end{align}$$*
As required. $\qed$ :::
::: {#cor:multiplication_by_negative_one_changes_integer_sign .corollary}
Corollary 3. Multiplication of a positive integer by -1 makes it
a negative integer and multiplication of a negative integer by -1
makes it a positive integer
-
If
xis a positive integer then-1*xis a negative integer. -
If
xis a negative integer then-1*xis a positive integer.
Proof:
By definition if x\in\mathbb{Z} is positive then
x\in\left[a,0\right] for some a\in\mathbb{N}. By proposition
63{reference-type="ref"
reference="prop:multiplication_by_negative_one_for_integers"} we have
that -1*x=\left(0,a\right)=x*-1, which is by definition a negative
integer.
Likewise if x\in\mathbb{Z} is negative then x\in\left[0,a\right]
for some a\in\mathbb{N}. By proposition
63{reference-type="ref"
reference="prop:multiplication_by_negative_one_for_integers"} we have
that -1*x=\left(a,0\right)=x*-1, which is by definition a positive
integer.
$\qed$ :::
Associativity of integer addition and multiplication
The associativity of addition and multiplication of the naturals also extends to the integers.
::: theorem
Theorem 21. Let x,y,z\in\mathbb{Z}. We have that
-
$x+\left(y+z\right)=\left(x+y\right)+z$
-
$x\left(yz\right)=\left(xy\right)z$
Proof:
-
x+\left(y+z\right)=\left(x+y\right)+z:Let
x,y,z\in\mathbb{Z}be such thatx=\left(a,b\right), y=\left(c,d\right)andz=\left(e,f\right)wherea,b,c,d,e,f\in\mathbb{N}and we have that\left(a,b\right)\in\left[a,b\right], \left(c,d\right)\in\left[c,d\right]and\left(e,f\right)\in\left[e,f\right]. We have that$$\begin{align} x+\left(y+z\right)&=\left(a,b\right)+\left(\left(c,d\right)+\left(e,f\right)\right)\ &=\left(a,b\right)+\left(c+e,d+f\right)\ &=\left(a+\left(c+e\right),b+\left(d+f\right)\right)\ &=\left(\left(a+c\right)+e,\left(b+d\right)+f\right),\text{ By associativity of addition for natural numbers}\ &=\left(a+c,b+d\right)+\left(e,f\right)\ &=\left(\left(a,b\right)+\left(c,d\right)\right)+\left(e,f\right)\ &=\left(x+y\right)+z \end{align*}$$*
Which shows associativity of addition.
-
x\left(yz\right)=\left(xy\right)z:As with addition, let
x,y,z\in\mathbb{Z}be such thatx=\left(a,b\right), y=\left(c,d\right)andz=\left(e,f\right)wherea,b,c,d,e,f\in\mathbb{N}and we have that\left(a,b\right)\in\left[a,b\right], \left(c,d\right)\in\left[c,d\right]and\left(e,f\right)\in\left[e,f\right]. We then have that$$\begin{align} x\left(yz\right)&=\left(a,b\right)\left(\left(c,d\right)\left(e,f\right)\right)\ &=\left(a,b\right)\left(ce+df,cf+de\right)\ &=\left(a\left(ce+df\right)+b\left(cf+de\right),a\left(cf+de\right)+b\left(ce+df\right)\right)\ &=\left(ace+adf+bcf+bde,acf+ade+bce+bdf\right)\ &=\left(ace+bde+adf+bcf,acf+bdf+ade+bce\right),\ \text{By associativity of addition for natural numbers}\ &=\left(\left(ac+bd\right)e+\left(ad+bc\right)f,\left(ac+bd\right)f+\left(ad+bc\right)e\right)\ &=\left(ac+bd,ad+bc\right)\left(e,f\right)\ &=\left(\left(a,b\right)\left(c,d\right)\right)\left(e,f\right)\ &=\left(xy\right)z \end{align}$$*
Showing associativity of multiplication.
The result follows. $\qed$ :::
Commutativity of integer addition and multiplication
As with the naturals, addition and multiplication in the integers both satisfy commutativity.
::: theorem Theorem 22. Addition and multiplication are commutative
For all x,y\in\mathbb{Z} we have that
-
$x+y=y+x$
-
$xy=yx$
Proof:
-
x+y=y+x:Let
x,y\in\mathbb{Z}. By definition we have thatx\in\left[a,b\right]andy\in\left[c,d\right]for somea,b,c,d\in\mathbb{N}. Letx=\left(a,b\right)andy=\left(c,d\right). We then have by definition of addition that$$\begin{align} x+y&=\left(a,b\right)+\left(c,d\right)\ &=\left(a+c,b+d\right)\ &=\left(c+a,d+b\right),\ \text{By commutativity of addition for natural numbers}\ &= \left(c,d\right)+\left(a,b\right) &=y+x \end{align*}$$*
Showing commutativity holds for addition in the integers.
-
xy=yx:Let
x,y\in\mathbb{Z}by definition we have thatx\in\left[a,b\right]andy\in\left[c,d\right]for somea,b,c,d\in\mathbb{N}. So letx=\left(a,b\right)andy=\left(c,d\right). By definition of multiplication we have$$\begin{align} xy&=\left(a,b\right)\left(c,d\right)\ &=\left(ac+bd,ad+bc\right)\ &=\left(ca+db,da+bc\right), \text{By commutativity of multiplication of the naturals}\ &=\left(ca+db,da+bc\right), \text{By commutativity of addition of the naturals}\ &=\left(c,d\right)\left(a,b\right)\ &=yx \end{align*}$$*
Showing commutativity for integer multiplication.
The result has been shown. $\qed$ :::
Multiplication distributes over addition
Another result that extends from the naturals is that multiplication distributes over addition.
::: theorem Theorem 23. Multiplication distributes over addition
For all x,y,z\in\mathbb{Z} we have that
-
$x\left(y+z\right)=xy+xz$
-
$\left(y+z\right)x=yx+zx=xy+xz$
Proof:
Let x,y,z\in\mathbb{Z} then
x\in\left[a,b\right],y\in\left[c,d\right] and z\in\left[e,f\right]
for some a,b,c,d,e,f\in\mathbb{N}.
So let x=\left(a,b\right), y=\left(c,d\right) and
z=\left(e,f\right).
-
x\left(y+z\right)=xy+xz:We have that
$$\begin{align} x\left(y+z\right)&=\left(a,b\right)\left(\left(c,d\right)+\left(e,f\right)\right)\ &=\left(a,b\right)\left(c+e,d+f\right)\ &=\left(a\left(c+e\right)+b\left(d+f\right),a\left(d+f\right)+b\left(c+e\right)\right)\ &=\left(ac+ae+bd+bf,ad+af+bc+be\right)\ &=\left(ac+bd+ae+bf,ad+bc+af+be\right)\ &=\left(ac+bd,ad+bc\right)+\left(ae+bf,af+be\right)\ &=\left(a,b\right)\left(c,d\right)+\left(a,b\right)\left(e,f\right)\ &=xy+xz \end{align*}$$*
-
\left(y+z\right)x=yx+zx=xy+xz:Now that we have the previous part the proof of this part is quick. We have
$$\begin{align} \left(y+z\right)x&=x\left(y+z\right), \text{By commutativity of multiplication}\ &=xy+xz, \text{By part }1.\ &=yx+zx, \text{By commutativity of multiplication} \end{align*}$$*
As required. $\qed$ :::
The Zero and Identity laws
The zero and identity laws from the naturals extend to the integers.
::: theorem Theorem 24. The zero and Identity laws
Let x\in\mathbb{Z}. We have that
-
$x+0=x=0+x$
-
$1x=x=x1$
Proof:
Let x\in\mathbb{Z} then we have that x=\left(a,b\right) for some
$a,b\in\mathbb{N}$
-
x+0=x=0+x:We have that
0\in\left[0,0\right]. Hence we have that$$\begin{equation} x+0=\left(a,b\right)+\left(0,0\right)=\left(a+0,b+0\right)=\left(a+b\right)=\left(0+a,0+b\right)=\left(0,0\right)+\left(a,b\right)=0+x \end{equation*}$$*
-
x*1=x=1*x:As
1\in\left[1,0\right]then$$\begin{align} x1&=\left(a,b\right)\left(1,0\right)\ &=\left(a1+b0,b1+a0\right)\ &=\left(a+0,b+0\right)\ &=\left(a,b\right)=x\ &=\left(1a+0b,0a+1b\right)\ &=\left(1,0\right)\left(a,b\right)\ &=1x \end{align}$$*
The result follows. $\qed$ :::
Extending subtraction to the integers
As we have a notion of subtraction on the naturals, we can ask about
extending this to the integers. We defined subtraction on the naturals
as follows. Let n,m\in\mathbb{N} such that n\leq m. Let
d\in\mathbb{N} such that n=m+d. We define subtraction by
$$\begin{equation*} d=n-m \end{equation*}$$
Where we called d the difference between n and m. We also have the
notion of a positive and negative integer. Recall that x\in\mathbb{Z}
is a positive integer if and only if x Let x\in\mathbb{Z}. We say that
x is a positive integer if and only if
x\in\left[\left(b,0\right)\right] for some b\in\mathbb{N}. Likewise
x is a negative integer if and only if
x\in\left[\left(0,b\right)\right] for some b\in\mathbb{N}. In order
to extend subtraction to the integers we need to consider a few things.
::: definition Definition 108. Negation of an natural number
Let x\in\mathbb{Z} so that x is a positive integer, i.e a natural
number. We define the negation of x, denoted -x by
$$\begin{equation} -x=-1x=\left(0,1\right)x \end{equation}$$
where \left(0,1\right)\in\left[\left(0,-1\right)\right]. That is
\left(0,1\right) is an element of the equivalence class
\left[\left(0,1\right)\right] which represents all possible elements
that are -1.
:::
We can extend this result to include a general integer.
::: proposition Proposition 64. Negation of an integer
Let x\in\mathbb{Z} so that x\in\left[\left(a,b\right)\right] for
some a,b\in\mathbb{N}. We have that
$$\begin{equation} -1x=-1\left(a,b\right)=\left(b,a\right) \end{equation*}$$*
Proof:
Let x\in\mathbb{Z} be as given by the hypothesis. We have that
$$\begin{align} -1x&=-1\left(a,b\right)\ &=\left(0,1\right)\left(a,b\right)\ &=\left(0a+b1,0b+1a\right)\ &=\left(b,a\right) \end{align}$$*
As required. $\qed$ :::
In light of this, we can define subtraction for integers.
::: definition Definition 109. Integer subtraction
Let x,y\in\mathbb{Z}. We define the subtraction of y from x,
denoted x-y by
$$\begin{equation} x-y=x+\left(-y\right)=x+\left(-1y\right) \end{equation}$$* :::
We immediately get that subtraction is closed, from the fact that both addition and multiplication are closed. We do not have associativity of subtraction in general.
::: proposition Proposition 65. Integer subtraction is not associative
Let x,y,z\in\mathbb{Z}. We have that
$$\begin{equation} x-\left(y-z\right)\neq \left(x-y\right)-z \end{equation*}$$*
Proof:
Let x=2, y=4 and z=6, we have
x\in\left[2,0\right], y\in\left[4,0\right] and z\in\left[0,6\right]
so x\in\left(2,0\right), y\in\left(4,0\right) and
z\in\left(0,6\right) . We have that
$$\begin{align} x-\left(y-z\right)&=\left(2,0\right)-\left(\left(4,0\right)-\left(6,0\right)\right)\ &=\left(2,0\right)-\left(\left(4,0\right)+\left(-1*\left(6,0\right)\right)\right)\ &=\left(2,0\right)-\left(\left(4,0\right)+\left(0,6\right)\right)\ &=\left(2,0\right)-\left(4,6\right)\ &=\left(2,0\right)+\left(-1*\left(4,6\right)\right)\ &=\left(2,0\right)+\left(6,4\right)\ &=\left(8,4\right)\ \end{align*}$$*
On the other side we have
$$\begin{align} \left(x-y\right)-z&=\left(\left(2,0\right)-\left(4,0\right)\right)-\left(6,0\right)\ &=\left(\left(2,0\right)+\left(-1*\left(4,0\right)\right)\right)-\left(6,0\right)\ &=\left(\left(2,0\right)+\left(0,4\right)\right)-\left(6,0\right)\ &=\left(2,4\right)-\left(6,0\right)\ &=\left(2,4\right)+\left(-1*\left(6,0\right)\right)\ &=\left(2,4\right)+\left(0,6\right)\ &=\left(2,10\right) \end{align*}$$*
Clearly \left(8,4\right)\neq \left(2,10\right). Indeed they are not
even equivalent. Suppose that \left(8,4\right)\sim\left(2,10\right)
then we have that 8+10=4+2. However 18\neq 6. $\qed$
:::
We can also immediately see the following result, which allows us to formally show that subtraction is an inverse to addition.
::: {#prop:IntegerAdditiveInverse .proposition} Proposition 66. Subtracting an integer from itself gives zero
Let x\in\mathbb{Z}. We have that
$$\begin{equation} x-x=0 \end{equation*}$$*
Proof:
Let x\in\mathbb{Z} where x\in\left[a,b\right] for some
a,b\in\mathbb{N}. We have
$$\begin{align} x-x&=\left(a,b\right)-\left(a,b\right)\ &=\left(a,b\right)+\left(b,a\right)\ &=\left(a+b,b+a\right) \end{align*}$$*
It is left to show that \left(a+b,b+a\right)\sim\left(0,0\right).
Indeed
$$\begin{equation} \left(a+b\right)+0=\left(b+a\right)+0 \Rightarrow a+b=b+a \end{equation*}$$*
The result is shown. $\qed$ :::
The cancellation laws
We can now deduce that the cancellation laws also extend to the integers.
::: theorem Theorem 25. The cancellation laws
Let x,y,z\in\mathbb{Z}.
-
If
x+y=x+zthen we havey=z. -
For
x\neq 0, ifxy=xzthen we have that $y=z$
Proof:
-
If
x+y=x+zthen we havey=z:Let
x,y,z\in\mathbb{Z}. We have that$$\begin{align} x+y&=x+z\ \Rightarrow -x+x+y&=-x+x+z,\ \text{Adding the negative of } x \text{ to both sides}\ \Rightarrow \left(-x+x\right)+y*&=\left(-x+x\right)+z,\ \text{Associativity of integers}\ \Rightarrow 0+y&=0+z,\ \text{By proposition \ref{prop:IntegerAdditiveInverse}}\ \Rightarrow y&=z \end{align*}$$*
-
For
x\neq 0, ifxy=xzthen we have thaty=z:Let
x,y,z\in\mathbb{Z}wherex\neq 0. Suppose thatx\in\left[a,b\right], y\in\left[c,d\right]andz\in\left[e,f\right]. We have$$\begin{align} xy&=\left(a,b\right)\left(c,d\right)=\left(ac+bd,ad+bc\right)\ xz&=\left(a,b\right)\left(e,f\right)=\left(ae+bf,af+be\right) \end{align*}$$*
Now assume
xy=xzthen we have that\left(ac+bd,ad+bc\right)\sim\left(ae+bd,ad+be\right)which is to say$$\begin{equation} ac+bd+af+be=ae+bf+ad+bc \end{equation*}$$*
Observe that
$$\begin{align} ac+bd+af+be&=a\left(c+f\right)+b\left(d+e\right)\ ae+bf+ad+bc&=a\left(e+d\right)+b\left(f+c\right) \end{align*}$$*
Which gives
$$\begin{equation} a\left(c+f\right)+b\left(d+e\right)=a\left(e+d\right)+b\left(f+c\right) \end{equation*}$$*
There are now two cases to consider,
a<banda>b. Firstly suppose thata<bthen we can write thatb=a+hfor someh>0, this is well-defined asa,b\in\mathbb{N}. We then have$$\begin{align} a\left(c+f\right)+b\left(d+e\right)&=a\left(e+d\right)+b\left(f+c\right)\ a\left(c+f\right)+\left(a+h\right)\left(d+e\right)&=a\left(e+d\right)+\left(a+h\right)\left(f+c\right)\ a\left(c+f\right)+a\left(d+e\right)+h\left(d+e\right)&=a\left(e+d\right)+a\left(f+c\right)+h\left(f+c\right)\ a\left(d+e\right)+h\left(d+e\right)&=a\left(e+d\right)+h\left(f+c\right),\text{ Cancelling }a\left(c+f\right)\ h\left(d+e\right)&=h\left(f+c\right),\text{ Cancelling }a\left(d+e\right)\ \left(d+e\right)&=\left(f+c\right),\text{ Cancelling }h\ \end{align*}$$*
Now as
d+e=f+cwe have thatc-d=e-f\Rightarrow \left(c,d\right)\sim\left(e,f\right)which is the same as sayingy=z.Now if
a>bthen we writeb=a-hfor someh>0, again being well-defined asa,b\in\mathbb{N}. Thus$$\begin{align} a\left(c+f\right)+b\left(d+e\right)&=a\left(e+d\right)+b\left(f+c\right)\ a\left(c+f\right)+\left(a-h\right)\left(d+e\right)&=a\left(e+d\right)+\left(a-h\right)\left(f+c\right)\ a\left(c+f\right)+a\left(d+e\right)-h\left(d+e\right)&=a\left(e+d\right)+a\left(f+c\right)-h\left(f+c\right)\ a\left(d+e\right)-h\left(d+e\right)&=a\left(e+d\right)-h\left(f+c\right),\text{ Cancelling }a\left(c+f\right)\ -h\left(d+e\right)&=-h\left(f+c\right),\text{ Cancelling }a\left(d+e\right)\ \left(f+c\right)&=\left(d+e\right),\text{By adding each side to the other and cancelling }h\ \end{align*}$$*
As
f+c=d+ethen we have by similar logic to before the $y=z$
The result is shown. $\qed$ :::
Extending the summation and product notations to integers
Summation and product notation has been defined on the naturals. As with the theme of this section the notations extend in a natural way to integers. As before we need to define a few things.
Let z\in\mathbb{Z}^{n+m+1} be an ordered n+m+1 tuple of integers
where z=\left(z_{-m},z_{-m+1},\dots,z_{-1},z_0,z_1,z\dots, z_n\right)
and define
\mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}.
Define f:\mathbb{Z}_m^n\rightarrow\mathbb{Z} by
$$\begin{align*} f:\mathbb{Z}_m^n&\rightarrow \mathbb{Z}\ i&\mapsto f\left(i\right)=z_i \end{align*}$$
As before, f simply maps gets the value of z_i from the ordered
tuple z.
::: definition Definition 110. Summation notation for the integers
Let z\in\mathbb{Z}^{n+m+1} be ordered n+m+1 tuple of integers where
z=\left(z_{-m},z_{-m+1},\dots,z_{-1},z_0,z_1,z\dots, z_n\right).
Define \mathbb{Z}_m^n by
\mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}.
Let f:\mathbb{Z}^{n+m+1}:\mathbb{Z} defined by
$$\begin{align} f:\mathbb{Z}^{m+n+1}&\rightarrow\mathbb{Z}\ i&\mapsto f\left(i\right)=z_i \end{align*}$$*
We define the summation notation for integers by
$$\begin{equation} \sum_{i=-m}^n f\left(i\right)=f\left(-m\right)+f\left(-m+1\right)+\dots+f\left(-1\right)+f\left(0\right)+f\left(1\right)+\dots+f\left(n\right) \end{equation*}$$*
Alternatively this is written
$$\begin{equation} \sum_{i=-m}^n z_i = z_{-m}+z_{-m+1}+\dots+z_{-1}+z_0+z_1+\dots+z_n \end{equation*}$$*
We have that i is called the index of summation and that i=-m is
the starting index of the summation, and n the ending index of the
summation. If z=\emptyset then we define the summation to be 0 and
call a summation an empty sum.
We can also define the summation of some subset of \mathbb{Z}_m^n
which allows for starting a summation at some starting point other than
i=-m. Let T\subseteq\mathbb{Z}_m^n. We define the summation over the
set T by
$$\begin{equation} \sum_{i\in T} z_i \end{equation*}$$*
If we have a mapping g:\mathbb{Z}\rightarrow\mathbb{Z} we can define
a summation over g by
$$\begin{equation} \sum_{i\in T} g\left(z_i\right) \end{equation*}$$*
Finally we can define a summation over a predicate P\left(i\right)
for i\in T by
$$\begin{equation} \sum_{P\left(i\right)}g\left(z_i\right) \end{equation*}$$*
where we take the sum of the g\left(z_i\right) for the i that
satisfy the predicate P. We note that if we have k>n for some
k\in\mathbb{N} then the sum
$$\begin{equation} \sum_{i=k}^n z_i=0 \end{equation*}$$* :::
The proprieties shown for summations with natural numbers also extend to the integer version.
::: proposition Proposition 67. Properties of summation notation
Let n,m\in\mathbb{Z} such that m<n. Let s,t\in\mathbb{N}^{n+m+1}
and let c\in\mathbb{Z}.
Let a,b\in\mathbb{Z} with m<a<b<n. Define A=\mathbb{Z}_a^b and
define
$$\begin{equation}
B=\mathbb{Z}_m^n\setminus A=\left{-m,-m+1,\dots,a-1,b+1,\dots,n-1,n\right}
\end{equation*}$$ so that A\cup B =\mathbb{Z}_m^n. Let
k\in \mathbb{Z} be the starting index summation such that k<n. We
have the following properties hold.*
-
$\displaystyle \sum_{i=-m}^n s_i = \sum_{i\in A} s_i +\sum_{i\in B} s_i =\sum_{i=-m}^{-1} s_i + \sum_{i=0}^{n}s_i$
-
$\displaystyle \sum_{i=k}^n s_i = \sum_{i=k}^d s_i + \sum_{i=d+1}^n s_i$
-
\displaystyle\sum_{i=k}^n c*s_i = c*\sum_{i=k}^n s_ifor some $c\in\mathbb{Z}$ -
\displaystyle\sum_{i=k}^n c = c\left(n+1-k\right)for some $c\in\mathbb{Z}$ -
$\displaystyle\sum_{i=k}^n s_i+t_i = \sum_{i=k}^n s_i + \sum_{i=k}^n t_i$
Proof:
-
\displaystyle \sum_{i=-m}^n s_i = \sum_{i\in A} s_i +\sum_{i\in B} s_i =\sum_{i=-m}^{-1} s_i + \sum_{i=0}^{n}s_i:This follows by applying the definition. We have that
$$\begin{align} \sum_{i=-m}^n s_i&=s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_{n}\ &=\left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}\right)+\left(s_0+s_1+\dots+s_{n-1}+s_{n}\right)\ &=\sum_{i=-m}^{-1} s_i + \sum_{i=0}^n s_i \end{align*}$$*
Additionally note that
$$\begin{align} \sum_{i=-m}^n s_i&=s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_{n}\ &= \left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{a-2}+s_{a-1}\right)+\left(s_a+s_{a+1}+\dots+s_{b-1}+s_b\right)\ &+\left(s_{b+1}+s_{b+2}+\dots+s_{n-1}+s_n\right)\ &= \left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{a-2}+s_{a-1}\right) + \left(s_{b+1}+s_{b+2}+\dots+s_{n-1}+s_n\right)\ &+ \left(s_a+s_{a+1}+\dots+s_{b-1}+s_b\right)\ &= \sum_{i\in B} s_i + \sum_{i\in A} s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i \end{align*}$$*
-
\displaystyle \sum_{i=k}^n s_i = \sum_{i=k}^d s_i + \sum_{i=d+1}^n s_i:The proof is similar to part 1, replacing
-mbyk. -
\displaystyle\sum_{i=k}^n c*s_i = c*\sum_{i=k}^n s_ifor some $c\in\mathbb{Z}$We have by definition that
$$\begin{equation} \sum_{i=k}^n cs_i=cs_k+cs_{k+1}+cs_{k+3}+\dots+cs_n \end{equation}$$*
By multiplication distributing over addition we have
$$\begin{equation} \sum_{i=1}^n cs_i=cs_k+cs_{k+1}+cs_{k+3}+\dots+cs_n=c\left(s_k+s_{k+1}+\dots+s_n\right)=c\sum_{i=k}^n s_i \end{equation*}$$*
-
\displaystyle\sum_{i=k}^n c = c\left(n+1-k\right)for some $c\in\mathbb{Z}$If
n>0andk>=0then the result is the same as for natural numbers. So suppose thatk<0. Consider the following set of the indices given by$$\begin{equation} S=\left{k,k+1,k+2,\dots,-1,0,1,\dots,n-1,n\right} \end{equation*}$$*
We have that the cardinality of
Sisn+1-k. Indeed consider the following mapping$$\begin{align} f:S&\rightarrow \mathbb{N}\ s&\mapsto f\left(s\right)=s-k \end{align*}$$*
Define the mapping
g:S\rightarrow\mathop{\mathrm{Image}}\left(f\right)then we have thatgis a bijection. Suppose thatg\left(x\right)=g\left(y\right)for somex,y\in Sthen$$\begin{align} g\left(x\right)&=g\left(y\right)\ x-k&=y-k\ x&=y \end{align*}$$*
showing injectivity. Now as
gis a mapping fromSto the image offwe have by proposition 15{reference-type="ref" reference="prob:RestOfCodomainToImageIsSurjective"} thatgis surjective. Hence we conclude thatgis a bijection.Now we have that
$$\begin{align} \mathop{\mathrm{Image}}\left(f\right)&=\left{f\left(x\right):x\in S\right}\ &= \left{k-k,\left(k+1\right)-k,\left(k+2\right)-k,\dots,-1-k,0-k,1-k,\dots,\left(n-1\right)-k,n-k\right}\ &=\left{0,1,2,\dots,k-1,k,k-1,\dots,n-1-k,n-k\right} \end{align*}$$*
Hence
\left|S\right|=\left|\mathop{\mathrm{Image}}\left(f\right)\right|=n-k+1. Hence the sum is addingcto itselfn+1-ktimes. This is to say$$\begin{equation} \sum_{i=k}^n c= c\left(n+1-k\right) \end{equation*}$$*
-
\displaystyle\sum_{i=k}^n s_i+t_i = \sum_{i=k}^n s_i + \sum_{i=k}^n t_i:This follows by the definition. We have
$$\begin{align} \sum_{i=k}^n s_i+t_i&= \left(s_k+t_k\right)+\left(s_{k+1}+t_{k+1}\right)+\dots\ &+\left(s_{-1}+t_{-1}\right)+\left(s_{0}+t_{0}\right)+\left(s_{1}+t_{1}\right)+\dots+\left(s_{n-1}+t_{n-1}\right)+\left(s_{n}+t_{n}\right)\ &=\left(s_k+s_{k+1}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_n\right)+\ &+\left(t_k+t_{k+1}+\dots+t_{-1}+t_0+t_1+\dots+t_{n-1}+t_n\right)\ &= \sum_{i=k}^n s_i + \sum_{i=k}^n t_i \end{align*}$$*
$\qed$ :::
We make a similar definition for product notation.
::: definition Definition 111. Product notation for the integers
Let z\in\mathbb{Z}^{n+m+1} be ordered n+m+1 tuple of integers where
z=\left(z_{-m},z_{-m+1},\dots,z_{-1},z_0,z_1,z\dots, z_n\right).
Define \mathbb{Z}_m^n by
\mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}.
Let f:\mathbb{Z}^{n+m+1}:\mathbb{Z} defined by
$$\begin{align} f:\mathbb{Z}^{m+n+1}&\rightarrow\mathbb{Z}\ i&\mapsto f\left(i\right)=z_i \end{align*}$$*
We define the summation notation for integers by
$$\begin{equation} \prod_{i=-m}^n f\left(i\right)=f\left(-m\right)f\left(-m+1\right)\dotsf\left(-1\right)f\left(0\right)f\left(1\right)\dots+f\left(n\right) \end{equation}$$
Alternatively this is written
$$\begin{equation} \prod_{i=-m}^n z_i = z_{-m}z_{-m+1}\dotsz_{-1}z_0z_1\dotsz_n \end{equation}$$*
We have that i is called the index of the product and that i=-m is
the starting index of the product, and n the ending index of the
product. If z\in\emptyset then we define the product to be 1 and
call a product an empty sum.
We can also define the product of some subset of \mathbb{Z}_m^n which
allows for starting a product at some starting point other than i=-m.
Let T\subseteq\mathbb{Z}_m^n. We define the product over the set T
by
$$\begin{equation} \prod_{i\in T} z_i \end{equation*}$$*
If we have a mapping g:\mathbb{Z}\rightarrow\mathbb{Z} we can define
a product over g by
$$\begin{equation} \prod_{i\in T} g\left(z_i\right) \end{equation*}$$*
Finally we can define a product over a predicate P\left(i\right) for
i\in T by
$$\begin{equation} \prod_{P\left(i\right)}g\left(z_i\right) \end{equation*}$$*
where we take the sum of the g\left(z_i\right) for the i that
satisfy the predicate P. We note that if we have k>n for some
k\in\mathbb{N} then the product
$$\begin{equation} \prod_{i=k}^n z_i=1 \end{equation*}$$* :::
::: proposition Proposition 68. Properties of product notation
Let n,m\in\mathbb{Z} such that m<n. Let s,t\in\mathbb{Z}^{n+m+1}
and let c\in\mathbb{Z}. Let a,b\in\mathbb{Z} so that m<a<b<n.
Define A=\mathbb{Z}_a^b and define
$$\begin{equation}
B=\mathbb{Z}_m^n\setminus A=\left{-m,-m+1,\dots,a-1,b+1,\dots,n-1,n\right}
\end{equation*}$$ so that A\cup B =\mathbb{Z}_m^n. Let
k\in \mathbb{Z} be the lower index of the product.*
We have that the following properties hold.
-
*$\displaystyle \prod_{i=-m}^n s_i = \prod_{i\in A} s_i \prod_{i\in B} s_i = \prod_{i=-m}^{-1} s_i * \prod_{i=0}^n s_i$
-
$\displaystyle \prod_{i=k}^n s_i = \prod_{i=k}^m s_i * \prod_{i=m+1}^n s_i$
-
$\displaystyle\prod_{i=k}^n s_it_i = \prod_{i=k}^n s_i \prod_{i=1}^n t_i$
Proof:
-
\displaystyle \prod_{i=-m}^n s_i = \prod_{i\in A} s_i *\prod_{i\in B} s_i = \prod_{i=-m}^{-1} s_i * \prod_{i=0}^n s_i:This follows by the definition of the product. We have that
$$\begin{align} \prod_{i=-m}^n s_i &= s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}s_0s_1*\dotss_{n-1}s_n\ &=\left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}\right)\left(s_0s_1*\dotss_{n-1}s_n\right)\ &=\prod_{i=-m}^{-1}s_i\prod_{i=0}^n s_i \end{align}$$*
Likewise we have
$$\begin{align} \prod_{i=-m}^n s_i &= s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}s_0s_1*\dotss_{n-1}s_n\ &= \left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{a-2}s_{a-1}\right)\left(s_as_{a+1}\dotss_{b-1}s_b\right)\ & \left(s_{b+1}s_{b+2}\dotss_{n-1}s_n\right)\ &= \left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{a-2}s_{a-1}\right) * \left(s_{b+1}s_{b+2}\dotss_{n-1}s_n\right)\ & \left(s_as_{a+1}\dotss_{b-1}s_b\right)\ &=\prod_{i\in B}s_i * \prod_{i\in A} s_i = \prod_{i\in A} s_i * \prod_{i\in B} s_i \end{align}$$*
-
\displaystyle \prod_{i=k}^n s_i = \prod_{i=k}^m s_i * \prod_{i=m+1}^n s_i:The proof is similar to part 1. We replace
-mwithk. -
\displaystyle\prod_{i=k}^n s_it_i = \prod_{i=k}^n s_i \prod_{i=1}^n t_i:Observer that
$$\begin{align} \prod_{i=k}^n s_it_i&=s_{k}t_{k}s_{k+1}t_{k+1}s_{k+2}t_{k+2}\dotss_{-1}t_{-1}s_{0}t_{0}s_{1}t_{1}\dotss_{n-1}t_{n-1}s_{n}t_{n}\ &=\left(s_{k}s_{k+1}s_{k+2}\dotss_{-1}s_{0}s_{1}\dotss_{n-1}s_{n}\right)\ &\left(t_{k}t_{k+1}t_{k+2}\dotst_{-1}t_{0}t_{1}\dotst_{n-1}t_{n}\right)\ &=\prod_{i=k}^n s_i * \prod_{i=k}^n s_i \end{align}$$
$\qed$ :::
We can now consider extending the result of proposition
39{reference-type="ref"
reference="prop:NaturalsHaveNoZeroDivisors"}. I.e if the product of
ab=0 for a,b\in\mathbb{Z} then at least one of a or b is zero.
::: {#prop:IntegersHaveNoZeroDivisors .proposition} Proposition 69. Product of two integers being zero implies one of the numbers is zero
Let x,y\in\mathbb{Z}. If xy=0 then at least one of x or y is
zero.
Proof:
Let x,y\in\mathbb{Z}. If x=y=0 then the result is trivial. So
suppose that x=\left(a,b\right) and y=\left(c,d\right), moreover
suppose y\neq 0. By definition of integer multiplication we have that
$$\begin{equation} xy=\left(a,b\right)\left(c,d\right)=\left(ac+bd,ad+bc\right)=\left(0,0\right) \end{equation}$$*
By assumption. We have that
$$\begin{align} \left(ac+bd,ad+bc\right)&=\left(0,0\right) \iff ac+bd+0=ad+bc+0\ \Rightarrow ac+bd&=ad+bc \end{align*}$$*
Now suppose without loss of generality suppose that c>d then we have
that \exists p\in\mathbb{N} such that d+p=c. We hence have
$$\begin{align} ac+bd&=ad+bc\ a\left(d+p\right)+bd&=ad+b\left(d+p\right)\ ad+ap+bd&=ad+bd+bp\ ap&=bp\ a&=b ,\text{By the cancellation laws for the natural numbers}\ a+0&=b+0 \Rightarrow \left(a,b\right)=\left(0,0\right) \end{align*}$$*
A similar argument applies for c<d.
Hence x=0. A similar argument assuming x\neq 0 shows that y=0.
The result is shown. $\qed$
:::
Extending the rules for inequalities to the integers
For the natural numbers, we were able to derive some rules for how
inequalities behave, we can extend those results to the integers. Before
we do so we have an additional consideration. As
\mathbb{N}\subset\mathbb{Z} then we can view every non-zero
n\in\mathbb{N} as a positive integer in \mathbb{Z}. Hence for
positive a,b,c\in\mathbb{Z} the results from the proposition
48{reference-type="ref"
reference="prop:InequalityNaturalNumbers"} instantly extend to those
integers.
To extend the results fully we need to consider negative integers as
well. Consider x=-3 and y=6, clearly x<y. Now consider -1*x = 3
and -1*y=-6, we have that -1*x> -1*y. This can be shown in general.
::: {#prop:MultiplicationByNegativeOneFlipsInequalitySign .proposition}
Proposition 70. Multiplication by -1 changes the inequality sign
Let x,y\in\mathbb{Z}. We have the following
-
If
x<ythen $-x>-y$ -
If
x\leq ythen $-x\geq -y$ -
If
x>ythen $-x<-y$ -
If
x\geq ythen $-x\leq-y$
Proof:
-
If
x<ythen-x>-y:Let
x,y\in\mathbb{Z}so thatx<y. There are three cases to consider-
x\geq 0and $y\geq 0$ -
x<0and $y\geq 0$ -
x<0and $y<0$
-
x\geq 0andy\geq 0:Suppose that
x\geq 0andy\geq 0thenx\in\left[\left(a,0\right)\right]for somea\in\mathbb{N}andy\in\left[\left(b,0\right)\right]for someb\in\mathbb{N}. Asx<ythen we must havea+0<b+0\Rightarrow a<b.We have that
$$\begin{align} -x=-1x=-1\left(a,0\right)&=\left(0,a\right)\ -y=-1y=-1\left(b,0\right)&=\left(0,b\right) \end{align*}$$*
Now, by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 2. we know that
a<bis the same asb>a. Now we have-x>-yby definition of greater than for integers as we have$$\begin{equation} -x>-y \iff 0+b>a+0 \end{equation*}$$*
-
x<0andy\geq 0:Now suppose that
x<0andy\geq 0then we have thatx\in\left[\left(0,a\right)\right]andy\in\left[\left(b,0\right)\right]wherea,b\in\mathbb{N}.$$\begin{align} -x=-1x=-1\left(0,a\right)&=\left(a,0\right)\ -y=-1y=-1\left(b,0\right)&=\left(0,b\right) \end{align*}$$*
Now, we have that if
-x>-ythen we have$$\begin{equation} a+b>0+0 \end{equation*}$$*
However as
a,b\in\mathbb{N}andx<0 \implies a> 0. We conclude thata+b\geq a > 0and so-x>-y. -
x<0andy<0:Now suppose that
x<0andy< 0thenx\in\left[\left(0,a\right)\right]for somea\in\mathbb{N}andy\in\left[\left(0,b\right)\right]for someb\in\mathbb{N}. Asx<ythen we have thatb<a, which is the same asa>b.We have that
$$\begin{align} -x=-1x=-1\left(0,a\right)&=\left(a,0\right)\ -y=-1y=-1\left(0,b\right)&=\left(b,0\right) \end{align*}$$*
Applying the definition of
>to-xand-ygives$$\begin{equation} -x>-y \iff a>b \end{equation*}$$*
Which we know to be true. Hence
-x>-y.
This shows part 1.
-
-
If
x\leq ythen-x\geq -y:If
x<ythen we apply part 1. to get-x>-yfrom which it follows that-x\geq -yby definition. It is left to check whenx=y. This is clear however asx=y\implies -x=-yand so-x\geq -y. -
If
x>ythen-x<-y:The proof of this part is similar to part 1. As in part 1. there are three cases to consider
-
x\geq 0and $y\geq 0$ -
x\geq 0and $< 0$ -
x<0and $y<0$
-
x\geq 0andy\geq 0:Suppose that
x\geq 0andy\geq 0thenx\in\left[\left(a,0\right)\right]for somea\in\mathbb{N}andy\in\left[\left(b,0\right)\right]for someb\in\mathbb{N}. Asx>ythen we must havea+0>b+0\Rightarrow a>b.We have that
$$\begin{align} -x=-1x=-1\left(a,0\right)&=\left(0,a\right)\ -y=-1y=-1\left(b,0\right)&=\left(0,b\right) \end{align*}$$*
Now, by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 2. we know that
a>bis the same asb<a. Now we have-x<-yby definition of less than for integers as we have$$\begin{equation} -x<-y \iff 0+b<a+0 \end{equation*}$$*
-
x\geq 0andy<0:Now suppose that
x\geq 0andy< 0then we have thatx\in\left[\left(a,0\right)\right]andy\in\left[\left(0,b\right)\right]wherea,b\in\mathbb{N}. We have$$\begin{align} -x=-1x=-1\left(a,0\right)&=\left(0,a\right)\ -y=-1y=-1\left(0,b\right)&=\left(b,0\right) \end{align*}$$*
Now, we have that if
-x<-ythen we have$$\begin{equation} 0+0<a+b \end{equation*}$$*
However as
a,b\in\mathbb{N}andy<0 \implies b> 0. We conclude that0<b\leq a+band so $-x<-y$ -
x<0andy<0:Now suppose that
x<0andy< 0thenx\in\left[\left(0,a\right)\right]for somea\in\mathbb{N}andy\in\left[\left(0,b\right)\right]. Ax>ythen we have that0+b>a+0\Rightarrow b>awhich is the same asa<b.$$\begin{align} -x=-1x=-1\left(0,a\right)&=\left(a,0\right)\ -y=-1y=-1\left(0,b\right)&=\left(b,0\right) \end{align*}$$*
Applying the definition of
<to-xand-ygives$$\begin{equation} -x<-y \iff a+0<b+0 \Rightarrow a<b \end{equation*}$$*
Which we know to be true. Hence
-x<-y.
-
-
If
x\geq ythen-x\leq-y:If
x>ywe apply part 3. So instead supposex=ybut thenx=y\Rightarrow -x=yand so by definition we have-x\leq -y.
The result is shown. $\qed$ :::
This proposition will play a big role in the following proposition that extends the results for the rules of inequalities to the integers.
::: {#prop:InequalityIntegerNumbers .proposition} Proposition 71. Properties of inequalities for the integers
Let x,y,z,c\in\mathbb{Z}. We have the following properties for
inequalities
-
x<yis the same asy>x: -
x\leq yis the same asy\geq x: -
If
x<yandy<zthenx<z: -
If
x\leq yandy<zthenx<z: -
If
x<yandy\leq zthenx<z: -
If
x\leq yandy\leq zthenx\leq z: -
If
x>yandy>zthenx>z: -
If
x\geq yandy>zthenx>z: -
If
x>yandy\geq zthenx>z: -
If
x\geq yandy\geq zthenx\geq z: -
If
x<ythenx+z<y+z: -
If
x\leq ythenx+z\leq y+z: -
If
x>ythenx+z>y+z: -
If
x\geq ythenx+z\geq y+z: -
If
x<yandz\geq 0thenxz<yz: -
If
x<yandz< 0thenxz>yz: -
If
x\leq yandz\geq 0thenxz\leq yz: -
If
x\leq yandz<0thenxz\geq yz: -
If
x>yandz\geq 0thenxz>yz: -
If
x>yandz< 0thenxz<yz: -
If
x\geq yandz\geq 0thenxz\geq yz: -
If
x\geq yandz<0thenxz\leq yz:
Proof:
-
x<yis the same asy>x:Let
x,y\in\mathbb{Z}withx<y. Similar reasoning as in proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} can be used. As in the proposition, there are three cases to consider.-
x\geq 0and $y\geq 0$ -
x<0and $y\geq 0$ -
x<0and $y<0$
-
x\geq 0andy\geq 0:Suppose
x\geq 0andy\geq 0thenx\in\left[\left(a,0\right)\right]andy\in\left[\left(b,0\right)\right]for somea,b\in\mathbb{N}. We have thatx< yonly holds ifa<b, which is equivalent tob>aby proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"}. But by definition of>for integers, we have that$$\begin{equation} b>a \iff y>x \end{equation*}$$*
-
x<0andy\geq 0:Suppose that
x<0andy\geq 0, thenx\in\left[\left(0,a\right)\right]andy\in\left[\left(b,0\right)\right]for somea,b\in\mathbb{N}. By definition of<we have that$$\begin{equation} x<y \iff 0+0 < a+b\implies y>x\iff a+b > 0 \end{equation*}$$*
Now,
x<0\implies a>0and so we have thata+b\geq a > 0and soy>x. -
x<0andy<0:Now suppose that
x<0andy<0, it follows thatx\in\left[\left(0,a\right)\right]andy\in\left[\left(0,b\right)\right]for somea,b\in\mathbb{N}. By definition of<we have that$$\begin{equation} x<y\iff b<a \implies y>x \iff a>b \end{equation*}$$*
Hence, as
b<a, we have thata>band soy>x.
-
-
x\leq yis the same asy\geq x:If
x<ythen we apply part 1. Otherwise, we have thatx=yand so clearlyy=xand hencey\geq x. -
If
x<yandy<zthenx<z:Suppose that
x<yandy<z. There are four cases to consider.-
x\geq 0,y\geq 0and $z\geq 0$ -
x<0,y\geq 0and $z\geq 0$ -
x<0,y<0and $z\geq 0$ -
x<0,y<0and $z<0$
-
x\geq 0,y\geq 0andz\geq 0:Suppose that
x\geq 0,y\geq 0andz\geq 0then the result follows immediately by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 6. asx\geq 0,y\geq 0andz\geq 0givesx\in\left[\left(a,0\right)\right],y\in\left[\left(b,0\right)\right]andz\in\left[\left(c,0\right)\right]for somea,b,c\in\mathbb{N}and thereforex,y,z\in\mathbb{N}. -
x<0,y\geq 0andz\geq 0:Now suppose that
x<0,y\geq 0andz\geq 0. We have thatx\in\left[\left(0,a\right)\right],y\in\left[\left(b,0\right)\right]andz\in\left[\left(c,0\right)\right]for some for somea,b,c\in\mathbb{N}. Now we have that$$\begin{align} x<0 &\iff a>0 \ y\geq 0 &\iff b\geq 0\ z\geq 0&\iff c\geq 0 \end{align*}$$*
By assumption
x<yand so we have that0<a+b, moreover by assumptiony<zand so we haveb<c. We hence have that0<a+b<a+c. Now as0<a+cwe have by definition of<that$$\begin{equation} 0<a+c\iff 0+0<a+c \iff \left(0,a\right)<\left(c,0\right) \iff x<z \end{equation*}$$*
-
x<0,y<0andz\geq 0:Now suppose that
x<0,y<0andz\geq 0. We have thatx\in\left[\left(0,a\right)\right],y\in\left[\left(0,b\right)\right]andz\in\left[\left(c,0\right)\right]for some $a,b,c\in\mathbb{N}$$$\begin{align} x<0 &\iff a>0 \ y<0 &\iff b>0\ z\geq 0&\iff c\geq 0 \end{align*}$$*
By assumption
x<yand so we have thatb<a, moreover by assumptiony<zand so we have0<b+c. Asb<athen we have0<b+c<a+c, moreover we have by the definition of<that$$\begin{equation} 0<a+c\iff 0+0<a+c \iff \left(0,a\right)<\left(c,0\right) \iff x<z \end{equation*}$$*
-
x<0,y<0andz<0:Suppose that
x<0,y< 0andz<0. We have thatx\in\left[\left(0,a\right)\right],y\in\left[\left(0,b\right)\right]andz\in\left[\left(0,c\right)\right]for somea,b,c\in\mathbb{N}. Observe$$\begin{align} x<0 &\iff a>0 \ y<0 &\iff b>0\ z<0 &\iff c>0 \end{align*}$$*
As
x<ywe have thatb<a, likewise asy<zwe have thatc<b, hence we have thatc<b<aand soc<a. Hence by definition of<we have$$\begin{equation} c<a\iff 0+c<a+0\iff \left(0,a\right)<\left(0,c\right)\iff x<z \end{equation*}$$*
-
-
If
x\leq yandy<zthenx<z:Suppose that
x\leq yandy<z. Ifx<ythen we apply part 3. So suppose thatx=y, then we must have thaty<z\iff x<zand hence the result. -
If
x<yandy\leq zthenx<z:As with part 5. Suppose
x<yandy\leq z, then ify<zwe apply part 3. Then we are left with the casey=zand hence we have thatx<y\iff x<z. -
If
x\leq yandy\leq zthenx\leq z:Suppose that
x\leq yandy\leq z, then ifx<yandy<zwe apply part 3. Ifx\leq yandy<zwe apply part 4. Ifx<yandy\leq zwe apply part 5. Hence we are left with the case wherex=yandy=z. The result follows immediately. -
If
x>yandy>zthenx>z:By part 1. of the proposition we have that this is equivalent to
y<xandz<ythenz<xand so part 3. applies. -
If
x\geq yandy>zthenx>z:Applying part 2 to
x\geq yand part 1. toy>zandx>zgives the equivalent statement ofy\leq xandz<ythenz<xand so part 4. applies. -
If
x>yandy\geq zthenx>z:As with part 8. Applying parts 2. and 1. gives the equivalent statement of
y<xandz\geq ythenz<xand s part 5. applies -
If
x\geq yandy\geq zthenx\geq z:Solely applying part 2 of the proposition gives the statement
y\leq xandz\leq ythenz\leq x, so part 6. applies. -
If
x<ythenx+z<y+z:Suppose that
x<ywherex\in\left[\left(a,b\right)\right]andy\in\left[\left(c,d\right)\right]for somea,b,c,d\in\mathbb{N}. Letz\in\left[\left(e,f\right)\right]. By assumption we know that$$\begin{equation} x<y\iff a+d<b+c \end{equation*}$$*
Now, we have that
$$\begin{align} x+z=\left(a,b\right)+\left(e,f\right)=\left(a+e,b+f\right)\ y+z=\left(c,d\right)+\left(e,f\right)=\left(c+e,d+f\right) \end{align*}$$*
Now, suppose that
x+z<y+z. We have that$$\begin{equation} x+z<y+z\iff \left(a+e\right)+\left(d+f\right)<\left(b+f\right)+\left(c+e\right) \end{equation*}$$*
Observe that
$$\begin{align} \left(a+e\right)+\left(d+f\right)&<\left(b+f\right)+\left(c+e\right)\ \underbrace{\left(a+d\right)}{=j}+\underbrace{\left(e+f\right)}{=k}&<\underbrace{\left(b+c\right)}{=l}+\underbrace{\left(f+e\right)}{=k}\ \end{align*}$$*
For some
j,k,l\in\mathbb{N}. We see thatj<lasa+d<b+c. Hence by proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 12. thatj<l\Rightarrow j+k<l+kand so we havex+z<y+z. -
If
x\leq ythenx+z\leq y+z:If
x<ythen the result follows from part 11. Otherwise, we havex=yand thenx+z=y+zand so we havex+z\leq y+z. -
If
x>ythenx+z>y+z:As has been the case so far, applying part 1. gives us the statement
y<xtheny+z<x+zand so part 11. applies. -
If
x\geq ythenx+z\geq y+z:By part 2. we get the equivalent statement of
y\leq xtheny+z\leq x+zfrom which we can apply part 12. -
If
x<yandz\geq 0thenxz<yz:Suppose that
x<ywherex\in\left[\left(a,b\right)\right]andy\in\left[\left(c,d\right)\right]for somea,b,c,d\in\mathbb{N}. Letz\in\left[\left(e,0\right)\right]for somee\in\mathbb{N}. Asx<ywe have$$\begin{equation} x<y\iff a+d<b+c \end{equation*}$$*
Now we have that
$$\begin{align} xz=\left(a,b\right)\left(e,0\right)=\left(ae,be\right)\ yz=\left(c,d\right)\left(e,0\right)=\left(ce,de\right)\ \end{align*}$$*
Now, consider
xz<yzthen$$\begin{equation} xz<yz\iff ae+de<be+ce \iff e\underbrace{\left(a+d\right)}{=m}<e\underbrace{\left(b+c\right)}{=n} \end{equation*}$$*
The result now follows from proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 16.
-
If
x<yandz< 0thenxz>yz:Suppose that
x<ywherex\in\left[\left(a,b\right)\right]andy\in\left[\left(c,d\right)\right]for somea,b,c,d\in\mathbb{N}. Letz\in\left[\left(0,e\right)\right]for somee\in\mathbb{N}. Asx<ywe have$$\begin{equation} x<y\iff a+d<b+c \end{equation*}$$*
Now we have that
$$\begin{align} xz=\left(a,b\right)\left(0,e\right)=\left(be,ae\right)\ yz=\left(c,d\right)\left(0,e\right)=\left(de,ce\right)\ \end{align*}$$*
Now, we want to show that
xz>yz, by definition we have$$\begin{align} xz>yz \iff be+ce>ae+de\iff e\underbrace{\left(b+c\right)}{=m}<e\underbrace{\left(a+d\right)}{=n} \end{align*}$$*
The result now follows from proposition 48{reference-type="ref" reference="prop:InequalityNaturalNumbers"} part 16.
-
If
x\leq yandz\geq 0thenxz\leq yz:Suppose that
x\leq ythen if we have thatx<ywe apply part 15. Otherwise,x=yand the result is trivial. -
If
x\leq yandz<0thenxz\geq yz:Likewise, if
x<ythen we apply part 16. So suppose thatx=ythenxz=yzand we, therefore, havexz\geq yz. -
If
x>yandz\geq 0thenxz>yz:Let
z\geq 0and by applying part 1. we get the equivalent statement ofy<xandz\geq 0thenyz<xzfor which we apply part 15. -
If
x>yandz< 0thenxz<yz:Applying part 1. we get the equivalent statement of
y<xandz<0thenyz<xzfor which we apply part 16. -
If
x\geq yandz\geq 0thenxz\geq yz:Part 2 of this proposition gives the equivalent statement of
y\leq xand\geq 0thenyz\leq xzand so part 17. applies. -
If
x\geq yandz<0thenxz\leq yz:Now, part 2 gives us the expression
y\leq xandz<0thenyz\geq xzand so we apply part 18.
The result has been shown. $\qed$ :::
The absolute value function
After the construction of the natural numbers, we explored the notion of cardinality. That was assigning a notion of size to a natural number. Recall the definition,
$$\begin{align*} \left|\cdot\right|:\mathbb{N}&\rightarrow\mathbb{N}\ n&\mapsto\left|n\right|=n \end{align*}$$
To extend this we consider the following. We know that a\in\mathbb{N}
has a cardinality \left|a\right|=a as a\in\mathbb{N} refers to a set
containing a elements. Unfortunately, the notion of a set containing
a elements doesn't extend in a natural way to the integers. For
example, what does it mean for a set to contain -3 elements? Instead,
we need to re-think the notion of size.
Armed with subtraction we can re-cast our this understanding of size
into a more useful form. Consider for example 6-3=3, we can interpret
this expression as saying that the number 3 is 3 less than 6, or
equivalently the number 6 is 3 bigger than 3. Stated in another
way, if we were to get a ruler and measure something to be 6 cm long
and we want to cut it in half we will measure the halfway point at 3cm
along from where we start measuring. That is to say, the halfway point
would be 6cm - 3cm=3cm.
What we have done is rather than think about the number of elements, we
have thought about things in terms of distances. This turns out to be a
very powerful idea, there is an entire subject in mathematics which
studies this idea of distances, formally called metrics, which we will
see later. We have only considered the positive case so far, what about
3-6?
We know that 3-6=-3 and using similar logic this is saying that the
number -3 is 6 away from 3, equivalently 3 is 6 more than
-3.
We make a definition.
::: definition Definition 112. Distance function for integers
Let x,y\in\mathbb{Z}. Define the function
d:\mathbb{Z}^2\rightarrow\mathbb{N} by
$$\begin{align} d:\mathbb{Z}^2&\rightarrow\mathbb{N}\ \left(x,y\right)&\mapsto d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{align*}$$* :::
We must verify that this is well defined
::: {#prop:IntegerDistanceFuncWellDefined .proposition} Proposition 72. The distance function for the integers is well-defined
Let x,y\in\mathbb{Z}. We have that
$$\begin{equation} d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{equation*}$$*
is well-defined.
Proof:
Let x,y\in\mathbb{Z}. There are two cases to consider x\geq y and
x<y.
-
x\geq y:Suppose that
x\geq y, then by proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} part 14. we have$$\begin{equation} x\geq y \Rightarrow \left(x+\left(-y\right)\right) \geq \left(y+\left(-y\right)\right) \Rightarrow x-y \geq 0 \end{equation*}$$*
Hence
x-y\in\mathbb{N}. -
x<y:As
x<ywe have by definition ofdthatd\left(x,y\right)=-\left(x-y\right)where we have thatx-y<0. However we have that-\left(x-y\right)=-1 * \left(x-y\right)and so by part 16 of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} we have that-1*\left(x-y\right)>0which is to say $-\left(x-y\right)\in\mathbb{N}$
The result has been shown. $\qed$ :::
In light of the definition of the distance function, we can define the so-called absolute value function. This will give us a notion of the magnitude of an integer.
::: definition Definition 113. Absolute value function
Let x\in\mathbb{Z} we define the absolute value function, denoted by
\left|x\right| by the function
$$\begin{equation} \left|x\right|=d\left(x,0\right)=\begin{cases} x,\ \text{If } x\geq 0\ -x,\ \text{If } x< 0 \end{cases} \end{equation*}$$* :::
With this definition, we have generalised the idea of "size" to the
integers. That is the size of an integer is its distance from 0. We
have the basic properties of the absolute value
::: proposition Proposition 73. Properties of the absolute value
Let x,y,z\in\mathbb{Z}. We have that the absolute value function has
the following properties
-
\left|x\right|\geq 0for all $x\in\mathbb{Z}$ -
$\left|x\right|=0\iff x=0$
-
$\left|x-y\right|=0\iff x=y$
-
$\left|xy\right|=\left|x\right|\left|y\right|$
-
$\left|\left|x\right|\right|=\left|x\right|$
-
$\left|-x\right|=\left|x\right|$
-
$\left|x\right|\leq y \iff -y\leq x\leq y$
-
\left|x\right|\geq y\iff x\leq -yor $x\geq y$ -
$\left|x+y\right|\leq \left|x\right|+\left|y\right|$
-
$\left|x-y\right|\leq\left|x-z\right|+\left|z-y\right|$
-
$\left|x-y\right|\geq \left|\left|x\right|-\left|y\right|\right|$
-
\left|\cdot\right|is not injective -
\left|\cdot\right|is not surjective
Proof:
-
\left|x\right|\geq 0for allx\in\mathbb{Z}:This follows by proposition 72{reference-type="ref" reference="prop:IntegerDistanceFuncWellDefined"}.
-
\left|x\right|=0\iff x=0:We have by definition that
\left|x\right|=0, if and only ifx=0. -
\left|x-y\right|=0\iff x=y:\left(\Rightarrow\right): Suppose that\left|x-y\right|=0. There are two cases to consider.Firstly if
x\geq y, then by definition we have that\left|x-y\right|=x-y=0from which we clearly havex=y. The other case isx<yfrom which we get\left|x-y\right|=-\left(x-y\right)=0. In other words, we have-1*\left(x-y\right)=0. Now by proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"} we know that for integersa,bthat ifab=0, at least one ofaorbis zero. As-1\neq 0we conclude thatx-y=0from which we getx=y.\left(\Leftarrow\right): Suppose thatx=ythenx-y=0and so\left|x-y\right|=0. -
\left|xy\right|=\left|x\right|\left|y\right|:Let
x,y\in\mathbb{Z}. There are four cases to consider.-
x\geq 0and $y\geq 0$ -
x\geq 0and $y<0$ -
x<0and $y\geq 0$ -
x<0and $y<0$
-
x\geq 0andy\geq 0:If
x\geq 0andy\geq 0thenxy\geq 0and so\left|xy\right|=xy. Likewise\left|x\right|=xand\left|y\right|=y. Hence\left|xy\right|=\left|x\right|\left|y\right|. -
x\geq 0andy<0:If
x\geq 0then\left|x\right|=xby definition, and ify<0then\left|y\right|=-y. Now\left|xy\right|=-xyasy<0. Moreover, we have that$$\begin{equation} -xy=\left(-1\right)\left(x\right)\left(y\right)=\left(x\right)\left(-1\right)\left(y\right)=\left(x\right)\left(-y\right)=\left|x\right|\left|y\right| \end{equation*}$$*
Hence we get $\left|xy\right|=\left|x\right|\left|y\right|$
-
x<0andy\geq 0:This is similar to the above but swapping the roles of
xandy. -
x<0andy<0:Suppose that
x<0andy<0, then we have that\left|x\right|=-xand\left|y\right|=-yby definition. Moreover, we have that-x*-y = xy. Hence $\left|xy\right|=xy=\left(-x\right)\left(-y\right)=\left|x\right|\left|y\right|$
-
-
\left|\left|x\right|\right|=\left|x\right|:We have that
\left|x\right|=xifx\geq 0and-xifx<0.So if
x\geq 0, we have$$\begin{equation} \left|\left|x\right|\right|=\left|x\right|=x=\left|x\right| \end{equation*}$$*
Now if
x<0then$$\begin{equation} \left|\left|x\right|\right|=\left|-x\right|=\underbrace{-x}_{\text{As }-x>0}=\left|x\right| \end{equation*}$$*
-
\left|-x\right|=\left|x\right|:As
-x=-1 *xwe have by part 4 that$$\begin{equation} \left|-x\right|=\left|-1x\right|=\left|-1\right|\left|x\right|=1\left|x\right|=\left|x\right| \end{equation*}$$*
-
\left|x\right|\leq y \iff -y\leq x\leq y:\left(\Rightarrow\right): Suppose that\left|x\right|\leq y. Ifx\geq 0then we get that\left|x\right|=x\leq y. From this, it is clear that-y\leq x\leq yasx\geq 0andx\leq y \Rightarrow y \geq 0.Now if
x<0, then\left|x\right|=-x\leq y. Clearlyx\leq -xasx<0hence we conclude thatx\leq -x\leq y. Now by part 18 of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} we have we have$$\begin{equation} \left(-1\right)\left(-x\right)\geq \left(-1\right)\left(y\right) \iff x\geq -y \end{equation}$$*
Now
x\geq -yis the same as-y\leq xand so we have-y\leq x\leq -x \leq y.Hence
-y\leq x\leq y.\left(\Leftarrow\right): Suppose that-y\leq x\leq y. There are two cases to consider.-
$x\geq 0$
-
$x<0$
-
x\geq 0:Suppose
x\geq 0, then clearly asx\leq ythen\left|x\right|\leq \left|y\right|=y. Moreover, we have that-y\leq xis the samex\geq -yand by part 22. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} when applied tox\geq -ygives$$\begin{equation} \left(-1\right)\left(x\right)\leq \left(-1\right)\left(-y\right) \iff -x\leq y \end{equation}$$*
We have that
\left|-x\right|=\left|x\right|by part 6. Hence\left|-x\right|=\left|x\right|\leq \left|y\right|=y. -
x<0:Suppose
x<0. By assumptionx\leq yso eithery\geq 0ory< 0. We can't havey<0as for example takex=-4andy=-2then we would have2\leq -4\leq -2a contradiction.So suppose that
y\geq 0then asx\leq ywe have\left|x\right|\leq\left|y\right|=y. Now as-y\leq xby assumption we have thatx\geq -yand so part 22. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} gives$$\begin{equation} \left(-1\right)\left(x\right)\leq \left(-1\right)\left(-y\right) \iff -x\leq y \end{equation}$$*
Hence part 6. applies and we get that $\left|x\right|\leq y$
-
-
\left|x\right|\geq y\iff x\leq -yorx\geq y:\left(\Rightarrow\right): Suppose that\left|x\right|\geq y. Ifx\geq 0then\left|x\right|=x\geq y. So suppose thatx<0then by definition we have that\left|x\right|=-xand so-x\geq yand the result follows when applying part 22. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"}.\left(\Leftarrow\right): Suppose that eitherx\leq -yorx\geq y. We have three cases to consider.-
$x\leq -y$
-
$x\geq y$
-
x\leq -yand $x\geq y$
-
x\leq -y:Suppose that
x\leq -yholds. Ifx\geq 0then we have that-y\geq 0, Hencey<0. Moreover, we have that by part 18. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} that$$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(-y\right) \iff -x\geq y \end{equation}$$*
Now part 6. applies and we see that
\left|-x\right|=\left|x\right|\geq\left|y\right|=y. This is to say\left|x\right|\geq y.Now suppose that
x<0. Then asx\leq -ywe have that either-y\geq 0or-y<0. In the former case-y\geq 0givesy<0. Hence by part 18. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} we conclude that$$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(y\right) \iff -x\geq y \end{equation}$$*
As
x<0then-x\geq 0. The result follows when taking the absolute value.Now suppose that
-y<0theny\geq 0. Following similar logic to the previous case, we see that$$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(y\right) \iff -x\geq y \end{equation}$$*
The result again follows after taking the absolute value.
-
x\geq y:This case is trivial.
-
x\leq -yandx\geq y:Suppose that
x\leq -yandx\geq yare both true. We know by the first case thatx\leq -ygives\left|x\right|\geq yandx\leq yalso implies\left|x\right|\geq yby the second case. Hence both inequalities being true at the same time implies the result\left|x\right|\geq y.
-
-
\left|x+y\right|\leq \left|x\right|+\left|y\right|:Let
x,y\in\mathbb{Z}. There are four cases to consider.-
x\geq 0and $y\geq 0$ -
x\geq 0and $y\leq 0$ -
x\leq 0and $y\geq 0$ -
x\leq 0and $y\leq 0$
-
x\geq 0andy\geq 0:Suppose
x\geq 0andy\geq 0, then we have that$$\begin{equation} \left|x+y\right|=x+y=\left|x\right|+\left|y\right|\Rightarrow \left|x+y\right|\leq\left|x\right|+\left|y\right| \end{equation*}$$*
-
x\geq 0and $y\leq 0$By assumption we have that
\left|x\right|=xand\left|y\right|=-y. We have two cases based on the absolute value,\left|x\right|\leq\left|y\right|and\left|x\right|\geq\left|y\right|.So suppose that
\left|x\right|\leq\left|y\right|then by definitionx\leq -yand so by part 12. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} we have that$$\begin{equation} x\leq -y \Rightarrow x+y\leq 0 \end{equation*}$$*
Moreover, as
x\geq 0theny\leq x+y\leq 0. Hence we have by the definition of the absolute value that$$\begin{equation} \left|x+y\right|=-\left(x+y\right)\leq -y=\left|y\right| \end{equation*}$$ As
-y>0.*In the case
\left|x\right|\geq\left|y\right|we have by definition thatx\geq -yand sox+y\geq 0. Additionally it is clear thatx\geq x+yasy\leq 0and\left|x\right|\geq\left|y\right|. Hence by definition of the absolute value we have that$$\begin{equation} \left|x+y\right|=x+y\leq x=\left|x\right| \end{equation*}$$*
Now, it is clear to see that
\left|x\right|\leq \left|x\right|+\left|y\right|and likewise\left|y\right|\leq \left|x\right|+\left|y\right|.We have hence shown that
\left|x+y\right|leq\left|x\right|+\left|y\right|. -
x\leq 0andy\geq 0:This is similar to above, interchanging the roles of
xandy. -
x\leq 0andy\leq 0:Suppose that
x\leq 0andy\leq 0then by definition we have that\left|x+y\right|=-\left(x+y\right)=-x-y. Asx\leq 0andy\leq 0then we have that and\left|y\right|=-ywhich shows $\left|x+y\right|=\left|x\right|+\left|y\right|\leq\left|x\right|+\left|y\right|$
-
-
\left|x-y\right|\leq\left|x-z\right|+\left|z-y\right|:We have that
$$\begin{align} \left|x-y\right|&=\left|x-\left(z-z\right)-y\right|\ &=\left|x-z+z-y\right|\ &\leq \left|x-z\right|+\left|z-y\right| \end{align*}$$*
-
\left|x-y\right|\geq \left|\left|x\right|-\left|y\right|\right|:We have that
$$\begin{align} \left|x\right|&=\left|\left(x-y\right)+y\right|\leq \left|x-y\right|+\left|y\right| \Rightarrow \left|x\right|-\left|y\right|\leq \left|x-y\right|\ \left|y\right|&=\left|\left(y-x\right)+x\right|\leq \left|x-y\right|+\left|x\right| \Rightarrow \left|y\right|-\left|x\right|\leq \left|x-y\right|\ \end{align*}$$*
Hence we have
$$\begin{align} \left|x\right|-\left|y\right|\leq \left|x-y\right| &\Rightarrow \left|\left|x\right|-\left|y\right|\right|\leq \left|x-y\right|\ \left|y\right|-\left|x\right|=\left(-1\right)\left(\left|x\right|-\left|y\right|\right)\leq \left|x-y\right| &\Rightarrow \left|\left|x\right|-\left|y\right|\right|\leq \left|x-y\right|\ \end{align*}$$*
Hence we have the result.
-
\left|\cdot\right|is not injective:To see that the absolute value function is not injective consider
\left|3\right|=\left|-3\right|. We have that\left|3\right|=3and\left|-3\right|=3but3\neq -3. -
\left|\cdot\right|is not surjective:We have that the absolute value function as there are no
x\in\mathbb{Z}so that\left|x\right|=-1for example.
This ends the proposition. $\qed$ :::
Extending exponentiation to the integers
We can extend the idea of exponentiation to include integers. We are now
able to consider negative bases. In other words, expressions of the form
\displaystyle x^n for x\in\mathbb{Z} with x<0. This extension is
somewhat trivial and extends naturally from the definition of the
naturals. We first look at the case where n\geq 0
::: definition Definition 114. Exponentiation of integer numbers
Let \mathbb{Z}^+=\left\{x\in\mathbb{Z}:x\geq 0\right\}. Let
\left(x,n\right)\in\mathbb{Z}\times\mathbb{Z} with n\geq 0 and let
\wedge:\mathbb{Z}\times\mathbb{Z}\rightarrow\mathbb{Z}. We define the
exponentiation of x by n to be x multiplied by itself n-1 times
$$\begin{align} \wedge:\mathbb{Z}\times\mathbb{Z}^+&\rightarrow\mathbb{Z}\ \left(x,n\right)&\mapsto \wedge\left(x,n\right)=\begin{cases} 1,\ \text{If } x=0\text{ and } n=0\ 1,\ \text{If } n=0\ \displaystyle \prod_{i=1}^y x ,\ \text{If }x\neq 0\text{ and } n \geq 0\ \end{cases} \end{align*}$$*
We will write \wedge\left(x,n\right) as x^n. We say that x is the
base and n is the exponent. We sometimes say that x has been raised
to the power of n. In the case that x=0 and m=0 we have a vacuous
product and so an empty product which by definition has a value of 1.
:::
We will explore this definition by first considering x=-1
$$\begin{align*} xx=x^1&=-1=-1\ xx=x^2&=-1*-1=1\ xxx=x^3&=-1*-1*-1=-1\ xxxx=x^4&=-1-1*-1*-1=1\ \end{align*}$$
This leads to the following proposition.
::: proposition
Proposition 74. Negative one to power of 2n is 1 Let
n\in\mathbb{N}. We have that
$$\begin{equation} \left(-1\right)^{2n} = 1 \end{equation*}$$*
Proof:
We argue by induction on n. The base case is n=0 and by definition,
we have that
$$\begin{equation} \left(-1\right)^{20}=\left(-1\right)^{0}=1=1 \end{equation}$$*
Now suppose the result holds for some n=k, that is
$$\begin{equation} \left(-1\right)^{2k}=1 \end{equation*}$$*
We show that
$$\begin{equation} \left(-1\right)^{2*\left(k+1\right)}=1 \end{equation*}$$*
We have
$$\begin{align} \left(-1\right)^{2\left(k+1\right)}&=\left(-1\right)^{2k+2}\ &=\prod_{i=1}^{2k+2} \left(-1\right)\ &=\prod_{i=1}^{2k} \left(-1\right) \prod_{i=2k+1}^{2k+2} \left(-1\right)\ &= 1 * \left(\left(-1\right)\left(-1\right)\right) &=1\left(1\right)=1 \end{align*}$$*
Which shows the result. $\qed$ :::
This result generalises for any negative integer.
::: proposition Proposition 75. Negative integer to the power of 2n is positive
Let x\in\mathbb{Z} with x<0. Let n\in\mathbb{N}. We have that
$$\begin{equation} x^{2n} > 1 \end{equation*}$$*
Proof:
By definition we have
$$\begin{align} x^{2n}&=\prod_{i=1}^{2n} x\ &=\prod_{i=1}^{2n} \left(-1*-x\right)\ &=\prod_{i=1}^{2n} \left(-1\right) \prod_{i=}^{2n}\left(-x\right)\ &=1\underbrace{\prod_{i=}^{2n}\left(-x\right)}_{\geq 1} \geq 1\ \end{align*}$$*
As -x>0 because x<0. $\qed$
:::
We also note that exponentiation is neither commutative nor associative as they were not for the naturals. However, the following results do extend.
::: {#prop:IntegerExponentiationPowerLaw .proposition} Proposition 76. Power law of exponentiation for positive exponents
Let x\in\mathbb{Z} and let n,m\mathbb{N} with n\geq 0 and
m\geq 0. We have that
$$\begin{equation} \left(x^n\right)^m = x^{nm} \end{equation*}$$*
Proof:
By the definition of exponentiation, we have that
$$\begin{equation} \left(x^n\right)^m=\prod_{i=1}^m x^n =\prod_{i=1}^m\left(\prod_{j=1}^n x\right) \end{equation*}$$*
Hence we have
$$\begin{align} \left(x^n\right)^m&=\underbrace{\prod_{j=1}^n x * \prod_{j=1}^n x \dots * \prod_{j=1}^n x}_{n\text{ times}}\ &=\underbrace{\underbrace{xx*\dotsx}_{n\text{ times}}\underbrace{xx\dotsx}_{n\text{ times}}\dots*\underbrace{xx\dotsx}{n\text{ times}}}{m\text{ times}}\ \end{align}$$*
Therefore, there are n*m total multiplications of x with itself.
Which is to say
$$\begin{equation} \left(x^n\right)^m = \underbrace{xxx*\dotsx}_{nm\text{ times}} = \prod_{i=1}^{nm} x = x^{nm} \end{equation*}$$*
As promised. $\qed$ :::
::: {#prop:IntegerExponentiationOfSameBaseAddsPowers .proposition} Proposition 77. Multiplying exponents of the same base adds the powers
Let x\in\mathbb{Z} be a fixed integer and let n,m\in\mathbb{N}. We
have that
$$\begin{equation} x^n x^m = x^{n+m} \end{equation}$$*
Proof:
Let x\in\mathbb{Z} and n,m\in\mathbb{N} If n=0 or m=0 or both
then the result is trivial. Likewise if n=0 and m\geq 0 or n\geq 0
and m=0 again the result is trivial. So suppose that n>0 and m>0.
We have by definition of exponentiation that
$$\begin{equation} x^nx^m=\prod_{i=1}^n x * \prod_{i=1}^m x = \underbrace{xx*\dots x}_{n\text{ times}} * \underbrace{xx*\dots x}_{m\text{ times}}=\underbrace{xx*\dots x}_{n+m \text{ times}}=x^{n+m} \end{equation}$$*
As expected. $\qed$ :::
::: {#prop:IntegerExponentiationPowerOfProductIsProductOfPowers .proposition} Proposition 78. Power of product is product of powers
Let x,y\in\mathbb{Z} and n\in\mathbb{N}. Then
$$\begin{equation} \left(xy\right)^n=x^ny^n \end{equation*}$$*
Proof:
If n=0 then \left(x*y\right)^n=1 and clearly x^0*y^0=1. So let
n>0 then we have
$$\begin{align} \left(xy\right)^n=\prod_{i=1}^n xy &=\underbrace{xyxy*\dots xy}_{n\text{ times}}\ &= \left(\underbrace{xx*\dots x}_{n\text{ times}}\right)\left(\underbrace{yy\dots y}_{n\text{ times}}\right),\ \text{ By commutativity of multiplication}\ &=x^ny^n \end{align*}$$*
Showing the proposition. $\qed$ :::
The awake reader may have noticed how we have only dealt with positive
exponents so far in our extension of exponentiation to the integers.
What about negative exponents? We can, loosely, justify why we can't yet
consider negative exponents by considering proposition
77{reference-type="ref"
reference="prop:IntegerExponentiationOfSameBaseAddsPowers"}. For a
second suppose that instead of n.m\in\mathbb{N} we consider
n,m\in\mathbb{Z}. In particular n=1 and m=-1, then we have that
$$\begin{equation*} x^1x^{-1}=x^{1+-1}=x^0=1 \end{equation}$$
Hence we have that when x^1 is multiplied by x^{-1} we get back
to 1. Hence in a sense x^{-1} cancels with x. If we let x=2 we
have x^1=2 and so x^1*x^{-1}=1 gives us the equation 2*x^{-1}=1.
We intuitively know that \displaystyle x^{-1}=\frac{1}{2} which we
know is not an integer. Hence if
77{reference-type="ref"
reference="prop:IntegerExponentiationOfSameBaseAddsPowers"} held for all
integer powers we have the implied existence of a new type of object.
This object has the potential that when an integer is multiplied by the
appropriate member of this new type of object, assuming such an object
even exists, then integer multiplication is undone.
Construction of the Rationals
::: epigraph A man is like a fraction whose numerator is what he is and whose denominator is what he thinks of himself. The larger the denominator, the smaller the fraction.
Leo Tolstoy :::
We have now built a theory of integer numbers. One main reason for doing
this was to be able to always undo subtraction. We still have a glaring
issue at hand, however. How do we undo multiplication? For example, we
are unable to express in mathematical language how many times one
quantity goes into another. If we have 6 pints and 3 friends we know
that each friend should get 2 pints as 3*2=6. In a sense we have
that 2 goes into 6 a total of 3 times and 3 goes into 6 a
total of 2 times. The integers don't have a concept of how many times
one integer can go into another. This is what we call division and we
write \displaystyle\frac{6}{2}=3 and \displaystyle\frac{6}{3}=2 for
each situation respectively.
Thankfully the method used to construct the integers can be used again
on the integers themselves to construct an even richer theory. As with
the integers, we should consider what we want to do. We seek a way to
undo the multiplication of integers. Consider a,b,c,d\in\mathbb{Z}
a=6,b=3,c=12 and d=6, with these values we intuitively know that
\displaystyle\frac{6}{3}=2 and \displaystyle\frac{12}{6}=2. We also
note that 6*6=36 and 3*12=36. This gives us a clue on how to
proceed. We have that \displaystyle\frac{6}{3} and
\displaystyle\frac{12}{6} are hence similar. If we temporarily use the
language of relations we have that
\left(a,b\right)\sim\left(c,d\right).
Defining the Rationals
We proceed by defining division as an ordered tuple on integers
::: definition Definition 115. Division as an ordered tuple
Let a,b\in\mathbb{Z}. We define division as an ordered tuple
\left(a,b\right)\in\mathbb{Z}^2 to mean \displaystyle\frac{a}{b}. We
will call x\in\mathbb{Z}^2 a division tuple in this context.
:::
Hence we can define the relation we considered above.
::: definition Definition 116. Relation for division
Let \left(a,b\right),\left(c,d\right)\in\mathbb{Z}^2 be division
tuples. We define the relation \sim such that
\left(a,b\right)\sim\left(c,d\right) if and only if $ad=bc$
:::
With this definition there is something we need to consider that we have
heard since school, you can't divide by zero, that is for any integer
a we have \displaystyle\frac{a}{0} is not defined.
Suppose that \left(a,0\right)\sim\left(c,d\right) for some
a,c,d\in\mathbb{Z}. We have by definition of the relation that
$$\begin{equation*} \left(a,0\right)\sim\left(c,d\right)\iff ad=0c = 0 \end{equation}$$
By proposition
69{reference-type="ref"
reference="prop:IntegersHaveNoZeroDivisors"} we have that either a=0
or d=0 or both.
If a=0 then we have
\left(0,0\right)\sim\left(c,d\right)\Rightarrow 0=0 for all
c,d\in\mathbb{Z}. This means that every division tuple in
\mathbb{Z}^2 would be equivalent to \left(0,0\right). Likewise if
d=0 we get \left(a,0\right)\sim\left(c,0\right)\Rightarrow 0=0 again
meaning for all division tuples in \mathbb{Z}^2 would be equivalent.
Finally if both a=0 and d=0 then
\left(0,0\right)\sim\left(c,0\right) and so 0=0*c=0 and again every
division tuple would be equivalent.
This is a problem as this relation would imply that all elements are
essentially the same9 . This is not a useful definition to be using so
we will avoid this by not allowing b=0 in
\left(a,b\right)\in\mathbb{Z}^2. We revise the definition
::: definition Definition 117. Division as an ordered tuple
Let a,b\in\mathbb{Z} with b\neq 0. We define division as an ordered
tuple \left(a,b\right)\in\mathbb{Z}^2 to mean
\displaystyle\frac{a}{b}. We will call x\in\mathbb{Z}^2 a division
tuple in this context.
:::
::: definition Definition 118. Relation for division
Let \left(a,b\right),\left(c,d\right)\in\mathbb{Z}^2 be division
tuples where b\neq 0 and d\neq 0. We define the relation \sim such
that \left(a,b\right)\sim\left(c,d\right) if and only if $ad=bc$
:::
We can show that this revised definition is an equivalence relation.
::: proposition Proposition 79. Relation for division ordered tuple is an equivalence relation
Let x,y,z\in\mathbb{Z}^2 be division tuples and defined the relation
x\sim y as above. We have that \sim is an equivalence relation.
Proof:
Let x,y,z\in\mathbb{Z}^2 be division tuples such that
x=\left(a,b\right),y=\left(c,d\right) and z=\left(e,f\right). We
show that \sim is an equivalence relation, in other words.
-
\simis reflexive -
\simis symmetric -
\simis transitive
-
\simis reflexive:We have that for
x=\left(a,b\right)thatx\sim xasx\sim xif and only ifab=ab. -
\simis symmetric:Suppose that
x=\left(a,b\right)andy=\left(c,d\right). Suppose thatx\sim ythen we have thatad=bc. Hencebc=ad \Rightarrow cb=adand so\left(c,d\right)\sim\left(a,b\right)and soy\sim x. -
\simis transitive:Suppose that
x\sim yandy\sim zthen by definition we have thatad=bcandcf=de. We have that$$\begin{align} ad&=bc\ adf&=bcf\ adf&=bde\ af&=be \end{align*}$$*
Hence
\left(a,b\right)\sim\left(e,f\right)and sox\sim z.
It follows that \sim is an equivalence relation. $\qed$
:::
We can now turn our attention to the set
\mathbb{Z}^2/\sim=\left\{\left[x\right]_\sim:x\in\mathbb{Z}^2\right\}.
::: definition Definition 119. Rationals
Let \mathbb{Z}^2 have the equivalence relation \sim defined by
\left(a,b\right)\sim\left(c,d\right) if and only if ad=bc. We define
the set of rational numbers, denoted \mathbb{Q}, as the quotient set
\mathbb{Z}^2/\sim. The set has the form
$$\begin{equation} \mathbb{Q}=\left{\dots,-\frac{2}{3},-\frac{1}{3},-\frac{1}{2},0,\frac{1}{2},\frac{1}{3},\frac{2}{3},\dots\right} \end{equation}$$ :::
Extending equality to the rationals
As with the integers, it is easy to extend equality.
::: definition Definition 120. Equality of rationals
Let x,y\in\mathbb{Q} be two rational numbers. We define that two
rationals are equal, denoted x=y if and only if x\sim y. That is x
and y are in the same equivalence class. If x\not\sim y then we say
that x is not equal to y and write x\neq y.
:::
Extending inequality operators to the rationals
The inequality operators can be extended to the rationals in a natural way.
::: definition Definition 121. Less than operator
Let x,y\in\mathbb{Q} where x\in\left[a,b\right] and
y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. The less than
operator, denoted by x<y is defined by the logical proposition
$$\begin{equation} <\left(x,y\right)=\begin{cases} 1,\ \text{If } ad<bc\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
This can equivalently be express as
$$\begin{equation} x<y \iff ad<bc \end{equation*}$$* :::
::: definition Definition 122. Less than or equal to operator
Let x,y\in\mathbb{Q} where x\in\left[a,b\right] and
y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. The less than or
equal operator, denoted by x\leq y is defined by the logical
proposition
$$\begin{equation} \leq\left(x,y\right)=\begin{cases} 1,\ \text{If } ad\leq bc\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
This can equivalently be express as
$$\begin{equation} x\leq y \iff ad\leq bc \end{equation*}$$* :::
::: definition Definition 123. Greater than operator
Let x,y\in\mathbb{Q} where x\in\left[a,b\right] and
y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. The greater than
operator, denoted by x>y is defined by the logical proposition
$$\begin{equation} >\left(x,y\right)=\begin{cases} 1,\ \text{If } ad>bc\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
This can equivalently be express as
$$\begin{equation} x>y \iff ad>bc \end{equation*}$$* :::
::: definition Definition 124. Greater than or equal to operator
Let x,y\in\mathbb{Q} where x\in\left[a,b\right] and
y\in\left[c,d\right] for some a,b,c,d\in\mathbb{Z}. The greater than
or equal to operator, denoted by x\geq y is defined by the logical
proposition
$$\begin{equation} \geq\left(x,y\right)=\begin{cases} 1,\ \text{If } ad\geq bc\ 0,\ \text{Otherwise} \end{cases} \end{equation*}$$*
This can equivalently be express as
$$\begin{equation} x\geq y \iff ad\geq bc \end{equation*}$$* :::
Extending addition to the rationals
We can extend addition to the rationals. To do so we need to consider
how integers are represented in the rationals. As we know an element
\left(a,b\right)\in\mathbb{Q} is going to represent
\displaystyle\frac{a}{b}. So we can start by considering what an
integer will look like. We know by the definition of the equivalence
relation that for \left(a,b\right),\left(c,d\right)\in\mathbb{Z}^2
that
$$\begin{equation*} \left(a,b\right)\sim\left(c,d\right)\iff ad=bc \end{equation*}$$
Hence if we have for b=d=1 that
$$\begin{equation*} \left(a,1\right)\sim\left(c,1\right)\iff a=c \end{equation*}$$
Hence an integer can be represented in the rationals by an element of
the form \left(k,1\right) for all k\in\mathbb{Z}. Therefore if
x,y\in\mathbb{Z} they will have the representation
x=\left(x_1,1\right) and y=\left(y_1,1\right) for some
x_1,y_1\in\mathbb{Z}. Hence by integer addition, we have that
$$\begin{equation*} x+y=\left(x_1,1\right)+\left(y_1,1\right)=\left(x_1+y_1,1\right) \end{equation*}$$
Now what happens if a=c=1? From the definition of the equivalence
relation we have that
$$\begin{equation*} \left(1,b\right)\sim\left(1,d\right)\iff d=b \end{equation*}$$
So we see that \left(1,b\right)\sim\left(1,d\right) means that
intuitively \displaystyle\frac{1}{b}=\frac{1}{d}. The question now
becomes what is \displaystyle\frac{1}{b}+\frac{1}{b}?
For example consider \displaystyle\frac{1}{2}+\frac{1}{2}=1, or
\displaystyle\frac{1}{3}+\frac{1}{3}=\frac{2}{3}. It seems the result
we need is that \displaystyle\frac{1}{b}+\frac{1}{b}=\frac{2}{b}. We
hence have that
$$\begin{equation*} \left(1,b\right)+\left(1,b\right)=\left(2,b\right) \end{equation*}$$
Hence more generally we have that
$$\begin{equation*} \left(a,b\right)+\left(c,b\right)=\left(a+c,b\right) \end{equation*}$$
Now, from intuition, we know that for example
\displaystyle\frac{1}{3}=\frac{2}{6}=\frac{1*2}{3*2}. In the language
of the relation we have defined, we have that
$$\begin{equation*} \left(a,b\right)\sim\left(ad,bd\right) \end{equation*}$$
With these facts, we have enough to recover the definition of the addition of rational numbers we were told in school.
We have that
$$\begin{align*} \left(a,b\right)+\left(c,d\right)&\sim\left(ad,bd\right)+\left(bc,bd\right)\ &\sim\left(ad+bc,bd\right) \end{align*}$$
Indeed, we have for example
$$\begin{equation*} \frac{1}{2}+\frac{1}{3}=\frac{31+21}{32}=\frac{5}{6} \end{equation}$$
We make the required definition.
::: definition Definition 125. Addition on the Rationals
Let x,y\in\mathbb{Q} with x\in\left[a,b\right] and
y=\left[c,d\right] so that b\neq 0 and d\neq 0. We define addition
on the rationals by
$$\begin{equation} x+y=\left[a,b\right]+\left[c,d\right]=\left[ad+bc,bd\right] \end{equation}$$ :::
Extending multiplication to the rationals
We can extend multiplication to the rationals as well. As with extending
addition, we should consider how integers are represented in the
rationals. As before an integer in the rationals is of the form
\left(a,1\right) and given the definition from the integers we know we
must have
$$\begin{equation*} \left(a,1\right)\left(b,1\right)=\left(ab,1\right) \end{equation}$$
Now we need to answer the question of
\left(1,b\right)*\left(1,d\right). Taking a similar approach as to
addition we will consider some examples. We intuitively know that
\displaystyle 1*\frac{1}{2}=\frac{1}{2}. This is to say that
$$\begin{equation*} \left(1,1\right)\left(1,2\right)=\left(1,2\right) \end{equation}$$
We also knot that 2*2=4 and so we know \displaystyle\frac{4}{2}=2.
In other words we must have that
$$\begin{equation*} \left(4,1\right)\left(1,2\right)=\left(2,1\right)\sim\left(4,2\right) \end{equation}$$
Now, suppose we have \displaystyle\frac{3}{2}=1.5, what is
\displaystyle\frac{3}{2}*\frac{1}{3}? Again we know intuitively that
0.5+0.5+0.5=3(0.5)=1.5, hence we can write
$$\begin{equation*} \left(3,2\right)\left(1,3\right)=\left(1,2\right)\sim\left(3,6\right) \end{equation}$$
We can now see how to handle \left(1,b\right)*\left(1,d\right) and
more generally \left(a,b\right)*\left(c,d\right). We make the
definition.
::: definition Definition 126. Multiplication on the Rationals
Let x,y\in\mathbb{Q} with x\in\left[a,b\right] and
y=\left[c,d\right] so that b\neq 0 and d\neq 0. We define
multiplication on the rationals by
$$\begin{equation} xy=\left[a,b\right]\left[c,d\right]=\left[ac,bd\right] \end{equation}$$ :::
Closure properties of addition and multiplication
As with the natural numbers and integers we need to show that the operations of addition and multiplication on the rationals are closed and well-defined.
::: theorem Theorem 26. Addition and multiplication on the rational are well-defined operators and closed
We have that \forall x,y\in\mathbb{Q} that
-
$x+y\in\mathbb{Q}$
-
$xy\in\mathbb{Q}$*
Proof:
-
x+y\in\mathbb{Q}:We must show that if
\left(a,b\right)\sim\left(a',b'\right)and\left(c,d\right)\sim\left(c',d'\right)then we have$$\begin{equation} \left(ad+bc,bd\right)\sim\left(a'd'+b'c',b'd'\right) \end{equation*}$$*
By definition we have that
\left(a,b\right)\sim\left(a',b'\right)holds if and only ifab'=ba', likewise\left(c,d\right)\sim\left(c',d'\right)holds if and only ifcd'=c'd. It is left to show\left(ad+bc,bd\right)\sim\left(a'd'+b'c',b'd'\right). By definition of the equivalence relation we have that$$\begin{equation} \left(ad+bc,bd\right)\sim\left(a'd'+b'c',b'd'\right) \iff \left(ad+bc\right)b'd'=bd\left(a'd'+b'c'\right) \end{equation*}$$*
We have that
$$\begin{align} \left(ad+bc\right)b'd'&=adb'd'+bcb'd', \text{ As integer multiplication distributes over the addition}\ &=\left(ab'\right)\left(dd'\right)+\left(cd'\right)\left(bb'\right), \text{ By commutativity}\ &=\left(ba'\right)\left(dd'\right)+\left(dc'\right)\left(bb'\right), \text{ By the equivalence relation}\ &=\left(bd\right)\left(a'd'\right)+\left(bd\right)\left(b'c'\right), \text{ By commutativity}\ &=bd\left(a'd'+b'c' ,\right), \text{ As integer multiplication distributes over the addition} \end{align*}$$*
Which is what we wished to show. Hence addition is well-defined. It is left to show closure. Let
x,y\in\mathbb{Q}withx=\left(a,b\right)andy=\left(c,d\right)so thatb\neq 0andd\neq 0. By definition of addition we have that$$\begin{equation} \left(a,b\right)+\left(c,d\right)=\left(ad+bc,bd\right) \end{equation*}$$*
As
ad+bc\in\mathbb{Z}andbd\in\mathbb{Z}then\left(ad+bc,bd\right)\in\left[ad+bc,bd\right]and sox+y\in\mathbb{Q}. -
x*y\in\mathbb{Q}:As with addition we need to show that if
\left(a,b\right)\sim\left(a',b'\right)and\left(c,d\right)\sim\left(c',d'\right)that$$\begin{equation} \left(ac,bd\right)\sim\left(a'c',b'd'\right) \end{equation*}$$*
As
\left(a,b\right)\sim\left(a',b'\right)holds if and only ifab'=ba', likewise\left(c,d\right)\sim\left(c',d'\right)holds if and only ifcd'=c'd. It is left to show\left(ac,bd\right)\sim\left(a'c',b'd'\right), that is$$\begin{equation} \left(ac,bd\right)\sim\left(a'c',b'd'\right)\iff acb'd'=bda'c' \end{equation*}$$*
We have
$$\begin{align} acb'd'&=\left(ab'\right)\left(cd'\right), \text{By commutativity}\ &=\left(ba'\right)\left(c'd\right),\ \text{By the equivalence relation}\ &=bda'c',\ \text{By commutativity} \end{align*}$$*
Showing that multiplication is well-defined. To show closure let
x,y\in\mathbb{Q}withx=\left(a,b\right)andy=\left(c,d\right)so thatb\neq 0andd\neq 0then by definition we have that$$\begin{equation} \left(a,b\right)\left(c,d\right)=\left(ac,bd\right) \end{equation}$$*
From which it is clear that
ac,bd\in\mathbb{Z}so $xy\in\mathbb{Q}$*
The result is shown. $\qed$ :::
Associativity of rational addition and multiplication
The associativity of addition and multiplication extends to the rationals.
::: theorem
Theorem 27. Let x,y,z\in\mathbb{Q}. We have that
-
$x+\left(y+z\right)=\left(x+y\right)+z$
-
$x\left(yz\right)=\left(xy\right)z$
Proof:
-
x+\left(y+z\right)=\left(x+y\right)+z:Let
x,y,z\in\mathbb{Q}be such thatx=\left(a,b\right), y=\left(c,d\right)andz=\left(e,f\right)wherea,b,c,d,e,f\in\mathbb{N}and we have that\left(a,b\right)\in\left[a,b\right], \left(c,d\right)\in\left[c,d\right]and\left(e,f\right)\in\left[e,f\right]. We have that$$\begin{align} x+\left(y+z\right)&=\left(a,b\right)+\left(\left(c,d\right)+\left(e,f\right)\right)\ &=\left(a,b\right)+\left(cf+de,df\right)\ &=\left(adf+b\left(cf+de\right),bdf\right)\ &=\left(adf+bcf+bde,bdf\right)\ &=\left(\left(ad+bc\right)f+bde,bdf\right)\ &=\left(\left(ad+bc\right)f+ebd,bdf\right),\text{ By associativity of addition for integer numbers}\ &=\left(ad+bc,bd\right)+\left(e,f\right)\ &=\left(\left(a,b\right)+\left(c,d\right)\right)+\left(e,f\right)\ &=\left(x+y\right)+z \end{align*}$$*
Which shows associativity of addition.
-
x\left(yz\right)=\left(xy\right)z:As with addition, let
x,y,z\in\mathbb{Q}be such thatx=\left(a,b\right), y=\left(c,d\right)andz=\left(e,f\right)wherea,b,c,d,e,f\in\mathbb{Z}and we have that\left(a,b\right)\in\left[a,b\right], \left(c,d\right)\in\left[c,d\right]and\left(e,f\right)\in\left[e,f\right]. We then have that$$\begin{align} x\left(yz\right)&=\left(a,b\right)\left(\left(c,d\right)\left(e,f\right)\right)\ &=\left(a,b\right)\left(ce,df\right)\ &=\left(ace,bdf\right)\ &=\left(ac,bd\right)\left(e,f\right)\ &=\left(\left(a,b\right)\left(c,d\right)\right)\left(e,f\right) \end{align}$$*
Showing associativity of multiplication.
The result follows. $\qed$ :::
Commutativity of rational addition and multiplication
As with the naturals and integers, addition and multiplication in the rationals both satisfy commutativity.
::: theorem Theorem 28. Addition and multiplication are commutative
For all x,y\in\mathbb{Q} we have that
-
$x+y=y+x$
-
$xy=yx$
Proof:
-
x+y=y+x:Let
x,y\in\mathbb{Q}. By definition we have thatx\in\left[a,b\right]andy\in\left[c,d\right]for somea,b,c,d\in\mathbb{Z}. Letx=\left(a,b\right)andy=\left(c,d\right). We then have by definition of addition that$$\begin{align} x+y&=\left(a,b\right)+\left(c,d\right)\ &=\left(ad+bc,bd\right)\ &=\left(bc+ad,bd\right),\ \text{By associativity of addition for the integers}\ &=\left(cb+da,db\right),\ \text{By commutativity of addition for the integers}\ &= \left(c,d\right)+\left(a,b\right) &=y+x \end{align*}$$*
Showing commutativity holds for addition in the integers.
-
xy=yx:Let
x,y\in\mathbb{Q}by definition we have thatx\in\left[a,b\right]andy\in\left[c,d\right]for somea,b,c,d\in\mathbb{Z}. So letx=\left(a,b\right)andy=\left(c,d\right). By definition of multiplication we have$$\begin{align} xy&=\left(a,b\right)\left(c,d\right)\ &=\left(ac,bd\right)\ &=\left(ca,db\right), \text{By commutativity of multiplication of the integers}\ &=\left(c,d\right)\left(a,b\right)\ &=yx \end{align*}$$*
Showing commutativity for integer multiplication.
The result has been shown. $\qed$ :::
The Zero and Identity laws
The zero and identity laws from both the naturals and integers extend to the rationals. But first, we show the following result.
::: lemma Lemma 7. Representation of zero in the rationals
We have that 0=\left[0,a\right] for all a\in\mathbb{Z} with
$a\neq 0$
Proof:
Let x,y\in\left[0,a\right] with x=\left(0,a_1\right) and
y=\left(0,a_2\right). We hence have that$x\sim y$ and
Where the final 0=0 is the zero of the integers, from which the
result is clear. $\qed$
:::
We take the natural representation of 0 for the rationals.
::: theorem Theorem 29. The zero and Identity laws
Let x\in\mathbb{Q}. We have that
-
$x+0=x=0+x$
-
$1x=x=x1$
Proof:
Let x\in\mathbb{Q} then we have that x=\left(a,b\right) for some
$a,b\in\mathbb{Z}$
-
x+0=x=0+x:We have that
0\in\left[0,1\right]. Hence we have that$$\begin{align} x+0&=\left(a,b\right)+\left(0,1\right)\ &=\left(a1+b0,b1\right)\ &=\left(a,b\right)=x\ &=\left(1a+0b.1b\right)\ &=\left(0,1\right)\left(a,b\right)\ &=0+x \end{align}$$*
-
x*1=x=1*x:As
1\in\left[1,1\right]then$$\begin{align} x1&=\left(a,b\right)\left(1,1\right)\ &=\left(a1,b1\right)\ &=\left(a,b\right)\ &=\left(a,b\right)=x\ &=\left(1a,1b\right)\ &=\left(1,0\right)\left(a,b\right)\ &=1x \end{align}$$*
The result follows. $\qed$ :::
Multiplication distributes over addition
Yet another result that extends to the rationals is that multiplication distributes over addition.
::: theorem Theorem 30. Multiplication distributes over addition
For all x,y,z\in\mathbb{Q} we have that
-
$x\left(y+z\right)=xy+xz$
-
$\left(y+z\right)x=yx+zx=xy+xz$
Proof:
Let x,y,z\in\mathbb{Q} then
x\in\left[a,b\right],y\in\left[c,d\right] and z\in\left[e,f\right]
for some a,b,c,d,e,f\in\mathbb{Z}.
Let x=\left(a,b\right), y=\left(c,d\right) and z=\left(e,f\right).
-
x\left(y+z\right)=xy+xz:We have that
$$\begin{align} x\left(y+z\right)&=\left(a,b\right)\left(\left(c,d\right)+\left(e,f\right)\right)\ &=\left(a\left(cf+ed\right),bdf\right)\ &=\left(acf+aed,bdf\right),\ \text{By multiplication distributes over addition for the integers}\ &=\left(acf+aed,bdf\right)\left(1,1\right),\ \text{By the identity law for the rationals}\ &=\left(acf+aed,bdf\right)\left(b,b\right),\ \text{As }\left(1,1\right)\sim\left(b.b\right)\ &=\left(\left(acf+aed\right)b,bdfb\right)\ &=\left(acfb+aedb,bdfb\right),\ \text{By multiplication distributes over addition for the integers}\ &=\left(acbf+aebd,bdbf\right),\ \text{By commutativity of integer multiplication}\ &=\left(ac,bd\right)+\left(ae,bf\right)\ &=\left(a,b\right)\left(c,d\right)+\left(a,b\right)\left(e,f\right)\ &=xy+xz \end{align*}$$*
-
\left(y+z\right)x=yx+zx=xy+xz:Invoking the previous part of the proof we have that
$$\begin{align} \left(y+z\right)x&=x\left(y+z\right), \text{By commutativity of multiplication}\ &=xy+xz, \text{By part }1.\ &=yx+zx, \text{By commutativity of multiplication} \end{align*}$$*
As required. $\qed$ :::
Extending subtraction to the rationals
We can extend subtraction from the integers to the rationals. Recall
that subtraction was defined for x,y\in\mathbb{Z} by
$$\begin{equation*} x-y=x+\left(-y\right)=x+\left(-1y\right) \end{equation}$$
That is to say subtraction was defined by adding the negation of y to
x. We will use a similar idea to define subtraction on the rationals.
Firstly we need to consider what it means to negate a rational number.
To do so we need to define what it means for a rational number to be
"positive" or "negative".
We know that any integer x can be expressed as a rational by
\left(x,1\right) and so in this case \left(x,1\right) is positive if
x is positive and \left(x,1\right) is negative if x is negative.
Hence a general rational number \left(a,b\right) being positive or
negative will depend on a and b being positive or negative. There
are a few cases to consider.
-
Suppose that
ais positive andbis positive. We have that for\left(a,b\right)\sim\left(c,d\right)for somec,d\in\mathbb{Z}that$$\begin{equation*} ad=cb \end{equation*}$$
As
aandbare positive then we are forced to conclude thatcanddare also positive for if not then one side of this equation would have a different sign. -
Suppose that
ais positive andbis negative. Then as before we have that for\left(a,b\right)\sim\left(c,d\right)to be true that$$\begin{equation*} ad=cb \end{equation*}$$
As
bwas negative then we have thatcbis either positive or negative depending onc. Ifcis positive thencbis negative and sodmust also be negative. Likewise ifcis negative thencbis positive anddmust be positive.
The cases for when a is negative and b is either positive or
negative are similar. We can use this to make a definition for a
positive and negative rational number.
::: definition Definition 127. Positive and negative rational number
Let x\in\mathbb{Q} so that x=\left(a,b\right) for some
a,b\in\mathbb{Z}. We say that x is a positive rational number if and
only if a is positive and b is positive. That is to say
x\in\mathbb{Q} is positive if and only if
\mathop{\mathrm{sgn}}\left(a\right)=\mathop{\mathrm{sgn}}\left(b\right)
with \mathop{\mathrm{sgn}}\left(a\right)\neq 0 and
\mathop{\mathrm{sgn}}\left(b\right)\neq 0 where
\mathop{\mathrm{sgn}} denotes the sign function of an integer.
If
\mathop{\mathrm{sgn}}\left(a\right)\neq\mathop{\mathrm{sgn}}\left(b\right)
and \mathop{\mathrm{sgn}}\left(a\right)\neq 0 and
\mathop{\mathrm{sgn}}\left(b\right)\neq 0 then we have that x is a
negative rational number.
Finally if \mathop{\mathrm{sgn}}\left(a\right)= 0 and
\mathop{\mathrm{sgn}}\left(b\right)\neq 0 then we say that x is
neither positive or negative.
:::
We can summarise this definition using \mathop{\mathrm{sgn}} just like
we did for the integers.
::: definition Definition 128. Sign of a rational number
Let x\in\mathbb{Q} where x=\left(a,b\right) with a,b\in\mathbb{Z}
and b\neq 0. We define the sign of x, denoted by
\mathop{\mathrm{sgn}}\left(x\right) to be the following function
$$\begin{align} \mathop{\mathrm{sgn}}:\mathbb{Q}&\rightarrow\left{-1,0,1\right}\ x&\mapsto\mathop{\mathrm{sgn}}\left(x\right)=\begin{cases} 1,\ \text{If } x\text{ is a positive rational number}\ -1,\ \text{If } x\text{ is a negative rational number}\ 0,\ \text{If } \mathop{\mathrm{sgn}}\left(a\right)=0 \end{cases} \end{align*}$$* :::
Now that we have defined the notion of a positive and negative rational
number we can consider what it means to negate a rational number. The
definition follows immediately from the representation of -1 in
\mathbb{Q} being \left(-1,1\right). Indeed for any x\in\mathbb{Q}
with x=\left(a,b\right) we have
$$\begin{equation*} -x=-1x=\left(-1,1\right)\left(a,b\right)=\left(-a,b\right) \end{equation*}$$
We make the formal definition.
::: definition Definition 129. Negation of a rational number
Let x\in\mathbb{Q}. We define the negation of x, denoted -x by
$$\begin{equation} -x=-1x=\left(-1,1\right)x \end{equation}$$
where \left(-1,1\right)\in\left[\left(-1,1\right)\right]. That is
\left(-1,1\right) is an element of the equivalence class
\left[\left(-1,1\right)\right] which represents all possible elements
that are -1.
:::
We can now make a definition for subtraction for the rational numbers
::: definition Definition 130. Rational number subtraction
Let x,y\in\mathbb{Q}. We define the subtraction of y from x,
denoted x-y by
$$\begin{equation} x-y=x+\left(-y\right)=x+\left(-1y\right) \end{equation}$$* :::
We immediately get that subtraction is closed, from the fact that both addition and multiplication is closed. We do not have associativity of subtraction in general.
::: proposition Proposition 80. Rational number subtraction is not associative
Let x,y,z\in\mathbb{Q}. We have that
$$\begin{equation} x-\left(y-z\right)\neq \left(x-y\right)-z \end{equation*}$$*
Proof:
Let \displaystyle x=\frac{1}{2}, y=\frac{1}{4} and
\displaystyle z=\frac{1}{6}, we have
x\in\left[\left(1,2\right)\right], y\in\left[\left(1,4\right)\right]
and z\in\left[\left(1,6\right)\right] so
x=\left(1,2\right), y=\left(1,4\right) and z=\left(1,6\right) . We
have that
$$\begin{align} x-\left(y-z\right)&=\left(1,2\right)+\left(\left(1,4\right)-\left(1,6\right)\right)\ &=\left(1,2\right)-\left(\left(1,4\right)+\left(-1*\left(1,6\right)\right)\right)\ &=\left(1,2\right)-\left(\left(1,4\right)+\left(-1,6\right)\right)\ &=\left(1,2\right)-\left(\left(16+4-1,46\right)\right)\ &=\left(1,2\right)-\left(\left(2,24\right)\right)\ &=\left(1,2\right)+\left(-1\left(\left(2,24\right)\right)\right)\ &=\left(1,2\right)+\left(-2,24\right)\ &=\left(124+2-1,224\right)\ &=\left(22,48\right)\ \end{align}$$*
On the other hand we have
$$\begin{align} \left(x-y\right)-z&=\left(1,2\right)-\left(\left(1,4\right)-\left(1,6\right)\right)\ &=\left(\left(1,2\right)+\left(-1*\left(1,4\right)\right)\right)-\left(1,6\right)\ &=\left(\left(1,2\right)+\left(-1,4\right)\right)-\left(1,6\right)\ &=\left(14+2-1,24\right)-\left(1,6\right)\ &=\left(2,8\right)-\left(1,6\right)\ &=\left(2,8\right)+\left(-1\left(1,6\right)\right)\ &=\left(2,8\right)+\left(-1,6\right)\ &=\left(26+8-1,86\right)\ &=\left(4,48\right) \end{align}$$*
It is left to show that \left(22,48\right)\neq\left(4,48\right).
Indeed to have \left(22,48\right)=\left(4,48\right) we need
\left(22,48\right)\sim\left(4,48\right) which occurs if and only if
22*48=48*8. However one the left hand side 48 is multiplied by 22
and on the right-hand side 48 is multiplied by 8 so they clearly can
not be equal.
The result is shown. $\qed$ :::
As with subtraction with integers, we can now show that formally, subtraction is an inverse to addition.
::: {#prop:RationalAdditiveInverse .proposition} Proposition 81. Subtracting an integer from itself gives zero
Let x\in\mathbb{Q}. We have that
$$\begin{equation} x-x=0 \end{equation*}$$*
Proof:
Let x\in\mathbb{Q} where x\in\left[\left(a,b\right)\right] for some
a,b\in\mathbb{Z} and b\neq 0. We have
$$\begin{align} x-x&=\left(a,b\right)-\left(a,b\right)\ &=\left(a,b\right)+\left(-a,b\right)\ &=\left(ab+b*-a,bb\right)\ &=\left(ab-ba,bb\right)\ &=\left(ab-ab,bb\right)\ &=\left(0,bb\right) \end{align*}$$*
It is left to show that \left(0,b*b\right)\sim\left(0,1\right).
Indeed
$$\begin{equation} 01=bb0 \Rightarrow 0=0 \end{equation}$$*
The result is shown. $\qed$ :::
The cancellation laws
We can now deduce that the cancellation laws extend to the rational numbers.
::: {#thm:CancellationLawsForRationals .theorem} Theorem 31. The cancellation laws
Let x,y,z\in\mathbb{Q}.
-
If
x+y=x+zthen we havey=z. -
For
x\neq 0, ifxy=xzthen we have that $y=z$
Proof:
-
If
x+y=x+zthen we havey=z:Let
x,y,z\in\mathbb{Q}. We have that$$\begin{align} x+y&=x+z\ \Rightarrow -x+x+y&=-x+x+z,\ \text{Adding the negative of } x \text{ to both sides}\ \Rightarrow \left(-x+x\right)+y*&=\left(-x+x\right)+z,\ \text{Associativity of the rationals}\ \Rightarrow 0+y&=0+z,\ \text{By proposition \ref{prop:RationalAdditiveInverse}}\ \Rightarrow y&=z \end{align*}$$*
-
For
x\neq 0, ifxy=xzthen we have thaty=z:Let
x,y,z\in\mathbb{Q}wherex\neq 0. Suppose thatx\in\left[\left(a,b\right)\right], y\in\left[\left(c,d\right)\right]andz\in\left[\left(e,f\right)\right]. We have$$\begin{align} xy&=\left(a,b\right)\left(c,d\right)=\left(ac,bd\right)\ xz&=\left(a,b\right)\left(e,f\right)=\left(ae,bf\right) \end{align*}$$*
Now suppose that
xy=xzthen we have that\left(ac,bd\right)\sim\left(ae,bf\right)which is to say$$\begin{equation} acbf=aebd \end{equation*}$$*
Observer that $$\begin{align} &acbf=aebd\ &a\left(cbf\right)=a\left(ebd\right)\ &cbf=ebd,\ \text{By the cancellation laws for the integers}\ &bcf=bed,\ \text{By commutativity of the integers}\ &b\left(cf\right)=b\left(ed\right)\ &cf=ed,\ \text{By the cancellation laws for the integers}\ \Rightarrow&\left(c,d\right)\sim\left(e,f\right),\ \text{By definition of the equivalence relation} \end{align*}$$*
It hence follows that as
\left(c,d\right)\sim\left(e,f\right)then $y=z$
The result is shown. $\qed$ :::
Defining multiplicative inverses and division
When we extended the naturals to the integers we were able to extend the
notion of subtraction in such a way that we could undo any addition
operation. We were not able to do the same for multiplication in
general. For example if we have x*2=1 where 1,2,x\in\mathbb{Z} then
there is no integer x that when multiplied by 2 gives 1.
What happens if we consider instead the situation where we have
1,2,x\in\mathbb{Q}? Let x=\left(a,b\right) for some
a,b\in\mathbb{Z} with b\neq 0 and taking the natural representations
for 1 and 2 of 1=\left(1,1\right) and 2=\left(2,1\right). We
have that
$$\begin{align*} x2&=1\ \left(a,b\right)\left(2,1\right)&=\left(1,1\right)\ \left(2a,b\right)&=\left(1,1\right)\ \Rightarrow\left(2a,b\right)&\sim\left(1,1\right)\iff 2a=b \end{align}$$
We don't seem to be in a better position then when we asked this
question for \mathbb{Z}. However as a,b were arbitrary, of course
with b\neq 0, we are free to vary them. For example a=1 gives us
b=2, a=2 gives b=4, a=3 yields b=6 and so on. We hence have
that there is a family of possible value for x which satisfies x*2=1
over the rational numbers, in particular we have x=\left(a,2a\right)
for a\in\mathbb{Z} and a\geq 0. Moreover we clearly have
$$\begin{equation*} \left(a,2a\right)\sim\left(1,2\right)\iff 2a=2a \end{equation*}$$
Hence we have that \left(a,2a\right) somehow undoes multiplication by
2. Indeed consider 45*2=90. We have that
$$\begin{equation*} 90*\left(a,2a\right)=\left(90,1\right)\left(a,2a\right)=\left(90a,2a\right) \end{equation}$$
Where we have \left(90a,2a\right)\sim\left(45,1\right) as
90a*1=45*2a \iff 90a = 90a. We can generalise this to x*y=1 for any
y\in\mathbb{Q}. Indeed let x=\left(a,b\right) and
y=\left(c,d\right) where a,b,c,d\in\mathbb{Z} and c\neq 0 and
d\neq 0 then we have
$$\begin{align*} xy&=\left(a,b\right)\left(c,d\right)\ &=\left(ac,bd\right)=\left(1,1\right)\ \Rightarrow\left(ac,bd\right)&\sim\left(1,1\right)\iff bd=ac \end{align*}$$
This is a somewhat unsatisfactory conclusion as it doesn't tell us what
a or b should actually be equal to in order for x*y=1, likewise,
it doesn't tell us what c or d should be either.
Perhaps then we should consider a more simple setup. Suppose that
x\in\mathbb{Z} then is there y\in\mathbb{Q} where
y=\left(c,d\right) with d\neq 0, such that x*y=1? We have
$$\begin{equation*} xy=\left(x,1\right)\left(c,d\right)=\left(xc,d\right)=\left(1,1\right) \end{equation*}$$
Hence
$$\begin{equation*} \left(xc,d\right)\sim\left(1,1\right)\iff xc=d \end{equation*}$$
Hence y=\left(c,xc\right) satisfies this relation. However we can see
that \left(c,cx\right)\sim\left(1,x\right). Hence for any integer
x\neq 0 we have a solution to x*y=1 with y\in\mathbb{Q}. We call
y a multiplicative inverse of x and x a multiplicative inverse of
y.
::: definition Definition 131. Multiplicative inverse of an integer
Let x\in\mathbb{Z} be such that x\neq 0. Then there is a
y\in\mathbb{Q} such that
$$\begin{equation} xy=1=yx \end{equation*}$$*
where y=\left(1,x\right). We can write this as
\displaystyle y=\frac{1}{x} or y=x^{-1}. We sometimes say that
x{-1} is a reciprocal of x or a multiplicative inverse of x.
:::
In light of this, we have the immediate result
::: {#prop:MultiplicativeInverseOfIntegerTimesInverseIsOriginalNumber .proposition} Proposition 82. Multiplicative inverse of an integer times its multiplicative inverse is the original number
Let x\in\mathbb{Z} so that x^{-1}\in\mathbb{Q} where
\displaystyle x^{-1}=\frac{1}{x} is the multiplicative inverse to x
in the rationals. The following result holds.
$$\begin{equation} xx^{-1}x = x \end{equation}$$
Proof:
By definition of a multiplicative inverse we have that
$$\begin{equation} xx^{-1}=x\frac{1}{x}=\left(x,1\right)\left(1,x\right)=\left(x,x\right)\sim\left(1,1\right)=1 \end{equation}$$*
Hence as x^{-1} is a multiplicative inverse for x it follows that
x is a multiplicative inverse for x^{-1} and so
$$\begin{equation} xx^{-1}x=1x=x \end{equation}$$*
As required. $\qed$ :::
Armed with this definition we can answer the original question. In order
to find an x so that x*y=1 we have that we need to find a
multiplicative inverse for c and a multiplicative inverse for
\displaystyle d^{-1}=\frac{1}{d}. Clearly we have that
\displaystyle c^{-1}=\frac{1}{c} and a multiplicative inverse for
d^{-1} is simply d. Hence a candidate for x is given by
x=\left(d,c\right). Indeed we have that
$$\begin{equation*} xy=\left(d,c\right)\left(c,d\right)=\left(cd,cd\right)\sim\left(1,1\right)=1 \end{equation*}$$
We can hence extend the idea of multiplicative inverses to the rationals.
::: definition Definition 132. Multiplicative inverse of a rational number
Let x\in\mathbb{Q} such that x=\left(a,b\right) with
a,b\in\mathbb{Z} and b\neq 0. Then there is a y\in\mathbb{Q} such
that
$$\begin{equation} xy=1=yx \end{equation*}$$*
where y=\left(b,a\right). Hence we must also have a\neq 0. We write
this as \displaystyle y=\frac{b}{a} or as
\displaystyle x^{-1}=y=\frac{b}{a}. We sometimes say that x{-1} is a
reciprocal of x or a multiplicative inverse of x.
:::
A similar result holds as for proposition 82{reference-type="ref" reference="prop:MultiplicativeInverseOfIntegerTimesInverseIsOriginalNumber"}
::: {#prop:MultiplicativeInverseOfRationalTimesInverseIsOriginalNumber .proposition} Proposition 83. Multiplicative inverse of a rational number times its multiplicative inverse is the original number
Let x\in\mathbb{Q} with x=\left(a,b\right) and a,b\in\mathbb{Z}
so that a\neq 0 and b\neq 0. Let x^{-1} denote the multiplicative
inverse of x. The following result holds.
$$\begin{equation} xx^{-1}x = x \end{equation}$$
Proof:
By definition of a multiplicative inverse we have that
$$\begin{equation} xx^{-1}=1 \end{equation}$$*
Hence
$$\begin{equation} xx^{-1}x=1x=x \end{equation}$$*
As required. $\qed$ :::
We now have a solid grasp of undoing multiplication in the rational numbers. In fact we are now in a position to define the operation of division. However we are already done due to the work we have just done, and our original motivation for defining the rational numbers in the first place. We use the idea of multiplicative inverses!
::: definition Definition 133. Division
Let a,b\in\mathbb{Z} so that b\neq 0. We define the division of a
by b, denoted \displaystyle\frac{a}{b} by
$$\begin{equation} \frac{a}{b}=ab^{-1}=\left(a,1\right)\left(1,b\right)=\left(a,b\right) \end{equation*}$$* :::
We can extend the notion of division even further by considering
a,b\in\mathbb{Q} rather than \mathbb{Z}. At first is appears we have
a problem, we defined the rationals using integers and division in terms
of integers, so how could we possibly assign any meaning to an
expression like \displaystyle\frac{1}{\frac{1}{2}}?
Consider for example the following
$$\begin{equation*} \frac{1}{\frac{1}{2}}\frac{1}{2} \end{equation}$$
If we were suppose the rule for multiplication that we defined extends to this situation then we get
$$\begin{equation*} \frac{1}{\frac{1}{2}}\frac{1}{2}=\frac{11}{\frac{1}{2}2}=\frac{1}{1}=1 \end{equation}$$
In the context of the work we have just done we have that
\displaystyle \frac{1}{\frac{1}{2}} is a multiplicative inverse of
\frac{1}{2}. However we know that \displaystyle \frac{1}{2} has a
multiplicative inverse of 2. Does this mean that
\displaystyle \frac{1}{\frac{1}{2}}=2? A deeper analysis of
expressions of the form \displaystyle \frac{1}{\frac{1}{a}}.
We know from before that \displaystyle\frac{1}{a}=a^{-1} for some
non-zero a\in\mathbb{Z}. Hence we have that by definition
a^{-1}\in\mathbb{Z}. Hence we are considering the expression
$$\begin{equation*} \frac{1}{\frac{1}{a}}=\frac{1}{a^{-1}} \end{equation*}$$
Therefore we know from the definition of the multiplicative inverse of a
rational number that there is some y\in\mathbb{Q} so that
$$\begin{equation*} \frac{1}{a^{-1}}y=1 \end{equation}$$
By the definition we also know what y must be
\displaystyle \frac{a^{-1}}{1}=a^{-1}=\frac{1}{a}. Hence we can
justify our "temporary" assumption of extending the multiplication rule.
Hence hence make the following deduction
::: {#prop:OneDividedByMultiplicativeInverseOfInteger .proposition} Proposition 84. One divided by multiplicative inverse of an integer is the integer itself
Let x\in\mathbb{Q} so that \displaystyle x=\frac{1}{\frac{1}{a}}
for some a\in\mathbb{Z} with a\neq 0. we have that
$$\begin{equation} \frac{1}{\frac{1}{a}}=a \end{equation*}$$*
Proof:
Let x\in\mathbb{Q} be such that
\displaystyle x=\frac{1}{\frac{1}{a}} for some non-zero
a\in\mathbb{Z}. We know by definition that
$$\begin{equation} x=\frac{1}{a}=a^{-1} \end{equation*}$$*
where a^{-1}\in\mathbb{Z} and therefore
\displaystyle x = \frac{1}{a^{-1}}. Moreover this is still a rational
number by definition and so there exists some rational y so that
$$\begin{equation} xy=1 \end{equation}$$*
where \displaystyle y=\frac{a^{-1}}{1}=a^{-1}. It follows that
\displaystyle y=\frac{1}{a}. Again by definition there is some
z\in\mathbb{Q} so that y*z=1 where \displaystyle z=\frac{a}{1}=a
that is to say z is a multiplicative inverse of y.
We therefore have that
$$\begin{equation} xy=1=yz \end{equation*}$$*
Hence by theorem
31{reference-type="ref"
reference="thm:CancellationLawsForRationals"} we have that x=z which
is to say
$$\begin{equation} \frac{1}{\frac{1}{a}}=a \end{equation*}$$*
As required. $\qed$ :::
We hence get an immediate corollary
::: corollary Corollary 4. One divided by rational number
Let x\in\mathbb{Q} be such that \displaystyle x=\frac{a}{b}. We
have that
$$\begin{equation} \frac{1}{x}=\frac{1}{\frac{a}{b}}=\frac{b}{a} \end{equation*}$$*
Proof:
We have
$$\begin{equation} \frac{1}{x}=\frac{1}{\frac{a}{b}}=\frac{1}{a\frac{1}{b}}=\frac{1}{a b^{-1}}=\frac{1}{a}\frac{1}{b^{-1}}=\frac{1}{a}b=\frac{b}{a} \end{equation}$$*
As required. $\qed$ :::
Extending the summation and product notations to the rationals
Summation and product notation has been defined on the naturals as well as the integers. We can extend the notation to include the rational numbers.
Let q\in\mathbb{Q}^{n+m+1} be an ordered n+m+1 tuple of rational
numbers where
$$\begin{equation*} q=\left(q_{-m},q_{-m+1},\dots,q_{-1},q_0,q_1,\dots, q_n\right) \end{equation*}$$
Define
\mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\} to
be a set of indices and define f:\mathbb{Z}_m^n\rightarrow\mathbb{Q}
by
$$\begin{align*} f:\mathbb{Z}_m^n&\rightarrow \mathbb{Q}\ i&\mapsto f\left(i\right)=q_i \end{align*}$$
::: definition Definition 134. Summation notation for rational numbers
Let z\in\mathbb{Q}^{n+m+1} be ordered n+m+1 tuple of integers where
q=\left(q_{-m},q_{-m+1},\dots,q_{-1},q_0,q_1,\dots, q_n\right). Define
\mathbb{Z}_m^n by
\mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}.
Let f:\mathbb{Z}^{n+m+1}:\mathbb{Q} defined by
$$\begin{align} f:\mathbb{Z}^{m+n+1}&\rightarrow\mathbb{Q}\ i&\mapsto f\left(i\right)=q_i \end{align*}$$*
We define the summation notation for the rational numbers by
$$\begin{equation} \sum_{i=-m}^n f\left(i\right)=f\left(-m\right)+f\left(-m+1\right)+\dots+f\left(-1\right)+f\left(0\right)+f\left(1\right)+\dots+f\left(n\right) \end{equation*}$$*
Alternatively this is written
$$\begin{equation} \sum_{i=-m}^n q_i = q_{-m}+q_{-m+1}+\dots+q_{-1}+q_0+q_1+\dots+q_n \end{equation*}$$*
We have that i is called the index of summation and that i=-m is
the starting index of the summation, and n the ending index of the
summation. If q\in\emptyset then we define the summation to be 0 and
call the summation an empty sum.
We can also define the summation of some subset of \mathbb{Z}_m^n
which allows for starting a summation at some starting point other than
i=-m. Let T\subseteq\mathbb{Z}_m^n. We define the summation over the
set T by
$$\begin{equation} \sum_{i\in T} z_i \end{equation*}$$*
If we have a mapping g:\mathbb{Q}\rightarrow\mathbb{Q} we can define
a summation over g by
$$\begin{equation} \sum_{i\in T} g\left(z_i\right) \end{equation*}$$*
Finally we can define a summation over a predicate P\left(i\right)
for i\in T by
$$\begin{equation} \sum_{P\left(i\right)}g\left(z_i\right) \end{equation*}$$*
where we take the sum of the g\left(z_i\right) for the i that
satisfy the predicate P. We note that if we have k>n for some
k\in\mathbb{N} then the sum
$$\begin{equation} \sum_{i=k}^n z_i=0 \end{equation*}$$* :::
The usual proprieties shown for summations with integer numbers also extend to the rational number version.
::: proposition Proposition 85. Properties of summation notation
Let n,m\in\mathbb{Z} such that m<n. Let s,t\in\mathbb{Q}^{n+m+1}
and let c\in\mathbb{Q}.
Let a,b\in\mathbb{Z} with m<a<b<n. Define A=\mathbb{Z}_a^b and
define
$$\begin{equation}
B=\mathbb{Z}_m^n\setminus A=\left{-m,-m+1,\dots,a-1,b+1,\dots,n-1,n\right}
\end{equation*}$$ so that A\cup B =\mathbb{Z}_m^n. Let
k\in \mathbb{Z} be the starting index summation such that k<n. We
have that the following properties hold.*
-
$\displaystyle \sum_{i=-m}^n s_i = \sum_{i\in A} s_i +\sum_{i\in B} s_i =\sum_{i=-m}^{-1} s_i + \sum_{i=0}^{n}s_i$
-
$\displaystyle \sum_{i=k}^n s_i = \sum_{i=k}^d s_i + \sum_{i=d+1}^n s_i$
-
\displaystyle\sum_{i=k}^n c*s_i = c*\sum_{i=k}^n s_ifor some $c\in\mathbb{Q}$ -
$\displaystyle\sum_{i=k}^n s_i+t_i = \sum_{i=k}^n s_i + \sum_{i=k}^n t_i$
Proof:
-
\displaystyle \sum_{i=-m}^n s_i = \sum_{i\in A} s_i +\sum_{i\in B} s_i =\sum_{i=-m}^{-1} s_i + \sum_{i=0}^{n}s_i:The proof is the same as for the integer case. We give it again for completeness
We have that
$$\begin{align} \sum_{i=-m}^n s_i&=s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_{n}\ &=\left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}\right)+\left(s_0+s_1+\dots+s_{n-1}+s_{n}\right)\ &=\sum_{i=-m}^{-1} s_i + \sum_{i=0}^n s_i \end{align*}$$*
Additionally note that
$$\begin{align} \sum_{i=-m}^n s_i&=s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_{n}\ &= \left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{a-2}+s_{a-1}\right)+\left(s_a+s_{a+1}+\dots+s_{b-1}+s_b\right)\ &+\left(s_{b+1}+s_{b+2}+\dots+s_{n-1}+s_n\right)\ &= \left(s_{-m}+s_{-m+1}+s_{-m+2}+\dots+s_{a-2}+s_{a-1}\right) + \left(s_{b+1}+s_{b+2}+\dots+s_{n-1}+s_n\right)\ &+ \left(s_a+s_{a+1}+\dots+s_{b-1}+s_b\right)\ &= \sum_{i\in B} s_i + \sum_{i\in A} s_i = \sum_{i\in A} s_i + \sum_{i\in B} s_i \end{align*}$$*
-
\displaystyle \sum_{i=k}^n s_i = \sum_{i=k}^d s_i + \sum_{i=d+1}^n s_i:The proof is similar to part 1, replacing
-mbyk. -
\displaystyle\sum_{i=k}^n c*s_i = c*\sum_{i=k}^n s_ifor some $c\in\mathbb{Q}$We have by definition that
$$\begin{equation} \sum_{i=k}^n cs_i=cs_k+cs_{k+1}+cs_{k+3}+\dots+cs_n \end{equation}$$*
By multiplication distributing over addition we have
$$\begin{equation} \sum_{i=1}^n cs_i=cs_k+cs_{k+1}+cs_{k+3}+\dots+cs_n=c\left(s_k+s_{k+1}+\dots+s_n\right)=c\sum_{i=k}^n s_i \end{equation*}$$*
-
\displaystyle\sum_{i=k}^n s_i+t_i = \sum_{i=k}^n s_i + \sum_{i=k}^n t_i:This follows by the definition. We have
$$\begin{align} \sum_{i=k}^n s_i+t_i&= \left(s_k+t_k\right)+\left(s_{k+1}+t_{k+1}\right)+\dots\ &+\left(s_{-1}+t_{-1}\right)+\left(s_{0}+t_{0}\right)+\left(s_{1}+t_{1}\right)+\dots+\left(s_{n-1}+t_{n-1}\right)+\left(s_{n}+t_{n}\right)\ &=\left(s_k+s_{k+1}+\dots+s_{-1}+s_0+s_1+\dots+s_{n-1}+s_n\right)+\ &+\left(t_k+t_{k+1}+\dots+t_{-1}+t_0+t_1+\dots+t_{n-1}+t_n\right)\ &= \sum_{i=k}^n s_i + \sum_{i=k}^n t_i \end{align*}$$*
$\qed$ :::
We make a similar definition for product notation.
::: definition Definition 135. Product notation for the rationals numbers
Let z\in\mathbb{Q}^{n+m+1} be ordered n+m+1 tuple of integers where
q=\left(q_{-m},q_{-m+1},\dots,q_{-1},q_0,q_1,\dots, q_n\right). Define
\mathbb{Z}_m^n by
\mathbb{Z}_m^n=\left\{-m,-m+1,-m+2,\dots,-1,0,1,\dots,n-1,n\right\}.
Let f:\mathbb{Z}^{n+m+1}:\mathbb{Z} defined by
$$\begin{align} f:\mathbb{Z}^{m+n+1}&\rightarrow\mathbb{Q}\ i&\mapsto f\left(i\right)=z_i \end{align*}$$*
We define the summation notation for integers by
$$\begin{equation} \prod_{i=-m}^n f\left(i\right)=f\left(-m\right)f\left(-m+1\right)\dotsf\left(-1\right)f\left(0\right)f\left(1\right)\dots+f\left(n\right) \end{equation}$$
Alternatively this is written
$$\begin{equation} \prod_{i=-m}^n q_i = q_{-m}q_{-m+1}\dotsq_{-1}q_0q_1\dotsq_n \end{equation}$$*
We have that i is called the index of the product and that i=-m is
the starting index of the product, and n the ending index of the
product. If z\in\emptyset then we define the product to be 1 and
call a product an empty product.
We can also define the product of some subset of \mathbb{Z}_m^n which
allows for starting a product at some starting point other than i=-m.
Let T\subseteq\mathbb{Z}_m^n. We define the product over the set T
by
$$\begin{equation} \prod_{i\in T} z_i \end{equation*}$$*
If we have a mapping g:\mathbb{Z}\rightarrow\mathbb{Z} we can define
a product over g by
$$\begin{equation} \prod_{i\in T} g\left(z_i\right) \end{equation*}$$*
Finally we can define a product over a predicate P\left(i\right) for
i\in T by
$$\begin{equation} \prod_{P\left(i\right)}g\left(z_i\right) \end{equation*}$$*
where we take the sum of the g\left(z_i\right) for the i that
satisfy the predicate P. We note that if we have k>n for some
k\in\mathbb{N} then the product
$$\begin{equation} \prod_{i=k}^n z_i=1 \end{equation*}$$* :::
::: proposition Proposition 86. Properties of product notation
Let n,m\in\mathbb{Z} such that m<n. Let s,t\in\mathbb{Q}^{n+m+1}
and let c\in\mathbb{Z}. Let a,b\in\mathbb{Z} so that m<a<b<n.
Define A=\mathbb{Z}_a^b and define
$$\begin{equation}
B=\mathbb{Z}_m^n\setminus A=\left{-m,-m+1,\dots,a-1,b+1,\dots,n-1,n\right}
\end{equation*}$$ so that A\cup B =\mathbb{Z}_m^n. Let
k\in \mathbb{Z} be the lower index of the product.*
We have that the following properties hold.
-
*$\displaystyle \prod_{i=-m}^n s_i = \prod_{i\in A} s_i \prod_{i\in B} s_i = \prod_{i=-m}^{-1} s_i * \prod_{i=0}^n s_i$
-
$\displaystyle \prod_{i=k}^n s_i = \prod_{i=k}^m s_i * \prod_{i=m+1}^n s_i$
-
$\displaystyle\prod_{i=k}^n s_it_i = \prod_{i=k}^n s_i \prod_{i=1}^n t_i$
Proof:
-
\displaystyle \prod_{i=-m}^n s_i = \prod_{i\in A} s_i *\prod_{i\in B} s_i = \prod_{i=-m}^{-1} s_i * \prod_{i=0}^n s_i:The proof is the same for the intger case.
We have that
$$\begin{align} \prod_{i=-m}^n s_i &= s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}s_0s_1*\dotss_{n-1}s_n\ &=\left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}\right)\left(s_0s_1*\dotss_{n-1}s_n\right)\ &=\prod_{i=-m}^{-1}s_i\prod_{i=0}^n s_i \end{align}$$*
Likewise we have
$$\begin{align} \prod_{i=-m}^n s_i &= s_{-m}s_{-m+1}s_{-m+2}\dotss_{-1}s_0s_1*\dotss_{n-1}s_n\ &= \left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{a-2}s_{a-1}\right)\left(s_as_{a+1}\dotss_{b-1}s_b\right)\ & \left(s_{b+1}s_{b+2}\dotss_{n-1}s_n\right)\ &= \left(s_{-m}s_{-m+1}s_{-m+2}\dotss_{a-2}s_{a-1}\right) * \left(s_{b+1}s_{b+2}\dotss_{n-1}s_n\right)\ & \left(s_as_{a+1}\dotss_{b-1}s_b\right)\ &=\prod_{i\in B}s_i * \prod_{i\in A} s_i = \prod_{i\in A} s_i * \prod_{i\in B} s_i \end{align}$$*
-
\displaystyle \prod_{i=k}^n s_i = \prod_{i=k}^m s_i * \prod_{i=m+1}^n s_i:The proof is similar to part 1. We replace
-mwithk. -
\displaystyle\prod_{i=k}^n s_it_i = \prod_{i=k}^n s_i \prod_{i=1}^n t_i:Observer that
$$\begin{align} \prod_{i=k}^n s_it_i&=s_{k}t_{k}s_{k+1}t_{k+1}s_{k+2}t_{k+2}\dotss_{-1}t_{-1}s_{0}t_{0}s_{1}t_{1}\dotss_{n-1}t_{n-1}s_{n}t_{n}\ &=\left(s_{k}s_{k+1}s_{k+2}\dotss_{-1}s_{0}s_{1}\dotss_{n-1}s_{n}\right)\ &\left(t_{k}t_{k+1}t_{k+2}\dotst_{-1}t_{0}t_{1}\dotst_{n-1}t_{n}\right)\ &=\prod_{i=k}^n s_i * \prod_{i=k}^n s_i \end{align}$$
$\qed$ :::
We can now extend the result of proposition
39{reference-type="ref"
reference="prop:NaturalsHaveNoZeroDivisors"} and proposition
69{reference-type="ref"
reference="prop:IntegersHaveNoZeroDivisors"}. I.e if the product of
ab=0 for a,b\in\mathbb{Q} then at least one of a or b is zero.
::: {#prop:RationalsHaveNoZeroDivisors .proposition} Proposition 87. Product of two rational numbers being zero implies one of the numbers is zero
Let x,y\in\mathbb{Q}. If xy=0 then at least one of x or y is
zero.
Proof:
Let x,y\in\mathbb{Q}. If x=y=0 then the result is trivial. So
suppose that x=\left(a,b\right) and y=\left(c,d\right), moreover
suppose y\neq 0. By definition of rational number multiplication we
have that
$$\begin{equation} xy=\left(a,b\right)\left(c,d\right)=\left(ac,bd\right)=\left(0,1\right) \end{equation}$$*
Hence we must have ac=0. Therefore by proposition
69{reference-type="ref"
reference="prop:IntegersHaveNoZeroDivisors"} we must have that either
a=0 or c=0 or both. As we have assumed y\neq 0 then a=0 and so
x=0. A similar argument assuming x\neq 0 shows that y=0. The
result is shown. $\qed$
:::
Extending the rules for inequalities to the integers
For the natural numbers and the integers, we have a theory of inequalities. These results extend to the rationals. Additionally, as rational numbers represent the division of integers there are some additional properties that now hold.
we were able to derive some rules for how inequalities behave, we can extend those results to the integers. Before we do so we have an additional consideration.
To extend the results fully we need to consider negative rational numbers as well. We follow a similar layout to the section on integer inequalities.
::: {#prop:MultiplicationByNegativeOneFlipsInequalitySignRational .proposition}
Proposition 88. Multiplication by -1 changes the inequality sign
Let x,y\in\mathbb{Q}. We have the following
-
If
x<ythen $-x>-y$ -
If
x\leq ythen $-x\geq -y$ -
If
x>ythen $-x<-y$ -
If
x\geq ythen $-x\leq-y$
Proof:
Let x,y\in\mathbb{Q} so that \displaystyle x=\frac{a}{b} and
\displaystyle y=\frac{c}{d} where b\neq 0 and d\neq 0.
-
If
x<ythen-x>-y:Let
x,y\in\mathbb{Q}so thatx<y. By definition of<for the rationals we have that$$\begin{equation} x<y\iff ad<bc \end{equation*}$$*
Applying proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} we have
$$\begin{equation} ad<bc\Rightarrow -ad>-bc \end{equation*}$$*
Hence
-x>-y. -
If
x\leq ythen-x\geq -y:Let
x,y\in\mathbb{Q}so thatx\leq y. Applying the definition of\leqfor the rationals gives$$\begin{equation} x\leq y\iff ad\leq bc \end{equation*}$$*
Proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} gives
$$\begin{equation} ad\leq bc\Rightarrow -ad\geq -bc \end{equation*}$$ Hence
-x\geq y.* -
If
x>ythen-x<-y:Let
x,y\in\mathbb{Q}so thatx>y. By definition of>for the rationals we have that$$\begin{equation} x>y\iff ad>bc \end{equation*}$$*
Proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} shows us that
$$\begin{equation} ad>bc\Rightarrow -ad<-bc \end{equation*}$$*
Hence
-x<-y. -
If
x\geq ythen-x\leq-y:Let
x,y\in\mathbb{Q}so thatx>y. By definition of\geqfor the rationals, we have that$$\begin{equation} x\geq y\iff ad\geq bc \end{equation*}$$*
Proposition 70{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySign"} we have
$$\begin{equation} ad\geq bc\Rightarrow -ad\leq-bc \end{equation*}$$*
Hence
-x\leq -y.
The result is shown. $\qed$ :::
There is another useful lemma that will be useful for extending the rules of inequalities to the rationals.
::: {#lem:LargerRatMinusSmallIsPositive .lemma} Lemma 8. Strictly larger rational minus a smaller is positive
Let x,y\in\mathbb{Q}. We have that $x<y\iff y-x>0$
Proof:
\left(\Rightarrow\right): Let x,y\in\mathbb{Q}, then
\displaystyle x=\frac{a}{b} and \displaystyle y=\frac{c}{d} for some
a,b,c,d\in\mathbb{Z} and b\neq 0 and d\neq 0. As x<y then
ad<bc. Now y-x is given by
$$\begin{align} y-x&=y+\left(-1x\right)\ &=\left(c,d\right)+\left(-a,b\right)\ &=\left(cb-ad,bd\right) \end{align}$$*
Now, saying y-x is positive is the same as y-x>0. By definition of
greater than, and the fact that 0\in\left[\left(0,1\right)\right] we
would have that
$$\begin{equation} \left(cb-ad\right)1>0\left(bd\right) \Rightarrow bc-ad>0 \end{equation*}$$*
Which is true as ad<bc. Hence $y-x>0$
\left(\Leftarrow\right): Suppose that y-x>0 where
x,y\in\mathbb{Q}, with \displaystyle x=\frac{a}{b} and
\displaystyle y=\frac{c}{d} for some a,b,c,d\in\mathbb{Z} and
b\neq 0 and d\neq 0. We have that
$$\begin{equation} y-x = \left(cb-ad,bd\right) \end{equation*}$$*
Moreover, y-x>0 implies that
$$\begin{equation} \left(cb-ad\right)1>0\left(bd\right) \Rightarrow bc-ad>0 \end{equation*}$$*
This is to say bc>ad, which by part 1 of proposition
71{reference-type="ref"
reference="prop:InequalityIntegerNumbers"} is the same as ad<bc which
is equivalent to saying that x<y.
As required. $\qed$ :::
::: {#cor:LargerOrEqualRatMinusSmallIsPositive .corollary} Corollary 5. Larger or equal rational minus a smaller is positive
Let x,y\in\mathbb{Q}. We have that x\leq y\iff y-x>0 or y=x.
Proof:
\left(\Rightarrow\right): Suppose x\leq y. If x<y then lemma
8{reference-type="ref"
reference="lem:LargerRatMinusSmallIsPositive"} applies. Otherwise
x=y.
\left(\Leftarrow\right): Suppose that one of y-x> 0 or y=x holds.
In the first case, y-x>0 implies x<y by lemma
8{reference-type="ref"
reference="lem:LargerRatMinusSmallIsPositive"} and clearly we will have
x\leq y. If x=y then we clearly also have x\leq y by definition.
$\qed$
:::
We can now extend the properties of inequalities to the rationals.
::: {#prop:InequalityRationalNumbers .proposition} Proposition 89. Properties of inequalities for the rationals
Let x,y,z,c\in\mathbb{Q}. We have the following properties for
inequalities
-
x<yis the same as $y>x$ -
x\leq yis the same as $y\geq x$ -
If
x<yandy<zthen $x<z$ -
If
x\leq yandy<zthen $x<z$ -
If
x<yandy\leq zthen $x<z$ -
If
x\leq yandy\leq zthen $x\leq z$ -
If
x>yandy>zthen $x>z$ -
If
x\geq yandy>zthen $x>z$ -
If
x>yandy\geq zthen $x>z$ -
If
x\geq yandy\geq zthen $x\geq z$ -
If
x<ythen $x+z<y+z$ -
If
x\leq ythen $x+z\leq y+z$ -
If
x>ythen $x+z>y+z$ -
If
x\geq ythen $x+z\geq y+z$ -
If
x<yandz\geq 0then $xz<yz$ -
If
x<yandz< 0then $xz>yz$ -
If
x\leq yandz\geq 0then $xz\leq yz$ -
If
x\leq yandz<0then $xz\geq yz$ -
If
x>yandz\geq 0then $xz>yz$ -
If
x>yandz< 0then $xz<yz$ -
If
x\geq yandz\geq 0then $xz\geq yz$ -
If
x\geq yandz<0then $xz\leq yz$ -
If
x<yandz>0then $\displaystyle\frac{x}{z}<\frac{y}{z}$ -
If
x\leq yandz>0then $\displaystyle\frac{x}{z}\leq\frac{y}{z}$ -
If
x>yandz>0then $\displaystyle\frac{x}{z}>\frac{y}{z}$ -
If
x\geq yandz>0then $\displaystyle\frac{x}{z}\geq\frac{y}{z}$ -
If
x<yandz<0then $\displaystyle\frac{x}{z}>\frac{y}{z}$ -
If
x\leq yandz<0then $\displaystyle\frac{x}{z}\geq\frac{y}{z}$ -
If
x>yandz<0then $\displaystyle\frac{x}{z}<\frac{y}{z}$ -
If
x\geq yandz<0then $\displaystyle\frac{x}{z}\leq\frac{y}{z}$ -
If
x<yandx>0andy>0then $\displaystyle \frac{1}{x}>\frac{1}{y}$ -
If
x<yandx<0andy<0then $\displaystyle \frac{1}{x}>\frac{1}{y}$ -
If
x\leq yandx>0andy>0then $\displaystyle \frac{1}{x}\geq \frac{1}{y}$ -
If
x\leq yandx<0andy<0then $\displaystyle \frac{1}{x}\geq \frac{1}{y}$ -
If
x>yandx>0andy>0then $\displaystyle \frac{1}{x}<\frac{1}{y}$ -
If
x>yandx<0andy<0then $\displaystyle \frac{1}{x}<\frac{1}{y}$ -
If
x\geq yandx>0andy>0then $\displaystyle \frac{1}{x}\leq \frac{1}{y}$ -
If
x\geq yandx<0andy<0then $\displaystyle \frac{1}{x}\leq \frac{1}{y}$
Proof:
Let x,y,z\in\mathbb{Q}. Let \displaystyle x=\frac{a}{b},
\displaystyle y=\frac{c}{d}, \displaystyle z=\frac{e}{f} for
a,b,e,f,g,h\in\mathbb{Z} and b\neq 0, d\neq 0, f\neq 0.
-
x<yis the same asy>x:Suppose that
x<ythen by definition we havead<bc. Applying part 1. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} givesbe>afand soy>x. -
x\leq yis the same asy\geq x:If
x<ythen part 1 applies, Otherwise we havex=yand soy=xand clearlyy\geq x. -
If
x<yandy<zthenx<z:Suppose that
x<yandy<ztheny-x>0andz-y>0by lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"}. Now we have that$$\begin{equation} \left(y-x\right)+\left(z-y\right)=z-x>0 \end{equation*}$$*
As
y-xandz-yare both greater than 0. Hence asz-x>0thenx<z. -
If
x\leq yandy<zthenx<z:Suppose that
x\leq yandy<z. Ifx<ythen the previous part applies, so suppose not. Thenx=yand sox<z. -
If
x<yandy\leq zthenx<z:Suppose that
x<yandy\leq z. By lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"} we have thaty-x>0, likewise by corollary 5{reference-type="ref" reference="cor:LargerOrEqualRatMinusSmallIsPositive"} we have thaty\leq zmeans eitherz-y>0ory=z.If
z-y>0then the result is the same as part 3. So supposey=zthen clearlyx<z. -
If
x\leq yandy\leq zthenx\leq z:If
x\leq yandy\leq zthen eitherx<yandy<zin which case part 3. applies, orx<yandy\leq zso part 5. applies, orx\leq yandy<zso part 4 applies. Finally, we have the casex=yandy=zso clearlyx=zso thatx\leq z. -
If
x>yandy>zthenx>z:By part 1. this is equivalent to
y<xandz<ythenz<xso part 3. applies. -
If
x\geq yandy>zthenx>z:Using parts 1. and 2. gives us the equivalent expression
y\leq xandz<ythenz<xand so part 4 applies. -
If
x>yandy\geq zthenx>z:As with the previous part, applying parts 1. and 2. gives the statement
y<xandz\leq ythenz<xso part 5. applies. -
If
x\geq yandy\geq zthenx\geq z:Using part 2. gives us
y\leq xandz\leq ythenz\leq xso part 6. applies. -
If
x<ythenx+z<y+z:Suppose that
x<ytheny-x>0by lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"}. Observer that$$\begin{align} y-x&=y-\left(z-z\right)-x\ &=\left(y-z\right)+\left(z-x\right)\ &=\left(y-z\right)-\left(x-z\right)>0 \end{align*}$$*
So
\left(y-z\right)-\left(x-z\right)>0and so by the same lemma we conclude thatx+z<y+z. -
If
x\leq ythenx+z\leq y+z:If
x<ythen the previous part applies. Otherwisex=yand clearlyx+z=y+zand sox+z\leq y+z. -
If
x>ythenx+z>y+z:Applying part 1. and then part 11. gives the equivalent result
y<xtheny+z<x+z. -
If
x\geq ythenx+z\geq y+z:Applying part 2. and then part 12. gives the equivalent result
y\leq xtheny+z\leq x+z. -
If
x<yandz\geq 0thenxz<yz:Suppose
x<ytheny-x>0by lemma 8{reference-type="ref" reference="lem:LargerRatMinusSmallIsPositive"}. Hence, by distributivity, we havez\left(y-x\right)>0asz\geq 0. Hence$$\begin{equation} z\left(y-x\right)=zy-zx=yz-xz \Rightarrow xz<yz \end{equation*}$$*
-
If
x<yandz< 0thenxz>yz:Suppose
x<y, asz<0 \Rightarrow -z>0, then applying part 15. with-zgives-xz<-yz. Finally by proposition 88{reference-type="ref" reference="prop:MultiplicationByNegativeOneFlipsInequalitySignRational"} part 1 yieldsxz>yz. -
If
x\leq yandz\geq 0thenxz\leq yz:If
x\leq ythere are two cases to consider. Ifx<ythen part 15. applies. Otherwisex=yand clearlyxz=yzgivingxz\leq yz. -
If
x\leq yandz<0thenxz\geq yz:Likewise, if
x\leq ythere are two cases. The casex<yis covered by part 16. Otherwisex=ygivesxz=yzand again we havexz\geq yz. -
If
x>yandz\geq 0thenxz>yz:We have
x>yis the same asy<xand sox-y>0. By distributivity, we have thatz\left(x-y\right)>0. Therefore we havezx-zy=xz-yz>0and soyz<xzwhich is the same asxz>yzby part 1. -
If
x>yandz< 0thenxz<yz:We have
x>y. Additionally,z<0\Rightarrow -z>0so applying part 19. gives-xz>-yzand so by part 1. we concludexz<yz. -
If
x\geq yandz\geq 0thenxz\geq yz:There are two cases to consider. If
x>ythen we apply part 19. Otherwisex=yandxz=yzso thatxz\geq yz. -
If
x\geq yandz<0thenxz\leq yz:Again there are two cases to consider. If
x>ythen the result holds by part 20. Otherwisex=yand soxz=yzto give the resultxz\leq yz. -
If
x<yandz>0then\displaystyle\frac{x}{z}<\frac{y}{z}:This follows by part 15.
-
If
x\leq yandz>0then\displaystyle\frac{x}{z}\leq\frac{y}{z}:This follows by part 17.
-
If
x>yandz>0then\displaystyle\frac{x}{z}>\frac{y}{z}:This follows by part 19.
-
If
x\geq yandz>0then\displaystyle\frac{x}{z}\geq\frac{y}{z}:This follows by part 21.
-
If
x<yandz<0then\displaystyle\frac{x}{z}>\frac{y}{z}:This follows by part 16.
-
If
x\leq yandz<0then\displaystyle\frac{x}{z}\geq\frac{y}{z}:This follows by part 18.
-
If
x>yandz<0then\displaystyle\frac{x}{z}<\frac{y}{z}:This follows by part 20.
-
If
x\geq yandz<0then\displaystyle\frac{x}{z}\leq\frac{y}{z}:This follows by part 22.
-
If
x<yandx>0andy>0then\displaystyle \frac{1}{x}>\frac{1}{y}:Suppose that
x<ythenad<bc. Moreover asx>0that eithera>0andb>0ora<0andb<0. Likewise asy>0then eitherc>0andd>0orc<0andd<0. Hence there are four cases to consider.-
a>0andb>0andc>0and $d>0$ -
a>0andb>0andc<0and $d<0$ -
a<0andb<0andc>0and $d>0$ -
a<0andb<0andc<0and $d<0$
-
a>0andb>0andc>0andd>0:Observe that
$$\begin{align} ad&<bc\ a^{-1}ad&<a^{-1}bc,\ \text{By part 15. as} a^{-1}>0\ d&<a^{-1}bc,\ \text{As multiplication of an element by its inverse is } 1\ dc^{-1}&<a^{-1}bcc^{-1},\ \text{By part 15. as } c^{-1}>0\ dc^{-1}&<a^{-1}b,\ \text{As multiplication of an element by its inverse is } 1\ \frac{d}{c}&<\frac{b}{a},\ \text{By the definition of an inverse element}\ \end{align*}$$*
Hence
\displaystyle \frac{d}{c}<\frac{b}{a}which is equivalent to\displaystyle \frac{b}{a}>\frac{d}{c}, which is to say\displaystyle\frac{1}{x}>\frac{1}{y}. -
a>0andb>0andc<0andd<0:We have that as
c<0andd<0thenad<0andbc<0andad<bc. Hence observer that$$\begin{align} ad&<bc\ a^{-1}ad&>a^{-1}bc,\ \text{By part 16. as } a^{-1}<0\ d&>a^{-1}bc\ dc^{-1}&<a^{-1}bcc^{-1},\ \text{By part 20. as } c^{-1}<0\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*
Hence we again conclude that
\displaystyle\frac{1}{x}>\frac{1}{y}. -
a<0andb<0andc>0andd>0:The argument is similar to the previous one, swapping the roles of
a,b,candd. -
a<0andb<0andc<0andd<0:This is similar to the first part. We give the full argument. As
a<0,b<0,c<0andd<0thenad>0andbc>0andad<bc. Hence we can see that$$\begin{align} ad&<bc\ a^{-1}ad&>a^{-1}bc,\ \text{By part 16. as } a^{-1}<0\ d&>a^{-1}bc\ dc^{-1}&<a^{-1}-bcc^{-1},\ \text{By part 20. as } c^{-1}<0\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*
Giving the result.
-
-
If
x<yandx<0andy<0then\displaystyle \frac{1}{x}>\frac{1}{y}:This is similar to the previous part. Suppose that
x<ythenad<bc. Moreover asx<0that eithera>0andb<0ora<0andb>0. Likewise asy<0then eitherc>0andd<0orc<0andd>0. Hence there are four cases to consider.-
a>0andb<0andc>0and $d<0$ -
a>0andb<0andc<0and $d>0$ -
a<0andb>0andc>0and $d<0$ -
a<0andb>0andc<0and $d>0$
-
a>0andb<0andc>0andd<0:As
a>0andb<0andc>0andd<0then we have thatad<0andbc<0andad<bc. We have that$$\begin{align} ad&<bc\ a^{-1}ad&<a^{-1}bc\ d&<a^{-1}bc\ dc^{-1}&<a^{-1}bcc^{-1}\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*
Giving $\displaystyle \frac{1}{x}>\frac{1}{y}$
-
a>0andb<0andc<0andd>0:We have
a>0andb<0andc<0andd>0then we have thatad>0andbc>0andad<bc. We have that$$\begin{align} ad&<bc\ a^{-1}ad&<a^{-1}bc\ d&<a^{-1}bc\ dc^{-1}&<a^{-1}bcc^{-1}\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*
Giving $\displaystyle \frac{1}{x}>\frac{1}{y}$
-
a<0andb>0andc>0andd<0:This time we have
a<0andb>0andc>0andd<0then we have thatad>0andbc>0and $ad<bc$$$\begin{align} ad&<bc\ -ad&>bc\ a^{-1}\left(-ad\right)&>a^{-1}\left(-bc\right)\ -d&>a^{-1}\left(-bc\right)\ -dc^{-1}&>a^{-1}\left(-bc\right)c^{-1}\ -dc^{-1}&>-a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*
Giving the result.
-
a<0andb>0andc<0andd>0:Finally,
a<0andb>0andc<0andd>0which givesad<0andbc<0withad<bc. Once again we have that$$\begin{align} ad&<bc\ a^{-1}ad&>a^{-1}bc\ d&>a^{-1}bc\ dc^{-1}&<a^{-1}bcc^{-1}\ dc^{-1}&<a^{-1}b\ \frac{d}{c}&<\frac{b}{a} \end{align*}$$*
Which concludes this part of the proposition
-
-
If
x\leq yandx>0andy>0then\displaystyle \frac{1}{x}\geq \frac{1}{y}:If
x<ythen we apply part 31. Otherwisex=yand so\displaystyle \frac{1}{x}= \frac{1}{y}hence the result. -
If
x\leq yandx<0andy<0then $\displaystyle \frac{1}{x}\geq \frac{1}{y}$Likewise if
x<ywe apply part 32. Otherwisex=yand\displaystyle \frac{1}{x}= \frac{1}{y}so the result is clear. -
If
x>yandx>0andy>0then\displaystyle \frac{1}{x}<\frac{1}{y}:Applying part 1. the equivalent statement is
y<xandx<0andy<0then\displaystyle \frac{1}{y}>\frac{1}{y}so part 32. applies. -
If
x>yandx<0andy<0then\displaystyle \frac{1}{x}<\frac{1}{y}:Likewise by part 1. this is the same as
y<xandx>0andy>0then\displaystyle \frac{1}{y}>\frac{1}{y}so part 31. applies. -
If
x\geq yandx>0andy>0then\displaystyle \frac{1}{x}\leq \frac{1}{y}:If
x>ythen part 35 applies. Otherwise,x=yand the result is clear. -
If
x\geq yandx<0andy<0then\displaystyle \frac{1}{x}\leq \frac{1}{y}:Finally, if
x>ythen we apply part 36. Otherwisex=yand we get the result.
The result has been shown.10 $\qed$ :::
Extending exponentiation to the rational numbers
Recall the definition of exponentiation from the integers.
$$\begin{align*} \wedge:\mathbb{Z}\times\mathbb{Z}^+&\rightarrow\mathbb{Z}\ \left(x,n\right)&\mapsto \wedge\left(x,n\right)=\begin{cases} 1,\ \text{If } x=0\text{ and } n=0\ 1,\ \text{If } n=0\ \displaystyle \prod_{i=1}^y x ,\ \text{If }x\neq 0\text{ and } n \geq 0\ \end{cases} \end{align*}$$
where \mathbb{Z}^+=\left\{x\in\mathbb{Z}:x\geq 0\right\}. We noted in
the section on extending exponentiation to the integers that we were
unable to consider the case of negative exponents. By assuming that they
did we deduced that a new type of object exists that undoes integer
multiplication. As we have seen in this section, that object type is
actually a rational number. Indeed we showed that in proposition
82{reference-type="ref"
reference="prop:MultiplicativeInverseOfIntegerTimesInverseIsOriginalNumber"}
that if x\in\mathbb{Z} then there is some *x^{-1}\in\mathbb{Q} so
that x*x^{-1}=1=x^0. This would generalise proposition
77{reference-type="ref"
reference="prop:IntegerExponentiationOfSameBaseAddsPowers"} to all
integers rather than positive exponents. We hence generalise the
definition of exponentiation and prove the results to all integer
exponents rather than the positive.
::: definition Definition 136. Exponentiation of integer numbers
Let \left(x,y\right)\in\mathbb{Z}\times\mathbb{Z} and let
\wedge:\mathbb{Z}\times\mathbb{Z}\rightarrow\mathbb{Q}. We define the
exponentiation of x by y by $$\begin{align}
\wedge:\mathbb{Z}\times\mathbb{Z}&\rightarrow\mathbb{Q}\
\left(x,y\right)&\mapsto \wedge\left(x,y\right)=\begin{cases}
1,\ \text{If } x=0\text{ and } y=0\
1,\ \text{If } x=0\
\displaystyle \prod_{i=1}^y x ,\ \text{If }x\neq 0\text{ and } n \geq 0\
\displaystyle \prod_{i=1}^{\left|y\right|} \frac{1}{x} ,\ \text{If }x\neq 0\text{ and } y < 0\
\end{cases}
\end{align*}$$*
:::
We can now extend the results shown in the section on integer exponentiation extension.
::: {#prop:IntegerExtensionExponentiationPowerLaw .proposition} Proposition 90. Power law of exponentiation for positive exponents
Let x\in\mathbb{Z} and let n,m\in\mathbb{Z}. We have that
$$\begin{equation} \left(x^n\right)^m = x^{nm} \end{equation*}$$*
Proof:
If n,m\geq 0 the result is the same as proposition
76{reference-type="ref"
reference="prop:IntegerExponentiationPowerLaw"}. So we must consider the
following cases
-
n\geq 0and $m<0$ -
n< 0and $m\geq 0$ -
n< 0and $m<0$
-
n\geq 0andm<0:By definition of integer exponentiation, we have that
\displaystyle x^n=\prod_{i=1}^n x. Now applying the general definition of integer exponentiation we see that$$\begin{align} \left(x^n\right)^m=&\prod_{i=1}^{\left|m\right|} \frac{1}{x^n}\ &=\underbrace{\left(\frac{1}{x^n}\right)\left(\frac{1}{x^n}\right)\left(\frac{1}{x^n}\right)\dots\left(\frac{1}{x^n}\right)}_{\left|m\right|\text{ times}} \end{align*}$$*
Now, we know by definition of multiplication for rationals that
\displaystyle\frac{1}{a}*\frac{1}{b}=\frac{1}{ab}and so.$$\begin{align} \left(x^n\right)^m=&\underbrace{\left(\frac{1}{x^n}\right)\left(\frac{1}{x^n}\right)\left(\frac{1}{x^n}\right)\dots\left(\frac{1}{x^n}\right)}{\left|m\right|\text{ times}}\ &=\frac{1}{x^{n\left|m\right|}}\ &=\prod{i=1}^{n\left|m\right|} \frac{1}{x} =x^{nm} \end{align*}$$*
By definition.
-
n< 0andm\geq 0:As
n<0then we have that$$\begin{equation} x^n=\prod_{i=1}^{\left|n\right|}\frac{1}{x}=\frac{1}{x^n} \end{equation*}$$*
We can now apply similar logic to the first part to conclude the result.
-
n< 0andm<0:Using similar logic to the two previous parts deduces the result.
As promised. $\qed$ :::
::: {#prop:IntegerExtensionExponentiationOfSameBaseAddsPowers .proposition} Proposition 91. Multiplying exponents of the same base adds the powers
Let x\in\mathbb{Z} be a fixed integer and let n,m\in\mathbb{Z}. We
have that
$$\begin{equation} x^n x^m = x^{n+m} \end{equation}$$*
Proof:
If n,m\geq 0 the result is the same as proposition
77{reference-type="ref"
reference="prop:IntegerExponentiationOfSameBaseAddsPowers"}, so we have
to consider the following three cases
-
n\geq 0and $m<0$ -
n< 0and $m\geq 0$ -
n< 0and $m<0$
-
n\geq 0andm<0:Let
m=-kfor somek\in\mathbb{Z}withk>0. We know that\displaystyle x^m=x^{-k}=\prod_{i=1}^{-k} \frac{1}{x} = x^{-k}. Now we have$$\begin{equation} x^nx^m=x^n x^{-k}=x^{n+-k} \end{equation}$$*
Which is equivalent to
x^{n+m}. -
n< 0andm\geq 0:Like the previous part let
n=-kfor somek\in\mathbb{Z}withk>0then we get$$\begin{equation} x^nx^m=x^{-k}x^{m}=x^{-k+m}=x^{n+m} \end{equation}$$
-
n< 0andm<0:Let
n=-kandm=-jfork,j\in\mathbb{Z}withk>0andj>0. Then$$\begin{equation} x^nx^m=x^{-k}x^{-j}=x^{-k+-j}=x^{n+m} \end{equation}$$
As required. $\qed$ :::
::: {#prop:IntegerExtensionExponentiationPowerOfProductIsProductOfPowers .proposition} Proposition 92. Power of product is product of powers
Let x,y\in\mathbb{Z} and n\in\mathbb{Z}. Then
$$\begin{equation} \left(xy\right)^n=x^ny^n \end{equation*}$$*
Proof:
If n=0 then \left(x*y\right)^n=1 and clearly x^0*y^0=1. So
suppose n>0 then we have
$$\begin{align} \left(xy\right)^n=\prod_{i=1}^n xy &=\underbrace{xyxy*\dots xy}_{n\text{ times}}\ &= \left(\underbrace{xx*\dots x}_{n\text{ times}}\right)\left(\underbrace{yy\dots y}_{n\text{ times}}\right),\ \text{ By commutativity of multiplication}\ &=x^ny^n \end{align*}$$*
Finally, let n<0 then a similar argument shows that
$$\begin{equation} \left(xy\right)^n=\frac{1}{x^ny^n} \end{equation*}$$ Showing the proposition. $\qed$* :::
We have extended integer exponentiation. What can we say about rational
exponentiation? We can clearly extend the base of exponentiation to an
arbitrary rational number. We have already used special cases of this
when we considered denominators and numerators separately in the
previous proofs. We formalise this to a fully general rational number.
Firstly, we know that if n<0 then \displaystyle x^n=\frac{1}{x^n}.
Additionally if x\in\mathbb{Z} then a multiplicative inverse of x in
the rationals is given by x^{-1}=\frac{1}{x}. We combine the two into
a general definition.
::: definition Definition 137. Exponentiation for negative indices
Let x\in\mathbb{Z} with x\neq 0. We extend exponentiation to
negative n\in\mathbb{Z} by
$$\begin{equation} x^{-n} = \left(x^{-1}\right)^n \end{equation*}$$*
Clearly we have in general that $x^{-n}\in\mathbb{Q}$ :::
Now we can consider the more general case of
\displaystyle\left(\frac{a}{b}\right)^n for a,b,n\in\mathbb{Z} and
b\neq 0. We have the following proposition
::: proposition Proposition 93. Rational number raised to an integer exponent
Let x\in\mathbb{Q} with \displaystyle x=\frac{a}{b} and b\neq 0.
Let n\in\mathbb{Z} We have that
$$\begin{equation} \left(\frac{a}{b}\right)^n=\frac{a^n}{b^n} \end{equation*}$$*
Proof:
We have that
$$\begin{align} \left(\frac{a}{b}\right)^n&=\left(ab^{-1}\right)^n\ &= \underbrace{\left(a b^{-1}\right)\left(a b^{-1}\right)\dots \left(a b^{-1}\right)}_{n \text{ times}}\ &=\underbrace{aaa\dotsa}_{n \text{ times}}\underbrace{b^{-1}b^{-1}b^{-1}\dotsb^{-1}}_{n \text{ times}}\ &= a^n \left(b^{-1}\right)^n\ &=a^nb^{-n}\ &=\frac{a^n}{b^n} \end{align*}$$*
As required. $\qed$ :::
The rules of integer exponentiation extend when the base is rational.
::: {#propRationalExponentiationPowerLaw .proposition} Proposition 94. Power law of exponentiation for positive exponents
Let x\in\mathbb{Q} and let n,m\in \mathbb{Z}. We have that
$$\begin{equation} \left(x^n\right)^m = x^{nm} \end{equation*}$$*
Proof:
Let \displaystyle x=\frac{a}{b} with a,b\in\mathbb{Z} and
b\neq 0. We have that
$$\begin{align} \left(x^n\right)^m&=\left(\left(\frac{a}{b}\right)^n\right)^m\ &=\left(\frac{a^n}{b^n}\right)^m\ &=\left(a^nb^{-m}\right)^m\ &=a^{nm}b^{-nm}\ &=\frac{a^{nm}}{b^{nm}}\ &=x^{nm} \end{align}$$ $\qed$ :::
::: {#prop:RationalExponentiationOfSameBaseAddsPowers .proposition} Proposition 95. Multiplying exponents of the same base adds the powers
Let x\in\mathbb{Q} be a fixed integer and let n,m\in\mathbb{Z}. We
have that
$$\begin{equation} x^n x^m = x^{n+m} \end{equation}$$*
Proof:
Let \displaystyle x=\frac{a}{b} with a,b\in\mathbb{Z} and
b\neq 0. Observe that
$$\begin{align} x^nx^m&=\left(\frac{a}{b}\right)^n\left(\frac{a}{b}\right)^m\ &=\frac{a^n}{b^n}\frac{a^m}{b^m}\ &=\frac{a^na^m}{b^nb^m}\ &=\frac{a^{n+m}}{b^{n+m}}\ &=\left(\frac{a}{b}\right)^{n+m}\ &=x^{n+m} \end{align}$$*
As required. $\qed$ :::
::: {#prop:RationalExponentiationPowerOfProductIsProductOfPowers .proposition} Proposition 96. Power of product is product of powers
Let x,y\in\mathbb{Q} and n\in\mathbb{Z}. Then
$$\begin{equation} \left(xy\right)^n=x^ny^n \end{equation*}$$*
Proof:
Let \displaystyle x=\frac{a}{b} with a,b\in\mathbb{Z} and b\neq 0
and let \displaystyle y=\frac{c}{d} with c,d\in\mathbb{Z} and
d\neq 0. We have
$$\begin{align} \left(xy\right)^n&=\left(\frac{a}{b}\frac{c}{d}\right)^n\ &=\left(\frac{ac}{bd}\right)^n\ &=\frac{\left(ac\right)^n}{\left(bd\right)^n}\ &=\frac{a^n c^n}{b^n d^n}\ &=\frac{a^n}{b^n}\frac{c^n}{d^n}\ \frac{}{} &=x^ny^n \end{align*}$$* :::
What about rational exponents? Can we assign meaning to expressions of
the form \displaystyle \wedge\left(\frac{a}{b},\frac{c}{d}\right)?
Using a similar argument to when we considered extending integer
exponentiation. Suppose that proposition
95{reference-type="ref"
reference="prop:RationalExponentiationOfSameBaseAddsPowers"} holds for
rational exponents. In particular we have for some x\in\mathbb{Q} that
$$\begin{equation*} x^{\frac{1}{2}}x^{\frac{1}{2}}=x^1 \end{equation}$$
Now, suppose that x=2. We are hence saying that
$$\begin{equation*} 2^{\frac{1}{2}}2^{\frac{1}{2}}=2 \end{equation}$$
If we suppose that \displaystyle 2^{\frac{1}{2}}\in\mathbb{Q} with say
\displaystyle y=2^{\frac{1}{2}} we are saying that y^2=2.
Unfortunately, there is no such rational y that satisfies this.
Moreover, we lack the theory required to prove this at this time. This
will be corrected in part {reference-type="ref"
reference="part2"}.
Extending the absolute value function
When we constructed the integers we recast the notion of size into that of distance. This was achieved using the so-called absolute value function given by
$$\begin{equation*} \left|x\right|=d\left(x,0\right)=\begin{cases} x,\ \text{If } x\geq 0\ -x,\ \text{If } x< 0 \end{cases} \end{equation*}$$
where
$$\begin{align*} d:\mathbb{Z}^2&\rightarrow\mathbb{N}\ \left(x,y\right)&\mapsto d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{align*}$$
Now that we have constructed the rational numbers we can consider how
this idea extends. One thing that is clear from the definition of d
for integers is that the smallest possible non-zero distance that can be
achieved is 1, for example, d\left(2,1\right). However, consider
$$\begin{equation*} 1-\frac{1}{2}=\frac{1}{2} \end{equation*}$$
If this idea of distance is to extend to the rationals we will clearly
have that distances smaller than 1 are now possible. In other words,
the mapping for d when used with rational numbers can no longer map
into \mathbb{N}. This is easily remedied by defining the following
set.
::: definition Definition 138. Positive rationals
We define the set of positive rationals by
$$\begin{equation} \mathbb{Q}^+=\left{x\in\mathbb{Q}: x>0\right} \end{equation*}$$* :::
It is clear from the definitions for the integers how to extend the distance function and the absolute value function to the rationals.
::: definition Definition 139. Distance function for the rationals
Let x,y\in\mathbb{Q}. Define the function
d:\mathbb{Q}^2\rightarrow\mathbb{Q}^+ by
$$\begin{align} d:\mathbb{Q}^2&\rightarrow\mathbb{Q}^+\ \left(x,y\right)&\mapsto d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{align*}$$* :::
As before we prove that this distance function is well-defined.
::: {#prop:RationalDistanceFuncWellDefined .proposition} Proposition 97. The distance function for the rationals is well-defined
Let x,y\in\mathbb{Q}. We have that
$$\begin{equation} d\left(x,y\right)=\begin{cases} x-y,\ \text{If } x\geq y\ -\left(x-y\right),\ \text{If } x< y \end{cases} \end{equation*}$$*
is well-defined.
Proof:
Let x,y\in\mathbb{Q}. There are two cases to consider x\geq y and
x<y.
-
x\geq y:Suppose that
x\geq y, then by proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} part 14. we have$$\begin{equation} x\geq y \Rightarrow \left(x+\left(-y\right)\right) \geq \left(y+\left(-y\right)\right) \Rightarrow x-y \geq 0 \end{equation*}$$*
Hence
x-y\in\mathbb{Q}^+. -
x<y:As
x<ywe have by definition ofdthatd\left(x,y\right)=-\left(x-y\right)where we have thatx-y<0. However we have that-\left(x-y\right)=-1 * \left(x-y\right)and so by part 16 of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we have that-1*\left(x-y\right)>0which is to say $-\left(x-y\right)\in\mathbb{Q}^+$
The result has been shown. $\qed$ :::
We can now generalise the absolute value function.
::: definition Definition 140. Absolute value function
Let x\in\mathbb{Q} we define the absolute value function, denoted by
\left|x\right| by the function
$$\begin{equation} \left|x\right|=d\left(x,0\right)=\begin{cases} x,\ \text{If } x\geq 0\ -x,\ \text{If } x< 0 \end{cases} \end{equation*}$$* :::
We have generalised the idea of "size" to the rationals. We can now also generalise the properties of the absolute value function explored in the construction of the integers.
::: proposition Proposition 98. Properties of the absolute value
Let x,y,z\in\mathbb{Q}. We have that the absolute value function has
the following properties
-
\left|x\right|\geq 0for all $x\in\mathbb{Q}$ -
$\left|x\right|=0\iff x=0$
-
$\left|x-y\right|=0\iff x=y$
-
$\left|xy\right|=\left|x\right|\left|y\right|$
-
\displaystyle \left|\frac{x}{y}\right|=\frac{\left|x\right|}{\left|y\right|}with $y\neq 0$ -
$\left|\left|x\right|\right|=\left|x\right|$
-
$\left|-x\right|=\left|x\right|$
-
$\left|x\right|\leq y \iff -y\leq x\leq y$
-
\left|x\right|\geq y\iff x\leq -yor $x\geq y$ -
$\left|x+y\right|\leq \left|x\right|+\left|y\right|$
-
$\left|x-y\right|\leq\left|x-z\right|+\left|z-y\right|$
-
$\left|x-y\right|\geq \left|\left|x\right|-\left|y\right|\right|$
-
\left|\cdot\right|is not injective -
\left|\cdot\right|is not surjective
Proof:
-
\left|x\right|\geq 0for allx\in\mathbb{Q}:This follows by proposition 97{reference-type="ref" reference="prop:RationalDistanceFuncWellDefined"}.
-
\left|x\right|=0\iff x=0:We have by definition that
\left|x\right|=0, if and only ifx=0. -
\left|x-y\right|=0\iff x=y:\left(\Rightarrow\right): Suppose that\left|x-y\right|=0. There are two cases to consider.Firstly if
x\geq y, then by definition we have that\left|x-y\right|=x-y=0from which we clearly havex=y. The other case isx<yfrom which we get\left|x-y\right|=-\left(x-y\right)=0. In other words, we have-1*\left(x-y\right)=0. Now by proposition 87{reference-type="ref" reference="prop:RationalsHaveNoZeroDivisors"} we know that for rationalsa,bthat ifab=0, at least one ofaorbis zero. As-1\neq 0we conclude thatx-y=0from which we getx=y.\left(\Leftarrow\right): Suppose thatx=ythenx-y=0and so\left|x-y\right|=0. -
\left|xy\right|=\left|x\right|\left|y\right|:Let
x,y\in\mathbb{Q}. There are four cases to consider.-
x\geq 0and $y\geq 0$ -
x\geq 0and $y<0$ -
x<0and $y\geq 0$ -
x<0and $y<0$
-
x\geq 0andy\geq 0:If
x\geq 0andy\geq 0thenxy\geq 0and so\left|xy\right|=xy. Likewise\left|x\right|=xand\left|y\right|=y. Hence\left|xy\right|=\left|x\right|\left|y\right|. -
x\geq 0andy<0:If
x\geq 0then\left|x\right|=xby definition, and ify<0then\left|y\right|=-y. Now\left|xy\right|=-xyasy<0. Moreover, we have that$$\begin{equation} -xy=\left(-1\right)\left(x\right)\left(y\right)=\left(x\right)\left(-1\right)\left(y\right)=\left(x\right)\left(-y\right)=\left|x\right|\left|y\right| \end{equation*}$$*
Hence we get $\left|xy\right|=\left|x\right|\left|y\right|$
-
x<0andy\geq 0:This is similar to the above but swapping the roles of
xandy. -
x<0andy<0:Suppose that
x<0andy<0, then we have that\left|x\right|=-xand\left|y\right|=-yby definition. Moreover, we have that-x*-y = xy. Hence $\left|xy\right|=xy=\left(-x\right)\left(-y\right)=\left|x\right|\left|y\right|$
-
-
\displaystyle\left|\frac{x}{y}\right|=\frac{\left|x\right|}{\left|y\right|}withy\neq 0:This follows by part 4.
-
\left|\left|x\right|\right|=\left|x\right|:We have that
\left|x\right|=xifx\geq 0and-xifx<0.So if
x\geq 0, we have$$\begin{equation} \left|\left|x\right|\right|=\left|x\right|=x=\left|x\right| \end{equation*}$$*
Now if
x<0then$$\begin{equation} \left|\left|x\right|\right|=\left|-x\right|=\underbrace{-x}_{\text{As }-x>0}=\left|x\right| \end{equation*}$$*
-
\left|-x\right|=\left|x\right|:As
-x=-1 *xwe have by part 4 that$$\begin{equation} \left|-x\right|=\left|-1x\right|=\left|-1\right|\left|x\right|=1\left|x\right|=\left|x\right| \end{equation*}$$*
-
\left|x\right|\leq y \iff -y\leq x\leq y:\left(\Rightarrow\right): Suppose that\left|x\right|\leq y. Ifx\geq 0then we get that\left|x\right|=x\leq y. From this, it is clear that-y\leq x\leq yasx\geq 0andx\leq y \Rightarrow y \geq 0.Now if
x<0, then\left|x\right|=-x\leq y. Clearlyx\leq -xasx<0hence we conclude thatx\leq -x\leq y. Now by part 18 of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we have we have$$\begin{equation} \left(-1\right)\left(-x\right)\geq \left(-1\right)\left(y\right) \iff x\geq -y \end{equation}$$*
Now
x\geq -yis the same as-y\leq xand so we have-y\leq x\leq -x \leq y.Hence
-y\leq x\leq y.\left(\Leftarrow\right): Suppose that-y\leq x\leq y. There are two cases to consider.-
$x\geq 0$
-
$x<0$
-
x\geq 0:Suppose
x\geq 0, then clearly asx\leq ythen\left|x\right|\leq \left|y\right|=y. Moreover, we have that-y\leq xis the samex\geq -yand by part 22. of proposition 71{reference-type="ref" reference="prop:InequalityIntegerNumbers"} when applied tox\geq -ygives$$\begin{equation} \left(-1\right)\left(x\right)\leq \left(-1\right)\left(-y\right) \iff -x\leq y \end{equation}$$*
We have that
\left|-x\right|=\left|x\right|by part 6. Hence\left|-x\right|=\left|x\right|\leq \left|y\right|=y. -
x<0:Suppose
x<0. By assumptionx\leq yso eithery\geq 0ory< 0. We can't havey<0as for example takex=-4andy=-2then we would have2\leq -4\leq -2a contradiction.So suppose that
y\geq 0then asx\leq ywe have\left|x\right|\leq\left|y\right|=y. Now as-y\leq xby assumption we have thatx\geq -yand so part 22. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} gives$$\begin{equation} \left(-1\right)\left(x\right)\leq \left(-1\right)\left(-y\right) \iff -x\leq y \end{equation}$$*
Hence part 6. applies and we get that $\left|x\right|\leq y$
-
-
\left|x\right|\geq y\iff x\leq -yorx\geq y:\left(\Rightarrow\right): Suppose that\left|x\right|\geq y. Ifx\geq 0then\left|x\right|=x\geq y. So suppose thatx<0then by definition we have that\left|x\right|=-xand so-x\geq yand the result follows when applying part 22. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"}.\left(\Leftarrow\right): Suppose that eitherx\leq -yorx\geq y. We have three cases to consider.-
$x\leq -y$
-
$x\geq y$
-
x\leq -yand $x\geq y$
-
x\leq -y:Suppose that
x\leq -yholds. Ifx\geq 0then we have that-y\geq 0, Hencey<0. Moreover, we have that by part 18. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} that$$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(-y\right) \iff -x\geq y \end{equation}$$*
Now part 6. applies and we see that
\left|-x\right|=\left|x\right|\geq\left|y\right|=y. This is to say\left|x\right|\geq y.Now suppose that
x<0. Then asx\leq -ywe have that either-y\geq 0or-y<0. In the former case-y\geq 0givesy<0. Hence by part 18. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we conclude that$$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(y\right) \iff -x\geq y \end{equation}$$*
As
x<0then-x\geq 0. The result follows when taking the absolute value.Now suppose that
-y<0theny\geq 0. Following similar logic to the previous case, we see that$$\begin{equation} \left(-1\right)\left(x\right)\geq \left(-1\right)\left(y\right) \iff -x\geq y \end{equation}$$*
The result again follows after taking the absolute value.
-
x\geq y:This case is trivial.
-
x\leq -yandx\geq y:Suppose that
x\leq -yandx\geq yare both true. We know by the first case thatx\leq -ygives\left|x\right|\geq yandx\leq yalso implies\left|x\right|\geq yby the second case. Hence both inequalities being true at the same time implies the result\left|x\right|\geq y.
-
-
\left|x+y\right|\leq \left|x\right|+\left|y\right|:Let
x,y\in\mathbb{Q}. There are four cases to consider.-
x\geq 0and $y\geq 0$ -
x\geq 0and $y\leq 0$ -
x\leq 0and $y\geq 0$ -
x\leq 0and $y\leq 0$
-
x\geq 0andy\geq 0:Suppose
x\geq 0andy\geq 0, then we have that$$\begin{equation} \left|x+y\right|=x+y=\left|x\right|+\left|y\right|\Rightarrow \left|x+y\right|\leq\left|x\right|+\left|y\right| \end{equation*}$$*
-
x\geq 0and $y\leq 0$By assumption we have that
\left|x\right|=xand\left|y\right|=-y. We have two cases based on the absolute value,\left|x\right|\leq\left|y\right|and\left|x\right|\geq\left|y\right|.So suppose that
\left|x\right|\leq\left|y\right|then by definitionx\leq -yand so by part 12. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we have that$$\begin{equation} x\leq -y \Rightarrow x+y\leq 0 \end{equation*}$$*
Moreover, as
x\geq 0theny\leq x+y\leq 0. Hence we have by the definition of the absolute value that$$\begin{equation} \left|x+y\right|=-\left(x+y\right)\leq -y=\left|y\right| \end{equation*}$$ As
-y>0.*In the case
\left|x\right|\geq\left|y\right|we have by definition thatx\geq -yand sox+y\geq 0. Additionally it is clear thatx\geq x+yasy\leq 0and\left|x\right|\geq\left|y\right|. Hence by definition of the absolute value we have that$$\begin{equation} \left|x+y\right|=x+y\leq x=\left|x\right| \end{equation*}$$*
Now, it is clear to see that
\left|x\right|\leq \left|x\right|+\left|y\right|and likewise\left|y\right|\leq \left|x\right|+\left|y\right|.We have hence shown that
\left|x+y\right|leq\left|x\right|+\left|y\right|. -
x\leq 0andy\geq 0:This is similar to above, interchanging the roles of
xandy. -
x\leq 0andy\leq 0:Suppose that
x\leq 0andy\leq 0then by definition we have that\left|x+y\right|=-\left(x+y\right)=-x-y. Asx\leq 0andy\leq 0then we have that and\left|y\right|=-ywhich shows $\left|x+y\right|=\left|x\right|+\left|y\right|\leq\left|x\right|+\left|y\right|$
-
-
\left|x-y\right|\leq\left|x-z\right|+\left|z-y\right|:We have that
$$\begin{align} \left|x-y\right|&=\left|x-\left(z-z\right)-y\right|\ &=\left|x-z+z-y\right|\ &\leq \left|x-z\right|+\left|z-y\right| \end{align*}$$*
-
\left|x-y\right|\geq \left|\left|x\right|-\left|y\right|\right|:We have that
$$\begin{align} \left|x\right|&=\left|\left(x-y\right)+y\right|\leq \left|x-y\right|+\left|y\right| \Rightarrow \left|x\right|-\left|y\right|\leq \left|x-y\right|\ \left|y\right|&=\left|\left(y-x\right)+x\right|\leq \left|x-y\right|+\left|x\right| \Rightarrow \left|y\right|-\left|x\right|\leq \left|x-y\right|\ \end{align*}$$*
Hence we have
$$\begin{align} \left|x\right|-\left|y\right|\leq \left|x-y\right| &\Rightarrow \left|\left|x\right|-\left|y\right|\right|\leq \left|x-y\right|\ \left|y\right|-\left|x\right|=\left(-1\right)\left(\left|x\right|-\left|y\right|\right)\leq \left|x-y\right| &\Rightarrow \left|\left|x\right|-\left|y\right|\right|\leq \left|x-y\right|\ \end{align*}$$*
Hence we have the result.
-
\left|\cdot\right|is not injective:This follows as the absolute value function was not injective for the integers
-
\left|\cdot\right|is not surjective:This follows as the absolute value function was not surjective for the integers
As required. $\qed$ :::
Elementary Number Theory
Introduction
::: epigraph Mathematics is the queen of the sciences and Number Theory is the queen of mathematics.
Carl Friedrich Gauss :::
In the previous part, we have gone from only having the axioms of ZFC, the rules of logic and knowledge of mappings and have built two types of numbers, the naturals and the integers. Unfortunately, we need to make a detour from constructing new objects. We need to start using the objects we have constructed to provide a guide on how to proceed with building more mathematical objects.
We will start with Number Theory. Number Theory primarily deals with the
properties of the integers \mathbb{Z} as well as mappings defined on
\mathbb{Z}. This includes properties about the operations on the
integers, properties about the compositions and ways of expressing
relationships between certain "types" of integers, solving equations
involving the integers and more.
The applications of Number Theory to the modern world are numerous. One main example of the usage of Number Theory is encryption, the art of obfuscating information so that it can only be read by trusted individuals11 . We will later consider an example of encryption called RSA.
Additionally, the ideas that we will develop when studying Number Theory are key to providing crucial insights into other branches of mathematics. We will come to see that many of the key properties of the integers are also enjoyed by many other types of mathematical objects, especially in an abstract setting.
Divisibility
::: epigraph Now where there are no parts, neither extension, shape, nor divisibility is possible. And these monads are the true atoms of nature and, in a word, the elements of things.
Gottfried Leibniz :::
Definition of divisibility of integers
Although we have a concrete construction of the integers, we haven't
even discussed some of their most basic properties! We know how to add,
subtract and multiply them, but we don't know how to divide them without
the rational numbers \mathbb{Q}. It is with \mathbb{Q} that we can
hope to find a rule that says that
\displaystyle\frac{a}{b}\in\mathbb{Z} for some a,b\in\mathbb{Z}.
Recall that in \mathbb{Q} we defined an equivalence relation \sim so
that for \left(a,b\right),\left(c,d\right)\in\mathbb{Z}^2 we have that
$$\begin{equation*} \left(a,b\right)\sim\left(c,d\right)\iff ad=bc \end{equation*}$$
where we had b\neq 0 and d\neq 0. We also saw that
\left(x,1\right)\in\left[\left(x,1\right)\right] represented an
integer. Hence the question we are resolving is when does
\left(a,b\right)\sim\left(x,1\right). We have that
$$\begin{equation*} \left(a,b\right)\sim\left(x,1\right)\iff a=bx \end{equation*}$$
That is b divides a and gives an integer if and only if a=bx. We
make this our first formal definition in the field of Number Theory.
::: {#def:NT_Int_Div_def .definition} Definition 141. Integer divisibility
Let a,b\in\mathbb{Z} with b\neq 0. We say that a is divisible by
b, or b divides a, written as b\mid a if and only if
\exists c\in\mathbb{Z} so that a=bc. We say that b is a divisor of
a.
If b does not divide a we write b\not\nmid a.
:::
::: example
Example 90. We have that 3\mid 6 as 6=3*2.
Obverse that 2\nmid 3. Indeed there is no integer x so 3=2x.
:::
We make a definition based on the definition of divisibility. Namely based on if a number can be divided into two equal parts.
::: definition Definition 142. Even number
Let x\in\mathbb{Z}. We say that x is even if we have that
2\mid x.
:::
This immediately gives another definition.
::: definition Definition 143. Odd number
Let x\in\mathbb{Z}. We say that x is odd if we have that
2\nmid x.
:::
We can make another definition, based on divisibility.
::: definition Definition 144. Integer multiple
Let a,b\in\mathbb{Z} so that b\mid a. We say that b is a multiple
of a.
:::
There are two results that we can derive based on an even number, an odd number and integer multiples.
::: {#prop:NT_even_iff_2n .proposition} Proposition 99. Integer is even if it is a multiple of 2
Let x\in\mathbb{Z}. We have that x is even if and only if x is a
multiple of 2.
Proof:
\left(\Rightarrow\right): Suppose that x is even, then by
definition we have that 2\mid x and so by the definition of
divisibility we have that x=2c for some c\in\mathbb{X}. By the
definition of being an integer multiple we have that x is a multiple
of 2.
\left(\Leftarrow\right): Suppose that x is a multiple of 2. By
definition of being an integer multiple, we have that x=2r for some
r\in\mathbb{Z}. Hence by the definition of divisibility, we have that
2\mid x and so by definition of an even number we have that x is
even. $\qed$
:::
We can find a similar proposition for odd numbers. Observe that by the
previous proposition that x being even means that x=2n for some
integer n. Also, we have that 2n+2=2\left(n+1\right) is even, so
what can we say about 2n+1?
::: proposition Proposition 100. Integer is odd if and only if it is not a multiple of 2
Let x\in\mathbb{Z}. We have that x is odd if and only if x is not
a multiple of 2.
Proof:
The proof follows by the contra-positive, that is x is a multiple of
2 if and only if x is even, which is the previous proposition. $\qed$
:::
Hence we need to determine if 2n+1 is even or odd. We need to develop
the theory of divisibility.
The definition of divisibility gives an immediate result. Namely that
when considering the divisibility of integers we need only concern
ourselves with positive integers, as negative integers will also be
divisors. That is if b\mid a then so does -b.
::: {#prop:NT_PositiveAndNegativeDivisorsForIntsExist .proposition} Proposition 101. Integer dividing another implies negative integer also divides
Let a,b\in\mathbb{Z} with b\mid a. We also have that -b\mid a.
Proof:
Let a,b\in\mathbb{Z} with b\mid a. By definition of divisibility,
we have that \exists c\in\mathbb{Z} so that a=bc. We know that
-1*1=1 and so we have that
$$\begin{equation} a=bc=\left(-1*-1\right)bc=-b*-c \end{equation*}$$*
As -c\in\mathbb{Z} then it follows by definition that -b\mid a.
$\qed$
:::
Hence by proposition 101{reference-type="ref" reference="prop:NT_PositiveAndNegativeDivisorsForIntsExist"} we will restrict our view to positive divisors only, knowing that any results about a positive divisor will extend to negative divisors.
One clear divisor of any integer a is itself, that is a\mid a as
a=a*1. We will find it interesting to consider the more non-trivial
divisors of some integers. Hence we make the following definition
::: definition Definition 145. Proper divisor
Let a,b\in\mathbb{Z} with b\mid a. If we have that 0<b<a then we
say that b is a proper divisor of a.
:::
There are some clear results about divisibility.
::: {#prop:NT_divisibility_properties .proposition} Proposition 102. Properties of divisibility
Let a,b,c\in\mathbb{Z}. We have the following properties for
divisibility
-
a\mid b \Rightarrow a\mid bcfor any $c\in\mathbb{Z}$ -
a\mid bandb\mid cimplies that $a\mid c$ -
a\mid banda\mid cimplies thata\mid\left(bx+cy\right)for any $x,y\in\mathbb{Z}$ -
a\mid bandb\mid aimpliesa=\pm b, that is eithera=bora=-b. -
a\mid banda>0andb>0implies thata\leq b. -
If
m\in\mathbb{Z}is such thatm\neq 0thena\mid bis true if and only ifma\mid mb. -
For all
a\in\mathbb{Z}witha\neq 0we have $a\mid 0$
Proof:
-
a\mid b \Rightarrow a\mid bcfor anyc\in\mathbb{Z}: -
a\mid bandb\mid cimplies thata\mid c:Suppose that
a\mid b, then by definition there existsd\in\mathbb{Z}so thatb=ad. Hence we have that$$\begin{equation} bc=adc \Rightarrow a\mid bc \end{equation*}$$*
as
dc\in\mathbb{Z}. -
a\mid banda\mid cimplies thata\mid\left(bx+cy\right)for anyx,y\in\mathbb{Z}:Suppose that
a\mid bandb\mid c, then by the definition of divisibility, and by part 1., we have thatb=axandc=byfor allx,y\in\mathbb{Z}. We hence see that$$\begin{equation} c=axy \end{equation*}$$*
Hence as
xy\in\mathbb{Z}then we conclude thata\mid c. -
a\mid bandb\mid aimpliesa=\pm b, that is eithera=bora=-b:Let
a\mid banda\mid c, then there ared,e\in\mathbb{Z}such thatb=adandc=ae. Now, letx,y\in\mathbb{Z}then we have thatbx=adxandcy=aeyandbx+cy=adx+aey=a\left(dx+ey\right). Hencea\mid\left(bx+cy\right). -
a\mid banda>0andb>0implies thata\leq b:If
a\mid bthen\exists x\in\mathbb{Z}so thatb=ax, likewise ifb\mid athen\exists y\in\mathbb{Z}so thata=by. It follows thatb=byx. We have thatb=byxis true if and only ifyx=1. Therefore eitherx=y=1orx=y=-1.The result is clear after substituting
yintoa=by. -
If
m\in\mathbb{Z}is such thatm\neq 0thena\mid bis true if and only ifma\mid mb:\left(\Rightarrow\right): Letm\in\mathbb{Z}be non-zero and leta\mid b. By definition, there is somec\in\mathbb{Z}so thatb=ac. Multiplying both sides bymgives$$\begin{equation} bm=acm=amc \end{equation*}$$*
and so
am\mid bm.\left(\Leftarrow\right):Suppose thatam\mid bm, then again by the definition of divisibility we have that there is somec\in\mathbb{Z}so thatbm=amc. By the cancellation law, we can cancel themto getb=acand the result follows. -
For all
a\in\mathbb{Z}witha\neq 0we havea\mid 0:Let
a\in\mathbb{Z}, wherea\neq 0. We have that0=kahas the solutionk=0by part I proposition 69{reference-type="ref" reference="prop:IntegersHaveNoZeroDivisors"}. Hencea\mid 0.
As required. $\qed$ :::
Part 3. of the previous proposition can be generalised. We will work with an example to see how this can be achieved.
::: example
Example 91. Let a=2, b=16 and c=32. Clearly we have that
a\mid b as 16=4*2 and likewise a\mid c as 32=5*2.
Now part 3. states that if a\mid b and a\mid c then we must have
that a\mid\left(bx+cy\right) for any x,y\in\mathbb{Z}.
Indeed, for example, we can see that
2\mid\left(-5\left(16\right)+7\left(32\right)\right). As
-5\left(16\right)+7\left(32\right)=-80+224=144. Now suppose that
d=64 and say z=5. We can see that
$$\begin{equation} -5\left(16\right)+7\left(32\right)+5\left(64\right)=144+320=464 \Rightarrow 2\mid\left(-5\left(16\right)+7\left(32\right)+5\left(64\right)\right) \end{equation*}$$* :::
We prove the general statement now.
::: {#prop:NT_Divisor_dividing_all_in_set_divides_linear_combination .proposition} Proposition 103. Divisor that divides a set of integers divides a combination of the set
Let a\in\mathbb{Z} and let S=\left\{b_1,b_2,b_3,\dots,b_n\right\}
be a set of n integers where b_i\in\mathbb{Z} for each b_i.
Moreover suppose that a\mid b_i for each b_i\in S. We have that
$$\begin{equation} a\mid\sum_{i=1}^n b_i x_i \end{equation*}$$*
for any x_i\in\mathbb{Z}.
Proof:
We argue by induction on n. The base case is n=2 which is shown in
proposition 102{reference-type="ref"
reference="prop:NT_divisibility_properties"}. So suppose that the result
holds for some k\geq 1, which is to say that if
S=\left\{b_1,b_2,\dots,b_k\right\} and we have that a\mid b_i for
each b_i\in S then
$$\begin{equation} a\mid\sum_{i=1}^k b_i x_i \end{equation*}$$*
We need to show that the result holds for k+1. That is if
\Tilde{S}=S\cup \left\{b_{k+1}\right\} so that a\mid b_i for each
b_i\in\Tilde{S} then
$$\begin{equation} a\mid\sum_{i=1}^{k+1} b_i x_i \end{equation*}$$*
So take \Tilde{S}=S\cup \left\{b_{k+1}\right\} so that a\mid b_i
for each b_i\in\Tilde{S}. By applying part 1. of proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} to each a\mid b_i we know
that for all x_i\in\mathbb{Z} that a\mid b_ix_i.
Now, by the induction hypothesis we know that \forall b_i\in S that
a\mid b_i and moreover we have that
$$\begin{equation} a\mid\sum_{i=1}^k b_i x_i \end{equation*}$$*
Let \displaystyle d=\sum_{i=1}^k b_i x_i. Again by part 1 of
proposition 102{reference-type="ref"
reference="prop:NT_divisibility_properties"} we have that a\mid ad.
Additional we know that a\mid b_{k+1} and so by part 3. of
102{reference-type="ref"
reference="prop:NT_divisibility_properties"}, As d\in\mathbb{Z}, we
have that
$$\begin{align} a &\mid\left(1d + b_{k+1}x_{k+1}\right)\ a &\mid\left(\sum_{i=1}^k b_i x_i + b_{k+1}x_{k+1}\right)\ a &\mid\left(\sum_{i=1}^{k+1} b_i x_i\right)\ \end{align}$$*
Which implies the result holds for k+1 and hence for any
n\in\mathbb{N} by induction. $\qed$
:::
The greatest common divisor and the least common multiple
Now that we have a solid grasp of the basics of integer divisibility, we can start looking towards some applications. One immediate question is given a set of integers say
$$\begin{equation*} S=\left{a_1,a_2,a_3,\dots,a_n\right} \end{equation*}$$
What is the largest integer which divides each a_i\in S. and what is
the largest integer m so that m has each a_i\in S as a proper
divisor? An immediate use of these two ideas is very useful when doing
arithmetic with rational numbers. For example, consider trying to
simplify the fraction \displaystyle\frac{525}{2925}. To simplify this
we need to find the integers that multiply to make 525 and those that
multiply to make 2925. If there are any in common then we know from
the construction of the rationals that \displaystyle \frac{x}{x}=1 and
in particular we have that
\displaystyle\frac{xy}{xz}=\frac{y}{z}*\frac{x}{x}=1.
Likewise suppose we wanted to add \displaystyle\frac{1}{4} and
\displaystyle\frac{1}{7}. It is true that by definition of addition,
we would have
$$\begin{equation*} \frac{1}{4}+\frac{1}{7}=\frac{17+14}{74}=\frac{7+4}{74}=\frac{11}{28} \end{equation*}$$
The key stage was \displaystyle\frac{1*7+1*4}{7*4}, breaking this down
we see that
$$\begin{equation*} \frac{17+14}{74}=\frac{17}{74}+\frac{14}{74} \end{equation}$$
In other words, we are finding a multiple in common with 7 and 4 to
turn the denominator into. It is therefore worthwhile to work out the
theory of working out common divisors and common multiples.
We will start by working out common divisors, by first making a definition.
::: definition Definition 146. Common divisor
Let a,b,c\in\mathbb{Z} be non-zero integers. We say that c is a
common divisor of a and b if c\mid a and c\mid b.
:::
::: example
Example 92. Consider the integers 35 and 25. The divisors of
35 are 1, 5 and 7 and 35, likewise the divisors of 25 are
1 and 5 and 25. The largest common divisor is therefore 5.
:::
::: example
Example 93. Consider the integers 24 and 54. Doing the same as
before, we can see that the divisors of 24 are 1, 2, 3, 4,
6, 8, 12 and 24. Looking at the divisors of 54 we see that
they are 1, 2, 3, 6, 9, 18, 27 and 54.
The common divisors of 24 and 54 are therefore 1, 2, 3 and
6,
:::
::: example
Example 94. Consider the common divisors of 3 and 5. The
divisors of 3 are simply 1 and 3, likewise the divisors of 5 are
1 and 5. The only common divisor is 1.
:::
We can see from the previous examples that there was a largest, or greatest common divisor between the pairs of integers in each case. We can show that for any two integers, there is always a greatest common divisor.
::: {#thm:NT_gcd_exists .theorem} Theorem 32. The greatest common divisor of two integers exists
Let a,b\in\mathbb{Z} so that a\neq 0 or b\neq 0. Then there
exists d\in\mathbb{Z} so that d is the largest possible common
divisor, that is there is no g\in\mathbb{Z} with g>d so that
g\mid a and g\mid b.
Proof:
Firstly, we note that as 1\mid a and 1\mid b, the largest possible
common divisor is at least 1, proving existence. To show that there is
the largest possible common divisor we must show that this divisor can't
exceed some integer, say M, where M depends on a and b. Moreover
by proposition
101{reference-type="ref"
reference="prop:NT_PositiveAndNegativeDivisorsForIntsExist"} we only
need to consider the case where a\geq 0 and b\geq 0.
So. suppose that c\mid a and c\mid b for some c\geq 1. By part 5.
of proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} we have that as c\mid a
then c\leq a, likewise as c\mid b then c\leq b. There are three
possibilities to consider
-
$a=b$
-
Without any loss of generality we have $a<b$
-
One of
a=0orb=0but not both at the same time.
-
a=b:In this case we easily take
Mto be the largest divisor ofa, or equivalentlyb, then $c\leq M$ -
Without any loss of generality we have
a<b:Without loss of generality, we take
a<b, if this is not the case we simply swap the roles ofaandb. In this case, we takeMto be the largest divisor so thatM\leq a. For if we took aMso thatM\leq bthen by the facta<bwe could have the case thatM>aa contradiction to the fact thatc\leq aasc\mid a. -
One of
a=0orb=0but not both at the same time:Suppose that
a=0andb\neq 0, then we have that for allM\in\mathbb{Z}thatM\mid a, but asc\mid bthenc\leq band so we takeM=basb\mid b. Likewise if we assumeb=0anda\neq 0.
In each case we found a M so that if we take c\leq M then c\mid a
and c\mid b.
:::
We have shown that the for any two integers a greatest common divisor always exists. We can make a formal definition.
::: definition Definition 147. Greatest common divisor
Let a,b\in\mathbb{Z} so that a\neq 0 and b\neq 0. Let
d\in\mathbb{Z} be such that d\mid a and d\mid b. We say the
largest value of d where d\mid a and d\mid b is the greatest
common divisor of a and b, denoted
d=\mathop{\mathrm{GCD}}\left(a,b\right), sometimes written
\gcd\left(a,b\right) and in some texts simply by \left(a,b\right).
As a\mid 0 for any integer a. We define
\mathop{\mathrm{GCD}}\left(a,0\right)=a, similarly
\mathop{\mathrm{GCD}}\left(0,b\right)=b.
:::
We will use the notation \mathop{\mathrm{GCD}} in this text and we
will usually abbreviate saying the greatest common divisor to
\mathop{\mathrm{GCD}}. Although we have proved that the greatest
common divisor exists, we do not yet actually have a method of
calculating what it is other than trying through trial and error. To see
how we can attempt to construct a method of finding
\mathop{\mathrm{GCD}} we should look to cases where integer division
does not fail and to cases where it does fail.
::: example
Example 95. It is clear that 2\nmid 3 as there is no integer x
so that 3=2x. If we take x=1 we get the false equality of 3=2, if
we take x=2 we get another false equality of 3=4. We observe however
that 3=2*1+1.
:::
::: example
Example 96. Let a=25 and b=7. It is clear that b\nmid a. The
first couple multiples of 7 are 7=7*1, 14=7*2, 21=7*3, 28=7*4
and so on. However, we can see that 25=7*3+4.
:::
::: example
Example 97. Let a=36 and b=12. Clearly that b\mid a as
36=12*3. The first couple multiples of 7 are 7=7*1, 14=7*2,
21=7*3, 28=7*4 and so on.
:::
::: example
Example 98. This time, let a=8 and b=2. Then we have that
2\mid 8 as 8=2*4. In a similar way to the previous examples we see
that $8=24+0$*
:::
If we let a,b\in\mathbb{Z} so that b\nmid a then, in the previous
examples it seems that we can always find a multiple of b so that
bx\leq a for some x\in\mathbb{Z} and in particular we have that
$$\begin{equation*} a=bx+\left(a-bx\right) \end{equation*}$$
In the case that b\mid a then a-bx=0. Interpreting what a-bx
means, when b\nmid a then a-bx\neq 0 and when b\mid a we had that
a-bx=0. Hence a-bx\neq 0 is a measure of how far off we are from
having b\mid a. This is to say that if a-bx>0 then we are a little
short of making a multiple of a from b and if a-bx<0 we are a
little over of making a multiple of a from b.
In general, we can see that any integer division can be viewed in this
way, that is if a,b\in\mathbb{Z} we can see the result of a divided
by b in the form a=qb+r for some q,r\in\mathbb{Z}.
::: {#thm:NT_divAlg .theorem} Theorem 33. The division algorithm
Let a,b\in\mathbb{Z} so that b> 0, then there exist
q,r\in\mathbb{Z} with q,r being unique so that
$$\begin{equation} a=bq+r \end{equation*}$$*
where $0\leq r < b$
Proof:
There are three cases to consider
-
$a=b$
-
$a<b$
-
$a>b$
-
a=b:If
a=bthenb\mid aholds trivially and we see thata=1*b+0whereq=1andr=0. -
a<b:If
a<bthen we also see that trivially we have thata=0*b+awhereq=0andr=a. -
a>b:This case is the meat of the theorem. To prove the division theorem we will argue by induction on
a. The base case isa=1where we either havea=bora<bwhich have been dealt with. So now suppose that the result holds for somek>1. Likewise in the base case, we only need to consider the case ofk+1>b, or equivalently $b<k+1$As
b<k+1we have that1\leq \left(k+1\right)-band so by the induction hypothesis we have that there are integersq,r\in\mathbb{Z}so that$$\begin{equation} \left(k+1\right)-b=bq+r \end{equation*}$$*
where
0\leq r< b. From this, we clearly getk+1=bq'+rwhereq'=1+qwhich shows the induction step. The result now follows by induction.
Now that the existence has been shown, it is left to show the
uniqueness of q and r. So suppose that q_1,r_1 and q_2,r_2 are
two such pairs that satisfy the conditions of the theorem. Firstly
suppose that r_1\neq r_2 then we have that, without loss of generality
that r_1<r_2 so that 0<r_2-r_1<b and then by the theorem we have
that
$$\begin{equation} r_2-r_1=b\left(q_2-q_1\right) \end{equation*}$$*
which implies that b \mid\left(r_2-r_1\right). This is a
contradiction to theorem
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} part 5. as this part
implies that b\leq r_2-r_1. Therefore r_1=r_2 and from r_1=r_2 we
have that 0=b\left(q_2-q_1\right) and by part
{reference-type="ref" reference="part1"} proposition
69{reference-type="ref"
reference="prop:IntegersHaveNoZeroDivisors"} as b>0 then q_2-q_1=0
giving q_2=q_1. $\qed$
:::
Based on this theorem we make a definition.
::: definition Definition 148. Quotient and remainder
Let a,b\in\mathbb{Z} so that b>0. We have by the division algorithm
that
$$\begin{equation} a=qb+r \end{equation*}$$*
where q,r\in\mathbb{Z} and 0\leq r < b. We say that q is the
quotient of the division and that r is the remainder.
:::
In the theorem, we assumed that b>0. However by proposition
101{reference-type="ref"
reference="prop:NT_PositiveAndNegativeDivisorsForIntsExist"} we know
that negative divisors are also valid. To resolve this we reformulate
theorem 33{reference-type="ref"
reference="thm:NT_divAlg"} so that 0\leq r <\left|b\right|.
::: {#thm:NT_divAlg_ext .theorem} Theorem 34. The division algorithm (Extended)
Let a,b\in\mathbb{Z} so that b\neq 0, then there exist
q,r\in\mathbb{Z} with q,r being unique so that
$$\begin{equation} a=bq+r \end{equation*}$$*
where $0\leq r < \left|b\right|$
Proof:
By the division algorithm, theorem
33{reference-type="ref" reference="thm:NT_divAlg"} we
have for \left|a\right| and \left|b\right| that there exist unique
q,r\in\mathbb{Z} so that
$$\begin{equation} \left|a\right|=q\left|b\right|+r \end{equation*}$$*
where 0\leq r<\left|b\right|. There are a few cases to consider.
-
$r=0$
-
r>0and $a\geq 0$ -
r>0and $a<0$
-
r=0:If
r=0, then\left|a\right|=q\left|b\right|and so by the properties of the absolute value we have thata=\pm qb, hencea=b\left(\pm q\right)and we have the result. -
r>0anda\geq 0:Now suppose
r>0anda\geq 0. We hence have thata=q\left|b\right|+rwhich gives$$\begin{align} a&=bq+r,\ \text{If } b>0\ a&=\left(-b\right)q+r,\ \text{If } b<0\ \end{align*}$$*
The first is simply the first version of the division algorithm and the second can be written as
a=b\left(-q\right)+rwhich gives the result. -
r>0anda<0:Finally if
r>0anda<0then we have$$\begin{equation} -a=\left|b\right|q+r \Rightarrow a=-\left|b\right|q-r \end{equation*}$$*
This is a problem as it would give a negative remainder. We can employ a trick that doesn't change the value of
abut allows us to expressa=-\left|b\right|q-rin a more suitable form.$$\begin{align} a&=-\left|b\right|q-r\ a&=-\left|b\right|q+\left(\left|b\right|-\left|b\right|\right)-r\ a&=-\left|b\right|q+\left|b\right|+\left(\left|b\right|r\right)\ a&=\left|b\right|\left(-1-q\right)+\left(\left|b\right|r\right)\ \end{align*}$$*
By assumption we have that
0<r<\left|b\right|implies that0<\left|b\right|-r<\left|b\right|, so we re-write the above as$$\begin{equation} a=bq'+r' \end{equation*}$$*
where
r'=\left|b\right|-randq'=-1-q, ifb>0and forb<0we writeq'=1+q.
This completes the proof. $\qed$ :::
We can now go back to a problem from the first section, namely showing
that 2n+1 must be odd
::: {#prop:NT_Odd_iff_2n+1 .proposition} Proposition 104. Integer is odd if and only if it is a multiple of $2n+1$
Let x\in\mathbb{Z}. We have that x is odd if and only if it is a
multiple of 2n+1 where x=2n+1 for n\in\mathbb{Z}. Then n is
odd.
Proof:
Suppose x\in\mathbb{Z}, then by the division algorithm we have that
$$\begin{equation} x=2q+r \end{equation*}$$*
where 0\leq r< \left|2\right|. Hence the only remainders possible are
r=0 or r=1. Hence either x=2q or x=2q+1. In the first case we
have x=2q is even by definition. In the case x=2q+1 we have that
2\nmid 2n+1 and so x can't be even by definition. It follows that
x is odd. $\qed$
:::
With this proposition and proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"} we can derive the evenness or oddness when adding or multiplying even or odd integers.
::: proposition Proposition 105. Even and oddness for addition and multiplication
Let x,y\in\mathbb{Z}. We have that
-
If
xis even andyis even thenx+yis even andxyis even. -
If
xis even andyis odd thenx+yis odd andxyis even. -
If
xis odd andyis even thenx+yis odd andxyis even. -
If
xis odd andyis odd thenx+yis even andxyis odd.
Proof:
-
If
xis even andyis even thenx+yis even andxyis even:Suppose that
xandyare even, then by proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"} we havex=2nfor somen\in\mathbb{Z}andy=2mfor somem\in\mathbb{Z}. We have thatx+y=2n+2m=2\left(n+m\right)hencex+yis even by proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"}. Likewise, we have thatxy=2n*2m=2\left(n*m\right)and therefore even. -
If
xis even andyis odd thenx+yis odd andxyis odd:Suppose that
xis even andyis odd. By we have thatx=2nfor somen\in\mathbb{Z}by 99{reference-type="ref" reference="prop:NT_even_iff_2n"} and by proposition 104{reference-type="ref" reference="prop:NT_Odd_iff_2n+1"} we have thaty=2m+1for somem\in\mathbb{Z}.We have
x+y=2n+2m+1-2\left(n+m\right)+1and sox+yis odd by proposition 104{reference-type="ref" reference="prop:NT_Odd_iff_2n+1"}. Additionally,xy=2n\left(2m+1\right)=2\left(2mn+n\right)and so by proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"} we have thatxyis even. -
If
xis odd andyis even thenx+yis odd andxyis even:Similar to above, swapping the roles of
xandy. -
If
xis odd andyis odd thenx+yis even andxyis odd:By proposition 104{reference-type="ref" reference="prop:NT_Odd_iff_2n+1"} we have that
x=2n+1for somen\in\mathbb{Z}andy=2m+1for somem\in\mathbb{Z}.Now,
x+y=\left(2n+1\right)+\left(2m+1\right)=2\left(n+m\right)+2=2\left(\left(n+m\right)+1\right). So by proposition 99{reference-type="ref" reference="prop:NT_even_iff_2n"} we havex+yis even.Finally,
xy=\left(2n+1\right)\left(2m+1\right)=4nm+2n+2m+1=2\left(2nm+\left(n+m\right)\right)+1and so by proposition 104{reference-type="ref" reference="prop:NT_Odd_iff_2n+1"} is odd.
As required. $\qed$ :::
Continuing with our quest to find a method to compute the greatest
common divisor. At first, it might seem that we haven't made much
progress in finding a way to calculate the \mathop{\mathrm{GCD}}.
However, consider the following examples.
::: example
Example 99. Consider a=56 and b=24. By the division algorithm,
we have that 56=2*24+8. Now what about a=24 and b=8? Again, by the
division algorithm, we have that 24=3*8+0.
Now, the divisors of 56 are 1, 2, 4, 7, 8, 14, 28 and
56, the divisors of 24 are 1, 2, 3, 4, 6, 8, 12 and
24. The largest common divisor was 8, which was the remainder after
the first use of the division algorithm. Likewise, it was the quotient
in the second application of the division algorithm.
:::
::: example
Example 100. Consider a=4947 and b=1552. By the division
algorithm, we have that 4974=3*1552+291. Applying the division
algorithm to a=1552 and b=291 gives 1552=5*291+97. A third
application of the division algorithm to a=291 and b=97 gives
291=3*97+0.
Unlike with the previous example, there may be potentially too many
divisors for 4947 to list them out by trying each integer
0<x\leq 4947. The same is true for 1552. However, if we follow the
same logic as the previous example we might suspect that 97 is the
greatest common divisor, as by the division algorithm for a=4947 and
b=97 we get 4947=51*97+0. Applying the division algorithm to
a=1552 and b=97 gives 1552=16*97+0.
:::
Based on these two examples we might be tempted to make a conjecture on
how we can potentially calculate the \mathop{\mathrm{GCD}}. A further
example is needed.
::: example
Example 101. Let a=574 and b=34. By the division algorithm, we
have that 574=16*34+30. Applying the algorithm again to a=34 and
b=30 gives 34=1*30+4. Another application gives 30=7*4+2 and
finally a last application gives 4=2*2.
Now, applying the division algorithm to 574 and 2 gives
574=287*2+0 and applying it to 34 and 2 gives 34=17*2+0. So we
suspect that \mathop{\mathrm{GCD}}\left(574,34\right)=2.
:::
If what we suspect is true, then repeated applications of the division algorithm might provide a way to compute the greatest common divisor of any two integers. We can provide more evidence that this must be the case by considering the examples in reverse.
::: example
Example 102. Consider a=56 and b=24. We saw that applying the
division algorithm twice gave us that
$$\begin{align} 56&=224+8\ 24&=38 \end{align*}$$*
By substituting 24=3*8 into 56=2*24+8 we get
$$\begin{align} 56&=224+8\ 56&=2\left(38\right)+8\ 56&=68+8\ 56&=78 \end{align}$$*
And hence by the definition of divisibility 8\mid 56, likewise by
24=3*8 we have that 8\mid 24.
Now, suppose that d is a common divisor of 56 and 24. We have
that as d\mid 56 and d\mid 24, in particular we must have that
d\mid\left(2*24+8\right) as d\mid 56. Hence d\mid 8 as d\mid 24,
and clearly the largest such d\mid 8 is 8 itself.
:::
::: example
Example 103. In the example where a=4947 and b=1552. We saw
that applying the division algorithm three times gave us
$$\begin{align} 4947&=31552+291\ 1552&=5291+97\ 291&=397+0 \end{align}$$*
By substituting 291=3*97 into 1552=5*291+97 we get
$$\begin{align}
1552&=5291+97\
1552&=5\left(397\right)+97\
1552&=1597+97\
1552&=1697\
\end{align*}$$ Which gives us that 97\mid 1552. Now substituting
1552=16*97 and 291=3*97 into 4947=3*1552+291 yields.*
$$\begin{align} 4947&=31552+291\ 4947&=3\left(1697\right)+397\ 4947&=4897+397\ 4947&=5197 \end{align}$$*
Showing 97\mid 4947. Now as in the previous example, suppose that d
is a common divisor of 4947 and 1552. As d\mid 4947 and
d\mid 1552 then d\mid\left(3*1552+291\right) which gives
d\mid 291. Applying similar logic, we see that as d\mid 1552 and
d\mid 97 then d\mid\left( 5*291+97\right) and so d\mid 97. The
largest such d satisfying this is d=97.
:::
It is therefore clear that for integers a and b repeated
applications of the division algorithm on b and the remainder r give
a candidate for the greatest common divisor. When this candidate is used
candidate through the equations generated by each use of the division
algorithm proves that it is the largest such common divisor of a and
b. Hence, informally, we have found the method for computing the
\mathop{\mathrm{GCD}}! It is left to formalise this discovery.
From working the examples in reverse we have an important proposition
that will be crucial for proving the result. Namely that the greatest
common divisor of a and b is also equal to the greatest common
divisor of b and r where r is the remainder from the division
algorithm.
::: {#prop:NT_GCD_of_ints_is_GCD_of_divisor_and_remainder .proposition} Proposition 106. $\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right)$
Let a,b\in\mathbb{Z} so that b\neq 0. By the division algorithm we
have that a=qb+r where q,r\in\mathbb{Z} and
0\leq r<\left|b\right|.
We have that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right) \end{equation*}$$*
Proof:
Let d=\mathop{\mathrm{GCD}}\left(a,b\right). By definition of the
greatest common divisor, we have that d\mid a and d\mid b. By the
division algorithm we have that a=qb+r where q,r\in\mathbb{Z} and
0\leq r<\left|b\right|.
Hence as d\mid a then d\mid\left(qb+r\right). Now, as r=a-qb then
d\mid r. Hence by definition of the greatest common divisor, we must
have that d\leq\mathop{\mathrm{GCD}}\left(b,r\right) as d is a
common divisor of b and r.
Now suppose that g=\mathop{\mathrm{GCD}}\left(b,r\right) then
g\mid b and g\mid r. However, by proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} part 3. as g\mid b and
g\mid r then \forall x,y\in\mathbb{Z} we have that
g\mid\left(bx+yr\right). In particular, we have that
g\mid\left(qb+r\right). But if g\mid\left(qb+r\right) then as
a= qb+r we have that g\mid a.
Therefore we have that g\leq \mathop{\mathrm{GCD}}\left(a,b\right).
Combining the two directions gives us that
$$\begin{align} d&=\mathop{\mathrm{GCD}}\left(a,b\right)\leq \mathop{\mathrm{GCD}}\left(b,r\right)\ g&=\mathop{\mathrm{GCD}}\left(b,r\right)\leq \mathop{\mathrm{GCD}}\left(a,b\right)\ \end{align*}$$*
That is, d\leq g and g\leq d which is true if and only if d=g.
Which is to say
\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right).
As required. $\qed$
:::
We are almost ready to formalise the process of computing the greatest
common divisor. The last step to show is that repeatedly applying the
division algorithm doesn't result in a process that never ends. We have
for integers a and b that the division algorithm gives a=qb+r
where 0\leq r<\left|b\right|. Another application applied to b and
r would give b=q'r+\Tilde{r} where we have
\leq \Tilde{r}<\left|r\right|<\left|b\right|.
Clearly then, applying multiple stages of the division algorithm will
always cause the remainder at each stage to decrease, and by the
condition that 0\leq r <\left|b\right| this process ultimately will
give a remainder of 0. For if not then there would be some integer x
so that 0\leq x < 1 is a contradiction. We formally prove this result.
::: {#prop:NT_EuclidAlgor_Terminates .proposition} Proposition 107. Remainders from multiple applications of division algorithm decrease to $0$
Let a,b\in\mathbb{Z} with b\neq 0. Consider the result of the
division algorithm on a,b, i.e
$$\begin{equation} a=qb+r,\ ,\ 0\leq r< \left|b\right| \end{equation*}$$*
Likewise consider applying the division algorithm to b and r to
get
$$\begin{equation} b=\Tilde{q}r+\Tilde{r},\ ,\ 0\leq \Tilde{r} < r \end{equation*}$$*
If we continually apply this process we have that the remainder is eventually zero.
Proof:
By proposition
106{reference-type="ref"
reference="prop:NT_GCD_of_ints_is_GCD_of_divisor_and_remainder"}, we
know that
\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right)
where r is the remainder from the division algorithm and
0\leq r < \left|b\right|.
Applying the division algorithm to b and r gives us again, by
proposition
106{reference-type="ref"
reference="prop:NT_GCD_of_ints_is_GCD_of_divisor_and_remainder"} that
\mathop{\mathrm{GCD}}\left(b,r\right)=\mathop{\mathrm{GCD}}\left(r,r_1\right)
where 0\leq r_1 < \left|r\right|.
Continuing in this fashion for n applications we get the chain of
inequalities
$$\begin{equation} 0\leq r_n <\left|r_{n-1}\right|<\left|r_{n-2}\right|<\dots <\left|r_2\right|<\left|r_1\right|<\left|r\right| \end{equation*}$$*
Now, for any integers x,y\in\mathbb{Z}, where x\geq 0 and
y\geq 0, we have that the largest value of x so that x<y is given
by x=y-1. Hence, in the chain of inequalities for the remainder, the
smallest decrease from one remainder to the next is 1 and hence there
can only be at most r such decreases. If there were more than r
decreases, then at the $n$-th application we would have r_n<0 a
contradiction to the division algorithm.
This bounds the length of the chain of inequalities to be at most r
and therefore we eventually get to 0 as required. $\qed$
:::
We can now formalise the process for computing the greatest common divisor using repeated applications of the division algorithm.
::: {#thm:NT_EuclidAlgor .theorem} Theorem 35. The Euclidean algorithm
Let a,b\in\mathbb{Z} so that b\neq 0, and suppose that
\left|a\right|\geq \left|b\right|. Let x,y\in\mathbb{Z} so that
x=a and y=b. We have that following these steps computes the
greatest common divisor of a and b.
-
Let
d=\mathop{\mathrm{GCD}}\left(x,y\right). Ifb=0thend=aand there is nothing more to do. -
Otherwise,
b\neq 0so use the division algorithm to writea=qb+rwhere0\leq r <\left|b\right|. -
Let
x=bandy=r, then by the division algorithm we have that\left|b\right|\geq\left|r\right|. -
Go back to step
1. and repeat untily=0.
Following these steps gives us that
d=\mathop{\mathrm{GCD}}\left(a,b\right) is the value of x after
these steps have been performed. This is to say we have that
$d=\mathop{\mathrm{GCD}}\left(a,b\right)=x$
Proof:
Let a,b\in\mathbb{Z} be as stated in the theorem. Let x=a and
y=b. By the division algorithm we know that a=qb+r for some
q,b\in\mathbb{Z} where 0\leq r<\left|b\right|. Moreover by
proposition
106{reference-type="ref"
reference="prop:NT_GCD_of_ints_is_GCD_of_divisor_and_remainder"} we have
that
\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,r\right).
By this proposition, we are therefore looking for the value of
\mathop{\mathrm{GCD}}\left(b,r\right). By proposition
107{reference-type="ref"
reference="prop:NT_EuclidAlgor_Terminates"} we know that the chain of
remainders that are generated by repeatedly using the division algorithm
must eventually be 0. Hence at some point, we are computing
\mathop{\mathrm{GCD}}\left(r_n,0\right) after some step n. The value
of \mathop{\mathrm{GCD}}\left(r_n,0\right)=r_n. Which is the required
greatest common divisor. $\qed$
:::
Theorem 35{reference-type="ref"
reference="thm:NT_EuclidAlgor"} has shown that we can calculate the
greatest common divisor for any integers a,b\in\mathbb{Z} where
b\neq 0. With this theorem, we can now assume that whenever
d=\mathop{\mathrm{GCD}}\left(a,b\right) is stated we know the value of
d by applying this algorithm. We can now consider properties of the
\mathop{\mathrm{GCD}}. One such example is
\mathop{\mathrm{GCD}}\left(ma,mb\right) for some m\in\mathbb{Z}.
Clearly if d=\mathop{\mathrm{GCD}}\left(a,b\right) then d\mid ma and
d\mid mb so d\mid\mathop{\mathrm{GCD}}\left(ma,mb\right). As we will
see it turns out that we must have in fact, that
d=\mathop{\mathrm{GCD}}\left(ma,mb\right). Another property is a
particular application of proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} part 3.
We know from part 3. that if a\mid b and a\mid c then for all
integers x,y\in\mathbb{Z} that a\mid\left(bx+cy\right). Now suppose
that d=\mathop{\mathrm{GCD}}\left(a,b\right), then by definition we
have that d\mid a and d\mid b then d\mid\left(ax+by\right) for any
x,y\in\mathbb{Z}. By the definition of divisibility, we have that
ax+by=cd for some c\in\mathbb{Z}. The question now is, is it
possible to have c=1?
As it turns out the answer is yes.
::: {#thm:NT_bezout_id .theorem} Theorem 36. Bézout's Identity
Let a,b\in\mathbb{Z} so that b\neq 0 and consider
d=\mathop{\mathrm{GCD}}\left(a,b\right). Then, there exists
x,y\in\mathbb{Z} so that
$$\begin{equation} d=ax+by \end{equation*}$$*
Proof:
Let a,b\in\mathbb{Z} be as given and let
d=\mathop{\mathrm{GCD}}\left(a,b\right). By proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} part 3. we have that as
d\mid a and d\mid b then we have that for all x,y\in\mathbb{Z}
that d\mid\left(ax+by\right).
Let S denote the set of all such ax+by, that is
$$\begin{equation} S=\left{ax+by:x,y\in\mathbb{Z}\right} \end{equation*}$$*
Now, it is clear that there are s\in S where s<0 and s\in S where
s>0. Moreover, we clearly have 0\in S as we can take x=0 and
y=0.
Now consider the set \Tilde{S} given by
$$\begin{equation} \Tilde{S}=\left{s\in S: s>0\right} \end{equation*}$$*
We have by definition of \Tilde{S} that \forall s \in \Tilde{S}
that s>0 and so \Tilde{S}\subset\mathbb{N}. Hence by the
well-ordering principle, theorem 18{reference-type="ref"
reference="thm:WOP"}, there is a smallest element, say \Bar{s}. By
definition of being an element of \Tilde{S} we have that
\Bar{s}=ax_0+by_0 for some x_0,y_0\in\mathbb{Z}, where x_0,y_0
each have a fixed value.
We show that \Bar{s}\mid a and \Bar{s}\mid b. Suppose instead that
\Bar{s}\nmid a, then by the division algorithm we have that
a=q\Bar{s}+r where 0<r<\left|\Bar{s}\right|. It hence follows that
$$\begin{align} a&=q\Bar{s}+r\ r&=a-q\Bar{s}\ r&=a-q\left(ax_0+by_0\right)\ r&=a-qax_0-qby_0\ r&=a\left(1-qx_0\right)+b\left(-qy_0\right)\ \end{align*}$$*
This gives us at r\in\Tilde{S}. We know that by the division
algorithm that 0<r<\left|\Bar{s}\right| hence r<\Bar{s} which gives
a contradiction to the well-ordering principle. Meaning that
\Bar{s}\nmid a is false so it must be the case that \Bar{s}\mid a. A
similar argument shows that \Bar{s}\mid b.
Now, we have that d=\mathop{\mathrm{GCD}}\left(a,b\right) and so
a=md and b=nd for some n,m\in\mathbb{Z}. Moreover, we have that
\Bar{s}=ax_0+by_0. So we have that
$$\begin{align} \Bar{s}&=ax_0+by_0\ \Bar{s}&=\left(md\right)x_0+\left(nd\right)y_0\ \Bar{s}&=d\left(mx_0+ny_0\right) \end{align*}$$*
Hence by the definition of divisibility, we conclude that
d\mid\Bar{s}. Applying part 5. of proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} we have that
d\leq \Bar{s}. But as d is the greatest common divisor of a and
b we can't have d< \Bar{s}, so it follows d=\Bar{s} as required.
$\qed$
:::
We now note the more standard properties of the greatest common divisor.
::: {#prop:NT_GCD_properties .proposition} Proposition 108. Properties of the greatest common divisor
Let a,b\in\mathbb{Z} with b\neq 0. We have the following properties
of the \mathop{\mathrm{GCD}} hold.
-
$\mathop{\mathrm{GCD}}\left(a,a\right)=a$
-
$\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,a\right)$
-
Let
Dbe the set of all common divisors ofaandb. then\forall d\in Dwe have that $d\mid\mathop{\mathrm{GCD}}\left(a,b\right)$ -
We have that
\mathop{\mathrm{GCD}}\left(a,b\right)is the smallest suchax+bywherex,y\in\mathbb{Z}so that $\mathop{\mathrm{GCD}}\left(a,b\right)=ax+by$ -
Let
m\in\mathbb{Z}withm>0, then $\mathop{\mathrm{GCD}}\left(am,bm\right)=m\mathop{\mathrm{GCD}}\left(a,b\right)$* -
If
d\mid aandd\mid bwhered\in\mathbb{Z}andd>0then $\displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=\frac{1}{d}\mathop{\mathrm{GCD}}\left(a,b\right)$ -
If
\mathop{\mathrm{GCD}}\left(a,b\right)=dthen $\displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=1$
Proof:
-
\mathop{\mathrm{GCD}}\left(a,a\right)=a:Clearly, we have that
a\mid a. Now by proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 5. We have that ifa\mid awitha>0thena\leq a. Henceais the largest such divisor so $\mathop{\mathrm{GCD}}\left(a,a\right)=a$ -
\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,a\right):This is trivial. If
d=\mathop{\mathrm{GCD}}\left(a,b\right)thendis the largest common divisor ofaandb. -
Let
Dbe the set of all common divisors ofaandb. then\forall d\in Dwe have thatd\mid\mathop{\mathrm{GCD}}\left(a,b\right):Let
Dbe defined as above, then$$\begin{equation} D=\left{x\in\mathbb{Z}: x>0\text{ and } x\mid a \text{ and } x\mid b\right} \end{equation*}$$*
Then by definition of
Dwe have that\forall d\in Dthatdis a common divisor ofaanddis a common divisor ofb. Clearly thend\mid\mathop{\mathrm{GCD}}\left(a,b\right)as\mathop{\mathrm{GCD}}\left(a,b\right)is the largest such common divisor ofaandband therefore\mathop{\mathrm{GCD}}\left(a,b\right)\in D. -
We have that
\mathop{\mathrm{GCD}}\left(a,b\right)is the smallest suchax+bywherex,y\in\mathbb{Z}so that\mathop{\mathrm{GCD}}\left(a,b\right)=ax+by:This follows from the proof of theorem
\ref{thm:NT_bezout_id}. For it it were not we would have a contradiction. -
Let
m\in\mathbb{Z}withm>0, then\mathop{\mathrm{GCD}}\left(a,b\right)=m\mathop{\mathrm{GCD}}\left(a,b\right):By the previous part we have that
\mathop{\mathrm{GCD}}\left(a,b\right)is the smallest such element of the set$$\begin{equation} S=\left{ax+by:x,y\in\mathbb{Z}\right} \end{equation*}$$*
Let
s\in Sdenote the smallest suchax+by, that iss=ax+byands=\mathop{\mathrm{GCD}}\left(a,b\right).As
s=\mathop{\mathrm{GCD}}\left(a,b\right)thens\mid aands\mid b. Ass\mid athena=ksfor somek\in\mathbb{Z}and soam=k\left(ms\right)which is to sayms\mid am. Likewise ass\mid bthenb=lsfor somel\in\mathbb{Z}and hencebm=l\left(ms\right)givingms\mid bm.Now as
s=ax+bythen we have thatms=m\left(ax+by\right)=a\left(mx\right)+b\left(my\right). Moreover, ass\in Sis the smallest suchax+bythenm\left(ax+by\right)will be the smallest such element of the set$$\begin{equation} \Tilde{S}=\left{amx+bmy:x,y\in\mathbb{Z}\right} \end{equation*}$$*
Hence we have that
amx+bmy=\mathop{\mathrm{GCD}}\left(am,bm\right)=ms=m*\mathop{\mathrm{GCD}}\left(a,b\right). -
If
d\mid aandd\mid bwhered\in\mathbb{Z}andd>0then $\displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=\frac{1}{d}\mathop{\mathrm{GCD}}\left(a,b\right)$Let
a,b,d\in\mathbb{Z}so thatd\mid aandd\mid b. Asd\mid athen we have that\displaystyle\frac{a}{d}\in\mathbb{Z}, likewise asd\mid bthen\displaystyle\frac{b}{d}\in\mathbb{Z}. The result now follows by applying the previous part. -
If
\mathop{\mathrm{GCD}}\left(a,b\right)=dthen\displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=1:This follows by the previous part.
Concluding the proof. $\qed$ :::
We have talked a lot about the greatest common divisor but nothing about the least common multiple. As with common divisors, we start by making a definition of a common multiple.
::: definition Definition 149. Common multiple
Let a,b,c\in\mathbb{Z} so that a\mid m and b\mid m. We say that
m is a common multiple of a and b.
:::
::: example
Example 104. Let a=2, b=4 and c=8. We have that 2\mid 8 and
4\mid 8 and so 8 is a common multiple of 2 and 4. In fact, 4
is a common multiple of 2 and 4.
:::
::: example
Example 105. Let a=4 and b=14. Listing multiples of 2 we have
4, 8, 12, 16, 20, 24, 28, 32 and so on. Doing a similar
procedure for 14 we see we have 14, 28, 42 and so on. We see
that 28 is a common multiple of 4 and 14.
:::
::: example
Example 106. Consider a=24 and b=54. Listing the first ten
multiples of a and b we have
$$\begin{align} &24,\ 48,\ 72,\ 96,\ 120,\ 144,\ 168,\ 192,\ 216,\ 240,\ \dots\ &54,\ 108,\ 162,\ 216,\ 270,\ 324,\ 378,\ 432,\ 486,\ 540,\ \dots\ \end{align*}$$*
The first common multiple is 216. Interestingly, we saw that
\mathop{\mathrm{GCD}}\left(a,b\right) was 6. We have that
216*6=1296 and $2454$=1296.*
:::
::: example
Example 107. We observe for any integer a that a\mid 0 as
0=am for some m\in\mathbb{z} and by proposition
69{reference-type="ref"
reference="prop:IntegersHaveNoZeroDivisors"} we must have either a=0
or m=0. Hence 0 can be argued to be a common multiple of any
integers a and b. This result is not particularly useful.
:::
These examples indicate that a common multiple always exists. In fact, there is always a smallest common multiple
::: {#thm:NT_lcm_exists .theorem} Theorem 37. The least common multiple of two integers exists
Let a,b\in\mathbb{Z} where a>0 and b>0. We have that
\exists m\in\mathbb{Z} with m>0 so that m is the smallest common
multiple of a and b. That is m is the smallest such integer so
that a\mid m and b\mid m.
Proof:
We first prove that a non-trivial common multiple of a and b
exists. That is some m\neq 0 as 0 can be viewed as a common divisor
of any two integers a,b. Clearly ab is a common multiple of a and
b as a\mid ab and b\mid ab. Hence a non-trivial common multiple
exists.
It is left to show that there is a minimal common multiple. Let S be
the set of all positive common multiples of a and b. By the
well-ordering principle, S has a smallest element as
S\subset\mathbb{N}. The result follows. $\qed$
:::
We can now make a formal definition. However, first, we can note that
the restriction of a>0 and b>0 is not needed.
::: corollary
Corollary 6. Let a,b\in\mathbb{Z}, where a\neq 0 and b\neq 0.
We have that \exists m\in\mathbb{Z} with m>0 so that m is the
smallest common multiple of a and b. This is, m is the smallest
such integer so that a\mid m and b\mid m.
Proof:
The proof is similar to theorem
37{reference-type="ref"
reference="thm:NT_lcm_exists"}. We have that ab is a common multiple
of a and b as is -ab. Hence we have that one of ab>0 or -ab>0.
Let S be the set of all positive common multiples of a and b. Then
the well-ordering principle gives us that S has the smallest such
element. \qed.
:::
::: definition Definition 150. Least common multiple
Let a,b\in\mathbb{Z} so that a\neq 0 and b\neq 0. We say that the
smallest positive value m so that a\mid m and b\mid m is the least
common multiple of a and b, denoted
m=\mathop{\mathrm{LCM}}\left(a,b\right), sometimes written
\mathop{\mathrm{lcm}}\left(a,b\right).
:::
It is important to note why we say that the least common multiple is
positive. If we allowed a negative least common multiple, say -m, then
for all n\in\mathbb{Z} with n>0 we have that -nm is a smaller
common multiple than -m and so we could always find a smaller such
multiple.
As with the greatest common divisor, we need a way to compute the least
common multiple. We should look again at the example where a=24 and
b=54. We saw that the first, smallest, common multiple was 216, and
that the greatest common divisor was 6. We also noted that the product
ab=1296 which is also the product 216*6. We should look to more
examples to see if this holds in other cases.
::: example
Example 108. Let a=14 and b=21. Using the method of writing
multiples out we have
$$\begin{align} &14,\ 28,\ 42,\ 56,\ \dots\ &21,\ 42,\ 63,\ 84,\ \dots\ \end{align*}$$*
So the smallest positive common multiple is 42. Now,
\mathop{\mathrm{GCD}}\left(14,21\right)=7. Finally, 14*21=294 and
7*42=294.
Hence we have that
\displaystyle \mathop{\mathrm{LCM}}\left(14,21\right)=\frac{14*21}{\mathop{\mathrm{GCD}}\left(14,21\right)}.
In general we might expect that $\displaystyle \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)}$* :::
::: example
Example 109. Let a=6 and b=36. Using our expected result, we
have that
\displaystyle \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{a*b}{\mathop{\mathrm{GCD}}\left(a,b\right)}.
So computing \mathop{\mathrm{GCD}}\left(a,b\right) we see that
\mathop{\mathrm{GCD}}\left(a,b\right)=6 and so we suspect that
\displaystyle\mathop{\mathrm{LCM}}\left(6,36\right)=\frac{6*36}{6}=36.
Writing out the multiples of both 6 and $36$
$$\begin{align} &6,\ 12,\ 18,\ 24,\ 30,\ 36,\ 42,\ \dots\ &36,\ 72,\ 108,\ \dots\ \end{align*}$$*
So the smallest common multiple is indeed 36.
:::
We have enough evidence to postulate and prove the following theorem.
::: {#thm:NT_LCM_by_GCD_is_product .theorem} Theorem 38. Least common multiple by greatest common divisor equals product
Let a,b\in\mathbb{Z} so that a> 0 and b> 0. We have that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)\mathop{\mathrm{LCM}}\left(a,b\right)=ab \end{equation}$$*
Proof:
Let d=\mathop{\mathrm{GCD}}\left(a,b\right), then by definition we
have that d\mid a so by proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} part 1. implies that
d\mid ac for any c\in\mathbb{Z} and in particular d\mid ab. Hence
by the definition of divisibility, there exists n\in\mathbb{Z} so that
ab=dn.
Now as d\mid a then there is an integer u so that a=du, likewise
as d\mid b then there is an integer v so that b=dv. Hence we have
that
$$\begin{align} dn&=dub \Rightarrow n=ub,\ \text{By the cancellation law for the integers}\ dn&=adv \Rightarrow n=av,\ \text{By the cancellation law for the integers} \end{align*}$$*
Hence as n=ub we have that b\mid n and likewise as n=av we have
that a\mid n. Hence it follows that n is a common multiple of a
and b. We need to show that n is the smallest such multiple so then
\mathop{\mathrm{LCM}}\left(a,b\right)=n.
So, let S denote the set of positive common multiples of a and b
and let s\in S be a common multiple of a and b. By definition of a
common multiple, we have that there exists some k_1,k_2\in\mathbb{Z}
so that s=ak_1 and s=bk_2.
Now, we have by Bézout's identity we have that
\exists x,y\in\mathbb{Z} so that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)=d=ax+by \end{equation*}$$*
Now, consider sd, we have that
$$\begin{align} sd&=s\left(ax+by\right)\ &=sax+sby\ &=\left(bk_2\right)ax+\left(ak_1\right)by\ &=abk_2x+abk_1y\ &=ab\left(k_2x+k_1y\right)\ &=dn\left(k_2x+k_1y\right)\ s&=n\left(k_2x+k_1y\right),\ \text{By the cancellation law for the integers} \end{align*}$$*
Now \left(k_2x+k_1y\right)\in\mathbb{Z} and so we have that
n\mid s. Now by proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} part 5. we have that
n\leq s. As s\in S was arbitrary we have that n divides the
smallest element of S by the well-ordering principle, i.e n is the
smallest common divisor and so by definition
\mathop{\mathrm{LCM}}\left(a,b\right)=n.
Hence we have that
ab=dn=\mathop{\mathrm{GCD}}\left(a,b\right)\mathop{\mathrm{LCM}}\left(a,b\right).
As required. \qed.
:::
We can now justify the following corollary to compute the least common multiple.
::: {#cor:NT_lcm_formula .corollary} Corollary 7. Least common multiple is product divided by greatest common divisor
Let a,b\in\mathbb{Z} so that a>0 and b>0. We have that
$$\begin{equation} \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)} \end{equation*}$$*
Proof:
By theorem 38{reference-type="ref" reference="thm:NT_LCM_by_GCD_is_product"} we have that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)\mathop{\mathrm{LCM}}\left(a,b\right)=ab \end{equation}$$*
Let d=\mathop{\mathrm{GCD}}\left(a,b\right) then by definition we
have that d\mid a and d\mid b so that d\mid ab. Hence
\displaystyle\frac{ab}{d}\in\mathbb{Z}. Hence
\mathop{\mathrm{LCM}}\left(a,b\right)\in\mathbb{Z}. $\qed$
:::
We can now show some similar results to proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"}
::: {#prop:NT_LCM_properties .proposition} Proposition 109. Properties of the least common multiple
Let a,b\in\mathbb{Z} with a>0 b> 0. We have the following
properties of the \mathop{\mathrm{LCM}} hold.
-
$\mathop{\mathrm{LCM}}\left(a,a\right)=a$
-
$\mathop{\mathrm{LCM}}\left(a,b\right)=\mathop{\mathrm{LCM}}\left(b,a\right)$
-
Let
Mbe the set of all positive common multiples ofaandb. then\forall m\in Mwe have that $\mathop{\mathrm{LCM}}\left(a,b\right)\mid m$ -
We have that
\mathop{\mathrm{LCM}}\left(a,b\right)is the greatest\displaystyle \frac{ab}{ax+by}where\mathop{\mathrm{GCD}}\left(a,b\right)=ax+by.
Proof:
-
\mathop{\mathrm{LCM}}\left(a,a\right)=a:As
\mathop{\mathrm{GCD}}\left(a,a\right)=aanda*a=a^2, we have by corollary 7{reference-type="ref" reference="cor:NT_lcm_formula"} that$$\begin{equation} \mathop{\mathrm{LCM}}\left(a,a\right)=\frac{aa}{\mathop{\mathrm{GCD}}\left(a,a\right)}=\frac{a^2}{a}=a \end{equation}$$*
-
\mathop{\mathrm{LCM}}\left(a,b\right)=\mathop{\mathrm{LCM}}\left(b,a\right):This follows as
\mathop{\mathrm{GCD}}\left(a,b\right)=\mathop{\mathrm{GCD}}\left(b,a\right)and integer multiplication is commutative, this is to say$$\begin{equation} \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)}=\frac{ba}{\mathop{\mathrm{GCD}}\left(b,a\right)}=\mathop{\mathrm{LCM}}\left(b,a\right) \end{equation*}$$*
-
Let
Mbe the set of all positive common multiples ofaandb. then\forall m\in Mwe have that\mathop{\mathrm{LCM}}\left(a,b\right)\mid m:Let
Mbe the set of all positive common multiples. By the well-ordering principle, there is a smallest element\Tilde{m}. By the definition of the least common multiple we have that\mathop{\mathrm{LCM}}\left(a,b\right)divides any other common multiple, so\mathop{\mathrm{LCM}}\left(a,b\right)\mid\Tilde{m}. For everym\in M, we have thatm\geq\Tilde{m}and so\mathop{\mathrm{LCM}}\left(a,b\right)\mid mfor everym\in M. -
We have that
\mathop{\mathrm{LCM}}\left(a,b\right)is the greatest\displaystyle \frac{ab}{ax+by}where\mathop{\mathrm{GCD}}\left(a,b\right)=ax+by:By proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"} part 4. we have that
\mathop{\mathrm{GCD}}\left(a,b\right)=ax+byfor somex,y\in\mathbb{Z}is the smallest suchax+by. Hence$$\begin{equation} \mathop{\mathrm{LCM}}\left(a,b\right)=\frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)} \end{equation*}$$*
Will be the greatest such fraction. For if not then there is either
x_0,y_0\in\mathbb{Z}so thatax_0+by_0<ax+bya contradiction to part 4. of proposition 108{reference-type="ref" reference="prop:NT_GCD_properties"}, or we have that there isx_1,y_1\in\mathbb{Z}withax_1+by_1>ax+bythen by part 35. of proposition 89{reference-type="ref" reference="prop:InequalityRationalNumbers"} we have that$$\begin{equation} \frac{ab}{ax_1+by_1}<\frac{ab}{ax+by} \end{equation*}$$*
Concluding the proof. $\qed$ :::
Prime and co-prime numbers
::: epigraph God may not play dice with the universe, but something strange is going on with the prime numbers.
Paul Erdos :::
So far we have been building a theory of divisibility. This theory has allowed us to define what it means to be an odd or an even integer. To know when one integer divides another, and computing the largest divisor of two integers. Where do we go from here? One question we could ask is how many divisors does a given integer have?
The divisor function
We start with the following definition.
::: definition Definition 151. The Divisor function
Let x\in\mathbb{Z}. We define
\sigma:\mathbb{Z}\rightarrow\mathbb{Z} by
$$\begin{align} \sigma:\mathbb{Z}&\mathlarger{\mathlarger{\rightarrow}}\mathbb{Z}\ x&\mapsto \sigma\left(x\right)=\sum_{d\mid x} 1 \end{align*}$$*
here we are summing over all of the divisors d of x, where if
d\mid x then we add one to the sum total.
:::
Rather than work with explicit examples we will provide a table of the first 20 integers.
$x$ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
\sigma\left(x\right) 1 2 2 3 2 4 2 4 3 4 2 6 2 4 4 5 2 6 2 6
: The divisor function for the integers 1\leq x\leq 20
There are a few things to note from this table. Firstly the only integer
with a single divisor is 1. Secondly, there are many examples of
integers having only 2 divisors. These are 2, 3, 5, 7, 11,
13, 17 and 19. As 1 is a divisor of every integer we can
conclude the other divisors in the case of \sigma\left(x\right)=2 must
be x itself.
What about the case when \sigma\left(x\right)>2. Looking at 6 we see
the divisors are 1, 2, 3 and 6 itself, and from the table
\sigma\left(2\right)=\sigma\left(3\right)=2. Moreover, we have that
6=2*30.
Similarly with 12 we have that the divisors are 1, 2, 3, 4,
6 and 12. Again, we have that
\sigma\left(2\right)=\sigma\left(3\right)=2. Now, as 12=2*6 and
6=2*3 then we have that 12=2*2*3. In both cases, we have seen that a
number x with \sigma\left(x\right)>2 can be written into a product
of integers with exactly 2 divisors. We can ask does this hold in
general? To do so we need to make some definitions.
Prime numbers
With the remarks of the previous section, we give a special name to any
integer x where \sigma\left(x\right)=2.
::: definition Definition 152. Prime number
Let x\in\mathbb{Z} with x\geq 2. We say that x is a prime number,
or simply that x is prime, if and only if \sigma\left(x\right)=2. In
other words, we say that x is prime, if and only if the only two
distinct positive divisors of x are 1 and itself. If x is not
prime we say that x is composite.
:::
We noted that there were many x\in\mathbb{Z} with
\sigma\left(x\right)=2. A natural question that arises is are there
infinitely many such x, or are there only finitely so many? To answer
this we need to see how primes and divisibility interact. We first have
to make another definition based on the greatest common divisor of two
integers. We show some examples to motivate this new definition.
::: example
Example 110. Let a=6 and b=35. By the Euclidean algorithm, we
see that
$$\begin{align} 35&=5\left(6\right)+5\ 6&=5+1\ 5&=5\left(1\right) \end{align*}$$*
Hence \mathop{\mathrm{GCD}}\left(a,b\right)=1.
:::
::: example
Example 111. Let a=2 and b=3. By the Euclidean algorithm, we
see that
$$\begin{align} 3&=2+1\ 2&=2\left(1\right) \end{align*}$$*
Hence \mathop{\mathrm{GCD}}\left(a,b\right)=1. We note that a and
b are prime.
:::
::: example
Example 112. Let a=4 and b=9. By the Euclidean algorithm, we
see that
$$\begin{align} 9&=2\left(4\right)+1\ 4&=4\left(1\right) \end{align*}$$*
Hence \mathop{\mathrm{GCD}}\left(a,b\right)=1.
:::
We see that there are integers a,b\in\mathbb{Z} so that
\mathop{\mathrm{GCD}}\left(a,b\right)=1. Meaning that they have no
common divisors other than 1. This situation turns out to happen
enough in Number Theory to warrant a definition.
::: definition Definition 153. Co-prime Integers
Let a,b\in\mathbb{Z}. We say that a is co-prime to b, or a and
b are co-prime, or a and b are relatively prime, if and only if
\mathop{\mathrm{GCD}}\left(a,b\right)=1.
:::
We have some immediate results.
::: {#prop:NT_Bezout_coprime .proposition} Proposition 110. Bézout's Identity for co-prime integers
Let a,b\in\mathbb{Z} so that
\mathop{\mathrm{GCD}}\left(a,b\right)=1. We have that
\exists x,y\in\mathbb{Z} so that
$$\begin{equation} 1=ax+by \end{equation*}$$*
Proof:
This immediately follows from theorem 36{reference-type="ref" reference="thm:NT_bezout_id"}. $\qed$ :::
::: proposition Proposition 111. Distinct prime numbers are co-prime
Let p,q\in\mathbb{Z} so that p and q are prime. We have that
\mathop{\mathrm{GCD}}\left(p,q\right)=1.
Proof:
Let p,q\in\mathbb{Z} so that p and q are prime and p\neq q. As
p is prime then the only positive divisors are p and 1, likewise
for q. Hence the largest divisor of both p and q is 1 so that
\mathop{\mathrm{GCD}}\left(p,q\right)=1 by definition. $\qed$
:::
::: {#cor:NT_PrimeNotDividing_Integer_implies_coprime .corollary} Corollary 8. Prime not dividing integer implies co-prime
Let a,p\in\mathbb{Z} where p is prime. If p\nmid a then
$\mathop{\mathrm{GCD}}\left(a,p\right) = 1$
Proof:
Let a,p\in\mathbb{Z} where p is prime and where p\nmid a. Suppose
that \mathop{\mathrm{GCD}}\left(a,p\right)=d for some
d\in\mathbb{Z}. By definition of the greatest common divisor, we have
that d\mid p and by definition of a prime, we have that either d=1
or d=p. But if d=p then p\mid a by definition of the greatest
common divisor, contradicting the assumption that p\nmid a. Hence
d=1. $\qed$
:::
::: proposition Proposition 112. Product of co-prime integers is equal to their least common multiple
Let a,b\in\mathbb{Z} so that
\mathop{\mathrm{GCD}}\left(a,b\right)=1. We have that
ab=\mathop{\mathrm{LCM}}\left(a,b\right).
Proof:
Let a,b\in\mathbb{Z} be as given in the proposition. We have by
corollary 7{reference-type="ref"
reference="cor:NT_lcm_formula"} that
$$\begin{equation} \mathop{\mathrm{LCM}}\left(a,b\right)= \frac{ab}{\mathop{\mathrm{GCD}}\left(a,b\right)} \end{equation*}$$*
As a and b are co-prime, we have
\mathop{\mathrm{GCD}}\left(a,b\right)=1, hence the result. $\qed$
:::
::: {#prop:NT_Bezout_coef_coprime .proposition} Proposition 113. Coefficients in Bézout's identity are co-prime
Let a,b\in\mathbb{Z} with d=\mathop{\mathrm{GCD}}\left(a,b\right)
so that by Bézout's identity we have \exists x,y\in\mathbb{Z} so that
$$\begin{equation} d=ax+by \end{equation*}$$*
We have that $\mathop{\mathrm{GCD}}\left(x,y\right)=1$
Proof:
Let a,b\in\mathbb{Z} with d=\mathop{\mathrm{GCD}}\left(a,b\right).
By Bézout's identity we have that there exists x,y\in\mathbb{Z} so
that
$$\begin{equation} d=ax+by \end{equation*}$$*
Now, dividing by d gives
$$\begin{equation} 1=\frac{a}{d}x+\frac{b}{d}y \end{equation*}$$*
As d\mid a and d\mid b. Hence we have that 1=k_1x+k_2y where
\displaystyle k_1=\frac{a}{d} and \displaystyle k_2=\frac{b}{d}.
Hence \mathop{\mathrm{GCD}}\left(x,y\right)=1 and so by definition x
and y are co-prime. $\qed$
:::
With some basic results out of the way, we can start seeing more meaningful consequences of defining prime and co-prime numbers. One of the first things we should do is see how primes divide other integers.
::: example
Example 113. Let n=10, we have that 2\mid 10 and
\sigma\left(2\right)=2, hence 2 is prime. Moreover 10=2*5 and
clearly 2\mid 2.
:::
::: example
Example 114. let n=4, clearly 4=2*2 and so 2\mid 4. Moreover,
2\mid 2.
:::
::: example
Example 115. Let n=14=2*7. Both 2 and 7 are prime and so
2\mid 14 and 7\mid 14.
:::
Then, if a prime p divides n=ab we seem to have that either
p\mid a or p\mid b.
::: {#lem:NT_Euclid .lemma} Lemma 9. Euclid's Lemma
Let a,b\in\mathbb{Z} and let p\in\mathbb{Z} be prime. Suppose that
p\mid ab. We have that either p\mid a or p\mid b.
Proof:
Let p\mid ab. Suppose that p\nmid b. As the only divisors of p
are 1 and itself then we have that
\mathop{\mathrm{GCD}}\left(p,b\right)=1 by corollary
8{reference-type="ref"
reference="cor:NT_PrimeNotDividing_Integer_implies_coprime"}. Now by
proposition 110{reference-type="ref"
reference="prop:NT_Bezout_coprime"} we have that
\exists x,y\in\mathbb{Z} so that
$$\begin{equation} 1=px+by \end{equation*}$$*
Multiplying by a gives a=apx+aby and as p\mid apx and p\mid ab
we have that p\mid a. Likewise if p\nmid a. $\qed$
:::
This result generalises to products of more than two integers.
::: {#lem:NT_Euclid_general .lemma} Lemma 10. Generalised Euclid's lemma
Let p\in\mathbb{Z} be prime. Let n\in\mathbb{Z} be such that
$$\begin{equation} n=\prod_{i=1}^m a_i \end{equation*}$$*
where a_i\in\mathbb{Z} for each i. Suppose that p\mid n, then
there exists an i\in\mathbb{N} so that p\mid a_i.
Proof:
We argue by induction on m. The base case is m=2 which follows by
Euclid's lemma. So suppose the result holds for some k>2 that is if
n is such that
$$\begin{equation} n=\prod_{i=1}^k a_i \end{equation*}$$*
then there is some i\in\mathbb{N} so that p\mid a_i. We show that
if n is such that
$$\begin{equation} n=\prod_{i=1}^{k+1} a_i \end{equation*}$$*
then there is some i\in\mathbb{N} so that p\mid a_i. So suppose
that p\mid n, then
$$\begin{equation} p\mid\prod_{i=1}^{k+1} a_i \end{equation*}$$*
We have that
$$\begin{align} p\mid&\prod_{i=1}^{k+1} a_i \ p\mid&\left(\prod_{i=1}^{k} a_i a_k\right) \end{align}$$*
By the induction hypothesis we have that as
\displaystyle p\mid\prod_{i=1}^{k} a_i then there is some
i\in\mathbb{N} so that p\mid a_i where 1\leq i \leq k. Hence we
have that either p\mid a_i or p\mid a_{k+1}. The result now follows
by induction. $\qed$
:::
With Euclid's lemma, we can provide a very famous theorem. Namely, there
is no x\in\mathbb{Q} so that x^2=2. We first need a definition,
based on co-prime integers.
::: definition Definition 154. Reduced fraction
Let x\in\mathbb{Q} where \displaystyle x=\frac{a}{b} and b\neq 0.
We say that x is a reduced fraction, or a fraction in its lowest terms
if \mathop{\mathrm{GCD}}\left(a,b\right)=1.
:::
We give some examples.
::: example
Example 116. Let \displaystyle x=\frac{1}{2}. As
\mathop{\mathrm{GCD}}\left(1,2\right)=1 we have that x is a reduced
fraction.
:::
::: example
Example 117. Let \displaystyle x=\frac{3}{6}. We can compute that
\mathop{\mathrm{GCD}}\left(3,6\right)=3, hence we have that 3\mid 3
and 3\mid 6. We hence can write
$$\begin{equation} x=\frac{3}{6}=\frac{31}{32}=\frac{1}{2} \end{equation*}$$*
And as \mathop{\mathrm{GCD}}\left(1,2\right)=1 we can conclude x is
now in its lowest terms.
:::
We can now show the theorem.
::: {#thm:NT_Root2Irrational .theorem} Theorem 39. No rational exists whose square is $2$
We have that \not\exists x\in\mathbb{Q} with x^2=2.
Proof:
Suppose instead that x\in\mathbb{Q} where
\displaystyle x=\frac{a}{b} with b\neq 0. Moreover assume that x
is a reduced fraction, i.e \mathop{\mathrm{GCD}}\left(a,b\right)=1. We
can make this assumption as otherwise we can reduce x until it is
reduced without affecting the proof.
We have that
$$\begin{align} x^2&=2\ \frac{a^2}{b^2}&=2\ a^2&=2b^2 \end{align*}$$*
Hence by the definition of divisibility, we have 2\mid a^2 and so by
Euclid's lemma we have that 2\mid a as 2 is prime. So write a=2k
for some k\in\mathbb{Z}. Then we have that
$$\begin{align} a^2&=2b^2\ \left(2k\right)^2&=2b^2\ 4k^2&=2b^2\ 2k^2&=b^2\ \end{align*}$$*
Hence 2\mid b^2 and again by Euclid's lemma we have that 2\mid b.
We have a contradiction as 2\mid a and 2\mid b implies that
\mathop{\mathrm{GCD}}\left(a,b\right)\geq 2 and so x can't have been
a reduced fraction. But then if x was not a reduced fraction and a
and b can't be co-prime then we can conclude that there is no rational
x so that x^2=2. $\qed$
:::
This raises the question if there is no rational x whose square is 2
then what exactly is x? Unfortunately, we are not quite ready to
properly answer this question in a satisfying way, all we can is that we
have seen a hint of a new type of number. One that we can define but not
study in more detail at the moment.
::: definition Definition 155. Irrational number
If we have x\not\in\mathbb{Q}, then we say that x is irrational. In
other words, x is irrational if and only if
\displaystyle x=\frac{a}{b} where a,b\in\mathbb{Z} and b\neq 0.
:::
Clearly, if S denotes the set of irrational numbers then by theorem
\ref{thm:NT_Root2Irrational} that S\neq\emptyset. Perhaps then it
makes sense, for now, to consider which elements of x\in\mathbb{Q} so
that x^2=y where y\in\mathbb{Z}, or more restrictively, which
x\in\mathbb{Z} are such that we have x^2=y where y\in\mathbb{Z}.
Before we start answering this question, we note one useful result by generalising Euclid's lemma from the prime case to the co-prime case.
::: {#lem:NT_Euclid_co_primes .lemma} Lemma 11. Euclid's lemma for co-primes
Let a,b,c\in\mathbb{Z} and suppose that c\mid ab and
\mathop{\mathrm{GCD}}\left(b,c\right)=1. We have that c\mid a.
Proof:
Let a,b,c\in\mathbb{Z} be such that c\mid ab and
\mathop{\mathrm{GCD}}\left(b,c\right)=1. As
\mathop{\mathrm{GCD}}\left(b,c\right)=1, we have by proposition
110{reference-type="ref"
reference="prop:NT_Bezout_coprime"} that there exists integers
x,y\in\mathbb{Z} so that
$$\begin{equation} bx+cy=1 \end{equation*}$$*
On multiplication by a we have that abx+acy=a. Clearly c\mid abx
and c\mid acy and so c\mid a as required. $\qed$
:::
There is a useful application of this lemma.
::: {#exam:NT_solutions_to_ax_plus_by .example} Example 118.
Let a,b\in\mathbb{Z} and let
d=\mathop{\mathrm{GCD}}\left(a,b\right). We know by Bézout's identity
that \exists x,y\in\mathbb{Z} so that
$$\begin{equation} ax+by=d \end{equation*}$$*
The theorem for Bézout's identity, theorem
36{reference-type="ref"
reference="thm:NT_bezout_id"}, doesn't state anything about there not
being another pair x',y' so that
$$\begin{equation} ax'+by'=d \end{equation*}$$*
For example, consider a=30 and b=105, then
\mathop{\mathrm{GCD}}\left(a,b\right)=15 and we have that
15=-3*30+1*105, i.e x=-3 and y=1 in this case. We could have also
have x=-10 and y=3 as -10*30+3*105=-300+315=15.
So supposing that a,b\in\mathbb{Z} and
d=\mathop{\mathrm{GCD}}\left(a,b\right) we know that
\exists x,x',y, y'\in\mathbb{Z} with
$$\begin{align} ax+by&=d\ ax'+by'&=d \end{align*}$$*
Can we find a relation between the pair x and y and the pair x'
and y'? As d\mid a then there exists a'\in\mathbb{Z} so that
a=a'd and likewise as d\mid b then there exists b'\in\mathbb{Z} so
that b=b'd. Hence we see that
$$\begin{align} ax+by&=d\ a'dx+b'dy&=d\ a'x+b'y&=1 \end{align*}$$*
Now, we have that x and y are co-prime so we can deduce that a'
and b' are also co-prime. Now, we have that
$$\begin{equation} ax+by=d=ax'+by' \end{equation*}$$*
So, re-arranging we see that
$$\begin{align} ax-ax'&=by'-by\ a\left(x-x'\right)&=b\left(y'-y\right) \end{align*}$$*
Dividing by d gives
$$\begin{equation} a'\left(x-x'\right)=b'\left(y'-y\right) \end{equation*}$$*
Now, as a' and b' are co-prime, we have by Euclid's lemma for
co-primes that a'\mid\left(y'-y\right), We, therefore have that
\exists k\in\mathbb{Z} so that
$$\begin{equation} y'-y=a'k \Rightarrow y'=y+a'k \end{equation*}$$*
But as y'-y=a'k we have that
$$\begin{align} a'\left(x-x'\right)&=b'\left(a'k\right)\ x-x'&=b'k\ x'&=x-b'k\ \end{align*}$$*
Therefore, we can conclude that
$$\begin{align} x'&=x-\frac{b}{d}k\ y'&=y+\frac{a}{d}k \end{align*}$$*
where k\in\mathbb{Z}. To check this is the case we return to the
example of a=30 and b=105 where we had that
\mathop{\mathrm{GCD}}\left(a,b\right)=15. We saw that x=-3 and
y=1. Using these values in the equations above we get
$$\begin{align} x'&=-3-\frac{105}{15}k \Rightarrow x'=-3-7k\ y'&=1+\frac{30}{15}k \Rightarrow y'=1+2k \end{align*}$$*
Using k=1 gives us the alternative solution we saw of x'=-10 and
y'=3.
:::
From Euclid's lemma for co-primes we have deduced the full set of values
where d=\mathop{\mathrm{GCD}}\left(a,b\right) and d=ax+by.
We now return to the problem at hand. We wish to consider the elements
of x\in\mathbb{Z} so that x^2=y where y\in\mathbb{Z}. As is the
theme of this section we will do some exploratory examples.
::: example
Example 119. Let x\in\mathbb{Q} be such that
\displaystyle x=\frac{2}{1}, then
\displaystyle x^2=\frac{4}{1}=4\in\mathbb{Z}. In particular, we have
that 4=2*2=2^2.
:::
::: example
Example 120. Consider \displaystyle x=\frac{10}{1}=10. Clearly
x^2=100\in\mathbb{Z}. We have that
$$\begin{equation} 100=250=2225=2255=2^25^2 \end{equation}$$* :::
::: example
Example 121. We generalise the example of there being no
x\in\mathbb{Q} so that x^2=2. We will show that for a prime
p\in\mathbb{Z}, there is no x\in\mathbb{Q} so that x^2=p. So
suppose there is such an x, that is \displaystyle x=\frac{a}{b} so
that a,b\in\mathbb{Z} and b\neq 0 and moreover suppose that x is a
reduced fraction, which is to say
\mathop{\mathrm{GCD}}\left(a,b\right)=1. We then have that
$$\begin{equation} x^2=\frac{a^2}{b^2}=p \Rightarrow a^2=pb^2 \end{equation*}$$*
Hence p\mid a^2. Hence by Euclid's lemma, we have that p\mid a.
Hence let a=pk for some k\in\mathbb{Z}. We then have that
$$\begin{align} a^2&=pb^2\ \left(pk\right)^2&=pb^2\ p^2k^2&=pb^2\ pk&=b^2 \end{align*}$$*
Therefore p\mid b^2 and so by Euclid's lemma we have that p\mid b,
a contradiction to the assumption that x was a fraction in reduced
form.
:::
This last example shows that for any prime p there is no rational
number x with x^2=p. We also saw an example of when x^2=p^2,
namely when p=2. Also an example of a product of primes satisfying
x^2=p^2*q^2 for some primes p and q. It seems therefore that the
question of what x\in\mathbb{Z} so that x^2=y for some integer y
is deeply connected to primes. In particular, we have seen that the
powers of the primes must be even. We need more examples before we can
make a claim.
::: example
Example 122. Consider x=4, we have that x^2=8 and 8 is not
prime as \sigma\left(8\right)=4, with divisors 1, 2, 4 and 8.
However, we have that 8=2^4 and we know that 2 is prime.
:::
::: example
Example 123. Let y=3^2*5^4=5625, a product of primes. We can see
that we can take x=3*5^2=75.
:::
With these examples, we can see that to answer the question of what
x\in\mathbb{Z} are such that x^2=y for some y\in\mathbb{Z}, it is
enough to consider the structure of the primes that make y. This leads
us to, perhaps, the most important theorem of elementary Number
Theory12 .
::::: {#thm:NT_FTOA .theorem} Theorem 42. The fundamental theorem of arithmetic
Let n\in\mathbb{Z} be such that n\geq 2. We have that n can be
expressed as a product of one or more primes. This product is uniquely
up to the order of the primes. This is to say we have that
$$\begin{equation} n=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dotsp_k^{e_k} \end{equation*}$$*
where p_i are the primes and e_i are the powers for the prime
p_i. Here uniquely up to the order of the primes means that, for
example, 6=2*3=3*2 are considered the same product.
Proof:
There are two parts to this theorem, firstly we must show that every
integer n\geq 2 is expressible as a product of primes. Secondly that
this product is unique up to the ordering of the primes.
As a result, we will break this theorem down into two sub-theorems.
::: {#thm:NT_FOTA_EveryIntIsProductOfPrimes .theorem} Theorem 40. Every integer greater than one is expressible as a product of primes
Let n\in\mathbb{Z} be such that n>1. We have that
$$\begin{equation*} n=p_1p_2p_3*\dotsp_k \end{equation}$$
where p_i are the primes.
Proof:
We argue by induction on n. The base case is n=2 for which we have
n=2 which is a prime. So the base case is immediate. So suppose the
result holds for some k>2, that is n=k can be written as a product
of primes. We show that n=k+1 can be written as a product of primes.
If k+1 is itself prime we are done, so suppose not, then
\sigma\left(k+1\right)>2 and so there are some factors, say a and
b so that k+1=ab, where 2\leq a < k+1 and 2\leq b < k+1.
However, this means that we have 2\leq a \leq k and 2\leq b \leq k
and so by the induction hypothesis we can write a and b as a product
of primes. But then ab will be a product of primes and so k+1 is a
product of primes.
The result follows by induction. \qed
:::
::: {#thm:NT_FOTA_PrimeProdUnique .theorem} Theorem 41. The product of primes expression for an integer is unique
Let n\in\mathbb{Z} be such that n\geq 2. We have that the expression
for n as a product of primes is unique.
Proof:
Let n\in\mathbb{Z} be as given. Suppose that n has two different
representations into a product of primes, that is
$$\begin{align*} n&=p_1p_2p_3\dots p_r\ n&=q_1q_2q_3\dots q_s \end{align*}$$
where without loss of generality we suppose that r\leq s. Moreover,
Without loss of generality suppose that we have the primes in ascending
order, that is, p_1\leq p_2\leq p_3\leq\dots\leq p_r and that
q_1\leq q_2\leq q_3\leq\dots\leq q_s.
Now as p_1\mid q_1q_2q_3\dots q_s we have by Euclid's lemma that
p_1\mid q_i for some 1\leq i \leq s. Therefore p_1\geq q_1 as the
primes are in ascending order. Likewise, as
q_1\mid p_1p_2p_3\dots p_r, then q_1\mid p_j for some
1\leq j\leq r. Hence q_1\geq p_1. As p_1\geq q_1 and q_1\geq p_1
we must have that p_1=q_1. Hence we have
$$\begin{align*} p_1p_2p_3\dots p_r&=q_1q_2q_3\dots q_s\ p_1p_2p_3\dots p_r&=p_1q_2q_3\dots q_s\ p_2p_3\dots p_r&=q_2q_3\dots q_s\ \end{align*}$$
This process can be repeated for each prime p_j for the remaining
2\leq j\leq r. Now if r<s we will eventually get to
$$\begin{equation*} 1=q_{r+1}q_{r+2}q_{r+3}\dots q_s \end{equation*}$$
However the only divisors of 1 are 1 and -1, hence none of the
q_i for r+1\leq i\leq s can't be prime, a contradiction. So r=s.
If r=s then we must have that p_i=q_i for 1\leq i\leq s. Hence the
two expressions of for n are equal giving us uniqueness. \qed
:::
.
The fundamental theorem of arithmetic now follows from theorem
40{reference-type="ref"
reference="thm:NT_FOTA_EveryIntIsProductOfPrimes"} and theorem
41{reference-type="ref"
reference="thm:NT_FOTA_PrimeProdUnique"}. The final result involving the
powers of primes is trivial to see. Suppose that n is a product of
primes given by
$$\begin{equation} n=p_1p_2p_3\dots p_k \end{equation*}$$*
We will have that some of the p_i will be the same and others will
not, if we combine the primes that are equal then we will get that,
after re-labelling so that k is once again the largest index that
appears,
$$\begin{equation} n=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dotsp_k^{e_k} \end{equation*}$$*
The result is shown. $\qed$ :::::
This theorem is of great importance. It ultimately allows us to deal with problems of divisibility by recasting them into statements about the primes that make the integer. We make a quick definition.
::: definition Definition 156. Prime factorisation of an integer
Let n\in\mathbb{Z} where
n=p_1^{e_1}*p_2^{e_2}*p_3^{e_3}*\dots*p_k^{e_k}. We say that the
expression for n is the prime factorisation of n, or simply the
factorisation of $n$
:::
We have shown that any integer can be factored into a product of primes, a natural question we can now ask, and answer, is how many primes are there. Could it be the case that the set of primes finite, if very large? We can see that the number of primes is infinite.
::: theorem Theorem 43. Number of primes is infinite
We have that the number of primes is infinite.
Proof:
We will argue by contradiction. Suppose that there are only a finite number of primes, say
$$\begin{equation} P=\left{p_1,p_2,p_3,\dots,p_n\right} \end{equation*}$$*
where we have that p_i<p_j for i<j and 1\leq i,j\leq n, i.e
p_1=2, p_2=3 etc. Let N be the integer
$$\begin{equation} N=\left(p_1p_2p_3\dots p_n\right)+1 \end{equation*}$$*
Clearly, N is not prime as otherwise we would have N\in P but
N>p_n, which would be a contradiction. So N is composite and by the
fundamental theorem of arithmetic, we have that N has a factorisation
into primes. Clearly, none of the p_i divide N, but then none of the
p_i divide the prime factorisation of N from the fundamental theorem
of arithmetic, a contradiction. Hence P can't be a finite set and the
number of primes must be infinite. $\qed$
:::
The fundamental theorem of arithmetic can be used to recast some previous results for the greatest common divisor. We start with a result for integers being co-primes.
::: {#prop:NT_co-prime_iff_no_common_primes .proposition} Proposition 114. Greatest common divisor is 1 if and only if no-common prime in factorisation
Let a,b\in\mathbb{Z} with b\neq 0. We have that
\mathop{\mathrm{GCD}}\left(a,b\right)=1 if and only if a and b
share no common primes in their factorisations.
Proof:
We have that \mathop{\mathrm{GCD}}\left(a,b\right)=1 if and only if
the largest divisor of both a and b is 1, which occurs if and only
if there are no primes in the factorisation of a and in the
factorisation of b in common. $\qed$
:::
We can compute the greatest common divisor by considering the prime
factorisations of a and b. To do so we need a helpful result.
::: {#prop:NT_express_primes_in_common_basis .proposition} Proposition 115. Expression for integers as powers of same primes
Let a,b\in\mathbb{Z} with a\geq 2 and b\geq 2. Consider the prime
factorisations of a and b given by
$$\begin{align} a&=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_n^{e_n}\ &=\prod_{\substack{p_i\mid a \ p_i\text{ is prime}}} p_i^{e_i}\ b&=q_1^{f_1}q_2^{f_2}q_3^{f_3}\dots q_m^{f_m}\ &=\prod_{\substack{q_i\mid b \ q_i\text{ is prime}}} q_i^{f_i}\ \end{align*}$$*
where n need not be equal to m. We have that there exist prime
numbers
$$\begin{equation} t_1<t_2<t_3\dots <t_v \end{equation*}$$*
So that
$$\begin{align} a&=t_1^{g_1}t_2^{g_2}t_3^{g_3}\dots t_v^{g_v}\ b&=t_1^{h_1}t_2^{h_2}t_3^{h_3}\dots t_v^{h_v} \end{align*}$$*
Proof:
Let a,b\in\mathbb{Z} be as given. We have that
$$\begin{align} a&=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_n^{e_n}\ b&=q_1^{f_1}q_2^{f_2}q_3^{f_3}\dots q_m^{f_m}\ \end{align*}$$*
In particular. Let A=\left\{p_1,p_2,p_3,\dots,p_n\right\} and let
B=\left\{q_1,q_2,q_3,\dots,q_m\right\}. We can therefore define the
set T=A\cup B. Where we have
$$\begin{equation} T=\left{t_1,t_2,\dots,t_v\right} \end{equation*}$$*
where clearly v\leq \left(n+m\right). We now need a way to pick the
primes in the factorisation of a, and b, from the set T.
Define \iota_A and \iota_B as follows
$$\begin{align} \iota_A:A&\mathlarger{\mathlarger{\rightarrow}}T\ x&\mapsto \iota_A\left(x\right)=x \end{align*}$$*
$$\begin{align} \iota_B:B&\mathlarger{\mathlarger{\rightarrow}}T\ x&\mapsto \iota_B\left(x\right)=x \end{align*}$$*
That is, \iota_A and \iota_B simply map elements of either A or
B to the same element in T. Using these mappings we can see that
$$\begin{align}
a&=\prod_{i=1}^n p_i^{e_i}\
&=\prod_{i=1}^n \iota_A\left(p_i\right)^{e_i}\
&=\prod_{p_i\in A} p_i^{e_i}\
&=\prod_{p_i\in A} p_i^{e_i} * \prod_{t_i\in T\setminus A} t_i^0\
&=\prod_{t_i\in T} t_i^{g_i}, \text{ where } g_i =\begin{cases}
e_i,\ \text{If } t_i=p_i\
0,\ \text{If } t_i\not\in A
\end{cases}\
&= t_1^{g_1}t_2^{g_2}t_3^{g_3}\dots t_v^{g_v}
\end{align*}$$*
Likewise, for b we have
$$\begin{align}
b&=\prod_{j=1}^m q_j^{f_j}\
&=\prod_{j=1}^n \iota_B\left(q_j\right)^{f_j}\
&=\prod_{q_j\in B} q_j^{f_j}\
&=\prod_{q_j\in B} q_j^{f_j} * \prod_{t_j\in T\setminus B} t_j^0\
&=\prod_{t_j\in T} t_j^{h_j}, \text{ where } h_j =\begin{cases}
f_j,\ \text{If } t_j=q_j\
0,\ \text{If } t_j\not\in B
\end{cases}\
&= t_1^{h_1}t_2^{h_2}t_3^{h_3}\dots t_v^{h_v}
\end{align*}$$*
Hence, a and b have been expressed as a product of the same set of
primes, where possibly one or more of the powers in either of the
products could be zero. As required. $\qed$
:::
In other words, proposition
115{reference-type="ref"
reference="prop:NT_express_primes_in_common_basis"} is saying that given
the prime factorisations of a and b, we can always construct a new
set containing the common primes of a and b and use this new set to
express the factorisations of a and b. Why is this useful? It is
useful because it will allow us to find the greatest common divisor of
two integers by simply looking at the primes, and the powers of those
primes, in common of those integers. We can use some examples to express
this idea. The reader is encouraged to also try these examples using the
Euclidean algorithm to verify.
::: example
Example 124. Let a=2*3^2*5=90 and b=3*5^2*7=525. We have that
the \mathop{\mathrm{GCD}}\left(a,b\right)=15. Additionally, we know
that 15=3*5. The common primes of a and b are 3 and 5.
:::
::: example
Example 125. Let a=5*11=55 and b=2*7=14. We have that the
\mathop{\mathrm{GCD}}\left(a,b\right)=1. Moreover, a and b have no
common primes.
:::
::: example
Example 126. Let a=7*11*13=1001 and b=7*11*17=1309. We have
that the \mathop{\mathrm{GCD}}\left(a,b\right)=77, as the primes in
common are 7 and 11.
:::
::: example
Example 127. Let a=2*3^4=162 and b=3^3*5=135. We have that the
\mathop{\mathrm{GCD}}\left(a,b\right)=27, as the primes, with powers,
in common is only 3^3.
:::
We show that looking at the primes in common is sufficient to get the greatest common divisor. To aid in the notation we make a definition
::: definition Definition 157. The minimum function for integers
Let a,b\in\mathbb{Z}. We define the minimum function, denoted
\min\left(a,b\right) by
$$\begin{align} \min:\mathbb{Z}^2&\rightarrow\mathbb{Z}\ \left(a,b\right)&\mapsto\min\left(a,b\right)=\begin{cases} a,\ \text{If } a\leq b\ b,\ \text{If } b\leq a \end{cases} \end{align*}$$* :::
::: {#prop:NT_gcd_can_be_computed_by_primes .proposition} Proposition 116. Greatest common divisor from prime factorisation
Let a,b\in\mathbb{Z} with b\neq 0. By proposition
115{reference-type="ref"
reference="prop:NT_express_primes_in_common_basis"} we know that there
exists a set of primes
$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*
so that the prime factorisations of a and b are given by
$$\begin{align} a&=\prod_{i=1}^v t_i^{e_i}\ b&=\prod_{i=1}^v t_i^{f_i}\ \end{align*}$$*
We have that the greatest common divisor
\mathop{\mathrm{GCD}}\left(a,b\right) is given by
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,b\right)=t_1^{\min\left(e_1,f_1\right)}t_2^{\min\left(e_2,f_2\right)}t_3^{\min\left(e_3,f_3\right)}\dots t_v^{\min\left(e_v,f_v\right)} \end{equation*}$$*
Proof:
Let a,b\in\mathbb{Z} be as given as suppose that we have expressed
a and b in accordance with proposition
115{reference-type="ref"
reference="prop:NT_express_primes_in_common_basis"}. This is to say we
have a set T of primes so that
$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*
and the prime factorisations of a and b are given by
$$\begin{align} a&=\prod_{i=1}^v t_i^{e_i}\ b&=\prod_{i=1}^v t_i^{f_i}\ \end{align*}$$*
let d=\mathop{\mathrm{GCD}}\left(a,b\right) and let
\displaystyle D=t_1^{\min\left(e_1,f_1\right)}t_2^{\min\left(e_2,f_2\right)}t_3^{\min\left(e_3,f_3\right)}\dots t_v^{\min\left(e_v,f_v\right)}.
We need to show that d=D. To do so we show
-
$\displaystyle D\leq d$
-
$\displaystyle d\leq D$
Then the result follows from the fact that for n,m\in\mathbb{Z} we
have n\leq m and m\leq n we have n=m, for ease of notation, let
\sigma_i=\min\left(e_i,f_i\right) for 1\leq i\leq v.
-
\displaystyle D\leq d:We have by definition of the minimum that
\sigma_i\leq e_iand\sigma_i\leq f_i. Hence\exists k_i, l_i\in\mathbb{Z}so that$$\begin{align} e_i&=\sigma_i+k_i\ f_i&=\sigma_i+l_i \end{align*}$$*
Hence, we can express
aas$$\begin{align} a&=\prod_{i=1}^v t_i^{e_i}\ &=\prod_{i=1}^v t_i^{\sigma_i+k_i}\ &=\prod_{i=1}^v t_i^{\sigma_i}t_i^{k_i}\ &=\prod_{i=1}^v t_i^{\sigma_i}\prod_{i=1}^v t_i^{k_i}\ &=D*\prod_{i=1}^v t_i^{k_i} \end{align*}$$*
Therefore, as
\displaystyle \prod_{i=1}^v t_i^{k_i} \in\mathbb{Z}we have thatD\mid a. A similar argument forbshows thatD\mid b. HenceDis a common divisor ofaandband so by definition we have thatD\leq d. -
d\leq D:To show that
d\leq Dwe will show thatd\mid Dthen by proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} part 5. we will have thatd\leq D. So suppose thatd\mid Dthen by the definition of divisibility we have that there is somek\in\mathbb{Z}so that$$\begin{equation} d=Dk \end{equation*}$$*
As
k\in\mathbb{Z}, it has a factorisation into primes by the fundamental theorem of arithmetic. Now,kcould have primes in common withD, hence we can take those primes that are in common withDandkand place them into the factorisation ofD, this is to say we have that$$\begin{align} d&=Dk\ d&=t_1^{\sigma_1}t_2^{\sigma_2}t_3^{\sigma_3}\dots t_v^{\sigma_v}k\ d&=t_1^{\lambda_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}k'\ \end{align*}$$*
Where we have that
\lambda_iis the new value for each prime after we extract the primes that were in common withDandk, additionally,k'are the primes that were not in common.To get the result we want we need to show two things.
-
$k'=1$
-
\lambda_i\leq \sigma_ifor all $1\leq i\leq v$
-
k'=1:Suppose for a contradiction that
k'\neq 1. Asd>0andD>0then we must have thatk>0which means thatk'>0. Hence ask'>0we have by the fundamental theorem of arithmetic thatk'has a factorisation into primes, say$$\begin{equation} k'=q_1^{r_1}q_2^{r_2}q_3^{r_3}\dots q_c^{r_c} \end{equation*}$$*
Moreover, no
q_j=t_iask'has no common primes witht_1^{\lambda_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}. Pick one of the primes ink'sayq=q_jthen we have thatq\mid d. Moreover we have thatd\mid aasd=\mathop{\mathrm{GCD}}\left(a,b\right)hence we must have thatq\mid a. Hence we have thatqis one of the primest_ia contradiction. Thereforek'=1. -
\lambda_i\leq \sigma_ifor all1\leq i\leq v:Suppose for a contradiction that
\lambda_i>\sigma_ifor all1\leq i\leq v. Without loss of generality takei=1, for if this is not the case re-label the primes. Now by definition of\sigma_1we have that\sigma_1=\min\left(e_,f_1\right)and so we must have that either\sigma_1=e_1or\sigma_1=f_1. Without loss of generality let\sigma_1=e_1as the case where\sigma_1=f_1is similar.We, therefore, have that
\lambda_1>e_1. Now, asdis the greatest common divisor ofathere is as\in\mathbb{Z}so thatds=awheres>0as bothaanddare. Now, comparing the prime factorisations ofdsandawe have that$$\begin{equation} st_1^{\lambda_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_1^{e_1}t_2^{e_2}t_3^{e_3}\dots t_v^{e_v} \end{equation}$$*
Dividing by
\displaystyle t_1^{e_1}we get that$$\begin{equation} st_1^{\lambda_1-e_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_1^{e_1-e_1}t_2^{e_2}t_3^{e_3}\dots t_v^{e_v} \end{equation}$$*
Where clearly
\displaystyle t_1^{e_1-e_1}=1. So this can be re-written as$$\begin{equation} st_1^{\lambda_1-e_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_2^{e_2}t_3^{e_3}\dots t_v^{e_v} \end{equation}$$*
As
\lambda_1>e_1, we have\lambda_1-e_1>0. and sot_1divides the left-hand side of the equation. But by the fundamental theorem of arithmetic ift_1divides the left-hand side it must also divide the right-hand side and so would appear in the factorisation, but it is not in the factorisation of the right-hand side a contradiction. So\lambda_i\leq\sigma_ifor all1\leq i\leq v.
Therefore $d\leq D$
As
D\leq dandd\leq Dwe must have thatd=D. As required. $\qed$ ::: -
Proposition
116{reference-type="ref"
reference="prop:NT_gcd_can_be_computed_by_primes"} allows us to compute
the greatest common divisor by considering the prime factorisations,
rather than using the Euclidean algorithm. Unfortunately, we now have a
new problem, how do we compute the prime factorisation of an integer?
Thankfully to answer this question we have to answer the original
question posed, what x\in\mathbb{Z} are such that x^2=y for some
y\in\mathbb{Z}? Clearly if x\in\mathbb{Z} then x has some prime
factorisation, say
$$\begin{equation*} x=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$
So that
$$\begin{align*} x^2&=\left(p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}\right)\left(p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}\right)\ &=\left(p_1^{e_1}p_1^{e_1}\right)\left(p_2^{e_2}p_2^{e_2}\right)\left(p_3^{e_3}p_3^{e_3}\right)\dots\left(p_k^{e_k}p_k^{e_k}\right)\ &=p_1^{2e_1}p_2^{2e_2}p_3^{2e_3}\dots p_k^{2e_k}=y \end{align*}$$
For each prime p_i, the power of that prime is now of the form 2e_i
and therefore the power is even. We make this fact a definition.
::: {#def:NT_square_number .definition} Definition 158. Square number
Let y\in\mathbb{Z} where y>0, if there exists an x\in\mathbb{Z}
so that
$$\begin{equation} x^2=y \end{equation*}$$*
Then we say that y is a square number.
:::
In light of the above discussion, we have the following result.
::: {#prop:NT_square_number_iff_prime_exonents_even .proposition} Proposition 117. Square number if and only if prime factorisation has even powers
Let x\in\mathbb{Z}. We have that x is a square number if and only
if the prime factorisation of x only contains even prime powers. This
is to say that each prime p_i in the factorisation of x has an
exponent of the form 2e_i.
Proof:
\left(\Rightarrow\right): Suppose that x is a square number, by
definition there exists y\in\mathbb{Z} so that y^2=x. Let the prime
factorisation of y be
$$\begin{equation} y=q_1^{f_1}q_2^{f_2}q_3^{f_3}\dots q_k^{f_k} \end{equation*}$$*
We have that then
$$\begin{equation} x=y^2=q_1^{2f_1}q_2^{2f_2}q_3^{2f_3}\dots q_k^{2f_k} \end{equation*}$$*
Hence all the prime factors of x have an exponent of the form 2f_i
making them even.
\left(\Leftarrow\right): Suppose that the prime factorisation of x
has prime factors which only have even powers, that is
$$\begin{equation} x=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$*
As each e_i is even we have that
\displaystyle \frac{e_i}{2}\in\mathbb{Z}. Define y to be
$$\begin{equation} y=p_1^{e_1/2}p_2^{e_2/2}p_3^{e_3/2}\dots p_k^{e_k/2} \end{equation*}$$*
Where clearly y\in\mathbb{Z}. We then have that
$$\begin{align} y^2&=\left(p_1^{e_1/2}p_2^{e_2/2}p_3^{e_3/2}\dots p_k^{e_k/2}\right)\left(p_1^{e_1/2}p_2^{e_2/2}p_3^{e_3/2}\dots p_k^{e_k/2}\right)\ &=\left(p_1^{e_1/2}p_1^{e_1/2}\right)\left(p_2^{e_2/2}p_2^{e_2/2}\right)\left(p_3^{e_3/2}p_1^{e_3/2}\right)\dots \left(p_k^{e_k/2}p_k^{e_k/2}\right)\ &=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}=x \end{align*}$$*
Hence as x=y^2 for some y\in\mathbb{Z} we conclude that x is a
square number. $\qed$
:::
We also have an immediate proposition.
::: {#prop:NT_product_of_sqaure_numbers_is_sqaure_number .proposition} Proposition 118. Product of two square numbers is a square number
Let x,y\in\mathbb{Z} be square numbers. We have that xy is a square
number.
Proof:
Let x,y\in\mathbb{Z} be square numbers. We have by definition that
\exists a,b\in\mathbb{Z} so that
$$\begin{align} a^2&=x\ b^2&=y \end{align*}$$*
Now, consider the product xy, we have
$$\begin{equation} xy=a^2b^2=\left(ab\right)^2 \end{equation}$$*
Hence by definition, xy is a square number. $\qed$
:::
With proposition
117{reference-type="ref"
reference="prop:NT_square_number_iff_prime_exonents_even"} we can
finally answer the question of what x\in\mathbb{Z} are such that
x^2=y for some y\in\mathbb{Z}. It is those x\in\mathbb{Z} so that
x^2 is a square number! At first, this doesn't seem too useful as we
can clearly take any n\in\mathbb{Z} and see that n^2\in\mathbb{Z}.
However, the real meaning of this result is actually the converse, given
some n\in\mathbb{Z} we can see if there is an x\in\mathbb{Z} so that
x^2=n. With this, we make a definition
::: definition Definition 159. Square root function
Let x\in\mathbb{Z} be a positive square number. We define the square
root function, denoted by \sqrt{} as follows
$$\begin{align} \sqrt{}:\mathbb{Z}&\rightarrow\mathbb{Z}\ x&\mapsto \sqrt{x}=\begin{cases} n,\ \text{If } n^2=x\ \text{Undefined otherwise} \end{cases} \end{align*}$$*
That is, we define the square root of an integer x to be the integer
n that when squared gives x.
:::
In light of this definition, we have the following result.
::: {#prop:NT_root_of_product_is_product_of_roots .proposition} Proposition 119. Square root of product is product of square roots
Let x,y\in\mathbb{Z} be square numbers. We have that
$$\begin{equation} \sqrt{xy}=\sqrt{x}\sqrt{y} \end{equation*}$$*
Proof:
Let x,y be as given. By proposition
118{reference-type="ref"
reference="prop:NT_product_of_sqaure_numbers_is_sqaure_number"} we have
that xy is a square number and so \sqrt{xy} is well-defined. We need
to show that
$$\begin{equation} \sqrt{xy}=\sqrt{x}\sqrt{y} \end{equation*}$$*
By definition, we suppose that \sqrt{xy}=n, where n^2=xy.
Additionally, we can suppose that \sqrt{x}=a where a^2=x and
\sqrt{y}=b where b^2=y. Now, we have that
$$\begin{equation} \left(\sqrt{x}\sqrt{y}\right)^2=\left(ab\right)^2=a^2b^2=xy=n^2=\left(\sqrt{xy}\right)^2 \end{equation*}$$*
As n^2=a^2b^2 we have that n=ab. Hence we have that
\sqrt{xy}=\sqrt{x}\sqrt{y} as required. $\qed$
:::
The idea of a square number actually generalises, meaning the question
of what x\in\mathbb{Z} are such that x^2=y for some y\in\mathbb{Z}
can be generalised to the question what x\in\mathbb{Z} are such that
x^n=y for some y\in\mathbb{Z} and every n\in\mathbb{N}.
The generalisation works very similarly to how we got to square numbers.
As before let x\in\mathbb{Z} which has a factorisation
$$\begin{equation*} x=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$
Now, consider x^n, the factorisation is given by
$$\begin{equation*} x=p_1^{ne_1}p_2^{ne_2}p_3^{ne_3}\dots p_k^{ne_k} \end{equation*}$$
Hence the power of each prime p_i is of the form ne_i. This is the
defining characteristic for the next definition.
::: definition Definition 160. $n$-th power number
Let y\in\mathbb{Z} with y>0 and let n\in\mathbb{N}, if there
exists an x\in\mathbb{Z} so that
$$\begin{equation} x^n=y \end{equation*}$$*
We say that y is the $n$-th power of x. We have already seen the
case of n=2 where y is called a square number. For n=3 we call y
a cube number. For n>4, there is no formal term hence the definition
using the terminology of $n$-th power.
:::
The next step is to prove an equivalent proposition to 117{reference-type="ref" reference="prop:NT_square_number_iff_prime_exonents_even"}.
::: {#prop:NT_nth_power_number_iff_prime_exonents_multiple_of_n .proposition}
Proposition 120. $n$-th power number if and only if prime
factorisation has multiples of n powers
Let x\in\mathbb{Z}. We have that x is a $n$-th power number if and
only if the prime factorisation of x only contains prime powers that
are a multiple of n. this is to say that each prime p_i in the
factorisation of x has an exponent of the form ne_i.
Proof:
\left(\Rightarrow\right): Suppose that x is a $n$-th power number,
by definition there exists y\in\mathbb{Z} so that y^n=x. Let the
prime factorisation of y be
$$\begin{equation} y=q_1^{f_1}q_2^{f_2}q_3^{f_3}\dots q_k^{f_k} \end{equation*}$$*
We have that then
$$\begin{equation} x=y^2=q_1^{nf_1}q_2^{nf_2}q_3^{nf_3}\dots q_k^{nf_k} \end{equation*}$$*
Hence all the prime factors of x have an exponent of the form nf_i,
meaning each prime power is a multiple of n.
\left(\Leftarrow\right): Suppose that the prime factorisation of x
has prime factors which only have multiples of n, that is
$$\begin{equation} x=p_1^{ne_1}p_2^{ne_2}p_3^{ne_3}\dots p_k^{ne_k} \end{equation*}$$*
As each e_i is a multiple of n we have that
\displaystyle \frac{e_i}{n}\in\mathbb{Z}. Define y to be
$$\begin{equation} y=p_1^{e_1/n}p_2^{e_2/n}p_3^{e_3/n}\dots p_k^{e_k/n} \end{equation*}$$*
Where clearly y\in\mathbb{Z}. We then have that
$$\begin{align} y^n&=\prod_{i=1}^n\left(p_1^{e_1/n}p_2^{e_2/n}p_3^{e_3/n}\dots p_k^{e_k/n}\right)\ &=\prod_{j=1}^k\left(\prod_{i=1}^n\left(p_j^{e_j/n}\right)\right)\ &=\prod_{j=1}^k\left(p_j^{e_j}\right)\ &=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}=x \end{align*}$$*
Hence as x=y^n for some y\in\mathbb{Z} we conclude that x is an
$n$-th power number. $\qed$
:::
As before, there is an immediate proposition.
::: proposition
Proposition 121. Product of two n-th power numbers is an $n$-th
power number
Let x,y\in\mathbb{Z} be $n$-th power numbers. We have that xy is an
$n$-th power number.
Proof:
Let x,y\in\mathbb{Z} be $n$-th power numbers. By definition, we have
that \exists a,b\in\mathbb{Z} so that
$$\begin{align} a^n&=x\ b^n&=y \end{align*}$$*
We have
$$\begin{equation} xy=a^nb^n=\left(ab\right)^n \end{equation}$$*
Giving the result. $\qed$ :::
We can now now generalise the square root function.
::: definition Definition 161. $n$-th root function
Let x\in\mathbb{Z} be a positive $n$-th power number. We define the
$n$-th root function, denoted by \displaystyle \sqrt[n]{} is given by
$$\begin{align} \sqrt[n]{}:\mathbb{Z}&\rightarrow\mathbb{Z}\ x&\mapsto \sqrt[n]{x}=\begin{cases} m,\ \text{If } m^n=x\ \text{Undefined otherwise} \end{cases} \end{align*}$$*
That is, we define the $n$-th root of an integer x to be the integer
m that when raised to the power of n gives x.
:::
The integers modulo n
::: epigraph Mathematicians call it "the arithmetic of congruences." You can think of it as clock arithmetic
John Derbyshire :::
So far in the study of the divisibility of integers, we have considered
what it means for an integer a to divide another b, namely we have
that a\mid b if there is some c\in\mathbb{Z} such that ac=b. We
now explore the implications of the case where a\nmid b, in
particular, we look at the the remainders from the division algorithm.
Remainders after division
Recall that for a,b\in\mathbb{Z} we have that a\mid b if
\exists c\in\mathbb{Z} so that b=ac. When this is not the case we
have that a\nmid b. By the division algorithm, when a\nmid b we have
that 0<r<\left|a\right|, that is we have that
$$\begin{equation*} b=qa+r \end{equation*}$$
The question is, what are the possible values for r? The division
algorithm gives us lower and upper bounds on the valid values of r but
does not say anything about whether it can take all values in this
range. Is only a small subset of this range valid? What happens if we
allow b to be an arbitrary integer rather than some fixed integer?
Some exploratory examples will be helpful.
::: example
Example 128. Let a=2 and consider some b>a. We have by the
division algorithm that
$$\begin{equation} b=2q+r \end{equation*}$$*
Hence r can only be one of 0 or 1. Now if r=0 then we must have
that b is an even number and if r=1 we must have that b is an odd
number. Then as b is an arbitrary integer we must have that dividing
any integer by 2 will give us all of the possible remainders as an
integer x\in\mathbb{Z} is either even or odd.
:::
::: example
Example 129. Let a=3 and consider some b>a. By the division
algorithm we have that r is either 0, 1 or 2. Like before, if
r=0 then b is a multiple of 3 so that b=3q.
Now, suppose b is a multiple of 3. We have that b+3 is also a
multiple of 3 as
$$\begin{equation} b+3=3q+3\Rightarrow 3\left(q+1\right) \end{equation*}$$*
So, as b is a multiple of 3 and b+3 is a multiple of 3 then
these will give a remainder r=0 by the division algorithm. What can we
say about b+1 and b+2? Using b+1 in the division algorithm with
3 gives
$$\begin{align} b+1=3q+1 \end{align*}$$*
as b=3q. Hence the remainder is 1. Likewise using b+2 in the
division algorithm with 3 gives a remainder of 2. As b was an
arbitrary integer, we can conclude that the possible remainders when
dividing an arbitrary integer by 3 are 0, 1 and 2. All of the
possibilities are realised for the division of an arbitrary integer.
:::
::: example
Example 130. Let a=4 and consider some b>a. The division
algorithm gives the possible range of remainders of 0, 1, 2 and 3.
Like the previous example, we see that if the remainder is 0 then b
is a multiple of 4, so similarly b+4 is a multiple of 4. Looking
at b+1, b+2 and b+3 we see by the division algorithm that
$$\begin{align} b+1=4q+1\ b+2=4q+2\ b+3=4q+3\ \end{align*}$$*
So dividing an arbitrary integer by 4 will give a remainder in the
range 0 to 3 inclusive.
:::
These examples suggest that when dividing an arbitrary integer b by
some a\in\mathbb{Z} with a>0 will always give one value r with
0<r<\left|a\right|. This is true not just for the examples above but
for every a>0. We can prove this but first make an important
observation.
::: {#cor:NT_integer_minus_remainder_is_divisable .corollary} Corollary 9.
Let a,b\in\mathbb{Z} with b>a. Consider the division algorithm for
b divided by a, that is we have
$$\begin{equation} b=qa+r \end{equation*}$$*
for some q,r\in\mathbb{Z} and 0<r<\left|a\right|. We have that
a\mid\left(b-r\right).
Proof:
Let a,b\in\mathbb{Z} be as given by the hypothesis. The division
algorithm applied to a dividing b gives
$$\begin{equation} b=qa+r \end{equation*}$$*
This gives \left(b-r\right)=qa and so by the definition of
divisibility we have that a\mid\left(b-r\right). As required. $\qed$
:::
Corollary 9{reference-type="ref" reference="cor:NT_integer_minus_remainder_is_divisable"} is slightly misleading, this corollary provides the bedrock for the rest of this section.
Congruences and residues (Modular arithmetic)
We are now able to define the main topic behind this section. Corollary
9{reference-type="ref"
reference="cor:NT_integer_minus_remainder_is_divisable"} tells us that
when we divide b by a, then the difference between b and the
remainder r is always divisible by a. This is to say
a\mid\left(b-r\right). Now suppose we have another integer c so that
when c is divided by a the remainder is also r. We then also have
a\mid\left(c-r\right), in a sense b and c are similar when divided
by a. That is they give the same remainder after division. We use this
to make the definitions.
::: definition Definition 162. Congruences, congruent number and residue number
Let a,b,n\in\mathbb{Z} so that n>0. If we have that a and b
have the same remainder when divided by n we say that a and b a
congruent modulo n. This is denoted by
$$\begin{equation} a\equiv b \ (\mathrm{mod}\ n) \end{equation*}$$*
We call b a residue of a modulo n. We usually say that a is
congruent to b modulo n. We define a congruence to capture the
notion of congruent numbers and residue numbers. We call the number n
the modulus of the congruence.
If a is not congruent to b, equivalently if b is not a residue of
a we write a\not\equiv b\ (\mathrm{mod}\ n).
:::
We can make use of corollary 9{reference-type="ref" reference="cor:NT_integer_minus_remainder_is_divisable"} to connect division to congruences.
::: {#prop:NT_congruent_iff_difference_is_divisible .proposition} Proposition 122. Congruent if and only if the difference is divisible by modulus
Let a,b,n\in\mathbb{Z} and fix n\geq 1. We have that
a\equiv b\ (\mathrm{mod}\ n) if and only if n\mid\left(a-b\right).
Proof:
By the division algorithm, we have that
$$\begin{align} a&=qn+r\ b&=q'n+r' \end{align*}$$*
for some q,q',r,r'\in\mathbb{Z} where 0<r<\left|n\right| and
0<r'<\left|n\right|. Hence we have that
$$\begin{equation} a-b=\left(q-q'\right)n+\left(r-r'\right) \end{equation*}$$*
where -n<r-r'<n.
\left(\Rightarrow\right): Suppose that
a\equiv b\ (\mathrm{mod}\ n). By definition of congruences, we have
that a and b share the same remainder when divided by n. Hence
r=r' and so r-r'=0 so that
$$\begin{equation} a-b=\left(q-q'\right)n \end{equation*}$$*
which implies that n\mid\left(a-b\right).
\left(\Leftarrow\right): Now suppose that n\mid\left(a-b\right). We
then have that
$$\begin{equation} \left(a-b\right)-\left(q-q'\right)n=\left(r-r'\right) \end{equation*}$$*
Where -n<r-r'<n. The only integer strictly between -n and n which
is divisible by n is 0. Indeed if there were such a number between
-n and n which was divisible by n it would be a multiple of n so
the inequality wouldn't be strict. Hence r-r'=0 which implies that
r=r'. So by the definition of a congruence, we have that
a\equiv b\ (\mathrm{mod}\ n). $\qed$
:::
Proposition 122{reference-type="ref" reference="prop:NT_congruent_iff_difference_is_divisible"} gives us a bridge, allowing us to translate statements about divisibility into statements about congruences and visa versa. To get used to working with congruences we will use some examples.
::: example Example 131. Suppose you were asked given that it is Monday, what would be the day in 100 days times? We can make use of the fact that days repeat in a 7-day cycle. By the division algorithm, we have that
$$\begin{equation} 100=14\left(7\right)+2 \end{equation*}$$*
That is to say, 100\equiv 2\ (\mathrm{mod}\ 7). So we know that if
the current day is a Monday, then in 100 days times the day would be a
Wednesday.
:::
::: example
Example 132. The previous example can also be used to calculate
some time in the future. Suppose we are using a $24$-hour clock and we
know that it is 1 PM, what would be the time in 164 hours?
To find this we make use of the fact that a $24$-hour clock repeats
every 24 hours. So we need to compute the remainder of 164 when
divided by 24. The division algorithm gives
$$\begin{equation} 164=6\left(24\right)+20 \end{equation*}$$*
Hence 164\equiv 20\ (\mathrm{mod}\ 24). Now on a $24$-hour clock, 1
PM is equal to 13. So the time 164 hours later will be given by
13+20=33. We have a problem, 33 is not on a $24$-hour clock! To find
what time this is we need to find the remainder when 33 is divided by
24. We can quickly see that
$$\begin{equation} 33=124+9 \end{equation}$$*
So 33\equiv 9\ (\mathrm{mod}\ 24). Hence, 164 hours after 1 PM
:::
::: example
Example 133. Let a,n\in\mathbb{Z}. We have that
n\mid\left(a-a\right) and so by proposition
122{reference-type="ref"
reference="prop:NT_congruent_iff_difference_is_divisible"} that
a\equiv a\ (\mathrm{mod}\ n).
:::
::: example
Example 134. Let a=8, b=11 and n=5. Using the division
algorithm we can see that
$$\begin{align} 8&\equiv 3\ (\mathrm{mod}\ 5)\ 11&\equiv 1\ (\mathrm{mod}\ 5) \end{align*}$$*
Now, consider a+b=19. By the division algorithm, we see that
$$\begin{equation} 19=35+4 \end{equation}$$*
So that 19\equiv 4\ (\mathrm{mod}\ 5). It would seem that we can add
congruences together and the result makes sense.
:::
The last two examples hint at some properties of congruences which
should be investigated. In particular, a\equiv a\ (\mathrm{mod}\ n) is
one criterion of being an equivalence relation. Do the other properties
for being an equivalence relation hold? This is to say if
a\equiv b\ (\mathrm{mod}\ n) do we have that
b\equiv a\ (\mathrm{mod}\ n) and if a\equiv b\ (\mathrm{mod}\ n) and
b\equiv c\ (\mathrm{mod}\ n) do we have
a\equiv c\ (\mathrm{mod}\ n)? As it turns out the answer is yes
::: {#prop:NT_congruences_form_equivalence_relation .proposition} Proposition 123. Congruences are an equivalence relation
Let a,b,n\in\mathbb{Z} so that n is fixed and n\geq 1. Consider
the relation \sim_n where
$$\begin{align} a\sim_n b \iff a\equiv b\ (\mathrm{mod}\ n) \end{align*}$$*
We have that \sim_n is an equivalence relation. That is
-
$a\equiv a\ (\mathrm{mod}\ n)$
-
If
a\equiv b\ (\mathrm{mod}\ n)then $b\equiv a\ (\mathrm{mod}\ n)$ -
If
a\equiv b\ (\mathrm{mod}\ n)andb\equiv c\ (\mathrm{mod}\ n)then $a\equiv c\ (\mathrm{mod}\ n)$
Proof:
-
a\equiv a\ (\mathrm{mod}\ n):As
n\mid\left(a-a\right)then by proposition 122{reference-type="ref" reference="prop:NT_congruent_iff_difference_is_divisible"} thata\equiv a\ (\mathrm{mod}\ n). -
If
a\equiv b\ (\mathrm{mod}\ n)thenb\equiv a\ (\mathrm{mod}\ n):Suppose that
a\equiv b\ (\mathrm{mod}\ n)thenn\mid\left(a-b\right). By the definition of divisibility, we have that\exists k\in\mathbb{Z}so thata-b=kn. Multiplying both sides by-1givesb-a=\left(-k\right)n. So again by the definition of divisibility, we have thatn\mid\left(b-a\right)and sob\equiv a\ (\mathrm{mod}\ n). -
If
a\equiv b\ (\mathrm{mod}\ n)andb\equiv c\ (\mathrm{mod}\ n)thena\equiv c\ (\mathrm{mod}\ n):Suppose that
a\equiv b\ (\mathrm{mod}\ n)andb\equiv c\ (\mathrm{mod}\ n), thenn\mid\left(a-b\right)andn\mid\left(b-c\right). Property 3. of proposition 102{reference-type="ref" reference="prop:NT_divisibility_properties"} then gives us thatn\mid\left(\left(a-b\right)+\left(b-c\right)\right).Clearly
\left(a-b\right)+\left(b-c\right)=a-cand son\mid\left(a-c\right)which is to saya\equiv c\ (\mathrm{mod}\ n).
As required. $\qed$ :::
We now know that congruences form an equivalence relation, one for each
n\geq 1. So we can consider the equivalence classes that are formed by
congruences. Let a\in\mathbb{Z}, what does the equivalence class
\left[a\right] look like?
Recall that \left[a\right]_n is given by
$$\begin{equation*} \left[a\right]_n=\left{x\in\mathbb{Z}: a\sim_n x\right}=\left{x\in\mathbb{Z}:a\equiv x\ (\mathrm{mod}\ n)\right} \end{equation*}$$
That is, the equivalence class \left[a\right] is a set of integers
that are congruent to a modulo n. Equivalently, as \sim_n is an
equivalence relation, we have that a\equiv x\ (\mathrm{mod}\ n) is the
same as x\equiv a\ (\mathrm{mod}\ n). Hence we can view
\left[a\right] as the set of integers that x so that x gives a
remainder of a when divided by $n$13 . This is to say
$$\begin{equation*} \left[a\right]_n=\left{x\in\mathbb{Z}:x\equiv a\ (\mathrm{mod}\ n)\right} \end{equation*}$$
For example, suppose that a=0, then we have that
$$\begin{align*} \left[0\right]_n&=\left{x\in\mathbb{Z}:x\equiv 0\ (\mathrm{mod}\ n)\right}\ &=\left{\dots,-2n,n,0,n,2n,\dots\right} \end{align*}$$
Likewise, when a=1 we have that
$$\begin{align*} \left[1\right]_n&=\left{x\in\mathbb{Z}:x\equiv 1\ (\mathrm{mod}\ n)\right}\ &=\left{\dots,1-2n,1-n,1,1+n,1+2n,\dots\right} \end{align*}$$
That is, the equivalence class of 0 modulo n is simply the multiples
of n and the equivalence class of 1 modulo n are one more than a
multiple of n.
How many congruence classes are there for a given n? Clearly, by the
definition algorithm, there are at most n such classes, We can show
that there are exactly n classes.
::: {#prop:NT_congruence_equiv_class_count .proposition} Proposition 124. Number of equivalence classes for congruence equivalence relation
Let a,b,n\in\mathbb{Z} so that n\geq 1 and consider the relation
\sim_n given by
$$\begin{equation} a\sim_n b\iff a\equiv b\ (\mathrm{mod}\ n) \end{equation*}$$*
We have that there are n equivalence classes for the relation
\sim_n, one for each possible remainder.
Proof:
By the division algorithm applied to n dividing a we have for
a,b,n\in\mathbb{Z} with n\geq 1 that
$$\begin{equation} a=qn+r \end{equation*}$$*
for some unique q,r\in\mathbb{Z} and 0\leq r < \left|n\right|.
Hence the possible remainders are in the set
$$\begin{equation} R=\left{0,1,2,\dots,n-1\right} \end{equation*}$$*
as n\geq 1. Firstly we show that no two i,j\in R are congruent
modulo n. So suppose, WLOG, that 0\leq i\leq j<n, then j-i>0 and
j-i<n. Then we have that n\nmid\left(j-i\right) and so
j\not\equiv i\ (\mathrm{mod}\ n). As the choice of i,j was arbitrary
we conclude that no two elements of R are congruent. It was shown in
the proof of theorem
19{reference-type="ref"
reference="thm:EquivClassesOfRelationPartitionSet"} that unequal
equivalence classes are disjoint, which is to say unique. Hence
\left[i\right]_n\neq\left[j\right]_n for all i,j\in R.
Now, it is left to show that any given r\in R belongs to exactly one
equivalence class. This is clear upon rewriting the result from the
division algorithm as
$$\begin{equation} a-r=qn \end{equation*}$$*
which gives that a\equiv r\ (\mathrm{mod}\ n). $\qed$
:::
It follows immediately by theorem
19{reference-type="ref"
reference="thm:EquivClassesOfRelationPartitionSet"} that the equivalence
classes modulo n partition \mathbb{Z}. Additionally we have that
\left[a\right]_n=\left[b\right]_n if and only if
a\equiv b\ (\mathrm{mod}\ n).
As we have shown that \sim_n is an equivalence relation we can define
the quotient set, as in definition
96{reference-type="ref" reference="def:QuotientSet"}
::: definition Definition 163. The integers modulo $n$
Let a\in\mathbb{Z}. We define the quotient set \mathbb{Z}_n to be
$$\begin{equation} \mathbb{Z}_n=\left{\left[a\right]_n:a\in\mathbb{Z}\right} \end{equation*}$$*
By proposition
124{reference-type="ref"
reference="prop:NT_congruence_equiv_class_count"} we know there are n
such sets which correspond to all the possible remainders when an
integer a is divided by n. Hence we can explicitly write
$$\begin{equation} \mathbb{Z}_n=\left{\left[0\right]_n,\left[1\right]_n,\left[2\right]_n,\dots,\left[n-1\right]_n\right} \end{equation*}$$*
If we take the canonical representative of each class, for example, if
the class is \left[0\right]_n we take the canonical representative to
be 0. We can write \mathbb{Z}_n more cleanly as
$$\begin{equation} \mathbb{Z}_n=\left{0,1,2,\dots,n-1\right} \end{equation*}$$* :::
As hinted by an example and because we have defined arithmetic on
\mathbb{Z}, the next natural question is how does arithmetic work in
\mathbb{Z}_n? Recall the example, we had that a=8, b=11 and n=5
and
$$\begin{align*} 8&\equiv 3\ (\mathrm{mod}\ 5)\ 11&\equiv 1\ (\mathrm{mod}\ 5) \end{align*}$$
When we computed a+b=19 and found that
$$\begin{equation*} 19\equiv 4\ (\mathrm{mod}\ 5) \end{equation*}$$
What about multiplication? We know that 8*8=64 and we can see that
64\equiv 4\ (\mathrm{mod}\ 5). Multiplying the residue of 3 with
itself we get 3*3=9 from which we see 9\equiv 4\ (\mathrm{mod}\ 5).
Similarly, we can see that subtraction makes sense in \mathbb{Z}_n. We
know that 11-8=3 so clearly 3\equiv 3\ (\mathrm{mod}\ n).
Subtracting the residues gives 1-3=-2. At first, this seems to be a
problem, we seem to be saying that 3\equiv -2\ (\mathrm{mod}\ 5).
However, a quick review of the definition of congruences tells us that
is correct. We know that a\equiv b\ (\mathrm{mod}\ 5) if and only if
n\mid\left(a-b\right), in our case we indeed have that
5\mid\left(3-\left(-2\right)\right) as 3-\left(-2\right)=5.
We can make the idea of addition, subtraction and multiplication rigorous.
::: {#prop:NT_operations_on_congruences .proposition} Proposition 125. Addition, subtraction and multiplication of congruences
Let a,b,c,d,n\in\mathbb{Z} so that a\equiv b\ (\mathrm{mod}\ n) and
c\equiv d\ (\mathrm{mod}\ n). We have that
-
$\left(a+c\right)\equiv \left(b+d\right)\ (\mathrm{mod}\ n)$
-
$\left(a-c\right)\equiv \left(b-d\right)\ (\mathrm{mod}\ n)$
-
$\left(ac\right)\equiv \left(bd\right)\ (\mathrm{mod}\ n)$
Proof:
Let a,b,c,d,n\in\mathbb{Z} be as given by the hypothesis. As
a\equiv b\ (\mathrm{mod}\ n) we have by proposition
122{reference-type="ref"
reference="prop:NT_congruent_iff_difference_is_divisible"} that
n\mid\left(a-b\right) and so by the definition of divisibility we have
that a-b=kn. Likewise, we have that as c\equiv d\ (\mathrm{mod}\ n)
then by proposition
122{reference-type="ref"
reference="prop:NT_congruent_iff_difference_is_divisible"} that
n\mid\left(c-d\right) and so by the definition of divisibility we have
that c-d=ln.
In particular, we have that a=b+kn and c=d+ln. It follows that
$$\begin{align} a+c&=\left(b+kn\right)+\left(d+ln\right)\ &=\left(b+d\right)+\left(kn+ln\right)\ &=\left(b+d\right)+n\left(k+l\right)\ &\Rightarrow n\mid\left(\left(a+c\right)-\left(b+d\right)\right)\ &\Rightarrow \left(a+c\right)\equiv \left(b+d\right)\ (\mathrm{mod}\ n) \end{align*}$$*
Likewise, for subtraction we have
$$\begin{align} a-c&=\left(b+kn\right)-\left(d+ln\right)\ &=\left(b-d\right)+\left(kn-ln\right)\ &=\left(b-d\right)+n\left(k-l\right)\ &\Rightarrow n\mid\left(\left(a-c\right)-\left(b-d\right)\right)\ &\Rightarrow \left(a-c\right)\equiv \left(b-d\right)\ (\mathrm{mod}\ n) \end{align*}$$*
Finally, for multiplication, we see that
$$\begin{align} ac&=\left(b+kn\right)\left(d+ln\right)\ &=bd+bln+dkn+kln^2\ &=bd+n\left(bl+dk+kln\right)\ &\Rightarrow n\mid\left(\left(ac\right)-\left(bd\right)\right)\ &\Rightarrow \left(ac\right)\equiv \left(bd\right)\ (\mathrm{mod}\ n) \end{align*}$$*
As required. $\qed$ :::
This proposition provides the backbone of showing that the operations of
addition, subtraction and multiplication are well-defined on
\mathbb{Z}_n.
::: definition Definition 164. Addition, subtraction and multiplication on $\mathbb{Z}_n$
Let a,b,n\in\mathbb{Z} with n\geq 1. We define addition,
subtraction and multiplication on \mathbb{Z}_n by
-
$\left[a\right]_n+\left[b\right]_n=\left[a+b\right]_n$
-
$\left[a\right]_n-\left[b\right]_n=\left[a-b\right]_n$
-
$\left[a\right]_n\left[b\right]_n=\left[ab\right]_n$ :::
We prove these are well-defined.
::: {#prop:NT_addition_subtraction_multiplication_Zn_well_defined .proposition}
Proposition 126. Addition, subtraction and multiplication on
\mathbb{Z}_n is well-defined and closed
Let n\in\mathbb{Z} so that n\geq 1. We have that addition,
subtraction and multiplication of equivalence classes are well-defined
and closed. This is to say \forall x,y\in\mathbb{Z}_n we have that
-
$\left[x\right]_n+\left[y\right]_n=\left[x+y\right]_n\in\mathbb{Z}_n$
-
$\left[x\right]_n-\left[y\right]_n=\left[x-y\right]_n\in\mathbb{Z}_n$
-
$\left[x\right]_n\left[y\right]_n=\left[xy\right]_n\in\mathbb{Z}_n$
Proof:
Suppose that a\in\left[x\right]_n and b\in\left[y\right]_n. By
definition, we have that a\equiv x\ (\mathrm{mod}\ n) and
b\equiv y\ (\mathrm{mod}\ n).
By proposition 125{reference-type="ref" reference="prop:NT_operations_on_congruences"} we have that
-
$a+b\equiv x+y\ (\mathrm{mod}\ n)$
-
$a-b\equiv x-y\ (\mathrm{mod}\ n)$
-
$ab\equiv xy\ (\mathrm{mod}\ n)$
So that a+b\in\left[x+y\right]_n, a-b\in\left[x-y\right]_n and
ab\in\left[xy\right]_n, showing the operations are well-defined.
Closure is immediate in each case. $\qed$
:::
We now have a well-defined idea of arithmetic on \mathbb{Z}_n. A poor
student or a particularly clever dog will realise immediately that we
have missed out on some operations that were defined on \mathbb{Z}. In
this section for example we defined what integer division means. What
about exponentiation?
We will first look at exponentiation. Thankfully there isn't much work
to do as we can make use of the definition of multiplication for
\mathbb{Z}_n. We can see that
$$\begin{align*} \left(\left[a\right]\right)^2&=\left[a\right]\left[a\right]=\left[aa\right]=\left[a^2\right]\ \left(\left[a\right]\right)^3&=\left[a\right]^2\left[a\right]=\left[a^2a\right]=\left[a^3\right]\ \left(\left[a\right]\right)^4&=\left[a\right]^3\left[a\right]=\left[a^3a\right]=\left[a^4\right]\ &\dots \end{align}$$
So clearly exponentiation is well-defined. Now, what about division? We
expect that to get a well-defined definition for division in
\mathbb{Z}_n it should respect the definition of divisibility for the
integers. Here in lies the problem, division over \mathbb{Z} is not
well-defined, for example, 3\nmid 2, so it is clear there is no
equivalence class for this case. What about the cases where division
over \mathbb{Z} is well-defined? This is our definition of being
congruent so we can't extend to division of congruences this way either.
However, recall that we have defined the idea of a multiplicative
inverse. In particular, we had that for x\in\mathbb{Z} such that
x\neq 0, then y\in\mathbb{Q} was said to be a multiplicative inverse
of x so that
$$\begin{equation*} xy=1=yx \end{equation*}$$
Perhaps then, we might hope to recover some notion of division modulo
n by using multiplicative inverses. Such a definition, of course,
would have to respect congruences. So for x\in\mathbb{Z}_n with
x\not\equiv 0\ (\mathrm{mod}\ n), we are looking for
y\in\mathbb{Z}_n so that x*y\equiv 1 \ (\mathrm{mod}\ n) To start,
it would be wise to look at multiplication for a few small values of
n\geq 2, to get a feel for what we are looking for.
* 0 1
0 0 0
1 0 1
: The multiplication table for n=5
* 0 1 2
0 0 0 0
1 0 1 2
2 0 2 1
: The multiplication table for n=5
* 0 1 2 3
0 0 0 0 0
1 0 1 2 3
2 0 2 0 2
3 0 3 2 1
: The multiplication table for n=5
* 0 1 2 3 4
0 0 0 0 0 0
1 0 1 2 3 4
2 0 2 4 1 3
3 0 3 1 4 2
4 0 4 3 2 1
: The multiplication table for n=5
* 0 1 2 3 4 5
0 0 0 0 0 0 0
1 0 1 2 3 4 5
2 0 2 4 0 2 4
3 0 3 0 3 0 3
4 0 4 2 0 4 2
5 0 5 4 3 2 1
: The multiplication table for n=7
* 0 1 2 3 4 5 6
0 0 0 0 0 0 0 0
1 0 1 2 3 4 5 6
2 0 2 4 6 1 3 5
3 0 3 6 2 5 1 4
4 0 4 1 5 2 6 3
5 0 5 3 1 6 4 2
6 0 6 5 4 3 2 1
: The multiplication table for n=7
What do these tables tell us? Starting with the case n=2, we see that
only 1\equiv 1\mod{2} has a multiplicative inverse, namely
1\equiv 1\mod{2}. For n=3, we that if x\equiv 1\ (\mathrm{mod}\ 3)
then we can take y\equiv 1\mod{3} and likewise if
x\equiv 2\ (\mathrm{mod}\ 3) then we can take y\equiv 2\mod{3}.
Things get a little more complicated for n=4. We see that if
x\equiv 1\ (\mathrm{mod}\ 4) then we take
y\equiv 1\ (\mathrm{mod}\ 4) and if x\equiv 3\ (\mathrm{mod}\ 4) we
take y\equiv 3\ (\mathrm{mod}\ 4). What about
x\equiv 2\ (\mathrm{mod}\ 4)?. Looking at the table we see that
x*1\equiv 2\ (\mathrm{mod}\ 4) and x*3\equiv 2\ (\mathrm{mod}\ 4),
finally x*2\equiv 0\ (\mathrm{mod}\ 4). A disaster! We have that 2
does not have a multiplicative inverse modulo 4. Hence not all
elements of \mathbb{Z}_4 have a multiplicative inverse. A similar
situation occurs for the case n=6, for example, the row for
x\equiv 3\ (\mathrm{mod}\ 6) shows only 3 and 0 can be results.
So our quest of being able to define some notion of division for
\mathbb{Z}_n in general appears to be at an end.
That being said, the situation looks more promising in the cases of
n=5 and n=7. For \mathbb{Z}_5 we have that the following
multiplicative inverses, for those x\not\equiv 0\ (\mathrm{mod}\ 5)
x x^{-1}
1 1
2 3
3 2
4 4
: The elements x\not\equiv 0\ (\mathrm{mod}\ 5) and their respective
multiplicative inverses
Likewise for the elements x\not\equiv 0\ (\mathrm{mod}\ 7) for
\mathbb{Z}_7 we have
x x^{-1}
1 1
2 4
3 5
4 2
5 3
6 6
: The elements x\not\equiv 0\ (\mathrm{mod}\ 7) and their respective
multiplicative inverses
We saw similar situations for \mathbb{Z}_2 and \mathbb{Z}_3, so what
do \mathbb{Z}_2, \mathbb{Z}_3, \mathbb{Z}_5, \mathbb{Z}_7 have
in common? The thing they have in common is that the modulus is a prime!
Does this result hold for all primes? If so, why? If not, why not and
what primes does it fail for?
We also saw cases in \mathbb{Z}_4 and \mathbb{Z}_6 where certain
elements did have a multiplicative inverse. For example in
\mathbb{Z}_4 we saw x\equiv 1\ (\mathrm{mod}\ 4) had the
multiplicative inverse of 1, similarly we saw
x\equiv 3\ (\mathrm{mod}\ 4) had a multiplication inverse of 3. In
\mathbb{Z}_6 we can see that 1 has the inverse of 1, and 5 has
an inverse of 5. So what is special in the case where n is not prime
that allows some elements to have an inverse?
In the case of \mathbb{Z}_4 the elements which had an inverse, 1 and
3 are co-prime to 4. Likewise in \mathbb{Z}_6 the elements that
had inverse were 1 and 5 which are again co-prime to 6. When n
was prime, we make the trivial observation that all non-zero elements of
\mathbb{Z}_n are co-prime to n, for if not then they share a common
prime factor and hence the greatest common divisor would be larger than
1. It seems we have recovered our original goal, that is to say, it
looks like it is the case that an element of x\in\mathbb{Z}_n for
n\geq 2 has a multiplicative inverse if
\mathop{\mathrm{GCD}}\left(x,n\right)=1. Clearly, this is an if and
only-if statement.
::: {#prop:NT_modulo_inverse_iff_coprime_with_modulus .proposition} Proposition 127. Existence of inverse element in $\mathbb{Z}_n$
Let n\in\mathbb{Z} with n\geq 2. Let x\in\mathbb{Z}_n. The
multiplicative inverse of x in \mathbb{Z}_n exist if and only if
$\mathop{\mathrm{GCD}}\left(x,n\right)=1$
Proof:
\left(\Rightarrow\right): Let x\in\mathbb{Z}_n have an inverse
y\in\mathbb{Z}_n. We therefore have that
$$\begin{equation} xy\equiv 1\ (\mathrm{mod}\ n) \end{equation*}$$*
By the definition of congruences, we therefore have that xy=1+kn for
some k\in\mathbb{Z}. Let d=\mathop{\mathrm{GCD}}\left(x,n\right). As
d is the greatest common divisor of x and n we have that d\mid x
and d\mid n so d\mid xy-kn. But xy-kn=1 so d\mid 1. Clearly
\mathop{\mathrm{GCD}}\left(x,n\right)\geq 1 and so we conclude that
d=1.
\left(\Leftarrow\right): Suppose that
\mathop{\mathrm{GCD}}\left(x,n\right)=1. We have by Bézout's Identity
(theorem \ref{thm:NT_bezout_id}) that \exists a,b\in\mathbb{Z} so
that
$$\begin{equation} ax+bn=1 \end{equation*}$$*
Modulo n, we get that ax\equiv 1\ (\mathrm{mod}\ n) and so a is
the inverse element of x in \mathbb{Z}_n.
As required. $\qed$ :::
::: {#cor:NT_all_modulo_inverses_if_n_prime .corollary}
Corollary 10. All non-zero elements of \mathbb{Z}_n exist if n
is prime
Let p be prime. We have that all the non-zero elements of
\mathbb{Z}_p have a multiplicative inverse.
Proof:
By corollary
8{reference-type="ref"
reference="cor:NT_PrimeNotDividing_Integer_implies_coprime"}, if p is
a prime and p\nmid a for some a\in\mathbb{Z} then
\mathop{\mathrm{GCD}}\left(a,,p\right)=1. Now suppose that
x\in\mathbb{Z}_p, clearly x\leq p. In particular p\not\mid x. As
this is true for every non-zero x\in\mathbb{Z}_p then
\mathop{\mathrm{GCD}}\left(x,,p\right)=1 and so each
x\in\mathbb{Z}_p has a multiplicative inverse by proposition
127{reference-type="ref"
reference="prop:NT_modulo_inverse_iff_coprime_with_modulus"}. $\qed$
:::
We have now recovered a definition of division of the congruence classes
of \mathbb{Z}_n. Now that modular arithmetic is on a solid footing,
what can we use it for? One immediate use case is solving problems about
divisibility.
::: example
Example 135. We will show that
6\mid a\left(a+1\right)\left(+2\right) for every integer a. We
observe that the possible residues of a modulo 6 are 0, 1, 2,
3, 4 and 5. It is enough to check that each is congruent to zero
modulo 6.
When a\equiv 0 \mod{6} we see that a+1\equiv 1\ (\mathrm{mod}\ 6)
and a+2\equiv2\ (\mathrm{mod}\ 6). So that
$$\begin{equation} a\left(a+1\right)\left(a+2\right)\equiv 011 \equiv 0\ (\mathrm{mod}\ 6) \end{equation*}$$*
Now, when a\equiv 1 \mod{6} we see that
a+1\equiv 2\ (\mathrm{mod}\ 6) and a+3\equiv2\ (\mathrm{mod}\ 6),
giving
$$\begin{equation} a\left(a+1\right)\left(a+2\right)\equiv 123 \equiv 0\ (\mathrm{mod}\ 6) \end{equation*}$$*
As 1*2*3=6 which is congruent to zero modulo 6. We see that, with
an abuse of notation for brevity, that
$a$ $a\ (\mathrm{mod}\ 6)$ $\left(a+1\right)\ (\mathrm{mod}\ 6)$ $\left(a+2\right)\ (\mathrm{mod}\ 6)$ $a\left(a+1\right)\left(a+2\right)\ (\mathrm{mod}\ 6)$
$2$ $2$ $3$ $4$ $24\equiv 0$ $3$ $3$ $4$ $5$ $60\equiv 0$ $4$ $4$ $5$ $0$ $0\equiv 0$ $5$ $5$ $0$ $1$ $0\equiv 0$
: The residues of a\ (\mathrm{mod}\ 6) for a\geq 2, the values of
each term and their resultant multiplication modulo $6$
We can see that the product is always zero modulo 6. As each product
is always congruent to zero modulo 6 then
a\left(a+1\right)\left(a+2\right)\equiv 0\ (\mathrm{mod}\ 6) which
implies 6\mid a\left(a+1\right)\left(a+2\right).
:::
The astute reader may notice that this feels longer than a proof that
uses only the definition of divisibility. The astute reader would be
correct. In fact we have that 6\mid m for some m\in\mathbb{Z} if and
only if 2\mid m and 3\mid m.
Indeed, suppose that 6\mid m then m=6n for some n\in\mathbb{Z},
moreover 6=2*3 so m=2*3*n which implies that 2\mid m and
3\mid m. Conversely, if 2\mid m and 3\mid m then 6 clearly
divides m as 2 and 3 will appear at least once in the prime
factorisation of m. So why did we bother with congruences? By first
doing the longer calculations and then the shorter proof, we have seen a
hint at a possible generalisation to the theory!
That is, if a\equiv b\ (\mathrm{mod}\ n) for some n\in\mathbb{Z}
with n>0 with a prime factorisation
$$\begin{equation*} n=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$
then we might expect that a\equiv b\ (\mathrm{mod}\ n) if and only if
a\equiv b\ (\mathrm{mod}\ p_i^{e_i}) where i=1,2,\dots, k. We can
prove this.
::: proposition Proposition 128. Congruent if and only if congruent to each prime in factorisation
Let n\in\mathbb{Z} so that n>0 and n has a prime factorisation
given by
$$\begin{equation} n=p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k} \end{equation*}$$*
Let a,b\in\mathbb{Z}. We have that a\equiv b\ (\mathrm{mod}\ n) if
and only if a\equiv b\ (\mathrm{mod}\ p_i^{e_i}) for each i where
$i=1,2,\dots, k$
Proof:
Let n\in\mathbb{Z} be as given in the hypothesis. We have that
$$\begin{align} a\equiv b\ (\mathrm{mod}\ n) &\iff n\mid\left(a-b\right)\ &\iff p_1^{e_1}p_2^{e_2}p_3^{e_3}\dots p_k^{e_k}\mid\left(a-b\right)\ &\iff p_i^{e_i}\mid\left(a-b\right),\ \text{For each } i=1,2,\dots, k\ &\iff a\equiv b\ (\mathrm{mod}\ p_i^{e_i}),\ \text{For each } i=1,2,\dots, k \end{align*}$$*
As required. $\qed$ :::
Another use of congruences is in cryptography, which is a field of study of taking messages and encoding (obfuscating) them in such a way that only the person the message was intended for can read it. This is especially true for the RSA14 encryption method. We already have some of the mathematical machinery required to explore how this method of cryptography works, namely prime numbers and congruences. On the other hand, we still lack some important theory. If cryptography is the field of encoding messages so that only the person the message was intended for can read it, then there is some method that encodes the message and a method that decodes the message using some information known to both the sender and recipient. This means that using this information the recipient will have some method of finding out the original message! We look at this idea in more detail.
Diophantine equations and Polynomials
::: epigraph I had a Polynomial once. My Doctor removed it.
Micheal Grant :::
We start with a definition that we have seen numerous times so far but have not formally defined. That of an equation.
::: definition Definition 165. Equation
An equation is a mathematical statement that states that two expressions are equal. :::
This seems simple enough, but what does it mean? Unfortunately, this depends on the situation, different situations will have a different meaning of what a statement is. Thankfully, we have seen equations already throughout the text so this abstract definition is familiar to us. For example
$$\begin{equation*} 1+1=2 \end{equation*}$$
is an equation. So is \mathop{\mathrm{GCD}}\left(a,b\right)=d, in a
similar vain we have from Bézout's Identity that d=ax+by is also an
equation. So why define something if it is really this simple? Simply
put, we can use the idea of an equation in a more complex way. For
example,
$$\begin{equation*} 1+x=2 \end{equation*}$$
says that 1 plus x is equal to 2 but we don't know what x is.
However, we can see that
$$\begin{align*} 1+x&=2\ x&=1 \end{align*}$$
That is, we see that x=1, this is an equation! This is where the power
of an equation starts to show its worth. If we have a problem where we
don't know the value of some quantity of interest, we might be able to
work out what that quantity is. We have seen more complex examples of
equations, for example x^2=2 which we have shown has no value of
x\in\mathbb{Q} where it is true.
Hence, equations that contain a value, or maybe multiple values that we don't know but want to know, are important. This section is focused on looking at such equations. We make another couple of definitions for when an equation contains a value we don't know.
::: definition Definition 166. Variable
A variable is a value that is allowed to be changed either freely or
restricted by some constraint or equation. A variable can be taken to be
any meaningful value, either inside or outside of some set S. The
context of the statement under study usually makes it clear where the
variable belongs.
:::
::: definition Definition 167. Indeterminate variable
An indeterminate variable is a variable value which has not been
specified. As with a variable, it could be inside or outside of some set
S.
:::
::: definition Definition 168. An unknown variable
An unknown variable, or simply an unknown, is a variable whose value is
unknown but we wish to find its value. As before, this unknown variable
is to be taken as a member of a set S. If a value for the unknown
variable can be found, we call it a solution to the equation.
:::
For example, the equation 5x+1=2 would have x as the indeterminate
variable, if we were solving for x then x would be the unknown
variable as well. The equation 2x+5y=6 has two indeterminate
variables, x and y. We can potentially have many indeterminate
variables in an equation. Moreover, in many problems, we will have a
certain type of variable whose value can vary but is not the unknown
that we are looking to solve for. We define this type of variable as
well.
::: definition Definition 169. Coefficient
A variable which can vary but is not the variable that is being solved for is called a coefficient, or a parameter of the equation. :::
So, let's start simply and consider the simplest equation possible with one unknown variable and two coefficients.
$$\begin{equation*} x+a=b \end{equation*}$$
This is simple to solve for the unknown x, simply take a from both
sides to give x=b-a. So for example if we let a,b\in\mathbb{Z} say
with a=5 and b=3, then we see that x\in\mathbb{Z} with x=3-5=-2.
This is also true if we take a,b\in\mathbb{Q}. A more complex form of
the above equation is
$$\begin{equation*} ax+b=c \end{equation*}$$
Now we hit a problem we are looking for a solution x\in\mathbb{Z}.
Firstly, we have that ax=c-b, but then a solution x\in\mathbb{Z} can
occur if and only if a\mid\left(b-c\right). If we look for a solution
where x\in\mathbb{Q} then no such problem occurs. Therefore, the set
that we are looking for solutions in is crucial in solving equations.
With our current theory, the situation gets more hopeless the more
complicated the equation becomes. For example, if we consider the
equation
$$\begin{equation*} 4x^2+2x+3=0 \end{equation*}$$
Does this equation have solutions in \mathbb{Z}? How about
\mathbb{Q}?. Additionally, what happens if we have more than one
equation or unknowns? For example, consider the two equations given by
$$\begin{align*} 4x+2y&=6\ -2x+5y&=7 \end{align*}$$
How do we solve equations like this? This section aims to answer questions like these. We make a final definition, a special case for when we only seek integer solutions.
::: definition Definition 170. Diophantine equation
An equation for which the solutions have to be integers is called a Diophantine equation15 . :::
Linear Diophantine equations
Linear equations with two variables
We start where the previous section left off, by looking at the simplest type of equation that can be solved.
::: definition Definition 171. Linear equation of a single indeterminate variable
Let S be a set. We say an equation is a linear equation in a single
variable x if it has the form
$$\begin{equation} ax+b=c \end{equation*}$$*
for some coefficients a,b,c\in S and an indeterminate variable x.
In particular as this equation only has one indeterminate variable we
say it is a single-variable linear equation.
:::
We have already seen that solutions to this equation exist in
\mathbb{Z} if and only if a\mid\left(c-b\right), and a solution
always exists if we want x\in\mathbb{Q}. Things are a bit more
interesting if we introduce a second variable.
::: definition Definition 172. Linear equation of two indeterminate variables
Let S be a set. We say an equation is a linear equation in two
variables x,y if it has the form
$$\begin{equation} ax+by=c \end{equation*}$$*
for some coefficients a,b,c\in S and indeterminate variables x and
y.
:::
We have seen this type of equation before, in Bézout's Identity (Theorem
36{reference-type="ref"
reference="thm:NT_bezout_id"}). In Bézout's Identity, we have that the
greatest common divisor, d, of two integers a,b can be expressed as
$$\begin{equation*} ax+by=d \end{equation*}$$
for some x,y\in\mathbb{Z}. This gives us examples of already solved
equations, but what about the other way? Given an equation of the form
$$\begin{equation*} ax+by=c \end{equation*}$$
with a,b,c\in\mathbb{Z} given, can we find integer values for x and
y?. That is, we are considering ax+by=c to be a Diophantine
equation. If the reader is sufficiently alert, they will notice that by
mentioning Bézout's Identity we are hinting that it will be crucial to
finding the solutions.
We know of one solution, namely if
\mathop{\mathrm{GCD}}\left(a,b\right)=d and c=d then the solution is
found by the Euclidean algorithm. Now if c were a multiple of d can
we find solutions? Recall proposition
108{reference-type="ref"
reference="prop:NT_GCD_properties"} part 4. We have that
\mathop{\mathrm{GCD}}\left(a,b\right)=d is the smallest such so that
ax+by=d, given that this is the smallest such then we can show that
there exist others, namely these solutions are multiples of d.
::: {#prop:NT_bezout_extension .proposition}
Proposition 129. Integer has form ax+by if it is a multiple of
the greatest common divisor of a and $b$
Let a,b\in\mathbb{Z} and d=\mathop{\mathrm{GCD}}\left(a,b\right).
Let c\in\mathbb{Z}. We have that
$$\begin{equation} c=ax+by \end{equation*}$$*
if and only if d\mid c. Which is to say c is a multiple of $d$
Proof:
\left(\Rightarrow\right): Clearly if c=ax+by then as
d=\mathop{\mathrm{GCD}}\left(a,b\right) we have by proposition
102{reference-type="ref"
reference="prop:NT_divisibility_properties"} part 3 that d\mid c.
\left(\Leftarrow\right): Suppose that c=de for some
e\in\mathbb{Z}. By Bézout's Identity, we have that
\exists u,v\in\mathbb{Z} so that
$$\begin{equation} d=au+bv \end{equation*}$$*
where d=\mathop{\mathrm{GCD}}\left(a,b\right). Multiplying both sides
by e we get
$$\begin{equation} c=aue+bve=ax+by \end{equation*}$$*
Hence x=ue and y=ve.
As required. $\qed$ :::
Armed with this proposition we can find the solutions to the Diophantine
equation ax+by=c.
::: {#prop:NT_solutions_to_two_var_linear_diophantine_equation .proposition} Proposition 130. Solutions to the Diophantine equation $ax+by=c$
Let a,b,c\in\mathbb{Z} be such that
$$\begin{equation} ax+by=c \end{equation*}$$*
for the indeterminate variables x,y and let
d=\mathop{\mathrm{GCD}}\left(a,b\right). We have that there are
solutions so that x,y\in\mathbb{Z} if and only if d\mid c.
Moreover, there are infinitely many solutions where the solutions are given by
$$\begin{align} x&=x_0+\frac{bn}{d}\ y&=y_0-\frac{an}{d} \end{align*}$$*
where x_0,y_0\in\mathbb{Z} is one solution.
Proof:
The existence of a solution is given by proposition
129{reference-type="ref"
reference="prop:NT_bezout_extension"}. It is left to show that the
suggested solutions x,y are solutions and that there are infinitely
many solutions. This follows the argument in example
118{reference-type="ref"
reference="exam:NT_solutions_to_ax_plus_by"}. We give the argument again
to refresh the reader's memory.
Let x_0,y_0\in\mathbb{Z} be a solution, then we have that
$$\begin{equation} ax_0+by_0=c \end{equation*}$$*
For any n\in\mathbb{Z} let
$$\begin{align} x&=x_0+\frac{bn}{d}\ y&=y_0-\frac{an}{d} \end{align*}$$*
We then have that \displaystyle\frac{bn}{d}\in\mathbb{Z} as d\mid b
by definition of the greatest common divisor, likewise for
\displaystyle\frac{ab}{d}. Hence, we have that
$$\begin{align} ax+by&=a\left(x_0+\frac{bn}{d}\right)+b\left(y_0-\frac{an}{d}\right)\ &=ax_0+a\frac{bn}{d}+by_0-b\frac{an}{d}\ &=ax_0+\frac{abn}{d}+by_0-\frac{abn}{d}\ &=ax_0+by_0=c\ \end{align*}$$*
Hence x,y is a solution. Moreover, as n\in\mathbb{Z} is any integer
we have shown that there are infinitely many solutions. It is left to
show that these are the only solutions.
Let x,y\in\mathbb{Z} be any solution to ax+by=c, and let
x_0,y_0\in\mathbb{Z} be a particular solution. Hence
$$\begin{equation} ax+by=ax_0by_0 \end{equation*}$$*
Subtracting ax_0by_0 from the right-hand side gives
$$\begin{align} ax+by-ax_0by_0&=0\ a\left(x-x_0\right)+b\left(y-y_0\right)&=0 \end{align*}$$*
Now, as d=\mathop{\mathrm{GCD}}\left(a,b\right) then we have that
d\mid a and d\mid b so that
$$\begin{align} \frac{a}{d}\left(x-x_0\right)+\frac{b}{d}\left(y-y_0\right)&=0\ \frac{a}{d}\left(x-x_0\right)&=-\frac{b}{d}\left(y-y_0\right) \end{align*}$$*
If a=b=0, we are done so suppose not. Then one of a or b is
non-zero. Without loss of generality, suppose that a\neq 0. We have
that by proposition 108{reference-type="ref"
reference="prop:NT_GCD_properties"} that if
\mathop{\mathrm{GCD}}\left(a,b\right)=d then
\displaystyle\mathop{\mathrm{GCD}}\left(\frac{a}{d},\frac{b}{d}\right)=1,
moreover by definition of co-prime integers we have that
\displaystyle\frac{a}{d} and \displaystyle\frac{b}{d} are co-prime.
By Euclid's lemma for co-primes (lemma
11{reference-type="ref"
reference="lem:NT_Euclid_co_primes"}) we have that
\displaystyle\frac{a}{d} \mid-\left(y-y_0\right). Hence there is some
n\in\mathbb{Z} so that
$$\begin{equation} -\left(y-y_0\right)=n\frac{a}{d} \end{equation*}$$*
Which is to say
$$\begin{equation} y=y_0-\frac{an}{d} \end{equation*}$$*
Similarly, we have that
$$\begin{equation} x=x_0+\frac{bn}{d} \end{equation*}$$*
As required. $\qed$ :::
Linear equations with more than two variables
A natural question to ask now is what happens when we have more than two
indeterminate variables? For example ax+by+cz=e? We can take some
inspiration from the two variable case.
Recall that for ax+by=c with d=\mathop{\mathrm{GCD}}\left(a,b\right)
that there are solutions with x,y\in\mathbb{Z} if and only if
d\mid c. More importantly, we have that if
d=\mathop{\mathrm{GCD}}\left(a,b\right) then we can express d by
d=ax+by for some x,y\in\mathbb{Z} by Bézout's Identity. Moreover by
proposition
103{reference-type="ref"
reference="prop:NT_Divisor_dividing_all_in_set_divides_linear_combination"}
we have that for a set of n integers
S=\left\{b_1,b_2,b_3,\dots,b_n\right\} and additionally we have that
that a\mid b_i for each b_i\in S then
$$\begin{equation*} a\mid\sum_{i=1}^n b_i x_i \end{equation*}$$
This hints at an extension to Bézout's Identity, given a suitable extension to the definition of the greatest common divisor for more than two inputs. Hence, our goal is to build this suitable extension to the greatest common divisor. We will start by looking at some exploratory examples before moving on with the generalisation.
::: example
Example 136. Let a=2, b=4 and c=6. What is
\mathop{\mathrm{GCD}}\left(a,b,c\right)? Clearly, by inspection, we
have that 2 is the largest divisor of a,b and c. In particular we
have that \mathop{\mathrm{GCD}}\left(2,4\right)=2 and
\mathop{\mathrm{GCD}}\left(2,6\right)=2. In other words, we have that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(2,4,6\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(2,4\right),6\right) \end{equation*}$$*
Equivalently, we could have first considered
\mathop{\mathrm{GCD}}\left(4,6\right)=2 and then
\mathop{\mathrm{GCD}}\left(2,2\right)=2 so we have
$$\begin{equation} \mathop{\mathrm{GCD}}\left(2,4,6\right)=\mathop{\mathrm{GCD}}\left(2,\mathop{\mathrm{GCD}}\left(4,6\right)\right) \end{equation*}$$* :::
::: example
Example 137. Let a=3, b=6 and c=30. What is
\mathop{\mathrm{GCD}}\left(a,b,c\right)? Breaking this problem down we
have that \mathop{\mathrm{GCD}}\left(3,6\right)=3,
\mathop{\mathrm{GCD}}\left(3,30\right)=3 and
\mathop{\mathrm{GCD}}\left(6,30\right)=6. As the greatest common
divisor must divide all of the numbers we must conclude that
\mathop{\mathrm{GCD}}\left(3,6,30\right)=3.
:::
::: example
Example 138. Let a=3, b=5 and c=7. As a,b and c are all
prime we clearly see that $\mathop{\mathrm{GCD}}\left(a,b,c\right)=1$
:::
::: example
Example 139. Let a=14, b=35, c=7 and d=5. We again break
this down. We see that
$$\begin{align} \mathop{\mathrm{GCD}}\left(14,33\right)&=7\ \mathop{\mathrm{GCD}}\left(14,7\right)&=7\ \mathop{\mathrm{GCD}}\left(14,5\right)&=1\ \mathop{\mathrm{GCD}}\left(35,7\right)&=5\ \mathop{\mathrm{GCD}}\left(35,5\right)&=7\ \mathop{\mathrm{GCD}}\left(7,5\right)&=1\ \end{align*}$$*
Again the greatest common divisor is the smallest value that divides
all of the inputs a,b,c and d. The smallest such number here is 1
so \mathop{\mathrm{GCD}}\left(14,35,7,5\right)=1.
:::
In these examples, we made use of the fact that the greatest common
divisor of two numbers is the smallest number that divides both of the
input numbers. We then looked at all of the possible combinations of the
inputs and took the smallest value that occurred. This is to be
consistent with two variable version of the \mathop{\mathrm{GCD}} that
we have already developed. This was shown explicitly in the first
example with
$$\begin{equation*} \mathop{\mathrm{GCD}}\left(2,4,6\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(2,4\right),6\right)=\mathop{\mathrm{GCD}}\left(2,\mathop{\mathrm{GCD}}\left(4,6\right)\right) \end{equation*}$$
Hence an immediate property that we can deduce is that the
\mathop{\mathrm{GCD}} is associative, in the sense that computing the
\mathop{\mathrm{GCD}} of three numbers is equivalent to computing the
\mathop{\mathrm{GCD}} of two of the inputs with the remaining input.
::: proposition
Proposition 131. \mathop{\mathrm{GCD}} is associative
Let a,b,c\in\mathbb{Z}. We have that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a,\mathop{\mathrm{GCD}}\left(b,c\right)\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a,b\right),c\right) \end{equation*}$$*
Proof:
Let
x=\mathop{\mathrm{GCD}}\left(a,\mathop{\mathrm{GCD}}\left(b,c\right)\right)
and
y=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a,b\right),c\right),
We need to show that x\mid y and y\mid x then we can conclude that
x=y.
As
x=\mathop{\mathrm{GCD}}\left(a,\mathop{\mathrm{GCD}}\left(b,c\right)\right)
then by definition of the greatest common divisor, we have that
x\mid a and x\mid\mathop{\mathrm{GCD}}\left(b,c\right). Moreover as
x\mid\mathop{\mathrm{GCD}}\left(b,c\right) then again by definition of
the greatest common divisor we have that x\mid b and x\mid c.
As x\mid a and x\mid b then
x\mid\mathop{\mathrm{GCD}}\left(a,b\right) and likewise x\mid c so
x\mid\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a,b\right),c\right)
by definition and so x\mid y. The proof that y\mid x is similar.
As x\mid y and y\mid x and x>0 and y>0 we conclude that x=y
as required. $\qed$
:::
To extend our definition of the greatest common divisor to more than two
inputs, we will use the definition of the \mathop{\mathrm{GCD}} given
by the decomposition of primes. That is to say, given
a,b\in\mathbb{Z}, we know that there exists a set of primes
$$\begin{equation*} T=\left{t_1,t_2,\dots,t_v\right} \end{equation*}$$
So that a and b can be represented by a prime factorisation of
primes t_i\in T. That is
$$\begin{align*} a&=\prod_{i=1}^v t_i^{e_i}\ b&=\prod_{i=1}^v t_i^{f_i}\ \end{align*}$$
We then have that the greatest common divisor is given by
$$\begin{equation*} \mathop{\mathrm{GCD}}\left(a,b\right)=t_1^{\min\left(e_1,f_1\right)}t_2^{\min\left(e_2,f_2\right)}t_3^{\min\left(e_3,f_3\right)}\dots t_v^{\min\left(e_v,f_v\right)} \end{equation*}$$
Firstly, we will extend the result of proposition
115{reference-type="ref"
reference="prop:NT_express_primes_in_common_basis"} to the case of n
integers, the proof is similar to proposition
115{reference-type="ref"
reference="prop:NT_express_primes_in_common_basis"}.
::: {#prop:NT_General_express_primes_in_common_basis .proposition} Proposition 132. Expression of set of integers as powers of same primes
Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\} be such that
a_i\in\mathbb{Z} and a_i>2 for 1\leq i\leq n. For each a_i let
its prime factorisation be denoted by
$$\begin{equation} \mathlarger{a_i=\prod_{\substack{p_{\left(i,k\right)\mid a_i} \ p_{\left(i,k\right)}\text{ is prime}}} p_{\left(i,k\right)}^{e_{\left(i,k\right)}}} \end{equation*}$$*
where \left(i,k\right) is a index tuple with i denoting one of the
primes and k denoting the $k$-th element of $a_i$'s prime
factorisation. Then there exists a set of primes
$$\begin{equation} T=\left{t_1,t_2,t_3\dots,t_v\right} \end{equation*}$$*
with t_1<t_2<t_3<\dots <t_v so that
$$\begin{equation} \mathlarger{a_i=\prod_{j=1}^v t_{j}^{f_{\left(i,j\right)}}} \end{equation*}$$*
for each 1\leq i\leq n.
Proof:
Let each a_i be as given. That is,
$$\begin{equation} \mathlarger{a_i=\prod_{\substack{p_{\left(i,k\right)\mid a_i} \ p_{\left(i,k\right)}\text{ is prime}}} p_{\left(i,k\right)}^{e_{\left(i,k\right)}}} \end{equation*}$$*
Let
A_i=\left\{p_{\left(i,k\right)} : p_{\left(i,k\right)} \text{ appears in the prime factorisation of } a_i\right\},
that is each A_i denotes the set of the prime factors that appear in
a_i. We can therefore take T to be
$$\begin{equation} T=\bigcup_{i=1}^n A_i \end{equation*}$$*
so that
$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*
where \displaystyle v\leq \sum_{i=1}^n \left|A_i\right|. It is now
left to show that we can pick the primes in the factorisations of the
a_i from T. Define the mapping \iota_{A_i} by
$$\begin{align} \iota_{A_i}:A_i&\rightarrow T\ x&\mapsto\iota_{A_i}\left(x\right)=x \end{align*}$$*
We have that \iota_{A_i} maps the elements of A_i to the same
element in T. Therefore, we have for some a_i that
$$\begin{align} a_i&=\prod_{j=1}^k p_{\left(i,j\right)}^{e_{\left(i,k\right)}}\ &=\prod_{j=1}^k \iota_{A_i}\left(p_{\left(i,j\right)}\right)^{e_{\left(i,j\right)}}\ &=\prod_{p_{\left(i,j\right)\in A_i}} p_{\left(i,j\right)}^{e_{\left(i,j\right)}}\ &=\prod_{p_{\left(i,j\right)\in A_i}} p_{\left(i,j\right)}^{e_{\left(i,j\right)}}\prod_{t_i\in T\setminus A_i} t_i^0\ &=\prod_{t_i\in T} t_i^{g_i}, \text{ where } g_i =\begin{cases} e_{\left(i,j\right)},\ &\text{If } t_i=p_{\left(i,j\right)}\ 0, &\text{If } t_i\not\in A_i \end{cases}\ &=t_1^{g_1}t_2^{g_2}t_3^{g_3}\dots t_v^{g_v} \end{align}$$*
Which expresses a_i in terms of the primes in T as required.
$\qed$
:::
The final ingredient required before we can extend the
\mathop{\mathrm{GCD}} is to extend the minimum function to multiple
inputs. This is a straightforward extension.
::: definition Definition 173. General minimum function for integers
Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a
$n$-tuple of integers. We define the minimum function on S by
$$\begin{align} \min:\mathbb{Z}^n&\rightarrow\mathbb{Z}\ S&\mapsto\min\left(S\right)=\begin{cases} a_1,\ &\text{If } n=1\ \min\left(a_1,a_2\right),\ &\text{If } n=2\ \min\left(\min\left(a_1,a_2,a_3,\dots,a_{n-1}\right),a_n\right),\ &\text{If } n\geq 3\ \end{cases} \end{align*}$$* :::
We need to show that this is well-defined.
::: {#prop:NT_general_min_on_integers_is_well_defined .proposition} Proposition 133. General minimum function for the integers is well-defined
Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a
$n$-tuple of integers. We have that \min\left(S\right) is
well-defined.
Proof:
We argue by induction on n. The base case is n=1 for which the
result is trivial, likewise the case n=2 is trivial. So suppose the
result holds for some k>2, then we have that
$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right) \end{equation*}$$*
is well-defined. We show that
$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right) \end{equation*}$$*
is well-defined. Evaluating the inner
\min\left(a_1,a_2,a_3,\dots,a_{k}\right) we have by definition that
$$\begin{equation} \min\left(a_1,a_2,a_3,\dots,a_{k}\right)=\min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right) \end{equation*}$$*
Which by hypothesis is well-defined. Hence
\min\left(a_1,a_2,a_3,\dots,a_{k}\right)=m for some m\in\mathbb{Z}.
Hence we have that
$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)=\min\left(m,a_{k+1}\right) \end{equation*}$$*
Which is well-defined. Hence by induction, we have that the general minimum function on the integers is well-defined. $\qed$ :::
We also have the following proposition.
::: {#prop:NT_general_min_function_on_integers_is_associative .proposition} Proposition 134. The general minimum function is associative
Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a
$n$-tuple of integers. We have that
$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{n-1}\right),a_n\right)=\min\left(a_1,\min\left(a_2,a_3,\dots,a_{n-1},a_n\right)\right) \end{equation*}$$*
Proof:
We argue by induction on n. The case n=1 has nothing to prove.
Likewise for n=2, so we shall show it holds for n=3. That is
$$\begin{equation} \min\left(\min\left(a_1,a_2\right),a_3\right)=\min\left(a_1,\min\left(a_2,a_3\right)\right) \end{equation*}$$*
There are 6 cases to consider.
-
$a_1\leq a_2\leq a_3$
-
$a_1\leq a_3\leq a_2$
-
$a_2\leq a_1\leq a_3$
-
$a_2\leq a_3\leq a_1$
-
$a_3\leq a_1\leq a_2$
-
$a_3\leq a_2\leq a_1$
-
a_1\leq a_2\leq a_3:We have that
$$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_1,a_3\right)=a_1\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_2\right)=a_1\ \end{align*}$$*
-
a_1\leq a_3\leq a_2:$$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_1,a_3\right)=a_1\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_3\right)=a_1\ \end{align*}$$*
-
a_2\leq a_1\leq a_3:$$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_2,a_3\right)=a_2\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_2\right)=a_2\ \end{align*}$$*
-
a_2\leq a_3\leq a_1:$$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_2,a_3\right)=a_2\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_2\right)=a_2\ \end{align*}$$*
-
a_3\leq a_1\leq a_2:$$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_1,a_3\right)=a_3\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_1,a_3\right)=a_3\ \end{align*}$$*
-
a_3\leq a_2\leq a_1:$$\begin{align} \min\left(\min\left(a_1,a_2\right),a_3\right)&=\min\left(a_2,a_3\right)=a_3\ \min\left(a_1,\min\left(a_2,a_3\right)\right)&=\min\left(a_2,a_3\right)=a_3\ \end{align*}$$*
Hence the base case is shown. Now suppose that the proposition holds
for some k>3, that is
$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),k_n\right)=\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k-1},a_k\right)\right) \end{equation*}$$*
we show that it holds for k+1, i.e.
$$\begin{equation} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)=\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k},a_{k+1}\right)\right) \end{equation*}$$*
We have by evaluating the inner minimum of the left-hand side we get
$$\begin{equation} \min\left(a_1,a_2,a_3,\dots,a_{k}\right)=\min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_{k}\right) \end{equation*}$$*
And so by the induction hypothesis, we have that
$$\begin{align} \min\left(\min\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)&=\min\left(\min\left(\min\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_{k}\right),a_{k+1}\right)\ &=\min\left(\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)\right),a_{k+1}\right),\ \text{Induction hypothesis}\ \end{align*}$$*
As \min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right) is well-defined by
proposition
133{reference-type="ref"
reference="prop:NT_general_min_on_integers_is_well_defined"} then
\min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)=M say where
M\in\mathbb{Z}. Therefore, on substituting
\min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right) for M for ease of
reading we have
$$\begin{align} \min\left(\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)\right),a_{k+1}\right)&=\min\left(\min\left(a_1,M\right),a_{k+1}\right)\ &=\min\left(a_1,\min\left(M,a_{k+1}\right)\right)\ &=\min\left(a_1,\min\left(\min\left(a_2,a_3,\dots, a_{k-1},a_{k}\right),a_{k+1}\right)\right)\ &=\min\left(a_1,\min\left(a_2,a_3,\dots,a_{k},a_{k+1}\right)\right) \end{align*}$$*
The result now follows by induction. $\qed$ :::
Proposition 134{reference-type="ref" reference="prop:NT_general_min_function_on_integers_is_associative"} is a useful proposition, it allows us to discard the cumbersome notation of the definition of the general minimum function on the Integers. That is to say, we can now simply, and more easily write
$$\begin{equation*} \min\left(a_1,a_2,a_3,\dots,a_n\right) \end{equation*}$$
For convenience, we also define the minimum function for a subset of n
integers.
::: definition Definition 174. General minimum function for a subset of integers
Let A=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be a
subset of n integers. Let
S=\left(a_1,a_2,a_3,\dots,a_n\right)\in A^n. We define the minimum of
the set of integers A by
$$\begin{equation} \min\left(A\right)=\min\left(S\right)=\min\left(a_1,a_2,a_3,\dots,a_n\right) \end{equation*}$$*
That is, we simply take the element of A^n which corresponds to the
set.
:::
::: example
Example 140. Let A=\left\{2,3\right\}. We have that
$$\begin{equation} A^2=\left{\left(2,2\right), \left(2,3\right), \left(3,2\right),\left(3,3\right)\right} \end{equation*}$$*
We have that S=\left(2,3\right)\in A^2 and
$$\begin{equation} \min\left(A\right)=\min\left(S\right)=\min\left(2,3\right)=2 \end{equation*}$$* :::
We have all the ingredients required to extend the
\mathop{\mathrm{GCD}} function. We use a method similar to how we
extended the minimum function.
::: definition Definition 175. Generalised greatest common divisor
Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a
$n$-tuple of integers. We define the greatest common divisor function on
S by
$$\begin{align} \mathop{\mathrm{GCD}}:\mathbb{Z}^n&\rightarrow\mathbb{Z}\ S&\mapsto\mathop{\mathrm{GCD}}\left(S\right)=\begin{cases} a_1,\ &\text{If } n=1\ \mathop{\mathrm{GCD}}\left(a_1,a_2\right),\ &\text{If } n=2\ \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{n-1}\right),a_n\right),\ &\text{If } n\geq 3\ \end{cases} \end{align*}$$* :::
We show that this is well-defined.
::: {#prop:NT_general_gcd_on_integers_is_well_defined .proposition} Proposition 135. Generalised greatest common divisor function for the integers is well-defined
Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a
$n$-tuple of integers. We have that \gcd\left(S\right) is
well-defined.
Proof:
The argument is by induction on n. The base case is n=2 which is
well-defined by theorem 32{reference-type="ref"
reference="thm:NT_gcd_exists"}. Now suppose the result is true for some
k>2, that is
$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right) \end{equation*}$$*
is well-defined. We show that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right) \end{equation*}$$*
is well-defined. Evaluating the inner
\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right) we have by
definition that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right) \end{equation*}$$*
Which by hypothesis is well-defined. Hence
\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right)=d for some
d\in\mathbb{Z}. Hence we have that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)=\mathop{\mathrm{GCD}}\left(d,a_{k+1}\right) \end{equation*}$$*
Which is well-defined. The result now follows by induction. $\qed$ :::
As with the minimum function, to avoid cumbersome notation we can show that the generalised greatest common divisor is associative.
::: {#prop:NT_general_gcd_on_integers_is_associative .proposition}
Proposition 136. Generalised \mathop{\mathrm{GCD}} is
associative
Let S=\left(a_1,a_2,a_3,\dots,a_n\right)\in\mathbb{Z}^n be a
$n$-tuple of integers. We have that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{n-1}\right),a_n\right)=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{n-1},a_n\right)\right) \end{equation*}$$*
Proof:
We argue by induction on n. The cases of n=1 and n=2 are trivial,
so we show it holds for n=3.
Let
x=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3\right)\right)
and
y=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2\right),a_3\right),
We need to show that x\mid y and y\mid x then we can conclude that
x=y.
As
x=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3\right)\right)
then by definition of the greatest common divisor, we have that
x\mid a_1 and x\mid\mathop{\mathrm{GCD}}\left(a_2,a_3\right).
Moreover, as x\mid\mathop{\mathrm{GCD}}\left(a_2,a_3\right) then again
by definition of the greatest common divisor we have that x\mid a_2
and x\mid a_3.
As x\mid a_1 and x\mid a_2 then
x\mid\mathop{\mathrm{GCD}}\left(a_1,a_2\right) and likewise
x\mid a_3 so
x\mid\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2\right),a_3\right)
by definition and so x\mid y. The proof that y\mid x is similar.
As x\mid y and y\mid x and x>0 and y>0 we conclude that x=y
as required.
Now suppose the result is true for some k>2. That is
$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right),a_k\right)=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_k\right)\right) \end{equation*}$$*
we show that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k},a_{k+1}\right)\right) \end{equation*}$$*
Evaluation of the inner \mathop{\mathrm{GCD}} of the left-hand side
yields
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right)=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right)a_{k}\right) \end{equation*}$$*
So by the induction hypothesis, we have that
$$\begin{align} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k}\right),a_{k+1}\right)&=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{k-1}\right)a_{k}\right),a_{k+1}\right)\ &=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)\right),a_{k+1}\right),\ \text{By hypothesis}\ \end{align*}$$*
As \mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right) is
well-defined by proposition
135{reference-type="ref"
reference="prop:NT_general_gcd_on_integers_is_well_defined"}, we have
\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)=d with
d\in\mathbb{Z}. Hence we have
$$\begin{align} \mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right)\right),a_{k+1}\right)&=\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_1,d\right),a_{k+1}\right)\ &=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(d,a_{k+1}\right)\right)\ &=\mathop{\mathrm{GCD}}\left(a_1,\mathop{\mathrm{GCD}}\left(\mathop{\mathrm{GCD}}\left(a_2,a_3,\dots,a_{k-1},a_{k}\right),a_{k+1}\right)\right)\ \end{align*}$$*
As required. $\qed$ :::
As with the minimum function, we can now simply write
$$\begin{equation*}
\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_{n-1},a_n\right)
\end{equation*}$$ Likewise for convenience, we define the
\mathop{\mathrm{GCD}} function for a subset of n integers.
::: definition Definition 176. General greatest common divisor function for a subset of integers
Let A=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be a
subset of n integers. Let
S=\left(a_1,a_2,a_3,\dots,a_n\right)\in A^n. We define the
\mathop{\mathrm{GCD}} of the set of integers A by
$$\begin{equation} \mathop{\mathrm{GCD}}\left(A\right)=\mathop{\mathrm{GCD}}\left(S\right)=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right) \end{equation*}$$*
That is, we simply take the element of A^n which corresponds to the
set.
:::
::: example
Example 141. Let A=\left\{2,3\right\}. We have that
$$\begin{equation} A^2=\left{\left(2,2\right), \left(2,3\right), \left(3,2\right),\left(3,3\right)\right} \end{equation*}$$*
We have that S=\left(2,3\right)\in A^2 and
$$\begin{equation} \mathop{\mathrm{GCD}}\left(A\right)=\mathop{\mathrm{GCD}}\left(S\right)=\mathop{\mathrm{GCD}}\left(2,3\right)=1 \end{equation*}$$* :::
We can now finally generalise the computation of the greatest common divisor from the prime factorisation of the inputs.
::: {#prop:NT_general_gcd_can_be_computed_by_primes .proposition} Proposition 137. Generalised version of the greatest common divisor from prime factorisation
Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be a set
of integers so that at least one a_i\neq 0 for 1\leq i\leq n. By
proposition
132{reference-type="ref"
reference="prop:NT_General_express_primes_in_common_basis"}, we know
that there exists a set of primes
$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*
so that for each a_i we have prime factorisations given by
$$\begin{equation}
\mathlarger{a_i=\prod_{j=1}^v t_{j}^{f_{\left(i,j\right)}}}
\end{equation*}$$ For 1\leq i\leq n. Define the family of sets for
each $1\leq j\leq v$*
$$\begin{equation} P_j=\left{f_{\left(i,j\right)} : 1\leq i\leq n\right} \end{equation*}$$*
We have that the greatest common divisor
\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,dots,a_n\right) is given by
$$\begin{equation} \mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right)=t_1^{\min\left(P_1\right)}t_2^{\min\left(P_2\right)}t_3^{\min\left(P_3\right)}\dots t_v^{\min\left(P_v\right)} \end{equation*}$$*
Proof:
The proof is similar to that of proposition
116{reference-type="ref"
reference="prop:NT_gcd_can_be_computed_by_primes"}. Let
S=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be as given so
by proposition
132{reference-type="ref"
reference="prop:NT_General_express_primes_in_common_basis"} we have a
set of primes
$$\begin{equation} T=\left{t_1,t_2,t_3,\dots,t_v\right} \end{equation*}$$*
so that for each a_i we have prime factorisations given by
$$\begin{equation} \mathlarger{a_i=\prod_{j=1}^v t_{j}^{f_{\left(i,j\right)}}} \end{equation*}$$*
Now, let d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right)
and let
D = t_1^{\min\left(P_1\right)}t_2^{\min\left(P_2\right)}t_3^{\min\left(P_3\right)}\dots t_v^{\min\left(P_v\right)},
we show that d\leq D and D\leq d. Define
\sigma_j=\min\left(\left\{f_{\left(i,j\right)}: 1\leq i\leq n\right\}\right)
for 1\leq j\leq v.
-
D\leq d:By the definition of the minimum, we have that
\sigma_j\leq f_{\left(i,j\right)}for each1\leq i\leq n. Hence, for eachiandjthere existsk_{\left(i,j\right)}\in\mathbb{Z}so that$$\begin{equation} f_{\left(i,j\right)} = \sigma_j + k_{\left(i,j\right)} \end{equation*}$$*
So that
a_ican be expressed as$$\begin{align} a_i&=\prod_{j=1}^v t_j^{f_{\left(i,j\right)}}\ &=\prod_{j=1}^v t_j^{\sigma_j+k_{\left(i,j\right)}}\ &=\prod_{j=1}^v t_j^{\sigma_j} t_j^{k_{\left(i,j\right)}}\ &=\prod_{j=1}^v t_j^{\sigma_j} \prod_{j=1}^vt_j^{k_{\left(i,j\right)}}\ &= D * \prod_{j=1}^vt_j^{k_{\left(i,j\right)}} \end{align*}$$*
As
a_iwas arbitrary this argument holds for each1\leq i\leq n. Hence, we have thatD\mid a_ifor eachi, soDis a common divisor of eacha_i. We conclude thatD\leq d. -
d\leq D:Suppose that
d\mid Dthen\exists k\in\mathbb{Z}so that$$\begin{equation} d=DK \end{equation*}$$*
Now,
khas a factorisation into primes by the fundamental theorem of arithmetic. Moreover,kcould have primes in common withD, so we can take those primes that are in common withDandkand place them into the factorisation ofD. That is$$\begin{align} d&=Dk\ d&=t_1^{\sigma_1}t_2^{\sigma_1}t_3^{\sigma_3}\dots t_v^{\sigma_v}k\ d&=t_1^{\lambda_1}t_2^{\lambda_1}t_3^{\lambda_3}\dots t_v^{\lambda_v}k'\ \end{align*}$$*
Where
\lambda_jare the new values for each prime after extracting the primes in common withDandkintoD.k'are the primes that are not in common. We need to show that-
$k'=1$
-
\lambda_j\leq \sigma_jfor all $1\leq j\leq v$
-
k'=1:Suppose for a contradiction that
k'\neq 1. Asd>0andD>0thenk>0and sok'>0. Now ask'\neq 1we havek'>1and so by the fundamental theorem of arithmetic we have thatk'has a factorisation into primes, say$$\begin{equation} k'=q_1^{r_1}q_2^{r_2}q_3^{r_3}\dots q_c^{r_c} \end{equation*}$$*
Now, no
q_l=t_jask'has no primes in common witht_1^{\lambda_1}t_2^{\lambda_1}t_3^{\lambda_3}\dots t_v^{\lambda_v}. Pick one of the primes ink', sayq=q_lthenq\mid d. Now asd=\gcd\left(a_1,a_2,a_3,\dots,a_n\right)then we haveq\mid a_ifor at least onea_i. This is a contradiction as thenqis one of the primest_j. We conclude that $k'=1$ -
\lambda_j\leq \sigma_jfor all1\leq j\leq v:Suppose for contraction that
\lambda_j>\sigma_jfor all1\leq j\leq v. Without loss of generality, takej=1, for if not re-label the primes.By definition of
\sigma_1, we have that\sigma_1=\min\left(\left\{f_{\left(i,1\right)}: 1\leq i\leq n\right\}\right), without loss of generality takei=1as the case for the other values ofiare similar. We have that\sigma_1=f_{\left(1,1\right)}and so\lambda_1>f_{\left(1,1\right)}. Asdis the greatest common divisor ofa_1then there is ans\in\mathbb{Z}so thatds=awheres>0as bothaanddare.Comparing the prime factorisations, we get that
$$\begin{equation} st_1^{\lambda_1}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_1^{f_{\left(1,1\right)}}t_2^{f_{\left(1,2\right)}}t_3^{f_{\left(1,3\right)}}\dots t_v^{f_{\left(1,v\right)}} \end{equation}$$*
Dividing by
\displaystyle t_1^{f_{\left(1,1\right)}}we get that$$\begin{equation} st_1^{\lambda_1-f_{\left(1,1\right)}}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_1^{f_{\left(1,1\right)}-f_{\left(1,1\right)}}t_2^{f_{\left(1,2\right)}}t_3^{f_{\left(1,3\right)}}\dots t_v^{f_{\left(1,v\right)}} \end{equation}$$*
Where clearly
\displaystyle t_1^{f_{\left(1,1\right)}-f_{\left(1,1\right)}}=1. So this can be re-written as$$\begin{equation} st_1^{\lambda_1-f_{\left(1,1\right)}}t_2^{\lambda_2}t_3^{\lambda_3}\dots t_v^{\lambda_v}=t_2^{f_{\left(1,2\right)}}t_3^{f_{\left(1,3\right)}}\dots t_v^{f_{\left(1,v\right)}} \end{equation}$$*
As
\lambda_1>f_{\left(1,1\right)}then\lambda_1-f_{\left(1,1\right)}>0and sot_1divides the left-hand side of the equation. By the fundamental theorem of arithmetic,t_1divides the left-hand side it must also divide the right-hand side and therefore be in the factorisation. It is not in the factorisation on the right-hand side which is a contradiction. It follows\lambda_j\leq\sigma_jfor all $1\leq j\leq v$
Therefore we conclude that
d\leq D.As
d\leq DandD\leq dwe have thatd=Dand the result is shown. $\qed$ ::: -
These last few results were somewhat technical. To show that our new
generalised \mathop{\mathrm{GCD}} works we give an example.
::: example
Example 142. We compute
\mathop{\mathrm{GCD}}\left(54,78,35,144,50\right). By inspection of
each of the numbers we have that
$$\begin{align} 54&=23^3\ 78&=2313\ 35&=57\ 144&=2^43^2\ 50&=25^2 \end{align*}$$*
Hence, the set of primes T is given by
$$\begin{equation} T=\left{2,3,5,7,13\right} \end{equation*}$$*
Now, by the proposition, we know that
$$\begin{equation} \mathop{\mathrm{GCD}}\left(54,78,35,144,50\right)=t_1^{\min\left(P_1\right)}t_2^{\min\left(P_2\right)}t_3^{\min\left(P_3\right)}t_4^{\min\left(P_4\right)}t_5^{\min\left(P_5\right)} \end{equation*}$$*
Where P_j will be the powers of the prime t_j that appear in the
factorisation of each of the inputs. Taking t_1=2, t_2=3, t_3=5, t_4=7
and t_5=13 we have
$$\begin{align}
P_1&=\left{1,1,0,4,1\right}=\left{0,1,4\right}\
P_2&=\left{3,1,0,2,0\right}=\left{0,1,2,3\right}\
P_3&=\left{0,0,1,0,2\right}=\left{0,1,2\right}\
P_4&=\left{0,0,1,0,0\right}=\left{0,1\right}\
P_5&=\left{0,1,0,0,0\right}=\left{0,1\right}\
\end{align*}$$ From which it is clear that the minimum of every P_j is
0. So that*
$$\begin{equation} \mathop{\mathrm{GCD}}\left(54,78,35,144,50\right)=1 \end{equation*}$$* :::
With a generalised \mathop{\mathrm{GCD}} function, we can extend
Bézout's Identity.
::: {#thm:NT_general_bezout_idenity .theorem} Theorem 44. Generalised Bézout's Identity
Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be a set
of integers so that at least one a_i\neq 0 for 1\leq i\leq n.
Consider d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots ,a_n\right).
Then, for i\leq 1\leq n we have \exists x_i\in\mathbb{Z} so that
$$\begin{equation} d=a_1x_1+a_2x_2+a_2x_2+\dots+a_nx_n=\sum_{i=1}^n a_ix_n \end{equation*}$$*
Proof:
Let S be as given by the hypothesis and let
d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots ,a_n\right). By
definition, we have that as d\mid a_i for each 1\leq i\leq n then by
proposition
103{reference-type="ref"
reference="prop:NT_Divisor_dividing_all_in_set_divides_linear_combination"}
we have that
$$\begin{equation} d\mid\sum_{i=1}^n a_ix_n \end{equation*}$$*
for any x_i\in\mathbb{Z}. Define the set A by
$$\begin{equation} G=\left{\sum_{i=1}^n a_ix_n : x_i\in\mathbb{Z}\right} \end{equation*}$$*
Clearly, there are both positive and negative elements in G,
additionally 0\in G by taking each x_i=0. Define \Tilde{G} by
$$\begin{equation} \Tilde{G}=\left{g\in G: g>0\right} \end{equation*}$$*
It follows that \Tilde{G}\subset\mathbb{Z} and so by the
well-ordering principle it has a smallest element \Tilde{g} of the
form
$$\begin{equation} \Tilde{g}=\sum_{i=1}^n a_ix_n \end{equation*}$$*
We must show that \Tilde{g}\mid a_i for each i. Suppose for
contradiction and without loss of generality that \Tilde{g}\nmid a_1.
By the division algorithm, we have that
$$\begin{equation} a_1=q\Tilde{g}+r \end{equation*}$$*
with 0<r<\left|\Tilde{g}\right|. Therefore
$$\begin{align} a_1&=q\Tilde{g}+r\ r&=a_1-q\Tilde{g}\ r&=a_1-q\sum_{i=1}^n a_ix_n\ r&=a_1-\left(qa_1x_1+q\sum_{i=2}^n a_ix_n\right)\ r&=a_1-qa_1x_1-q\sum_{i=2}^n a_ix_n\ r&=a_1\left(1-qx_1\right)-q\sum_{i=2}^n a_ix_n\ r&=a_1\left(1-qx_1\right)+\sum_{i=2}^n a_i\left(-qx_n\right)\ \end{align*}$$*
Which shows that r\in\Tilde{G}. Moreover as
0<r<\left|\Tilde{g}\right| we have r<\Tilde{g} a contradiction, so
\Tilde{g}\mid a_1. It follows that \Tilde{g}\mid a_i for each i.
As d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots ,a_n\right) we have
that a_1=m_id for each m_i\in\mathbb{Z}. Combining this with the
expression for \Tilde{g} shows that
$$\begin{align} \Tilde{g}&=\sum_{i=1}^n a_ix_n\ \Tilde{g}&=\sum_{i=1}^n \left(m_i d\right)x_n\ \Tilde{g}&=d\sum_{i=1}^n m_ix_n\ \end{align*}$$*
So d\mid\Tilde{g} and we have that d\leq\Tilde{g}. As d is the
greatest common divisor we have that d=\Tilde{g} as required. $\qed$
:::
We are now at the end of a long road. We can now, partially, generalise
proposition
130{reference-type="ref"
reference="prop:NT_solutions_to_two_var_linear_diophantine_equation"} to
the n variable case, namely we state the requirement for solutions to
exist.
::: definition
Definition 177. Linear equation of n indeterminate variables
Let S be a set. We say an equation is a linear equation in
$n$-variables if it has the form
$$\begin{equation} a_1x_1+a_2x_2+a_3x_2+\dots+a_nx_n=c \end{equation*}$$*
for some coefficients a_i\in S and c\in S and n indeterminate
variables x_n.
:::
::: {#prop:NT_existence_of_solutions_to_n_var_linear_diophantine_equation .proposition}
Proposition 138. Existence of solutions to n variable linear
Diophantine equation
Let S=\left\{a_1,a_2,a_3,\dots,a_n\right\}\subset\mathbb{Z} be such
that
$$\begin{equation} a_1x_1+a_2x_2+a_3x_3+\dots+a_nx_n=c \end{equation*}$$*
for the indeterminate variable x_i with 1\leq i\leq n. Let
d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right). We have
that there are solutions so that each x_i\in\mathbb{Z} if and only if
d\mid c.
Proof:
\left(\Rightarrow\right): If \displaystyle c= \sum_{i=1}^n a_ix_n
then by proposition
103{reference-type="ref"
reference="prop:NT_Divisor_dividing_all_in_set_divides_linear_combination"}
we have that d\mid c.
\left(\Leftarrow\right): Suppose that d\mid c then
\exists e\in\mathbb{Z} so that c=de. By the generalised Bézout's
Identity for each i that \exists y_i\in\mathbb{Z} so that
$$\begin{equation} d=\sum_{i=1}^n a_iy_n \end{equation*}$$*
where d=\mathop{\mathrm{GCD}}\left(a_1,a_2,a_3,\dots,a_n\right).
Multiplying both sides by e we see that
$$\begin{equation} c=e\sum_{i=1}^n a_iy_i=\sum_{i=1}^n a_i\left(ey_i\right) \end{equation*}$$*
Hence each x_i=ey_i.
The result is shown. $\qed$ :::
Unlike proposition 130{reference-type="ref" reference="prop:NT_solutions_to_two_var_linear_diophantine_equation"}, we did not show that there are infinitely many solutions and what form they take. Recall that we used example 118{reference-type="ref" reference="exam:NT_solutions_to_ax_plus_by"} to find the general form of the solutions for the two-variable case. We shall see if we can do the same for multiple variables.
::: example Example 143. Consider the three-variable Diophantine equation
$$\begin{equation} 15x+9y+27z=9 \end{equation*}$$*
Clearly \mathop{\mathrm{GCD}}\left(15,9,27\right)=3 and 3\mid 9 so
integer solutions exist. How can we find one?
One idea that might give a solution is to try and reduce this to a two-variable equation. How can this be done? We can see that
$$\begin{equation} 15x+9y+27z=3\left(5x+3y\right)+27z=9 \end{equation*}$$*
As 5x+3y\in\mathbb{Z} for x,y\in\mathbb{Z} we will denote this by
\in\mathbb{Z}, that is v=5x+3y. As v\in\mathbb{Z} we can set it to
any integer value as \mathop{\mathrm{GCD}}\left(5,3\right)=1 and 1
divides every integer. So let v=1 to give
$$\begin{equation} 5x+3y=1 \end{equation*}$$*
The Euclidean algorithm shows that x=2 and y=-3 is a particular
solution. Hence the general solutions will be given by
$$\begin{align} x&=2+\frac{3n}{1}=2+3n\ y&=-3-\frac{5n}{1}=-3-5n\ \end{align*}$$*
In particular, the general solutions satisfy 5x+3y=1 and so the
general solutions to 5x+3y=v will be given by
$$\begin{align} x&=v\left(2+3n\right)\ y&=v\left(-3-5n\right)\ \end{align*}$$*
Now, consider the remaining equation given by
$$\begin{equation} 3v+27z=9 \end{equation*}$$*
By the Euclidean algorithm, we see that v=3 and z=0 is a particular
solution, with the general solutions given by
$$\begin{align} v&=3+\frac{27n}{3}=3+9k\ z&=-0-\frac{3k}{3}=-k\ \end{align*}$$*
Hence we have that
$$\begin{align} x&=v\left(2+3n\right)\ y&=v\left(-3-5n\right)\ v&=3+9k\ z&=-k\ \end{align*}$$*
As we have an expression for v we can substitute it into the
expressions for x and y to give
$$\begin{align} x&=\left(3+9k\right)\left(2+3n\right)\ y&=\left(3+9k\right)\left(-3-5n\right)\ z&=-k\ \end{align*}$$*
where n,k\in\mathbb{Z}. This is a general solution to 15x+9y+27z=9.
Indeed, if we substitute these back into the original equation we get
$$\begin{align} 15x+9y+27z&=15\left(\left(3+9k\right)\left(2+3n\right)\right)+9\left(\left(3+9k\right)\left(-3-5n\right)\right)+27\left(-k\right)\ &=15\left(3+9k\right)\left(2+3n\right)+9\left(3+9k\right)\left(-3-5n\right)-27k\ &=\left(3+9k\right)\left(15\left(2+3n\right)+9\left(-3-5n\right)\right)-27k\ &=\left(3+9k\right)\left(30+45n+\left(-27-45n\right)\right)-27k\ &=\left(3+9k\right)\left(3\right)-27k\ &=9+27k-27k=9\ \end{align*}$$* :::
::: {#exam:NT_solution_to_linear_diophantine_by_subs .example} Example 144.
We consider the same example again, but we will find a different solution method. So consider the three-variable equation given by
$$\begin{equation} 15x+9y+27z=9 \end{equation*}$$*
We will express x in terms of y and z. We have
$$\begin{equation} x=\frac{9-9y-27z}{15} \end{equation*}$$*
Observe that we can express 9 and 27 is terms of 15. We have
$$\begin{align} 9&=15-6\ 27&=215-3 \end{align}$$*
Hence, we can split the expression for x up into the parts that we
know are divisible by 15 and the parts that may or may not be
divisible by $15$
$$\begin{align} x&=\frac{9-9y-27z}{15}\ &=\frac{\left(15-6\right)-\left(15-6\right)y-\left(2\left(15\right)-3\right)z}{15}\ &=1-y-2z+\frac{-6+6y+3z}{15}\ \end{align*}$$*
As we seek x\in\mathbb{Z} then as 1-y-2z\in\mathbb{Z} we will also
require that \displaystyle \frac{-6+6y+3z}{15}\in\mathbb{Z}. Let
\displaystyle \frac{-6+6y+3z}{15}=s where s\in\mathbb{Z}. Then we
have
$$\begin{equation} x=1-y-2z+s \end{equation*}$$*
Now, We have that
$$\begin{equation} 15s=-6+6y-3z \end{equation*}$$*
We repeat the above process to get y in terms of z and s. Doing
so gives
$$\begin{align} 15s&=-6+6y-3z\ 6y&=15s+6+3z\ \end{align*}$$*
As before, we can express 15 and 3 in terms of 6 to get
$$\begin{align} 15&=26+3\ 3&=6-3 \end{align}$$*
So that,
$$\begin{align}
6y&=15s+6+3z\
6y&=\left(2\left(6\right)+3\right)s+6+\left(6-3\right)z\
y&=\frac{\left(2\left(6\right)+3\right)s+6+\left(6-3\right)z}{6}\
y&=2s+1-z+\frac{3s+3z}{6}\
\end{align*}$$*
As we need y\in\mathbb{Z} then we require
\displaystyle \frac{3s+3z}{6}\in\mathbb{Z} say
\displaystyle \frac{3s+3z}{6}=t. Then we have
$$\begin{equation} y=2s+1-z+t \end{equation*}$$*
Finally, we have that
$$\begin{equation} 6t=3s+3z \end{equation*}$$*
Which can be solved directly for z to give z=2t-s.Substituting the
value of z in y gives
$$\begin{align}
y&=2s+1-z+t\
y&=2s+1-\left(2t-s\right)+t\
y&=3s+1-t\
\end{align*}$$ And on substitution of this y value and z into x we
get*
$$\begin{align} x&=1-y-2z+s\ x&=1-\left(3s+1-t\right)-2\left(2t-s\right)+s\ x&=1-3s-1+t-4t+2s+s\ x&=-3t\ \end{align*}$$*
Hence, a general solution is given by,
$$\begin{align} x&=-3t\ y&=3s+1-t\ z&=2t-s \end{align*}$$*
for s,t\in\mathbb{Z}. This is indeed a general solution as
$$\begin{align} 15x+9y+27z&=15\left(-3t\right)+9\left(3s+1-t\right)+27\left(2t-s\right)\ &=-45t+27s+9-9t+54t-27s\ &=9 \end{align*}$$* :::
Of course, there was nothing special about using only three variables.
::: example Example 145. Consider the four-variable Diophantine equation
$$\begin{equation} 55a+35b-77c+144d=1 \end{equation*}$$*
As \mathop{\mathrm{GCD}}\left(55,35,77,144\right)=1 then integer
solutions exist. We will use a similar method to example
144{reference-type="ref"
reference="exam:NT_solution_to_linear_diophantine_by_subs"}, with some
details omitted for brevity.
Expressing, a in terms of b,c and d we get that
$$\begin{equation} a=\frac{1-35b+77c-144d}{55} \end{equation*}$$*
Noting that
$$\begin{align} 35&=55-20\ 77&=55+22\ 144&=255+34\ \end{align}$$*
We can express a as
$$\begin{align} a&=\frac{1-35b+77c-144d}{55}\ &=\frac{1-\left(55-20\right)b+\left(55+22\right)c-\left(2\left(55\right)+34\right)d}{55}\ &=-b+c-2d+\frac{1+20b+22c-34d}{55}\ \end{align*}$$*
We require that \displaystyle \frac{1+20b+22c-34d}{55}\in\mathbb{Z}
say with \displaystyle \frac{1+20b+22c-34d}{55} = u for
u\in\mathbb{Z}. We therefore have that
$$\begin{equation} 55u=1+20b+22c-34d \end{equation*}$$*
Expressing, b in terms of c,d and u we get
$$\begin{align} b=\frac{55u-1-22c+34d}{20} \end{align*}$$*
We see that
$$\begin{align} 55&=220+15\ 22&=20+2\ 34&=20+14 \end{align}$$*
Hence,
$$\begin{equation} b=2u-c+d+\frac{15u-1-2c+14d}{20} \end{equation*}$$*
Set \displaystyle \frac{15u-1-2c-14d}{20}=v where v\in\mathbb{Z}.
Then
$$\begin{equation} 20v=15u-1-2c+14d \end{equation*}$$*
Solving c in terms of d,u and v gives
$$\begin{equation} c=\frac{15u-1-20v+14d}{2} \end{equation*}$$*
where we get
$$\begin{equation} c=7u-10v+7d+\frac{u-1}{2}=7u-10v+7d+x \end{equation*}$$*
where \displaystyle x=\frac{u-1}{2}\in\mathbb{Z}. We seem to have hit
a problem, we still don't have an expression for d. However, suppose
d\in\mathbb{Z} is arbitrary, can we recover a general solution with
this assumption?
Firstly, we will express a,b and c in terms of u,v,x and d. We
get that
$$\begin{align} d&\in\mathbb{Z}\ c&=7u-10v+7d+x\ b&=-6d-5u+11v-x\ a&=11d+13u-21v+2x \end{align*}$$*
Observe that,
$$\begin{align} 55a&=55\left(11d+13u-21v+2x\right)=605d+715u-1155v+110x\ 35b&=35\left(-6d-5u+11v-x\right)=-210d-175u+385v-35x\ 77c&=77\left(7u-10v+7d+x\right)=539d+539u-770v+77x \end{align*}$$*
Hence
$$\begin{align} 55a+35b-77c+144d&=605d+715u-1155v+110x\ &-210d-175u+385v-35x\ &-\left(539d+539u-770v+77x\right)\ &+144d\ &=0d+u+0v-2x=u-2x \end{align*}$$*
However, we know that \displaystyle x=\frac{u-1}{2} so that 2x=u-1
so
$$\begin{equation} u-2x=u-\left(u-1\right)=1 \end{equation*}$$*
Hence
$$\begin{align} a&=11d+13u-21v+2x\ b&=-6d-5u+11v-x\ c&=7u-10v+7d+x\ u&,v,x,d\in\mathbb{Z}\ \end{align*}$$*
Gives a general solution for u,v,x,d\in\mathbb{Z}.
:::
It is interesting to note that we have four arbitrary integer variables in the previous example. In the case of three variables, we were able to find solutions requiring only two arbitrary integer variables. Does the other method also give a general solution requiring four arbitrary integer variables?
::: example Example 146. Consider again
$$\begin{equation} 55a+35b-77c+144d=1 \end{equation*}$$*
We have that \mathop{\mathrm{GCD}}\left(55,35\right)=5 so that
$$\begin{equation}
55a+35b-77c+144d=5\left(11a+7b\right)-77c+144d=1
\end{equation*}$$ We have that as 11a+7b\in\mathbb{Z} we can replace
this with a variable, say u so that we get the equation*
$$\begin{equation} 11a+7b=1 \end{equation*}$$.*
By the Euclidean algorithm, we see for u=5 that a=2 and b=-3 is a
general solution, with the general solutions being given by
$$\begin{align} a&=2+\frac{7n}{1}=2+7x\ b&=-3-\frac{11n}{1}=-3-11x\ \end{align*}$$*
Hence the general solution to 11a+7b=u is given by
$$\begin{align} a&=u\left(2+7x\right)\ b&=u\left(-3-11x\right)\ \end{align*}$$*
Now, the original four-variable equation is the three-variable equation
$$\begin{equation} 5u-77c+144d=1 \end{equation*}$$*
We have that \mathop{\mathrm{GCD}}\left(-77,144\right)=1. So replace
-77c+144d with a variable, say v so that
$$\begin{equation} -77c+144d=1 \end{equation*}$$*
By the Euclidean algorithm, a particular solution is c=43 and d=23
and the general solution is
$$\begin{align} c&=43+\frac{144y}{1}=43+144y\ d&=23-\frac{-77y}{1}=23+77y\ \end{align*}$$*
So that solution to -77c+144d=v is given by
$$\begin{align} c&=v\left(43+144y\right)\ d&=v\left(23+77y\right)\ \end{align*}$$*
This turns the three-variable equation into a two-variable equation given by
$$\begin{equation} 5u+v=1 \end{equation*}$$*
which clearly has a particular solution of u=0 and v=1 to give
general solutions given by
$$\begin{align} u&=0+\frac{z}{1}=z\ v&=-1-\frac{5z}{1}=-1-5z\ \end{align*}$$*
Therefore, we have
$$\begin{align} a&=u\left(2+7x\right)\ b&=u\left(-3-11x\right)\ c&=v\left(43+144y\right)\ d&=v\left(23+77y\right)\ u&=z\ v&=1-5z\ \end{align*}$$*
So, substituting u and v where required yields
$$\begin{align} a&=z\left(2+7x\right)\ b&=z\left(-3-11x\right)\ c&=\left(1-5z\right)\left(43+144y\right)\ d&=\left(1-5z\right)\left(23+77y\right)\ \end{align*}$$*
Where x,y,z\in\mathbb{Z}. We verify that this is a general solution.
We have
$$\begin{align} 55a+35b-77c+144d&=55z\left(2+7x\right)+35z\left(-3-11x\right)-77\left(1-5z\right)\left(43+144y\right)+144\left(1-5z\right)\left(23+77y\right)\ &=z\left(55\left(2+7x\right)+35\left(-3-11x\right)\right)+\left(1-5z\right)\left(-77\left(43+144y\right)+144\left(23+77y\right)\right)\ &=z\left(110+385x-105-385x\right)+\left(1-5z\right)\left(-77\left(43+144y\right)+144\left(23+77y\right)\right)\ &=5z+\left(1-5z\right)\left(-3311-11088y+3312+11088y\right)\ &=5z+\left(1-5z\right)=1\ \end{align*}$$*
Hence
$$\begin{align} a&=z\left(2+7x\right)\ b&=z\left(-3-11x\right)\ c&=\left(1-5z\right)\left(43+144y\right)\ d&=\left(1-5z\right)\left(23+77y\right)\ \end{align*}$$*
where x,y,z\in\mathbb{Z} is a general solution.
:::
Hence we expressed the $4$-variable linear Diophantine equation in terms of three arbitrary variables and found expressions for $3$-variable linear Diophantine equations in terms of two arbitrary parameters. Is this always the case, and if so does it hold for any number of variables? The answer to this question can be found by considering the method of replacing parts of the $n$-variable linear Diophantine equation with variables.
For example, for the $3$-variable case we have
$$\begin{equation*} ax+by+cz=d \end{equation*}$$
Suppose that \mathop{\mathrm{GCD}}\left(a,b,c\right)\mid d. After a
potential factoring of ax+by=g_1\left(a'x+b'y\right) where
g_1=\mathop{\mathrm{GCD}}\left(a,b\right), we can replace a'x+by'
with a variable, say u so that
$$\begin{equation*} a'x+b'y=u \end{equation*}$$
As we have factored out the greatest common divisor we will have
\mathop{\mathrm{GCD}}\left(a',b'\right)=1 by proposition
108{reference-type="ref"
reference="prop:NT_GCD_properties"} part 7. Hence we can solve
a'x+b'y=1 by the Euclidean algorithm and get a general solution
$$\begin{align*} x&=u\left(x_0+b'n\right)\ y&=u\left(y_0-a'n\right) \end{align*}$$
for some n\in\mathbb{Z}. As we have seen, this turns the $3$-variable
equation into a $2$-variable equation given by
$$\begin{equation*} g_1u+cz=d \end{equation*}$$
Which is solvable as \mathop{\mathrm{GCD}}\left(g_1,c\right)\mid d.
This will have a general solution of
$$\begin{align*} u&=u_0+\frac{cm}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\ z&=z_0-\frac{g_1m}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\ \end{align*}$$
Hence a general form of the general solution to the $3$-variable case is given by
$$\begin{align*} x&=\left(u_0+\frac{cm}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\right)\left(x_0+b'n\right)\ y&=\left(u_0+\frac{cm}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\right)\left(y_0-a'n\right)\ z&=z_0-\frac{g_1m}{\mathop{\mathrm{GCD}}\left(g_1,c\right)}\ \end{align*}$$
Here the arbitrary variables are n and m.
Now, in the $4$-variable case we have
$$\begin{equation*} aw+bx+cy+dz=e \end{equation*}$$
Suppose that \mathop{\mathrm{GCD}}\left(a,b,c,d\right)\mid e. As
before, after a potential factoring of aw+bx=g_1\left(a'w+a'x\right)
where g_1=\mathop{\mathrm{GCD}}\left(a,b\right), we can replace
a'w+bx' with a variable, say u so that
$$\begin{equation*} a'w+b'x=u \end{equation*}$$
We can solve a'w+b'x=1 by the Euclidean algorithm and get a general
solution for any u
$$\begin{align*} w&=u\left(w_0+b'n\right)\ x&=u\left(x_0-a'n\right) \end{align*}$$
for some n\in\mathbb{Z}. This turns the $4$-variable equation into a
$3$-variable equation given by
$$\begin{equation*} g_1u+cy+dz=e \end{equation*}$$
Which is solvable as \mathop{\mathrm{GCD}}\left(g_1,c,d\right)\mid e.
Two choices can be made here: replacing g_1u+cy or cy+dz.
Taking the first choice, we have after a potential factoring
g_1u+cy=g_2\left(g_1'u+c'y\right) where
g_2=\mathop{\mathrm{GCD}}\left(g_1,c\right), we set g_1'u+c'y=v,
solving g_1'u+c'y=1 by the Euclidean algorithm, we get a general
solution for any v
$$\begin{align*} u&=v\left(u_0+c'm\right)\ y&=v\left(y_0-g_1'm\right) \end{align*}$$
We are now left with a $2$-variable equation given by
$$\begin{equation*} g_2v+dz=e \end{equation*}$$
Again, solvable because \mathop{\mathrm{GCD}}\left(g_2,d\right)\mid e.
This has a general solution given by
$$\begin{align*} v&=v_0+\frac{dk}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\ z&=z_0-\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\ \end{align*}$$
Hence, we have a general solution given by.
$$\begin{align*} w&=\left(v_0+\frac{dk}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\right)\left(u_0+c'm\right)\left(w_0+b'n\right)\ x&=\left(v_0+\frac{dk}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\right)\left(u_0+c'm\right)\left(x_0-a'n\right)\ y&=\left(v_0+\frac{dk}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\right)\left(y_0-g_1'm\right)\ z&=z_0-\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_2,d\right)}\ \end{align*}$$
Alternatively, suppose we took the choice of cy+dz, after a potential
factoring cy+dz=g_2\left(c'y+d'z\right) where
g_2=\mathop{\mathrm{GCD}}\left(c,d\right). Setting c'y+d'z=v and
solving c'y+d'e=1 by the Euclidean algorithm, we get a general
solution for any v given by
$$\begin{align*} y&=v\left(y_0+d'm\right)\ z&=v\left(z_0-c'm\right)\ \end{align*}$$
Which now gives the $2$-variable equation
$$\begin{equation*} g_1u+g_2v=e \end{equation*}$$
Solutions exists as \mathop{\mathrm{GCD}}\left(g_1,g_2\right)\mid e.
This has a general solution
$$\begin{align*} u&=u_0+\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\ v&=v_0+\frac{g_1k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)} \end{align*}$$
Hence, a different general solution to the original $4$-variable equation is given by
$$\begin{align*}
w&=\left(u_0+\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\right)\left(w_0+b'n\right)\
x&=\left(u_0+\frac{g_2k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\right)\left(x_0-a'n\right)\
y&=\left(v_0+\frac{g_1k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\right)\left(y_0+d'm\right)\
z&=\left(v_0+\frac{g_1k}{\mathop{\mathrm{GCD}}\left(g_1,g_2\right)}\right)\left(z_0-c'm\right)\
\end{align*}$$
It seems there are a few things to show. Firstly, is it possible to show
that the solution to an $n$-variable linear Diophantine equation can be
expressed in terms of n-1 variables? Secondly, do the general
solutions have the form of being a product of 2 variable solutions?
Given an $n$-variable linear Diophantine equation, replacing two
variables with a single variable turns any $n$-variable equation with
n-1 variables. Eventually, this process terminates when, after some
number of replacements, we get to a $2$-variable equation.
We have therefore a strong understanding of solving Linear Diophantine equations in any number of variables.
Polynomials
Previously, we have seen how to handle linear equations in multiple variables. That is equations of the form
$$\begin{equation*} a_1x_2+a_2x_2+\dots+a_nx_n=b,\ , a_i,x_i,b\in\mathbb{Z} \end{equation*}$$
A natural question to ask is can we extend this to non-linear equations?
For example, we have defined what we mean by a square number
158{reference-type="ref"
reference="def:NT_square_number"}. That is, a number y\in\mathbb{Z} is
square if \exists x\in\mathbb{Z} so that x^2=y. If we consider x
as a variable for a moment, then we have seen many examples of solving
this type of equation. We studied this when finding what integer numbers
were squares. We can ask the question, what happens if we combine this
x^2 variable with just the variable x, for example, what values for
x\in\mathbb{Z} or \mathbb{Q} would satisfy
$$\begin{equation*} x^2+x=2 \end{equation*}$$
We can go further than simply x^2. For example, we can consider
$$\begin{equation*} x^n=\prod_{i=1}^n x \end{equation*}$$
and combine variables of this form however we wish and multiply them by constants, for example.
$$\begin{equation*} x^8+15x^7-8x^3+2x^2+x+5=0 \end{equation*}$$
We will want to study equations of this form. We will want to
::: definition Definition 178. Monomial
Let X be a variable and let a\in S for some set S\neq\emptyset.
We define a monomial to be an expression of the form
$$\begin{equation} aX^n \end{equation*}$$*
where n\in\mathbb{Z} with $n\geq 0$
:::
From a monomial, we define a so-called polynomial
::: definition Definition 179. Polynomial
Let S be a set and let n\in\mathbb{Z} with n\geq 0. Let X be a
variable. We define a polynomial to be an expression of the form
$$\begin{equation} P\left(X\right)=a_nX^n+a_{n-1}X^{n-1}+a_{n-2}X^{n-2}+\dots+a_1X+a_0 \end{equation*}$$*
Where a_0,a_1,\dots,a_n\in S are called the coefficients of the
polynomial. We say that X is an indeterminate variable and we say that
P\left(X\right) is a polynomial in X with coefficients in S.
Here we are formally using the + operation associated with S
between the terms in the polynomial.
:::
We can, of course, replace X with a particular value to evaluate the
polynomial.
::: definition Definition 180. Evaluation of a polynomial
Let S\neq\emptyset be a set and let P\left(X\right) be a polynomial
with coefficients in S. Let s\in S. We define the evaluation of the
polynomial P at s by
$$\begin{equation} P\left(s\right)=a_ns^n+a_{n-1}s^{n-1}+a_{n-2}s^{n-2}+\dots+a_1s+a_0 \end{equation*}$$* :::
::: example
Example 147. Let S=\mathbb{Z} and define P\left(X\right) by
$$\begin{equation} P\left(X\right)=2X^2-3X+5 \end{equation*}$$*
What is P\left(1\right)?
On substituting X=1 we see
$$\begin{equation} P\left(1\right)=2\left(1\right)^2-3\left(1\right)+5=2-3+5=4 \end{equation*}$$* :::
It will be useful to describe the set of all polynomials whose
coefficients lie in some set S.
::: definition Definition 181. Set of all polynomials with coefficients in a set $S$
Let S\neq\emptyset. We define the set of all polynomials whose
coefficients are in S by the set
$$\begin{equation} S\left[X\right]=\left{\sum_{i=0}^n s_iX^i: n\in\mathbb{N}\text{ and } s_i\in S\right} \end{equation*}$$*
We define the polynomials to be the elements of this set and write
P\in S\left[X\right], with the understanding that P actually means
P\left(X\right).
:::
From the definition of a polynomial, we have many choices that we can
make that allow us to create a polynomial. We can modify the
coefficients a_i for 0\leq i\leq n however we wish, so long as they
are all in the set S. In particular, the choices we can make are
clearly dependent on the value of n we can pick. The value of n is
an important property of polynomials.
::: definition Definition 182. Degree of a polynomial
Let P\in S\left[X\right]. We define the degree of the polynomial P
to be the largest n\in\mathbb{Z} so that the coefficient of X^n is
not equal to zero.
We write \deg\left(P\right)=n to mean the degree of the polynomial
P is n.
:::
::: example
Example 148. Let S=\mathbb{Z} and define P\left(X\right) by
$$\begin{equation} P\left(X\right)=2X^2-3X+5 \end{equation*}$$*
We see that the largest n where X^n\neq 0 is 2 so
\deg\left(P\right)=2.
:::
The astute reader might ask the following. Suppose that P is given by
$$\begin{equation*} P=0+0X+0X^2+0X^3+\dots+0X^n \end{equation*}$$
what is the degree of P? On one hand, by our definition, we can't
assign it a degree! There are no non-zero coefficients in the
polynomial! On the other hand, we intuitively know that the above
polynomial represents a meaningful polynomial, especially for the theory
we are attempting to develop. It is not clear how to resolve this
problem for now. Perhaps, developing the theory as much as we can
without it will make it clear how to resolve this issue.
Defining addition between two polynomials
We can define how to add two polynomials together. To do so we need to
recast how we see a polynomial, and to do so recall the definition of
the Cartesian product of n sets
33{reference-type="ref"
reference="def:CartProductOfNSet"}.
Let S_1,S_2,\dots,S_n be sets. We define the Cartesian product of
S_1,S_2,\dots,S_N, denoted S_1\times S_2\times\dots\times S_n to be
the set of all ordered pairs of the form
\left(s_1,s_2,\dots,s_n\right) where
s_1\in S_1.s_2\in S_2,\dots s_n\in S_n. This is to say that
$$\begin{equation*} S_1\times S_2\times\dots\times S_n=\left{\left(s_1,s_2,\dots,s_n\right):s_1\in S_1.s_2\in S_2,\dots s_n\in S_n\right} \end{equation*}$$
In particular, if all of the sets are the same we denote this by S^n.
We can use this idea to define a polynomial of degree n as a tuple.
Firstly, observe that we can write a polynomial P\left(X\right) as
$$\begin{align*} P\left(X\right)&=a_nX^n+a_{n-1}X^{n-1}+a_{n-2}X^{n-2}+\dots+a_1X+a_0\ &= a_0+a_1X+a_2X^2+a_3X^3+\dots+a_{n-1}X^{n-1}+a_nX^n\ &=\sum_{i=0}^n a_i X^i \end{align*}$$
That is we can express P as the sum of products of coefficients in S
and the corresponding power of the indeterminate variable X.
Now, we have that \deg\left(P\right)=n so in order to have the correct
sized tuple we must consider the Cartesian product of S with itself
n+1 times, that is
$$\begin{equation*} S^{n+1}=\prod_{i=0}^n S \end{equation*}$$
As each a_i\in S for 0\leq i\leq n we have that the tuple
a=\left(a_0,a_1,a_2,\dots,a_{n-1},a_n\right)\in S^{n+1}. This is the
correspondence we need.
::: definition Definition 183. Polynomial as an $n+1$-tuple
Let S\neq\emptyset and let n\in\mathbb{Z} with n\geq 0 so that
\deg\left(P\right)=n where
$$\begin{equation} P\left(X\right)=a_nX^n+a_{n-1}X^{n-1}+a_{n-2}X^{n-2}+\dots+a_1X+a_0 \end{equation*}$$*
We can view a polynomial as an element of the set S^{n+1}, say a
with the form
$$\begin{equation} a=\left(a_0,a_1,a_2,\dots,a_{n-1},a_n\right) \end{equation*}$$*
More simply, we can write
$$\begin{equation} P=\left(a_0,a_1,a_2,\dots,a_{n-1},a_n\right) \end{equation*}$$*
where we have the powers of X^n being implicit.
:::
This definition has an immediate consequence, it enables us to have a
representation for each X^n for any n\geq 0. For example, we see
that
$$\begin{align*} P\left(X\right)=1=X^0 &\iff a=\left(1\right)\ P\left(X\right)=X &\iff a=\left(0,1\right)\ P\left(X\right)=X^2 &\iff a=\left(0,0,1\right)\ P\left(X\right)=X^3 &\iff a=\left(0,0,0,1\right)\ &\dots \end{align*}$$
This allows us to build an understanding of how to properly define addition of two polynomials. Suppose we have
$$\begin{align*} P\left(X\right)&=1+X+X^2\ Q\left(X\right)&=4-3X+X^2+X^3 \end{align*}$$
Where the coefficients of P and Q are elements of \mathbb{Z}. We
see that P=\left(1,1,1\right) and Q=\left(4,-3,1,1\right). Firstly,
we have that P has less entries in its tuple than Q. We can account
for this by noting that P\left(X\right)=1+X+X^2=1+X+X^2+0X^3 and so an
alternative representation of P is given by P=\left(1,1,1,0\right).
Now, considers the terms in both P and Q which are associated with
X^0 i.e. P_0\left(X\right)=1 and Q_0\left(X\right)=4. As these are
simply elements of \mathbb{Z} we would expect that
P_0\left(X\right)+Q_0\left(X\right)=1+4=5 and so the sum to have the
tuple form \left(5\right).
Considering the terms in both P and Q which are associated with
X^1, P_1\left(X\right)=X and Q_1\left(X\right)=-3X, we would then
expect that P_1\left(X\right)+Q_1\left(X\right)=X-3X=-2X.
We can continue this process for the other terms X^2 and X^3 to get
$$\begin{align*} P_0\left(X\right)+Q_0\left(X\right)&=1+4=5\ P_1\left(X\right)+Q_1\left(X\right)&=X-3X=-2X\ P_2\left(X\right)+Q_2\left(X\right)&=X^2+X^2=2X^2\ P_3\left(X\right)+Q_3\left(X\right)&=0X^3+X^3=X^3\ \end{align*}$$
This would then suggest that
P\left(X\right)+Q\left(X\right)=5-2X+2X^2+X^3. Or, expressing this in
tuple form, we have
$$\begin{equation*} \left(1,1,1,0\right)+\left(4,-3,1,1\right)=\left(5,-2,2,1\right) \end{equation*}$$
That is, the addition of two tuples representing polynomials is done by doing an "element-wise" addition of the tuples. There are a few things that would need to be considered for this to become the foundation for defining addition for polynomials.
Firstly, we observed that P_0\left(X\right)+Q_0\left(X\right) made
sense as this represents integer addition. If we picked our coefficients
from say \mathbb{N}, we would not be able to consider
P_0\left(X\right)-Q_0\left(X\right); in fact, this holds for each
P_i-Q_i. It would therefore be useful to have closure of addition, and
additionally a notion of subtraction of the elements of S. This puts a
restriction on what the set S can be, for example, it is clear that
S\neq\mathbb{N} as subtraction is not closed in \mathbb{N}.
This also means our definition of polynomials is going to depend on the
underlying set that the coefficients come from. It is therefore a wise
idea to, at least temporarily, distinguish between when we are talking
about polynomial addition and when we are talking about the addition of
the elements of the set S.
We will use +_S when talking about addition between the elements of
the set S, and we will use \oplus_S for the polynomial
addition16 .
Furthermore, we had that P was of a lesser degree than Q,
\deg\left(P\right)=2 and \deg\left(Q\right)=3. This poses no real
issue as we can always extend an $m$-tuple to an $n$-tuple, for m<n.
Indeed, suppose we have an element s\in S^m and we want to extend it
to an element of S^n where m<n, then we can use the following map
$$\begin{align*} f:S^m&\mathlarger{\mathlarger{\rightarrow}}S^n\ (s_1,s_2,\dots,s_m)&\mapsto f\left((s_1,s_2,\dots,s_m)\right)=(s_1,s_2,\ldots,s_m, \underbrace{0, 0, \dots, 0}_{n-m \text{ times}}) \end{align*}$$
Or more simply, append n-m $0$s to the element of s^m. We provide a
general definition.
::: definition Definition 184. Polynomial tuple extension map
Let n,m\in\mathbb{Z} so that m\leq n. We define the polynomial
tuple extension map by
$$\begin{align} E_m^n:S^m&\mathlarger{\mathlarger{\rightarrow}}S^n\ (s_1,s_2,\dots,s_m)&\mapsto E_m^n\left((s_1,s_2,\dots,s_m)\right)=(s_1,s_2,\ldots,s_m, \underbrace{0, 0, \dots, 0}_{n-m \text{ times}}) \end{align*}$$*
Here, we are using the notation E_m^n to indicate this extends an
$m$-tuple to an n-tuple.
:::
This means that given two polynomials expressed in their tuple forms, we can always extend the one with the fewer elements so that they share the same number of elements. From this, we can define polynomial addition.
::: definition Definition 185. Polynomial addition
Let S be a set and let P,Q\in S\left[X\right] so that
\deg\left(P\right)=n and \deg\left(Q\right)=m so that without loss
of generality we have that m\leq n and
$$\begin{align} P=\left(p_0,p_1,p_2,\dots, p_{n-1},p_n\right)\ Q=\left(q_0,q_1,q_2,\dots, q_{m-1},q_m\right)\ \end{align*}$$*
Furthermore, suppose that S is endowed with an addition +_S such
that +_S is well-defined and closed.
We define the addition of P and Q, by
$$\begin{align} \oplus_S:s^n\times s^n&\mathlarger{\mathlarger{\rightarrow}}s^n\ \left(P, E_m^n\left(Q\right)\right)&\mapsto\oplus_S\left(P,E_m^n\left(Q\right)\right)=\left(p_0+_S q_0,p_1+_S q_1, p_2+S q_2,\dots, p{n-1}+_S0, p_n+_S0\right) \end{align*}$$* :::
::: proposition Proposition 139. Polynomial addition is well-defined and closed
Let S be a set and let P,Q\in S\left[X\right] so that
\deg\left(P\right)=n and \deg\left(Q\right)=m so that without loss
of generality we have that m\leq n and
$$\begin{align} P=\left(p_0,p_1,p_2,\dots, p_{n-1},p_n\right)\ Q=\left(q_0,q_1,q_2,\dots, q_{m-1},q_m\right)\ \end{align*}$$*
Furthermore, suppose that S is endowed with an addition +_S such
that +_S is well-defined and closed. We have that the polynomial
addition of P and Q, denoted P\oplus_S Q is well-defined and
closed.
Proof:
This is immediate. By the definition of the polynomial addition, we have that
$$\begin{equation} P\oplus_S Q=\left(p_0+_S q_0,p_1+_S q_1, p_2+S q_2,\dots, p{n-1}+_S0, p_n+_S 0\right) \end{equation*}$$*
As +_S is well defined and closed, then we have that
p_i+_S q_i\in S for 0\leq i\leq m. Moreover, for m<i\leq n we have
that p_i+_S 0=p_i\in S. Hence, all the entries in the tuple given by
P\oplus_S Q are elements of S so that P\oplus_S Q\in S^n. $\qed$
:::
::: {#lem:NT_Polynomial_degree_addition .lemma} Lemma 12. Degree of polynomial from polynomial addition
Let P,Q\in S\left[X\right]. Then
$$\begin{equation} \deg\left(P\oplus_S Q\right)\leq\max\left(\deg\left(P\right),\deg\left(Q\right)\right) \end{equation*}$$*
Proof:
The result is instant if \deg\left(P\right)=\deg\left(Q\right) so
suppose not and without loss of generality suppose that
\deg\left(P\right)>\deg\left(Q\right) where \deg\left(P\right)=n and
\deg\left(P\right)=m. Then as tuples we have that
$$\begin{align} P=\left(p_0,p_1,p_2,\dots,p_{n-1},p_n\right)\ Q=\left(q_0,q_1,q_2,\dots,q_{m-1},q_m\right)\ \end{align*}$$*
As \deg\left(Q\right)<\deg\left(P\right) we use the tuple extension
mapping E_m^n on Q and we have that
\deg\left(E_m^n\left(Q\right)\right)\leq n. Hence
$$\begin{equation} \deg\left(P\oplus_S Q\right)\leq \deg\left(P\oplus_S E\left(Q\right)\right)\leq n = \max\left(\deg\left(P\right),\deg\left(Q\right)\right) \end{equation*}$$*
$\qed$ :::
We are getting an idea for our problem with the polynomial given by
$$\begin{equation*} P=0+0X+0X^2+0X^3+\dots+0X^n \end{equation*}$$
If we want lemma
12{reference-type="ref"
reference="lem:NT_Polynomial_degree_addition"} to be consistent, we
should define the degree of P to be such that it is no larger than the
degree of any other polynomial. In particular, for c\in S we have that
Q=c with Q\in S\left[X\right] has degree 0, we must have that
\deg\left(P\right)< \deg\left(Q\right)=0. This still doesn't fully
answer the question, which negative integer should we take for the
degree of P? Maybe, once we have a definition for the multiplication
of polynomials, it will provide further insight.
Now, given a potential candidate for defining the addition of two
polynomials, we can also consider a potential candidate for defining the
subtraction of two polynomials. As before, we take inspiration from
\mathbb{Z}.
As we have shown that the addition of integers is closed and
well-defined, additionally, for every x\in\mathbb{Z} we have that
\exists y so that x+y=0. In particular, we take y=-x so that the
expression becomes x-x=0. A sensible definition for polynomial
subtraction should also respect these properties; subtracting two
polynomials should give another polynomial. This raises a question;
suppose P\in S\left[X\right], what is P-P?
We know that in \mathbb{N}, \mathbb{Z} and \mathbb{Q}, that for an
element x that x-x should be 0, but what does it mean for 0 to
be an element of S and by extension S\left[X\right]? In particular
is it the same 0 as for \mathbb{N}, \mathbb{Z} and \mathbb{Q}?
On the other hand, we know that for any x in \mathbb{N},
\mathbb{Z} and \mathbb{Q} that x+0=x=0+x, a similar sort of
element of S would be useful and clearly plays an important role for
defining a similar element for S\left[X\right]. This idea is general
enough, assuming we have a well-behaved +_S, that we can apply it to a
set S.
::: definition Definition 186. Additive Identity of a set $S$
Let S be a set so that there is an operation +_S:S^2\rightarrow S
such that +_S is closed and well-defined. Let e\in S. If we have
that \forall s \in S that s+_S e=s, then we say that e is a right
additive identity element of S.
Similarly, if \forall s \in S we have that e+_Ss=s, then we say
that e is a left additive identity element of S.
If we have that \forall s\in S that e+_S s=s=s+_S e, we simply call
e an additive identity element.
If we need to be clear which set the additive inverse belongs to, we will write $e_S$ :::
It is an immediate consequence of +_S that the identity element is
unique.
::: proposition
Proposition 140. The additive identity element of a set S is
unique
Let S be a set so that there is an operation +_S:S^2\rightarrow S
such that +_S is closed and well-defined. Let e,f\in S be additive
identity elements of S.
We have that e=f.
Proof:
Let S and +_S:S^2\rightarrow S be as given, and let e,f\in S be
additive identity elements of S.
By definition, we have that
$$\begin{equation} e=e+_s f=f \end{equation*}$$*
As +_S is well-defined and closed, we have that e=f as required.
$\qed$
:::
From this, we can immediately identify that 0 in \mathbb{N},
\mathbb{Z} and \mathbb{Q} is unique.
We have resolved one part of this problem, that in \mathbb{N},
\mathbb{Z} and \mathbb{Q}, for an element x that x-x=0. We have
answered what it means for "$0$" to be in S, but what does it mean
for -x\in S given x\in S?. Noting that x-x=x+_S\left(-x\right),
this is precisely what it means for x to be invertible in S at least
with respect to +_S. As with the additive identity of S, this idea
is also general enough to apply to a more general set S.
::: definition Definition 187. Additive Inverse of a set $S$
Let S be a set so that there is an operation +_S:S^2\rightarrow S
such that +_S is closed and well-defined. Let s\in S.
If we have that \exists x\in S such that s+_S x=e, then we say that
x is a right additive inverse element of s in S.
Similarly, if \exists x\in S such that x+_S s=e, then we say that
x is a left additive inverse element of s in S.
If we have that \exists x\in S that x+_S s=s=s+_S x, we simply call
x an additive inverse element of s in S.
:::
As with the additive identity element, we have an immediate consequence
that the inverse of an element s\in S is unique.
::: proposition
Proposition 141. The additive inverse element of an element of S
is unique
Let S be a set so that there is an operation +_S:S^2\rightarrow S
such that +_S is closed and well-defined. Let s\in S be an arbitrary
element of S.
We have that the additive inverse of s is unique.
Proof:
Let S and +_S:S^2\rightarrow S be as given, and let s\in S be an
arbitrary element of S and suppose that s has two inverses x and
y.
By definition, we have that
$$\begin{align} x&=x+_S e\ &=x+_S\left(s+_S y\right)\ &= \end{align*}$$*
As +_S is well-defined and closed, we have that e=f as required.
$\qed$
:::
It would also be useful to undo the addition of polynomials via
polynomial subtraction. The only requirement is that we need +_S to be
invertible In particular, as we are using a well-defined and closed
operation on S, that is +_S, we have gained a definition of
subtraction for free! Using -_S to denote subtraction in S, we have
$$\begin{align*} \ominus_S:s^n\times s^n&\mathlarger{\mathlarger{\rightarrow}}s^n\ \left(P, E\left(Q\right)\right)&\mapsto\ominus_S\left(P,E\left(Q\right)\right)=\left(p_0-_S q_0,p_1-_S q_1, p_2-S q_2,\dots, p{n-1}-_S0, p_n-_S0\right) \end{align*}$$
Given a notion of subtraction, we can also define what it means for two polynomials to be equal. Firstly, recall what it means for
::: definition Definition 188. Equality of Polynomials
Let P,Q\in S\left[X\right] where \deg\left(P\right)=n and
\deg\left(Q\right)=m where without loss of generality m\leq n.
We say that P and Q are equal as polynomials, written P=Q, if and
only if
$$\begin{equation} P\ominus_S Q = 0 = \left(\underbrace{0,0,0,\dots, 0,0}_{n+1 \text{ times}}\right) \end{equation*}$$*
That is, if the difference between the two is the zero polynomial. :::
We can therefore define the following relation.
::: definition Definition 189. :::
It is immediate that a polynomial therefore has a unique representation as an $n+1$-tuple.
Defining multiplication between two polynomials
We can use the same idea of the $n+1$-tuples to define multiplication of
polynomials. Recall that we observed that we can express the
intermediate X, and powers of it, as follows
$$\begin{align*} P\left(X\right)=1=X^0 &\iff a=\left(1\right)\ P\left(X\right)=X &\iff a=\left(0,1\right)\ P\left(X\right)=X^2 &\iff a=\left(0,0,1\right)\ P\left(X\right)=X^3 &\iff a=\left(0,0,0,1\right)\ &\dots \end{align*}$$
Intuitively, we want X^2=X*X, X^3=X^2*X and so on. That is
$$\begin{align*} XX=\left(0,1\right)\left(0,1\right)&=\left(0,0,1\right)=X^2\ X^2X=\left(0,0,1\right)\left(0,1\right)&=\left(0,0,0,1\right)=X^3\ &\dots \end{align*}$$
What about more complex expressions? Say X*\left(X+X^2\right). The
answer to this would depend on if multiplication is distributive over
addition with respect to the indeterminate, and additionally on
multiplication is commutative!. For now, let us assume that this is the
case,
It seems therefore that multiplication by X has the effect of
"shifting" to the right
-
We are clearly not talking about sunsets ↩︎
-
If we are being logical and don't want to get soaked before we get to our destination. ↩︎
-
By exist we mean in the abstract sense. ↩︎
-
Without loss of generality means we have made a choice in the proof which allows us to consider a single case as the other cases have the same argument just with the notation changed to reflect the different choice. ↩︎
-
Unless you are either not a human or somehow reading this in some unknown form of existence ↩︎
-
We will first need to prove that in order to speak of the inverse of a mapping that we will need the left and right inverses to be equal ↩︎
-
Hence the similar names. ↩︎
-
Hopefully not all at once! ↩︎
-
We can think of this as some sort of singularity ↩︎
-
Phew! ↩︎
-
Until someone manages to find a way to get past the elegant mathematics of the encryption scheme! ↩︎
-
If there is only one theorem you learn when studying Number Theory, it has to be this one! ↩︎
-
I prefer this way of thinking. ↩︎
-
RSA stands for Rivest--Shamir--Adleman ↩︎
-
Named after the 3rd-century mathematician Diophantus of Alexandria ↩︎
-
When we have fully defined polynomial addition, we will go with the usual convention of just using
+to denote addition ↩︎