Intro to Modern Algebra

Math 310
University of Nebraska–Lincoln

In this course we will study proof writing based on abstract algebra.

Algebra comes from the arabic word “al-jabr" which means reunion of broken parts. In mathematics this word is used for naming a broad area of study which deals with manipulating abstract symbols and objects in a similar manner to that in which we manipulate numbers. Therefore, we shall start by studying the properties of numbers (specifically integers) and later we will build different algebraic structures which obey the same properties.

At the same time we will learn how to write mathematical proofs. This is a skill that takes some amount of patience to master. You might be surprised by the fact that advanced mathematics and in particular proof writing does not consist of a long string of formulas. Instead it consists of many words with some formulas sprinkled throughout. As is the case with a good essay, you might have to go through several attempts and subsequent drafts to write a really good proof.

One very important word of advice about proofs: in this course we will be developing the material from the ground up. This means that when you write a proof you should be able to justify every step relying only on facts previously mentioned in this class and not on any other knowledge.

1 The Integers

1.1 Properties of integers

What is a number? Certainly the things used to count sheep, money, etc. are numbers: $1, 2, 3, \dots$ .

We will call these the positive whole mumbers natural numbers and write ${\mathbb{N}}$ for the set of all natural numbers ${\mathbb{N}} = \{1, 2, 3, 4, \dots, \}.$ What you see above is set notation which one reads as “ the set of natural numbers, ${\mathbb{N}}$ , consists of 1,2,3,4, etc”.

Since we like to keep track of debts too, we’ll allow negatives and $0$ , which gives us the the integers. The word integer refers to a whole number, either negative, positive, or zero. So, the set of all integers is: ${\mathbb{Z}} = \{ \dots, -3, -2, -1, 0, 1, 2, 3, 4, \dots\}.$ The symbol ${\mathbb{Z}}$ is used for the set of integers since the German word for number is Zahlen.

From now on, for brevity, we write $a \in {\mathbb{Z}}$ to mean $a$ is an integer’. The symbol $\in$ means “in”, as in “belongs to” or “is an element of", so $a \in {\mathbb{Z}}$ reads “ $a$ is an element of the set of all integers”.

Integers have many nice mathematical properties one could ask for: they have operations of addition, subtraction, and multiplication. These operations follow several rules which we will take for granted. We call these facts that we take for granted “axioms".

Axioms 1.1 (Properties of integers).

Closure under addition: If $a,b$ are integers then $a+b$ is an integer.
Associativity of addition: If $a,b,c$ are integers then $(a+b)+c=a+(b+c)$ .
Commutativity of addition: If $a,b$ are integers then $a+b=b+a$ .
Additive identity: there exist an integer $0$ so that if $a$ is an integer then $a+0=0+a=a$ .
Additive inverses: If $a$ is an integer then there exists an integer $-a$ so that $a+(-a)=(-a)+a=0$ .
Closure under multiplication: If $a,b$ are integers then $ab$ is an integer.
Associativity of multiplication: If $a,b,c$ are integers then $(ab)c=a(bc)$ .
Commutativity of multiplication: If $a,b$ are integers then $ab=ba$ .
Multiplicative identity: There exists an integer $1$ so that if $a$ is an integer then $a\cdot 1=1\cdot a=a$ .
Distibutivity: If $a,b,c$ are integers then $a(b+c)=(b+c)a=ab+ac$ .

Remark 1.1. If we consider instead the set of natural numbers ${\mathbb{N}}=\{1,2,3,\ldots\}$ we see that while most of the above listed axioms continue to hold for the natural numbers, two of them do no. Indeed the additive identity axiom is not satisfied since $0$ is not a natural number. Moreover the additive inverses axiom is not satisfied since if $a$ is a natural number then $-a$ is no longer a natural number.

1.2 Divisibility and proof techniques

Multiplication produces a relationship which we call divisibility among integers.

Definition 1.2. An integer $a$ divides an integer $b$ if there exists an integer $n$ such that $b = a \cdot n$ .

Notation 1.1. We write $a\mid b$ for $a$ divides $b$ . This should not be confused with the fraction $a/b$ which indicates division.

Example 1.3.

The integers that divide $12$ are $1,2,3,4,6,12,-1,-2,-3,-4,-6$ and $-12$ — don’t forget the negative ones!
Every integer divides $0$ : If $a$ is any integer, then since $a \cdot 0 = 0$ , by definition $a$ divides $0$ .
The only integer $0$ divides is $0$ itself: since $0 \cdot n = 0$ (for any integer $n$ ), by definition $0$ divides $0$ . But if $b$ is any non-zero integer, then $0 \cdot n \ne b$ for all possible $n$ . So, $0$ does not divide $b$ when $b \ne 0$ .
The only integers that divide $1$ are $-1$ and $1$ . However $1$ divides any integer $b$ since $b=1\cdot b$ .
The only integers that divide $-1$ are $-1$ and $1$ . However $-1$ divides any integer $b$ since $b=(-1)\cdot (-b)$ where $-b$ is an integer by the additive inverses axiom.

Definition 1.4. An integer $m$ is even if $2$ divides $m$ . In other words, $m$ is even provided there exists an integer $n$ such that $m = 2 \cdot n$ .

Definition 1.5. An integer $m$ is odd provided it is not even.

Let us now work out our first mathematical proof!

We will prove the assertion "The sum of two even integers is even." It helps to first rewrite this in if-then form as “ If $a$ and $b$ are even integers then $a+b$ is an even integer."

The proof below is an example of a proof technique called direct proof, which works best for statements of the form “If P, then Q." A direct proof for such a statement starts with "Suppose P". Then several steps may follow that use axioms, previously proven facts, and formula manipulations. The proof ends when through these deductions you have reached the conclusion that Q is true. This is marked by the proof end symbol whiich looks like a square.

Proposition 1.6. If $a$ and $b$ are even integers then $a+b$ is an even integer.

Proof. Suppose $a$ and $b$ are both even integers. By definition of even this means there exists an integer $x$ such that $a = 2x$ and there exists an integer $y$ such that and $b = 2y$ . It follows that $a + b = 2x + 2y = 2(x+y) \qquad \text{ by distributivity}.$ Since $x$ and $y$ are integers, $x+y$ is an integer by closure of the integers under addition. Since we have found an integer, $x+y$ , such that $a + b =2((x+y)$ , by definition $2$ divides $a+b$ , so $a+b$ is even. ◻

Proposition 1.7. If $a, b,c,m,n$ are integers such that $a$ divides $b$ and $a$ divides $c$ , then $a$ divides $b \cdot n + c \cdot m$ .

Proof. Suppose $a$ divides $b$ and $a$ divides $c$ . By definition, this means that there exists an integer $x$ so that $b = ax$ and there exists an integer $y$ such that $c = ay$ . We have $b\cdot m+ c \cdot n = axm + ayn = a(xm+yn).$ Since $x,y,m,n$ are integers, $xm + yn$ is also an integer by closure of the integers under addition and mutiplication. Since $b\cdot m+ c \cdot n =a(xm+yn)$ and $xm + yn$ is an integer, $a$ divides $bm + cn$ by definition. ◻

Let’s now prove that the sum of an odd integer and an even integer is odd.

We will do so by contradiction. To understand this proof technique consider the following scenario. Suppose I wanted to prove that it is not night outside. If I can look out the window and see the sun then it cannot be night, so it must be daytime. Here seeing the sun contradicts that it is night, so it allows to deduce that it is day.

The proof below is an example of the proof technique called proof by contradiction, which works best for statements of the form “If P, then Q." where $Q$ is a negative statement. In the proof below $Q$ will be the statement $a+b$ is odd, which is the same as " $a+b$ is not even". Not $Q$ is the statement " $a+b$ is even".

A proof by contradiction for such a statement starts with "Suppose P. Assume (towards a contradiction) not Q." Then several steps may follow that use axioms, previously proven facts, and formula manipulations. The proof ends when through these deductions you have reached a contradiction. At this point we say “Since we have reached a contradiction the assumption not $Q$ must be false, thus $Q$ is true.

Proposition 1.8. If $a$ is an even integer and $b$ is an odd integers, then $a+b$ is odd.

Proof. Suppose $a$ is an odd integer and $b$ is an even integers and assume that $a+b$ is even. Then $2\mid (a+b)$ and $2\mid a$ by definition of even. By definition of divisibility, there exit integers $k, l$ so that $a+b=2k$ and $a=2l$ .

We have $b=(a+b)-a=2k-2l=2(k-l).$ Since $l$ is an integer, so is its additive inverse $-l$ . Since $k$ and $-l$ are integers, $k-l=k+(-l)$ is an integer by closure under addition. Since $b=2(k-l)$ and $k-l$ is an integer, $2$ divides $b$ , so $b$ is even by definition. This contradicts that given fact that $b$ is an odd integer.

Since we have reached a contradiction, the assumption that $a+b$ is even must be false, thus $a+b$ is odd. ◻

1.3 Sets

Sets are just collections of objects. the objects in a set are called elements of that set.

Notation 1.2. We write $x\in A$ to mean that $x$ is an element of a set $A$ .

Example 1.9. Writing $A=\{1,2,3\}$ means $A$ is the set with elements 1,2, and 3.

Example 1.10. The empty set, denoted $\emptyset$ is the set with no elements.

Notation 1.3. We can describe the elements of a set by writing down the properties they satisfy. This is called set-builder notation.

Example 1.11. For example, the set of all even integers is $E=\{a \mid a\in {\mathbb{Z}}, a\text{ is even} \}$ and the set of all odd integers is $O=\{a \mid a\in {\mathbb{Z}}, a\text{ is odd} \}.$ The vertical line in set builder notation is read "such that". It does not mean divides, although it is the same symbol.

New sets can be obtained from old by means of several set operations.

Definition 1.12. Given two sets $A$ and $B$ , we can form new sets by

$\bullet$	union	$A\cup B=\{x \ \vert \ x\in A \text{ or } x\in B\}$
$\bullet$	intersection	$A\cap B=\{x \ \vert \ x\in A \text{ and } x\in B\}$
$\bullet$	set difference	$A\setminus B=\{x \ \vert \ x\in A \text{ and } x\not\in B\}$

Example 1.13. What are $E\cup O, E\cap O, {\mathbb{Z}}\setminus E, {\mathbb{Z}}\setminus O$ ?

$\begin{matrix} E\cup O ={\mathbb{Z}} & E\cap O=\emptyset & {\mathbb{Z}}\setminus E=O &{\mathbb{Z}}\setminus O=E. \end{matrix}$

Example 1.14. Let $a$ and $b$ be integers and let $C=\{x\in{\mathbb{Z}} \mid x \text{ divides both } a \text{ and } b\}$ .

Can $C$ be the empty set? No. Since $1$ divides any integer, $1\in C$ no matter what $a$ and $b$ are. So $C$ is not the empty set.
Can $C$ be a set with only one element? No. Since also $-11$ divides any integer, $C$ has at least two elements ( $1$ and $-1$ ) no matter what $a$ and $b$ are. So $C$ is not a set with only one element.
Can $C={\mathbb{Z}}$ ? Yes, when $a=b=0$ every integer divides both $a$ and $b$ .

1.4 Logic

We can form compound statements as listed below:

$P$ and $Q$ is a true statement whenever both $P$ and $Q$ are true statements
$P$ or $Q$ is a true statement whenever either one of $P$ or $Q$ (or both) are true statements
Not $P$ is a true statement whenever $P$ is a false statement.
If $P$ then $Q$ (written $P\rightarrow Q$ ) is true whenever $P$ is false or $Q$ is true.

Example 1.15. Which of the following statements are true?

“If pigs could fly then they would be purple.” True since P=“pigs could fly" is false.
“If $12$ is even, then $5$ divides $12$ .” False since P=“12 is even" is true but Q=“ $5\mid 12$ is false.
“For an integer $n$ , if $n$ is even then $-2$ divides $n$ ”. True. When P=“ $n$ is even" is true then Q=“ $-2$ divides $n$ " is true.

The following negation rules are useful. Note that negation takes and to or and vice-versa.

De Morgan’s Laws for negation:

not(P and Q)=(not P) or (not Q)
not(P or Q)=(not P) and (not Q)

We summarize the rules of logic above in truth tables where $T$ stands for true and $F$ stands for false.

P	Q	P and Q	P or Q	not P	$P\rightarrow Q$	(not P) $\rightarrow$ Q	(not Q) $\rightarrow$ (not P)
T	T	T	T	F	T	T	T
T	F	F	T	F	F	T	F
F	T	F	T	T	T	T	T
F	F	F	F	T	T	F	T

Two statements are logically equivalent if they have the same truth value for all possible truth values of the inputs $P, Q$ . If statements $S$ and $T$ are logically equivalent and we want to prove $S$ is true, we can prove $T$ is true instead!

Remark 1.16. We see from the table that the following pairs of statements are logically equivalent since they have the same truth values (the columns that correspond to the two statements have the same entries):

“If $P$ then $Q$ " is equivalent to “If (not Q) then (not P)".
“ $P$ or $Q$ " is equivalent to “(not $P)\rightarrow Q$ ".

Contrapositives and converses

Definition 1.17. The contrapositive of the if-then statement “If $P$ then $Q$ ” is “If not $Q$ then not $P$ ”.

By the second bullet point in 1.16 the contrapositive is logically equivalent to the original statement. Sometimes when proving an if-then statement, it works a bit better to give a proof of the contrapositive. If you can prove the contrapositive is true, it follows automatically that the original statement is also true because of logical equivalence.

Example 1.18. Consider the statement “For an integer $a$ , if $a^2$ is odd, then $a$ is odd."

If we want to prove this statement we can instead prove the contrapositive “For an integer $a$ , if $a$ is even, then $a^2$ is even." This latter statement is easier to prove.

Definition 1.19. The converse of the if-then statement “If $P$ then $Q$ ” is “If $Q$ then $P$ ”.

Unlike the contrapositive, the converse of an if-then statement is not logically equivalent to the statement itself.

Example 1.20. Consider the statement: “If Math 310 is in session, the it is daytime." This statement is true. The converse is the statement “If it is daytime, then Math 310 is in session." This statement is false (there is no Math 310 in session at 8am). We see thus that the original statement is not logically equivalent to its converse since they don’t have the same truth value.

1.4.1 Quantifiers

In mathematics we use two quantifiers

the universal quantifier “for every” (or “for any" or “for all"), in symbols $\forall$ and
the existential quantifier “there exists”, in symbols $\exists$ .

An important difference

To prove an existentially quantified statement “There exists $x$ such that $P$ holds." it suffices to give an example of an $x$ for which $P$ holds.
To disprove a universally quantified statement “For all $x$ , $P$ holds." it suffices to give a counterexample of an $x$ for which $P$ is false.

quantifier	prove	disprove
$\forall$	proof	counterexample
$\exists$	example	proof

Example 1.21. Consider the following statement: “The sum of two odd integers is odd." Note that the statement involves implicit universal quantifiers. It implicitly states: “For all odd integers $a$ and $b$ the sum $a+b$ is odd." According to the table above to disprove this statement it suffices to give a counterexample $a=1, b=1$ are two odd integers such that $a+b=2$ is even.

Negating statements containing quantifiers:

the negation of “For every $x$ , $P$ holds” is “There exists $x$ such that not $P$ holds”.
the negation of “There exists $x$ such that $P$ holds” is “For every $x$ , not $P$ holds”.

Example 1.22. Negate the following statements:

The sum of two odd integers is odd.

Negation: There exist odd integers $a$ and $b$ such that $a+b$ is even.
There exists an odd integers $a$ so that $a^2$ is even.

Negation: for every odd integer $a$ , $a^2$ is odd.

1.5 The division algorithm

First we look at a special property of the non-negative integers.

Axiom 1.23 (The Well-Ordering Principle). ¹ Every non-empty set of non-negative integers has a smallest element.

Understanding the Well-Ordering Principle.

Example 1.24. For example, the set $\{3, 7, 9, 21\}$ is a non-empty set of non-negative integers, and so the Well-Ordering Principle tells us that it has a smallest element. In this example, it’s evident: $3$ .

This also holds for infinite sets: Let $S$ be the set of all strictly positive multiples of $7$ . It has a smallest elements (namely, $7$ ). The principle also applies even when it is not obvious what the smallest element is. For instance, consider the height measured in inches (and rounded down to the nearest whole number) of all giraffes on planet Earth. That is, let $S = \{m \mid \text{there exists a giraffe of height $m$ inches (rounded down)}\}.$ This set must have a smallest number (there is a shortest giraffe — poor guy!).

Remark 1.25. The word “non-empty” is crucial. The statement “Every set of non-negative integers has a smallest element” is false, since the empty-set (which is considered a subset of the set of non-negative integers — the empty set is a subset of every set) has no smallest element, since it doesn’t have any elements at all.

Remark 1.26. The word “non-negative” is also important. For instance, the set of all negative multiples of $7$ , namely $\{-7, -14,-21, \dots \}$ is a non-empty set of integers, but it has no smallest element.

Let us now discuss the so-called “division algorithm”:

Theorem 1.27 (The Division Algorithm Theorem). Let $n, d \in \mathbb Z$ with $d>0$ . There exists unique $q, r \in \mathbb Z$ such that $n = q d + r \,\,\,\,\,\,\,\,\,\,\,\,\, {\text{ and }} \,\,\,\,\,\,\, 0 \leqslant r < d.$

Example 1.28. If we take $n = 25$ and $d = 4$ in the statement of the Division Algorithm, then the values of $q$ and $r$ that make the conclusion hold true are $q = 6$ and $r = 1$ : $25 = 6 \cdot 4 + 1$ and $0 \leqslant 1 < 4$ .

Let us check that $6$ and $1$ are the only values that work: Suppose $25 = q \cdot 4 + r$ and $0 \leqslant r < 4$ . So $r$ is one of $0,1,2$ , or $3$ . If $r = 0$ , then $25 = q \cdot 4$ , but $4$ does not divide $25$ , and so $r = 0$ is not possible. If $r=2$ , then $25 = q \cdot 4 + 2$ and thus $23 = q \cdot 4$ . But $4$ does not divide $23$ , and so $r = 2$ is not possible. If $r = 3$ , then $25 = q \cdot 4 + 3$ and thus $22 = q \cdot 4$ . But $4$ does not divide $23$ , and so $r = 3$ is not possible. We thus must have that $r = 1$ . Then $25 = q \cdot 4 + 1$ and hence $24 = q \cdot 4$ , so that $q$ must be $6$ .

Example 1.29. If we take $n = -25$ and $d = 4$ in the statement of the Division Algorithm, then the values of $q$ and $r$ that make the conclusion hold true are $q = -7$ and $r = 3$ : $-25 = -7 \cdot 4 + 3$ and $0 \leqslant 3 < 4$ .

Remark 1.30. If we were to allow $d$ to be negative, the conclusion of the Division Algorithm Theorem would fail to be true: note that it is impossible for any integer $r$ to satisfy $0 \leqslant r < d$ when $d < 0$ , so no such integers $q$ and $r$ would exist in this case.

Remark 1.31. If we were to drop the phrase “ $0 \leqslant r < d$ " from the Division Algorithm Theorem, it would also become false, since the uniqueness portion would fail: For instance, if $n = 25$ and $d = 4$ , then $25 = 6 \cdot 4 + 1$ but also $25 = 5 \dot 4 + 5$ , so without the clause “ $0 \leqslant r < d$ ", there would not be a unique pair of integers $q$ and $r$ that satisfy the requirements.

The Division Algorithm Theorem has two parts: an existence statement and an uniqueness statement. Existence says that there is at least one way of writing $n = q d + r$ with $0 \leqslant r < d$ . The uniqueness part says there is at most one way of doing so. In other words, the uniqueness clause asserts that if we have $n = q d + r$ with $0 \leqslant r < d$ and we also have $n = q' d + r'$ with $0 \leqslant r' < d$ , then it must be that $q = q'$ and $r = r'$ .

Our next goal is to prove the existence portion of the division algorithm. Let’s recall what this says exactly:

Theorem 1.32 (Existence portion of the Division Algorithm).

Let $n, d$ be integers with $d > 0$ . There exist integers $q$ and $r$ such that $n = qd + r$ and $0 \leqslant r < d$ .

Before giving the formal proof, let’s think intuitively for a bit. If I want to divide $37$ by $5$ , for instance, leaving a remainder $r$ in the range $0 \leqslant r < 5$ , we could proceed by subtracting off $5$ ’s from $37$ repeatedly until it is no longer possible to do so without going into negative territory. That is, the remainder is what’s left over after I subtract off as many $5$ ’s as a I can from $37$ — in this example the remainder is $2$ .

In general, given $n$ and $d > 0$ , we want to subtract from $n$ as many copies of $d$ as possible. (Things are more complicated when $n$ is negative, but don’t worry about this for now.) This motivates the definition of the set $\mathcal S := \{n - d x\,\, | \,\, x \in \mathbb Z\,\,\, {\text{and}} \,\,\, n-dx \geqslant 0\}.$ So, an element $y$ belongs to $\mathcal S$ if (1) $y$ a positive integer and (2) $y$ is the result of subtracting from $n$ some number of copies of $d$ .

For instance if $n = 17$ and $d = 5$ then $\mathcal S = \{2, 7, 12, 17, \dots\}$ . Note that in this example the smallest element of $\mathcal S$ is $2$ , which is indeed the remainder of division of $17$ by $5$ . We can thus “find” the remainder by seeking out the smallest element of $\mathcal S$ , and this turns out to work in general. Moreover, once we know what the remainder $r$ is, the value of $q$ is determined by the equation $n = qd + r$ .

Let us now give a formal proof:

Proof of Theorem 1.32. Let $n$ and $d$ be integers with $d > 0$ . Define a set of non-negative integers as follows: $\mathcal S := \{n - d x\,\, | \,\, x \in \mathbb Z\,\,\, {\text{and}} \,\,\, n-dx \geqslant 0\}.$ By construction, $\mathcal S$ consists of only non-negative integers. We would like to apply the Well-Ordering Principle, but to do so we must check that $\mathcal S$ is not empty.

Let us prove $\mathcal S$ is non-empty by considering two cases. If $n \geqslant 0$ , then taking $x = -n$ in the formula above gives $n - d \cdot (-n) = n(d+1)$ , which belongs to $\mathcal S$ (since it is non-negative by assumption). If $n < 0$ , then taking $x = n$ in the formula above gives $n - d \cdot n = n(1-d)$ . Since $d \geqslant 1$ , we have $1-d \leqslant 0$ , and since $n < 0$ , it follows that $n(1-d) \geqslant 0$ . Thus $n- d \cdot n$ is an element of $\mathcal S$ . In either case, we have shown $\mathcal S$ is not the empty set.

Since $\mathcal S$ is a non-empty set of non-negative integers, by the Well-Ordering Principle, $\mathcal S$ has a smallest element, which we will call $r$ . (Our aim is to prove that this is the $r$ that “works” to give the conclusion.)

By definition of $\mathcal S$ , we have that $r \geqslant 0$ and that $r = n - qd$ for some integer $q$ . Let us now show that $r < d$ also. Suppose this is not the case; that is, suppose $r \geqslant d$ . Since $r = n - dq$ , it follows that $r-d = n - dx -d = n-d(x+1)$ . Since $r \geqslant d$ , we have $r-d \geqslant 0$ . This shows that $r - d \in \mathcal S$ . But $r -d < r$ , which contradicts our assumption that $r$ is the smallest element of $\mathcal S$ . We conclude that it must be the case that $r < d$ .

We have shown that there are integers $r$ and $q$ such that $r = n - qd$ and $0 \leqslant r < d$ . Rewriting the first equation slightly, we have shown that there are integers $r$ and $q$ such that $n= r + qd$ and $0 \leqslant r < d$ . ◻

Let us prove the “uniqueness” portion of the Division Algorithm. Here is what it says:

Theorem 1.33. Let $n, d$ be integers with $d > 0$ . Suppose that $n = qd+r$ with $0 \leqslant r < d$ and $n = q'd + r'$ with $0 \leqslant r' < d$ . Then $r = r'$ and $q = q'$ .

Proof. We first show that $d$ divides $(r-r')$ : Since $n = qd+r$ and $n = q'd + r'$ , we have $qd + r = q'd + r'$ and hence $r-r' = q'd - qd = n = (q-q')d$ . Since $q' -q$ is an integer, this shows that $d$ divides $r - r'$ .

We next show that $-d < r-r' < d$ : Since $r' < d$ , we have $-d < -r'$ , and since $0 \leqslant r$ , wehave $-r' \leqslant r' + r = r - r'$ . So $-d \leqslant r - r'$ . Similarly, since $r' \geqslant 0$ we have $-r' \leqslant 0$ and since $r < d$ , it follows that $r - r' \leqslant r < d$ , whence $r- r' < d$ . Putting these together gives $-d < r-r' < d.$

Now, the only integer $x$ such that $-d < x < d$ and $d$ divides $x$ is $0$ . Since we have shown that $r - r'$ has these two properties, it follows that $r-r' = 0$ ; that is, $r = r'$ .

Finally, since $q'd + r' = qd + r$ and $r = r'$ , we conclude that $0 = q'd - qd = (q'-q)d$ . Since $d \ne 0$ , it follows that $q-q' = 0$ and thus $q = q'$ . ◻

1.6 Greatest Common Divisor

Definition 1.34. Let $a$ be an integer. A divisor of $a$ is an integer $d$ so that $d$ divides $a$ .

Definition 1.35. Let $a$ and $b$ be two integers, not both of which are $0$ . The greatest common divisor or GCD of $a,b$ is the largest integer $d$ such that $d$ divides $a$ and $d$ divides $b$ . We often write $\gcd(a,b)$ for the GCD of $a$ and $b$ .

More formally by definition $\gcd(a,b)$ has the following properties:

$\gcd(a,b)$ divides both $a$ and $b$ . (“common divisor" property)
If $n$ is an integer that divides both $a$ and $b$ , $\gcd(a,b) \geqslant n$ . (“greatest" property)

According to the above definition if you want to prove that an integer $d$ is the GCD of $a$ and $b$ you must show

$d$ divides both $a$ and $b$ . (“common divisor" property)
If $n$ is an integer that divides both $a$ and $b$ , $d \geqslant n$ . (“greatest" property)

Example 1.36. For example, let us find $\gcd(18, 24)$ by listing all the common factors of $18$ and $24$ and finding the largest one: The factors of $18$ are $\pm 1, \pm 2, \pm 3, \pm 6, \pm 9, \pm 18$ . The factors of $24$ are $\pm 1, \pm 2, \pm 3, \pm 6, \pm 12, \pm 24.$ So the common factors are $\pm 1, \pm 2, \pm 3, \pm 6$ . The largest of these numbers is $6$ ; whence $\gcd(18, 24) = 6$ .

Example 1.37.

For $a>0$ , what is $\gcd(a, a)$ ? Since $a$ divides itself and nothing bigger than $a$ does, $\gcd(a,a)= a$ . For $a<0$ , what is $\gcd(a, a)$ ? When $a<0$ , $-a$ is the largest divisor of $a$ so $\gcd(a,a)= -a$ .
What is $\gcd(a, 7a)$ when $a>0$ ? Since $a$ divides both $a$ and $7a$ and nothing bigger than $a$ divides $a$ , we have $\gcd(a, 7a) = a$ . What is $\gcd(a, 7a)$ when $a<0$ ? Since $-a$ divides both $a$ and $7a$ and nothing bigger than $-a$ divides $a$ , we have $\gcd(a, 7a) = -a$ .
What is $\gcd(a,0)$ for $a \neq 0$ ? Every integer divides $0$ . So $\gcd(a,0)$ is the largest divisor of $a$ . For $a>0$ , the largest divisor of $a$ is $a$ , but if $a < 0$ , then the largest divisor of $a$ is $-a$ . So $\gcd(a,0) = |a|$ .

Lemma 1.38. If $a, b \in \mathbb{Z}$ with $b \neq 0$ , then $\gcd(a,b)=\gcd(a,-b)$ .

Proof. Suppose $a, b \in \mathbb{Z}$ with $b \neq 0$ . We first show that $d$ divides $b$ if and only if $d$ divides $-b$ . If $d\mid b$ then there exists $n\in {\mathbb{Z}}$ so that $b=dn$ and thus $-b=d(-n)$ shows that $d\mid -b$ . Conversely if $d\mid -b$ then there exists $k\in {\mathbb{Z}}$ so that $-b=dk$ and thus $b=d(-k)$ shows that $d\mid b$ .

So, the set of common divisors of $a$ and $b$ is identical to the set of common divisors of $a$ and $-b$ . By definitions, $\gcd(a,b)$ is the largest element of the first set and $\gcd(a,-b)$ is the largest element of the second set. Since these sets are identical, $\gcd(a,b) = \gcd(a,-b)$ . ◻

Theorem 1.39. Let $a$ and $b$ be integers with $b > 0$ , and suppose $q, r$ are integers which satisfy $a = qb+r$ . Then $\gcd(a,b) = \gcd(b,r)$ .

Proof. Suppose $a$ and $b$ are integers with $b > 0$ . First, suppose that $d$ is an integer that divides both $a$ and $b$ . Then we can find integers $n$ and $m$ such that $a = dn$ and $b = dm$ , and $dn = a = qb + r = qdm + r, \textrm{ so } r = dn-qdm = d(n-qm).$ Since $n,q,m$ are integers by closure of the integers under addition and multiplication $n-qm\in{\mathbb{Z}}$ . Therefore, $d$ divides $r$ , and by assumption $d$ also divides $b$ . Now since $\gcd(b,r)$ is the largest integer that divides both $b$ and $r$ , we must have $d \leqslant \gcd(b,r)$ by property (2) in Definition 1.35. In particular, this applies to the special case when $d = \gcd(a,b)$ , since that is an integer that divides both $a$ and $b$ . We conclude that $\gcd(a,b) \leqslant \gcd(b,r)$ .

Now we will show that $\gcd(a,b) \geqslant \gcd(b,r)$ . To do that, suppose now that $d$ is an integer that divides both $b$ and $r$ . This means we can find integers $m$ and $n$ such that $b=dm$ and $r = dn$ , so $a = qb+r = qdm + dn = d(qm+n).$ Again since $n,q,m$ are integers by closure of the integers under addition and multiplication $qm+n\in{\mathbb{Z}}$ . This shows that $d$ must also divide $a$ . Since by assumption $d$ also divides $b$ , we must have $d \leqslant \gcd(a,b)$ by propery (2) in Definition 1.35. This applies in particular to the case when $d = \gcd(b,r)$ , so we conclude that $\gcd(b,r) \leqslant \gcd(a,b)$ .

We have shown that $\gcd(a,b) \leqslant \gcd(b,r)$ and $\gcd(b,r) \leqslant \gcd(a,b)$ , and hence $\gcd(a,b) = \gcd(b,r)$ . ◻

1.7 The Euclidean Algorithm

Theorem 1.39 is used in practice to find, in an efficient manner, the gcd of two integers. Let’s look at an example:

Example 1.40 (Euclidean algorithm). Suppose we want to find the gcd of $524$ and $148$ . Consider the following calculations: $\begin{eqnarray*} 524 = 148 \cdot 3 + 80 \\ 148 = 80 \cdot 1 + 68 \\ 80 =68 \cdot 1 + 12 \\ 68 = 12 \cdot 5 + 8 \\ 12 = 8 \cdot 1 + 4 \\ 8 = 4 \cdot 2 + 0 \end{eqnarray*}$ In each step, we are applying the Division Algorithm, starting by dividing $524$ by $148$ to leave a remainder of $80$ , then dividing $148$ by $80$ , leaving a remainder of $68$ , and so on. In general, if in one step we are dividing $x$ by $y$ leaving a remainder of $z$ , in the next step we are dividing $y$ by $z$ , leaving a new remainder.

By Theorem 1.39, if $a = bq + r$ , then $\gcd(a,b) = \gcd(b,r)$ . Applying Theorem 1.39 to equation [1] it shows that $\gcd (524,148)= \gcd(148,80)$ . Applying Theorem 1.39 to equation [2] it shows that $\gcd(148,80) =\gcd (80,68)$ , and continuing to apply Theorem 1.39 to the remaining equations we obtain $\gcd (524,148)= \gcd(148,80)=\gcd (80,68) = \gcd(68, 12) =\gcd(12, 8) = \gcd(8,4) = \gcd(4,0).$ Since $\gcd(4,0) = 4$ by Example 1.37, the equalities above give $\gcd(524, 148) = 4$ .

The Extended Euclidean Algorithm

In addition to finding the gcd of two integers $a$ and $b$ , the Euclidean algorithm can also be used to find integers $x$ and $y$ such that $\gcd(a, b) = xa + by$ . This is sometimes called the “Extended Euclidean Algorithm”.

Definition 1.41. An expression of the form $xa + yb$ is known as a integer linear combination of $a$ and $b$ .

With this terminology, the extended Euclidean algorithm gives a method of expressing $\gcd(a,b)$ as an integer linear combination of $a$ and $b$ . Let’s see how this works in the example above:

Example 1.42 (Extended Euclidean algorithm). Our next goal is to express $\gcd(524,148)=4$ as a linear combination of $524$ and $148$ . We can do so re-using the work from the Euclidean Algorithm as follows:

$\begin{eqnarray*} 524 = 148 \cdot 3 + 80 \Leftrightarrow 80 = 1 \cdot 524 - 3 \cdot 148 \\ 68 = 1 \cdot 148 - 1 \cdot 80 = 1 \cdot 148 - (1 \cdot 524 - 3 \cdot 148) \cdot 1 = 4 \cdot 148 - 1 \cdot 524\\ 12 = - 1 \cdot 80 + 1 \cdot 68 = - 1 \cdot (4 \cdot 148 - 1 \cdot 524) + (1 \cdot 524 - 3 \cdot 148) = - 7 \cdot 148 + 2 \cdot 524 \\ 8 = 68 - 5 \cdot 12 = ( 4 \cdot 148 - 1 \cdot 524 ) - 5 \cdot (- 7 \cdot 148 + 2 \cdot 524) = -11 \cdot 524 + 39 \cdot 148\\ 4 = 1 \cdot 12 - 1 \cdot 8 = (- 7 \cdot 148 + 2 \cdot 524) - (-11 \cdot 524 + 39 \cdot 148) = 13 \cdot 524 - 46 \cdot 148 \end{eqnarray*}$

Here are formal statements of the Euclidean Algorithm and its extended version.

Theorem 1.43 (The Euclidean Algorithm). Let $a$ and $b$ be integers such that $b > 0$ . Define a list of numbers $r_1, r_2 \dots, r_m$ such that $b > r_1 > r_2 > \cdots > r_m \textrm{ and } r_m = 0$ as follows:

define $r_1$ to be the unique integer in the Division Algorithm such that $a = b \cdot q_1 + r_1$ for some integer $q_1$ and $0 \leqslant r_1 < b$ . (If $r_1 = 0$ , then the process stops at $m = 1$ .)
If $r_1 > 0$ , define $r_2$ to be the unique integer in the Division Algorithm such that $b = r_1 \cdot q_2 + r_2$ for some integer $q_2$ and $0 \leqslant r_2 < r_1$ . (If $r_2 = 0$ , then the process stops at $m = 2$ .)
If $r_2 > 0$ , define $r_3$ to be the unique integer in the Division Algorithm such that $r_1 = r_2 \cdot q_3 + r_3$ for some integer $q_3$ and $0 \leqslant r_3 < r_2$ . (If $r_3 = 0$ , then the process stops and $m = 3$ .)
If $r_3 > 0$ , define $r_4$ to be the unique integer in the Division Algorithm such that $r_2 = r_3 \cdot q_4 + r_4$ for some integer $q_4$ and $0 \leqslant r_4 < r_3$ . (If $r_4 = 0$ , then the process stops and $m = 4$ .)
Continue in this fashion until one arrives at an integer $r_m$ that is $0$ . More precisely, $r_{m-2} = r_{m-1} \cdot q_m + 0$ .

Then $\displaystyle \gcd(a,b) = r_{m-1}$ .

Proof. Claim 1: the algorithm terminates in finitely many steps.

The inequalities from the Division Theorem can be combines as follows $b > r_1 > r_2 > r_3 > \cdots \geq 0$ Since there are finitely many integers between $0$ and $b$ which can be the successive remainders in the algorithm, the process described in the Theorem cannot continue forever; that is, $r_m = 0$ for some $m$ . In fact, notice that the process must stop after at most $b$ steps.

Next we want to show

Claim 2: $\gcd(a,b)= r_{m-1}$ .

Since $a = bq_1 + r_1$ , by Theorem 1.39 we have $\gcd(a,b) = \gcd(b, r_1)$ . Moreover, since $b = r_1 q_2 + r_2$ , Theorem 1.39 also implies that $\gcd(b, r_1)= \gcd(r_1, r_2)$ . Since $r_1 = r_2 q_3 + r_3$ , by Theorem 1.39 we must also have $\gcd(r_1, r_2)= \gcd(r_2, r_3)$ . Continuing in the manner, applying Theorem 1.39 for each step of the algorithm, we obtain the list of equations $\gcd(a,b) = \gcd(b, r_1)= \gcd(r_1, r_2) = \gcd(r_2, r_3) = \cdots = \gcd(r_{m-2}, r_{m-1}) = \gcd(r_{m-1}, r_m).$ Recall that $r_m = 0$ , and so from the equality of the first and last quantities above we have $\gcd(a,b) = \gcd(r_{m-1}, 0)$ . Since $\gcd(x, 0) = |x|$ for any non-zero integer $x$ , we have $\gcd(r_{m-1},0) = r_{m-1}$ . This proves that $\gcd(a,b) = r_{m-1}$ . ◻

A consequence of the Extended Euclidean Algorithm is

Theorem 1.44. If $a,b$ are integers, $a\neq 0$ , and $b\neq 0$ , then there exist integers $x,y$ so that $\gcd(a,b)=xa+by$ .

1.8 Primes

Definition 1.45. A integer $p$ is called prime if

$p \ne 0$ , $p \ne 1$ , $p \ne -1$ and
the only divisors of $p$ are $1$ , $-1$ , $p$ and $-p$ .

Definition 1.46. An integer $n$ is composite if it is not prime.

Lemma 1.47. If $p$ is prime and $a$ is any integer, then $\gcd(p,a)$ is either $|p|$ or $1$ . In particular, if $p$ does not divide $a$ , then $\gcd(p,a) = 1$ .

Proof. Suppose $p$ is prime and $a$ is any integer. By definition, the only divisors of $p$ are $1$ , $-1$ , $p$ , and $-p$ . If $p$ divides $a$ , then so does $|p|$ , and since $|p|$ is the largest divisor of $p$ , we conclude that $\gcd(a,p) = |p|$ . If $p$ does not divide $a$ , neither does $-p$ , so $1$ is the only positive integer dividing both $a$ and $p$ , and $\gcd(p,a) = 1$ . ◻

Theorem 1.48. Let $a, b, c$ be integers. If $a$ divides $bc$ and $\gcd(a,b) = 1$ , then $a \mid c$ .

Proof. Suppose $a, b, c$ are integers, $a$ divides $bc$ and $\gcd(a,b) = 1$ . By the extended Euclidear Algorithm (Theorem 1.43) the greatest common divisor of $a$ and $b$ is always an integer linear combination of $a$ and $b$ ; specifically there exist integers $x$ and $y$ such that $1=\gcd(a,b)=ax+by$ . Therefore, multiplying by $c$ gives $axc+byc = c.$ On the other hand, our assumption that $a$ divides $bc$ says that there exists an integer $n$ such that $an=bc$ . Therefore, $\begin{align*} axc+byc & = c \\ axc + any & = c \\ a(xc+ny) & = c. \end{align*}$ Since $xc + ny$ is an integer by closure of ${\mathbb{Z}}$ under addition and multiplication, by definition, this means that $a$ divides $c$ . ◻

Corollary 1.49. Assume $a,b, p$ are integers. If $p$ is a prime and $p$ divides $ab$ , then $p$ must divide $a$ or $b$ .

Note that “or" here is not exclusive: the theorem says that one of the following three statements is true:

either $p$ divides a,
$p$ divides $b$ ,
or $p$ divides both $a$ and $b$ .

Proof. Assume $p$ is prime and $p$ divides $ab$ . We consider two cases. If $p$ divides $a$ , we are done since we already have what we wanted to prove. If $p$ does not divide $a$ . By Lemma 1.47, $\gcd(a,p)=1$ . Since $\gcd(a,p)=1$ and we assumed that $p$ divides $ab$ , we can apply Theorem 1.48, which leads us to conclude that $p$ must divide $b$ . ◻

Corollary 1.50. Suppose $p$ is a prime integer, $a_1, a_2, \dots, a_n$ are arbitrary integers, and $p$ divides $a_1 a_2 \cdots a_n$ . Then $p$ divides at least one of $a_1, a_2, \dots,$ or $a_n$ .

Proof. The most rigorous way to prove this would be a proof by induction, a topic that has not yet been discussed in this class. We will make do with a slightly less formal proof:

Assume $p$ is prime and $p$ divides $a_1 a_2 \cdots a_n$ . Thinking of the latter product as $(a_1 \cdots a_{n-1}) \cdot a_n$ , by Corollary 1.49 we have that $p$ divides $a_1 \cdots a_{n-1}$ or $p$ divides $a_n$ . If $p$ divides $a_n$ , then we are done – we have found an $a_i$ that $p$ divides. If not, then $p$ divides $a_1 \cdots a_{n-1}$ . Now applying the Corollary again gives that $p$ divides $a_1 \cdots a_{n-2}$ or $p$ divides $a_{n-1}$ . Likewise, if $p$ divides $a_1 \cdots a_{n-2}$ , then $p$ divides $a_1 \cdots a_{n-3}$ or $p$ divides $a_{n-2}$ . Continuing in this manner, we arrive at the fact that $p$ divides $a_1$ or $p$ divides $a_2$ or $\cdots$ or $p$ divides $a_n$ . ◻

Corollary 1.51. Suppose $p$ , $q_1, q_2, \dots, q_n$ are all prime integers, and that $p$ divides $q_1 q_2 \cdots q_n$ . Then $p = \pm q_i$ for some $i \in \{1, 2, \dots, n\}$ .

Proof. By Corollary 1.50, we have that $p$ divides $q_i$ for some $i \in \{1, 2, \dots, n\}$ . Since $q_i$ is prime, its only divisors are $\pm 1$ and $\pm q_i$ . But $p \ne 1$ and $p \ne -1$ (by the definition of “prime”) and hence it must be that $p = \pm q_i$ . ◻

1.9 The Fundamental Theorem of Arithmetic

The Fundamental theorem of arithmetic says that every integer can be factored into primes in an essentially unique way. More precisely, it says the following:

Theorem 1.52 (The Fundamental Theorem of Arithmetic). Any integer $n \neq 0, 1, -1$ can be written as a product of primes. Moreover, if $p_1\cdots p_s =q_1\cdots q_t$ are two factorizations of $n$ into primes, then $s=t$ , and there exists a reordering of the $q_j$ ’s such that $q_i=\pm p_i$ for all $i$ .

Example 1.53. For example, we can factor $-24$ into a product of primes as follows $-24 = (-2) \cdot 2 \cdot 2 \cdot 3$ and also as $-24 = (-3) \cdots 2 \cdot 2 \cdot 2.$ These are “essentially the same” in the sense that if we rearrange the factors and change a couple of the signs, we can tranform one factorization into the other one.

Example 1.54. For another example, $-17$ is already factored into a product of primes: it is a product of one prime.

The Fundamental Theorem consists of an “existence" and “uniqueness" statement. The existence part says that every nonzero integer can be written as a product of prime integers. More formally, for any integer $n$ such that $n \ne 0$ , there is a non-negative integer $s$ and a list of primes $p_1, \dots, p_s$ such that $n = \pm p_1 p_2 \cdots p_n$ . The uniqueness part says that this list of primes $p_1, \dots, p_s$ is unique up to signs and the order of the primes. More formally, that given wo factorizations of $n$ into prime say $p_1\cdots p_s \,\,\,{\text{and} }\,\,\,q_1\cdots q_t$ then $s=t$ , and we can reorder of the $q_j$ ’s such that $q_i=\pm p_i$ for all $i$ .

The existence portion of the Fundamental Theorem of Arithmetic states the following:

Theorem 1.55 (FTA existence). For any integer $n$ , other than $1$ , $-1$ or $0$ , there is a list of prime integers $p_1, p_2, \dots, p_s$ for $s \geqslant 1$ such that $n = p_1p_2 \cdots p_s$ .

Proof of the existence portion of the Fundamental Theorem. Suppose $n$ is an integer other than $1$ , $-1$ or $0$ .

Case 1: Let us first consider the case when $n \geqslant 2$ . Let $\mathcal S$ be the set of all integers that are greater than or equal to $2$ that are not products of primes: $\mathcal S= \{n \in {\mathbb{Z}} \mid n \geqslant 2 \text{ and $n$ is not a product of primes} \}.$ Our goal is to prove $\mathcal S$ is the empty set.

By way of contradiction, assume $\mathcal S$ is not empty. Then, by the Well-Ordering Principle, $\mathcal S$ will have a least element, call it $n$ . In other words, $n$ is an integer such that $n \geqslant 2$ and such that $n$ cannot be written as a product of primes, but such that every integer $m$ such that $2 \leqslant m < n$ can be written as a product of primes.

We note that $n$ cannot be prime, since a prime integer is its own prime factorization. (Note that $s = 1$ is allowed in the statement of the theorem.) So, we must have that $n = a \cdot b$ for some $1 < a < n$ and $1 < b < n$ . But then since $a$ and $b$ are both less than $n$ and $n$ is the least element of $\mathcal S$ , both $a$ and $b$ must not in $\mathcal S$ . Since each is greater than or equal to $2$ , it follows that each of them is a product of primes. Say $a = q_1 q_2 \cdots q_k$ and $b = r_1 r_2 \cdots r_l$ where each $q_i$ and $r_j$ is prime. Then $n = q_1 q_2 \cdots q_k r_1 r_2 \cdots r_l$ is a product of primes, contrary to the fact that $n \in \mathcal S$ . We conclude that $\mathcal S$ must be the empty set — that is, every integer that is at least $2$ is a product of prime intgers.

Case 2: Now assume $n \leqslant -2$ . Then $-n \geqslant 2$ and hence, by the case already established, we have that $-n = p_1 p_2 \cdots p_s$ for some primes $p_1, \dots, p_s$ . It follows that $-n = (-p_1) p_2 p_3 \cdots p_s$ . Since $p_1$ is prime, so is $-p_1$ . This proves $n$ is a product of primes. ◻

The uniqueness portion of the Fundamental Theorem states:

Theorem 1.56 (FTA uniqueness). If $p_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t$ for some primes $p_1, p_2, \dots, p_s$ and $q_1, q_2, \dots, q_s$ , then $s = t$ and, after possibly reordering the $q_j$ ’s, we have that $p_i = \pm q_i$ for all $i$ .

Proof. Suppose $n$ is an integer other than $1$ , $-1$ or $0$ .

Case 1: Let us first consider the case when $n \geqslant 2$ .

We give a proof by contradiction. Consider the set

$\mathcal S =\left\{n\in {\mathbb{Z}} \mid n\geq 2 \text{ and } n=p_1\cdots p_s, n=q_1\cdots q_t \right.$ are two prime factorizations which differ not just by reordering and negating the factors $\left.\right\}$ .

Assume towards a contradiction that $\mathcal{S}$ is not empty. Since $\mathcal{S}$ consists of nonnegative integers, by the Well-Ordering Principle, there is a smallest element of $\mathcal{S}$ — let us call it $N$ . Then $N$ has two factorizations, say $N = p_1 p_2 \cdots p_s = q_1 q_2 \cdots q_t$ for some primes $p_1, \dots, p_s$ and $q_1, \dots, q_t$ , such that no possible reordering and changing of signs can make them be the same.

We can rewrite the equations above as $N = p_1 (p_2 \cdots p_s) = q_1 q_2 \cdots q_t, \qquad(*)$ which shows that $p_1$ divides $q_1 q_2 \cdots q_t$ . Since $p_1$ is prime as are $q_1,\ldots, q_t$ , it must be that $p_1= \pm q_j$ for some $1\leq j\leq t$ by Corollary 1.51. By renumbering the $q$ ’s, we may assume $j = 1$ , so that $p_1=\pm q_1$ . Dividing $(*)$ by $|p_1|$ we get $m = \pm p_2 p_3 \cdots p_s = \pm q_2 q_3 \cdots q_t$ with $m=N/|p_1|$ still a positive integer. Since $p_1$ is a prime $|p_1| > 1$ , and so we must have $m<N$ . Thus $m\not \in \mathcal{S}$ and hence the two prime factorizations of $m$ above must be essentially the same, that is, $s-1=t-1$ and after reordering $p_i=\pm q_i$ for $2\leq i\leq s$ . Together with $p_1=\pm q_1$ this shows that the two factorizations of $N$ are essentially the same, a contradiction.

We conclude that $\mathcal{S}$ is empty, so every integer $n \geqslant 2$ has an essentially unique prime factorization.

Case 1: Now assume $n \leqslant -2$ . Then $-n \geqslant 2$ and hence, by the case already established, since $-n=(-p_1) p_2 \cdots p_s=(-q_1)q_2\cdots q_t$ we have that $s=t$ and after possibly reordering we have $p_i=\pm q_i$ for $1\leq i\leq n$ . ◻

Lemma 1.57. If the Fundamental Theorem of Arithmetic is true and $n\neq 1$ is an integer then there exist positive prime integers $p_1, \ldots, p_s$ such that $n=p_1\cdots p_s$ .

Proof. Suppose $n\neq 1$ is an integer and the FTA is true.

By the FTA, there exist primes $p'_1, \ldots, p'_s$ so that $n= p'_1 \cdots p'_s$ . Then we have $|n|= |p'_1 \cdots p'_s|=|p'_1| \cdots |p'_s|.$ Since the absolute value of a prime is a prime, each of the integers $|p_1|, \ldots |p_s|$ are positive primes, so by setting $p_i=|p'_i|$ we obtain that $n=p_1\cdots p_s$ and $p_1, \ldots, p_s$ are positive primes. ◻

Let’s see a sample application for the uniqueness statement of the FTA .

Example 1.58. Are there primes $p_1, p_2, p_3$ and $q_1, q_2$ so that $p_1\neq p_2, p_1\neq p_3, p_2\neq p_3, q_1\neq q_2$ and $p_1p_2p_3=q_1q_2$ ?

No this is not possible. The FTA says that if we have two prime factorizations $p_1p_2p_3$ and $q_1q_2$ that are equal then the number of prime factors in each is the same. This would mean that $3=2$ , a contradiction.

The relationship between prime factorizations, divisors, and gcd

Here is the main result that relates prime factorizations and the notion of divisor.

Lemma 1.59. Let $n$ and $d$ pe positive integers so that $n=p_1^{f_1}p_2^{f_2}\cdots p_k^{f_k}$ with $p_i>0$ distinct primes and $e_i\geq 0$ integers. Then $d\mid n$ if and only if $d=p_1^{e_1}p_2^{e_2}\cdots p_k^{e_k}$ for some integers $f_1,\ldots, f_k$ so that $0\leq e_i \leq f_i$ for all $1\leq i\leq k$ .

With this interpretation of the divisors in terms of prime factorization, we can determine a formula for the GCD.

Theorem 1.60. Let $a$ and $b$ be positive integers so that $\begin{eqnarray*} a=p_1^{a_1}p_2^{a_2}\cdots p_k^{a_k}\\ b=p_1^{b_1}p_2^{b_2}\cdots p_k^{b_k} \end{eqnarray*}$ with $p_i>0$ distinct primes and $a_i\geq 0, b_i\geq 0$ integers. Then $\gcd(a,b)=p_1^{\min\{a_1,b_1\}}p_2^{\min\{a_2,b_2\}}\cdots p_k^{\min\{a_k,b_k\}},$ where $\min\{a_i,b_i\}$ denotes the smallest among $a_i$ and $b_i$ .

Proof. By Lemma 1.59, $d\mid a$ if and only if $d=\pm p_1^{e_1}p_2^{e_2}\cdots p_k^{e_k}$ for some integers $f_1,\ldots, f_k$ so that $0\leq f_i \leq a_i$ and $d\mid b$ if and only if $d=\pm p_1^{e_1}p_2^{e_2}\cdots p_k^{e_k}$ for some integers $f_1,\ldots, f_k$ so that $0\leq f_i \leq b_i$ . Thus $d$ is a common divisor of $a$ and $b$ if and only if $d=\pm p_1^{e_1}p_2^{e_2}\cdots p_k^{e_k}$ and $0\leq f_k \leq \min\{a_i,b_i\}$ . The largest number of this form is $p_1^{\min\{a_1,b_1\}}p_2^{\min\{a_2,b_2\}}\cdots p_k^{\min\{a_k,b_k\}}$ . This number is therefore the largest common divisor $\gcd(a,b)$ . ◻

Example 1.61. Suppose $a=2^{101}\cdot 3^{27}\cdot 5$ and $b=2^{10}\cdot 3^{33}$ then we can first write the numbers using all primes that appear in either, using exponents equal to zero if necessary: $a=2^{101}\cdot 3^{27}\cdot 5^1$ and $b=2^{10}\cdot 3^{33}\cdot 5^0$ . Then the formula in the theorem above gives $\gcd(a,b)=2^{\min\{101,10\}}\cdot 3^{\min\{27,33\}}\cdot 5^{\min\{1,0\}}=2^{10}\cdot3^{27}\cdot 5^0.$

2 Induction

2.1 Proofs by induction

Induction is a very helpful proof technique for showing results about all positive integers, or where the statement depends on a positive integer. Usually, proofs by induction go as follows:

We have a statement we want to prove that depends on a positive integer $n$ ; let’s refer to that statement as $P(n)$ . To prove $P(n)$ for all $n$ , we follow two steps:

Base Case: Prove that $P(n)$ holds for the smallest value of $n$ we are considering. Usually this is $P(0)$ or $P(1)$ .
Induction Step: Assume that $P(n)$ is true for some $n$ , and prove that this implies $P(n+1)$ is true.

The Domino Metaphor: Imagine that the statements $P(1), P(2), P(3), \dots$ are like dominoes. Suppose I tell you that (a) the dominoes have been arranged so that if one of the them falls over, the next one in line will fall too, and (b) I have knocked over the first one. Then it is clear that, eventually, every domino will fall. (Ignore the fact that it would take an infinite amount of time for this to happen in reality.) The Principle of Mathematical Induction is an abstraction of this idea.

It is a theorem that once you prove both the base case and the induction step, $P(n)$ must hold for all integers $n \geqslant r$ , where $r$ is the value of $n$ in your base case. Most often $r=1$ .

Here is the formal statement:

Theorem 2.1 (The Principle of Mathematical Induction). Suppose $r\geq 1$ is an integer and $P(n)$ is a statement that applies to an integer $n\geq r$ . If $P(r)$ is true and $P(n)$ implies $P(n+1)$ for all $n \geqslant r$ , then $P(n)$ is true for all $n\geq r$ .

Proof. Suppose $P(r)$ is true and $P(n)$ implies $P(n+1)$ for all $n \geqslant 1$ .

We give a proof by contradiction. Assume it is not the case that $P(n)$ is true for all $n \geqslant r$ . Then the set $S := \{k \in {\mathbb{Z}} \mid k \geqslant r, \text{ $P(k)$ is false}\}$ is a non-empty set of positive integers. By the Well-Ordering Axiom, $S$ has a least element, call it $j$ . In words, $j$ is an integer such that $P(j)$ is true but $P(i)$ is false for all $i$ such that $r \leqslant i < j$ .

Since we assumed $P(r)$ is true, $j \ne r$ . So $j > r$ and hence $j-1\geq r$ . Since $j-1 < j$ , $j-1$ must not be in the set $S$ . That is, $P(j-1)$ is true. But we are assuming $P(j-1)$ implies $P(j)$ , and thus $P(j)$ must be true. We have concluded that $P(j$ ) is both true and false, which is a contradiction.

Since we reached a contradiction, it must be that the set $S$ is empty — that is, $P(n)$ is true for all positive integers $n$ . ◻

Here is a statement that can be proven by the Principle of Mathematical Induction:

Recall that $n! = 1 \cdot 2 \cdots \cdot n$ .

Theorem 2.2. For every integer $n \geqslant 1$ , we have $2^{n-1} \leqslant n!$ .

Proof. We give a proof by induction. Let $P(n)$ be the statement " $2^{n-1} \leq n!$ ".

Base case: $P(1)$ is the statement $2^0\le 1!$ . This is true since $2^0=1=1!$ .
Induction Step: Assume $P(n)$ is true for some $n\geq 1$ , that is, $2^{n-1} \leq n!$ .

$P(n+1)$ is the statement $2^{n+1-1} \leq (n+1)!$

Observe that $2^n=2^{n-1}\cdot 2\leq n!\cdot( n+1) = 1 \cdot 2 \cdots n \cdot (n+1)=(n+1)!,$ where the middle inequality follows by multiplying the inequalities $2^{n-1}\leq n$ (the inductive hypothesis) and $2\leq n+1$ (which follows from $n\geq 1$ ).

This shows $P(n+1)$ is true.

By the Principle of Mathematical Induction, $2^{n-1} \leq n!$ for every integer $n \geq 1$ .

◻

When writing a proof by induction, you should tell the reader that you’re about to do a proof by induction, and you should clearly indicate where your base case and induction step are. The assumption that $P(n)$ holds for some $n$ used in the induction step is called the induction hypothesis. The fact that proving your base case and induction steps is sufficient to prove the theorem is recalled by invoking the Principle of Mathematical Induction to finish the proof.

Important features of a proof by induction:

always start by stating you will do a proof by induction
next state clearly what your $P(n)$ statement is
a frequent mistake is to include the words "for all/every/any $n$ " in the $P(n)$ statement. This is incorrect. The $P(n)$ statement applies just to the $n$ which is named inside the $P(n)$ notation.
label your base case and induction step clearly
state what $P(n+1)$ is clearly before you prove it.
finish the proof saying “By the Principle of Mathematical Induction $\ldots$ is true for all $n\geq \ldots$ .

Here are some more proofs using the Principle of Mathematical Induction:

Theorem 2.3. For all $n \geqslant 1$ , $1 + 2 + \cdots + n = \frac{n(n+1)}{2}$ .

Proof. Let’s prove this by induction on $n$ . Let $P(n)$ be the statement $1 + 2 + \cdots + n = \frac{n(n+1)}{2}$ .

Base Case: When $n=1$ , we do indeed have $1 = \frac{1 \cdot 2}{2}$ , so $P(1)$ holds.
Induction Step: Suppose that $P(n)$ holds for some $n \geqslant 1$ , so that $1 + 2 + \cdots + n = \frac{n(n+1)}{2}$ . Then $\begin{align*} 1 + 2 + \cdots + n + (n+1) & = ( 1 + 2 + \cdots + n ) + (n+1) \\ & = \frac{n(n+1)}{2} + (n+1) & \textrm{ by induction hypothesis} \\ & = (n+1) \left( \frac{n}{2} + 1 \right) & \textrm{ factoring out } n+1 \\ & = (n+1) \left( \frac{n+2}{2} \right) \\ & = \frac{(n+1)(n+2)}{2} \\ & = \frac{(n+1)((n+1) + 1)}{2} \\ \end{align*}$ so $P(n+1)$ holds.

By the Principle of Mathematical Induction, $1 + 2 + \cdots + n = \frac{n(n+1)}{2}$ for all $n \geqslant 1$ . ◻

Example 2.4. For every integer $n\geq 1$ , $5$ divides $11^n - 6$ .

Proof. We give a proof by induction. Let $P(n)$ be the statement "5 divides $11^n - 6$ ".

Base case: for $n = 1$ , $P(1)$ is the statement that $5$ divides $11 - 6$ .
Induction Step: Assume $P(n)$ is true for some $n\geq 1$ . This means that 5 divides $11^n - 6$ , so by definition of divisibility there exists $k\in{\mathbb{Z}}$ so that $11^n-6=5k$ or $11^n=5k+6$ .

We wish to prove that $P(n+1)$ is true. $P(n+1)$ is the statement that $5$ divides $11^{n+1} - 6$ . Observe that $11^{n+1}-6=11\cdot 11^n-6=11(5k+6)-6=11\cdot 5k+66-6=11\cdot 5k+60=5(11k+12).$ Since $11, k,12\in{\mathbb{Z}}$ , then $11k+12\in{\mathbb{Z}}$ by closure under addition and multiplication. Since $11^{n+1}-6=5(11k+12)$ and $11k+12\in{\mathbb{Z}}$ , $5$ divides $11^{n+1}-6$ . So $P(n+1)$ is true.

By the Principle of Mathematical Induction, 5 divides $11^n - 6$ for all integers $n\geq 1$ .

◻

We can also reprove the following theorem using induction on $n \geqslant 1$ :

Theorem 2.5. Let $a_1, \ldots, a_n$ be integers. If $p$ is a prime and $p$ divides $a_1 \cdots a_n$ , then $p$ divides $a_i$ for some $i\in\{1,\ldots, n\}$ .

Of course we have already proved this before! But now we can give a more formal proof — our previous proof was a bit handwave-y. We will need the following result, which we have proved before:

Theorem 2.6 (Euclid’s Lemma). If $p$ is a prime and $p$ divides $ab$ , then $p$ divides $a$ or $p$ divides $b$ .

Proof of Theorem 2.5. Let’s fix a prime $p$ . Let $P(n)$ be the statement “If $p$ divides $a_1 \cdots a_n$ for some integers $a_1, \ldots, a_n$ , then $p$ divides $a_i$ for some $i$ ”. We prove $P(n)$ holds for all $n \geqslant 1$ by induction.

Base Case: When $n=1$ , if $p$ divides $a_1$ , then $p$ divides the only one of the $a_i$ , which is $a_1$ .
Induction Step: Suppose that $P(n)$ holds for some $n \geqslant 1$ , meaning that if $p$ divides $a_1 \cdots a_n$ , then $p$ must divide some of the $a_i$ . Now suppose that $a_1, \ldots, a_{n+1}$ are integers and $p$ divides $a_1 \cdots a_{n+1}$ . Since $n \geqslant 1$ , we must have $n+1 \geqslant 2$ , so that we have at least two different $a_i$ . Since $p$ divides $(a_1 \cdots a_{n}) a_{n+1}$ , Theorem 2.6 says that $p$ divides $a_1 \cdots a_n$ or $p$ divides $a_{n+1}$ . If $p$ divides $a_{n+1}$ , we are done. Otherwise, $p$ must divide $a_1 \cdots a_n$ , so by induction hypothesis, $p$ divides some $a_i$ for $1 \leqslant i \leqslant n$ . Either way, $p$ divides one of the numbers in the list $a_1, \ldots, a_{n+1}$ , and we have shown that $P(n+1)$ holds.

◻

One thing that might be confusing about this proof is that Theorem 2.6 is precisely $P(2)$ . We needed this statement in our induction step.

Let’s now prove the uniqueness part of the FTA.

Theorem 2.7 (FTA uniqueness). If $p_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t$ for some primes $p_1, p_2, \dots, p_s$ and $q_1, q_2, \dots, q_s$ , then $s = t$ and, after possibly reordering the $q_j$ ’s, we have that $p_1 = \pm q_1, \ldots, p_s=\pm q_s$ .

Proof. We give a proof by induction on $n=\max\{s,t\}$ . Let $P(n)$ be the statement “If $n=\max\{s,t\}$ and $p_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t$ for some primes $p_1, p_2, \dots, p_s$ and $q_1, q_2, \dots, q_s$ , then $s = t$ and, after possibly reordering the $q_j$ ’s, we have that $p_1 = \pm q_1, \ldots, p_s=\pm q_s$ ."

Base Case: Suppose $1=\max\{s,t\}$ and $p_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t$ for some primes $p_1, p_2, \dots, p_s$ and $q_1, q_2, \dots, q_s$ . Since $s$ and $t$ are positive and $\max\{s,t\}=1$ we must have $s=1$ and $t=1$ . So $s=t$ and the equality $p_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t$ is just $p_1=q_1$ . This shows $P(1)$ is true.
Induction Step: Suppose $P(n)$ is true. Suppose $n+1=\max\{s,t\}$ and $p_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t$ for some primes $p_1, p_2, \dots, p_s$ and $q_1, q_2, \dots, q_s$ . Since $p_s\mid p_1 p_2 \cdots p_s$ and $p_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t$ , we have that $p_s\mid q_1 q_2 \cdots q_t$ . By Theorem 2.5 we conclude that $p_s\mid q_i$ for some $i\in \{1,\ldots, t\}$ . After reordering the the $q_j$ ’s, we may assume that $p_s\mid q_t$ and since both $p_s$ and $q_t$ are primes this gives $p_s=\pm q_t$ .

Substituting $p_s=\pm q_t$ into $p_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t$ we have $p_1 p_2 \cdots p_{s-1}(\pm q_t)= q_1 q_2 \cdots q_{t-1}q_t$ and dividing by $q_t$ yields $p_1 p_2 \cdots (\pm p_{s-1})= q_1 q_2 \cdots q_{t-1}.$ Observe that $\max\{s-1,t-1\}=\max\{s,t\}-1=n$ so that $P(n)$ applies to the above displayed equality and gives $s-1=t-1$ and $p_1=\pm q_1, \ldots, \pm p_{s-1}=\pm q_{s-1}$ . Since $s-1=t-1$ we have $s=t$ and since $p_s=\pm q_t$ from above we have $p_s=q_s$ . Together with $p_1=\pm q_1, \ldots, p_{s-1}=\pm q_{s-1}$ this shows that $P(n+1)$ is true.

By the principle of mathematical induction, we have proven that the theorem is true. ◻

2.2 Strong induction

Sometimes we need a stronger form of induction where in the induction step we assume not only $P(n)$ , but also $P(k)$ for all $k \leqslant n$ , where $k$ should also satisfy $k \geqslant 1$ or whatever else is our base case. This is called Strong Induction or Complete Induction. Here is the formal statement:

Theorem 2.8 (The Principle of Strong Mathematical Induction). Suppose $P(n)$ is a statement that refers to an integer $n$ . Suppose that there exist integers $r, b$ so that

$P(r), P(r+1), \ldots, P(b)$ are true and
for all integers $n> b$ , if $P(k)$ is true for all $1 \leq k < n$ , then $P(n)$ is true.

Then $P(n)$ is true for all $n$ .

There are two differences between strong induction and usual induction. First, in strong induction one typically needs to prove more than one base case ( $P(r), P(r+1), \ldots, P(b)$ ). You will have to include as a base case all the cases for which you cannot prove the induction step. Second, in the induction step of strong induction we get to assume any previous statements $P(k)$ with $k<n$ and use whichever one is convenient. This flexibility is what makes this version of induction “strong".

Example 2.9. Use strong induction to prove that every class with at least $6$ students in it can be divided up into groups each of size $3$ or $4$ .

Proof. Let $P(n)$ be the statement “A class with $n$ students can be divided up into groups each of size $3$ of $4$ ”. We prove $P(n)$ is true for all $n \geqslant 6$ by strong induction.

Base case: For the base cases, we start by checking each of the cases $n=6$ , $n=7$ , and $n=8$ hold. This is true since $6 = 3 + 3$ , $7 = 3+4$ and $8 = 4 + 4$ .
Induction step For the induction step, assume $n \geq 9$ and that $P(k)$ is true for all $k$ such that $6 \leqslant k < n$ . Our goal is to prove $P(n)$ is true, that is, a class having $n$ students can be divided up into groups each of size $3$ of $4$ .

From a class of $n$ students, pick any three students and put them together into a group. Note that $n - 3 = \geqslant 9-3=6$ since we have assumed $n \geqslant 9$ . Also we have $n-3<n$ . So the strong induction hypothesis applied for $k=n-3$ gives that $P(n-3)$ is true— that is, the remaining $n-3$ students can be put into groups of size $3$ or $4$ . In all, we have put all $n$ students into groups of size $3$ or $4$ combining the group of 3 we formed initially with the grups formed form the other $n-3$ . This shows that $P(n)$ is true.

By the Principle of Strong Mathematical Induction, $P(n)$ is true for all $n \geqslant 6$ . ◻

Let’s use strong induction to reprove the existence part of the Fundamental Theorem of Arithmetic.

Theorem 2.10. Every integer $n \geqslant 2$ is a product of primes.

Proof. We give a proof by strong induction. Let $P(n)$ be the statement the integer $n$ is a product of primes.

Base Case: When $n=2$ , $2$ is prime, so it is a product of primes — in fact, it is a product of one prime.
Induction Step: Suppose that every integer $2 \leqslant k < n$ is a product of primes. We want to show that $n$ is a product of primes.

If $n$ is prime, then it is a product of primes, and we are done. If $n$ is not prime, by Lemma [lemma composite] we can find integers $a\neq \pm 1$ and $b\neq \pm 1$ such that $n = ab$ . In fact, setting $r=|a|$ and $s=|b|$ and using that $n>0$ we see that $n=|n|=|ab|=|a||b|=rs$ with $r>0$ and $s>0$ and $r\neq 1$ and $s\neq 1$ . This means that $r>1$ and $s>1$ so $s=n/r<n/1=n$ and $r=n/s<n/1=n$ .

Since $2 \leq r <n$ and $2 \leq s <n$ , by the induction hypothesis we can find primes $p_1, \ldots, p_s$ and $q_1, \ldots, q_t$ such that $r = p_1 \cdots p_s \textrm{ and } s = q_1 \cdots q_t,$ so $n= rs = (p_1 \cdots p_s) (q_1 \cdots q_t) = p_1 \cdots p_s q_1 \cdots q_t$ is a product of primes.

By the principle of strong mathematical induction every integer $n \geqslant 2$ is a product of primes. ◻

3 Modular Arithmetic

3.1 Equivalence relations and equivalence classes

Definition 3.1. An equivalence relation on a set $X$ is a binary relation $\sim$ satisfying the following properties:

For every $a \in X$ , $a \sim a$ . This is known as the reflexive property.
For all $a,b \in X$ , if $a \sim b$ , then $b \sim a$ . This is known as the symmetric property.
For all $a,b,c \in X$ , if $a \sim b$ and $b \sim c$ , then $a \sim c$ . This is known as the transitive property.

Typically the symbol “ $\sim$ ” is read as “is equivalent to”; $a \sim b$ is read as “ $a$ is equivalent to $b$ ”.

Example 3.2. Say $X$ is the set of students in our class and a relation is defined on this set as follows: students $a$ and $b$ are equivalent, written $a\sim b$ , if they are born in the same month. We check this is an equivalence relation:

For every $a \in X$ , $a$ is born in the same month as $a$ , thuse $a\sim a$ . Thus the relation staisfies the reflexive property.
If $a \sim b$ , then $a$ and $b$ are born in the same month, so $b$ and $a$ are born in the same month, which shows $b\sim a$ . This means that the relation satisfies the symmetric property.
If $a \sim b$ and $b \sim c$ , then $a$ and $b$ are born in the same month and $b$ and $c$ are born in the same month so in fact all there ( $a,b,c$ ) are born in the same month. In particular, $a$ and $c$ are born in the same month so $a\sim c$ . This proves the transitive property.

Since it satisfies the reflexive, symmetric and transitive properties the relation of being born in the same month is an equivalence relation.

Here is an example of a binary relation that is not an equivalence relation:

Example 3.3. Let $X={\mathbb{Z}}$ and set $a\sim b$ to mean $a\leq b$ . This relation is symmetric and transitive but not reflexive. We show it is not reflexive by means of a counterexample: we have $3\leq 5$ so $3\sim 5$ , but $5\not \sim 3$ since $5\not\leq 3$ .

Here is another example:

Example 3.4. Let $X$ denote the set of all points in the plane. Given two such points $P$ and $Q$ , declare $P \sim Q$ to mean that $P$ and $Q$ have the same distance to the origin. For instance the point $P = (3,4)$ has distance $5$ to the origin and the point $Q = (0,-5)$ also has distance $5$ to the origin, and hence $P \sim Q$ . Put $P$ and $R$ are not similar for $R = (3,0)$ .

The three properties of an equivalence relation are pretty clear in this case: (1) Any point $P$ has the same distance to the origin as itself, (2) if $P$ and $Q$ have the same distance to the origin, so do $Q$ and $P$ , and (3) if $P$ and $Q$ have the same distance to the origin and $Q$ and $R$ have the same distance to the origin, then $P$ and $R$ are of the same distance to the origin.

Definition 3.5. Let $\sim$ be an equivalence relation on some set $X$ . For $a \in X$ , the equivalence class of $a$ , written $[a]$ , is the subset of $X$ consisting of all elements that are equivalent to $a$ : $[a] = \{b \in X \mid a \sim b\}$

Example 3.6. Let $X$ and $\sim$ be as in Example 3.4. Given a point $P \in X$ , what is the equivalence class $[P]$ ? It is the set of all points in the plane located on the circle centered at the origin which goes through $P$ .

Suppose the distance from $P$ to the origin is $d$ while the distance from $Q$ to the origin is $r$ . As above $[P]$ is the circle centered at the origin of radius $d$ and $[Q]$ is the circle centered at the origin of radius $r$ .

If $P\sim Q$ then $d=r$ . Thus the circle centered at the origin of radius $d$ is the same as the circle centered at the origin of radius $r$ , that is $[P]=[Q]$ .

If $P\not \sim Q$ then $d\neq r$ . Thus the circle centered at the origin of radius $d$ and the circle centered at the origin of radius $r$ do not share any points, that is $[P]\cap [Q]=\emptyset$ .

Theorem 3.7. Let $X$ be any set and $\sim$ be any equivalence relation on $X$ . For elements $a, b \in X$ we have $[a] = [b]$ if and only if $a \sim b$ .

Proof. $\rightarrow$ If $[a] = [b]$ then $a \sim b$ .

If $[a] = [b]$ then since $a \in [a]$ (by the reflexive property) we have $a \in [b]$ too. By definition of $[b]$ , this means $a \sim b$ .

$\Leftarrow$ If $a \sim b$ then $[a] = [b]$ . Suppose $a \sim b$ . Pick any $c \in [a]$ . By definition this means that $c \sim a$ . Since $a \sim b$ , by transitivity, this gives that $c \sim b$ and hence $c \in [b]$ . We have proven $[a] \subseteq [b]$ . Now pick any $d \in [b]$ . By definition this means that $d \sim b$ . Since $a \sim b$ , by symmetry we also have $b \sim a$ and hence by transitivity we have $d \sim a$ , which means that $d \in [a]$ . This proves that $[b] \subseteq [a]$ . Since $[b] \subseteq [a]$ and $[a] \subseteq [b]$ , we have $[a] = [b]$ . ◻

Corollary 3.8. Let $X$ be and set and $\sim$ an equivalence relation on $X$ . Any two equivalence classes are either equal or disjoint: for all $a, b \in X$ , either $[a] = [b]$ or $[a]\cap [b] = \emptyset$ .

Since “either $[a] = [b]$ or $[a]\cap [b] = \emptyset$ " is a statement of the for “ P or Q" we will instead prove the logically equivalent statement “ not(Q) $\rightarrow$ P" , that is, “if $[a]\cap [b] \neq \emptyset$ then $[a] = [b]$ ".

Proof. Suppose $[a]\cap [b] \neq \emptyset$ . Then there exists $x\in [a]\cap [b]$ meaning that $x\in [a]$ and $x\in [b]$ . Since $x\in [a]$ we have $x\sim a$ by definition of equivalence class and since $x\in [b]$ we have $x\sim b$ by definition of equivalence class. By the symmetric and transitive properties, since $x\sim a$ and $x\sim b$ , then $a\sim b$ . Since $a\sim b$ , by the previous Theorem we deduce that $[a]=[b]$ . ◻

The corollary says that an equivalence relation on a set $X$ “partitions” $X$ into equivalence classes: Every element of $X$ belongs to one, and only one, equivalence class. Equivalently, $X$ is the disjoint union of its equivalence classes.

3.2 Congruence and congruence classes

3.2.1 Congruence

Definition 3.9. Fix a nonzero integer $N$ . We say that $a, b\in \mathbb Z$ are congruent modulo $N$ if $N$ divides $a-b$ . We write $a \, \equiv \, b \pmod N$ for “ $a$ is congruent to $b$ modulo $N$ ". In this notation, the $a$ and $b$ are the two inputs, and $\, \equiv \!\pmod N$ is one piece, like a complicated equal sign.

Here are some examples:

$5\equiv 19 \pmod 7$ because $7$ divides $5-19=-14$ .
$-5\equiv 20 \pmod{10}$ is not true, because $10$ does not divide $-5-20=-25$ .
$-11\equiv -26 \pmod 5$ because $5$ divides $-11-(-26) = 15$ .

Theorem 3.10. Any two odd integers are congruent modulo 2.

Proof. Let $a$ and $b$ be two odd integers. By a problem in problem set 1, $a = 2q+1$ and $b = 2s+1$ for some integers $q$ and $s$ . Now $a-b = (2q+1)-(2s+1) = 2(q-s),$ so $2$ divides $a-b$ , which means that $a \equiv b \,\,\pmod 2$ . ◻

Example 3.11. It is not true that any two odd integers are congruent modulo 3, because for example $3 \not\equiv 5 \pmod{3}$ (since $3$ does not divide $3-5=-2$ ) and $3$ and $5$ are both odd.

Example 3.12. If $7 \equiv 27 \pmod{N}$ for some integer $N$ , what are the possible values of $N$ ? The congruence $7 \equiv 27 \pmod{N}$ holds if and only if $N$ divides $7 - 27 = 20$ , and so $N$ must be one of the divisors of $20$ : $N \in \{\pm 1, \pm 2, \pm 4, \pm 5, \pm 10\}$ .

Congruence classes modulo $10$ The number $2001$ is congruent to $1$ modulo $10$ , because $2001 -1 = 2000$ is divisible by $10$ . One way to find this is to consider the division algorithm: dividing 2001 by 10 we obtain $2001=10\cdot 20+1$ which shows that $2001-1=10\cdot 20$ is divisible by 10.

A different way to think about this: for any positive number $n$ , if $r \in \{0, 1, \dots, 9\}$ is the digit appearing in it one’s place, then the ones place of $n - r$ is $0$ and hence $n-r$ is divisible by $10$ . Thus, $n \equiv r \pmod{10}$ where $r$ is the digit in the one’s place of $n$ .

However, we have to be a little more careful about negative integers. If $n <0$ is an integer, to find the unique $0 \leqslant r \leqslant 9$ such that $n \equiv r \pmod {10}$ , we consider the last digit $r$ of $n$ , and take $r=10-n$ . For example, $1999 \equiv 9 \pmod{10}$ and $-1999 \equiv 1 \pmod{10}$ .

Theorem 3.13. For a fixed $N>0$ , every $a\in \mathbb Z$ is congruent modulo $N$ to some $r \in \mathbb Z$ such that $0\leqslant r < N$ .

Proof. By the Division Algorithm we have $a = qN + r, \text{ with $0 \leqslant r N$.}$ So $a-r = qN$ and thus $N$ divides $a-r$ . It follows from the definition of congruence that $a \equiv r \pmod{N}$ . ◻

Theorem 3.14. Congruence modulo $N$ is an equivalence relation; that is, the following three properties hold for any nonzero integer $N$ :

Symmetric: For any integer $a$ , $a\equiv a \pmod N$ .
Reflexive: For all integers $a$ and $b$ , if $a\equiv b\pmod N$ , then $b\equiv a \pmod N$ .
Transitive: For all integers $a$ , $b$ , and $c$ , if $a\equiv b\pmod N$ and $b\equiv c \pmod N$ , then $a\equiv c \pmod N$ .

Proof. Fix a nonzero integer $N$ .

For any integer $a$ , $a - a = 0$ is divisible by $N$ and hence $a\equiv a \pmod N$ .
For integers $a$ and $b$ , assume $a \equiv b \pmod{N}$ . Then $N$ divides $a-b$ and hence $N$ also divides $(-1)(a-b) = b-a$ . This proves that $b\equiv a \pmod N$ .
Let $a$ , $b$ , and $c$ be integers such that $a\equiv b\pmod N$ and $b\equiv c \pmod N$ . By definition this means $N$ divides $a-b$ and $N$ divides $b-c$ . Using the fact proven before that if $N$ divides two integers then it divides their sum, we see that $N$ divides $(a-b) + (b-c) = a-c$ . By definition, this proves $a\equiv c \pmod N$ .

◻

Theorem 3.15 (Congruences can be added). Given a nonzero integer $N$ and integers $a,b,c,d$ , if $a \equiv b \pmod N$ and $c \equiv d \pmod N$ , then $a + c \equiv b + d \pmod N$ .

Proof. Fix a nonzero integer $N$ and assume $a,b,c,d$ are integers such that $a \equiv b \pmod N$ and $c \equiv d \pmod N$ . Then $N$ divides both $a-b$ and $c-d$ and hence $N$ also divides their sum, which is $a-b+c-d = (a+c) - (b+d)$ . By definition this means that $a + c \equiv b + d \pmod N$ . ◻

Theorem 3.16 (Congruences can be multiplied). Given a nonzero integer $N$ and integers $a,b,c,d$ , if $a \equiv b \pmod N$ and $c \equiv d \pmod N$ , then $a \cdot c \equiv b \cdot d \pmod N$ .

Proof. Fix a nonzero integer $N$ and assume $a,b,c,d$ are integers such that $a \equiv b \pmod N$ and $c \equiv d \pmod N$ . Then $N$ divides both $a-b$ and $c-d$ , so $a-b=Nk$ and $c-d=N\ell$ for some $k,\ell\in{\mathbb{Z}}$ .

Therefore $a+b-(c+d)=a-c+b-d=Nk+N\ell=N(k+\ell)$ where $k+\ell$ is an integer by closure of ${\mathbb{Z}}$ under addition. This means that $N\mid (a+b-(c+d))$ which shows $a+b\cong c+d$ by the definition of congruence.

Since $a=b+Nk$ and $c=d+N\ell$ we have $ac=(b+Nk)(d+N\ell)=bd+Nkd+N\ell b+N^2k\ell=bd+N(kd+\ell b+Nk\ell).$ It follows that $ac-bd = N(kd+\ell b+Nk\ell)$ , where $kd+\ell b+Nk\ell$ is an integer by closure of the integers under addition and multiplication. Thus $N\mid(ac-bd)$ and hence $ac \equiv bd \pmod N$ by the definition of congruence. ◻

3.2.2 Congruence classes

Definition 3.17. Fix a nonzero integer $N$ . For $a\in \mathbb Z$ , the congruence class of $a$ modulo $N$ is the subset of $\mathbb Z$ consisting of all integers congruent to $a$ modulo $N$ ; that is, the congruence class of $a$ modulo $N$ is the set $[a]_N := \{b\in \mathbb Z\, | \, b\equiv a \pmod N\}.$

Let me emphasize that congruence classes are sets, not numbers.

Example 3.18. What are some of the elements in $[11]_4$ ? Here are some examples $11, 15, 19, 23$ and $7, 3, -1, -5$ .

The set $[7]_2$ consists of all the odd integers. The set $[-8]_2$ consists of all the even integers.

True or False? Justify.

$47 \in [17]_{5}$ . This is true since $47 \equiv 17 \pmod{5}$ .
$[17]_{7 } \cap [23]_{7 } = \emptyset$ . This is true: an element belonging to both sets would have to be congruent to both $17$ and $23$ modulo $7$ . If such a number existed, then $17$ and $23$ would be congruent modulo $7$ (using the symmetric and transitive properties of congruences). But they are not. So no such number exists.
$[17]_{6 } \cap [19]_{7 } = \emptyset$ . This is false; for instance, $5$ belongs to both these sets. Notice that we are using two different values for $N$ in this example.
For all integers $a$ , $[a]_{60} \subseteq [a]_{10}$ . This is true. If $b \in [a]_{60}$ then $b \equiv a \pmod{60}$ and hence $60$ divides $b-a$ . But then $10$ would also divide $b-a$ and hence $b \equiv a \pmod{10}$ , so that $b \in [a]_{10}$ .

Example 3.19. Set $N = 3$ . Then there are three congruence classes modulo $3$ :

$[0]_3 = \{ \dots, -6, -3, 0, 3, 6, 9, \dots\}$
$[1]_3 = \{ \dots, -5, -2, 1, 4 , 7, 10, \dots\}$
$[2]_3 = \{ \dots, -4, -1, 2, 6 , 8, 11, \dots\}$

Notice that every integer belongs to one, and only one, of these three sets. They partition the integers into three non-overlapping subsets, likes countries on a map. A bit more formally, this property looks as follows.

Theorem 3.20. For any positive integer $N$ and any pair of integers $a,b$ exactly one of the following possibilities is true:

$[a]_N=[b]_N$ or
$[a]_N \cap [b]_N=\emptyset$

Proof. Since we have seen in Theorem 3.14 that congruence modulo $N$ is an equivalence relation, and since the congruence classes are just the equivalence classes of this equivalence relation, the claim of this theorem follows from Corollary 3.8. ◻

We can make more precise when each of the two cases in the preceding theorem occurs.

Theorem 3.21. For any positive integer $N$ and any pair of integers $a,b$ we have

$[a]_N=[b]_N$ if and only if $a\equiv b\pmod{N}$
$[a]_N \cap [b]_N=\emptyset$ if and only if $a\not\equiv b \pmod{N}$ .

Proof. The statement “ $[a]_N=[b]_N$ if and only if $a\equiv b\pmod{N}$ " is true by Theorem 3.7 applied to the equivalence relation congruence modulo $N$ .

We will now prove the statement “ $[a]_N \cap [b]_N=\emptyset$ if and only if $a\not\equiv b \pmod{N}$ ".

Suppose $[a]_N \cap [b]_N=\emptyset$ . Then since $a\in [a]_N$ we obtain that $a\not\in [b]_N$ . Therefore, by definition of congruence classes $a\not\equiv b \pmod{N}$ .

Suppose $a\not\equiv b \pmod{N}$ then $a\in[a]_N$ but $a\not \in [b]_N$ . This shows that $[a]_N\neq[b]_N$ , By Theroem 3.21 this means that $[a]_N \cap [b]_N=\emptyset$ must be true. ◻

Theorem 3.22. For any positive integer $N$ , there are exactly $N$ congruence classes modulo $N$ , namely the congruence classes $[0]_N, [1]_N, [2]_N, ,\ldots, [N-1]_N$ .

Proof. Suppose $N$ is a positive integer. By Theorem 3.13 we have that any integer $a$ is congruent to some integer $r$ so that $0\leq r<N$ . By Theorem 3.21, since $a\equiv r\pmod{N}$ we have $[a]_N=[r]_N$ for some integer $r$ so that $0\leq r<N$ . Since $0\leq r<N$ , the possibilities for $r$ are $0, 1,2, \ldots, N-1$ and thus the possibilities for $[a]_N$ are $[0]_N, [1]_N, [2]_N, ,\ldots, [N-1]_N$ . We thus have at most $N$ congruence classes modulo $N$ .

To see that the congruence classes $[0]_N, [1]_N, [2]_N, ,\ldots, [N-1]_N$ are distinct, we appeal to the uniqueness of $r$ in Theorem 3.13. Indeed, if we hve that $[i]_N=[j]_N$ so that $0\leq i\leq N-1$ and $0\leq i\leq N-1$ then applying Theorem 3.13 for $a=i$ we get that both $i$ and $j$ satisfy the properties required for $r$ and thus $i=j$ by uniqueness. Thus the congruence classes $[0]_N, [1]_N, [2]_N, ,\ldots, [N-1]_N$ are distinct and so we have exactly $N$ congruence classes. ◻

3.3 Modular arithmetic

We come to a very important point. We will see shortly that although congruence classes modulo $N$ are sets, not numbers, they behave like numbers in the sense that they can be added and multiplied. Thus we can talk about arithmetic on the following set.

Definition 3.23. For a positive integer $N$ , $\mathbb{Z}_N$ is the set of congruence classes of integers modulo $N$ , that is $\mathbb{Z}_N=\{[0]_N, [1]_N, \ldots, [N-1]_N\}.$

Using the fact that congruences can be added and multiplied, we may introduce rule for adding and multiplying congruence classes

Definition 3.24. Fix a non-zero integer $N$ and let $a,b\in{\mathbb{Z}}$ . Define $[a]_N + [b]_N$ to be the congruence class modulo $N$ given by the following rule:

Pick any $x\in [a]_N$ and pick any $y\in [b]$ , and define $[a]_N+[b]_N:= [x+y]_N$ .

For example if we picked $x=a$ and $y=b$ we would get $[a]_N+[b]_N:= [a+b]_N.$

Example 3.25. Let us compute $[17]_5 + [11]_5$ . The rule says that we start by picking any elements $x \in [17]_5$ and $y \in [11]_5$ . I’ll choose $x = 2$ and $y = 6$ . Next we add our choices for $x$ and $y$ as ordinary integers to get $8$ , and conclude that $[17]_5 + [11]_5 = [8]_5$ . Note that $[8]_5 = [3]_5$ , so we could also say that $[17]_5 + [11]_5 = [3]_5$ . Since $[17]_5 = [2]_5$ and $[11]_5 = [1]_5$ , we could rewrite this yet again as $[2]_5 + [1]_5 = [3]_5$ .

Now you try it:

Example 3.26. Have each member of your group compute $[120]_6 + [13]_6$ . Then compare your answers. Are they equal? Repeat this for $[-19]_6 + [23]_6$ .

Solution: If one of us were to pick $114 \in [120]_6$ and $7 \in [13]_6$ , the rule for adding congruence classes gives $[120]_6 + [13]_6 = [121]_6$ . If another one of us were to pick $0 \in [120]_6$ and $1 \in [13]_6$ , the rule for adding congruence classes gives $[120]_6 + [13]_6 = [1]_6$ . This gives the same result since $[121]_6 = [1]_6$ , because $121 \equiv 1 \pmod{6}$ .

Perhaps you notice a troubling aspect of the definition for adding congruence classes: the definition of $[a]_N+[b]_N$ seems to depend on choices of representative elements $x$ and $y$ . What if one pair of choices leads to a different outcome than another pair of choice? That would be non-sensical.

Theorem 3.27. The definition of $[a]_N+[b]_N$ is independent of the choices mades. That is, if $x_1, x_2 \in [a]_N$ and $y_1, y_2 \in [b]_N$ , then $[x_1 + y_1]_N = [x_2 + y_2]_N$ .

Proof. Our proof will use two theorems proven before: (1) $a$ and $b$ belong to the same congruence class modulo $N$ if and only if $a \equiv b \pmod{N}$ (Theorem 3.21) and (2) congruence classes may be added (Theorem 3.15).

Since $x_1$ and $x_2$ belong to the same congruence class, by Theorem 3.21 we have that $x_1 \equiv x_2 \pmod{N}$ , and likewise since $y_1$ and $y_2$ belong to the same congruence class, $y_1 \equiv y_2 \pmod{N}$ . By Theorem 3.15, we may add these two congruence classes to conclude $x_1 + y_1 \equiv x_2 + y_2 \pmod{N}$ . Using Theorem 3.21 again, it follows that $[x_1 + y_1]_N = [x_2 + y_2]_N$ . ◻

Now we can define multiplication of congruence classes.

Definition 3.28. Fix a non-zero integer $N$ and let $a,b\in{\mathbb{Z}}$ . Define $[a]_N \cdot [b]_N$ to be the congruence class modulo $N$ given by the following rule:

Pick any $x\in[a]_N$ and pick any $y\in [b]_N$ , and define $[a]_N\cdot[b]_N:=[xy]_N$ .

For example if we picked $x=a$ and $y=b$ we would get $[a]_N+[b]_N:= [ab]_N.$

Theorem 3.29. The definition of multiplying congruence classes is independent of the choices made. In more detail, if $[a]_N$ and $[b]_N$ are two congruence classes modulo $N$ , and we pick $x_1, x_2 \in X$ , and $y_1, y_2 \in Y$ , then $[x_1 y_1]_N = [x_2 y_2]_N$ .

Proof. Our proof will use two theorems proven before: (1) $a$ and $b$ belong to the same congruence class modulo $N$ if and only if $a \equiv b \pmod{N}$ (Theorem 3.21) , and (2) congruence classes may be multiplied (Theorem 3.16).

Since $x_1$ and $x_2$ belong to the same congruence class, by (1) we have that $x_1 \equiv x_2 \pmod{N}$ , and likewise since $y_1$ and $y_2$ belong to the same congruence class, $y_1 \equiv y_2 \pmod{N}$ . By (2), we may add these two congruence classes to conclude $x_1 y_1 \equiv x_2 y_2 \pmod{N}$ . Using (1) again, it follows that $[x_1 y_1]_N = [x_2 y_2]_N$ . ◻

3.3.1 Units in ${\mathbb{Z}}_N$

Definition 3.30. A congruence class $[a]_N$ in ${\mathbb{Z}}_N$ is called a unit if it has a multiplicative inverse. A multiplicative inverse of $[a]_N$ is a congruence class $[b]_N$ in ${\mathbb{Z}}_N$ such that $[b]_N \cdot [a]_N = [1]_N$ .

Example 3.31. In $\mathbb{Z}_6$ , $[1]_6$ and $[5]_6$ are the only units. The multiplicative inverse of $[1]_6$ is $[1]_6$ , and the multiplicative inverse of $[5]_6$ is $[5]_6$ .

Theorem 3.32. Given a non-zero integer $N$ and any integer $a$ , there exists some integer $x$ such that $x \cdot a \equiv 1 \pmod{N}$ if and only if $\gcd(a,N) = 1$ .

Proof. Recall that $\gcd(a,N) = 1$ if and only if $xa + yN = 1$ for some integers $x$ and $y$ . The existence of integers $x$ and $y$ such that $xa + yN = 1$ is equivalent to the existence of an integer $x$ such that $N$ divides $xa-1$ , or equivalently, such that $xa \equiv 1 \pmod N$ . ◻

We can translate Theorem 3.32 into a statement about congruence classes:

Corollary 3.33. Given a non-zero integer $N$ and any integer $a$ , the congruence class $[a]_N$ is a unit in ${\mathbb{Z}}_N$ if and only if $\gcd(a,N)=1$ .

Proof. There exists some $[x]_N$ in $\mathbb{Z}_N$ such that $[a]_N [x]_N = [ax]_N = [1]_N$ if and only if there is an integer $x$ such that $ax \equiv 1 \pmod N$ if and only if $\gcd(a,N) = 1$ by the previous theorem. So the congruence class $[a]_N \in {\mathbb{Z}}_N$ is a unit if and only if $\gcd(a,N) = 1$ . ◻

How do we find multiplicative inverses? We can use trial and error, but it gets very slow; or we can use the extended Euclidean algorithm.

Example 3.34. Note that $[131]_{260}$ is a unit. We can find its multiplicative inverse via the extended Euclidean algorithm, as follows. First, we note that $\begin{align*} 260 & = 1 \cdot 131 + 129 \\ 131 &= 1 \cdot 129 + 2 \\ 129 &= 64 \cdot 2 + 1 \\ 2 &= 2 \cdot 1 + 0, \end{align*}$ so that indeed $\gcd(131,260)=1$ . We can now use the computations above to find a linear combination of $131$ and $260$ that is equal to 1: $\begin{align*} 129 & = 1 \cdot 260 - 1 \cdot 131 \\ 2 &= 1 \cdot 131 - 1 \cdot 129 = 1 \cdot 131 - 1 \cdot (1 \cdot 260 - 1 \cdot 131) = - 1 \cdot 260 + 2 \cdot 131 \\ 1 &= 129 - 64 \cdot 2 = 1 \cdot 260 - 1 \cdot 131 - 64 \cdot (- 1 \cdot 260 + 2 \cdot 131) = 65 \cdot 260 + (-129) \cdot 131 \\ \end{align*}$

Therefore, $(-129) \cdot 131 \equiv 1 \pmod {260},$ so that $[-129]_{260} = [131]_{260}$ is the multiplicative inverse of $[131]_{260}$ .

4 Rings

4.1 Binary operations

Definition 4.1. A binary operation on a set $S$ is a function from $S\times S$ to $S$ . In other words, it is a rule that assigns to two inputs taken from $S$ , one output which also belongs to $S$ .

Example 4.2. For example, addition and subtraction are binary operations on set the of integers.

If $\star$ denotes a binary operation, we write $x \star y$ to indicate the result of applying $\star$ to the pair of inputs $(x,y)$ , just as we would with the usual symbols $+$ and $-$ for the sum and subtraction of integers.

Remark 4.3. In defining a binary operation, we are implicitly stating that the set $S$ is closed under the operation.

Example 4.4. For instance, if we tried to define $a \star b = \frac{a}{b}$ on the set of non-zero integers, we would run into trouble: this is not an operation since $\frac{a}{b}$ isn’t always an integer.

Example 4.5.

Which of the following rules $\star$ are binary operations on the indicated set $S$ ?
1. Let $S$ be all non-zero real numbers and let $a \star b = \frac{a}{b}$ .
2. Let $S$ be the set of all even integers and $a \star b = a + b$
3. Let $S$ be the set of all odd integers and $a \star b = a + b$

Solution: The operations in (a) and (b) are in fact binary operations in the sets given, though (c) is not since it does not satisfy closure: the sum of two odd integers is never an odd integer.

Here are some special abstract properties a binary operation can have.

Definition 4.6.

Commutativity. A binary operation $\star$ is commutative if $x\star y = y \star x$ for any $x,y\in S$ .
Associativity. By definition, binary operations only take two inputs. If we wanted to operate on three things, we would have to choose two to pair first, then throw in the third. A binary operation $\star$ is associative if we get the same result with either grouping: $(x\star y) \star z = x \star(y\star z) \textrm{ for any } x,y,z \textrm{ in } S.$
Identity. An element $e\in S$ is an identity for a binary operation $\star$ if $e\star x = x$ and $x \star e = x$ for all $x\in S$ .
Inverses. If $\star$ is a binary operation with an identity $e$ , then an inverse for an element $x$ is another element $y$ such that $x \star y = e$ and $y\star x = e$ .

Example 4.7. For each of the following binary operations, determine if they are associative or commutative:

$S$ is the set of all non-zero real numbers and the operation is division: $a \star b = a/b$ .

Solution: This operation is not commutative, since for example $1/2 \neq 2/1$ . It is also not associative; associativity would mean that $(a/b)/c = a/(b/c)$ , and yet $(a/b)/c = \frac{a}{b} \cdot \frac{1}{c} = \frac{a}{bc} \textrm{ while } a/(b/c) = a \cdot \frac{c}{b} = \frac{ac}{b}.$ For example, $(1/2)/2 = 1/4 \neq 1 = 1/1=1/(2/2).$
$\mathbb{R}$ with the operation of subtraction: $a \star b = a - b$ .

Solution: This operation is neither commutative nor associative: commutativity fails for example in $1-2 = -1 \neq 1 = 2-1$ , while a counterexample to associativity is $(1-1)-1 = -1 \neq 1 = 1-(1-1)$ .
$\mathbb{R}$ with the operation of addition.

Solution: This is in fact a commutative and associative operation.
$\mathbb{R}$ with the operation being taking the maximum of two real numbers, that is, the operation is $a \star b = \max\{a,b\}$ .

Solution: This is in fact a commutative and associative operation.

Definition 4.8. Recall that $2 \times 2$ matrix is an array of real numbers of the form $\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix}$ and that one may add $2 \times 2$ matrices using the rule $\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} + \begin{bmatrix} r & s \\ t & u \\ \end{bmatrix} = \begin{bmatrix} a+r & b+s\\ c+t & d+u\\ \end{bmatrix}$

One can also multiply $2 \times 2$ matrices by the rule $\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} \cdot \begin{bmatrix} r & s \\ t & u \\ \end{bmatrix} = \begin{bmatrix} ar + bt & as + bu \\ cr + dt & cs + du \\ \end{bmatrix}$

Example 4.9. Addition of matrices is a binary operation on the set of all two-by-two matrices. This is in fact an associative and commutative operation.

The fact that multiplication of matrices is associative is one of the first properties one learns in a linear algebra class when talking about product of matrices. However, this is not a commutative operation! For example, $\begin{bmatrix} 1 & 0 \\ 0 & 0 \\ \end{bmatrix} \begin{bmatrix} 0 & 1 \\ 0 & 0 \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \\ \end{bmatrix} \neq \begin{bmatrix} 0 & 0 \\ 0 & 0 \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \\ \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 0 \\ \end{bmatrix}.$

We can also describe operations by tables, like we do with $+$ tables and $\times$ tables. Here are some operation tables for operations on the set $\{a,b,c,d\}$ : the entry in row $x$ and column $y$ for operation $\star$ means $x\star y$ . Decide for each whether the operation is commutative, has an identity, and/or has inverses.

$\clubsuit$	a	b	c	d
a	a	b	c	d
b	b	b	c	d
c	c	c	c	d
d	d	d	d	d

$\diamondsuit$	a	b	c	d
a	a	d	c	b
b	b	a	d	c
c	c	b	a	d
d	d	c	b	a

$\heartsuit$	a	b	c	d
a	a	a	a	a
b	a	b	c	d
c	a	c	a	c
d	a	d	c	b

$\spadesuit$	a	b	c	d
a	a	b	c	d
b	b	c	d	a
c	c	d	a	b
d	d	a	b	c

To see if one of these operations has an identity, we want to look for a row and a column that are unchanged, and that correspond to multiplication by the same element on the left and right. For example, we see that the row and column for $a$ in the table for $\clubsuit$ look exactly like the original row/column on the other side, so $a$ is the identity for this operation.

To see if one of these operations is commutative, we want to see that the entries corresponding to $x \star y$ and $y \star x$ coincide; more precisely, this means that the entries in positions $(i,j)$ and $(j,i)$ are the same. If we pretend that the entries in our tables correspond to the entries in a matrix, we want the transpose of our table to be the same as the original table, or else the operation is not commutative.

Finally, if an operation has an identity, we can spot the elements with inverses by looking for entries in the table that are equal to the identity. Notice that we need both $x \star y = e$ and $y \star x = e$ for $x$ to be the inverse of $y$ , so if the identity for our operation appears in position $(i,j)$ in the table, we also need the identity to appear in position $(j,i)$ to say that we found elements with an inverse.

In the tables above, we have:

$\clubsuit$ is commutative, has an identity $a$ , and $a$ is the only element with an inverse (since $a$ only appears in the table once).
$\diamondsuit$ is not commutative, since for example $d \diamondsuit a = d$ and $a \diamondsuit d = b$ . $\diamondsuit$ also has no identity, and thus it makes no sense to ask about inverses.
$\heartsuit$ is commutative, and $b$ is the identity. The inverse of $d$ is $d$ itself, and the inverse of the identity $b$ is of course $b$ .
$\spadesuit$ is commutative, and $a$ is the identity. Every element has an inverse: the inverse of $b$ is $d$ , the inverse of $c$ is itself, and of course the inverse of $a$ is $a$ , since $a$ is the identity.

With this information, we can set out to identify two of these operations as the sum and multiplication in $\mathbb{Z}_4$ . Both are commutative, so $\diamondsuit$ is certainly none of them. Now we note that every element in $\mathbb{Z}_4$ has an inverse for the addition: $[0]_4$ is the identity, $[1]_4$ is the inverse of $[3]_4$ , and $[2]_4$ is its own inverse. We conclude that $\spadesuit$ is the addition in $\mathbb{Z}_4$ , with $a=[0]_4$ , $c = [2]_4$ . Therefore, $b$ and $d$ must be $[1]_4$ and $[3]_4$ , though we don’t know which one is which; either choice will work.

To identify the multiplication in $\mathbb{Z}_4$ , we again want a commutative operation with identity, but this time we have an extra fun property: when we multiply any element in $\mathbb{Z}_4$ by $[0]_4$ , we always get $[0]_4$ as the result. So in the correct table we will find a row and column where every entry is the same element... so it must be $\heartsuit$ !

Example 4.10. Which of the following operations have an identity? If so, what is it?

multiplication of $2\times 2$ matrices.
division of positive real numbers $a \star b =\frac{a}{b}$ .
averaging two real numbers, that is $a \star b = \frac{a+ b}{2}$ .
maximum of two real numbers: that is $a \star b = \max\{a,b\}$ .

Solution:

The multiplication of $2 \times 2$ matrices has an identity: the matrix $\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$ .
The division of positive real numbers does not have an identity! While $a/1=a$ for any real number, $1/a = a$ is not true for most real numbers.
The operation of averaging two real numbers does not have an identity. If $\frac{e + a}{2} = a$ , then $e + a = 2a$ and hence $e = a$ . So there is no single elements $e$ that works simultaneously for all $a$ .
The maximum of two real numbers does not have an identity.

4.2 Rings

Definition 4.11. A ring is a set $R$ with two binary operations, written as $+$ and $\times$ , and called addition and multiplication, respectively, such that the following properties all hold:

Closure under addition: If $a,b$ are elements of $R$ , then $a+b\in R$ .
Associativity of addition: If $a,b,c$ are elements of $R$ , then $(a+b)+c=a+(b+c)$ .
Commutativity of addition: If $a,b$ are elements of $R$ , then $a+b=b+a$ .
Additive identity: There exists an element $0\in R$ so that for any $a\in R$ we have $a+0=a=0+a$ .
Additive inverses: For every $a\in R$ there is an element denoted $-a\in R$ so that $a+(-a)=0=(-a)+a$ .
Closure under multiplication: If $a,b$ are elements of $R$ , then $ab\in R$ .
Associativity of multiplication: If $a,b,c$ are elements of $R$ , then $(ab)c=a(bc)$ .
Multiplicative identity: There exists an element $1\in R$ so that for any $a\in R$ we have $a\cdot 1=a=1\cdot a$ .
Distibutivity: If $a,b,c$ are elements of $R$ , then $a(b+c)=ab+ac$ and $(b+c)a=ba+ca$ .

We may say that $(R,+,\times)$ is a ring to indicate that $R$ is a ring with addition given by the operation $+$ and multiplication given by the operation $\times$ .

Warning: our definition of ring is called a “ring with identity” in some textbooks which allow rings not to have a multiplicative identity.

Definition 4.12. A ring $R$ is called a commutative ring if in addition to all the properties in Definition 4.11 it satisfies

Commutativity of multiplication: If $a,b$ are elements of $R$ , then $ab=ba$ .

Theorem 4.13. The set of all integers $\mathbb{Z}$ with the usual sum and multiplication of integers is a commutative ring.

Proof. In the case of integers, the properties listed in the definition of a ring are so basic to the nature of the integers that they cannot be “proven” in any meaningful way, but are rather taken as axioms; see the Axioms in section 1.1 for a list of these properties. The validity of these axioms establishes that $\mathbb{Z}$ is a commutative ring. The additive and multiplicative identities are the familiar integers $0$ and $1$ . ◻

Theorem 4.14. The set of congruences classes modulo $N$ with the addition and multiplication of congruence classes given by $[a]_N + [b]_N = [a+b]_N \textrm{ and } [a]_N \cdot [b]_N = [a \cdot b]_N$ is a commutative ring.

Proof. We will not check all ten axioms in detail, but rather just check some of them. The common theme is that each axiom is a consequence of the fact that these axioms hold for ${\mathbb{Z}}$ . For instance, the distributive law holds since $[a]_N \cdot ([b]_N + [c]_N) = [a]_N \cdot [b+c]_N = [a(b+c)]_N = [ab + ac]_N = [ab]_N + [ac]_N = [a]_N [b]_N + [a]_N [c]_N$ and here the key step is the third equality, which is the distributive law for the integers.

The element $[0]_N$ is an additive identity, since $[0]_N + [a]_N = [0+a]_N = [a]_N$ , with the second equations holding since $0$ is the additive identity for the integers.

Given an element $[a]_N$ , it has an additive inverse, namely $[-a]_N$ , since $[a]_N + [-a]_N = [a - a]_N = [0]_N$ .

The element $[1]_N$ is a multiplicative identity, since $[1]_N \cdot [a]_N = [1 \cdot a]_N = [a]_N$ , with the second equations holding since $1$ is the multiplicative identity for the integers.

All the other axioms are proven in a similar manner. ◻

Theorem 4.15. The set of all $2\times 2$ matrices with coefficients in $\mathbb R$ under the rules for adding and multiplying matrices given by $\begin{bmatrix} a & b \\ c & d \end{bmatrix} \cdot \begin{bmatrix} r & s \\ t & u \end{bmatrix} = \begin{bmatrix} ar + bt & as + bu \\ cr + dt & cs + du \end{bmatrix}$ and $\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} + \begin{bmatrix} r & s \\ t & u \end{bmatrix} = \begin{bmatrix} a+r & b+s\\ c+t & d+u \end{bmatrix}.$ is a non-commutative ring.

Proof. This is indeed a ring, and each axiom is the sort of thing you might have seen justified in a class in matrix theory. For instance, the multiplicative identity is $I_2 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix}$ since it follows from the rule for multiplying matrices that $I_2 \cdot A = A = A \cdot I_2$ for all two-by-two matrices $A$ . ◻

Let us look at some other examples. In each case, we will determine if the given set with the given operations forms a ring.

Example 4.16.

The set of all even integers under the usual rules for adding and multiplying.

This is not a ring. The only axiom that fails is the existence of a multiplicative identity: there is no even integer $x$ such that $x \cdot a = a$ for all even integers $a$ .
The set of all positive real numbers under the usual rules for adding and multiplying.

This is not a ring. One axiom that fails is the existence of an additive identity: there is no positive real number $x$ such that $x + a = a$ for all positive real numbers $a$ . (Since there is no additive identity, Axiom 5 is not even defined.)
The set of all function from $\mathbb{R}$ to $\mathbb{R}$ with the following rules for adding and multiplying functions $(f+g)(x)=f(x)+g(x)$ and $(f\cdot g)(x)=f(x)\cdot g(x)$ .

This is indeed a ring. Remember, if $f$ and $g$ are two such functions, their sum $f + g$ is the function whose value at $x \in \mathbb{R}$ is $f(x) + g(x)$ and their product is the function $f \cdot g$ whose values at $x$ is $f(x) \cdot g(x)$ . As we know from calculus, adding and multiplying continuous functions gives continuous functions , and so the closures under addition and multiplication hold. The additive identity is the constant function $0$ and the multiplicative identity is the constant function $1$ . The additive inverse of a function $f$ is $-f$ , the function whose value at $x$ is $-f(x)$ . The remaining axioms all holds since these axioms hold for the real numbers themselves. We’ll skip the details.
The set of all increasing function from $\mathbb{R}$ to $\mathbb{R}$ with the following rules for adding and multiplying functions $(f+g)(x)=f(x)+g(x)$ and $(f\cdot g)(x)=f(x)\cdot g(x)$ .

This is not a ring since it doesn’t satisfy closure under multiplication. The function $f:\mathbb{R}\to R, f(x)=x$ is increasing but the function $(f\cdot f)(x)=f(x)^2=x^2$ is not increasing (recall the graph of $x^2$ has a dip).

Let’s review the examples of rings we have seen so far

Example 4.17.

the integers ${\mathbb{Z}}$
the set of congruence classes modulo $N$ , $\mathbb{Z}_N$
all other sets of numbers such as the rationals $\mathbb{Q}$ , the reals $\mathbb{R}$ , the complex numbers $\mathbb{C}$
the set of $2\times 2$ matrices with real number entries $\mathcal{M}_{2\times 2}(\mathbb{R})$ with the operations described in Theorem 4.15. In fact there is nothing special about $2\times 2$ matrices. For any integer $n\geq 1$ , the set of $n\times n$ matrices with real number entries $\mathcal{M}_{n\times n}(\mathbb{R})$ is also a ring. Even more generally for any integer $n\geq 1$ , the set of $n\times n$ matrices with entries in a commutative ring $R$ is a ring.
the set of functions $\mathcal{F}=\{f:\mathbb{R}\to \mathbb{R}\}$ with the operations described in Example 4.16.

Many of the rings we have seen so far are commutative rings. Each of ${\mathbb{Z}}$ , ${\mathbb{Z}}_N$ , and the ring of all functions from $\mathbb{R}$ to $\mathbb{R}$ are commutative rings. We have seen only one example so far of a non-commutative ring: The ring of two-by-two matrices with real entries is a non-commutative ring.

4.3 Properties of rings

In this section we see to heat extent the rules of arithmetic we are familiar with from working with, say, the integers still continue to hold in any ring. Since we don’t know anything about rings other than the 9 properties in their Definition 4.11, we will need to prove the theorem below using only those 9 properties.

Theorem 4.18. Let $(R,+,\cdot)$ be a ring.

$0 \cdot a = 0$ and $a \cdot 0 = 0$ for all $a \in R$ .
$(-a) \cdot b = - (a \cdot b)$ for any elements $a, b \in R$ .
$(-1) \cdot a = - a$ for any element $a \in R$ .
The zero element $0_R$ and the one element $1_R$ of $R$ are unique.
Every element $x\in R$ has a unique additive inverse.
For every element $x \in R$ , if $x$ is a unit, that is, if $x$ has a two-sided multiplicative inverse, then that inverse is unique.

Proof. Suppose $R$ is a ring.

(a) Let $a\in R$ . Then we compute $\begin{align*} 0+0 &= 0 & \text{ by additive identity} \\ (0+0)\cdot a&=0\cdot a & \text{ by multiplying } a \text{ to both sides } \\ 0\cdot a + 0\cdot a&= 0\cdot a& \text{ by distributivity}\\ -(0\cdot a) + 0\cdot a + 0\cdot a&=- (0\cdot a) + 0\cdot a & \text{ by adding }- (0\cdot a)\text{ to both sides}\\ 0+ 0\cdot a &=0 &\text{ by additive inverses}\\ 0\cdot a &=0 &\text{ by additive identity}. \end{align*}$ We have thus proved that $0\cdot a =0$ . The proof of $a\cdot 0 =0$ is similar and I omit it.

(d) Assume that there are two additive identity elements $0$ and $0'$ . Then we have $\begin{align*} 0 &= 0+0' & \text{ because }0'\text{ is an additive identity} \\ 0' &= 0+ 0' & \text{ because }0\text{ is an additive identity}. \end{align*}$ Because the right hand sides are the same abovewe conclude $0=0'$ . Therefore there is only one additive identity.

Assume that there are two multiplicative identity elements $1$ and $1'$ . Then we have $\begin{align*} 1 &= 1\cdot 1' & \text{ because }1'\text{ is an additive identity} \\ 1' &= 1\cdot 1' & \text{ because }1\text{ is an additive identity}. \end{align*}$ Because the right hand sides are the same above we conclude $1=1'$ . Therefore there is only one multiplicative identity. (b) Let $a, b\in R$ . We shall first show that $(-a)\cdot b$ is an additive inverse for $a\cdot b$ . We compute using distributivity and part (d) that $\begin{eqnarray*} (-a)\cdot b + a\cdot b=((-a)+a)\cdot b=0\cdot b=0 \\ a\cdot b+(-a)\cdot b=(a+(-a))\cdot b=0\cdot b=0. \end{eqnarray*}$ By what we have computed above $(-a)\cdot b$ is an additive inverse for $a\cdot b$ . By definition $-(ab)$ is also an additive inverse for $a\cdot b$ . By part (b) $a\cdot b$ has a unique additive inverse, which means that $(-a)\cdot b=-(ab)$ .

For (c), we apply (b) using $a = 1$ and $b = a$ to get $(-1) a = -(1 \cdot a)$ . The right side is equal to $-a$ by the multiplicative inverse property and thus $(-1)a = -a$ .

(e) Let $x\in R$ and assume $x$ has two additive inverses $y\in R$ and $z\in R$ .

Then $\begin{align*} x+y &= 0 & \text{ because }y\text{ is an additive inverse of } x \\ z+(x+y)&= z+0 & \text{ by adding } z \text{ to both sides } \\ (z+x)+y&= z& \text{ by associativity of addition and the additive identity property}\\ 0+y &=z & \text{ because }z\text{ is an additive inverse of } x \\ y &=z &\text{ by additive identity}. \end{align*}$ Since $y=z$ we have thus shown that there is only one additive inverse for $x$ .

(f) Let $x\in R$ and assume $x$ has two multiplicative inverses $y\in R$ and $z\in R$ . Then $\begin{align*} xy &= 1 & \text{ because }y\text{ is a multiplicative inverse of } x \\ z(xy)&= z\cdot 1 & \text{ by multiplying } z \text{ to both sides } \\ (zx)y&= z& \text{ by associativity of multiplication and the multiplicative identity property}\\ 1\cdot y &=z & \text{ because }z\text{ is a multiplicative inverse of } x \\ y &=z &\text{ by multiplicative identity}. \end{align*}$

Since $y=z$ we have thus shown that there is only one multiplicative inverse for $x$ . ◻

4.3.1 The Zero Ring

Theorem 4.19. The one-element set $\{0\}$ equipped with binary the operations $0 + 0 = 0$ and $0 \cdot 0 = 0$ is a ring.

Conversely, if $R$ is any ring such that $0_R = 1_R$ , then $R$ has only one element, namely $0_R$ .

Proof. It is straightforward to verify most of the nine axioms of a ring for the one element ring $\{0\}$ . The one that is a bit confusing is Axiom 9, which states that there exists some element $1_R$ of $R$ such that $1_R \cdot a = a$ and $a \cdot 1_R = a$ for all $a$ in $R$ . Set $1_R = 0$ . Then $1_R \cdot 0 = 0 \cdot 0 = 0$ and $0 \cdot 1_R = 0 \cdot 0 = 0$ . Since $0$ is the only element of this ring, this does indeed verify Axiom 9.

Now assume $R$ is any ring such that $0_R = 1_R$ . Pick any element $a \in R$ . We will prove that $a = 0_R$ and hence $R$ has just one element. By Theorem 1 above, $0_R \cdot a = 0_R$ , and by Axiom 9 we have $1_R \cdot a = a$ . But we are assuming $0_R = 1_R$ and hence $0_R = 0_R \cdot a = 1_R \cdot a = a$ , so that $a = 0_R$ . ◻

4.4 Subrings

Definition 4.20. A subset $S$ of a ring $R$ is a subring of $R$ if $S$ is a ring for the same operations of addition and multiplication from $R$ .

Example 4.21 (Trivial rings). If $R$ is a ring then $\{0_R\}$ and $R$ are subrings of $R$ . These are called the trivial subrings of $R$ .

Example 4.22. $\mathbb{R}$ is a ring, and each of ${\mathbb{Z}}$ and $\mathbb{Q}$ is a non-trivial subring of $\mathbb{R}$ .

Example 4.23. What are the subrings of ${\mathbb{Z}}$ ? Every ring is a subring of itself, and thus ${\mathbb{Z}}$ is a subring. Also $\{0\}$ is a subring of ${\mathbb{Z}}$ . These are the only two subrings of ${\mathbb{Z}}$ .

Proposition 4.24. Assume $R$ is a ring and that $S\subseteq R$ is subset of $R$ . If

$0_R \in S$ ,
$1_R\in S$ ,
$S$ is closed under addition: $s_1,s_2\in S$ $\rightarrow$ $s_1+s_2\in S$ ,
$S$ is closed under additive inverses: $s_1\in S$ $\rightarrow$ $-s_1\in S$ , and
$S$ is closed under multiplication: $s_1,s_2\in S$ $\rightarrow$ $s_1s_2\in S$ ,

then $S$ is a subring of $R$ .

Proof. The assumptions show that Axioms 1, 4, 5, 6 and 9 all hold. The remaining axioms are automatic since if they hold in $R$ they certainly hold for the subset $S$ of $R$ . For instance, since addition is commutative in $R$ we have that $a + b = b + a$ for all elements $a,b \in R$ , and so it certainly is true that $a + b = b + a$ for all elements $a,b \in S$ . Thus, $S$ is a ring under the same operations of addition and multiplication as in $R$ . Moreover, the first two assumptions show that $S$ contains $0_R$ and $1_R$ , as is required of a subring. (Another way of saying this is that $0_S = 0_R$ and $1_S = 1_R$ hold.) ◻

Example 4.25. For a fixed positive integer $N$ , what are the subrings of ${\mathbb{Z}}_N$ ? As with ${\mathbb{Z}}$ , one subring is ${\mathbb{Z}}_N$ itself, and another is $\{0\}$ . But for certain values of $N$ we can have other subrings as well: for example one can check that the set $S=\{[0]_{12}, [4]_{12}, [8]_{12}\}$ is a subring of ${\mathbb{Z}}_{12}$ with $0_S=[0]_{12}, 1_S=[4]_{12}$ . This shows that the converse of Proposition 4.24 is not valid in the sense that if $S$ is a subring of $R$ it does not mean that we must have $1_R\in S$ .

4.5 Domains and fields

Definition 4.26. Let $R$ be a ring. An element $r \in R$ is a unit in $R$ if it has a two-sided multiplicative inverse, that is, if there exists $s \in R$ such that $rs = 1 = sr$ .

Recall that you proved before that if an element $r$ has a multiplicative inverse, then it has only one such inverse. In other words, inverses are unique when they exist. If $r$ is a unit in a ring, we write its unique inverse as $r^{-1}$ .

Definition 4.27. Let $R$ be a commutative ring. A nonzero element $r \in R$ is a zerodivisor in $R$ if there exists a nonzero $s \in R$ such that $rs =0$ .

Definition 4.28. A ring $R$ is an integral domain, or simply a domain, if $R$ is commutative, $0_R \ne 1_R$ , and $R$ has no zerodivisors.

Definition 4.29. A ring $R$ is a field if $R$ is commutative, $1_R \ne 0_R$ , and every nonzero element is a unit.

Proposition 4.30. A ring $R$ is a domain if and only if

$R$ is commutative,
$0_R \ne 1_R$ , and
whenever $ab = 0_R$ for elements $a, b \in R$ , we have $a = 0_R$ or $b = 0_R$ .

Proof. $(\rightarrow)$ Suppose $R$ is a domain. Then $R$ is commutative and $0_R \ne 1_R$ , by definition of domain. Suppose $a, b \in R$ are such that $ab = 0_R$ and assume towards a contradiction that $a \neq 0_R$ and $b \neq 0_R$ . Then $a$ and $b$ are zero divisors, contradicting the fact that a domain must not have any zero divisors. Thus the assumption is false and $a = 0_R$ or $b = 0_R$ must be true.

$(\Leftarrow)$ Suppose the three bullet points hold and assume towards a contradiction that $R$ is not a domain. Since the conditions that $R$ is commutative and $0_R \ne 1_R$ are satisfied it must be the case that $R$ has zero divisors. Let $a$ be a zero divisor. Then by definition of zero divisor $a\neq 0_R$ and there exists $b\in R$ so that $b\neq 0_R$ and $ab=0_R$ . This contradicts the third butlet point. Thus the assumption is false and $R$ is a domain. ◻

Example 4.31. The ring $\mathbb{Z}$ is a domain: it is a commutative ring, $1_{\mathbb{Z}} \neq 0_{\mathbb{Z}}$ , and given integers $a$ and $b$ , if $ab=0$ then $a=0$ or $b=0$ . However, $\mathbb{Z}$ is not a field, since for example $2$ is not a unit. In fact, the only units in $\mathbb{Z}$ are $1$ and $-1$ .

Example 4.32. The rings $\mathbb{R}$ and $\mathbb{Q}$ are both fields.

Theorem 4.33. Let $N > 1$ be an integer.

$\mathbb{Z}_N$ a field if and only if $N$ is prime.
$\mathbb{Z}_N$ is a domain if and only if $N$ is prime.

Proof.

We have shown before that $[a]_N$ is a unit if and only if $\gcd(a,N) = 1$ .

Suppose $N = p$ is prime. Since $p$ is prime, for each integer $a$ , either $p$ divides $a$ or $\gcd(a,p) = 1$ . Equivalently, if $[a]_p \neq [0]_p$ , then $\gcd(a,p) = 1$ . We conclude that every nonzero element $[a]_p$ in $\mathbb{Z}_p$ is a unit, and thus $\mathbb{Z}_p$ is a field.

Assume $N$ is not prime. Then there exist integers $1< a, b < N$ such that $N = ab$ , and thus $\gcd(a,N) = a > 1$ . Note also that $[a]_N \neq [0]_N$ , so $[a]_N$ is neither zero nor a unit. We conclude that $\mathbb{Z}_N$ is not a field.
Assume $N$ is not prime. Then there exist integers $1< a, b < N$ such that $N = ab$ . Note that $[a]_N, [b]_N \neq 0$ , even though $[a]_N [b]_N = [ab]_N = [N]_N = [0]_N$ . We conclude that $[a]_N$ is a zerodivisor, and $\mathbb{Z}_N$ is not a domain.

Now assume that $N = p$ is prime. Suppose that $[a]_p, [b]_p$ in $\mathbb{Z}_p$ are such that $[a]_p [b]_p = 0$ . Then by definition we have $[ab]_p = [0]_p$ , or equivalently, $ab \equiv 0 \pmod p$ , so $p$ divides $ab$ . Since $p$ is prime, this implies that $p$ divides $a$ or $p$ divides $b$ . But that would mean that either $[a]_p = [0]_p$ or $[b]_p = [0]_p$ . We conclude that $\mathbb{Z}_p$ is a domain.edhere

◻

Proposition 4.34. Let $R$ be a commutative ring. If $r \in R$ is a unit, then it is not a zero divisor.

Proof. Suppose $u$ is a unit, and let $v$ be the multiplicative inverse of $R$ . By way of contradiction, suppose that $ua = 0$ for some non-zero element $a \in R$ . Multiplying both sides by $v$ , we conclude that $a = v(ua) = (vu) a = v \cdot 0 = 0$ , a contradiction. Therefore, $u$ is not a zerodivisor. ◻

Corollary 4.35. Every field is an integral domain.

Proof. Suppose $R$ is a field. By definition this implies $R$ is commutative and $1_R \ne 0_R$ . Given any nonzero element $r \in R$ , $r$ is a unit by definition, and hence by Proposition 4.34 $u$ is not a zerodivisor. Therefore, we conclude that no nonzero element in $R$ is a zerodivisor, and thus $R$ must be an integral domain. ◻

The converse is false: $\mathbb{Z}$ is a domain, but not a field.

Theorem 4.36. Assume $R$ is a commutative ring and $r \in R$ is a non-zero element that is not a zerodivisor. If $ra = rb$ for some elements $a, b \in R$ , then $a = b$ .

Proof. If $ra=rb$ , then $r(a-b) = 0$ . Since $r$ is not zero and not a zerodivisor, we must have $a-b = 0$ , or equivalently, $a=b$ . ◻

Corollary 4.37. if $R$ is an integral domain, then $R$ has the following cancelation property: for all nonzero $a \in R$ and any elements $b, c \in R$ , if $ab=ac$ then $b=c$ .

Proof. Given any nonzero element $a \in R$ , we have that $a$ is not a zerodivisor, since $R$ is a domain. Therefore, Theorem 4.36 applies, and thus if $ab=ac$ then $b=c$ . ◻

4.6 Product rings

Definition 4.38. Let $R$ and $S$ be two rings. Let $R\times S$ be the set of ordered pairs $R\times S=\{(r, s) \,\, | \,\, r \in R, s\in S\}.$ Define an operation $+$ on $R\times S$ by $(r_1, s_1) + (r_2, s_2) = (r_1+r_2, s_1+s_2)$ and an operation $\cdot$ on $R\times S$ by $(r_1, s_1) \cdot (r_2, s_2)= (r_1 \cdot r_2, s_1 \cdot s_2).$

Proposition 4.39. For any two rings $R$ and $S$ , $R \times S$ is also a ring. Its additive identity is $(0_r,0_S)$ and the additive inverse of an element $(r,s)$ is $(-r,-s)$ .

Proof. The only way to prove this is the painstakingly check all nine axioms. All of them follow directly from the fact that each of $R$ and $S$ are assumed to be rings. We will just give a sampling of the proofs:

For instance, $R \times S$ has an additive identity, and it is $0_{R \times S} = (0_R,0_S)$ , since for any $(r,s) \in R \times S$ , $(0_R,0_S) + (r,s) = (0_R + r, 0_S + s) = (r,s)$ using that $0_R$ and $0_S$ are the additive identities of $R$ and $S$ .

Similarly, this ring has a multiplicative identity, namely $(1_R,1_S)$ .

To give just one more example, let us check the left distributive law. For arbitrary elements $r_1, r_2, r_3 \in R$ and $s_1, s_2, s_3 \in S$ , we have $(r_1, s_1) ((r_2, s_2) + (r_3, s_3)) = (r_1, s_1) (r_2 + r_3, s_2+ s_3) = (r_1(r_2 + r_3) , s_1(s_2+ s_3)) = (r_1r_2 + r_1r_3 , s_1s_2+ s_1s_3)$ where in the last equality we have used that the left distributive law holds in each of $R$ and $S$ . We also have $(r_1, s_1) (r_2, s_2) + (r_1, s_1) (r_3, s_3) = (r_1r_2 + r_1r_3, s_1s_2+ s_1s_3)$ and hence $(r_1, s_1) ((r_2, s_2) + (r_3, s_3)) =(r_1, s_1) (r_2, s_2) + (r_1, s_1) (r_3, s_3).$ ◻

Proposition 4.40. If the set $R$ has $N$ elements and the set $S$ has $M$ elements then the set $R\times S$ has $N\cdot M$ elements. In particular, ${\mathbb{Z}}_N\times {\mathbb{Z}}_M$ is a set with $N\cdot M$ elements.

Proof. We need to count the number of pairs $(r,s)$ with $r\in R$ and $s\in S$ . There are $N$ possibilities for $r$ and for each of these there are $M$ possibilities for $S$ , so overall $N\cdot M$ possibilities. ◻

Example 4.41. The ring ${\mathbb{Z}}_2 \times {\mathbb{Z}}_3$ has a total of six elements $([0 ]_2, [0 ]_3), ([0 ]_2, [1 ]_3), ([0 ]_2, [2 ]_3), ([1 ]_2, [0 ]_3), ([1 ]_2, [1 ]_3), ([1 ]_2, [2 ]_3)$ and we have, for example, $([1]_2, [1]_3) + ([1]_2, [1]_3) = ([0]_2, [2]_3)$ .

Example 4.42. The ring ${\mathbb{Z}}_2 \times {\mathbb{Z}}_2$ has the following addition and multiplication tables: $\begin{array}{ c || c | c | c | c } + & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) \\ \hline\hline ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) \\ \hline ([0 ]_2, [1 ]_2) & ([0 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2)& ([1 ]_2, [0 ]_2) \\ \hline ([1 ]_2, [0 ]_2)& ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) \\ \hline ([1 ]_2, [1 ]_2)& ([1 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) \\ \end{array}$ $\begin{array}{ c || c | c | c | c } \cdot & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) \\ \hline\hline ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) \\ \hline ([0 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) & (([0 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) \\ \hline ([1 ]_2, [0 ]_2)& ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [0 ]_2) \\ \hline ([1 ]_2, [1 ]_2)& ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) \\ \end{array}$

The additive identity is $([0 ]_2, [0 ]_2)$ , the multiplicative identity is $([1 ]_2, [1 ]_2)$ and the only unit in this ring is also $([1 ]_2, [1 ]_2)$ .

4.7 Polynomial rings

We now define polynomials. This will match the notion of polynomials with coefficients in the real numbers that you have encountered in high school or in calculus, but we extend it to create polynomials with coefficients in any ring, for example, ${\mathbb{Z}}_N$ .

Theorem 4.43. If $R$ is a commutative ring, then there exists a polynomial ring denoted $R[x]$ , containing an element $x$ that is not in $R$ , which has these properties:

The set $R[x]$ consists of all expressions of the form $a_0+a_1x+...+a_nx^n \mbox{ (where $n \geq 0$ and $a_i \in R$ for $0\leq i\leq n$)}.$
The representation of elements in $R[x]$ is unique, that is, $a_0+a_1 x+...+a_n x^n =b_0+b_1 x+...+b_n x^n \text{ if and only if } a_0=b_0, \ldots, a_n=b_n.$ In particular, $a_0+a_1 x+...+a_n x^n =0_R$ if and only if $a_i =0_R$ for $0\leq i\leq n$ .
Addition in $R[x]$ is given by $(a_0 + a_1x + \cdots + a_n x^n) + (b_0 + b_1x + \cdots + b_n x^n) = (a_0 + b_0) + (a_1 + b_1) x + \cdots + (a_n + b_n) x^n.$
Multiplication in $R[x]$ is given by $\begin{eqnarray*} (a_0 + a_1x + \cdots + a_n x^n) \cdot (b_0 + b_1x + \cdots + b_m x^m) = \\ (a_0 \cdot b_0) + (a_1 b_0 +a_0 b_1) x + (a_2 b_0 + a_1b_1 + a_0 b_2) x^2 + \cdots + (a_n b_m) x^{n+m}. \end{eqnarray*}$ More precisely, the coefficient of $x^j$ in the product is $\sum_{i=0}^j a_{j-i}b_{i}$ .
$R$ is a subring of $R[x]$ by viewing elements $a_0\in R$ as polynomials.
$xa=ax$ for every $a \in R$ and, more generally, $R[x]$ is a commutative ring.

We will not prove this theorem as checking the ring axioms can get slightly tedious.

Example 4.44. Here are sole polynomials with the rings they belong to

$[1]_9+[3]_9x\in {\mathbb{Z}}_9[x]$
$x^2+3x+7\in \mathbb{Z}[x]$
$2x^2+\frac{3}{2}x+1\in \mathbb{Q}[x]$
$\pi x^3+ex+\sqrt{2}\in \mathbb{R}[x]$

Definition 4.45. The degree of a polynomial $p(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n$ is the largest $d\geq 0$ so that $a_d \ne 0$ . So, the degree is $n$ provided $a_n \ne 0$ .

We write $\operatorname{degree}p(x)$ for the degree of $p(x)$ .

Example 4.46. The degrees of the following polynomials in $\mathbb{R}[x]$ are

$\operatorname{degree}(3 - 7x^2 + \sqrt{2} x^7) = 7$
$\operatorname{degree}(\pi x^5 + \sqrt[7]{128} x^{12}) = 12$

$\operatorname{degree}(5.7) = 0$
$\operatorname{degree}(0) =$ undefined.

Proposition 4.47. Given a ring $R$ and non-zero polynomials $p(x)$ and $q(x)$ in $R[x]$ $\begin{eqnarray*} \operatorname{degree}(p(x) + q(x)) &\leq & \max\{\operatorname{degree}p(x),\operatorname{degree}q(x)\}\\ \operatorname{degree}(p(x) \cdot q(x)) &\leq & \operatorname{degree}p(x) + \operatorname{degree}q(x). \end{eqnarray*}$

Proof. Say $p(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n$ and $q(x) = b_0 + b_1 x + b_2 x^2 + \cdots + b_m x^m$ with $a_n \ne 0$ and $b_m \ne 0$ , so that $\operatorname{degree}(p) = n$ and $\operatorname{degree}(q) = m$ .

Then the highest possible nonzero term of $p(x)+q(x)$ is the one among $a_n x^n$ or $b_m x^m$ which has largest exponent. Thus $\operatorname{degree}(p(x) + q(x)) \leq \max\{\operatorname{degree}p(x),\operatorname{degree}q(x)\}$ . Then the highest possible nonzero term of $p(x) \cdot q(x)$ is $(a_nb_m) x^{n+m}$ .Thus $\operatorname{degree}(p(x) \cdot q(x)) \leq n + m$ . ◻

Example 4.48. The inequalities in the previous proposition need not be equalities!

if $p(x)=q(x)=x$ then $1=\operatorname{degree}(p(x) + q(x)) = \max\{\operatorname{degree}p(x),\operatorname{degree}q(x)\}$
if $p(x)=x$ and $q(x)=-x+1$ then $0=\operatorname{degree}(p(x) + q(x)) < \max\{\operatorname{degree}p(x),\operatorname{degree}q(x)\}=\max\{1,1\}$
if $p(x)=q(x)=x$ then $2=\operatorname{degree}(p(x) \cdot q(x)) = \operatorname{degree}p(x)+ \operatorname{degree}q(x)=1+1$
if $p(x)=[2]_4x$ and $q(x)=[2]_4x+[1]_4$ then $1=\operatorname{degree}(p(x) \cdot q(x)) < \operatorname{degree}p(x)+ \operatorname{degree}q(x)1+1$

However, the second inequality becomes an equality if the coefficient ring is a domain.

Proposition 4.49. If $R$ is a domain then

for non-zero polynomials $p(x)$ and $q(x)$ in $R[x]$ , $\operatorname{degree}(p(x) \cdot q(x)) = \operatorname{degree}p(x)+ \operatorname{degree}q(x).$
$R[x]$ is a domain.

Proof. (a) Say $p(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n$ and $q(x) = b_0 + b_1 x + b_2 x^2 + \cdots + b_m x^m$ with $a_n \ne 0$ and $b_m \ne 0$ , so that $\operatorname{degree}(p) = n$ and $\operatorname{degree}(q) = m$ . Then the highest nonzero term of $p(x) \cdot q(x)$ is $(a_nb_m) x^{n+m}$ , which is non-zero since $a_n \ne 0$ and $b_m \ne 0$ and in a domain if $a_n \ne 0$ and $b_m \ne 0$ then $a_nb_m\neq 0$ (otherwise $a_n$ and $b_m$ would be zero divisors, a contradiction). Thus $\operatorname{degree}(p(x) \cdot q(x)) \leq n + m$ .

(b) Assume towards a contradiction that $R[x]$ has a zero divizor $p(x)$ . Then $p(x)\neq 0$ and there exists $q(x)\neq 0$ so that $p(x)q(x)=0$ . Same as in part (a) if $p(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n$ and $q(x) = b_0 + b_1 x + b_2 x^2 + \cdots + b_m x^m$ with $a_n \ne 0$ and $b_m \ne 0$ , then the highest nonzero term of $p(x) \cdot q(x)$ is $(a_nb_m) x^{n+m}$ where $a_nb_m\neq 0$ by the same argument as in (a). Since $p(x) \cdot q(x)$ has a nonzero term it is not the zero polynomials, a contradiction.

Therefore our assumption is false and $R[x]$ has no zero divisors, so it is a domain. ◻

4.7.1 The Division Algorithm and GCDs

Theorem 4.50 (Division Algorithm). Let $F$ be a field and $f(x),g(x) \in F[x]$ with $g(x) \neq0$ . Then there exists unique polynomials $q(x)$ and $f(x)$ such that $f(x)=g(x)q(x)+r(x) \mbox{ and either }\ r(x)=0_F \mbox{ or } \deg r(x) < \deg g(x)$

We will not worry about proving this theorem. Instead let’s recall that there is an algorithm (long division) that gives us the $q(x)$ and $r(x)$ in the Division Algorithm Theorem.

Example 4.51. We divide $f(x)=x^3+x^2-1$ by $g(x)=x-1$

Example 4.52. Divide $f(x)=3x^5+2x^4+2x^3+4x^2+x-2$ by $g(x)=2x^3+1$ in $\mathbb{R}[x]$ .

Definition 4.53. Let $F$ be a field and $a(x),b(x) \in F[x]$ with $b(x) \neq 0$ We say that $a(x)$ divides $b(x)$ , and write $a(x)|b(x)$ if $b(x)=a(x)h(x)$ for some $h(x) \in F[x]$ .

Definition 4.54. Let $F$ be a field and $a(x),b(x) \in F[x]$ , not both zero. The greatest common divisor of $a(x)$ and $b(x)$ is the polynomial of highest degree that divides both $a(x)$ and $b(x)$ and that has leading coefficient $1_F$ .

Example 4.55. Find the GCD of $a(x)=2x^4+5x^3-5x-2$ and $b(x)=2x^3-3x^2-2x$ in $\mathbb{R}[x]$ . Since $\begin{eqnarray*} a(x)=2x^4+5x^3-5x-2 &=&(2x+1)(x+2)(x+1)(x-1)\\ b(x)=2x^3-3x^2-2x &=&(2x+1)(x-2)x. \end{eqnarray*}$ We see that the only common factor for $f(x)$ and $g(x)$ is $2x+1$ . However $2x+1=2(x+\frac{1}{2})$ , so another polynomial of degree one that divides both $f(x)$ and $g(x)$ is $x+\frac{1}{2}$ . Since the latter has leading coefficient one and is the unique polynomial with leading coefficient one that divides both $f(x)$ and $g(x)$ we conclude $\gcd(f(x),g(x))=x+\frac{1}{2}$ .

Theorem 4.56 (Euclidean Algorithm). Let $F$ be a field and $a(x),b(x) in F[x]$ , not both zero. Then there is a unique GCD of $d(x)$ of $a(x)$ and $b(x)$ which can be found using the Euclidean algorithm followed by dividing the last non-zero remainder polynomial by its leading coefficient.

Furthermore, there are polynomial $u(x)$ and $v(x)$ such that $d(x)=a(x)u(x)+b(x)v(x)$ .

Example 4.57. We find the GCD of $a(x)=2x^4+5x^3-5x-2$ and $b(x)=2x^3-3x^2-2x$ in $\mathbb{R}[x]$ using the Euclidean algorithm.

The last non-zero remainder is $r(x)=-\frac{48}{49}x-\frac{24}{49}$ and dividing it by its leading coefficient, $-\frac{48}{49}$ , gives $\gcd(a(x),b(x))=x+\frac{1}{2}.$

4.50 and 4.56 show that there is a strong analogy between the integers and polynomials. We make this precise as follows.

Let us summarize the analogy between $F[x]$ and ${\mathbb{Z}}$ so far:

$F[x]$	${\mathbb{Z}}$
domain (see 4.49)	domain
Long Division	Division Algorithm
$a(x)=b(x)q(x)+r(x), \deg(r(x))<\deg(b(x))$ or $r(x)=0$	$a=bq+r, 0\leq r<b$
Euclidean Algorithm	Euclidean Algorithm

Definition 4.58. A Euclidean domain is a ring $R$ which is a domain and in which the Division Theorem holds. More precisely this means that there is a function $f: R\to {\mathbb{N}}$ so that for any $a,b\in R$ with $b\neq 0$ there exist $q,r\in R$ so that $a=b\cdot q+r \text{ and either } r=0 \text{ or } N(r)<N(b).$

For the ring of polynomials $N(p(x))=\deg(p(x))$ . For ${\mathbb{Z}}, N(x)=\vert x\vert$ .

Example 4.59. The ring of integers ${\mathbb{Z}}$ is a Euclidean domain. If $F$ is a field then the ring of polynomials with coefficients in $F$ , $F[x]$ , is a Euclidean domain. In particular, $\mathbb{Q}[x], \mathbb{R}[x], \mathbb{C}[x]$ and ${\mathbb{Z}}_p[x]$ are Euclidean domains for any prime $p>0$ .

4.7.2 Irreducible polynomials and irreducible factorizations

Let us explore the units in $R[x]$ .

Proposition 4.60. Let $R$ be a domain. A polynomial $p(x)$ is a unit in $R[x]$ if and only if $p(x)=c$ for some $c\in R$ such that $c$ is a unit in $R$ .

Proof. Assume $p(x)$ is a unit. Then $p(x) \cdot q(x) = 1$ for some polynomial $q(x)$ . By the degree formula, $\operatorname{degree}(p(x)) + \operatorname{degree}(q(x)) = \operatorname{degree}(1) = 0$ . Since degrees cannot be negative, this proves that $\operatorname{degree}(p(x)) = 0=\operatorname{degree}(q(x))$ . That is, $p(x) = c$ for some $c\in R$ and $q(x)=r$ for some $r\in R$ so that $cr=1$ . This means that $c$ is a unit in $R$ .

Conversely, suppose $p(x)=c$ for some $c\in R$ such that $c$ is a unit in $R$ . Then there exists $r\in R$ so that $rc=cr=1_R$ and thus we see that $q(x)=r$ is a multiplicative inverse for $p(x) = c$ in $R[x]$ ◻

Corollary 4.61. Let $F$ be a field. A polynomial $p(x)$ is a unit in $F[x]$ if and only if $p(x)=c$ for some $0_F\neq c\in F$ .

Example 4.62. The previous proposition is no longer true if we remove the hypothesis that $R$ is a domain. For example $[1]_9+[3]_9x$ is a unit in ${\mathbb{Z}}_9[x]$ with multiplicative inverse $[1]_9+[6]_9x$ .

Definition 4.63. Let $R$ be a commutative ring with identity. An element $a \in R$ is said to be an associate of an element $b \in R$ if there is a unit $u \in R$ such that $au=b$ .

Example 4.64.

$-4$ is an associate to the number $4$ in $\mathbb{Z}$ since $-4=4\cdot(-1)$ and $-1$ is a unit in ${\mathbb{Z}}$
$2(x^2+3x+2)$ is an associate to the polynomial $x^2+3x+2$ in $\mathbb{Q}[x]$ since $2$ is a unit in $\mathbb{Q}[x]$ .

Definition 4.65. Let $F$ be a field. A non-constant polynomial $p(x) \in F[x]$ is said to be irreducible if its only divisors are its associates and the nonzero constant polynomials (units).

Lemma 4.66. If $F$ is a field then every polynomial of degree 1 in $F[x]$ is irreducible in $F[x]$ .

Proof. Let $p(x)\in F[x]$ be a polynomial of degree one. Suppose $q(x)\mid p(x)$ . Then there exists $h(x)$ so that $p(x)=q(x)h(x)$ and by the degree formula 4.49 we have $1=\deg(p(x))=\deg(q(x))+\deg(h(x))$ . Therefore we see that at least one of $q(x)$ or $h(x)$ must have degree 0, that is $q(x)$ or $h(x)$ is a constant. It cannot be the zero constant since that would make $p(x)=0$ , a contradiction. So we see that either $q(x)$ or $h(x)$ is a unit in $F[x]$ . We have shown that either $q(x)$ is a nonzero constant or $q(x)$ is an associate of $p(x)$ so $p(x)$ is irreducible. ◻

Remark 4.67. Irreducible polynomials are analogous to prime integers since an integer $p$ is prime if and only if its only divisors are $\pm 1$ (the units of ${\mathbb{Z}}$ ) and $\pm p$ (the associates of $p$ in ${\mathbb{Z}}$ .)

To further the analogy between integers and polynomials here are new facts we have seen:

$F[x]$	${\mathbb{Z}}$
Units: $p(x)=c$ , $c\in F, c\neq 0_F$ (see 4.61)	Units: $-1, 1$
Associates: $q(x)$ is associate to $c\cdot q(x)$	Associates: $n$ is associate to $\pm n$
for all $c\in F, c\neq 0_F$
Irreducible polynomial	Prime integer
Irreducible factorization	Prime factorization

The following theorems expand on the analogy between prime and irreducible polynomials and between prime factorizations and irreducible factorizations.

The following theorem is reminiscent of [lemma composite].

Theorem 4.68. Let $F$ be a field. A non-zero polynomial $f(x)$ is reducible in $F[x]$ if and only if $f(x)$ can be written as the product of two polynomials of lower degree.

The following theorem is reminiscent of 1.49 and [lemma prime].

Theorem 4.69. Let $F$ be a field and $p(x)$ a non-constant polynomial in $F[x]$ . Then the following are equivalent:

The polynomial $p(x)$ is irreducible.
If $b(x)$ and $c(x)$ are polynomials and $p(x)|b(x)c(x)$ , then $p(x)|b(x)$ or $p(x)|c(x)$ .
If $r(x)$ and $s(x)$ are any polynomials such that $p(x)=r(x)s(x)$ , then $r(x)$ is a non-zero constant polynomial or $s(x)$ is a non-zero constant polynomial.

The following theorem is reminiscent of the Fundamental Theorem of Arithmetic 1.52. Note that in that the uniqueness part of that theorem which states that $p_i=\pm q_i$ could be restated as $p_i$ and $q_i$ are associates (in ${\mathbb{Z}}$ ).

Theorem 4.70. Let $F$ be a field. Every non-constant polynomial $f(x) \in F[x]$ is a product of irreducible polynomials in $F[x]$ . This factorization is unique in the following sense: If $p_1(x)p_2(x)...p_n(x)=q_1(x)q_2(x)...q_m(x)$ with each $p_i(x)$ and $q_i(x)$ irreducible, then $r=s$ and up to some relabeling, $p_i(x)$ and $q_i(x)$ are associates.

5 Homomorphisms and Isomorphisms

5.1 Functions

Definition 5.1. A function $f$ from a set $A$ to a set $B$ denoted $f:A\to B$ is a rule that assigns to every element $a\in A$ an element denoted $f(a)\in B$ .

The set $A$ is called the domain of $f$ and the set $B$ is called the codomain of $f$ .

In other situations it is better to represent functions by giving a formula or rule for $f(x)$ .

Example 5.2. Here are some functions you know:

$f:\mathbb{R}\to \mathbb{R}, f(x)=x^2$
$f:{\mathbb{Z}}\to {\mathbb{Z}}, f(x)=2x$
$f:\mathbb{R}\to {\mathbb{Z}}, f(x)=\lfloor x\rfloor$ , where $\lfloor x\rfloor$ denotes the largest integer that is smaller or equal to $x$

and some that you will get to know

$f:{\mathbb{Z}}\to{\mathbb{Z}}_N, f(x)=[x]_N$
$f:{\mathbb{Z}}_N\to{\mathbb{Z}}_N, f([x]_N)=[-x]_N$

Notice that in Figure 1 sometimes there can be multiple arrows reaching the same element in $B$ . When this does not happen we say $f$ is injective.

Definition 5.3. A function $f:A\to B$ is injective or one-to-one if whenever $x,y \in A$ are such that $f(x)=f(y)$ then $x=y$ .

To prove a function $f$ is injective, suppose $f(x)=f(y)$ and prove that $x=y$ .
To prove a function $f$ is not injective find values $x, y$ so that $f(x)=f(y)$ but $x\neq y$ .

Notice that in Figure [fig] there is arrow leaving from every element (dot) in $A$ . This is because a function associates to every $a\in A$ some element of $B$ . However there may be some elements in $B$ which do not receive arrows, so they are not outputs of the function $f$ . To distinguish those elements of $B$ which are outputs of $f$ we introduce the following notion.

Definition 5.4. The image of a function $f:A\to B$ is the set $\Im(f)=\{ f(a) \mid a\in A\}.$ The image is a subset of the codomain, that is, $\Im(f)\subseteq B$ , but they need not be equal.

Definition 5.5. A function $f:A\to B$ is surjective or onto if for every $b\in B$ there exists $a\in A$ so that $f(a)=b$ .

Equivalently, $f$ is surjective if and only if $\Im(f)=B$ .

Now we can combine the notions of injective and surjective.

Definition 5.6. A function $f:A\to B$ is bijective if $f$ is both injective and surjective.

Example 5.7. In Figure [fig]

the first function is not injective, not surjective, and not bijective
the second function is injective, not surjective, and not bijective
the third function is not injective, surjective, and not bijective
the fourth function is injective, surjective, and bijective.

Example 5.8.

the function $f:\mathbb{R}\to \mathbb{R}, f(x)=x^2$ is not injective since $f(-1)=f(1)$ ; it is not surjective since $\Im(f)=[0,\infty)\neq \mathbb{R}$ and is not bijective since it is not surjective.
the function $f:{\mathbb{Z}}\to {\mathbb{Z}}, f(x)=2x$ is injective: if $f(x)=f(y)$ then $2x=2y$ so $x=y$ (since $2$ is not a zero-divisor in ${\mathbb{Z}}$ ); the image consists of all the even integers so $f$ is not surjective and thus it is not bijective.
$f:\mathbb{R}\to {\mathbb{Z}}, f(x)=\lfloor x\rfloor$ , where $\lfloor x\rfloor$ denotes the largest integer that is smaller or equal to $x$ . This function is not injective: $f(2.1)=2=f(2.2)$ ; it is surjective since for every integer $b\in {\mathbb{Z}}$ we have $f(b)=b$ . The function is not bijective since it is not surjective.
$f:{\mathbb{Z}}\to{\mathbb{Z}}_N, f(x)=[x]_N$ is not injective since, for example, $f(0)=f(N)=[0]_N$ . The function is surjective since for every element $[x]_N\in {\mathbb{Z}}_N$ we have $f(x)=[x]_N$ . The function is not bijective since it is not injective.
$f:{\mathbb{Z}}_N\to{\mathbb{Z}}_N, f([x]_N)=[-x]_N$ is injective: if $f([x]_N)=f([y]_N)$ then $[-x]_N=[-y]_N$ and so by multiplying both sides by $[-1]_N$ we conclude $[-1]_N[-x]_N=[-1]_N[-y]_N, \text{ that is }, [x]_N=[y]_N.$ This function is surjective since for each $[b]_N\in {\mathbb{Z}}_N$ there exists $[a]_N=[-b]_N\in {\mathbb{Z}}_N$ so that $f([-b]_N)=[b]_N$ . This function is bijective since it is both injective and surjective.

Proposition 5.9. If a function $f:A\to B$ is bijective and $A$ and $B$ are finite sets, then $A$ and $B$ must have the same number of elements.

Proof. Suppose a function $f:A\to B$ is bijective and $A$ and $B$ are finite sets. Let $n=\#A$ .

Then we can write $A=\{a_1, a_2, \ldots, a_n\}$ where $a_i\neq a_j$ whenever $i\neq j$ . Since $f$ is injective, by the contrapositive of the definition of injective it follows that $f(a_i)\neq f(a_j)$ whenever $i\neq j$ . Thus the image of $f$ , that is, the set $\Im(f)=\{f(a_1), f(a_2), \ldots, f(a_n)\}$ consists of exactly $n$ distinct elements. In other words, $\#\Im(f)=n$ .

Since $f$ is surjective we have $B=\Im(f)$ and thus we conclude $\#B=\#\Im(f)=n$ . Since $\#A=n$ we deduce that $\#A=\#B$ . ◻

Example 5.10. Suppose in a certain class no two students have the same birthday month and for each of the months of the year there is a student born in that month. How many students are there in the class?

The answer is 12 students. To see this, consider $f:\{\text{students}\}\to \{\text{months}\}, \mathbb{Q}uad f(\text{student})=\text{student's birthday month}.$ The information given tells us $f$ is injective (no two students have the same birthday month) and surjective (for each of the months of the year there is a student born in that month). Thus $f$ is bijective and by Theorem 5.9 we conclude $\#$ students $= \#$ months $=12$ .

Example 5.11. Suppose in a certain class no two students have the same birthday month. How many students are there in the class?

There are at most 12 students in the class. Since each student has a different birthday month the function $f:\{\text{students}\}\to \{\text{months}\}, \mathbb{Q}uad f(\text{student})=\text{student's birthday month}.$ is injective. As in the proof of 5.9, the number of students is equal to the number of their birthday months, which is at most 12 since there are only 12 months in a year.

Example 5.12. Suppose in a certain class there are more than 12 students? Can each student has a different birthday month?

No, the students cannot have each have a different birthday month. Assume towards a contradiction that each student has a different birthday month. Then the set of all their birthday months would have as many elements as there are students in the class, so there would be more than 12 different birthday months. This is a contradiction.

Theorem 5.13. If $A$ and $B$ are finite sets which have the same number of elements then the following are equivalent:

$f$ is injective,
$f$ is surjective,
$f$ is bijective.

Note: a “the following are equivalent" statement as above means three if and only if sratements: (a) if and only if (b); (b) if and only if (c); (c) if and only of (a). However, to prove all of this it suffice to prove only 3 if – then statements, namely

$(a)\rightarrow (b)$ : if (a) then (b)
$(b)\rightarrow (c)$ : if (b) then (c)
$(c)\rightarrow (a)$ : if (c) then (a)

Proof. $(a)\rightarrow (b)$ : if $f$ is injective then $f$ is surjective.

Suppose $\#A=\#B=n$ and $f:A\to B$ is injective. Then we can write $A=\{a_1, a_2, \ldots, a_n\}$ where $a_i\neq a_j$ whenever $i\neq j$ . Since $f$ is injective, by the contrapositive of the definition of injective it follows that $f(a_i)\neq f(a_j)$ whenever $i\neq j$ . Thus the image of $f$ , that is, the set $\Im(f)=\{f(a_1), f(a_2), \ldots, f(a_n)\}$ consists of exactly $n$ distinct elements. In other words, $\#\Im(f)=n$ . Since $\Im(f)\subseteq B$ and $\#\Im(f)=\#B=n$ it follows that there is no element of $B$ that is not in $\Im(f)$ , in other words $\Im(f)=B$ , so $f$ is surjective.

$(b)\rightarrow (c)$ : if $f$ is surjective. then $f$ is bijective.

Suppose $\#A=\#B=n$ and $f:A\to B$ is surjective. By definition of surjective we have $\Im(f)=B$ and in particular $\#\Im(f)=\#B=n$ . We can write $A=\{a_1, a_2, \ldots, a_n\}$ and $\Im(f)=\{f(a_1), f(a_2), \ldots, f(a_n)\}$ Since we know $\#\Im(f)=n$ the elements $f(a_1), f(a_2), \ldots, f(a_n)$ must be pairwise distinct (otherwise $\Im(f)$ would have fewer than $n$ elements). Thus $f(a_i)\neq f(a_j)$ whenever $a_i\neq a_j$ , which is the contrapositive of the definition of injective. We have thus shown $f$ is injective and since it was assume surjective $f$ is in fact bijective.

$(c)\rightarrow (a)$ : if $f$ is bijective then $f$ is injective.

This is true for any function by definition of bijective. ◻

Theorem 5.14. If $R$ is a domain which has finitely many elements, then $R$ is a field.

Proof. Consider $r\in R$ with $r\neq 0$ . Since $R$ is a domain if follows that $R$ is not a zero divisor. Let $f:R\to R$ be given by $f(x)=rx$ . Then $f$ is injective. Indeed, if $f(x)=f(y)$ then $rx=ry$ and since $r$ is not a zero divisor we have that $x=y$ by the cancellation Theorem 4.36.

Since the domain and codomain of $f$ are finite sets having the same number of elements (in fact the domain and codomain are the same set $R$ ), and $f$ is injective, then $f$ must be surjective by Theorem 5.13. Thus there exists $s\in R$ such that $f(s)=1_R$ , that is $rs=1_R$ . Because $R$ is a domain it is a commutative ring and thus $sr=rs=1_R$ , We have shown that $r$ is a unit.

Since every $r\neq 0_R$ is a unit, $R$ is a field. ◻

Theorem 5.15 (Fermat’s little theorem). If $p$ is a prime integer and $a$ is an integer not divisible by $p$ then $a^{p-1}\equiv 1 \pmod{p}$ .

Proof. Suppose $p$ is a prime integer and $a$ is an integer not divisible by $p$ . Since $p$ does not divide $a$ , we have $a\not\equiv 0 \pmod p$ and thus $[a]_p\neq [0]_p$ . By Theorem 4.33 we know ${\mathbb{Z}}_p$ is a field and so, since $[a]_p\neq [0]_p$ we deduce that $[a]_p$ is a unit in ${\mathbb{Z}}_p$ .

Consider the function $f:{\mathbb{Z}}_p\setminus \{[0]_p\}\to {\mathbb{Z}}_p\setminus \{[0]_p\}$ given by $f([x]_p)=[a]_p[x]_p$ . We claim that $f$ is injective. Indeed, if $f([x]_p)=f([y]_p)$ , then $[a]_p[x]_p=[a]_p[y]_p$ . Since $[a]_p$ is a unit it is not a zero divisor by Proposition 4.34. Applying the cancellation principle (Theorem 4.36) we deduce from $[a]_p[x]_p=[a]_p[y]_p$ that $[x]_p=[y]_p$ .

Since $f$ is injective and the domain and codomain of $f$ are finite sets which have the same number of elements ( $p-1$ ), by Theorem 5.13 we deduce that $f$ is surjective, so $\Im(f)= {\mathbb{Z}}_p\setminus \{[0]_p\}=\{[1]_p, [2]_p, \ldots, [p-1]_p\}$ . On the other hand $\Im(f)=\{[a]_p[1]_p, [a]_p[2]_p, \ldots, [a]_p[p-1]_p\}.$ Taking the product of the elements of $\Im(f)$ is the same as taking the product of the elements of ${\mathbb{Z}}_p\setminus \{[0]_p\}$ , so $[a]_p[1]_p\cdot [a]_p[2]_p\cdot \cdots \cdot [a]_p[p-1]_p = [1]_p \cdot [2]_p \cdot \cdots \cdot [p-1]_p$ $[1]_p\cdot [2]_p \cdot \cdots \cdot [p-1]_p\cdot [a]_p^{p-1}= [1]_p \cdot [2]_p \cdot \cdots \cdot [p-1]_p$ Cancelling, as in Theorem 4.36, $[1]_p, [2]_p, \ldots, [p-1]_p$ , which are all non zero-divisors, in the above equation we obtain $[a]_p^{p-1}=[1]_p,$ in other words $a^{p-1}\equiv 1 \pmod{p}$ . ◻

5.2 Isomorphisms

Two rings $R$ and $S$ are isomorphic if they are the same after relabelling. Here is the formal definition:

Definition 5.16. Let $R$ and $S$ be rings. An isomorphism from $R$ to $S$ is a bijective function (that is, a function that is both injective and surjective) $f\!: R \to S$ such that the following properties hold:

For all $a, b \in R$ if $f(a)=f(b)$ then $a=b$ . uad ( $f$ is injective)
For all $s\in S$ there exists $r\in R$ so that $f(r)=s$ . uad( $f$ is surjective)
For all $a, b \in R$ we have $f(a + b) = f(a) + f(b)$ . uad( $f$ preserves addition.)
For all $a, b \in R$ we have $f(a \cdot b) = f(a) \cdot f(b)$ . uad ( $f$ preserves multiplication.)
$f(1_R) = 1_S$ . uad ( $f$ preserves multiplicative identities.)

We say $R$ is isomorphic to $S$ if there exists an isomorphism from $R$ to $S$ , and we write $R \cong S$ to signify that $R$ and $S$ are isomorphic.

The reason we do not include $f(0_R) = 0_S$ in the definition of “isomorphism” is that it is a consequence of the other parts of the definition.

Example 5.17. Consider the set $S = \{ a,b,c,d \}$ with the two operations below:

$\spadesuit$	a	b	c	d
a	a	b	c	d
b	b	c	d	a
c	c	d	a	b
d	d	a	b	c

$\heartsuit$	a	b	c	d
a	a	a	a	a
b	a	b	c	d
c	a	c	a	c
d	a	d	c	b

$(S, \spadesuit, \heartsuit)$ is a ring. Now compare this with the addition and multiplication tables for $\mathbb{Z}_4$ :

$\spadesuit$	$[0]_4$	$[1]_4$	$[2]_4$	$[3]_4$
$[0]_4$	$[0]_4$	$[1]_4$	$[1]_4$	$[3]_4$
$[1]_4$	$[1]_4$	$[1]_4$	$[3]_4$	$[0]_4$
$[1]_4$	$[1]_4$	$[3]_4$	$[0]_4$	$[1]_4$
$[3]_4$	$[3]_4$	$[0]_4$	$[1]_4$	$[1]_4$

$\heartsuit$	$[0]_4$	$[1]_4$	$[1]_4$	$[3]_4$
$[0]_4$	$[0]_4$	$[0]_4$	$[0]_4$	$[0]_4$
$[1]_4$	$[0]_4$	$[1]_4$	$[1]_4$	$[3]_4$
$[1]_4$	$[0]_4$	$[1]_4$	$[0]_4$	$[1]_4$
$[3]_4$	$[0]_4$	$[3]_4$	$[1]_4$	$[1]_4$

The tables above and below are completely identical if we relabel the elements $a, b, c, d$ as follows: $f: {\mathbb{Z}}_4 \to S \qquad f([0]_4) = a \quad f([1]_4) = b \quad f([2]_4) = c \quad f([3]_4) = d$ So this map $f$ is an isomorphism between $(S, \spadesuit, \heartsuit)$ and $\mathbb{Z}_4$ . Notice in fact that this is the only isomorphism between these two rings: since $1_S = b$ , we must have $f(1) = b$ ; but then this implies that $f(2) = f(1+1) = f(1) + f(1) = b+b=c$ , and $f(3) = f(1+2) = f(1) + f(2) = b + c = d$ .

Example 5.18. Consider the set $T = \{a,b,c,d\}$ and the binary operations $\oplus$ and $\otimes$ given by the following tables:

$\oplus$	a	b	c	d
a	a	b	c	d
b	b	a	d	c
c	c	d	a	b
d	d	c	b	a

$\otimes$	a	b	c	d
a	a	a	a	a
b	a	b	c	d
c	a	c	c	a
d	a	d	a	d

Let us take it on faith that $(T, \oplus, \otimes)$ is a ring. Let’s prove that $T$ is isomorphic to the ring ${\mathbb{Z}}_2 \times {\mathbb{Z}}_2$ . Inspecting the addition and multiplication tables for $T$ , we see that the zero element of $T$ is $a$ and the one element is $b$ . So any isomorphism $f: {\mathbb{Z}}_2 \times {\mathbb{Z}}_2 \to T$ would have to satisfy $f(([0]_2, [0]_2)) = a$ and $f(([1]_2, [1]_2)) = b$ , and since $f$ must be a bijection, there are two possibilities. In fact, both of them work. Let’s consider the function $f: {\mathbb{Z}}_2 \times {\mathbb{Z}}_2 \to T$ defined by $\begin{aligned} f(([0]_2, [0]_2)) & = a \\ f(([1]_2, [0]_2)) & = c \\ f(([0]_2, [1]_2)) & = d \\ f(([1]_2, [1]_2)) & = b. \\ \end{aligned}$ It is clearly a bijection. Checking the three required axioms is rather tedious since there are $16$ possible additions and $16$ possible multiplications, but it does all work out.

To see that $f$ preserves addition, for instance, we have $f( ([0]_2, [1]_2) + ([1]_2, [1]_2)) = f( ([1]_2, [0]_2)) = c$ and $f( ([0]_2, [1]_2)) \oplus f([1]_2, [1]_2)) = d \oplus b = c$ . So $f$ preserves addition in this one case, and it also does in the other $15$ cases. To check this carefully we could write down the addition table for ${\mathbb{Z}}_2 \times {\mathbb{Z}}_2$ and see that $f$ transforms it into the table $\oplus$ , entry by entry.

Similarly, $f$ preserves multiplication. For instance, $f( ([0]_2, [1]_2) \cdot ([1]_2, [1]_2)) = f( ([0]_2, [1]_2)) = d$ and $f( ([0]_2, [1]_2)) \otimes f([1]_2, [1]_2)) = d \otimes b = d$ . So $f$ preserves multiplication in this one case, and it also does in the other $15$ cases. To check this carefully we could write down the multiplication table for ${\mathbb{Z}}_2 \times {\mathbb{Z}}_2$ and see that $f$ transforms it into the table $\otimes$ , entry by entry.

Finally, $1_T = b$ and $1_{{\mathbb{Z}}_2 \times {\mathbb{Z}}_2} = ([1]_2, [1]_2)$ and $f( ([1]_2, [1]_2)) = b$ , so that the final axiom holds too.

Example 5.19. Consider the ring $R = {\operatorname{Mat}}_{2\times 2}(\mathbb{R})$ of all $2 \times 2$ matrices with real entries. The subring $S = \left\lbrace \begin{bmatrix} a & 0 \\ 0 & a \end{bmatrix} \mid a \in \mathbb{Z} \right\rbrace$ is isomorphic to $\mathbb{Z}$ . The map $f\!: \mathbb{Z} \to S$ given by $f(n) = \begin{bmatrix} n & 0 \\ 0 & n \end{bmatrix}$ is an isomorphism. Indeed:

$f$ is bijective: If $f(a) = f(b)$ then $\begin{bmatrix} a & 0 \\ 0 & a \end{bmatrix} = \begin{bmatrix} b & 0 \\ 0 & b \end{bmatrix}$ and hence $a = b$ . This shows $f$ is one-to-one (injective). Given an arbitrary element $Y := \begin{bmatrix} a & 0 \\ 0 & a \end{bmatrix}$ of $S$ , we have $f(a) = Y$ , and thus $f$ is onto (surjective).
$f$ preserves addition: $f(a+b) = \begin{bmatrix} a+b & 0 \\ 0 & a+b \end{bmatrix} = \begin{bmatrix} a & 0 \\ 0 & a \end{bmatrix} + \begin{bmatrix} b & 0 \\ 0 & b \end{bmatrix} = f(a) + f(b)$ for all $a, b \in \mathbb{Z}$ .
$f$ preserves multiplication: $f(ab) = \begin{bmatrix} ab & 0 \\ 0 & ab \end{bmatrix} = \begin{bmatrix} a & 0 \\ 0 & a \end{bmatrix} \begin{bmatrix} b & 0 \\ 0 & b \end{bmatrix} = f(a) f(b)$ for all $a, b \in \mathbb{Z}$ .
$f$ preserves multiplicative identities: $f(1) = \begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}$ .

Theorem 5.20. The rings ${\mathbb{Z}}_2 \times {\mathbb{Z}}_2$ and ${\mathbb{Z}}_4$ are not isomorphic.

We will give several proofs of this. The first one will use the following Lemma:

Proposition 5.21. If $f\!: R \to S$ is an isomorphism and $r \in R$ is a unit, then $f(r)$ is also a unit. In particular, if $R$ is isomorphic to $S$ , then there is a bijection between the units of $R$ and the units of $S$ .

Proof. Suppose that $r \in R$ is a unit, and let $s \in R$ be the inverse of $r$ . Then $f(r) f(s) = f(rs) = f(1_R) = 1_S \mathbb{Q}uad \textrm{ and } \mathbb{Q}uad f(s) f(r) = f(sr) = f(1_R) = 1_S.$

This proves the first claim, which can be interpreted as saying that $f$ induces a function $f': \{r \in R | \text{ $r$ is a unit in $R$ }\} \to \{s \in S | \text{ $s$ is a unit in $S$ }\}$ given by $f'(r) = f(r)$ . (I am calling it $f'$ since it has a different domain and codomain as $f$ , but it’s given by the same rule.) Since $f$ is one-to-one (injective) so is $f'$ . It is not obvious that $f'$ is onto, however. So, pick $s \in S$ such that $s$ is a unit. So there is an element $y$ in $S$ such that $sy = 1_S = ys$ . Since $f$ is onto (surjective), $s = f(r)$ for some $r$ and $y = f(x)$ for some $x$ . I claim $r$ is also a unit. We have $f(rx) = f(r) f(x) = s y = 1_S$ . On the other hand $f(1_R) = 1_S$ and thus $f(1_R) = f(rx)$ . But $f$ is one-to-one and hence we must have $rx = 1_R$ . A nearly identical argument shows that $xr = 1_R$ . This proves $r$ is a unit and hence it belongs to the domain of $f'$ . This proves $f'$ is onto (surjective). ◻

Now we can give yet a proof of Theorem 5.20:

First proof of Theorem 5.20. The ring ${\mathbb{Z}}_4$ has two units, $[1]_4$ and $[3]_4$ , while ${\mathbb{Z}}_2 \times {\mathbb{Z}}_2$ only has one unit, $([1]_2,[1]_2)$ . Therefore, the two rings cannot be isomorphic. ◻

Proposition 5.22. If $f: R \to S$ is an isomorphism and $r \in R$ is a zerodivisor, then $f(r)$ is also a zerodivisor. In particular, if $R$ and $S$ are isomorphic, then there is a bijection between the zerodivisors of $R$ and the zerodivisors of $S$ .

Proof. Assume $r \in R$ is a zerodivisor and $f: R \to S$ is an isomorphism of rings. By definition, $r \ne 0_R$ and there is an element $y \in R$ such that $y \ne 0_R$ and $ry = 0_R$ . Since $f$ preserves multiplication, $f(r) f(y) = f(ry) = f(0_R)$ and by Lemma [lem1] $f(0_R) = 0_S$ . Thus $f(r) f(y) = 0_S$ . Since $f$ is one-to-one, $f(0_R) = 0_S$ , and $r \ne 0_R$ , it follows that $f(r) \ne 0_S$ . Similarly, $f(y) \ne 0_S$ . This proves that $f(r)$ is a zerodivisor in $S$ .

This proves the first claim, which can be interpreted as saying that $f$ induces a function $f': \{r \in R | \text{ $r$ is a zerodivisor in $R$ }\} \to \{s \in S | \text{ $s$ is a zerodivisor in $S$ }\}$ given by $f'(r) = f(r)$ . (I am calling it $f'$ since it has a different domain and codomain as $f$ , but it’s given by the same rule.) Since $f$ is injective so is $f'$ . It is not obvious that $f'$ is onto, however. So, pick $s \in S$ such that $s$ is a zerodivisor. So there is an element $y \neq 0$ in $S$ such that $sy = 0_S$ . Since $f$ is surjective, $s = f(r)$ for some $r \in R$ and $y = f(x)$ for some $x \in R$ . I claim $r$ is also a zerodivisor. We have $f(rx) = f(r) f(x) = s y = 0_S$ . On the other hand, $f(0_R) = 0_S$ and thus $f(0_R) = f(rx)$ . But $f$ is injective and hence we must have $rx = 0_R$ . Moreover, since $y \neq 0$ , and since $f$ is injective and $f(0_R) = 0_S$ , we must have $x \neq 0_R$ . This proves $r$ is a zerodivisor and hence it belongs to the domain of $f'$ . This proves $f'$ is onto (surjective). ◻

And this allows us to give another proof of Theorem 5.20.

Second proof of Theorem 5.20. The ring ${\mathbb{Z}}_4$ has one zerodivisor, $[2]_4$ , while ${\mathbb{Z}}_2 \times {\mathbb{Z}}_2$ has $2$ zerodivisors, $([1]_2,[0]_2)$ and $([0]_2,[1]_2)$ . Therefore, the two rings cannot be isomorphic. ◻

5.3 Homomorphisms

Definition 5.23. A ring homomorphism between two rings $R$ and $S$ is a function $\phi: R \to S$ that satisfies the following properties:

$\phi(x + y) = \phi(x) + \phi(y)$ for all $x, y \in R$ . (“ $\phi$ preserves addition.”)
$\phi(x \cdot y) = \phi(x) \cdot \phi(y)$ for all $x, y \in R$ . (“ $\phi$ preserves multiplication.”)
$\phi(1_R) = 1_S$ . (“ $\phi$ preserves multiplicative identities.”)

Note that a ring isomorphism is a ring homomorphism that is both surjective (onto) and injective (one-to-one).

Proposition 5.24 (Properties of homomorphisms). Let $\phi: S \rightarrow T$ be a homomorphism of rings. Then

$\phi(0_S) = 0_T$ .
For all $x \in S$ , $- \phi(x) = \phi(-x)$ . (A ring homomorphism preserves additive inverses.)
If $u \in S$ is a unit, then $\phi(u) \in T$ is a unit.

Proof. $\,$

We have $\phi(0_S + 0_S) = \phi(0_S) + \phi(0_S)$ and $\phi(0_S + 0_S) = \phi(0_S)$ , so that $\phi(0_S) + \phi(0_S) = \phi(0_S)$ . Now add $-\phi(0_S)$ to both sides to conclude that $\phi(0_S) = 0_T$ .
We have $\phi(-x) + \phi(x) = \phi(-x + x) = \phi(0_S) = 0_T$ (using the first result), and this proves $\phi(-x)$ is the additive inverse of $\phi(x)$ .
If $u$ is a unit in $S$ , then there is a $v \in S$ such that $uv = 1_S = vu$ . It follows that $\phi(u) \phi(v) = \phi(uv) = \phi(1_S) = 1_T$ and similarly $\phi(v) \phi(u) = \phi(vu) = \phi(1_S) = 1_T$ . This shows that $\phi(v)$ is the multiplicative inverse of $\phi(u)$ and hence $\phi(u)$ is a unit.edhere

◻

6 RSA Cryptography

6.1 The mathematics behind the RSA

There are two central mathematical principles behind RSA public key cryptography.

May 8, 2023

Lemma 6.1. If $p$ and $q$ are positive primes such that $p\neq q$ and $a$ is an integer which satisfies $p\mid a$ and $q\mid a$ , then $(pq)\mid a$ .

Proof. Suppose that $a,p,q$ are integers such that $p\mid a$ and $q\mid a$ and $p$ and $q$ are distinct primes. By definition of divides, $a=pk$ for some $k\in {\mathbb{Z}}$ . Since $q\mid a$ we have $q\mid (pk)$ and since $q$ is a prime we deduce that $q\mid p$ or $q\mid k$ by 1.49. If $q\mid p$ then since $p$ is and $q$ are prime we have $q=\pm p$ and since $p,q$ are both positive it must be that $q=p$ . This is a contradiction, so instead $q\mid k$ must be true. Thus $k=q\ell$ for some $\ell\in {\mathbb{Z}}$ by definition of divides.

Since $a=pk=pq\ell$ with $\ell\in {\mathbb{Z}}$ , this shows that $(pq)\mid a$ . ◻

Theorem 6.2. If $p,q$ are positive primes, $de \equiv 1 \pmod{(p-1)(q-1)}$ and $b$ is an integer, then $b^{de}\equiv b \pmod{pq}.$

Proof. First, we will show that if $c \equiv 1 \mod {(p-1)(q-1)}$ , then $b^{c} \equiv 1 \mod {pq}$ ; applying this to $c = de$ will give that $b^{ce} \equiv b \mod {pq}$ .

So let $c \equiv 1 \mod {(p-1)(q-1)}$ , so that $c = 1 + (p-1)(q-1)k$ for some $k \in {\mathbb{Z}}$ .

Case 1: if $p$ does not divide $b$ , using Fermat’s Little Theorem 5.15 we have

$b^c = b^{1 + (p-1)(q-1)k} = b \left( b^{p-1} \right)^{(q-1)k} \equiv b \cdot 1^{(q-1)k} \equiv b \mod p.$
Case 2: if $p\mid b$ then $b\equiv 0 \pmod {p}$ and also $b^c\equiv 0 \pmod p$ so again we have $b^c\equiv b\pmod p$ .

Similarly, modulo $q$ we have

Case 1: if $q$ does not divide $b$ , using Fermat’s Little Theorem 5.15

$b^c = b^{1 + (p-1)(q-1)k} = b \left( b^{q-1} \right)^{(p-1)k} \equiv b \cdot 1^{(p-1)k} \equiv b \mod q.$
Case 2: if $q\mid b$ then $b\equiv 0 \pmod {q}$ and also $b^c\equiv 0 \pmod q$ so again we have $b^c\equiv b\pmod q$ .

By the definition of congruence we have proved that $p\mid(b^c-1) \text{ and } q \mid(b^c-1).$ Since $p\neq q$ are positive primes it follows by Lemma [lem:last] that $n=pq\mid (b^c-1)$ , that is, $b^c\equiv 1\pmod{n}$ . Applying this to $c = de$ yields $b^{de}\equiv 1\pmod{n}$ .

◻

6.2 How RSA works

We can replace any letter in the alphabet by a number with two digits as follows:

A	B	C	D	E	F	G	H	I	J	K	L	M
01	02	03	04	05	06	07	08	09	10	11	12	13

N	O	P	Q	R	S	T	U	V	W	X	Y	Z
14	15	16	17	18	19	20	21	22	23	24	25	26

We can then represent words by numbers: HELLO=0805121215.

Secret: Bob picks 2 primes $p,q$ and computes $m=pq$	$p=29, q=101$
(sometimes $n$ itself is taken to be a prime;	$n=2929$
that works, but is not necessary)
Secret: Bob chooses an encrypting exponent $e$ that is coprime with	$e=17$
$\phi(m)=\phi(pq)=(p-1)(q-1)$
Public: Bob announces $m$ and $e$ . This is Bob’s public key.	$n=2929, e=17$
Secret: Alice writes her message as a sequence of numbers.	HELLO
If $m$ is an $r+1$ -digit number she will split her message into	0 805 121 215
$r$ -digit blocks. Her message will look like $b_1 \mathbb{Q}uad b_2 \mathbb{Q}uad \ldots \mathbb{Q}uad b_m$
Secret: Alice computes $b_1^e \pmod m, b_2^e \pmod m, \ldots b_n^e\pmod n$	$0^{17}=0$
$b_1^e \pmod m, b_2^e \pmod n, \ldots b_n^e\pmod n$	$805^{17}\equiv 962 \pmod{2929}$
	$121^{17}\equiv 1053 \pmod{2929}$
	$215^{17}\equiv 1491 \pmod{2929}$
Public: Alice sends the results of her computations to Bob.	0 0962 1053 1491
This is the encrypted message
Secret: Bob computes $d\equiv e^{-1} \pmod {(p-1)(q-1)}$	$d=1153$
using the Euclidean algorithm
Bob computes $(b_1^e)^d \pmod m, (b_2^e)^d \pmod m, \ldots (b_n^e)^d\pmod m$ ,	$962^{1153}\equiv 805 \pmod{2929}$
recovering the original message $b_1 \mathbb{Q}uad b_2 \mathbb{Q}uad \ldots \mathbb{Q}uad b_n$ .	$1053^{1153}\equiv 121 \pmod{2929}$
This is the decryption phase.	$1491^{1153}\equiv 215 \pmod{2929}$

Why is this system secure? The only way to find $d$ , given $e$ and $n$ , is by knowing the value of $(p-1)(q-1)$ . If one knows the value of $(p-1)(q-1)$ , it is easy and fast (Euclidean algorithm) to find $d$ ; without knowing the value of $(p-1)(q-1)$ there is no practical way to find $d$ . Now, to find the value of $(p-1)(q-1)$ one must know the values of $p$ and $q$ . Recall $n = p \cdot q$ and that $n$ is PUBLIC. So, they only way to “crack” the code is to find the prime factorization of $n$ . There is no (known) method of factoring huge integers in a practical amount of time!

This fancy-sounding principle is an axiom, and like most axioms, this is formalized common sense.↩︎