Intro to Modern Algebra

Math 310
University of Nebraska–Lincoln

In this course we will study proof writing based on abstract algebra.

Algebra comes from the arabic word “al-jabr" which means reunion of broken parts. In mathematics this word is used for naming a broad area of study which deals with manipulating abstract symbols and objects in a similar manner to that in which we manipulate numbers. Therefore, we shall start by studying the properties of numbers (specifically integers) and later we will build different algebraic structures which obey the same properties.

At the same time we will learn how to write mathematical proofs. This is a skill that takes some amount of patience to master. You might be surprised by the fact that advanced mathematics and in particular proof writing does not consist of a long string of formulas. Instead it consists of many words with some formulas sprinkled throughout. As is the case with a good essay, you might have to go through several attempts and subsequent drafts to write a really good proof.

One very important word of advice about proofs: in this course we will be developing the material from the ground up. This means that when you write a proof you should be able to justify every step relying only on facts previously mentioned in this class and not on any other knowledge.

1 The Integers

1.1 Properties of integers

What is a number? Certainly the things used to count sheep, money, etc. are numbers: 1,2,3,1, 2, 3, \dots.

We will call these the positive whole mumbers natural numbers and write {\mathbb{N}} for the set of all natural numbers ={1,2,3,4,,}.{\mathbb{N}} = \{1, 2, 3, 4, \dots, \}. What you see above is set notation which one reads as “ the set of natural numbers, {\mathbb{N}}, consists of 1,2,3,4, etc”.

Since we like to keep track of debts too, we’ll allow negatives and 00, which gives us the the integers. The word integer refers to a whole number, either negative, positive, or zero. So, the set of all integers is: ={,3,2,1,0,1,2,3,4,}.{\mathbb{Z}} = \{ \dots, -3, -2, -1, 0, 1, 2, 3, 4, \dots\}. The symbol {\mathbb{Z}} is used for the set of integers since the German word for number is Zahlen.

From now on, for brevity, we write aa \in {\mathbb{Z}} to mean aa is an integer’. The symbol \in means “in”, as in “belongs to” or “is an element of", so aa \in {\mathbb{Z}} reads “aa is an element of the set of all integers”.

Integers have many nice mathematical properties one could ask for: they have operations of addition, subtraction, and multiplication. These operations follow several rules which we will take for granted. We call these facts that we take for granted “axioms".

Axioms 1.1 (Properties of integers).

  1. Closure under addition: If a,ba,b are integers then a+ba+b is an integer.

  2. Associativity of addition: If a,b,ca,b,c are integers then (a+b)+c=a+(b+c)(a+b)+c=a+(b+c).

  3. Commutativity of addition: If a,ba,b are integers then a+b=b+aa+b=b+a.

  4. Additive identity: there exist an integer 00 so that if aa is an integer then a+0=0+a=aa+0=0+a=a.

  5. Additive inverses: If aa is an integer then there exists an integer a-a so that a+(a)=(a)+a=0a+(-a)=(-a)+a=0.

  6. Closure under multiplication: If a,ba,b are integers then abab is an integer.

  7. Associativity of multiplication: If a,b,ca,b,c are integers then (ab)c=a(bc)(ab)c=a(bc).

  8. Commutativity of multiplication: If a,ba,b are integers then ab=baab=ba.

  9. Multiplicative identity: There exists an integer 11 so that if aa is an integer then a1=1a=aa\cdot 1=1\cdot a=a.

  10. Distibutivity: If a,b,ca,b,c are integers then a(b+c)=(b+c)a=ab+aca(b+c)=(b+c)a=ab+ac.

Remark 1.1. If we consider instead the set of natural numbers ={1,2,3,}{\mathbb{N}}=\{1,2,3,\ldots\} we see that while most of the above listed axioms continue to hold for the natural numbers, two of them do no. Indeed the additive identity axiom is not satisfied since 00 is not a natural number. Moreover the additive inverses axiom is not satisfied since if aa is a natural number then a-a is no longer a natural number.

1.2 Divisibility and proof techniques

Multiplication produces a relationship which we call divisibility among integers.

Definition 1.2. An integer aa divides an integer bb if there exists an integer nn such that b=anb = a \cdot n.

Notation 1.1. We write aba\mid b for aa divides bb. This should not be confused with the fraction a/ba/b which indicates division.

Example 1.3.

Definition 1.4. An integer mm is even if 22 divides mm. In other words, mm is even provided there exists an integer nn such that m=2nm = 2 \cdot n.

Definition 1.5. An integer mm is odd provided it is not even.

Let us now work out our first mathematical proof!

We will prove the assertion "The sum of two even integers is even." It helps to first rewrite this in if-then form as “ If aa and bb are even integers then a+ba+b is an even integer."

The proof below is an example of a proof technique called direct proof, which works best for statements of the form “If P, then Q." A direct proof for such a statement starts with "Suppose P". Then several steps may follow that use axioms, previously proven facts, and formula manipulations. The proof ends when through these deductions you have reached the conclusion that Q is true. This is marked by the proof end symbol whiich looks like a square.

Proposition 1.6. If aa and bb are even integers then a+ba+b is an even integer.

Proof. Suppose aa and bb are both even integers. By definition of even this means there exists an integer xx such that a=2xa = 2x and there exists an integer yy such that and b=2yb = 2y. It follows that a+b=2x+2y=2(x+y) by distributivity.a + b = 2x + 2y = 2(x+y) \qquad \text{ by distributivity}. Since xx and yy are integers, x+yx+y is an integer by closure of the integers under addition. Since we have found an integer, x+yx+y, such that a+b=2((x+y)a + b =2((x+y), by definition 22 divides a+ba+b, so a+ba+b is even. ◻

Proposition 1.7. If a,b,c,m,na, b,c,m,n are integers such that aa divides bb and aa divides cc, then aa divides bn+cmb \cdot n + c \cdot m.

Proof. Suppose aa divides bb and aa divides cc. By definition, this means that there exists an integer xx so that b=axb = ax and there exists an integer yy such that c=ayc = ay. We have bm+cn=axm+ayn=a(xm+yn).b\cdot m+ c \cdot n = axm + ayn = a(xm+yn). Since x,y,m,nx,y,m,n are integers, xm+ynxm + yn is also an integer by closure of the integers under addition and mutiplication. Since bm+cn=a(xm+yn)b\cdot m+ c \cdot n =a(xm+yn) and xm+ynxm + yn is an integer, aa divides bm+cnbm + cn by definition. ◻

Let’s now prove that the sum of an odd integer and an even integer is odd.

We will do so by contradiction. To understand this proof technique consider the following scenario. Suppose I wanted to prove that it is not night outside. If I can look out the window and see the sun then it cannot be night, so it must be daytime. Here seeing the sun contradicts that it is night, so it allows to deduce that it is day.

The proof below is an example of the proof technique called proof by contradiction, which works best for statements of the form “If P, then Q." where QQ is a negative statement. In the proof below QQ will be the statement a+ba+b is odd, which is the same as "a+ba+b is not even". Not QQ is the statement "a+ba+b is even".

A proof by contradiction for such a statement starts with "Suppose P. Assume (towards a contradiction) not Q." Then several steps may follow that use axioms, previously proven facts, and formula manipulations. The proof ends when through these deductions you have reached a contradiction. At this point we say “Since we have reached a contradiction the assumption not QQ must be false, thus QQ is true.

Proposition 1.8. If aa is an even integer and bb is an odd integers, then a+ba+b is odd.

Proof. Suppose aa is an odd integer and bb is an even integers and assume that a+ba+b is even. Then 2(a+b)2\mid (a+b) and 2a2\mid a by definition of even. By definition of divisibility, there exit integers k,lk, l so that a+b=2ka+b=2k and a=2la=2l.

We have b=(a+b)a=2k2l=2(kl).b=(a+b)-a=2k-2l=2(k-l). Since ll is an integer, so is its additive inverse l-l. Since kk and l-l are integers, kl=k+(l)k-l=k+(-l) is an integer by closure under addition. Since b=2(kl)b=2(k-l) and klk-l is an integer, 22 divides bb, so bb is even by definition. This contradicts that given fact that bb is an odd integer.

Since we have reached a contradiction, the assumption that a+ba+b is even must be false, thus a+ba+b is odd. ◻

1.3 Sets

Sets are just collections of objects. the objects in a set are called elements of that set.

Notation 1.2. We write xAx\in A to mean that xx is an element of a set AA.

Example 1.9. Writing A={1,2,3}A=\{1,2,3\} means AA is the set with elements 1,2, and 3.

Example 1.10. The empty set, denoted \emptyset is the set with no elements.

Notation 1.3. We can describe the elements of a set by writing down the properties they satisfy. This is called set-builder notation.

Example 1.11. For example, the set of all even integers is E={aa,a is even}E=\{a \mid a\in {\mathbb{Z}}, a\text{ is even} \} and the set of all odd integers is O={aa,a is odd}.O=\{a \mid a\in {\mathbb{Z}}, a\text{ is odd} \}. The vertical line in set builder notation is read "such that". It does not mean divides, although it is the same symbol.

New sets can be obtained from old by means of several set operations.

Definition 1.12. Given two sets AA and BB, we can form new sets by

\bullet union AB={x|xA or xB}A\cup B=\{x \ \vert \ x\in A \text{ or } x\in B\}
\bullet intersection AB={x|xA and xB}A\cap B=\{x \ \vert \ x\in A \text{ and } x\in B\}
\bullet set difference A\B={x|xA and xB}A\setminus B=\{x \ \vert \ x\in A \text{ and } x\not\in B\}

Example 1.13. What are EO,EO,\E,\OE\cup O, E\cap O, {\mathbb{Z}}\setminus E, {\mathbb{Z}}\setminus O?

EO=EO=\E=O\O=E.\begin{matrix} E\cup O ={\mathbb{Z}} & E\cap O=\emptyset & {\mathbb{Z}}\setminus E=O &{\mathbb{Z}}\setminus O=E. \end{matrix}

Example 1.14. Let aa and bb be integers and let C={xx divides both a and b}C=\{x\in{\mathbb{Z}} \mid x \text{ divides both } a \text{ and } b\}.

1.4 Logic

We can form compound statements as listed below:

Example 1.15. Which of the following statements are true?

  1. “If pigs could fly then they would be purple.” True since P=“pigs could fly" is false.

  2. “If 1212 is even, then 55 divides 1212.” False since P=“12 is even" is true but Q=“5125\mid 12 is false.

  3. “For an integer nn, if nn is even then 2-2 divides nn”. True. When P=“nn is even" is true then Q=“2-2 divides nn" is true.

The following negation rules are useful. Note that negation takes and to or and vice-versa.

De Morgan’s Laws for negation:

not(P and Q)=(not P) or (not Q)
not(P or Q)=(not P) and (not Q)

We summarize the rules of logic above in truth tables where TT stands for true and FF stands for false.

P Q P and Q P or Q not P PQP\rightarrow Q (not P) \rightarrow Q (not Q)\rightarrow (not P)
T T T T F T T T
T F F T F F T F
F T F T T T T T
F F F F T T F T

Two statements are logically equivalent if they have the same truth value for all possible truth values of the inputs P,QP, Q. If statements SS and TT are logically equivalent and we want to prove SS is true, we can prove TT is true instead!

Remark 1.16. We see from the table that the following pairs of statements are logically equivalent since they have the same truth values (the columns that correspond to the two statements have the same entries):

Contrapositives and converses

Definition 1.17. The contrapositive of the if-then statement “If PP then QQ” is “If not QQ then not PP”.

By the second bullet point in 1.16 the contrapositive is logically equivalent to the original statement. Sometimes when proving an if-then statement, it works a bit better to give a proof of the contrapositive. If you can prove the contrapositive is true, it follows automatically that the original statement is also true because of logical equivalence.

Example 1.18. Consider the statement “For an integer aa, if a2a^2 is odd, then aa is odd."

If we want to prove this statement we can instead prove the contrapositive “For an integer aa, if aa is even, then a2a^2 is even." This latter statement is easier to prove.

Definition 1.19. The converse of the if-then statement “If PP then QQ” is “If QQ then PP”.

Unlike the contrapositive, the converse of an if-then statement is not logically equivalent to the statement itself.

Example 1.20. Consider the statement: “If Math 310 is in session, the it is daytime." This statement is true. The converse is the statement “If it is daytime, then Math 310 is in session." This statement is false (there is no Math 310 in session at 8am). We see thus that the original statement is not logically equivalent to its converse since they don’t have the same truth value.

1.4.1 Quantifiers

In mathematics we use two quantifiers

An important difference

quantifier prove disprove
\forall proof counterexample
\exists example proof

Example 1.21. Consider the following statement: “The sum of two odd integers is odd." Note that the statement involves implicit universal quantifiers. It implicitly states: “For all odd integers aa and bb the sum a+ba+b is odd." According to the table above to disprove this statement it suffices to give a counterexample a=1,b=1a=1, b=1 are two odd integers such that a+b=2a+b=2 is even.

Negating statements containing quantifiers:

Example 1.22. Negate the following statements:

1.5 The division algorithm

First we look at a special property of the non-negative integers.

Axiom 1.23 (The Well-Ordering Principle). 1 Every non-empty set of non-negative integers has a smallest element.

Understanding the Well-Ordering Principle.

Example 1.24. For example, the set {3,7,9,21}\{3, 7, 9, 21\} is a non-empty set of non-negative integers, and so the Well-Ordering Principle tells us that it has a smallest element. In this example, it’s evident: 33.

This also holds for infinite sets: Let SS be the set of all strictly positive multiples of 77. It has a smallest elements (namely, 77). The principle also applies even when it is not obvious what the smallest element is. For instance, consider the height measured in inches (and rounded down to the nearest whole number) of all giraffes on planet Earth. That is, let S={mthere exists a giraffe of height m inches (rounded down)}.S = \{m \mid \text{there exists a giraffe of height $m$ inches (rounded down)}\}. This set must have a smallest number (there is a shortest giraffe — poor guy!).

Remark 1.25. The word “non-empty” is crucial. The statement “Every set of non-negative integers has a smallest element” is false, since the empty-set (which is considered a subset of the set of non-negative integers — the empty set is a subset of every set) has no smallest element, since it doesn’t have any elements at all.

Remark 1.26. The word “non-negative” is also important. For instance, the set of all negative multiples of 77, namely {7,14,21,}\{-7, -14,-21, \dots \} is a non-empty set of integers, but it has no smallest element.

Let us now discuss the so-called “division algorithm”:

Theorem 1.27 (The Division Algorithm Theorem). Let n,dn, d \in \mathbb Z with d>0d>0. There exists unique q,rq, r \in \mathbb Z such that n=qd+r and 0r<d.n = q d + r \,\,\,\,\,\,\,\,\,\,\,\,\, {\text{ and }} \,\,\,\,\,\,\, 0 \leqslant r < d.

Example 1.28. If we take n=25n = 25 and d=4d = 4 in the statement of the Division Algorithm, then the values of qq and rr that make the conclusion hold true are q=6q = 6 and r=1r = 1: 25=64+125 = 6 \cdot 4 + 1 and 01<40 \leqslant 1 < 4.

Let us check that 66 and 11 are the only values that work: Suppose 25=q4+r25 = q \cdot 4 + r and 0r<40 \leqslant r < 4. So rr is one of 0,1,20,1,2, or 33. If r=0r = 0, then 25=q425 = q \cdot 4, but 44 does not divide 2525, and so r=0r = 0 is not possible. If r=2r=2, then 25=q4+225 = q \cdot 4 + 2 and thus 23=q423 = q \cdot 4. But 44 does not divide 2323, and so r=2r = 2 is not possible. If r=3r = 3, then 25=q4+325 = q \cdot 4 + 3 and thus 22=q422 = q \cdot 4. But 44 does not divide 2323, and so r=3r = 3 is not possible. We thus must have that r=1r = 1. Then 25=q4+125 = q \cdot 4 + 1 and hence 24=q424 = q \cdot 4, so that qq must be 66.

Example 1.29. If we take n=25n = -25 and d=4d = 4 in the statement of the Division Algorithm, then the values of qq and rr that make the conclusion hold true are q=7q = -7 and r=3r = 3: 25=74+3-25 = -7 \cdot 4 + 3 and 03<40 \leqslant 3 < 4.

Remark 1.30. If we were to allow dd to be negative, the conclusion of the Division Algorithm Theorem would fail to be true: note that it is impossible for any integer rr to satisfy 0r<d0 \leqslant r < d when d<0d < 0, so no such integers qq and rr would exist in this case.

Remark 1.31. If we were to drop the phrase “0r<d0 \leqslant r < d" from the Division Algorithm Theorem, it would also become false, since the uniqueness portion would fail: For instance, if n=25n = 25 and d=4d = 4, then 25=64+125 = 6 \cdot 4 + 1 but also 25=54̇+525 = 5 \dot 4 + 5, so without the clause “0r<d0 \leqslant r < d", there would not be a unique pair of integers qq and rr that satisfy the requirements.

The Division Algorithm Theorem has two parts: an existence statement and an uniqueness statement. Existence says that there is at least one way of writing n=qd+rn = q d + r with 0r<d0 \leqslant r < d. The uniqueness part says there is at most one way of doing so. In other words, the uniqueness clause asserts that if we have n=qd+rn = q d + r with 0r<d0 \leqslant r < d and we also have n=qd+rn = q' d + r' with 0r<d0 \leqslant r' < d, then it must be that q=qq = q' and r=rr = r'.

Our next goal is to prove the existence portion of the division algorithm. Let’s recall what this says exactly:

Theorem 1.32 (Existence portion of the Division Algorithm).

Let n,dn, d be integers with d>0d > 0. There exist integers qq and rr such that n=qd+rn = qd + r and 0r<d0 \leqslant r < d.

Before giving the formal proof, let’s think intuitively for a bit. If I want to divide 3737 by 55, for instance, leaving a remainder rr in the range 0r<50 \leqslant r < 5, we could proceed by subtracting off 55’s from 3737 repeatedly until it is no longer possible to do so without going into negative territory. That is, the remainder is what’s left over after I subtract off as many 55’s as a I can from 3737 — in this example the remainder is 22.

In general, given nn and d>0d > 0, we want to subtract from nn as many copies of dd as possible. (Things are more complicated when nn is negative, but don’t worry about this for now.) This motivates the definition of the set 𝒮:={ndx|xandndx0}.\mathcal S := \{n - d x\,\, | \,\, x \in \mathbb Z\,\,\, {\text{and}} \,\,\, n-dx \geqslant 0\}. So, an element yy belongs to 𝒮\mathcal S if (1) yy a positive integer and (2) yy is the result of subtracting from nn some number of copies of dd.

For instance if n=17n = 17 and d=5d = 5 then 𝒮={2,7,12,17,}\mathcal S = \{2, 7, 12, 17, \dots\}. Note that in this example the smallest element of 𝒮\mathcal S is 22, which is indeed the remainder of division of 1717 by 55. We can thus “find” the remainder by seeking out the smallest element of 𝒮\mathcal S, and this turns out to work in general. Moreover, once we know what the remainder rr is, the value of qq is determined by the equation n=qd+rn = qd + r.

Let us now give a formal proof:

Proof of Theorem 1.32. Let nn and dd be integers with d>0d > 0. Define a set of non-negative integers as follows: 𝒮:={ndx|xandndx0}.\mathcal S := \{n - d x\,\, | \,\, x \in \mathbb Z\,\,\, {\text{and}} \,\,\, n-dx \geqslant 0\}. By construction, 𝒮\mathcal S consists of only non-negative integers. We would like to apply the Well-Ordering Principle, but to do so we must check that 𝒮\mathcal S is not empty.

Let us prove 𝒮\mathcal S is non-empty by considering two cases. If n0n \geqslant 0, then taking x=nx = -n in the formula above gives nd(n)=n(d+1)n - d \cdot (-n) = n(d+1), which belongs to 𝒮\mathcal S (since it is non-negative by assumption). If n<0n < 0, then taking x=nx = n in the formula above gives ndn=n(1d)n - d \cdot n = n(1-d). Since d1d \geqslant 1, we have 1d01-d \leqslant 0, and since n<0n < 0, it follows that n(1d)0n(1-d) \geqslant 0. Thus ndnn- d \cdot n is an element of 𝒮\mathcal S. In either case, we have shown 𝒮\mathcal S is not the empty set.

Since 𝒮\mathcal S is a non-empty set of non-negative integers, by the Well-Ordering Principle, 𝒮\mathcal S has a smallest element, which we will call rr. (Our aim is to prove that this is the rr that “works” to give the conclusion.)

By definition of 𝒮\mathcal S, we have that r0r \geqslant 0 and that r=nqdr = n - qd for some integer qq. Let us now show that r<dr < d also. Suppose this is not the case; that is, suppose rdr \geqslant d. Since r=ndqr = n - dq, it follows that rd=ndxd=nd(x+1)r-d = n - dx -d = n-d(x+1). Since rdr \geqslant d, we have rd0r-d \geqslant 0. This shows that rd𝒮r - d \in \mathcal S. But rd<rr -d < r, which contradicts our assumption that rr is the smallest element of 𝒮\mathcal S. We conclude that it must be the case that r<dr < d.

We have shown that there are integers rr and qq such that r=nqdr = n - qd and 0r<d0 \leqslant r < d. Rewriting the first equation slightly, we have shown that there are integers rr and qq such that n=r+qdn= r + qd and 0r<d0 \leqslant r < d. ◻

Let us prove the “uniqueness” portion of the Division Algorithm. Here is what it says:

Theorem 1.33. Let n,dn, d be integers with d>0d > 0. Suppose that n=qd+rn = qd+r with 0r<d0 \leqslant r < d and n=qd+rn = q'd + r' with 0r<d0 \leqslant r' < d. Then r=rr = r' and q=qq = q'.

Proof. We first show that dd divides (rr)(r-r'): Since n=qd+rn = qd+r and n=qd+rn = q'd + r', we have qd+r=qd+rqd + r = q'd + r' and hence rr=qdqd=n=(qq)dr-r' = q'd - qd = n = (q-q')d. Since qqq' -q is an integer, this shows that dd divides rrr - r'.

We next show that d<rr<d-d < r-r' < d: Since r<dr' < d, we have d<r-d < -r', and since 0r0 \leqslant r, wehave rr+r=rr-r' \leqslant r' + r = r - r'. So drr-d \leqslant r - r'. Similarly, since r0r' \geqslant 0 we have r0-r' \leqslant 0 and since r<dr < d, it follows that rrr<dr - r' \leqslant r < d, whence rr<dr- r' < d. Putting these together gives d<rr<d.-d < r-r' < d.

Now, the only integer xx such that d<x<d-d < x < d and dd divides xx is 00. Since we have shown that rrr - r' has these two properties, it follows that rr=0r-r' = 0; that is, r=rr = r'.

Finally, since qd+r=qd+rq'd + r' = qd + r and r=rr = r', we conclude that 0=qdqd=(qq)d0 = q'd - qd = (q'-q)d. Since d0d \ne 0, it follows that qq=0q-q' = 0 and thus q=qq = q'. ◻

1.6 Greatest Common Divisor

Definition 1.34. Let aa be an integer. A divisor of aa is an integer dd so that dd divides aa.

Definition 1.35. Let aa and bb be two integers, not both of which are 00. The greatest common divisor or GCD of a,ba,b is the largest integer dd such that dd divides aa and dd divides bb. We often write gcd(a,b)\gcd(a,b) for the GCD of aa and bb.

More formally by definition gcd(a,b)\gcd(a,b) has the following properties:

  1. gcd(a,b)\gcd(a,b) divides both aa and bb. (“common divisor" property)

  2. If nn is an integer that divides both aa and bb, gcd(a,b)n\gcd(a,b) \geqslant n. (“greatest" property)

According to the above definition if you want to prove that an integer dd is the GCD of aa and bb you must show

  1. dd divides both aa and bb. (“common divisor" property)

  2. If nn is an integer that divides both aa and bb, dnd \geqslant n. (“greatest" property)

Example 1.36. For example, let us find gcd(18,24)\gcd(18, 24) by listing all the common factors of 1818 and 2424 and finding the largest one: The factors of 1818 are ±1,±2,±3,±6,±9,±18\pm 1, \pm 2, \pm 3, \pm 6, \pm 9, \pm 18. The factors of 2424 are ±1,±2,±3,±6,±12,±24.\pm 1, \pm 2, \pm 3, \pm 6, \pm 12, \pm 24. So the common factors are ±1,±2,±3,±6\pm 1, \pm 2, \pm 3, \pm 6. The largest of these numbers is 66; whence gcd(18,24)=6\gcd(18, 24) = 6.

Example 1.37.

Lemma 1.38. If a,ba, b \in \mathbb{Z} with b0b \neq 0, then gcd(a,b)=gcd(a,b)\gcd(a,b)=\gcd(a,-b) .

Proof. Suppose a,ba, b \in \mathbb{Z} with b0b \neq 0. We first show that dd divides bb if and only if dd divides b-b. If dbd\mid b then there exists nn\in {\mathbb{Z}} so that b=dnb=dn and thus b=d(n)-b=d(-n) shows that dbd\mid -b. Conversely if dbd\mid -b then there exists kk\in {\mathbb{Z}} so that b=dk-b=dk and thus b=d(k)b=d(-k) shows that dbd\mid b.

So, the set of common divisors of aa and bb is identical to the set of common divisors of aa and b-b. By definitions, gcd(a,b)\gcd(a,b) is the largest element of the first set and gcd(a,b)\gcd(a,-b) is the largest element of the second set. Since these sets are identical, gcd(a,b)=gcd(a,b)\gcd(a,b) = \gcd(a,-b). ◻

Theorem 1.39. Let aa and bb be integers with b>0b > 0, and suppose q,rq, r are integers which satisfy a=qb+ra = qb+r. Then gcd(a,b)=gcd(b,r)\gcd(a,b) = \gcd(b,r).

Proof. Suppose aa and bb are integers with b>0b > 0. First, suppose that dd is an integer that divides both aa and bb. Then we can find integers nn and mm such that a=dna = dn and b=dmb = dm, and dn=a=qb+r=qdm+r, so r=dnqdm=d(nqm).dn = a = qb + r = qdm + r, \textrm{ so } r = dn-qdm = d(n-qm). Since n,q,mn,q,m are integers by closure of the integers under addition and multiplication nqmn-qm\in{\mathbb{Z}}. Therefore, dd divides rr, and by assumption dd also divides bb. Now since gcd(b,r)\gcd(b,r) is the largest integer that divides both bb and rr, we must have dgcd(b,r)d \leqslant \gcd(b,r) by property (2) in Definition 1.35. In particular, this applies to the special case when d=gcd(a,b)d = \gcd(a,b), since that is an integer that divides both aa and bb. We conclude that gcd(a,b)gcd(b,r)\gcd(a,b) \leqslant \gcd(b,r).

Now we will show that gcd(a,b)gcd(b,r)\gcd(a,b) \geqslant \gcd(b,r). To do that, suppose now that dd is an integer that divides both bb and rr. This means we can find integers mm and nn such that b=dmb=dm and r=dnr = dn, so a=qb+r=qdm+dn=d(qm+n).a = qb+r = qdm + dn = d(qm+n). Again since n,q,mn,q,m are integers by closure of the integers under addition and multiplication qm+nqm+n\in{\mathbb{Z}}. This shows that dd must also divide aa. Since by assumption dd also divides bb, we must have dgcd(a,b)d \leqslant \gcd(a,b) by propery (2) in Definition 1.35. This applies in particular to the case when d=gcd(b,r)d = \gcd(b,r), so we conclude that gcd(b,r)gcd(a,b)\gcd(b,r) \leqslant \gcd(a,b).

We have shown that gcd(a,b)gcd(b,r)\gcd(a,b) \leqslant \gcd(b,r) and gcd(b,r)gcd(a,b)\gcd(b,r) \leqslant \gcd(a,b), and hence gcd(a,b)=gcd(b,r)\gcd(a,b) = \gcd(b,r). ◻

1.7 The Euclidean Algorithm

Theorem 1.39 is used in practice to find, in an efficient manner, the gcd of two integers. Let’s look at an example:

Example 1.40 (Euclidean algorithm). Suppose we want to find the gcd of 524524 and 148148. Consider the following calculations: 524=1483+80148=801+6880=681+1268=125+812=81+48=42+0\begin{eqnarray*} 524 = 148 \cdot 3 + 80 \\ 148 = 80 \cdot 1 + 68 \\ 80 =68 \cdot 1 + 12 \\ 68 = 12 \cdot 5 + 8 \\ 12 = 8 \cdot 1 + 4 \\ 8 = 4 \cdot 2 + 0 \end{eqnarray*} In each step, we are applying the Division Algorithm, starting by dividing 524524 by 148148 to leave a remainder of 8080, then dividing 148148 by 8080, leaving a remainder of 6868, and so on. In general, if in one step we are dividing xx by yy leaving a remainder of zz, in the next step we are dividing yy by zz, leaving a new remainder.

By Theorem 1.39, if a=bq+ra = bq + r, then gcd(a,b)=gcd(b,r)\gcd(a,b) = \gcd(b,r). Applying Theorem 1.39 to equation [1] it shows that gcd(524,148)=gcd(148,80)\gcd (524,148)= \gcd(148,80). Applying Theorem 1.39 to equation [2] it shows that gcd(148,80)=gcd(80,68)\gcd(148,80) =\gcd (80,68), and continuing to apply Theorem 1.39 to the remaining equations we obtain gcd(524,148)=gcd(148,80)=gcd(80,68)=gcd(68,12)=gcd(12,8)=gcd(8,4)=gcd(4,0).\gcd (524,148)= \gcd(148,80)=\gcd (80,68) = \gcd(68, 12) =\gcd(12, 8) = \gcd(8,4) = \gcd(4,0). Since gcd(4,0)=4\gcd(4,0) = 4 by Example 1.37, the equalities above give gcd(524,148)=4\gcd(524, 148) = 4.

The Extended Euclidean Algorithm

In addition to finding the gcd of two integers aa and bb, the Euclidean algorithm can also be used to find integers xx and yy such that gcd(a,b)=xa+by\gcd(a, b) = xa + by. This is sometimes called the “Extended Euclidean Algorithm”.

Definition 1.41. An expression of the form xa+ybxa + yb is known as a integer linear combination of aa and bb.

With this terminology, the extended Euclidean algorithm gives a method of expressing gcd(a,b)\gcd(a,b) as an integer linear combination of aa and bb. Let’s see how this works in the example above:

Example 1.42 (Extended Euclidean algorithm). Our next goal is to express gcd(524,148)=4\gcd(524,148)=4 as a linear combination of 524524 and 148148. We can do so re-using the work from the Euclidean Algorithm as follows:

524=1483+8080=1524314868=1148180=1148(15243148)1=4148152412=180+168=1(41481524)+(15243148)=7148+25248=68512=(41481524)5(7148+2524)=11524+391484=11218=(7148+2524)(11524+39148)=1352446148\begin{eqnarray*} 524 = 148 \cdot 3 + 80 \Leftrightarrow 80 = 1 \cdot 524 - 3 \cdot 148 \\ 68 = 1 \cdot 148 - 1 \cdot 80 = 1 \cdot 148 - (1 \cdot 524 - 3 \cdot 148) \cdot 1 = 4 \cdot 148 - 1 \cdot 524\\ 12 = - 1 \cdot 80 + 1 \cdot 68 = - 1 \cdot (4 \cdot 148 - 1 \cdot 524) + (1 \cdot 524 - 3 \cdot 148) = - 7 \cdot 148 + 2 \cdot 524 \\ 8 = 68 - 5 \cdot 12 = ( 4 \cdot 148 - 1 \cdot 524 ) - 5 \cdot (- 7 \cdot 148 + 2 \cdot 524) = -11 \cdot 524 + 39 \cdot 148\\ 4 = 1 \cdot 12 - 1 \cdot 8 = (- 7 \cdot 148 + 2 \cdot 524) - (-11 \cdot 524 + 39 \cdot 148) = 13 \cdot 524 - 46 \cdot 148 \end{eqnarray*}

Here are formal statements of the Euclidean Algorithm and its extended version.

Theorem 1.43 (The Euclidean Algorithm). Let aa and bb be integers such that b>0b > 0. Define a list of numbers r1,r2,rmr_1, r_2 \dots, r_m such that b>r1>r2>>rm and rm=0b > r_1 > r_2 > \cdots > r_m \textrm{ and } r_m = 0 as follows:

Then gcd(a,b)=rm1\displaystyle \gcd(a,b) = r_{m-1}.

Proof. Claim 1: the algorithm terminates in finitely many steps.

The inequalities from the Division Theorem can be combines as follows b>r1>r2>r3>0b > r_1 > r_2 > r_3 > \cdots \geq 0 Since there are finitely many integers between 00 and bb which can be the successive remainders in the algorithm, the process described in the Theorem cannot continue forever; that is, rm=0r_m = 0 for some mm. In fact, notice that the process must stop after at most bb steps.

Next we want to show

Claim 2: gcd(a,b)=rm1\gcd(a,b)= r_{m-1}.

Since a=bq1+r1a = bq_1 + r_1, by Theorem 1.39 we have gcd(a,b)=gcd(b,r1)\gcd(a,b) = \gcd(b, r_1). Moreover, since b=r1q2+r2b = r_1 q_2 + r_2, Theorem 1.39 also implies that gcd(b,r1)=gcd(r1,r2)\gcd(b, r_1)= \gcd(r_1, r_2). Since r1=r2q3+r3r_1 = r_2 q_3 + r_3, by Theorem 1.39 we must also have gcd(r1,r2)=gcd(r2,r3)\gcd(r_1, r_2)= \gcd(r_2, r_3). Continuing in the manner, applying Theorem 1.39 for each step of the algorithm, we obtain the list of equations gcd(a,b)=gcd(b,r1)=gcd(r1,r2)=gcd(r2,r3)==gcd(rm2,rm1)=gcd(rm1,rm).\gcd(a,b) = \gcd(b, r_1)= \gcd(r_1, r_2) = \gcd(r_2, r_3) = \cdots = \gcd(r_{m-2}, r_{m-1}) = \gcd(r_{m-1}, r_m). Recall that rm=0r_m = 0, and so from the equality of the first and last quantities above we have gcd(a,b)=gcd(rm1,0)\gcd(a,b) = \gcd(r_{m-1}, 0). Since gcd(x,0)=|x|\gcd(x, 0) = |x| for any non-zero integer xx, we have gcd(rm1,0)=rm1\gcd(r_{m-1},0) = r_{m-1}. This proves that gcd(a,b)=rm1\gcd(a,b) = r_{m-1}. ◻

A consequence of the Extended Euclidean Algorithm is

Theorem 1.44. If a,ba,b are integers, a0a\neq 0, and b0b\neq 0, then there exist integers x,yx,y so that gcd(a,b)=xa+by\gcd(a,b)=xa+by.

1.8 Primes

Definition 1.45. A integer pp is called prime if

Definition 1.46. An integer nn is composite if it is not prime.

Lemma 1.47. If pp is prime and aa is any integer, then gcd(p,a)\gcd(p,a) is either |p||p| or 11. In particular, if pp does not divide aa, then gcd(p,a)=1\gcd(p,a) = 1.

Proof. Suppose pp is prime and aa is any integer. By definition, the only divisors of pp are 11, 1-1, pp, and p-p. If pp divides aa, then so does |p||p|, and since |p||p| is the largest divisor of pp, we conclude that gcd(a,p)=|p|\gcd(a,p) = |p|. If pp does not divide aa, neither does p-p, so 11 is the only positive integer dividing both aa and pp, and gcd(p,a)=1\gcd(p,a) = 1. ◻

Theorem 1.48. Let a,b,ca, b, c be integers. If aa divides bcbc and gcd(a,b)=1\gcd(a,b) = 1, then aca \mid c.

Proof. Suppose a,b,ca, b, c are integers, aa divides bcbc and gcd(a,b)=1\gcd(a,b) = 1. By the extended Euclidear Algorithm (Theorem 1.43) the greatest common divisor of aa and bb is always an integer linear combination of aa and bb; specifically there exist integers xx and yy such that 1=gcd(a,b)=ax+by1=\gcd(a,b)=ax+by. Therefore, multiplying by cc gives axc+byc=c.axc+byc = c. On the other hand, our assumption that aa divides bcbc says that there exists an integer nn such that an=bcan=bc. Therefore, axc+byc=caxc+any=ca(xc+ny)=c.\begin{align*} axc+byc & = c \\ axc + any & = c \\ a(xc+ny) & = c. \end{align*} Since xc+nyxc + ny is an integer by closure of {\mathbb{Z}} under addition and multiplication, by definition, this means that aa divides cc. ◻

Corollary 1.49. Assume a,b,pa,b, p are integers. If pp is a prime and pp divides abab, then pp must divide aa or bb.

Note that “or" here is not exclusive: the theorem says that one of the following three statements is true:

Proof. Assume pp is prime and pp divides abab. We consider two cases. If pp divides aa, we are done since we already have what we wanted to prove. If pp does not divide aa. By Lemma 1.47, gcd(a,p)=1\gcd(a,p)=1. Since gcd(a,p)=1\gcd(a,p)=1 and we assumed that pp divides abab, we can apply Theorem 1.48, which leads us to conclude that pp must divide bb. ◻

Corollary 1.50. Suppose pp is a prime integer, a1,a2,,ana_1, a_2, \dots, a_n are arbitrary integers, and pp divides a1a2ana_1 a_2 \cdots a_n. Then pp divides at least one of a1,a2,,a_1, a_2, \dots, or ana_n.

Proof. The most rigorous way to prove this would be a proof by induction, a topic that has not yet been discussed in this class. We will make do with a slightly less formal proof:

Assume pp is prime and pp divides a1a2ana_1 a_2 \cdots a_n. Thinking of the latter product as (a1an1)an(a_1 \cdots a_{n-1}) \cdot a_n, by Corollary 1.49 we have that pp divides a1an1a_1 \cdots a_{n-1} or pp divides ana_n. If pp divides ana_n, then we are done – we have found an aia_i that pp divides. If not, then pp divides a1an1a_1 \cdots a_{n-1}. Now applying the Corollary again gives that pp divides a1an2a_1 \cdots a_{n-2} or pp divides an1a_{n-1}. Likewise, if pp divides a1an2a_1 \cdots a_{n-2}, then pp divides a1an3a_1 \cdots a_{n-3} or pp divides an2a_{n-2}. Continuing in this manner, we arrive at the fact that pp divides a1a_1 or pp divides a2a_2 or \cdots or pp divides ana_n. ◻

Corollary 1.51. Suppose pp, q1,q2,,qnq_1, q_2, \dots, q_n are all prime integers, and that pp divides q1q2qnq_1 q_2 \cdots q_n. Then p=±qip = \pm q_i for some i{1,2,,n}i \in \{1, 2, \dots, n\}.

Proof. By Corollary 1.50, we have that pp divides qiq_i for some i{1,2,,n}i \in \{1, 2, \dots, n\}. Since qiq_i is prime, its only divisors are ±1\pm 1 and ±qi\pm q_i. But p1p \ne 1 and p1p \ne -1 (by the definition of “prime”) and hence it must be that p=±qip = \pm q_i. ◻

1.9 The Fundamental Theorem of Arithmetic

The Fundamental theorem of arithmetic says that every integer can be factored into primes in an essentially unique way. More precisely, it says the following:

Theorem 1.52 (The Fundamental Theorem of Arithmetic). Any integer n0,1,1n \neq 0, 1, -1 can be written as a product of primes. Moreover, if p1ps=q1qtp_1\cdots p_s =q_1\cdots q_t are two factorizations of nn into primes, then s=ts=t, and there exists a reordering of the qjq_j’s such that qi=±piq_i=\pm p_i for all ii.

Example 1.53. For example, we can factor 24-24 into a product of primes as follows 24=(2)223-24 = (-2) \cdot 2 \cdot 2 \cdot 3 and also as 24=(3)222.-24 = (-3) \cdots 2 \cdot 2 \cdot 2. These are “essentially the same” in the sense that if we rearrange the factors and change a couple of the signs, we can tranform one factorization into the other one.

Example 1.54. For another example, 17-17 is already factored into a product of primes: it is a product of one prime.

The Fundamental Theorem consists of an “existence" and “uniqueness" statement. The existence part says that every nonzero integer can be written as a product of prime integers. More formally, for any integer nn such that n0n \ne 0, there is a non-negative integer ss and a list of primes p1,,psp_1, \dots, p_s such that n=±p1p2pnn = \pm p_1 p_2 \cdots p_n. The uniqueness part says that this list of primes p1,,psp_1, \dots, p_s is unique up to signs and the order of the primes. More formally, that given wo factorizations of nn into prime say p1psandq1qtp_1\cdots p_s \,\,\,{\text{and} }\,\,\,q_1\cdots q_t then s=ts=t, and we can reorder of the qjq_j’s such that qi=±piq_i=\pm p_i for all ii.

The existence portion of the Fundamental Theorem of Arithmetic states the following:

Theorem 1.55 (FTA existence). For any integer nn, other than 11, 1-1 or 00, there is a list of prime integers p1,p2,,psp_1, p_2, \dots, p_s for s1s \geqslant 1 such that n=p1p2psn = p_1p_2 \cdots p_s.

Proof of the existence portion of the Fundamental Theorem. Suppose nn is an integer other than 11, 1-1 or 00.

Case 1: Let us first consider the case when n2n \geqslant 2. Let 𝒮\mathcal S be the set of all integers that are greater than or equal to 22 that are not products of primes: 𝒮={nn2 and n is not a product of primes}.\mathcal S= \{n \in {\mathbb{Z}} \mid n \geqslant 2 \text{ and $n$ is not a product of primes} \}. Our goal is to prove 𝒮\mathcal S is the empty set.

By way of contradiction, assume 𝒮\mathcal S is not empty. Then, by the Well-Ordering Principle, 𝒮\mathcal S will have a least element, call it nn. In other words, nn is an integer such that n2n \geqslant 2 and such that nn cannot be written as a product of primes, but such that every integer mm such that 2m<n2 \leqslant m < n can be written as a product of primes.

We note that nn cannot be prime, since a prime integer is its own prime factorization. (Note that s=1s = 1 is allowed in the statement of the theorem.) So, we must have that n=abn = a \cdot b for some 1<a<n1 < a < n and 1<b<n1 < b < n. But then since aa and bb are both less than nn and nn is the least element of 𝒮\mathcal S, both aa and bb must not in 𝒮\mathcal S. Since each is greater than or equal to 22, it follows that each of them is a product of primes. Say a=q1q2qka = q_1 q_2 \cdots q_k and b=r1r2rlb = r_1 r_2 \cdots r_l where each qiq_i and rjr_j is prime. Then n=q1q2qkr1r2rln = q_1 q_2 \cdots q_k r_1 r_2 \cdots r_l is a product of primes, contrary to the fact that n𝒮n \in \mathcal S. We conclude that 𝒮\mathcal S must be the empty set — that is, every integer that is at least 22 is a product of prime intgers.

Case 2: Now assume n2n \leqslant -2. Then n2-n \geqslant 2 and hence, by the case already established, we have that n=p1p2ps-n = p_1 p_2 \cdots p_s for some primes p1,,psp_1, \dots, p_s. It follows that n=(p1)p2p3ps-n = (-p_1) p_2 p_3 \cdots p_s. Since p1p_1 is prime, so is p1-p_1. This proves nn is a product of primes. ◻

The uniqueness portion of the Fundamental Theorem states:

Theorem 1.56 (FTA uniqueness). If p1p2ps=q1q2qtp_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t for some primes p1,p2,,psp_1, p_2, \dots, p_s and q1,q2,,qsq_1, q_2, \dots, q_s, then s=ts = t and, after possibly reordering the qjq_j’s, we have that pi=±qip_i = \pm q_i for all ii.

Proof. Suppose nn is an integer other than 11, 1-1 or 00.

Case 1: Let us first consider the case when n2n \geqslant 2.

We give a proof by contradiction. Consider the set

𝒮={nn2 and n=p1ps,n=q1qt\mathcal S =\left\{n\in {\mathbb{Z}} \mid n\geq 2 \text{ and } n=p_1\cdots p_s, n=q_1\cdots q_t \right. are two prime factorizations which differ not just by reordering and negating the factors }\left.\right\}.

Assume towards a contradiction that 𝒮\mathcal{S} is not empty. Since 𝒮\mathcal{S} consists of nonnegative integers, by the Well-Ordering Principle, there is a smallest element of 𝒮\mathcal{S} — let us call it NN. Then NN has two factorizations, say N=p1p2ps=q1q2qtN = p_1 p_2 \cdots p_s = q_1 q_2 \cdots q_t for some primes p1,,psp_1, \dots, p_s and q1,,qtq_1, \dots, q_t, such that no possible reordering and changing of signs can make them be the same.

We can rewrite the equations above as N=p1(p2ps)=q1q2qt,(*)N = p_1 (p_2 \cdots p_s) = q_1 q_2 \cdots q_t, \qquad(*) which shows that p1p_1 divides q1q2qtq_1 q_2 \cdots q_t. Since p1p_1 is prime as are q1,,qtq_1,\ldots, q_t, it must be that p1=±qjp_1= \pm q_j for some 1jt1\leq j\leq t by Corollary 1.51. By renumbering the qq’s, we may assume j=1j = 1, so that p1=±q1p_1=\pm q_1. Dividing (*)(*) by |p1||p_1| we get m=±p2p3ps=±q2q3qtm = \pm p_2 p_3 \cdots p_s = \pm q_2 q_3 \cdots q_t with m=N/|p1|m=N/|p_1| still a positive integer. Since p1p_1 is a prime |p1|>1|p_1| > 1, and so we must have m<Nm<N. Thus m𝒮m\not \in \mathcal{S} and hence the two prime factorizations of mm above must be essentially the same, that is, s1=t1s-1=t-1 and after reordering pi=±qip_i=\pm q_i for 2is2\leq i\leq s. Together with p1=±q1p_1=\pm q_1 this shows that the two factorizations of NN are essentially the same, a contradiction.

We conclude that 𝒮\mathcal{S} is empty, so every integer n2n \geqslant 2 has an essentially unique prime factorization.

Case 1: Now assume n2n \leqslant -2 . Then n2-n \geqslant 2 and hence, by the case already established, since n=(p1)p2ps=(q1)q2qt-n=(-p_1) p_2 \cdots p_s=(-q_1)q_2\cdots q_t we have that s=ts=t and after possibly reordering we have pi=±qip_i=\pm q_i for 1in1\leq i\leq n. ◻

Lemma 1.57. If the Fundamental Theorem of Arithmetic is true and n1n\neq 1 is an integer then there exist positive prime integers p1,,psp_1, \ldots, p_s such that n=p1psn=p_1\cdots p_s.

Proof. Suppose n1n\neq 1 is an integer and the FTA is true.

By the FTA, there exist primes p1,,psp'_1, \ldots, p'_s so that n=p1psn= p'_1 \cdots p'_s. Then we have |n|=|p1ps|=|p1||ps|.|n|= |p'_1 \cdots p'_s|=|p'_1| \cdots |p'_s|. Since the absolute value of a prime is a prime, each of the integers |p1|,|ps||p_1|, \ldots |p_s| are positive primes, so by setting pi=|pi|p_i=|p'_i| we obtain that n=p1psn=p_1\cdots p_s and p1,,psp_1, \ldots, p_s are positive primes. ◻

Let’s see a sample application for the uniqueness statement of the FTA .

Example 1.58. Are there primes p1,p2,p3p_1, p_2, p_3 and q1,q2q_1, q_2 so that p1p2,p1p3,p2p3,q1q2p_1\neq p_2, p_1\neq p_3, p_2\neq p_3, q_1\neq q_2 and p1p2p3=q1q2p_1p_2p_3=q_1q_2?

No this is not possible. The FTA says that if we have two prime factorizations p1p2p3p_1p_2p_3 and q1q2q_1q_2 that are equal then the number of prime factors in each is the same. This would mean that 3=23=2, a contradiction.

The relationship between prime factorizations, divisors, and gcd

Here is the main result that relates prime factorizations and the notion of divisor.

Lemma 1.59. Let nn and dd pe positive integers so that n=p1f1p2f2pkfkn=p_1^{f_1}p_2^{f_2}\cdots p_k^{f_k} with pi>0p_i>0 distinct primes and ei0e_i\geq 0 integers. Then dnd\mid n if and only if d=p1e1p2e2pkekd=p_1^{e_1}p_2^{e_2}\cdots p_k^{e_k} for some integers f1,,fkf_1,\ldots, f_k so that 0eifi0\leq e_i \leq f_i for all 1ik1\leq i\leq k.

With this interpretation of the divisors in terms of prime factorization, we can determine a formula for the GCD.

Theorem 1.60. Let aa and bb be positive integers so that a=p1a1p2a2pkakb=p1b1p2b2pkbk\begin{eqnarray*} a=p_1^{a_1}p_2^{a_2}\cdots p_k^{a_k}\\ b=p_1^{b_1}p_2^{b_2}\cdots p_k^{b_k} \end{eqnarray*} with pi>0p_i>0 distinct primes and ai0,bi0a_i\geq 0, b_i\geq 0 integers. Then gcd(a,b)=p1min{a1,b1}p2min{a2,b2}pkmin{ak,bk},\gcd(a,b)=p_1^{\min\{a_1,b_1\}}p_2^{\min\{a_2,b_2\}}\cdots p_k^{\min\{a_k,b_k\}}, where min{ai,bi}\min\{a_i,b_i\} denotes the smallest among aia_i and bib_i.

Proof. By Lemma 1.59, dad\mid a if and only if d=±p1e1p2e2pkekd=\pm p_1^{e_1}p_2^{e_2}\cdots p_k^{e_k} for some integers f1,,fkf_1,\ldots, f_k so that 0fiai0\leq f_i \leq a_i and dbd\mid b if and only if d=±p1e1p2e2pkekd=\pm p_1^{e_1}p_2^{e_2}\cdots p_k^{e_k} for some integers f1,,fkf_1,\ldots, f_k so that 0fibi0\leq f_i \leq b_i. Thus dd is a common divisor of aa and bb if and only if d=±p1e1p2e2pkekd=\pm p_1^{e_1}p_2^{e_2}\cdots p_k^{e_k} and 0fkmin{ai,bi}0\leq f_k \leq \min\{a_i,b_i\}. The largest number of this form is p1min{a1,b1}p2min{a2,b2}pkmin{ak,bk}p_1^{\min\{a_1,b_1\}}p_2^{\min\{a_2,b_2\}}\cdots p_k^{\min\{a_k,b_k\}}. This number is therefore the largest common divisor gcd(a,b)\gcd(a,b). ◻

Example 1.61. Suppose a=21013275a=2^{101}\cdot 3^{27}\cdot 5 and b=210333b=2^{10}\cdot 3^{33} then we can first write the numbers using all primes that appear in either, using exponents equal to zero if necessary: a=210132751a=2^{101}\cdot 3^{27}\cdot 5^1 and b=21033350b=2^{10}\cdot 3^{33}\cdot 5^0. Then the formula in the theorem above gives gcd(a,b)=2min{101,10}3min{27,33}5min{1,0}=21032750.\gcd(a,b)=2^{\min\{101,10\}}\cdot 3^{\min\{27,33\}}\cdot 5^{\min\{1,0\}}=2^{10}\cdot3^{27}\cdot 5^0.

2 Induction

2.1 Proofs by induction

Induction is a very helpful proof technique for showing results about all positive integers, or where the statement depends on a positive integer. Usually, proofs by induction go as follows:

We have a statement we want to prove that depends on a positive integer nn; let’s refer to that statement as P(n)P(n). To prove P(n)P(n) for all nn, we follow two steps:

The Domino Metaphor: Imagine that the statements P(1),P(2),P(3),P(1), P(2), P(3), \dots are like dominoes. Suppose I tell you that (a) the dominoes have been arranged so that if one of the them falls over, the next one in line will fall too, and (b) I have knocked over the first one. Then it is clear that, eventually, every domino will fall. (Ignore the fact that it would take an infinite amount of time for this to happen in reality.) The Principle of Mathematical Induction is an abstraction of this idea.

It is a theorem that once you prove both the base case and the induction step, P(n)P(n) must hold for all integers nrn \geqslant r, where rr is the value of nn in your base case. Most often r=1r=1.

Here is the formal statement:

Theorem 2.1 (The Principle of Mathematical Induction). Suppose r1r\geq 1 is an integer and P(n)P(n) is a statement that applies to an integer nrn\geq r. If P(r)P(r) is true and P(n)P(n) implies P(n+1)P(n+1) for all nrn \geqslant r, then P(n)P(n) is true for all nrn\geq r.

Proof. Suppose P(r)P(r) is true and P(n)P(n) implies P(n+1)P(n+1) for all n1n \geqslant 1.

We give a proof by contradiction. Assume it is not the case that P(n)P(n) is true for all nrn \geqslant r. Then the set S:={kkr, P(k) is false}S := \{k \in {\mathbb{Z}} \mid k \geqslant r, \text{ $P(k)$ is false}\} is a non-empty set of positive integers. By the Well-Ordering Axiom, SS has a least element, call it jj. In words, jj is an integer such that P(j)P(j) is true but P(i)P(i) is false for all ii such that ri<jr \leqslant i < j.

Since we assumed P(r)P(r) is true, jrj \ne r. So j>rj > r and hence j1rj-1\geq r. Since j1<jj-1 < j, j1j-1 must not be in the set SS. That is, P(j1)P(j-1) is true. But we are assuming P(j1)P(j-1) implies P(j)P(j), and thus P(j)P(j) must be true. We have concluded that P(jP(j) is both true and false, which is a contradiction.

Since we reached a contradiction, it must be that the set SS is empty — that is, P(n)P(n) is true for all positive integers nn. ◻

Here is a statement that can be proven by the Principle of Mathematical Induction:

Recall that n!=12nn! = 1 \cdot 2 \cdots \cdot n.

Theorem 2.2. For every integer n1n \geqslant 1, we have 2n1n!2^{n-1} \leqslant n!.

Proof. We give a proof by induction. Let P(n)P(n) be the statement "2n1n!2^{n-1} \leq n!".

 ◻

When writing a proof by induction, you should tell the reader that you’re about to do a proof by induction, and you should clearly indicate where your base case and induction step are. The assumption that P(n)P(n) holds for some nn used in the induction step is called the induction hypothesis. The fact that proving your base case and induction steps is sufficient to prove the theorem is recalled by invoking the Principle of Mathematical Induction to finish the proof.

Important features of a proof by induction:

  1. always start by stating you will do a proof by induction

  2. next state clearly what your P(n)P(n) statement is

  3. a frequent mistake is to include the words "for all/every/any nn" in the P(n)P(n) statement. This is incorrect. The P(n)P(n) statement applies just to the nn which is named inside the P(n)P(n) notation.

  4. label your base case and induction step clearly

  5. state what P(n+1)P(n+1) is clearly before you prove it.

  6. finish the proof saying “By the Principle of Mathematical Induction \ldots is true for all nn\geq \ldots.

Here are some more proofs using the Principle of Mathematical Induction:

Theorem 2.3. For all n1n \geqslant 1, 1+2++n=n(n+1)21 + 2 + \cdots + n = \frac{n(n+1)}{2}.

Proof. Let’s prove this by induction on nn. Let P(n)P(n) be the statement 1+2++n=n(n+1)21 + 2 + \cdots + n = \frac{n(n+1)}{2}.

By the Principle of Mathematical Induction, 1+2++n=n(n+1)21 + 2 + \cdots + n = \frac{n(n+1)}{2} for all n1n \geqslant 1. ◻

Example 2.4. For every integer n1n\geq 1, 55 divides 11n611^n - 6.

Proof. We give a proof by induction. Let P(n)P(n) be the statement "5 divides 11n611^n - 6".

 ◻

We can also reprove the following theorem using induction on n1n \geqslant 1:

Theorem 2.5. Let a1,,ana_1, \ldots, a_n be integers. If pp is a prime and pp divides a1ana_1 \cdots a_n, then pp divides aia_i for some i{1,,n}i\in\{1,\ldots, n\}.

Of course we have already proved this before! But now we can give a more formal proof — our previous proof was a bit handwave-y. We will need the following result, which we have proved before:

Theorem 2.6 (Euclid’s Lemma). If pp is a prime and pp divides abab, then pp divides aa or pp divides bb.

Proof of Theorem 2.5. Let’s fix a prime pp. Let P(n)P(n) be the statement “If pp divides a1ana_1 \cdots a_n for some integers a1,,ana_1, \ldots, a_n, then pp divides aia_i for some ii”. We prove P(n)P(n) holds for all n1n \geqslant 1 by induction.

 ◻

One thing that might be confusing about this proof is that Theorem 2.6 is precisely P(2)P(2). We needed this statement in our induction step.

Let’s now prove the uniqueness part of the FTA.

Theorem 2.7 (FTA uniqueness). If p1p2ps=q1q2qtp_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t for some primes p1,p2,,psp_1, p_2, \dots, p_s and q1,q2,,qsq_1, q_2, \dots, q_s, then s=ts = t and, after possibly reordering the qjq_j’s, we have that p1=±q1,,ps=±qsp_1 = \pm q_1, \ldots, p_s=\pm q_s.

Proof. We give a proof by induction on n=max{s,t}n=\max\{s,t\}. Let P(n)P(n) be the statement “If n=max{s,t}n=\max\{s,t\} and p1p2ps=q1q2qtp_1 p_2 \cdots p_s= q_1 q_2 \cdots q_t for some primes p1,p2,,psp_1, p_2, \dots, p_s and q1,q2,,qsq_1, q_2, \dots, q_s, then s=ts = t and, after possibly reordering the qjq_j’s, we have that p1=±q1,,ps=±qsp_1 = \pm q_1, \ldots, p_s=\pm q_s."

By the principle of mathematical induction, we have proven that the theorem is true. ◻

2.2 Strong induction

Sometimes we need a stronger form of induction where in the induction step we assume not only P(n)P(n), but also P(k)P(k) for all knk \leqslant n, where kk should also satisfy k1k \geqslant 1 or whatever else is our base case. This is called Strong Induction or Complete Induction. Here is the formal statement:

Theorem 2.8 (The Principle of Strong Mathematical Induction). Suppose P(n)P(n) is a statement that refers to an integer nn. Suppose that there exist integers r,br, b so that

Then P(n)P(n) is true for all nn.

There are two differences between strong induction and usual induction. First, in strong induction one typically needs to prove more than one base case (P(r),P(r+1),,P(b)P(r), P(r+1), \ldots, P(b)). You will have to include as a base case all the cases for which you cannot prove the induction step. Second, in the induction step of strong induction we get to assume any previous statements P(k)P(k) with k<nk<n and use whichever one is convenient. This flexibility is what makes this version of induction “strong".

Example 2.9. Use strong induction to prove that every class with at least 66 students in it can be divided up into groups each of size 33 or 44.

Proof. Let P(n)P(n) be the statement “A class with nn students can be divided up into groups each of size 33 of 44”. We prove P(n)P(n) is true for all n6n \geqslant 6 by strong induction.

By the Principle of Strong Mathematical Induction, P(n)P(n) is true for all n6n \geqslant 6. ◻

Let’s use strong induction to reprove the existence part of the Fundamental Theorem of Arithmetic.

Theorem 2.10. Every integer n2n \geqslant 2 is a product of primes.

Proof. We give a proof by strong induction. Let P(n)P(n) be the statement the integer nn is a product of primes.

By the principle of strong mathematical induction every integer n2n \geqslant 2 is a product of primes. ◻

3 Modular Arithmetic

3.1 Equivalence relations and equivalence classes

Definition 3.1. An equivalence relation on a set XX is a binary relation \sim satisfying the following properties:

  1. For every aXa \in X, aaa \sim a. This is known as the reflexive property.

  2. For all a,bXa,b \in X, if aba \sim b, then bab \sim a. This is known as the symmetric property.

  3. For all a,b,cXa,b,c \in X, if aba \sim b and bcb \sim c, then aca \sim c. This is known as the transitive property.

Typically the symbol “\sim” is read as “is equivalent to”; aba \sim b is read as “aa is equivalent to bb”.

Example 3.2. Say XX is the set of students in our class and a relation is defined on this set as follows: students aa and bb are equivalent, written aba\sim b, if they are born in the same month. We check this is an equivalence relation:

Since it satisfies the reflexive, symmetric and transitive properties the relation of being born in the same month is an equivalence relation.

Here is an example of a binary relation that is not an equivalence relation:

Example 3.3. Let X=X={\mathbb{Z}} and set aba\sim b to mean aba\leq b. This relation is symmetric and transitive but not reflexive. We show it is not reflexive by means of a counterexample: we have 353\leq 5 so 353\sim 5, but 5≁35\not \sim 3 since 535\not\leq 3.

Here is another example:

Example 3.4. Let XX denote the set of all points in the plane. Given two such points PP and QQ, declare PQP \sim Q to mean that PP and QQ have the same distance to the origin. For instance the point P=(3,4)P = (3,4) has distance 55 to the origin and the point Q=(0,5)Q = (0,-5) also has distance 55 to the origin, and hence PQP \sim Q. Put PP and RR are not similar for R=(3,0)R = (3,0).

The three properties of an equivalence relation are pretty clear in this case: (1) Any point PP has the same distance to the origin as itself, (2) if PP and QQ have the same distance to the origin, so do QQ and PP, and (3) if PP and QQ have the same distance to the origin and QQ and RR have the same distance to the origin, then PP and RR are of the same distance to the origin.

Definition 3.5. Let \sim be an equivalence relation on some set XX. For aXa \in X, the equivalence class of aa, written [a][a], is the subset of XX consisting of all elements that are equivalent to aa: [a]={bXab}[a] = \{b \in X \mid a \sim b\}

Example 3.6. Let XX and \sim be as in Example 3.4. Given a point PXP \in X, what is the equivalence class [P][P]? It is the set of all points in the plane located on the circle centered at the origin which goes through PP.

Suppose the distance from PP to the origin is dd while the distance from QQ to the origin is rr. As above [P][P] is the circle centered at the origin of radius dd and [Q][Q] is the circle centered at the origin of radius rr.

If PQP\sim Q then d=rd=r. Thus the circle centered at the origin of radius dd is the same as the circle centered at the origin of radius rr, that is [P]=[Q][P]=[Q].

If P≁QP\not \sim Q then drd\neq r. Thus the circle centered at the origin of radius dd and the circle centered at the origin of radius rr do not share any points, that is [P][Q]=[P]\cap [Q]=\emptyset.

Theorem 3.7. Let XX be any set and \sim be any equivalence relation on XX. For elements a,bXa, b \in X we have [a]=[b][a] = [b] if and only if aba \sim b.

Proof. \rightarrow If [a]=[b][a] = [b] then aba \sim b.

If [a]=[b][a] = [b] then since a[a]a \in [a] (by the reflexive property) we have a[b]a \in [b] too. By definition of [b][b], this means aba \sim b.

\Leftarrow If aba \sim b then [a]=[b][a] = [b]. Suppose aba \sim b. Pick any c[a]c \in [a]. By definition this means that cac \sim a. Since aba \sim b, by transitivity, this gives that cbc \sim b and hence c[b]c \in [b]. We have proven [a][b][a] \subseteq [b]. Now pick any d[b]d \in [b]. By definition this means that dbd \sim b. Since aba \sim b, by symmetry we also have bab \sim a and hence by transitivity we have dad \sim a, which means that d[a]d \in [a]. This proves that [b][a][b] \subseteq [a]. Since [b][a][b] \subseteq [a] and [a][b][a] \subseteq [b], we have [a]=[b][a] = [b]. ◻

Corollary 3.8. Let XX be and set and \sim an equivalence relation on XX. Any two equivalence classes are either equal or disjoint: for all a,bXa, b \in X, either [a]=[b][a] = [b] or [a][b]=[a]\cap [b] = \emptyset.

Since “either [a]=[b][a] = [b] or [a][b]=[a]\cap [b] = \emptyset" is a statement of the for “ P or Q" we will instead prove the logically equivalent statement “ not(Q) \rightarrow P" , that is, “if [a][b][a]\cap [b] \neq \emptyset then [a]=[b][a] = [b]".

Proof. Suppose [a][b][a]\cap [b] \neq \emptyset. Then there exists x[a][b]x\in [a]\cap [b] meaning that x[a]x\in [a] and x[b]x\in [b]. Since x[a]x\in [a] we have xax\sim a by definition of equivalence class and since x[b]x\in [b] we have xbx\sim b by definition of equivalence class. By the symmetric and transitive properties, since xax\sim a and xbx\sim b, then aba\sim b. Since aba\sim b, by the previous Theorem we deduce that [a]=[b][a]=[b]. ◻

The corollary says that an equivalence relation on a set XX “partitions” XX into equivalence classes: Every element of XX belongs to one, and only one, equivalence class. Equivalently, XX is the disjoint union of its equivalence classes.

3.2 Congruence and congruence classes

3.2.1 Congruence

Definition 3.9. Fix a nonzero integer NN. We say that a,ba, b\in \mathbb Z are congruent modulo NN if NN divides aba-b. We write ab(modN)a \, \equiv \, b \pmod N for “aa is congruent to bb modulo NN". In this notation, the aa and bb are the two inputs, and (modN)\, \equiv \!\pmod N is one piece, like a complicated equal sign.

Here are some examples:

  1. 519(mod7)5\equiv 19 \pmod 7 because 77 divides 519=145-19=-14.

  2. 520(mod10)-5\equiv 20 \pmod{10} is not true, because 1010 does not divide 520=25-5-20=-25.

  3. 1126(mod5)-11\equiv -26 \pmod 5 because 55 divides 11(26)=15-11-(-26) = 15.

Theorem 3.10. Any two odd integers are congruent modulo 2.

Proof. Let aa and bb be two odd integers. By a problem in problem set 1, a=2q+1a = 2q+1 and b=2s+1b = 2s+1 for some integers qq and ss. Now ab=(2q+1)(2s+1)=2(qs),a-b = (2q+1)-(2s+1) = 2(q-s), so 22 divides aba-b, which means that ab(mod2)a \equiv b \,\,\pmod 2. ◻

 

Example 3.11. It is not true that any two odd integers are congruent modulo 3, because for example 35(mod3)3 \not\equiv 5 \pmod{3} (since 33 does not divide 35=23-5=-2) and 33 and 55 are both odd.

Example 3.12. If 727(modN)7 \equiv 27 \pmod{N} for some integer NN, what are the possible values of NN? The congruence 727(modN)7 \equiv 27 \pmod{N} holds if and only if NN divides 727=207 - 27 = 20, and so NN must be one of the divisors of 2020: N{±1,±2,±4,±5,±10}N \in \{\pm 1, \pm 2, \pm 4, \pm 5, \pm 10\}.

Congruence classes modulo 1010 The number 20012001 is congruent to 11 modulo 1010, because 20011=20002001 -1 = 2000 is divisible by 1010. One way to find this is to consider the division algorithm: dividing 2001 by 10 we obtain 2001=1020+12001=10\cdot 20+1 which shows that 20011=10202001-1=10\cdot 20 is divisible by 10.

A different way to think about this: for any positive number nn, if r{0,1,,9}r \in \{0, 1, \dots, 9\} is the digit appearing in it one’s place, then the ones place of nrn - r is 00 and hence nrn-r is divisible by 1010. Thus, nr(mod10)n \equiv r \pmod{10} where rr is the digit in the one’s place of nn.

However, we have to be a little more careful about negative integers. If n<0n <0 is an integer, to find the unique 0r90 \leqslant r \leqslant 9 such that nr(mod10)n \equiv r \pmod {10}, we consider the last digit rr of nn, and take r=10nr=10-n. For example, 19999(mod10)1999 \equiv 9 \pmod{10} and 19991(mod10)-1999 \equiv 1 \pmod{10}.

Theorem 3.13. For a fixed N>0N>0, every aa\in \mathbb Z is congruent modulo NN to some rr \in \mathbb Z such that 0r<N0\leqslant r < N.

Proof. By the Division Algorithm we have a=qN+r, with 0rN.a = qN + r, \text{ with $0 \leqslant r N$.} So ar=qNa-r = qN and thus NN divides ara-r. It follows from the definition of congruence that ar(modN)a \equiv r \pmod{N}. ◻

Theorem 3.14. Congruence modulo NN is an equivalence relation; that is, the following three properties hold for any nonzero integer NN:

  1. Symmetric: For any integer aa, aa(modN)a\equiv a \pmod N.

  2. Reflexive: For all integers aa and bb, if ab(modN)a\equiv b\pmod N, then ba(modN)b\equiv a \pmod N.

  3. Transitive: For all integers aa, bb, and cc, if ab(modN)a\equiv b\pmod N and bc(modN)b\equiv c \pmod N, then ac(modN)a\equiv c \pmod N.

Proof. Fix a nonzero integer NN.

  1. For any integer aa, aa=0a - a = 0 is divisible by NN and hence aa(modN)a\equiv a \pmod N.

  2. For integers aa and bb, assume ab(modN)a \equiv b \pmod{N}. Then NN divides aba-b and hence NN also divides (1)(ab)=ba(-1)(a-b) = b-a. This proves that ba(modN)b\equiv a \pmod N.

  3. Let aa, bb, and cc be integers such that ab(modN)a\equiv b\pmod N and bc(modN)b\equiv c \pmod N. By definition this means NN divides aba-b and NN divides bcb-c. Using the fact proven before that if NN divides two integers then it divides their sum, we see that NN divides (ab)+(bc)=ac(a-b) + (b-c) = a-c. By definition, this proves ac(modN)a\equiv c \pmod N.

 ◻

Theorem 3.15 (Congruences can be added). Given a nonzero integer NN and integers a,b,c,da,b,c,d, if ab(modN)a \equiv b \pmod N and cd(modN)c \equiv d \pmod N, then a+cb+d(modN)a + c \equiv b + d \pmod N.

Proof. Fix a nonzero integer NN and assume a,b,c,da,b,c,d are integers such that ab(modN)a \equiv b \pmod N and cd(modN)c \equiv d \pmod N. Then NN divides both aba-b and cdc-d and hence NN also divides their sum, which is ab+cd=(a+c)(b+d)a-b+c-d = (a+c) - (b+d). By definition this means that a+cb+d(modN)a + c \equiv b + d \pmod N. ◻

Theorem 3.16 (Congruences can be multiplied). Given a nonzero integer NN and integers a,b,c,da,b,c,d, if ab(modN)a \equiv b \pmod N and cd(modN)c \equiv d \pmod N, then acbd(modN)a \cdot c \equiv b \cdot d \pmod N.

Proof. Fix a nonzero integer NN and assume a,b,c,da,b,c,d are integers such that ab(modN)a \equiv b \pmod N and cd(modN)c \equiv d \pmod N. Then NN divides both aba-b and cdc-d, so ab=Nka-b=Nk and cd=Nc-d=N\ell for some k,k,\ell\in{\mathbb{Z}}.

Therefore a+b(c+d)=ac+bd=Nk+N=N(k+)a+b-(c+d)=a-c+b-d=Nk+N\ell=N(k+\ell) where k+k+\ell is an integer by closure of {\mathbb{Z}} under addition. This means that N(a+b(c+d))N\mid (a+b-(c+d)) which shows a+bc+da+b\cong c+d by the definition of congruence.

Since a=b+Nka=b+Nk and c=d+Nc=d+N\ell we have ac=(b+Nk)(d+N)=bd+Nkd+Nb+N2k=bd+N(kd+b+Nk).ac=(b+Nk)(d+N\ell)=bd+Nkd+N\ell b+N^2k\ell=bd+N(kd+\ell b+Nk\ell). It follows that acbd=N(kd+b+Nk)ac-bd = N(kd+\ell b+Nk\ell), where kd+b+Nkkd+\ell b+Nk\ell is an integer by closure of the integers under addition and multiplication. Thus N(acbd)N\mid(ac-bd) and hence acbd(modN)ac \equiv bd \pmod N by the definition of congruence. ◻

3.2.2 Congruence classes

Definition 3.17. Fix a nonzero integer NN. For aa\in \mathbb Z, the congruence class of aa modulo NN is the subset of \mathbb Z consisting of all integers congruent to aa modulo NN; that is, the congruence class of aa modulo NN is the set [a]N:={b|ba(modN)}.[a]_N := \{b\in \mathbb Z\, | \, b\equiv a \pmod N\}.

Let me emphasize that congruence classes are sets, not numbers.

Example 3.18. What are some of the elements in [11]4[11]_4? Here are some examples 11,15,19,2311, 15, 19, 23 and 7,3,1,57, 3, -1, -5.

The set [7]2[7]_2 consists of all the odd integers. The set [8]2[-8]_2 consists of all the even integers.

True or False? Justify.

  1. 47[17]547 \in [17]_{5}. This is true since 4717(mod5)47 \equiv 17 \pmod{5}.

  2. [17]7[23]7=[17]_{7 } \cap [23]_{7 } = \emptyset. This is true: an element belonging to both sets would have to be congruent to both 1717 and 2323 modulo 77. If such a number existed, then 1717 and 2323 would be congruent modulo 77 (using the symmetric and transitive properties of congruences). But they are not. So no such number exists.

  3. [17]6[19]7=[17]_{6 } \cap [19]_{7 } = \emptyset. This is false; for instance, 55 belongs to both these sets. Notice that we are using two different values for NN in this example.

  4. For all integers aa, [a]60[a]10[a]_{60} \subseteq [a]_{10}. This is true. If b[a]60b \in [a]_{60} then ba(mod60)b \equiv a \pmod{60} and hence 6060 divides bab-a. But then 1010 would also divide bab-a and hence ba(mod10)b \equiv a \pmod{10}, so that b[a]10b \in [a]_{10}.

Example 3.19. Set N=3N = 3. Then there are three congruence classes modulo 33:

Notice that every integer belongs to one, and only one, of these three sets. They partition the integers into three non-overlapping subsets, likes countries on a map. A bit more formally, this property looks as follows.

Theorem 3.20. For any positive integer NN and any pair of integers a,ba,b exactly one of the following possibilities is true:

  1. [a]N=[b]N[a]_N=[b]_N or

  2. [a]N[b]N=[a]_N \cap [b]_N=\emptyset

Proof. Since we have seen in Theorem 3.14 that congruence modulo NN is an equivalence relation, and since the congruence classes are just the equivalence classes of this equivalence relation, the claim of this theorem follows from Corollary 3.8. ◻

We can make more precise when each of the two cases in the preceding theorem occurs.

Theorem 3.21. For any positive integer NN and any pair of integers a,ba,b we have

  1. [a]N=[b]N[a]_N=[b]_N if and only if ab(modN)a\equiv b\pmod{N}

  2. [a]N[b]N=[a]_N \cap [b]_N=\emptyset if and only if ab(modN)a\not\equiv b \pmod{N}.

Proof. The statement “[a]N=[b]N[a]_N=[b]_N if and only if ab(modN)a\equiv b\pmod{N}" is true by Theorem 3.7 applied to the equivalence relation congruence modulo NN.

We will now prove the statement “[a]N[b]N=[a]_N \cap [b]_N=\emptyset if and only if ab(modN)a\not\equiv b \pmod{N}".

Suppose [a]N[b]N=[a]_N \cap [b]_N=\emptyset. Then since a[a]Na\in [a]_N we obtain that a[b]Na\not\in [b]_N. Therefore, by definition of congruence classes ab(modN)a\not\equiv b \pmod{N}.

Suppose ab(modN)a\not\equiv b \pmod{N} then a[a]Na\in[a]_N but a[b]Na\not \in [b]_N. This shows that [a]N[b]N[a]_N\neq[b]_N, By Theroem 3.21 this means that [a]N[b]N=[a]_N \cap [b]_N=\emptyset must be true. ◻

Theorem 3.22. For any positive integer NN, there are exactly NN congruence classes modulo NN, namely the congruence classes [0]N,[1]N,[2]N,,,[N1]N[0]_N, [1]_N, [2]_N, ,\ldots, [N-1]_N.

Proof. Suppose NN is a positive integer. By Theorem 3.13 we have that any integer aa is congruent to some integer rr so that 0r<N0\leq r<N. By Theorem 3.21, since ar(modN)a\equiv r\pmod{N} we have [a]N=[r]N[a]_N=[r]_N for some integer rr so that 0r<N0\leq r<N. Since 0r<N0\leq r<N, the possibilities for rr are 0,1,2,,N10, 1,2, \ldots, N-1 and thus the possibilities for [a]N[a]_N are [0]N,[1]N,[2]N,,,[N1]N[0]_N, [1]_N, [2]_N, ,\ldots, [N-1]_N. We thus have at most NN congruence classes modulo NN.

To see that the congruence classes [0]N,[1]N,[2]N,,,[N1]N[0]_N, [1]_N, [2]_N, ,\ldots, [N-1]_N are distinct, we appeal to the uniqueness of rr in Theorem 3.13. Indeed, if we hve that [i]N=[j]N[i]_N=[j]_N so that 0iN10\leq i\leq N-1 and 0iN10\leq i\leq N-1 then applying Theorem 3.13 for a=ia=i we get that both ii and jj satisfy the properties required for rr and thus i=ji=j by uniqueness. Thus the congruence classes [0]N,[1]N,[2]N,,,[N1]N[0]_N, [1]_N, [2]_N, ,\ldots, [N-1]_N are distinct and so we have exactly NN congruence classes. ◻

3.3 Modular arithmetic

We come to a very important point. We will see shortly that although congruence classes modulo NN are sets, not numbers, they behave like numbers in the sense that they can be added and multiplied. Thus we can talk about arithmetic on the following set.

Definition 3.23. For a positive integer NN, N\mathbb{Z}_N is the set of congruence classes of integers modulo NN, that is N={[0]N,[1]N,,[N1]N}.\mathbb{Z}_N=\{[0]_N, [1]_N, \ldots, [N-1]_N\}.

Using the fact that congruences can be added and multiplied, we may introduce rule for adding and multiplying congruence classes

Definition 3.24. Fix a non-zero integer NN and let a,ba,b\in{\mathbb{Z}}. Define [a]N+[b]N[a]_N + [b]_N to be the congruence class modulo NN given by the following rule:

Pick any x[a]Nx\in [a]_N and pick any y[b]y\in [b], and define [a]N+[b]N:=[x+y]N[a]_N+[b]_N:= [x+y]_N.

For example if we picked x=ax=a and y=by=b we would get [a]N+[b]N:=[a+b]N.[a]_N+[b]_N:= [a+b]_N.

Example 3.25. Let us compute [17]5+[11]5[17]_5 + [11]_5. The rule says that we start by picking any elements x[17]5x \in [17]_5 and y[11]5y \in [11]_5. I’ll choose x=2x = 2 and y=6y = 6. Next we add our choices for xx and yy as ordinary integers to get 88, and conclude that [17]5+[11]5=[8]5[17]_5 + [11]_5 = [8]_5. Note that [8]5=[3]5[8]_5 = [3]_5, so we could also say that [17]5+[11]5=[3]5[17]_5 + [11]_5 = [3]_5. Since [17]5=[2]5[17]_5 = [2]_5 and [11]5=[1]5[11]_5 = [1]_5, we could rewrite this yet again as [2]5+[1]5=[3]5[2]_5 + [1]_5 = [3]_5.

Now you try it:

Example 3.26. Have each member of your group compute [120]6+[13]6[120]_6 + [13]_6. Then compare your answers. Are they equal? Repeat this for [19]6+[23]6[-19]_6 + [23]_6.

Solution: If one of us were to pick 114[120]6114 \in [120]_6 and 7[13]67 \in [13]_6, the rule for adding congruence classes gives [120]6+[13]6=[121]6[120]_6 + [13]_6 = [121]_6. If another one of us were to pick 0[120]60 \in [120]_6 and 1[13]61 \in [13]_6, the rule for adding congruence classes gives [120]6+[13]6=[1]6[120]_6 + [13]_6 = [1]_6. This gives the same result since [121]6=[1]6[121]_6 = [1]_6, because 1211(mod6)121 \equiv 1 \pmod{6}.

Perhaps you notice a troubling aspect of the definition for adding congruence classes: the definition of [a]N+[b]N[a]_N+[b]_N seems to depend on choices of representative elements xx and yy. What if one pair of choices leads to a different outcome than another pair of choice? That would be non-sensical.

Theorem 3.27. The definition of [a]N+[b]N[a]_N+[b]_N is independent of the choices mades. That is, if x1,x2[a]Nx_1, x_2 \in [a]_N and y1,y2[b]Ny_1, y_2 \in [b]_N, then [x1+y1]N=[x2+y2]N[x_1 + y_1]_N = [x_2 + y_2]_N.

Proof. Our proof will use two theorems proven before: (1) aa and bb belong to the same congruence class modulo NN if and only if ab(modN)a \equiv b \pmod{N} (Theorem 3.21) and (2) congruence classes may be added (Theorem 3.15).

Since x1x_1 and x2x_2 belong to the same congruence class, by Theorem 3.21 we have that x1x2(modN)x_1 \equiv x_2 \pmod{N}, and likewise since y1y_1 and y2y_2 belong to the same congruence class, y1y2(modN)y_1 \equiv y_2 \pmod{N}. By Theorem 3.15, we may add these two congruence classes to conclude x1+y1x2+y2(modN)x_1 + y_1 \equiv x_2 + y_2 \pmod{N}. Using Theorem 3.21 again, it follows that [x1+y1]N=[x2+y2]N[x_1 + y_1]_N = [x_2 + y_2]_N. ◻

Now we can define multiplication of congruence classes.

Definition 3.28. Fix a non-zero integer NN and let a,ba,b\in{\mathbb{Z}}. Define [a]N[b]N[a]_N \cdot [b]_N to be the congruence class modulo NN given by the following rule:

Pick any x[a]Nx\in[a]_N and pick any y[b]Ny\in [b]_N, and define [a]N[b]N:=[xy]N[a]_N\cdot[b]_N:=[xy]_N.

For example if we picked x=ax=a and y=by=b we would get [a]N+[b]N:=[ab]N.[a]_N+[b]_N:= [ab]_N.

Theorem 3.29. The definition of multiplying congruence classes is independent of the choices made. In more detail, if [a]N[a]_N and [b]N[b]_N are two congruence classes modulo NN, and we pick x1,x2Xx_1, x_2 \in X, and y1,y2Yy_1, y_2 \in Y, then [x1y1]N=[x2y2]N[x_1 y_1]_N = [x_2 y_2]_N.

Proof. Our proof will use two theorems proven before: (1) aa and bb belong to the same congruence class modulo NN if and only if ab(modN)a \equiv b \pmod{N} (Theorem 3.21) , and (2) congruence classes may be multiplied (Theorem 3.16).

Since x1x_1 and x2x_2 belong to the same congruence class, by (1) we have that x1x2(modN)x_1 \equiv x_2 \pmod{N}, and likewise since y1y_1 and y2y_2 belong to the same congruence class, y1y2(modN)y_1 \equiv y_2 \pmod{N}. By (2), we may add these two congruence classes to conclude x1y1x2y2(modN)x_1 y_1 \equiv x_2 y_2 \pmod{N}. Using (1) again, it follows that [x1y1]N=[x2y2]N[x_1 y_1]_N = [x_2 y_2]_N. ◻

3.3.1 Units in N{\mathbb{Z}}_N

Definition 3.30. A congruence class [a]N[a]_N in N{\mathbb{Z}}_N is called a unit if it has a multiplicative inverse. A multiplicative inverse of [a]N[a]_N is a congruence class [b]N[b]_N in N{\mathbb{Z}}_N such that [b]N[a]N=[1]N[b]_N \cdot [a]_N = [1]_N.

Example 3.31. In 6\mathbb{Z}_6, [1]6[1]_6 and [5]6[5]_6 are the only units. The multiplicative inverse of [1]6[1]_6 is [1]6[1]_6, and the multiplicative inverse of [5]6[5]_6 is [5]6[5]_6.

Theorem 3.32. Given a non-zero integer NN and any integer aa, there exists some integer xx such that xa1(modN)x \cdot a \equiv 1 \pmod{N} if and only if gcd(a,N)=1\gcd(a,N) = 1.

Proof. Recall that gcd(a,N)=1\gcd(a,N) = 1 if and only if xa+yN=1xa + yN = 1 for some integers xx and yy. The existence of integers xx and yy such that xa+yN=1xa + yN = 1 is equivalent to the existence of an integer xx such that NN divides xa1xa-1, or equivalently, such that xa1(modN)xa \equiv 1 \pmod N. ◻

We can translate Theorem 3.32 into a statement about congruence classes:

Corollary 3.33. Given a non-zero integer NN and any integer aa, the congruence class [a]N[a]_N is a unit in N{\mathbb{Z}}_N if and only if gcd(a,N)=1\gcd(a,N)=1.

Proof. There exists some [x]N[x]_N in N\mathbb{Z}_N such that [a]N[x]N=[ax]N=[1]N[a]_N [x]_N = [ax]_N = [1]_N if and only if there is an integer xx such that ax1(modN)ax \equiv 1 \pmod N if and only if gcd(a,N)=1\gcd(a,N) = 1 by the previous theorem. So the congruence class [a]NN[a]_N \in {\mathbb{Z}}_N is a unit if and only if gcd(a,N)=1\gcd(a,N) = 1. ◻

How do we find multiplicative inverses? We can use trial and error, but it gets very slow; or we can use the extended Euclidean algorithm.

Example 3.34. Note that [131]260[131]_{260} is a unit. We can find its multiplicative inverse via the extended Euclidean algorithm, as follows. First, we note that 260=1131+129131=1129+2129=642+12=21+0,\begin{align*} 260 & = 1 \cdot 131 + 129 \\ 131 &= 1 \cdot 129 + 2 \\ 129 &= 64 \cdot 2 + 1 \\ 2 &= 2 \cdot 1 + 0, \end{align*} so that indeed gcd(131,260)=1\gcd(131,260)=1. We can now use the computations above to find a linear combination of 131131 and 260260 that is equal to 1: 129=126011312=11311129=11311(12601131)=1260+21311=129642=1260113164(1260+2131)=65260+(129)131\begin{align*} 129 & = 1 \cdot 260 - 1 \cdot 131 \\ 2 &= 1 \cdot 131 - 1 \cdot 129 = 1 \cdot 131 - 1 \cdot (1 \cdot 260 - 1 \cdot 131) = - 1 \cdot 260 + 2 \cdot 131 \\ 1 &= 129 - 64 \cdot 2 = 1 \cdot 260 - 1 \cdot 131 - 64 \cdot (- 1 \cdot 260 + 2 \cdot 131) = 65 \cdot 260 + (-129) \cdot 131 \\ \end{align*}

Therefore, (129)1311(mod260),(-129) \cdot 131 \equiv 1 \pmod {260}, so that [129]260=[131]260[-129]_{260} = [131]_{260} is the multiplicative inverse of [131]260[131]_{260}.

4 Rings

4.1 Binary operations

Definition 4.1. A binary operation on a set SS is a function from S×SS\times S to SS. In other words, it is a rule that assigns to two inputs taken from SS, one output which also belongs to SS.

Example 4.2. For example, addition and subtraction are binary operations on set the of integers.

If \star denotes a binary operation, we write xyx \star y to indicate the result of applying \star to the pair of inputs (x,y)(x,y), just as we would with the usual symbols ++ and - for the sum and subtraction of integers.

Remark 4.3. In defining a binary operation, we are implicitly stating that the set SS is closed under the operation.

Example 4.4. For instance, if we tried to define ab=aba \star b = \frac{a}{b} on the set of non-zero integers, we would run into trouble: this is not an operation since ab\frac{a}{b} isn’t always an integer.

Example 4.5.

  1. Which of the following rules \star are binary operations on the indicated set SS?

    1. Let SS be all non-zero real numbers and let ab=aba \star b = \frac{a}{b}.

    2. Let SS be the set of all even integers and ab=a+ba \star b = a + b

    3. Let SS be the set of all odd integers and ab=a+ba \star b = a + b

Solution: The operations in (a) and (b) are in fact binary operations in the sets given, though (c) is not since it does not satisfy closure: the sum of two odd integers is never an odd integer.

Here are some special abstract properties a binary operation can have.

Definition 4.6.

Example 4.7. For each of the following binary operations, determine if they are associative or commutative:

  1. SS is the set of all non-zero real numbers and the operation is division: ab=a/ba \star b = a/b.

    Solution: This operation is not commutative, since for example 1/22/11/2 \neq 2/1. It is also not associative; associativity would mean that (a/b)/c=a/(b/c)(a/b)/c = a/(b/c), and yet (a/b)/c=ab1c=abc while a/(b/c)=acb=acb.(a/b)/c = \frac{a}{b} \cdot \frac{1}{c} = \frac{a}{bc} \textrm{ while } a/(b/c) = a \cdot \frac{c}{b} = \frac{ac}{b}. For example, (1/2)/2=1/41=1/1=1/(2/2).(1/2)/2 = 1/4 \neq 1 = 1/1=1/(2/2).

  2. \mathbb{R} with the operation of subtraction: ab=aba \star b = a - b.

    Solution: This operation is neither commutative nor associative: commutativity fails for example in 12=11=211-2 = -1 \neq 1 = 2-1, while a counterexample to associativity is (11)1=11=1(11)(1-1)-1 = -1 \neq 1 = 1-(1-1).

  3. \mathbb{R} with the operation of addition.

    Solution: This is in fact a commutative and associative operation.

  4. \mathbb{R} with the operation being taking the maximum of two real numbers, that is, the operation is ab=max{a,b}a \star b = \max\{a,b\}.

    Solution: This is in fact a commutative and associative operation.

Definition 4.8. Recall that 2×22 \times 2 matrix is an array of real numbers of the form [abcd]\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} and that one may add 2×22 \times 2 matrices using the rule [abcd]+[rstu]=[a+rb+sc+td+u]\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} + \begin{bmatrix} r & s \\ t & u \\ \end{bmatrix} = \begin{bmatrix} a+r & b+s\\ c+t & d+u\\ \end{bmatrix}

One can also multiply 2×22 \times 2 matrices by the rule [abcd][rstu]=[ar+btas+bucr+dtcs+du]\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} \cdot \begin{bmatrix} r & s \\ t & u \\ \end{bmatrix} = \begin{bmatrix} ar + bt & as + bu \\ cr + dt & cs + du \\ \end{bmatrix}

Example 4.9. Addition of matrices is a binary operation on the set of all two-by-two matrices. This is in fact an associative and commutative operation.

The fact that multiplication of matrices is associative is one of the first properties one learns in a linear algebra class when talking about product of matrices. However, this is not a commutative operation! For example, [1000][0100]=[0100][0000]=[0100][1000].\begin{bmatrix} 1 & 0 \\ 0 & 0 \\ \end{bmatrix} \begin{bmatrix} 0 & 1 \\ 0 & 0 \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \\ \end{bmatrix} \neq \begin{bmatrix} 0 & 0 \\ 0 & 0 \\ \end{bmatrix} = \begin{bmatrix} 0 & 1 \\ 0 & 0 \\ \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & 0 \\ \end{bmatrix}.

We can also describe operations by tables, like we do with ++ tables and ×\times tables. Here are some operation tables for operations on the set {a,b,c,d}\{a,b,c,d\}: the entry in row xx and column yy for operation \star means xyx\star y. Decide for each whether the operation is commutative, has an identity, and/or has inverses.

\clubsuit a b c d
a a b c d
b b b c d
c c c c d
d d d d d
\diamondsuit a b c d
a a d c b
b b a d c
c c b a d
d d c b a
\heartsuit a b c d
a a a a a
b a b c d
c a c a c
d a d c b
\spadesuit a b c d
a a b c d
b b c d a
c c d a b
d d a b c

To see if one of these operations has an identity, we want to look for a row and a column that are unchanged, and that correspond to multiplication by the same element on the left and right. For example, we see that the row and column for aa in the table for \clubsuit look exactly like the original row/column on the other side, so aa is the identity for this operation.

To see if one of these operations is commutative, we want to see that the entries corresponding to xyx \star y and yxy \star x coincide; more precisely, this means that the entries in positions (i,j)(i,j) and (j,i)(j,i) are the same. If we pretend that the entries in our tables correspond to the entries in a matrix, we want the transpose of our table to be the same as the original table, or else the operation is not commutative.

Finally, if an operation has an identity, we can spot the elements with inverses by looking for entries in the table that are equal to the identity. Notice that we need both xy=ex \star y = e and yx=ey \star x = e for xx to be the inverse of yy, so if the identity for our operation appears in position (i,j)(i,j) in the table, we also need the identity to appear in position (j,i)(j,i) to say that we found elements with an inverse.

In the tables above, we have:

With this information, we can set out to identify two of these operations as the sum and multiplication in 4\mathbb{Z}_4. Both are commutative, so \diamondsuit is certainly none of them. Now we note that every element in 4\mathbb{Z}_4 has an inverse for the addition: [0]4[0]_4 is the identity, [1]4[1]_4 is the inverse of [3]4[3]_4, and [2]4[2]_4 is its own inverse. We conclude that \spadesuit is the addition in 4\mathbb{Z}_4, with a=[0]4a=[0]_4, c=[2]4c = [2]_4. Therefore, bb and dd must be [1]4[1]_4 and [3]4[3]_4, though we don’t know which one is which; either choice will work.

To identify the multiplication in 4\mathbb{Z}_4, we again want a commutative operation with identity, but this time we have an extra fun property: when we multiply any element in 4\mathbb{Z}_4 by [0]4[0]_4, we always get [0]4[0]_4 as the result. So in the correct table we will find a row and column where every entry is the same element... so it must be \heartsuit!

Example 4.10. Which of the following operations have an identity? If so, what is it?

  1. multiplication of 2×22\times 2 matrices.

  2. division of positive real numbers ab=aba \star b =\frac{a}{b}.

  3. averaging two real numbers, that is ab=a+b2a \star b = \frac{a+ b}{2}.

  4. maximum of two real numbers: that is ab=max{a,b}a \star b = \max\{a,b\}.

Solution:

  1. The multiplication of 2×22 \times 2 matrices has an identity: the matrix [1001]\begin{bmatrix} 1 & 0 \\ 0 & 1 \end{bmatrix}.

  2. The division of positive real numbers does not have an identity! While a/1=aa/1=a for any real number, 1/a=a1/a = a is not true for most real numbers.

  3. The operation of averaging two real numbers does not have an identity. If e+a2=a\frac{e + a}{2} = a, then e+a=2ae + a = 2a and hence e=ae = a. So there is no single elements ee that works simultaneously for all aa.

  4. The maximum of two real numbers does not have an identity.

4.2 Rings

Definition 4.11. A ring is a set RR with two binary operations, written as ++ and ×\times, and called addition and multiplication, respectively, such that the following properties all hold:

  1. Closure under addition: If a,ba,b are elements of RR, then a+bRa+b\in R.

  2. Associativity of addition: If a,b,ca,b,c are elements of RR, then (a+b)+c=a+(b+c)(a+b)+c=a+(b+c).

  3. Commutativity of addition: If a,ba,b are elements of RR, then a+b=b+aa+b=b+a.

  4. Additive identity: There exists an element 0R0\in R so that for any aRa\in R we have a+0=a=0+aa+0=a=0+a.

  5. Additive inverses: For every aRa\in R there is an element denoted aR-a\in R so that a+(a)=0=(a)+aa+(-a)=0=(-a)+a.

  6. Closure under multiplication: If a,ba,b are elements of RR, then abRab\in R.

  7. Associativity of multiplication: If a,b,ca,b,c are elements of RR, then (ab)c=a(bc)(ab)c=a(bc).

  8. Multiplicative identity: There exists an element 1R1\in R so that for any aRa\in R we have a1=a=1aa\cdot 1=a=1\cdot a.

  9. Distibutivity: If a,b,ca,b,c are elements of RR, then a(b+c)=ab+aca(b+c)=ab+ac and (b+c)a=ba+ca(b+c)a=ba+ca.

We may say that (R,+,×)(R,+,\times) is a ring to indicate that RR is a ring with addition given by the operation ++ and multiplication given by the operation ×\times.

Warning: our definition of ring is called a “ring with identity” in some textbooks which allow rings not to have a multiplicative identity.

Definition 4.12. A ring RR is called a commutative ring if in addition to all the properties in Definition 4.11 it satisfies

  1. Commutativity of multiplication: If a,ba,b are elements of RR, then ab=baab=ba.

Theorem 4.13. The set of all integers \mathbb{Z} with the usual sum and multiplication of integers is a commutative ring.

Proof. In the case of integers, the properties listed in the definition of a ring are so basic to the nature of the integers that they cannot be “proven” in any meaningful way, but are rather taken as axioms; see the Axioms in section 1.1 for a list of these properties. The validity of these axioms establishes that \mathbb{Z} is a commutative ring. The additive and multiplicative identities are the familiar integers 00 and 11. ◻

Theorem 4.14. The set of congruences classes modulo NN with the addition and multiplication of congruence classes given by [a]N+[b]N=[a+b]N and [a]N[b]N=[ab]N[a]_N + [b]_N = [a+b]_N \textrm{ and } [a]_N \cdot [b]_N = [a \cdot b]_N is a commutative ring.

Proof. We will not check all ten axioms in detail, but rather just check some of them. The common theme is that each axiom is a consequence of the fact that these axioms hold for {\mathbb{Z}}. For instance, the distributive law holds since [a]N([b]N+[c]N)=[a]N[b+c]N=[a(b+c)]N=[ab+ac]N=[ab]N+[ac]N=[a]N[b]N+[a]N[c]N[a]_N \cdot ([b]_N + [c]_N) = [a]_N \cdot [b+c]_N = [a(b+c)]_N = [ab + ac]_N = [ab]_N + [ac]_N = [a]_N [b]_N + [a]_N [c]_N and here the key step is the third equality, which is the distributive law for the integers.

The element [0]N[0]_N is an additive identity, since [0]N+[a]N=[0+a]N=[a]N[0]_N + [a]_N = [0+a]_N = [a]_N, with the second equations holding since 00 is the additive identity for the integers.

Given an element [a]N[a]_N, it has an additive inverse, namely [a]N[-a]_N, since [a]N+[a]N=[aa]N=[0]N[a]_N + [-a]_N = [a - a]_N = [0]_N.

The element [1]N[1]_N is a multiplicative identity, since [1]N[a]N=[1a]N=[a]N[1]_N \cdot [a]_N = [1 \cdot a]_N = [a]_N, with the second equations holding since 11 is the multiplicative identity for the integers.

All the other axioms are proven in a similar manner. ◻

Theorem 4.15. The set of all 2×22\times 2 matrices with coefficients in \mathbb R under the rules for adding and multiplying matrices given by [abcd][rstu]=[ar+btas+bucr+dtcs+du]\begin{bmatrix} a & b \\ c & d \end{bmatrix} \cdot \begin{bmatrix} r & s \\ t & u \end{bmatrix} = \begin{bmatrix} ar + bt & as + bu \\ cr + dt & cs + du \end{bmatrix} and [abcd]+[rstu]=[a+rb+sc+td+u].\begin{bmatrix} a & b \\ c & d \\ \end{bmatrix} + \begin{bmatrix} r & s \\ t & u \end{bmatrix} = \begin{bmatrix} a+r & b+s\\ c+t & d+u \end{bmatrix}. is a non-commutative ring.

Proof. This is indeed a ring, and each axiom is the sort of thing you might have seen justified in a class in matrix theory. For instance, the multiplicative identity is I2=[1001]I_2 = \begin{bmatrix} 1 & 0 \\ 0 & 1 \\ \end{bmatrix} since it follows from the rule for multiplying matrices that I2A=A=AI2I_2 \cdot A = A = A \cdot I_2 for all two-by-two matrices AA. ◻

Let us look at some other examples. In each case, we will determine if the given set with the given operations forms a ring.

Example 4.16.

  1. The set of all even integers under the usual rules for adding and multiplying.

    This is not a ring. The only axiom that fails is the existence of a multiplicative identity: there is no even integer xx such that xa=ax \cdot a = a for all even integers aa.

  2. The set of all positive real numbers under the usual rules for adding and multiplying.

    This is not a ring. One axiom that fails is the existence of an additive identity: there is no positive real number xx such that x+a=ax + a = a for all positive real numbers aa. (Since there is no additive identity, Axiom 5 is not even defined.)

  3. The set of all function from \mathbb{R} to \mathbb{R} with the following rules for adding and multiplying functions (f+g)(x)=f(x)+g(x)(f+g)(x)=f(x)+g(x) and (fg)(x)=f(x)g(x)(f\cdot g)(x)=f(x)\cdot g(x).

    This is indeed a ring. Remember, if ff and gg are two such functions, their sum f+gf + g is the function whose value at xx \in \mathbb{R} is f(x)+g(x)f(x) + g(x) and their product is the function fgf \cdot g whose values at xx is f(x)g(x)f(x) \cdot g(x). As we know from calculus, adding and multiplying continuous functions gives continuous functions , and so the closures under addition and multiplication hold. The additive identity is the constant function 00 and the multiplicative identity is the constant function 11. The additive inverse of a function ff is f-f, the function whose value at xx is f(x)-f(x). The remaining axioms all holds since these axioms hold for the real numbers themselves. We’ll skip the details.

  4. The set of all increasing function from \mathbb{R} to \mathbb{R} with the following rules for adding and multiplying functions (f+g)(x)=f(x)+g(x)(f+g)(x)=f(x)+g(x) and (fg)(x)=f(x)g(x)(f\cdot g)(x)=f(x)\cdot g(x).

    This is not a ring since it doesn’t satisfy closure under multiplication. The function f:R,f(x)=xf:\mathbb{R}\to R, f(x)=x is increasing but the function (ff)(x)=f(x)2=x2(f\cdot f)(x)=f(x)^2=x^2 is not increasing (recall the graph of x2x^2 has a dip).

Let’s review the examples of rings we have seen so far

Example 4.17.

  1. the integers {\mathbb{Z}}

  2. the set of congruence classes modulo NN, N\mathbb{Z}_N

  3. all other sets of numbers such as the rationals \mathbb{Q}, the reals \mathbb{R}, the complex numbers \mathbb{C}

  4. the set of 2×22\times 2 matrices with real number entries 2×2()\mathcal{M}_{2\times 2}(\mathbb{R}) with the operations described in Theorem 4.15. In fact there is nothing special about 2×22\times 2 matrices. For any integer n1n\geq 1, the set of n×nn\times n matrices with real number entries n×n()\mathcal{M}_{n\times n}(\mathbb{R}) is also a ring. Even more generally for any integer n1n\geq 1, the set of n×nn\times n matrices with entries in a commutative ring RR is a ring.

  5. the set of functions ={f:}\mathcal{F}=\{f:\mathbb{R}\to \mathbb{R}\} with the operations described in Example 4.16.

Many of the rings we have seen so far are commutative rings. Each of {\mathbb{Z}}, N{\mathbb{Z}}_N, and the ring of all functions from \mathbb{R} to \mathbb{R} are commutative rings. We have seen only one example so far of a non-commutative ring: The ring of two-by-two matrices with real entries is a non-commutative ring.

4.3 Properties of rings

In this section we see to heat extent the rules of arithmetic we are familiar with from working with, say, the integers still continue to hold in any ring. Since we don’t know anything about rings other than the 9 properties in their Definition 4.11, we will need to prove the theorem below using only those 9 properties.

Theorem 4.18. Let (R,+,)(R,+,\cdot) be a ring.

  1. 0a=00 \cdot a = 0 and a0=0a \cdot 0 = 0 for all aRa \in R.

  2. (a)b=(ab)(-a) \cdot b = - (a \cdot b) for any elements a,bRa, b \in R.

  3. (1)a=a(-1) \cdot a = - a for any element aRa \in R.

  4. The zero element 0R0_R and the one element 1R1_R of RR are unique.

  5. Every element xRx\in R has a unique additive inverse.

  6. For every element xRx \in R, if xx is a unit, that is, if xx has a two-sided multiplicative inverse, then that inverse is unique.

Proof. Suppose RR is a ring.

(a) Let aRa\in R. Then we compute 0+0=0 by additive identity(0+0)a=0a by multiplying a to both sides 0a+0a=0a by distributivity(0a)+0a+0a=(0a)+0a by adding (0a) to both sides0+0a=0 by additive inverses0a=0 by additive identity.\begin{align*} 0+0 &= 0 & \text{ by additive identity} \\ (0+0)\cdot a&=0\cdot a & \text{ by multiplying } a \text{ to both sides } \\ 0\cdot a + 0\cdot a&= 0\cdot a& \text{ by distributivity}\\ -(0\cdot a) + 0\cdot a + 0\cdot a&=- (0\cdot a) + 0\cdot a & \text{ by adding }- (0\cdot a)\text{ to both sides}\\ 0+ 0\cdot a &=0 &\text{ by additive inverses}\\ 0\cdot a &=0 &\text{ by additive identity}. \end{align*} We have thus proved that 0a=00\cdot a =0. The proof of a0=0a\cdot 0 =0 is similar and I omit it.

(d) Assume that there are two additive identity elements 00 and 00'. Then we have 0=0+0 because 0 is an additive identity0=0+0 because 0 is an additive identity.\begin{align*} 0 &= 0+0' & \text{ because }0'\text{ is an additive identity} \\ 0' &= 0+ 0' & \text{ because }0\text{ is an additive identity}. \end{align*} Because the right hand sides are the same abovewe conclude 0=00=0'. Therefore there is only one additive identity.

Assume that there are two multiplicative identity elements 11 and 11'. Then we have 1=11 because 1 is an additive identity1=11 because 1 is an additive identity.\begin{align*} 1 &= 1\cdot 1' & \text{ because }1'\text{ is an additive identity} \\ 1' &= 1\cdot 1' & \text{ because }1\text{ is an additive identity}. \end{align*} Because the right hand sides are the same above we conclude 1=11=1'. Therefore there is only one multiplicative identity. (b) Let a,bRa, b\in R. We shall first show that (a)b(-a)\cdot b is an additive inverse for aba\cdot b. We compute using distributivity and part (d) that (a)b+ab=((a)+a)b=0b=0ab+(a)b=(a+(a))b=0b=0.\begin{eqnarray*} (-a)\cdot b + a\cdot b=((-a)+a)\cdot b=0\cdot b=0 \\ a\cdot b+(-a)\cdot b=(a+(-a))\cdot b=0\cdot b=0. \end{eqnarray*} By what we have computed above (a)b(-a)\cdot b is an additive inverse for aba\cdot b. By definition (ab)-(ab) is also an additive inverse for aba\cdot b. By part (b) aba\cdot b has a unique additive inverse, which means that (a)b=(ab)(-a)\cdot b=-(ab).

For (c), we apply (b) using a=1a = 1 and b=ab = a to get (1)a=(1a)(-1) a = -(1 \cdot a). The right side is equal to a-a by the multiplicative inverse property and thus (1)a=a(-1)a = -a.

(e) Let xRx\in R and assume xx has two additive inverses yRy\in R and zRz\in R.

Then x+y=0 because y is an additive inverse of xz+(x+y)=z+0 by adding z to both sides (z+x)+y=z by associativity of addition and the additive identity property0+y=z because z is an additive inverse of xy=z by additive identity.\begin{align*} x+y &= 0 & \text{ because }y\text{ is an additive inverse of } x \\ z+(x+y)&= z+0 & \text{ by adding } z \text{ to both sides } \\ (z+x)+y&= z& \text{ by associativity of addition and the additive identity property}\\ 0+y &=z & \text{ because }z\text{ is an additive inverse of } x \\ y &=z &\text{ by additive identity}. \end{align*} Since y=zy=z we have thus shown that there is only one additive inverse for xx.

(f) Let xRx\in R and assume xx has two multiplicative inverses yRy\in R and zRz\in R. Then xy=1 because y is a multiplicative inverse of xz(xy)=z1 by multiplying z to both sides (zx)y=z by associativity of multiplication and the multiplicative identity property1y=z because z is a multiplicative inverse of xy=z by multiplicative identity.\begin{align*} xy &= 1 & \text{ because }y\text{ is a multiplicative inverse of } x \\ z(xy)&= z\cdot 1 & \text{ by multiplying } z \text{ to both sides } \\ (zx)y&= z& \text{ by associativity of multiplication and the multiplicative identity property}\\ 1\cdot y &=z & \text{ because }z\text{ is a multiplicative inverse of } x \\ y &=z &\text{ by multiplicative identity}. \end{align*}

Since y=zy=z we have thus shown that there is only one multiplicative inverse for xx. ◻

4.3.1 The Zero Ring

Theorem 4.19. The one-element set {0}\{0\} equipped with binary the operations 0+0=00 + 0 = 0 and 00=00 \cdot 0 = 0 is a ring.

Conversely, if RR is any ring such that 0R=1R0_R = 1_R, then RR has only one element, namely 0R0_R.

Proof. It is straightforward to verify most of the nine axioms of a ring for the one element ring {0}\{0\}. The one that is a bit confusing is Axiom 9, which states that there exists some element 1R1_R of RR such that 1Ra=a1_R \cdot a = a and a1R=aa \cdot 1_R = a for all aa in RR. Set 1R=01_R = 0. Then 1R0=00=01_R \cdot 0 = 0 \cdot 0 = 0 and 01R=00=00 \cdot 1_R = 0 \cdot 0 = 0. Since 00 is the only element of this ring, this does indeed verify Axiom 9.

Now assume RR is any ring such that 0R=1R0_R = 1_R. Pick any element aRa \in R. We will prove that a=0Ra = 0_R and hence RR has just one element. By Theorem 1 above, 0Ra=0R0_R \cdot a = 0_R, and by Axiom 9 we have 1Ra=a1_R \cdot a = a. But we are assuming 0R=1R0_R = 1_R and hence 0R=0Ra=1Ra=a0_R = 0_R \cdot a = 1_R \cdot a = a, so that a=0Ra = 0_R. ◻

4.4 Subrings

Definition 4.20. A subset SS of a ring RR is a subring of RR if SS is a ring for the same operations of addition and multiplication from RR.

Example 4.21 (Trivial rings). If RR is a ring then {0R}\{0_R\} and RR are subrings of RR. These are called the trivial subrings of RR.

Example 4.22. \mathbb{R} is a ring, and each of {\mathbb{Z}} and \mathbb{Q} is a non-trivial subring of \mathbb{R}.

Example 4.23. What are the subrings of {\mathbb{Z}}? Every ring is a subring of itself, and thus {\mathbb{Z}} is a subring. Also {0}\{0\} is a subring of {\mathbb{Z}}. These are the only two subrings of {\mathbb{Z}}.

Proposition 4.24. Assume RR is a ring and that SRS\subseteq R is subset of RR. If

then SS is a subring of RR.

Proof. The assumptions show that Axioms 1, 4, 5, 6 and 9 all hold. The remaining axioms are automatic since if they hold in RR they certainly hold for the subset SS of RR. For instance, since addition is commutative in RR we have that a+b=b+aa + b = b + a for all elements a,bRa,b \in R, and so it certainly is true that a+b=b+aa + b = b + a for all elements a,bSa,b \in S. Thus, SS is a ring under the same operations of addition and multiplication as in RR. Moreover, the first two assumptions show that SS contains 0R0_R and 1R1_R, as is required of a subring. (Another way of saying this is that 0S=0R0_S = 0_R and 1S=1R1_S = 1_R hold.) ◻

Example 4.25. For a fixed positive integer NN, what are the subrings of N{\mathbb{Z}}_N? As with {\mathbb{Z}}, one subring is N{\mathbb{Z}}_N itself, and another is {0}\{0\}. But for certain values of NN we can have other subrings as well: for example one can check that the set S={[0]12,[4]12,[8]12}S=\{[0]_{12}, [4]_{12}, [8]_{12}\} is a subring of 12{\mathbb{Z}}_{12} with 0S=[0]12,1S=[4]120_S=[0]_{12}, 1_S=[4]_{12}. This shows that the converse of Proposition 4.24 is not valid in the sense that if SS is a subring of RR it does not mean that we must have 1RS1_R\in S.

4.5 Domains and fields

Definition 4.26. Let RR be a ring. An element rRr \in R is a unit in RR if it has a two-sided multiplicative inverse, that is, if there exists sRs \in R such that rs=1=srrs = 1 = sr.

Recall that you proved before that if an element rr has a multiplicative inverse, then it has only one such inverse. In other words, inverses are unique when they exist. If rr is a unit in a ring, we write its unique inverse as r1r^{-1}.

Definition 4.27. Let RR be a commutative ring. A nonzero element rRr \in R is a zerodivisor in RR if there exists a nonzero sRs \in R such that rs=0rs =0.

Definition 4.28. A ring RR is an integral domain, or simply a domain, if RR is commutative, 0R1R0_R \ne 1_R, and RR has no zerodivisors.

Definition 4.29. A ring RR is a field if RR is commutative, 1R0R1_R \ne 0_R, and every nonzero element is a unit.

Proposition 4.30. A ring RR is a domain if and only if

Proof. ()(\rightarrow) Suppose RR is a domain. Then RR is commutative and 0R1R0_R \ne 1_R, by definition of domain. Suppose a,bRa, b \in R are such that ab=0Rab = 0_R and assume towards a contradiction that a0Ra \neq 0_R and b0Rb \neq 0_R. Then aa and bb are zero divisors, contradicting the fact that a domain must not have any zero divisors. Thus the assumption is false and a=0Ra = 0_R or b=0Rb = 0_R must be true.

()(\Leftarrow) Suppose the three bullet points hold and assume towards a contradiction that RR is not a domain. Since the conditions that RR is commutative and 0R1R0_R \ne 1_R are satisfied it must be the case that RR has zero divisors. Let aa be a zero divisor. Then by definition of zero divisor a0Ra\neq 0_R and there exists bRb\in R so that b0Rb\neq 0_R and ab=0Rab=0_R. This contradicts the third butlet point. Thus the assumption is false and RR is a domain. ◻

Example 4.31. The ring \mathbb{Z} is a domain: it is a commutative ring, 101_{\mathbb{Z}} \neq 0_{\mathbb{Z}}, and given integers aa and bb, if ab=0ab=0 then a=0a=0 or b=0b=0. However, \mathbb{Z} is not a field, since for example 22 is not a unit. In fact, the only units in \mathbb{Z} are 11 and 1-1.

Example 4.32. The rings \mathbb{R} and \mathbb{Q} are both fields.

Theorem 4.33. Let N>1N > 1 be an integer.

  1. N\mathbb{Z}_N a field if and only if NN is prime.

  2. N\mathbb{Z}_N is a domain if and only if NN is prime.

Proof.

  1. We have shown before that [a]N[a]_N is a unit if and only if gcd(a,N)=1\gcd(a,N) = 1.

    Suppose N=pN = p is prime. Since pp is prime, for each integer aa, either pp divides aa or gcd(a,p)=1\gcd(a,p) = 1. Equivalently, if [a]p[0]p[a]_p \neq [0]_p, then gcd(a,p)=1\gcd(a,p) = 1. We conclude that every nonzero element [a]p[a]_p in p\mathbb{Z}_p is a unit, and thus p\mathbb{Z}_p is a field.

    Assume NN is not prime. Then there exist integers 1<a,b<N1< a, b < N such that N=abN = ab, and thus gcd(a,N)=a>1\gcd(a,N) = a > 1. Note also that [a]N[0]N[a]_N \neq [0]_N, so [a]N[a]_N is neither zero nor a unit. We conclude that N\mathbb{Z}_N is not a field.

  2. Assume NN is not prime. Then there exist integers 1<a,b<N1< a, b < N such that N=abN = ab. Note that [a]N,[b]N0[a]_N, [b]_N \neq 0, even though [a]N[b]N=[ab]N=[N]N=[0]N[a]_N [b]_N = [ab]_N = [N]_N = [0]_N. We conclude that [a]N[a]_N is a zerodivisor, and N\mathbb{Z}_N is not a domain.

    Now assume that N=pN = p is prime. Suppose that [a]p,[b]p[a]_p, [b]_p in p\mathbb{Z}_p are such that [a]p[b]p=0[a]_p [b]_p = 0. Then by definition we have [ab]p=[0]p[ab]_p = [0]_p, or equivalently, ab0(modp)ab \equiv 0 \pmod p, so pp divides abab. Since pp is prime, this implies that pp divides aa or pp divides bb. But that would mean that either [a]p=[0]p[a]_p = [0]_p or [b]p=[0]p[b]_p = [0]_p. We conclude that p\mathbb{Z}_p is a domain.edhere

 ◻

Proposition 4.34. Let RR be a commutative ring. If rRr \in R is a unit, then it is not a zero divisor.

Proof. Suppose uu is a unit, and let vv be the multiplicative inverse of RR. By way of contradiction, suppose that ua=0ua = 0 for some non-zero element aRa \in R. Multiplying both sides by vv, we conclude that a=v(ua)=(vu)a=v0=0a = v(ua) = (vu) a = v \cdot 0 = 0, a contradiction. Therefore, uu is not a zerodivisor. ◻

Corollary 4.35. Every field is an integral domain.

Proof. Suppose RR is a field. By definition this implies RR is commutative and 1R0R1_R \ne 0_R. Given any nonzero element rRr \in R, rr is a unit by definition, and hence by Proposition 4.34 uu is not a zerodivisor. Therefore, we conclude that no nonzero element in RR is a zerodivisor, and thus RR must be an integral domain. ◻

The converse is false: \mathbb{Z} is a domain, but not a field.

Theorem 4.36. Assume RR is a commutative ring and rRr \in R is a non-zero element that is not a zerodivisor. If ra=rbra = rb for some elements a,bRa, b \in R, then a=ba = b.

Proof. If ra=rbra=rb, then r(ab)=0r(a-b) = 0. Since rr is not zero and not a zerodivisor, we must have ab=0a-b = 0, or equivalently, a=ba=b. ◻

Corollary 4.37. if RR is an integral domain, then RR has the following cancelation property: for all nonzero aRa \in R and any elements b,cRb, c \in R, if ab=acab=ac then b=cb=c.

Proof. Given any nonzero element aRa \in R, we have that aa is not a zerodivisor, since RR is a domain. Therefore, Theorem 4.36 applies, and thus if ab=acab=ac then b=cb=c. ◻

4.6 Product rings

Definition 4.38. Let RR and SS be two rings. Let R×SR\times S be the set of ordered pairs R×S={(r,s)|rR,sS}.R\times S=\{(r, s) \,\, | \,\, r \in R, s\in S\}. Define an operation ++ on R×SR\times S by (r1,s1)+(r2,s2)=(r1+r2,s1+s2)(r_1, s_1) + (r_2, s_2) = (r_1+r_2, s_1+s_2) and an operation \cdot on R×SR\times S by (r1,s1)(r2,s2)=(r1r2,s1s2).(r_1, s_1) \cdot (r_2, s_2)= (r_1 \cdot r_2, s_1 \cdot s_2).

Proposition 4.39. For any two rings RR and SS, R×SR \times S is also a ring. Its additive identity is (0r,0S)(0_r,0_S) and the additive inverse of an element (r,s)(r,s) is (r,s)(-r,-s).

Proof. The only way to prove this is the painstakingly check all nine axioms. All of them follow directly from the fact that each of RR and SS are assumed to be rings. We will just give a sampling of the proofs:

For instance, R×SR \times S has an additive identity, and it is 0R×S=(0R,0S)0_{R \times S} = (0_R,0_S), since for any (r,s)R×S(r,s) \in R \times S, (0R,0S)+(r,s)=(0R+r,0S+s)=(r,s)(0_R,0_S) + (r,s) = (0_R + r, 0_S + s) = (r,s) using that 0R0_R and 0S0_S are the additive identities of RR and SS.

Similarly, this ring has a multiplicative identity, namely (1R,1S)(1_R,1_S).

To give just one more example, let us check the left distributive law. For arbitrary elements r1,r2,r3Rr_1, r_2, r_3 \in R and s1,s2,s3Ss_1, s_2, s_3 \in S, we have (r1,s1)((r2,s2)+(r3,s3))=(r1,s1)(r2+r3,s2+s3)=(r1(r2+r3),s1(s2+s3))=(r1r2+r1r3,s1s2+s1s3)(r_1, s_1) ((r_2, s_2) + (r_3, s_3)) = (r_1, s_1) (r_2 + r_3, s_2+ s_3) = (r_1(r_2 + r_3) , s_1(s_2+ s_3)) = (r_1r_2 + r_1r_3 , s_1s_2+ s_1s_3) where in the last equality we have used that the left distributive law holds in each of RR and SS. We also have (r1,s1)(r2,s2)+(r1,s1)(r3,s3)=(r1r2+r1r3,s1s2+s1s3)(r_1, s_1) (r_2, s_2) + (r_1, s_1) (r_3, s_3) = (r_1r_2 + r_1r_3, s_1s_2+ s_1s_3) and hence (r1,s1)((r2,s2)+(r3,s3))=(r1,s1)(r2,s2)+(r1,s1)(r3,s3).(r_1, s_1) ((r_2, s_2) + (r_3, s_3)) =(r_1, s_1) (r_2, s_2) + (r_1, s_1) (r_3, s_3). ◻

Proposition 4.40. If the set RR has NN elements and the set SS has MM elements then the set R×SR\times S has NMN\cdot M elements. In particular, N×M{\mathbb{Z}}_N\times {\mathbb{Z}}_M is a set with NMN\cdot M elements.

Proof. We need to count the number of pairs (r,s)(r,s) with rRr\in R and sSs\in S. There are NN possibilities for rr and for each of these there are MM possibilities for SS, so overall NMN\cdot M possibilities. ◻

Example 4.41. The ring 2×3{\mathbb{Z}}_2 \times {\mathbb{Z}}_3 has a total of six elements ([0]2,[0]3),([0]2,[1]3),([0]2,[2]3),([1]2,[0]3),([1]2,[1]3),([1]2,[2]3)([0 ]_2, [0 ]_3), ([0 ]_2, [1 ]_3), ([0 ]_2, [2 ]_3), ([1 ]_2, [0 ]_3), ([1 ]_2, [1 ]_3), ([1 ]_2, [2 ]_3) and we have, for example, ([1]2,[1]3)+([1]2,[1]3)=([0]2,[2]3)([1]_2, [1]_3) + ([1]_2, [1]_3) = ([0]_2, [2]_3).

Example 4.42. The ring 2×2{\mathbb{Z}}_2 \times {\mathbb{Z}}_2 has the following addition and multiplication tables: +([0]2,[0]2)([0]2,[1]2)([1]2,[0]2)([1]2,[1]2)([0]2,[0]2)([0]2,[0]2)([0]2,[1]2)([1]2,[0]2)([1]2,[1]2)([0]2,[1]2)([0]2,[1]2)([0]2,[0]2)([1]2,[1]2)([1]2,[0]2)([1]2,[0]2)([1]2,[0]2)([1]2,[1]2)([0]2,[0]2)([0]2,[1]2)([1]2,[1]2)([1]2,[1]2)([1]2,[0]2)([0]2,[1]2)([0]2,[0]2)\begin{array}{ c || c | c | c | c } + & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) \\ \hline\hline ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) \\ \hline ([0 ]_2, [1 ]_2) & ([0 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2)& ([1 ]_2, [0 ]_2) \\ \hline ([1 ]_2, [0 ]_2)& ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) \\ \hline ([1 ]_2, [1 ]_2)& ([1 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) \\ \end{array} ([0]2,[0]2)([0]2,[1]2)([1]2,[0]2)([1]2,[1]2)([0]2,[0]2)([0]2,[0]2)([0]2,[0]2)([0]2,[0]2)([0]2,[0]2)([0]2,[1]2)([0]2,[0]2)(([0]2,[1]2)([0]2,[0]2)([0]2,[1]2)([1]2,[0]2)([0]2,[0]2)([0]2,[0]2)([1]2,[0]2)([1]2,[0]2)([1]2,[1]2)([0]2,[0]2)([0]2,[1]2)([1]2,[0]2)([1]2,[1]2)\begin{array}{ c || c | c | c | c } \cdot & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) \\ \hline\hline ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) \\ \hline ([0 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) & (([0 ]_2, [1 ]_2) & ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) \\ \hline ([1 ]_2, [0 ]_2)& ([0 ]_2, [0 ]_2) & ([0 ]_2, [0 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [0 ]_2) \\ \hline ([1 ]_2, [1 ]_2)& ([0 ]_2, [0 ]_2) & ([0 ]_2, [1 ]_2) & ([1 ]_2, [0 ]_2) & ([1 ]_2, [1 ]_2) \\ \end{array}

The additive identity is ([0]2,[0]2)([0 ]_2, [0 ]_2), the multiplicative identity is ([1]2,[1]2)([1 ]_2, [1 ]_2) and the only unit in this ring is also ([1]2,[1]2)([1 ]_2, [1 ]_2).

4.7 Polynomial rings

We now define polynomials. This will match the notion of polynomials with coefficients in the real numbers that you have encountered in high school or in calculus, but we extend it to create polynomials with coefficients in any ring, for example, N{\mathbb{Z}}_N.

Theorem 4.43. If RR is a commutative ring, then there exists a polynomial ring denoted R[x]R[x], containing an element xx that is not in RR, which has these properties:

  1. The set R[x]R[x] consists of all expressions of the form a0+a1x+...+anxn (where n0 and aiR for 0in).a_0+a_1x+...+a_nx^n \mbox{ (where $n \geq 0$ and $a_i \in R$ for $0\leq i\leq n$)}.

  2. The representation of elements in R[x]R[x] is unique, that is, a0+a1x+...+anxn=b0+b1x+...+bnxn if and only if a0=b0,,an=bn.a_0+a_1 x+...+a_n x^n =b_0+b_1 x+...+b_n x^n \text{ if and only if } a_0=b_0, \ldots, a_n=b_n. In particular, a0+a1x+...+anxn=0Ra_0+a_1 x+...+a_n x^n =0_R if and only if ai=0Ra_i =0_R for 0in0\leq i\leq n.

  3. Addition in R[x]R[x] is given by (a0+a1x++anxn)+(b0+b1x++bnxn)=(a0+b0)+(a1+b1)x++(an+bn)xn.(a_0 + a_1x + \cdots + a_n x^n) + (b_0 + b_1x + \cdots + b_n x^n) = (a_0 + b_0) + (a_1 + b_1) x + \cdots + (a_n + b_n) x^n.

  4. Multiplication in R[x]R[x] is given by (a0+a1x++anxn)(b0+b1x++bmxm)=(a0b0)+(a1b0+a0b1)x+(a2b0+a1b1+a0b2)x2++(anbm)xn+m.\begin{eqnarray*} (a_0 + a_1x + \cdots + a_n x^n) \cdot (b_0 + b_1x + \cdots + b_m x^m) = \\ (a_0 \cdot b_0) + (a_1 b_0 +a_0 b_1) x + (a_2 b_0 + a_1b_1 + a_0 b_2) x^2 + \cdots + (a_n b_m) x^{n+m}. \end{eqnarray*} More precisely, the coefficient of xjx^j in the product is i=0jajibi\sum_{i=0}^j a_{j-i}b_{i}.

  5. RR is a subring of R[x]R[x] by viewing elements a0Ra_0\in R as polynomials.

  6. xa=axxa=ax for every aRa \in R and, more generally, R[x]R[x] is a commutative ring.

We will not prove this theorem as checking the ring axioms can get slightly tedious.

Example 4.44. Here are sole polynomials with the rings they belong to

  1. [1]9+[3]9x9[x][1]_9+[3]_9x\in {\mathbb{Z}}_9[x]

  2. x2+3x+7[x]x^2+3x+7\in \mathbb{Z}[x]

  3. 2x2+32x+1[x]2x^2+\frac{3}{2}x+1\in \mathbb{Q}[x]

  4. πx3+ex+2[x]\pi x^3+ex+\sqrt{2}\in \mathbb{R}[x]

Definition 4.45. The degree of a polynomial p(x)=a0+a1x+a2x2++anxnp(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n is the largest d0d\geq 0 so that ad0a_d \ne 0. So, the degree is nn provided an0a_n \ne 0.

We write degreep(x)\operatorname{degree}p(x) for the degree of p(x)p(x).

Example 4.46. The degrees of the following polynomials in [x]\mathbb{R}[x] are

2

  1. degree(37x2+2x7)=7\operatorname{degree}(3 - 7x^2 + \sqrt{2} x^7) = 7

  2. degree(πx5+1287x12)=12\operatorname{degree}(\pi x^5 + \sqrt[7]{128} x^{12}) = 12

  1. degree(5.7)=0\operatorname{degree}(5.7) = 0

  2. degree(0)=\operatorname{degree}(0) =undefined.

Proposition 4.47. Given a ring RR and non-zero polynomials p(x)p(x) and q(x)q(x) in R[x]R[x] degree(p(x)+q(x))max{degreep(x),degreeq(x)}degree(p(x)q(x))degreep(x)+degreeq(x).\begin{eqnarray*} \operatorname{degree}(p(x) + q(x)) &\leq & \max\{\operatorname{degree}p(x),\operatorname{degree}q(x)\}\\ \operatorname{degree}(p(x) \cdot q(x)) &\leq & \operatorname{degree}p(x) + \operatorname{degree}q(x). \end{eqnarray*}

Proof. Say p(x)=a0+a1x+a2x2++anxnp(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n and q(x)=b0+b1x+b2x2++bmxmq(x) = b_0 + b_1 x + b_2 x^2 + \cdots + b_m x^m with an0a_n \ne 0 and bm0b_m \ne 0, so that degree(p)=n\operatorname{degree}(p) = n and degree(q)=m\operatorname{degree}(q) = m.

Then the highest possible nonzero term of p(x)+q(x)p(x)+q(x) is the one among anxna_n x^n or bmxmb_m x^m which has largest exponent. Thus degree(p(x)+q(x))max{degreep(x),degreeq(x)}\operatorname{degree}(p(x) + q(x)) \leq \max\{\operatorname{degree}p(x),\operatorname{degree}q(x)\}. Then the highest possible nonzero term of p(x)q(x)p(x) \cdot q(x) is (anbm)xn+m(a_nb_m) x^{n+m}.Thus degree(p(x)q(x))n+m\operatorname{degree}(p(x) \cdot q(x)) \leq n + m. ◻

Example 4.48. The inequalities in the previous proposition need not be equalities!

  1. if p(x)=q(x)=xp(x)=q(x)=x then 1=degree(p(x)+q(x))=max{degreep(x),degreeq(x)}1=\operatorname{degree}(p(x) + q(x)) = \max\{\operatorname{degree}p(x),\operatorname{degree}q(x)\}

  2. if p(x)=xp(x)=x and q(x)=x+1q(x)=-x+1 then 0=degree(p(x)+q(x))<max{degreep(x),degreeq(x)}=max{1,1}0=\operatorname{degree}(p(x) + q(x)) < \max\{\operatorname{degree}p(x),\operatorname{degree}q(x)\}=\max\{1,1\}

  3. if p(x)=q(x)=xp(x)=q(x)=x then 2=degree(p(x)q(x))=degreep(x)+degreeq(x)=1+12=\operatorname{degree}(p(x) \cdot q(x)) = \operatorname{degree}p(x)+ \operatorname{degree}q(x)=1+1

  4. if p(x)=[2]4xp(x)=[2]_4x and q(x)=[2]4x+[1]4q(x)=[2]_4x+[1]_4 then 1=degree(p(x)q(x))<degreep(x)+degreeq(x)1+11=\operatorname{degree}(p(x) \cdot q(x)) < \operatorname{degree}p(x)+ \operatorname{degree}q(x)1+1

However, the second inequality becomes an equality if the coefficient ring is a domain.

Proposition 4.49. If RR is a domain then

  1. for non-zero polynomials p(x)p(x) and q(x)q(x) in R[x]R[x], degree(p(x)q(x))=degreep(x)+degreeq(x).\operatorname{degree}(p(x) \cdot q(x)) = \operatorname{degree}p(x)+ \operatorname{degree}q(x).

  2. R[x]R[x] is a domain.

Proof. (a) Say p(x)=a0+a1x+a2x2++anxnp(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n and q(x)=b0+b1x+b2x2++bmxmq(x) = b_0 + b_1 x + b_2 x^2 + \cdots + b_m x^m with an0a_n \ne 0 and bm0b_m \ne 0, so that degree(p)=n\operatorname{degree}(p) = n and degree(q)=m\operatorname{degree}(q) = m. Then the highest nonzero term of p(x)q(x)p(x) \cdot q(x) is (anbm)xn+m(a_nb_m) x^{n+m}, which is non-zero since an0a_n \ne 0 and bm0b_m \ne 0 and in a domain if an0a_n \ne 0 and bm0b_m \ne 0 then anbm0a_nb_m\neq 0 (otherwise ana_n and bmb_m would be zero divisors, a contradiction). Thus degree(p(x)q(x))n+m\operatorname{degree}(p(x) \cdot q(x)) \leq n + m.

(b) Assume towards a contradiction that R[x]R[x] has a zero divizor p(x)p(x). Then p(x)0p(x)\neq 0 and there exists q(x)0q(x)\neq 0 so that p(x)q(x)=0p(x)q(x)=0. Same as in part (a) if p(x)=a0+a1x+a2x2++anxnp(x) = a_0 + a_1 x + a_2 x^2 + \cdots + a_n x^n and q(x)=b0+b1x+b2x2++bmxmq(x) = b_0 + b_1 x + b_2 x^2 + \cdots + b_m x^m with an0a_n \ne 0 and bm0b_m \ne 0, then the highest nonzero term of p(x)q(x)p(x) \cdot q(x) is (anbm)xn+m(a_nb_m) x^{n+m} where anbm0a_nb_m\neq 0 by the same argument as in (a). Since p(x)q(x)p(x) \cdot q(x) has a nonzero term it is not the zero polynomials, a contradiction.

Therefore our assumption is false and R[x]R[x] has no zero divisors, so it is a domain. ◻

4.7.1 The Division Algorithm and GCDs

Theorem 4.50 (Division Algorithm). Let FF be a field and f(x),g(x)F[x]f(x),g(x) \in F[x] with g(x)0g(x) \neq0. Then there exists unique polynomials q(x)q(x) and f(x)f(x) such that f(x)=g(x)q(x)+r(x) and either r(x)=0F or degr(x)<degg(x)f(x)=g(x)q(x)+r(x) \mbox{ and either }\ r(x)=0_F \mbox{ or } \deg r(x) < \deg g(x)

We will not worry about proving this theorem. Instead let’s recall that there is an algorithm (long division) that gives us the q(x)q(x) and r(x)r(x) in the Division Algorithm Theorem.

Example 4.51. We divide f(x)=x3+x21f(x)=x^3+x^2-1 by g(x)=x1g(x)=x-1

Example 4.52. Divide f(x)=3x5+2x4+2x3+4x2+x2f(x)=3x^5+2x^4+2x^3+4x^2+x-2 by g(x)=2x3+1g(x)=2x^3+1 in [x]\mathbb{R}[x].

Definition 4.53. Let FF be a field and a(x),b(x)F[x]a(x),b(x) \in F[x] with b(x)0b(x) \neq 0 We say that a(x)a(x) divides b(x)b(x), and write a(x)|b(x)a(x)|b(x) if b(x)=a(x)h(x)b(x)=a(x)h(x) for some h(x)F[x]h(x) \in F[x].

Definition 4.54. Let FF be a field and a(x),b(x)F[x]a(x),b(x) \in F[x], not both zero. The greatest common divisor of a(x)a(x) and b(x)b(x) is the polynomial of highest degree that divides both a(x)a(x) and b(x)b(x) and that has leading coefficient 1F1_F.

Example 4.55. Find the GCD of a(x)=2x4+5x35x2a(x)=2x^4+5x^3-5x-2 and b(x)=2x33x22xb(x)=2x^3-3x^2-2x in [x]\mathbb{R}[x]. Since a(x)=2x4+5x35x2=(2x+1)(x+2)(x+1)(x1)b(x)=2x33x22x=(2x+1)(x2)x.\begin{eqnarray*} a(x)=2x^4+5x^3-5x-2 &=&(2x+1)(x+2)(x+1)(x-1)\\ b(x)=2x^3-3x^2-2x &=&(2x+1)(x-2)x. \end{eqnarray*} We see that the only common factor for f(x)f(x) and g(x)g(x) is 2x+12x+1. However 2x+1=2(x+12)2x+1=2(x+\frac{1}{2}), so another polynomial of degree one that divides both f(x)f(x) and g(x)g(x) is x+12x+\frac{1}{2}. Since the latter has leading coefficient one and is the unique polynomial with leading coefficient one that divides both f(x)f(x) and g(x)g(x) we conclude gcd(f(x),g(x))=x+12\gcd(f(x),g(x))=x+\frac{1}{2}.

Theorem 4.56 (Euclidean Algorithm). Let FF be a field and a(x),b(x)inF[x]a(x),b(x) in F[x], not both zero. Then there is a unique GCD of d(x)d(x) of a(x)a(x) and b(x)b(x) which can be found using the Euclidean algorithm followed by dividing the last non-zero remainder polynomial by its leading coefficient.

Furthermore, there are polynomial u(x)u(x) and v(x)v(x) such that d(x)=a(x)u(x)+b(x)v(x)d(x)=a(x)u(x)+b(x)v(x).

Example 4.57. We find the GCD of a(x)=2x4+5x35x2a(x)=2x^4+5x^3-5x-2 and b(x)=2x33x22xb(x)=2x^3-3x^2-2x in [x]\mathbb{R}[x] using the Euclidean algorithm.

The last non-zero remainder is r(x)=4849x2449r(x)=-\frac{48}{49}x-\frac{24}{49} and dividing it by its leading coefficient, 4849-\frac{48}{49}, gives gcd(a(x),b(x))=x+12.\gcd(a(x),b(x))=x+\frac{1}{2}.

4.50 and 4.56 show that there is a strong analogy between the integers and polynomials. We make this precise as follows.

Let us summarize the analogy between F[x]F[x] and {\mathbb{Z}} so far:

F[x]F[x] {\mathbb{Z}}
domain (see 4.49) domain
Long Division Division Algorithm
a(x)=b(x)q(x)+r(x),deg(r(x))<deg(b(x))a(x)=b(x)q(x)+r(x), \deg(r(x))<\deg(b(x)) or r(x)=0r(x)=0 a=bq+r,0r<ba=bq+r, 0\leq r<b
Euclidean Algorithm Euclidean Algorithm

Definition 4.58. A Euclidean domain is a ring RR which is a domain and in which the Division Theorem holds. More precisely this means that there is a function f:Rf: R\to {\mathbb{N}} so that for any a,bRa,b\in R with b0b\neq 0 there exist q,rRq,r\in R so that a=bq+r and either r=0 or N(r)<N(b).a=b\cdot q+r \text{ and either } r=0 \text{ or } N(r)<N(b).

For the ring of polynomials N(p(x))=deg(p(x))N(p(x))=\deg(p(x)). For ,N(x)=|x|{\mathbb{Z}}, N(x)=\vert x\vert.

Example 4.59. The ring of integers {\mathbb{Z}} is a Euclidean domain. If FF is a field then the ring of polynomials with coefficients in FF, F[x]F[x], is a Euclidean domain. In particular, [x],[x],[x]\mathbb{Q}[x], \mathbb{R}[x], \mathbb{C}[x] and p[x]{\mathbb{Z}}_p[x] are Euclidean domains for any prime p>0p>0.

4.7.2 Irreducible polynomials and irreducible factorizations

Let us explore the units in R[x]R[x].

Proposition 4.60. Let RR be a domain. A polynomial p(x)p(x) is a unit in R[x]R[x] if and only if p(x)=cp(x)=c for some cRc\in R such that cc is a unit in RR.

Proof. Assume p(x)p(x) is a unit. Then p(x)q(x)=1p(x) \cdot q(x) = 1 for some polynomial q(x)q(x). By the degree formula, degree(p(x))+degree(q(x))=degree(1)=0\operatorname{degree}(p(x)) + \operatorname{degree}(q(x)) = \operatorname{degree}(1) = 0. Since degrees cannot be negative, this proves that degree(p(x))=0=degree(q(x))\operatorname{degree}(p(x)) = 0=\operatorname{degree}(q(x)). That is, p(x)=cp(x) = c for some cRc\in R and q(x)=rq(x)=r for some rRr\in R so that cr=1cr=1. This means that cc is a unit in RR.

Conversely, suppose p(x)=cp(x)=c for some cRc\in R such that cc is a unit in RR. Then there exists rRr\in R so that rc=cr=1Rrc=cr=1_R and thus we see that q(x)=rq(x)=r is a multiplicative inverse for p(x)=cp(x) = c in R[x]R[x] ◻

Corollary 4.61. Let FF be a field. A polynomial p(x)p(x) is a unit in F[x]F[x] if and only if p(x)=cp(x)=c for some 0FcF0_F\neq c\in F.

Example 4.62. The previous proposition is no longer true if we remove the hypothesis that RR is a domain. For example [1]9+[3]9x[1]_9+[3]_9x is a unit in 9[x]{\mathbb{Z}}_9[x] with multiplicative inverse [1]9+[6]9x[1]_9+[6]_9x.

Definition 4.63. Let RR be a commutative ring with identity. An element aRa \in R is said to be an associate of an element bRb \in R if there is a unit uRu \in R such that au=bau=b.

Example 4.64.

  1. 4-4 is an associate to the number 44 in \mathbb{Z} since 4=4(1)-4=4\cdot(-1) and 1-1 is a unit in {\mathbb{Z}}

  2. 2(x2+3x+2)2(x^2+3x+2) is an associate to the polynomial x2+3x+2x^2+3x+2 in [x]\mathbb{Q}[x] since 22 is a unit in [x]\mathbb{Q}[x].

Definition 4.65. Let FF be a field. A non-constant polynomial p(x)F[x]p(x) \in F[x] is said to be irreducible if its only divisors are its associates and the nonzero constant polynomials (units).

Lemma 4.66. If FF is a field then every polynomial of degree 1 in F[x]F[x] is irreducible in F[x]F[x].

Proof. Let p(x)F[x]p(x)\in F[x] be a polynomial of degree one. Suppose q(x)p(x)q(x)\mid p(x). Then there exists h(x)h(x) so that p(x)=q(x)h(x)p(x)=q(x)h(x) and by the degree formula 4.49 we have 1=deg(p(x))=deg(q(x))+deg(h(x))1=\deg(p(x))=\deg(q(x))+\deg(h(x)). Therefore we see that at least one of q(x)q(x) or h(x)h(x) must have degree 0, that is q(x)q(x) or h(x)h(x) is a constant. It cannot be the zero constant since that would make p(x)=0p(x)=0, a contradiction. So we see that either q(x)q(x) or h(x)h(x) is a unit in F[x]F[x]. We have shown that either q(x)q(x) is a nonzero constant or q(x)q(x) is an associate of p(x)p(x) so p(x)p(x) is irreducible. ◻

Remark 4.67. Irreducible polynomials are analogous to prime integers since an integer pp is prime if and only if its only divisors are ±1\pm 1 (the units of {\mathbb{Z}}) and ±p\pm p (the associates of pp in {\mathbb{Z}}.)

To further the analogy between integers and polynomials here are new facts we have seen:

F[x]F[x] {\mathbb{Z}}
Units: p(x)=cp(x)=c, cF,c0Fc\in F, c\neq 0_F (see 4.61) Units: 1,1-1, 1
Associates: q(x)q(x) is associate to cq(x)c\cdot q(x) Associates: nn is associate to ±n\pm n
for all cF,c0Fc\in F, c\neq 0_F
Irreducible polynomial Prime integer
Irreducible factorization Prime factorization

The following theorems expand on the analogy between prime and irreducible polynomials and between prime factorizations and irreducible factorizations.

The following theorem is reminiscent of [lemma composite].

Theorem 4.68. Let FF be a field. A non-zero polynomial f(x)f(x) is reducible in F[x]F[x] if and only if f(x)f(x) can be written as the product of two polynomials of lower degree.

The following theorem is reminiscent of 1.49 and [lemma prime].

Theorem 4.69. Let FF be a field and p(x)p(x) a non-constant polynomial in F[x]F[x]. Then the following are equivalent:

  1. The polynomial p(x)p(x) is irreducible.

  2. If b(x)b(x) and c(x)c(x) are polynomials and p(x)|b(x)c(x)p(x)|b(x)c(x), then p(x)|b(x)p(x)|b(x) or p(x)|c(x)p(x)|c(x).

  3. If r(x)r(x) and s(x)s(x) are any polynomials such that p(x)=r(x)s(x)p(x)=r(x)s(x), then r(x)r(x) is a non-zero constant polynomial or s(x)s(x) is a non-zero constant polynomial.

The following theorem is reminiscent of the Fundamental Theorem of Arithmetic 1.52. Note that in that the uniqueness part of that theorem which states that pi=±qip_i=\pm q_i could be restated as pip_i and qiq_i are associates (in {\mathbb{Z}}).

Theorem 4.70. Let FF be a field. Every non-constant polynomial f(x)F[x]f(x) \in F[x] is a product of irreducible polynomials in F[x]F[x]. This factorization is unique in the following sense: If p1(x)p2(x)...pn(x)=q1(x)q2(x)...qm(x)p_1(x)p_2(x)...p_n(x)=q_1(x)q_2(x)...q_m(x) with each pi(x)p_i(x) and qi(x)q_i(x) irreducible, then r=sr=s and up to some relabeling, pi(x)p_i(x) and qi(x)q_i(x) are associates.

5 Homomorphisms and Isomorphisms

5.1 Functions

Definition 5.1. A function ff from a set AA to a set BB denoted f:ABf:A\to B is a rule that assigns to every element aAa\in A an element denoted f(a)Bf(a)\in B.

The set AA is called the domain of ff and the set BB is called the codomain of ff.

In other situations it is better to represent functions by giving a formula or rule for f(x)f(x).

Example 5.2. Here are some functions you know:

and some that you will get to know

Notice that in Figure 1 sometimes there can be multiple arrows reaching the same element in BB. When this does not happen we say ff is injective.

Definition 5.3. A function f:ABf:A\to B is injective or one-to-one if whenever x,yAx,y \in A are such that f(x)=f(y)f(x)=f(y) then x=yx=y.

Notice that in Figure [fig] there is arrow leaving from every element (dot) in AA. This is because a function associates to every aAa\in A some element of BB. However there may be some elements in BB which do not receive arrows, so they are not outputs of the function ff. To distinguish those elements of BB which are outputs of ff we introduce the following notion.

Definition 5.4. The image of a function f:ABf:A\to B is the set (f)={f(a)aA}.\Im(f)=\{ f(a) \mid a\in A\}. The image is a subset of the codomain, that is, (f)B\Im(f)\subseteq B, but they need not be equal.

Definition 5.5. A function f:ABf:A\to B is surjective or onto if for every bBb\in B there exists aAa\in A so that f(a)=bf(a)=b.

Equivalently, ff is surjective if and only if (f)=B\Im(f)=B.

Now we can combine the notions of injective and surjective.

Definition 5.6. A function f:ABf:A\to B is bijective if ff is both injective and surjective.

Example 5.7. In Figure [fig]

Example 5.8.

  1. the function f:,f(x)=x2f:\mathbb{R}\to \mathbb{R}, f(x)=x^2 is not injective since f(1)=f(1)f(-1)=f(1); it is not surjective since (f)=[0,)\Im(f)=[0,\infty)\neq \mathbb{R} and is not bijective since it is not surjective.

  2. the function f:,f(x)=2xf:{\mathbb{Z}}\to {\mathbb{Z}}, f(x)=2x is injective: if f(x)=f(y)f(x)=f(y) then 2x=2y2x=2y so x=yx=y (since 22 is not a zero-divisor in {\mathbb{Z}}); the image consists of all the even integers so ff is not surjective and thus it is not bijective.

  3. f:,f(x)=xf:\mathbb{R}\to {\mathbb{Z}}, f(x)=\lfloor x\rfloor, where x\lfloor x\rfloor denotes the largest integer that is smaller or equal to xx. This function is not injective: f(2.1)=2=f(2.2)f(2.1)=2=f(2.2); it is surjective since for every integer bb\in {\mathbb{Z}} we have f(b)=bf(b)=b. The function is not bijective since it is not surjective.

  4. f:N,f(x)=[x]Nf:{\mathbb{Z}}\to{\mathbb{Z}}_N, f(x)=[x]_N is not injective since, for example, f(0)=f(N)=[0]Nf(0)=f(N)=[0]_N. The function is surjective since for every element [x]NN[x]_N\in {\mathbb{Z}}_N we have f(x)=[x]Nf(x)=[x]_N. The function is not bijective since it is not injective.

  5. f:NN,f([x]N)=[x]Nf:{\mathbb{Z}}_N\to{\mathbb{Z}}_N, f([x]_N)=[-x]_N is injective: if f([x]N)=f([y]N)f([x]_N)=f([y]_N) then [x]N=[y]N[-x]_N=[-y]_N and so by multiplying both sides by [1]N[-1]_N we conclude [1]N[x]N=[1]N[y]N, that is ,[x]N=[y]N.[-1]_N[-x]_N=[-1]_N[-y]_N, \text{ that is }, [x]_N=[y]_N. This function is surjective since for each [b]NN[b]_N\in {\mathbb{Z}}_N there exists [a]N=[b]NN[a]_N=[-b]_N\in {\mathbb{Z}}_N so that f([b]N)=[b]Nf([-b]_N)=[b]_N. This function is bijective since it is both injective and surjective.

Proposition 5.9. If a function f:ABf:A\to B is bijective and AA and BB are finite sets, then AA and BB must have the same number of elements.

Proof. Suppose a function f:ABf:A\to B is bijective and AA and BB are finite sets. Let n=#An=\#A.

Then we can write A={a1,a2,,an}A=\{a_1, a_2, \ldots, a_n\} where aiaja_i\neq a_j whenever iji\neq j. Since ff is injective, by the contrapositive of the definition of injective it follows that f(ai)f(aj)f(a_i)\neq f(a_j) whenever iji\neq j. Thus the image of ff, that is, the set (f)={f(a1),f(a2),,f(an)}\Im(f)=\{f(a_1), f(a_2), \ldots, f(a_n)\} consists of exactly nn distinct elements. In other words, #(f)=n\#\Im(f)=n.

Since ff is surjective we have B=(f)B=\Im(f) and thus we conclude #B=#(f)=n\#B=\#\Im(f)=n. Since #A=n\#A=n we deduce that #A=#B\#A=\#B. ◻

Example 5.10. Suppose in a certain class no two students have the same birthday month and for each of the months of the year there is a student born in that month. How many students are there in the class?

The answer is 12 students. To see this, consider f:{students}{months},uadf(student)=student’s birthday month.f:\{\text{students}\}\to \{\text{months}\}, \mathbb{Q}uad f(\text{student})=\text{student's birthday month}. The information given tells us ff is injective (no two students have the same birthday month) and surjective (for each of the months of the year there is a student born in that month). Thus ff is bijective and by Theorem 5.9 we conclude #\# students =#= \# months =12=12.

Example 5.11. Suppose in a certain class no two students have the same birthday month. How many students are there in the class?

There are at most 12 students in the class. Since each student has a different birthday month the function f:{students}{months},uadf(student)=student’s birthday month.f:\{\text{students}\}\to \{\text{months}\}, \mathbb{Q}uad f(\text{student})=\text{student's birthday month}. is injective. As in the proof of 5.9, the number of students is equal to the number of their birthday months, which is at most 12 since there are only 12 months in a year.

Example 5.12. Suppose in a certain class there are more than 12 students? Can each student has a different birthday month?

No, the students cannot have each have a different birthday month. Assume towards a contradiction that each student has a different birthday month. Then the set of all their birthday months would have as many elements as there are students in the class, so there would be more than 12 different birthday months. This is a contradiction.

Theorem 5.13. If AA and BB are finite sets which have the same number of elements then the following are equivalent:

  1. ff is injective,

  2. ff is surjective,

  3. ff is bijective.

Note: a “the following are equivalent" statement as above means three if and only if sratements: (a) if and only if (b); (b) if and only if (c); (c) if and only of (a). However, to prove all of this it suffice to prove only 3 if – then statements, namely

  1. (a)(b)(a)\rightarrow (b): if (a) then (b)

  2. (b)(c)(b)\rightarrow (c): if (b) then (c)

  3. (c)(a)(c)\rightarrow (a): if (c) then (a)

Proof. (a)(b)(a)\rightarrow (b): if ff is injective then ff is surjective.

Suppose #A=#B=n\#A=\#B=n and f:ABf:A\to B is injective. Then we can write A={a1,a2,,an}A=\{a_1, a_2, \ldots, a_n\} where aiaja_i\neq a_j whenever iji\neq j. Since ff is injective, by the contrapositive of the definition of injective it follows that f(ai)f(aj)f(a_i)\neq f(a_j) whenever iji\neq j. Thus the image of ff, that is, the set (f)={f(a1),f(a2),,f(an)}\Im(f)=\{f(a_1), f(a_2), \ldots, f(a_n)\} consists of exactly nn distinct elements. In other words, #(f)=n\#\Im(f)=n. Since (f)B\Im(f)\subseteq B and #(f)=#B=n\#\Im(f)=\#B=n it follows that there is no element of BB that is not in (f)\Im(f), in other words (f)=B\Im(f)=B, so ff is surjective.

(b)(c)(b)\rightarrow (c): if ff is surjective. then ff is bijective.

Suppose #A=#B=n\#A=\#B=n and f:ABf:A\to B is surjective. By definition of surjective we have (f)=B\Im(f)=B and in particular #(f)=#B=n\#\Im(f)=\#B=n. We can write A={a1,a2,,an}A=\{a_1, a_2, \ldots, a_n\} and (f)={f(a1),f(a2),,f(an)}\Im(f)=\{f(a_1), f(a_2), \ldots, f(a_n)\} Since we know #(f)=n\#\Im(f)=n the elements f(a1),f(a2),,f(an)f(a_1), f(a_2), \ldots, f(a_n) must be pairwise distinct (otherwise (f)\Im(f) would have fewer than nn elements). Thus f(ai)f(aj)f(a_i)\neq f(a_j) whenever aiaja_i\neq a_j, which is the contrapositive of the definition of injective. We have thus shown ff is injective and since it was assume surjective ff is in fact bijective.

(c)(a)(c)\rightarrow (a): if ff is bijective then ff is injective.

This is true for any function by definition of bijective. ◻

Theorem 5.14. If RR is a domain which has finitely many elements, then RR is a field.

Proof. Consider rRr\in R with r0r\neq 0. Since RR is a domain if follows that RR is not a zero divisor. Let f:RRf:R\to R be given by f(x)=rxf(x)=rx. Then ff is injective. Indeed, if f(x)=f(y)f(x)=f(y) then rx=ryrx=ry and since rr is not a zero divisor we have that x=yx=y by the cancellation Theorem 4.36.

Since the domain and codomain of ff are finite sets having the same number of elements (in fact the domain and codomain are the same set RR), and ff is injective, then ff must be surjective by Theorem 5.13. Thus there exists sRs\in R such that f(s)=1Rf(s)=1_R, that is rs=1Rrs=1_R. Because RR is a domain it is a commutative ring and thus sr=rs=1Rsr=rs=1_R, We have shown that rr is a unit.

Since every r0Rr\neq 0_R is a unit, RR is a field. ◻

Theorem 5.15 (Fermat’s little theorem). If pp is a prime integer and aa is an integer not divisible by pp then ap11(modp)a^{p-1}\equiv 1 \pmod{p}.

Proof. Suppose pp is a prime integer and aa is an integer not divisible by pp. Since pp does not divide aa, we have a0(modp)a\not\equiv 0 \pmod p and thus [a]p[0]p[a]_p\neq [0]_p. By Theorem 4.33 we know p{\mathbb{Z}}_p is a field and so, since [a]p[0]p[a]_p\neq [0]_p we deduce that [a]p[a]_p is a unit in p{\mathbb{Z}}_p.

Consider the function f:p\{[0]p}p\{[0]p}f:{\mathbb{Z}}_p\setminus \{[0]_p\}\to {\mathbb{Z}}_p\setminus \{[0]_p\} given by f([x]p)=[a]p[x]pf([x]_p)=[a]_p[x]_p. We claim that ff is injective. Indeed, if f([x]p)=f([y]p)f([x]_p)=f([y]_p), then [a]p[x]p=[a]p[y]p[a]_p[x]_p=[a]_p[y]_p. Since [a]p[a]_p is a unit it is not a zero divisor by Proposition 4.34. Applying the cancellation principle (Theorem 4.36) we deduce from [a]p[x]p=[a]p[y]p[a]_p[x]_p=[a]_p[y]_p that [x]p=[y]p[x]_p=[y]_p.

Since ff is injective and the domain and codomain of ff are finite sets which have the same number of elements (p1p-1), by Theorem 5.13 we deduce that ff is surjective, so (f)=p\{[0]p}={[1]p,[2]p,,[p1]p}\Im(f)= {\mathbb{Z}}_p\setminus \{[0]_p\}=\{[1]_p, [2]_p, \ldots, [p-1]_p\}. On the other hand (f)={[a]p[1]p,[a]p[2]p,,[a]p[p1]p}.\Im(f)=\{[a]_p[1]_p, [a]_p[2]_p, \ldots, [a]_p[p-1]_p\}. Taking the product of the elements of (f)\Im(f) is the same as taking the product of the elements of p\{[0]p}{\mathbb{Z}}_p\setminus \{[0]_p\}, so [a]p[1]p[a]p[2]p[a]p[p1]p=[1]p[2]p[p1]p[a]_p[1]_p\cdot [a]_p[2]_p\cdot \cdots \cdot [a]_p[p-1]_p = [1]_p \cdot [2]_p \cdot \cdots \cdot [p-1]_p [1]p[2]p[p1]p[a]pp1=[1]p[2]p[p1]p[1]_p\cdot [2]_p \cdot \cdots \cdot [p-1]_p\cdot [a]_p^{p-1}= [1]_p \cdot [2]_p \cdot \cdots \cdot [p-1]_p Cancelling, as in Theorem 4.36, [1]p,[2]p,,[p1]p[1]_p, [2]_p, \ldots, [p-1]_p, which are all non zero-divisors, in the above equation we obtain [a]pp1=[1]p,[a]_p^{p-1}=[1]_p, in other words ap11(modp)a^{p-1}\equiv 1 \pmod{p}. ◻

5.2 Isomorphisms

Two rings RR and SS are isomorphic if they are the same after relabelling. Here is the formal definition:

Definition 5.16. Let RR and SS be rings. An isomorphism from RR to SS is a bijective function (that is, a function that is both injective and surjective) f:RSf\!: R \to S such that the following properties hold:

We say RR is isomorphic to SS if there exists an isomorphism from RR to SS, and we write RSR \cong S to signify that RR and SS are isomorphic.

The reason we do not include f(0R)=0Sf(0_R) = 0_S in the definition of “isomorphism” is that it is a consequence of the other parts of the definition.

Example 5.17. Consider the set S={a,b,c,d}S = \{ a,b,c,d \} with the two operations below:

 

\spadesuit a b c d
a a b c d
b b c d a
c c d a b
d d a b c
\heartsuit a b c d
a a a a a
b a b c d
c a c a c
d a d c b

 

(S,,)(S, \spadesuit, \heartsuit) is a ring. Now compare this with the addition and multiplication tables for 4\mathbb{Z}_4:

 

\spadesuit [0]4[0]_4 [1]4[1]_4 [2]4[2]_4 [3]4[3]_4
[0]4[0]_4 [0]4[0]_4 [1]4[1]_4 [1]4[1]_4 [3]4[3]_4
[1]4[1]_4 [1]4[1]_4 [1]4[1]_4 [3]4[3]_4 [0]4[0]_4
[1]4[1]_4 [1]4[1]_4 [3]4[3]_4 [0]4[0]_4 [1]4[1]_4
[3]4[3]_4 [3]4[3]_4 [0]4[0]_4 [1]4[1]_4 [1]4[1]_4
\heartsuit [0]4[0]_4 [1]4[1]_4 [1]4[1]_4 [3]4[3]_4
[0]4[0]_4 [0]4[0]_4 [0]4[0]_4 [0]4[0]_4 [0]4[0]_4
[1]4[1]_4 [0]4[0]_4 [1]4[1]_4 [1]4[1]_4 [3]4[3]_4
[1]4[1]_4 [0]4[0]_4 [1]4[1]_4 [0]4[0]_4 [1]4[1]_4
[3]4[3]_4 [0]4[0]_4 [3]4[3]_4 [1]4[1]_4 [1]4[1]_4

 

The tables above and below are completely identical if we relabel the elements a,b,c,da, b, c, d as follows: f:4Sf([0]4)=af([1]4)=bf([2]4)=cf([3]4)=df: {\mathbb{Z}}_4 \to S \qquad f([0]_4) = a \quad f([1]_4) = b \quad f([2]_4) = c \quad f([3]_4) = d So this map ff is an isomorphism between (S,,)(S, \spadesuit, \heartsuit) and 4\mathbb{Z}_4. Notice in fact that this is the only isomorphism between these two rings: since 1S=b1_S = b, we must have f(1)=bf(1) = b; but then this implies that f(2)=f(1+1)=f(1)+f(1)=b+b=cf(2) = f(1+1) = f(1) + f(1) = b+b=c, and f(3)=f(1+2)=f(1)+f(2)=b+c=df(3) = f(1+2) = f(1) + f(2) = b + c = d.

Example 5.18. Consider the set T={a,b,c,d}T = \{a,b,c,d\} and the binary operations \oplus and \otimes given by the following tables:

\oplus a b c d
a a b c d
b b a d c
c c d a b
d d c b a
\otimes a b c d
a a a a a
b a b c d
c a c c a
d a d a d

 

Let us take it on faith that (T,,)(T, \oplus, \otimes) is a ring. Let’s prove that TT is isomorphic to the ring 2×2{\mathbb{Z}}_2 \times {\mathbb{Z}}_2. Inspecting the addition and multiplication tables for TT, we see that the zero element of TT is aa and the one element is bb. So any isomorphism f:2×2Tf: {\mathbb{Z}}_2 \times {\mathbb{Z}}_2 \to T would have to satisfy f(([0]2,[0]2))=af(([0]_2, [0]_2)) = a and f(([1]2,[1]2))=bf(([1]_2, [1]_2)) = b, and since ff must be a bijection, there are two possibilities. In fact, both of them work. Let’s consider the function f:2×2Tf: {\mathbb{Z}}_2 \times {\mathbb{Z}}_2 \to T defined by f(([0]2,[0]2))=af(([1]2,[0]2))=cf(([0]2,[1]2))=df(([1]2,[1]2))=b.\begin{aligned} f(([0]_2, [0]_2)) & = a \\ f(([1]_2, [0]_2)) & = c \\ f(([0]_2, [1]_2)) & = d \\ f(([1]_2, [1]_2)) & = b. \\ \end{aligned} It is clearly a bijection. Checking the three required axioms is rather tedious since there are 1616 possible additions and 1616 possible multiplications, but it does all work out.

To see that ff preserves addition, for instance, we have f(([0]2,[1]2)+([1]2,[1]2))=f(([1]2,[0]2))=cf( ([0]_2, [1]_2) + ([1]_2, [1]_2)) = f( ([1]_2, [0]_2)) = c and f(([0]2,[1]2))f([1]2,[1]2))=db=cf( ([0]_2, [1]_2)) \oplus f([1]_2, [1]_2)) = d \oplus b = c. So ff preserves addition in this one case, and it also does in the other 1515 cases. To check this carefully we could write down the addition table for 2×2{\mathbb{Z}}_2 \times {\mathbb{Z}}_2 and see that ff transforms it into the table \oplus, entry by entry.

Similarly, ff preserves multiplication. For instance, f(([0]2,[1]2)([1]2,[1]2))=f(([0]2,[1]2))=df( ([0]_2, [1]_2) \cdot ([1]_2, [1]_2)) = f( ([0]_2, [1]_2)) = d and f(([0]2,[1]2))f([1]2,[1]2))=db=df( ([0]_2, [1]_2)) \otimes f([1]_2, [1]_2)) = d \otimes b = d. So ff preserves multiplication in this one case, and it also does in the other 1515 cases. To check this carefully we could write down the multiplication table for 2×2{\mathbb{Z}}_2 \times {\mathbb{Z}}_2 and see that ff transforms it into the table \otimes, entry by entry.

Finally, 1T=b1_T = b and 12×2=([1]2,[1]2)1_{{\mathbb{Z}}_2 \times {\mathbb{Z}}_2} = ([1]_2, [1]_2) and f(([1]2,[1]2))=bf( ([1]_2, [1]_2)) = b, so that the final axiom holds too.

Example 5.19. Consider the ring R=Mat2×2()R = {\operatorname{Mat}}_{2\times 2}(\mathbb{R}) of all 2×22 \times 2 matrices with real entries. The subring S={[a00a]a}S = \left\lbrace \begin{bmatrix} a & 0 \\ 0 & a \end{bmatrix} \mid a \in \mathbb{Z} \right\rbrace is isomorphic to \mathbb{Z}. The map f:Sf\!: \mathbb{Z} \to S given by f(n)=[n00n]f(n) = \begin{bmatrix} n & 0 \\ 0 & n \end{bmatrix} is an isomorphism. Indeed:

Theorem 5.20. The rings 2×2{\mathbb{Z}}_2 \times {\mathbb{Z}}_2 and 4{\mathbb{Z}}_4 are not isomorphic.

We will give several proofs of this. The first one will use the following Lemma:

Proposition 5.21. If f:RSf\!: R \to S is an isomorphism and rRr \in R is a unit, then f(r)f(r) is also a unit. In particular, if RR is isomorphic to SS, then there is a bijection between the units of RR and the units of SS.

Proof. Suppose that rRr \in R is a unit, and let sRs \in R be the inverse of rr. Then f(r)f(s)=f(rs)=f(1R)=1Suad and uadf(s)f(r)=f(sr)=f(1R)=1S.f(r) f(s) = f(rs) = f(1_R) = 1_S \mathbb{Q}uad \textrm{ and } \mathbb{Q}uad f(s) f(r) = f(sr) = f(1_R) = 1_S.

This proves the first claim, which can be interpreted as saying that ff induces a function f:{rR| r is a unit in R }{sS| s is a unit in S }f': \{r \in R | \text{ $r$ is a unit in $R$ }\} \to \{s \in S | \text{ $s$ is a unit in $S$ }\} given by f(r)=f(r)f'(r) = f(r). (I am calling it ff' since it has a different domain and codomain as ff, but it’s given by the same rule.) Since ff is one-to-one (injective) so is ff'. It is not obvious that ff' is onto, however. So, pick sSs \in S such that ss is a unit. So there is an element yy in SS such that sy=1S=yssy = 1_S = ys. Since ff is onto (surjective), s=f(r)s = f(r) for some rr and y=f(x)y = f(x) for some xx. I claim rr is also a unit. We have f(rx)=f(r)f(x)=sy=1Sf(rx) = f(r) f(x) = s y = 1_S. On the other hand f(1R)=1Sf(1_R) = 1_S and thus f(1R)=f(rx)f(1_R) = f(rx). But ff is one-to-one and hence we must have rx=1Rrx = 1_R. A nearly identical argument shows that xr=1Rxr = 1_R. This proves rr is a unit and hence it belongs to the domain of ff'. This proves ff' is onto (surjective). ◻

Now we can give yet a proof of Theorem 5.20:

First proof of Theorem 5.20. The ring 4{\mathbb{Z}}_4 has two units, [1]4[1]_4 and [3]4[3]_4, while 2×2{\mathbb{Z}}_2 \times {\mathbb{Z}}_2 only has one unit, ([1]2,[1]2)([1]_2,[1]_2). Therefore, the two rings cannot be isomorphic. ◻

Proposition 5.22. If f:RSf: R \to S is an isomorphism and rRr \in R is a zerodivisor, then f(r)f(r) is also a zerodivisor. In particular, if RR and SS are isomorphic, then there is a bijection between the zerodivisors of RR and the zerodivisors of SS.

Proof. Assume rRr \in R is a zerodivisor and f:RSf: R \to S is an isomorphism of rings. By definition, r0Rr \ne 0_R and there is an element yRy \in R such that y0Ry \ne 0_R and ry=0Rry = 0_R. Since ff preserves multiplication, f(r)f(y)=f(ry)=f(0R)f(r) f(y) = f(ry) = f(0_R) and by Lemma [lem1] f(0R)=0Sf(0_R) = 0_S. Thus f(r)f(y)=0Sf(r) f(y) = 0_S. Since ff is one-to-one, f(0R)=0Sf(0_R) = 0_S, and r0Rr \ne 0_R, it follows that f(r)0Sf(r) \ne 0_S. Similarly, f(y)0Sf(y) \ne 0_S. This proves that f(r)f(r) is a zerodivisor in SS.

This proves the first claim, which can be interpreted as saying that ff induces a function f:{rR| r is a zerodivisor in R }{sS| s is a zerodivisor in S }f': \{r \in R | \text{ $r$ is a zerodivisor in $R$ }\} \to \{s \in S | \text{ $s$ is a zerodivisor in $S$ }\} given by f(r)=f(r)f'(r) = f(r). (I am calling it ff' since it has a different domain and codomain as ff, but it’s given by the same rule.) Since ff is injective so is ff'. It is not obvious that ff' is onto, however. So, pick sSs \in S such that ss is a zerodivisor. So there is an element y0y \neq 0 in SS such that sy=0Ssy = 0_S. Since ff is surjective, s=f(r)s = f(r) for some rRr \in R and y=f(x)y = f(x) for some xRx \in R. I claim rr is also a zerodivisor. We have f(rx)=f(r)f(x)=sy=0Sf(rx) = f(r) f(x) = s y = 0_S. On the other hand, f(0R)=0Sf(0_R) = 0_S and thus f(0R)=f(rx)f(0_R) = f(rx). But ff is injective and hence we must have rx=0Rrx = 0_R. Moreover, since y0y \neq 0, and since ff is injective and f(0R)=0Sf(0_R) = 0_S, we must have x0Rx \neq 0_R. This proves rr is a zerodivisor and hence it belongs to the domain of ff'. This proves ff' is onto (surjective). ◻

And this allows us to give another proof of Theorem 5.20.

Second proof of Theorem 5.20. The ring 4{\mathbb{Z}}_4 has one zerodivisor, [2]4[2]_4, while 2×2{\mathbb{Z}}_2 \times {\mathbb{Z}}_2 has 22 zerodivisors, ([1]2,[0]2)([1]_2,[0]_2) and ([0]2,[1]2)([0]_2,[1]_2). Therefore, the two rings cannot be isomorphic. ◻

5.3 Homomorphisms

Definition 5.23. A ring homomorphism between two rings RR and SS is a function ϕ:RS\phi: R \to S that satisfies the following properties:

  1. ϕ(x+y)=ϕ(x)+ϕ(y)\phi(x + y) = \phi(x) + \phi(y) for all x,yRx, y \in R. (“ϕ\phi preserves addition.”)

  2. ϕ(xy)=ϕ(x)ϕ(y)\phi(x \cdot y) = \phi(x) \cdot \phi(y) for all x,yRx, y \in R. (“ϕ\phi preserves multiplication.”)

  3. ϕ(1R)=1S\phi(1_R) = 1_S. (“ϕ\phi preserves multiplicative identities.”)

Note that a ring isomorphism is a ring homomorphism that is both surjective (onto) and injective (one-to-one).

Proposition 5.24 (Properties of homomorphisms). Let ϕ:ST\phi: S \rightarrow T be a homomorphism of rings. Then

  1. ϕ(0S)=0T\phi(0_S) = 0_T.

  2. For all xSx \in S, ϕ(x)=ϕ(x)- \phi(x) = \phi(-x). (A ring homomorphism preserves additive inverses.)

  3. If uSu \in S is a unit, then ϕ(u)T\phi(u) \in T is a unit.

Proof. \,

  1. We have ϕ(0S+0S)=ϕ(0S)+ϕ(0S)\phi(0_S + 0_S) = \phi(0_S) + \phi(0_S) and ϕ(0S+0S)=ϕ(0S)\phi(0_S + 0_S) = \phi(0_S), so that ϕ(0S)+ϕ(0S)=ϕ(0S)\phi(0_S) + \phi(0_S) = \phi(0_S). Now add ϕ(0S)-\phi(0_S) to both sides to conclude that ϕ(0S)=0T\phi(0_S) = 0_T.

  2. We have ϕ(x)+ϕ(x)=ϕ(x+x)=ϕ(0S)=0T\phi(-x) + \phi(x) = \phi(-x + x) = \phi(0_S) = 0_T (using the first result), and this proves ϕ(x)\phi(-x) is the additive inverse of ϕ(x)\phi(x).

  3. If uu is a unit in SS, then there is a vSv \in S such that uv=1S=vuuv = 1_S = vu. It follows that ϕ(u)ϕ(v)=ϕ(uv)=ϕ(1S)=1T\phi(u) \phi(v) = \phi(uv) = \phi(1_S) = 1_T and similarly ϕ(v)ϕ(u)=ϕ(vu)=ϕ(1S)=1T\phi(v) \phi(u) = \phi(vu) = \phi(1_S) = 1_T. This shows that ϕ(v)\phi(v) is the multiplicative inverse of ϕ(u)\phi(u) and hence ϕ(u)\phi(u) is a unit.edhere

 ◻

6 RSA Cryptography

6.1 The mathematics behind the RSA

There are two central mathematical principles behind RSA public key cryptography.

May 8, 2023

Lemma 6.1. If pp and qq are positive primes such that pqp\neq q and aa is an integer which satisfies pap\mid a and qaq\mid a, then (pq)a(pq)\mid a.

Proof. Suppose that a,p,qa,p,q are integers such that pap\mid a and qaq\mid a and pp and qq are distinct primes. By definition of divides, a=pka=pk for some kk\in {\mathbb{Z}}. Since qaq\mid a we have q(pk)q\mid (pk) and since qq is a prime we deduce that qpq\mid p or qkq\mid k by 1.49. If qpq\mid p then since pp is and qq are prime we have q=±pq=\pm p and since p,qp,q are both positive it must be that q=pq=p. This is a contradiction, so instead qkq\mid k must be true. Thus k=qk=q\ell for some \ell\in {\mathbb{Z}} by definition of divides.

Since a=pk=pqa=pk=pq\ell with \ell\in {\mathbb{Z}}, this shows that (pq)a(pq)\mid a. ◻

Theorem 6.2. If p,qp,q are positive primes, de1(mod(p1)(q1))de \equiv 1 \pmod{(p-1)(q-1)} and bb is an integer, then bdeb(modpq).b^{de}\equiv b \pmod{pq}.

Proof. First, we will show that if c1mod(p1)(q1)c \equiv 1 \mod {(p-1)(q-1)}, then bc1modpqb^{c} \equiv 1 \mod {pq}; applying this to c=dec = de will give that bcebmodpqb^{ce} \equiv b \mod {pq}.

So let c1mod(p1)(q1)c \equiv 1 \mod {(p-1)(q-1)}, so that c=1+(p1)(q1)kc = 1 + (p-1)(q-1)k for some kk \in {\mathbb{Z}}.

Similarly, modulo qq we have

 ◻

6.2 How RSA works

We can replace any letter in the alphabet by a number with two digits as follows:

A B C D E F G H I J K L M
01 02 03 04 05 06 07 08 09 10 11 12 13
N O P Q R S T U V W X Y Z
14 15 16 17 18 19 20 21 22 23 24 25 26

We can then represent words by numbers: HELLO=0805121215.

Secret: Bob picks 2 primes p,qp,q and computes m=pqm=pq p=29,q=101p=29, q=101
(sometimes nn itself is taken to be a prime; n=2929n=2929
that works, but is not necessary)
Secret: Bob chooses an encrypting exponent ee that is coprime with e=17e=17
ϕ(m)=ϕ(pq)=(p1)(q1)\phi(m)=\phi(pq)=(p-1)(q-1)
Public: Bob announces mm and ee. This is Bob’s public key. n=2929,e=17n=2929, e=17
Secret: Alice writes her message as a sequence of numbers. HELLO
If mm is an r+1r+1-digit number she will split her message into 0  805  121  215
rr-digit blocks. Her message will look like b1uadb2uaduadbmb_1 \mathbb{Q}uad b_2 \mathbb{Q}uad \ldots \mathbb{Q}uad b_m
Secret: Alice computes b1e(modm),b2e(modm),bne(modn)b_1^e \pmod m, b_2^e \pmod m, \ldots b_n^e\pmod n 017=00^{17}=0
b1e(modm),b2e(modn),bne(modn)b_1^e \pmod m, b_2^e \pmod n, \ldots b_n^e\pmod n 80517962(mod2929)805^{17}\equiv 962 \pmod{2929}
121171053(mod2929)121^{17}\equiv 1053 \pmod{2929}
215171491(mod2929)215^{17}\equiv 1491 \pmod{2929}
Public: Alice sends the results of her computations to Bob. 0  0962  1053  1491
This is the encrypted message
Secret: Bob computes de1(mod(p1)(q1))d\equiv e^{-1} \pmod {(p-1)(q-1)} d=1153d=1153
using the Euclidean algorithm
Bob computes (b1e)d(modm),(b2e)d(modm),(bne)d(modm)(b_1^e)^d \pmod m, (b_2^e)^d \pmod m, \ldots (b_n^e)^d\pmod m, 9621153805(mod2929)962^{1153}\equiv 805 \pmod{2929}
recovering the original message b1uadb2uaduadbnb_1 \mathbb{Q}uad b_2 \mathbb{Q}uad \ldots \mathbb{Q}uad b_n. 10531153121(mod2929)1053^{1153}\equiv 121 \pmod{2929}
This is the decryption phase. 14911153215(mod2929)1491^{1153}\equiv 215 \pmod{2929}

Why is this system secure? The only way to find dd, given ee and nn, is by knowing the value of (p1)(q1)(p-1)(q-1). If one knows the value of (p1)(q1)(p-1)(q-1), it is easy and fast (Euclidean algorithm) to find dd; without knowing the value of (p1)(q1)(p-1)(q-1) there is no practical way to find dd. Now, to find the value of (p1)(q1)(p-1)(q-1) one must know the values of pp and qq. Recall n=pqn = p \cdot q and that nn is PUBLIC. So, they only way to “crack” the code is to find the prime factorization of nn. There is no (known) method of factoring huge integers in a practical amount of time!


  1. This fancy-sounding principle is an axiom, and like most axioms, this is formalized common sense.↩︎