Number Theory#
Contents#
Arithmetic#
The bit complexity of numbers#
We focus on bit complexity (of algorithms), the number of elementary operations on individual bits, because it reflects the amount of hardware, transistors and wires, necessary for implementing an algorithm.
It is a basic property of numbers that for any base \(b \ge 2\) the sum of any three single-digit numbers is at most two digits long.
\( \begin{aligned} b &= 2 && & 1 + 1 + 1 &= 11 &&= 1 \times 2^1 + 1 \times 2^0 \\ b &= 3 && & 2 + 2 + 2 &= 20 &&= 2 \times 3^1 + 0 \times 3^0 \\ b &= 10 && & 9 + 9 + 9 &= 27 &&= 2 \times 10^1 + 7 \times 10^0 \\ b &= 16 && & F + F + F &= 2D &&= 2 \times 16^1 + 13 \times 16^0 \\ b &= 60 && & \lambda_{59} + \lambda_{59} + \lambda_{59} &= 2\lambda_{57} &&= 2 \times 60^1 + 57 \times 60^0 \\ \end {aligned} \)
How many digits are needed to represent the number \(N \ge 0\) in base \(b\)? \(\lceil \log_b (N + 1) \rceil\) digits are needed to represent \(N\) in base \(b\).
With \(k \ge 1\) digits in base \(b\) we can express numbers up to \(b^k - 1\).
base \(2\)
with \(1\) digits in base \(2\) we can express numbers up to \(2^1 - 1 = 1\)
with \(2\) digits in base \(2\) we can express numbers up to \(2^2 - 1 = 3\)
with \(3\) digits in base \(2\) we can express numbers up to \(2^3 - 1 = 7\)
with \(4\) digits in base \(2\) we can express numbers up to \(2^4 - 1 = 15\)
base \(10\)
with \(1\) digits in base \(10\) we can express numbers up to \(10^1 - 1 = 9\)
with \(2\) digits in base \(10\) we can express numbers up to \(10^2 - 1 = 99\)
with \(3\) digits in base \(10\) we can express numbers up to \(10^3 - 1 = 999\)
with \(4\) digits in base \(10\) we can express numbers up to \(10^4 - 1 = 9999\)
\( \begin{aligned} b^k - 1 &= N \\ b^k &= N + 1 \\ \log_b b^k &= \lceil \log_b (N + 1) \rceil \\ k &= \lceil \log_b (N + 1) \rceil \\ \end {aligned} \)
Base \(2\)
\( \begin{aligned} \lceil \log_2 (\textcolor{green}{ 0} + 1) \rceil &= 1 \text{ b} \\ \lceil \log_2 (\textcolor{green}{ 1} + 1) \rceil &= 1 \text{ b} \\ \lceil \log_2 (\textcolor{green}{ 2} + 1) \rceil &= 2 \text{ b} \\ \lceil \log_2 (\textcolor{green}{ 3} + 1) \rceil &= 2 \text{ b} \\ \lceil \log_2 (\textcolor{green}{ 4} + 1) \rceil &= 3 \text{ b} \\ &\vdots \\ \lceil \log_2 (\textcolor{green}{ 7} + 1) \rceil &= 3 \text{ b} \\ \lceil \log_2 (\textcolor{green}{ 8} + 1) \rceil &= 4 \text{ b} \\ &\vdots \\ \lceil \log_2 (\textcolor{green}{15} + 1) \rceil &= 4 \text{ b} \\ \lceil \log_2 (\textcolor{green}{16} + 1) \rceil &= 5 \text{ b} \\ &\vdots \\ \lceil \log_2 (\textcolor{green}{31} + 1) \rceil &= 5 \text{ b} \\ \lceil \log_2 (\textcolor{green}{32} + 1) \rceil &= 6 \text{ b} \\ &\vdots \\ \lceil \log_2 (\textcolor{green}{63} + 1) \rceil &= 6 \text{ b} \\ \lceil \log_2 (\textcolor{green}{64} + 1) \rceil &= 7 \text{ b} \\ \end {aligned} \)
How much does the size of a number change when its base changes?
The rule for converting logarithms from base \(a\) to base \(b\):
\(\log_b N = \frac{1}{\log_a b} \log_a N \iff \log_a N = \log_b N \times \log_a b\)
The size of integer \(N\) in base \(a\) is the same as its size in base \(b\) times a constant factor \(\log_a b\). Therefore, in big-\(O\) notation the base is irrelevant and the size is written simply as \(O(\log N)\)
\(\log N\) is the power to which \(2\) must be raised to obtain \(N\).
\(\lceil \log N \rceil\) is the number of times \(N\) must be halved to obtain unity. This will be useful when a number is halved at each iteration of an algorithm.
\(\lceil \log (N + 1) \rceil\) is the number of bits in the binary representation of \(N\).
\(\lfloor \log N \rfloor\) is the depth of a complete binary tree with \(N\) nodes.
\(\log N = 1 + 1/2 + 1/3 + \dotsb + 1/N\) to within a constant factor.
Addition#
The rule that the sum of any three single-digit numbers is at most two digits allows us to add two numbers in any base in the following way: align their right-hand ends and then perform a single right-to-left pass in which the sum is computed digit by digit, maintaining the overflow as a carry. Since we know the individual sum is a two-digit number, the carry is always a single digit, and so at any given step, three single-digit numbers are added.
Carry 1 1 1 1
1 1 0 1 0 1 (53)
1 0 0 0 1 1 (35)
-------------
1 0 1 1 0 0 0 (88)
Complexity
Suppose both \(x\) and \(y\) are \(n\) bits long. Then \(x + y\) is at most \(n + 1\) bits long and each bit of this sum is computed in a fixed amount of time.
\(T(n) = c_1n + c_0 = O(n)\)
Is there a faster algorithm? In order to add two \(n\)-bit numbers they must at least be read and their sum written, which require n operations. Therefore, the addition algorithm is optimal up to multiplicative constants.
It is true that in a single processor instruction integers can be added whose size in bits is within the word length of today’s computers–\(32\) perhaps. But it is often useful and necessary to handle numbers much larger than this, perhaps several thousand bits long. Adding and multiplying such large numbers on real computers is very much like performing the operations bit by bit.
Implementation
#include <stdio.h>
int main (void) {
long b1;
long b2;
int i = 0;
int r = 0; // remainder
int sum[20];
printf("Enter the first binary number: "); scanf("%ld", &b1);
printf("Enter the second binary number: "); scanf("%ld", &b2);
while (b1 != 0 || b2 != 0) {
sum[i++] = (b1 % 10 + b2 % 10 + r) % 2;
r = (b1 % 10 + b2 % 10 + r) / 2;
b1 /= 10;
b2 /= 10;
}
if (r != 0)
sum[i++] = r;
--i;
printf("Sum: ");
while (i >= 0)
printf("%d", sum[i--]);
printf("\n");
return 0;
}
Multiplication#
Grade-School Binary Multiplication Algorithm#
Multiply two numbers \(x\) and \(y\). Create an array of intermediate sums each representing the product of \(x\) by a single digit of \(y\). These values are appropriately left-shifted and then added up.
\(13 \times 11\) or \(x = 1101\) and \(y = 1011\)
1 1 0 1 (13, multiplicand)
x 1 0 1 1 (11, multiplier)
---------
1 1 0 1 (1101 times 1) 13 x 2^0 = 13
1 1 0 1 (1101 times 1, shifted once) 13 x 2^1 = 26
0 0 0 0 (1101 times 0, shifted twice) 0 x 2^2 = 0
+ 1 1 0 1 (1101 times 1, shifted thrice) 13 x 2^3 = 104
---------------
1 0 0 0 1 1 1 1 (binary 143)
In binary each intermediate row is either zero or \(x\) itself, left-shifted an appropriate amount of times. Left-shifting multiplies by the base and right-shifting divides by the base, rounding down if necessary.
Correctness
Complexity
If \(x\) and \(y\) are both \(n\) bits then there are \(n\) intermediate rows with lengths of up to \(2n\) bits, accounting for the shifts. The total time taken to add up these rows, two numbers at a time, is
\(\underbrace{O(n) + O(n) + \dotsb + O(n)}_{n - 1 \text{ times}} = O(n^2)\)
Al Khwarizmi’s multiplication by repeated halvings of the multiplier#
Multiply two decimal numbers \(x\) and \(y\). Write them next to each other and then repeat the following: divide the first number by \(2\), rounding down the result, and double the second number; continue until the first number reaches unity and then strike out all the rows in which the first number is even and add up whatever remains in the second column.
parity of multiplier 11
11 13 11 = 5 x 2 + 1
5 26 5 = 2 x 2 + 1
2 52 (strike out) 2 = 1 x 2 + 0
1 104 1 = 0 x 2 + 1
--------
143
Multiplication à la Français#
This algorithm implements the following recursive rule.
\( x \cdot y = \begin{cases} 2(x \cdot \lfloor y/2 \rfloor) & \text{if \textit{y} is even} \\ x + 2(x \cdot \lfloor y/2 \rfloor) & \text{if \textit{y} is odd} \\ \end {cases} \)
INPUT two n-bit integers x and y, where y >= 0
OUTPUT their product
MULT (x, y)
1 if y = 0
2 return 0
3
4 z = MULT(x, floor(y/2))
5
6 if even(y)
7 return 2z
8 else
9 return 2z + x
EXAMPLE
algorithm decomposition
\(x \cdot 25 = x \cdot 16 + x \cdot 8 + x \cdot 1\)
For multiplication the terms \(x \cdot 2^i\) come from repeated doubling
Correctness
The recursive rule is transparently correct. Checking that the algorithm is correct is a matter of verifying that it mimics the rule and that it handles the base case \(y = 0\) properly.
Complexity
The algorithm terminates after \(n\) recursive calls because at each call \(y\) is halved (i.e., its number of bits is decreased by one). Each recursive call requires the following operations:
a division by \(2\) (a right shift)
a test for odd/even (looking up the last bit)
a multiplication by \(2\) (a left shift)
possibly, one addition
a total of \(O(n)\) bit operations. The total running time is
\(T(n) = O(n^2)\)
Can we do better? Intuitively, it seems that multiplication requires adding about \(n\) multiples of one of the inputs, and we know that each addition is linear, so it would appear that \(n^2\) bit operations are inevitable. But we will see that we can de better.
Division#
To divide an integer \(x\) by an integer \(y \ne 0\) means to find a quotient \(q\) and a remainder \(r\), where \(x = yq + r\) and \(r \lt y\).
INPUT two n-bit integers x and y, where y >= 1
OUTPUT the quotient and the remainder of x divided by y
DIVIDE (x, y)
1 if x = 0
2 return (q, r) = (0, 0)
3
4 (q, r) = DIVIDE(floor(x/2), y)
5 q = 2q
6 r = 2r
7 if odd(x)
8 r = r + 1
9 if r >= y
10 r = r - y
11 q = q + 1
12 return (q, r)
Complexity Like multiplication, division takes quadratic time.
\(T(n) = O(n^2)\)
\( \begin{aligned} 2^{32} - 1 &= 4294967295 &&= 3 \times 5 \times 17 \times 257 \times 65537 \\ 2^{53} - 1 &= 9007199254740991 &&= 6361 \times 69431 \times 20394401 \\ 2^{64} - 1 &= 18446744073709551615 &&= 3 \times 5 \times 17 \times 257 \times 641 \times 65537 \times 6700417 \\ \end {aligned} \)
What’s the largest value that \(n\) bits can hold?
print(f' largest value')
print(f' 16-bit numbers {2** 16 - 1:55} = 2^ 16 - 1')
print(f' 20-bit numbers {2** 20 - 1:55} = 2^ 20 - 1')
print(f' 32-bit numbers {2** 32 - 1:55} = 2^ 32 - 1')
print(f' 40-bit numbers {2** 40 - 1:55} = 2^ 40 - 1')
print(f' 52-bit numbers {2** 52 - 1:55} = 2^ 52 - 1')
print(f' 53-bit numbers {2** 53 - 1:55} = 2^ 53 - 1')
print(f' 64-bit numbers {2** 64 - 1:55} = 2^ 64 - 1')
print(f'128-bit numbers {2**128 - 1:55} = 2^128 - 1')
largest value
16-bit numbers 65535 = 2^ 16 - 1
20-bit numbers 1048575 = 2^ 20 - 1
32-bit numbers 4294967295 = 2^ 32 - 1
40-bit numbers 1099511627775 = 2^ 40 - 1
52-bit numbers 4503599627370495 = 2^ 52 - 1
53-bit numbers 9007199254740991 = 2^ 53 - 1
64-bit numbers 18446744073709551615 = 2^ 64 - 1
128-bit numbers 340282366920938463463374607431768211455 = 2^128 - 1
Theorem
If \(p\) is prime then \(2^{p - 1}\) leaves the remainder \(1\) on division by \(p\).
Example
\(2^{1000}\) leaves the remainder \(562\) on division by \(1001\) so \(1001\) is composite.
\(1001 = 7 \times 11 \times 13\)
\(2^{k^k}\) can be computed by successive squaring
\( \begin{aligned} 1000 &= 2^3 + 2^5 + 2^6 + 2^7 + 2^8 + 2^9 \\ 2^{1000} &= 2^{2^3} \times 2^{2^5} \times 2^{2^6} \times 2^{2^7} \times 2^{2^8} \times 2^{2^9} \\ 2^{2^3} &= 256 \\ 2^{2^4} &= 256^2 = 65536 \equiv 471 \mod 1001 \\ 2^{2^5} &= 471^2 = 221841 \equiv 620 \mod 1001 \\ 2^{2^3} \times 2^{2^5} &\equiv 256 \times 620 = 158720 \equiv 562 \mod 1001 \\ 2^{2^6} &\equiv 620^2 = 384400 \equiv 16 \mod 1001 \\ 2^{2^3} \times 2^{2^5} \times 2^{2^6} &\equiv 562 \times 16 = 8992 \equiv 984 \mod 1001 \\ 2^{2^7} &\equiv 16^2 \equiv 256 \mod 1001 \\ 2^{2^3} \times 2^{2^5} \times 2^{2^6} \times 2^{2^7} &\equiv 984 \times 256 = 251904 \equiv 653 \mod 1001 \\ 2^{2^8} &\equiv 471 \mod 1001 \\ 2^{2^3} \times 2^{2^5} \times 2^{2^6} \times 2^{2^7} \times 2^{2^8} &\equiv 653 \times 471 = 307563 \equiv 256 \\ 2^{2^9} &\equiv 620 \\ 2^{1000} &= 2^{2^3} \times 2^{2^5} \times 2^{2^6} \times 2^{2^7} \times 2^{2^8} \times 2^{2^9} \equiv 620 \times 256 = 167168 \equiv 562 \\ \end {aligned} \)
Any programming language which can do double precision can compute \(2^{p - 1} \mod p\) in linear time.
Induction
Inductive definition of \(\mathbb{N}\): \(1 \in \mathbb{N}\) is the least member of \(\mathbb{N}\). Given any \(n \in \mathbb{N}\) the next member is \(n + 1\).
2#
We know that given integers \(a\) and \(b\) not both zero there are integers \(x\) and \(y\) s.t. \(\gcd(a, b) = ax + by\). How do we find \(x\) and \(y\)?
Euclid’s Algorithm - gives a very efficient algorithm, and is still the basis for many numerical methods in arithmetical applications
Euclid’s Algorithm
\(a, b \gt 0\), since multiplying either by \(-1\) does not change \(\gcd(a, b)\). We can replace \(x\) by \(-x\) and \(y\) by \(-y\).
Suppose \(b \le a\). Let \(r_0 = b, r_{-1} = a\)
Apply the division algorithm iteratively as follows.
\( \begin{aligned} r_{-1} &= r_0 q_1 + r_1 && 0 \lt r_1 \le r_0 \\ r_0 &= r_1 q_2 + r_2 && 0 \lt r_2 \lt r_1 \\ r_1 &= r_2 q_3 + r_3 && 0 \lt r_3 \lt r_2 \\ \vdots \\ r_{s-3} &= r_{s-2} q_{s-1} + r_{s-1} && 0 \lt r_{s-1} \lt r_{s-2} \\ r_{s-2} &= r_{s-1} q_s \\ \end {aligned} \)
We stop the moment there is a remainder equal to \(0\). This could be \(r_1\) if \(b \mid a\), for example.
Since \(r_j \lt r_{j-1}\) we must eventually have a zero remainder.
Euclid proved that \(\gcd(a, b) = r_{s-1}\)
\(\text{Proof}\)
\(\gcd(a, b) \mid a\) and \(\gcd(a, b) \mid b\) and so \(\gcd(a, b) \mid r_1 = a - bq_1\). Repeating, \(\gcd(a, b) \mid r_j\) for \(j = 2, 3, \dotsc, s - 1\)
Starting from the bottom, \(r_{s-1} \mid r_{s-2}\), \(r_{s-1} \mid r_{s-3}\), and so on until we have \(r_{s-1} \mid b\) and \(r_{s-1} \mid a\), and so \(r_{s-1} \mid \gcd(a, b)\)
Since \(r_{s-1} \mid \gcd(a, b)\) and \(\gcd(a, b) \mid r_{s-1}\) it is the case that \(r_{s-1} = \gcd(a, b)\).
\(\blacksquare\)
EXAMPLE
\( \begin{aligned} a &= & 10,678 \\ b &= & 42 \\ 10,678 &= & 42 &\times & 254 &+ & 10 \\ 42 &= & 10 &\times & 4 &+ & 2 \\ 10 &= & 2 &\times & 5 \\ \end {aligned} \)
\(\gcd(10678, 42) = 2\)
\(x\) and \(y\) can be found via back substitution, which is inefficient since it requires that all computations be stored.
\( \begin{aligned} 2 &= 42 - 10 \times 4 \\ &= 42 - (10,678 - 42 \times 254) \times 4 \\ &= 42 - 10,678 \times 4 + 42 \times 254 \times 4 \\ &= 42 + 42 \times 254 \times 4 - 10,678 \times 4 \\ &= 42 \times (1 + 254 \times 4) - 10,678 \times 4 \\ &= 42 \times 1017 - 10,678 \times 4 \\ \end {aligned} \)
Extended Euclidean Algorithm
\( \begin{aligned} x_{-1} &:= 1 \\ y_{-1} &:= 0 \\ x_0 &:= 0 \\ y_0 &:= 1 \\ r_{-1} &= r_0 q_1 + r_1 && x_1 = x_{-1} - q_1 x_0 && y_1 = y_{-1} - q_1 y_0 \\ r_0 &= r_1 q_2 + r_2 && x_2 = x_0 - q_2 x_1 && y_2 = y_0 - q_2 y_1 \\ r_1 &= r_2 q_3 + r_3 && x_3 = x_1 - q_3 x_2 && y_3 = y_1 - q_3 y_2 \\ \vdots \\ r_{s-3} &= r_{s-2} q_{s-1} + r_{s-1} && x_{s-1} = x_{s-3} - q_{s-1} x_{s-2} && y_{s-1} = y_{s-3} - q_{s-1} y_{s-2} \\ r_{s-2} &= r_{s-1} q_s \\ \end {aligned} \)
The claim is that \(x = x_{s-1}, y = y_{s-1}\). More generally, the claim is that \(r_j = ax_j + by_j\) and this can be proved by induction.
\(\text{Proof by induction}\)
By construction it is the case that \(r_{-1} = ax_{-1} + by_{-1}\) and \(r_0 = ax_0 + by_0\).
\( \begin{aligned} r_{-1} &= a \times 1 + b \times 0 = a \\ r_0 &= a \times 0 + b \times 1 = b \\ \end {aligned} \)
Suppose \(r_k = ax_k + by_k\) is established for all \(j \le k\).
\( \begin{aligned} r_{k+1} &= r_{k-1} - q_{k+1} r_k \\ &= (ax_{k-1} + by_{k-1}) - q_{k+1} (ax_k + by_k) \\ &= ax_{k-1} + by_{k-1} - q_{k+1} ax_k - q_{k+1} by_k \\ &= ax_{k-1} - q_{k+1} ax_k + by_{k-1} - q_{k+1} by_k \\ &= a(x_{k-1} - q_{k+1} x_k) + b(y_{k-1} - q_{k+1} y_k) \\ &= ax_{k+1} + by_{k+1} \\ \end {aligned} \)
\(\blacksquare\)
\(\boxed{\gcd(a, b) = r_{s-1} = ax_{s-1} + by_{s-1}}\)
EXAMPLE
\( \begin{aligned} r_{-1} &= & 10,678 \\ r_0 &= & 42 \\ x_{-1} &= & 1 \\ x_0 &= & 0 \\ y_{-1} &= & 0 \\ y_0 &= & 1 \\ 10,678 &= & 42 &\times & 254 &+ & 10 && \underset{x_1}{ 1} &= & 1 &- & 254 &\times & 0 && \underset{y_1}{-254} &= & 0 &- & 1 &\times & 254 \\ 42 &= & 10 &\times & 4 &+ & 2 && \underset{x_2}{-4} &= & 0 &- & 4 &\times & 1 && \underset{y_2}{1017} &= & 1 &- & 4 &\times & (-254) \\ 10 &= & 2 &\times & 5 \\ \end {aligned} \)
\(\gcd(10678, 42) = 2 = 10678 \times (-4) + 42 \times 1017\)
Linear Diophantine Equation
Euclid’s algorithm can be used to find the complete solution in integers to linear diophantine equations of the kind
\(ax + by = c\) where \(a\), \(b\), and \(c\) are integers, and \(x\) and \(y\) are all integers which satisfy this
\(\text{Case 1}\) Both \(a\) and \(b\) are zero. —– If \(a = b = 0\) then the linear diophantine equation is not soluble unless \(c = 0\); and then it is soluble by any \(x\) and \(y\), which is not very interesting.
\(\text{Case 2}\) One of \(a\) and \(b\) is non-zero.
\(\gcd(a, b) \mid (ax + by) \implies \gcd(a, b) \mid c\)
Choose \(x\) and \(y\) s.t. \(ax + by = \gcd(a, b)\). Then there is a solution
\( \begin{aligned} ax + by &= \gcd(a, b) \\ \frac{ax + by}{\gcd(a, b)} &= 1 \\ c \frac{ax + by}{\gcd(a, b)} &= c \\ a \left( \frac{xc}{\gcd(a, b)} \right) + b \left( \frac{yc}{\gcd(a, b)} \right) \end {aligned} \)
Let’s call this solution \(x_0, y_0\). Consider any other solution \(x, y\).
\( \begin{aligned} ax + by &= c \\ ax + by - ax_0 - by_0 &= c - c = 0 \\ ax - ax_0 &= by_0 - by \\ a(x - x_0) &= b(y_0 - y) \\ \frac{a}{\gcd(a, b)}(x - x_0) &= \frac{b}{\gcd(a, b)}(y_0 - y) \\ \end {aligned} \)
Recall that \(\gcd \left( \frac{a}{\gcd(a, b)}, \frac{b}{\gcd(a, b)} \right) = 1\)
\(y_0 - y = z \frac{a}{\gcd(a, b)}\) for some \(z\)
\(x - x_0 = z \frac{b}{\gcd(a, b)}\) for some \(z\)
Any \(x\) and \(y\) of this form give a solution, so we have found a complete solution set.
THEOREM
Suppose that \(a\) and \(b\) are not both zero, that \(\gcd(a, b) \mid c\), and that \(ax_0 + by_0 = c\). Then every solution of
\(ax + by = c\)
is given by
\(x = x_0 + z \frac{b}{\gcd(a, b)}, y = y_0 + z \frac{a}{\gcd(a, b)}\)
where \(z\) is any integer.
The solutions \(x\) all leave the same remainder on division by \(\frac{b}{\gcd(a, b)}\) and likewise for \(y\) on division by \(\frac{a}{\gcd(a, b)}\).
Lehman’s Method based on differences of squares, and a small improvement over trial division
Apply trial division with \(d = 2, 3, \dotsc, d \le n^{\frac{1}{3}}\)
For \(1 \le t \le n^{\frac{1}{3}} + 1\) consider the numbers \(x\) with \(\sqrt{4tn} \le x \le \sqrt{4tn + n^{\frac{2}{3}}}\) Check each \(x^2 - 4tn\) to see if it is a perfect square \(y^2\) (compute \(4tn - \lfloor \sqrt{4tn} \rfloor^2\))
If there are \(x\) and \(y\) s.t. \(x^2 - 4tn = y^2\) then compute \(\gcd(x + y, n)\).
If there is no \(t \le n^{\frac{1}{3}} + 1\) for which there are \(x\) and \(y\), then \(n\) is prime.
EXAMPLE
\(n = 10001\) and so \(\lfloor (10001)^{\frac{1}{3}} \rfloor = 21\)
Trial division with \(d = 2, 3, 5, 7, 11, 13, 17, 19\) finds no factors.
\( \begin{aligned} t &= 1 \\ 4tn &= 40004 \\ \lfloor \sqrt{4tn} \rfloor &= \lfloor \sqrt{40004} \rfloor = 200 \\ \left\lfloor \sqrt{4tn + n^{\frac{2}{3}}} \right\rfloor &= \left\lfloor \sqrt{40004 + 21^2} \right\rfloor = \left\lfloor \sqrt{40004 + 441} \right\rfloor = \left\lfloor \sqrt{40445} \right\rfloor = 201 \\ 4tn - \lfloor \sqrt{4tn} \rfloor^2 &= 40004 - 200^2 = 40004 - 40000 = 4 = 2^2 \\ \end {aligned} \)
\( \begin{aligned} x^2 &-& 4tn &=& y^2 \\ (200^2 = 40000) &-& 40004 &=& -4 \\ (201^2 = 40401) &-& 40004 &=& 397 \\ \end {aligned} \)
\( \begin{aligned} t &= 2 \\ 4tn &= 80008 \\ \lfloor \sqrt{4tn} \rfloor &= \lfloor \sqrt{80008} \rfloor = 282 \\ \left\lfloor \sqrt{4tn + n^{\frac{2}{3}}} \right\rfloor &= \left\lfloor \sqrt{80008 + 21^2} \right\rfloor = \left\lfloor \sqrt{80008 + 441} \right\rfloor = \left\lfloor \sqrt{80449} \right\rfloor = 283 \\ 4tn - \lfloor \sqrt{4tn} \rfloor^2 &= 80008 - 282^2 = 80008 - 79524 = 484 = 22^2 \\ 4tn - \left\lfloor \sqrt{4tn + n^{\frac{2}{3}}} \right\rfloor &= 80008 - 283^2 = 80008 - 80089 = -81 = -9^2 \\ \end {aligned} \)
\( \begin{aligned} x^2 &-& 4tn &=& y^2 \\ (282^2 = 79524) &-& 80008 &=& -484 \\ (283^2 = 80089) &-& 80008 &=& 81 \\ \end {aligned} \)
\(\gcd(x + y, n) = \gcd(283 + 9, 10001) = 73\)
Proof of Lehman’s Method
depends on a subject called Diophantine approximation, via continued fractions, which in turn has some connections with Euclid’s algorithm
we can take a shortcut via Dirichlet
Dirichlet Theorem
For any real number \(\alpha\) and any integer \(Q \ge 1\) there exits integers \(a\) and \(q\) with \(1 \le q \le Q\) s.t.
\(\left| \alpha - \frac{a}{q} \right| \le \frac{1}{q(Q + 1)}\)
As a consequence of casting out all common factors of \(a\) and \(q\) in \(a/q\) we have
\(\gcd(a, q) = 1\)
Proof of Lehman’s Algorithm
When there is a \(d \mid n\) with \(n^{\frac{1}{3}} \lt d \le n^{\frac{1}{2}}\) then there is a \(t\) with \(1 \le t \le n^{\frac{1}{3}} + 1\) and \(x\) and \(y\) s.t.
\(4tn \le x^2 \le 4tn + n^{\frac{2}{3}}, x^2 - y^2 = 4tn\)
The runtime is bounded by \(n^{\frac{1}{3}}\)
Acknowledgements#
2006
Dasgupta, Sanjoy; Christos Papadimitriou; & Umesh Vazirani. Algorithms. Chapter 1. McGraw-Hill.
2018
Rosen, Kenneth. Discrete Mathematics and its Applications 8e. McGraw-Hill.
2023
Vaughan, Robert. A Course of Elementary Number Theory.