Post date: May 12, 2014 11:34:11 PM
"What number below 1 million, in its Collatz sequence, takes the longest number of steps to reach 1?"
I really liked this one. It was a real challenge. I have two answers for you. The first one, the easiest one, is proper T-SQL, but it has a problem.
CREATE TABLE #Integers (i BIGINT NOT NULL PRIMARY KEY)
INSERT INTO #Integers VALUES (0), (1), (2), (3), (4), (5), (6), (7), (8), (9)
INSERT INTO #Integers
SELECT o.i + 10 * t.i + 100 * h.i + 1000 * k.i + 10000 * k10.i + 100000 * k100.i
FROM #Integers o
CROSS JOIN #Integers t
CROSS JOIN #Integers h
CROSS JOIN #Integers k
CROSS JOIN #Integers k10
CROSS JOIN #Integers k100
WHERE o.i + 10 * t.i + 100 * h.i + 1000 * k.i + 10000 * k10.i + 100000 * k100.i > 9
; WITH Collatz (i, j) AS
(
SELECT i, i
FROM #Integers
UNION ALL
SELECT i, CASE WHEN j % 2 = 0 THEN j/2 ELSE 3 * j + 1 END
FROM Collatz
WHERE j > 1
)
SELECT TOP (1) i, COUNT(*)
FROM Collatz
GROUP BY i
ORDER BY 2 DESC
OPTION (MAXRECURSION 1000)
DROP TABLE #Integers
This method takes on average of 1.8ms to generate a sequence, which is plenty fast normally. But in this case, with 1 million results, it means that it takes over 32 minutes, which is far too slow for my liking.
To get a more speedy answer, it takes a little cheating, Once you encounter a number that you have already calculated, for example 5, you already know every number that will follow it. 5, 16, 8, 4, 2, 1. We can take advantage of this, by keeping a table of the numbers that we have calculated and their lengths. Once we hit a number we have already calculated, we can add our current length to the saved value and skip the rest of the calculation. This short-circuiting of the recursion saves a huge amount of time. It isn't idiomatic SQL, using nested loops instead of queries, but we don't have the ability, as far as I know, to write a query where each line is returned in order, saving its data at that point in time for the next line in the query. This is not proper, but it's effective, getting us from 30 minutes to less than 2.
DECLARE @counter INT, @i BIGINT, @len INT, @templen INT
CREATE TABLE #Map (i INT NOT NULL PRIMARY KEY, len INT NOT NULL)
INSERT INTO #Map VALUES (1, 1)
SET @counter = 2
WHILE 1 = 1
BEGIN
-- Collatz calc
-- Wouldn't it be nice if you could define a temporary subroutine for this kind of thing?
SET @i = @counter
SET @len = 1
WHILE 1 = 1
BEGIN
-- Search the map. If we find a result, populate the map and go to the next number.
SET @templen = NULL
SELECT @templen = len
FROM #Map
WHERE i = @i
IF @templen IS NOT NULL
BEGIN
INSERT INTO #Map VALUES (@counter, @len + @templen - 1)
BREAK
END -- In map
-- Collatz calculation
IF @i % 2 = 0
BEGIN
SET @i = @i / 2
SET @len = @len + 1
END -- Even
ELSE BEGIN
SET @i = 3 * @i + 1
SET @len = @len + 1
END -- Odd
END -- Collatz
SET @counter = @counter + 1
IF @counter >= 1000000
BREAK
END -- Each number in range
SELECT TOP (1) i
FROM #Map
ORDER BY len DESC
DROP TABLE #Map