Post date: Jul 26, 2015 7:8:0 PM
"How many different combinations of two or more positive integers between 1 and 99 will sum up to 100?"
This is a simply stated problem that turns into a giant problem in possible combinations. It would be easy except for the need to remove duplicate values ("different combinations"), which takes up almost all the time generating the answer. It seems to be a specific case of the "Subset sum problem," which is NP-complete and more than I can do in SQL server. But after I spent the effort doing the exponential time solution by tracking the full details of every combination, I found that Euler (the person) had solved this particular case of the subset sum problem with a simple recursive series, the partition function P.
P = Sigma (k=0->n) [ -1 ** (k + 1) * (P(n - k * (3 * k - 1) / 2) + P(n - k * (3 * k + 1) / 2)) ]
I had actually searched for this fast numeric solution before doing the slow combinatorial solution, but Google didn't bring up any pages referencing it. It turned out to be described on a page that doesn't reference combinatorial math.
When I implemented it, I found that the function always produced incorrect values. Partly it was because there were two special cases not mentioned in the article (P(0) = 1 and 0 for n < 0), but in addition it seems to be off by 1, because it thinks that there is one solution that adds up to 1. It seems to consider the case where the number exists by itself, but that violates the rule of two or more integers. So it is necessary to subtract 1 from the value at the end. I found this very irritating. But it's really no big deal.
-- We can make this faster by storing the results in a table. It is faster to look them up than to recursively calculate them.
-- In addition, this lets us get around the limit of recursion in SQL server. If we look up the result first, we can avoid calling the recursive procedure.
CREATE TABLE #Cache (i INT PRIMARY KEY NOT NULL, a BIGINT NOT NULL)
INSERT INTO #Cache VALUES (0, 1) -- P(0) = 1
GO
-- This has to be a procedure because it updates a table
CREATE PROCEDURE dbo.PartitionFunction
@n numeric(38,10),
@return bigint = NULL OUTPUT
AS
BEGIN
SET @return = 0
IF @n < 0 RETURN
DECLARE @k int = 1, @temp0 numeric(38,10), @temp1 numeric(38,10), @temp2 numeric(38,0)
-- It is only necessary to check up to SQRT(n), because after that, the results are negative, which will return 0.
WHILE @k < SQRT(@n) + 1
BEGIN
SELECT @temp0 = @n - @k * (3 * @k - 1) / 2, @temp1 = 0, @temp2 = 0
SELECT @temp1 = a
FROM #Cache
WHERE i = @temp0
IF @temp1 = 0 AND @temp0 >= 0
EXEC dbo.PartitionFunction @temp0, @temp1 OUTPUT
SET @temp0 = @n - @k * (3 * @k + 1) / 2
SELECT @temp2 = a
FROM #Cache
WHERE i = @temp0
IF @temp2 = 0 AND @temp0 >= 0
EXEC dbo.PartitionFunction @temp0, @temp2 OUTPUT
IF @k % 2 = 1
SET @return = @return + @temp1 + @temp2
ELSE
SET @return = @return - @temp1 - @temp2
SET @k = @k + 1
END
IF @return = 0 RETURN
INSERT INTO #Cache VALUES (@n, @return)
END
GO
-- Because of the recursion limit of 32, we need to build up our cache table from the bottom up rather than the top down.
-- If the cache table is seeded, then it won't be necessary to call another recursive step in many cases.
EXEC dbo.PartitionFunction 30
EXEC dbo.PartitionFunction 60
EXEC dbo.PartitionFunction 90
-- Now the answer
DECLARE @answer int
EXEC dbo.PartitionFunction 100, @answer OUTPUT
SELECT @answer - 1
GO
DROP TABLE #Cache
GO
DROP PROCEDURE dbo.PartitionFunction
GO