Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize BigInteger.Multiply by Toom-Cook multiplication #112876

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

kzrnm
Copy link
Contributor

@kzrnm kzrnm commented Feb 24, 2025

https://en.wikipedia.org/wiki/Toom%E2%80%93Cook_multiplication

I updated BigInteger multiplication to use the Toom-3 algorithm.

The current Karatsuba algorithm has a time complexity of $O(n^{\log_2{3}}) \simeq O(n^{1.58})$, which is expected to improve to $O(n^{\log_3{5}}) \simeq O(n^{1.46})$ resulting in better performance.

in other languages:

About the Implementation

  • Merged SquareThreshold and MultiplyKaratsubaThreshold.
    • Since both had the same value, this improves testability.
  • Added {[MethodImpl(MethodImplOptions.AggressiveInlining)] to avoid stack consumption when determining the algorithm.
  • In some cases, the Toom-2.5 algorithm is used.

Why MultiplyToom3Threshold is 256?

Based on the benchmark results, I decided to set MultiplyToom3Threshold to 256.

Benchmark

When the number of digits is small, the preprocessing for algorithm selection is relatively high, leading to a slight regression—for example, a computation that previously took 19 μs now takes 20 μs.

However, as the number of digits increases, performance improves; for instance, a multiplication that used to take 750 μs is now completed in 690 μs.

Code
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.Numerics;
using System.Runtime.InteropServices;

[MemoryDiagnoser(false)]
[HideColumns("Job", "Error", "StdDev", "Median", "RatioSD")]
public class MultiplySomeSizeTests
{
    public IEnumerable<object> GetMultiplyArgs()
    {
        var rnd = new Random(227);
        var bytes = new byte[1000000];
        var lengths = new int[] { 100, 500, 1000, 10000, 100000, 1000000 };
        for (int i = lengths.Length - 1; i >= 0; i--)
        {
            var largeLength = lengths[i];
            var large = Make(largeLength);
            foreach (var p in new double[] { 0.999999999999999, 0.75, 0.5, 0.25 })
            {
                var smallLength = (int)(p * lengths[i]);
                var small = Make(smallLength);
                yield return new Data($"{largeLength:D7}-{smallLength:D7}", large, small);
            }

            yield return new Data($"Square{largeLength:D7}", large, large);
        }
        BigInteger Make(int length)
        {
            var b = bytes.AsSpan().Slice(0, length);
            rnd.NextBytes(b);
            return new BigInteger(b, isUnsigned: true);
        }
    }

    public record Data(string Name, BigInteger Large, BigInteger Small)
    {
        public override string ToString() => Name;
    }

    [Benchmark]
    [ArgumentsSource(nameof(GetMultiplyArgs))]
    public BigInteger Multiply(Data data)
    {
        return data.Large * data.Small;
    }
}

BenchmarkDotNet v0.13.12, Windows 11 (10.0.26100.3194)
13th Gen Intel Core i5-13500, 1 CPU, 20 logical and 14 physical cores
.NET SDK 10.0.100-alpha.1.25077.2
  [Host]   : .NET 10.0.0 (10.0.25.7313), X64 RyuJIT AVX2
  ShortRun : .NET 10.0.0 (42.42.42.42424), X64 RyuJIT AVX2

Job=ShortRun  IterationCount=3  LaunchCount=1  
WarmupCount=3  

Method Toolchain data Mean Ratio Gen0 Gen1 Gen2 Allocated Alloc Ratio
Multiply \main\corerun.exe 0000100-0000025 132.2 ns 1.00 0.0119 - - 152 B 1.00
Multiply \pr0128\corerun.exe 0000100-0000025 132.6 ns 1.00 0.0119 - - 152 B 1.00
Multiply \pr0256\corerun.exe 0000100-0000025 134.3 ns 1.02 0.0119 - - 152 B 1.00
Multiply \pr0512\corerun.exe 0000100-0000025 133.3 ns 1.01 0.0119 - - 152 B 1.00
Multiply \pr1024\corerun.exe 0000100-0000025 133.2 ns 1.01 0.0119 - - 152 B 1.00
Multiply \main\corerun.exe 0000100-0000050 223.6 ns 1.00 0.0138 - - 176 B 1.00
Multiply \pr0128\corerun.exe 0000100-0000050 241.7 ns 1.08 0.0138 - - 176 B 1.00
Multiply \pr0256\corerun.exe 0000100-0000050 232.8 ns 1.04 0.0138 - - 176 B 1.00
Multiply \pr0512\corerun.exe 0000100-0000050 231.1 ns 1.03 0.0138 - - 176 B 1.00
Multiply \pr1024\corerun.exe 0000100-0000050 228.1 ns 1.02 0.0138 - - 176 B 1.00
Multiply \main\corerun.exe 0000100-0000075 322.2 ns 1.00 0.0157 - - 200 B 1.00
Multiply \pr0128\corerun.exe 0000100-0000075 322.2 ns 1.00 0.0157 - - 200 B 1.00
Multiply \pr0256\corerun.exe 0000100-0000075 336.5 ns 1.04 0.0157 - - 200 B 1.00
Multiply \pr0512\corerun.exe 0000100-0000075 323.3 ns 1.00 0.0157 - - 200 B 1.00
Multiply \pr1024\corerun.exe 0000100-0000075 324.6 ns 1.01 0.0157 - - 200 B 1.00
Multiply \main\corerun.exe 0000100-0000099 399.4 ns 1.00 0.0176 - - 224 B 1.00
Multiply \pr0128\corerun.exe 0000100-0000099 445.1 ns 1.11 0.0176 - - 224 B 1.00
Multiply \pr0256\corerun.exe 0000100-0000099 436.7 ns 1.09 0.0176 - - 224 B 1.00
Multiply \pr0512\corerun.exe 0000100-0000099 407.8 ns 1.02 0.0176 - - 224 B 1.00
Multiply \pr1024\corerun.exe 0000100-0000099 438.0 ns 1.10 0.0176 - - 224 B 1.00
Multiply \main\corerun.exe 0000500-0000125 2,540.3 ns 1.00 0.0496 - - 656 B 1.00
Multiply \pr0128\corerun.exe 0000500-0000125 2,650.3 ns 1.04 0.0496 - - 656 B 1.00
Multiply \pr0256\corerun.exe 0000500-0000125 2,621.6 ns 1.03 0.0496 - - 656 B 1.00
Multiply \pr0512\corerun.exe 0000500-0000125 2,648.0 ns 1.04 0.0496 - - 656 B 1.00
Multiply \pr1024\corerun.exe 0000500-0000125 2,644.0 ns 1.04 0.0496 - - 656 B 1.00
Multiply \main\corerun.exe 0000500-0000250 3,945.4 ns 1.00 0.0610 - - 776 B 1.00
Multiply \pr0128\corerun.exe 0000500-0000250 4,185.3 ns 1.06 0.0610 - - 776 B 1.00
Multiply \pr0256\corerun.exe 0000500-0000250 4,546.5 ns 1.15 0.0610 - - 776 B 1.00
Multiply \pr0512\corerun.exe 0000500-0000250 4,777.1 ns 1.21 0.0610 - - 776 B 1.00
Multiply \pr1024\corerun.exe 0000500-0000250 4,604.5 ns 1.17 0.0610 - - 776 B 1.00
Multiply \main\corerun.exe 0000500-0000375 5,344.4 ns 1.00 0.0687 - - 904 B 1.00
Multiply \pr0128\corerun.exe 0000500-0000375 6,243.9 ns 1.17 0.0687 - - 904 B 1.00
Multiply \pr0256\corerun.exe 0000500-0000375 6,226.8 ns 1.17 0.0687 - - 904 B 1.00
Multiply \pr0512\corerun.exe 0000500-0000375 6,180.4 ns 1.16 0.0687 - - 904 B 1.00
Multiply \pr1024\corerun.exe 0000500-0000375 5,654.4 ns 1.06 0.0687 - - 904 B 1.00
Multiply \main\corerun.exe 0000500-0000499 6,060.4 ns 1.00 0.0763 - - 1024 B 1.00
Multiply \pr0128\corerun.exe 0000500-0000499 7,032.1 ns 1.16 0.0763 - - 1024 B 1.00
Multiply \pr0256\corerun.exe 0000500-0000499 6,844.1 ns 1.13 0.0763 - - 1024 B 1.00
Multiply \pr0512\corerun.exe 0000500-0000499 6,821.3 ns 1.13 0.0763 - - 1024 B 1.00
Multiply \pr1024\corerun.exe 0000500-0000499 6,821.3 ns 1.13 0.0763 - - 1024 B 1.00
Multiply \main\corerun.exe 0001000-0000250 7,894.7 ns 1.00 0.0916 - - 1280 B 1.00
Multiply \pr0128\corerun.exe 0001000-0000250 9,196.4 ns 1.16 0.0916 - - 1280 B 1.00
Multiply \pr0256\corerun.exe 0001000-0000250 9,301.5 ns 1.18 0.0916 - - 1280 B 1.00
Multiply \pr0512\corerun.exe 0001000-0000250 9,215.9 ns 1.17 0.0916 - - 1280 B 1.00
Multiply \pr1024\corerun.exe 0001000-0000250 9,514.8 ns 1.21 0.0916 - - 1280 B 1.00
Multiply \main\corerun.exe 0001000-0000500 12,588.0 ns 1.00 0.1068 - - 1528 B 1.00
Multiply \pr0128\corerun.exe 0001000-0000500 13,960.0 ns 1.11 0.1068 - - 1528 B 1.00
Multiply \pr0256\corerun.exe 0001000-0000500 13,931.5 ns 1.11 0.1068 - - 1528 B 1.00
Multiply \pr0512\corerun.exe 0001000-0000500 14,277.4 ns 1.13 0.1068 - - 1528 B 1.00
Multiply \pr1024\corerun.exe 0001000-0000500 13,925.9 ns 1.11 0.1068 - - 1528 B 1.00
Multiply \main\corerun.exe 0001000-0000750 16,363.0 ns 1.00 0.1221 - - 1776 B 1.00
Multiply \pr0128\corerun.exe 0001000-0000750 17,010.7 ns 1.04 0.1221 - - 1776 B 1.00
Multiply \pr0256\corerun.exe 0001000-0000750 17,728.1 ns 1.08 0.1221 - - 1776 B 1.00
Multiply \pr0512\corerun.exe 0001000-0000750 17,987.1 ns 1.10 0.1221 - - 1776 B 1.00
Multiply \pr1024\corerun.exe 0001000-0000750 17,786.7 ns 1.09 0.1221 - - 1776 B 1.00
Multiply \main\corerun.exe 0001000-0000999 19,115.1 ns 1.00 0.1526 - - 2024 B 1.00
Multiply \pr0128\corerun.exe 0001000-0000999 19,084.0 ns 1.00 0.1526 - - 2024 B 1.00
Multiply \pr0256\corerun.exe 0001000-0000999 19,898.4 ns 1.04 0.1526 - - 2024 B 1.00
Multiply \pr0512\corerun.exe 0001000-0000999 20,120.3 ns 1.05 0.1526 - - 2024 B 1.00
Multiply \pr1024\corerun.exe 0001000-0000999 20,073.3 ns 1.05 0.1526 - - 2024 B 1.00
Multiply \main\corerun.exe 0010000-0002500 331,491.7 ns 1.00 0.9766 - - 12528 B 1.00
Multiply \pr0128\corerun.exe 0010000-0002500 345,360.3 ns 1.04 0.9766 - - 12528 B 1.00
Multiply \pr0256\corerun.exe 0010000-0002500 331,238.7 ns 1.00 0.9766 - - 12528 B 1.00
Multiply \pr0512\corerun.exe 0010000-0002500 328,670.5 ns 0.99 0.9766 - - 12528 B 1.00
Multiply \pr1024\corerun.exe 0010000-0002500 346,899.5 ns 1.05 0.9766 - - 12528 B 1.00
Multiply \main\corerun.exe 0010000-0005000 505,027.1 ns 1.00 0.9766 - - 15025 B 1.00
Multiply \pr0128\corerun.exe 0010000-0005000 476,146.5 ns 0.94 0.9766 - - 15024 B 1.00
Multiply \pr0256\corerun.exe 0010000-0005000 450,121.5 ns 0.89 0.9766 - - 15024 B 1.00
Multiply \pr0512\corerun.exe 0010000-0005000 486,532.8 ns 0.96 0.9766 - - 15024 B 1.00
Multiply \pr1024\corerun.exe 0010000-0005000 490,601.3 ns 0.97 0.9766 - - 15024 B 1.00
Multiply \main\corerun.exe 0010000-0007500 672,133.9 ns 1.00 0.9766 - - 17529 B 1.00
Multiply \pr0128\corerun.exe 0010000-0007500 639,168.1 ns 0.95 0.9766 - - 17529 B 1.00
Multiply \pr0256\corerun.exe 0010000-0007500 593,343.2 ns 0.88 0.9766 - - 17529 B 1.00
Multiply \pr0512\corerun.exe 0010000-0007500 588,610.1 ns 0.88 0.9766 - - 17528 B 1.00
Multiply \pr1024\corerun.exe 0010000-0007500 655,762.0 ns 0.98 0.9766 - - 17528 B 1.00
Multiply \main\corerun.exe 0010000-0009999 752,860.1 ns 1.00 0.9766 - - 20025 B 1.00
Multiply \pr0128\corerun.exe 0010000-0009999 655,242.7 ns 0.87 0.9766 - - 20025 B 1.00
Multiply \pr0256\corerun.exe 0010000-0009999 690,172.2 ns 0.92 0.9766 - - 20025 B 1.00
Multiply \pr0512\corerun.exe 0010000-0009999 662,638.8 ns 0.88 0.9766 - - 20024 B 1.00
Multiply \pr1024\corerun.exe 0010000-0009999 739,939.7 ns 0.98 0.9766 - - 20025 B 1.00
Multiply \main\corerun.exe 0100000-0025000 13,501,335.9 ns 1.00 31.2500 31.2500 31.2500 125052 B 1.00
Multiply \pr0128\corerun.exe 0100000-0025000 10,430,180.2 ns 0.77 31.2500 31.2500 31.2500 125056 B 1.00
Multiply \pr0256\corerun.exe 0100000-0025000 10,917,271.4 ns 0.81 31.2500 31.2500 31.2500 125052 B 1.00
Multiply \pr0512\corerun.exe 0100000-0025000 10,978,839.6 ns 0.81 31.2500 31.2500 31.2500 125052 B 1.00
Multiply \pr1024\corerun.exe 0100000-0025000 11,131,788.0 ns 0.82 31.2500 31.2500 31.2500 125052 B 1.00
Multiply \main\corerun.exe 0100000-0050000 19,529,617.7 ns 1.00 31.2500 31.2500 31.2500 150068 B 1.00
Multiply \pr0128\corerun.exe 0100000-0050000 15,097,175.5 ns 0.77 46.8750 46.8750 46.8750 150062 B 1.00
Multiply \pr0256\corerun.exe 0100000-0050000 14,365,115.1 ns 0.74 46.8750 46.8750 46.8750 150062 B 1.00
Multiply \pr0512\corerun.exe 0100000-0050000 15,798,947.9 ns 0.81 31.2500 31.2500 31.2500 150068 B 1.00
Multiply \pr1024\corerun.exe 0100000-0050000 15,769,881.2 ns 0.81 31.2500 31.2500 31.2500 150068 B 1.00
Multiply \main\corerun.exe 0100000-0075000 26,245,361.5 ns 1.00 31.2500 31.2500 31.2500 175068 B 1.00
Multiply \pr0128\corerun.exe 0100000-0075000 20,914,966.7 ns 0.80 31.2500 31.2500 31.2500 175067 B 1.00
Multiply \pr0256\corerun.exe 0100000-0075000 18,830,658.3 ns 0.72 31.2500 31.2500 31.2500 175067 B 1.00
Multiply \pr0512\corerun.exe 0100000-0075000 18,699,003.1 ns 0.71 31.2500 31.2500 31.2500 175058 B 1.00
Multiply \pr1024\corerun.exe 0100000-0075000 21,151,182.3 ns 0.81 31.2500 31.2500 31.2500 175058 B 1.00
Multiply \main\corerun.exe 0100000-0099999 29,509,821.9 ns 1.00 31.2500 31.2500 31.2500 200068 B 1.00
Multiply \pr0128\corerun.exe 0100000-0099999 20,654,077.1 ns 0.70 31.2500 31.2500 31.2500 200067 B 1.00
Multiply \pr0256\corerun.exe 0100000-0099999 21,727,724.0 ns 0.74 31.2500 31.2500 31.2500 200058 B 1.00
Multiply \pr0512\corerun.exe 0100000-0099999 20,679,755.2 ns 0.70 31.2500 31.2500 31.2500 200058 B 1.00
Multiply \pr1024\corerun.exe 0100000-0099999 23,118,738.5 ns 0.78 31.2500 31.2500 31.2500 200067 B 1.00
Multiply \main\corerun.exe 1000000-0250000 509,747,866.7 ns 1.00 - - - 1250760 B 1.00
Multiply \pr0128\corerun.exe 1000000-0250000 321,428,916.7 ns 0.63 - - - 1250392 B 1.00
Multiply \pr0256\corerun.exe 1000000-0250000 314,692,916.7 ns 0.62 - - - 1250392 B 1.00
Multiply \pr0512\corerun.exe 1000000-0250000 316,530,200.0 ns 0.62 - - - 1250472 B 1.00
Multiply \pr1024\corerun.exe 1000000-0250000 354,507,166.7 ns 0.70 - - - 1250472 B 1.00
Multiply \main\corerun.exe 1000000-0500000 778,175,866.7 ns 1.00 - - - 1500472 B 1.00
Multiply \pr0128\corerun.exe 1000000-0500000 453,160,500.0 ns 0.58 - - - 1500472 B 1.00
Multiply \pr0256\corerun.exe 1000000-0500000 433,646,666.7 ns 0.56 - - - 1500760 B 1.00
Multiply \pr0512\corerun.exe 1000000-0500000 438,128,166.7 ns 0.56 - - - 1500472 B 1.00
Multiply \pr1024\corerun.exe 1000000-0500000 467,433,166.7 ns 0.60 - - - 1500760 B 1.00
Multiply \main\corerun.exe 1000000-0750000 1,016,370,266.7 ns 1.00 - - - 1750760 B 1.00
Multiply \pr0128\corerun.exe 1000000-0750000 605,555,000.0 ns 0.60 - - - 1750472 B 1.00
Multiply \pr0256\corerun.exe 1000000-0750000 562,233,733.3 ns 0.55 - - - 1750472 B 1.00
Multiply \pr0512\corerun.exe 1000000-0750000 553,515,400.0 ns 0.54 - - - 1750472 B 1.00
Multiply \pr1024\corerun.exe 1000000-0750000 564,205,566.7 ns 0.56 - - - 1750472 B 1.00
Multiply \main\corerun.exe 1000000-0999999 1,194,190,333.3 ns 1.00 - - - 2000472 B 1.00
Multiply \pr0128\corerun.exe 1000000-0999999 621,246,233.3 ns 0.52 - - - 2000760 B 1.00
Multiply \pr0256\corerun.exe 1000000-0999999 617,586,733.3 ns 0.52 - - - 2000472 B 1.00
Multiply \pr0512\corerun.exe 1000000-0999999 613,818,600.0 ns 0.51 - - - 2000760 B 1.00
Multiply \pr1024\corerun.exe 1000000-0999999 613,492,666.7 ns 0.51 - - - 2000088 B 1.00
Multiply \main\corerun.exe Square0000100 275.5 ns 1.00 0.0176 - - 224 B 1.00
Multiply \pr0128\corerun.exe Square0000100 275.3 ns 1.00 0.0176 - - 224 B 1.00
Multiply \pr0256\corerun.exe Square0000100 280.0 ns 1.02 0.0176 - - 224 B 1.00
Multiply \pr0512\corerun.exe Square0000100 277.2 ns 1.01 0.0176 - - 224 B 1.00
Multiply \pr1024\corerun.exe Square0000100 276.6 ns 1.00 0.0176 - - 224 B 1.00
Multiply \main\corerun.exe Square0000500 4,054.4 ns 1.00 0.0763 - - 1024 B 1.00
Multiply \pr0128\corerun.exe Square0000500 4,121.7 ns 1.02 0.0763 - - 1024 B 1.00
Multiply \pr0256\corerun.exe Square0000500 4,123.9 ns 1.02 0.0763 - - 1024 B 1.00
Multiply \pr0512\corerun.exe Square0000500 4,142.7 ns 1.02 0.0763 - - 1024 B 1.00
Multiply \pr1024\corerun.exe Square0000500 4,175.9 ns 1.03 0.0763 - - 1024 B 1.00
Multiply \main\corerun.exe Square0001000 12,664.4 ns 1.00 0.1526 - - 2024 B 1.00
Multiply \pr0128\corerun.exe Square0001000 12,840.2 ns 1.01 0.1526 - - 2024 B 1.00
Multiply \pr0256\corerun.exe Square0001000 13,198.4 ns 1.04 0.1526 - - 2024 B 1.00
Multiply \pr0512\corerun.exe Square0001000 13,049.2 ns 1.03 0.1526 - - 2024 B 1.00
Multiply \pr1024\corerun.exe Square0001000 13,170.3 ns 1.04 0.1526 - - 2024 B 1.00
Multiply \main\corerun.exe Square0010000 507,761.8 ns 1.00 0.9766 - - 20025 B 1.00
Multiply \pr0128\corerun.exe Square0010000 449,839.6 ns 0.89 1.4648 - - 20024 B 1.00
Multiply \pr0256\corerun.exe Square0010000 446,288.1 ns 0.88 1.4648 - - 20024 B 1.00
Multiply \pr0512\corerun.exe Square0010000 433,092.4 ns 0.85 1.4648 - - 20024 B 1.00
Multiply \pr1024\corerun.exe Square0010000 491,795.1 ns 0.97 0.9766 - - 20024 B 1.00
Multiply \main\corerun.exe Square0100000 20,070,908.3 ns 1.00 31.2500 31.2500 31.2500 200059 B 1.00
Multiply \pr0128\corerun.exe Square0100000 14,622,159.4 ns 0.73 46.8750 46.8750 46.8750 200066 B 1.00
Multiply \pr0256\corerun.exe Square0100000 14,373,196.9 ns 0.72 46.8750 46.8750 46.8750 200062 B 1.00
Multiply \pr0512\corerun.exe Square0100000 14,284,324.5 ns 0.71 46.8750 46.8750 46.8750 200066 B 1.00
Multiply \pr1024\corerun.exe Square0100000 15,955,562.5 ns 0.79 31.2500 31.2500 31.2500 200068 B 1.00
Multiply \main\corerun.exe Square1000000 780,524,800.0 ns 1.00 - - - 2000472 B 1.00
Multiply \pr0128\corerun.exe Square1000000 445,885,933.3 ns 0.57 - - - 2000472 B 1.00
Multiply \pr0256\corerun.exe Square1000000 440,426,600.0 ns 0.56 - - - 2000760 B 1.00
Multiply \pr0512\corerun.exe Square1000000 428,211,366.7 ns 0.55 - - - 2000472 B 1.00
Multiply \pr1024\corerun.exe Square1000000 430,626,800.0 ns 0.55 - - - 2000760 B 1.00

@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Feb 24, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-system-numerics
See info in area-owners.md if you want to be subscribed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Numerics community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant