Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extremely poor performance on sum reduce #509

Open
lucdem opened this issue Feb 19, 2024 · 1 comment
Open

Extremely poor performance on sum reduce #509

lucdem opened this issue Feb 19, 2024 · 1 comment

Comments

@lucdem
Copy link

lucdem commented Feb 19, 2024

Creating a large 3 dimensional array and calling sum(axis: 0) takes an enormous amount of time on my machine (11s), far more than Numpy (around 75ms) or even just a simple C# naive implementation (around 200ms).

C# Code:

using NumSharp;
using System.Diagnostics;
using System.Linq;

namespace NumSharpTest;

internal class Program
{
    static void Main(string[] args)
    {
        Console.WriteLine("NumSharp");
        NumSharp();
        Console.WriteLine("#######################");
        Console.WriteLine("Naive");
        Naive();
    }

    static void NumSharp()
    {
        var randArr = np.random.uniform(0, 1, [500, 500, 500]).astype(np.float32);

        var stopwatch = new Stopwatch();
        stopwatch.Start();
        var linesSum = randArr.sum(axis: 0);
        stopwatch.Stop();
        Console.WriteLine($"Elapsed: {stopwatch.ElapsedMilliseconds} ms");
    }

    static void Naive()
    {
        var random = new Random();
        var shape = new int[] { 500, 500, 500 };
        var randArr = new float[shape[0] * shape[1] * shape[2]];
        for (int i = 0; i < randArr.Length; i++) { randArr[i] = (float)random.NextDouble(); }

        var stopwatch = new Stopwatch();
        stopwatch.Start();
        var sum = new float[shape[1] * shape[2]];
        for (int i = 0; i < shape[0]; i++)
        {
            for (int j = 0; j < shape[1]; j++)
            {
                for (int k = 0; k < shape[2]; k++)
                {
                    sum[shape[2] * j + k] += randArr[shape[1] * shape[2] * i + shape[2] * j + k];
                }
            }
        }
        stopwatch.Stop();
        Console.WriteLine($"Elapsed: {stopwatch.ElapsedMilliseconds} ms");
    }
}


Python Code:

import numpy as np
from time import time

randArr = np.random.normal(1, 0.5, (500, 500, 500))

start = time()
sum = randArr.sum(axis=0)
end = time()

print(f'Elapsed: {(end - start) * 1000} ms')
@vkribo
Copy link

vkribo commented Mar 27, 2024

I profiled it and the problem seems to be in the method GetOffset in the Shape class. In the method there is a list called coords being created milions of times. I tried reimplementing it using a stackalloced array instead and it went from 6287 ms to 2315 ms.
Maybe someone more familliar with the code could look into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants