-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU not detected -- RM detects a driver/library version mismatch. #4
Comments
Sorry, but I don't have much idea about what's going wrong. It seems likely that ollama simply doesn't support that gpu currently. This I did encounter an issue that seems superficially related, and I opened an issue with ollama. It's probably actually unrelated, but maybe it would be of interest? Ultimately, I would recommend opening an issue with ollama, since the maintainers there would hopefully know better about what's going wrong (even if it is my nix package that's actually at fault).
Would you be willing to open an issue about this with more detail, so I can try to fix it? Does it not build, or does it build but not detect any gpu? |
I'll open an issue with ollama then. Thanks for your thoughts. I'll try and build the new version again and get back to you in a separate issue. |
Just to note @abysssol, I tried again your most up-to-date version. I realised what the issue was - I was having nixpkgs follow the unstable branch of nixpkgs (which at the moment you have pointing to your separate repository before the PR). It's still building at the moment, but I imagine there won't be any issues now! Will let you know if it fails any further. |
Sorry to re-open this issue. I noticed in your issue that you set the
This shows that the error is specifically coming from
Investigating further, the drivers I have in the This is probably an issue that will be solved once everything ends up in nixpkgs, but I thought I'd try and dive into the rabbit hole and see if I was correct. Unfortunately I've hit one too many stumbling blocks and it's getting late where I am. Detailing where I am at the moment. So it seems my issue COULD be coming from a mismatch in that my system is on nixpkgs-stable and yours is currently based on nixpkgs-master. What's weird though is that I've never had this issue with other deep learning libraries in the past where I base my flakes on nixpkgs-unstable. I've tried overriding a bit and getting stuck (pkgs referring to nixpkgs-stable):
Ends with:
And it's certainly right - there is no libcuda in that cuda-toolkit directory, so I'm guessing there has been a change in how cuda organises it's libraries. If you have any thoughts on how to go about testing this I'd appreciate it - but no worries if not. Ultimately I seem to get stuck diving into rabbit holes like this ;) |
Don't be; this is exactly what reopening issues is meant for. I'm actually rather excited to see that this problem may have a solution after all. |
Sorry for not getting to this today, I've been working on getting ollama 0.1.24 merged into upstream nixpkgs. |
My hope is that once ollama is available from upstream nixpkgs, all library and driver versions should match, so hopefully your gpu would work then. ollama = {
url = "github:abysssol/ollama-flake";
inputs.nixpkgs.follows = "nixpkgs";
}; Then, try overriding let
ollama = ollama-abysssol.cuda.override {
# if you use the zen kernel, else the relevent kernel packages. omit if using default kernel
linuxPackages = pkgs.linuxPackages_zen;
};
in Maybe that could make a difference? It looks like you already did something similar, so maybe not ... I don't know. |
I created a new branch, ollama = {
url = "github:abysssol/ollama-flake/cuda-testing";
inputs.nixpkgs.follows = "nixpkgs";
}; Unfortunately, I have thus far been unable to find any more information than what you already did. |
Apologies for not responding - I have a number off reports/papers I'm working on at the moment! Just had a chance to try. It appears the patches fail - going into a meeting now so I cannot debug. But I can leave you with a log for now.
|
No worries. It's good to prioritize things that actually matter to you. I'll do the same.
The failure was an oversight of mine: I left in an argument to |
This is my problem too, I'm redefining nvidia-x11 on my nixos config like this: hardware.nvidia.package = pkgs.linuxPackages_cachyos.nvidia_x11.overrideAttrs (s: rec {
version = "550.40.07";
name = (builtins.parseDrvName s.name).name + "-" + version;
src = pkgs.fetchurl {
url = "https://download.nvidia.com/XFree86/Linux-x86_64/${version}/NVIDIA-Linux-x86_64-${version}.run";
sha256 = "298936c727b7eefed95bb87eb8d24cfeef1f35fecac864d98e2694d37749a4ad";
};
}); That is != than nixpkgs itself, so ollama fail because build with 545, the default version of nixpkgs. How can I change the version to this one on input definition of ollama. |
I'm not sure if this will work, but I think you can just override nvidia_x11 with your custom driver. Try it and tell me if it works. { pkgs, lib, config, ollama }: # add `config` if it's not already an argument
let
system = "x86_64-linux";
ollamaCuda = (ollama.${version}.cuda.override {
linuxPackages.nvidia_x11 = config.hardware.nvidia.package;
});
in
{
# if you're using the service in nixos-unstable
services.ollama.package = ollamaCuda;
# otherwise, put it in system packages
environment.systemPackages = [
ollamaCuda
];
} inputs = {
ollama.url = "github:abysssol/ollama-flake";
}; |
Humm i think that is the way, but sems that stdenv is not get from cuda
|
Working!!!!!!!!!!! with @abysssol suggestion, only not add the |
I'm using a full flake system, so I have in my own repo my OS config and my home-manager config, so at end: Main flake.nix, ollama = {
url = "github:abysssol/ollama-flake";
inputs.utils.follows = "flake-utils";
}; later, declare an overlay for this ollama: overlay-ia = final: prev: {
ia = {
ollama = ollama.packages.${system}.cuda.override {
linuxPackages.nvidia_x11 = pkgs.linuxPackages_cachyos.nvidia_x11.overrideAttrs (s: rec {
version = "550.40.07";
name = (builtins.parseDrvName s.name).name + "-" + version;
src = pkgs.fetchurl {
url = "https://download.nvidia.com/XFree86/Linux-x86_64/${version}/NVIDIA-Linux-x86_64-${version}.run";
sha256 = "298936c727b7eefed95bb87eb8d24cfeef1f35fecac864d98e2694d37749a4ad";
};
});
};
};
}; of course, this nvidia package must be the same declared on my os config: hardware.nvidia.package = pkgs.linuxPackages_cachyos.nvidia_x11.overrideAttrs (s: rec {
version = "550.40.07";
name = (builtins.parseDrvName s.name).name + "-" + version;
src = pkgs.fetchurl {
url = "https://download.nvidia.com/XFree86/Linux-x86_64/${version}/NVIDIA-Linux-x86_64-${version}.run";
sha256 = "298936c727b7eefed95bb87eb8d24cfeef1f35fecac864d98e2694d37749a4ad";
};
}); Then, add this overlay to my nixpkgs definition, later use it on home-manager packages definitions. And wallaaaaa:
thanks @abysssol |
I'm glad to hear you got it working. Is there a reason why you're duplicating the definition of your custom driver instead of using a let binding? |
Nonono no reason, heheheh, that's why I was testing, now I'm using a common let on the main flake to define the desired nvidia driver. |
Hi - I just wondered if you had some thoughts. I have a machine with some NVIDIA 2080 Supers in that, for some reason, doesn't detect the GPU and launches in CPU only mode. Happy to go over to Ollama directly if you're not sure. However thought that maybe you might have come across this so worth asking first. Thanks for looking at this either way.
I'm still working on a flake from prior to your changes to wrap around your nixpkgs fork (as of this moment, that doesn't work for me). Tried using the gpu/cuda package. I've used this on another machine and it works flawlessly (thanks for your hard work on this!).
I noticed I have this here:
To clarify I'm able to run pytorch with CUDA on a GPU from within a flake, so I believe the system is setup correctly. Again, happy to open an issue with Ollama if you don't believe you can help.
Full log of opening
ollama serve
.The text was updated successfully, but these errors were encountered: