Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add microscopy tools #568

Open
stebo85 opened this issue Jan 18, 2024 · 23 comments
Open

add microscopy tools #568

stebo85 opened this issue Jan 18, 2024 · 23 comments

Comments

@stebo85 stebo85 moved this from New to External volunteer needed in NeuroDesk Jan 18, 2024
@vennand
Copy link
Contributor

vennand commented Feb 9, 2024

Hi @stebo85 ,
I work at Sydney Microscopy & Microanalysis, at the University of Sydney. I suspect this was requested by me via Ryan Sullivan.
Can I be the external volunteer to implement this? I would start with implementing RELION.

I read that we need to submit an issue to request access to the interactive tool? Or is it better to use the manual method?
https://www.neurodesk.org/developers/new_tools/interactive_build/

Thanks

@stebo85
Copy link
Contributor Author

stebo85 commented Feb 9, 2024

Dear @vennand,

It would be wonderful to have your help on this!!!

I just added you to the access group, so you can login to https://labtokyo.neurodesk.org/ - Try out the interactive build process and let us know if this works for you. Once you are more familiar with how we build containers, the manual process might be more efficient.

We are still working on improving how people can contribute new containers to our repository - so it would be wonderful to hear where things don't make sense yet and how the process could be improved!

Thank you so much
Steffen

@vennand
Copy link
Contributor

vennand commented Feb 20, 2024

Hi @stebo85,

Sorry for taking so long to reply. I tried the interactive build process, but it didn't let me choose a GPU node, so I went with the manual process.

I'm a bit stuck at the point of building relion, because I need to choose a CUDA toolkit version, and I also need to specify the GPU compute capabilities (assuming it's NVIDIA).

Is there a way to get those dynamically from the container before building? Because this would depend on the hardware of the system.

Thanks,
André

@stebo85
Copy link
Contributor Author

stebo85 commented Feb 20, 2024

Dear @vennand - GPUs are tricky. The way we have been successful so far is to package the latest CUDA toolkit version in the container that works with the software. At runtime, the GPU driver gets mounted into the container, but so far, we saw that various CUDA versions work well on different Nvidia GPU driver versions. It would need a different container for AMD/Intel GPUs I guess, we haven't tested that yet.

Here is an example container where we install CUDA through conda:

conda_install='pytorch=1.6.0 torchvision=0.7.0 cudatoolkit=10.2 -c pytorch' \

This container works on various Nvidia GPUs we had access to

@vennand
Copy link
Contributor

vennand commented Mar 1, 2024

Hi @stebo85,
I've created this container for relion:
https://github.com/vennand/neurocontainers/tree/master/recipes/relion

The Cuda Toolkit seems to work, though I haven't tested with a dataset yet.
However, it doesn't seem like the "toolVersion" was replaced correctly in the README.md. I'm not sure if I've done something wrong.
When exiting the build process, it also tries to push it automatically to docker.io/vnmd/relion_4.0.1 , which I don't have permission to. To test it properly, I need to make sure the GUI launches technically, so it would probably be better if this step wasn't automatic.

I was wondering, is there a way to call other containers when building a container? Relion uses ctffind, motioncor2 and topaz, but they are all third party software that can be use independently. Is the best way to build all of them within the relion container, or can I create containers for them, and use them in the relion container?
Topaz would likely cause issues with relion in the same environment, since it needs it's own conda environment.

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 1, 2024

Dear @vennand,

Great to hear you are making progress :)

the toolVersion should be replaced in the README once it's building in our repository. You can just send a pull request. This will then build it and provide you with a command for testing the container and you can test if the GUI works. We are working on an interactive graphical build system that will allow to do all of this nicely, but right now, that's the best we can do.

Yes, you can call other containers from within containers! For this to work you need to install singularity and lmod in the container and the you can use the module system to call any other container. Here is an example where we do this: https://github.com/NeuroDesk/neurocontainers/blob/master/recipes/code/build.sh

from this example you need:

--install lmod \
--env GOPATH='$HOME'/go \
   --env PATH='$PATH':/usr/local/go/bin:'$PATH':${GOPATH}/bin \
   --run="wget https://dl.google.com/go/go$GO_VERSION.$OS-$ARCH.tar.gz \
    && tar -C /usr/local -xzvf go$GO_VERSION.$OS-$ARCH.tar.gz \
    && rm go$GO_VERSION.$OS-$ARCH.tar.gz \
    && mkdir -p $GOPATH/src/github.com/sylabs \
    && cd $GOPATH/src/github.com/sylabs \
    && wget https://github.com/sylabs/singularity/releases/download/v${SINGULARITY_VERSION}/singularity-ce-${SINGULARITY_VERSION}.tar.gz \
    && tar -xzvf singularity-ce-${SINGULARITY_VERSION}.tar.gz \
    && cd singularity-ce-${SINGULARITY_VERSION} \
    && ./mconfig --without-suid --prefix=/usr/local/singularity \
    && make -C builddir \
    && make -C builddir install \
    && cd .. \
    && rm -rf singularity-ce-${SINGULARITY_VERSION} \
    && rm -rf /usr/local/go $GOPATH \
    && ln -s /usr/local/singularity/bin/singularity /bin/" \
   --copy module.sh /usr/share/ \

the content of module.sh is:

trap "" 1 2 3

case "$0" in
    -bash|bash|*/bash) . /usr/share/lmod/6.6/init/bash ;;
       -ksh|ksh|*/ksh) . /usr/share/lmod/6.6/init/ksh ;;
       -zsh|zsh|*/zsh) . /usr/share/lmod/6.6/init/zsh ;;
          -sh|sh|*/sh) . /usr/share/lmod/6.6/init/sh ;;
                    *) . /usr/share/lmod/6.6/init/sh ;;  # default for scripts
esac

trap - 1 2 3

You might have to adjust the lmod version depending on your base operating system container version

I would go down the route of multiple containers if:

  1. a software has conflicting dependencies
  2. a software is useful for multiple other containers

Keen to see this working :)
Thank you
Steffen

@vennand
Copy link
Contributor

vennand commented Mar 6, 2024

Hi @stebo85,

I'm not sure I understand how to use this. So let's say I want to create a container for Topaz, so I can use it in relion. In the Topaz container, I need to install lmod and singularity? Can I use the module.sh file as is?

After that, how to I call Topaz in the relion container? For relion, I would need to set an environment variable that points to the Topaz executable. Would this work then?
--env RELION_TOPAZ_EXECUTABLE=/usr/local/topaz/latest/topaz

Thanks,
André

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 7, 2024

Dear @vennand

you don't need to install singularity and lmod in the Topaz container - only in the relion container (so the container that calls other containers). You should be able to use the module.sh file as is, but check that the lmod version fits with your base image version - otherwise adjust the version number.

the variable for RELION_TOPAZ_EXECUTABLE is a bit tricky and you need to try what works there, a few options:

  1. the easiest would be if you can run module load topaz/version and then topaz will just be on the path - maybe you can leave the variable empty and it first tries to find topaz on the path?
  2. if 1) doesn't work then you could create a wrapper script that does this and store it for example under /usr/local/topaz/latest/topaz
    this wrapper script then contains something like:
module load Topaz/x.x.x
topaz "$@"

Then you can set the variable to this and it might work:
--env RELION_TOPAZ_EXECUTABLE=/usr/local/topaz/latest/topaz

Let me know how you go with this :)

Thank you
Steffen

@vennand
Copy link
Contributor

vennand commented Mar 8, 2024

Hi @stebo85,

Thanks for the instructions! I think it's going well so far.

Quick questions, would you recommend to install other software under /usr/local/ or /opt/ ?
I'll create a separate container for Topaz, but it seems like it's not necessary for motioncor2 and ctffind. So at the moment, I'm installing under /usr/local/ctffind/4.1.14 and /usr/local/motioncor2/1.6.4

Thanks,
André

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 8, 2024

It doesn’t matter where you install other software in the container :) I prefer /opt - but as long as the binaries are on the path variable it will work

@vennand
Copy link
Contributor

vennand commented Mar 13, 2024

Hi @stebo85,

I've made a pull requests for relion and topaz. I'm assuming the next step is to test it with a GUI, on a machine with a GPU ideally?

Also, side note, is it possible to create containers for software with licenses? Could be a dongle server or network license?

Thanks,
André

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 13, 2024

ok, great I merge that.

Yes, once the container is built you get a command for testing the container.

Dongles are difficult but network licenses would work

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 13, 2024

Topaz worked and you can test:
#613

Relion failed with:

[ 9/20] RUN git clone 'https://github.com/3dem/relion.git' --branch=4.0.1 && cd relion && mkdir build && cd build && cmake -DCUDA_ARCH= -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-11.8 -DCMAKE_INSTALL_PREFIX=/opt/relion-4.0.1/ -DFORCE_OWN_FLTK=ON .. && make && make install:
#16 145.5 [ 3%] Building NVCC (Device) object src/apps/CMakeFiles/relion_jaz_gpu_util.dir//jaz/cuda/relion_jaz_gpu_util_generated_test00.cu.o
#16 145.6 nvcc fatal : Value 'sm_' is not defined for option 'gpu-architecture'
#16 145.6 CMake Error at relion_jaz_gpu_util_generated_test00.cu.o.Release.cmake:220 (message):
#16 145.6 Error generating
#16 145.6 /tmp/relion/build/src/apps/CMakeFiles/relion_jaz_gpu_util.dir/
/jaz/cuda/./relion_jaz_gpu_util_generated_test00.cu.o
#16 145.6
#16 145.6
#16 145.6 make[2]: *** [src/apps/CMakeFiles/relion_jaz_gpu_util.dir/build.make:77: src/apps/CMakeFiles/relion_jaz_gpu_util.dir/__/jaz/cuda/relion_jaz_gpu_util_generated_test00.cu.o] Error 1
#16 145.6 make[1]: *** [CMakeFiles/Makefile2:460: src/apps/CMakeFiles/relion_jaz_gpu_util.dir/all] Error 2
#16 145.6 make: *** [Makefile:136: all] Error 2

@vennand
Copy link
Contributor

vennand commented Mar 13, 2024

Does the machine have a GPU driver installed? It need nvidia-smi, otherwise the environment variable COMPUTE_CAPABILITY will come back empty.

Should I try to add some conditions, so that if COMPUTE_CAPABILITY is empty, it'll compile for CPU only (without -DCUDA_ARCH=${COMPUTE_CAPABILITY})? Though a relion without GPU won't be very appealing to users...

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 13, 2024

Dear @vennand - no, the build server doesn't have a GPU - do you really need a GPU to build it? Can't you just set this variable explicitly?

@vennand
Copy link
Contributor

vennand commented Mar 13, 2024

I need to specify the architecture of the GPU the container will run on, so I can't set it explicitly. For example, I'm testing it on a Telsa P40, which has a compute capability of 6.1, but new GPUs like an RTX 4090 has 8.5. It affects which binaries the software uses to compile. It'll throw errors when we try to use it otherwise.

Anyway, I'll try to find a work-around with if/else statements inline.

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 14, 2024

It would need 3 specific containers for each. You could encode this through the toolVersion. Can you build a container with compute capability 6.1, 8.5 and CPU only?

@vennand
Copy link
Contributor

vennand commented Mar 14, 2024

I added this instead, seems to work with my testing. Could that cause issues in the future with how NeuroDesk is coded?
&& if [[ -z '${COMPUTE_CAPABILITY}' ]] \ ; then cmake -DCMAKE_INSTALL_PREFIX=/opt/${toolName}-${toolVersion}/ -DFORCE_OWN_FLTK=ON .. # RELION: If there is no NVIDIA driver installed, compile without GPU \ ; else cmake -DCUDA_ARCH=${COMPUTE_CAPABILITY} -DCUDA_TOOLKIT_ROOT_DIR=/usr/local/cuda-11.8 -DCMAKE_INSTALL_PREFIX=/opt/${toolName}-${toolVersion}/ -DFORCE_OWN_FLTK=ON .. ; fi # RELION: Otherwise, compile with GPU architecture and CUDA version \

It tests if COMPUTE_CAPABILITY is of length 0. If it is, then it compiles for CPU only, otherwise it uses the correct GPU architecture

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 14, 2024

That works, but will only build the CPU version. Why not building GPU versions as well?

@vennand
Copy link
Contributor

vennand commented Mar 14, 2024

I think I don't understand how containers work then. I assumed they would be built on each machine, to install them. Is it that once they are built by the test server, they are used as is? I'd have to test on a machine without GPU, but I suspect the compile process of relion checks for a NVIDIA driver, and may compile for CPU only even if I specify an architecture.

I can make a container for each compute capability, but that would make 16 containers (from 3.5 to 9.0), plus CPU only

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 14, 2024

Dear @vennand,

Containers are built once and then used by the target system as is. They are not rebuilt on the target system. So, correct: You would need to build 16 different containers if you want to support every target architecture in your case. To have a more realistic option, why not building a container with CPU, a container with the latest compute capability and a container with an older compute capability - then see what deployment systems you have in practice and how these containers run?

Thank you
Steffen

@vennand
Copy link
Contributor

vennand commented Mar 14, 2024

Do you have an example of containers using different toolVersions? Do I have to create new recipe directories?

@stebo85
Copy link
Contributor Author

stebo85 commented Mar 14, 2024

We don't have a tool yet that needs that. I think it would be easiest for now to push different versions of the build.sh file to the repository and see if this actually works. Then we can see how to streamline this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: External volunteer needed
Development

No branches or pull requests

2 participants