Skip to content

Python framework to extract multimodal features for multimodal recommendation in a highly-customizable way.

Notifications You must be signed in to change notification settings

sisinflab/Ducho

Repository files navigation

Ducho v2.0

This is the official GitHub repo for the paper "Ducho 2.0: Towards a More Up-to-Date Unified Framework for the Extraction of Multimodal Features in Recommendation".

Table of contents

What is Ducho

SourcesBackends
ItemsInteractionsTensorFlowPyTorchTransformersSentence-Transformers
Audio
VisualImage
TextualImage
Visual-TextualImageImageImage
Ducho v2.0 is a Python framework for the extraction of multimodal features for recommendation. It provides a unified interface to most of the common libraries for deep learning (e.g., TensorFlow, PyTorch, Transformers, Sentence-Transformers) to extract high-level features from items (e.g., product images/descriptions) and user-item interactions (e.g., users reviews). It is highly configurable through a YAML-based configuration file (which may be override by input arguments from the command line in case). Users can indicate the source from which to extract the multimodal features (i.e., items/interactions), the modalities (i.e., visual/textual/audio/multiple), and the list of models along with output layers and preprocessing steps to extract the features. Moreover, with the new version of Ducho, users can conduct extractions by utilizing their own pretrained models.

You may choose among three options to run Ducho:

  • Locally by cloning this GitHub repo.
  • By pulling our docker image on Docker Hub (link).
  • On Google Colab (link).

Prerequisites

Local

Ducho may work on both CPU and GPU, harnessing the power of CUDA and MPS engines. However, if you want to speed up your feature extraction, we highly recommend to go for the GPU-accelerated option.

In that case, if your machine is equipped with NVIDIA drivers, you should first make sure CUDA is installed along with the compatible NVIDIA drivers.

For example, a possible working environment involves the following (you may refer to any Google Colab notebook):

Nvidia drivers: 525.85.12
Cuda: 11.8.89
Python: 3.11.2
Pip: 23.3.2

Please, refer to this link for the official guidelines on how to install the nvidia toolkit on linux from scratch.

Docker

Docker is easily the best option to run any NVIDIA/CUDA-based framework, because it provides docker images with everything already setup for almost all your needs.

First of all, you need to install the docker engine on your machine. Here is the official link for ubuntu. Then, for the sake of the demos provided in this repository, you might also need docker compose (here is a reference link).

Quite conveniently, you can find several CUDA-equipped images on Docker Hub. You may refer to this link. Depending on your CUDA version, you may also need to install nvidia-docker2 (in case, here is a reference).

To test if everything worked smoothly, pull and run a container from the following docker image through this command:

docker run --rm --gpus all nvidia/cuda:11.8.0-cudnn8-devel-ubuntu22.04 nvidia-smi

Once the docker image has been downloaded from the hub, you should be able to see something like this:

==========
== CUDA ==
==========

CUDA Version 11.8.0

Container image Copyright (c) 2016-2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.

This container image and its contents are governed by the NVIDIA Deep Learning Container License.
By pulling and using the container, you accept the terms and conditions of this license:
https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

A copy of this license is made available in this container at /NGC-DL-CONTAINER-LICENSE for your convenience.

Mon Feb  5 08:11:59 2024       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.129.06   Driver Version: 470.129.06   CUDA Version: 11.8     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000001:00:00.0 Off |                    0 |
| N/A   33C    P0    24W /  70W |      0MiB / 15109MiB |      4%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

meaning that the installation is ok and you are finally ready to pull Ducho's image (which is actually built from this CUDA image)!

Google Colab

You just need a Google Drive account!

Installation

Depending on where you are running Ducho, you might need to first clone this repo and install the necessary packages.

Local and Google Colab

If you are running Ducho on your local machine or Google Colab, you first need to clone this repository:

git clone https://github.com/sisinflab/Ducho.git

Then, install the needed dependencies through pip:

pip install -r requirements.txt # Local
pip install -r requirements_colab.txt # Google Colab

P.S. Since Google Colab already comes with almost all necessary packages, you just need to install very few missing ones.

Now you are all set to run Ducho (see later).

Docker

Note that these two steps are not necessary for the docker version because the image already comes with the suitable environment. In this case, you just need to pull our docker image from Docker Hub (link):

docker pull sisinflabpoliba/ducho

After the installation, you will be prompted to a command line, where you can run Ducho (see later).

Try Ducho

To ease the usage of Ducho, here we provide a demo spanning different multimodal recommendation scenarios. Use it to better familiarize with the framework:

  • Demo RecSys: It performs visual, textual and multiple feature extraction from items. More precisely, it demonstrates the process of extracting visual and textual features, incorporating custom models as well. Additionally, it showcases the extraction of visual-textual features via multiple modality (link).

Use Ducho

Once you have familiarized with Ducho, you can use it for your own datasets and custom multimodal feature extractions! Please refer to the official documentation where all modules, classes, and methods are explained in detail.

You may also consider to take a look at this guideline to better understand how to fill in your custom configuration files.

Independently on where you are using Ducho, here are the basic instructions to run a custom multimodal extraction pipeline.

Assuming all input data has been placed in the correct folder, and the configuration file has been filled in, you can use our convenient run.sh script:

python3 run.py --config=<path_to_config_file> [--additional_argument_1=additional_value_1, --additional_argument_2=additional_value_2, ...]

where the path to your custom configuration file is needed to override the existing default one (which does no specific actions), while the additional argument/value pairs are optional to override some of the condiguration parameters from the command line.

As the configuration dictionary derived from the configuration file is built on nested dictionaries, the argument may come in the form of key1.key2.key3...keyn. For example, if you want to override the input path of the textual interaction data, you should write:

python3 run.py --config=<path_to_config_file> --textual.interactions.input_path=<path_to_input>

this will override the same entry in the configuration file you provided. We do recommend to use this command line overriding just for simple configuration parameters (as in the reported example) since the framework is currently not tested to override, for example, model paramters (which are stored in complex list of dictionary structures).

The team

Currently, Ducho is maintained by:

The scientific supervision is driven by:

Cite Us

DUCHO 2.0

If you use our code in a scientific work, don't forget to cite us :)

@inproceedings{DBLP:conf/www/AttimonelliDMPGD24,
  author       = {Matteo Attimonelli and
                  Danilo Danese and
                  Daniele Malitesta and
                  Claudio Pomo and
                  Giuseppe Gassi and
                  Tommaso Di Noia},
  title        = {Ducho 2.0: Towards a More Up-to-Date Unified Framework for the Extraction of Multimodal Features in Recommendation},
  booktitle    = {{WWW}},
  doi          = {10.1145/3589335.3651440}
  publisher    = {{ACM}},
  year         = {2024}
}

DUCHO 1.0

@inproceedings{DBLP:conf/mm/MalitestaGPN23,
  author       = {Daniele Malitesta and
                  Giuseppe Gassi and
                  Claudio Pomo and
                  Tommaso Di Noia},
  title        = {Ducho: {A} Unified Framework for the Extraction of Multimodal Features
                  in Recommendation},
  booktitle    = {{ACM} Multimedia},
  pages        = {9668--9671},
  doi          = {10.1145/3581783.3613458},
  publisher    = {{ACM}},
  year         = {2023}
}