Skip to content

Example: COIL20

Kathleen Sucipto edited this page Feb 20, 2020 · 1 revision

The code for running the module on COIL20 is available in fastGNMF.examples.COIL20. Note that all images are automatically resized to 32 x 32.

Preparing the dataset

The dataset can be downloaded from this link. Ensure the images are stored in a directory named COIL20 under examples directory and are named in the format of obj[i]__[j].png where

  • i = [1..n] and j = [0..rank-1]
  • n = the number of objects
  • k = the number of images per object

For instance, with n = 2 and rank = 3, the directory structure would be:

- examples
  |_ COIL20
     |_ obj1__0.png
     |_ obj1__1.png
     |_ obj1__2.png
     |_ obj2__0.png
     |_ obj2__1.png
     |_ obj2__2.png
   |_ COIL20.py

Functions

read_dataset

A function to read the images from the COIL20 directory.

Parameters:

  • rank (int): the number of objects, by default = 20
  • image_num (int): the number of images per object, by default = 72
  • seed (int): the randomization seed for selecting the rank number of objects among all available objects, by default = None

Returns a tuple:

  • X: a numpy array containing all images with dimension [(32 x 32) x (rank x image_num)]; note that the images are shuffled so that the images of the same object are not grouped together
  • groundtruth: cluster labels to separate each object

visualize_tsne

A function to store the t-sne image of the latent feature vector V. We use TSNE() function implemented in sklearn and currently only allows random initialization.

Parameters:

  • V: an instance of numpy array or matrix, the latent feature vector produced by the matrix factorization
  • rank (int): the number of objects/rank/clusters
  • groundtruth (array of int): the cluster labels for each image provided by the read_dataset() function
  • plot_title (str): the title of the plot
  • plot_file (str): the file path of the stored plot image
  • tsne_perplexity (int): the perplexity to be passed to the TSNE function, by default = 2
  • seed (int) : the randomization seed, by default = None

Image example:

tsne_image

plot_basis

A function to plot the basis matrix U produced by the matrix vectorization.

Parameters:

  • U: an instance of numpy array or matrix, the basis vector U produced by the matrix vectorization
  • ncol: the number of columns in the resulted canvas
  • nrow: the number of rows in the resulted canvas
  • size: the height/width of each basis image, by default = 32

Image example:

Clone this wiki locally