Skip to content

bhavaygg/CVD-Correction-extension

Repository files navigation

IR2022_T9

Color Vision Deficiency (CVD) is one of the most common visual disorders around the world with no known medical treatment till today. People suffer inconveniences in everyday activities such as recognizing traffic signals and distinguishing the colors of everyday items such as clothes, flowers, etc. In recent years, the time we spend on screens has considerably increased, people suffering from CVD face difficulties even in watching content online - be it for entertainment purposes, or educational. In this project, we work on processing images and videos in a manner that people suffering CVD can view them easily. The colors that are non-perceivable and indistinguishable to persons with CVD on the images are recolored so that they are perceivable and distinguishable by the persons with CVD.

Our contributions include proposing a deep learning method , an online recoloring tool, a chrome extension for correcting images, and a dataset of images perceived to people with different CVDs (protanopia, deutranopia and tritanopia). In future can work on generating new metrics for discernablility.

  • Baseline - We create a simulation model which transforms the image to the LMS color space where filters for the desired CVD simulation is applied to it. The image generated is an accurate representation for what is being perceived by people suffering from CVD and is taken as the ground truth. Our model then applied an additional transformation which shifts the image into a more visible specter. This is a simple and fast approach which emphasizes the contrast between colors that are hard to distinguish.
  • Gaussian Mixture Model(GMM) - We also implement a GMM based model.\cite{huang2009image} For each pixel in the image, the perceptual difference between any two colors can be approximated by the Euclidean distance between them. To model the underlying color distribution, we assume that the distribution can be well approximated by K Gaussians. The probability density is of the form - (p(x|\theta)=\sum_{i=1}^{K}\omega_i G_i(x|\theta_i)). The hue shift of a color depends on the posterior probability of each Gaussian and the corresponding mapping function. The distance between a pair of gaussians is then computed using KL divergence. The colors are then interpolated to ensure the local color smoothness for the recolored image.
  • Deep Learning Based Model - We propose a method with a two pronged approach. It involves changing the color space of an image to color blind space, then use the transformed image to learn another transformation which enhances discernability between subjects while also maintaining the separation between objects in the color space. Our deep learning framework visualized in the figure and is comprised of 2 parts, namely the corrector model and the referee model. The corrector network uses a U-Net like Architecture whose task is to receive an input image and produce a recoloured image which is better perceived by people with CVD. The referee model is a pre-trained CNN-model to detect objects in the CVD corrected image generated by the corrector model, compares it to its performance on the original image and feed the loss into the corrector network. The ultimate aim of this sort of approach is to use detection of objects as a metric for discernability to ensure that the image generated by the corrector network is a clear one.

Link to Dataset - https://drive.google.com/drive/folders/1WNVi06hTzV1vZUWReQF7VzIQFiY_rDoo?usp=sharing

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •