Skip to content
This repository has been archived by the owner on Sep 8, 2023. It is now read-only.

Latest commit

 

History

History
268 lines (214 loc) · 9.44 KB

code-analysis.md

File metadata and controls

268 lines (214 loc) · 9.44 KB

Absolutely! Let's go through your code line by line:

#!/usr/bin/env python

This is a shebang line that tells the system to use the Python interpreter specified in the environment to run the script.

##############
#### Your name: Yingqi Chen
##############

This is a placeholder for your name. You should replace it with your actual name.

import numpy as np
import re, math
from sklearn import svm, metrics
from skimage import io, feature, filters, exposure, color, measure, transform
import ransac_score

Here, you're importing various libraries and modules. These include NumPy for numerical operations, re for regular expressions, math for mathematical functions, scikit-learn's Support Vector Machines (SVM) for classification tasks, skimage for image processing tasks, and ransac_score for RANSAC scoring.

class ImageClassifier:

You're defining a class named ImageClassifier. This class will contain the methods and attributes required for image classification.

    def __init__(self):
        self.classifier = None

This is the constructor of the ImageClassifier class. It initializes self.classifier to None.

    def imread_convert(self, f):
        return io.imread(f).astype(np.uint8)

This is a method named imread_convert that takes a file path (f) as an argument. It reads an image using io.imread from scikit-image and converts it to uint8 data type.

    def load_data_from_folder(self, dir):

This method (load_data_from_folder) takes a directory path (dir) as an argument.

        ic = io.ImageCollection(dir+"*.bmp", load_func=self.imread_convert)

It creates an image collection (ic) by loading all BMP images from the specified directory. The load_func parameter is set to the imread_convert method, which converts the images to uint8.

        data = io.concatenate_images(ic)

This combines the images in the image collection into a single array data.

        labels = np.array(ic.files)

It creates an array labels containing the file paths of the images.

        for x, y in enumerate(labels):
            m = re.search("_", y)
            labels[x] = y[len(dir):m.start()]

This loop processes the file paths to extract labels from the image names. It uses a regular expression to find the position of the underscore (_) in the file name.

        return(data,labels)

The method returns the combined image data and the corresponding labels.

    def extract_image_features(self, data):

This method (extract_image_features) takes an array of image data (data) as an argument.

        arr = []

This initializes an empty list arr to store the extracted features.

        for im in data:

This loop iterates over the images in the dataset.

            im_gray = color.rgb2gray(im)
       
            im_gray = filters.gaussian(im_gray, sigma=0.4)
            
            f = feature.hog(im_gray, orientations=10, pixels_per_cell=(48, 48), cells_per_block=(4, 4), feature_vector=True, block_norm='L2-Hys')
            arr.append(f)

Within the loop, you perform the following operations:

  1. Convert the image to grayscale using color.rgb2gray.
  2. Apply Gaussian blur with a sigma of 0.4 using filters.gaussian.
  3. Extract HOG features using feature.hog.
        feature_data = np.array(arr)
        return(feature_data)

The method returns the list of extracted features as a NumPy array.

    def train_classifier(self, train_data, train_labels):

This method (train_classifier) takes training data (train_data) and corresponding labels (train_labels) as arguments.

        self.classifer = svm.LinearSVC()
        self.classifer.fit(train_data, train_labels)

Here, you initialize a linear Support Vector Classifier (LinearSVC) and fit it to the training data and labels.

    def predict_labels(self, data):

This method (predict_labels) takes a dataset (data) as an argument.

        predicted_labels = self.classifer.predict(data)
        return predicted_labels

It predicts labels for the given data using the trained classifier and returns the predicted labels.

    def line_fitting(self, data):

This method (line_fitting) takes a dataset (data) as an argument.

        slope = []
        intercept = []

You initialize empty lists to store slopes and intercepts.

        for img in data:

This loop iterates over the images in the dataset.

            # Convert the image to grayscale
            gray_image = color.rgb2gray(img)

            # Apply Gaussian filter with sigma around 3
            smoothed_image = filters.gaussian(gray_image, sigma=3)

            # Perform edge detection using Canny
            edges = feature.canny(smoothed_image)

            # Extract coordinates of edge pixels
            edge_coords = np.column_stack(np.where(edges))

            # Extract x and y coordinates
            x_coords = edge_coords[:, 1]


            y_coords = edge_coords[:, 0]

            # Perform RANSAC-based line fitting
            # In this case, the acceptable_dist changed from 1.0 to 0.5 will resolve the line fitting issue
            slopes, intercepts = self.ransac_line_fit(x_coords, y_coords, num_samples=2, repeats=1000,
                                                    acceptable_dist=0.5)

            slope.append(slopes)
            intercept.append(intercepts)

Within the loop, you perform the following operations:

  1. Convert the image to grayscale.
  2. Apply Gaussian filter with sigma around 3.
  3. Perform edge detection using Canny.
  4. Extract coordinates of edge pixels.
  5. Extract x and y coordinates from the edge coordinates.
  6. Perform RANSAC-based line fitting by calling the ransac_line_fit method.
        return slope, intercept

The method returns the lists of slopes and intercepts.

    def ransac_line_fit(self, x_coord, y_coord, num_samples, repeats, acceptable_dist):

This is a method for fitting a line using RANSAC. It takes x and y coordinates, as well as parameters for the RANSAC algorithm.

        best_slope = None
        best_intercept = None
        max_inliers = 0

        for _ in range(repeats):
            # Choose a random set of points
            random_indices = np.random.choice(len(x_coord), num_samples, replace=False)
            x_subset = x_coord[random_indices]
            y_subset = y_coord[random_indices]

            # Fit a line to the random subset of points
            line = np.polyfit(x_subset, y_subset, 1)
            slope, intercept = line[0], line[1]

            # Calculate perpendicular distances from the line
            distances = np.abs(y_coord - (slope * x_coord + intercept))

            # Count inliers (points close to the line)
            inliers = np.sum(distances <= acceptable_dist)

            # Update best model if this model is better
            if inliers > max_inliers:
                max_inliers = inliers
                best_slope, best_intercept = slope, intercept

        return best_slope, best_intercept

Within the ransac_line_fit method, you implement the RANSAC algorithm for line fitting. This includes selecting random points, fitting a line, calculating distances, and updating the best model.

def main():

    img_clf = ImageClassifier()

    # load images
    (train_raw, train_labels) = img_clf.load_data_from_folder('./train/')
    (test_raw, test_labels) = img_clf.load_data_from_folder('./test/')
    (wall_raw, _) = img_clf.load_data_from_folder('./wall/')
    
    # convert images into features
    train_data = img_clf.extract_image_features(train_raw)
    test_data = img_clf.extract_image_features(test_raw)
    
    # train model and test on training data
    img_clf.train_classifier(train_data, train_labels)
    predicted_labels = img_clf.predict_labels(train_data)
    print("\nTraining results")
    print("=============================")
    print("Confusion Matrix:\n",metrics.confusion_matrix(train_labels, predicted_labels))
    print("Accuracy: ", metrics.accuracy_score(train_labels, predicted_labels))
    print("F1 score: ", metrics.f1_score(train_labels, predicted_labels, average='micro'))
    
    # test model
    predicted_labels = img_clf.predict_labels(test_data)
    print("\nTest results")
    print("=============================")
    print("Confusion Matrix:\n",metrics.confusion_matrix(test_labels, predicted_labels))
    print("Accuracy: ", metrics.accuracy_score(test_labels, predicted_labels))
    print("F1 score: ", metrics.f1_score(test_labels, predicted_labels, average='micro'))

    # ransac
    print("\nRANSAC results")
    print("=============================")
    s, i = img_clf.line_fitting(wall_raw)
    print(f"Line Fitting Score: {ransac_score.score(s,i)}/10")

if __name__ == "__main__":
    main()

This is the main function. It creates an instance of ImageClassifier and performs the following steps:

  1. Loads image data and labels from specified directories.
  2. Extracts features from the image data.
  3. Trains a classifier on the training data and evaluates its performance on both training and test data.
  4. Performs RANSAC-based line fitting on wall images.
  5. Prints the results.

I hope this detailed explanation helps you understand your code better! If you have any further questions or need clarification on specific parts, feel free to ask.