GNN Model with 93% Accuracy for Facebook Page-Page Network Node Classification with TSNE Visualization #169

liammulhern · 2024-10-28T05:28:53Z

GNN Model with 93% Accuracy for Facebook Page-Page Network Node Classification with TSNE Visualization

This project introduces a multi-layer graph neural network (GNN) for semi-supervised, multi-class node classification on the Facebook Large Page-Page Network dataset, achieving 93.14% accuracy. The network classifies nodes (representing Facebook pages) into four categories: Politicians, Government Organizations, Television Shows, and Companies.

Key features of PR:

Modules:

dataset.py: Loads and preprocesses data.
main.py: CLI for training and inference.
modules.py: Defines GNN architecture.
train.py: Manages training, validation, and metric logging.
predict.py: Runs model inference and visualizations.

Execution:

Supports training (--train --save --load), inference (--inference <index>), and visualization (--display) through CLI.

GNN Architecture:

Uses multilayer perceptrons (MLPs) and sparse layers, transforming node features into learned embeddings for classification.
Incorporates ReLU activation and log softmax for output.

Training:

Learning Rate: 1e-4
Epochs: 100
Optimizer: Adam
Loss: Cross-Entropy

Results:

Achieves 93.14% accuracy; training and validation metrics show potential overfitting.
TSNE visualizations show clearer clustering post-training, indicating successful categorization.

…, and nodes

… GNN training method

gayanku · 2024-11-04T00:52:27Z

This is an initial inspection, no action is required at this point
GNN FB Page-Page Network dataset ---- Normal Difficulty

Category		Marks	Comments
Algorithm solves the problem	5	4.5	Some overfitting seen. No early stop
Implementation functions as intended	3	3	2 layer GCN, torch_geometric/GCNConv
Good design	1	1	Modular, Reusable
Commenting	1	1	Meaningful docstrings and comments
Algorithm above Normal Difficulty	5	5
Algorithm is Hard difficulty	5	0	Normal Difficulty
Section IV : Max mark 15 from 20		14.5

TSNE / UMAP:
Good, though would have expected to see a bit more separation at the reported accuracy.

Discussion: Good, with a bit more depth useful.

Suggestions:
Redundant code in repo - SparseLayer. Is this needed?
Also doc mentions the use of the sparse layer, though the implemented model does not use this layer.
Code can have some early stop, ex: train loop should save best model by keeping track of current best accuracy.
Could add a hyper parameter search or discuss how the values were obtained.

gayanku · 2024-11-13T03:26:02Z

Marking

Good/OK/Fair Practice (Design/Commenting, TF/Torch Usage)
	Adequate design and implementation. Redundant code in repo - SparseLayer.	-1
	Spacing and comments.
	Header blocks.
Recognition Problem
	OK solution to problem. Some overfitting seen. No early stop	-2
	Driver Script present.
	File structure present.
	Good Usage & Demo & Visualisation & Data usage.
	Module present.
	Commenting present.
	No Data leakage found.
	Difficulty : Normal. GNN Task.	-5
Commit Log
	Good Meaningful commit messages.
	Good Progressive commits.
Documentation
	Readme :Good.
	Model/technical explanation :Acceptable.	-1
	Description and Comments :Good.
	Markdown used and PDF NOT submitted.	-2
Pull Request
	Successful Pull Request (Working Algorithm Delivered on Time in Correct Branch).	-2
	Feedback action required: Feedback marks possible +2 if the requested changes are made. Update PR to correct branch.	-2
	Request Description is good.
TOTAL		-15

Marked as per the due date and changes after which aren't necessarily allowed to contribute to grade for fairness.
Subject to approval from Shakes

gayanku · 2024-11-13T09:56:11Z

Feedback marks possible +2 if the requested changes are made (see above).

liammulhern added 27 commits October 3, 2024 15:57

Adds initial README with project scope

ba7e30b

Adds initial README for GNN with project scope

ba785bb

Init pattern recognition python files

76d53ab

Adds file descriptions for pattern recognition files

e4f82e2

Adds test and util file descriptions, updates gitignore for model files

eb91268

Adds FLPP dataset loader and test cases to validate features, classes…

3b34f27

…, and nodes

Adds FLPP dataset loader with test training split

efd0159

Adds graph visualisation and related tests

1b77d5c

Adds edge list to adjacency matrix using torch geo

342677f

Adds GNN and sparse layer modules

559c989

Adds train and test methods for generic model inputs

45ef881

Adds multiple epoch train method that saves and loads model

a44ee7f

Updates save and load to device functionality for model values

bcb5afa

Adds initial model inference for categorisation of graph from GNN

bbfd83c

Adds rangpur slurm runner script

04c5bd2

Adds entry point with CLI args for running model functions; Adds full…

e76f5b2

… GNN training method

Adds TSNE from raw FLPP dataset and figure output

d2c160f

Updates dataloader for validate split

1b40b44

Adds adjacency matrix to training model and begins readme documentation

b09d81e

Updates script to ease training setup

2adf099

Removes pytorch geo dependency

45b57b5

Adds dataset, hyperparameters, and model figures to readme

9e3ba4c

Adds overview and implementation documentation to readme

2b15848

Adds trained TSNE plot and updates training loader to use tensor masks

dc42071

Adds accuracy and loss plots for trained model

4e38681

Adds inference interface to main.py

c5e3d98

Adds conclusion and discussion to readme file

a495aec

shaivikaaaa added the _GNN label Oct 31, 2024

gayanku added Preliminary Grade To be confirmed after review. help wanted Extra attention is needed Feedback Needed Feedback needed for completion. labels Nov 13, 2024

hanemma7moud added the BB label Nov 17, 2024

shakes76 added Completed and removed help wanted Extra attention is needed labels Nov 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GNN Model with 93% Accuracy for Facebook Page-Page Network Node Classification with TSNE Visualization #169

GNN Model with 93% Accuracy for Facebook Page-Page Network Node Classification with TSNE Visualization #169

liammulhern commented Oct 28, 2024

gayanku commented Nov 4, 2024 •

edited

Loading

gayanku commented Nov 13, 2024 •

edited

Loading

gayanku commented Nov 13, 2024

GNN Model with 93% Accuracy for Facebook Page-Page Network Node Classification with TSNE Visualization #169

Are you sure you want to change the base?

GNN Model with 93% Accuracy for Facebook Page-Page Network Node Classification with TSNE Visualization #169

Conversation

liammulhern commented Oct 28, 2024

GNN Model with 93% Accuracy for Facebook Page-Page Network Node Classification with TSNE Visualization

Key features of PR:

Modules:

Execution:

GNN Architecture:

Training:

Results:

gayanku commented Nov 4, 2024 • edited Loading

gayanku commented Nov 13, 2024 • edited Loading

Marking

gayanku commented Nov 13, 2024

gayanku commented Nov 4, 2024 •

edited

Loading

gayanku commented Nov 13, 2024 •

edited

Loading