Social Network Analysis | Chapter 9 | Graph Representation Learning | Part 3

Name: Social Network Analysis | Chapter 9 | Graph Representation Learning | Part 3
Uploaded: 2022-04-29T08:52:35.000Z
Duration: 1 h 10 min 54 s
Description: This is supplementary material for the book Social Network Analysis by Dr. Tanmoy Chakraborty. Book Website: https://social-network-analysis.in/ Available for purchase at: https://www.amazon.in/Social-Network-Analysis-Tanmoy-Chakraborty/dp/9354247830

Graph Neural Network: Graph Preparation Learning

Overview of Graph Learning Methods

The discussion continues on graph neural networks (GNN) and graph preparation learning, focusing on various graph equation learning methods (GLL).

Previous lectures covered simple matrix saturation approaches and random walk methods to capture context in graphs.

Challenges in Graph Evolution

A key issue arises when nodes are distant with no shared neighbors, making it difficult to map them closely in the embedding space.

Deep Learning Approaches

The lecture recaps deep learning methods, particularly convolutional neural networks (CNN), emphasizing their application in graph learning.

GCN operates as a message-passing paradigm where each node receives messages from its neighbors over a defined number of hops.

Operations in GCN Layers

Each layer involves two main operations: aggregation (e.g., addition) and non-linearity.

Parameters w and b :

w : Projects aggregated neighbor embeddings.

b : Projects the node's previous embedding.

Computation Graph Structure

Despite varying topologies for different nodes, the operations remain consistent across layers—aggregation followed by non-linearity.

Node Representation Calculation

To compute a node's hidden state at depth k :

Aggregate neighbor embeddings (average).

Project using weight matrix W_k .

Add the projected node embedding through bias b_k .

Compact Formulation of GCN

A compact representation using matrices simplifies calculations:

Matrices involved:

H : Hidden states,

W : Weight matrices,

A : Combination of adjacency symmetry and degree matrix.

Loss Function for Downstream Tasks

For tasks like node classification, a cross-entropy loss function is defined based on predicted labels versus ground truth.

Inductive vs. Transductive Learning Settings

Two machine learning paradigms discussed:

Transductive: Access to entire dataset; labels known for some nodes but not all.

Understanding Graph Neural Networks

Introduction to Node Representation

The discussion begins with the introduction of a new node in a model, emphasizing the need to train the model while considering existing layers (w layer one and layer two).

A computational graph is drawn for the new node, indicating that no additional training is required since previous weights (w1, b1, w2, b2) are already established.

Graph Convolutional Networks (GCN) vs. GraphSAGE

The speaker introduces GraphSAGE as a more generic version of GCN, highlighting similarities but also key differences.

In GCN, embeddings from neighboring nodes are summed up; however, GraphSAGE retains both neighbor and original node embeddings side by side without aggregation.

Embedding Computation in GCN

The process of computing the embedding for node v at layer k is explained using an equation that aggregates neighbors' embeddings.

A potential issue arises where it becomes unclear which part of the embedding corresponds to the node itself versus its neighbors due to summation.

Enhancements in GraphSAGE

Unlike GCN's simple aggregation method, GraphSAGE keeps neighbor embeddings separate from the original node's embedding.

An aggregation function can be defined for neighbor embeddings; this could be summation or other methods leading to combined representations.

Concatenation vs. Aggregation

In contrast to GCN’s summing approach, GraphSAGE uses concatenation of aggregated neighbor embeddings and own embeddings.

This concatenation allows for clearer differentiation between components during subsequent processing layers.

Aggregation Methods in Detail

Various aggregation methods are discussed; simple averaging gives equal weightage to all neighbors.

Graph Neural Networks and LSTM Aggregation

Element-wise Max Pooling and LSTM Models

The discussion begins with element-wise max pooling and mean pooling on the embeddings of neighboring nodes, highlighting these as initial approaches for aggregation.

An LSTM model is introduced as a more sophisticated method for aggregating neighbor embeddings, where each neighbor's embedding serves as input to the model.

A critical issue arises: LSTMs are designed to capture sequential data, but in this context, the order of neighbors is irrelevant. Thus, using an LSTM may lead to unintended learning of neighbor ordering.

Addressing Ordering Issues in Neighbor Aggregation

To prevent the LSTM from learning neighbor order, inputs can be shuffled across multiple iterations. This ensures that no specific ordering influences the output.

By feeding different permutations of neighbor embeddings into the LSTM multiple times, one can obtain various aggregated outputs which can then be averaged or combined in other ways.

Theoretical Implications and Practical Applications

The paper discusses theoretical implications comparing mean pooling, max pooling, and LSTMs for aggregation methods. It emphasizes how even simple modifications can significantly enhance performance.

This approach leads to a generalized way of representing aggregation while providing theoretical guarantees about its effectiveness compared to traditional methods.

Compact Representations Using Matrices

The use of matrices is suggested for obtaining compact representations by breaking down dimensions effectively (e.g., d into d/2).