1 Eigenvalues, Eigenvectors, and SVD

Powerful mathematical tools for understanding linear transformations and data analysis.

1.1 Eigenvalues and Eigenvectors

1.1.1 The Eigenvalue Problem

For a square matrix A, find scalars λ and non-zero vectors v such that: A·v = λ·v

This equation represents the eigenvalue equation.

1.1.2 Geometric Interpretation

Eigenvectors: Directions that remain unchanged (up to scaling) under linear transformation
Eigenvalues: Scaling factors along those directions

1.1.3 Characteristic Equation

The eigenvalues are roots of the characteristic polynomial: det(A - λI) = 0

1.2 Applications in Machine Learning

1.2.1 Principal Component Analysis (PCA)

PCA finds the principal directions of variance in data.

import numpy as np
from sklearn.decomposition import PCA
 
# Generate sample data
np.random.seed(42)
X = np.random.randn(100, 3)
 
# Apply PCA
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X)
 
print("Explained variance ratio:", pca.explained_variance_ratio_)
print("Principal components shape:", pca.components_.shape)

1.2.2 Power Iteration Method

Algorithm to find the dominant eigenvalue and eigenvector:

def power_iteration(A, num_iterations=100):
    """Find dominant eigenvalue and eigenvector using power iteration"""
    n = A.shape[0]
    v = np.random.rand(n)
 
    for _ in range(num_iterations):
        # Multiply by matrix
        v_new = np.dot(A, v)
        # Normalize
        v_new = v_new / np.linalg.norm(v_new)
        v = v_new
 
    # Compute eigenvalue
    eigenvalue = np.dot(v, np.dot(A, v)) / np.dot(v, v)
 
    return eigenvalue, v

1.3 Singular Value Decomposition (SVD)

1.3.1 Matrix Decomposition

Every m×n matrix A can be decomposed as: A = U × Σ × Vᵀ

Where:

U: m×m orthogonal matrix (left singular vectors)
Σ: m×n diagonal matrix (singular values)
V: n×n orthogonal matrix (right singular vectors)

1.3.2 Low-Rank Approximation

SVD enables optimal low-rank approximations:

def low_rank_approximation(A, k):
    """Compute rank-k approximation of matrix A using SVD"""
    U, s, Vt = np.linalg.svd(A)
 
    # Keep only top k singular values
    S_k = np.diag(s[:k])
 
    # Reconstruct approximation
    A_k = U[:, :k] @ S_k @ Vt[:k, :]
 
    return A_k

1.4 Spectral Graph Theory

1.4.1 Graph Laplacian

For a graph with adjacency matrix A: L = D - A

Where D is the degree matrix.

1.4.2 Spectral Clustering

def spectral_clustering(adjacency_matrix, k):
    """Perform spectral clustering"""
    # Compute Laplacian
    degree_matrix = np.diag(np.sum(adjacency_matrix, axis=1))
    laplacian = degree_matrix - adjacency_matrix
 
    # Compute eigenvectors
    eigenvalues, eigenvectors = np.linalg.eigh(laplacian)
 
    # Use k smallest eigenvectors (excluding first)
    X = eigenvectors[:, 1:k+1]
 
    # Normalize rows
    X = X / np.linalg.norm(X, axis=1, keepdims=True)
 
    # Apply k-means clustering
    from sklearn.cluster import KMeans
    kmeans = KMeans(n_clusters=k, random_state=42)
    labels = kmeans.fit_predict(X)
 
    return labels

1.5 Numerical Considerations

1.5.1 Condition Number

The condition number κ(A) = σ₁/σₙ measures matrix sensitivity:

Well-conditioned: κ(A) ≈ 1
Ill-conditioned: κ(A) >> 1

1.5.2 Eigenvalue Algorithms

QR Algorithm: For finding all eigenvalues
Jacobi Method: For symmetric matrices
Lanczos Method: For large sparse matrices

1.6 Applications in Deep Learning

1.6.1 Neural Network Analysis

def analyze_neural_network_weights(weights):
    """Analyze neural network weights using eigenvalues"""
    # Compute weight matrix eigenvalues
    eigenvalues, eigenvectors = np.linalg.eigh(weights @ weights.T)
 
    # Compute condition number
    condition_number = eigenvalues.max() / eigenvalues.min()
 
    # Check for vanishing/exploding gradients
    if condition_number > 100:
        print("Warning: Potentially unstable gradients")
    elif condition_number < 0.01:
        print("Warning: Potentially vanishing gradients")
 
    return condition_number, eigenvalues

1.6.2 Convolutional Neural Networks

SVD can be used for:

Network compression: Reducing parameter count
Knowledge distillation: Transferring knowledge between models
Pruning: Removing redundant filters

1.7 Key Takeaways

Eigenvalues reveal the scaling behavior of linear transformations
SVD provides the optimal low-rank approximation of any matrix
Spectral methods are powerful tools for data analysis and clustering
Numerical stability is crucial when working with eigenvalues

These concepts form the mathematical foundation for many modern machine learning algorithms and data analysis techniques.

Mathematics

Eigenvalues, Eigenvectors, and SVD

1 Eigenvalues, Eigenvectors, and SVD

1.1 Eigenvalues and Eigenvectors

1.1.1 The Eigenvalue Problem

1.1.2 Geometric Interpretation

1.1.3 Characteristic Equation

1.2 Applications in Machine Learning

1.2.1 Principal Component Analysis (PCA)

1.2.2 Power Iteration Method

1.3 Singular Value Decomposition (SVD)

1.3.1 Matrix Decomposition

1.3.2 Low-Rank Approximation

1.4 Spectral Graph Theory

1.4.1 Graph Laplacian

1.4.2 Spectral Clustering

1.5 Numerical Considerations

1.5.1 Condition Number

1.5.2 Eigenvalue Algorithms

1.6 Applications in Deep Learning

1.6.1 Neural Network Analysis

1.6.2 Convolutional Neural Networks

1.7 Key Takeaways

Table of Contents

Mathematics

Eigenvalues, Eigenvectors, and SVD

1 Eigenvalues, Eigenvectors, and SVD

1.1 Eigenvalues and Eigenvectors

1.1.1 The Eigenvalue Problem

1.1.2 Geometric Interpretation

1.1.3 Characteristic Equation

1.2 Applications in Machine Learning

1.2.1 Principal Component Analysis (PCA)

1.2.2 Power Iteration Method

1.3 Singular Value Decomposition (SVD)

1.3.1 Matrix Decomposition

1.3.2 Low-Rank Approximation

1.4 Spectral Graph Theory

1.4.1 Graph Laplacian

1.4.2 Spectral Clustering

1.5 Numerical Considerations

1.5.1 Condition Number

1.5.2 Eigenvalue Algorithms

1.6 Applications in Deep Learning

1.6.1 Neural Network Analysis

1.6.2 Convolutional Neural Networks

1.7 Key Takeaways

Share this article

Connect with OliveML

Table of Contents