How to Train and Validate Convolutional KAN Models with PyTorch

Jun 1, 2023 | Data Science

homemayankDocumentsarticle-generation-using-llmresized_images_gitcomputer_visionreadme_IvanDrokin_torch-conv-kan-1

In this article, we’ll explore the exciting world of TorchConv KAN, a framework that facilitates the training, validation, and quantization of Convolutional Kolmogorov-Arnold Networks (KAN). Utilizing the PyTorch library with CUDA acceleration, we can unleash the full potential of these networks on datasets such as MNIST, CIFAR, Tiny ImageNet, and ImageNet1k.

Introducing the Convolutional KAN Layers

The Kolmogorov-Arnold Networks (KAN) utilize the Kolmogorov-Arnold representation theorem, which allows the architecture to construct learnable activations on edges and perform summation on nodes, unlike the traditional Multi-Layer Perceptron (MLP) that applies fixed non-linearity on nodes.

In a nutshell, imagine KAN layers as chefs with unique recipes (kernels) that slide over ingredients (input data), preparing delectable dishes (output) with their specialized techniques (learnable functions). This innovative approach makes KANs distinct from traditional convolution layers.

Prerequisites

To get started, ensure you have the following installed:

Python (version 3.9 or higher)
CUDA Toolkit (compatible with your PyTorch installation)
cuDNN (matching your installed CUDA Toolkit)

Usage: Building a Simple Model

Here’s how to create a simple model using KAN convolutions:

import torch
import torch.nn as nn
from kan_convs import KANConv2DLayer

class SimpleConvKAN(nn.Module):
    def __init__(self, layer_sizes, num_classes: int = 10, input_channels: int = 1, spline_order: int = 3, groups: int = 1):
        super(SimpleConvKAN, self).__init__()
        self.layers = nn.Sequential(
            KANConv2DLayer(input_channels, layer_sizes[0], spline_order, kernel_size=3, groups=1, padding=1, stride=1),
            KANConv2DLayer(layer_sizes[0], layer_sizes[1], spline_order, kernel_size=3, groups=groups, padding=1, stride=2),
            KANConv2DLayer(layer_sizes[1], layer_sizes[2], spline_order, kernel_size=3, groups=groups, padding=1, stride=2),
            KANConv2DLayer(layer_sizes[2], layer_sizes[3], spline_order, kernel_size=3, groups=groups, padding=1, stride=1),
            nn.AdaptiveAvgPool2d((1, 1))
        )
        self.output = nn.Linear(layer_sizes[3], num_classes)
        self.drop = nn.Dropout(p=0.25)

    def forward(self, x):
        x = self.layers(x)
        x = torch.flatten(x, 1)
        x = self.drop(x)
        x = self.output(x)
        return x

This code constructs a convolutional model that uses KAN convolutions in a sequential manner. You start by defining your layers and finish with a classification output.

Running the Training and Testing

To execute the training and testing for the baseline models on the MNIST, CIFAR-10, or CIFAR-100 datasets, simply run:

python mnist_conv.py

This initiates the training process on your chosen dataset while logging performance metrics.

Accelerate-Based Training

For those looking to enhance efficiency, follow these steps:

Clone the Repository and Install Dependencies:

git clone https://github.com/IvanDrokintorch-conv-kan.git
cd torch-conv-kan
pip install -r requirements.txt

Set Up Weights and Biases:
Sign up at Weights & Biases, then integrate Wandb into your project.
Run Your Training Script:
```
python accelerate launch cifar.py
```
This command starts the training for the CIFAR-10 dataset.

Troubleshooting Common Issues

As with any complex system, you might encounter some bumps along the road. Here are a few troubleshooting tips:

CUDA Errors: Ensure your PyTorch installation is CUDA-enabled and matches the version of your CUDA Toolkit.
Library Dependencies: Make sure all required libraries are properly installed as per the requirements.txt file.
Access Permissions: If you encounter permission issues with the Wandb account, double-check your API key and account settings.

For more insights, updates, or to collaborate on AI development projects, stay connected with fxis.ai.

Conclusion

This deep dive into Convolutional KANs showcases their innovative architecture and offers a user-friendly roadmap to get started with training your own models. Explore the world of AI with KAN layers, and remember each experiment brings us one step closer to robust AI applications.

At fxis.ai, we believe that such advancements are crucial for the future of AI, as they enable more comprehensive and effective solutions. Our team is continually exploring new methodologies to push the envelope in artificial intelligence, ensuring that our clients benefit from the latest technological innovations.

Stay Informed with the Newest F(x) Insights and Blogs

Tech News and Blog Highlights, Straight to Your Inbox