AI Image Detector - FFT Noise Analysis

Parameters

Forensic Configuration

Upload Image

Signal Filter

Channel

FFT Size

FFT Engine

Detecting...

Performance

FFT Time: --

Toggles

Hanning Window Visual Log Scale

Templates

Forensic Dashboard

No Image Loaded

Source Pan & Zoom Enabled

Power Spectrum (FFT) Magma

READY

Polar Spectrum (Unwrapped) X: Angle | Y: Freq

Azimuthal Sum X: Angle | Y: Power

Radial Falloff X: Frequency | Y: Power

Analysis Quick Guide

1. Log Visuals vs. Linear Truth: The images use Log scaling so you can see them. Internally, we score using Linear power to catch "invisible" grid spikes that are mathematically massive. This is standard practice in signal processing—log for visibility, linear for accuracy.

2. Why Laplacian? We run a High-Pass filter (3×3 Laplacian kernel) to strip image content. The filter sums to zero, guaranteeing removal of the DC component (average brightness). We're looking for the manufacturing defects of the generator, not the picture itself.

3. Artifact Identification:

Star Field / Grid: Periodic grid indicates upsampling artifacts from transposed convolutions (checkerboard effect). Most common in GANs (StyleGAN1/2, ProGAN) and early diffusion models.
Solid Halo / Ring: Indicates heavy JPEG compression, aggressive post-processing, or Gaussian blurring.
Natural Falloff: Real photographs typically exhibit chaotic, non-periodic frequency distributions without geometric patterns.

4. Model Effectiveness:

Best Results: GAN-generated faces (StyleGAN, ProGAN, BigGAN), AI upscalers, and images with known upsampling artifacts.
Limited Results: Modern diffusion models (SDXL, DALL-E 3, Midjourney v6+) have largely eliminated checkerboard artifacts through alternative upsampling methods.
Not Reliable For: Determining if an image is AI-generated with certainty (use this as one tool among many).

Technical Methodology & Signal Processing

1. The Pipeline

Unlike simple metadata checkers, this tool performs frequency domain analysis. The image is processed in five stages:
Input Image -> Blue Channel Extraction -> Laplacian High-Pass -> Hanning Window -> FFT

2. Channel Selection (Blue Channel)

We default to the Blue Channel. In digital sensors using a Bayer Filter, color sampling is non-uniform: 50% green, 25% red, 25% blue. The blue channel has the lowest sampling density and typically the lowest signal-to-noise ratio. Compression artifacts and generation errors are most prominent in this channel—making it the "canary in the coal mine" for forensic analysis.

3. Edge Detection (Laplacian High-Pass)

A 3×3 Laplacian kernel acts as a second-order derivative filter. Because the kernel sums to zero, it guarantees removal of the DC component (average pixel intensity), leaving only high-frequency content. This effectively removes the "picture" (faces, objects, scenes) and reveals the underlying "texture" (pixel-level relationships, compression artifacts, generation defects).

4. Windowing (Hanning)

Before FFT, a Hanning window is optionally applied to reduce spectral leakage. Image boundaries create artificial discontinuities (sharp edges) that would appear as spurious high-frequency content. The window tapers these edges smoothly to zero, minimizing artifacts at the cost of slight frequency resolution reduction. Disable this to see edge artifacts more clearly.

5. Frequency Domain (FFT)

We use a Fast Fourier Transform to convert spatial data (pixels) into frequency data (sine wave components). The output is centered (fftshift) so low frequencies appear at the center.

Center: Low frequencies (gradual gradients, smooth regions)
Edges: High frequencies (sharp edges, noise, fine detail)

Transposed Convolutions & Checkerboard Artifacts:
Many GANs and older generative models use transposed convolutions (also called deconvolution) for upsampling. This operation can create checkerboard patterns in the spatial domain—subtle enough to be invisible to the eye, but mathematically regular. In the frequency domain, these artifacts manifest as a periodic grid or star pattern at regular intervals, corresponding to the upsampling factor.

6. FFT Engine: WASM vs JavaScript

The tool supports two FFT implementations with automatic fallback:

WASM (Rust) – Preferred:
• Handles any image dimensions (not limited to powers of 2)
• 3-10x faster through native compiled code and SIMD optimizations
• Uses Rust's FFT library for production-grade accuracy
• Returns centered output directly (no post-shift needed)

JavaScript – Fallback:
• Custom Cooley-Tukey implementation (requires power-of-2 dimensions)
• Upscales non-square images to next power-of-2 (e.g., 3151×4096 → 4096×4096)
• Pure JS, works in any browser without WASM support
• Separable 2D FFT: rows then columns (O(n² log n) complexity)

7. Grid Prominence Score

The tool calculates a "Grid Prominence Score" by comparing peak power at cardinal angles (0°, 90°, 180°, 270°) against local averages. Higher ratios indicate stronger periodic artifacts characteristic of upsampling. The score is logged to the console but not displayed in the UI—it's used for internal analysis and future improvements.

8. References & Further Reading

• Detecting and Simulating Artifacts in GAN Fake Images (Zhang et al., 2019)
• Checkerboard artifacts free convolutional neural networks (Sugawara, 2019)
• Discrete Fourier Transform in Unmasking Deepfake Images (MDPI, 2024)
• On the Frequency Bias of Generative Models (Schwarz et al., NeurIPS 2021)