VAAS Banner


DOI arXiv GitHub Repo License Task Open In Colab

VAAS: Vision-Attention Anomaly Scoring

Model Summary

VAAS (Vision-Attention Anomaly Scoring) is a dual-module vision framework for image anomaly detection and localisation.
It combines global attention-based reasoning with patch-level self-consistency analysis to produce a continuous, interpretable anomaly score alongside dense spatial anomaly maps.

Rather than making binary decisions, VAAS estimates where anomalies occur and how strongly they deviate from learned visual regularities, enabling explainable assessment of image integrity.

The framework is further extended with a cross-attention fusion mechanism that enables global representations to directly guide patch-level anomaly reasoning.


Examples of detection and scoring

Inference with visual example


Read Research Paper


Architecture Overview

VAAS Methodology

VAAS consists of two complementary components:

  • Global Attention Module (Fx)
    A Vision Transformer backbone that captures global semantic and structural irregularities through attention distributions.

  • Patch-Level Module (Px)
    A SegFormer-based segmentation model that identifies local inconsistencies in texture, boundaries, and regions.

The framework is further extended with a cross-attention fusion mechanism, enabling global representations from Fx to guide patch-level anomaly reasoning within Px.

These components are combined via a hybrid scoring mechanism:

  • S_F: Global attention fidelity score
  • S_P: Patch-level plausibility score
  • S_H: Final hybrid anomaly score

S_H provides a continuous measure of anomaly intensity rather than a binary decision.


Installation

VAAS is distributed as a lightweight inference library and can be installed instantly.

PyTorch is only required when executing inference or loading pretrained VAAS models.
This allows users to inspect, install, and integrate VAAS without heavy dependencies.


1. Install PyTorch

To run inference or load pretrained VAAS models, install PyTorch and torchvision for your system (CPU or GPU).
Follow the official PyTorch installation guide:

https://pytorch.org/get-started/locally/

Quick installation (CPU)

pip install torch torchvision

2. Install VAAS

pip install vaas

VAAS will automatically detect PyTorch at runtime and raise a clear error if it is missing.


Usage


Try VAAS instantly on Google Colab (no setup required):

The notebooks cover:


1. Quick start: run VAAS and get a visual result

from vaas.inference.pipeline import VAASPipeline
from PIL import Image
import requests
from io import BytesIO

pipeline = VAASPipeline.from_pretrained(
    repo_id="OBA-Research/vaas",
    device="cpu",
    alpha=0.5,
    model_variant="v2-base-df2023"  # v2-medium-df2023 and v2-large-df2023 are also available
)

url = "https://raw.githubusercontent.com/OBA-Research/VAAS/main/examples/images/COCO_DF_C110B00000_00539519.jpg"
image = Image.open(BytesIO(requests.get(url).content)).convert("RGB")

pipeline.visualize(
    image=image,
    save_path="vaas_visualization.png",
    mode="all",
    threshold=0.5,
)

2. Programmatic inference (scores + anomaly map)

result = pipeline(image)

print(result)
anomaly_map = result["anomaly_map"]

Output format

{
  "S_F": float,
  "S_P": float,
  "S_H": float,
  "anomaly_map": ndarray
}

Model Variants

v2 (Cross-Attention VAAS)

Models Training Data Description Hugging Face Model
vaas-v2-base-df2023 DF2023 (10%) Lightweight inference with cross-attention fusion https://huggingface.co/OBA-Research/vaas/tree/v2-base-df2023
vaas-v2-medium-df2023 DF2023 (≈50%) Balanced anomaly reasoning with improved localisation https://huggingface.co/OBA-Research/vaas/tree/v2-medium-df2023
vaas-v2-large-df2023 DF2023 (100%) Full-scale training with highest sensitivity and interpretability https://huggingface.co/OBA-Research/vaas/tree/v2-large-df2023

v1 (Legacy)

Models Training Data Description Hugging Face Model
vaas-v1-base-df2023 DF2023 (10%) Initial public inference release https://huggingface.co/OBA-Research/vaas
vaas-v1-medium-df2023 DF2023 (≈50%) Scale-up experiment https://huggingface.co/OBA-Research/vaas/tree/v1-medium-df2023
vaas-v1-large-df2023 DF2023 (100%) Full-dataset training https://huggingface.co/OBA-Research/vaas/tree/v1-large-df2023

Notes

  • VAAS supports both local and online images
  • PyTorch is loaded lazily and only required at runtime
  • CPU inference is supported; GPU accelerates execution but is optional

Intended Use

  • Image anomaly detection
  • Visual integrity assessment
  • Explainable inspection of irregular regions
  • Research on attention-based anomaly scoring
  • Prototyping anomaly-aware vision systems

Limitations

  • Trained on a single dataset
  • Does not classify anomaly types
  • Performance may degrade on out-of-distribution imagery

Ethical Considerations

VAAS is intended for research and inspection purposes.
It should not be used as a standalone decision-making system in high-stakes or sensitive applications without human oversight.


Citation

If you use VAAS, please cite both the software and the associated paper.

@software{vaas,
  title        = {VAAS: Vision-Attention Anomaly Scoring},
  author       = {Bamigbade, Opeyemi and Scanlon, Mark and Sheppard, John},
  year         = {2025},
  publisher    = {Zenodo},
  doi          = {10.5281/zenodo.18064355},
  url          = {https://doi.org/10.5281/zenodo.18064355}
}
@article{BAMIGBADE2026302063,
title = {VAAS: Vision-Attention Anomaly Scoring for image manipulation detection in digital forensics},
journal = {Forensic Science International: Digital Investigation},
volume = {56},
pages = {302063},
year = {2026},
note = {DFRWS EU 2026 - Selected Papers from the 13th Annual Digital Forensics Research Conference Europe},
issn = {2666-2817},
doi = {https://doi.org/10.1016/j.fsidi.2026.302063},
url = {https://www.sciencedirect.com/science/article/pii/S266628172600020X},
author = {Opeyemi Bamigbade and Mark Scanlon and John Sheppard},
keywords = {Digital forensics, Image manipulation detection, Tamper localisation, Explainable AI, Vision transformers, Segmentation, Attention mechanisms, Anomaly scoring},
abstract = {Recent advances in AI-driven image generation have introduced new challenges for verifying the authenticity of digital evidence in forensic investigations. Modern generative models can produce visually consistent forgeries that evade traditional detectors based on pixel or compression artefacts. Most existing approaches also lack an explicit measure of anomaly intensity, which limits their ability to quantify the severity of manipulation. This paper introduces Vision-Attention Anomaly Scoring (VAAS), a novel dual-module framework that integrates global attention-based anomaly estimation using Vision Transformers (ViT) with patch-level self-consistency scoring derived from segmentation embeddings. The hybrid formulation provides a continuous and interpretable anomaly score that reflects both the location and degree of manipulation. Evaluations on the DF2023 and CASIA v2.0 datasets demonstrate that vaas achieve competitive F1 and IoU performance, while enhancing visual explainability through attention-guided anomaly maps. The framework bridges quantitative detection with human-understandable reasoning, supporting transparent and reliable image integrity assessment. The source code for all experiments and corresponding materials for reproducing the results are available open source.}
}

Contributing

We welcome contributions that improve the usability, robustness, and extensibility of VAAS.

See: https://github.com/OBA-Research/VAAS/blob/main/CONTRIBUTING.md


License

MIT License


Maintainers

OBA-Research

Downloads last month
54
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for OBA-Research/vaas