`dataeval` Feasibility Tutorial¶

A guide to running the dataeval feasibility tools via checkmaite.

NOTE: The dataeval package can normally be used in the checkmaite framework for both image classification (IC) and object detection (OD) tasks. However, feasibility can only be used for IC tasks until checkmaite has upgraded to MAITE v0.8. As such, this tutorial will only cover the IC scenario.

What is `dataeval`?¶

The dataeval package analyzes datasets and models to give users the ability to train and test performant, unbiased, and reliable AI models and monitor data for impactful shifts to deployed models.

The tools demonstrated in this tutorial are a subset of the larger dataeval framework. They are specifically focused on estimating the maxmium achievable performance of a model on a dataset. Another way of saying this is that the feasibility tools in this tutorial attempt to measure the irreducible error of a dataset - variability in a dataset that is inherently random and cannot be explained by a predictive model. This quantity is of interest because it informs an engineer about the inherent difficulty of a problem. If this difficulty surpasses operational performance requirements, then the problem must be changed in order to become feasible.

There are currently only two feasibility tools in dataeval:

the Bayes Error Rate (BER) for IC calculations
the Upper-Bound Average Precision (UAP) for OD calculations

The dataeval feasibility tools are generally applied to an entire dataset. Their computational demands are low-to-moderate, and are run on CPU.

Running the `dataeval` feasibility algorithms inside `checkmaite`¶

The following section uses the checkmaite API to run the dataeval feasibility test stage for Image Classification.

First, we create some machinery for generating a Yolo-style IC dataset for demonstration purposes.

In [1]:

Copied!





import os
import tempfile
from contextlib import contextmanager
from pathlib import Path
from typing import Iterable, Tuple, Iterator

from PIL import Image

CLASSES: list[str] = ["cat", "dog"]
NUM_IMAGES_PER_CLASS: int = 4
IMG_SHAPE: Tuple[int, int] = (64, 128)  # (H, W)


def create_fake_yolo_dataset(
    root_dir: Path,
    split: str,
    classes: Iterable[str],
    num_images_per_class: int,
    image_shape: Tuple[int, int],
) -> None:
    """Populate *root_dir/split/* with sub-dirs and RGB images."""
    root_dir = Path(root_dir)
    (root_dir / split).mkdir(parents=True, exist_ok=True)

    for class_name in classes:
        class_dir = root_dir / split / class_name
        class_dir.mkdir(parents=True, exist_ok=True)

        for i in range(num_images_per_class):
            img = Image.new("RGB", image_shape, color=(i, i, i))
            img.save(class_dir / f"{i}_{class_name}.jpg")


@contextmanager
def fake_dataset(
    *,
    split: str = "test",
    classes: Iterable[str] = CLASSES,
    num_images_per_class: int = NUM_IMAGES_PER_CLASS,
    image_shape: Tuple[int, int] = IMG_SHAPE,
) -> Iterator[tuple[Path, list[str], int, Tuple[int, int]]]:
    """
    Context-manager that yields a temporary dataset root and
    removes it automatically afterwards.

    Example
    -------
    >>> with fake_dataset() as (root, classes, n_per_cls, img_shape):
    ...     print(root)
    ...     # use files under `root / 'test' / <class_name>` ...
    """
    with tempfile.TemporaryDirectory() as tmp:
        dataset_root = Path(tmp)
        create_fake_yolo_dataset(
            root_dir=dataset_root,
            split=split,
            classes=classes,
            num_images_per_class=num_images_per_class,
            image_shape=image_shape,
        )
        yield dataset_root, list(classes), num_images_per_class, image_shape
        # directory is automatically cleaned up on exit
import os
import tempfile
from contextlib import contextmanager
from pathlib import Path
from typing import Iterable, Tuple, Iterator

from PIL import Image

CLASSES: list[str] = ["cat", "dog"]
NUM_IMAGES_PER_CLASS: int = 4
IMG_SHAPE: Tuple[int, int] = (64, 128)  # (H, W)


def create_fake_yolo_dataset(
    root_dir: Path,
    split: str,
    classes: Iterable[str],
    num_images_per_class: int,
    image_shape: Tuple[int, int],
) -> None:
    """Populate *root_dir/split/* with sub-dirs and RGB images."""
    root_dir = Path(root_dir)
    (root_dir / split).mkdir(parents=True, exist_ok=True)

    for class_name in classes:
        class_dir = root_dir / split / class_name
        class_dir.mkdir(parents=True, exist_ok=True)

        for i in range(num_images_per_class):
            img = Image.new("RGB", image_shape, color=(i, i, i))
            img.save(class_dir / f"{i}_{class_name}.jpg")


@contextmanager
def fake_dataset(
    *,
    split: str = "test",
    classes: Iterable[str] = CLASSES,
    num_images_per_class: int = NUM_IMAGES_PER_CLASS,
    image_shape: Tuple[int, int] = IMG_SHAPE,
) -> Iterator[tuple[Path, list[str], int, Tuple[int, int]]]:
    """
    Context-manager that yields a temporary dataset root and
    removes it automatically afterwards.

    Example
    -------
    >>> with fake_dataset() as (root, classes, n_per_cls, img_shape):
    ...     print(root)
    ...     # use files under `root / 'test' / ` ...
    """
    with tempfile.TemporaryDirectory() as tmp:
        dataset_root = Path(tmp)
        create_fake_yolo_dataset(
            root_dir=dataset_root,
            split=split,
            classes=classes,
            num_images_per_class=num_images_per_class,
            image_shape=image_shape,
        )
        yield dataset_root, list(classes), num_images_per_class, image_shape
        # directory is automatically cleaned up on exit

We then create the necessary MAITE-wrapped dataset and initialize a DatasetImageClassificationFeasibilityTestStage object. We load the wrapped dataset into this object, set a performance-threshold and then execute the test stage. If the performance-threshold is found to be above the BER, the problem is said to be infeasible.

In [2]:

Copied!





from pathlib import Path
from checkmaite.core.image_classification import DataevalFeasibility
from checkmaite.core.image_classification.dataset_loaders import YoloClassificationDataset

with fake_dataset() as (root, _, _, _):
    dataset = YoloClassificationDataset(root_dir=root, dataset_id="test")
    capability = DataevalFeasibility()
    output = capability.run(use_cache=False, datasets=[dataset])
from pathlib import Path
from checkmaite.core.image_classification import DataevalFeasibility
from checkmaite.core.image_classification.dataset_loaders import YoloClassificationDataset

with fake_dataset() as (root, _, _, _):
    dataset = YoloClassificationDataset(root_dir=root, dataset_id="test")
    capability = DataevalFeasibility()
    output = capability.run(use_cache=False, datasets=[dataset])

/home/runner/work/checkmaite/checkmaite/.venv/lib/python3.10/site-packages/xaitk_saliency/__init__.py:3: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
  import pkg_resources

/home/runner/work/checkmaite/checkmaite/.venv/lib/python3.10/site-packages/dataeval/extractors/_torch.py:157: UserWarning: The given NumPy array is not writable, and PyTorch does not support non-writable tensors. This means writing to this tensor will result in undefined behavior. You may want to copy the array to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:213.)
  tensor = torch.as_tensor(img)

Slide Deck¶

Once the test stage has completed, the code below uses the gradient package to create HTML and PPTX formatted reports of the results of the dataeval feasibility test stage.

In [3]:

Copied!





import os
from checkmaite.core.report._markdown import create_markdown_output

output_dir = Path("dataeval_feasibility_example_output")
os.makedirs(output_dir, exist_ok=True)

create_markdown_output(output.collect_md_report(threshold=0.6), output_dir, md_filename='Dataeval_Feasibility_Example_Report.md')
print(f"Markdown report saved in {output_dir}.")
import os
from checkmaite.core.report._markdown import create_markdown_output

output_dir = Path("dataeval_feasibility_example_output")
os.makedirs(output_dir, exist_ok=True)

create_markdown_output(output.collect_md_report(threshold=0.6), output_dir, md_filename='Dataeval_Feasibility_Example_Report.md')
print(f"Markdown report saved in {output_dir}.")

Markdown report saved in dataeval_feasibility_example_output.

In [ ]:

dataeval Feasibility Tutorial¶

What is dataeval?¶

Running the dataeval feasibility algorithms inside checkmaite¶

Slide Deck¶

`dataeval` Feasibility Tutorial¶

What is `dataeval`?¶

Running the `dataeval` feasibility algorithms inside `checkmaite`¶