Active R&DNSF SBIR Phase IMedical Imaging

PERSIST

Predicting catastrophic forgetting before retraining.

PERSIST is a topology-driven system that predicts how much a deep learning model will forget when retrained on new data, and recommends the best mitigation strategy with expected benefit before a single training step is wasted. Initial validation target is medical imaging AI under the FDA's Predetermined Change Control Plan (PCCP) framework, where every model update has to be justifiable to a regulator.

The problem we're solving

Deep learning models forget. When you fine-tune a model on new data, it tends to lose performance on the data it learned from before. This is called catastrophic forgetting, and it is a real and unsolved problem for any team that has to ship model updates over time. Self-driving stacks, medical imaging classifiers, foundation-model fine-tunes, and continually learning agents all hit it.

The standard playbook is to retrain, run a benchmark suite, see how bad the regression is, then try a mitigation strategy and retrain again. That cycle burns GPU time and produces noisy results. In regulated settings, like medical imaging under the FDA's PCCP framework, it produces a paper trail that auditors will challenge.

PERSIST is the opposite approach: predict the forgetting and recommend the mitigation before the retrain happens.

The approach

Topology of the loss landscape

We compute persistent homology over slices of the loss landscape around the trained model. The result is a small set of topological features that describe how fragmented or smooth the landscape is around the operating point. These features turn out to carry strong predictive signal about how the model will respond to additional training.

Forgetting prediction

A trained predictor takes those topological features plus standard model statistics and produces a forecast of expected retention loss under continual training. The point is not to predict perfectly. The point is to produce a calibrated risk estimate that beats the baseline of running the actual retrain.

Mitigation recommendation

Beyond prediction, PERSIST recommends which mitigation strategy is most likely to help, and by how much. EWC regularization, replay, distillation, and parameter-isolation methods are scored against the topology of the model being updated. Strong topological signal in early experiments shows that some mitigations help dramatically on fragmented landscapes and barely at all on smooth ones.

FDA PCCP-aligned outputs

Every prediction comes with provenance: model fingerprint, dataset fingerprint, topological feature vector, predicted retention loss, and recommended mitigation with expected benefit. The output is structured for inclusion in a Predetermined Change Control Plan submission.

Where it stands

Phase I-A · Complete

Scale validation on ImageNet-100. Eight architectures spanning ResNet, ConvNeXt, EfficientNet, DenseNet, and ViT families. Topological signal replicates and strengthens at scale. Core findings published in the research codebase.

Phase I-B · In Progress

Cross-dataset forgetting sweep. 114 configurations covering six ordered dataset pairs across 19 architectures. Mixed-effects analysis identifies where topology is load-bearing and where it isn't. Findings are being written up for arXiv.

NSF SBIR · Phase I Application

Phase I plan targets medical imaging AI under FDA PCCP requirements. Personnel, budget, NMSU subaward, and academic advisor commitments locked in. Application materials are in active development.

How it's being run

PERSIST runs on the NMSU Discovery HPC cluster with NVIDIA A100-PCIE-40GB GPUs. The full Phase I-A and Phase I-B sweep covers 19 architectures from 0.3M to 44.7M parameters across three datasets (CIFAR-100, CUB-200-2011, and NWPU-RESISC-45) plus the ImageNet-100 scale-validation set. Every experiment is reproducible from a config file and a seed.

The codebase is open source under MIT license. Phase I-B raw artifacts include 1,824 step checkpoints across the cross-dataset sweep, available for independent re-analysis.

Where it goes commercially

Primary

Medical imaging

FDA PCCP-eligible classifiers and segmentation models. Every retrain must be predictable and auditable.

Adjacent

Foundation-model fine-tuning

Predicting which fine-tunes will catastrophically regress the base model before paying the GPU bill.

Long-tail

Autonomy + manufacturing vision

Continual-learning stacks where every model swap is a deployment risk.

Read the findings View on GitHub

Back to Projects