RESEARCH PROGRAMS
Investigating Fundamental Obstacles to Continual Learning in Neural Networks
Four research programs investigating fundamental obstacles to continual learning. Our flagship experiment (EXP-01 PERSIST) has completed preliminary proof-of-concept across 19 architectures and 3 datasets, demonstrating that loss landscape topology predicts mitigation benefit at small scale. Phase I will test whether these signals survive at production scale (100M-7B+ parameters), requiring supercomputer resources and novel distributed persistent homology algorithms.
Project PERSIST
Plasticity-Enabled Retention through Structured Information Synthesis over Time
Problem
Catastrophic forgetting (McCloskey & Cohen, 1989) prevents neural networks from learning sequential tasks without destroying previously acquired knowledge. Current mitigations (EWC, replay buffers, progressive networks) reduce but do not eliminate interference. The underlying geometric mechanism remains poorly understood.
Click to explore full experiment →Hypothesis
The persistence of learned knowledge under sequential task training is predictable from the topological features of the loss landscape around converged weight configurations. Architectures whose landscapes exhibit richer topological structure (measured via persistent homology H₁) are more resistant to catastrophic forgetting, independent of model size.
Click to explore full experiment →Methodology
Datasets (3)
CIFAR-100 (50/50 class split), CUB-200-2011 (fine-grained birds, 100/100), NWPU-RESISC45 (satellite scenes, 23/22). All resized to 32x32 for cross-architecture consistency.
Architectures (19)
14 diverse architectures (CNNs, ViTs, MLP-Mixer) plus a WRN-28-k width ladder (k=1,2,4,6,8,10) to isolate scale from topology. Range: 0.3M to 44.7M parameters.
Topological Analysis
50x50 loss landscape grid along filter-normalized random directions. 5 independent slices per architecture. Persistent homology via both Ripser (graph-based) and GUDHI (cubical complexes).
Results (19 Architectures, 3 Datasets Complete)
0.037
CUB-200 Perm. p (suggestive; does not survive Bonferroni across 3 datasets)
0.046
Phase 6 Pooled Interaction (n=57, clustered bootstrap, EWC moderation)
-0.92
Params-only (CUB)
0.34
+Topology (CUB)
-0.76
Params (CIFAR)
On CIFAR-100 (n=19), parameter count dominates (rho = -0.76, p = 0.0002, survives Bonferroni) and topology adds no predictive value. On CUB-200 (n=19, fine-grained birds), topology rescues prediction (permutation p = 0.037), but this does not survive Bonferroni correction across 3 datasets. On RESISC-45 (n=19, satellite scenes), topology also does not help predict forgetting (p = 0.566). However, H0 strongly predicts EWC benefit on RESISC-45 (rho = 0.86, p = 2.4e-6). Phase 6 pooled interaction analysis (n=57, clustered bootstrap) formally confirms dataset moderation: H0 predicts EWC benefit on CIFAR-100 and RESISC-45 (CIs excluding zero, permutation p=0.046) but not CUB-200.
57 of 57 configurations complete across 3 datasets. Most stable signal: H0 predicts EWC benefit (CIFAR-100 rho = 0.76, RESISC-45 rho = 0.86).
PERSIST References
- McCloskey & Cohen (1989). Catastrophic interference in connectionist networks. Psych. of Learning and Motivation.
- Kirkpatrick et al. (2017). Overcoming catastrophic forgetting. PNAS, 114(13).
- Li et al. (2018). Visualizing the loss landscape of neural nets. NeurIPS.
- Maria et al. (2014). The GUDHI Library: simplicial complexes and persistent homology.
- Bauer (2021). Ripser: efficient computation of Vietoris-Rips persistence barcodes. JOSS.
Publication target: NeurIPS / ICML (Continual Learning track)
Project DRIFT
Degradation Regimes In Iterated Field Transformations
Focus
Investigating behavioral uncertainty in quantum system state evolution under repeated manipulation. Focused on stability degradation under variation in operator ordering and diversity, in regimes where closed-form analytical prediction is not feasible across all configurations.
Click to explore full experiment →Research Themes
- State distribution dynamics under iteration
- Operator ordering effects on error profiles
- Operator diversity as experimental variable
- Predictability boundaries and stability thresholds
DRIFT References
- Tranter et al. (2019). Ordering and Trotter error in quantum simulation.
- Nakamura & Ankerhold (2024). Non-Markovian effects in iterated quantum channels.
- Huang et al. (2024). Gate diversity as a design axis in quantum circuits.
- Kwon et al. (2021). Gate-based quantum computing review.
Project Φ
Systematic Survey of Integrated Information in Neural Network Architectures
Objective
Compute integrated information (Φ*) — a scalar measure of how much a system is “more than the sum of its parts” (Tononi, 2004) — across major deep learning architecture families. Test whether Φ* correlates with generalization, transfer learning, and robustness. No systematic Φ* survey across modern deep learning architectures has been published.
Click to explore full experiment →Methodology
Φ* approximation adapted for neural networks using the KSG mutual information estimator (Kraskov et al., 2004). Greedy bipartition search for the minimum information partition. Validated independently using Perturbational Complexity Index (Casali et al., 2013).
Architecture survey: MLPs, CNNs (ResNet), RNNs (LSTM, GRU), Transformers (GPT-2, ViT), Graph Networks (GCN, GAT). Φ* measured at 5 training checkpoints per architecture.
Click to explore full experiment →Connection to PERSIST
If topological depth (PERSIST) predicts forgetting resistance, does information integration (Φ*) also predict it? Networks with higher integrated information may create deeper topological features because integration requires complex, multi-scale structure in the loss landscape. QUANTA serves as a research instrument for interactive exploration of these measurements.
PHI References
- Tononi (2004). An information integration theory of consciousness. BMC Neuroscience.
- Oizumi et al. (2014). From phenomenology to mechanisms of consciousness: IIT 3.0. PLoS Comp. Bio.
- Barrett & Seth (2011). Practical measures of integrated information. PLoS Comp. Bio.
- Casali et al. (2013). Perturbational complexity index. Science Translational Medicine.
Publication target: Nature Machine Intelligence / ICLR / Neuroscience of Consciousness
Note on broader theoretical context
IIT's Φ metric has implications beyond computational systems. The relationship between integrated information and quantum measurement — whether systems with measurable Φ interact with quantum states differently — is an open question in the foundations of physics. Relevant work includes Von Neumann (1932) on quantum measurement, Wigner (1961) on consciousness and wave function collapse, and experimental investigations by Radin et al. (2012, 2016) and Nelson (2001). These connections inform our long-term research direction but are not the focus of EXP-02.
Project GENESIS
Information Capacity Scaling Laws in Neural Networks
Hypothesis
Neural network information capacity follows an area law — proportional to boundary parameters (input/output interface) — rather than a volume law proportional to total parameter count. This would constitute a computational analog of the Bekenstein bound (Bekenstein, 1973), which establishes that maximum entropy in a physical region is proportional to surface area, not volume.
Click to explore full experiment →Methodology
Memorization capacity measurement (Zhang et al., 2017) across 15+ architecture configurations varying depth/width ratios. Power-law fitting on log-log axes: C ~ Vα vs C ~ Aβ. Bayesian model comparison (BIC) to determine which scaling relationship is statistically preferred. Decisive test: vary depth at constant width — if capacity saturates, area law is supported.
Click to explore full experiment →Connection to PERSIST
If capacity is boundary-limited, catastrophic forgetting may occur when new task information competes for limited boundary capacity. Topological protection (PERSIST) may work by encoding knowledge in interior parameters that new learning cannot overwrite. The area law, if confirmed, would provide a theoretical explanation for why topology predicts forgetting.
GENESIS References
- Bekenstein (1973). Black holes and entropy. Physical Review D.
- Zhang et al. (2017). Understanding deep learning requires rethinking generalization. ICLR.
- Kaplan et al. (2020). Scaling laws for neural language models.
- Wheeler (1990). Information, physics, quantum: the search for links.
Publication target: Nature Physics / Physical Review Letters / ICML
Open Research
Reproducibility
- Deterministic seeding (seed = 42)
- Version-controlled YAML configs
- Full dependency pinning
- PyTorch 2.x, Ripser, scikit-tda
Infrastructure
- Preliminary: Local GPU (NVIDIA RTX 4090)
- Phase I: Supercomputer (planned)
- Automated experiment dashboard
- Results in structured JSON
Research Team
Crystal A. Gutierrez, MS
Principal Investigator & Co-Founder
MS in Information Technology. BS in Information Communication Technology. Adjunct Professor, New Mexico State University. Research experience in AI-driven predictive modeling through the Purdue University Data Mine, developing weather forecasting models in collaboration with Bayer. Oversees research strategy, institutional partnerships, and experimental design review.
Joshua R. Gutierrez, MS
Co-Principal Investigator & Co-Founder
MS in Artificial Intelligence & Data Science. BS in Computer Science. Designed and built the lab's experimental infrastructure — model architectures, topological analysis pipeline, loss landscape sampling, and reproducibility framework. Leads day-to-day experiment execution, computational methodology, and software engineering across all research programs.
Research Collaboration
We welcome inquiries from academic institutions, funding agencies, and researchers working on continual learning, topological data analysis, or deep learning theory.
