Improvement Plan

AQAL Roadmap

Based on expert review. Organized by priority, compute requirements, and current progress. The directive: fix the CPU science first, then invest in GPU learning.

Issues Identified

Four structural weaknesses identified by expert review. These are not caveats — they are the core problems to solve.

Uncorrected Statistics

resolved

Fixed in v0.3: 1,002 FDR-corrected connections (q<0.05) from 1,545 subjects across 36 sites. 2.6x improvement over single-consortium baseline. Site harmonization and age/sex covariates applied.

Statistics, Not Learning

critical

The ND transform is a fixed per-vertex scale+shift from t-tests. It discards covariance structure and cannot capture individual variation. A linear patch, not a learned model.

Resting → Task-Evoked Leap

high

Connectivity from resting-state fMRI applied to task-evoked predictions. Assumes resting wiring maps to active processing — the error is unquantified.

Average Neurodiverse Brain

high

88.3% of vertices get identical alteration regardless of individual. Autism is a spectrum — a single average risks stereotyping.

No Behavioral Ground Truth

high

Zero validation against sensory overload, eye-tracking, pupil dilation, GSR, or caregiver report. A visualization engine, not a validated tool.

Dataset Limitations

resolved

Expanded from 871 to 1,545 subjects across 36 sites (two consortia). Site harmonization applied. Age/sex covariates included. Male skew persists but dataset is now 2x larger.

No Uncertainty

resolved

Fixed in v0.2: 200-iteration bootstrap produces 95% credible intervals per vertex. Mean CI width 0.027. API returns uncertainty metadata.

No Early Detection

long-term

AQAL predicts brain activity from stimuli (forward). Early detection needs the inverse: behavior → neural differences → risk flag.

Without GPU — Fix the Science

Most cleanup is data work and statistics, not deep learning. These address the three biggest flaws: uncorrected p-values, group-average predictions, and missing behavioral ground truth.

FDR-Corrected Connectivity Map

Done

1,545 subjects across 36 sites (dual-consortium). 1,002 FDR-corrected connections (q<0.05), 1,773 uncorrected, 73 Bonferroni. 2.6x improvement over single-consortium v4.

v5 transform trained, uploaded to HuggingFace, APIs auto-load v5, paper updated.

Age Stratification

Done

Adolescents (12-18): 49 FDR from 262 ASD + 280 TD. Children (<12): 1 FDR from 221 ASD + 327 TD (548 total, confirms statistical transform insufficient). Adults: 0 FDR.

Age-band selector in API and frontend. Child-specific analysis published as negative finding.

Uncertainty Quantification

Done

200 bootstrap iterations producing 95% credible intervals for each of 20,484 vertices. Mean CI width: 0.027, max: 0.108. Propagated to API responses.

CI bounds stored in v4 transform. API returns uncertainty metadata (mean CI width, high-confidence vertex %).

Site Harmonization

Done

Site effects residualized from all 4,950 connectivity features across 20 sites via linear regression before statistical testing.

Implemented in v4 training pipeline. Per-site documentation not yet started.

5-Minute Individual Calibration

Not Started

Design standardized stimulus set, collect brief behavioral response, fit per-person scaling vector with ordinary least squares (CPU linear solve).

Turns population average into a personal prior. No code written yet.

Behavioral Validation Study

Not Started

AQAL predicts high visual-network divergence → measure pupil dilation, gaze aversion, caregiver stress rating. Report sensitivity, specificity, calibration curves.

Prospective study design needed. Requires IRB approval and clinical partner.

Clinical Guardrails

Not Started

Define referral language, risk-flag thresholds. Specify what 'increased divergence' triggers in a pediatric workflow — and what it does not.

Clinical disclaimer added to all API compare responses. Full ethics documentation not yet started.

With GPU — Learn the Neurodiverse Brain

Our child analysis (548 subjects, 0 FDR connections) proved that the statistical transform cannot detect connectivity differences in children under 12. GPU fine-tuning is not optional — it is the only path to child-specific and early detection models.

Conditional Encoder Fine-Tuning

Blocked on GPU

Fine-tune the 177M-parameter transformer directly on autistic task fMRI using LoRA adapters per subgroup. Learn the ND manifold instead of patching.

Months 3-8

Developmental Models

Blocked on GPU

Separate encoders for infants/toddlers. Early autism markers appear in visual attention and social orienting during the first 18 months.

Months 6-12

Inverse Pipeline (Brain→Behavior)

Blocked on GPU

Flip the pipeline for screening: observe behavior, predict neural differences, output a risk flag — not a diagnosis.

Months 8-14

Multi-Condition Expansion

Blocked on GPU

Joint training for ADHD, SPD, anxiety. EEG fusion for portable real-time monitoring.

Months 10-18

End-to-End Learned Uncertainty

Blocked on GPU

Model ensembles or Bayesian neural nets so credible intervals come from the model itself, not post-hoc bootstrapping.

Months 6-10

Toward Early Detection

AQAL does not detect autism today. Early detection requires the inverse problem — and three prerequisites.

Train on infants and toddlers

Early markers are visual attention and social orienting in the first 18 months. Cannot extrapolate from adult resting-state maps.

Flip the pipeline

Instead of stimulus→brain, build brain→behavior. Compare predicted sensory profiles to real home videos. A mismatch score becomes a risk flag, not a diagnosis.

Clinical pathway integration

Output: 'increased visual and salience divergence relative to age norms, consider M-CHAT follow-up.' Models do not replace clinical assessment.

Toward Clinical Diagnostic Rigidity

To transition from heuristic to Software as a Medical Device (SaMD).

Future

ISO 13485 Quality Management System

Future

FDA De Novo classification pathway

Future

Prospective trials (ADOS-2 / ADI-R)

Future

Physiological validation (HRV, GSR)

Future

Predetermined Change Control Plan

Future

Prove clinical utility vs clinician alone

Progress Tracker

Done(19)

Foundation model deployed (CPU, 20-30s inference)

1,545-subject dual-consortium connectivity analysis (v5, dual-consortium)

1,002 FDR-corrected connections (q<0.05), 2.6x improvement over v4

Age-stratified transforms: child / adolescent / adult

200-iteration bootstrap uncertainty (95% CI per vertex)

v4 transform uploaded to HuggingFace, APIs auto-load

Age-band selector in NeuroBrain API and frontend

Uncertainty and CI metadata exposed in API responses

Clinical disclaimer added to all compare responses

Child analysis: dual-consortium combined (548 children, 221 ASD + 327 TD) — 0 FDR connections. Confirms statistical transform insufficient for children <12

5-minute individual calibration module at /calibrate

CI ranges displayed on sensory profile bars

Live API: predict, compare, connectivity, interpret

Sensory Audit app with video and text input

7-network sensory profiling system

Brain demo with per-timestep interpretation

Technical paper updated with child-specific negative finding

Public roadmap page with CPU vs GPU tracks

Three apps deployed: mind.new, neuro.mind.new, sensory.mind.new

In Progress(2)

HBN Data Usage Agreement (phenotypic access for 990 children 5-12)

Clinical guardrails — disclaimer added, full ethics documentation pending

Not Started (CPU)(1)

Behavioral validation study protocol

Blocked on GPU(5)

LoRA fine-tuning on ASD task fMRI

Infant/toddler developmental encoders

Brain→behavior inverse pipeline

Multi-condition expansion (ADHD, SPD, anxiety)

End-to-end Bayesian uncertainty

Pending External(5)

NDA dataset access approval

SPARK dataset access approval

GPU compute allocation

IRB approval for validation study

Clinical partner for ADOS-2 trials

Chief Scientist Directive

“Do the CPU work first. Clean statistics, age stratification, calibration, and behavioral validation will tell you whether the signal is real. Only then spend GPU budget to replace the statistical transform with a learned model. Otherwise you are training a bigger network on a noisy map.”