270,000-sample dataset covering spatial, physical, and embodied action reasoning reduces error rates by 66.6% on 20 capability probes; 100K open subset and fine-tuned ...