Jian Zhang

Master's student at Xiamen University

personal/jian_zhang_sanya.jpeg

Hello, I am Jian (Dylan). Over the past two years, I have had impactful collaborations with Yue Huang and Xinghao Ding, through which I developed core research skills and a clear long-term goal: building systems that can perceive, decide, and act in the physical world like humans. During this period, I also had the opportunity to collaborate with Dr. Zhiwen Fan. I believe this direction can fundamentally reshape society. I plan to start my PhD at Texas A&M University in Fall 2026.

My recent projects include SpatialStack, VLM-3R, Dyn-Bench, DynamicVerse, Large Spatial Model, and InstantSplat. In my early stage, I focused on faster 3D reconstruction and semantic 3D representation, with some exploration in video generation. I am now increasingly focused on intelligence for embodied systems in the physical world.

I am currently seeking internship opportunities. Feel free to contact me by email.

selected publications

  1. spatialstack.gif
    SpatialStack: Layered Geometry-Language Fusion for 3D VLM Spatial Reasoning
    Jian Zhang*, Shijie Zhou*, Bangya Liu*, and 2 more authors
    In CVPR, 2026
  2. vlm3r.gif
    VLM-3R: Vision-Language Models Augmented with Instruction-Aligned 3D Reconstruction
    Zhiwen Fan*, Jian Zhang*, Renjie Li, and 8 more authors
    In CVPR, 2026
  3. lsm.gif
    Large spatial model: End-to-end unposed images to semantic 3d
    Zhiwen Fan*, Jian Zhang*, Wenyan Cong, and 8 more authors
    In NeurIPS, 2024