Guangqian Guo

Ph.D. Student at Northwestern Polytechnical University

Guangqian Guo

About Me

I am currently a Ph.D. student at the School of Artificial Intelligence, Northwestern Polytechnical University (NWPU), Xi'an, China, from 2023. Before that, I received my B.S. from Shaanxi University of Science and Technology, Xi'an, China, in 2021. I am advised by Prof. Shan Gao and have been fortunate to have had the opportunity to intern at Huawei, vivo, and ByteDance, where I gained valuable experience in both academia and industry.

My research primarily focuses on developing robust and data-efficient visual foundation models and multimodal large language models for visual perception, understanding, and generation. More recently, I have been working on two closely related directions:

1. Robust visual foundation models: improving robustness and generalization for challenging scenarios such as degraded visual inputs, non-salient targets, and remote sensing images.

2. Multimodal large language models and vision-language agents: advancing image understanding and generation, with a particular interest in diffusion-based UMM and VLM agents.

Internships

ByteDance logo

ByteDance

09/2025 - 01/2026 | Research Intern

Worked on unified multimodal understanding-generation modeling based on discrete diffusion models for parallel output.

vivo logo

vivo Camera Team

08/2025 - 09/2025 | Research Intern

Worked on a vision-language-model-based PhotoAgent for photo editing, tool-use planning, and multimodal agent evaluation.

Huawei logo

Huawei

07/2024 - 04/2025 | Research Intern

Worked on robust segmentation foundation models for degraded visual inputs via generative latent-space enhancement.

Recent News

Publications

GleSAM paper figure Guangqian Guo, Yong Guo, Xuehui Yu, Wenbo Li, Yaoxing Wang, Shan Gao
Segment Any-Quality Images with Generative Latent Space Enhancement
IEEE Computer Vision and Pattern Recognition Conference (CVPR), 2025
[Paper] [Code]
VNS-SAM paper figure Guangqian Guo, Pengfei Chen, Yong Guo, Huafeng Chen, Boqiang Zahng, Shan Gao
Boosting Segment Anything Model to Generalize Visually Non-Salient Scenarios
IEEE Transactions on Image Processing (TIP), 2025
[Paper] [Code]
GleSAM+ paper figure Guangqian Guo, Aixi Ren, Yong Guo, Xuehui Yu, Jiacheng Tian, Wenli Li, Yaoxing Wang, Shan Gao
Towards Any-Quality Image Segmentation via Generative and Adaptive Latent Space Enhancement
International Journal of Computer Vision (IJCV), major revision.
[Paper] [Code]
P2P paper figure Guangqian Guo, Dian Shao, Chenguang Zhu, Sha Meng, Xuan Wang, Shan Gao
P2P: Transforming from Point Supervision to Explicit Visual Prompt for Object Detection and Segmentation
International Joint Conference on Artificial Intelligence (IJCAI), 2024
[Paper] [Code]
HANet paper figure Guangqian Guo, Pengfei Chen, Xuehui Yu, Zhenjun Han, Qixiang Ye, Shan Gao
HANet: Save the Tiny, Save the All: Hierarchical Activation Network for Tiny Object Detection
IEEE Transactions on Circuits and Systems for Video Technology (TCSVT), 2023
[Paper] [Code]
RPG paper figure *Chaowei Wang, *Guangqian Guo, Chang Liu, Dian Shao, Shan Gao
Effective Rotate: Learning Rotation-Robust Prototype for Aerial Object Detection
* co-first author
IEEE Transactions on Geoscience and Remote Sensing (TGRS), 2024
[Paper]
Hybrid-Net paper figure Shan Gao (Ph.D. advisor), *Guangqian Guo, Hanqiao Huang, C. L. Philip Chen
Go Deep or Broad? Exploit Hybrid Network Architecture for Weakly Supervised Object Classification and Localization
* student first author
IEEE Transactions on Neural Networks and Learning Systems (TNNLS), 2023
[Paper]
SAM-COD paper figure Huafeng Chen, Pengxu Wei, Guangqian Guo, Shan Gao
SAM-COD: SAM-guided Unified Framework for Weakly-Supervised Camouflaged Object Detection
European Conference on Computer Vision (ECCV), 2024
[Paper]
P-COD paper figure Huafeng Chen, Dian Shao, Guangqian Guo, Shan Gao
Just a Hint: Point-Supervised Camouflaged Object Detection
European Conference on Computer Vision (ECCV), 2024
[Paper]