Qiucheng Wu

I am a final-year Ph.D. student in Computer Science at the University of California, Santa Barbara, under the supervision of Prof. Shiyu Chang. I received my bachelor’s and master’s degrees from the University of Michigan, College of Engineering, and a second bachelor’s degree at Shanghai Jiao Tong University.

My research focuses on generative models at the intersection of vision and language. Specifically, I am interested in:

MLLM Agent for Tool Calling.
The imbalanced capability of (MLLMs) in understanding visual and textual information, and methods to enhance their performance on visual tasks.
Text control in visual generative models (e.g., diffusion models, GANs) for images and videos.

I am actively looking for full time industry positions! Feel free to drop me an email if you're interested!

Some of my projects and publications are shown below.

Email: qiucheng@ucsb.edu / CV

Research Projects

* indicates authors with equal contributions.

	VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs Qiucheng Wu, Handong Zhao, Michael Saxon, Trung Bui, William Yang Wang, Yang Zhang, and Shiyu Chang. ICCV, 2025 [Code] To diagnose the imbalance between visual and textual understanding in MLLMs, we introduce VSP, focusing on evaluating their spatial planning ability and further analyzing weaknesses through fine-grained perception and reasoning tasks.
	VividCam: Learning Unconventional Camera Motions from Virtual Synthetic Videos Qiucheng Wu, Handong Zhao, Zhixin Shu, Jing Shi, Yang Zhang, and Shiyu Chang. Arxiv, 2025 [Demo] We generate diverse and complex camera motions using only low-quality synthetic videos for training.
	Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis Qiucheng Wu, Yujian Liu, Handong Zhao, Trung Bui, Zhe Lin, Yang Zhang, Shiyu Chang ICCV, 2023 [Code] We propose a new text-to-image algorithm with explicit control over cross-attention in diffusion models from spatial and temporal views. This alleviates inconsistencies between images and text and helps to fix errors like missing objects, mismatched attributes, and mislocated objects.
	Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang CVPR, 2023 [Code] [Demo] Based on a fixed stable diffusion model, we disentangle target attributes from a single training image. The learned parameters can then be applied to unseen images and achieve the same edits. This finding leads to a lightweight image editing framework with only 50 learnable parameters.
	Broad Spectrum Image Deblurring via An Adaptive Super-Network Qiucheng Wu, Yifan Jiang, Junru Wu, Victor Kulikov, Vidit Goel, Nikita Orlov, Humphrey Shi, Zhangyang Wang, Shiyu Chang IEEE Transactions on Image Processing (TIP)*, 2023 [Code] We propose Ada-Deblur, a super-network that can be applied to a "broad spectrum" of various blur levels with no retraining on novel blurs.
	Grasping the Arrow of Time from the Singularity: Decoding Micromotion in Low-dimensional Latent Spaces from StyleGAN Qiucheng Wu, Yifan Jiang, Junru Wu, Kai Wang, Gong Zhang, Humphrey Shi, Zhangyang Wang, Shiyu Chang CPAL*, 2024 [Code] [Demo]
	Temporal Frame Filtering with Near-Pixel Compute for Autonomous Driving Wantong Li, Qiucheng Wu, Janak Sharda, Shiyu Chang, Shimeng Yu IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2022
	Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices Yimeng Zhang, Akshay Karkal Kamath, Qiucheng Wu, Zhiwen Fan, Wuyang Chen, Zhangyang Wang, Shiyu Chang, Sijia Liu, Cong Hao Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023
	Learning Action Translator for Meta Reinforcement Learning on Sparse Rewards Tasks Yijie Guo, Qiucheng Wu, Honglak Lee AAAI, 2022 Meta reinforcement learning requires substantial amounts of data. To improve the sample efficiency and performance of metal-RL algorithms on sparse-reward tasks, we introduce a novel objective function to learn an action translator among training tasks.
	DataSifterText: Partially Synthetic Text Generation for Sensitive Clinical Notes Nina Zhou, Qiucheng Wu, Zewen Wu, Simeone Marino, and Ivo Dinov Journal of Medical Systems, 2022 We propose DataSifter-Text to protect privacy of sensitive textual dataset. The DataSifter-Text obfuscates identifiable information in the dataset. It effectively balances between privacy and utility.
	Compressive Big Data Analytics: An ensemble meta-algorithm for high-dimensional multisource datasets Simeone Marino, Yi Zhao, Nina Zhou, Yiwang Zhou, Arthur W. Toga, Lu Zhao, Yingsi Jian, Yichen Yang, Yehu Chen, Qiucheng Wu, Jessica Wild, Brandon Cummings, Ivo D. Dinov PLOS One, 2020 We apply the compressive big data analytics (CBDA) to analyze salient features and key biomarkers of a high-dimensional clinical dataset. Most Likely Health Impact in 4th Annual Symposium Poster Competition in Michigan Institute for Data Science, University of Michigan.
	DataSifter: Statistical Obfuscation of Electronic Health Records and Other Sensitive Datasets Simeone Marino, Nina Zhou, Yi Zhao, Lu Wang, Qiucheng Wu, Ivo D. Dinov Journal of Statistical Computation and Simulation, 2019 We propose DataSifter to obfuscate sensitive clinical datasets while preserving utility for downstream tasks. Most Interesting Methodological Advances in 4th Annual Symposium Poster Competition in Michigan Institute for Data Science, University of Michigan.

Education

UC Santa Barbara
Final-year Ph.D. student, Sep 2021 - Current

University of Michigan
M.S., Computer Science and Engineering, Sep 2019 – May 2021
B.S., Computer Science, Sep 2017 – May 2019
GPA: 3.94/4.00
Focus Areas

Natural Language Processing

Deep Learning for Vision

Data Mining

Shanghai Jiao Tong University
B.S., Electrical and Computer Engineering, Sep 2015 – Aug 2019
GPA: 3.67/4.00
Capstone Gold Prize (2019)

Work Experience

Adobe
Research Intern
Summer 2023, Summer 2024, Summer 2025

Trained efficient multimodal LLM agents for Photoshop/Lightroom workflows.
Developed text-based controllable image/video generation with diffusion models.

Outcomes: ICCV 2025, ICCV 2023, CVPR 2023

Picsart
Research Intern
Summer 2022

Explored controllable image generation with GANs for creative editing tools.
Image restoration through adaptive deblurring methods.

Outcomes: CPAL 2024, TIP 2023

Fitly
Machine Learning Engineer Intern
2019 – 2020

Developed personalized ML models to improve few-shot food classification accuracy in a nutrition app.

Teaching Experience

• CS165B: Machine Learning (TA, Fall 2021, Winter 2024, UCSB)
• EECS 492: Intro to Artificial Intelligence (TA, Fall 2020, Winter 2021, UMich)
• Vv156: Applied Honors Calculus (TA, Fall 2016, UM-SJTU Joint Institute)

Others

Intel Akraino: Edge Cloud Game Architecture
Capstone Gold Prize
UM-SJTU Joint Institute, Shanghai Jiao Tong University, August 2019

In this capstone project, we designed a framework to boost web communications between terminal devices and servers by introducing edge servers. We use Kubernetes to organize edge servers as different containers. The edge servers are responsible for computing and sending information to terminal devices, leading to more efficient performance.

This website is built using the source code from Jon Barron's public academic website.