Qiucheng Wu

I am a final-year Ph.D. student in Computer Science at the University of California, Santa Barbara, under the supervision of Prof. Shiyu Chang. I received my bachelor’s and master’s degrees from the University of Michigan, College of Engineering, and a second bachelor’s degree at Shanghai Jiao Tong University.

My research focuses on generative models at the intersection of vision and language. Specifically, I am interested in:

  • The imbalanced capability of multimodal large language models (MLLMs) in understanding visual and textual information, and methods to enhance their performance on visual tasks.
  • Text control in visual generative models (e.g., diffusion models, GANs) for images and videos.

Some of my projects and publications are shown below.

Email: qiucheng@ucsb.edu /  CV

profile photo
Research Projects

* indicates authors with equal contributions.

VSP: Diagnosing the Dual Challenges of Perception and Reasoning in Spatial Planning Tasks for MLLMs
Qiucheng Wu, Handong Zhao, Michael Saxon, Trung Bui, William Yang Wang, Yang Zhang, and Shiyu Chang.
ICCV, 2025 [Code]

To diagnose the imbalance between visual and textual understanding in MLLMs, we introduce VSP, focusing on evaluating their spatial planning ability and further analyzing weaknesses through fine-grained perception and reasoning tasks.

Harnessing the Spatial-Temporal Attention of Diffusion Models for High-Fidelity Text-to-Image Synthesis
Qiucheng Wu*, Yujian Liu*, Handong Zhao, Trung Bui, Zhe Lin, Yang Zhang, Shiyu Chang
ICCV, 2023 [Code]

We propose a new text-to-image algorithm with explicit control over cross-attention in diffusion models from spatial and temporal views. This alleviates inconsistencies between images and text and helps to fix errors like missing objects, mismatched attributes, and mislocated objects.

Uncovering the Disentanglement Capability in Text-to-Image Diffusion Models
Qiucheng Wu, Yujian Liu, Handong Zhao, Ajinkya Kale, Trung Bui, Tong Yu, Zhe Lin, Yang Zhang, Shiyu Chang
CVPR, 2023 [Code] [Demo]

Based on a fixed stable diffusion model, we disentangle target attributes from a single training image. The learned parameters can then be applied to unseen images and achieve the same edits. This finding leads to a lightweight image editing framework with only 50 learnable parameters.

Broad Spectrum Image Deblurring via An Adaptive Super-Network
Qiucheng Wu*, Yifan Jiang*, Junru Wu*, Victor Kulikov, Vidit Goel, Nikita Orlov, Humphrey Shi, Zhangyang Wang, Shiyu Chang
IEEE Transactions on Image Processing (TIP), 2023 [Code]

We propose Ada-Deblur, a super-network that can be applied to a "broad spectrum" of various blur levels with no retraining on novel blurs.

Grasping the Arrow of Time from the Singularity: Decoding Micromotion in Low-dimensional Latent Spaces from StyleGAN
Qiucheng Wu*, Yifan Jiang*, Junru Wu*, Kai Wang, Gong Zhang, Humphrey Shi, Zhangyang Wang, Shiyu Chang
CPAL, 2024 [Code] [Demo]

Temporal Frame Filtering with Near-Pixel Compute for Autonomous Driving
Wantong Li, Qiucheng Wu, Janak Sharda, Shiyu Chang, Shimeng Yu
IEEE 4th International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2022

Data-Model-Circuit Tri-Design for Ultra-Light Video Intelligence on Edge Devices
Yimeng Zhang*, Akshay Karkal Kamath*, Qiucheng Wu*, Zhiwen Fan*, Wuyang Chen, Zhangyang Wang, Shiyu Chang, Sijia Liu, Cong Hao
Proceedings of the 28th Asia and South Pacific Design Automation Conference, 2023

Learning Action Translator for Meta Reinforcement Learning on Sparse Rewards Tasks
Yijie Guo, Qiucheng Wu, Honglak Lee
AAAI, 2022

Meta reinforcement learning requires substantial amounts of data. To improve the sample efficiency and performance of metal-RL algorithms on sparse-reward tasks, we introduce a novel objective function to learn an action translator among training tasks.

DataSifterText: Partially Synthetic Text Generation for Sensitive Clinical Notes
Nina Zhou, Qiucheng Wu, Zewen Wu, Simeone Marino, and Ivo Dinov
Journal of Medical Systems, 2022

We propose DataSifter-Text to protect privacy of sensitive textual dataset. The DataSifter-Text obfuscates identifiable information in the dataset. It effectively balances between privacy and utility.

Compressive Big Data Analytics: An ensemble meta-algorithm for high-dimensional multisource datasets
Simeone Marino, Yi Zhao, Nina Zhou, Yiwang Zhou, Arthur W. Toga, Lu Zhao, Yingsi Jian, Yichen Yang, Yehu Chen, Qiucheng Wu, Jessica Wild, Brandon Cummings, Ivo D. Dinov
PLOS One, 2020

We apply the compressive big data analytics (CBDA) to analyze salient features and key biomarkers of a high-dimensional clinical dataset.

Most Likely Health Impact in 4th Annual Symposium Poster Competition in Michigan Institute for Data Science, University of Michigan.

DataSifter: Statistical Obfuscation of Electronic Health Records and Other Sensitive Datasets
Simeone Marino, Nina Zhou, Yi Zhao, Lu Wang, Qiucheng Wu, Ivo D. Dinov
Journal of Statistical Computation and Simulation, 2019

We propose DataSifter to obfuscate sensitive clinical datasets while preserving utility for downstream tasks.

Most Interesting Methodological Advances in 4th Annual Symposium Poster Competition in Michigan Institute for Data Science, University of Michigan.

Education
UC Santa Barbara
Final-year Ph.D. student, Sep 2021 - Current
University of Michigan
M.S., Computer Science and Engineering, Sep 2019 – May 2021
B.S., Computer Science, Sep 2017 – May 2019
GPA: 3.94/4.00
Focus Areas
  • Natural Language Processing
  • Deep Learning for Vision
  • Data Mining
  • Shanghai Jiao Tong University
    B.S., Electrical and Computer Engineering, Sep 2015 – Aug 2019
    GPA: 3.67/4.00
    Capstone Gold Prize (2019)
    Work Experience
    Adobe
    Research Intern
    Summer 2023, Summer 2024, Summer 2025
    • Trained efficient multimodal LLM agents for Photoshop/Lightroom workflows.
    • Developed text-based controllable image/video generation with diffusion models.

    Outcomes: ICCV 2025, ICCV 2023, CVPR 2023

    Picsart
    Research Intern
    Summer 2022
    • Explored controllable image generation with GANs for creative editing tools.
    • Image restoration through adaptive deblurring methods.

    Outcomes: CPAL 2024, TIP 2023

    Fitly
    Machine Learning Engineer Intern
    2019 – 2020
    • Developed personalized ML models to improve few-shot food classification accuracy in a nutrition app.
    Teaching Experience

    • CS165B: Machine Learning (TA, Fall 2021, Winter 2024, UCSB)
    • EECS 492: Intro to Artificial Intelligence (TA, Fall 2020, Winter 2021, UMich)
    • Vv156: Applied Honors Calculus (TA, Fall 2016, UM-SJTU Joint Institute)

    Others
    Intel Akraino: Edge Cloud Game Architecture
    Capstone Gold Prize
    UM-SJTU Joint Institute, Shanghai Jiao Tong University, August 2019

    In this capstone project, we designed a framework to boost web communications between terminal devices and servers by introducing edge servers. We use Kubernetes to organize edge servers as different containers. The edge servers are responsible for computing and sending information to terminal devices, leading to more efficient performance.


    This website is built using the source code from Jon Barron's public academic website.