About me

Mingfu Liang is a Ph.D. student at Northwestern University, started in Sep. 2020, under the supervision of Professor Ying Wu. His research mainly revolves around machine learning and computer vision, with a special emphasis on the theoretical foundations and practical applications of these fields. At present, his PhD thesis research is engrossed in enhancing machine learning algorithms with consistent, long-term learning capabilities and adaptive generalizability, enabling them to keep pace with a dynamic world. This research area, known as Continual Learning, Incremental Learning, and Lifelong Learning, aims to empower intelligent agents with a complete lifecycle. He earned his Master’s degree in Applied Math also from Northwestern University, where he specialized in analytical and computational methods for partial differential equations (PDE), stochastic differential equations (SDE), and advanced methods in parallel computing.

Apart from his thesis research of Continual Learning, his general research interests span various domains. He is actively involved in different research projects on generative models like GPT and diffusion models, multi-modality learning like vision question answering (VQA) and vision language models (VLM), open world/vocabulary learning (e.g., classification and detection), uncertainty learning, model customization and personalization, robotic learning, domain adaptation and generalization, autonomous driving, active learning, and semi-supervised learning. All these research endeavors underline his commitment to furthering the field of machine learning and his goal to enable a new era of smart, adaptive machines.

I am actively searching for a full-time research scientist and/or engineer position, please feel free to contact me at mingfuliang2020@u.northwestern.edu!


Research before Ph.D. journey: Before embarking on his Ph.D. journey, he actively tackled various intriguing challenges, including Image Matting, Semantic Segmentation, and Network Formulation. The latter encompassed areas like Network Pruning and Optimization, Attention Mechanism, Neural Architecture Search, and the Lottery Ticket Hypothesis. During his undergraduate studies in Pure and Applied Mathematics (specialized in Financial Mathematics and Engineering), he also displayed a keen interest in competitive problem-solving. He actively participated in numerous data mining and mathematical modeling competitions, securing noteworthy rankings and accolades in esteemed platforms such as the Mathematical Contest in Modeling (MCM), Kaggle, and the SIGKDD Cup.

News

[2024.03] Joining the Privacy-Preserving Machine Learning (PPML) team at Sony AI this Spring (April - June) as a Research Intern, working on Scalable and Versatile Multi-Modality (Vision-Language) Foundation Models!

[2024.02] Two papers have been accepted by CVPR-2024, “AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving” and “Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception”! Thanks all my mentors and collaborators from NEC Labs and Northwestern University! See you in Seattle this Summer!

[2024.01] Joined Department of Media Analytics at NEC Lab America this Winter quarter (January - March) again as a research intern on Controllable Video/Image Generation Models for Autonomous Driving!

[2023.09] The paper “TOA: Task-oriented Active VQA” is accepted by NeurIPS-2023! See you in New Orleans this December!

[2023.07] The paper “Understanding Self-attention Mechanism via Dynamical System Perspective” is accepted by ICCV-2023. More details are coming soon!

[2023.06] An interesting course final project with Bin Wang on Virtual Try-on based on Segment Anything model and Conditional Generative models (Stable Diffusion and Conditional GANs) training by the Diffuser repo from Huggingface! Some code snippets are released here

[2023.04] I will be interning in the Department of Media Analytics at NEC Lab America this summer on Autonomous Driving and Continual Learning, working with Dr. Jong-Chyi Su, Dr. Samuel Schulter, and Prof. Manmohan Chandraker.

[2022.12] Gave a talk for the Incremental Subpopulation Shifting (ECCV-2022) at AI TIME.

[2022.07] The paper “Balancing between Forgetting and Acquisition in Incremental Subpopulation Learning” is accepted by ECCV-2022! The code is released here.

[2022.06] Start my internship in the Department of Machine Learning at NEC Lab America on Interactive Visual Exploration System, working with Dr. Erik Kruus.

Selected Publications (*: contributed equally)

AIDE
Mingfu Liang, Jong-Chyi Su, Sameul Schulter, Sparsh Garg, Shiyu Zhao, Ying Wu, Manmohan Chandraker

AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving
CVPR 2024

[ArXiv]

Main Takeaways:

• Created a first Automatically Improved Data Engine (AIDE) to scale up the Autonomous Vehicle (AV) system on handling safety-critical novel object detection.
• AIDE composites of automatic issue identification, efficient data curation, model improvement via auto-labeling, and verification through diverse AV scenarios generation, all intelligently powered by recent advancements on VLMs and LLMs.
• AIDE can automatically enable closed-set object detector to detect novel object, outperforming state-of-the-art (SOTA) open-vocabulary object detectors (OVOD) with a large margin; Achieved the highest cost-efficiency (training and/or labeling) than learning paradigms like fully, semi supervised, and active learning. AIDE can also maintain and even improve the performance on known objects detection, and improve SOTA OVOD by >4% average precision (AP) on novel categories without human labels.
• Iterative use of AIDE significantly boosts performance, and minimal human feedback can lead to substantial gains, e.g., an extra 5.5% AP increased by correcting 30 images in verification of AIDE. Last but not least, AIDE can also scale up with more unlabeled data and approach the fully-supervised upper bound.

ISL
Mingfu Liang, Jiahuan Zhou, Wei Wei, Ying Wu

Balancing between Forgetting and Acquisition in Incremental Subpopulation Learning
ECCV 2022

[Twitter] [Poster] [PDF] [Supp.] [Springer] [Code] [YouTube] [Project Page]

Main Takeaways:

• Studied a novel and practical setting for incremental learning, i.e., incrementally learning to classify the unseen subpopulation (e.g., different breeds of dogs) as their corresponding population (i.e., the class dog), without retaining the images used for learning the seen population, called Incremental Subpopulation Learning (ISL)
• Empirically shown that ISL is promising for alleviating the subpopulation shifting problem (i.e., the large performance drop, mostly >30%, when a model directly tests on unseen subpopulations), without sacrificing the original performance on the seen population.
• Proposed a two-stage learning framework as a novel and the first baseline tailored to ISL , which disentangles the knowledge acquisition and forgetting to better handle the stability and plasticity trade-off inspired by the generalized Boosting Theory. Proposed novel proxy estimations to measure the forgetting and knowledge acquisition approximately to create a new optimization objective function for ISL.
• Benchmark the representative and the state-of-the-art (SOTA) non-exemplar-based methods on a recently proposed large-scale dataset tailored to real-world subpopulation shifting for the first time, i.e., the BREEDS datasets. Conducted extensive empirical study and formal analysis for the proposed method and the comparison methods to enlighten future research directions.

CG
Zhongzhan Huang*, Mingfu Liang*, Jinghui Qin, Shanshan Zhong, Liang Lin

Understanding Self-attention Mechanism via Dynamical System Perspective
ICCV 2023

[ArXiv][Media Cover (in Chinese)]

Main Takeaways:

• We empirically show that the intrinsic stiffness phenomenon (SP) in the high-precision solution of ordinary differential equations (ODEs) also widely exists in high-performance neural networks (NN)
• We formally demonstrate that the Self-Attention Mechanism (SAM) is a stiffness-aware step size adaptor that can enhance the model's representational ability to measure intrinsic SP by refining the estimation of stiffness information and generating adaptive attention values, which provides a new understanding of why and how the SAM can benefit the model performance
• This novel perspective can also explain the lottery ticket hypothesis in SAM, design new quantitative metrics of representational ability, and inspire a new theoretic-inspired approach, StepNet.

CG
Xiaoying Xing, Mingfu Liang, Ying Wu

TOA: Task-oriented Active VQA
NeurIPS 2023

[OpenReview]

Main Takeaways:


CG
Lei Fan, Mingfu Liang, Yunxuan Li, Gang Hua, Ying Wu

Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception
CVPR 2024

[ArXiv]

Main Takeaways:


Other Publications (*: contributed equally)

CG
Changhao Shi, Haomiao Ni, Kai Li, Shaobo Han, Mingfu Liang, Martin Renqiang Min

Exploring Compositional Visual Generation with Latent Classifier Guidance
CVPR 2023, Workshop of Generative Models for Computer Vision

[ArXiv]

Main Takeaways:

• We show that performing classifier guidance on the latent space modeled by diffusion models can provide effective compositional generation, and also enable sequential editing with high fidelity.

Lecture I gave about "Guidance for Diffusion Model" on CS-496 Deep Generative Models, Slides

DIA
Zhongzhan Huang*, Senwei Liang*, Mingfu Liang, Haizhao Yang

DIANet: Dense-and-Implicit Attention Network
AAAI 2020
[PDF] [Code]

Main Takeaways:

• We proposes a novel-yet-simple framework that shares an attention module throughout different network layers to encourage the integration of layer-wise information , called Dense-and-Implicit-Attention (DIA) unit.
• Many choices of modules can be used in the DIA unit. Since Long Short Term Memory (LSTM) has a capacity of capturing long-distance dependency, we focus on the case when the DIA unit is the modified LSTM (called DIA-LSTM).
• Experiments on benchmark datasets show that the DIA-LSTM unit is capable of emphasizing layer-wise feature interrelation and leads to significant improvement of image classification accuracy.
• We further empirically show that the DIA-LSTM has a strong regularization ability on stabilizing the training of deep networks by the experiments with the removal of skip connections (He et al. 2016a) or Batch Normalization (Ioffe and Szegedy 2015) in the whole residual network.

IEBN
Senwei Liang*, Zhongzhan Huang*, Mingfu Liang, Haizhao Yang

Instance Enhancement Batch Normalization: An Adaptive Regulator of Batch Noise
AAAI 2020
[PDF] [Code]

Main Takeaways:

• We offer a new point of view that the self-attention mechanism can help to regulate the noise by enhancing instance-specific information to obtain a better regularization effect.
• We propose an attention-based BN called Instance Enhancement Batch Normalization (IEBN) that recalibrates the information of each channel by a simple linear transformation.
IEBN has a good capacity for regulating the batch noise and stabilizing network training to improve generalization even in the presence of two kinds of noise attacks during training.

CAP
Wei He, Meiqing Wu, Mingfu Liang, Siew-Kei Lam

CAP: Context-Aware Pruning for Semantic Segmentation
WACV 2021
[PDF] [Code] [Supplementary] [Video]

Main Takeaways:

Work was done when I was a summer research intern in Nanyang Technological University (NTU, Singapore)
The first work to explore contextual information for guiding channel pruning tailored to semantic segmentation.
• We formulate the embedded contextual information by leveraging the layer-wise channels interdependency via the Context-aware Guiding Module (CAGM) and introduce the Context-aware Guided Sparsification (CAGS) to adaptively identify the informative channels on the cumbersome model by inducing channel-wise sparsity on the scaling factors in batch normalization (BN) layers.
• The resulting pruned models require significantly lesser operations for inference while maintaining comparable performance to (at times outperforming) the original models. We evaluated our framework on widely-used benchmarks and showed its effectiveness on both large and lightweight models.

Academic Service

  • Invited Conference Reviewer/Program Committee Member:
    • British Machine Vision Conference (BMVC) 2020
    • AAAI Conference on Artificial Intelligence (AAAI) 2021, 2022, 2023, 2024
    • IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2021
    • Asian Conference on Computer Vision (ACCV) 2024
    • Conference on Computer Vision and Pattern Recognition (CVPR) 2021, 2022, 2023, 2024
    • International Conference on Computer Vision (ICCV) 2021, 2023
    • European Conference on Computer Vision (ECCV) 2022, 2024
    • Conference on Lifelong Learning Agents (CoLLAs) 2023, 2024
    • IEEE International Conference on Multimedia and Expo (ICME) 2023
    • Conference on Neural Information Processing Systems (NeurIPS) 2023
    • International Conference on Learning Representations (ICLR) 2024
    • International Conference on Machine Learning (ICML) 2024
  • Invited Journal Reviewer:
    • IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)
    • IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
    • Pattern Recognition Letters (PRL)
    • Machine Vision and Applications (MVA)

Teaching Assistant

  • Introduction to Computer Vision (EE 332, 2021 Fall)
  • Engineering Analysis (EE 205-1, 2023 Fall)