Xiaodan Hu (胡晓丹)

I'm a final-year Ph.D. candidate (2019-2024) in the Computer Vision and Robotics Laboratory at the University of Illinois at Urbana-Champaign, supervised by Prof. Narendra Ahuja and worked closely with Prof. Heng Ji, specializing in Multimodal Video Understanding with LLMs. I am actively pursuing research scientists or ML engineer opportunities in the job market, starting in May 2024.

I am a Mavis Future Faculty Fellow.

xiaodan8 at illinois dot edu     Google Scholar     Resume

Research interest

    My research focuses on multimodal (Language and Vision) video understanding with LLMs, which is to develop multimodal video models that harness LLMs to understand and reason from the videos in a multimodal and hierarchical fashion similar to humans. Downstream applications including video generation and video retrieval.


News

  • 2023.04 I'm selected as the recipient of the Shun Lien Chuang Memorial Award for Excellence in Graduate Education, 2023-2024
  • 2023.03 I'm honored to receive the Yunni and Maxine Pao Memorial Fellowship 2023-2024
  • 2022.09 I'm Selected as the recipient of the Amazon Fellowship with grants and AWS credits for research 2022-2023
  • 2022.04 I'm selected as the recipient of the Rambus Computer Engineering Fellowship for 2022-2023
  • 2022.04 I'm selected as a Mavis Future Faculty Fellow (MF3) and receive the fellowship, 2022-2023
  • 2021.07 Our paper "Unsupervised 3d pose estimation for hierarchical dance video recognition" got accepted at ICCV 2021


Education


Papers

  • Unsupervised 3D Pose Estimation for Hierarchical Dance Video Recognition
    Xiaodan Hu, Narendra Ahuja
    Accepted by ICCV 2021
    PDF
  • Squeeze-and-Attention Networks for Semantic Segmentation
    Zilong Zhong, Zhong Qiu Lin, Rene Bidart, Xiaodan Hu, Ibrahim Ben Daya, Zhifeng Li, Wei-Shi Zheng, Jonathan Li, Alexander Wong
    Accepted by CVPR 2020
    PDF
  • MUSE: Illustrating Textual Attributes by Portrait Generation
    Xiaodan Hu, Pengfei Yu, Kevin Knight, Heng Ji, Bo Li, Honghui Shi
    arXiv 2020
    PDF
  • RUNet: A Robust UNet Architecture for Image Super-Resolution
    Xiaodan Hu, Mohamed A. Naiel, Alexander Wong, Mark Lamm and Paul Fieguth
    Accepted as oral presentation at Women in Computer Vision Workshop at CVPR 2019 (CVPRW-WiCV 2019)
    PDF
  • ClearGAN: Photo-Realistic High-Resolution Text-to-Image Synthesis via Joint Inter-modal and Intra-modal Attention Modeling
    Xiaodan Hu, Paul Fieguth, Mohamed A. Naiel and Alexander Wong
    CVPR 2019 Workshop Language and Vision, accepted as poster and spotlight (CVPRW-Language & Vision 2019)
  • ProstateGAN: Mitigating Data Bias via Prostate Diffusion Imaging Synthesis with Generative Adversarial Networks
    Xiaodan Hu, Audrey Chung, Alexander Wong, and Paul Fieguth
    Accepted as a poster presentation in Neurips 2018 Workshop Machine Learning for Health (NIPSW-ML4H2018)
    PDF
  • Content-Adaptive Non-Stationary Projector Resolution Enhancement
    Xiaodan Hu
    Master Thesis, 04/2018-04-2019
  • Robust Visual Enhancement of Moving Contents in Projected Imagery
    Xiaodan Hu, Mohamed A. Naiel, Zohreh Azimifar, Mark Lamm and Paul Fieguth
    Accepted as a poster presentation at 2019 Society for Information Display International Symposium, Seminar and Exhibition (SID2019)
    PDF
  • Device, system and method for enhancing one or more of high contrast regions and text regions in projected images
    Xiaodan Hu, Mohamed A. Naiel, Zohreh Azimifar, Ibrahim Ben Daya, Mark Lamm and Paul Fieguth
    U. S. Patent
    PDF
  • Projector Resolution Enhancement Using a Non-stationary Content-adaptive Scheme
    Xiaodan Hu, Mohamed A. Naiel, Zohreh Azimifar, Ibrahim Ben Daya, Mark Lamm and Paul Fieguth
    Journal of Signal Processing: Image Communication, in review (JSPIC)
  • Text Enhancement in Projected Imagery
    Xiaodan Hu, Mohamed A. Naiel, Zohreh Azimifar, Ibrahim Ben Daya, Mark Lamm and Paul Fieguth
    Accepted as a poster presentation at the Conference on Vision and Imaging Systems (CVIS2018), published in a special issue of Journal of Computational Vision and Imaging Systems (JCVIS)
    PDF
  • Motion Detection in High Resolution Enhancement
    Xiaodan Hu, Avery Ma, Ahmed Gawish, Mark Lamm, Paul Fieguth
    Accepted as a poster presentation at the Conference on Vision and Imaging Systems (CVIS2017), published in a special issue of Journal of Computational Vision and Imaging Systems (JCVIS)
    PDF
  • Application of Modular Approach in GIS-based Hydrological Modeling
    Shixiong Hu, He Jin, Xiaodan Hu, Yuannan Long
    Accepted by Geoinformatics, 2014 22nd International Conference (Geoinformatics 2014)
    PDF

Awards and Services

  • Shun Lien Chuang Memorial Award for Excellence in Graduate Education, 2023-2024
  • Yunni and Maxine Pao Memorial Fellowship, 2023-2024
  • Amazon Fellowship, 2022-2023
  • Mavis Future Faculty Fellowship (MF3) UIUC, 2022
  • Rambus Computer Engineering Fellowship UIUC, 2022
  • ACL 2023 Reviewer, ACL 2023 Demo Reviewer
  • ICLR 2023 Reviewer
  • EMNLP 2023 Reviewer, EMNLP 2022 Demo Reviewer
  • WACV 2023 Reviewer
  • ECCV 2022 Reviewer
  • NAACL 2022 Reviewer
  • ACL ARR 2022 Reviewer; ACL 2022 Demo Reviewer; ACL 2021 Demo Reviewer
  • New In ML Workshop at CVPR 2022, Reviewer
  • Annual Conference on Vision and Intelligent Systems 2020, Session Chair
  • Annual Conference on Vision and Intelligent Systems 2019-2020, Technical Program Committee
  • New In ML at NeurIPS 2020, Reviewer
  • ISCAS 2020, Reviewer
  • Received a travel award to attend and present the work at WiCV at CVPR 2019
  • Received a student travel grant to attend and present the work at SID Display Week 2019
  • Certificate of the Fundamentals of University Teaching Program in UW, 2018
  • Graduate Research Studentship (GRS), UW 2017-2019
  • International Masters Student Award, UW 2017-2019
  • Faculty of Engineering Graduate Scholarship, UW 2018

Research Experience

  • Research Assistant at Computer Vision and Robotics Laboratory CVRL, UIUC (2019.08 - present). Supervisor: Narendra Ahuja
  • Research Assistant at Vision and Image Processing Lab (VIP), UW (2017.05 - 2019.04). Supervisor: Paul Fieguth

Teaching Experience

  • BME 393 Digital Systems, University of Waterloo (2019 Spring). Instructor: Prof. Parsin Haji Reza
  • CS 444 Deep Learning for Computer Vision, University of Illinois at Urbana-Champaign (2023 Fall). Instructor: Prof. Saurabh Gupta

Employment Experience

  • Research Intern at Google (2023.05 - 2023.08).
    Large language model for video summarization.
  • Applied Scientist Intern at Amazon Go (2022.01 - 2022.08). Supervisor: Chuhang Zou, Jaechul Kim
    Large language model driven open-vocabulary human activity recognition generalizable to new scenes, activities or objects
    Video scene graph construction by Comonsense Reasoning.
  • Research Associate at Lab of Vision and Image Processing (VIP), UW (2019.05 - 2019.08).
    Work with Prof. Paul Fieguth and Prof. Alex Wong on text-to-image generation.
  • Research Intern at Christie Digital Systems Canada Inc. (2017.03 - 2018.08). Mentor: Mr. Mark Lamm
    Content-adaptive high-resolution enhancement using one/multiple low-resolution projector(s)

Projects

  • Multimodal Video Understanding and Synthesis
    Dance video recognition by tracking, 2D pose estimation, and unsupervised 3D pose estimation.
    Developed an action proposal generation network for automatic segmentation of a video into clips.
    Multimedia Human Activity Video Generation
  • Multimedia Attribute Discovery and Language Acquisition
    Concept acquisition and multimedia attributes extraction through curriculum learning.
    Language acquisition by textual attributes guided portrait painting generation

Personal

  • I love travelling and foods. I've been to most places in China, the east coast and west coast in United States, and eastern Canada. Here are some photography I took during my travelling.
  • I have a lasting passion for piano and have been playing piano since I was 5 years old. I had participataed a few of local piano concerts and had passed the piano 10th-level grading test in China. Besides, when I was in college, I joined the choir as a soprano. I would love to share with you my favorate pieces I played when working from home: Fantaisie-ImpromptuMinute Waltz千と千尋
  • I love playing badminton, table tennis, volleyball and swimming, and I have received special training, which brings me a strong body and more fun while playing. Besides, I enjoy outdoor sports and extreme sports. I often climb mountains in long weekends, drifting in summer, and skiing in winter. Since few years ago, I have been thinking to get a certificate of diving and a recreational pilot permit.
  • I enjoy be involved in volunteer activities and I also devoted myself to the volunteer association as the president during my undergraduate study. Until now I still keep connecting with a school for the deaf and the dumb in Beijing.