Lijie Fan (樊立杰)

  • Ph.D. student in Computer Science
  • Massachusetts Institute of Technology
  • Email: lijiefan[at]

    I am a PhD student at MIT CSAIL. I’m advised by Prof. Dina Katabi. My research interest lies mainly within machine perception and learning from vision and wireless signals. Previously I got my bachelor’s degree from Tsinghua University.

  • Publications

    (*: equal contribution)

    Selected Research Projects

    Making the Invisible Visible: Action Recognition Through Walls and Occlusions

    Tianhong Li*, Lijie Fan*, Mingmin Zhao et al.

    We introduce a neural network model that can detect human actions through walls and occlusions. Our model takes radio frequency (RF) signals as input, generates 3D human skeletons as an intermediate representation, and recognizes actions. Our model achieves comparable accuracy to vision-based action recognition systems in visible scenarios, yet continues to work accurately when people are not visible, hence addressing scenarios that are beyond the limit of today’s vision-based models.

    Controllable Image-to-Video Translation: A Case Study on Facial Expression Generation

    Lijie Fan, Wenbing Huang, Chuang Gan et al.

    We propose a user-controllable approach to generate video clips of various lengths from a single face image. The lengths and types of the expressions are controlled by users. To this end, we design a novel neural network architecture that can incorporate the user input into its skip connections and propose several improvements to the adversarial training method for the neural network. Experiments and user studies verify the effectiveness of our approach.

    End-to-End Learning of Motion Representation for Video Understanding

    Lijie Fan*, Wenbing Huang*, Chuang Gan et al.

    We proposed TVNet, a well-initialized and end-to-end trainable algorithm to learn motions from videos by unfolding the iterations of the TV-L1 method to particular neural layers. Despite being initialized as a specific TV-L1, the proposed TVNet can be fine-tuned to learn richer and more task-oriented features than the standard optical flow. Our model achieves better accuracies than other action representation methods and C3D on HMDB-51, UCF-101 and ASLAN.

    Towards Efficient Action Recognition: Principal Backpropagation for Training Two-Stream Networks

    Wenbing Huang*, Lijie Fan*, Mehrtash Harandi et al.

    We proposed PBNets, which are trained using a Watch-and-Choose mechanism. Our approach exploits a dense snippet-wise temporal pooling strategy to discover the global characteristic for each input video, while only backpropagates a small number of representative snippets that are selected with two novel strategies, i.e. Max-rule and KL-rule. The proposed model consistently outperform state-of-the-art methods on UCF-101 and HMDB-51 dataset.

    Adversarial Localization Network

    Lijie Fan, Shengjia Zhao, Stefano Ermon

    We applied the application of adversarial training to weakly supervised object localization for the first time, making the classifier more robust against adversarial noise. We utilized super-pixel representations which incorporate structure about object boundary, making mask follow the object contour better given the same resolution. Experiments showed our method could obtain superior or comparable results with state-of-the-arts with a fairly small model architecture with few hyper-parameters or post processing.

    Professional Services

  • Conference Reviewer: CVPR, ICCV, NeurIPS, AAAI