2026 3rd International Conference on Computer Vision, Image Processing and Computational Photography (CVIP 2026)

吕欣.jpg

Prof. Xin Lv

Communication University of China, China

Bio: Xin Lv is a Professor and PhD Supervisor at the School of Animation and Digital Arts, Communication University of China, where he also serves as the Dean of the Digital Human Research Center. He holds the positions of Director of the Network Technology and Intelligent Media Design Committee under the National Advisory Committee on Computer Fundamental Education for Universities, and Secretary-General of the Animation and Digital Media Art Committee of the National Association of Film and Television Education in Higher Institutions. His accolades include the National First Prize for Teaching Achievements, the 2023 "Rainbow Award" for Technological Innovation, and the Best Digital Creativity Award at the Asia Youth Animation and Digital Art Competition. He has also been selected for the "Young Talents Program" of Beijing Universities. His research primarily focuses on virtual digital humans and intelligent product design.

Title: Generative AI and Digital Humans: Exploring the New Frontiers of Empathetic Media

Abstract: The advancing maturity of generative AI, affective computing, and digital human technology is collectively driving media beyond its passive role as a "conduit" for human information, giving rise to intelligent media entities with empathic capabilities. This lecture unveils the core paradigm of "empathic media" through intelligent digital humans: By integrating large language models for contextual understanding, affective AI for emotional recognition, and fuzzy logic-driven multimodal body language expression, digital humans have evolved from audiovisual symbols into interactive agents capable of emotional resonance—transforming from digital shells into communicative partners with engaging souls. Grounded in the speaker's five years of digital human R&D case studies, the talk delves into the technical architecture and ethical challenges of empathic media, while envisioning the future of human-AI emotional interaction.

李玺.jpg

Prof. Xi Li

Zhejiang University, China

Bio: Xi Li is a Qishi Distinguished Professor at Zhejiang University, a recipient of the National Science Fund for Distinguished Young Scholars, and a Member of the National Academy of Artificial Intelligence (NAAI). He is also a Fellow of IAPR, IET, and AAIA, a Senior Member of IEEE, and a Distinguished Member of CCF and CSIG. Additionally, he has been honored as a National Young Talents Specialist, a Distinguished Expert of Zhejiang Province, and a recipient of the Zhejiang Provincial Science Fund for Distinguished Young Scholars. His research focuses on artificial intelligence and computer vision, and he has published over 200 high-quality academic papers. He serves as an editorial board member for international journals such as TMM and as an Area Chair for conferences like ICCV. He has led several major research projects, including the Ministry of Science and Technology's 2030 New Generation Artificial Intelligence Major Project, the NSFC Joint Fund Key Project, the NSFC Mathematical Tianyuan Fund Cross-Disciplinary Key Project, the Ministry of Education Key Planning Research Project, the Military Science and Technology Commission Major Special Project, the Zhejiang Provincial Fund Major Project, the Zhejiang Provincial Distinguished Young Scholars Project, and the Ningbo Science and Technology Innovation Yongjiang 2035 Key R&D Project. As the first completer, he has received multiple awards, including the 2023 China Invention Association Invention Entrepreneurship Award Innovation Prize First Class, the 2025, 2024, and 2022 Huawei "Problem Bounty" Spark Value Award, the 2023 Lu Zengyong CAD&CG High-Tech Award First Class, the 2021 Huawei Outstanding Technical Cooperation Achievement Award, the 2021 China Industry-University-Research Collaboration Innovation and Promotion Award (Individual Award), and the 2021 China Society of Image and Graphics Natural Science Award Second Class. Additionally, he has received the 2022 Ministry of Education Science and Technology Progress Award First Class (ranked 2nd), the 2021 China Electronics Society Science and Technology Progress Award First Class (ranked 4th), the 2013 China Patent Excellence Award (ranked 5th), and the 2012 Beijing Science and Technology Award First Class (ranked 9th). He has also been honored with four Best Academic Paper Awards. Under his supervision, students have received the China Society of Image and Graphics Doctoral Dissertation Incentive Program Award and the Zhejiang Provincial Outstanding Master's Thesis Award.

Title: World Model Generation from Multimodal Representations for Images and Videos

Abstract: Today, image and video generation represent a key focus and challenging problem in the field of artificial intelligence, particularly due to the underlying approach of modeling interactive world models. This report provides an in-depth analysis from multiple perspectives—such as efficient multimodal generation, understanding, and expression—centered on data-driven AI learning methods. It systematically reviews the different developmental stages in the field of multimodal feature representation and learning, and introduces a series of representative research works and their practical applications that we have conducted in recent years using feature learning for visual semantic analysis, understanding, and generation. Special attention is given to the application potential of these technologies in building video generation-driven, real-time interactive world simulators. World simulators aim to simulate the evolution of the real world by generating dynamic video sequences that adhere to physical laws, providing a foundation for decision-making, simulation, and content creation. This places core demands on the efficiency, controllability, temporal consistency, and physical plausibility of generative models. Finally, the report will discuss some of the open questions and challenges involved in multimodal visual generation and understanding.

张方略.png

Assoc. Prof. Fang-Lue Zhang

Victoria University of Wellington, China

Bio: Fang-Lue Zhang is a Senior Lecturer at Victoria University of Wellington, New Zealand. He received his Ph.D. in Computer Science from Tsinghua University in 2015. Since 2009, Dr. Zhang has focused on research in computer graphics and intelligent image/video editing methods. He has proposed numerous innovative approaches in the structured representation, analysis, and synthesis of images and videos, as well as in perception-based visual media analysis and editing. He has published over 100 papers in international conferences and journals in the fields of computer graphics and artificial intelligence, including more than 30 papers in top-tier publications such as IEEE TPAMI, ACM SIGGRAPH/SIGGRAPH Asia, ACM TOG, IEEE TVCG, IEEE TIP, and AAAI. Dr. Zhang has received the Victoria University of Wellington Early-Career Research Excellence Award (2019) and the Royal Society of New Zealand Fast-Start Marsden Grant (2020). He has served as the Program Chair of Pacific Graphics 2020 and 2021, and as the Program Chair of CVM 2024. He is currently a member of the IEEE Central New Zealand Section and serves on the editorial boards of several international journals in computer graphics.

Title: Gaze-Driven 360-degree Scene Analysis and Enhancement

Abstract: Panoramic images and videos can present 360-degree real-world scenes, providing users with a highly immersive experience. Compared to traditional virtual reality (VR) scenes generated through complex modeling and rendering, panoramic images and videos are directly captured from the real world, offering a more intuitive and comprehensive representation of the scene. Visual perception in panoramic environments plays a crucial role in the quality of user experience, and understanding and analyzing users' visual perception is one of the core challenges in this field. This report focuses on a key perceptual feature in 360-degree images and videos—the user scanpath—and explores how deep learning techniques can be applied to predict scanpaths and enhance image quality based on such user gaze trajectories.

王士同.jpg

Prof. Shitong Wang

Jiangnan University, China

Bio: Shitong Wang received the M.S. degree in computer science from the Nanjing University of Aeronautics and Astronautics, Nanjing, China, in 1987. He visited London University, Bristol University, U.K., Hiroshima International University, Osaka Prefecture University, Hong Kong University of Science and Technology, and Hong Kong Polytechnic University as a Research Scientist/visiting professor for over eight years. He have authored/co-authored 90 papers in several IEEE TRANSACTIONS journals. He is currently a Full Professor with the School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China.

Title: Recent Advances In Multi-View Fuzzy System Modeling

Abstract: Multi-view data scenes are ubiquitous in practical applications. The past multi-view fuzzy model learning strived for the collaborations among views. In this talk, Prof.Wang will introduce recent advances about collaborative learning frameworks regarding multi-view fuzzy system that consider not only the diversity among different views but also their consistency on the visible and/or hidden information. The related experiment results will be reported to demonstrate the effectiveness of these frameworks from Prof. Wang's team.