Difficulty: Advanced. Categories: Core Programming AI.

This course provides an in-depth exploration of modern computer vision techniques, focusing on both theoretical foundations and cutting-edge applications. Students will examine how machines interpret and understand visual data through advanced algorithms and deep learning architectures. Key topics include convolutional neural networks (CNNs), vision transformers (ViTs), object detection, semantic and instance segmentation, 3D vision, and generative vision models such as diffusion models. The course also covers self-supervised learning, transfer learning, and multimodal vision-language systems. Through a combination of lectures, research paper discussions, and hands-on projects, students will develop the ability to design, implement, and optimize computer vision systems for real-world applications such as autonomous driving, medical imaging, and intelligent surveillance.
Learning objectives
Students should have a solid foundation in Python programming, linear algebra, probability, and basic machine learning (e.g., familiarity with neural networks and CNNs).