The field of computer vision is rapidly advancing, with a focus on improving depth estimation and visual perception in various environments and conditions. Recent research has explored the use of monocular foundation models, LiDAR-visual-thermal datasets, and symmetry guidance for point cloud completion to achieve state-of-the-art performance. Notably, the integration of motion and structure priors has enabled robust depth estimation in diverse outdoor conditions. Furthermore, the development of benchmarks such as RGB-Th-Bench and LENVIZ has facilitated the evaluation of vision-language models and low-light image enhancement techniques. Innovative approaches, including intrinsic image decomposition and patchwise refinement, have also been proposed to address challenges in self-supervised monocular depth estimation and high-resolution image processing. Noteworthy papers include Distilling Monocular Foundation Model for Fine-grained Depth Completion, which achieved first place on the KITTI benchmark, and SymmCompletion, which introduced a highly effective point cloud completion method based on symmetry guidance. Additionally, DiffV2IR proposed a novel framework for visible-to-infrared image translation, and Synthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors presented a robust depth estimation framework incorporating motion and structure priors.