The recent advancements in neural network optimization and hardware-algorithm co-design have significantly pushed the boundaries of efficiency and performance in resource-constrained environments. A notable trend is the integration of deep reinforcement learning for optimizing core placement in many-core near-memory computing systems, which aims to enhance parallelism and reduce power consumption. Additionally, the development of binary-native and gradient-free training algorithms for binary neural networks has opened new avenues for operation-optimized training, demonstrating substantial accuracy improvements with minimal hardware requirements. Hardware-aware training methodologies are also gaining traction, particularly in optimizing memory usage for neural networks deployed on event-based processors, where routing-aware training has shown to significantly enhance both accuracy and memory efficiency. Furthermore, the field is witnessing innovative approaches to error detection and correction in ReRAM crossbar arrays, ensuring fault-free accuracy in deep learning accelerators with minimal overhead. These developments collectively underscore a shift towards more integrated and efficient solutions that bridge the gap between algorithm design and hardware implementation, promising scalable and robust neural network deployments in diverse computational environments.