The recent advancements in image generation and editing have seen a significant shift towards more efficient and versatile methods. Researchers are increasingly focusing on developing techniques that enhance the fidelity and control of image outputs without the need for extensive model adjustments or computational resources. One notable trend is the exploration of alternative attention mechanisms, such as uniform attention maps, which have shown to improve image reconstruction fidelity and editing precision. Additionally, there is a growing interest in steering rectified flow models to achieve controlled image generation, leveraging vector field dynamics to guide denoising trajectories efficiently. Another area of innovation is the design of scale-wise transformers for text-to-image synthesis, which optimize sampling speed and memory usage while maintaining high-quality outputs. Furthermore, the introduction of decoder-only autoregressive models capable of generating images in random orders has expanded the capabilities of visual generation models, enabling new applications such as inpainting and outpainting. These developments collectively indicate a move towards more flexible, efficient, and high-fidelity image generation and editing solutions.