Visual content is the most natural abstraction of the real world. Interacting with various types of visual content via mobile devices at anytime and anywhere promises the future of mobile computing. Such sustained mobile visual computing will revolutionize cyber-physical systems, especially the cyber-human interaction, in numerous applications from education and entertainment to infrastructure and healthcare monitoring. However, today's systems, highly optimized for static content or stationary devices, fail to achieve the energy efficiency required by this long-term vision. In this talk, I will present a human-centered perspective to bridge this gap by understanding the human perception of the dynamic visual content under the mobile context. I will focus on the end-to-end video pipeline, and use sustained mobile video delivery and display as two examples to demonstrate the benefits of the human-centered perspective. This work has resulted in prototypes that save mobile energy at scale as well as new algorithms and models that enrich the visual computing theory. Some ideas of this work are also being standardized by MPEG. I will end my talk by discussing my future research directions to pursue sustained mobile visual computing in other types of visual content and computing tasks.