Nonlinear control systems with partial information to the decision maker are prevalent in a variety of applications.This work explores reinforcement learning methods for finding the optimal policy in the nearly linear-quadratic regulator systems.The cost function of the system is nonconvex, but the study establishes local strong convexity and smoothness in the vicinity of the global optimizer.A policy gradient algorithm is proposed that is guaranteed to converge to the globally optimal policy with a linear rate.