Policy gradient methods in federated reinforcement learning face challenges in ensuring convergence under heterogeneous environments.Heterogeneity can lead to optimal policies being non-deterministic or time-varying in tabular environments.Global convergence results are proven for federated policy gradient algorithms using local updates with specific conditions.Introduction of b-RS-FedPG method shows explicit convergence rates towards near-optimal policies, outperforming federated Q-learning empirically in heterogeneous settings.