A new approach for generalizable reward model in Reinforcement Learning from Human Feedback (RLHF) is proposed.Existing reward models lack the ability to correctly evaluate unseen prompt-response pairs.The proposed approach decomposes the reward value into prompt-free reward and prompt-related reward.The new reward learning algorithm prioritizes data samples based on their prompt-free reward values.