Offline safe reinforcement learning (RL) is a promising approach for learning safe behaviors without risky online interactions with the environment.Existing methods in offline safe RL often result in overly conservative policies or safety constraint violations.This paper proposes a new approach to offline safe RL that learns a policy generating desirable trajectories and avoiding undesirable ones.The approach involves partitioning a pre-collected dataset into desirable and undesirable subsets, and using a classifier to score the trajectories.