UW researchers develop a new method for training AI systems to predicts users’ preferences

University of Washington researchers have developed a new AI training method called “variational preference learning” (VPL) to improve how AI systems reflect diverse user values. Traditional AI models, trained using reinforcement learning from human feedback (RLHF), often inherit the biases and preferences of their trainers, which may not align with the diverse values of all users. VPL addresses this by predicting and tailoring AI responses to individual preferences based on user feedback.

While RLHF tends to average preferences, leading to generic or biased responses, VPL enables more personalized outputs, accommodating cultural differences and unique user needs. The researchers also cautioned that while personalizing AI responses could reduce bias, it may also risk amplifying certain biases, highlighting the importance of balancing personalization with broader, inclusive alignment.

Read the Original Article >