This paper addresses a spoken dialogue framework that helps users in making decisions. Various decision criteria are involved in selecting from a set of alternatives. Users often do not have a definite goal or criteria for selection, and thus the system has to bridge the knowledge gap and also let the users know an appropriate alternative together with the reason for the recommendation through a dialogue. In this paper, we propose a model of dialogue state that considers the user’s preferences as well as his/her knowledge about the domain changing through a decision making dialogue. A user simulator is trained using a statistics collected using a trial sightseeing guidance system. Then, we optimize the dialogue strategy based on the state model through reinforcement learning with a natural policy gradient approach using a user simulator trained from the collected dialogue data.