WEKO3
アイテム
連続的な状態空間のボロノイ分割を用いた強化学習に関する研究
http://hdl.handle.net/10232/17425
http://hdl.handle.net/10232/174257f208ac2-7390-4110-8ab4-c7c6fb8b75c3
名前 / ファイル | ライセンス | アクション |
---|---|---|
rikouken378.pdf (806.9 kB)
|
|
|
diss_KATHY-THI-AUNG_201303.pdf (2.7 MB)
|
|
Item type | 学位論文 / Thesis or Dissertation(1) | |||||
---|---|---|---|---|---|---|
公開日 | 2015-02-18 | |||||
タイトル | ||||||
タイトル | Study on reinforcement learning using Voronoi diagram in continuous state space | |||||
言語 | en | |||||
タイトル | ||||||
タイトル | 連続的な状態空間のボロノイ分割を用いた強化学習に関する研究 | |||||
言語 | ja | |||||
著者 |
ケティ ティ オウン
× ケティ ティ オウン |
|||||
言語 | ||||||
言語 | eng | |||||
資源タイプ | ||||||
資源タイプ識別子 | http://purl.org/coar/resource_type/c_db06 | |||||
資源タイプ | doctoral thesis | |||||
アクセス権 | ||||||
アクセス権 | open access | |||||
アクセス権URI | http://purl.org/coar/access_right/c_abf2 | |||||
要約(Abstract) | ||||||
内容記述タイプ | Other | |||||
内容記述 | 理工学研究科博士論文(工学) ; 学位取得日: 平成25年3月25日 | |||||
言語 | ja | |||||
要約(Abstract) | ||||||
内容記述タイプ | Other | |||||
内容記述 | "There are several kinds of learning methods however most of the research tell us that reinforcement learning (RL) [1] is the most suitable method in machine learning that deals with the decision to take an action using an agent at discrete time steps, and it is expected that would be useful anywhere in the future. There are several ways to implement the learning process but Q-learning algorithm due to Watkins [2] is a policy for estimating the optimal state-action value (Q-value), and it is one of the most fundamental methods in RL. Q-learning can apply in many practical applications but it works only state and action are both discrete. It is difficult to treat in continuous state space because of the Curse of dimensionality problem. This dissertation proposes VQE (Voronoi Q-value Element) to be able to apply the Q-learning in continuous state space and to solve the Curse of dimensionality problem by partitioning the state space. As a method of space division, we apply the Voronoi diagram which is a general space division. Nevertheless, Voronoi diagram has a lot of flexibility thus a method of position determination of VQEs becomes a problem. Therefore, we present the addition method of VQEs to decide the position and LBG algorithm is used for adaptive state transition vector grouping. In addition, we propose the integration method of VQEs to reduce the number of states and memory usage and Delaunay tessellation technique is used to find the adjacent VQEs. These proposed methods also aim to show the improvement of a learning efficiency. In order to examine the efficiency of our proposed methods, we constructed the continuous states and discrete actions experimental model. The experiments are carried out compared with lattice of a previous work. The results indicate that the proposed methods are greatly improved than the previous method." |
|||||
言語 | en | |||||
作成日 | ||||||
日付 | 2013-03-25 | |||||
日付タイプ | Issued | |||||
出版タイプ | ||||||
出版タイプ | VoR | |||||
出版タイプResource | http://purl.org/coar/version/c_970fb48d4fbd8a85 | |||||
NDC | ||||||
主題Scheme | NDC | |||||
主題 | 007 | |||||
ファイル(説明) | ||||||
内容記述タイプ | Other | |||||
内容記述 | 学位論文の要旨, 学位論文本文 | |||||
言語 | ja | |||||
公開者・出版者 | ||||||
出版者 | 鹿児島大学 | |||||
言語 | ja | |||||
公開者・出版者 | ||||||
出版者 | Kagoshima University | |||||
言語 | en | |||||
学位記番号 | ||||||
値 | 理工研第378号 | |||||
学位名 | ||||||
言語 | ja | |||||
学位名 | 博士(工学) | |||||
学位名 | ||||||
言語 | en | |||||
学位名 | Doctor of Philosophy in Engineering | |||||
学位授与機関 | ||||||
学位授与機関識別子Scheme | kakenhi | |||||
学位授与機関識別子 | 17701 | |||||
言語 | ja | |||||
学位授与機関名 | 鹿児島大学 | |||||
学位授与年月日 | ||||||
学位授与年月日 | 2013-03-25 | |||||
学位授与番号 | ||||||
学位授与番号 | 甲理工研第378号 |