RL DQN not learning
3 ビュー (過去 30 日間)
古いコメントを表示
I am working on solving resource allocation problem for two network slices in 5G network. I have two slices one URLLC and one eMBB (xMBB).
The action is percentage of resource allocated to each slice from overall available resources.
A = [0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1]
For example 0.1 means 10% of overall resources allocated to URLLC slice while the rest is left for eMBB one. I observe 6 states as follows:
1. ηx: Utilization for eMBB slice
2. ηu: Utilization for URLLC slice
3. QoS utility for eMBB slice
4. QoS utility for URLLC slice
5. Ux: Total number of users in eMBB slice. Varies randomly in each iteration. Actions do not have an effect on what the next state will be
6. Uu: Total number of users in URLLC slice. Varies randomly in each iteration. Actions do not have an effect on what the next state will be
The reward is defined as follows (utilizing combinations from the states themselvs) :
R=0.75 Ru+ 0.25 Rx giving more reward for URLLC
where:
Ru=β Uu + (1-β)ηu part of reward comming from URLLC
β constant between 0 & 1. Acts as a weight of QoS utility importance compared to utilization in URLLC slice
Rx = α Ux + (1-α) ηx part of reward comming from eMBB
α constant between 0 & 1. Acts as a weight of QoS utility importance compared to utilization in eMBB slice
I am using Matlab RL toolbox. Modifying DQN cart pole example. Unfortunately it is learning slowly. Also when simulating the trained agent, it is stuck on just 2 action values.
0 件のコメント
回答 (0 件)
参考
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!