RESIDENTIAL DEMAND RESPONSE USING REINFORCEMENT LEARNING WITH PURSUIT ALGORITHM

Amala S, Rajeev; Jibi P, Mathew

DSpace Home
→
PG Thesis
→
Power Systems (PS)
→
2023
→
View Item

dc.contributor.author	Amala S, Rajeev
dc.contributor.author	Jibi P, Mathew
dc.date.accessioned	2023-10-07T09:24:38Z
dc.date.available	2023-10-07T09:24:38Z
dc.date.issued	2023-07
dc.identifier.uri	http://210.212.227.212:8080/xmlui/handle/123456789/468
dc.description.abstract	The primary goal of Demand Response (DR) is to lower the system’s max imum demand. The introduction of smart grid and bidirectional communications make the implementation easier. A common way of cost minimization is shifting the loads from peak hours to off-peak hours. Reinforcement Learning (RL) is used for solving various optimization problem. Since the nature of the power system is stochastic, implementation of DR using RL techniques makes it more suitable. Here a scenario of scheduling residential loads with flexible devices is considered with the aim of minimization in energy consumption and minimum discomfort to the con sumers. Q-learning which is a variant of RL is used for implementing the scheduling. The main concern is to find a balance between exploration and exploitation. One of the traditional RL methods is used for balancing exploration and exploitation is epsilon-greedy algorithm. The main challenge in the implementation of ϵ-greedy algorithm is to obtain the cooling schedule for balancing the exploration and ex ploitation. In this project, we propose an efficient algorithm for the action selection that is Pursuit algorithm. Here the performance of epsilon greedy is analyzed for various cooling schedule methods. The performance of RL algorithm using ϵ-greedy and pursuit algorithm is compared. The only parameter that depends on the per formance of pursuit algorithm is the convergence rate β. As the dependency of pursuit algorithm on the hyperparameters is less and also no predefined episodes are required the convergence rate is faster than in ϵ-greedy. The performance of the algorithm is also analyzed using various tariff structures	en_US
dc.language.iso	en	en_US
dc.relation.ispartofseries	;TKM21EEPS01
dc.title	RESIDENTIAL DEMAND RESPONSE USING REINFORCEMENT LEARNING WITH PURSUIT ALGORITHM	en_US
dc.type	Technical Report	en_US