Dalarna University's logo and link to the university's website

du.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
On the Feasibility of Reinforcement Learning in Single- and Multi-Agent Systems: The Cases of Indoor Climate and Prosumer Electricity Trading Communities
Dalarna University, School of Information and Engineering, Microdata Analysis.ORCID iD: 0000-0002-0551-9341
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Over half of the world’s population live in urban areas, a trend which is expected to only grow as we move further into the future. With this increasing trend in urbanisation, challenges are presented in the form of the management of urban infrastructure systems. As an essential infrastructure of any city, the energy system presents itself as one of the biggest challenges. Indeed, as cities expand in population and economically, global energy consumption increases, and as a result, so do greenhouse gas (GHG) emissions. Key to realising the goals as laid out by the 2030 Agenda for Sustainable Development, is the energy transition - embodied in the goals pertaining to affordable and clean energy, sustainable cities and communities, and climate action. Renewable energy systems (RESs) and energy efficiency have been shown as key strategies towards achieving these goals. While the building sector is considered to be one of the biggest contributors to climate change, it is also seen as an area with many opportunities for realising the energy transition. Indeed, the emergence of the smart city and the internet of things (IoT), alongside Photovoltaic and battery technology, offers opportunities for both the smart management of buildings, as well as the opportunity to form self-sufficient peer-to-peer (P2P) electricity trading communities. Within this context, advanced building control offers significant potential for mitigating global warming, grid instability, soaring energy costs, and exposure to poor indoor building climates. Most advanced control strategies, however, rely on complex mathematical models, which require a great deal of expertise to construct, thereby costing in time and money, and are unlikely to be frequently updated - which can lead to sub-optimal or even wrong performance. Furthermore, arriving at solutions in economic settings as complex and dynamic as the P2P electricity markets referred to above, often leads to solutions that are computationally intractable. A model-based approach thus seems, as alluded to above, unsustainable, and I thus propose taking a model-free alternative instead. One such alternative is the reinforcement learning (RL) method. This method provides a beautiful solution that addresses many of the limitations seen in more classical approaches - those based on complex mathematical models - to single- and multi-agent systems. To address the feasibility of RL in the context of building systems, I have developed four papers. In studying the literature, while there is much review work in support of RL for controlling energy consumption, it was found that there were no such works analysing RL from a methodological perspective w.r.t. controlling the comfort level of building occupants. Thus, in Paper I, to fill in this gap in knowledge, a comprehensive review in this area was carried out. To follow up, in Paper II, a case study was conducted to further assess, among other things, the computational feasibility of RL for controlling occupant comfort in a single agent context. It was found that the RL method was able to improve thermal and indoor air quality by more than 90% when compared with historically observed occupant data. Broadening the scope of RL, Papers III and IV considered the feasibility of RL at the district scale by considering the efficient trade of renewable electricity in a peer-to-peer prosumer energy market. In particular, in Paper III, by extending an open source economic simulation framework, multi-agent reinforcement learning (MARL) was used to optimise a dynamic price policy for trading the locally produced electricity. Compared with a benchmark fixed price signal, the dynamic price mechanism arrived at by RL, increased community net profit by more than 28%, and median community self-sufficiency by more than 2%. Furthermore, emergent social-economic behaviours such as changes in supply w.r.t changes in price were identified. A limitation of Paper III, however, is that it was conducted in a single environment. To address this limitation and to assess the general validity of the proposed MARL-solution, in Paper IV a full factorial experiment based on the factors of climate - manifested in heterogeneous demand/supply profiles and associated battery parameters, community scale, and price mechanism, was conducted in order to ascertain the response of the community w.r.t net-loss (financial gain), self-sufficiency, and income equality from trading locally produced electricity. The central finding of Paper IV was that the community, w.r.t net-loss, performs significantly better under a learned dynamic price mechanism than under the benchmark fixed price mechanism, and furthermore, a community under such a dynamic price mechanism stands an odds of 2 to 1 in increased financial savings. 

Place, publisher, year, edition, pages
Borlänge: Dalarna University, 2023.
Series
Dalarna Doctoral Dissertations ; 24
Keywords [en]
Reinforcement Learning, Multi-Agent Reinforcement Learning, Buildings, Indoor Climate, Occupant Comfort, Positive Energy Districts, Peer-to-Peer Markets, Complex Adaptive Systems
National Category
Energy Systems Building Technologies Computer and Information Sciences
Identifiers
URN: urn:nbn:se:du-45300ISBN: 978-91-88679-40-6 (print)OAI: oai:DiVA.org:du-45300DiVA, id: diva2:1731696
Public defence
2023-03-31, Room 311, Borlänge, 13:00 (English)
Opponent
Supervisors
Available from: 2023-02-21 Created: 2023-01-27 Last updated: 2023-08-17Bibliographically approved
List of papers
1. A review of reinforcement learning methodologies for controlling occupant comfort in buildings
Open this publication in new window or tab >>A review of reinforcement learning methodologies for controlling occupant comfort in buildings
Show others...
2019 (English)In: Sustainable cities and society, ISSN 2210-6707, Vol. 51, article id 101748Article in journal (Refereed) Published
National Category
Building Technologies
Research subject
Research Profiles 2009-2020, Complex Systems – Microdata Analysis
Identifiers
urn:nbn:se:du-30601 (URN)10.1016/j.scs.2019.101748 (DOI)000493744700053 ()2-s2.0-85070980900 (Scopus ID)
Available from: 2019-08-08 Created: 2019-08-08 Last updated: 2023-02-17Bibliographically approved
2. A novel reinforcement learning method for improving occupant comfort via window opening and closing
Open this publication in new window or tab >>A novel reinforcement learning method for improving occupant comfort via window opening and closing
Show others...
2020 (English)In: Sustainable cities and society, ISSN 2210-6707, Vol. 61, article id 102247Article in journal (Refereed) Published
Abstract [en]

An occupant's window opening and closing behaviour can significantly influence the level of comfort in the indoor environment. Such behaviour is, however, complex to predict and control conventionally. This paper, therefore, proposes a novel reinforcement learning (RL) method for the advanced control of window opening and closing. The RL control aims at optimising the time point for window opening/closing through observing and learning from the environment. The theory of model-free RL control is developed with the objective of improving occupant comfort, which is applied to historical field measurement data taken from an office building in Beijing. Preliminary testing of RL control is conducted by evaluating the control method’s actions. The results show that the RL control strategy improves thermal and indoor air quality by more than 90 % when compared with the actual historically observed occupant data. This methodology establishes a prototype for optimally controlling window opening and closing behaviour. It can be further extended by including more environmental parameters and more objectives such as energy consumption. The model-free characteristic of RL avoids the disadvantage of implementing inaccurate or complex models for the environment, thereby enabling a great potential in the application of intelligent control for buildings.

Keywords
Markov decision processes, Reinforcement learning, Control, Indoor comfort, Occupant
National Category
Building Technologies
Research subject
Research Profiles 2009-2020, Energy and Built Environments
Identifiers
urn:nbn:se:du-33796 (URN)10.1016/j.scs.2020.102247 (DOI)000573585000003 ()2-s2.0-85086900231 (Scopus ID)
Available from: 2020-06-09 Created: 2020-06-09 Last updated: 2023-02-17Bibliographically approved
3. A multi-agent reinforcement learning approach for investigating and optimising peer-to-peer prosumer energy markets
Open this publication in new window or tab >>A multi-agent reinforcement learning approach for investigating and optimising peer-to-peer prosumer energy markets
2023 (English)In: Applied Energy, ISSN 0306-2619, E-ISSN 1872-9118, Vol. 334, article id 120705Article in journal (Refereed) Published
Abstract [en]

Current power grid infrastructure was not designed with climate change in mind, and, therefore, its stability, especially at peak demand periods, has been compromised. Furthermore, in light of the current UN’s Intergovernmental Panel on Climate Change reports concerning global warming and the goal of the 2015 Paris climate agreement to constrain global temperature increase to within 1.5–2 °C above pre-industrial levels, urgent sociotechnical measures need to be taken. Together, Smart Microgrid and renewable energy technology have been proposed as a possible solution to help mitigate global warming and grid instability. Within this context, well-managed demand-side flexibility is crucial for efficiently utilising on-site solar energy. To this end, a well-designed dynamic pricing mechanism can organise the actors within such a system to enable the efficient trade of on-site energy, therefore contributing to the decarbonisation and grid security goals alluded to above. However, designing such a mechanism in an economic setting as complex and dynamic as the one above often leads to computationally intractable solutions. To overcome this problem, in this work, we use multi-agent reinforcement learning (MARL) alongside Foundation – an open-source economic simulation framework built by Salesforce Research – to design a dynamic price policy. By incorporating a peer-to-peer (P2P) community of prosumers with heterogeneous demand/supply profiles and battery storage into Foundation, our results from data-driven simulations show that MARL, when compared with a baseline fixed price signal, can learn a dynamic price signal that achieves both a lower community electricity cost, and a higher community self-sufficiency. Furthermore, emergent social–economic behaviours, such as price elasticity, and community coordination leading to high grid feed-in during periods of overall excess photovoltaic (PV) supply and, conversely, high community trading during overall low PV supply, have also been identified. Our proposed approach can be used by practitioners to aid them in designing P2P energy trading markets.

Keywords
Peer-to-peer market, Community-based market, Dynamic pricing, Multi-agent systems, Multi-agent reinforcement learning, Proximal Policy Optimisation
National Category
Energy Systems
Identifiers
urn:nbn:se:du-45480 (URN)10.1016/j.apenergy.2023.120705 (DOI)000922545500001 ()2-s2.0-85148708839 (Scopus ID)
Available from: 2023-02-17 Created: 2023-02-17 Last updated: 2023-03-06Bibliographically approved
4. Does a smart agent overcome the tragedy of the commons in residential prosumer communities?
Open this publication in new window or tab >>Does a smart agent overcome the tragedy of the commons in residential prosumer communities?
2023 (English)Article in journal (Refereed) Submitted
National Category
Energy Systems
Identifiers
urn:nbn:se:du-45481 (URN)
Available from: 2023-02-17 Created: 2023-02-17 Last updated: 2023-04-03Bibliographically approved

Open Access in DiVA

fulltext(537 kB)254 downloads
File information
File name FULLTEXT01.pdfFile size 537 kBChecksum SHA-512
0357984ec528a5a542f055f41c5d16461f2d483f2085170d212362cbe01ea613af4cafa828997e7ed3cf35279ee4a82220086dadbfd0034f8acb7db8207f9685
Type fulltextMimetype application/pdf

Authority records

May, Ross

Search in DiVA

By author/editor
May, Ross
By organisation
Microdata Analysis
Energy SystemsBuilding TechnologiesComputer and Information Sciences

Search outside of DiVA

GoogleGoogle Scholar
Total: 255 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

isbn
urn-nbn

Altmetric score

isbn
urn-nbn
Total: 961 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf