Dalarna University's logo and link to the university's website

du.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
A multi-agent reinforcement learning approach for investigating and optimising peer-to-peer prosumer energy markets
Dalarna University, School of Information and Engineering, Microdata Analysis.ORCID iD: 0000-0002-0551-9341
Dalarna University, School of Information and Engineering, Energy Technology.ORCID iD: 0000-0003-3025-6333
2023 (English)In: Applied Energy, ISSN 0306-2619, E-ISSN 1872-9118, Vol. 334, article id 120705Article in journal (Refereed) Published
Abstract [en]

Current power grid infrastructure was not designed with climate change in mind, and, therefore, its stability, especially at peak demand periods, has been compromised. Furthermore, in light of the current UN’s Intergovernmental Panel on Climate Change reports concerning global warming and the goal of the 2015 Paris climate agreement to constrain global temperature increase to within 1.5–2 °C above pre-industrial levels, urgent sociotechnical measures need to be taken. Together, Smart Microgrid and renewable energy technology have been proposed as a possible solution to help mitigate global warming and grid instability. Within this context, well-managed demand-side flexibility is crucial for efficiently utilising on-site solar energy. To this end, a well-designed dynamic pricing mechanism can organise the actors within such a system to enable the efficient trade of on-site energy, therefore contributing to the decarbonisation and grid security goals alluded to above. However, designing such a mechanism in an economic setting as complex and dynamic as the one above often leads to computationally intractable solutions. To overcome this problem, in this work, we use multi-agent reinforcement learning (MARL) alongside Foundation – an open-source economic simulation framework built by Salesforce Research – to design a dynamic price policy. By incorporating a peer-to-peer (P2P) community of prosumers with heterogeneous demand/supply profiles and battery storage into Foundation, our results from data-driven simulations show that MARL, when compared with a baseline fixed price signal, can learn a dynamic price signal that achieves both a lower community electricity cost, and a higher community self-sufficiency. Furthermore, emergent social–economic behaviours, such as price elasticity, and community coordination leading to high grid feed-in during periods of overall excess photovoltaic (PV) supply and, conversely, high community trading during overall low PV supply, have also been identified. Our proposed approach can be used by practitioners to aid them in designing P2P energy trading markets.

Place, publisher, year, edition, pages
2023. Vol. 334, article id 120705
Keywords [en]
Peer-to-peer market, Community-based market, Dynamic pricing, Multi-agent systems, Multi-agent reinforcement learning, Proximal Policy Optimisation
National Category
Energy Systems
Identifiers
URN: urn:nbn:se:du-45480DOI: 10.1016/j.apenergy.2023.120705ISI: 000922545500001Scopus ID: 2-s2.0-85148708839OAI: oai:DiVA.org:du-45480DiVA, id: diva2:1737565
Available from: 2023-02-17 Created: 2023-02-17 Last updated: 2023-03-06Bibliographically approved
In thesis
1. On the Feasibility of Reinforcement Learning in Single- and Multi-Agent Systems: The Cases of Indoor Climate and Prosumer Electricity Trading Communities
Open this publication in new window or tab >>On the Feasibility of Reinforcement Learning in Single- and Multi-Agent Systems: The Cases of Indoor Climate and Prosumer Electricity Trading Communities
2023 (English)Doctoral thesis, comprehensive summary (Other academic)
Abstract [en]

Over half of the world’s population live in urban areas, a trend which is expected to only grow as we move further into the future. With this increasing trend in urbanisation, challenges are presented in the form of the management of urban infrastructure systems. As an essential infrastructure of any city, the energy system presents itself as one of the biggest challenges. Indeed, as cities expand in population and economically, global energy consumption increases, and as a result, so do greenhouse gas (GHG) emissions. Key to realising the goals as laid out by the 2030 Agenda for Sustainable Development, is the energy transition - embodied in the goals pertaining to affordable and clean energy, sustainable cities and communities, and climate action. Renewable energy systems (RESs) and energy efficiency have been shown as key strategies towards achieving these goals. While the building sector is considered to be one of the biggest contributors to climate change, it is also seen as an area with many opportunities for realising the energy transition. Indeed, the emergence of the smart city and the internet of things (IoT), alongside Photovoltaic and battery technology, offers opportunities for both the smart management of buildings, as well as the opportunity to form self-sufficient peer-to-peer (P2P) electricity trading communities. Within this context, advanced building control offers significant potential for mitigating global warming, grid instability, soaring energy costs, and exposure to poor indoor building climates. Most advanced control strategies, however, rely on complex mathematical models, which require a great deal of expertise to construct, thereby costing in time and money, and are unlikely to be frequently updated - which can lead to sub-optimal or even wrong performance. Furthermore, arriving at solutions in economic settings as complex and dynamic as the P2P electricity markets referred to above, often leads to solutions that are computationally intractable. A model-based approach thus seems, as alluded to above, unsustainable, and I thus propose taking a model-free alternative instead. One such alternative is the reinforcement learning (RL) method. This method provides a beautiful solution that addresses many of the limitations seen in more classical approaches - those based on complex mathematical models - to single- and multi-agent systems. To address the feasibility of RL in the context of building systems, I have developed four papers. In studying the literature, while there is much review work in support of RL for controlling energy consumption, it was found that there were no such works analysing RL from a methodological perspective w.r.t. controlling the comfort level of building occupants. Thus, in Paper I, to fill in this gap in knowledge, a comprehensive review in this area was carried out. To follow up, in Paper II, a case study was conducted to further assess, among other things, the computational feasibility of RL for controlling occupant comfort in a single agent context. It was found that the RL method was able to improve thermal and indoor air quality by more than 90% when compared with historically observed occupant data. Broadening the scope of RL, Papers III and IV considered the feasibility of RL at the district scale by considering the efficient trade of renewable electricity in a peer-to-peer prosumer energy market. In particular, in Paper III, by extending an open source economic simulation framework, multi-agent reinforcement learning (MARL) was used to optimise a dynamic price policy for trading the locally produced electricity. Compared with a benchmark fixed price signal, the dynamic price mechanism arrived at by RL, increased community net profit by more than 28%, and median community self-sufficiency by more than 2%. Furthermore, emergent social-economic behaviours such as changes in supply w.r.t changes in price were identified. A limitation of Paper III, however, is that it was conducted in a single environment. To address this limitation and to assess the general validity of the proposed MARL-solution, in Paper IV a full factorial experiment based on the factors of climate - manifested in heterogeneous demand/supply profiles and associated battery parameters, community scale, and price mechanism, was conducted in order to ascertain the response of the community w.r.t net-loss (financial gain), self-sufficiency, and income equality from trading locally produced electricity. The central finding of Paper IV was that the community, w.r.t net-loss, performs significantly better under a learned dynamic price mechanism than under the benchmark fixed price mechanism, and furthermore, a community under such a dynamic price mechanism stands an odds of 2 to 1 in increased financial savings. 

Place, publisher, year, edition, pages
Borlänge: Dalarna University, 2023
Series
Dalarna Doctoral Dissertations ; 24
Keywords
Reinforcement Learning, Multi-Agent Reinforcement Learning, Buildings, Indoor Climate, Occupant Comfort, Positive Energy Districts, Peer-to-Peer Markets, Complex Adaptive Systems
National Category
Energy Systems Building Technologies Computer and Information Sciences
Identifiers
urn:nbn:se:du-45300 (URN)978-91-88679-40-6 (ISBN)
Public defence
2023-03-31, Room 311, Borlänge, 13:00 (English)
Opponent
Supervisors
Available from: 2023-02-21 Created: 2023-01-27 Last updated: 2023-08-17Bibliographically approved

Open Access in DiVA

fulltext(2469 kB)277 downloads
File information
File name FULLTEXT01.pdfFile size 2469 kBChecksum SHA-512
9b688da15900e92cb34024181256127cc0b0bfd483de54b26623adaaf66747bfe8ed69ad3710c43633fc4afbf37634a31e11299f6f2b95ac2db4a2cd1518edd3
Type fulltextMimetype application/pdf

Other links

Publisher's full textScopus

Authority records

May, RossHuang, Pei

Search in DiVA

By author/editor
May, RossHuang, Pei
By organisation
Microdata AnalysisEnergy Technology
In the same journal
Applied Energy
Energy Systems

Search outside of DiVA

GoogleGoogle Scholar
Total: 277 downloads
The number of downloads is the sum of all downloads of full texts. It may include eg previous versions that are now no longer available

doi
urn-nbn

Altmetric score

doi
urn-nbn
Total: 445 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • chicago-author-date
  • chicago-note-bibliography
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf