Rethink Virtual Power Plants — II

This article is a follow on that discusses a reliable software stack for virtual power plants, based on MQTT, ThingsBoard, CrateDB and PredictiveWorks.

The first part of this article is here.

This is the second part of an article about a recent project that aimed to define a reliable software stack for future virtual power plants, based on a well-defined data-driven process, MQTT communication, the IoT platform ThingsBoard, and PredictiveWorks for analytics and optimization.

PredictiveWorks and its template-based, plug-and-play approach towards advanced analytics has been invited to cover power plant specific functionality but also provide the flexi­bility and openness to take cyber security as another first-class citizen into account.

And why not leverage a single-stop analytics platform and use the same approach to forecast market prices, predict power profiles and determine anomalies in the behavior of distributed energy assets as indi­ca­tors of being compromised?

In the remaining sections of this article, the main building blocks of a virtual power plant architecture are introduced, and then, their implementation based on MQTT, ThingsBoard and PredictiveWorks is discussed.

Distributed algorithms: A characteristic feature is that control & optimization algorithms are distri­bu­ted in nature, with the right balance between edge and cloud intelligence.

Data Machine

Virtual power plants are usually considered by taking a functional or system perspective, which or­ga­nizes the system into two main building blocks:

An energy information system as a single source of truth for internal & external information. And an energy management system to constantly generate power pro­files, optimal sche­dules and set points to control distributed energy resources and loads.

In this project, a virtual power plant is discussed from a data-centric perspective with distri­bu­ted real-time data sources & sinks on the one hand and a data-driven process on the other hand.

According to the well-known Lambda architecture, two different data flows are supported: One for historical data to enable forecasts & predictions to look ahead, e.g. when partici­pa­ting in Day-Ahead markets and smart segmentation of energy devices. An another one to respond to real-time events.

Data comprise attributes and telemetry data of distributed energy devices, market and wea­ther data and more. Computed insights & optimizations are transformed into set points and time frames to e.g. charge and discharge battery systems and control residential solar power generation.

From this perspective, there is not so much difference to an aggregated fleet of dis­tri­buted end­points and network devices built to detect and mitigate cyber threats. Or distributed sen­sors and devices in an Industrial IoT use case.


Being a template-based AI application platform, PredictiveWorks’ objective in this project is to provide machine intelligence for virtual power plants:

For smart device & asset segmentation, forecasting & predicting of power profiles & market prices, and optimizing power schedules (i.e. set points and time frames) to control remote energy devices.

Templates start with a business case. Each business case can be organized in tasks and each business task combines several reusable application blueprints.

Blueprints represent technical data workflows, comprising pre-built but configurable data connectors and machine in­telligence operators to trans­form data from a lower to a higher level of value.

The inevitable analytics & optimization part of vir­tual power plants is handled by a set of configurable business templates.

It is the opposite of a rigid software platform, where each use case is mapped onto a closed box software solution. When it comes to take additional features into account, it is just the selection & configu­ra­tion of additional templates without having to wait for the integration of another vertical software solution.

PredictiveWorks template approach accelerates AI adoption for almost every project and use case. But to make this crystal clear: PredictiveWorks is a smart analytics platform and no silver-bullet for all phases of data-driven processes.

Tracking, collecting, aggregating or visualizing attributes and telemetry data of distri­buted energy resources is not in the focus of PredictiveWorks. Trying to connect any advanced analytics platform to huge amounts of raw data without any intermediate partner platform that covers the previous phases of a data process generates a single result: frustration.


The project team has been aware of this fact from the very beginning and decided to use the open-source IoT platform ThingsBoard as the appropriate partner system to collect, prepare and aggregate device and asset specific attributes and telemetry data at IoT scale before per­forming advanced analytics and optimization.

ThingsBoard has been selected by the team for several reasons:

Electric grids are not designed to interact with a whole bunch of individual residential solar bat­tery systems. It is important to organize and aggregate them to look like a traditional grid component, e.g. a large stream turbine.

ThingsBoard’s device and asset management provides a flexible mechanism to aggregate IoT devices into assets. Assets can be combined with other assets to build even more complex ob­jects.

Aggregation & Edge Platform

In this project, a ThingsBoard device is an individual residential site. The first level of aggre­gation is accomplished by an edge platform (Raspberry PI) installed on each site.

It aggregates batteries, con­trollers, in­verters, smart meters and other elementary IoT devices and abstracts from the diversity of installations and home loads.

Edge platforms represent an important component of project’s architecture of virtual power plants e.g. for the performance, resilience and security of the system.

The second level is accomplished by ThingsBoard. Devices are aggregated as assets repre­sen­ting virtual (sub-)stations following geographic or administrative factors.

A third aggregation level is used to organize larger virtual components (such as stations), and the fourth and final level is the top level of the virtual power plant.

Rule Engine

Another ThingsBoard feature that was important for this project, is its rule engine for data pro­cessing and actions on incoming events such as publishing to third party platforms.

From an architectural point of view, ThingsBoard leverages Akka actors and Netty for high-perfor­man­ce coordination of messages between millions of devices. As a result, a single in­stance can constantly handle 20,000+ devices and 30,000 MQTT messages per se­cond.

All this makes ThingsBoard an excellent choice to become a key component in the architec­tu­re of a virtual power plant.

Lessons learned

ThingsBoard leverages Apache Cassandra as its NoSQL storage component. In this project, Cassandra was replaced by CrateDB as IoT scale database due to its performance and con­venient SQL foundation.

Data Flows

Edge platforms leverage northbound communication via MQTT messages to send real-time data to ThingsBoard. This encloses site attributes, telemetry data and locally optimized char­ging & discharging plans to prepare grounds for global DR requests.

ThingsBoard uses Apache Kafka for northbound communication to publish real-time events for Pre­dic­tive­Works. And PredictiveWorks uses HTTP for southbound communication with ThingsBoard to e.g. optimize device & asset topologies, and MQTT to send messages and commands to each distributed site.

In addition, ThingsBoard and PredictiveWorks share historical data based on CrateDB.

Data Analytics

Data analytics is a key component of all future virtual power plant. In this section, we sum­ma­rize how cloud intelligence, provided by ThingsBoard and PredictiveWorks, interact with edge intelligence to generate insights to operate a virtual power plant.


From a top-level data perspective, two analytics questions with respect to energy devices and aggregated assets have to be constantly answered: What happens (monitoring) and what will happen (predicting), i.e. what is the current and (near) future behavior of the system at each level of aggregation.

The project team decided to use these questions to distinguish between the responsibilities of ThingsBoard and PredictiveWorks:

Within the virtual power plant’s data process, ThingsBoard is responsible for collecting attri­butes & telemetry data (monitoring) from hundred and even thousands of distributed sites.

Due to its integrated rule engine, ThingsBoard constantly aggregates these data based on the current device and asset to­po­logy. This results in a time series of generated and consumed energy alongside respective at­tri­butes of the distributed devices.

Forecast & Prediction

PredictiveWorks is used to forecast & predict power profiles from past measurements and ag­gregations (historical data). These profiles represent the active power produced or con­sumed over a given time horizon (e.g. 24 h) with a system-defined time step (e.g. 15 min).

To this end, PredictiveWorks plugins for time series analysis & regression are used. Horizon and step size can be configured to meet the requirements of a certain energy market.

For each power profile, also a command profile based on the latest measurements is com­puted that contains the power set points and time frames required by each energy device to meet the desired power profiles.


PredictiveWorks is also responsible to constantly compute the optimal working point of the virtual power plant. Optimization is based on the computed power profiles, global opti­mi­za­tion objectives and market prices.

PredictiveWorks use a Mixed-Integer Linear Programming (MILP) for­mulation of the global optimization model, provided as a project-built plugin for data ope­rations.

The implementation and testing of this MILP plugin, based on Apache Spark’s DataFrame technology, needed a couple of days and impressively demonstrated that even if there is a certain plugin missing within the set 200+ available plugins, PredictiveWorks is capable to respond to this situation fast.

Optimization is implemented as a distributed algorithm with balanced edge & cloud intel­li­gence: After having solved the global optimization problem, the results are published to the distributed edge platforms via MQTT.

The edge platform subscribes to the MQTT topic and, after having received the optimal com­mand profiles, computes the best plan for the residential solar battery sys­tem taking local con­straints and objectives into account.

The computed optimization plans are published via MQTT the same way telemetry data are sent to ThingsBoard. Its rule engine forwards these plans and PredictiveWorks adapts its glo­bal optimization.

The main advantage of taking edge intelligence into account is resilience to the inevitable in­ter­­mittency of communication: When sites go offline for short or moderate amounts of time, they have this last version of glo­bal command profiles, and can continue to compute their best plan. And if sites have not published their plans with a certain amount of time, they are excluded from aggregation and do not contribute to the global optimization.

After having determined the global optimal plan, it can be used to submit an appropriate bid to the market submission service.

And after having received a DR request, PredictiveWorks executes the computed command profiles by publishing them via MQTT to each site. Each edge platform then implements the set points and time frames for each DER and load.


The power plant’s virtual topology of aggregated devices and assets is subject to continuous changes and optimizations. Geographical or administrative factors may define initial drivers.

Similarity of devices and assets due to their dynamic behavior, however, is a better classi­fi­cation criterion with respect to the power plant’s resilience.

Therefore, continuous segmentation of all devices and assets is one of the analytics features PredictiveWorks is responsible for.

This task is not so much different from detecting customer or user segments in marketing or retail use cases. Understanding behavior and computing the baseline, i.e. what is normal for discovered segments, also build the foundation for protecting virtual power plants against cyber-attacks.

What is true for all distributed cloud-based IT systems also holds for digital energy systems: Perimeter defense or other reactive defense mechanisms are not enough, and it is just a mat­ter of time when digital systems get compromised.

Having this in mind, shifts the focus to reduce infiltration time to a minimum. Without know­ing the baseline of all devices and aggregated assets, it is obvious that hunting for malware or other threats is not possible.


Virtual power plants are promising but complex machines that will have a huge impact on the fu­ture energy sector. Aggregating, orchestrating and controlling distributed energy assets, in­teracting with the electric grid and participating in energy markets is challenging.

But all this does not justify building virtual power plants from ground up and thereby re­in­vent the wheel over and over again.

Distributed cloud-based systems of similar complexity are well known from many (Industrial) IoT use cases. And state-of-the-art IoT technology is capable to face the new challenges of the energy sector.

PredictiveWorks, while primarily targeting small & medium size business with configurable but pre-defined templates for fast AI adoptions, has proven to respond to the demands of the power sector as well. Either by rapid development of extra purpose-built plugins on the one hand, or by automatically providing additional e.g. cyber defense functionality on the other hand.

Originally published at

The world´s first AI-prediction template provisioning and sharing platform for advanced data analytics.