-
Notifications
You must be signed in to change notification settings - Fork 0
/
clipping-loss-validation.tex
287 lines (234 loc) · 22.7 KB
/
clipping-loss-validation.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
% !TEX spellcheck = en_US
\documentclass[conference]{IEEEtran}
\usepackage{cite}
\usepackage{amsmath,amssymb,amsfonts}
\usepackage{algorithmic}
\usepackage{graphicx}
\usepackage{textcomp}
\usepackage{xcolor}
\usepackage{colortbl}
% add hyperlinks, delete all .aux files if adding hyperref after previous build
\usepackage{hyperref}
% support for unicode charcters like "é" and "ñ"
\usepackage[T1]{fontenc}
% Provides generic commands \degree, \celsius, \perthousand, \micro and \ohm
\usepackage{gensymb}
% splits a section into multiple columns
\usepackage{multicol}
\def\BibTeX{{\rm B\kern-.05em{\sc i\kern-.025em b}\kern-.08em
T\kern-.1667em\lower.7ex\hbox{E}\kern-.125emX}}
\begin{document}
\title{Validation of Subhourly Clipping Loss Error Corrections}
\author{\IEEEauthorblockN{Abhishek Parikh\textsuperscript{1}, Kirsten Perry\textsuperscript{2},Kevin Anderson\textsuperscript{2}, William B. Hobbs\textsuperscript{3}, Rounak Kharait\textsuperscript{1},}
\IEEEauthorblockN{and Mark A. Mikofski\textsuperscript{1}}
\IEEEauthorblockA{\textsuperscript{1}DNV, San Diego, CA, 92123, USA }
\IEEEauthorblockA{\textsuperscript{2}NREL, Golden, CO, 80401, USA }
\IEEEauthorblockA{\textsuperscript{3}Southern Company, Birmingham, AL, 35203, USA }}
\maketitle
\begin{abstract}
Under-performance of solar PV systems is an important issue that increases risks for stakeholders, including developers, investors and operators. Recently some attention has focused on underestimation of inverter clipping losses as a possible source of over-prediction where sub-hourly solar variability is high. Several models and data sets have been analyzed over the past few years, with the aim of quantifying, predicting, and correcting underestimated clipping loss errors for systems with high DC/AC ratio and solar variability. In this research, we apply a machine learning model developed at NREL to two physical PV systems, to correct for subhourly clipping losses. For each system, we compare overall AC power output for the model taken at 1-minute intervals to AC power output taken at 1-hour intervals with the addition of the subhourly clipping correction. Our findings consistently show that the addition of the clipping loss correction lead to a reduction in mean bias error of 0.8\% and 1.2\% for systems A and B, respectively, with no additional filtering applied. When examining high solar variability periods where clipping is more pronounced, system A and B experienced a 1.8\% and 2.7\% reduction in mean bias error, respectively, when the clipping correction was applied.
\end{abstract}
\begin{IEEEkeywords}
inverter, clipping, solar, irradiance, variability, performance, modeling, TMY
\end{IEEEkeywords}
\section{Introduction}
In its report "2020 Solar Risk Assessment" \cite{Matsui2020}, kWh Analytics warned that systematic underproduction across the industry exposes investors to increased risk. DNV found that a sample of 39 projects from 2019 were under-performing compared to their pre-construction energy assessments by 3\% on average, as shown in Fig.~\ref{fig:project-underperformance}. NextEra estimates that sub-hourly solar resource variability can affect actual energy production by approximately 1-4\% \cite{Bradford}. Cormode, \textit{et al.}, explained that energy assessments based on hourly weather, like typical meteorological year (TMY) data sets, underestimate inverter clipping losses by up to 5\% for projects at an annual level with intra-hourly solar variability, especially for high DC/AC ratios \cite{Cormode2019}. Allen, \textit{et al.}, used measured 1-minute weather data to model clipping loss underestimation from using hourly data, in the Southeast US \cite{Allen2015} and at ten sites across the country \cite{Allen2018}, for a range of DC/AC ratios and, for a subset of sites, across multiple years. They estimated similar errors, with some sites seeing errors of 2 to over 4\% for DC/AC ratios of 1.4 to 2.0, respectively. This is demonstrated in Fig.~\ref{fig:irradiance-and-power}, which shows irradiance and power output simulated at 1-minute and 1-hour resolution in the top and bottom panels, respectively. On July 10th, there is little intra-hour variability, indicative of a mostly clear day where hourly and sub-hourly simulations are in close agreement. In contrast, July 13th has much higher solar variability, but no clipping is observed in the 1-hour output. However, the 1-minute resolution shows intermittent clipping all day long, indicating that not all of the available irradiance is used by the system, and the hourly simulations underestimate inverter clipping \cite{Kharait}.
\begin{figure}[htbp]
\centerline{\includegraphics[width=9cm]{fig1.png}}
\caption{Project-average validation results for solar energy assessments. Each project-year was adjusted for interannual variability by scaling production by the ratio of TGY to historical monthly insolation.}
\label{fig:project-underperformance}
\end{figure}
\begin{figure}[htbp]
% combining the 1- and 60-min data on the same plot might be helpful. Additionally, using a second Y axis or normalizing GHI to 1000 w/m^2 and power to dc nameplate might help.
\centerline{\includegraphics[width=9cm]{hourly_v_1-min_clipping.png}}
\caption{Irradiance input from the NIST test-bed weather station in Gaithersburg, MD, and predicted energy output using SolarFarmer at time-resolution of 1-minute, \textit{top panel}, and 1-hour, \textit{bottom panel}. The black dashed line in both panels shows the inverter-rated power, 260-kW.}
\label{fig:irradiance-and-power}
\end{figure}
Several methods have been proposed to correct clipping loss error in hourly predictions \cite{Cormode2019,Kharait,Anderson2020,Bradford,Allen2019,hofmann2014improved}, but only NextEra validated their method with operational power data \cite{Bradford}.
In this report, we adapt the NREL machine learning-based model \cite{Anderson2020} to predict clipping loss corrections and apply these corrections to hourly energy assessments. The adjusted energy assessment is then validated against operational data from the same system and we report on the average and distribution of validation errors. We have obtained data from two operating PV systems in the southeast US, with DC/AC ratios around 1.4. In this paper, we will outline our method and present model validation results for these two systems.
\section{Methods}
\subsection{Model}
The machine learning (ML) model developed at NREL has already been described in detail \cite{Anderson2020}. This model was adapted for the purpose of this research. A black-box XGBoost model was generated, with satellite data inputs, as well as features derived from these fields. In an extension of the model described in \cite{Anderson2020}, system DC/AC ratio and AC power outputs from the associated SAM model were added as continuous variable features. Specific model input parameters included the following:
\begin{itemize}
\item Min-max normalized POA, GHI, clearsky POA, clearsky GHI, and cell temperature
\item First-order differenced normalized POA
\item Difference between Clearsky POA and POA
\item System DC/AC ratio
\item Modeled AC power (taken from the SAM simulation)
\end{itemize}
The model was trained using high quality irradiance measurements from data sources including SURFRAD, SOLRAD, UOSRML, MIDC, and BSRN \cite{Augustine2000} \cite{solrad} and 1-minute to 30-minute clipping loss errors predicted using SAM \cite{Freeman2018}. The resulting trained model was then used to make predictions at the operational systems.
\subsection{Systems}
For the validation, two operational systems in Georgia were chosen. The system parameters of the two solar systems used for validation are provided in Table~\ref{table1}. These systems were selected for their high DC/AC ratio and 1-minute data sampling frequency.
\begin{table}[htbp]
\caption{System Parameters}
\begin{center}
\begin{tabular}{ |c|c|c| }
\hline
& System A & System B \\
\hline
Array type & Tracking & Tracking \\
\hline
Array axis azimuth & 180$^{\circ}$ & 180$^{\circ}$\\
\hline
DC capacity & 27 MW & 43 MW\\
\hline
AC capacity & 19 MW & 30 MW \\
\hline
DC/AC Ratio & 1.42 & 1.43 \\
\hline
Ground coverage ratio & 0.3 & 0.3 \\
\hline
\end{tabular}
\end{center}
\label{table1}
\end{table}
\subsection{Validation}
A challenge in validating the clipping loss correction model is the lack of quality high-frequency irradiance measurements that are co-located with operational PV systems. Therefore this model differs from traditional approaches for model training and validation because it relies on high-quality ground station data measurements and SAM-simulated data to train the model, but uses operational data from different systems for validation. So no holdout methods are required in validating this model because the training and validation data sets are already independent. The downside of this method is that any systematic biases in SAM are introduced into the clipping loss correction model. In future work, if a large population of high-frequency irradiance and co-located operational data are obtained, the model can be retrained holding out a subset of the data for validation. This procedure would have the benefit of incorporating any physical effects that are not captured by SAM.
Quality of operational datasets is vital to avoid disruption of validation. Data points in the operational dataset related to plant sub-performance instances due to sources other than subhourly clipping error are considered to be outliers from the validation's standpoint and need to be removed. However, detecting sub-performance from inverter output is challenging. For this validation effort, the bias change is presented for both unfiltered and filtered data points. For filtering, data points that met the following criteria were retained:
\begin{itemize}
\item The difference between the measured, on-site POA and the modeled POA calculated using the NSRDB \cite{Sengupta2018} data is less than 5\%
\item The difference between actual power and simulated power of a particular timestep is less than 25\%.
\end{itemize}
For system A and system B, 60.5\% and 55.1\% of the data is retained after applying the data filter, respectively.
In addition, high clipping errors occur during periods where the plant observes DC clipping (usually corresponding to higher irradiance), and high fluctuations in the irradiance. For the purpose of this validation, the fluctuations are quantified by calculating the variability index based on \cite{Stein}. The variability index is normally defined on GHI. However, since only on-site measured POA was available, the variability index for this study was calculated and aggregated at 30-min intervals using the on-site POA measured at a 1-min frequency. The clearsky POA was calculated using the Perez implementation avaiable in pvlib \cite{Holmgren2018}. Since the ML model should be correcting only those periods with high fluctuations, the change is split into sets of low and high variability index. Periods are divided into three sets based on their variability index value:
\begin{itemize}
\item Low variability index $(\leq10)$
\item Medium variability index $(>10$ \& $\leq50)$.
\item High variability index $(>50)$
\end{itemize}
Table.~\ref{var_index_breakdown} shows the composition of the on-site data based on the variability index. For this validation, two complete years of operational data from system A are used, and one complete year of operational data from system B is used.
% is this 50% of plant nameplate, or 50% of one of the power values for the individual timestep?
% it would be good to note the # or % of intervals that were excluded
\begin{table}[htbp]
\caption{Composition of unfiltered data based on variability index}
\begin{center}
\begin{tabular}{ |c|c|c| }
\hline
& System A & System B \\
\hline
Low variability index $(\leq10)$ & 62\% & 63\% \\
\hline
Medium variability index $(>10$ \& $\leq50)$ & 21\% & 20\% \\
\hline
High variability index $(>50)$ & 17\% & 17\% \\
\hline
\end{tabular}
\end{center}
\label{var_index_breakdown}
\end{table}
The bias in model results is used to measure the reduction in clipping loss errors from the hourly energy assessment, and to identify if there are any unknown factors that are systematically affecting the results. Bias and mean bias error (MBE), the two metrics used to evaluate model performance in this study, are described in detail below.
\begin{itemize}
\item \textbf{Bias}: Delta between the corrected energy assessment predictions and the measured operational data. Positive bias means over-predicted energy output:
\begin{equation}
\Delta E={E_\text{predicted}} - {E_\text{measured}}\label{eq:bias}
\end{equation}
where $E$ is the energy output.
\item \textbf{Mean bias error (MBE)}: Average of the bias:
\begin{equation}
\mathit{MBE}=\frac{\sum_{n=1}^N{\Delta E_n}}{N}\label{eq:mbe}
\end{equation}
Where $n$ is a single timestep, and $N$ is the total number of time intervals. Please note that a low MBE might hide seasonal or diurnal bias
\end{itemize}
\section{Results}
30-minute NSRDB data from each system was fed into the ML model to make clipping error predictions. Model input parameters are previously described in the Methods section.
Time steps with a higher variability index should have higher clipping errors. Our results, shown in Fig. ~\ref{fig:DCS-vi-clec-heatmap}, support this statement. Fig.~\ref{fig:DCS-vi-clec-heatmap} displays a heat map of average variability index, as well as the average clipping loss error correction predicted for system A. Generally, higher variability periods coincide with higher clipping loss corrections, usually occurring around the middle of the day, and during summer months.
\begin{figure}[htbp]
\centerline{\includegraphics[width=9cm]{DCS_VI_CLEC_heatmap.png}}
\caption{The first sub-figure shows the average variability index calculated at system A. The variability index is calculated and aggregated at a 30-min interval using on-site POA measured at a 1-min time resolution. In the second sub-figure, the average clipping error correction calculated for system A using the ML model is shown. The model predicts the clipping error correction factor for each 30-min interval.}
\label{fig:DCS-vi-clec-heatmap}
\end{figure}
Fig.~\ref{fig:DCS-inv-modelbias-poa-scatter} shows the actual power of an inverter against the actual POA. At a higher variability index, the power is lower and corresponds to higher model bias.
\begin{figure}[htbp]
\centerline{\includegraphics[width=9cm]{DCS_Inv_ModelBias_POA_with_VI.png}}
\caption{Inverter power and bias in modeled power for a 1000 kW rated inverter at system A. Lower power and higher model bias are observed at a higher variability index.}
\label{fig:DCS-inv-modelbias-poa-scatter}
\end{figure}
For the energy model, 30-minute NSRDB data was used to generate a time series of energy estimates for system A and system B. Then, these energy estimates were corrected using the clipping loss error corrections predicted by the ML model. Table.~\ref{results_before_filter} summarizes the mean bias errors in the energy models before and after applying the correction, for systems A and B respectively. The overall shift in the bias is small for each time series. As indicated by Fig.~\ref{fig:DCS-vi-clec-heatmap} and Fig.~\ref{fig:DCS-inv-modelbias-poa-scatter}, the magnitude of the clipping loss error correction should be high during time steps with a high variability index and low during time steps with a low variability index. Hence, the bias is explored based on the low, medium and high variability index categories described in the Validation section. Part (A) of Fig.~\ref{fig:DCS-modelbias-cdf} and Fig.~\ref{fig:PAW-modelbias-cdf} shows the cumulative distribution of the overall bias before and after correction. Part (B) of these figures shows the bias broken down by the variability index. Bias is high for the points with a high variability index, and low for points with a low variability index. Promisingly, applying the clipping correction factor for high solar variability periods where clipping is most prominent results in the largest reduction in MBE. For unfiltered system A high VI data, applying the clipping correction factor results in a 1.8\% reduction in MBE; similarly, applying the clipping correction factor for the associated system B data results in a 2.7\% reduction in MBE.
\begin{table}[htbp]
\caption{Results summary by System}
\begin{center}
\scalebox{0.96}{%
\begin{tabular}{|c|c|c|c|c| }
\hline
System & Metric & Before & After & Change\\
& & correction & correction & [\%]\\
\hline
System A & Overall MBE & 5.4\% & 4.6\% & -0.8\\
\hline
System A & MBE (High VI) & 10.7\% & 8.9\% & -1.8\\
\hline
System A & MBE (Medium VI) & 4.3\% & 3.3\% & -1.0 \\
\hline
System A & MBE (Low VI) & 4.4\% & 3.9\% & -0.5\\
\hline
System B & Overall MBE & 7.8\% & 6.6\% & -1.2\\
\hline
System B & MBE (High VI) & 10.4\% & 7.7\% & -2.7\\
\hline
System B & MBE (Medium VI) & 7.7\% & 6.1\% & -1.6\\
\hline
System B & MBE (Low VI) & 7.1\% & 6.4\% & -0.7\\
\hline
\end{tabular}}
\end{center}
\label{results_before_filter}
\end{table}
\begin{figure}[htbp]
\centerline{\includegraphics[width=9cm]{DCS_ModelBias_breakdown_CDF_v4.png}}
\caption{CDF of Model bias for system A before data filtering}
\label{fig:DCS-modelbias-cdf}
\end{figure}
\begin{figure}[htbp]
\centerline{\includegraphics[width=9cm]{PAW_ModelBias_breakdown_CDF_v4.png}}
\caption{CDF of Model bias for system B before data filtering}
\label{fig:PAW-modelbias-cdf}
\end{figure}
The overall model bias in Fig.~\ref{fig:DCS-modelbias-cdf} and Fig.~\ref{fig:PAW-modelbias-cdf} can be attributed to many reasons. Factors like the difference in the resource measured on-site and in the NSRDB dataset, system-specific sub-performance, incorrect loss assumptions in the energy model, etc., impact the bias along with the clipping loss errors. However, for a given data point, it is unrealistic to have a bias of greater than 25\% in power when the bias in POA is less than 5\%. Consequently, points fitting this logic are removed and the bias change is recalculated. Table.~\ref{results_after_filter} summarizes the mean bias errors in the filtered data points of the energy models before and after applying the correction, for systems A and B respectively. Fig.~\ref{fig:DCS-modelbias-after-filter-cdf} and Fig.~\ref{fig:PAW-modelbias-after-filter-cdf} show the bias before and after the correction for this filtered dataset. Compared to the CDFs plotted for the unfiltered dataset, these plots show a much tighter distribution as the outliers are removed.
Results for the filtered data mirror the unfiltered data results, except with lower overall mean bias error values. Interestingly, the drop in mean bias error after applying the clipping error correction is generally consistent for both filtered and unfiltered data (see the "\% Change" field in Tables \ref{results_before_filter} and \ref{results_after_filter} for comparison). For example, the drop in overall MBE for unfiltered data is 0.8\% vs. 0.5\% for filtered data for system A. Similarly the overall MBE difference is 1.2\% for unfiltered data and 0.8\% for filtered data for system B.
\begin{table}[htbp]
\caption{Results summary by System, after filtering}
\begin{center}
\scalebox{0.96}{%
\begin{tabular}{ |c|c|c|c|c| }
\hline
System & Metric & Before & After & Change\\
& & Correction & Correction & [\%]\\
\hline
System A & Overall MBE & 2.3\% & 1.8\% & -0.5\\
\hline
System A & MBE (High VI) & 10.7\% & 9.0\% & -1.7 \\
\hline
System A & MBE (Medium VI) & 3.1\% & 2.2\% & -0.9\\
\hline
System A & MBE (Low VI) & 1.1\% & 0.8\% & -0.3\\
\hline
System B & Overall MBE & 3.6\% & 2.8\% & -0.8\\
\hline
System B & MBE (High VI) & 8.7\% & 5.9\% & -2.8\\
\hline
System B & MBE (Medium VI) & 4.4\% & 2.9\% & -1.5\\
\hline
System B & MBE (Low VI) & 2.8\% & 2.3\% & -0.5\\
\hline
\end{tabular}}
\end{center}
\label{results_after_filter}
\end{table}
\begin{figure}[htbp]
\centerline{\includegraphics[width=9cm]{DCS_ModelBias_breakdown_AFTER_filter_CDF_v4.png}}
\caption{CDF of Model bias for system A after data filtering}
\label{fig:DCS-modelbias-after-filter-cdf}
\end{figure}
\begin{figure}[htbp]
\centerline{\includegraphics[width=9cm]{PAW_ModelBias_breakdown_AFTER_filter_CDF_v4.png}}
\caption{CDF of Model bias for system B after data filtering}
\label{fig:PAW-modelbias-after-filter-cdf}
\end{figure}
\section{Conclusions}
Over-prediction occurs in typical energy assessments that use hourly data if the systems have sub-hourly solar variability and high DC/AC, because clipping losses are underestimated. These errors have been quantified using a machine-learning model trained on high-frequency solar irradiance data and simulated operational data to create clipping loss error corrections. The corrections were applied to a typical hourly energy assessment of an existing operational system and compared to the measured output from the same system. Model bias was calculated before and after the clipping correction was applied, and the difference in MBE values was compared. Overall, the clipping correction outputs from the model reduced bias in the energy assessments by 0.8\% and 1.2\% for systems A and B, respectively, with no additional data filters applied. However, these bias reductions should be interpreted with two points in consideration: first, clipping loss error decreases with decreasing simulation timestep, so the corrections in this 30-minute validation are somewhat smaller than they would be in an hourly context; and second, based on the CDF comparisons, it seems likely that the model corrections do not fully eliminate the bias. Thus, the bias reductions reported in this paper likely underestimate the bias of a typical energy model. For high solar variability periods where clipping is most pronounced, the difference in MBE before and after applying the clipping correction is more pronounced. For unfiltered data with a high VI index, the MBE difference after applying the correction was 1.8\% and 2.7\% for systems A and B, respectively.
\section*{Acknowledgment}
This work was authored in part by Alliance for Sustainable Energy, LLC, the manager and operator of the National Renewable Energy Laboratory for the U.S. Department of Energy (DOE) under Contract No. DE-AC36-08GO28308. Funding provided by the U.S. Department of Energy’s Office of Energy Efficiency and Renewable Energy (EERE) under Solar Energy Technologies Office (SETO) Agreement Numbers 34348. The views expressed in the article do not necessarily represent the views of the DOE or the U.S. Government. The U.S. Government retains and the publisher, by accepting the article for publication, acknowledges that the U.S. Government retains a nonexclusive, paid-up, irrevocable, worldwide license to publish or reproduce the published form of this work, or allow others to do so, for U.S. Government purposes.
\bibliographystyle{IEEEtran}
% argument is your BibTeX string definitions and bibliography database(s)
\bibliography{IEEEabrv,bibliography}
\end{document}