(Slightly amended with code and data link, 12th January 2019.)
Prediction of electrical load demand or, in other words, electrical energy consumption is important for the proper operation of electrical grids, at all scales. RTOs and ISOs forecast demand based upon historical trends and facts, and use these to assure adequate supply is available.
This is particularly important when supply is intermittent, such as solar PV generation or wind generation, but, to some degree, all generation is intermittent and can be unreliable.
Such prediction is particularly difficult at the small and medium scale. At large scale, relative errors are easier to control, and there are a large number of units drawing upon or producing electrical energy which are amassed. At the very smallest of scales, it may be possible to anticipate usage of single institutions or households based upon historical trends and living patterns. This has only partly been achieved in devices like the Sense monitor, and prediction is still far away.
Presumably, techniques which apply to the very small could be scaled to deal with small and moderate size subgrids, although the moderate sized subgrids will probably be adaptations of the techniques used at large scale.
There is some evidence that patterns of electrical consumption directly follow the behavior of the building’s or home’s occupants that day, modulated by outside temperatures and occurrence of notable or special events. Accordingly, being able to identify the pattern of behavior early in a day can offer power prior information for the consumption pattern that will hold later in the day.
There is independent evidence occupant do, in a sense, select their days from a palette of available behaviors. This has been observed in Internet Web traffic, as well as secondary signals in emissions from transportation centers. Discovering that palette of behaviors is a challenge.
This post reports on an effort do such discovery using time series of electricity consumption for 366 days from a local high school. Consumption is sampled every 15 minutes.
Here is a portion of this series, with some annotations:
The segmentation is done automatically with a regime switching detector. The portion below shows these data atop a short-term Fourier spectrum of the same (STFT):
The point of this exercise is to cluster days together in a principled day, so to attempt to derive a kind of palette. One “color” of such a palette would be a cluster. Accordingly, if a day is identified, from the preliminary trace of its electricity consumption as being a member of a cluster, the bet is that the remainder of the day’s consumption will follow the patterns of other series seen in the cluster. If more than one cluster fits, then some kind of model average across clusters can be taken as predictive, obviously with greater uncertainty.
(Click on figure to see larger image and then use browser Back Button to return to blog.)
Each day of 366 for the 2007-2008 academic year was separated and pairwise dissimilarities for all days were calculated using a Symmetrized Normalized Compression Divergence (SNCD) described previously. The dissimilarity matrix was used with the default hierarchical clustering function, hclust
, in R and its Ward-D2 method. That clustering produced the following dendrogram:
The facilities of the dynamicTreeCut package of R were used to find a place to cut the dendrogram and thus identify clusters. The cutreeDynamic
function was called on the result of hierarchical clustering, using the hybrid method, and a minimum cluster size setting of one, to give the cluster chooser free range.
There were 5 clusters found. Here they are in various ways.
First, the dates and their weekdays:
$`1`
2007-09-06 2007-09-07 2007-09-10 2007-09-14 2007-09-17 2007-09-18 2007-09-21 2007-09-25 2007-09-27 2007-10-01 2007-10-02 2007-10-03 2007-10-04 2007-10-09
"Thursday" "Friday" "Monday" "Friday" "Monday" "Tuesday" "Friday" "Tuesday" "Thursday" "Monday" "Tuesday" "Wednesday" "Thursday" "Tuesday"
2007-10-10 2007-10-22 2007-10-23 2007-10-29 2007-10-31 2007-11-02 2007-11-05 2007-11-06 2007-11-13 2007-11-21 2007-11-28 2007-12-03 2007-12-04 2007-12-05
"Wednesday" "Monday" "Tuesday" "Monday" "Wednesday" "Friday" "Monday" "Tuesday" "Tuesday" "Wednesday" "Wednesday" "Monday" "Tuesday" "Wednesday"
2007-12-06 2007-12-11 2007-12-12 2007-12-14 2007-12-17 2007-12-18 2007-12-19 2008-01-03 2008-01-04 2008-01-11 2008-01-15 2008-01-16 2008-01-17 2008-01-18
"Thursday" "Tuesday" "Wednesday" "Friday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday" "Friday" "Tuesday" "Wednesday" "Thursday" "Friday"
2008-01-22 2008-01-23 2008-01-24 2008-01-29 2008-01-30 2008-01-31 2008-02-05 2008-02-06 2008-02-07 2008-02-11 2008-02-12 2008-02-13 2008-02-25 2008-02-27
"Tuesday" "Wednesday" "Thursday" "Tuesday" "Wednesday" "Thursday" "Tuesday" "Wednesday" "Thursday" "Monday" "Tuesday" "Wednesday" "Monday" "Wednesday"
2008-02-28 2008-03-10 2008-03-12 2008-03-13 2008-03-14 2008-03-19 2008-03-24 2008-03-25 2008-04-01 2008-04-02 2008-04-03 2008-04-04 2008-04-11 2008-04-23
"Thursday" "Monday" "Wednesday" "Thursday" "Friday" "Wednesday" "Monday" "Tuesday" "Tuesday" "Wednesday" "Thursday" "Friday" "Friday" "Wednesday"
2008-04-28 2008-04-30 2008-05-05 2008-05-07 2008-05-09 2008-05-12 2008-05-19 2008-05-22 2008-05-27 2008-05-28 2008-06-01 2008-06-02 2008-06-04 2008-06-05
"Monday" "Wednesday" "Monday" "Wednesday" "Friday" "Monday" "Monday" "Thursday" "Tuesday" "Wednesday" "Sunday" "Monday" "Wednesday" "Thursday"
2008-06-07 2008-06-10 2008-06-13 2008-06-17 2008-06-18 2008-06-19 2008-06-23 2008-06-24 2008-06-27 2008-07-01 2008-07-02 2008-07-05 2008-08-11 2008-08-18
"Saturday" "Tuesday" "Friday" "Tuesday" "Wednesday" "Thursday" "Monday" "Tuesday" "Friday" "Tuesday" "Wednesday" "Saturday" "Monday" "Monday"
2008-08-27
"Wednesday"
$`2`
2007-09-03 2007-09-04 2007-09-08 2007-09-12 2007-09-13 2007-09-15 2007-09-20 2007-09-24 2007-09-29 2007-10-06 2007-10-07 2007-10-08 2007-10-12 2007-10-15
"Monday" "Tuesday" "Saturday" "Wednesday" "Thursday" "Saturday" "Thursday" "Monday" "Saturday" "Saturday" "Sunday" "Monday" "Friday" "Monday"
2007-10-20 2007-10-27 2007-10-28 2007-10-30 2007-11-03 2007-11-22 2007-11-23 2007-11-26 2007-12-01 2007-12-13 2007-12-24 2007-12-26 2007-12-28 2007-12-31
"Saturday" "Saturday" "Sunday" "Tuesday" "Saturday" "Thursday" "Friday" "Monday" "Saturday" "Thursday" "Monday" "Wednesday" "Friday" "Monday"
2008-01-05 2008-01-14 2008-01-21 2008-01-25 2008-02-02 2008-02-04 2008-02-09 2008-02-10 2008-02-15 2008-02-18 2008-02-19 2008-02-20 2008-02-21 2008-02-22
"Saturday" "Monday" "Monday" "Friday" "Saturday" "Monday" "Saturday" "Sunday" "Friday" "Monday" "Tuesday" "Wednesday" "Thursday" "Friday"
2008-03-04 2008-03-06 2008-03-15 2008-03-18 2008-03-23 2008-03-28 2008-03-29 2008-04-05 2008-04-10 2008-04-16 2008-04-17 2008-04-18 2008-04-21 2008-04-22
"Tuesday" "Thursday" "Saturday" "Tuesday" "Sunday" "Friday" "Saturday" "Saturday" "Thursday" "Wednesday" "Thursday" "Friday" "Monday" "Tuesday"
2008-04-25 2008-05-01 2008-05-02 2008-05-08 2008-05-21 2008-05-24 2008-05-29 2008-06-08 2008-06-12 2008-06-21 2008-06-25 2008-06-26 2008-07-04 2008-07-06
"Friday" "Thursday" "Friday" "Thursday" "Wednesday" "Saturday" "Thursday" "Sunday" "Thursday" "Saturday" "Wednesday" "Thursday" "Friday" "Sunday"
2008-07-07 2008-07-13 2008-07-18 2008-07-21 2008-07-22 2008-07-23 2008-07-24 2008-07-29 2008-07-30 2008-08-01 2008-08-02 2008-08-05 2008-08-06 2008-08-08
"Monday" "Sunday" "Friday" "Monday" "Tuesday" "Wednesday" "Thursday" "Tuesday" "Wednesday" "Friday" "Saturday" "Tuesday" "Wednesday" "Friday"
2008-08-09 2008-08-10 2008-08-12 2008-08-13 2008-08-15 2008-08-16 2008-08-20 2008-08-28
"Saturday" "Sunday" "Tuesday" "Wednesday" "Friday" "Saturday" "Wednesday" "Thursday"
$`3`
2007-09-05 2007-09-11 2007-09-19 2007-09-26 2007-09-28 2007-10-05 2007-10-11 2007-10-16 2007-10-17 2007-10-18 2007-10-19 2007-10-24 2007-10-25 2007-10-26
"Wednesday" "Tuesday" "Wednesday" "Wednesday" "Friday" "Friday" "Thursday" "Tuesday" "Wednesday" "Thursday" "Friday" "Wednesday" "Thursday" "Friday"
2007-11-01 2007-11-07 2007-11-08 2007-11-09 2007-11-14 2007-11-15 2007-11-16 2007-11-19 2007-11-20 2007-11-27 2007-11-29 2007-11-30 2007-12-07 2007-12-10
"Thursday" "Wednesday" "Thursday" "Friday" "Wednesday" "Thursday" "Friday" "Monday" "Tuesday" "Tuesday" "Thursday" "Friday" "Friday" "Monday"
2007-12-20 2007-12-21 2007-12-27 2008-01-02 2008-01-07 2008-01-08 2008-01-09 2008-01-10 2008-01-28 2008-02-01 2008-02-08 2008-02-14 2008-02-26 2008-02-29
"Thursday" "Friday" "Thursday" "Wednesday" "Monday" "Tuesday" "Wednesday" "Thursday" "Monday" "Friday" "Friday" "Thursday" "Tuesday" "Friday"
2008-03-03 2008-03-05 2008-03-07 2008-03-08 2008-03-11 2008-03-17 2008-03-26 2008-03-27 2008-03-31 2008-04-07 2008-04-08 2008-04-09 2008-04-14 2008-04-15
"Monday" "Wednesday" "Friday" "Saturday" "Tuesday" "Monday" "Wednesday" "Thursday" "Monday" "Monday" "Tuesday" "Wednesday" "Monday" "Tuesday"
2008-04-24 2008-04-29 2008-05-06 2008-05-13 2008-05-14 2008-05-15 2008-05-16 2008-05-20 2008-05-23 2008-05-30 2008-06-03 2008-06-06 2008-06-09 2008-06-11
"Thursday" "Tuesday" "Tuesday" "Tuesday" "Wednesday" "Thursday" "Friday" "Tuesday" "Friday" "Friday" "Tuesday" "Friday" "Monday" "Wednesday"
2008-06-14 2008-06-16 2008-06-22 2008-07-14 2008-07-25 2008-08-19 2008-08-26
"Saturday" "Monday" "Sunday" "Monday" "Friday" "Tuesday" "Tuesday"
$`4`
2007-09-01 2007-09-02 2007-09-09 2007-09-16 2007-09-22 2007-09-23 2007-09-30 2007-10-13 2007-10-14 2007-10-21 2007-11-04 2007-11-10 2007-11-11 2007-11-12 2007-11-17 2007-11-18
"Saturday" "Sunday" "Sunday" "Sunday" "Saturday" "Sunday" "Sunday" "Saturday" "Sunday" "Sunday" "Sunday" "Saturday" "Sunday" "Monday" "Saturday" "Sunday"
2007-11-24 2007-11-25 2007-12-02 2007-12-08 2007-12-09 2007-12-15 2007-12-16 2007-12-22 2007-12-23 2007-12-25 2007-12-29 2007-12-30 2008-01-01 2008-01-06 2008-01-12 2008-01-13
"Saturday" "Sunday" "Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Tuesday" "Saturday" "Sunday" "Tuesday" "Sunday" "Saturday" "Sunday"
2008-01-19 2008-01-20 2008-01-26 2008-01-27 2008-02-03 2008-02-16 2008-02-17 2008-02-23 2008-02-24 2008-03-01 2008-03-02 2008-03-09 2008-03-16 2008-03-21 2008-03-22 2008-03-30
"Saturday" "Sunday" "Saturday" "Sunday" "Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Sunday" "Sunday" "Friday" "Saturday" "Sunday"
2008-04-06 2008-04-12 2008-04-13 2008-04-19 2008-04-20 2008-04-26 2008-04-27 2008-05-03 2008-05-04 2008-05-10 2008-05-11 2008-05-17 2008-05-18 2008-05-25 2008-05-31 2008-06-15
"Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Saturday" "Sunday" "Sunday" "Saturday" "Sunday"
2008-06-29 2008-07-12 2008-07-19 2008-07-20 2008-07-26 2008-07-27 2008-08-03 2008-08-17 2008-08-24 2008-08-31
"Sunday" "Saturday" "Saturday" "Sunday" "Saturday" "Sunday" "Sunday" "Sunday" "Sunday" "Sunday"
$`5`
2008-03-20 2008-05-26 2008-06-20 2008-06-28 2008-06-30 2008-07-03 2008-07-08 2008-07-09 2008-07-10 2008-07-11 2008-07-15 2008-07-16 2008-07-17 2008-07-28
"Thursday" "Monday" "Friday" "Saturday" "Monday" "Thursday" "Tuesday" "Wednesday" "Thursday" "Friday" "Tuesday" "Wednesday" "Thursday" "Monday"
2008-07-31 2008-08-04 2008-08-07 2008-08-14 2008-08-21 2008-08-22 2008-08-23 2008-08-25 2008-08-29 2008-08-30
"Thursday" "Monday" "Thursday" "Thursday" "Thursday" "Friday" "Saturday" "Monday" "Friday" "Saturday"
Note that most of the weekend days are in cluster 4 along with a Christmas Tuesday (25 December 2007) and Veterans Day (observed) on a Monday, 12 November 2007, and a Good Friday, 21 March 2008. Assigning meanings to the other clusters depends upon having events to mark them with. It’s known, for example, that the last day of school in 2008 was 20th June 2008. Unfortunately, the academic calendars for 2007-2008 have apparently been discarded. (I was able to find a copy of the 2008 Westwood High School yearbook, but it is not informative about dates, consisting primarily of photographs.) Accordingly, it’s necessary to look for internal consistency.
There is a visual way of representing these findings. The figure below, a reproduction of the one at the head of the blog post, traces energy consumption for the high school during each day. The abscissa shows hours of the day, broken up into 96 15-minute intervals. For each of 366 days, the energy consumption recorded is plotted, and the lines connected. Each line is plotted in a different color depending upon the day of the week. The colors are faded by adjusting their alpha value so they can be seen through.
Note how days with flat energy consumption tend to be in a single color. These are apparently weekend days.
Atop of each of the lines describing energy consumption, a black numeral has been printed which gives the cluster number to which the day was assigned. These are printed at the highest point of their associated curves, but these are jittered so they don’t stack atop one another and make them hard to distinguish.
(Click on figure to see larger image and then use browser Back Button to return to blog.)
The clusters go along with consumption characters. A proactive energy management approach would entail examining the activities done on the days in each of the clusters. Of special interest would be clusters, such as clusters 1 and 3 which have very high energy usage.
Code and data
The code and data reviewed here are available in my Google replacement for a Git repository.
Future work
I am next planning to apply this clustering technique to long neglected time series of streamflow in Sharon, MA and on the South Shore.
This post is very simple to read and appreciate without leaving any details out. Great work!
data science training in malaysia
Pingback: Procrustes tangent distance is better than SNCD | Hypergeometric