A look at an electricity consumption series using SNCDs for clustering

(Slightly amended with code and data link, 12th January 2019.)

Prediction of electrical load demand or, in other words, electrical energy consumption is important for the proper operation of electrical grids, at all scales. RTOs and ISOs forecast demand based upon historical trends and facts, and use these to assure adequate supply is available.

This is particularly important when supply is intermittent, such as solar PV generation or wind generation, but, to some degree, all generation is intermittent and can be unreliable.

Such prediction is particularly difficult at the small and medium scale. At large scale, relative errors are easier to control, and there are a large number of units drawing upon or producing electrical energy which are amassed. At the very smallest of scales, it may be possible to anticipate usage of single institutions or households based upon historical trends and living patterns. This has only partly been achieved in devices like the Sense monitor, and prediction is still far away.

Presumably, techniques which apply to the very small could be scaled to deal with small and moderate size subgrids, although the moderate sized subgrids will probably be adaptations of the techniques used at large scale.

There is some evidence that patterns of electrical consumption directly follow the behavior of the building’s or home’s occupants that day, modulated by outside temperatures and occurrence of notable or special events. Accordingly, being able to identify the pattern of behavior early in a day can offer power prior information for the consumption pattern that will hold later in the day.

There is independent evidence occupant do, in a sense, select their days from a palette of available behaviors. This has been observed in Internet Web traffic, as well as secondary signals in emissions from transportation centers. Discovering that palette of behaviors is a challenge.

This post reports on an effort do such discovery using time series of electricity consumption for 366 days from a local high school. Consumption is sampled every 15 minutes.

Here is a portion of this series, with some annotations:

The segmentation is done automatically with a regime switching detector. The portion below shows these data atop a short-term Fourier spectrum of the same (STFT):

The point of this exercise is to cluster days together in a principled day, so to attempt to derive a kind of palette. One “color” of such a palette would be a cluster. Accordingly, if a day is identified, from the preliminary trace of its electricity consumption as being a member of a cluster, the bet is that the remainder of the day’s consumption will follow the patterns of other series seen in the cluster. If more than one cluster fits, then some kind of model average across clusters can be taken as predictive, obviously with greater uncertainty.

(Click on figure to see larger image and then use browser Back Button to return to blog.)

Each day of 366 for the 2007-2008 academic year was separated and pairwise dissimilarities for all days were calculated using a Symmetrized Normalized Compression Divergence (SNCD) described previously. The dissimilarity matrix was used with the default hierarchical clustering function, hclust, in R and its Ward-D2 method. That clustering produced the following dendrogram:

The facilities of the dynamicTreeCut package of R were used to find a place to cut the dendrogram and thus identify clusters. The cutreeDynamic function was called on the result of hierarchical clustering, using the hybrid method, and a minimum cluster size setting of one, to give the cluster chooser free range.

There were 5 clusters found. Here they are in various ways.

First, the dates and their weekdays:


$`1`
 2007-09-06  2007-09-07  2007-09-10  2007-09-14  2007-09-17  2007-09-18  2007-09-21  2007-09-25  2007-09-27  2007-10-01  2007-10-02  2007-10-03  2007-10-04  2007-10-09 
 "Thursday"    "Friday"    "Monday"    "Friday"    "Monday"   "Tuesday"    "Friday"   "Tuesday"  "Thursday"    "Monday"   "Tuesday" "Wednesday"  "Thursday"   "Tuesday" 
 2007-10-10  2007-10-22  2007-10-23  2007-10-29  2007-10-31  2007-11-02  2007-11-05  2007-11-06  2007-11-13  2007-11-21  2007-11-28  2007-12-03  2007-12-04  2007-12-05 
"Wednesday"    "Monday"   "Tuesday"    "Monday" "Wednesday"    "Friday"    "Monday"   "Tuesday"   "Tuesday" "Wednesday" "Wednesday"    "Monday"   "Tuesday" "Wednesday" 
 2007-12-06  2007-12-11  2007-12-12  2007-12-14  2007-12-17  2007-12-18  2007-12-19  2008-01-03  2008-01-04  2008-01-11  2008-01-15  2008-01-16  2008-01-17  2008-01-18 
 "Thursday"   "Tuesday" "Wednesday"    "Friday"    "Monday"   "Tuesday" "Wednesday"  "Thursday"    "Friday"    "Friday"   "Tuesday" "Wednesday"  "Thursday"    "Friday" 
 2008-01-22  2008-01-23  2008-01-24  2008-01-29  2008-01-30  2008-01-31  2008-02-05  2008-02-06  2008-02-07  2008-02-11  2008-02-12  2008-02-13  2008-02-25  2008-02-27 
  "Tuesday" "Wednesday"  "Thursday"   "Tuesday" "Wednesday"  "Thursday"   "Tuesday" "Wednesday"  "Thursday"    "Monday"   "Tuesday" "Wednesday"    "Monday" "Wednesday" 
 2008-02-28  2008-03-10  2008-03-12  2008-03-13  2008-03-14  2008-03-19  2008-03-24  2008-03-25  2008-04-01  2008-04-02  2008-04-03  2008-04-04  2008-04-11  2008-04-23 
 "Thursday"    "Monday" "Wednesday"  "Thursday"    "Friday" "Wednesday"    "Monday"   "Tuesday"   "Tuesday" "Wednesday"  "Thursday"    "Friday"    "Friday" "Wednesday" 
 2008-04-28  2008-04-30  2008-05-05  2008-05-07  2008-05-09  2008-05-12  2008-05-19  2008-05-22  2008-05-27  2008-05-28  2008-06-01  2008-06-02  2008-06-04  2008-06-05 
   "Monday" "Wednesday"    "Monday" "Wednesday"    "Friday"    "Monday"    "Monday"  "Thursday"   "Tuesday" "Wednesday"    "Sunday"    "Monday" "Wednesday"  "Thursday" 
 2008-06-07  2008-06-10  2008-06-13  2008-06-17  2008-06-18  2008-06-19  2008-06-23  2008-06-24  2008-06-27  2008-07-01  2008-07-02  2008-07-05  2008-08-11  2008-08-18 
 "Saturday"   "Tuesday"    "Friday"   "Tuesday" "Wednesday"  "Thursday"    "Monday"   "Tuesday"    "Friday"   "Tuesday" "Wednesday"  "Saturday"    "Monday"    "Monday" 
 2008-08-27 
"Wednesday" 

$`2`
 2007-09-03  2007-09-04  2007-09-08  2007-09-12  2007-09-13  2007-09-15  2007-09-20  2007-09-24  2007-09-29  2007-10-06  2007-10-07  2007-10-08  2007-10-12  2007-10-15 
   "Monday"   "Tuesday"  "Saturday" "Wednesday"  "Thursday"  "Saturday"  "Thursday"    "Monday"  "Saturday"  "Saturday"    "Sunday"    "Monday"    "Friday"    "Monday" 
 2007-10-20  2007-10-27  2007-10-28  2007-10-30  2007-11-03  2007-11-22  2007-11-23  2007-11-26  2007-12-01  2007-12-13  2007-12-24  2007-12-26  2007-12-28  2007-12-31 
 "Saturday"  "Saturday"    "Sunday"   "Tuesday"  "Saturday"  "Thursday"    "Friday"    "Monday"  "Saturday"  "Thursday"    "Monday" "Wednesday"    "Friday"    "Monday" 
 2008-01-05  2008-01-14  2008-01-21  2008-01-25  2008-02-02  2008-02-04  2008-02-09  2008-02-10  2008-02-15  2008-02-18  2008-02-19  2008-02-20  2008-02-21  2008-02-22 
 "Saturday"    "Monday"    "Monday"    "Friday"  "Saturday"    "Monday"  "Saturday"    "Sunday"    "Friday"    "Monday"   "Tuesday" "Wednesday"  "Thursday"    "Friday" 
 2008-03-04  2008-03-06  2008-03-15  2008-03-18  2008-03-23  2008-03-28  2008-03-29  2008-04-05  2008-04-10  2008-04-16  2008-04-17  2008-04-18  2008-04-21  2008-04-22 
  "Tuesday"  "Thursday"  "Saturday"   "Tuesday"    "Sunday"    "Friday"  "Saturday"  "Saturday"  "Thursday" "Wednesday"  "Thursday"    "Friday"    "Monday"   "Tuesday" 
 2008-04-25  2008-05-01  2008-05-02  2008-05-08  2008-05-21  2008-05-24  2008-05-29  2008-06-08  2008-06-12  2008-06-21  2008-06-25  2008-06-26  2008-07-04  2008-07-06 
   "Friday"  "Thursday"    "Friday"  "Thursday" "Wednesday"  "Saturday"  "Thursday"    "Sunday"  "Thursday"  "Saturday" "Wednesday"  "Thursday"    "Friday"    "Sunday" 
 2008-07-07  2008-07-13  2008-07-18  2008-07-21  2008-07-22  2008-07-23  2008-07-24  2008-07-29  2008-07-30  2008-08-01  2008-08-02  2008-08-05  2008-08-06  2008-08-08 
   "Monday"    "Sunday"    "Friday"    "Monday"   "Tuesday" "Wednesday"  "Thursday"   "Tuesday" "Wednesday"    "Friday"  "Saturday"   "Tuesday" "Wednesday"    "Friday" 
 2008-08-09  2008-08-10  2008-08-12  2008-08-13  2008-08-15  2008-08-16  2008-08-20  2008-08-28 
 "Saturday"    "Sunday"   "Tuesday" "Wednesday"    "Friday"  "Saturday" "Wednesday"  "Thursday" 

$`3`
 2007-09-05  2007-09-11  2007-09-19  2007-09-26  2007-09-28  2007-10-05  2007-10-11  2007-10-16  2007-10-17  2007-10-18  2007-10-19  2007-10-24  2007-10-25  2007-10-26 
"Wednesday"   "Tuesday" "Wednesday" "Wednesday"    "Friday"    "Friday"  "Thursday"   "Tuesday" "Wednesday"  "Thursday"    "Friday" "Wednesday"  "Thursday"    "Friday" 
 2007-11-01  2007-11-07  2007-11-08  2007-11-09  2007-11-14  2007-11-15  2007-11-16  2007-11-19  2007-11-20  2007-11-27  2007-11-29  2007-11-30  2007-12-07  2007-12-10 
 "Thursday" "Wednesday"  "Thursday"    "Friday" "Wednesday"  "Thursday"    "Friday"    "Monday"   "Tuesday"   "Tuesday"  "Thursday"    "Friday"    "Friday"    "Monday" 
 2007-12-20  2007-12-21  2007-12-27  2008-01-02  2008-01-07  2008-01-08  2008-01-09  2008-01-10  2008-01-28  2008-02-01  2008-02-08  2008-02-14  2008-02-26  2008-02-29 
 "Thursday"    "Friday"  "Thursday" "Wednesday"    "Monday"   "Tuesday" "Wednesday"  "Thursday"    "Monday"    "Friday"    "Friday"  "Thursday"   "Tuesday"    "Friday" 
 2008-03-03  2008-03-05  2008-03-07  2008-03-08  2008-03-11  2008-03-17  2008-03-26  2008-03-27  2008-03-31  2008-04-07  2008-04-08  2008-04-09  2008-04-14  2008-04-15 
   "Monday" "Wednesday"    "Friday"  "Saturday"   "Tuesday"    "Monday" "Wednesday"  "Thursday"    "Monday"    "Monday"   "Tuesday" "Wednesday"    "Monday"   "Tuesday" 
 2008-04-24  2008-04-29  2008-05-06  2008-05-13  2008-05-14  2008-05-15  2008-05-16  2008-05-20  2008-05-23  2008-05-30  2008-06-03  2008-06-06  2008-06-09  2008-06-11 
 "Thursday"   "Tuesday"   "Tuesday"   "Tuesday" "Wednesday"  "Thursday"    "Friday"   "Tuesday"    "Friday"    "Friday"   "Tuesday"    "Friday"    "Monday" "Wednesday" 
 2008-06-14  2008-06-16  2008-06-22  2008-07-14  2008-07-25  2008-08-19  2008-08-26 
 "Saturday"    "Monday"    "Sunday"    "Monday"    "Friday"   "Tuesday"   "Tuesday" 

$`4`
2007-09-01 2007-09-02 2007-09-09 2007-09-16 2007-09-22 2007-09-23 2007-09-30 2007-10-13 2007-10-14 2007-10-21 2007-11-04 2007-11-10 2007-11-11 2007-11-12 2007-11-17 2007-11-18 
"Saturday"   "Sunday"   "Sunday"   "Sunday" "Saturday"   "Sunday"   "Sunday" "Saturday"   "Sunday"   "Sunday"   "Sunday" "Saturday"   "Sunday"   "Monday" "Saturday"   "Sunday" 
2007-11-24 2007-11-25 2007-12-02 2007-12-08 2007-12-09 2007-12-15 2007-12-16 2007-12-22 2007-12-23 2007-12-25 2007-12-29 2007-12-30 2008-01-01 2008-01-06 2008-01-12 2008-01-13 
"Saturday"   "Sunday"   "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday"  "Tuesday" "Saturday"   "Sunday"  "Tuesday"   "Sunday" "Saturday"   "Sunday" 
2008-01-19 2008-01-20 2008-01-26 2008-01-27 2008-02-03 2008-02-16 2008-02-17 2008-02-23 2008-02-24 2008-03-01 2008-03-02 2008-03-09 2008-03-16 2008-03-21 2008-03-22 2008-03-30 
"Saturday"   "Sunday" "Saturday"   "Sunday"   "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday"   "Sunday"   "Sunday"   "Friday" "Saturday"   "Sunday" 
2008-04-06 2008-04-12 2008-04-13 2008-04-19 2008-04-20 2008-04-26 2008-04-27 2008-05-03 2008-05-04 2008-05-10 2008-05-11 2008-05-17 2008-05-18 2008-05-25 2008-05-31 2008-06-15 
  "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday" "Saturday"   "Sunday"   "Sunday" "Saturday"   "Sunday" 
2008-06-29 2008-07-12 2008-07-19 2008-07-20 2008-07-26 2008-07-27 2008-08-03 2008-08-17 2008-08-24 2008-08-31 
  "Sunday" "Saturday" "Saturday"   "Sunday" "Saturday"   "Sunday"   "Sunday"   "Sunday"   "Sunday"   "Sunday" 

$`5`
 2008-03-20  2008-05-26  2008-06-20  2008-06-28  2008-06-30  2008-07-03  2008-07-08  2008-07-09  2008-07-10  2008-07-11  2008-07-15  2008-07-16  2008-07-17  2008-07-28 
 "Thursday"    "Monday"    "Friday"  "Saturday"    "Monday"  "Thursday"   "Tuesday" "Wednesday"  "Thursday"    "Friday"   "Tuesday" "Wednesday"  "Thursday"    "Monday" 
 2008-07-31  2008-08-04  2008-08-07  2008-08-14  2008-08-21  2008-08-22  2008-08-23  2008-08-25  2008-08-29  2008-08-30 
 "Thursday"    "Monday"  "Thursday"  "Thursday"  "Thursday"    "Friday"  "Saturday"    "Monday"    "Friday"  "Saturday" 

Note that most of the weekend days are in cluster 4 along with a Christmas Tuesday (25 December 2007) and Veterans Day (observed) on a Monday, 12 November 2007, and a Good Friday, 21 March 2008. Assigning meanings to the other clusters depends upon having events to mark them with. It’s known, for example, that the last day of school in 2008 was 20th June 2008. Unfortunately, the academic calendars for 2007-2008 have apparently been discarded. (I was able to find a copy of the 2008 Westwood High School yearbook, but it is not informative about dates, consisting primarily of photographs.) Accordingly, it’s necessary to look for internal consistency.

There is a visual way of representing these findings. The figure below, a reproduction of the one at the head of the blog post, traces energy consumption for the high school during each day. The abscissa shows hours of the day, broken up into 96 15-minute intervals. For each of 366 days, the energy consumption recorded is plotted, and the lines connected. Each line is plotted in a different color depending upon the day of the week. The colors are faded by adjusting their alpha value so they can be seen through.

Note how days with flat energy consumption tend to be in a single color. These are apparently weekend days.

Atop of each of the lines describing energy consumption, a black numeral has been printed which gives the cluster number to which the day was assigned. These are printed at the highest point of their associated curves, but these are jittered so they don’t stack atop one another and make them hard to distinguish.

(Click on figure to see larger image and then use browser Back Button to return to blog.)

The clusters go along with consumption characters. A proactive energy management approach would entail examining the activities done on the days in each of the clusters. Of special interest would be clusters, such as clusters 1 and 3 which have very high energy usage.

Code and data

The code and data reviewed here are available in my Google replacement for a Git repository.

Future work

I am next planning to apply this clustering technique to long neglected time series of streamflow in Sharon, MA and on the South Shore.

About ecoquant

See https://wordpress.com/view/667-per-cm.net/ Retired data scientist and statistician. Now working projects in quantitative ecology and, specifically, phenology of Bryophyta and technical methods for their study.
This entry was posted in American Statistical Association, consumption, data streams, decentralized electric power generation, dendrogram, divergence measures, efficiency, electricity, electricity markets, energy efficiency, energy utilities, ensembles, evidence, forecasting, grid defection, hierarchical clustering, hydrology, ILSR, information theoretic statistics, local self reliance, Massachusetts, microgrids, NCD, normalized compression divergence, numerical software, open data, prediction, rate of return regulation, Sankey diagram, SNCD, statistical dependence, statistical series, statistics, sustainability, symmetric normalized compression divergence, time series. Bookmark the permalink.

2 Responses to A look at an electricity consumption series using SNCDs for clustering

  1. 360DIGITMGtraining says:

    This post is very simple to read and appreciate without leaving any details out. Great work!
    data science training in malaysia

  2. Pingback: Procrustes tangent distance is better than SNCD | Hypergeometric

Leave a reply. Commenting standards are described in the About section linked from banner.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.