Smart meters - Quality Guidelines for Big Data
Data Class Smart meter
- Short description of this class of big data, the source(s) and the structure of the raw data
The class “Smart meter” consists of data collections related to metering information in general, mostly related to consumption or production of energy or water. This metering is down in short time intervals and the measurements can be read from a distance automatically. In general the data can include inflow and outflow of a specific unit (metering point) in a certain interval. The most common occurrence of smart meters is smart electricity meters measuring the consumption and (most of the time) also production of electrical power.
The structure of the raw data is very simplistic:
- Metering ID
Most of the times additional background variables are available per metering ID which will be used to transform metering IDs into useful statistical units. These background variables may consist of:
- an unique identifier to link the metering point to administrative units, e.g. business ID,
- geographical information which may be as precise as an address or a coordinate or just a geographical unit, e.g. district or
- information on the kind of metering point, e.g. household, business, producer.
- Short description of the role of the big data class in the ESSnet(s), including links to deliverables (if already existing)
This Big Data class was the main input for WP 3 of ESSnet Big Data I and is now handled in WPD Smart energy. The output of the previous ESSnet delivered insights in the area of data access, data handling, the production of statistics and methodology, future perspective and recommendations.
An obvious statistical output is the consumption of electricy and identifying specific patterns of consumption might be of additional interest, e.g. to find inherent socio-demographics factors to explain "energy-saving" and "energy-wasting" households. Businesses energy consumption could be related to business cycle effects and it could therefore be used as an auxiliary variable in estimating economy. Construction sites of new buildings, discrimination into vacant/non-vacant homes as well as prices/spending statistics are additional possible output.
- Basic description which processes are necessary to transform the raw data into statistical data
The process of transforming raw metering data into meaningful statistical data can be split into the following sub processes:
- Linking metering point to statistical units, e.g. to register data on business or households. The linkage might be down based on unique identifier or on the address or coordinate of a metering point if available.
- Non linkable metering points need to classified according to the necessary classification not present in the background information, e.g., household / business.
- Quality guidelines relevant for this big data class
The accuracy of the measurement itself are probably of limited relevance to the overall quality of statistical output as it is used to invoice customers, but checks should be in place to check for extreme and implausible values.
However, estimated classification such as the division into household or business might suffer from poor accuracy and therefore, all used models should be checked thoroughly and confusion matrices should be estimated.
Currently, coverage of smart electricity meters is increasing Europe wide, but the roll-out is still on-going and therefore undercoverage is present and has to be handled accordingly.
There are no known examples of overcoverage in the raw data, but there can be artificial overcoverage due to linking or classification errors.
4.4.Comparability over time
Once setup completely this should be a fairly stable data source.
4.5.Process Errors / data source specific errors
A very specific source of error for consumption and production of electricity when estimated based on smart electricity meters is the own consumption of produced energy, since this is not recorded but most smart meters.
Because of the use of models to estimate some of the important characteristics, it is important to have good training data which is not easy to get for some indicators, e.g. vacant/non-vacant dwelling.