For service support related to Validation Reports, please contact: ESTAT-SUPPORT-SDMX@ec.europa.eu




Document release date

14.03.2021.

Purpose

Present document describes

  1. The standardization of the Validation Reports in Eurostat,
  2. The structure and content of the Validation Reports, and
  3. the error types reported by the structural validation service.

Service version

Descriptions are based on STRUVAL version 8.5.4


Standardization of the Validation Reports


In collaboration with National Statistical Institutes, Eurostat has introduced an SDMX-compliant XML schema for validation reports. This Machine-Readable Report (MRR) contains the results of the validation event (e.g. errors in data, error messages) and can be processed by Eurostat’s Validation Report Formatter Service (VRF) to produce an additional, user-friendly Human Readable Report (HRR). The HRR format enhances the readability and clarity of the information provided, and presents information on validation errors in a transparent and ordered manner.

Eurostat is adapting its validation services (STRUVAL and CONVAL) to produce the SDMX-compliant MRR report and thus to generated the associated HRR report. STRUVAL fully supports the new Validation Report, while its implementation for CONVAL is ongoing.


Retrieval of the Validation Report


The data provider may access the Validation Report through the EDAMIS feedback channel. The Validation Report is never sent directly (e.g. via email) to data providers due to possible confidentiality constraints. The EDAMIS service may be, however, configured to send an email to inform users of the availability of the report (and any other message received).


Data providers receive the Validation Report in a human-readable format aimed at statisticians and other generalist groups. The HTML format described in the rest of this guide is the most widely used format, though the human-readable report can also be generated in TXT and PDF format. A machine-readable XML format is also generated, but is currently only available for Eurostat internal use.


For report examples, please see the Complete sample reports section below.


Please note that currently only STRUVAL reports are disseminated in the new format. CONVAL reports follow the current CSV format.



                                                                                                                    

Structure of the Validation Report (HRR)

The Validation Report consists of a Header section and an Validation Results section. The Header contains validation process metadata and a general overview of the results of the validation. The Validation Results section contains the details of all errors detected.


Header section


The Header section provides an overview of the validation event: the aggregated results, and the actors and assets involved. The Header consists of an information box on process metadata, a counter of validation failures per severity, and messaging related to the particular validation process instance.




Process Metadata


The information box lists the key metadata related to the validation flow instance.

Please note that entries that are not relevant for the individual validation event will not appear in the report, e.g. the use of Constraints is optional and therefore may not appear.


ComponentDescription

Data Provider

Identifier of the data provider country or organization. May be the country code or the name of the organization.

Data Submitted On

Date and time when the dataset was submitted in EDAMIS.

Process Type

Indicates if the validation instance is pre-validation or official transmission. 

If a pre-validation process is invoked, the process type is 'PRE-VALIDATION'

If official transmission is invoked, the process type is 'OFFICIAL TRANSMISSION'

Please also see Messaging below.

Processed On

Date and time when the dataset was processed by the Eurostat data validation services. In case the data is processed by more than one service, the time stamp denotes the closure of the last validation event.
Validated DatasetName of dataset validated; EDAMIS Dataset ID.

DSD

DSD Artefact ID as listed in the Euro SDMX Registry.

Dataflow

Dataflow Artefact ID as listed in the Euro SDMX Registry.
ConstraintConstraint Artefact ID as listed in the Euro SDMX Registry.

Validation Report Generated On

Date and time when the HRR version of the Validation Report is created.

Actors

Name and version of validation service(s) called. If multiple validation services are called, all are listed. In the Eurostat context, the services may include:

STRUVAL - mandatory
CONVAL- optional, if configured for the dataflow


Error occurrence counter


The Header displays a counter with the total of issues encountered, grouped per Severity. Further breakdown of detected failures based on their root causes is presented in the Validation Results section.


SeverityHandling of reported failure
ErrorBlocking. The validation process is terminated and the identified issue must be corrected in the dataset before re-submission.
WarningNon-blocking. The validation process detected an issue where evaluation and possible correction is required before the acceptance of the data.
InfoNon-blocking. Information on the data is provided.


Messaging


The Validation Report includes general purpose (not error specific) messages to inform data providers of specific circumstances of the validation flow instance.


MessageOccurrence

Official Transmission


The label appears for validation flow instances that are intended as official data transmissions, with no pre-validation option selected in EDAMIS.

Pre-Validation. Data is not officially transmitted to Eurostat.The label appears for validation flow instances that are intended as pre-validation of the data, with the pre-validation option selected at data submission in EDAMIS
Validation ended with success.The validation process concluded with no Error or Warning severity issues detected.

Validation ended with errors found.

The validation process concluded with minimum 1 Error or Warning severity issue detected.

The report is based on confidential data. Some values might have been removed.

Datasets may contain data defined as confidential. In such cases, all elements of the report that are or may be confidential are removed. These elements may include values for the concepts CONCEPT_NAME and CONCEPT_VALUE, and error messages that may include one or both of the previous concepts.

Confidential information is removed from the Validation Results section of the report, the Header section never contains confidential components.

Error limit reached.

Validation services are set to terminate after reaching a pre-set number of validation errors. In case the error limit is reached, the report indicates this fact. Please note that the data may contain further, unreported errors that have not yet been identified and may be detected on re-submission of the data.

The current error cap is set at 10000 error occurrences.


Validation Results section


Note: In case the validation process detects no errors of any severity, the Validation Results section is empty.

The Validation results section holds all occurrences of errors detected by the validation services in the dataset. The section presents error locations, accompanying error specific metadata and error messages, including the following attributes:


Component

Description

Error CodeError category. Serves diagnostic purposes only.

Message ID

Error type, refers to the type of error. Serves diagnostic purposes only.

For a complete list, please see the Error Types section of this guide below.

Concept Name

Name of concept affected by error. If confidential, it is removed form the report.

Concept Type

Type of concept affected by error. Value may be: Dimension / Attribute / Measure

Concept Value

Value of concept affected by error. If confidential, it is removed form the report.

Number of Occurrences

Value refers to the total number of occurrences for a unique error.

SeveritySeverity of the error detected. Value may be: ERROR / WARNING / INFO

Description 1

Error message, describing the error. Possibly confidential.

Description 2Instructional message, advising on how to resolve the error. Optional, may not be included for specific errors.
First OccurrenceThe first occurrence of the error in a tabular representation. Concept Names comprising the series key are not truncated for readability. Includes error Position (line / row).
Next OccurrencesAll further occurrences of the specific error (if any), with the Concept Names truncated and represented in-line. 



When opening the Validation Results section, the grouped view of the errors appears on the left side. Each group includes all error occurrences generated by the same, unique root cause.

Note: A single root cause may trigger multiple error types, and these will be listed separately (e.g. a code is unexpected and also violates a length constraint).




Errors are identified as unique and therefore grouped together according to the following logic:



If a difference in value on any level is detected between two error occurrences, the root cause of the errors is considered distinct and therefore they are listed separately. If the values are identical on all four levels, the occurrences are grouped together.


When clicking the header of any error group, the panel containing the complete details on the error group opens on the right.




The report contains two descriptive messages to support resolution of the issue detected. Description 1 is a traditional error message, defining the nature of the error and the violating concepts (if any). Description 2 provides additional information or instructions. Description 2 is optional, and may be user defined.

Note: Errors generated by technical issues detected in the file (e.g. structurally incorrect dataset) will also appear as an error group, with 1 occurrence and no location defined. 




Location of all occurrences of an error are listed in the Validation Report. In addition to the series keys (the cross-products of Concept Name-Concept Value pairs), the new report also defines the Position concept that located the error in the input file.



Filtering reports due to confidentiality constraints

Eurostat policy prohibits the inclusion of confidential data in Validation Reports distributed to external parties, including national statistical institutes. Statistical domains collecting and processing confidential data will receive a report where elements with the possibility of the presence of confidential data are filtered out.

The filtering extends to the following:

  1. Concept Type - Measure and Attribute, as these may contain confidential information. Dimensions are never filtered out.
  2. Concept Value
  3. Description 1. - The description may contain dynamic elements where confidential data (e.g. an OBS_VALUE) is inserted. The description is completely removed from the report.


Complete sample reports


Please download samples for optimal view.


HRR_no errors.html

MRR_no errors.xml


HRR_errors.html

MRR_errors.xml


HRR_truncated_confidential_values.html


HRR_technical error.html

MRR_technical error.xml


Service support


For service support related to Validation Reports, please contact:

 ESTAT-SUPPORT-SDMX@ec.europa.eu

See also


STRUVAL error codes and messages