WPG Overview English (en) français (fr)

From ESSnet Big Data
Jump to: navigation, search

Background

Financial transaction data (FTD) has a lot of potential in enriching existing statistics, both in the role of auxiliary information for weighting and calibration, quality assurance and as a source by its own merit. In societies, where new payment solutions emerge as an option to cash and where the internet enables easier trading, FTD is a natural data source for statistics monitoring economic activity.

There are several ways to subdivide and categorise FTD, one of them is to cross-classify into six groups by type of transactions and by type of payer-/receiver pair. Transaction types are debit or credit card transactions and “giro”-transactions (bank-to-bank transfer). Payer-/receiver-pair types are business-to-business (B2B), consumer-to-business (C2B) and consumer-to-consumer (C2C), where C2B means either B2C or C2B. A consumer in our context can be a person or a household.

One area where FTD would be a vital data source for statistics is to measure the shared or sharing economy. This is as an all-encompassing term that refers to a host of on-line economic transactions, often set-up in a very nascent regulatory environment. The transactions can be of all the above types and they take place through shared economy platforms (SEP) which allow individuals, groups or businesses to make money from underused assets which are offered as services. The platforms play a mediating role in connecting the owner of the asset with the consumer trying to access the good, as well as typically granting the security of the C2C and C2B transactions. An example (among several) is Airbnb for sharing houses/apartments.

Objectives

The main aim of the WP is to get an overview of the sources and the data infrastructure (metadata) of financial transaction data in the countries participating in this WP. The objective is to describe to what extent FTD are available as well as whether it is possible for NSIs to access them. Given the infrastructure it is also a main aim of the WP to assess the statistical potential of these data sources. This may be for improving the existing quality or for quality evaluations of some currently produced statistics, or it may be for a completely new portfolio of statistical products. To do this, empirical studies will be carried out. Given that the relevant financial transaction data is acquired in time for this WP, empirical investigations should underpin conclusions about potential and implementing statistics based on such data sources.

The WP will not cover all potential areas of statistics, but besides the shared economy concept there are connections to statistics on household budget, prices, finance, economic turnover, trade and so forth. Description of work

Task 0 – Review of literature on SEP

Performed by INS (Statistics Bulgaria)

SEP is a relatively new phenomenon, radically changing the traditional economy by enabling sharing globally instead of only locally. Despite being new, SEPs have already been studied in the literature, e.g. in “Participation, Privacy, and Power in the Sharing Economy Platforms and the Sharing Economy: An Analysis “, a Report from the EU H2020 Research Project Ps2Share by Kateryna Stanoevska-Slabeva Vera Lenz-Kesekamp and Viktor Suter, all from the University of St. Gallen. The task T0 is to study this and other reports and write an overview of SEPs based on the literature, focusing on the relevant aspects for WPG, which are:

  1. What SEP is and what it is not, i.e. finding out what the characteristics of a SEP are and which platforms falls outside such definition.
  2. What are the main SEPs, for what purpose are they used, in which countries are they used?
  3. What are their market share on the activities they provide?

Task 0 should aim at what is relevant for identifying SEP as potential data sources for official statistics. A large part of the SEP literature is about the role of SEP in the economy and in the society, but this may not be relevant to the current ESSnet.

Task 1 - Existence and accessibility

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal) and SURS (Statistics Slovenia)

Task 1 (T1) is to investigate the accessibility of the six types of financial transaction data (hereunder to identify stakeholders) in the countries participating in the WP. The participating countries are spread throughout the ESS and should give an overview picture of the situation in the ESS. Circumstances that should be investigated are:

  1. What transaction data exists?
  2. Who owns the data? (in case of a fragmented data situation with many owners);
  3. What contents does the data cover?
  4. What are the legislative regulations relevant for getting access? Is it possible to get access?
  5. If access is possible, are there legislative limitations to the amount of data that can be accessed? Limitations could be sample instead of the complete registers; only some variables; no identification variables; data processing needed to be performed at the data owner, leading to access of aggregated data only; access to only predefined purposes, access only for a predefined period.
  6. If access is possible, are there practical/technical limitations to the amount of data that can be accessed? Limitations could e.g. be lack of server capacity at NSI; no infrastructure for securely transferring large amounts of data from data owner to NSI; no relevant software for processing such large amounts of data; unreasonable costs for data owner;
  7. If access to the complete data is unpractical (6), does the practical/technical limitations prevail for access to data only on transactions made a certain date?
  8. What are the processes needed to access the data? What costs are involved and can they be covered? How can secure access by or transfers to an NSI be established? What are the possibilities to obtain relevant data in time for the empirical analysis planned in T4 and T8.

A target in T1 is to cover all six groups of FTD, however due to access limitations and limited contribution some may be omitted for one or all countries.

Task 2 - Metadata

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal) and SURS (Statistics Slovenia)

Task 2 (T2) is to study the data infrastructure (metadata) of the T1-data, by using an exploratory approach and to create the foundation for T3. Relevant issues are e.g.:

  • What is the range of the population?
  • What is the base unit of the data sets? Which relevant composite (aggregated) units can be formed based on the data?
  • Do the units have an identifier that can be used at the NSI to link the data with internal NSI data, if such linkage is legal (c.f. Task 1)?
  • Which variables are included? For numerical variables, which ones only categorical? Does the data contain relevant GPS-coordinates or other localisation variables?
  • Are there known quality issues with any of the variables? What are the problems?

Task 3 - Official statistics candidates

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal) and SURS (Statistics Slovenia)

Based on the results of T2, Task 3 (T3) is about identifying statistics that seem to be able to benefit from FTD, either as quality improvement (e.g. modifying the production system using financial transaction data as auxiliary variables) or for quality evaluation. Some issues are:

  • Considering the natural population for a certain official statistics topic, is this natural population covered by the FTD population? Are there coverage issues in the FTD population?
  • How relevant are the FTD-variables for the official statistics topic, given the ability to link the FTD-data to other data and given quality issues related to the variables?

Some statistics that prior to T3 seem very promising candidates for using financial transaction data, may during T3 turn out to be dead ends due to lack of relevant variables, linking possibilities or other quality issues. Some potential statistics in T3 are e.g.;

  • economic activity of enterprises (transactions between businesses, B2B, possibly split by industry classification code),
  • price information,
  • tourism (transactions on stays at hotels, stays at Airbnb apartments etc),
  • household budget statistics
  • macroeconomic statistics

Task 4 - Empirical analyses

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal) and SURS (Statistics Slovenia).

The empirical analyses, Task 4 (T4), will be performed among the promising statistics identified in T3. This task is only relevant if data have been accessed in time to do an empirical study within the WP. The purpose of each empirical analysis is to assess the statistics potential of an FTD source: The empirical analysis will e.g. look at whether FTD can give auxiliary information useful for improving the estimation leading to the statistics, and whether FTD can replace some of the current data sources.

As an example, Norway will at the start of the ESSnet have access to card transactions and giro transactions, and will perform at least one empirical analysis. The choice of statistic will depend on T3, but might be about the potential to use giro transactions for statistics about the market for rental apartments.

Task 5

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal), SURS (Statistics Slovenia).

Task 5 (T5) is as T1, but with the purpose to study SEP data, and T5 is restricted to literature reviews. T0 will reveal what SEP is and the varieties of different sharing activities. T5 will just as T1 start with identifying existing SEP data for different economical activities.

Task 6

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal), SURS (Statistics Slovenia).

Task 6 (T6) is as T2, but for SEP data, and T6 is restricted to literature reviews.

Task 7

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal) and SURS (Statistics Slovenia).

Task 7 (T7) is as T3, but for SEP data, and T7 is restricted to literature reviews.

Task 8

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal) and SURS (Statistics Slovenia).

The empirical analyses of SEP data, Task 8 (T8), will be performed among the promising statistics identified in T7. This task is only relevant if data have been accessed in time to do an empirical study within the WP. If such SEP data has been accessed, these data should be discussed with Eurostat and the task should only start after approval by Euorstat. The purpose of each empirical analysis is to assess the statistics potential of an SEP data source: The empirical analysis will e.g. look at whether SEP data can give auxiliary information useful for improving the estimation leading to the statistics, and whether SEP data can replace some of the current data sources.

Remark

T5-T8 will be performed taking into account position papers of the Commission (http://ec.europa.eu/growth/singlemarket/services/collaborative-economy_en) and of the partners' official national authorities. Only findings relevant to the concepts of and to the production of official statistics should be analysed and reported.

Task 9-10 - Final project report

Performed by all: SSB (Statistics Norway), INS (Statistics Bulgaria), Destatis (Statistics Germany), INE (Statistics Portugal) and SURS (Statistics Slovenia)

Task 9 (T9) and Task 10 (T10) is about writing a final project report of the findings of T1-T4 and T5-T8, respectively.

Collaboration with Banca d’Italia on T1 and T2

Banca d’Italia will work as for ISTAT in the project on the basis of a Memorandum of Understanding, but will not be funded by the project, and also no cost is to be reimbursed for Banca d’Italia participants. The reason for participating in the project, is that Banca d’Italia has a specific expertise on the payment systems, and this expertise is not currently available at Istat. Banca d’Italia will do consultancy activities in the following domains:

  • The provision of relevant data on electronic payments held by Banca d'Italia or other private data-holders;
  • The expertise related to the rules governing the electronic payment systems.

Moreover, Banca d’Italia is interested in identifying alternative sources (typically Big Data) for measuring economic activity and sharing them within the ESS.

Timeline

T1 and T5 starts at M0 for all countries for all financial transaction data types, followed by parallel activities T2&T6, T3&T7, T4&T8. T9 and T10 are the final activities.

Milestones and deliverables

See here for an overview of available milestones and deliverables.

WPG milestones

  GM1   Report on the WP meeting mid-2019   Month 9
  GM2   Report on the WP meeting mid-2020   Month 20

WPG deliverables

  G1  Draft report on T1-T3 and T6-T8. The report will be a collection of the reports for each country on these tasks, including a brief overall summary. The descriptions should be sufficient for deciding on what studies/work should be included in T4 and T8, each study identified by the combination of “a statistical product”, “financial transaction data type”, and “country”.   Month 11  
  G2  Conclusions on studies, c.f. D1, and plan for year two   Month 12  
  G3  Closing the selection of studies (statistics, financial transaction data type, country) to be part of T4 and T8   Month 15  
  G4  Final project report   Month 23  Based on task 9-10