Europa Analytics - Frequently Asked Questions English (en) français (fr)

From thinktank

Jump to: navigation, search

These FAQs are designed to provide a better understanding of the Europa Analytics service.

They provide basic information and will often link to more detailed information.

The list is going to be updated on a regular basis.



Contents

TECHNOLOGY

What is the technology behind the Europa Analytics?

The system relies on a web server's recording of data of visits in log files (weblogs). A proxy server records requests from visitors' browsers in weblogs. They contain information related to the web audience, such as visitor's IP addresses, browser types, URLs of visited pages, referrer pages, search terms, date and time visited and much more.

The application starts every day at 13h30 and processes the log files of the past day. The main steps the application performs are: the extraction, the transformation and the loading (so-called ETL- Extract, Transfer and Load) of data into the tables of the Europa webmarts (specialized database for web audience data). Europa Analytics data are stored in webmarts dedicated to Europa sites set-up on the basis of the configuration editor, which is updated by DGs and services. At this stage, only Europa sites hosted at the Data Centre in Luxembourg are processed. More information can be found on the Information Providers Guide.


What are the current available reporting tools?

The Europa Analytics service provides both easy to access dashboard reports via the so-called 'Kiosk' (no user account needed), and other more complete reports via the 'Web Report Studio' so-called 'Europa Analytics reporting tool' for which creation of an account is necessary.

You may extract reports such as monthly/quarterly/yearly dashboards as well as reports containing information on the following indicators: key figures, browsers and platforms, search terms, pages, referrer pages, visitor's recency and frequency, countries, organisations, languages, spiders, status codes etc. There is a possibility of setting up customised reports and Scheduled reports (Push reports) as well.

More information can be found on Europa Analytics reporting environments and on List of Generic Reports available with Europa Analytics.


Why are there differences in numbers between Dashboard Reports and other advanced Template reports in the Europa Analytics (EA) reporting tool?

The differences are due to some "historical" technical choices that caused the EA system to rely on two different datamarts (a datamart is a subset of the whole data warehouse that aims at meeting a specific need). The main reasons behind that choice were linked to performance issues and to the fact that some indicators were not easy to aggregate with the original datamart solution. As a result the two datamarts, though importing data from the same weblogs, treat in a different way status codes such as 3xx (redirections), 4xx (client errors) and 5xx (server errors) and this is why there are some differences in certain indicators.

For the Kiosk and Dashboard Reports, the so-called "DG COMM datamart" is used (which was mainly meant to speed-up the process). This datamart does not take into account error codes and redirections which result in figures that differ to indicators obtained via Pages Report or Key Figures Reports and other advanced Template Reports.

In practice, this means that we can use those differences as an indicator of how many visitors experience errors and/or redirections on the page of interest. If no errors and no redirections occur on a website the figures will be exactly the same for both datamarts.


What are the requirements for adding new weblogs to the EA system?

The EA system relies on SAS logging technology and requires a customised configuration prior to getting data from it (more information under the CONFIGURATION section of this FAQ). Actually only weblogs (proxy) coming from Digit Data Centre in Luxembourg are processed today. The required format of the logs is the BlueCoat format. More details can be found on Request the Delivery of the weblogs.


I am using a Pages Report and I see that some of my pages were not found and a result was given for all the pages in the site. What is the reason of this aggregation?

The EA system does not distinguish between different dynamic URLs if they are identified by a URL variable (also called keyword) that is not incorporated in the ETL procedure. Suppose that the "pg" (=page) variable is not known by the system then the result is not given for the following URL: ec.europa.eu/research/index.cfm?pg=why&lg=en, but for the first part of it, which is: ec.europa.eu/research/index.cfm?. To avoid this kind of situation please send an email to FMB COMM EUROPA MANAGEMENT indicating keywords that should be recorded for your webmart. Please keep in mind that random keys, sessionid, usernames, passwords, and control keywords, etc. cannot be added as this would slow down the pages database.


Is there a way of excluding some of my URLs from being counted in the statistics?

You can add to your URL the following keyword: ea-ignore=true (example: /newsroom/beta/about-eu/index_en.htm?ea-ignore=true). This will allow you to avoid the concerned pages from being included in the Europa Analytics figures.


How EA handles web spiders?

Spiders are computers that examine the content of websites, rather than human viewers. EA solution filters out about 60% of data which is considered as coming from web spiders. To see how many spiders were removed from the data you can use the Spider File Hits Report. Analysis of spiders, index engines or crawlers is based on clickstream analysis within SAS Web Analytics (SWA) tool. SWA proposes the list and provides all clicks done by those IP-addresses.

Following elements are checked for spiders' identification:

  • number of clicks above 5000 with avg of 2 secs per click
  • number of clicks above 15000 with avg of 1 sec per click
  • number of OS and browsers used in sessions
  • more than 13 languages used in one session

IP-addresses of spiders are stored and used to filter out while extracting. IP-addresses not returning are kept for 500 days. After that, they are deleted and not used anymore for filtering in ETL.


How can I access reports concerning the EC Intranet pages?

Europa Analytics solution provides web indicators of Europa public websites and not the password protected Intranet pages. A similar service, managed by DG HR, exists for the EC Intracomm pages. Please contact FMB EC MYINTRACOMM for more information or have a look at this page.


Does Europa Analytics track webgate websites?

No, it does not. The current Europa Analytics tool does not support webgate websites, due to restrictions in the datalogs policy. The functionality will be available with the upcoming analytics tool.


How can I check up to which date the reports are available?

The Europa Analytics system relies on weblogs availability and ETL processes.

The data is fully updated with a standard 2 days delay. Some partial data is available with a 1 day delay, however shall be analysed with caution as it is not yet complete. Therefore, if we want to extract a report for yesterday's data it is recommended to wait with its extraction until tomorrow.

In case of occasional service disruptions, data may not be synchronised to the standard 2 days delay. In this particular situation it is useful to check the data availability via the Kiosk interface, where the date of latest loaded data is displayed on the top of the page. The Webmart Properties Report also provides this kind of information (on a level of a specific webmart).


Is it allowed to use third party web analytics tools?

The IPG rules expressly state that "Third party services are not allowed on EUROPA. Webmasters must use in-house solutions and not third party tools". IPG rules should be always communicated by the DG webmaster to any external contractors when providing website development services to the relevant DG.

The DG COMM recommends using the Europa Analytics service as it is the in-house solution customized for Europa webmasters and supported by the European Commission. It guarantees business continuity and a general control over the data - in both senses, by controlling respect of the protection of personal data and by controlling the processed data as such.

More information can be found on:

It is essential to highlight the importance of keeping fixed indicators for a tool in order for it to be able to provide comparable values over time. For example, to be able to see the evolution of the number of visitors to your website over a set period of time. Please refer to the TECHNOLOGY section of this FAQ to read more on how our system works.


CONFIGURATION

Where can I check the current configuration of my webmart(s)?

There are 3 possibilities to access webmart's configuration:

In order to check if the configuration is correct, please consider reviewing the following points:

  • missing URL(s)
  • missing redirection(s)
  • incorrect splitting into section(s)
  • obsolete URL(s)
  • expiry date of URL(s) approaching the end
  • one URL cannot be mapped more than once (excluding expired rules)


How can I request a change in the configuration of my webmart(s)?

There are two options:

  • You can initiate a change request via the Config Editor in the Europa Analytics reporting tool. The Config Editor user guide can be found in EA reporting tool under the section Config Editor.
  • You can follow the procedure described here.

In order to keep data collection coherent and comparable, it is best practice to implement changes on the 1st day of a month, a quarter or a year.


Why are my webmart's / section's figures for the current period significantly higher/lower than for previous month / quarter / year?

Please verify if your webmart's / section's configuration has changed over the period that you are analyzing. The higher numbers for the current period may be explained by an addition of new URL(s) to your webmart. Whereas the lower numbers may be explained by an expiration or a deletion of URL(s) that has/have been included in your webmart in previous periods and are now not counted anymore in the figures. Be aware that each configuration's change may have an impact on figures. This is why before drawing any erroneous conclusions it is essential to understand your webmart's configuration. More information can be found under the CONFIGURATION section of this FAQ.


I would like to create a new section within a webmart. Will the new section include data retroactively for my URL(s)?

Be aware that this will NOT be retroactive – the data for the new section will be collected as of the creation date. However it will be still possible to retrieve the data by specifying the URL of interest in the Specify URL box in the parameters specification interface – available in the EA reporting tool. Please note that this will only be possible if the root section of your webmart was properly configured.


INDICATORS AND METRICS (DATA ANALYSIS)

How can I know where the visitors come from to my website?

To know where the visitors come from, please use the Referrer Entry Pages Report and/or the Countries Report (available via Template Reports). The Referrer Entry Pages Report report provides information on the pages that has led to your website. Whereas the Countries Report report provides information on the countries from which a request to access your website has been registered.


What does it mean when a report displays that "no referrer" was found?

The no referrer means that a referrer page was not identified. Therefore a visitor opened a page either via a bookmark, via a link in an email or directly typed in an address bar.


Where can I find the search terms that have led to my website/section or a particular URL of the website?

To get a list of the search terms (keywords) that have led to your website from external or internal search engines please use the Search Terms Overview Report (available via Template Reports).

Note1: Due to the latest changes implemented by Google, the Google search terms cannot be fully detected by Europa Analytics and other Web Analytics tools. However, search terms from other search engines including internal Europa Search are fully available for your analysis.

Note2: Please be informed that the internal search terms (from Europa Search) are not available for the following time period of data: 10 March - 22 July 2015.


How to find search terms which were typed by the visitors in the search box of a specific site?

TEMPORARILY NOT AVAILABLE

  • Open Referrer Entry Pages Report and select the following parameters:
    • Dates: 01Mar2015 − 31Mar2015 per interval Month => more than 1 month is not recommended
    • Webmart: Europa Search => if you do not have access to this webmart please request it via COMM EUROPA MANAGEMENT
    • Specify URL: %querytext=% => please put the exact text: %querytext=% (together with the wildcards)
    • Number of URLs: 1000000000
  • Export the table to Excel (right click on table with results, click on export table option, choose Excel, click ok)
  • From the "Referrer" column (which provides the full referrer URL) select pages of the website from which you want to see the search terms. For example, select only those referrer URLs starting with: ec.europa.eu/agriculture in order to see search terms from Agriculture sites
  • Then look at the column "Page Description" (landing pages). The values of the querytext= keyword are the search terms which have been typed on the site you have selected (in our example Agriculture sites). Extract those terms from "Page Description".
  • Once search terms are extracted you can look at the page views metric in order to select the top ones

How can I check how many searches have been performed to my website through the Europa Search?

Log into the Europa Analytics reporting tool with your credentials

  • Open Template Reports
  • Select Referrer Entry Pages
  • Insert time parameters => more than 1 month is not recommended
  • Select a webmart => select a webmart/website to which the number of internal searches has to be extracted
    • Optional: Select a section
    • Optional: If interested in a particular URL, please do not forget to delete the "http://" from the beginning of the URL. Not deleting "http://" will lead to no results.

If you are looking for more than one URL, please do not forger to use % in the beginning and the end of the URL. Separate the URLs with a coma.

  • In the field: Number of Referral Pages please insert the following number "10000000".

It will give you the optimum results.

  • Once you generate the report, please extract it in an excel format (right click on table with results, click on export table option, choose Excel, click OK)

In order to have the exact number of searches, you should manipulate the excel file as follows:

  • Click on data and add a Filter (Data tab)
  • On the 'referral' column, choose results which contain the following text "europa.eu/geninfo/query"
  • After you apply the filter, sum up the results of the Entry page views.

This number represents the number of searches performed via the Europa Search that have referred visitors to your website.

Note: Downloading the report might take more than 20 min.


How can I decode unreadable results on "Search Terms Overview" report?

It is possible that when running the "Search Terms Overview" report, unreadable comments might appear under the section 'Searched Terms'. Below you can find examples of unreadable search terms:

  • Cr%C3%A9ative
  • dom%C3%ADnguez
  • EU%E3%80%80integrated %EF%BC%AD%EF%BD%81%EF%BD%92%EF%BD%89%EF%BD%8E%EF%BD%85%E3%80%80policy
  • %D0%B0%D1%82%D0%BB%D0%B0%D1%81 %D0%BD%D0%B0 %D0%BC%D0%BE%D1%80%D0%B5%D1%82%D0%B0%D1%82%D0%B0

These are in fact real searched terms, however written in an alphabet that was not recognized by Europa Analytics. (f.e. Arabic or Chinese) In order to understand them, please use the following online decoder: Decoder

Please insert the text and click "Decode URL". If the text appears in a language that needs translation, copy paste the text and use a translation tool of your choice.


Does Europa Analytics allow collecting data such as visits coming from or leaving Social Media channels?

Europa Analytics provides information on referrers that led your visitor to the Europa pages (including Social Media channel referrers). You may use the Referrer Entry Pages Report for the purpose of analysing visits coming from Social Media channels. However, the current system does not allow to detect visitors leaving Europa sites to go to Social Media channels.


How to get the average time session spent on the website?

To get the average time session, please use the Webmart Key Figures Report. You can also use the Visitor Recency Report report or the Visitor Frequency Report report.


Is the unique visitor metrics actually reflecting a person visiting a website?

The Unique Visitor metric does not measure a person but is rather a measure of the device through which a person interacts with a website or network (that is why the term Unique Browser is often used instead). We may therefore reasonably assume that the number of Unique Browsers somehow corresponds to people visiting a website (the trends correspond well to the increasing or decreasing popularity of a website). However, the very same person accessing a website from the office during the day, from the smart-phone when travelling, and from home in the evening will be counted 3 times - because they are using 3 different devices. On the contrary, when sharing a PC (at home, internet cafe, etc.), the visits of several persons are counted as only 1 Unique Browser (as they access through the same device).


Why is there a difference between the sum of the unique visitors per day (for 30 days) and an aggregation of unique visitors per month?

The repeat visitors (those that come back to your website over several days) cause this difference. The same visitor can make several visits the same day. In this case the visitor is counted only once during this period (we call him unique during 24 hours' time in this case). However, if the same visitor returns to your site each day of a month, then he is counted in the unique visitors per day, but only 1 time in the month. It is therefore important not to sum the daily numbers of unique visitors. The same problem arises if we make the sum of the unique visitors per month (for 12 months) and compare it with an aggregation of unique visitors per year.


Why do I obtain more visits than page views for my website?

The page views are based on the HTTP status code 2XX (success). If another code is returned (3XX – redirection, 4XX – client error, 5XX – server error), there will be no "page view" although it will be counted as a visit. Therefore it is even possible that for various pages you may find some figures for visits when the corresponding row has 0 page views (Visits > 0 with Pages = 0).

This approach prevents from counting redirections and errors in the page views, as in this case the user does not get any real content. In order to construct and analyse the whole path of the visitor over the web, all the status codes are counted in the visits.

To analyse the number of redirections and error status codes on your site please use the Pages Report (new version) available via the Template reports. More information on redirectrions can be found on Redirection Impact.


Why do I obtain more visits in the Pages Report than in the Webmart/Section Key Figures Report?

If you have come to this conclusion, this means that you have summed up the number of visits in the Pages Report. This practice should be avoided, as it leads to incorrect overstated figures. The Pages Report displays the number of visits per page, whereas the Webmart Key Figures Report and the Section Key Figures Report display the precise aggregated number for all pages in the site.

Simple example: A visitor opens a page /example/abc and after a while opens another page /example/xyz, both within webmart called Example. In the Pages report we get one visit for each page (total=2 visits), whereas in the reality it is only one visit. The total number of visits can be made only in a very rare situation, which is when the probability of having the same visit for the displayed pages does not exist. More information can be found on: Is it ok to sum up the number of page views and visits displayed by URL?.


Is it ok to sum up the number of page views and visits displayed by URL?

The Pages Report allows analysing, among other indicators, the number of page views and visits per each visited URL of the website. The results for the page views by URL can be summed up in order to obtain a total figure for a given period of time. However, the results for the visits by URL should not be summed up, because the URLs displayed in the report might be a part of the same visit. The total number of visits would be hence very likely overstated.

Visual explanation of this problem: Visits vs. PageViews by URL.jpg


Why there is a difference between visits stated in the Pages Report and the Referrer Entry Pages Report? Shouldn’t this data be the same?

Please consider that it is not correct to sum up the visits from the Pages Report, as this brings overstated results. More information can be found under the question Why do I obtain more visits in the Pages Report than in the Webmart/Section Key Figures Report?. However, you may sum up visits from the Referrer Entry Report, as they are all distinct visits.

Moreover, the Referrer Entry Pages report takes into account only the starting (entry) point of the visit. This means that non-entry pages are not counted in this report (no visits are registered for those pages).

Example: visitor uses google search and lands on ec.europa.eu website, then he surfs further and opens another page of this website: ec.europa.eu/atwork. The Referrer Entry Pages report will collect the data about the google search (as the referrer) and about the ec.europa.eu website (as the entry page), but will not report any page view/visit on ec.europa.eu/atwork (as this is a non-entry page).


How it is possible that for the same time period and the same URL the number of visits/page views is different depending on the chosen interval?

Please pay attention when choosing the "interval" (read "aggregation") for the report. The data is aggregated as follows (applies to all the "Template Reports" except the Dashboard Reports):

  • Year: from 1 January – 31 December
  • Month: from the first day of the given month until the last day of the given month
  • Day: day by day

Taking into account the above, while for example selecting the monthly interval and specific dates such as 22 April - 22 May, the system returns data for the whole month on April and May. Therefore the data is higher in the monthly aggregation when compared to the daily aggregation. To obtain correct figures for the period 22 April - 22 May, please use the "day" interval.


Are https pages included in the reports?

They are included in the figures, but there is no possibility of distinguishing between the http and the https pages.


Is the traffic originating from the European Commission excluded from the reports?

The reports do not exclude the EC's IP addresses. However, you may want to run the Organisations Report, which allows to differentiate the traffic generated by EC visitors and other non-EC visitors.


How should be the bounce rate indicator interpreted?

Bounce has an interest only in relation to specific pages. It reveals a single page view (entry page = exit page) during a visit (session time = 30' of inactivity or connection from another browser). The Bounce Rate represents the ratio between the single page view of a page and the total number of visits on that page (or webmart, or section, depending on the report and the query). When no URL has been specified (such as in a Webmart Key Figures Report or also could happen e.g. in a Dashboard Report), the Bounce Rate is an aggregation of the bounce rates related to the whole webmart or section for which the report has been retrieved. More information on Bounce Rate.


How can I check how frequently the visitors come to the site?

Please use the Visitor Frequency Report. This report shows the number of unique visitors per visitor's frequency. Visitor frequency (for a unique visitor for a selected period) is defined as a number of days between the first and the last visit divided by the total number of visits.


How can I check how recently the visitors came to the site?

Please use the Visitor Recency Report. This report shows the number of unique visitors per visitor's recency. Visitor recency (for a unique visitor for a selected period) is defined as a number of weeks between the penultimate and the most recent visit of a visitor.


REPORT CUSTOMISATION

How to customise and add a customised report to my home page?

Please refer to the PPT slides from hands-on training and follow the steps in the exercise 8 (Personalising homepage) and exercise 9 (Reports' customisation).


How can I check how many times certain files were downloaded from my website?

Please use the Pages Report. Using the Specify URL box in the parameters specification allows you to display only those URL that are of your interest. Remember to use the wildcard and type the extension of the file you want to track. If you want to track pdf files the configuration should be the following: %.pdf%.


How can I apply a filter in a report?

Once your results are displayed you can right-click on a title of an indicator of interest and choose Filter and Rank option. Then you can choose an operator and a value to be applied.


How can I export a report from Europa Analytics?

You may export reports in pdf, excel or word format. Please check the procedure on Export report from Europa Analytics.


I am trying to export a yearly Pages Report to excel but the downloaded file is empty. Is there any way of getting the report in an excel format?

It is not recommended to download the whole year pages data, because of MS Excel limitations. Excel 2007/2010 will have the Excel 2003 limits (65,536 rows) when opening a .xls document (which is the case when a zip doc is downloaded from SAS). We recommend therefore the following:

  • Limit the number of URLs to be downloaded to max 65.000 by for example choosing the monthly data instead of the yearly data
  • Specify a filter for including/excluding specific URLs (more on Use URL in Report) or choose only those pages that were actually viewed properly (page views > 0)


TRAINING

How can I request practical training on Europa Analytics?

Personalized one hour coaching sessions are offered by the Analytics Team. Please contact COMM EUROPA MANAGEMENT (Europamanagement@ec.europa.eu) to schedule a couching session. The official syslog four hour training sessions are no longer available.





Bugs on Europa Analytics contains all known bugs and/or needed or wanted improvements

Personal tools