WP8 Documentation1 English (en) français (fr)

From ESSnet Big Data
Jump to: navigation, search

This page provides an overview, alphabetically by author(s), of all articles on big data methodology, IT or quality useful for WP8 Methodology. See here for documentation on big data in general and on the other ESSnet Big Data workpackages.

  • L. Altin, M. Tiru, E. Saluveer & A. Puura (2015): Using Passive Mobile Positioning Data in Tourism and Population Statistics, NTTS 2015 Conference abstract
  • A. Arai, Z. Fan, D. Matekenya & R. Shibasaki (2016): Comparative Perspective of Human Behavior Patterns to Uncover Ownership Bias among Mobile Phone Users
  • AAPOR (2013): Report of the Task Force on Non-probability sampling, June.
  • AAPOR (2015): American Association for Opinion Research Report on Big Data
  • R.L. Ackoff (1989): From Data to Wisdom, Journal of Applied Systems Analysis 16, 3-9
  • R. Agrawal & R. Srikant (1994): Fast algorithms for mining association rules in large databases, Proceedings of the 20th International Conference on Very Large Databases, 487-499, Santiago, Chile
  • G.M. Amdahl  (1967): Validity of the single processor approach to achieving large scale computing capabilities, AFIPS Conference Proceedings 30, 483-485
  • ASA-working group (2014): Discovery with Data: Leveraging Statistics with Computer Science to Transform Science and Society, report of a Working Group of the
  • M. Assay (2012): Big Data is now TOO BIG - and we're drowning in toxic information, Just why are we hoarding every last binary bit?, The Register, Cloud Business, 4 June
  • J.W. Ayers, B.M. Althouse, J.P. Allem, et al. (2013): Seasonality in seeking mental health information on Google, American Journal of Preventive Medicine 44, 520-525
  • J.W. Ayers, K. Ribisl & J.S. Brownstein (2011): Using Search Query Surveillance to Monitor Tax Avoidance and Smoking Cessation following the United States' 2009 “SCHIP” Cigarette Tax Increase, PLoS ONE 6(3): e16777
  • D. Ayoubkhani (2012): An investigation into using Google Trends as an administrative data source in ONS, Seminar on New Frontiers for Statistical Data Collection, UNECE Conference of European Statisticians, Geneva
  • F. Bacchini, M. Dalo, S. Falorsi, et al. (2014): Does Google index improve the forecast of Italian labour market?, Proceedings of the 47th Scientific Meeting of the Italian Statistical Society, Cagliari
  • J. Bai, J. Fan, R. Tsay (2016): Special Issue on Big Data, Journal of Business and Economic Statistics 34(4), 487-488
  • R. Baker, J.M. Brick, N.A. Bates, M. Battaglia, M.P. Couper, J.A.  Dever, K.J. Gile, R. Tourangeau (2013): Report on the AAPOR Task Force on Non-Probability Sampling. AAPOR report, May
  • C. Bange, T. Grosser & N. Janoschek (2015): Big data use cases 2015: Getting real on data monetization, resreport, BARC Research
  • G. Bello-Orgaz, J.J. Jung & D. Camacho (2016): Social big data: Recent achievements and new challenges, Information Fusion 28, 45–59
  • M. Beresewicz (2016): Internet data sources for real estate market analysis. PhD Dissertation.
  • J. Bethlehem (2010): Selection bias in web surveys. International Statistical Review, 78(2), 16–188, Wiley Online Library
  • J. Bethlehem (2010): Statistics without surveys? About the past, present and future of data collection in the Netherlands, Presentation for the 2010 International Methodology Symposium of Statistics Canada, October 26-29, Ottawa, Canada
  • J. Bethlehem & S. Biffignandi (2012): Handbook of web surveys. John Wiley and Sons
  • M.A. Beyer & L. Douglas (2012): it-glossary/big-data/ The Importance of Big Data: A Definition. Gartner report, June version, ID Number: G00235055.
  • P.J. Bickel, C. Chen, J. Kwon, J. Rice, E. van Zwet & P. Varaiya (2007): Measuring Traffic. Statistical Science, 22(4), 581–597
  • P. Biemer (2014): Total Survey Error: Adapting the Paradigm for Big Data
  • V.D. Blondel, A. Decuyper & G. Krings (2015): A survey of results on mobile phone datasets analysis. EPJ Data Science, 4(1), 1. Springer Berlin Heidelberg
  • J. Bollen, H. Mao & X-J. Zeng (2011): Twitter mood predicts the stock market, Journal of Computational Science 2(1), 1-8
  • D. Bollier (2010): The Promise and Peril of Big Data. Washington, DC: Aspen Institute, Communications and Society Program
  • A. Börsch-Supan, D. Elsner, H. Fassbender, R. Kiefer, D. McFadden & J. Winter (2004): How to make internet surveys representative: A case study of a two-step weighting procedure
  • O. ten Bosch & D. Windmeijer (2014): On the use of internet robots for official statistics, UNECE meeting on the Management of Statistical Information Systems (MSIS) Dublin, Ireland
  • D.M. Boyd & N.B. Ellision (2007): Social Network Sites: Definition, History, and Scholarship, Journal of Computer-Mediated Communication 13(1), 210–230
  • B. Braaksma, P. Daas, M. Offermans, M. Puts, M. Tennekes (2014): Big Data and official statistics: local experiences and international initiatives. Paper for the 47th Scientific Meeting of the Italian Statistical Society, 11-13 June, Cagliari, Italy
  • M. Braun (2015): Three Things About Data Science You Won't Find In the Books. Weblog 5th April.
  • L. Breiman (2001): Statistical Modeling: The Two Cultures. Statistical Science 16(3), 199-231
  • L. Breiman, J. Friedman, C.J. Stone & R.A. Olshe  (1984): Classification and Regression Trees. CRC Press
  • J.M. Brick (2013): Unit Nonresponse and Weighting Adjustments : A Critical Review. Journal of Official statistics 29(3), 329–353
  • D.J. Buckeley  (1968): A Semi-Poisson Model of Traffic Flow, Trans. Sci. 2, 107-133
  • B. Buelens, H.J. Boonstra, J. Van den Brakel & P. Daas (2012): Shifting paradigms in official statistics: from design-based to model-based to algorithmic inference. Discussion paper 201218, Statistics Netherlands, The Hague/Heerlen
  • B. Buelens, J. Burger & J. Van den Brakel (2015): Predictive inference for non-probability samples: a simulation study. Discussion paper 2015, Statistics Netherlands, The Hague/Heerlen, The Netherlands
  • B. Buelens, P. Daas, J. Burger, M. Puts & J. Van den Brakel (2014): Selectivity of Big Data, Discussion Paper 201411, Statistics Netherlands, The Hague/Heerlen, The Netherlands
  • B. Buelens, P. Daas, J. Van den Brakel (2012): Data Mining for Official Statistics: Challenges and Opportunities. Paper 915 of 12th IEEE International Conference on Data Mining Workshops, ICDM Workshops, Brussels, Belgium
  • E. Cambria & B. White (2014): Jumping NLP Curves: A Review of Natural Language Processing Research. IEEE Computational Intelligence Magazine 9(2), 48–57
  • L.J. Carr LJ & S.I. Dunsiger (2012): Search Query Data to Monitor Interest in Behavior Change: Application for Public Health, PLoS ONE 7(10), e48158, doi:10.1371/journal.pone.0048158
  • N. Carr (2010): The shallow, what Internet is doing to our brain, W.W, Norton and Company, New York
  • A. Cavallo & R. Rigobon (2016): The Billion Prices Project: Using Online Prices for Measurement and Research, National Bureau of Economic Research Working Paper No. 22111
  • R. Chambers (2009): Regression Analysis of Probability-Linked Data, Official statistics research series. Wellington: Statistics New Zealand (PDF)
  • R. Chambers & H. Chandra (2013): A random effect block bootstrap for clustered data. Journal of Computational and Graphical Statistics 22(2), 452–470
  • R. Chambers & R. Clark (2012): An introduction to model-based survey sampling with applications, (Vol. 37) OUP Oxford
  • R. Chambers & N. Tzavidis (2006): M -quantile models for small area estimation, Biometrica 93(2), 255–268
  • M. Chen, S. Mao & Y. Liu (2014): Big data: A survey, Mobile Networks and Applications 19(2), 171–209
  • P. Cheung  (2012): Big Data, Official Statistics and Social Science Research: Emerging Data Challenges. Presentation at the December 19th World Bank meeting, Washington.
  • H. Choi &  H. Varian (2011): Predicting the present with Google Trends, Technical Report
  • R.M. Cormack (1989): Log-linear models for capture-recapture, Biometrics, 395–413
  • M. Couper (2013): Is the Sky Falling? New Technology, Changing Media, and the Future of Surveys. Survey Research Methods 7(3), 145-156
  • J.W. Crampton, M. Graham, A. Poorthuis, T. Shelton, M. Stephens, M.W. Wilson & M. Zook (2013): Beyond the geotag: situating big data and leveraging the potential of the geoweb, Cartography and Geographic Information Science 40(2), 130-139
  • P.J.H. Daas & MJ Puts (2014): Big data as a source of statistical information. The Survey Statistician 69, 22-31
  • P. Daas (2012): Big Data and official statistics. Sharing Advisory Board, Software Sharing Newsletter 7, 2-3
  • P. Daas & J. Burger (2015): Profiling Big Data sources to assess their selectivity. Abstract for the New Techniques and Technologies for Statistics 2015 conference, Brussels, Belgium
  • P. Daas, S. De Broe & M. van Meeteren (2017): Center for Big Data Statistics at Statistics Netherlands. Abstract for the New Techniques and Technologies for Statistics 2017 conference, Brussels, Belgium
  • P. Daas, M. Puts & R. Renssen (2017): On Big Data based Statistical Inference. Abstract and poster for the 3rd UCL Workshop on the Theory of Big Data, June 26th-28th, London, UK
  • P.J.H. Daas (2013): Big Data and official statistics. The relevance of many tweets (in Dutch) STAtOR 14(3-4), 21-23
  • P.J.H. Daas & M.J.H. Puts (2014): Social Media Sentiment and Consumer Confidence. European Central Bank Statistics Paper Series No. 5, Frankfurt, Germany
  • P.J.H. Daas & M.P.J. Van der Loo (2013): Big Data (and official statistics), paper presented at the 2013 Meeting on the Management of Statistical Information Systems, Paris–Bangkok, France-Thailand.
  • P.J.H. Daas, B. Braaksma, R. Aly, Y. Engelhardt, D. Hiemstra &  R. Zurita Milla (2016): Big Data Masterclass and DataCamp 2015. Discussion paper 201615, Statistics Netherlands, The Hague/Heerlen, The Netherlands
  • P.J.H. Daas & B. Buelens (2017): Big data, bias and ways to correct for it. Abstract for the Big Data and ethics session at the 61st World Statistics Congress (ISI 2017) July 16th-21st, Marrakech, Morocco
  • P.J.H. Daas, J. Burger, L. Quan, O. ten Bosch & M. Puts (2016): Profiling of Twitter Users: a big data selectivity study. Discussion paper 201606, Statistics Netherlands, The Hague/Heerlen, The Netherlands
  • P.J.H. Daas, M.J.H. Puts, B. Buelens & P.A.M. van den Hurk (2015): Big data as a source for official statistics. Journal of Official Statistics 31, 249–269
  • P.J.H. Daas, M. Puts, M. Tennekes &  A. Priem (2014): Big Data as a Data Source for Official Statistics: experiences at Statistics Netherlands. Proceedings of Statistics Canada International Methodology Symposium 2014, Gatineau, Canada
  • P.J.H. Daas & M.J.H. Puts (2014): Sentiment analysis of Mexican tweets: smileys and emoticons. A Big Data sandbox studies for the social data task team of the UNECE taskforce, UNECE.
  • P.J.H. Daas, M. Roos, C. de Blois, R. Hoekstra, O. ten Bosch & Y. Ma (2011): New data sources for statistics: Experiences at Statistics Netherlands. Discussion paper 201109, Statistics Netherlands, The Hague/Heerlen, The Netherlands
  • P.J.H. Daas, M. Roos, M. Van de Ven & J. Neroni (2012): Twitter as a potential data source for statistics, Discussion Paper 201221, Statistics Netherlands, The Hague/Heerlen, The Netherlands
  • E. De Jonge, M. van Pelt & M. Roos (2012): Time patterns, geospatial clustering and mobility statistics based on mobile phone network data. Discussion paper 201214, Statistics Netherlands
  • F. De Meersman, G. Seynaeve, M. Debusschere,  P. Lusyne, P. Dewitte, Y. Baeyens, A. Wirthmann, C. Demunter, F. Reis & H.I. Reuter (2016): Assessing the Quality of Mobile Phone Data as a Source of Statistics. Presentation for the European Conference on Quality in Official Statistics 2016, Madrid, Spain
  • T. De Waal, M. Puts & P. Daas (2014): Statistical Data Editing of Big Data. Paper for the Royal Statistical Society 2014 International Conference, Sheffield, UK
  • E. Demidenko (2004): Mixed Models. Theory and Applications. New York: Wiley
  • C. Demunter & G. Seynaeve (2017): Better quality of mobile phone data based statistics through the use of signalling information – the case of tourism statistics, NTTS Conference, 13-17 March 2017 (paper and presentation download page)
  • J.A. Dever & R. Valliant (2006): A Comparison of Model-Based and Model-Assisted Estimators under Ignorable and Non-Ignorable Nonresponse. Proceedings of the Section on Survey Research Methods, Washington DC: American Statistical Association, 2938–2945
  • J. Deville & P. Lavallée (2006): Indirect sampling: The foundations of the generalized weight share method. Survey Methodology 32(2), 165—177
  • J.-C. Deville & C.E. Särndal (1992): Calibration estimators in survey sampling. Journal of the American statistical Association 87(418), 376–382
  • P. Deville, C. Linarde, S. Martine, M. Gilbert, F.R. Stevens, A.E. Gaughan, V.D. Blondela & A.J. Tatem (2014): Dynamic population mapping using mobile phone data, PNAS 111(45), 15888-15893
  • L. Di Consiglio & T. Tuoto (2017): Small area estimation in the presence of linkage error
  • Dialogic, Ministry of Economic affairs, Utrecht University (2008): Go with the dataflow! Analysing the Internet as a data source. Report for the Ministry of Economic affairs, version May 13th
  • P.J. Diggle, K.-Y. Liang & S.L. Zeger (1994): Analysis of Longitudinal Data. Oxford: Oxford University Press
  • L. Douglas (2012): The Importance of 'Big Data': A Definition. Gartner. Retrieved 21 June 2012.
  • Economist (2010): Data, data everywhere! Special report of the Economist, February 27
  • B. Efron  (2010): Large-scale inference: empirical Bayes methods for estimation, testing, and prediction. Institute of mathematical statistics monographs 1. Cambridge; New York: Cambridge University Press
  • B. Efron & R. Tibshirani  (1986): Bootstrap methods for standard errors, confidence intervals, and other measures of statistical accuracy. Statistical Science 1(1), 54–75
  • B. Efron & T. Hastie  (2016): Computer Age Statistical Inference: Algorithms, Evidence, and Data Science. Cambridge University Press
  • L. Einav & J. Levin  (2014): Economics in the age of big data. Science 346(6210), 715-721, DOI: 10.1126/science.1243089
  • C.K. Enders  (2010): Applied missing data analysis. Guilford Press
  • European Commission (2014): Feasibility Study on the Use of Mobile Positioning Data for Tourism Statistics, Eurostat
  • European Statistical System Committee (2013): Scheveningen Memorandum on Big Data and Official Statistics
  • European Statistical System Committee (2014): Big Data Action Plan and Roadmap
  • D. Evans & S. Bratton  (2012): Social Media Marketing: An Hour a Day. Sybex/Wiley and Sons 2nd edition
  • G. Eysenbach (2009): Infodemiology and infoveillance: Framework for an emerging set of public health informatics methods to analyze search, communication and publication behavior on the Internet. Journal of Medical Internet Research 11(1)
  • E. Fabrizi, N. Salvati,  M. Pratesi & N. Tzavidis (2014): Outlier robust model-assisted small area estimation. Biometrical Journal 56(1), 157–175
  • J. Fan, F. Han & H. Liu (2014): Challenges of Big data analysis. National Science Review 1(2), 293-314
  • R.E. Fay  (1996): Alternative paradigms for the analysis of imputed survey data. Journal of the American Statistical Association 91(434), 490–498
  • M. Feder & D. Pfeffermann  (2015): Statistical inference under non-ignorable sampling and non-response. University of Southampton.
  • S.E. Fienberg  (1972): The multiple recapture census for closed populations and incomplete 2k contingency tables. Biometrika 59(3), 591–603
  • P. Flach (2014): Machine Learning, the Art and Science of Algorithms that Make Sense of Data, 4th edition. Cambridge University Press, Cambridge, UK
  • L. Flekova & I. Gurevych (2013): Can We Hide in the Web? Large Scale Simultaneous Age and Gender Author Profiling in Social Media. Paper for the evaluation lab on uncovering plagiarism, authorship, and social software misuse at Conference and Labs Evaluation Forum 2013, September 23–26, Valencia, Spain
  • J. Fosen & L.-C. Zhang  (2011): The approach to quality evaluation of the micro-integrated employment statistics
  • J. Friedman, T. Hastie & R. Tibshirani, (2001): The elements of statistical learning (Vol. 1) Springer series in statistics Springer, Berlin
  • B. Fry (2008): Visualizing Data: Exploring and Explaining Data with the Processing Environment. Sebastopol, CA: OReilly Media Inc.
  • A. Fyhrlund, B. Fridlund & B. Sundgren  (2005): Using Text Mining in Official Statistics, Knowledge Mining, Proceedings of the NEMIS 2004 Final Conference, Studies in Fuzziness and Soft Computing 185, 201-211
  • A. Gandomi & M. Haider (2015): Beyond the hype: Big data concepts, methods, and analytics. International Journal of Information Management 35(2), 137–144
  • A. Gelman & J. Hill (2009): Data Analysis Using Regression and Multilevel/Hierarchical Models. Cambridge University Press
  • A. Gelman  (2007): Struggles with Survey Weighting and Regression Modelling. Statistical Science 22(2), 153–164.
  • A. Ghazal, T. Rabl, M. Hu, F. Raab, M. Poess, A. Crolotte & H.-A. Jacobsen (2013): Big-Bench: Towards an industry standard benchmark for big data analytics. In Proceedings of the 2013 international conference on Management of data - SIGMOD '13. Association for Computing Machinery (ACM)
  • J.D Gibons & S. Chakraborit  (2003): Nonparametric Statistical Inference, 4th Ed. CRC Press, New York, USA
  • J. Ginsberg,  M. H. Mohebbi, R. S. Patel, L. Brammer, M. S. Smolinski & L. Brilliant: Detecting influenza epidemics using search engine query data. Nature 457(7232): 1012–1014. doi:10.1038/nature07634
  • M. Glasson, J. Trepanier, V. Patruno, P. Daas, M. Skaliotis & A. Khan (2013): What does Big Data mean for Official Statistics? Paper for the High-Level Group for the Modernization of Statistical Production and Services.
  • S.A. Golder & M.W. Macy (2011): Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science 333, 1878-1881
  • V. van Grinsven & G. Snijkers (2015): Sentiments and Perceptions of Business Respondents on Social Media: an Exploratory Analysis. Journal of Official Statistics 31, 283–304
  • P. Groves, B. Kayyali, D. Knott & S.V. Kuiken (2013): The big data revolution in healthcare: Accelerating value and innovation. resreport, McKinsey and Company, Center for US Health System Reform; Business Technology Office
  • R.M. Groves (2011): Three Eras of Survey Research, Public Opinion Quarterly 75(5), 861-871
  • I. Guyon & A. Elisseeff (2003): An Introduction to Variable and Feature Selection. JMLR special issue on variable and feature selection 3, 1157—1182
  • G. Hager & G. Wellein (2010): Introduction to High Performance Computing for Scientists and Engineers, Boca Raton: Chapman and Hall/CRC Computational Science
  • M. Hahsler, B. Grun, K. Hornik & C. Buchta (2010): Introduction to arules – A computational environment for mining association rules and frequent item sets
  • A. Hajjem, F. Bellavance & D. Larocque  (2011): Mixed effects regression trees for clustered data. Statistics and Probability Letters 81(4), 451–459. Elsevier B.V
  • A. Hajjem, F. Bellavance & D. Larocque  (2014): Mixed-effects random forest for clustered data. Journal of Statistical Computation and Simulation 84(6), 1313–1328
  • T. Harford (2014): Big Data: are we making a big mistake? Significance 11 (5) 14-19
  • I.A.T. Hashem, I. Yaqoob, N.B. Anuar, S. Mokhtar, A. Gani & S.U. Khan (2015): The rise of big data on cloud computing: Review and open research issues. Information Systems 47, 98–115
  • H. Hassani, G. Saporta & E. Sirimal Silvia (2014): Data Mining and Official Statistics: The Past, the Present and the Future. Big Data 2, 1–10.
  • T. Hastie, R. Tibshirani & J. Friedman (2009): The Elements of Statistical Learning: Data Mining, Inference, and Prediction. 2nd ed. New York: Springer Science þ Business Media, LLC
  • J.J. Heckman (1976): The common structure of statistical models of truncation, sample selection, and limited dependent variables and a simple estimator for such models. Annals of Economic and Social Measurement 5, 475–492
  • N. Heerschap (2014): Mobile phone data and other new sources for tourism statistics (in Dutch) Section 10.2, Statistics Netherlands book on Tourism, 158-168, The Hague, The Netherlands
  • N.M. Heerschap, S.A. Ortega Azurduy, A.H. Priem & M.P.W. Offermans (2014): Innovation of tourism statistics through the use of new Big Data sources, paper presented at the Global Forum on Tourism Statistics, Prague.
  • G.T. Heineman, G. Pollice & S. Selkow (2009): Algorithms in a Nutshell, a desktop quick reference. OReilly Meia Inc. Sebastopol, USA
  • H. Herodotou, H. Lim, G. Luo, N. Borisov, L. Dong, F.B. Cetin & S. Babu (2011): Starfish: A self-tuning system for big data analytics 49
  • T. Hey, S. Tansley, K. Tolle  (2009): The Fourth Paradigm, Data-Intensive Scientific Discovery. Microsoft Research, Redmond, Washington, USA
  • M. Hildebrandt & S. Gutwirth (2013): Profiling the European Citizen. Cross Disciplinary Perspectives. Springer, Dordrecht, the Netherlands
  • E. Hoogteijling (2016): Modernisation of price collection at Statistics Netherlands. Presentation at the ESS Modernisation Workshop, 16–17 March, Bucharest
  • M. Houbiers (2004): Towards a Social Statistical Database and Unified Estimates at Statistics Netherlands. Journal of Official Statistics 20(1), 55–75
  • M. Houbiers, P. Knottnerus, A.H. Kroese, R.H. Renssen & V. Snijders (2003): Estimating consistent table sets: position paper on repeated weighting. Statistics Netherlands, Discussion paper 3005, 2003
  • H. Hu, Y. Wen, T.-S. Chua & X. Li (2014): Toward scalable systems for big data analytics: A technology tutorial. IEEE Access 2, 652–687
  • Y. Hu, J. Fowler Wood, V. Smith & N. Westbrook (2004): Friendships through OM: Examining the relationship between Instant Messaging and Intimacy, Journal of Computer-mediated Communications 10(1)
  • Hulliger, Beat, R. Lehtonen, R. Münnich, P. Jacques, European Commission & Eurostat (2012): Analysis of the Future Research Needs for Official Statistics. Luxembourg: Publications Office
  • Internet statistics guide (2002): Complete Guide to Internet Statistics and Research
  • M. Ito & all (2010): Hanging out, Messing around and Geeking Out: Kids living and learning with new media
  • B. Janssen (2010): Web data collection for household surveys at Statistics Netherlands. Internal Report CBS
  • A. Java, X. Song, T. Finin, F. Tseng (2007): Why we twitter: understanding microblogging usage and communities. Proceedings of the 9th WebKDD and 1st SNA-KDD 2007 workshop on Web mining and social network analysis, ACM New York, USA
  • J. Jonas  (2012): Interview: Data protection challenge of the future: Big Data. Data Protection Law and Policy Newsletter 9(7)
  • S. Jones (1998): Doing Internet Research: Critical Issues and Methods for Examining the Net. Sage Publications, Inc. California, USA
  • K.D. Bell (2011): Comparing methods for estimation of daytime population in Downtown Indianapolis, Indiana, Master of Science thesis, Dept. Geography, Indiana University
  • C. Kadushin (2012): Understanding Social Networks: Theories, Concepts, and Findings. Oxford University Press, New York, USA
  • A. M. Kaplan & M. Haenlein (2010): Users of the world, unite! The challenges and opportunities of social media, Business Horizons 53(1), 59-68
  • G. Kim & R. Chambers (2012): Regression analysis under incomplete linkage. Statistica Neerlandica, 56(9), 2756–2770
  • R. Kitchin (2013): Big data and human geography: Opportunities, challenges and risks. Dialogues in Human Geography 3(3), 262–267
  • R. Kitchin (2014): Big data, new epistemologies and paradigm shifts. Big Data and Society 1(1) 1–12
  • R. Kitchin (2015): The opportunities, challenges and risks of big data for official statistics. Statistical Journal of the IAOS 31(3), 471-481, DOI: 10.3233/SJI-150906
  • R. Kitchin & G. McArdle (2016): What makes big data, big data? exploring the ontological characteristics of 26 datasets. Big Data and Society 3(1), 1–10
  • Knight & Burn (2005): Developing a framework for assessing information quality on the World Wide Web, Informing Science J. 8, 159-172
  • A.D.I. Kramer, J.E. Guillory & J.T. Hancock (2014): Experimental evidence of massive-scale emotional contagion through social networks. PNAS 111(24), 8788-8790
  • T. Kraska (2013): Finding the needle in the big data systems haystack. IEEE Internet Computing 17(1), 84–86
  • H. Kwak, C. Lee, H. Park & S. Moon (2010): What is Twitter, a Social Network or a News Media? In: Proceedings of the 19th international conference on World wide web, ACM New York, NY, USA, 591-600
  • P. Lahiri & M.D. Larsen (2005): Regression Analysis with Linked Data. Journal of the American Statistical Association 100(469), 222–230
  • D. Laney (2013): 3D data management: Controlling data volume, velocity and variety. meta group. Application Delivery Strategies,(February 2001) (949)
  • T. Lansdall-Welfare, V. Lampos & N. Cristianini (2012): Nowcasting the mood of the nation, Significance: Big Data special issue 9(4), 26-28
  • P. Lavallée (2009): Indirect sampling (Vol. 7397) Springer Science and Business Media
  • P. Lavallée (2015): Sample matching: Toward a probabilistic approach for web surveys and big data?
  • D. Lazer, R. Kennedy, G. King & A. Vespignani (2014): The parable of Google Flu: traps in big data analysis. Science 343, 1203–1205
  • D. Lazer, A. Pentland, L. Adamic, S. Aral, A.L. Barabási, D. Brewer & T. Jebara, Computational social science (2009): Science 323, 721
  • S. Lee (2006): Propensity score adjustment as a weighting scheme for volunteer panel web surveys. Journal of Official Statistics 22(2), 329
  • R. Lehtonen & A. Veijanen (2016): Estimation of poverty rate and quintile share ratio for domains and small areas In: Alleva G. and Giommi A. (eds.) Topics in Theoretical and Applied Statistics, New York: Springer, 153–165
  • R. Lehtonen & A. Veijanen (2016): Model-assisted methods for small area estimation of poverty indicators. In: Pratesi M. (ed.) Analysis of Poverty Data by Small Area Estimation. Chichester: Wiley, 109–127
  • R. Lehtonen & A. Veijanen (2012): Small area poverty estimation by model calibration. Journal of the Indian Society of Agricultural Statistics 66(1), 125–133
  • J.M. Lepkowski, C. Tucker, J.M. Brick, E.D. De Leeuw, L. Japec, P.J. Lavrakas, M.W. Link & al. (Eds.): (2007) Advances in telephone survey methodology (Vol. 538) John Wiley and Sons
  • J. Leskovec, A. Rajaraman & J.D. Ullman (2014): Mining of Massive Datasets, 2nd edition. Cambridge University Press, Cambridge, UK
  • R. Little (2012): Calibrated Bayes: an Alternative Inferential Paradigm for Official Statistics (with discussion and rejoinder) Journal of Official Statistics 28(3), 309–372
  • R.J. Little (2015): Calibrated bayes, an inferential paradigm for official statistics in the era of big data. Statistical Journal of the IAOS 31(4), 555–563. IOS Press
  • A. Llorente, M. Garcia-Herranz, M. Cebrian & E. Moro (2015): Social Media Fingerprints of Unemployment. PloS ONE 10(5), e0128692. doi:10.1371/journal.pone.0128692
  • W.-Y. Loh (2014): Fifty years of classification and regression trees. International Statistical Review 82(3), 329–348. Wiley Online Library
  • S. Lohr (2009): Sampling: Design and analysis. Cengage Learning
  • S. Lohr & J. Brick (2012): Blending domain estimates from two victimization surveys with possible bias. Canadian Journal of Statistics 40(4), 679–696
  • London Workshop (2014): Statistics and Science, report on the London Workshop on the Future of the Statistical Sciences
  • S. Lundström & C.-E. Särndal (1999): Calibration as a standard method for treatment of nonresponse. Journal of Official Statistics 15(2), 305–328
  • C. Lynch  (2008): http:// dx.doi.org/10.1038/455028a Big Data: How Do Your Data Grow? Nature 455, 28–29
  • Y. Ma  (2016): The Use of Advanced Transportation Monitoring Data for Official Statistics. Doctoral dissertation. Erasmus University, Rotterdam
  • J. Manyika, M. Chui, B. Brown, J. Bughin, R. Dobbs, C. Roxburgh & A. Hung Byers (2011): Big Data: The Next Frontier for Innovation, Competition, and Productivity. Report of the McKinsey Global Institute, McKinsey and Company
  • G. Manzi, D.J. Spiegelhalter, R.M. Turner, J. Flowers & S.G. Thompson (2011): Modelling bias in combining small area prevalence estimates from multiple surveys. Journal of the Royal Statistical Society. Series A (Statistics in Society) 174(1), 31–50
  • S. Marchetti, C. Giusti, M. Pratesi, N. Salvati, F. Giannotti, D. Pedreschi, S. Rinzivillo, L. Pappalardo &  L. Gabrielli (2015): Small area model-based estimators using big data sources. Journal of Official Statistics 31(2), 263–281
  • D. Marr (1982): Vision. A Computational Investigation into the Human Representation and Processing of Visual Information. The MIT Press, Cambridge, Massachusetts, USA
  • W. H. Hsu, J. Lancaster, M. SR Paradesi & T. Weninger:Structural Link Analysis from User Profiles and Friends Networks: A Feature Construction Approach. Department of Computing and Information Sciences, Kansas State University
  • V. Mayer-Schönberger & K. Cukier  (2013): Big Data: A Revolution That Will Transform How We Live, Work and Think. John Murray, London, UK
  • D. McFadden, S. Cosslett, G. Duguay & W.S. Jung (1977): Demographic data for policy analysis. Urban Travel Demand Forecasting Project, Final Report, Volume VIII, Institute of Transportation Studies, University of California, Berkeley
  • McKinsey Global Institute (2011): Internet matters: The Nets sweeping impact of growth, jobs, and prosperity
  • D. McNicol (1972): A Primer of Signal Detection Theory. George Allen and Unwin LTD., London, UK
  • G. Miller (2011): Social Scientists Wade Into the Tweet Stream. Science 333(6051), 1814-1815
  • Y. Moreno, R. Pastor-Satorras & A. Vespignani  (2002): Epidemic outbreaks in complex heterogeneous networks. Eur. Phys. J. B 26, 521-529
  • J.M.F. Moura  (2009): What Is Signal Processing? Presidents Message, IEEE Signal Processing Magazine 26, 6, doi:10.1109/MSP.2009.934636
  • M-P Kwan (2016): Algorithmic Geographies: Big Data, Algorithmic Uncertainty, and the Production of Geographic Knowledge, Annals of the Association of American Geographers, March 2016
  • J. Murphy, A. Kim, H. Hagood, A. Richards, C. Augustine, L. Kroutil & A.Sage (2011): Twitter feeds and Google search query surveillance: can they supplement survey data collection? In Shifting the boundaries of research, edited by D. Birks et al., Proceedings of the sixth ASC International Conference, Bristol, Association for Survey Computing
  • K.P. Murphy (2012): Machine Learning: A Probabilistic Perspective. MIT press, Cambridge, USA
  • J. Nagler & J.A. Tucker (2015): Drawing Inferences and Testing Theories with Big Data. Paper for the American Political Science Association, Jan. pp 84-88
  • National Institute of Economic and Social Research and Growth Intelligence (2013): Measuring the UKs digital economy with big data
  • National Research Council (2013): FrontiersInMassiveDataAnalysisPrepub.pdf Frontiers in Massive Data Analysis. National Academies Press, Washington D.C., USA
  • C.B. Ng, Y.H. Tay & B.M. Goi (2012): Vision-based human gender recognition: a survey. arXiv:1204.1611
  • D. Nguyen, R. Gravel, D. Trieschnigg & T. Meder (2013): How old do you think I am?: A study of language and age in Twitter. In: Proceedings of the seventh international AAAI conference on weblogs and social media. AAAI Press, Palo Alto, CA, USA
  • Nixon, M.S., Aguado, A.S. (2012): Feature Extraction and Image Processing for Computer Vision, 3rd edition. Academic Press, Oxford, UK
  • D.J. Nordman & S.N. Lahiri (2004): On optimal spatial subsample size for variance estimation. Annals of statistics, 1981–2027. JSTOR
  • B. OConnor, R. Balasubramanyan, B.R. Routledge & N.A.Smith (2010): From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series. Proceedings of the Fourth International AAAI Conference on Weblogs and Social Media, May 23-26, Washington DC, USA
  • M. Offermans & M. Tennekes (2014): Mobile Phone Metadata: A New Source for Official Statistics. Presentation at the 2014 Joint Statistical Meeting (JSM) Boston, USA
  • C. ONeil & R. Schutt (2013): Doing Data Science: Straight Talk from the Front Line. OReilly Inc. USA
  • N. Oostdijk, A. Hürriyetoglu, M. Puts, P. Daas &  A. van den Bosch (2016): Information extraction from the social media: a linguistically motivated approach. Paper for the TALN 2016 workshop, 23rd French Conference on Natural Language Processing, session Risk and NLP: detection, prevention, management, Paris, France
  • L. Oostrom, A. N Walker, B. Staats, M. Slootbeek-Van Laar, S. Ortega Azurduy & B. Rooijakkers (2016): Measuring the internet economy in The Netherlands: a big data analysis. Discussion paper 201614. Statistics Netherlands, The Hague/Heerlen, The Netherlands
  • Organisation for Economic Co-operation and Development (2013): Measuring the Internet Economy: A Contribution to the Research Agenda, OECD Digital Economy Papers, No. 226, OECD Publishing
  • A.B. Owen (2001): Empirical likelihood. CRC press
  • A.B. Owen (2013): Self-concordance for empirical likelihood. Canadian Journal of Statistics 41(3), 387–397
  • Pang & Lee (2008): Opinion and sentiment mining. Foundations and Trends in Information Retrieval 2(1-2), 1-135
  • J.W. Pennebaker, M.E. Francis & R.J. Booth (2001): Linguistic Inquiry and Word Count: LIWC2001
  • D. Pfeffermann, JL. Eltinge & LD. Brown (2015): Methodological issues and challenges in the production of official statistics. Journal of Survey Statistics and Methodology 3, 425–483
  • D. Pfeffermann (2011): Modelling of complex survey data: Why model? Why is it a problem? How can we approach it. Survey Methodology 37(2), 115–136
  • D. Pfeffermann & M. Sverchkov (2003): Fitting generalized linear models under informative sampling. Analysis of survey Data, 175–195. Wiley, New York, USA
  • D. Pfeffermann & M. Sverchkov (2007): Small-Area Estimation under Informative Probability Sampling of Areas and Within the Selected Areas. Journal of the American Statistical Association 102(480), 1427–1439.
  • G. Pickering, J.M. Bull & D.J. Sanderson (1995): Sampling power-law distributions. Tectonophysics 248, 1-20
  • J.R. Pierce (1980): An introduction to Information Theory, Symbols, Signals and Noise, 2nd edition. Dover Publications Inc. NY, USA
  • M. Pratesi, editor (2016): Analysis of Poverty Data by Small Area Estimation. Wiley
  • M. Prensky (2001): Digital natives, digital immigrants, in On the Horizon 9(5)
  • J. Prinz (2004): Which Emotions Are Basic? In: D. Evans and P. Cruse (Eds.) Emotion, Evolution, and Rationality, Oxford University Press, UK, 69-88
  • M. Puts & P. Daas (2015): Editing Big Data: an holistic approach. Paper for the Work Session on Statistical Data Editing, United Nations Economic Commision for Europe, Budapest, Hongary
  • M. Puts, P. Daas & T. de Waal (2015): Finding Errors in Big Data. Significance 12(3), 26-29, DOI: 10.1111/j.1740-9713.2015.00826.x
  • M. Puts, P. Daas & T. de Waal (2017): Finding Errors in Big Data. In: The Best Writing on Mathematics 2016, Princeton, USA. (Pitici, M., ed), pp. 291-299, Princeton University Press, USA. (table of content)
  • M. Puts, P. Daas &  M. Tennekes (2015): High frequency road sensor data for official statistics. Abstract for the New Techniques and Technologies for Statistics conference, Brussels, Belgium
  • M. Puts, M. Tennekes, P.J.H. Daas & C. de Blois (2016): Using huge amounts of road sensor data for official statistics. Paper for the European Conference on Quality in Official Statistics 2016, Madrid, Spain
  • A. Rajaraman & J.D. Ullman (2011): Mining of Massive Datasets. Cambridge: Cambridge University Press
  • J. Rao (1996): On variance estimation with imputed survey data. Journal of the American Statistical Association 91(434), 499–06
  • J.N. Rao &  I. Molina (2015): Small area estimation, 2nd ed. Wiley
  • J.N.K. Rao & C. Wu (2010): Pseudo–Empirical Likelihood Inference for Multiple Frame Surveys. Journal of the American Statistical Association 105(492), 1494–1503
  • T. Rao & S. Srivastava (2012): [ http://arxiv.org/pdf/1212.1107.pdf Twitter Sentiment Analysis: How To Hedge Your Bets In The Stock Markets.]
  • C. Reep & B. Buelens (2015): Complementing official health statistics with internet search indices. Discussion paper 201508, Statistics Netherlands, The Netherlands
  • C. Reilly, A. Gelman & J. Katz (2001): Poststratification without Population Level Information on the Poststratifying Variable with Application to Political Polling. Journal of the American Statistical Association 96(453), 1–11
  • P. Rey del Castillo (2012): Use of machine learning methods to impute categorical data. Conference on European Statistics
  • F. Ricciato, P. Widhalm, M. Craglia & F. Pantisano (2015): Estimating Population Density Distribution from Network-based Mobile Phone Data, JRC Technical Report
  • M.K. Riddles (2013): Propensity score adjusted method for missing data (PhD Thesis) Iowa State University
  • D. Rivers (2007): Sampling for web surveys. In Joint statistical meeting
  • N. Robin, T. Klein & J. Jütting (2016): Public-private partnerships for statistics: lessons learned, future steps. Development Co-operation Working Paper 27. OECD, Paris
  • S. Roger, R.S. Bivand & E.J. Pebesma (2013): Applied spatial data analysis with R, John Wiley and Sons
  • M. Roos, P. Daas & M. Puts (2009): Innovative data collection: new sources and opportunities (in Dutch) Discussion paper 09027, Statistics Netherlands, Heerlen
  • P.R. Rosenbaum & D.B. Rubin (1983): The central role of the propensity score in observational studies for causal effects. Biometrika 70(1), 41–55. Biometrika Trust
  • D.B. Rubin (1976): Inference and missing data. Biometrika 63(3), 581–592. Biometrika Trust
  • D.B. Rubin (1987): Multiple imputation for nonresponse in surveys (Vol. 81) John Wiley and Sons
  • D.B. Rubin (1996): Multiple imputation after 18+ years. Journal of the American Statistical Association 91(434), 473–489. Taylor and Francis
  • N. Salvati, N. Tzavidis, M. Pratesi & R. Chambers (2012): Small area estimation via M-quantile geographically weighted regression. Test 21(1), 1–28
  • K. Samart (2011): Analysis of probabilistically linked data (PhD thesis) Doctor of Philosophy thesis, School of Mathematics; Applied Statistics, University of Wollongong
  • K. Samart & R. Chambers (2014): Linear Regression with Nested Errors Using Probability-Linked Data. Australian and New Zealand Journal of Statistics 56(1), 27–46
  • G. Saporta (2000): Data Mining and Official Statistics, paper presented at Quinta Conferenza Nationale di Statistica, Rome, Italy
  • C.-E. Särndal (2007): The calibration approach in survey theory and practice. Survey Methodology 33(2), 99–119
  • C.-E. Särndal & S. Lundström (2005): Estimation in surveys with nonresponse. John Wiley and Sons
  • R. Sayre (2013): Research and Technology Explosion in the Scale-out Storage Era
  • R. Schutt & C. ONeil (2013): Doing Data Science: Straight Talk from the Frontline. Sebastopol, CA: OReilly Media
  • R.J. Sela & J.S. Simonoff (2012): RE-EM trees: a data mining approach for longitudinal and clustered data. Machine Learning 86, 169–207
  • S.A. Shaikh (2011): Measures derived from a 2 x 2 table for an accuracy of a diagnostic test. Journal of Biometrics and Biostatistics 2(5)
  • C. Shannon (1948): A Mathematical Theory of Communication. Bell System Technical Journal 27, 379–423 and 623–656
  • J. Shao & others (2003): Impact of the bootstrap on sample surveys. Statistical Science 18(2), 191–198
  • C. Shirky (2008): Here comes everybody, 2008, New York, Penguin Press
  • K.E. Shirley (2015): Hierarchical models for estimating state and demographic trends in US death penalty public opinion, 1–28
  • S. Signorelli (2016): The Use of Big Data in Official Statistics, PhD thesis University of Bergamo, Italy
  • R. Silipo, I. Adea, A. Hart & M. Berthold (2015): Seven Techniques for Data Dimension Reduction. Whitepaper
  • N. Silver (2012): The Signal and the Noise: Why So Many Predictions Fail —but Some Don't. Penguin Group, New York, USA
  • D. Singh & C.K. Reddy (2014): A survey on platforms for big data analytics . Journal of Big Data, 1-8 , DOI: 10.1186/s40537-014-0008-6
  • G. Snijkers, G. Haraldsen, M. Luppes, P. Daas, J. Erikson & L.-C. Zhang (2014): Quality Challenges in Modernising Official Business Statistics. Paper and presentation for the European Conference on Quality in Official Statistics 2014, Vienna, Austria
  • S. Soroka, L. Young & M. Balmas (2015): Bad News or Mad News? Sentiment Scoring of Negativity, Fear, and Angry in News Content. In: D.V. Shah, J.N. Cappella and W.R. Neuman (Eds.) Towards Computational Social Science: Big Data in Digital Environments, The Annals of the American Academy of Political and Social Science 659, 108-121
  • SportLaw (2012): Socialympics: How Sports Organizations and Athletes used Social Media at London 2012. Located at: http://www.sportlaw.ca/wp-content/uploads/2013/01/Social-Media-and-the-Games.pdf
  • Staff, National Research Council (1996): Massive Data Sets: Proceedings of a Workshop. Washington: National Academies Press (1996)
  • P. Steffens (2016): Measuring safety using social media: an applied sentiment analysis through the use of text mining. MSc thesis. University of Maastricht, Maastricht, the Netherlands
  • S. Sterne (2010): Social Media Metrics: How to Measure and Optimize Your Marketing Investment. John Wiley and Sons Inc., Hoboken, USA
  • P. Struijs, B. Braaksma & P. Daas (2014): Official statistics and Big Data . Big Data and Society, April–June, 1–6, DOI: 10.1177/2053951714538417
  • P. Struijs & P. Daas (2014): Quality Approaches to Big Data in Official Statistics. Paper and presentation for the European Conference on Quality in Official Statistics 2014, Vienna, Austria
  • P. Struijs & P.J.H. Daas (2013): Big Data, Big Impact? Paper and presentation for the Seminar on Statistical Data Collection, Geneva, Switzerland
  • Sutradhar, Rao & Pandit (2010): Inferences in longitudinal mixed models for survey data. Journal of the Indian Society of Agricultural Statistics, a special issue in Memory of Dr. G. R. Seth, 64, 177–189
  • S-M. Tam & F. Clarke (2015): Big data, official statistics and some initiatives by the Australian Bureau of Statistics. International Statistical Review 83, 436–448
  • M. Tennekes & M. Offermans (2014): Daytime population estimations based on mobile phone metadata. Paper for the Joint Statistical Meetings, 2–7 August, Boston, MA
  • M. Tennekes, E. de Jonge & P. Daas (2013): Visualizing and Inspecting Large Datasets with Tableplots. Journal of Data Science 11, 43-58
  • M. Tennekes, E. de Jonge & P.J.H. Daas (2012): Innovative visual tools for data editing. Paper and presentations for the United Nations Economic Commission for Europe (UNECE) Work Session on Statistical Data Editing, Oslo, Norway
  • M. Tennekes & M. Puts (2015): Projection of road sensors to the Dutch road network, Abstract for the New Techniques and Technologies for Statistics conference, Brussels, Belgium
  • E. Tromp  (2011): Multilingual Sentiment Analysis on Social Media. Master thesis, TU Eindhoven, July 16
  • T. Trump  (2010): Types of twitter users. Presentation for the General online Research conference 2010, Pforzheim, Germany
  • T. Tuoto, D. Fusco & L. Di Consiglio (2016): Exploring solutions for linking Big Data in Official Statistics. SIS 2016, Conference proceedings ISBN: 9788861970618
  • N. Tzavidis, M.G. Ranalli, N. Salvati, E. Dreassi & R. Chambers (2015): Robust small area prediction for counts, Statistical methods in medical research 24(3), 373–395
  • UN Global Pulse (2012): Big Data for Development: Challenges and Opportunities, Version May
  • UN Global Pulse (2015): Analysing Social Media to understand Public Perceptions of Sanitation
  • C. Vaccari (2014): Big Data in Official Statistics, Thesis University of Camerino, Italy
  • M. Van de Ven (2011): Twitter message classification for national statistics, Thesis, draft version, Erasmus University of Rotterdam
  • J. Van den Brakel, E. Söhler, P. Daas & B. Buelens (2016): Social media as a data source for official statistics; the Dutch Consumer Confidence Index, Discussion paper 201601, Statistics Netherlands, The Hague/Heerlen, The Netherlands
  • J. Van den Brakel, E. Söhler, P. Daas & B. Buelens (2017): Social media as a data source for official statistics; the Dutch Consumer Confidence Index, Survey Methodology, Accepted for publication
  • AJ van Strien, CAM van Swaay & T Termaat (2013): Opportunistic citizen science data of animal species produce reliable estimates of distribution trends if analysed with occupancy models. Journal of Applied Ecology 50, 1450–1458
  • L. Velikovich, S. Blair-Goldensohn, K. Hannan & R. Mc-Donald (2010): The viability of web-dervied polarity lexicons. Proceedings of Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 777–785
  • E.F. Vonesh (2012): Generalized Linear and Nonlinear Models for Correlated Data. Theory and Applications Using SAS. SAS Institute
  • A. Wallgren & B. Wallgren (2014): Register-based Statistics. Wiley series in survey methodology (Second.) John Wiley and Sons, Inc.
  • S.F. Wamba, S. Akter, A. Edwards, G. Chopin & D. Gnanzou (2015): How big data can make big impact: Findings from a systematic review and a longitudinal case study International Journal of Production Economics 165, 234–246
  • D.J. Wang, X. Shi, D.A. McFarland & J. Leskovec (2012): Measurement error in network data: A re-classification. Social Networks 34(4), 396-409
  • W. Wang, D. Rothschildb, S. Goelb & A. Gelman (2015): Forecasting elections with non-representative polls. International Journal of Forecasting 21(3), 980–991
  • J.S. Ward & A. Barker (2013): Undefined by data: A survey of big data definitions
  • A. Wesolowski, N. Eagle, A.M. Noor, R.W. Snow & C.O. Buckee (2013): The impact of biases in mobile phone ownership on estimates of human mobility
  • K.M. Wolter (1986): Some coverage error models for census data. Journal of the American Statistical Association 81(394), 337–346
  • C. Wu (2005): Algorithms and R codes for the pseudo empirical likelihood method in survey sampling. Survey Methodology 31(2), 239–243
  • C. Wu & W.W. Lu (2016): Calibration Weighting Methods for Complex Surveys. International Statistical Review 84(1), 79–98
  • C. Wu & R.R. Sitter (2001): A model-calibration approach to using complete auxiliary information from survey data. Journal of the American Statistical Association 96(453), 185–193. Taylor and Francis
  • L. Wu & E. Brynjolfsson (2009): The future of prediction: how Google searches foreshadow housing prices and quantities, ICIS 2009 Proceedings, 147
  • C. Yang & P. Srinivasan (2016): Life Satisfaction and the Pursuit of Happiness on Twitter. PLoS ONE 11(3) e0150881 DOI: 10.1371/journal.pone.0150881
  • L.M.R. Ybarra & S.L. Lohr (2008): Small area estimation when auxiliary information is measured with error, Biometrika 95(4), 919–931
  • F. Zabala (2015): Let the data speak: Machine learning methods for data editing and imputation. Conference on European Statistics
  • L.-C. Zhang (2011): A Unit-Error Theory for Register-Based Household Statistics. Journal of Official Statistics 27(3), 415–432
  • L.-C. Zhang (2012): On the accuracy of register-based census employment statistics
  • L.-C. Zhang (2015): On modelling register coverage errors. Journal of Official Statistics 31(3), 381–396
  • L.-C. Zhang, I. Thomsen & Ø. Kleven (2013): On the Use of Auxiliary and Paradata for Dealing with Non-sampling Errors in Household Surveys, International Statistical Review 81(2), 270–288
  • L-C. Zhang (2012): Topics of statistical theory for register-based statistics and data integration, Statistica Neerlandica 66, 41–63
  • P.C. Zikopoulos, C. Eaton, D. de Roos, T. Deutsch &  G. Lapis (2012): Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data. McGraw Hill Enterprises, Ney York, USA
  • A. Zwitter (2014): Big Data Ethics. Big Data and Society 1(2), 205395171455925. doi:10.1177/2053951714559253