Flexible infrastructure for next generation of statisticians
Abstract
All over the world, there is a continuous need for a flexible environment where statisticians and data scientist can use the latest technologies to create new innovative products. To fulfil these requests several initiatives were launched around the globe, just to name a few:
- Data science campus, ONS (https://datasciencecampus.ons.gov.uk)
- Advance Analytics Workspace, StatCan (https://statcan.github.io/daaas/en/)
- Onyxia, INSEE (https://github.com/inseefrlab/onyxia)
- EC Data Platform (https://ec-europa.github.io/digit-dataplatform/)
- Cloudlab, Eurostat (https://github.com/eurostat/datalab)
Many of these projects are using cloud based technologies and built on open source solutions for possible reuse by the statistical community. The session would review some of these initiatives providing the guiding principles of technological choices and the limitations of the different solutions. The session would provide a chance to discuss best practices, new trends and possible opportunities for National Statistical Institutes and international scientific community to collaborate in order to realise economies of scale.
The session can be divided in two parts, with a series of brief presentations by the speakers followed by a panel discussion between the speakers moderated by a discussant or the chair. Alternatively, depending on the number of speakers, the session could comprise only the presentation of ongoing projects with no panel discussion.
Potential speakers are the developers or project managers of the above listed solutions and the discussant can be one of the member of the UNECE 2022 Machine Learning group working on IT infrastructure (https://statswiki.unece.org/display/ML/Machine+Learning+Group+2022).
Submissions
- A practical application of flexible cloud infrastructure to empower analysts at Statistics Canada
- BALSAM – a template for creating data hubs
- Flexible infrastructure for next generation of statisticians - Eurostat presentation
- Onyxia: an open source cloud native data science platform
- Technology, big data, data science and the influence these have in producing modern and high-quality statistics
- Cloud agnostic datalab – a proof of concept for a flexible open source solution for official statistics