Marco Combetto

AI & Digital Transformation — Public Sector · AI Act · Data Science

Data Science and Data Engineering

End-to-end data services — from raw pipeline architecture to machine-learning-ready analytics — tailored for public administrations, research institutions, and EU-funded projects.

Data Science and Data Engineering

Services include:

  1. Data Pipeline Design and Engineering
    Architecture and implementation of scalable ETL/ELT pipelines using modern tooling (Python, SQL, Apache Spark, dbt).
    Integration across heterogeneous sources: databases, REST APIs, open data portals, and legacy systems.
  2. Data Lake and Data Warehouse Architecture
    Design of cloud-native or hybrid data platforms (Azure, AWS, on-premise) aligned with public-sector security and sovereignty requirements.
    Schema design, partitioning strategies, and cost-optimised storage tiers for large datasets.
  3. Exploratory Data Analysis and Statistical Modelling
    Descriptive and inferential analysis to surface trends, anomalies, and actionable patterns in complex datasets.
    Reproducible analytical workflows using Jupyter, pandas, and R for evidence-based policy support.
  4. Machine Learning and Predictive Analytics
    Selection, training, and validation of supervised and unsupervised models for classification, forecasting, and clustering tasks.
    Deployment of explainable AI (XAI) pipelines that meet EU AI Act transparency requirements.
  5. Data Visualisation and BI Dashboards
    Interactive dashboards and reports (Power BI, Grafana, Plotly/Dash) for decision-makers and operational teams.
    KPI frameworks and data storytelling to translate findings into governance decisions.
  6. Data Governance and Quality
    Definition of data dictionaries, lineage documentation, and quality rules for FAIR-compliant datasets.
    Advisory on GDPR-aligned data handling and open-data publication workflows.

Deliverables

  • Documented data-pipeline architecture and code repositories
  • Data lake / data warehouse schemas and runbooks
  • Analytical reports and reproducible notebooks
  • Trained and validated ML models with performance benchmarks
  • Interactive dashboards and KPI frameworks
  • Data governance documentation and quality scorecards

Why Data Science and Engineering for the Public Sector?

Public institutions generate vast amounts of data — from administrative records and IoT sensors to satellite imagery and citizen surveys. Turning that raw data into reliable intelligence requires both rigorous engineering (so the data flows cleanly and consistently) and scientific rigour (so the models and statistics are trustworthy). With experience across EU research projects, national digital-transformation programmes, and open-data initiatives, I bridge the gap between data infrastructure and actionable insight.