Overview
⭐ - Featured Role | Apply direct with Data Freelance Hub
This role is for a Data Engineer with a 6-month contract, offering an inside IR35 pay rate. Located in London/Remote, it requires expertise in SQL, Python, geospatial data, and modern data tools, focusing on data transformation within the Breast Screening Pathway team.
Location: London/Remote, United Kingdom
Role Details
Contract: 6 months initially
IR35: Inside IR35
Work arrangement: Hybrid
Responsibilities
- Build secure, repeatable data ingestion and transformation pipelines for the Breast Screening Pathway team.
- Implement data cleansing rules and produce auditable, reproducible outputs.
- Analyse existing system datasets spread across multiple system instances to identify data issues and shape target datasets and structures for the future.
- Establish import/export patterns, handle data extracts, schema discovery, incremental loads, and work with multiple source instances.
- Develop data transformation-heavy pipelines from data profiling to cleansing to standardisation, conformance, and publishing.
- Utilise advanced SQL for profiling, joins/merges, deduplication, anomaly detection, and performance tuning.
- Write maintainable Python code for automation, parsing, rules engines, and data quality checks; work with data wrangling packages (Pandas, Polars) and visualization tools (Matplotlib).
- Experience with modern data tooling (e.g., Spark, Azure Data Factory) or equivalent implementations.
- Work with geospatial data (formats such as GeoJSON, shapefiles; CRS; spatial analysis workflows) and apply geographical context in data processing pipelines.
- Work with publicly available official datasets such as ONS open data products (census boundaries, lookups, indices, population estimates).
- Build rules for completeness/validity/consistency and implement exception handling.
Qualifications
- Strong experience in data transformation-focused pipelines, data profiling, cleansing, standardisation, conformance, and publishing.
- Advanced SQL knowledge for profiling, joins/merges, deduplication, anomaly detection, and performance tuning.
- Python scripting for automation, parsing, rules engines, and data quality checks; maintainable code.
- Experience with Python data wrangling packages (e.g., Pandas, Polars), modelling (e.g., scikit-learn), and visualization (e.g., matplotlib).
- Experience with modern data tooling (e.g., Spark, Azure Data Factory) or equivalents.
- Proven experience with geospatial data and related workflows.
- Ability to translate local/regional geospatial insights into wider national/regional datasets and outputs.
- Experience with official datasets such as ONS open data products.
- Ability to build robust data quality and completeness checks, with error handling.
Note: This description retains the core role expectations and requirements as provided, with formatting improvements for readability and compliance with formatting standards.