HDF Integration and Utility Data Collection

Posted Sep 16, 2020
Project ID: 2619817
40 hrs/week
3 months
Starts: Sep 20, 2020
Ends: Dec 19, 2020
Payrate range

*Please note: the service contract for this position will not be concluded with Henkel AG & Co. KGaA but with an external party”.

HDF Integration and Data Utility Collection


Project description:

The project aims at (1) collecting a huge amount of data from different data sources (IoT data, process data etc.) into an Azure Data Lake Storage on a continuous basis (e.g., hourly or daily) and (2) and joining/transforming (part) of the data and copying it to different folders in the Data Lake (e.g., for third party access). This service request is for item (2).

Detailed task description: 

The scope of services includes the following tasks, which are independently performed by the external contractor based on in advance assigned Change Requests (CR):


  • Technical consultation for and implementation of Azure Databricks transformations (SQL or Python) of data stored in Azure Data Lake (from several data sources: IoT, master data and process data) so that is can be accessed efficiently
  • Technical consultation for implementation of “delta transformations”, that means, transformation of only new data since the last transformation (per minute, daily, hourly) 
  • Set up Databricks jobs for the above transformations considering dependencies between them (that means, Job A must be finished before Job B can start).
  • Creation of (automated) tests for the above solutions
  • Creation of hand-over documents (operational task list, operational handbook, infrastructure sheet) for maintenance team, that will be provided to Henkel for verification and approval
  • Remote training of the maintenance team about implemented transformations (probably 1-3 sessions, depending on the complexity)


The performance of the contractor has the goal to independently implement solutions from end to end in order to fully automate existing manual or new interfaces.

Possible timelines to be kept.

The requested timelines of the transformation requests (CRs) should be kept.

Similar projects

+ Search all projects