Henkel

MS Azure Consultant

Posted Jul 22, 2021
Project ID: 2731217
Location
Remote, Remote
Hours/week
30 hrs/week
Timeline
6 months
Starts: Aug 2, 2021
Ends: Feb 1, 2022
Payrate range
Unknown

*Please note: the service contract for this position will not be concluded with Henkel AG & Co. KGaA but with an external party”.


Projektname /project name

 

HDF – SAP CDC - Data Pipelining

 

Projektbeschreibung /project description


The service is requested as part Henkel Data Foundation, IoT dashboard project for Laundry Business. The project has the purpose to build up a comprehensive service for Laundry business and IT people by combining different Microsoft Azure Services and other tools to provide a data lake and reporting tool with data processing capabilities. Huge datasets from different platforms are managed on this data hub.

 

Leistungsbeschreibung / task description


The service of the contractor is delivered using an agile working method. External resources are needed as there is no internal staff with the required expertise in the following areas:

 

  • MS Azure Cloud offerings
  • Key Vault
  • App Service
  • Container Registry
  • Container Instances
  • Kubernetes
  • MS SQL server / database
  • Active Directory
  • Databricks
  • Azure Data Factory
  • Event Hub

 

Therefore, the external consultant is in a unique position and performs significantly different tasks than the internal employees.

 

One sprint consists of 2 weeks and there is a daily jour fixe. During these meetings, the team discusses the current requirements and the contractor independently performs the following tasks:

 

  • Establish robust and performant pipelines using Python, Apache Spark within the Azure environment using Azure Databricks. The requirements for these are discussed in the sprint meetings.
  • Independently implement a data pipeline using Azure Data Factory (ADF) and extract data from two different sources (cosmos DB and SAP Gigya) and merge these two to provide KPIs.
  • Independently implement quality checks on incoming data and perform quality assurance tests on the datasets and code
  • The quality check is to see if the right data is extracted from the sources (in this case: databases) compared to the requirements that come from Business colleagues in the form of Hypotheses. In case of failure, the pipeline ADF will have to be reconfigured/edited by the Business based on discussions in sprint meetings.
  • Independently analyse business requirements that come from Business colleagues participating in the sprint meetings. The analysis includes gathering the parameters from business and extracting this data from different databases and overlaying this to identify any patterns. 
  • Consult the project team – in this case the Marketing team from Laundry business – regarding what parameters can be retrieved and what cannot be retrieved for marketing reports. If any specific reporting cannot be done with existing data, this will be communicated to the Laundry Business colleagues who then have to adapt the product.
  • Independently create maintenance documentation of the implementation in form of a handover document which is subject to approval by Henkel.

The service provision of the contractor has the goal to build the technology and data pipelining with different technology tools we can use like (Databricks, Dremio) and enable Henkel to maintain the solution.



Similar projects

+ Search all projects