Mandatory Skills: Spark/Scala, SQL, Python/PySpark or similar programming language, Databricks, UnityCatalog, ETL / hashtag#ELT development, monitoring and pipelining using tools such as ApacheAirflow,Ingestion tool
This role will be responsible for conducting data ingestion and platform management on the Data Bricks platform. This role requires a deep understanding of Data Lake ingestion processes and best practices, ETL/ELT implementation, CI/CD, System integration tools, Data pipeline management
Responsibilities:
1. Ingest data from a variety of source systems and tailor ingestion approaches on a per-system basis
2. Manage, maintain, and oversee ETL/ELT pipelines on the Data-bricks platform
3. Optimize data pipelines for scalability and speed
4. Document ingestion and integration flows and pipelines
5. Use Airflow to schedule and automate ingestion jobs
6. Manage metadata and master data in the technical data catalogue
7. Ensure ELT/ETL design meets required security and compliance guidelines, and ensure PII management, flagging and risk assessment during ingestion
8. Maintain ETL/ELT pipeline infrastructure and implement automated monitoring strategies
9. Ensure adherence to SDLC best practices.
10. Have 3+ years of experience in data engineering, ingestion pipelining, and ETL/ELT
11. Hold a bachelor’s degree in computer science, engineering, statistics, or related field
12. Have hands-on experience with and understanding of technologies such as Spark/Scala, SQL, Python/ PySpark or similar programming language, Data-bricks, Unity Catalog, ETL/ ELT development, monitoring and pipelining using tools such as Apache Airflow, Ingestion tools such as Dell Boomi, Data quality guidelines, CI/CD pipelines, Agile, Git and version control