Summary
Overview
Work History
Education
Skills
Websites
Certification
Awards
Personal Information
Timeline
Generic
Mohammed Abubakar

Mohammed Abubakar

Chicago

Summary

Dynamic Senior Data Engineer with a proven track record at Ernst and Young GDS, leveraging expertise in Databricks, Spark, and Python to architect and optimize data solutions. Excelled in automating and enhancing data workflows, demonstrating exceptional problem-solving skills and a collaborative spirit. Spearheaded innovative projects, significantly improving data processing efficiency and security.

Overview

8
8
years of professional experience
1
1
Certification

Work History

Senior Data Engineer

Ernst and Young GDS
01.2022 - Current
  • Design, implement, and manage data pipelines using Azure Data Factory (ADF) for efficient ETL processes
  • Develop and maintain Databricks notebooks and jobs to process large-scale data using Apache Spark
  • Build and optimize Azure Data Lake Storage (ADLS) to store, manage, and analyze large datasets in a scalable, secure environment
  • Leverage Azure Synapse Analytics to integrate big data and data warehousing solutions for advanced analytics
  • Implement Azure Key Vault for secure management of secrets, encryption keys, and credentials across cloud applications
  • Develop and optimize data Lakehouse architecture, combining the features of data lakes and data warehouses to enable faster analytics and processing
  • Collaborate with stakeholders to ensure alignment between business requirements and technical solutions
  • Monitor and troubleshoot data pipelines, ensuring high availability, scalability, and performance
  • Automate data ingestion, transformation, and cleansing processes to enhance data quality and reduce operational overhead
  • Ensure data security and compliance with industry standards, implementing encryption and data access controls
  • Provide ongoing support, troubleshooting, and optimization of data workflows across the enterprise data lake infrastructure

Data Engineer

Yash Technologies
07.2020 - 01.2022
  • Performed the Data Analysis, building pipelines in Azure data factory & transformations on various datasets like SQL Server and Azure Data Lake Storage
  • Created a Data Profiling framework that uncovers data quality issues in data sources, and what needs to be corrected in ETL and performed PySpark transformations using Databricks platform via PyCharm IDE
  • Preparing release document, addressing client for bugs and errors
  • Collaborated with team for better understanding of the project functional and technical aspects
  • Applied optimization techniques in ETL job level and query level
  • Proven team player, good communication skills and quick learner
  • Assisting Presales Tech team to help on procuring new opportunity by adding values to the available products
  • POCs -Automate Schedule Migration from different source to Azure Data Bricks and AWS Data Pipeline
  • Scripts Parser to change Hive query to compatible Databricks notebook script
  • Documenting new approaches based on usage, cost, and efficiency

Software Engineer

Hexaware Technologies
02.2017 - 05.2020
  • Part of Data Engineering Team responsible for complete ETL flow
  • Implemented data factory pipelines for loading data from on-premises SQL Server Database to Azure Cloud
  • Implemented full load and incremental load strategy for data load
  • Implemented data factory pipelines for creation of dimensions and facts as per data model of the project
  • Understanding client requirement for building end to end data pipeline
  • Connecting Databricks to various sources
  • Perform Extract, Transform, Load operation using PySpark
  • Streamlining different functionality together to derive meaningful results as per requirement
  • Analyzing client queries whether it is technical or functional and debugging it after locating the root cause of the problems
  • Documenting resolution for inclusion in knowledge base for support/development for future use cases
  • Connecting to the relational databases to the raw zone and later cleaning, transforming, summarizing the data

Education

Bachelors of Engineering - with Honors

RGPV Technical University

Skills

  • Databricks
  • Spark
  • Python
  • SQL
  • Data Warehouse
  • Microsoft Azure
  • Azure Data Factory
  • ADLS
  • Synapse
  • Fabric
  • SQL Server
  • Delta Lake
  • Azure DevOps
  • Confluence
  • AWS
  • S3
  • Glue
  • Lambda
  • CloudWatch
  • GitHub

Certification

  • Databricks Certified Associate Data Engineer, 2022
  • Azure Certified Data Engineer, 2021
  • Databricks Certified Apache Spark Developer, 2022

Awards

  • Star Award, For successfully implementation of PoC for on-prem Script to Spark compatible logic.
  • Excellence Award, For successfully implementing automated Data Profiling.

Personal Information

Citizenship: US Citizen

Timeline

Senior Data Engineer

Ernst and Young GDS
01.2022 - Current

Data Engineer

Yash Technologies
07.2020 - 01.2022

Software Engineer

Hexaware Technologies
02.2017 - 05.2020

Bachelors of Engineering - with Honors

RGPV Technical University
Mohammed Abubakar