Summary
Overview
Work History
Education
Skills
Key Projects
Timeline
Generic

Murali Parimi

Chicago

Summary

Results-focused data professional equipped for impactful contributions. Expertise in designing, building, and optimizing complex data pipelines and ETL processes. Strong in SQL, Python, and cloud platforms, ensuring seamless data integration and robust data solutions. Known for excelling in collaborative environments, adapting swiftly to evolving needs, and driving team success.

Overview

18
18
years of professional experience

Work History

Privacy Data Engineer

Meta
06.2022 - Current
  • Designed scalable data warehouse solutions to protect user privacy across Meta platforms (Facebook, Instagram, WhatsApp, Reality Labs), ensuring compliance with international/External data regulations
  • Prevented cross-platform user data violations, mitigating fines of up to 10% of Meta’s revenue
  • Delivered executive-level dashboards and visualizations to monitor data misuse, enabling informed decision-making
  • Collaborated on ETL (Extract, Transform, Load) tasks, maintaining data integrity and verifying pipeline stability.
  • Designed and implemented a quality framework for 'One Catalog,' a centralized system for managing code and data assets and their metadata. Developed key metrics, including asset completion rates and metadata annotation quality, to improve data quality and ensure compliance with external regulations. Utilized these metrics to proactively identify and address gaps in asset discovery and ingestion processes between source systems and 'One Catalog,' thereby improving the overall data reliability and integrity of the system.
  • Leveraged Meta Llama to generate synthetic datasets using LLM APIs. Subsequently, utilized the IRS API to compare expected output against actual output, enabling the precise measurement of precision/recall metrics.
  • Further, conducted comparative analyses of Microsoft Presidio, Google DLP, and Private AI on the generated datasets.
  • Finally, established robust data foundations and interactive dashboards to effectively project the performance and capabilities of these privacy-enhancing technologies.


Senior Data Engineer

Optum
12.2020 - 06.2022
  • Developed high-performance Spark pipelines to process, transform, and load data into Azure Synapse and Blob Storage, replacing existing Ab Initio ETL pipelines. This initiative resulted in significant cost savings of $2 million by eliminating Ab Initio licensing fees and reducing on-premise data center costs.
  • Enhanced data governance by establishing robust Access Control Lists (ACLs) for Blob Storage and effectively managing Databricks clusters.
    Led the Ab Initio and Azure Center of Excellence, providing oversight for both on-premises and cloud data lake operations.
  • Streamlined code migration processes by automating CI/CD pipelines with Jenkins, improving efficiency and reducing deployment risks.

Data Engineering Specialist

FedEx
07.2020 - 11.2020
  • As Infra Data Engineer: Automated Ab Initio cluster installations, significantly reducing setup time by 90%.
  • Team Leadership: Led a team of five junior engineers in implementing operational efficiency initiatives.

Data Engineer

Central Garden & Pet
06.2019 - 06.2020
  • As Administrator: Managed and maintained multiple Azure SQL databases, ensuring robust security measures and efficient access control, Setup Ab Initio on Azure using Ansible.
  • As Developer: Extracted and integrated data from Salesforce and SAP systems into MS SQL Server on Azure. This initiative facilitated the consolidation of numerous isolated data silos, enabling the client to establish a centralized and scalable data warehouse.

Data Engineer

Ciox Healthcare Tech
09.2018 - 05.2019
  • Designed a unified data model to integrate diverse data sources, ensuring consistency across data silos
  • Automated ETL pipelines to load on-premise data into AWS S3 and Redshift
  • Streamlined environment-specific code promotion with automation, reducing deployment errors

Data Engineer

TransUnion
08.2014 - 07.2018
  • Spearheaded a Proof-of-Concept (POC) comparing Ab Initio ETL pipelines with open-source alternatives like Apache Hive and Apache Pig. This initiative paved the way for a successful migration to open-source data engineering technologies.
  • Significantly improved ETL pipeline performance by 20% by transitioning file formats from ORC to Parquet.
  • Developed robust global ingestion frameworks to seamlessly integrate data from international acquisitions into the bigdata environment.
  • Orchestrated complex data workflows using Autosys, ensuring high availability and reliability.

Senior ETL Developer

Infosys, Syntel, IBM, Exusia
06.2007 - 07.2014
  • Designed and implemented Ab Initio workflows to process complex business transformations across multiple data sources
  • Migrated 3NF relational models into dimensional data warehouses, improving analytical efficiency
  • Mentored and trained junior team members in data engineering best practices
  • Employers: Infosys, Syntel, IBM, Exusia India Pvt Ltd
  • Clients: Wells Fargo, CapitalOne, Allstate Insurance, TD Bank, Citi Bank
  • Skills: Ab Initio, Teradata, DB2, Oracle, Control-M

Education

Bachelor of Engineering - Electronics & Communication Engineering

Andhra University
Bhimavaram
04-2007

Skills

  • Core: Python, SQL, Java, Apache Spark, Hadoop, Data warehousing, Data modeling, Data pipeline design
  • Cloud & Big Data: Azure (Synapse, Data Factory), AWS (Redshift, S3), Databricks, Apache Presto, Hive
  • Tools & Infrastructure: Ab Initio, Git, Jenkins, Ansible, Airflow, Control-M, Autosys

Key Projects

Meta - Digital Market Act Compliance 

Led the development of a comprehensive compliance solution for Meta's adherence to the Digital Market Act (DMA), encompassing services like Facebook (dating, gaming, marketplace, Messenger), Instagram, and Reality Labs.

  • Designed and implemented a flexible logging mechanism for user consent flows, ensuring seamless adaptability across diverse services.
  • Collaborated closely with cross-functional teams to define and establish key performance indicators (KPIs) for compliance and product performance evaluation.
  • Developed scalable data pipelines using Python, SQL, Apache Presto, and Airflow to support DMA experiments and facilitate future compliance initiatives.
  • Created insightful dashboards for both executive and technical teams to effectively monitor consent flow effectiveness and ensure ongoing compliance.

Transunion - Ingestion Framework

  • Designed and implemented a robust data ingestion framework to seamlessly integrate and harmonize datasets from multiple international acquisitions. This initiative resulted in a significant 30% improvement in data processing efficiency and facilitated seamless collaboration across global teams. Technologies utilized: Apache Hive, Autosys, Python

Timeline

Privacy Data Engineer

Meta
06.2022 - Current

Senior Data Engineer

Optum
12.2020 - 06.2022

Data Engineering Specialist

FedEx
07.2020 - 11.2020

Data Engineer

Central Garden & Pet
06.2019 - 06.2020

Data Engineer

Ciox Healthcare Tech
09.2018 - 05.2019

Data Engineer

TransUnion
08.2014 - 07.2018

Senior ETL Developer

Infosys, Syntel, IBM, Exusia
06.2007 - 07.2014

Bachelor of Engineering - Electronics & Communication Engineering

Andhra University
Murali Parimi