Summary

Overview

Work History

Education

Skills

Timeline

Raga Preethi Potu

Plano,TX

Summary

Experienced in design and deployment of Enterprise Application Development, Web Applications, Client-Server Technologies, Web Programming using Java and Big data technologies. Experience in the Design, Development and Implementation of Data warehousing Technology and Data Analysis. Possess comprehensive experience as a Data Engineer, Hadoop, Big Data & Analytics Developer. Analyzed credit data and financial statements to determine the lenders. Expertise on Hadoop architecture and ecosystem such as HDFS, MapReduce, Pig, Hive, Sqoop Flume. Complete Understanding on Hadoop daemons such as Job Tracker, Task Tracker, Name Node, Data Node and MRV1 and YARN architecture. Experience in installation, configuration, management, supporting and monitoring Hadoop cluster using various distributions such as Apache Hadoop, Cloudera Hortonworks, and various cloud services like AWS, GCP. Experience in Installation and Configuring Hadoop Stack elements MapReduce, HDFS, Hive, Pig, Sqoop, Flume, Oozie and Zookeeper. Expertise in writing custom Kafka consumer code and modifying existing producer code in Python to push data to Spark-streaming jobs. Ample knowledge on Apache Kafka, Apache Storm to build data platforms, pipelines, and storage systems; and search technologies such as Elastic search. Experience new features implemented by Azure to reproduce and troubleshoot Azure end-user issues and provide solutions to mitigate the issue. Knowledge in automated deployments leveraging Azure Resource Manager Templates, DevOps and Git repository for Automation and usage of Continuous Integration (CI/CD). Experienced in data processing and analysis using Spark, HiveQL, and SQL. Responsive expert experienced in monitoring database performance, troubleshooting issues and optimizing database environment. Possesses strong analytical skills, excellent problem-solving abilities, and deep understanding of database technologies and systems. Equally confident working independently and collaboratively as needed and utilizing excellent communication skills.

Overview

years of professional experience

Work History

Data Engineer

Eficens Systems LLC

02.2022 - Current

Involved in project life cycle including design, development, and implementation of verifying data received in data lake
Conducting data analysis and providing actionable insights through Tableau dashboards
Collaborating with stakeholders to define and understand their data visualization requirements
Optimizing Tableau workbooks and dashboards for improved performance and responsiveness
Configuring and optimizing EC2 instances to meet performance and scalability requirements
Developing and maintaining ETL workflows and processes using Snowflake and other related tools
Monitoring and optimizing performance of Snowflake data warehouse queries
Design and Develop ETL Processes in AWS Glue to migrate accidents data from external sources like S3, Text Files into AWS Redshift
Tuning and indexing tables to enhance query speed and overall system performance
Setting up role-based access controls (RBAC) and managing user permissions in Snowflake
Working closely with cross-functional teams, such as data scientists and analysts, to understand their data requirements and provide technical solutions
Designing and implementing data visualizations to effectively communicate insights and trends
Analyzed impact changes on existing ETL/ELT processes to ensure timely completion and availability of data in data warehouse for reporting use
Translated data access, transformation, and movement requirements into functional requirements and mapping designs
Developed, tested, and tuned performance of complex mappings, transforms, aggregations, joins, enrichment, validations for target data underpinnings
Used ranking functions and aggregation function in spark
Built real time streaming pipeline utilizing Kafka, Spark Streaming and Redshift
Developed logical and physical data flow models for Informatica ETL applications
Added support for AWS S3 and RDS to host static /media files and the database into amazon cloud
Worked on creation of customer Docker container images, tagging, and pushing of data images
Implemented and analyzed SQL query performance issues in databases
Responsible for design development of Spark SQL Scripts based on Functional Requirements and Environment: Snowflake, Tableau, Hadoop, Spark, Hive, Python, Kafka, AWS S3 Buckets, AWS Glue, NIFI, Postgress, Development toolkit (JIRA, Bitbucket/Git, Service now etc.,)

Data Engineer

Tecspirit

08.2020 - 01.2022

Implemented simple and complex spark jobs in python for Data Analysis across different data formats
Developed upgrade and downgrade scripts in SQL that filter corrupted and records with missing values along with identifying unique records based on different criteria
Implemented Azure Storage - Storage accounts, blob storage, and Azure SQL Server
Explored on the Azure storage accounts like Blob storage
Knowledge on the Azure DevOps and it process of creation of the tasks, pull requests, Git repositories
Experience in building, deploying, troubleshooting data extraction for huge amount of records using Azure Data Factory (ADF)
Fluency in Python with working knowledge of ML & Statistical libraries
Cleaned input text data using PySparkMachine learning feature exactions API
Used Pandas data frame for exploratory data analysis on sample dataset
Worked on Microsoft Azure services like HDInsight Clusters, BLOB, ADLS, Data Factory
Environment: Spark, Scala, Hadoop, Hive, Sqoop, Play framework, Apache Ranger.

Graduate Research assistant

UNCG

08.2019 - 07.2020

Implemented and used new tools like Globus API to transmit huge amounts of data with network security
Analyzed the research data by using the reporting tools like tableau
Optimized data collection procedures and generated reports accordingly
Used statistical techniques for hypothesis testing to validate data and interpretations
Involved in communications and design discussions with the client (gate city research network (gcrNet)).

Data Engineer Intern

Knowledge Matrix

05.2017 - 10.2017

Cryptographic algorithms are used in encryption of data and transferred with a safe key which involves data security and mobility
Applied SQL in querying, data extraction and data transformations
Relational databases like Oracle and SQL server gave good working experience
Using Dual controllers on various Business Projects for Dual Data Validation and Data consistency
Interacted with users, analyzing client processes, documenting the business requirements in the project
Good experience in identifying the root causes, Troubleshooting and submitting change controls.

Education

Master’s in computer science -

University of North Carolina

Greensboro, NC

Bachelor’s in computer science -

GITAM University

Hyderabad/India

06.2019

Skills

Python
Java
Scala
R
SQL
PL/SQL
UNIX
Linux
HBase
Spark-Redis
Cloudera Manager
Snowflake

Tableau
AWS
Jenkins
Airflow
Dagger
Postman
Workflows
Data Pipelines
Data analysis
Warehousing expertise
Staging tables
Structure designs

Timeline

Data Engineer

Eficens Systems LLC

02.2022 - Current

Data Engineer

Tecspirit

08.2020 - 01.2022

Graduate Research assistant

UNCG

08.2019 - 07.2020

Data Engineer Intern

Knowledge Matrix

05.2017 - 10.2017

Master’s in computer science -

University of North Carolina

Bachelor’s in computer science -

GITAM University

Raga Preethi Potu

Summary

Overview

Work History

Data Engineer

Data Engineer

Graduate Research assistant

Data Engineer Intern

Education

Master’s in computer science -

Bachelor’s in computer science -

Skills

Timeline

Data Engineer

Data Engineer

Graduate Research assistant

Data Engineer Intern

Master’s in computer science -

Bachelor’s in computer science -

Similar Profiles

BHARAT CHUNDRUBHARAT CHUNDRU

DHANVIKAS THOTADHANVIKAS THOTA

Shijohn VergheseShijohn Verghese

Joud AshourJoud Ashour

Leslie DaileyLeslie Dailey