Summary
Overview
Work History
Education
Skills
Timeline
Projects
TSVIEL BEN - SHABAT

TSVIEL BEN - SHABAT

Data Scientist

Summary

Data Scientist with 6+ years of experience in data analysis, machine learning, and statistical modeling. Proficient in Python, utilizing its capabilities to extract insights from complex datasets. Holds an M.Sc. in Statistics, with extensive knowledge in unsupervised learning techniques and AI frameworks. Passionate about leveraging cutting-edge technologies and expanding expertise through personal projects, including successful implementations using PyTorch. Committed to driving data-driven decision-making and delivering impactful solutions for business growth.

Overview

7
7
years of professional experience
6
6
years of post-secondary education

Work History

Data Scientist

Silk
02.2022 - Current
  • Anomaly Detection for Operational Risk:
    Developed a novel tree-based anomaly detection algorithm that assigns decision tree leaves to system components and computes anomaly scores based on probability ratios.
  • Performance & Cost Optimization:
    Collaborated with engineers to optimize system benchmarks and reduce cloud resource usage, ensuring high reproducibility of performance tests.
  • Cross-Functional Analytics:
    Integrated statistical methods into multi-dimensional data pipelines for improved risk monitoring and operational efficiency.

Teaching Assistant

Technion
01.2021 - 12.2021
  • Among other topics, the course deals with Network Graph Analysis, Recommender Systems, and Blockchain.
  • Helped with grading assignments and tests, providing constructive feedback to students based on results.
  • Supported classroom activities, tutoring, and reviewing work.

Software Engineer

Kaminario
01.2018 - 12.2020
  • Designed, automated, and deployed microservices in a Kubernetes environment, including a storage capacity prediction service using ELK ML.
  • Developed an automated storage capacity prediction microservice using ELK ML, providing clients with clear OpEx projections for six months ahead.

Student Data Analyst

Kaminario
01.2016 - 12.2018
  • Extracted and analyzed Call Home data to create monitoring and insights, reducing human labor monitoring costs by $200K annually.
  • Updated and developed scripts and queries to extract and analyze data from multiple sources.

Education

Master of Science - Statistics

Technion, Israel
04.2019 - 01.2022
  • Cum Laude; GPA: 95; Awarded the M.Sc. Student Award for interdisciplinary data science research.
  • Research integrated unsupervised learning, game theory, and advanced statistical methodologies.
  • Authored and co-authored multiple academic papers.

Bachelor of Science - Computer Information Systems

Technion, Israel
04.2014 - 01.2018
  • Cum Laude; GPA: 88.6; Recognized on the President’s and Dean’s Lists.
  • Graduated with a project on a smart robot packaging system, emphasizing innovative application of information systems engineering.

Skills

Regression

Timeline

Data Scientist - Silk
02.2022 - Current
Teaching Assistant - Technion
01.2021 - 12.2021
Technion - Master of Science, Statistics
04.2019 - 01.2022
Software Engineer - Kaminario
01.2018 - 12.2020
Student Data Analyst - Kaminario
01.2016 - 12.2018
Technion - Bachelor of Science, Computer Information Systems
04.2014 - 01.2018

Projects

Anomaly Detection

  • Methodology: Developed a novel tree-based algorithm using a standard decision tree. Forced the tree’s leaves to correspond to system components and employed supervised learning (with known component labels) to compute probabilities for each time sample. Derived a score using 1 – (min(probabilities)/max(probabilities)) and flagged an anomaly when the 24-hour average exceeded 80%.
  • Outcomes/Impact: Enabled detection of significant deviations across system components, effectively addressing the limitations of isolation forests—which are unsupervised and limited to time series data.

Stable-Performance

  • Methodology: Replaced a binary search approach with a constant jump search strategy combined with a Mann-Whitney test to evaluate performance variability in cloud-based storage systems. This method accounted for the logarithmic nature of performance gains and ensured that deviations signaled genuine system issues rather than noise.
  • Outcomes/Impact: Achieved highly reproducible performance tests; by reducing variability, any observed poor performance reliably indicated a real system problem, thereby improving overall test reliability and troubleshooting.

Faster-Performance-Testing

  • Methodology: Identified that traditional 5-minute performance tests were excessive. Demonstrated that taking the median of the first 45 seconds of data yielded results 99.99% as accurate as a full 5-minute average.
  • Outcomes/Impact: Reduced test runtime by 85% per thread, enabling more frequent testing (from once a week to every two nights) and significantly cutting cloud costs by reducing virtual machine deployment time.

Zone Critic

  • Methodology: Tackled frequent test failures due to resource shortages by implementing a Multi-Armed Bandit algorithm with exponential decay. This dynamically updated and tracked the success ratios across cloud zones, reacting swiftly to abrupt changes while filtering out transient noise.
  • Outcomes/Impact: Reduced test failures due to missing resources from an average of 10% down to less than 1%, ensuring a more reliable automated testing process.

Test Coverage

  • Methodology: Collaborated with QA leadership to define key feature categories and employed a large language model API to analyze test descriptions for these features. Generated visual reports (e.g., word clouds) that detailed test coverage along with associated costs and success ratios.
  • Outcomes/Impact: Streamlined test planning for new features, saving significant time and effort for QA leads when navigating complex test suites.

Bug Duplicates

  • Methodology: Built a microservice that utilized minimal information (error messages and failing test steps) in combination with BERT embeddings and cosine similarity to flag duplicate bug reports.
  • Outcomes/Impact: Flagged potential duplicate bug reports with approximately 90% accuracy (10% false positive rate), thereby saving cloud resources and developer time by reducing redundant investigations.

IOPulse

  • Methodology: Developed a cost optimization solution for PV2 disks on Azure (details proprietary until patent protection). The approach ensures that costs are directly tied to actual resource consumption.
  • Outcomes/Impact: Provides a mechanism to avoid overpaying for disk resources, optimizing cloud costs by ensuring payment only for what is consumed.

Performance Analyzer

  • Methodology: Created a data pipeline that processes various system monitoring data (IOstat, VDBench, SAR) to identify resource bottlenecks. The pipeline detects the point of maximum system performance and then analyzes the surrounding data to diagnose whether the issue lies in network, compute, or another resource area.
  • Outcomes/Impact: Aids in diagnosing system performance issues by accurately pinpointing resource bottlenecks, thereby enabling targeted configuration improvements and performance tuning.

Chiplet Members

  • Methodology: Conducted latency tests between every pair of CPU cores in architectures featuring chiplets. Clustered latency measurements into two groups—lower latency (cores within the same chiplet) and higher latency (cores in different chiplets). Built graphs from low-latency connections and computed cliques to automatically determine chiplet membership.
  • Outcomes/Impact: Uncovered hidden chiplet structures (not disclosed by cloud providers), enabling potential optimizations in inter-core communication and offering a deeper understanding of the underlying architecture.

ExtractAI (Faster Reporting Delivery)

  • Methodology: Leveraged iterative large language model (LLM) API calls to automate the extraction and summarization code of information from a vast Elasticsearch database containing system events, call-home emails, and counters.
  • Outcomes/Impact: Accelerated report generation and streamlined data insights, reducing the reporting burden and enabling quicker access to critical system metrics.

Ransomware Ingest Pipeline

  • Methodology: Built a cost-effective data ingestion pipeline using Apache Beam (Dataflow on GCP), CloudSchedule, and BigQuery, completing development in under two weeks.
  • Outcomes/Impact: Delivered a scalable pipeline for big data processing that efficiently removes duplicates and streamlines data ingestion.

Spot TTL

  • Methodology: Applied survival analysis techniques using usage history metrics to evaluate system performance under varying conditions.
  • Outcomes/Impact: Although not deployed, the analysis revealed that almost all tests can run on spot machines and that some zones are indeed better than the others.

Ransomware Detection

  • Methodology: Employed statistical anomaly detection methods informed by extensive research into ransomware operational patterns.
  • Outcomes/Impact: Developed an end-user feature for detecting ransomware activity, enhancing system security through early threat identification.

Background Compactor Research

  • Methodology: Analyzed customer periodic behavior to identify inefficiencies, demonstrating how the existing algorithm contributed to high latency during critical periods.
  • Outcomes/Impact: Reduced product latency by optimizing algorithm performance, leading to improved system responsiveness.
TSVIEL BEN - SHABATData Scientist