

Yosef Schoen
DATA SCIENTIST
050-709-3180
Email:
Address:
Beit Shemesh, Israel
Date of Birth:
August 26 1996
Phone:
EXPERIENCE
2025-Present
Data Analytics Instructor
Developers Institute
• Taught a full-spectrum Data Analytics curriculum including Python, PostgreSQL, Excel, Tableau, Power BI, and advanced SQL, with a focus on real-world application and project-based learning.
• Led instruction in data wrangling, object-oriented programming (OOP), data visualization (Matplotlib, Seaborn), A/B testing, and web data analysis using APIs and scraping tools.
• Mentored students through hands-on projects, developed curriculum content, and continuously refined teaching materials based on industry trends and student feedback.
2024-Present
Data Scientist
Pianoid
• Designed and implemented scalable data pipeline infrastructure for large-scale ETL operations on the cloud using AWS Glue, EMR, and S3 and queried it with sql on Athena.
• Developed and optimized Python (pandas and numpy) and Apache Spark (pyspark) scripts for feature extraction, transformation, and analysis of large audio datasets.
• Contributed to the development of classification machine learning and deep learning models for predictive modeling tasks using TensorFlow and scikit-learn.
• Assisted in algorithm development and advanced analytics of a GAN ml model.​
2023-2024
Data Collector
Talent Solutions (Microsoft)
Collected and curated a diverse dataset by conducting conversations with various speakers to support the development of a speech-to-text model.
• Managed, cleaned, and preprocessed data to ensure high-quality inputs for model training.
• Labeled and annotated data to establish a reliable ground truth for model evaluation and improvement​.
2023
Data Science Intern
Tipalti
-
Developed and implemented a transformer-based embedding model combined with a feature reduction and clustering ml model to group semantically similar messages from multiple sources while filtering out syntactically similar but semantically different messages.
-
• Cleaned and preprocessed multi-source data, reducing redundancy by 90% to improve efficiency.
-
• Worked in collaboration with members of other teams to assist in the decision making process.
EDUCATION
2022-2023
Data Science Certification
Israel Tech Challenge
Graduated a full-time training, hands-on accelerator that encourages research, autonomous learning, and teamwork, while putting an emphasis on the industry's best practices and needs.
2018-2022
BSc Computer Science
Jerusalem College of Technology
Relevant courses: Algorithms, Software Engineering, Automata and Formal Language, Computability and Computational Complexity, Object Oriented Programming, Functional Programming, TCP-IP, and Calculus, Linear Algebra, Probability, Statistics.
SKILLS

Languages: Python, SQL
Big Data: Hadoop, Spark
AWS: Sagemaker, Redshift, EMR, etc.
Machine Learning
Business Intelligence with PowerBI
Statistical Analysis
EXPERTISE
Data Analysis & Visualization
I specialize in extracting insights from complex datasets and presenting them in a visually compelling manner for informed decision-making.
Machine Learning
Experienced in building predictive models and implementing machine learning algorithms to solve business challenges.
Big Data Technologies
Proficient in handling and analyzing large-scale, unstructured data using cutting-edge big data technologies.