top of page
  • White LinkedIn Icon
  • White Facebook Icon
  • White Twitter Icon
  • White Instagram Icon

Welcome to My Portfolio

I'm a data science professional with a passion for turning data into insights. Feel free to explore my projects and get in touch if you want to collaborate.

headshot.jpeg

Yosef Schoen

DATA SCIENTIST 

050-709-3180

Email:

Address:

Beit Shemesh, Israel

Date of Birth:

August 26 1996

Phone:

EXPERIENCE

EXPERIENCE

2025-Present

Data Analytics Instructor

Developers Institute

• Taught a full-spectrum Data Analytics curriculum including Python, PostgreSQL, Excel, Tableau, Power BI, and advanced SQL, with a focus on real-world application and project-based learning.

• Led instruction in data wrangling, object-oriented programming (OOP), data visualization (Matplotlib, Seaborn), A/B testing, and web data analysis using APIs and scraping tools.

• Mentored students through hands-on projects, developed curriculum content, and continuously refined teaching materials based on industry trends and student feedback.

2024-Present

Data Scientist

Pianoid

• Designed and implemented scalable data pipeline infrastructure for large-scale ETL operations on the cloud using AWS Glue, EMR, and S3 and queried it with sql on Athena.

• Developed and optimized Python (pandas and numpy) and Apache Spark (pyspark) scripts for feature extraction, transformation, and analysis of large audio datasets.

• Contributed to the development of classification machine learning and deep learning models for predictive modeling tasks using TensorFlow and scikit-learn.

• Assisted in algorithm development and advanced analytics of a GAN ml model.​

2023-2024

Data Collector

Talent Solutions (Microsoft)

Collected and curated a diverse dataset by conducting conversations with various speakers to support the development of a speech-to-text model.

• Managed, cleaned, and preprocessed data to ensure high-quality inputs for model training.

• Labeled and annotated data to establish a reliable ground truth for model evaluation and improvement​.

2023

Data Science Intern

Tipalti

  • Developed and implemented a transformer-based embedding model combined with a feature reduction and clustering ml model to group semantically similar messages from multiple sources while filtering out syntactically similar but semantically different messages.

  • • Cleaned and preprocessed multi-source data, reducing redundancy by 90% to improve efficiency.

  • • Worked in collaboration with members of other teams to assist in the decision making process.

EDUCATION

EDUCATION

2022-2023

Data Science Certification

Israel Tech Challenge

Graduated a full-time training, hands-on accelerator that encourages research, autonomous learning, and teamwork, while putting an emphasis on the industry's best practices and needs.

2018-2022

BSc Computer Science

Jerusalem College of Technology

Relevant courses: Algorithms, Software Engineering, Automata and Formal Language, Computability and Computational Complexity, Object Oriented Programming, Functional Programming, TCP-IP, and Calculus, Linear Algebra, Probability, Statistics.

SKILLS

SKILLS

Languages: Python, SQL

Big Data: Hadoop, Spark

AWS: Sagemaker, Redshift, EMR, etc.

Machine Learning

Business Intelligence with PowerBI

Statistical Analysis

EXPERTISE

EXPERTISE

Data Analysis & Visualization

I specialize in extracting insights from complex datasets and presenting them in a visually compelling manner for informed decision-making.

Machine Learning

Experienced in building predictive models and implementing machine learning algorithms to solve business challenges.

Big Data Technologies

Proficient in handling and analyzing large-scale, unstructured data using cutting-edge big data technologies.

CONTACT
bottom of page