Jaewon Shim
Education
University of California Berkeley
Data Science B.A. | GPA: 3.9 / 4.0
Dec 2025 | Berkeley, CA
Work Experiences
Data Scientist Intern, MKS Instruments
- Led the project to migrate dashboards from Tableau to Power BI, including optimization in the data ETL process.
- Utilized product quality data to develop a predictive analytics model and created a dashboard using Power BI to effectively present products to stakeholders.
- Automated tasks, including report generation using Python and Excel VBA, increasing operational efficiency by 30%.
- Conducted exploratory data analysis to draw insights and identify key trends, providing data-driven decisions.
Python and Mathematics Tutor, Tublet
- Facilitated online programming and statistics tutoring sessions, offering guidance to over 100 students.
- Earned an Honorable IT Tutor Certificate, awarded to top 1% of tutors in the company, for raising student grades from C or below to A in 96% of tutoring sessions.
Skills
- Python: Pandas, Matplotlib, Seaborn, Scikit-learn, TensorFlow, Keras
- Mathematics/Statistics: Linear Algebra, Applied Calculus, Statistical Modeling, A/B Test
- SQL: PostgreSQL, MySQL, Microsoft SQL Server, Advanced Queries
- Others: SAP, Smartsheet, Excel VBA, Snowflake, Power Automate
- Business Intelligence: Microsoft Power BI, Tableau
- Machine Learning: Deep Learning, Regression, Classification, Clustering, Ensemble Methods
Projects
Diabetes Classification [Python / Machine Learning]
- Performed a statistical analysis on diabetes patients dataset and developed a classification model to predict whether a person is diabetic.
- Identified BMI and age as key predictors of diabetes risk through two sample t-test.
- Implemented six classifier machine learning algorithms, with the best-performing Extra Trees model correctly classifying 84.5% of all samples.
Deep Learning for Samsung Stock Forecasting [Python / Deep Learning]
- Featured the implementation of LSTM and GRU architectures in deep learning modeling using the TensorFlow framework, achieving an R-squared value of 0.95 with the GRU model.
- Provided actionable insights for stock investors, aiding in optimizing their investment plans.
- Forecasted Samsung stock prices for the following 10 days, advising against investment due to predicted price decline.
Socioeconomic Factors Impacting COVID-19 [SQL]
- Employed PostgreSQL database and advanced SQL queries to perform multivariate analysis.
- Explored most infectious countries along with their corresponding death rates.
- Calculated the global correlation coefficient of -0.751 between GDP and infection rate, highlighting a strong negative association and emphasizing the influence of GDP per capita on the spread of the pandemic.
Root Cause Analysis Dashboard [Power BI], MKS Instruments
- Pinpointed issues in manufacturing or design by analyzing patterns in OBQ, AFR, and WIRR.
- Reduced warranty incidents by 23% in 6 months through quality benchmarking and servicing.
- Allowed data-driven decision-making by aligning OBQ insights with customer reviews to prioritize areas of improvement.
London Bike Rides Dashboard [Tableau]
- Created moving average visualization with three customizable parameters using London bike rides dataset.
- Implemented a heatmap with two bar charts in the tooltip, displaying ride length and weather distribution.
Regression Modeling of California Housing Prices [Python / Machine Learning]
- Performed exploratory data analysis and random forest regression modeling to predict house prices.
- Demonstrated proficiency in regression algorithms, data preprocessing, model evaluation, and hyperparameter tuning.
- Explained 80% of the variance in house prices, successfully predicting the price of the target house.
Certificates
- Google Data Analytics Certificate
- DataCamp SQL Certificate
- DataCamp Python Certificate
- IBM Data Science Certificate
Languages
- Korean: Native/Bilingual Proficiency
- English: Native/Bilingual Proficiency