Hai Bao
647-831-8623 | hbao12@gmail.com | hbao.ca
SUMMARY STATEMENT
Senior Data Scientist with proven expertise in integrating LLMs (like GPT) to optimize data harvesting, reducing daily article volume by 25%. Skilled in text mining, creating Python/Plotly monitoring dashboards, and achieving significant workload reduction for ESG analysts.
EXPERIENCE
Senior Data Scientist
Jan 2023- Present
Morningstar Sustainalytics, Toronto, ON
● Responsible for maintaining the News Harvester System, which is a system that imports 400,000+ articles daily and filters them out for ESG relevance for ESG Analyst review.
● Integrated LLM-based solution (GPT) into the News Harvester System to help filter the news articles more efficiently, which resulted in the reduction of articles daily by 25% (approximately 1000 less articles per day).
● Created a Plotly dashboard in Python to monitor daily metrics and assist with anomaly detection. The dashboard is consumed by stakeholders through a daily automated email.
● Evaluated model performance using scikit-learn.
Data Scientist
Nov 2021 - Jan 2023
Morningstar Sustainalytics, Toronto, ON
● Responsible for maintaining the Thematic Crawler (web/internal database crawler).
● Improved text mining performance of the crawler through use of multilayered keywords. Was able to reduce the number of hits by approximately 50%, effectively reducing workload for ESG Analysts in half.
Network Capacity Planner
Dec 2014 - Apr 2020
Zayo, Toronto, ON
● Responsible for identifying cost optimization opportunities through SQL queries of the network equipment inventory database.
● Transitioned opportunities into realizing cost savings through implementation of network engineering projects (node installations/decommissions, customer circuit migrations).
● Worked on a project to vacate 2 floors of a data center in order to realize $2 million dollars worth of cost savings.
EDUCATION & CERTIFICATIONS
Bachelor of Science, University of Toronto
Sep 2005 - Aug 2010
Statistics Specialist
Machine Learning , IBM
Sep 2021
Covered topics such as Regression, Classification, Deep Learning, Time Series, Unsupervised Machine Learning
Data Analytics , Google
Jul 2021
Covered topics such as Exploratory Data Analysis, Data Engineering, Data Cleaning, Creating Visualizations
Natural Language Processing, DeepLearning.AI
Dec 2021
Covered topics such as Sentiment Analysis, Part-of-speech Tags, Named Entity Recognition
Tableau Desktop Specialist, Tableau
Nov 2020
SKILLS
Python Programming
Data Analytics
Data Visualizations
Machine Learning Modelling (XGBoost, Random Forest, etc.)
GenAI/LLM Integrated Solutions
Github