Harshal Talele

I'm a

About

Harshal is a technology-driven individual with a Master's degree in Data Science, currently working with Blue Matter Consulting in their Insights & Analytics team.

Having a strong background in data science tools and technologies, he has demonstrated expertise in crafting end-to-end solutions that transform data into actionable insights to support business growth for life sciences companies. He strives to create an impactful career in the field of data science and establish long-term associations with organizations in need of data-backed business problem-solving.

In addition to his passion for data science, Harshal is also an enthusiastic public speaker, always seeking opportunities to share his knowledge and insights. Whether through presentations, hosting events, or participating in podcasts, he thrives on engaging with others and contributing to meaningful discussions.

Thank you for visiting! Feel free to explore the portfolio and reach out for work opportunities or inquiries. Together, let's harness the potential of data and make a difference!

Skills

  • Programming Languages: Languages: Python, R, PySpark, C/C++, Java, JavaScript, PHP, HTML/CSS
  • Database: MySQL, PostgreSQL, NoSQL, MongoDB, Google Firebase, Oracle SQL Server
  • Analytics/Cloud: Tableau, Looker, Databricks, Spark, Alteryx, AWS (S3, Redshift, Athena), MLFlow, IBM Cognos, Denodo, Advanced MS Excel
  • IDEs/Frameworks: Jupyter, Git, VSCode, RStudio, MS Office, Anaconda, Eclipse, Apache Spark
  • Development: Angular, Webflow, WordPress, Android Studio
  • Data Science:Machine Learning, Natural Language Processing, Generative Models, Time Series Forecasting, ETL, Exploratory Data Analysis, Hypothesis Testing, Statistical Analysis, A/B Testing, Quality Assurance and Control

Resume

Education

Master of Science - Data Science

Aug 2022 - Dec 2023

University of Rochester, USA

  • Key Courses – NLP, Statistical Machine Learning, Time Series Analysis, Computational Statistics, Data Mining
  • Recipient of a 30% Scholarship on the tuition fees
  • Graduate Teaching Assistant for GBA465 - Python Analytics course
  • Assisting accessibility and inclusivity by managing and proctoring exams for students at the Office of Disability Resources

Bachelor of Engineering - Computer Science

Jun 2016 - Aug 2020

Savitribai Phule Pune University, India

  • Key Courses – Database Management Systems, Data Analytics, Data Structures, Distributed Systems
  • Volunteered in organizing & participating in non-technical events like debates, treasure hunts, etc. and technical events like hackathons, etc

Work Experience

Associate, Insights & Analytics

Jan 2024 - Present

Blue Matter Consulting

Data Analyst

Sep 2020 - Jun 2022

IQVIA (159 Solutions Inc.)

  • Provided analytical support by designing a launch tracker and creating dashboard reports for the client's new drug launch
  • Utilized Alteryx, MS Excel, and Tableau to process EDI 852, 867 sales, and EPI data to generate weekly stakeholder deliverables
  • Conducted ad-hoc analysis using HCOS, DDD data, and PLD to analyze patients and sales across demographics and geographies
  • Collaborated with cross-functional teams to migrate database from Teradata & Azure to AWS S3 by conducting QC and sanity checks
  • Recognized with ‘Ovation Award’ for managing offshore work stream that included organizing daily catch ups, facilitating client engagement & stakeholder deliverables as per client needs

Portfolio

  • All
  • Tech
  • Content

Forecasting Bike Inventory for Citibike

PySpark, Databricks, MLFlow, ETL, EDA

Developed an application for Citibike having 50k+ daily users to ensure availability of bikes and empty docks at the station in New York City. Implemented ETL pipeline with Spark streaming, performed EDA, built optimized forecasting model for net bike change, and fine-tuned hyperparameters for enhanced accuracy.

Analyzing Political Interest of Indian American

Python, Topic modelling, Sentiment Analysis, Tableau, Twitter API

Analyzed tweets from 2.8k unique Indian Americans to identify their political trends and biases ahead of 2024 presidential elections. Designed visuals illustrating prevalent topics and sentiment within the community based on party affiliations and geographic locations

Dynamic QA Generator for Research Papers

Python, Large Language Models, Data Preparation, OpenAI API

Developed a QA model to aid efficient comprehension of research papers by summarizing relevant information in the form of Q&A. Fine-tuned T5 models on OpenAI-modified QASPER dataset with 1.5k+ papers for question generation and answer generation task

COVID-19 Cases Prediction in Ohio

Ensemble Learning, Feature Engineering, Model Evaluation

Built ML model to predict COVID-19 cases using Ohio county's time series data and analyzed tweet-based social awareness impact. Achieved top 30% ranking with an impressive R2 score of 0.89 in the Kaggle competition

Classification of Tweets from Northern Europe

Python, Scikit Learn, Support Vector Classifier, Feature Engineering

Built a classification ML model to predict political polarity of multi-lingual text using 500k+ tweets data with an accuracy of 79%. Implemented lemmatization, POS tagging, CountVectorizer, TF-IDF Transformer for effective text cleaning and feature engineering.

Contact

I would love to hear from you! Whether you have questions, collaboration opportunities, or just want to connect, please feel free to reach out over mail or any social media platforms. If you are around the location below, I would be open to meeting!

Location:

South San Francisco, California