Philips Domestic Appliances
Koninklijke Philips N.V. (lit. 'Royal Philips'), commonly shortened to Philips, is a Dutch multinational conglomerate corporation that was founded in Eindhoven in 1891.
Philips was once one of the largest consumer electronics companies in the world, but later focused on health technology, having divested its other divisions.
Worked directly under the Global Marketing Director and the Head of Digital Transformation of Philips Domestic Appliances
Led a project focused on analyzing the impact of Net Promoter Score (NPS) on customer purchase and referral behavior, specifically in the context of repair and replacement services.
NPS, a key metric for assessing customer experience, was evaluated across various touchpoints to understand its influence on sales performance.
Lead Data Scientist responsible for overseeing the entire data science project lifecycle, including data collection, problem formulation, model development, and stakeholder management.
Key responsibilities include identifying business challenges and translating them into data-driven solutions, ensuring data quality and integrity, building and deploying predictive models, performing advanced statistical analysis, and communicating actionable insights to cross-functional teams.
Additionally, responsible for mentoring junior data scientists, coordinating with engineering teams to integrate models into production, and maintaining strong relationships with stakeholders to ensure alignment with business objectives.
Successfully integrated NPS data from an external system and developed a methodology to test hypotheses on its impact on sales.
Demonstrated a strong correlation between NPS and sales growth in certain categories over a 1.5-year period, while other categories showed inconclusive results.
Findings aligned with industry literature, affirming NPS as a reliable short-term sales predictor.
Additionally, decomposed and removed yearly seasonality from time series data to improve model accuracy and clarity.
Pandas: For data manipulation, cleaning, and merging external NPS data.
NumPy: For numerical operations and handling large datasets.
Matplotlib/Seaborn: For data visualization, helping to visualize trends, seasonality, and correlations.
SciPy: For hypothesis testing and statistical analysis.
Example: scipy.stats.ttest_ind() to perform statistical significance testing between NPS and sales.
Statsmodels: For more advanced statistical modeling, including time series decomposition and regression analysis.
scikit-learn: For implementing machine learning algorithms, particularly for correlation analysis and regression.
Prophet (Facebook): A specialized library for time series forecasting, which helps handle seasonality and trend decomposition.
Explanation:
medallia - a 3rd party company that provides sotware to gather and track NPS results (Net Promoter Score)
Time Series Decomposition: To remove seasonality and trend from the time series data. Statsmodels.tsa.seasonal_decompose() or Prophet for trend/seasonality analysis.
ARIMA (AutoRegressive Integrated Moving Average): For modeling time series data and predicting future trends based on past behavior. Statsmodels has an ARIMA implementation.
Correlation Analysis (Pearson or Spearman): - Pearson's coefficient measures linear correlation - Spearman and Kendall coefficients compare the ranks of data.
Numpy and pandas correlation functions
- numpy: np.corrcoef () - matrix of the Pearson coefficients.
Lag Analysis: To determine the delayed impact (e.g., 1.5 years) of NPS on sales. pandas.shift() can be used to create lagged features for sales and NPS scores to test delayed effects.
Rolling Window Analysis: To smooth the data and observe trends over time by applying rolling statistics. pandas.rolling() to calculate moving averages or rolling correlations.
SciPy:
- statistical routines scipy.stats: pearsonr(), spearmanr(), kendalltau()
- Note that these functions return objects that contain two values:- The correlation coefficient, - The p-value (https://towardsdatascience.com/p-values-explained-by-data-scientist-f40a746cfc8)
- P-value:
- A small p-value (< 0.05) suggests a statistically significant correlation.
- A large p-value (> 0.05) suggests no significant correlation.
Pandas:
- Here, you use .corr() function on the data frame to calculate all three correlation coefficients. You define the desired statistic with the parameter method, which can take on one of several values: 'pearson', 'spearman', 'kendall')