This dissertation investigates whether individual tweets related to the S&P500 Index can predict volatility in future returns. A sample of 3,329,267 tweets containing the keyword “SPX” was collected from the period 2012 to 2021. We applied Principal Component Analysis (PCA) to reduce the dimensionality of the word frequency data and then integrated it with the Heterogeneous Autoregressive (HAR) model. We evaluated the in-sample and out-of-sample forecasting performance of various HAR-PCA models using different estimation window schemes and compared them with the original HAR model. We found that HAR-PCA models generally outperform the HAR model, especially during periods of particularly high and low volatility. Our findings demonstrate the economic relevance of HAR-PCA models for portfolio investment and contribute to the literature by linking investor sentiment to return volatility using a word-based method, which avoids the complications of applying advanced algorithms.
Date of Award | 5 Dec 2023 |
---|
Original language | English |
---|
Awarding Institution | |
---|
Supervisor | Manuela Pedio (Supervisor) & Nick J Taylor (Supervisor) |
---|
Predicting volatility with Twitter sentiment: an application to the US stock market.
Yang, C. (Author). 5 Dec 2023
Student thesis: Master's Thesis › Master of Philosophy (MPhil)