Electoral Predictions with Twitter: a Machine-Learning approach
Accepted at the 6th Italian Information Retrieval Workshop, Cagliary, Italy, 25–26 May 2015 [1].
Abstract. In this work we study how Twitter can provide some interesting insights concerning the primary elections of an Italian political party. State-of-the- art approaches rely on indicators based on tweet and user volumes, often including sentiment analysis. We investigate how to exploit and improve those indicators in order to reduce the bias of the Twitter users sample. We propose novel indicators and a novel content-based method. Furthermore, we study how a machine learning approach can learn correction factors for those indicators. Experimental results on Twitter data support the validity of the proposed methods and their improvement over the state of the art.
Training correcting factors
One of the assumptions of this work is that Twitter users are not a representative sample of the voters population. Even if we were able to correctly classify each Twitter user, we would not be able to make a reliable estimate of the voting results as (i) several Twitter users may not vote, (ii) several voters are not present on Twitter, and (iii) the voters of each candidate have a different degree of representativeness in Twitter.
Given a predictor ϕ(c), we aim at learning a set of weights wc, one for each candidate, such that wcϕ(c) improves the estimate of actual votes received. The weights wc should act as a bridge correcting an estimate based on Twitter users to fit real world users behavior. We aim at learning the weights wc. For each region of Italy and for each candidate c, we create a training instance ⟨yc,xc⟩, where yc is the target variable being equal to the percentage of votes actually achieved by c in the given region, and xc is the input variable equal to a given estimator ϕ(c). In general, a vector of input variables can be used. We thus have a training data set with 60 training instances coming from 20 regions and 3 candidates. To conduct a 5-fold cross validation the data set was split region-wise in training and test sets. The training set was used to learn a weight wc via linear regression that minimizes (yc − wc ⋅ ϕ(c))2.
Algorithm | MAE | RMSE | MRM |
baseline TweetCount | 0.0818 | 0.1024 | 0.35 |
baseline UserCount | 0.0940 | 0.1080 | 0.45 |
UserShare | 0.0536 | 0.0705 | 0.75 |
ML-ClassTweetCount | 0.0533 | 0.0663 | 0.69 |
ContentAnalysis | 0.0525 | 0.0630 | 0.70 |
We used three different evaluation measures to assess the approaches discussed in this work. The most commonly used evaluation measure is the Mean Absolute Error (MAE). We also report the Root Mean Squared Error (RMSE), as it is more sensitive to large estimation errors. Finally, since we are also interested in the capability of predicting the correct ranking of the candidates, we also introduced the Mean Rank Match (MRM) measure, i.e., the mean number of times that the correct ranking of all the candidates was produced. Note that we conducted a per-region analysis, meaning that a prediction is produced for every region by exploiting the regional data only. The presented results are averaged across the 20 Italian regions. We applied this approach to three basic predictors. As reported in Table 1 these new approaches provide a significant improvement according to all metrics. A huge improvement is observed according to the MRM metric: we were able to reduce the prediction error on the votes shares up to the point of being able to correctly predict the final ranking of the candidates. By inspecting the weights learned by the best performing strategy, we see that the three candidates Renzi, Cuperlo and Civati have weights 1.02, 1.24 and 0.70 respectively. This means that the second candidate is under-represented in the Twitter data, and symmetrically for the third candidate.
The drawback of this approach is that it requires a training data where to learn the correction weights wc. This makes it not possible to directly apply the method before the election takes place. On the other hand, we can assume that weights are sufficiently stable, i.e., that the degree of representativeness of the Twitter sample for a specific sample does not change abruptly. If this is the case, then we can learn those weights by exploiting data from previous events. Indeed, it would be possible exploit elections at municipality, regional and European level to learn a proper set of weights for national elections. Another interesting case is that of a two-round voting system, where the model could be trained after the first round and used to predict the outcome of the second. Yet another option is to complement prediction with traditional polls data.