## Electoral Predictions with Twitter: a Machine-Learning approach

Accepted at the 6th Italian Information Retrieval Workshop, Cagliary, Italy, 25–26 May 2015 [1].

Abstract. In this work we study how Twitter can provide some interesting insights concerning the primary elections of an Italian political party. State-of-the- art approaches rely on indicators based on tweet and user volumes, often including sentiment analysis. We investigate how to exploit and improve those indicators in order to reduce the bias of the Twitter users sample. We propose novel indicators and a novel content-based method. Furthermore, we study how a machine learning approach can learn correction factors for those indicators. Experimental results on Twitter data support the validity of the proposed methods and their improvement over the state of the art.

### Training correcting factors

One of the assumptions of this work is that Twitter users are not a representative sample of the voters population. Even if we were able to correctly classify each Twitter user, we would not be able to make a reliable estimate of the voting results as (i) several Twitter users may not vote, (ii) several voters are not present on Twitter, and (iii) the voters of each candidate have a diﬀerent degree of representativeness in Twitter.

Given a predictor ϕ(c), we aim at learning a set of weights w_{c}, one for each
candidate, such that w_{c}ϕ(c) improves the estimate of actual votes received. The
weights w_{c} should act as a bridge correcting an estimate based on Twitter users to ﬁt
real world users behavior. We aim at learning the weights w_{c}. For each region of Italy
and for each candidate c, we create a training instance ⟨y_{c},x_{c}⟩, where y_{c} is the target
variable being equal to the percentage of votes actually achieved by c in the given
region, and x_{c} is the input variable equal to a given estimator ϕ(c). In general, a
vector of input variables can be used. We thus have a training data set with 60
training instances coming from 20 regions and 3 candidates. To conduct a 5-fold cross
validation the data set was split region-wise in training and test sets. The
training set was used to learn a weight w_{c} via linear regression that minimizes
(y_{c} − w_{c} ⋅ ϕ(c))^{2}.

Algorithm | MAE | RMSE | MRM |

baseline TweetCount | 0.0818 | 0.1024 | 0.35 |

baseline UserCount | 0.0940 | 0.1080 | 0.45 |

UserShare | 0.0536 | 0.0705 | 0.75 |

ML-ClassTweetCount | 0.0533 | 0.0663 | 0.69 |

ContentAnalysis | 0.0525 | 0.0630 | 0.70 |

We used three diﬀerent evaluation measures to assess the approaches discussed in this work. The most commonly used evaluation measure is the Mean Absolute Error (MAE). We also report the Root Mean Squared Error (RMSE), as it is more sensitive to large estimation errors. Finally, since we are also interested in the capability of predicting the correct ranking of the candidates, we also introduced the Mean Rank Match (MRM) measure, i.e., the mean number of times that the correct ranking of all the candidates was produced. Note that we conducted a per-region analysis, meaning that a prediction is produced for every region by exploiting the regional data only. The presented results are averaged across the 20 Italian regions. We applied this approach to three basic predictors. As reported in Table 1 these new approaches provide a signiﬁcant improvement according to all metrics. A huge improvement is observed according to the MRM metric: we were able to reduce the prediction error on the votes shares up to the point of being able to correctly predict the ﬁnal ranking of the candidates. By inspecting the weights learned by the best performing strategy, we see that the three candidates Renzi, Cuperlo and Civati have weights 1.02, 1.24 and 0.70 respectively. This means that the second candidate is under-represented in the Twitter data, and symmetrically for the third candidate.

The drawback of this approach is that it requires a training data where to learn
the correction weights w_{c}. This makes it not possible to directly apply the
method before the election takes place. On the other hand, we can assume that
weights are suﬃciently stable, i.e., that the degree of representativeness of the
Twitter sample for a speciﬁc sample does not change abruptly. If this is
the case, then we can learn those weights by exploiting data from previous
events. Indeed, it would be possible exploit elections at municipality, regional
and European level to learn a proper set of weights for national elections.
Another interesting case is that of a two-round voting system, where the model
could be trained after the ﬁrst round and used to predict the outcome of the
second. Yet another option is to complement prediction with traditional polls
data.