Text analysis in incident duration prediction

Pereira, F.C. and Rodrigues, F. and Ben-Akiva, M.
Transportation Research Part C, Elsevier, 2013

Abstract: Due to their heterogeneous case-by-case nature, plenty of relevant information about traffic incidents is communicated in free flow text fields instead of constrained value fields. As a result, such text components enclose considerable richness that is invaluable for incident analysis, modeling and prediction. However, the difficulty to formally interpret such data has led to minimal consideration in previous work.

This paper proposes the use of topic modeling, a text analysis technique, in the problem of incident duration prediction. We analyze a dataset of 2 years of accident cases and develop a duration prediction model that considers both textual and non-textual features. To demonstrate the value of the approach, we compare predictions with and without text analysis using several different prediction models.

Read more