Modelling transport choices with survey data streams

Przemysław Wrona

supervisor: Maciej Grzenda



One of the factors of the information society is the Internet of Things. These are technologies that allow you to connect any device to the Internet, for example, intelligent sensors. One type of never-ending stream is sent current localization of public transport such as busses, trams or metro lines.

The approach used in the work is the adaptation of traditional machine learning algorithms and artificial neural networks in a non-stationary environment, where the data for training the model flows in an endless stream.

In this paper, we are investigating day-to-day variability in public transport travel time using a GPS data set for a public transport route. It explores the nature and shape of travel time distributions for different departure time windows at different times of the day and factors causing travel time variabilities of public transport, such as distance between stops and destination, quality of vehicle number of seats, delay at the previous stop or using historical data. Additionally, the data from the streams is used to build timetable files (GTFS) based on actual journeys and will serve as one of the parameters in building the model. This data will be used to build a real model of travel behaviour and research the factors that have the greatest impact on these choices. We demonstrate the system with a real-world use case in Warsaw city, Poland.