This script returns the location most frequented by an individual (called MFL) during weekdays' daytime and nighttime according to a certain time window. The MFL during a given time windows is defined as the location in which the individual has spent most of his/her time. In this algorithm, two scales are considered, hours and days. Hopefully, for a given individual, the MFLs detected at the two scales should be the same.
The algorithm takes as input a 6 columns csv file with column names, the value separator is a semicolon ";". Each row of the file represents a spatio-temporal position of an individual's trajectory. The time is given by the columns 2 to 5 (year, month, day and hour). It is important to note that the table should be SORTED by individual ID and time.
- ID of the individual
- Year
- Month (1->12)
- Day (1->31)
- Hour (0->23)
- ID of the geographical location
The algorithm has 6 parameters:
- wdinput: Path of the input file (ex: "input.csv")
- wdoutput: Path of the output file (ex: "outputoftheawsomemaximelenormandsalgorithm.csv")
- minH: Lower bound (included) of the night time window (ex: 20h)
- maxH: Upper bound (included) of the night time window (ex: 7h)
- minW: Lower bound (included) of the day time window (ex: 8h)
- maxW: Upper bound (included) of the day time window (ex: 19h)
The algorithm returns a 19 columns csv file with column names, the value separator is a semicolon ";". Each row represents an individual:
- ID: ID of the individual
- NbMonths: Number of distinct months covered by the trajectory
- NbConsMonths: Maximum number of consecutive months covered by the trajectory
- MFLHomeDays: MFL during nighttime (day scale), 'NoMFL' if NbDaysHome=0
- MFLHomeDays2: Second MFL during nighttime (day scale) if ex aqueo, 'NoMFL' otherwise
- NbDaysHomeMFL: Number of distinct days (during nighttime) spent in the MFL
- NbDaysHome: Number of distinct days covered by the trajectory (during nighttime)
- MFLHomeHours: MFL during nighttime (hour scale), 'NoMFL' if NbHoursHome=0
- MFLHomeHours2: Second MFL during nighttime (hour scale) if ex aqueo, 'NoMFL' otherwise
- NbHoursHomeMFL: Number of distinct hours spent in the MFL (during nighttime)
- NbHoursHome: Number of distinct hours covered by the trajectory (during nighttime)
- MFLWorkDays: MFL during daytime (day scale), 'NoMFL' if NbDaysWork=0
- MFLWorkDays2: Second MFL during daytime (day scale) if ex aqueo, 'NoMFL' otherwise
- NbDaysWorkMFL: Number of distinct days spent in the MFL (during daytime)
- NbDaysWork: Number of distinct days covered by the trajectory (during daytime)
- MFLWorkHours: MFL during daytime (hour scale), 'NoMFL' if NbHoursWork=0
- MFLWorkHours2: Second MFL during daytime (hour scale) if ex aqueo, 'NoMFL' otherwise
- NbHoursWorkMFL: Number of distinct days spent in the MFL (during daytime)
- NbHoursWork: Number of distinct hours covered by the trajectory (during daytime)
You can run the code using the command:
python MFL.py input.csv output.csv 20 7 8 19
If you use this code, please cite:
Lenormand M, Louail T, Barthelemy M & Ramasco JJ (2016) Is spatial information in ICT data reliable? In proceedings of the 2016 Spatial Accuracy Conference, 9-17, Montpellier, France.
If you need help, find a bug, want to give me advice or feedback, please contact me! You can reach me at maxime.lenormand[at]irstea.fr