mobilipy package
Submodules
mobilipy.constants module
- mobilipy.constants.LATITUDE = 'latitude'
Default latitude column name
- mobilipy.constants.LONGITUDE = 'longitude'
Default longitude column name
- mobilipy.constants.TRACKED_AT = 'tracked_at'
Default timestamp column name
- mobilipy.constants.UTC = 'UTC'
Default timezone for the WaypointsDataFrame
mobilipy.gtfs_helper module
- class mobilipy.gtfs_helper.GTFS_Helper(directory, lon_lat_step=0.003)
Bases:
object- get_coordinates(stop_id)
Finds coordinates of a stop with the given ID
- Parameters
stop_id (str) – ID of the stop
- Returns
tuple[float, float] – Coordinates of the stop as (latitude, longitude)
- get_n_closest_stops(longitude, latitude, n=1, lon_lat_step=0.003) pandas.core.frame.DataFrame
Returns n closest stops to the given location
- Parameters
longitude (float) – Longitude as degrees
latitude (float) – Latitude as degrees
n (int, optional) – Number of stops to be returned. Defaults to 1.
lon_lat_step (float, optional) – Size of cells on map, in latitude/longitude degrees. Defaults to 0.003.
- Returns
pandas.DataFrame – DataFrame with information about n closest stops
- get_nearby_stops(longitude, latitude, lon_lat_step=0.003, df=True) pandas.core.frame.DataFrame
Finds nearest stops to the given location. Uses a grid search, checking the cell of the location as well as all the ones around it.
- Parameters
longitude (float) – Longitude as degrees
latitude (float) – Latitude as degrees
lon_lat_step (float, optional) – Size of cells on map, in latitude/longitude degrees. Defaults to 0.003.
- Returns
pandas.DataFrame – DataFrame with info on all the nearest stops
- get_transfers() pandas.core.frame.DataFrame
Finds possible transfers within the same parent station.
- Returns
pandas.DataFrame – DataFrame containing all the possible transfers in the dataset.
- id_to_name(stop_id) str
Finds the name of the stop with the given ID
- Parameters
stop_id (str) – ID of the stop
- Returns
str – Name of the stop with the given ID
mobilipy.legs module
- mobilipy.legs.get_user_legs(df, user_id, use_multiprocessing=True) pandas.core.frame.DataFrame
Builds the legs DataFrame for the given user.
- Parameters
df (pandas.DataFrame) – waypoints DataFrame
user_id (str) – ID of the user whose legs are to be created
use_multiprocessing (bool, optional) – Specifies whether the multiprocessing package should be used. Defaults to True.
- Returns
pandas.DataFrame – DataFrame of user’s legs
mobilipy.mode_detection module
- mobilipy.mode_detection.mode_detection(df, speed_th=2.78, acceleration_th=0.5, minimal_walking_duration=100, minimal_trip_duration=120, use_multiprocessing=True)
Tags the DataFrame at ‘trip’ indexes with detected modes in the ‘detected_mode’ column.
- Parameters
df (pandas.DataFrame) – DataFrame to be processed, coming from segmentation module
speed_th (float, optional) – The walk speed threshold. Defaults to 2.78.
acceleration_th (float, optional) – The walk acceleration threshold. Defaults to 0.5.
minimal_walking_duration (int, optional) – The walk duration threshold. Defaults to 100.
minimal_trip_duration (int, optional) – The minimal trip duration threshold. Defaults to 120.
use_multiprocessing (bool, optional) – Specifies whether the multiprocessing package should be used. Defaults to True.
- Returns
pandas.DataFrame – Segments DataFrame with modes of transport tagged in the mode_detected column.
mobilipy.plot module
- mobilipy.plot.get_leg_points(legs_from_waypoints, clean_waypoints, index, info=False)
Returns a DataFrame with all the points belonging to the given leg.
- Parameters
legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.
clean_waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.
index (int) – Index of the leg.
info (bool, optional) – Defines whether additional info about the leg should be returned. Defaults to False.
- Returns
pd.DataFrame – DataFrame with all the points belonging to the given leg
- mobilipy.plot.get_map_bounds(df)
Finds the map bounding box for the given WaypointsDataFrame
- Parameters
df (pd.DataFrame) – DataFrame with latitude and longitude columns
- Returns
((float, float), (float, float)) – Bounds as a 2x2 array, in the form of ((latitude_min, longitude_min), (latitude_max, longitude_max))
- mobilipy.plot.plot_all(legs_from_waypoints, waypoints)
Plots dirty waypoints, clean waypoints and resulting legs.
- Parameters
legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.
waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.
- Returns
folium.Map – Map with points from the DataFrames
- mobilipy.plot.plot_daily_legs(legs_from_waypoints, waypoints, day_num, first_=0, last_=- 1, map_=None, dirty_waypoints=False, plot_waypoints=False, solos=False)
Plots all the legs for a given day
- Parameters
legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.
waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.
day_num (int) – Index of the day
first (int, optional) – Index of the first leg to be plotted. Defaults to 0.
last (int, optional) – Index of the last leg to be plotted. Defaults to -1.
map ([type], optional) – Existing Map object to plot the points on. Defaults to None.
dirty_waypoints (bool, optional) – Specifies whether the supplied waypoints need cleaning. Defaults to False.
plot_waypoints (bool, optional) – Specifies whether waypoints should be plotted one by one. Defaults to False.
solos (bool, optional) – Specifies whether solo legs should be plotted. Defaults to False.
- Returns
folium.Map – Map with points from the DataFrames
- mobilipy.plot.plot_gps(df, loc_map=None, type_='transport', line=True)
Plots supplied GPS points on a folium Map.
- Parameters
df (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.
loc_map (folium.Map, optional) – Existing Map object to plot the points on. Defaults to None.
type (str, optional) – Type of data, can TRANSPORT, ACTIVITY. Defaults to TRANSPORT.
line (bool, optional) – Specifies whether consecutive points should be connected by a line. Defaults to True.
- Returns
folium.Map – Map with points from the DataFrame
- mobilipy.plot.plot_leg(legs_from_waypoints, clean_waypoints, index, map_=None, info=False)
Plots a selected leg on a folium Map.
- Parameters
legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.
clean_waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.
index (int) – Index of the leg.
map ([type], optional) – Existing Map object to plot the points on. Defaults to None.
info (bool, optional) – Defines whether additional info about the leg should be returned. Defaults to False.
- Returns
folium.Map – Map with points from the leg
- mobilipy.plot.plot_legs(legs_from_waypoints, clean_waypoints, map_=None)
Plots all the legs on a folium Map
- Parameters
legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.
clean_waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.
map ([type], optional) – Existing Map object to plot the points on. Defaults to None.
- Returns
folium.Map – Map with points from the legs DataFrame
- mobilipy.plot.plot_solos(solos, map_=None)
Plots DataFrame points on a folium Map without connecting them with a line.
- Parameters
solos (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.
map (folium.Map, optional) – Existing Map object to plot the points on. Defaults to None.
- Returns
folium.Map – Map with points from the DataFrame
- mobilipy.plot.plot_waypoints(waypoints, clean_df=True, map_=None)
Plots waypoints on a folium Map.
- Parameters
waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.
clean_df (bool, optional) – Prepares waypoints before plotting if True. Defaults to True.
map (folium.Map, optional) – Existing Map object to plot the points on. Defaults to None.
- Returns
folium.Map – Map with points from the DataFrame
mobilipy.poi_detection module
- mobilipy.poi_detection.detect_home_work(legs, waypoints, cell_size=0.2)
Detects home and work locations, tags them with ‘Home’ or ‘Work’ in the df
- Parameters
legs (pd.DataFrame) – legs DataFrame coming from the legs module
waypoints (pandas.DataFrame) – the waypoints DataFrame to be processed
mobilipy.preparation module
- mobilipy.preparation.prepare(df, accuracy_th=1000, sigma=10) pandas.core.frame.DataFrame
Cleans a raw GPS points dataframe by filtering in the zurich area, rearranging features and applying gaussian smoothing.
- Parameters
df (pandas.DataFrame) – DataFrame to be prepared for route processing
accuracy_th (int, optional) – Accuracy threshold for filtering. Defaults to 1000.
sigma (int, optional) – Sigma for Gaussian smoothing, defines the size of smoothing window. Defaults to 10.
- Returns
pd.DataFrame – [description]
mobilipy.privacy module
- mobilipy.privacy.add_noise(point, radius=100, offset=30)
Adds random uniform noise to the given point, in a radius = radius - offset
- Parameters
point (tuple) – Points as a tuple (latitude, longitude) in degrees.
radius (int, optional) – Radius of the point neighbourhood. Defaults to 100.
offset (int, optional) – Offset from the perimeter of the neighbourhood circle, where the shifted point can’t be located. Defaults to 30.
- Returns
tuple – Shifted point expressed as (latitude, longitude)
- mobilipy.privacy.aggregate(waypoints_df, cell_size=0.2, delta=datetime.timedelta(seconds=900)) pandas.core.frame.DataFrame
Aggregates users in timedeltas and cells on the map. Returns a DataFrame with the count of users in a given timedelta and cell.
- Parameters
waypoints_df (pandas.DataFrame) – DataFrame with ‘latitude’, ‘longitude’, ‘user_id’ and ‘tracked_at’ columns.
cell_size (float) – Size of the square cells on the map, in kilometers.
delta (datetime.timedelta) – Frequency for the time aggregation, e.g. 15 minutes.
- Returns
pandas.DataFrame – DataFrame with ‘tracked_at’, ‘cell_latitude’, ‘cell_longitude’ and ‘count’ columns. The ‘cell_latitude’ and ‘cell_longitude’ columns give coordinates of the centers of cells on the map.
- mobilipy.privacy.assign_cell_center(latitude, longitude, cell_size)
Returns the closest cell center for the given coordinates, when the map is divided into a lattice with the supplied cell_size
- Parameters
latitude (float) – Latitude as degrees
longitude (float) – Longitude as degrees
cell_size (float) – Cell size in kilometers
- Returns
tuple(float, float) – Coordinates of the closest cell center on the map
- mobilipy.privacy.dt_floor(dt, delta) datetime.datetime
Performs the floor operation on datetime values, with given timedelta as the base, e.g. 2021-01-01 12:21:47 with timedelta of 15s returns 2021-01-01 12:21:45.
- Parameters
dt (datetime.datetime) – Datetime value to be floored.
delta (datetime.timedelta) – Timedelta used for the floor operation.
- Returns
datetime.datetime – Floored datetime.
- mobilipy.privacy.get_obfuscation_utility(w_prepared, w_obfuscated, legs) float
Calculates the ratio of legs affected by obfuscation to total legs
- Parameters
w_prepared (pandas.DataFrame) – Smoothed and cleaned waypoints DataFrame
w_obfuscated (pandas.DataFrame) – Smoothed, cleaned, and obfuscated waypoints DataFrame
legs (pandas.DataFrame) – DataFrame that contains assembled legs
- Returns
[float] – Ratio of legs affected by obfuscation to total legs
- mobilipy.privacy.km_to_lat(km_north) float
Expresses latitude given in kilometers to the east in degrees
- Parameters
km_north (float) – Latitude expressed in kilometers to the north
- Returns
float – Latitude in degrees
- mobilipy.privacy.km_to_lon(km_east, latitude) float
Expresses given kilometers to the east in longitude degrees.
- Parameters
km_east (float) – Longitude expressed in km going east from longitude 0, at the given latitude
latitude (float) – Latitude in degrees
- Returns
float – Longitude in degrees
- mobilipy.privacy.lat_to_km(latitude) float
Expresses given latitude in kilometers to the north
- Parameters
latitude (float) – Latitude in degrees.
- Returns
float – Latitude expressed in kilometers to the north
- mobilipy.privacy.lon_to_km(latitude, longitude) float
Expresses given longitude in kilometers to the east
- Parameters
latitude (float) – Latitude expressed in degrees
longitude (float) – Longitude expressed in degrees
- Returns
float – Longitude as kilometers to the east
- mobilipy.privacy.obfuscate(df, locations, radius=100, offset=30, mode='remove') pandas.core.frame.DataFrame
Obfuscates the regions of points given in ‘locations’ parameter by either removing all the points in their proximity, or changing the location of these points to one, noisy location in the proximity circle.
- Parameters
df (pandas.DataFrame) – DataFrame with ‘latitude’ and ‘longitude’ columns.
locations (list) – List of locations given as (latitude, longitude) tuples
radius (int, optional) – Radius of the obfuscation circle. Defaults to 100.
offset (int, optional) – Smallest distance from the perimeter of the obfuscation circle at which the location of interest must be located. Defaults to 30.
mode (str, optional) – Obfuscation mode, can be either ‘remove’ or ‘assign’. Defaults to ‘remove’.
- Returns
pandas.DataFrame – DataFrame with obfuscated regions
- mobilipy.privacy.shift_point(point, latitude_shift_km, longitude_shift_km)
Shifts the coordinates of a given point.
- Parameters
point (tuple) – Starting point as (latitude, longitude), in degrees
latitude_shift_km (float) – Latitude offset in kilometers
longitude_shift_km (float) – Longitude offset in kilometers
- Returns
tuple – Shifted point expressed as (latitude, longitude)
mobilipy.reva module
- mobilipy.reva.analyse(df, user_id) pandas.core.frame.DataFrame
Returns complete trip information from a raw GPS waypoints DataFrame. Segments the data into trips, detects the mode of transport and tags the home and work locations.
- Parameters
df (pandas.DataFrame) – WaypointsDataFrame
user_id (str) – user’s ID
- Returns
pandas.DataFrame – DataFrame with selected user’s legs
mobilipy.segmentation module
- mobilipy.segmentation.segment(prepared_df, radius=0.025, min_samples=50, time_gap=850, use_multiprocessing=True) pandas.core.frame.DataFrame
Finds clusters of waypoints for legs
- Parameters
df (pandas.DataFrame) – Waypoints DataFrame to be processed
radius (float) – Eps for DBSCAN
min_samples (int) – Minimum number of samples to be considered for
time_gap (float) – Max time gap threshold for detected clusters
- Returns
pandas.DataFrame – DataFrame with the segment starts and ends
mobilipy.waypointsdataframe module
- class mobilipy.waypointsdataframe.WaypointsDataFrame(data, tracked_at='tracked_at', longitude='longitude', latitude='latitude', user_id='user_id', crs={'init': 'epsg:4326'}, timezone='UTC')
Bases:
pandas.core.frame.DataFrameClass that serves as an entry point for the mobilipy pipeline
Initializes the WaypointsDataFrame
- Parameters
data (pandas.DataFrame) – DataFrame with raw GPS data
tracked_at (str, optional) – Name of the column containing the timestamp. Defaults to constants.TRACKED_AT.
longitude (str, optional) – Name of the column containing the longitude. Defaults to constants.LONGITUDE.
latitude (str, optional) – Name of the column containing the latitude. Defaults to constants.LATITUDE.
user_id (str, optional) – Name of the column containing the user_id. Defaults to ‘user_id’.
crs (dict, optional) – Coordinate Reference System. Defaults to {“init”: “epsg:4326”}.
timezone (str, optional) – Timezone available in pytz. Defaults to constants.UTC.