mobilipy package

Submodules

mobilipy.constants module

mobilipy.constants.LATITUDE = 'latitude'

Default latitude column name

mobilipy.constants.LONGITUDE = 'longitude'

Default longitude column name

mobilipy.constants.TRACKED_AT = 'tracked_at'

Default timestamp column name

mobilipy.constants.UTC = 'UTC'

Default timezone for the WaypointsDataFrame

mobilipy.gtfs_helper module

class mobilipy.gtfs_helper.GTFS_Helper(directory, lon_lat_step=0.003)

Bases: object

get_coordinates(stop_id)

Finds coordinates of a stop with the given ID

Parameters

stop_id (str) – ID of the stop

Returns

tuple[float, float] – Coordinates of the stop as (latitude, longitude)

get_n_closest_stops(longitude, latitude, n=1, lon_lat_step=0.003) pandas.core.frame.DataFrame

Returns n closest stops to the given location

Parameters
  • longitude (float) – Longitude as degrees

  • latitude (float) – Latitude as degrees

  • n (int, optional) – Number of stops to be returned. Defaults to 1.

  • lon_lat_step (float, optional) – Size of cells on map, in latitude/longitude degrees. Defaults to 0.003.

Returns

pandas.DataFrame – DataFrame with information about n closest stops

get_nearby_stops(longitude, latitude, lon_lat_step=0.003, df=True) pandas.core.frame.DataFrame

Finds nearest stops to the given location. Uses a grid search, checking the cell of the location as well as all the ones around it.

Parameters
  • longitude (float) – Longitude as degrees

  • latitude (float) – Latitude as degrees

  • lon_lat_step (float, optional) – Size of cells on map, in latitude/longitude degrees. Defaults to 0.003.

Returns

pandas.DataFrame – DataFrame with info on all the nearest stops

get_transfers() pandas.core.frame.DataFrame

Finds possible transfers within the same parent station.

Returns

pandas.DataFrame – DataFrame containing all the possible transfers in the dataset.

id_to_name(stop_id) str

Finds the name of the stop with the given ID

Parameters

stop_id (str) – ID of the stop

Returns

str – Name of the stop with the given ID

mobilipy.legs module

mobilipy.legs.get_user_legs(df, user_id, use_multiprocessing=True) pandas.core.frame.DataFrame

Builds the legs DataFrame for the given user.

Parameters
  • df (pandas.DataFrame) – waypoints DataFrame

  • user_id (str) – ID of the user whose legs are to be created

  • use_multiprocessing (bool, optional) – Specifies whether the multiprocessing package should be used. Defaults to True.

Returns

pandas.DataFrame – DataFrame of user’s legs

mobilipy.mode_detection module

mobilipy.mode_detection.mode_detection(df, speed_th=2.78, acceleration_th=0.5, minimal_walking_duration=100, minimal_trip_duration=120, use_multiprocessing=True)

Tags the DataFrame at ‘trip’ indexes with detected modes in the ‘detected_mode’ column.

Parameters
  • df (pandas.DataFrame) – DataFrame to be processed, coming from segmentation module

  • speed_th (float, optional) – The walk speed threshold. Defaults to 2.78.

  • acceleration_th (float, optional) – The walk acceleration threshold. Defaults to 0.5.

  • minimal_walking_duration (int, optional) – The walk duration threshold. Defaults to 100.

  • minimal_trip_duration (int, optional) – The minimal trip duration threshold. Defaults to 120.

  • use_multiprocessing (bool, optional) – Specifies whether the multiprocessing package should be used. Defaults to True.

Returns

pandas.DataFrame – Segments DataFrame with modes of transport tagged in the mode_detected column.

mobilipy.plot module

mobilipy.plot.get_leg_points(legs_from_waypoints, clean_waypoints, index, info=False)

Returns a DataFrame with all the points belonging to the given leg.

Parameters
  • legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.

  • clean_waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.

  • index (int) – Index of the leg.

  • info (bool, optional) – Defines whether additional info about the leg should be returned. Defaults to False.

Returns

pd.DataFrame – DataFrame with all the points belonging to the given leg

mobilipy.plot.get_map_bounds(df)

Finds the map bounding box for the given WaypointsDataFrame

Parameters

df (pd.DataFrame) – DataFrame with latitude and longitude columns

Returns

((float, float), (float, float)) – Bounds as a 2x2 array, in the form of ((latitude_min, longitude_min), (latitude_max, longitude_max))

mobilipy.plot.plot_all(legs_from_waypoints, waypoints)

Plots dirty waypoints, clean waypoints and resulting legs.

Parameters
  • legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.

  • waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.

Returns

folium.Map – Map with points from the DataFrames

mobilipy.plot.plot_daily_legs(legs_from_waypoints, waypoints, day_num, first_=0, last_=- 1, map_=None, dirty_waypoints=False, plot_waypoints=False, solos=False)

Plots all the legs for a given day

Parameters
  • legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.

  • waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.

  • day_num (int) – Index of the day

  • first (int, optional) – Index of the first leg to be plotted. Defaults to 0.

  • last (int, optional) – Index of the last leg to be plotted. Defaults to -1.

  • map ([type], optional) – Existing Map object to plot the points on. Defaults to None.

  • dirty_waypoints (bool, optional) – Specifies whether the supplied waypoints need cleaning. Defaults to False.

  • plot_waypoints (bool, optional) – Specifies whether waypoints should be plotted one by one. Defaults to False.

  • solos (bool, optional) – Specifies whether solo legs should be plotted. Defaults to False.

Returns

folium.Map – Map with points from the DataFrames

mobilipy.plot.plot_gps(df, loc_map=None, type_='transport', line=True)

Plots supplied GPS points on a folium Map.

Parameters
  • df (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.

  • loc_map (folium.Map, optional) – Existing Map object to plot the points on. Defaults to None.

  • type (str, optional) – Type of data, can TRANSPORT, ACTIVITY. Defaults to TRANSPORT.

  • line (bool, optional) – Specifies whether consecutive points should be connected by a line. Defaults to True.

Returns

folium.Map – Map with points from the DataFrame

mobilipy.plot.plot_leg(legs_from_waypoints, clean_waypoints, index, map_=None, info=False)

Plots a selected leg on a folium Map.

Parameters
  • legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.

  • clean_waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.

  • index (int) – Index of the leg.

  • map ([type], optional) – Existing Map object to plot the points on. Defaults to None.

  • info (bool, optional) – Defines whether additional info about the leg should be returned. Defaults to False.

Returns

folium.Map – Map with points from the leg

mobilipy.plot.plot_legs(legs_from_waypoints, clean_waypoints, map_=None)

Plots all the legs on a folium Map

Parameters
  • legs_from_waypoints (pd.DataFrame) – DataFrame with legs, coming from legs.get_user_legs.

  • clean_waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.

  • map ([type], optional) – Existing Map object to plot the points on. Defaults to None.

Returns

folium.Map – Map with points from the legs DataFrame

mobilipy.plot.plot_solos(solos, map_=None)

Plots DataFrame points on a folium Map without connecting them with a line.

Parameters
  • solos (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.

  • map (folium.Map, optional) – Existing Map object to plot the points on. Defaults to None.

Returns

folium.Map – Map with points from the DataFrame

mobilipy.plot.plot_waypoints(waypoints, clean_df=True, map_=None)

Plots waypoints on a folium Map.

Parameters
  • waypoints (pd.DataFrame) – DataFrame with latitude, longitude and tracked_at columns.

  • clean_df (bool, optional) – Prepares waypoints before plotting if True. Defaults to True.

  • map (folium.Map, optional) – Existing Map object to plot the points on. Defaults to None.

Returns

folium.Map – Map with points from the DataFrame

mobilipy.poi_detection module

mobilipy.poi_detection.detect_home_work(legs, waypoints, cell_size=0.2)

Detects home and work locations, tags them with ‘Home’ or ‘Work’ in the df

Parameters
  • legs (pd.DataFrame) – legs DataFrame coming from the legs module

  • waypoints (pandas.DataFrame) – the waypoints DataFrame to be processed

mobilipy.preparation module

mobilipy.preparation.prepare(df, accuracy_th=1000, sigma=10) pandas.core.frame.DataFrame

Cleans a raw GPS points dataframe by filtering in the zurich area, rearranging features and applying gaussian smoothing.

Parameters
  • df (pandas.DataFrame) – DataFrame to be prepared for route processing

  • accuracy_th (int, optional) – Accuracy threshold for filtering. Defaults to 1000.

  • sigma (int, optional) – Sigma for Gaussian smoothing, defines the size of smoothing window. Defaults to 10.

Returns

pd.DataFrame – [description]

mobilipy.privacy module

mobilipy.privacy.add_noise(point, radius=100, offset=30)

Adds random uniform noise to the given point, in a radius = radius - offset

Parameters
  • point (tuple) – Points as a tuple (latitude, longitude) in degrees.

  • radius (int, optional) – Radius of the point neighbourhood. Defaults to 100.

  • offset (int, optional) – Offset from the perimeter of the neighbourhood circle, where the shifted point can’t be located. Defaults to 30.

Returns

tuple – Shifted point expressed as (latitude, longitude)

mobilipy.privacy.aggregate(waypoints_df, cell_size=0.2, delta=datetime.timedelta(seconds=900)) pandas.core.frame.DataFrame

Aggregates users in timedeltas and cells on the map. Returns a DataFrame with the count of users in a given timedelta and cell.

Parameters
  • waypoints_df (pandas.DataFrame) – DataFrame with ‘latitude’, ‘longitude’, ‘user_id’ and ‘tracked_at’ columns.

  • cell_size (float) – Size of the square cells on the map, in kilometers.

  • delta (datetime.timedelta) – Frequency for the time aggregation, e.g. 15 minutes.

Returns

pandas.DataFrame – DataFrame with ‘tracked_at’, ‘cell_latitude’, ‘cell_longitude’ and ‘count’ columns. The ‘cell_latitude’ and ‘cell_longitude’ columns give coordinates of the centers of cells on the map.

mobilipy.privacy.assign_cell_center(latitude, longitude, cell_size)

Returns the closest cell center for the given coordinates, when the map is divided into a lattice with the supplied cell_size

Parameters
  • latitude (float) – Latitude as degrees

  • longitude (float) – Longitude as degrees

  • cell_size (float) – Cell size in kilometers

Returns

tuple(float, float) – Coordinates of the closest cell center on the map

mobilipy.privacy.dt_floor(dt, delta) datetime.datetime

Performs the floor operation on datetime values, with given timedelta as the base, e.g. 2021-01-01 12:21:47 with timedelta of 15s returns 2021-01-01 12:21:45.

Parameters
  • dt (datetime.datetime) – Datetime value to be floored.

  • delta (datetime.timedelta) – Timedelta used for the floor operation.

Returns

datetime.datetime – Floored datetime.

mobilipy.privacy.get_obfuscation_utility(w_prepared, w_obfuscated, legs) float

Calculates the ratio of legs affected by obfuscation to total legs

Parameters
  • w_prepared (pandas.DataFrame) – Smoothed and cleaned waypoints DataFrame

  • w_obfuscated (pandas.DataFrame) – Smoothed, cleaned, and obfuscated waypoints DataFrame

  • legs (pandas.DataFrame) – DataFrame that contains assembled legs

Returns

[float] – Ratio of legs affected by obfuscation to total legs

mobilipy.privacy.km_to_lat(km_north) float

Expresses latitude given in kilometers to the east in degrees

Parameters

km_north (float) – Latitude expressed in kilometers to the north

Returns

float – Latitude in degrees

mobilipy.privacy.km_to_lon(km_east, latitude) float

Expresses given kilometers to the east in longitude degrees.

Parameters
  • km_east (float) – Longitude expressed in km going east from longitude 0, at the given latitude

  • latitude (float) – Latitude in degrees

Returns

float – Longitude in degrees

mobilipy.privacy.lat_to_km(latitude) float

Expresses given latitude in kilometers to the north

Parameters

latitude (float) – Latitude in degrees.

Returns

float – Latitude expressed in kilometers to the north

mobilipy.privacy.lon_to_km(latitude, longitude) float

Expresses given longitude in kilometers to the east

Parameters
  • latitude (float) – Latitude expressed in degrees

  • longitude (float) – Longitude expressed in degrees

Returns

float – Longitude as kilometers to the east

mobilipy.privacy.obfuscate(df, locations, radius=100, offset=30, mode='remove') pandas.core.frame.DataFrame

Obfuscates the regions of points given in ‘locations’ parameter by either removing all the points in their proximity, or changing the location of these points to one, noisy location in the proximity circle.

Parameters
  • df (pandas.DataFrame) – DataFrame with ‘latitude’ and ‘longitude’ columns.

  • locations (list) – List of locations given as (latitude, longitude) tuples

  • radius (int, optional) – Radius of the obfuscation circle. Defaults to 100.

  • offset (int, optional) – Smallest distance from the perimeter of the obfuscation circle at which the location of interest must be located. Defaults to 30.

  • mode (str, optional) – Obfuscation mode, can be either ‘remove’ or ‘assign’. Defaults to ‘remove’.

Returns

pandas.DataFrame – DataFrame with obfuscated regions

mobilipy.privacy.shift_point(point, latitude_shift_km, longitude_shift_km)

Shifts the coordinates of a given point.

Parameters
  • point (tuple) – Starting point as (latitude, longitude), in degrees

  • latitude_shift_km (float) – Latitude offset in kilometers

  • longitude_shift_km (float) – Longitude offset in kilometers

Returns

tuple – Shifted point expressed as (latitude, longitude)

mobilipy.reva module

mobilipy.reva.analyse(df, user_id) pandas.core.frame.DataFrame

Returns complete trip information from a raw GPS waypoints DataFrame. Segments the data into trips, detects the mode of transport and tags the home and work locations.

Parameters
  • df (pandas.DataFrame) – WaypointsDataFrame

  • user_id (str) – user’s ID

Returns

pandas.DataFrame – DataFrame with selected user’s legs

mobilipy.segmentation module

mobilipy.segmentation.segment(prepared_df, radius=0.025, min_samples=50, time_gap=850, use_multiprocessing=True) pandas.core.frame.DataFrame

Finds clusters of waypoints for legs

Parameters
  • df (pandas.DataFrame) – Waypoints DataFrame to be processed

  • radius (float) – Eps for DBSCAN

  • min_samples (int) – Minimum number of samples to be considered for

  • time_gap (float) – Max time gap threshold for detected clusters

Returns

pandas.DataFrame – DataFrame with the segment starts and ends

mobilipy.waypointsdataframe module

class mobilipy.waypointsdataframe.WaypointsDataFrame(data, tracked_at='tracked_at', longitude='longitude', latitude='latitude', user_id='user_id', crs={'init': 'epsg:4326'}, timezone='UTC')

Bases: pandas.core.frame.DataFrame

Class that serves as an entry point for the mobilipy pipeline

Initializes the WaypointsDataFrame

Parameters
  • data (pandas.DataFrame) – DataFrame with raw GPS data

  • tracked_at (str, optional) – Name of the column containing the timestamp. Defaults to constants.TRACKED_AT.

  • longitude (str, optional) – Name of the column containing the longitude. Defaults to constants.LONGITUDE.

  • latitude (str, optional) – Name of the column containing the latitude. Defaults to constants.LATITUDE.

  • user_id (str, optional) – Name of the column containing the user_id. Defaults to ‘user_id’.

  • crs (dict, optional) – Coordinate Reference System. Defaults to {“init”: “epsg:4326”}.

  • timezone (str, optional) – Timezone available in pytz. Defaults to constants.UTC.

Module contents