Retrieve and combine bus route on-time performance report data
Source:R/on_time_performance.R
report_otp.Rd
Retrieves and combines data from Zonar and Versatrans and generates full OTP
data. This is the main workhorse function for pulling, combining, and organizing
OTP data. The full schedule and OTP data is returned. To summarize and update
the OTP reports in Google Sheets call
update_otp_report()
on the data.frame
returned from this function.
Usage
report_otp(
date = as.character(Sys.Date()),
ampm = c("AM", "PM"),
include_zonar = TRUE,
cutoff_min = -25,
cutoff_max = 90,
inbound_delay_weight = 1,
outbound_delay_weight = 3.6,
rp_database = Sys.getenv("RP_DATABASE"),
os_database = Sys.getenv("OS_DATABASE"),
rp_odbc_name = Sys.getenv("RP_ODBC_NAME"),
uncovered_url = Sys.getenv("OTP_UNCOVERED_ID"),
route_change_id = Sys.getenv("ROUTE_CHANGE_ID"),
yard_ids = c(freeport = Sys.getenv("FREEPORT_OPSLOG_ID"), readville =
Sys.getenv("READVILLE_OPSLOG_ID"), washington = Sys.getenv("WASHINGTON_OPSLOG_ID")),
daysheet_id = Sys.getenv("DAYSHEET_ID"),
dispatchlog_id = Sys.getenv("OTP_DISPATCHLOG_ID"),
mail_to = strsplit(Sys.getenv("MAIL_TO"), ",")[[1]],
mail_from = Sys.getenv("MAIL_FROM"),
TZ = "America/New_York",
test = FALSE,
debug = FALSE
)
Arguments
- date
Character vector of length 1 giving the date in YYYY-MM-DD format.
- ampm
Character vector of length 1, either "AM", or "PM", use to specify morning or evening OTP report.
- include_zonar
Logical vector of length 1. If
TRUE
Zonar schedule data will be retrieved and used for OTP calculations.- cutoff_min
Numeric vector of length one giving the minimum delay time in minutes that will be considered a valid trip arrival. Usually this will be a negative value, e.g., -25 or so.
- cutoff_max
Numeric vector of length one giving the maximum delay time in minutes that will be considered a valid trip arrival. Usually this will be positive and relatively large, e.g., 60 or so.
- inbound_delay_weight
Numeric vector of length one giving the factor by which to prefer delayed vs early arrival for inbound trips. This factor is used to select among multiple arrival times in cases where vehicles pass through their assigned zones more than once. Higher values mean delayed arrivals are more likely to be selected over early arrivals. Default value is 2.
- outbound_delay_weight
Numeric vector of length one giving the factor by which to prefer early vs delayed arrival for outbound trips. This factor is used to select among multiple arrival times in cases where vehicles pass through their assigned zones more than once. Higher values mean early arrivals are more likely to be selected over delayed arrivals. Default value is 3.
- rp_database
Name of the RP database to connect to. Retrieved from
RP_DATABASE
environment variable by default but can be overridden here, see RVersatransRP package for details.- os_database
Name of the Onscreen database to connect to.
- rp_odbc_name
Name of the Windows ODBC data source to use. Retrieved from
RP_ODBC_NAME
environment variable by default but can be overridden here, see RVersatransRP package for details.- uncovered_url
URL or ID of a Google sheet containing a list of uncovered trips. Must contain columns named "Date", "AM/PM", "Route Set", and "Bus".
- route_change_id
ID of Google sheet containing route change date as returned by
report_routechanges()
. Defaults toSys.getenv("ROUTE_CHANGE_ID")
.- yard_ids
Named vector of IDs of google sheets containing OPS vehicle override logs for each yard.
- daysheet_id
ID of Google sheet containing day sheet log
- dispatchlog_id
ID of Google sheet containing shadow dispatch log
- mail_to
Character vector of email addresses to send notifications to.
- mail_from
GMail address used to send notification emails.
- TZ
The timezone used by the database server.
- test
Logical vector of length one indicating whether to pull an abbreviated test data set.
- debug
generate extra debugging data, only useful for development, never use this in production. Even with this the time needed to test this function will be long.
Value
A data frame with 34 columns:
- Date
dttm
Date- Route
chr
Visible route id number- RouteName
chr
Route Name- Vehicle
chr
Vehicle number- Direction
chr
Direction of trip: Inbound or Outbound
**
- TripOutcome
chr
classification of trip outcome: "on time", "late", "unreported", etc.- ExpectedTime
dttm
Expected arrival time; bell time for inbound routes, anchor time for outbound routes- ArrivalTime
dttm
Arrival time calculated from combined Onscreen and Zonar data- DelayTime
int
Difference betweenArrivalTime
andExpectedTime
, in minutes- TrueOnTime
lgl
Did the vehicle arrive on or before the expected time- TimeTier
hms
expected arrival time tier- Yard
chr
Vehicle yard, e.g. "FREEPORT", "READVILLE", or "WASHINGTON"
**
- AnchorTime
dttm
Scheduled arrival time for inbound routes, scheduled departure time for outbound routes- DelayTimeByAnchor
int
Difference betweenArrivalTime
andAnchorTime
, in minutes- AnchorAbbrev
chr
RP Anchor abbreviation number- OriginAbbrev
chr
Origin Point ID number- ExpectedDepartureTime
dttm
Expected departure time- ActualDepartTime
dttm
Actual departure time- ActualLoad
int
Number of riders assigned to route- RouteDistance
int
Route distance- RouteTime
int
Route time- Days
chr
Days the route is run- RouteSetName
chr
Route Set Name- AMPM
chr
Type of trip: "AM", "MID", or "PM"- excludedShuttles
lgl
shuttle that should be excluded from OTP calc- OriginalRPVehicle
chr
original vehicle number (if override in effect)- LocationDescription
chr
School name and street- LocationLat
dbl
School latitude- LocationLon
dbl
School longitude- Neighborhood
Neighborhood the school is located in
- OriginPointDescript
chr
Origin Point description- Originlatitude
dbl
For inbound trips this is the starting latitude; for outbound trips this is also the school latitude- Originlongitude
dbl
For inbound trips this is the starting latitude; for outbound trips this is also the school longitude- VehicleSource
chr
data source used to identify the vehicle- LocationTotal
int
number of locations included in route- LocationRecorded
int
number of route locations with vehicle arrival/departure time recorded- LocationDelayAvg
dbl
average delay time of recorded route locations- StopVehicleTotal
int
total stops assigned to vehicle arrival/departure time recorded on this date- StopVehicleRecorded
int
total stops with vehicle arrival/departure time recorded on this date- StopVehicleDelayAvg
dbl
average stop arrival/departure delay for all stops assigned to this vehicle on this date- GPSPings
int
Number of GPS locations recorded in time tier- Zone
chr
Zonar zone corresponding to the anchor location- ZonarCategory
chr
the category assigned to the Zonar zone- ZonarDuration
drtn
length of time recorded in Zonar zone- ZoneInZonar
lgl
anchor location has a corresponding Zonar zone- TypeOfBus
chr
one of "B", "H", "M", etc.- ArrivalTimeSource
chr
Indicates if the arrival time used came from Onscreen or Zonar
**
- OnTimeByAnchor
lgl
Did the vehicle arrive on or before the anchor time
**
- RPOSAnchorTimesMatch
lgl
Do RP and Onscreen anchor times agree- RPOSVehiclesUnmatched
chr
semi-colon separated list of vehicles with differing assignments in RP and Onscreen- RPOSVehiclesMatch
lgl
TRUE if the same vehicle was assigned to both RP and Onscreen- ArrivalInZonar
lgl
was arrival was recorded by Zonar- ArrivalInOnscreen
lgl
was arrival recorded by Onscreen- InRPRoute
lgl
TRUE if route existed in RP database- InOSRoute
lgl
TRUE if route existed in Onscreen database- InArrivalWindow
lgl
arrival inside the specified min/max cutoff window- RouteChanges
chr
List of route changes from the previous week- Uncovered
lgl
route in uncovered list- Unreported
lgl
route is unreported, i.e., no arrival time was recorded- NoGoStops
lgl
Number of route stops lists as no-go- NoGo
lgl
TRUE if route was listed as a no-go or ifNoGoStops
is equal to the number of stops on the Route- EarlyRelease
dtm
Time of early release, if applicable- MatchCount
int
number of vehicles with anchor arrival time recorded that may have been assigned to the route- PossibleMatches
chr
semi-colon separated list of vehicles with anchor arrival time recorded that may have been assigned to the route
Details
The main outcome variables are TripOutcome
which tells you what happened
(e.g., trip was reported/unreported/uncovered), and DelayTime
which
tells you the delay time in minutes for each reported route. The remaining variables are
included to make it easy to generate break-out reports (e.g., by Yard or by expected time tier)
or to help investigate specific trips (e.g., RPOriginalVehicle
tells you the
originally assigned vehicle, which is sometimes used even though Onscreen says it
was swapped out).
Examples
thedate <- as.character(BPSTranspoReportR:::last_wednesday(Sys.Date()))
otp <- report_otp(thedate, ampm = "AM", test = TRUE)
#> Retrieving data for 2023-06-07 AM OTP
#> Error in validate_drive_id(new_drive_id(out)) :
#> A <drive_id> must match this regular expression: `^[a-zA-Z0-9_-]+$`
#> Invalid input:
#> ✖ ""
#> Error in validate_drive_id(new_drive_id(out)) :
#> A <drive_id> must match this regular expression: `^[a-zA-Z0-9_-]+$`
#> Invalid input:
#> ✖ ""
#> Warning: Cannot retrieve OPS Log for2023-06-07
#> NOTE: only pulling schedule data for the first five zones; omit 'test' parameter if you want everything
#> Error in map(.x, .f, ...): ℹ In index: 1.
#> Caused by error in `is_destroyed()`:
#> ! Attempted to use cache which has been destroyed:
#> C:\Users\172060\Documents\Projects\BPSTranspoReportR\.zonarCache
dplyr::glimpse(otp[0,])
#> Rows: 0
#> Columns: 64
#> $ Date <dttm>
#> $ Route <chr>
#> $ RouteSetName <chr>
#> $ Vehicle <chr>
#> $ Direction <chr>
#> $ TripOutcome <chr>
#> $ ExpectedTime <dttm>
#> $ ArrivalTime <dttm>
#> $ DelayTime <dbl>
#> $ TrueOnTime <lgl>
#> $ TimeTier <time> hms()
#> $ Yard <chr>
#> $ AnchorTime <dttm>
#> $ DelayTimeByAnchor <dbl>
#> $ AnchorAbbrev <chr>
#> $ OriginAbbrev <chr>
#> $ OriginPointDescript <chr>
#> $ ExpectedDepartureTime <dttm>
#> $ ActualDepartTime <dttm>
#> $ ActualLoad <int>
#> $ RouteDistance <int>
#> $ RouteTime <int>
#> $ Days <chr>
#> $ RouteName <chr>
#> $ AMPM <chr>
#> $ excludedShuttles <lgl>
#> $ OriginalVehicle <chr>
#> $ LocationDescription <chr>
#> $ LocationLat <dbl>
#> $ LocationLon <dbl>
#> $ Neighborhood <chr>
#> $ Originlatitude <dbl>
#> $ Originlongitude <dbl>
#> $ VehicleSource <chr>
#> $ LocationTotal <int>
#> $ LocationRecorded <dbl>
#> $ LocationDelayAvg <dbl>
#> $ StopVehicleTotal <int>
#> $ StopVehicleRecorded <int>
#> $ StopVehicleDelayAvg <dbl>
#> $ GPSPings <dbl>
#> $ Zone <chr>
#> $ ZonarCategory <chr>
#> $ ZonarDuration <drtn> secs
#> $ ZoneInZonar <lgl>
#> $ TypeOfBus <chr>
#> $ ArrivalTimeSource <chr>
#> $ OnTimeByAnchor <lgl>
#> $ RPOSAnchorTimesMatch <lgl>
#> $ RPOSVehiclesUnmatched <chr>
#> $ RPOSVehiclesMatch <lgl>
#> $ ArrivalInZonar <lgl>
#> $ ArrivalInOnscreen <lgl>
#> $ InRPRoute <lgl>
#> $ InOSRoute <lgl>
#> $ InArrivalWindow <lgl>
#> $ RouteChanges <chr>
#> $ Uncovered <lgl>
#> $ Unreported <lgl>
#> $ NoGoStops <dbl>
#> $ NoGo <lgl>
#> $ EarlyRelease <lgl>
#> $ MatchCount <int>
#> $ PossibleMatches <chr>