Retrieve and combine bus route on-time performance report data
Source:R/on_time_performance.R
report_otp.Rd
Retrieves and combines data from Zonar and Versatrans and generates full OTP
data. This is the main workhorse function for pulling, combining, and organizing
OTP data. The full schedule and OTP data is returned. To summarize and update
the OTP reports in Google Sheets call
update_otp_report() on the data.frame
returned from this function.
Usage
report_otp(
date = as.character(Sys.Date()),
ampm = c("AM", "PM"),
include_zonar = TRUE,
cutoff_min = -25,
cutoff_max = 90,
inbound_delay_weight = 1,
outbound_delay_weight = 3.6,
rp_database = Sys.getenv("RP_DATABASE"),
os_database = Sys.getenv("OS_DATABASE"),
rp_odbc_name = Sys.getenv("RP_ODBC_NAME"),
uncovered_url = Sys.getenv("OTP_UNCOVERED_ID"),
route_change_id = Sys.getenv("ROUTE_CHANGE_ID"),
yard_ids = c(freeport = Sys.getenv("FREEPORT_OPSLOG_ID"), readville =
Sys.getenv("READVILLE_OPSLOG_ID"), washington = Sys.getenv("WASHINGTON_OPSLOG_ID")),
daysheet_id = Sys.getenv("DAYSHEET_ID"),
dispatchlog_id = Sys.getenv("OTP_DISPATCHLOG_ID"),
mail_to = strsplit(Sys.getenv("MAIL_TO"), ",")[[1]],
mail_from = Sys.getenv("MAIL_FROM"),
TZ = "America/New_York",
test = FALSE,
debug = FALSE
)Arguments
- date
Character vector of length 1 giving the date in YYYY-MM-DD format.
- ampm
Character vector of length 1, either "AM", or "PM", use to specify morning or evening OTP report.
- include_zonar
Logical vector of length 1. If
TRUEZonar schedule data will be retrieved and used for OTP calculations.- cutoff_min
Numeric vector of length one giving the minimum delay time in minutes that will be considered a valid trip arrival. Usually this will be a negative value, e.g., -25 or so.
- cutoff_max
Numeric vector of length one giving the maximum delay time in minutes that will be considered a valid trip arrival. Usually this will be positive and relatively large, e.g., 60 or so.
- inbound_delay_weight
Numeric vector of length one giving the factor by which to prefer delayed vs early arrival for inbound trips. This factor is used to select among multiple arrival times in cases where vehicles pass through their assigned zones more than once. Higher values mean delayed arrivals are more likely to be selected over early arrivals. Default value is 2.
- outbound_delay_weight
Numeric vector of length one giving the factor by which to prefer early vs delayed arrival for outbound trips. This factor is used to select among multiple arrival times in cases where vehicles pass through their assigned zones more than once. Higher values mean early arrivals are more likely to be selected over delayed arrivals. Default value is 3.
- rp_database
Name of the RP database to connect to. Retrieved from
RP_DATABASEenvironment variable by default but can be overridden here, see RVersatransRP package for details.- os_database
Name of the Onscreen database to connect to.
- rp_odbc_name
Name of the Windows ODBC data source to use. Retrieved from
RP_ODBC_NAMEenvironment variable by default but can be overridden here, see RVersatransRP package for details.- uncovered_url
URL or ID of a Google sheet containing a list of uncovered trips. Must contain columns named "Date", "AM/PM", "Route Set", and "Bus".
- route_change_id
ID of Google sheet containing route change date as returned by
report_routechanges(). Defaults toSys.getenv("ROUTE_CHANGE_ID").- yard_ids
Named vector of IDs of google sheets containing OPS vehicle override logs for each yard.
- daysheet_id
ID of Google sheet containing day sheet log
- dispatchlog_id
ID of Google sheet containing shadow dispatch log
- mail_to
Character vector of email addresses to send notifications to.
- mail_from
GMail address used to send notification emails.
- TZ
The timezone used by the database server.
- test
Logical vector of length one indicating whether to pull an abbreviated test data set.
- debug
generate extra debugging data, only useful for development, never use this in production. Even with this the time needed to test this function will be long.
Value
A data frame with 34 columns:
- Date
dttmDate- Route
chrVisible route id number- RouteName
chrRoute Name- Vehicle
chrVehicle number- Direction
chrDirection of trip: Inbound or Outbound
**
- TripOutcome
chrclassification of trip outcome: "on time", "late", "unreported", etc.- ExpectedTime
dttmExpected arrival time; bell time for inbound routes, anchor time for outbound routes- ArrivalTime
dttmArrival time calculated from combined Onscreen and Zonar data- DelayTime
intDifference betweenArrivalTimeandExpectedTime, in minutes- TrueOnTime
lglDid the vehicle arrive on or before the expected time- TimeTier
hmsexpected arrival time tier- Yard
chrVehicle yard, e.g. "FREEPORT", "READVILLE", or "WASHINGTON"
**
- AnchorTime
dttmScheduled arrival time for inbound routes, scheduled departure time for outbound routes- DelayTimeByAnchor
intDifference betweenArrivalTimeandAnchorTime, in minutes- AnchorAbbrev
chrRP Anchor abbreviation number- OriginAbbrev
chrOrigin Point ID number- ExpectedDepartureTime
dttmExpected departure time- ActualDepartTime
dttmActual departure time- ActualLoad
intNumber of riders assigned to route- RouteDistance
intRoute distance- RouteTime
intRoute time- Days
chrDays the route is run- RouteSetName
chrRoute Set Name- AMPM
chrType of trip: "AM", "MID", or "PM"- excludedShuttles
lglshuttle that should be excluded from OTP calc- OriginalRPVehicle
chroriginal vehicle number (if override in effect)- LocationDescription
chrSchool name and street- LocationLat
dblSchool latitude- LocationLon
dblSchool longitude- Neighborhood
Neighborhood the school is located in
- OriginPointDescript
chrOrigin Point description- Originlatitude
dblFor inbound trips this is the starting latitude; for outbound trips this is also the school latitude- Originlongitude
dblFor inbound trips this is the starting latitude; for outbound trips this is also the school longitude- VehicleSource
chrdata source used to identify the vehicle- LocationTotal
intnumber of locations included in route- LocationRecorded
intnumber of route locations with vehicle arrival/departure time recorded- LocationDelayAvg
dblaverage delay time of recorded route locations- StopVehicleTotal
inttotal stops assigned to vehicle arrival/departure time recorded on this date- StopVehicleRecorded
inttotal stops with vehicle arrival/departure time recorded on this date- StopVehicleDelayAvg
dblaverage stop arrival/departure delay for all stops assigned to this vehicle on this date- GPSPings
intNumber of GPS locations recorded in time tier- Zone
chrZonar zone corresponding to the anchor location- ZonarCategory
chrthe category assigned to the Zonar zone- ZonarDuration
drtnlength of time recorded in Zonar zone- ZoneInZonar
lglanchor location has a corresponding Zonar zone- TypeOfBus
chrone of "B", "H", "M", etc.- ArrivalTimeSource
chrIndicates if the arrival time used came from Onscreen or Zonar
**
- OnTimeByAnchor
lglDid the vehicle arrive on or before the anchor time
**
- RPOSAnchorTimesMatch
lglDo RP and Onscreen anchor times agree- RPOSVehiclesUnmatched
chrsemi-colon separated list of vehicles with differing assignments in RP and Onscreen- RPOSVehiclesMatch
lglTRUE if the same vehicle was assigned to both RP and Onscreen- ArrivalInZonar
lglwas arrival was recorded by Zonar- ArrivalInOnscreen
lglwas arrival recorded by Onscreen- InRPRoute
lglTRUE if route existed in RP database- InOSRoute
lglTRUE if route existed in Onscreen database- InArrivalWindow
lglarrival inside the specified min/max cutoff window- RouteChanges
chrList of route changes from the previous week- Uncovered
lglroute in uncovered list- Unreported
lglroute is unreported, i.e., no arrival time was recorded- NoGoStops
lglNumber of route stops lists as no-go- NoGo
lglTRUE if route was listed as a no-go or ifNoGoStopsis equal to the number of stops on the Route- EarlyRelease
dtmTime of early release, if applicable- MatchCount
intnumber of vehicles with anchor arrival time recorded that may have been assigned to the route- PossibleMatches
chrsemi-colon separated list of vehicles with anchor arrival time recorded that may have been assigned to the route
Details
The main outcome variables are TripOutcome which tells you what happened
(e.g., trip was reported/unreported/uncovered), and DelayTime which
tells you the delay time in minutes for each reported route. The remaining variables are
included to make it easy to generate break-out reports (e.g., by Yard or by expected time tier)
or to help investigate specific trips (e.g., RPOriginalVehicle tells you the
originally assigned vehicle, which is sometimes used even though Onscreen says it
was swapped out).
Examples
thedate <- as.character(BPSTranspoReportR:::last_wednesday(Sys.Date()))
otp <- report_otp(thedate, ampm = "AM", test = TRUE)
#> Retrieving data for 2023-06-07 AM OTP
#> Error in validate_drive_id(new_drive_id(out)) :
#> A <drive_id> must match this regular expression: `^[a-zA-Z0-9_-]+$`
#> Invalid input:
#> ✖ ""
#> Error in validate_drive_id(new_drive_id(out)) :
#> A <drive_id> must match this regular expression: `^[a-zA-Z0-9_-]+$`
#> Invalid input:
#> ✖ ""
#> Warning: Cannot retrieve OPS Log for2023-06-07
#> NOTE: only pulling schedule data for the first five zones; omit 'test' parameter if you want everything
#> Error in map(.x, .f, ...): ℹ In index: 1.
#> Caused by error in `is_destroyed()`:
#> ! Attempted to use cache which has been destroyed:
#> C:\Users\172060\Documents\Projects\BPSTranspoReportR\.zonarCache
dplyr::glimpse(otp[0,])
#> Rows: 0
#> Columns: 64
#> $ Date <dttm>
#> $ Route <chr>
#> $ RouteSetName <chr>
#> $ Vehicle <chr>
#> $ Direction <chr>
#> $ TripOutcome <chr>
#> $ ExpectedTime <dttm>
#> $ ArrivalTime <dttm>
#> $ DelayTime <dbl>
#> $ TrueOnTime <lgl>
#> $ TimeTier <time> hms()
#> $ Yard <chr>
#> $ AnchorTime <dttm>
#> $ DelayTimeByAnchor <dbl>
#> $ AnchorAbbrev <chr>
#> $ OriginAbbrev <chr>
#> $ OriginPointDescript <chr>
#> $ ExpectedDepartureTime <dttm>
#> $ ActualDepartTime <dttm>
#> $ ActualLoad <int>
#> $ RouteDistance <int>
#> $ RouteTime <int>
#> $ Days <chr>
#> $ RouteName <chr>
#> $ AMPM <chr>
#> $ excludedShuttles <lgl>
#> $ OriginalVehicle <chr>
#> $ LocationDescription <chr>
#> $ LocationLat <dbl>
#> $ LocationLon <dbl>
#> $ Neighborhood <chr>
#> $ Originlatitude <dbl>
#> $ Originlongitude <dbl>
#> $ VehicleSource <chr>
#> $ LocationTotal <int>
#> $ LocationRecorded <dbl>
#> $ LocationDelayAvg <dbl>
#> $ StopVehicleTotal <int>
#> $ StopVehicleRecorded <int>
#> $ StopVehicleDelayAvg <dbl>
#> $ GPSPings <dbl>
#> $ Zone <chr>
#> $ ZonarCategory <chr>
#> $ ZonarDuration <drtn> secs
#> $ ZoneInZonar <lgl>
#> $ TypeOfBus <chr>
#> $ ArrivalTimeSource <chr>
#> $ OnTimeByAnchor <lgl>
#> $ RPOSAnchorTimesMatch <lgl>
#> $ RPOSVehiclesUnmatched <chr>
#> $ RPOSVehiclesMatch <lgl>
#> $ ArrivalInZonar <lgl>
#> $ ArrivalInOnscreen <lgl>
#> $ InRPRoute <lgl>
#> $ InOSRoute <lgl>
#> $ InArrivalWindow <lgl>
#> $ RouteChanges <chr>
#> $ Uncovered <lgl>
#> $ Unreported <lgl>
#> $ NoGoStops <dbl>
#> $ NoGo <lgl>
#> $ EarlyRelease <lgl>
#> $ MatchCount <int>
#> $ PossibleMatches <chr>