Caseflow eFolder Monitoring - department-of-veterans-affairs/caseflow GitHub Wiki

Current Monitoring and Metrics

Raven/Sentry

Direct Raven/Sentry Calls

Where in Code Key ID Method Used Description
ExceptionLogger
.capture()
n/a Raven.capture_exception General use for logging exceptions in the application to Sentry
JobDataDogMetricMiddleware
.call()
job_class queue current_user.id current_user.email current_user.ip_address current_user.station_id Raven.capture_exception Logs errors to Sentry when an error occurs while attempting to send out a Datadog gauge message
VeteranFinder
.find_duplicate_bgs_rec()
bgs_rec_numbers Raven.capture_exception Logs an error when the BGS lookup number does not match any SSN or claim numbers

ExceptionLogger Calls

Where in Code Key ID Method Used Description
Api::V1::ApplicationController
rescue_from StandardError
n/a ExceptionLogger.capture Logs all StandardErrors that occur within the API controller
ManifestFetcher
.process()
n/a ExceptionLogger.capture Logs any exceptions that occur while processing Manifests

Datadog

Direct Datadog Calls

Where in Code Key ID Method Used Description
DataDogService
.increment_counter()
stat_name
tags
Datadog::Statsd.increment Increments the given stat counter in Datadog
DataDogService
.emit_gauge()
stat_name
metric_value
tags
Datadog::Statsd.gauge Handles sending metric data to Datadog

DataDogService Calls

Where in Code Key ID Method Used Description
ApplicationJob
.datadog_report_runtime()
metric_group_name DataDogService.record_runtime Records how long a job took to run within a given metric group
*never used
ApplicationJob
.datadog_report_time_segment()
segment DataDogService.emit_gauge Records how long a segment of code took to run
*never used
CollectDataDogMetrics
.emit_datadog_point()
type
db_name
DataDogService.emit_gauge Records a count of how many database connections are: active, dead, or idle
config.ru
.emit_datadog_point()
type DataDogService.emit_gauge Records the number of Puma worker threads that are: idle or active
JobDataDogMetricMiddleware
.call()
job_class DataDogService.emit_gauge Logging how long each job takes and tagging that job with a job_class.
MetricsService
.record()
service DataDogService.emit_gauge Records the latency of a given service when used in Production
MetricsService
.increment_datadog_counter()
metric_name
service
DataDogService.increment_counter Keeps a count of how many times a request for a service has been called, and how many times those requests have resulted in an error

CollectDataDogMetrics Calls

Where in Code Key ID Method Used Description
HealthChecksController:2 db_name include CollectDataDogMetrics Calls an module that has a callback method that sends monitoring data to Datadog

MetricsService Calls

Where in Code Key ID Method Used Description
ExternalApi::BGSService
.fetch_veteran_info()
file_number MetricsService.record Records the time it takes to fetch veteran info
ExternalApi::BGSService
.fetch_poa_by_file_number()
file_number MetricsService.record Records the time it takes to fetch a POA by file number
ExternalApi::BGSService
.fetch_poa_by_participant_id()
participant_id MetricsService.record Records the time it takes to fetch a POA by participant ID
ExternalApi::BGSService
.fetch_poa_org_record()
participant_id MetricsService.record Records the time it takes to fetch a POA Org Record by participant id
ExternalApi::BGSService
.fetch_poas_by_participant_ids()
participant_ids MetricsService.record Records the time it takes to fetch a list of POAs that are associated with the given participant IDs
ExternalApi::BGSService
.fetch_claims_for_file_number()
file_number MetricsService.record Records the time it takes to fetch a list of claims associated with a given file number
ExternalApi::BGSService
.fetch_person_info()
participant_id MetricsService.record Records the time it takes to fetch a data for a given participant
ExternalApi::BGSService
.fetch_person_by_ssn()
ssn MetricsService.record Records the time it takes to fetch a person for a given SSN
ExternalApi::BGSService
.check_sensitivity()
file_number MetricsService.record Records the time it takes to check whether or not a given file number can be accessed
ImageConverterService
.convert_tiff_to_pdf()
tiff_to_convert MetricsService.record Records the time it takes to convert a tiff to a pdf
ManifestFetcher
.documents_from_service_for()
file_number MetricsService.record Records the time it takes to fetch documents
RecordFetcher
.content_from_va_service()
record.file_number
record.manifest_source.name
MetricsService.record Records the time it takes to fetch a document file from a VA manifest source
RecordFetcher
.content_from_va_service()
record.s3_filename MetricsService.record Records the time it takes to convert an S3 image file
RecordFetcher
.content_from_va_service()
record.s3_filename MetricsService.record Records the time it takes to store an S3 file
RecordFetcher
.content_from_s3()
Record.s3_filename
record.file_number
MetricsService.record Records the time it takes to fetch an S3 file
ExternalApi::VBMSService
.send_and_log_request()
id
request.class
MetricsService.record Shared method for recording the time it takes for a request to run
ExternalApi::VBMSService
.call_and_log_service()
vbms_id
service.class
MetricsService.record Shared method for recording the time it takes for a service to run
ExternalApi::VBMSService
.fetch_documents_for()
download.file_number
request.class
self.send_and_log_request Records the time it takes to fetch documents for a download
ExternalApi::VBMSService
.v2_fetch_documents_for()
veteran_file_number
request.class
self.send_and_log_request Records the time it takes to fetch documents associated with a given veteran file number
ExternalApi::VBMSService
.fetch_delta_documents_for()
veteran_file_number
request.class
self.send_and_log_request Records the time it takes to fetch documents associated with a given veteran file number and time period
ExternalApi::VBMSService
.fetch_document_file()
document.document_id
request.class
self.send_and_log_request Records the time it takes to fetch a document
ExternalApi::VBMSService
.v2_fetch_document_file()
document.document_id
request.class
self.send_and_log_request Records the time it takes to fetch a document
ExternalApi::VBMSService
.v2_fetch_documents_for()
Veteran_file_number
service.class
self.call_and_log_service Records the time it takes to find documents associated with a given veteran file number
ExternalApi::VVAService
.fetch_documents_for()
download.file_number MetricsService.record Records the time it takes to fetch documents associated with a given claim number
ExternalApi::VVAService
.v2_fetch_documents_for()
file_number MetricsService.record Records the time it takes to fetch documents associated with a given claim number
ExternalApi::VVAService
.fetch_document_file()
document_id MetricsService.record Records the time it takes to fetch data for a given document

Rails Logs / Shoryuken

Direct Rails Logger Calls

Where in Code Key ID Method Used Level Description
ApplicationController
.authenticate()
request.original_url
request.referer
Rails.logger info Logs page that was trying to be accessed and the page the user was coming from when user is not authenticated
Api::V2::ApplicationController
.fetch_veteran_by_file_number()
file_number
current_user
Rails.logger debug Logs the current user and file number they are attempting to fetch
Config.ru:44 Process.pid
thread_count
backlog
waiting
@workers
Rails.logger info Logs Puma stats
DocumentCreator
.create()
ft_delta_documents
css_id
Rails.logger info Logs whether or not the current user can use feature toggle
DocumentCreator
.create()
manifest_source Rails.logger info Logs the manifest source id of a record that was trying to download if that record was already downloaded
V2::DownloadManifestJob
.perform()
manifest_source
manifest
file_number
Rails.logger error Logs the error that occurred while a user was attempting to download a file with the name of the error and the appeal file number
V2::DownloadManifestJob
.perform()
file_number
docs.map(&:id)
log_info / Rails.logger info Logs the file number, documents in the file, and zipfile size when a user attempts to download a file
ExceptionLogger
.capture()
n/a Rails.logger error General use for logging exceptions in the application to Rails Logs
ManifestFetcher
.fetch_documents_or_
delta_documents_for()
file_number
docs
ft_delta_documents
Rails.logger info Logs the file number, documents in the file, and file size of documents fetched from VBMS
MetricsService
.record()
description<
stopwatch
Rails.logger info Shared method which logs the start of a task description sent to it and the finish with time taken, or logs that the task was rescued
PdfService
.write()
size Rails.logger warn Logs that a pdf of the size passed cannot be written along with the pdf path, filename, and an error
RailsVBMSLogger
.log()
response_code
response_body
request_class_name
Rails.logger error Logs the HTTP response code along with the request class name and the response body
RecordFetcher
.process()
n/a process error Logs an error if records cannot be fetched from VBMS or VVA
SessionsController
.create()
user
css_id
Rails.logger info Logs the user who successfully signed in and the path they are being redirected to
SessionsController
.create()
n/a Rails.logger error Logs the error that occurred while a user was attempting to log in with the code location it failed
Shoryuken.rb n/a Shoryuken.configure_server n/a Sets the Rails logger to use Shoryuken Logging
ExternalAPI::VBMSService
.fetch_documents_for()
download Rails.logger info Logs the VBMS downloaded document’s list length
ExternalAPI::VBMSService
.v2_fetch_documents_for()
veteran_file_number Rails.logger info Logs the VBMS document list length from a veterans file number
ExternalAPI::VVAService
.fetch_documents_for()
download Rails.logger info Logs the VVA document list length for the download
ExternalAPI::VVAService
.v2_fetch_documents_for()
file_number Rails.logger info Logs the VVA document list length for the download

MetericsService Calls

Where in Code Key ID Method Used Description
ExternalApi::BGSService
.fetch_veteran_info()
file_number MetricsService.record Logs that fetching veteran info by file number has started and then finished with the time taken, or that there was an error
ExternalApi::BGSService
.fetch_poa_by_file_number()
file_number MetricsService.record Logs that fetching a poa has started and then finished with the time taken, or that there was an error
ExternalApi::BGSService
.fetch_poa_by_participant_id()
participant_id MetricsService.record Logs that fetching a poa by claimant and participant id has started and then finished with the time taken, or that there was an error
ExternalApi::BGSService
.fetch_poa_org_record()
participant_id MetricsService.record Logs that fetching a poa by org and participant id has started and then finished with the time taken, or that there was an error
ExternalApi::BGSService
.fetch_poas_by_participant_ids()
participant_ids MetricsService.record Logs that fetching poas by org and participant ids has started and then finished with the time taken, or that there was an error
ExternalApi::BGSService
.fetch_claims_for_file_number()
file_number MetricsService.record Logs that fetching claims by file number has started and then finished with the time taken, or that there was an error
ExternalApi::BGSService
.fetch_person_info()
participant_id MetricsService.record Logs that fetching a person's info by participant id has started and then finished with the time taken, or that there was an error
ExternalApi::BGSService
.fetch_person_by_ssn()
ssn MetricsService.record Logs that fetching a person by ssn has started and then finished with the time taken, or that there was an error
ExternalApi::BGSService
.check_sensitivity()
file_number MetricsService.record Logs that checking if a client can access a given file number started and then finished with the time taken, or that there was an error
ImageConverterService
.convert_tiff_to_pdf()
image_magick MetricsService.record Logs that converting a tiff to pdf has started and then finished with the time taken, or that there was an error
ManifestFetcher
.documents_from_service_for()
manifest_source.name
file_number
MetricsService.record Logs that fetching documents or delta documents for a file number has started and then finished with the time taken, or that there was an error
RecordFetcher
.content_from_va_service()
manifest_source.name
file_number
s3_filename
MetricsService.record Logs that fetching a record from a manifest source for a file number has started and then finished with the time taken, or that there was an error
RecordFetcher
.content_from_s3()
s3_filename
file_number
MetricsService.record Logs that fetching a record from s3 for a file number has started and then finished with the time taken, or that there was an error
ExternalAPI::VBMSService.
send_and_log_request()
request
id
MetricsService.record Logs that sending a request for a given id has started and then finished with the time taken, or that there was an error
ExternalAPI::VBMSService
.call_and_log_service()
service
vbms_id
MetricsService.record Logs that calling a service for a vbms id has started and then finished with the time taken, or that there was an error
ExternalAPI::VVAService
.fetch_documents_for()
download.file_number MetricsService.record Logs that getting the document list for a claim has started and then finished with the time taken, or that there was an error
ExternalAPI::VVAService
.fetch_document_file()
document.document_id MetricsService.record Logs that getting the document list for a claim has started and then finished with the time taken, or that there was an error

Metrics and Monitoring Suggestions

Keys/Options

Services

  • Rails (Logging) - Debugging and security, uses Shoryuken
  • DataDog - Metrics
  • Sentry - Security and logging, uses Raven

Type

  • Info
  • Security
  • Errors
  • Metrics

Controllers

Api::V1::ApplicationController

Method Service(s) Type(s)
unauthorized() Sentry
Rails
Security - station_id, css_id
sensitive_record() Sentry Security - calls forbidden()
forbidden() Sentry
Rails
Security - user_id, file_number or manifest_id
record_not_found() Sentry
Rails
Info - Record ID
authenticate_or_authorize() Sentry
Rails
Info - station_id, css_id
Security - calls unauthorized()

Api::V2::ApplicationController

Method Service(s) Type(s)
veteran_not_found() Rails
Sentry
Errors - file_number
vso_denied_record() Rails
Sentry
Security - calls forbidden()
verify_veteran_file_number() Rails
DataDog
Sentry
Metrics - time taken to verify, user cache/queue info (suggestions: documents already open, attempts count, last attempt time)
Info - file_number
fetch_veteran_by_file_number() Sentry Security - user_id, file_number, can call sensitive_record() or vso_denied_record()

Api::V2::FilesDownloadsController

Method Service(s) Type(s)
start() Rails
Sentry
DataDog
Info - @files_download
Metrics - time taken to download and finish, size of manifest

Api::V2::ManifestsController

Method Service(s) Type(s)
start() Rails
Sentry
DataDog
Info - user_id, file_number
Metrics - time taken to download and finish, size of manifest
refresh() Rails
Sentry
DataDog
Info - user_id, file_number
Metrics - time taken to download and finish, size of manifest

Api::V2::RecordsController

Method Service(s) Type(s)
document_failed() Rails
Sentry
Errors - manifest_source, version_id, document_id
record() Rails
Sentry
DataDog
Info - manifest_source, version_id, document_id
Metrics - time taken to download and finish, size of document

ErrorsController

Method Service(s) Type(s)
show() Rails
Sentry
Errors - general error info, status_code

SessionsController

Method Service(s) Type(s)
build_user() Rails
Sentry
DataDog
Info - username, css_id, station_id
Metrics - time taken to finish
Security - if fails - username, css_id, station_id

Jobs

V2::DownloadManifestJob

Method Service(s) Type(s)
perform() Datadog Metrics - time taken to start and end a manifest download, manifest size, and document count

V2::PackageFilesJob

Method Service(s) Type(s)
perform() Datadog Metrics - time taken to create a zipfile, zipfile size

V2::SaveFilesInS3Job

Method Service(s) Type(s)
perform() Datadog Metrics - time taken to save all the records to s3, record count, record size

ApplicationJob

Method Service(s) Type(s)
datadog_report_runtime() Datadog Metrics - leverage this in other jobs
datadog_report_time_segment() Datadog Metrics - leverage this in other jobs

Services

POAMapper

Method Service(s) Type(s)
get_claimant_poa_from_bgs_poa() DataDog
Sentry
Metrics - time taken to create bgs_rep object
Info - bgs_record
Errors
get_claimant_poa_from_bgs_claimants_poa() DataDog
Sentry
Metrics - time taken to create object
Info - bgs_record
Errors
get_hash_of_poa_from_bgs_poas() DataDog
Sentry
Metrics - time taken to create the hash
Info - bgs_resp
Errors
get_poa_from_bgs_org_poa() DataDog
Sentry
Metrics - time taken to create object
Info - bgs_rep
Errors

ExternalApi::BGSService

Method Service(s) Type(s)
parse_veteran_info() Sentry Info - veteran_data
parse_person_info() Sentry Info - bgs_info
fetch_user_info() Sentry Security
fetch_veteran_info_cache_key() Rails Info - file_number, client

DocumentCounter

Method Service(s) Type(s)
count() DataDog Metrics - time taken to run the count, file_numbers count

DocumentCreator

Method Service(s) Type(s)
create() DataDog Metrics - time taken to create a document, external document count, external document sizes

ManifestFetcher

Method Service(s) Type(s)
fetch_documents() DataDog Metrics - time taken to fetch all the documents, file number count, document sizes

PdfService

Method Service(s) Type(s)
write() Rails
DataDog
Info - contents.size, command,
Errors
Metrics - time taken to write out the pdf, pdf size
write_attributes_file() DataDog Metrics - time taken to write the file

UserAuthorizer

Method Service(s) Type(s)
General Sentry Security

VeteranFinder

Method Service(s) Type(s)
find() Rails Info - file_number, bgs_rec_numbers
find_uniq_file_numbers() DataDog Metrics - time taken to find all the file numbers, file number count
find_duplicate_bgs_rec() DataDog Metrics - time taken to find all the duplicate file numbers, file number count

ZipfileCreator

Method Service(s) Type(s)
process() DataDog Metrics - records count, time taken to create a zipfile, total size of records, size of zipfile
write_to_tempfile() Rails Info - t.path, record
⚠️ **GitHub.com Fallback** ⚠️