Analysis metric conventions - noncesense-research-lab/archival_network GitHub Wiki

Notes

Goal: Provide standard terminology/notation/definitions for commonly-encountered metrics in MAP discussion.

Feel free to edit this page - there are more quantities that are not included yet, and some of the terminology could use improvements.

For simplicity in defining terms, assume alternative blocks have been filtered/excluded, therefore there is only one (main-chain) solved version of the block at each height (and the node sometimes receives/relays multiple copies).

While this page currently refers to 'blocks', most of the metrics can be directly applied to transaction propagation logs as well, since MAP records txn NRTs.

Notation

NRT(H,x) is the timestamp when the node received the xth copy of the block at height H.

Example log

Let's say we have a table of block ID and node receipt timestamps (NRTs) for an archival node:

Height  // NRTs (HH:MM:SS)
------------------------------------------------------------
   H    // NRT(H,1), NRT(H,2), NRT(H,3), ...
1643586 // 00:00:00, 00:00:04, 00:00:09
1643587 // 00:02:05, 00:02:06, 00:02:11, 00:03:00, 00:03:12
1643588 // 00:04:06, 00:05:00, 00:07:05
1643589 // 00:07:04, 00:07:14, 00:07:18, 00:07:21

Metrics

Block discovery waiting time

How long did it take for somebody to solve block H?

W(H) := NRT(H,first) - NRT(H-1, first)

A histogram of this quantity over a range of heights tells us about mining activity (see its equivalent in MRTs in altchain_temporal_study.ipynb)

Broadcast delay &/or timestamp spoofing

Difference between block's miner reported timestamp (MRT) and actual broadcast to network. Maybe call this D for 'delay'

D(H) = NRT(H,first) - MRT(H)

(don't need to specify first or last for MRT since it will be the same in all copies)

A histogram of this quantity over a range of heights would theoretically provide information about latency etc. However, there is a lot of timestamp spoofing, which becomes the more interesting feature of this histogram (included in block_timestamp_analysis.ipynb)

Block broadcast window

What is the time difference between first and last receipt of a certain block by a given node?

B(H) := NRT(H,last) - NRT(H,first)

What are the implications? What would a histogram of this show us? Essentially, the time envelope for bursts of network activity around block discovery times. This might be an interesting way to heuristically detect a running node by network traffic rates, even if actual content is concealed by VPN, etc.

Block receipt count

How many times do we receive a copy of a given block?

C(H) := # of NRT entries for height H

What does this tell us?

Global block propagation time

How long does it take for a broadcast to propagate across the network to the last node. Use extended notation:

NRT(N,H,x) indicating the timestamp when MAP node N received the xth copy of block at height H

Suppose MAP node 'orange' is the first to hear a block, and MAP node 'ginger' is the last to hear about that block. Then we are interested in

G(H) := NRT(ginger, H, first) - NRT(orange, H, first)

More generally,

G(H) := NRT(first node, H, first copy) - NRT(last node, H, first copy)

This metric (especially with respect to block propagation time) can be used to estimate the expected number of orphaned blocks due to natural causes.