Custom ring composition spoils Monero fungibility - noncesense-research-lab/archival_network GitHub Wiki

Custom ring composition spoils Monero fungibility

Documentation by IsthmusCrypto

TL;DR:

This wiki page documents how a series of Monero transactions can be traced by two known analyses (unusual ring size and churn timing). Furthermore, a new heuristic is discovered: identifying sets of transactions with non-standard decoy selection algorithms!

Background

Recent conversations about the transition to a fixed ringsize included requests for evidence that custom ringsizes are detrimental to Monero privacy.

The community has already reached a broad consensus in favor of fixed ringsizes, motivated by several statistically-sound arguments articulated in the above link. Since custom (> default) ringsizes become a distinguishing feature for those transactions, they are no longer indistinguishable from the rest of the anonymity pool that is using the mandatory ringsize (> 95% of transactions).

(general background on empirical analysis of the Monero blockchain can be found in this post and the references therein)

Theory

(rough estimate with several approximations) Suppose X(R) is the fraction of transactions in the recent zone use a ringsize > R (e.g. from stoffu's charts, X(40) ~ 0.02, i.e. 98% of txns use < 40 ring members)

Taking into account random selection of large ring-size decoys: define P(R,M) as the probability of including an extant output made by a ringsize > R in your new transaction with M ring members. P(R,M) ~ X(R)*M

The likelihood of an N-long chain of sequential transactions each including an extant output made by > R ring members would be P(R,M)^N = [X(R)*M]^N, right? In other words, there's vanishingly small chance of finding 3 txns in a row with ringsize > 40…

Note that the above statistics only consider the chances for decoy selection, and do not take into account that a custom ringsize is itself uncommon. Thus the practical probability P(R,M)^N of these large ring-size chains occurring is even (significantly) more unlikely than calculated above, if one takes into account the inherent improbability of non-default ringsizes.

Motivation

I thought it would be fun to show an example demonstrating the damage that large ringsizes can cause! Fishing around for signatures of anomalous behaviors falls right in the ballpark of #noncesense-research-lab :- )

A case study in worst practices

It is trivial to follow the flow of funds from output 48f0a7cf371fff4bb594b36bf5a03974ee9bf639943dc18b1a4c13745fdb6b20

The owner churned a handful of times, with their wallet ringsize = 41 for all transactions, so the funds can be traced via the naked eye and a blockchain explorer. In all txns, output being actual spent sticks out like a sore thumb from the decoys. {7, 8, 7, 7, 8, 11, 7, 8, 41, 7, 8, 8, 7, 7, 7, …}

After a few hours of churning with massive ringsizes, the owner has “secretly/anonymously” moved the funds from the output mentioned above into transaction 739b86beb402422da5191857639eca36bc949664ded045a36d9f291424e6ba64

See for yourself!

It’s easiest to follow yourself if you view the ending transaction, then follow the churn upstream. Scroll to the bottom, click “more details”, then you can just follow the flow in reverse by clicking on the ring members.

Flow of funds through outputs

  • 48f0a7cf371fff4bb594b36bf5a03974ee9bf639943dc18b1a4c13745fdb6b20 // 1647658
  • 2fe08801f6010933e094cde35c10e4473d500debf0a1430c439809f8333029f0 // 1647672
  • 2c102ff223e2b5245e2f9bd80ffc96b1f3a97c4ec438d8132b54a9834ced95b8 // 1647688
  • ffef5289b8cb4436a856e3844d8d18f29936eb7eec196bc03c77235d4b6987de // 1647738
  • c9d4f3eca931ae4d48fe848cd37ba71822ae9c6e45e25669105da12a3384a3dc // 1647762
  • 3265790a08280e1818baede64f03912c8522fe7ed62e7c3fe6f3a4c43c1f04bd // 1647810

Three distinct failures

This entity made several critical mistakes while using Monero.

  1. Use of a non-default ringsize, across all transactions. The chain of transactions that each include one signer was created from exactly 41 ring members is a dead giveaway.

  2. The guess-newest heuristic (described here) is accurate across all churn steps. Despite having 40 decoys, the output being spent is always the most recent. Given 1 hour churn time and 40 decoys, there should be plenty of newer outputs used as ring members. This peculiarity leads us to failure #3:

  3. The entity used a non-standard decoy selection algorithm. Normally, over 50% of decoys are selected from the recent zone, i.e. the last 1.8 days. This user included nearly no decoys from the recent zone, and used an unusual distribution of outputs generated in the last year or so. Lets take a look at the [739b...ba64] transaction again, for example: (quick exploratory plot, not polished)

Wrong decoy selection

Interpretation

The custom ring-sizes and poor churn practices (rapid timing) allow us to follow Moneroj from output 84bd1a6571e69f970df62b6fdfcdda54052416716632260a86ca18fb33483736 through 6 levels of churn identified by ring member size (and 3 more levels appear to be likely assignable based on timing and peculiar decoy selection)

Regardless of whether this is the true flow of funds, or an attempt to frame a different user, either case is indicative of a local breakdown in fungibility. (I should not be able to make my transactions "look like" somebody else's transactions, or else a few things have gone wrong!). Perhaps it is some mixing service for old outputs? Unknown at the moment, analysis pending.

While Monero blockchain analysis has been previously discussed with respect to churn timing and ring size, the activity investigated here suggests another novel approach for heuristic linking: identify set(s) of transactions whose ring members have an age distribution that varies significantly from the decoy selection algorithm used during that era. See issue #42. Extreme deviations (e.g. a uniform variation instead of triangular) provide partial fingerprinting for transactions that may have been generated from the same software or entity. (If two transactions are suspected to be linked due to other heuristics, then a matching non-standard decoy age profile would provide strong statistical support for the possible connection.)

Is this MyMonero paranoia mode?

(Section under construction)

NRL sent a 41-ring member transaction using MyMonero.com web interface, with ID: d0274cb2c6aa07ed1616036f3b0ef910d9c98cca35a5f65ce6dc8b5cec5dbba5

The distribution is roughly triangular, but does not sample > 50% of outputs from the recent zone.

MyMonero_test