Logbook 2021 H2 - cardano-scaling/hydra GitHub Wiki
- What is this about?
- Newer entries
- December 2021
- November 2021
- October 2021
- September 2021
- August 2021
- July 2021
- Older entries
- Add a mutator to
closeTxwhich changes the snapshot number without changing the signature -> tests fail sometimes. - We added labels to locate which mutation failed -> red bin: do this properly, it feels hacked and in a way we should have a sum type enumerating close-specific mutations
- Correctly discard the healthy case when mutating the snapshot number -> tests pass always.
- Add a third close mutator to improve the implementation: changing both, the signature and snapshot number to a valid but unexpected value
- First we did this using the
closeRedeemersmart constructor, thus always getting well-formed snapshot numbers - Later we deliberately not used
closeRedeemerto test using the "on-chain type", that is (still) a Bytestring - For example, mutating this to something not resembling a serialized integer (but still a valid signature)
- First we did this using the
- Then we changed the snapshot number back to a
Integer, as we will be wanting to check> 0later - When
snapshotNumberis anInteger, we need to serialize and hash on-chain to verify a signature- We implemented a basic Integer-to-CBOR-encoder or rather Natural-to-CBOR as it will error on negative numbers
- The fact that negative Integers error, made our tests pass as ill-formed values
(< 0)were not valid, despite correct signatures
- Discussion on what we eventually will need on-chain with the realization that ultimately we only will ever need to serialize/hash Integers and Hashes for
closeTx, butTxOutforfanoutTx
-
We continued working on the close tx validator, starting with a first observation: there was a discrepancy between the on-chain and off-chain representations of a snapshot number (bytestring vs natural). Since any positive number is actually a potentially valid snapshot number, we created a mutation to generate negative number as snapshot numbers. As a consequence and to cope with the now failing test, we changed the on-chain representation to an
Integer(there's noNaturalavailable in Plutus!) and we wrote a basic CBOR encoder for unsigned integers (to match the off-chain signable representation). -
From there, we discussed whether this approach could indeed be used in the long-run. Writing a CBOR encoder for unsigned integer is quite trivial and does not require much code. While this is mostly sufficient for the close tx (which only requires to serialize the snapshot number), it isn't for the fanout. In its simplest form (i.e. no split, full UTXO fits in the transaction), it is necessary for the validator to verify that the output UTXO does indeed match whatever UTXO's hash was specified during the close and stored in the state-machine datum. This could potentially be addressed by #147. However, when we consider the realistic form of the fanout which will likely require splitting UTXO into sub-utxo, we will need to:
- (a) Split the UTXO into structured subsets;
- (b) Prove inclusion of a subset into the bigger set.
In the paper, this is achieved via Merkle-Patricia-Trees, but as discussed previously, in the coordinated form it can also be achieved with much simpler Merkle-Trees. This still means that we will need, eventually, to construct a hierarchical structure of hashes; where signable representations are Merkle nodes where leaves are transaction output. Unfortunately, we can't really use the trick described in #147 in this case because it would require one extra datum PER TXOUT. Thus, without added builtins in Plutus, we are left with no choice than finding some on-chain signable representation of TxOut. Could this be CBOR in the same way we approached the signable representation of snapshot numbers? Maybe. We would need to write a CBOR for:
- non-negative integers (CBOR type-00)
- negative integers (CBOR type-01)
- bytestrings (CBOR type-02)
- (finite) lists (CBOR type-04)
- (finite) dictionnaries (CBOR type-05)
Note that, this can be quite straightforward (e.g. https://github.com/elm-toulouse/cbor/blob/master/src/Cbor/Encode.elm) so it may not be a bad idea. Possibly as an independent "Plutus library". We should also probably start a discussion with the ledger and Plutus team with regards to including CBOR-serialization of builtins types as builtin.
- Discussed the ADR13 candidate:
- There was an "optional datum" field in Alonzo before, not in the final spec though.
- If we can get the ledger team to drop https://github.com/input-output-hk/cardano-ledger/blob/70cfbf9be79533a6d1b2ff446567f5b78bf945aa/eras/alonzo/impl/src/Cardano/Ledger/Alonzo/Rules/Utxow.hs#L290-L301, this approach would be less hacky.
- We should write down the alternative: Adding a serialization (+ hashing) builtin to plutus.
- Reviewed open PRs and what had been merged to master
- realized that the mock implementation is actually wrong: nothing checks whether the included hash (which is verified) is indeed corresponding to some specific snapshot number / pre-image, i.e. we could change the "content" + the signature to another valid value
- Goal for the week:
- complete the mock implementation using an on-chain encoder for the snapshot number (it's only an integer)
- implement the ADR13 method for
closeon a branch to check feasibility
Plan for today:
- merge pending PRs
- error handling in Head Logic
- architecture writeup
- Mithril
Rebased https://github.com/input-output-hk/hydra-poc/pull/144 on master after merging branch expanding MockHead contracts
Writing NodeSpec test to check we properly notify clients when an exception is raised in the Chain component.
- Added a new
PostTxOnChainFailedmessage to the server output - Introduce combinators to capture server output and mock exceptions in the
Chaincomponent - Not sure this is the right thing to do however, seems like we are somewhat tightly coupling node and implementation of other components?
Also, perhaps "let-it-crash" strategy would be better for the Hydra node?
Need to adapt YAML specifications to add new message and move shared PostChainTx and InvalidTxError to common.yaml.
- Got an "interesting" error in the Log API tests:
1) Hydra.Logging HydraLog Assertion failed (after 1 test and 9 shrinks): [Envelope {namespace = "", timestamp = 1864-05-09 09:18:20.203694887175 UTC, threadId = 0, message = DirectChain {directChain = PostingTxFailed {toPost = AbortTx {utxo = fromList []}, reason = CannotSpendInput {input = "", walletUtxo = fromList [], headUtxo = fromList []}}}}] Traceback (most recent call last): File "/nix/store/ypidjcrcsxpnrqm4ivxf8pg475m0axqd-python3.8-jsonschema-3.2.0/lib/python3.8/site-packages/jsonschema/validators.py", line 811, in resolve_fragment document = document[part] KeyError: 'PostChainTx' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/nix/store/ypidjcrcsxpnrqm4ivxf8pg475m0axqd-python3.8-jsonschema-3.2.0/bin/.jsonschema-wrapped", line 9, in <module> sys.exit(main()) File "/nix/store/ypidjcrcsxpnrqm4ivxf8pg475m0axqd-python3.8-jsonschema-3.2.0/lib/python3.8/site-packages/jsonschema/cli.py", line 76, in main sys.exit(run(arguments=parse_args(args=args))) File "/nix/store/ypidjcrcsxpnrqm4ivxf8pg475m0axqd-python3.8-jsonschema-3.2.0/lib/python3.8/site-packages/jsonschema/cli.py", line 87, in run for error in validator.iter_errors(instance):
Errors are now properly reported to clients and the TUI. There are still errors which make the node crash, esp. the ones related to failure to validate the tx and validators.
- These should be also reported as
InvalidTxErrorbut this is left for next year
Plans for today:
- Spike implementation of matching mock crypto so that we can verify sigantures in MockHead
- PR about reporting failures to submit transaction to end-user, catching exceptions and sending messages
Signature property check fails immediately => algorithm for computing encoding is basically wrong...
I have Inverted quotient and remainder in my transformation from integer to bytes on-chain 🤦
The error we get with a failed property is not very helpful as iut does not show anything about datums or redeemers, need to enhance output to show those?
- Adding redeemers display and datums to the
describeCardanoTxfunction
Still have failure: There's probably a difference in the representation of snapshot (number) on and off chain that explains the failing signatures verification
- In the
SignableRepresentationofSnapshotwe usedshowbut in constructing the datum incloseTxwe useserialise'
Still having failure: There is a one character difference between what show displays for off-chain Signed data and what it displays from Datums:
MultiSigned {multiSignature = [UnsafeSigned "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\SOH",UnsafeSigned "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\STX",UnsafeSigned "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\ETX"]}
vs
DataConstr Constr 1 [B "\SOH",List [B "PK\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\SOH",B "PK\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\STX",B "PK\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\ETX"]]
That must come from the ToCBOR instance which certainly adds some prefix when we invoke serialize': CBOR encoding of bytestrings prepend one or more bytes identifying the type and length of the bytes.
- Using CBOR encoding to pass bytes on- and off-chain is problematic because we don't have CBOR parsing capabilities on-chain so we must ensure whatever bytes manipulation we do works on compatible representations.
Trying to use directly the bytes from the underlying signature => Unit tests pass
Got some tests still failing with new closeTx validator:
- DirectChainSpec is failing but this is expected as we don't pass any signature in the Snapshot we post :)
- TxSpec is also failing to observe closeTx
- ETESpec and TUISpec
Fixing first DirectChainSpec was easy enough, just needed to sign the snapshot.
- There's a minor snag in that we pass the
PartytowithDirectChainwhich means we need to draw the signing key from somewhere else. Perhaps it would make sense to pass theSigningKeytowithDirectChain?
The problem with observing closeTx now is that we expect to decode an integer as SnapshotNumber but of course we get a bytestring!
https://github.com/input-output-hk/hydra-poc/blob/ensemble/more-contract-testing/hydra-node/src/Hydra/Chain/Direct/Tx.hs#L592
- Spent an hour troubleshooting the
TxSpectest which was not passing because we passed aSNothingdatum. The output of the test failure is particularly cryptic and does not provide much clues on what's going on.
I now only have the ETE test failing on validating the signatures, not sure why however 🤔 The cool thing is that we have a preoper failure being reported in the test.
Having a look at the node's logs: The CloseTx transaction is properly posted on-chain by node 1 AFAICT, but the message is actually misleading: What happens is that the transaction gets constructed properly but the submission fails and crashes the node => Adding some more detailed log messages
Seems like not all parties have signed the snapshot, here is the plutus reported error:
The data is: Constr 1 [List [I 10,I 20,I 30]]
The redeemer is: Constr 1 [B "\SOH",List [B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\n",B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\RS"]]
We have 3 parties but only 2 signatures which is fishy! The error might comes from the HeadLogic: The confirmed snapshot contains only part of the signatures we received! How come we did not catch this bug earlier 😮?
- Checking the ETE test passes with the "correct" signatures set then will write a proper test for that.
After the change we get the proper number of signatures:
but still have a failure to validate signatures.
The data is: Constr 1 [List [I 10,I 20,I 30]] The redeemer is: Constr 1 [B "\SOH",List [B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\DC4",B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\n",B "K\245\DC2/4ET\197\NUL\NUL\NUL\NUL\NUL\NUL\NUL\RS"]]
This comes from the fact the signatures are not in the same order than the parties which is an important assumption made in the mock head code.
- Possible solutions: Pass a map of parties to signatures, check the signatures differently (eg. shuffling lists), or make sure the 2 lists are in the same order
Adding the signatures to the SnapshotConfirmed message so that we can observe it and have a proper unit test
Decided to not implement complex ordering logic checking within MockHead but rather to order the multisignatures in the HeadLogic, where is produced, to match parties ordering.
- Ideally I would have liked to do that in the
aggregatefunction but this one works on already signedByteStrings so thePArtyinformation is buried. - Also noticed that the
APISpectests did not fail in spite of me adding a field to aServerOutputs constructor: I would expect the JSON specification validation code to have caught this but it did not.
TUISpec test cannot pass in the current implementation of validators because it does not produce any snapshot, hence the Close transaction fails to pass validation.
This is explicitly handled in the paper, e.g when snapshot number is 0 hence we should deal with it in our ContractSpec tests.
Made all tests green but for one TUISpec test which I put in pending because we don't have the ability to make it pass right now: The TUI full lifecycle test does not produce any snapshot hence we close with snapshot 0 and no signatures and the MockHead validator does not cover this.
Trying to generalise mutations, to produce more redeemers then introducing different and more significant ones.
I am "surprised" by the fact one cannot automatically derive instances for Plutus datatypes:
src/Hydra/Contract/MockHead.hs:42:13: error: [-Wmissing-methods, -Werror=missing-methods]
• No explicit implementation for
‘==’
• In the instance declaration for ‘Eq Input’
|
42 | deriving (Eq, Generic, Show)
- Also, the
Eqinstance does not even seem to be visible to test code. This is probably an arfifact from the Plutus plugin compiler and transformation?
test/Hydra/Chain/Direct/ContractSpec.hs:166:15: error:
• No instance for (Arbitrary Plutus.V1.Ledger.Value.Value)
arising from a use of ‘genericArbitrary’
There are instances for similar types:
instance cardano-ledger-shelley-test-0.1.0.0:Test.Cardano.Ledger.Shelley.ConcreteCryptoTypes.Mock
c =>
Arbitrary (Cardano.Ledger.Mary.Value.Value c)
-- Defined in ‘Test.Cardano.Ledger.ShelleyMA.Serialisation.Generators’
• In the expression: genericArbitrary
In an equation for ‘arbitrary’: arbitrary = genericArbitrary
In the instance declaration for ‘Arbitrary MockHead.Input’
|
166 | arbitrary = genericArbitrary
Perhaps they are available in some module we do not import?
- Looks there are some in the PAB code. As we don't depend on
plutus-appanymore, we'll need to rewrite them or vendor this file. like we did for SM code. - Created a
Plutus.Orphansmodule vendoring stuff from PAB.
I have a test failure with the generator when the redeemer is a Close with a different snapshot number, which is an expected error I would say?
*Anyhow, the close redeemer should not be only a snapshot number but a whole signed snapshot so this paves the way to do the needed changes.
Trying to remove non-documented symbols from haddock document generation, seems like this is only available as a module-level prune attribute, which sucks...
Some discussion aournd "mutation testing" approach:
- There is nothing there forcing us into implementing correctly the output datum, so that a
closetx could just as well produce an invalid output datum which won't be consumable by the fanout tx - However we already have tests in place for the whole "happy path" so if a close does produce some invalid datum, this will be caught by the fanout, or later when we check the produced UTXO
- By thoroughly testing each validator, we ensure each link of the chain is correct but we need test(s) to ensure the whole chain is correct
- Validators are always checked and implemented as purely local functions so it makes sense to test them locally
Discussing what to do next, and which contracts to implement. Seems like Close is the best candidate because it's one we have barely touched so far.
What about the contestationPeriod? Discussing the passing of time in the HEad logic, should probably be reported by underlying ChainComponent as ticks, instaed of putting the contestation computation logic in the on-chain component.
2 different things:
- Has enough time passed?
- Is the fanout posting the right UTXO?
Discussing testing strategy for contracts:
- Mutation approach: Generate valid transaction, then mutates them to render them invalid and make sure the validator fails
- Constraints-based approach: Starts with an empty transaction then add more "constraints" representing how we expect the transaction to be
- requires to start with a
Const Falsevalidator to make sure we have a failing test
- requires to start with a
- We need some higher level of testing involving sequence of transactions/validators to express properties: for example,
- It should always be possible to abort if the head is not open
- We also want to have rock-solid contracts so we need to test all kind of non-happy paths
We decide to give the "mutation" approach on Close validator a go. The idea is the following:
- We start from an
arbitrarysupposedly valid Tx (and relevant UTXO) of the required type, in this casegenCloseTx - We start from a
Const Truevalidator, eg. one that validates any transaction - We indeed verify this transaction passes stage 2 validation
- We don't care about stage 1 as it's supposed to be validated before that, which kind of implies we need to generate structurally valid transactions
- We then generate various relevant mutations to this valid transaction that are supposed to make it invalid
- We started with a simple one, namely replacing the redeemer with
Abort - Various possibilities include:
- Modifying the value of an input or an output,
- modifying the datum of an input (note that we modify the hash in the provided UTXO because if the hash is not compatible the transaction is srtucturally invalid)
- Removing some output or some input
- We started with a simple one, namely replacing the redeemer with
- By adding more and more mutations, the goal is to "triangulate" the validators, to make them more and more precise and verify more and more conditions
- While constructing the mutations, we let emerge some kind of DSL to construct transactions that we use in
Txmodule to simplify it
-
Raised the question of MPT again with researchers based on mostly one observation / concern: Hydra's specification for MPTs is different from Ethereum's, and requires the prefix for each node to be part of the node's hash. Implementation-wise, this introduces a quite important complexity since it requires hashes to be constructed one-by-one, as an onion layer, for each digit of the prefix -- without what it is not feasible to add / remove element with only a proof and a root hash. This means that in the case of Cardano output references, an MPT path is at minima 32 hashes! We are afraid that this would make the computational budget go over the roof.
Thus two legitimate questions:
- Why include the prefix / path as part of the hash? What would be the security consequences / hypotheses for omitting the prefix / path from the hash structure?
- Do we actually need to add / remove elements from an MPT at all in a coordinated Head context?
Researchers are investigating (1) and comparing with Ethereum's implementation, looking at the trade-offs in both solutions. For (2), the answer is almost in the question. Adding and removing elements to and from an MPT is required by the OCV code for the close transition in the presence of dangling transactions (basically, the OCV is re-applying those transactions on top of the signed snapshot and checks that the resulting UTXO matches with the MPT's root). In a coordinated context, there's none. Which means that MPT are only truly needed for fanout splitting.
-
As a consequence of (2) above, we may also wonder why an MPT is even needed at all. Since the output references are meaningless on the layer 1, all that is really necessary during fanout is to check that resulting UTxO (i.e. addresses, value and datums) do indeed correspond to what's been agreed by the participants. As such, a "simple" Merkle Tree (perhaps enhanced with the accumulated size to facilitate splitting) would be sufficient in order to create and verify the split transactions.
- We discussed the possibility of adding and removing participants to and from an already established head. This would allow a head to on-board new participants, for either a short period of time, or until the end. While there's no apparent issue or concern with this (it was discussed during the writing of the paper, but omitted to avoid bloating the paper), there hasn't been any explicit use-case made for it. One possible idea would be to on-board new validating participants and make the head a bit more of an "open" network (so long as participants agree to onboard someone...).
Trying to remove the annoying messages that are printed when the thread controlling the TUI exits because it's blocked on a STM operation.
- Seems like it's thrown in the MonadSTM but the exception type is not exported which means there's no way to catch it.
- I presume this is intentional, in order to remove the temptation users could have to tamper with those exceptions but in our case this is pretty annoying. Perhaps I could wrap it in a
silentlycall but then how about interaction with stdin/stdout? - It's possible to add a custom event that would be handled in the main event loop as an
AppEventthat will invokehaltfunction to stop the TUI. However, it's not clear how to inject that custom event into the channel that distributes them as it's currently private, so there would be a need for some kind of control side channel that would stop the TUI inner loop but this is too significant a change to be delt with right now.
In CI Build https://github.com/input-output-hk/hydra-poc/runs/4516685770?check_suite_focus=true there is something odd: it reports failed tests but those are nowhere to see.
The fact hydra-tui is not built along with hydra-node in the docker-compose is annnoying, also there aren't any build instructions so it's not possible to build them in compose.
- Seems like demo instructions assume user pulls images and does not build them locally, going to add instructions on how to build them locally.
There is a need for another level of ETE test, one that would check the docker images are properly working. We could test through the TUI, using existing infrastructure but running the nodes as containers and the TUI in-process with several instances interacting with a cluster.
Trying to simplify TxSpec tests and see if I can extract common features to reuse in testing contracts.
-
My idea about contracts would be to provide a way to build transactions and UTXO then apply the tx against given UTXO using underlying
ledger-specsinfra as provided byHydra.Ledger.Cardano. -
To test contracts we could do something similar to the constraints eDSL in Plutus: Start from a blank transaction then generate a new transaction applying a sequence of arbitrary constraints to generate a tx that would or would not pass the validator. Then trying to validate the transaction. Problem is the oracle: How do we know the property holds? Perhaps what we could do is to have the generator express what a valid init/commit/... transaction is? Like:
prop "check valid commit" $ \ (ACommitTx tx utxo) -> isRight $ runIdentity $ evaluateTransactionExecutionUnits pparams tx utxo epochInfo systemStart costmodels
Goal: Make the demo work.
We have a flawed logic in the observation of commits: We remove the inputs from initials that have not been observed in the commit which obviously leads to the inability to commit after having observed another commit tx.
- To Test the
observeCommittx and modify the Onchain state, we need to generate a list of initials(TxIn, PKH). But we also need to populate the list ofinitials :: [(TxIn, TxOut, Data)] - The reason why we have both
TxOutandDatais that we are using ledger specs whereTxOutonly contains datum hash.
Going to fix the TUI's commits. The problem is that we cannot build transactions when the head is open as the UTXO committed are now using full cardano tx hence we need to identify them according to our own addresses. We also need matching signing key to be able to sign the tx
While refactoring TUI we are bitten by the problem that Party only contains the multisig key and not the cardano key, which makes it impossiblre to use it to identify addresses to use in the TUI
- Workaround is to infer the list of addresses to sent money to in the TUI from the list of existing UTXpO
Some little bugs reamining in the TUI:
- List of UTXO displays duplicate addresses which messes up with navigation
- When user has no UTXO to send, she can still go to the recipient list but this crashes afterwards => Won't fix for now
We were able to complete the journey through the TUI, observing the fanout transaction in the cardano-node 🎉 :
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 0 2000000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "67d8ed01e13f33438ea9059ac9be2e159f943cffe054283485e0300271e3e9f9"
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 1 100000000 lovelace + TxOutDatumNone
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 2 100000000 lovelace + TxOutDatumNone
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 3 10000000 lovelace + TxOutDatumNone
df44aeb02bb740c86b3745b604bf3eccbb77f13d3493532df8fda38eea95de3a 4 90000000 lovelace + TxOutDatumNone
Also, we were able to open another Head after having closed the first one, and have one party not committing anything which is fine 🍾 .
- A problem with our current scheme is that a party whic commits nothing or whcih has consumed all its UTXO won't be listed in the recipients list.
[1 of 7] Compiling CardanoClusterFixture ( src/CardanoClusterFixture.hs, dist/build/CardanoClusterFixture.o, dist/build/CardanoClusterFixture.dyn_o )
src/CardanoClusterFixture.hs:14:15: error:
• Exception when trying to run compile-time code:
/tmp/nix-build-local-cluster-lib-local-cluster-0.1.0.drv-0/hydra-poc-root-local-cluster-lib-local-cluster-root/local-cluster/config: getDirectoryContents:openDirStream: does not exist (No such file or directory)
Code: makeRelativeToProject "config" >>= embedDir
• In the untyped splice:
$(makeRelativeToProject "config" >>= embedDir)
|
14 | configFiles = $(makeRelativeToProject "config" >>= embedDir)
Seems like file-embed does not work correctly inside nix build :sad:
- Using cabal API to package and extract extra data files across packages works just fine, no need to use
file-embedmodule: https://cabal.readthedocs.io/en/3.4/cabal-package.html#accessing-data-files-from-package-code
Still failing to build inside nix:
Setup: filepath wildcard 'config/*.json' refers to the directory 'config',
which does not exist or is not a directory.
- => We probably want to list all files explicitly
- We need to regenerate materialisation when change data file (or any content) of a local package
Docker compose build and run is working fine now, needed to:
- Update permissions when running
prepare-devnet.shso that files have 0400 perms - rebuild hydra-tui properly
Cool thing is that running hydra-tui works fine from the docker-compose using just
docker-compose --profile tui run hydra-tui-alice
Injecting UTXO(s) into demo cardano-node so that TUI user can post transaction and commit
Trying to simplifying key juggling code between crypto keys and cardano-api keys
- I am hitting a small snag with the
hashKeyfunction which is used by theTxmodule to pack in the initial datume, trying to find a hashing function that works with API types
Got a first working version of an exe injecting seed payment for one address, but got a submission error when trying to run it, might be an issue with version of nodes
- Rebuilding docker containers...
- Program can inject a single UTXO + seed payment to be used in the network:
$ cabal run seed-network -- --cardano-node-socket demo/devnet/ipc/node.socket --cardano-signing-key demo/devnet/credentials/alice.sk Querying node for Protocol Parameters at demo/devnet/ipc/node.socket Posting seed payment transaction at demo/devnet/ipc/node.socket, amount: Lovelace 100000000, key: demo/devnet/credentials/alice.sk UTXO for address ShelleyAddress Testnet (KeyHashObj (KeyHash "f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d")) StakeRefNull { "223de11cbda4126bae963c1d653e7c4711554011bcd807ec3eea8bf958199fa7#0": { "address": "addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3", "value": { "lovelace": 100000000 } }, "223de11cbda4126bae963c1d653e7c4711554011bcd807ec3eea8bf958199fa7#1": { "address": "addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3", "datumhash": "a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3", "value": { "lovelace": 899899828691 } } }
Fixed generators for WalletSpec so that we have 100% Right coverage. Trying to get code coverage information to understand what we are testing really
Somewhat correct invocation for code coverage with hpc, generating HTML files but failing to generate an index:
hpc markup \
'--destdir=/home/curry/hydra-poc/dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0.0/t/hydra-node/hpc/vanilla/html/hydra-node' \
--hpcdir=/home/curry/hydra-poc/dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0.0/t/hydra-node/hpc/vanilla/mix/hydra-node \
--hpcdir=/home/curry/hydra-poc/dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0/hpc/vanilla/mix/hydra-node-0.1.0/ \
--hpcdir=/home/curry/hydra-poc/dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0.0/hpc/vanilla/mix/hydra-node-0.1.0.0 \
--srcdir hydra-node \
./dist-newstyle/build/x86_64-linux/ghc-8.10.7/hydra-node-0.1.0/t/tests/hpc/vanilla/tix/tests/tests.tix
Discussing how to retrieve the UTXO from the cardano node, whether or not to put it in existing Client in TUI.
- It makes sense to separate responsibilities between component talking to Hydra and one talking to the node, even though in the end, from the perspective of the TUI, it's a single entry point
Got stuck once more in issues with various types of UTXO being available:
-
queryUtxoreturns a cardano-apiUTxO - we need a Hydra'
Utxo
We need to filter the UTXO used for payment with the markedDatum
- We manage to see the Head open in the TUI, with the right commits available
Troubleshooting the issue on CI with TUISpec, namely that End-to-end tests fail, looks related to the fact the fd used to write is not a vty: https://stackoverflow.com/questions/1605195/inappropriate-ioctl-for-device
- Starting to get the demo running: it's IMPORTANT to have the devnet re-created as the nodes do not sync back in time.
- Host-mounting the node.socket allows for some convenient cardano-cli querying
- hydra-node crashes when the TUI selects a randomly generated utxo to commit with "CannotSpendInput" -> expected
- By exposing the
IO Vtyinitializer, we can hook into theVtyinterface and re-direct theOutputinto a normal file - A generic BrickTest handle / with pattern emerges
- Realized that the
updatefunction does write continuously into the file and it contains "multiple" screens- try to seek backward on each update
- this messes with the terminal
- Theory why splitting individual frames is not possible: only changes are drawn to the Fd?
- There is a Mock output in vty: https://hackage.haskell.org/package/vty-5.33/docs/Graphics-Vty-Output-Mock.html
- Maybe outputPicture can be used instead? If used with a "fresh" displayContext, this shows the picture for real instead of forwarding it to the Fd
- Using a custom Output I can hook into 'outputByteBuffer' and redirect that? This seems to also allow providing an 'assumedStateRef' which could be cleared outside to force a "full re-render"?
- By writing into an IORef, which is cleared before each call to outputPicture we can keep a single frame!
- If we use the real display context, it draws correctly using
writeMoveCursorbut this output is harder to reason about programmatically - Maybe I could drop the escape codes for expectations and keep them for displaying?
- Adding the shouldRender function was easy now
-
threadDelays are necessary right now -> ugly .. double buffering where thegetPictureuses a (T)MVarto block on the next frame could help - it outputs the frame which does not have the expected bytes -> nice!
-
- Also, I do get a BlockedIndefinitely exception on failing tests.. should be okay
- We want to add hydra & cardano node to the TUI tests using
withBFTNodeandwithHydraNode- rewriting the tests was fine
- but fails now as the
withBFTNodefromlocal-clustercan't find the hard-coded fixtures inconfig/..
- We see two options now:
- generate everything similar to
cardano-testnetis doing it - embed our hard-coded local-cluster files when compiling the
local-clusterlibrary -> we go for this as it's more in-line what we have right now
- generate everything similar to
- Writing config JSON files which had been copied before from memory
- Cardano keys are a bit more involved. We had been pointing the
withBFTNodeto the actual file paths instead of copying them to a temporary directory, so we tackle this and need to change quite some signatures ofkeysFororsigningKeyPathFor - After rebasing the TUI test work, we also need to distribute
initialFundsand make a "seed payment" usingmkSeedPayment-> Success, all tests but the expected failure pass!
Goal: Fix all tests implementing fee coverage using "marked" UTXO
Still stuck in fixing DirectSpec test, probably because there's a race condition while we are waiting for the payment utxo to appear
- Inject marker UTXO
- Throwing an exception when we cannot cover fee
-
DirectSpectest is passing butWalletSpecproperties never coverRightside:coverFee balances transaction with fees +++ OK, passed 100 tests (100% Left). transaction's inputs are removed from wallet +++ OK, passed 100 tests (100% ErrNoPaymentUtxoFound).
Struggling to write a correct seed payment transaction generator for use in Integration and ETE tests
- There is a mismatch between the config we generate as part of the CardanoNode setup and the existing initial funds: In one case we use 900000 ADA and in the other case 900 ADA. As they use the same address, when the
mkGenesisTxfunction runs it retrieves one or the other. - I would like to start with empty
initialFundsand then fill them up as we need when we start the cluster
Paying InitTx succeeds but posting all subsequent transactions in DirectChainSpec fails, probably because they are not waiting for payment to appear
- Retrying blindly without timeout does not work, of course, so need to add a timeout to all retries
- Problem now is that
generatePaymentToCommitis probably consuming themarkerDatumwithout recreating it so it disappears
Forgot to retry some postTx calls, the ones that are supposed to fail
- I can confirm the "marked" UTXO is consumed by the
generatePaymentToCommit.
ETE tests still failing because we don't have a seed transaction posted so there's no payment utxo available => seeding UTXO
-
It's possible there is a race condition between the time the node sees the commits and the time the wallet takes into account the payment utxo to cover the collectcom tx? It's bob who is tryign to post the collectcomtx and is failing to cover its fees I can see this in Bob's log:
{ "message": { "directChain": { "contents": { "after": { "d0e48424eed4e798aac21e0caae434aa3fbb2fafa4dd62f40f568c6a7c895bdb#0": { "address": "601052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6", "datahash": null, "value": { "policies": { }, "lovelace": 97834279 } }, "876818297ef126d372d05572ddeb3a4dd971d72eb6062a119987bc20ab6212c5#1": { "address": "601052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6", "datahash": "a654fb60d21c1fed48db2c320aa6df9737ec0204c0ba53b9b94a09fb40e757f3", "value": { "policies": { }, "lovelace": 899896530691 } } }, "before": { "d0e48424eed4e798aac21e0caae434aa3fbb2fafa4dd62f40f568c6a7c895bdb#0": { "address": "601052386136b347f3bb7c67fe3f2ee4ef120e1836e5d2707bb068afa6", "datahash": null, "value": { "policies": { }, "lovelace": 97834279 } } }, "tag": "ApplyBlock" }, "tag": "Wallet" }, "tag": "DirectChain" }, "timestamp": "2021-12-07T12:14:49.531989295Z", "threadId": 20, "namespace": "HydraNode-2" }which happens "after" it tries to post collectcom, it's perfectly possible it's missing the payment txout
-
Should the node try again to post it if it fails, or should this be handled in the
postTxdefinition in theDirectmodule? => It's reasonable to expect various race conditions and the need to retry posting given some conditinos are not yet met but could be met in the future.
Added a timeout around finalizeTx in Direct and reinstated retry in the wallet so that we can wait for payment utxo to appear.
- This is actually a case where the client could be interested in the error reported and do something about it, eg. send money to the Node's "wallet" to pay for Head SM
LocalClusterSpec fails because there's no initialFunds as I removed it from the genesis-shelley.json file => We want to put them back, and overwrite them in tests.
- Start pairing with realization that the
MockHeadis already checking things:- it asserts the "newValue" is the same as "oldValue", i.e. nothing is added
- but of course we are adding the collected value from the commits
- So we start by passing off-chain knowledge to the Head (SM) validator via the redeemer (SM Input)
- NOTE: This might not be a good idea and instead we should look at the script context / all commit (PT) inputs
- We improve error printing on tx submission failures of
Chain.Direct - Now the close fails because it does not preserve value and we pass in a
TxOutof the head utxo
Goal: Have ETE and Benchmarks pass
There errors we are seeing from tests execution are painful, so we want to improve their formatting:
- Formatting the submission error is bit annoying as it requires peeling several layers of stacked errors
The plutus error is already formatted so would be nice to print it directly rather than
showing it - Is there not already a way to PPrint ledger/node errors?
The error in the CollectCom comes from the SM:
- It checks the value stored in the SM UTXO is preserved between transitions, and this is not the case currently, hence the errors reported on script execution
- We collect the total value in the
CollectComredeemer and use that to update the destination state's value
SM validator now fails on the close tx, for the same reason, eg. missing values
Then fanoutTx also fails, for the same reason, but now we must ensure the Final state is really final so that the SM logic checks the destination state value is 0 and there's no additional output for the SM/
Commit is failing in the benchmarks run with not enough fees => Trying to fix the wallet logic to remove the input txin from selection logic, but this does not work
Unit tests are failing:
- Struggling with the ledger/api discrepancies to fix unit tests
- We are missing the datum to pass to the
collectComTxfunction so we add them, but now this breaks some tests which require the UTXO, but not the datum and we are stuck in a maze of mapping and transforming back and forth between ledger and api
We are having an error in DirectSpec which fials to post the init tx: It seems we are retrying when posting the initTx but catching all errors which is sledgehammerish => Adding a proper exception instead
- InitTx submission blocks because we changed the way inputs are selected in the
finalizeTxand in the wallet weretrywhen there's noavailableUtxo, removing theretryreveals the error
Idea:
- We need to have a distinguished address or utxo we carry around to pay for the fees. Could we use a datum for that?
- Other option: Simply call cardano-cli to do the balancing of a tx
Trying to make the ETE fail by having both Alice and Bob committing UTXO. This should fail according to my theory that we are consuming the wrong UTXO in the collectCom tx
ETE test now fails for the same reason than Benchmark fails:
CannotSpendInput
{ input = ("9546383daca50c0c643abca09331c5e58cfef49fa899eb8d15bfb2347ba1b001", 1)
, walletUtxo =
fromList
[ (TxIn "8d383a29a211578298143ab26b3b2e1c4406abe5d7a905c49b234fdccf2627c8" (TxIx 1), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (KeyHashObj (KeyHash "3aaa2e3de913b0f5aa7e7f076e122d737db5329df1aa905192284fea")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 899996702000)])) TxOutDatumNone)
]
, headUtxo =
fromList
[ (TxIn "bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c" (TxIx 0), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 2000000)])) (TxOutDatumHash ScriptDataInAlonzoEra "c2f7589a052854c8877e74b7ec3de892981766ef819fc03bc8c893daf66dd72e"))
, (TxIn "bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c" (TxIx 1), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 2000000)])) (TxOutDatumHash ScriptDataInAlonzoEra "2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e"))
, (TxIn "bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c" (TxIx 2), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 2000000)])) (TxOutDatumHash ScriptDataInAlonzoEra "3b5e4228faf69ddf21fb84990b54d806c2b1234a250e0f4c54cc953257ff57ac"))
, (TxIn "bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c" (TxIx 3), TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (ScriptHashObj (ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId, 2000000)])) (TxOutDatumHash ScriptDataInAlonzoEra "59b19510ad6d701df3df01804374a9a5126b265b779804d739e78721ebd3872c"))
]
}
This error is returned by the Wallet when it tries to resolveInputs. Adding more info about all the inputs of the transaction
- I don't see a way to have a proper
CollectComTxthan collecting the outputs of the commit transactions and keeping them around in the Direct chain state, or the head state, to pass later on when submitting collect com tx. Now trying to retrieve the commits from observing the UTXO, we need to return theν_commitUTXO from the observed commit txs. - Trying to fix the value produced in the commit output
Once again lost in the maze of types between ledger and API...
Why don't we simply pass a Utxo to the commitTx function and instead pass a Maybe (x,y) tuple?
- This tuple is what is selected by the user, which should be a proper
Utxocontaining a single input.
Observation test now pass but test for transaction size fails
The problem in the TX size comes from the generated Value which is huge. This is also what we observe in the CI and this comes from our use of arbitrary Txs from the Ledger. We have something in the WalletSpec already for trimming down the values to something more palatable, perhaps using ReasonablySized ?
Struggled quite a bit but test checking we observe collectCom properly is failing because we do not use the commit outputs.
Note:
I really think the observeXXX functions should work with OnChainHeadState as:
- What is relevant for observation depends on the state
- The state can be modified by the observed TX, like the initials being produced/consumed by various transactions
Managed to have the CollectCom transaction consumes the commits UTXO and not the committed ones, now checking what happens in ETE test
- ETE and DirectChainSpec tests still failing
I think I know what happens: The Wallet tries to resolve an input corresponding to the commit UTXO but it does not have it in its UTXO set because it's a UTXO paid to a script address and not to the Wallet's owner address so we don't track it.
- But what happens for Head script UTXO?
- We pass more UTXO to resolve when we call
coverFee
=> Trying to add the accumulated commits in OnChainHeadState to the cover fee function
We are now observing a
WrappedShelleyEraFailure
( MissingScriptWitnessesUTXOW
( fromList
[ ScriptHash "6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db"
]
)
)
which probably means we don't put the script for consuming the commit outputs into the collect com tx. Also putting the redeemers.
Transaction now fails because of scripts execution:
HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [
UtxowFailure (
MissingRequiredDatums (fromList [SafeHash "549485dcc8131ab64122a9163080943f55b83ae368bc55bec73e583f192f3080"]) (fromList [SafeHash "8392f0c940435c06888f9bdb8c74a95dc69f156367d6a089cf008ae05caae01e",SafeHash "f4b9d64e4725efc05d7d078bb19e952b288c8403f5a585a8a6ffe589a9851614"])),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (UtxosFailure (ValidationTagMismatch (IsValid True)
(FailedUnexpectedly [
PlutusFailure "
The 3 arg plutus script (PlutusScript PlutusV1 ScriptHash \"07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9\") fails.
CekError An error has occurred: User error:
The provided Plutus code called 'error'.
The data is: Constr 0 [Constr 0 [I 100000000000000],List [I 10]]
The redeemer is: Constr 0 []
The context is:
Purpose: Spending (TxOutRef {txOutRefId = bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea, txOutRefIdx = 0})
TxInfo:
TxId: 91b8eae01ad75d635d8e925925195da468bc700c9a20286d2efdfd7957b3d3a8
Inputs: [ a24d818e7fc823c61416a095f98b139fc8c520b9ee5365791245f8d9ec7efc6b!0 -> - Value (Map [(,Map [(\"\",3000000)])]) addressed to
ScriptCredential: 6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db (no staking credential)
,a24d818e7fc823c61416a095f98b139fc8c520b9ee5365791245f8d9ec7efc6b!1 -> - Value (Map [(,Map [(\"\",899989536279)])]) addressed to
PubKeyCredential: f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d (no staking credential)
, bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea!0 -> - Value (Map [(,Map [(\"\",2000000)])]) addressed to
ScriptCredential: 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9 (no staking credential)]
Outputs: [ - Value (Map [(,Map [(\"\",5000000)])]) addressed to
ScriptCredential: 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9 (no staking credential)
, - Value (Map [(,Map [(\"\",899986238279)])]) addressed to
PubKeyCredential: f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d (no staking credential) ]
Fee: Value (Map [(,Map [(\"\",3298000)])])
Value minted: Value (Map [])
DCerts: []
Wdrl:[]
Valid range: (-\8734 , +\8734)
Signatories: []
Datums: [ ( 8392f0c940435c06888f9bdb8c74a95dc69f156367d6a089cf008ae05caae01e
Seems like it's missing the datum witnesses for the commit outputs.
- => Adding commit datums
So I don't have the missing datum error anymore, only a script failure
- The datum types are odd in the error, need to dump the transaction to see what's going on
It seems it's the head script which is failing:
The data is: Constr 0 [Constr 0 [I 10000000000000],List [I 10,I 20,I 30]]\nThe redeemer is: Constr 0 []
It's clear the datums are there:
HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (UtxosFailure (ValidationTagMismatch (IsValid True) (FailedUnexpectedly [PlutusFailure "
The 3 arg plutus script (PlutusScript PlutusV1 ScriptHash \"07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9\") fails.
CekError An error has occurred: User error:
The provided Plutus code called 'error'.
The data is: Constr 0 [Constr 0 [I 10000000000000],List [I 10,I 20,I 30]]
The redeemer is: Constr 0 []
The context is:
Purpose: Spending (TxOutRef {txOutRefId = bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c, txOutRefIdx = 0})
TxInfo:
TxId: 35b789b90d675222ca720ec0edb167d3c80d8ea2505327e5a8a6154de39c8ef7
Inputs: [ 0baa47ee668c4a9daf984ca29d2ada80224ea74ecae114f2b49c923252bd612f!0 -> - Value (Map [(,Map [(\"\",4000000)])]) addressed to
ScriptCredential: 6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db (no staking credential)
, 0baa47ee668c4a9daf984ca29d2ada80224ea74ecae114f2b49c923252bd612f!1 -> - Value (Map [(,Map [(\"\",899984536279)])]) addressed to
PubKeyCredential: f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d (no staking credential)
, 8d383a29a211578298143ab26b3b2e1c4406abe5d7a905c49b234fdccf2627c8!0 -> - Value (Map [(,Map [(\"\",2000000)])])addressed to
ScriptCredential: 6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db (no staking credential)
, 97469573293f61bf761da1adc77c1ad207e4c65249ee78f18c336db0a137a7a8!0 -> - Value (Map [(,Map [(\"\",4000000)])]) addressed to
ScriptCredential: 6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db (no staking credential)
, bd71f91ba872c4d79f45163dae877c1644a22527df47e1c49c25657acd10603c!0 -> - Value (Map [(,Map [(\"\",2000000)])]) addressed to
ScriptCredential: 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9 (no staking credential) ]
Outputs: [ - Value (Map [(,Map [(\"\",12000000)])]) addressed to
ScriptCredential: 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9 (no staking credential)
, - Value (Map [(,Map [(\"\",899981238279)])]) addressed to
PubKeyCredential: f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d (no staking credential) ]
Fee: Value (Map [(,Map [(\"\",3298000)])])
Value minted: Value (Map [])
DCerts: []
Wdrl: []
Valid range: (-\8734 , +\8734)
Signatories: []
Datums: [ ( 8392f0c940435c06888f9bdb8c74a95dc69f156367d6a089cf008ae05caae01e
, <> )
, ( 9352b132cb8dcedbc4d1115321a357d32b538aa1ba57c4c958ee6ebae8f5d50c
, <10,
\"{\\\"9546383daca50c0c643abca09331c5e58cfef49fa899eb8d15bfb2347ba1b001#1\\\":{\\\"address\\\":\\\"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3\\\",\\\"value\\\":{\\\"lovelace\\\":2000000}}}\"> )
, ( af586b80c5243d28f4c1e9d1984236d2934fecb6d3d0a3b7e50ba94f446c150f
, <30, \"{}\">)
, ( c2f7589a052854c8877e74b7ec3de892981766ef819fc03bc8c893daf66dd72e
, <<10000000000000>, [10, 20, 30]> )
, ( f5bcf944acb09ae13fcdec6517ad3ee23c03de7c6ac779dd4304ebfd38faeb44
, <20,
\"{\\\"ac6e8d41c8e11d7883b1a5f5b025494cde06c7cecd67d10d06d7627374cf81af#1\\\":{\\\"address\\\":\\\"addr_test1vqg9ywrpx6e50uam03nlu0ewunh3yrscxmjayurmkp52lfskgkq5k\\\",\\\"value\\\":{\\\"lovelace\\\":2000000}}}\"> ) ]
"
It's definitely the head script (with hash 07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9) that's failing as it's the only output beside change to the Tx.
- Amount is correct, it equals the sum of inputs + 2 ADAs
- I can see the 2 committed UTXO from Alice and bob.
- Trying to have the mockhead script always succeed does not help
Trying to avoid my LSP displaying "ghost imports" which are a PITA to navigate the file. Also removing the annoying popups.
- It's
lsp-lens-modewhich is enabled by default in lsp-mode. Addingper https://emacs-lsp.github.io/lsp-mode/page/settings/lens/(use-package lsp-mode :custom (lsp-lens-enable nil))
Got the following error when running benchmark:
hydra-node: failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (ValueNotConservedUTxO (Value 2000000 (fromList [])) (Value 1493272799 (fromList []))))),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure
(BadInputsUTxO (fromList [TxInCompact (TxId {_unTxId = SafeHash "26ecb3a06c1e32f63742f1f7836c42dc86184fd71e32988d0b4099382cf009d1"}) 0])))),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure NoCollateralInputs)),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (InsufficientCollateral (Coin 0) (Coin 4947000))))]
})))))
Another (different) error:
hydra-node: failed to cover fee for transaction: ErrNotEnoughFunds {missingDelta = Coin 3589920368}, ValidatedTx {body = TxBodyConstr TxBodyRaw {_inputs = fromList [TxInCompact (TxId {_unTxId = SafeHash "852d11d73776b64a9416bbef7811cca03485a01691abc37751dcb866b1353a29"}) 0], _collateral = fromList [], _outputs = St
rictSeq {fromStrict = fromList [(Addr Testnet (ScriptHashObj (ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "67d8ed01e13f33438ea9059ac9be2e159f943cffe054283485e0300271e3e9f9")),(Addr Testnet (KeyHashObj (KeyHash "16601980e4ae7eb11e87
180d154cc44a9b24105a6ee7c592ca66329c")) StakeRefNull,Value 3604780403 (fromList []),SNothing),(Addr Testnet (KeyHashObj (KeyHash "542a8a32c2a56fc6081e784a2a0527803015922309eee9ff051f629e")) StakeRefNull,Value 2541227916 (fromList []),SNothing),(Addr Testnet (KeyHashObj (KeyHash "529b55087caf60a7251c68b38f480de1f3ad
d14561322447f25fcf20")) StakeRefNull,Value 1042096452 (fromList []),SNothing)]}, _certs = StrictSeq {fromStrict = fromList []}, _wdrls = Wdrl {unWdrl = fromList []}, _txfee = Coin 0, _vldt = ValidityInterval {invalidBefore = SNothing, invalidHereafter = SNothing}, _update = SNothing, _reqSignerHashes = fromList [],
_mint = Value 0 (fromList []), _scriptIntegrityHash = SNothing, _adHash = SNothing, _txnetworkid = SNothing}, wits = TxWitnessRaw {_txwitsVKey = fromList [], _txwitsBoot = fromList [], _txscripts = fromList [(ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9",PlutusScript PlutusV1 ScriptHash "07
204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")], _txdats = TxDatsRaw (fromList [(SafeHash "67d8ed01e13f33438ea9059ac9be2e159f943cffe054283485e0300271e3e9f9",DataConstr Constr 3 []),(SafeHash "ff5f5c41a5884f08c6e2055d2c44d4b2548b5fc30b47efaa7d337219190886c5",DataConstr Constr 2 [])]), _txrdmrs = RedeemersR
aw (fromList [(RdmrPtr Spend 0,(DataConstr Constr 3 [],WrapExUnits {unWrapExUnits = ExUnits' {exUnitsMem' = 0, exUnitsSteps' = 0}}))])}, isValid = IsValid True, auxiliaryData = SNothing}, using head utxo: fromList [(TxInCompact (TxId {_unTxId = SafeHash "852d11d73776b64a9416bbef7811cca03485a01691abc37751dcb866b1353
a29"}) 0,(Addr Testnet (ScriptHashObj (ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "ff5f5c41a5884f08c6e2055d2c44d4b2548b5fc30b47efaa7d337219190886c5")))], and wallet utxo: fromList [(TxIn "852d11d73776b64a9416bbef7811cca03485a01691
abc37751dcb866b1353a29" (TxIx 1),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (KeyHashObj (KeyHash "30b49f3a89bb12e567cc21f749bbad9276e214eee6ffa63257bbcf30")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(AdaAssetId,5378452288)])) TxOutDatumNone),(TxIn
"8a927269eb6e203d189c5be935efc8c239721a58e1176ffab515bcc3ba69040d" (TxIx 1),TxOut (AddressInEra (ShelleyAddressInEra ShelleyBasedEraAlonzo) (ShelleyAddress Testnet (KeyHashObj (KeyHash "30b49f3a89bb12e567cc21f749bbad9276e214eee6ffa63257bbcf30")) StakeRefNull)) (TxOutValue MultiAssetInAlonzoEra (valueFromList [(Ada
AssetId,3601482403)])) TxOutDatumNone)]
Seems like all nodes try to post the fanout which explains the invalid UTXO we observer -> leader only should try to fanout
- Our
BehaviorSpectest is passing which is wrong We need to observe the transactions posted on chain to ensure a single node posts it RefactoringConnectToChaintype to have ahistoryfunction exposed to observe it - Our unit test is still passing :(
We confirm there's only on
FanOutTxposted, by the party which decided toClose. Trying to change the closer in ETE test shows ETE test passes consistently even when we change the closing party, so there's probably something fishy in the transactions we generate in the benchmarks
tx =
HardForkApplyTxErrFromEra
S
( S
( S
( S
( Z
( WrapApplyTxErr
{ unwrapApplyTxErr =
ApplyTxError
[ UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (ValueNotConservedUTxO (Value 2000000 (fromList [])) (Value 520403697 (fromList [])))))
, UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (BadInputsUTxO (fromList [TxInCompact (TxId{_unTxId = SafeHash "5866ca5080edbe814c1a5d05d505b137c8648e4f885bc35b951dfaf82c1a969b"}) 0]))))
, UtxowFailure (WrappedShelleyEraFailure (UtxoFailure NoCollateralInputs))
, UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (InsufficientCollateral (Coin 0) (Coin 4947000))))
]
}
)
)
)
)
)
Seems like the wallet cannot find the input to pay/provide collateral There's probably a race condition in the wallet whereby we keep a tx input that's been used until we observe it from an onchain block
- We should remove the inputs as soon as we post the transaction to theyt become unavailable
Trying to add a property to WalletSpec checking that: Our properties for covering fees are not relevant as they end up with 100% False cases, eg. the generated tx/outputs don't have enough ADA to pass the function
- Trying to change the generators to produce TxOut with emough value and messing up with api/ledger discrepancies
- Struggling to generate the right combination of UTXOs for the wallet and an arbitrary transactions to cover fees. Our calculation depends on PParams' maximum fees which are too high as we currently compute fees as an upper bound using
maxTxExUnits - Provide trimmed down pparams to ensure most of the transactions successfully cover fees
We observe the close fails to submit because it does not have enough funds, which makes sense given the collectCom tx does not properly propagates the total funds committed.
- Fixing the value in the CollectCom's output
- Benchmarks now failing consistently because of
CannotSpendInputerror, which is probably caused by some node trying to post a CollectCom transaction concurrently with another node?
Looking at the errors reported by the benchmarks, feeling they could be clearer. Having the logs dumped to stdout as Haskell Show instances makes them somewhat less readable and parseable.
Also thinking of a way to prove this is caused by concurrent attempts at posting the collectcom tx: There aren't any hints at this in the logs
- Trying to replace the body of transactions in the
DirectChainlog with their ids - Turns out we also have a problem in
commitTx:There too the values are incorrectly computed. This does not really explain why the benchmarks fail withcommitTx party utxo (initialIn, pkh) = mkUnsignedTx body datums redeemers scripts where body = TxBody { inputs = Set.singleton initialIn <> maybe mempty (Set.singleton . toShelleyTxIn . fst) utxo , collateral = mempty , outputs = StrictSeq.fromList [ TxOut (scriptAddr commitScript) (inject $ Coin 2000000) -- TODO: Value of utxo + whatever is in initialIn (SJust $ hashData @Era commitDatum) ]CannotSpendInputthough?
It seems we are consuming the wrong UTXO in the collectCom, eg. we are consuming the committed UTXOs instead of the result of the commit tx
In the observeCommitTx we return the committed UTXO:
observeCommitTx :: ValidatedTx Era -> Maybe (OnChainTx CardanoTx)
observeCommitTx tx@ValidatedTx{wits} = do
txOut <- snd <$> findScriptOutput (utxoFromTx tx) commitScript
dat <- lookupDatum wits txOut
(party, utxo) <- fromData $ getPlutusData dat
OnCommitTx (convertParty party) <$> convertUtxo utxo
where
commitScript = plutusScript MockCommit.validatorScript
convertUtxo = Aeson.decodeStrict' . OnChain.toByteString
and the collectCom has no way to know the actual UTXO from the CommitTX itself.
- Quick intro why Rollbacks happen and which parts of the architecture are related to this
- Arnaud presents the strategy:
- Each node just re-applies the events to recover the HeadState as it was
- When re-applying, the events are not reported back to the HeadLogic (really? why not?)
- But what happens when there is e.g. an Abort when re-applying / synchronizing with the chain.
- Any unexpected "replay" is deemed an adversarial action and we would be closing / aborting the head anyways -> anything happened in the Head so far would be lost.
- We aim to expose "stability" to our users so they can decide whether they rely on the open Head.
- Discussion starts
- PAB does this replaying as well and we are in danger of "re-inventing the wheel".
- Seeing an inconsistent transaction might not necessarily be an adversarial move though. Forking chains could result in this even with all honest parties.
- What is the actual problem here?
- Running example: The txs establishing a Head are rolled back and cannot be re-applied -> the Head was never open.
- Is it only when opening the Head? But also when closing/contesting the Head?
- Simple re-submission might not be enough, it could also require the application to re-balance or even re-construct the transaction. (Three levels of reaction)
- "Whatever it takes" to re-establish the HeadState.
- "Confidence" in Head should be visible to the users and they can decide on it
- Is this an individual decision or should it be known apriori? i.e. a parameter
- Users should decide
- Other situations where rollbacks are bad:
- Contestation rollbacks!
- What happens with contestation period / validity?
- Need to adapt the timeout to the new situation / new slots?
We have several issues popping up following our changes in the commitis/collectCom/fanout logic:
- ETE test is failing intermittently to submit the fanout tx because of unsifficient funds and also missing UTXO
- Some properties are also failing on generating init/commit?
Checking the serialisation (Plutus) for ContestationPeriod
Trying to track where the PT1 error comes from. It only appears in the .pir code but not in the PLC, however the PIR for head is not generated.
- Other option is to dump the splice with TH.
- This is a dead end, what happens is that the script execution fails because of a mismatch in redeemers, because we are missing an input
Seems like we had a collision in the generators for TxIn which we import from Test.Cardano.Ledger.Shelley.Serialisation.EraIndepGenerators
The hash generators used is based on an Int:
genHash :: forall a h. HashAlgorithm h => Gen (Hash.Hash h a)
genHash = mkDummyHash <$> arbitrary
mkDummyHash :: forall h a. HashAlgorithm h => Int -> Hash.Hash h a
mkDummyHash = coerce . hashWithSerialiser @h toCBOR
...
instance CC.Crypto crypto => Arbitrary (TxId crypto) where
arbitrary = TxId <$> arbitrary
instance CC.Crypto crypto => Arbitrary (TxIn crypto) where
arbitrary =
TxIn
<$> (TxId <$> arbitrary)
<*> arbitrary
Int has a much smaller domain than a 32-bytes BS obviously and we fell into a case where 2 TxIn were generated that lead to the inputs being "merged".
How to evolve our code to handle that problem? What this means is that we must ensure the head input is unique and does not conflict with the initials input
- Function is currently not total, it fails if this requirement is not met
- We could return an
Eitheror create a "smaller" input type with a smart constructor, but the latter is pretty much the same as the first. - Trying to filter the initials to remove the head input if it's there => does not work, because it then fails to validate the scripts because of discrepancies between utxo, tx and redeemers
- If/when we mvoe to using cardano-api, we know the
makeTransactionXXfunctions can fail so we probably need to fail too.
End up returning Either which ripples all over the codebase
- We have 59 calls to
errorin our codebase, which is not great
We are observing collision in the list of initials now:
- Passing a
Map TxIn DataasinitialInputstoabortTxfunction makes it more explicit we don't want collisions there
- Start refactoring of using cardano-api in
Hydra.Chain.Direct{.Tx, .Wallet} - Start from the "inside out" by creating
Api.Txand converting it toLedger.ValidatedTxon demand to ensure we can create the transaction drafts / pass them to the Wallet as we do right now. - When creating a
plutusScriptpendant to the existing one to get acardano-apiscript (to get a script address) -> where is theToCBOR Plutus.Scriptinstance coming from? - Turns out.. there is only a
Serialiseinstance (fromserialisepackage) and other parts (plutus-ledgerpackageLedger.Scriptsmodule) to define an orphan ToCBOR instance which does useencodefromserialisepackage -
observeInitTxlogic could be directly translated via access toLedger.TxBody, but can be made much simpler using theTxBodyContent - Rewriting
observeInitTxwas a bit more work than expected, but works. - If we use
cardano-apitypes we might be able to drop theDatafromOnChainHeadStatetriples as theApi.TxOutcan carry the Datum to spend an output (in CtxTx) .. or maybe not as it's optional in the Api.TxOut type. - Next Step: ensure
finalizeTxcan work with only aTxBodyand produce a fully balanced & signedValidatedTx- idealy using
makeTransactionBodyAutoBalanceeventually is an alternative tocoverFee_
- idealy using
- Looking at the signature of
makeTransactionBodyAutoBalance.. shouldinitTxbe producing aTxBodyContent BuildTx Erainstead? and only theHydra.Direct.Walletmake it aBalancedTxBodyand then sign it? i.e.
initTx :: .. -> TxBodyContent BuildTx Era
-- rename to 'balance'?
coverFee :: .. -> TxBodyContent BuildTx Era -> TxBody Era
sign :: .. -> TxBody Era -> Tx Era
Today's goal: Fanout real UTXO
- We need to add the UTXO to the output of the fanout TX
- The fanout tx is currently incorrect as it outputs a UTXO for the state machine which should not be the case
- To detect the fanout tx, we can look at the inputs and check one of them uses the head script passing
FanOutas redeemer
- To detect the fanout tx, we can look at the inputs and check one of them uses the head script passing
Adding an assert to the DirectChainSpec to observe there is a payment made to Alice after observing the FanOutTx
- We fail to submit tx with a strange error about
OutpuTooSmall - Looks like we were tripped by SN's comment about closing with an arbitrary UTXO! The problem is that we cannot make this test pass keeping it as it is ,there are still some kind of verifications done to ensure consistency of txs
Going to add more verification in EndToEndSpec
- We want the same kind of assertion to be done in ETE test, eg. to check we correctly fan out the right UTXO for alice and bob
- Refactoring check from
DirectChainSpecinto awaitForUtxofunction Had to extract\casefunction to named and typed function in where clause to make compiler happy
We cannot use the utxo in ETE directly as it's a mixed type TxIn/JSON -> converting to and From JSON to get a correct Utxo
- The ETE test fails because the output we want to fanout is too small! It's
14which is fine off-chain as our params are very lenient, but not quite so in Alonzo. - We commit just 1 ADA in the head which is enough to fanout
We have a failure on the /Hydra.Chain.Direct.Tx/fanoutTx/transaction size below limit for small number of UTXO/ test: With too many UTXOs the transction becomes way too large, esp. as those UTXOs are pretty much arbitrary and can themselves be very large.
- Trimming it down to filter UTXO > 10 items does not help much
- *Disabling the property for now
Thinking about rollbacks and laying out a plan:
- If we can resubmit a rollbacked tx, we do it
- How do we detect a transaction has been rolled back?
- We need to record the block at which a transaction of interest is observed (from Chain Sync)
- When we get a rollback message, we can check which transactions are past the rolled back index
- Then we can resubmit them in order
- How do we know it can be resubmitted?
- we always resubmit
- If resubmit fails on a supposedly rolledback tx
- Alert user
- Act on failing tx in head state:
- Init => Head vanished => discard state
- Commit => User can try to recommit, the same or other utxo?? (we know which commit(s) has been rolledback)
- CollectCom => one commit probably disappeared?
- Who does the resubmit?
- the one who initially submitted it
- if the rollback is adversarial ??
- How do we detect a transaction has been rolled back?
- Could we kill
OnChainHeadStateand only do queries?- we have 2 competing states => risk of desync is high
- we should stop observing the chain in the DirectChain
- Each party's chain component should insulate the Head state from rollbacks and try to resubmit
- If someone does not resubmit, there's no point in another party trying to resubmit
- The onchain head state maintains a stream of events topost/to observe
- When a rollback happens, past events are rehandled in a "rollback mode" which means they do not propagate to the Head State
- New observation from the chain still entail notification to the head state
- Head state must trust the chain component and any event coming from it resets the head state
- All nodes need to follow the rollback protocol so there cannot be any "interleaved" event
- in this case we need to abort/close, but that can also be tricky
We need to design the Close sequence for rollbacks
How to deal with benchmarks?
- We need to commit UTXO initially
- We need to pass the keys for the initial UTxO to ensure the commits end up having the same ids between every run
Adding singingKey to Dataset type -> need to implement To/FromJSON also removing Eq instance
- Adding a roundtrip JSON test for
Dataset - We cannot use plain
genDataset, got some errors trying to generate arbitrary transactions:src/Test/Aeson/Internal/RoundtripSpecs.hs:59:5: 1) Test.Generator, JSON encoding of Dataset, allows to encode values with aeson and read them back uncaught exception: ErrorCall findPayKeyPairAddr: expects only Base or Ptr addresses CallStack (from HasCallStack): error, called at src/Test/Cardano/Ledger/Shelley/Generator/Core.hs:434:7 in cardano-ledger-shelley-test-0.1.0.0-827a00c3eaf868a9c6ed74e429f91efce6a3bea6c8e377f0e0d8dab608426e8b:Test.Cardano.Ledger.Shelley.Generator.Core (after 2 tests) Exception thrown while showing test case: findPayKeyPairAddr: expects only Base or Ptr addresses CallStack (from HasCallStack): error, called at src/Test/Cardano/Ledger/Shelley/Generator/Core.hs:434:7 in cardano-ledger-shelley-test-0.1.0.0-827a00c3eaf868a9c6ed74e429f91efce6a3bea6c8e377f0e0d8dab608426e8b:Test.Cardano.Ledger.Shelley.Generator.Core
Removed arbitrary dataset, now making sure we can commit generated UTXO
- In the
generatePaymentwe extract the UTXO for the initial funds, there is a function apparently for that in the cardano-api we could use to get deterministically the initialtxInfrom which we can construct the payment transactions and hence its initial UTXO - Adding a
mkGenesisTxfunction to compute the transaction and output frominitialFundsfor a given key - Realising we don't need the initial Utxo but actually an initial payment tx => we still need to store signing key to prime the node
Wart: Would make sense to have the networkId in the CardanoNode, right now we expose a defaultNetworkId which is hard-coded
Implementing castHash to convert from a payment key to genesis utxo key
- We don't have access to constructors so we need to serialise then deserialise to convert the values
GeneratorSpec tests are now failing because we use all UTXO from the initial funding transaction to compute the amount to send, we should only select the "commit UTXO" and pass this around -> writing a (partial) function to select the minimal UTXO.
- This will work fine for the genesis transaction because all UTXO have the same TxId and different index, and the "commit" UTXO is the first one by construction.
Benchmark compiles but fails with a strange error about "peers not connected"
- There was a discrepancy in the value of initial funds leading to an error when committing on-chain -> unified into a hardcoded constant in
CardanoCluster - We now only have a problem with paying the fees for the initial funding tx, so we need to use
buildRawandcalculateMinFeeto properly build the tx - We also have a logical problem with the way we generate and run datasets:
- The
concurrencyparameter defines how many datasets we generate - We use the number of datasets generated to define the number of nodes to run
- The
Now struggling to retrieve the ProtocolParameters needed to calculate feees.
- We want to extract them from the genesis-shelley.json file, but apparently there's a discrepancy in the formats: The API has a
ProtocolParametersformat which is different from each Era's format - There is a JSON instance for genesisShelley, we can use that to read it from file and then convert to API's
ProtocolParameters
Benchmark fail on submitting commit tx with an ValueNotConserved error: Seems like the UTXO we consume is not correct, probably unknown by the Wallet -> This is the one we construct from the initialFunds which is supposed to work thanks to a ledger function to produce a TxIN from initial funds
Managed to generate dataset with initial funding transaction but ran into a snag: The "leader" of the bench run does the InitTx which means it consumes its initial funding and produces a new transaction out of it, hence the initial funding transaction in the dataset does not exist anymore.
- If we move the commit seeding before sending the init, we get another error: Commit transactions have more than one UTXO committed.
- We forgot to filter the utxo returned by the initial funding transaction
It now fails in the finalizeTx: The wallet raises an exception saying it cannot find the input to spend or cover the fees.
Fixed run of the benchmarks:
- We attached the client to the datasets in the wrong order thus the keys and UTXO were not the right ones
- Always select the maximum value UTXO from the wallet for change
Migrating to cardano-api:
- need to materialise nix on MB's machine because of a weird error
- some failing tests: generator test is failing with invalid witnesses, validation of TX also fails?
The generator is failing with NoRedeemer error -> probably generating tx with scripts and not passing the redeemers => Improving error reporting to see the actual tx generated
Transaction generated by alonzo generator are not necessarily valid because of the scripts or execution units or what not...
- We should talk to ledger team on how to generate valid Alonzo txs, who does generate Alonzo txs for test?
- In the meantime, would make sense to generate Mary txs and resign them because of the issue with body serialisation.
- Trying to increase execution budget in the PParams does not wokr
Using
freeCostModelin our generator to sidestep the issue of execution units. Switching makes the test pass, seems like there's aTODOin ledger code about using a non-free cost model for generating tx with scripts
Rebasing hail-cardano-api branch onto master in order to fix ETE test: We want to be able to commit 0 or 1 UTXO which has been fixed in master and is the last failing test
Completing work on commits from L1.
Fixing ETE test to ensure it uses a properly committed UTXO. It's pretty straightforward thanks to the utxoToJSON function that converts the generated payment to the expected format.
Transaction fails to be submitted off-chain:
seen messages: {"transaction":{"witnesses":{"keys":["8200825820db995fe25169d141cab9bbba92baa01f9f2e1ece7df4cb2ac05190f37fcc1f9d5840dffeaeb16f1b23a76b1f038f835099c81aaaab7d1ac9e8c0fadc192e7593466810193a72d53c9402eeb7748e6b7eef19d287241b385976da929237f279d3d300"],"scripts":{}},"body":{"outputs":[{"address":"addr_test1vz35vu6aqmdw6uuc34gkpdymrpsd3lsuh6ffq6d9vja0s6s67d0l4","value":{"lovelace":1000000}}],"mint":{"lovelace":0},"auxiliaryDataHash":null,"withdrawals":[],"certificates":[],"inputs":["9fdc525c20bc00d9dfa9d14904b65e01910c0dfe3bb39865523c1e20eaeb0903#0"],"fees":0,"validity":{"notBefore":null,"notAfter":null}},"id":"4c69e0154cdc07ca752157ed6cf247fe449b3d21e40bab0c848a822ae5a54c85","auxiliaryData":null},"utxo":{"998eec9baf49ee66c1609157f00a31198621740226584ae0eb4f32c81ff700f0#1":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","value":{"lovelace":1000000}}},"validationError":{"reason":"ApplyTxError [UtxowFailure (UtxoFailure (ValueNotConservedUTxO (Value 0 (fromList [])) (Value 1000000 (fromList [])))),UtxowFailure (UtxoFailure (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash \"9fdc525c20bc00d9dfa9d14904b65e01910c0dfe3bb39865523c1e20eaeb0903\"}) 0])))]"},"tag":"TxInvalid"}
The input to use for the off-chain transaction was hardcoded -> replacing with the one we generate
Now with another error:
{"transaction":{"witnesses":{"keys":["8200825820db995fe25169d141cab9bbba92baa01f9f2e1ece7df4cb2ac05190f37fcc1f9d58405f14a0e9b7da0deca07529cd4d1fa8e59b5efe345afcc94c7dbf1eb7c2e8a485658e36715d04ea305f510291204c0450f4d7cbc119e2495d98430134a3b9c301"],"scripts":{}},"body":{"outputs":[{"address":"addr_test1vz35vu6aqmdw6uuc34gkpdymrpsd3lsuh6ffq6d9vja0s6s67d0l4","value":{"lovelace":1000000}}],"mint":{"lovelace":0},"auxiliaryDataHash":null,"withdrawals":[],"certificates":[],"inputs":["998eec9baf49ee66c1609157f00a31198621740226584ae0eb4f32c81ff700f0#1"],"fees":0,"validity":{"notBefore":null,"notAfter":null}},"id":"d87840c06e3e65d422ed9181273579dc82e6b471024d3899610b5c025a243442","auxiliaryData":null},"utxo":{"998eec9baf49ee66c1609157f00a31198621740226584ae0eb4f32c81ff700f0#1":{"address":"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3","value":{"lovelace":1000000}}},"validationError":{"reason":"ApplyTxError [UtxowFailure (MissingVKeyWitnessesUTXOW (WitHashes (fromList [KeyHash \"f8a68cd18e59a6ace848155a0e967af64f4d00cf8acee8adc95a6b0d\"])))]"},"tag":"TxInvalid"}
Probably because we are passing the wrong key?
- => Keys and addresses were generated and hardcoded
ETE test is now passing!
Fixing benchmark to work with on-chain commits:
- we currently generate an arbitrary dataset from a seed random Utxo and then generating transactions with the right keys
- in the case of the constant Utxo, things are fine because we generate key pair to produce a utxo so we could as well keep the (initial) keys around and then commit the initial utxo on chain when starting the benchmark
Struggling a bit with getting to/fromJSON instances right for KeyPair, current solution is to store the signing key only and regenerate the verification key from the serialised bytes
the bytestring is hex-encoded in a JSON object
- Added a function to transform a
Ledger.KeyPairinto a(VerificationKey PaymentKey, SigningKey PaymentKey)
Feels like changing the benchmark is a bit more involved than merely adding keys and committing, as the whole logic of generating transactions beforehand is a bit borked. We should probably provide not a dataset but parameters for a dataset, like number of transactions to run and other things, and generate the txs on the go. This might skew the timings a bit but probably dwarfed by IOs anyway. Feels like the "right path" be:
- have a single way of generating dataset, eg. one utxo per client
- generate keys for each participants
- pass dataset parameters instead of actual dataset
- do not store the dataset? Once we have keys defined then the UTxOs should be constant?
- generate transactions in client from previous transaction, using the same keys
- transactions can be sent at random to some other party? but this might deplete the clients' funds and led to a client not being able to post txs anymore
- fix the payment graph so that amounts stay constant and all parties can always keep generating txs
Still trying to properly commit actual UTXOs.
Managed to get TxOut transformed between an Alonzo one and a Mary one, without requiring full transformation of the internal ledger
- Tests are failing still but with a different error, namely that we commit more than UTxO which is odd... Actually not: We wait for all payments at some address and use the retrieved UTxO there to commit, but we should only commit one of them.
got another interesting error:
ErrNotEnoughFunds {missingDelta = Coin 2298000}
At least, I can see that the inputs are correctly set, with the committed UTxO as input and also in the datum:
failed to cover fee for transaction: ErrNotEnoughFunds {missingDelta = Coin 2298000},
ValidatedTx {body = TxBodyConstr TxBodyRaw {
_inputs = fromList [TxIn (TxId {_unTxId = SafeHash "bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea"}) 1
,TxIn (TxId {_unTxId = SafeHash "e50062182d5d401d13249a7f7e7e1ac73deec0170421e10bc7d9b346c284ebdd"}) 1],
_collateral = fromList [],
_outputs = StrictSeq {fromStrict = fromList [(Addr Testnet (ScriptHashObj (ScriptHash "6679d3c92844becb16c55161b60111336e1ba2f3d14bbb52b051c4db")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "549485dcc8131ab64122a9163080943f55b83ae368bc55bec73e583f192f3080"))]},
_certs = StrictSeq {fromStrict = fromList []},
_wdrls = Wdrl {unWdrl = fromList []},
_txfee = Coin 0,
_vldt = ValidityInterval {invalidBefore = SNothing, invalidHereafter = SNothing},
_update = SNothing,
_reqSignerHashes = fromList [],
_mint = Value 0 (fromList []),
_scriptIntegrityHash = SNothing,
_adHash = SNothing,
_txnetworkid = SNothing},
wits = TxWitnessRaw {_txwitsVKey = fromList [],
_txwitsBoot = fromList [],
_txscripts = fromList [(ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2",PlutusScript PlutusV1 ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")],
_txdats = TxDatsRaw (fromList [
(SafeHash "2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e",DataConstr B "\248\166\140\209\142Y\166\172\232H\NAKZ\SO\150z\246OM\NUL\207\138\206\232\173\201Zk\r"),
(SafeHash "549485dcc8131ab64122a9163080943f55b83ae368bc55bec73e583f192f3080",DataConstr Constr 0 [I 10,B "{\"e50062182d5d401d13249a7f7e7e1ac73deec0170421e10bc7d9b346c284ebdd#1\":{\"address\":\"addr_test1vru2drx33ev6dt8gfq245r5k0tmy7ngqe79va69de9dxkrg09c7d3\",\"value\":{\"lovelace\":1000000}}}"])]),
_txrdmrs = RedeemersRaw (fromList [(RdmrPtr Spend 0,(DataConstr Constr 0 [],WrapExUnits {unWrapExUnits = ExUnits' {exUnitsMem' = 0, exUnitsSteps' = 0}}))])}, isValid = IsValid True, auxiliaryData = SNothing},
using utxo: fromList [(TxIn (TxId {_unTxId = SafeHash "bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea"}) 0,(Addr Testnet (ScriptHashObj (ScriptHash "07204aff6885e57d799ee359f7e401d64ff3b54e0de7c4edc3f163d9")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "f4b9d64e4725efc05d7d078bb19e952b288c8403f5a585a8a6ffe589a9851614"))),
(TxIn (TxId {_unTxId = SafeHash "bc1eae5aa8e72d3f9e5c0cd725b49dc47954570b6087eb16b5cc3f4ce6daf4ea"}) 1,(Addr Testnet (ScriptHashObj (ScriptHash "12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2")) StakeRefNull,Value 2000000 (fromList []),SJust (SafeHash "2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e")))]
Finally fixed the commit test:
- The UTXO set maintained by the Wallet is right: When we find a block, we traverse the transaction list (topological ordering?), remove the txins we know from the map and add the txouts we found corresponding to our address of interest.
- The problem was in the way we select the UTXO to use in
coverFee: We take the maximum of the UTXO from our internal state but this maximum is just an ordering of txids and chances are we get a smaller UTXO. Just filtering the map to select UTXO with a value higher than some threshold makes the test pass. - The only remaining test that fails is the
EndToEndSpectest as we still try to commit an arbitrary UTXO.
Fixing TUI and the use of addresses, comparing by Show instance which is not great but needed because we keep them as keys in Map. Now implementing mkSimpleTx which is the function from TUI that creates actual transaection to be committed on chain
Completed implementation of mkSimpleTx, it now returns an Either with an error, because makeTransactionBody does: It checks well-formedness of the transaction.
Got a failure with invalid witnesses and an encoding problems for UTxOs, getting an invalid UTF-8 encoding error
- Problem comes from
AssetName.AssetNameare encoded as Latin-1 in the cardano-api, why? -
ToJSON/FromJSONinstances in our version of cardano-api are wrong, they do not roundtrip properly. We want to upgrade our cardano-node dependency as it's been fixed recently
Fixed the dependencies and the few impacts from changes in API
- Still have a few failures related to the UTXO generator
- One test that's failing is the one about size of commits: What is this test about?. Perhaps we only need to ensure a single UTXO would fit atm?
- Discussed usage of
cardano-apiinHydra.Ledger.Cardano, MB is implementing it; that the Alonzo types are more complete / simpler to handle, but most of Alonzo features are not supported yet by our integration -> need to strip down generators in tests - "Fixed" benchmarks to only commit a single UTXO (current limitation)
- Continued "committing real UTXO" in pairing session
- Expect
postTxof committing arbitrary UTXO to fail when really spending the selectd UTXO incommitTx - To make it succeed though, we need to generate a payment tx such that the
Hydra.Direct.Wallet"sees" the resulting UTXOs, knows about them and can spend them - We are using
cardano-apiviaCardanoClientto construct, sign and submit this transaction - In order to pass the resulting
UTxOtoCommitTx (Utxo CardanoTx)we would either need to convertcardano-apiUTxOto thecardano-ledgerUTxOtype, or utilize the refactoredHydra.Ledger.Cardanowhich usescardano-apitypes
- Expect
- The
shell.nixby default does also build local cluster scripts which cannot be disabled with an argument. - Also, this workbench is a bit confusing and it didn't seem to be giving me something I need.
- The
cabalin scope is actually wrapped and uses a differentcabal.projectand using the standard one is not working well withexactDeps = true - In summary, here is a diff I used to get into a
nix-shellwhich cancabal test cardano-api:
diff --git a/shell.nix b/shell.nix
index b44ff6d99..c20294d26 100644
--- a/shell.nix
+++ b/shell.nix
@@ -89 +89 @@ let
- cabalWrapped
+ pkgs.cabal-install
@@ -104,5 +103,0 @@ let
- ## Workbench's main script is called directly in dev mode.
- ++ lib.optionals (!workbenchDevMode)
- [
- cluster.workbench.workbench
- ]
@@ -112,9 +106,0 @@ let
- ]
- ## Local cluster not available on Darwin,
- ## because psmisc fails to build on Big Sur.
- ++ lib.optionals (!stdenv.isDarwin)
- [
- pkgs.psmisc
- cluster.start
- cluster.stop
- cluster.restart
@@ -125 +111 @@ let
- exactDeps = true;
+ exactDeps = false;
Looking at PR to review, SN's PR fails to build on CI with the following error:
hydra-node: failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (MaxTxSizeUTxO 17634 16384)))]})))))
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-7lFjKp98vwwCeaGqBf05Yi:Relude.Debug
error, called at src/Hydra/Chain/Direct.hs:329:34 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
Seems like we are already hitting some limits of the chain, in this case the size of Tx apparently (greater than 16K)
- Bumping tx size to 50K and block size to 100K allows running the benchmark
- committed Utxo is about ±100 Utxo in total which makes the datum quite large
Something we could do to reduce the size of the datum on chain would be to make the commit and collectCom transactions pass a datum made from a hash of the MT of the committed Utxos.
- The committed Utxos would then be sent as a first message off-chain, signed by each party and verified thanks to the MT root hash:
- each party send
Committedmessage containing its Utxo - each party reconstruct the Utxo MT and verify its root hash
- But is it really needed? We can reap the Utxo directly from the commmitted transactions, no need to pass it around
- Plus, how is it verified on-chain? The
ν_commitvalidator needs to verify, in the case of an abort, that the UTXO posted by theAbortTxare indeed the ones present in the datum of each of the aborted commits: This could be achieved by computing the MT root of the UTXOs committed by the abort transactions, as the validator has access to them but of course this could be computationally relatively expensive.
Another failure running the benchmark:
hydra-node: failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (FeeTooSmallUTxO (Coin 3365841) (Coin 3298000))))]})))))
Two problems:
- We can't just pack arbitrary many Utxo because this would blow up the tx size -> limit a single Utxo per participant
- We can only ever commit concrete
CardanoTxnot abstract ones on the direct chain -> removing parameterisation on concrete transactions handling in Direct chain - We can do things 2 steps:
- first degenerify the
txwhen usingDirectChain - second limit commits to a single Utxo
- first degenerify the
Where to check we do not commit more than on TX?
- We could do it in the
CommitTxbut this would require introducing some more type for representing a single TxOut -> large change - We do it at the
DirectChainlevel throwing an exception if something goes wrong - We check the size of committed Utxo in the
fromPostChainTxfunction -> we could do it in thecommitTxfunction instead?
Still 2 tests failing:
- 1 test about the size of the commitTx -> should be fixed once we change the interface to the
commitTxfunction - End to end test
Interestingly our current code does not allow commiting no UTxO which explains why the ETE tests is failing: node 2 and 3 do not commit anything.
But the error reported says: MoreThanOneUtxoCommitteed which is certainly misleading.
- Added a test to
DirectChainSpecasserting we can commit an empty UTxO set, but this feels a bit too high-level for this kind of test. Perhaps there could be a more granular test module for thepostChainTxfunction? There seems to be a separate responsibility here, which is the handling of the on-chain state - But the
commitTxfunction accepts one and only one UTxO, so we cannot commit an empty UTxO set Going down the easy route: passMaybe Utxoto thecommitTxfunction
What we achieved?
- Swapped Hydra node to use real cardano node on a devnet, removing Mock chain
- Working on making master "green" again following big changes
- Demo works again with cardano node
What we plan to do?
- Properly commit and fanout Utxo from/to the real chain
- Design and implement handling of rollbacks
- Start implementing proper OCV in Plutus (again)
- Follow-up meetings with potential Hydra early adopters
- Seems like
haskell.nixputs an exe into the shell env when it's mentioned as abuild-tool-dependsin one of the local packages'.cabal - When showing the demo, ensure the
devnetis wiped andcardano-nodeis restarted, otherwise thehydra-node(it's wallet) could not find a "seed input" and crashes (for now at least)
- To have the
cardano-nodedocker (entrypoint) not remove key arguments and indeed produce blocks it requires the environment variableCARDANO_BLOCK_PRODUCER=true- https://github.com/input-output-hk/cardano-node/blob/master/nix/docker/context/bin/entrypoint seems to be the entrypoint in use
- When getting
NoLedgerViewerrors, updating genesissystemStart(byron + shelley) to be "within some time" helps - Investigating why the returned tx is not captured by
observeInitTx- We should really log why something is not an initTx etc. -> Either Reason OnChainTx
- Found the reason:
-
observeInitTxthought party is notelemof the converted parties -
convertPartycreates un-aliasedPartyfrom the chain data
-
- Possible solutions:
-
aliasofPartyis not taken into account forEqofParty - strip
aliasfrompartybefore checkingelem party parties - Do not incorporate alias into party but wrap it with data AliasedParty = AliasedParty Text Party
-
- Receive a
CommandFailedwhen trying to commit - Ran into the problem that the hydra-tui was showing "Initializing" ==
InitialState, but in fact we were only in "ReadyState" -> this is because we violated "make impossible states unrepresentable" when managing the state in Hydra TUI! :( - Make HeadLogic "alias"-proof by adding an aliased party in
HeadLogicSpec - Lots of repetition between
README.mdanddemo/README.md, especially after introducingprepare-devnet.sh-> only explain in demo/? - Chain works, but seems like hydra-nodes are not connecing to each other (using direct
cabalinvocation of demo setup) - Nodes seem to be connected in the
docker-composesetting- No tx submission in an open head though with repeating ReqTx:
{"message":{"node":{"by":{"alias":"alice","vkey":"000000000000002a"},"event":{"message":{"transaction":{"witnesses":{"scripts":{},"keys":["820082582060cdff1c5cd672fb7d8df7f60121fabd4416b2381df70d5c65cb1559af81599858406d08cd35336088575712d8f7fb5fc96a9e29fa6c89305a920aa41e2162a98b0daeb82a7696d14cd9ff6b308eebf71620f354a6820467d87ca5ff8ca383f10705"]},"body":{"outputs":[{"address":"addr_test1vre6wmj9zmh0fjfavedh6q9lq32lunnlseda4xk7t0cg47sal9qft","value":{"lovelace":1893670963}}],"mint":{"lovelace":0},"auxiliaryDataHash":null,"withdrawals":[],"certificates":[],"fees":0,"inputs":["ae85d245a3d00bfde01f59f3c4fe0b4bfae1cb37e9cf91929eadcea4985711de#93"],"validity":{"notBefore":null,"notAfter":null}},"id":"7071e48915eb9c3de986cef336544c24af8a01eedbf4721ddbef6fce0b591ad3","auxiliaryData":null},"party":{"alias":"alice","vkey":"000000000000002a"},"tag":"ReqTx"},"tag":"NetworkEvent"},"tag":"ProcessedEvent"},"tag":"Node"},"timestamp":"2021-11-17T18:21:07.164988302Z","namespace":"HydraNode-1","threadId":33}Just realised we have this section in shell.nix
tools = [
pkgs.pkgconfig
pkgs.haskellPackages.ghcid
pkgs.haskellPackages.hspec-discover
pkgs.haskellPackages.graphmod
pkgs.haskellPackages.cabal-plan
pkgs.haskellPackages.cabal-fmt
# Handy to interact with the hydra-node via websockets
pkgs.ws
# For validating JSON instances against a pre-defined schema
pkgs.python3Packages.jsonschema
pkgs.yq
# For plotting results of local-cluster benchmarks
pkgs.gnuplot
];
but actually it's not used and the tools listed are not available on the command-line!
Seems like they are used in the shell based on haskell.nix but not in the cabal only shell
Problems on master:
- Flacky test on DirectChain
- Test checking conformance of logs with schema does not seem to catch undocumented ctors
Adding an item in the backlog for rollbacks which we should handle sooner rather than later
Looks like the flakiness of DirectChainSpec comes from the use of withCluster which strats 3 nodes and produces rollbacks
- We need to add
waitForSocketeverywhere which is clumsy => refactor to move intowithBFTNode
Last step before merge to master = make benchmark runnable again
- We need to generate all key pairs for all nodes in the cluster and then write relevant files for each node
- We need the (Cardano) keys to modify the entries in
initialFunds-> move them before we start BFTNode - Passing the list of verification keys to
makeNodeConfigso that we do the change togenesis-shelley.jsoninside thewithBFTfunction => add empty list when not needed Using lenses to update theinitialFundsfield, we can useaddFieldfunction fromCArdanoNodeWe need to encode theVerificationKey PaymentKeywe have into Hex-encoded thingy
We can run the benchmarks and got results 🎉
- Seems like re-running benchmarks does not work correctly now
- resubmit transaction when it gets rollbacked? if it is valid
- indicate to the user probability of a rollback -> %age of stability = paranoia level
- making it a function of value committed? overridable with own settings
- we could replay the sequence of events? => genuine rollbacks (non-adversarial)
- if L1 can rollback so can L2
- once enough time has passed no rollback can happen => only need to keep stream of events until
$k$ slots has passed - the off-chain can start from where it was, the latest snapshot if the Head can be reopened with same UTXO set
- if contestation period is shorter than rollback period this could be a security issue we could not submit the exact same close because the validator would check the contestation period extends from the start of the close
- txs in the mempool would be replayed automatically in case of rollbacks => we'll observe them in the ChainSync
- user might want to introspect the on-chain state?
- PAB does not do anything about rollbacks -> pushing it to users
- what to expose to users? stability level, probability that there will be a rolllbak (99.99% is a few dozen blocks)
- provide a
HeadRollbackedoutput to users - practically, most rollbacks have been pretty small (< k/4) https://plutus-apps.readthedocs.io/en/latest/plutus/howtos/handling-blockchain-events.html https://plutus-apps.readthedocs.io/en/latest/plutus/explanations/rollback.html
Some documentation on Settlement error
SettlementError(b, eps, g) = g * exp(-0.69 - b * [0.249 * eps^{2.5} + 0.221 * eps^{3.5}])
Parameters:
b: the number of blocks on top of the transaction in question;
eps = 1 - 2*[adversarial stake], where the adversarial stake is a real between 0 and ½;
g: the grinding power of the adversary. A single-CPU grinding would correspond to g=10^5; a conservative default choice could be g=10^8 corresponding to a 1000-CPU grinding.
The resulting SettlementError(b, eps, g) is an estimate of the probability that a valid transaction appearing b blocks deep can be later invalidated. Here exp(X) refers to e^X where e is the base of the natural logarithm.
Note: given the crude estimate on grinding coming from the factor g, for small values of b the formula will produce outputs greater than 1 (until the exponential term becomes small enough to counter the effect of g). This simply means that for such small values of b this method does not provide any guarantees.
>>>>>>> Updated Logbook
- Add instructions on how to start a local, single cardano-node devnet.
- Generating a topology file feels annoying, but providing all peers as arguments (as we do) might scale less?
echo '{"Producers": \[{"addr": "127.0.0.1", "port": 3001, "valency": 1}\]}' > topology.json - Got a
HandShakeErrorwithVersionMismatch- was led astray on updating to a newer
cardano-nodedependency in our code - however, our code was already newer than the docker image so supporting the latest + one before was the solution
- was led astray on updating to a newer
-
hydra-nodecan connect after fixing protocol version,initTxis created and submitted, but not observed- Also when re-trying / re-submitting the node crashes with
hydra-node: cannot find a seed input to pass to Init transaction CallStack (from HasCallStack): error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug error, called at src/Hydra/Chain/Direct.hs:359:13 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct- After restarting the hydra-node and re-trying [i]nit this is the error
hydra-node: failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (ValueNotConservedUTxO (Value 0 (fromList [])) (Value 900000000000 (fromList []))))),UtxowFailure (WrappedShelleyEraFailure (UtxoFailure (BadInputsUTxO (fromList [TxIn (TxId {_unTxId = SafeHash "39786f186d94d8dd0b4fcf05d1458b18cd5fd8c6823364612f4a3c11b77e7cc7"}) 0]))))]}))))) CallStack (from HasCallStack): error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug error, called at src/Hydra/Chain/Direct.hs:328:34 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct - When adding all the block signing keys the node spams a weird log message (error?)
{"thread":"31","loc":null,"data":{"val":{"kind":"TraceNoLedgerView","slot":16370873857},"credentials":"Cardano"},"sev":"Error","env":"1.30.1:0fb43","msg":"","app":[],"host":"eiger","pid":"1","ns":["cardano.node.Forge"],"at":"2021-11-16T18:29:45.70Z"}
{"thread":"31","loc":null,"data":{"kind":"TraceStartLeadershipCheck","chainDensity":0,"slot":16370873858,"delegMapSize":0,"utxoSize":6,"credentials":"Cardano"},"sev":"Info","env":"1.30.1:0fb43","msg":"","app":[],"host":"eiger","pid":"1","ns":["cardano.node.LeadershipCheck"],"at":"2021-11-16T18:29:45.80Z"}
- Re-using a db + cardano-node run command from e2e test works (hydra-node shows initializing!)
- Manually invoking cardano-node keeps producing errors (and no blocks), this time:
{"thread":"32","loc":null,"data":{"val":{"kind":"TraceNodeNotLeader","slot":3462},"credentials":"Cardano"},"sev":"Info","env":"1.30.1:0fb43","msg":"","app":[],"host":"eiger","pid":"1","ns":["cardano.node.Forge"],"at":"2021-11-16T18:43:29.20Z"}
Goal: remove a pendingWith statement in a test
- Master does not compile, so reverting to last green point which is 21 days in the past We should really take care of not breaking master in the future as this prevents rapid intervention and branching when need be (nbot that I am a bigfan of branchinbg anyhow)
Trying to reactivate ServerSpec test, seems like there's a race condition.
- I don't understand how the code works anymore so it's unclear to me why it's failing, this has to do with more messages coming than expected 🤔 ?
Here is a trace I see
received "{\"me\":{\"vkey\":\"0000000000000001\"},\"tag\":\"Greetings\"}" resp: Greetings {me = 0000000000000001} received "{\"me\":{\"vkey\":\"0000000000000001\"},\"tag\":\"Greetings\"}" resp: Greetings {me = 0000000000000001} sending ReadyToCommit {parties = fromList []} - Trying to augment timeout does not work
Adding showLogsOnFailure to have the server's traces displayed
- Of course, the client stops after receiving one message so it nevers waits for everyhing after the greetings....
- That was easy :)
Goal: Use DirectChain in EndToEndSpec test
Added signing and verification keys for Cardano tx
- We can see the
initTxbeing submitted and added to the MempPool but not having it be part of a block - Adding JSON instances to
DirectChainLogin order to have better traces (we currently pass anullTracerbecause we don't have those instances). Note that we don't write full instances becauseValidatedTxis pretty complex so just use itsshowinstance
We need to increase timeout for observing init and commit transactions
- We see 1 node can commit but the other nodes are crashing with
no ownInitialwhich says they cannot extract their own pkh from the initials Utxo => 🤦 We forgot to add our own pkh to the list of initials! - We fail to observe
headIsOpenbecause of timeout again
Block length on our cluster is 2 seconds but we produce one block out of 3 because of our config which has 3 validators => we need 6 seconds we produce a block
- It depends on the
slotLengthandactiveSlotCoeff: 100ms * 20 = 2s - The "rule" is that
$3k / f > epochLength$
Changed active slot coeff to 1.0 but to no avail, we still miss some commits and don't see the head being opened
- Carol has no fund so she can't do anything...
We got a green ETE test with DirectChain 🎉
- Tests are a bit slow though, even when we increase slot coefficient
We have a problem with benchmark: they run a cluster with arbitrary number of nodes, so we need to generate n addresses and keys for each node. Plan is:
- read genesis file as JSON (raw JSON, we don't care to use cardano-api)
- generate right number of key pairs
- need to store the files in the temporary directory where we run the cluster
- convert to CBOR encoded address using
buildAddressfrom CardanoClient - inject into
initialFundsfield ingenesis-shelley.json
After switching to use withDirectChain in Bench.EndToEnd we have issues with overlapping instances on Utxo CardanoTx type which is an alias to the ledger's type
- Cardano.Api provides
ToJSONinstances which we should probably use, but we know cardano-api is prone to breaking changes. Also it has noFromJSONfrom some of the types which is annoying as we rely on those in various tests - Got stuck down a rabbit hole with lot of tests failing => revert and use custom function to bypass overlapping instances + comment out some tests relying on
Arbitraryinstances forDirectChainLog
Ending the day with only DirectChainSpec tests failing, unclear why.
Goal: Complete the OCV logic with Fanout
We have troubles with rollbacks in the chain: We observe some rollbacks even though there should not be any because we are on BFT nodes => Trying to run the network with a single BFT node makes the flakiness disappear
- BFT nodes should not do rollbacks but we have rollbacks, henceforth it's probably not BFT nodes we are running.
While writing observeFanoutTx we are stuck with an issue: The test fails but there's no obvious reason why it's failing, seems like the redeemer (Fanout) cannot be decoded correctly.
- Turns out it was a copy-paste issue 🤦
Ended up having the full test with mock fanout transaction (eg. one without actual committing txs) passing:
- We have a single node to avoid rollbacks
- The transition SM is still very simple, we probably want to ditch it altogether as we are handling the state threading by hand
- Refactored
CardanoClusterandCardanoNodeto make it easy to run a single node cluster, also removing copying of keys which is unneeded in the cluster. We can just load the keys from where they are stored in the source tree
- Work on open & close via Direct chain (using mock validators)
- Patterns start to arise: there are many things which could be DRYed
- observing state machine transitions are very similar
- keeping track of "interesting" utxo in
OnChainHeadStateis very similar - constructing SM transition txs is very similar
- When adding the closing
SnapshotNumberit was interesting to observe that initialy I kept it asDatum, but toobserveCloseTxit was more appropriate to keep it asRedeemerand decode it from that- There was no need to store it as
Datum(right now) - Also: Redeemers are more space efficient as we would only include them in the spending tx
- There was no need to store it as
- Concerning
OnChainHeadState- Seeing the repetition in
OnChainHeadStateof(TxIn, TxOut, Data)triples really gives the hint that this state-tracking code could also be generalized and only keep track of "interesting" utxo + their data - OTOH, a "head identifier" is currently implicitly encoded in the
threadOutputTxOutaddress, while it would make perfect sense to add it to thePostChainTx/OnChainTxtypes to describe "which head to abort" etc. - This makes me think that we could keep the whole state (head id + interesting utxo) abstractly in the
HeadState, e.g. existentially quantified; That way, we would not needTVarin theChain.Directand make the whole component stateless!
- Seeing the repetition in
- Copy & extend e2e test to also cover posting & observing
CollectComTx - This should be fairly easy, as we know the total utxo from the
PostChainTxvalue- OCV (not covered now) would "only" need to check all committed, i.e. all PTs present
- How far to go right now?
collectComcould just ignore all the committed utxo? - Is probably the smalles step, but there would be no real value in the head (or the ledger would not allow it)
- When drafting
collectComI realize that we do not needHeadParameters, but ratherData Era- it is enough to just keep the Datum around uninterpreted in the
OnChainHeadState
- it is enough to just keep the Datum around uninterpreted in the
- When fixing
TxSpecusage of construct functions (because changed signatures), I realize that "cover fee" test was more arbitrary than necessary- It was "side-loading" initial inputs, instead of feeding the
initTxoutputs intoabortTx
- It was "side-loading" initial inputs, instead of feeding the
- The more complex tests in
TxSpeccry for some refactoring now- Some DSL or operators to easily construct outputs, datums and "forward" them from one to the next tx would help
- Continuing with implementing
collectComTxvia the canonical transaction size prop test- also about 7kB transaction size for most
arbitrary :: Utxo SimpleTx
- also about 7kB transaction size for most
- Next: roundtrip-test with a newly created
observeCollectComTx- As the
OnCollectComTxis actually holding no data, this should be quite trivial
- As the
-
observeCollectComTxis just the same asobserveAbortTxand can be obviously DRYed- deliberately holding back on it though
- it's possible that something comes up which makes it not as straight-forward
- Unit tests pass now. Quick confusion about why it passes even though datum hash of provided output is
SNothing - To make the e2e "open Head" test pass, I only needed to plug
observeCollectComTxinto the<|>sequence ofrunOnChainTxs.. that was easy!- also,
runOnChainTxsfeels a bit off inChain.Direct.Tx-> moving it toChain.Direct
- also,
- Made abortTx unit property tests pass by improving observeAbortTx, which requires to pass a Utxo to
observeAbortTxnow - When adding initials outputs to
initTx, we need to store thePubKeyHashof the participants Cardano credential!- This is not yet kept around
- We need to add the Cardano credentials of all the participants to
initTxconstruction
- Discussing on
DirectChainSpecwhere the cardano credentials for participants should go now - First try: Adding them to the
HeadParametersanalogously toparties = [alice, bob, carol]- We know this is brittle and morally we would change
Partyto relateHydraandCardano(public) credentials to each other
- We know this is brittle and morally we would change
- Second try: Add it to
InitTxfor now as its used in less places - Realize adding it to
InitTxis already involved- the lowest hanging fruit may be to pass it to the
withDirectChainand thus make it "non-configurable" - would not work as we want to open subsets of participants?
- the lowest hanging fruit may be to pass it to the
- Third try: Start from bottom-up instead and work on
initTx+observeInitTxfor now- not worry yet about where the keys come from
- Seeing the "not observed if not invited" test raises the question whether we should determine being part of the Head using the Hydra credentials or Cardano credentials?
- This is interesting case where we could have used the Mikado method to safely and incrementally build a plan to make cardano credentials available in
Party
- Start putting credentials into withDirectChain and see where this gets me
- Suprisingly, the
Chain.Directintegration test passes with [] as cardano keys -> this should matter - Also interesting: only the "can commit" e2e test fails!
1) Test.DirectChain can commit
uncaught exception: ErrorCall
no ownInitial: []
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug
error, called at src/Hydra/Chain/Direct.hs:306:24 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
- Passing
[aliceCardanoVk]makes the commit test progress further
src/Hydra/Chain/Direct.hs:275:15:
1) Test.DirectChain can commit
uncaught exception: ErrorCall
failed to cover fee for transaction: ErrUnknownInput ...
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug
error, called at src/Hydra/Chain/Direct.hs:275:15 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
- Of course: initials are not in
knownUtxo - Need to add TxOut to
initialsofOnChainHeadState- this was a bit messy and the tuples are really crying for a refactor
- "can commit" e2e test progresses, but times out
- log is not very conclusive
- try adding more cardano logs to debug there
- For some reason I have not seen the
TraceMempoolRejectedTxerror before...
{"thread":"80","loc":null,"data":{"tx":{"txid":"txid: TxId {_unTxId = SafeHash \"7cda5fb5d5828c4cf1081f406cb1cc3d0241e2b8aa824b3682edfc4ea64c8138\"}"},"mempoolSize":{"numTxs":0,"bytes":0},"kind":"TraceMempoolRejectedTx","err":{"received":["fb5a425ee6b4da
│ 39fd9074006af88d7675e24acad19f252c0e133f379d1246c4"],"required":["2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e"],"kind":"MissingRequiredDatums","scripts":{"12c4f8ff8070f0d659bdb2ecf844190ddb1134ef821764bd2b4649b2":{"spending":"cf4be62
│ b474fe3047bc8630f462c0e130cb7064872a4e417da70ba321faf34e2#1"}},"errors":[{"kind":"CollectError","scriptpurpose":{"spending":"cf4be62b474fe3047bc8630f462c0e130cb7064872a4e417da70ba321faf34e2#1"},"error":"NoRedeemer"}]}},"sev":"Info","env":"1.30.0:a7085","
│ msg":"","app":[],"host":"eiger","pid":"1980729","ns":["cardano.node.Mempool"],"at":"2021-10-28T16:47:44.00Z"}
- "Handling" the reject result in
txSubmissionClienthas the test fail right away and no need to dig into log files! 🎉- For example
src/Hydra/Chain/Direct.hs:289:38:
1) Test.DirectChain can commit
uncaught exception: ErrorCall
failed to submit tx: HardForkApplyTxErrFromEra S (S (S (S (Z (WrapApplyTxErr {unwrapApplyTxErr = ApplyTxError [UtxowFailure (MissingRequiredDatums (fromList [SafeHash "2502ff9c9c341dd1384724ae35eab0b19e394c90226892fcc8e7cc86342d324e"]) (fromList [SafeHash "fb5a425ee6b4da39fd9074006af88d7675e24acad19f252c0e133f379d1246c4"]))]})))))
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug
error, called at src/Hydra/Chain/Direct.hs:289:38 in hydra-node-0.1.0-inplace:Hydra.Chain.Direct
- Reason:
commitTxis not including an initial redeemer- our unit tests would ideally balance, sign and validate txs against a ledger to catch this earlier
- After adding datum/redeemer the tx submission does not fail anymore, but test times out
- enabling tracers in cardano-node to debug this
- There are funny things in the logs like
FetchDeclineChainNotPlausible
...declined","declined":"FetchDeclineChainNotPlausible","peer":{"remote":{"addr":"127.0.0.1","port":"25898"},"local":{"addr":"127.0.0.1","port":"41739"}}},{"length":"1","kind":"FetchDecision
- Reading our own logs would help though.. seems we see the tx, but not the corresponding
OnChainTx
FromDirectChain "alice" (ReceiveTxs {onChainTxs = [], receivedTxs = [ValidatedTx {...
- Duh..
observeCommitTxwas not even called from Direct chain /runOnChainTxs- What kind of test would cover / improve this?
- Commit e2e test passes!
- Continue with creating a
MockCommitscript to focus on off-chain parts for now- the
Commitscript is not wrong per se, but we would not know as no test is covering them right now - also, any change driven by the off-chain logic (i.e. observing a commit tx) would force us updating the validator logic (not being checked etc.)
- the
- How to identify a commit tx?
- looking at the outputs? a single pay to v_commit?
- using the PT?
- After introducing a onchain
Utxotype, we realize thatconvertUtxo :: OnChain.Utxo -> Utxo txis tricky- after as short discussion we decided to go via a binary representation
- for now ToJSON / FromJSON, later CBOR or so
- this allows us to use the
Txtype class to convert to and from the on-chainUtxo
- Observing a commit tx works
- we were surprised by the small size and consistent size (~7kB)
- adding a scale (*100) had the transaction size property fail
- so it likely is because the
Gen (Utxo SimpleTx)is staying in reasonable orders of magnitue and it's just Integers what we store
- The local-cluster test of committing via DirectChain still fails
- with a non-saying
PostTxFailed - led us to changing a
Maybefunction chain back to useerrorfor more better visibility - obviously we should improve error handling!
- with a non-saying
- The reason is that
initials = []- as a next step, creating outputs which pay to a
MockInitialscript could work
- as a next step, creating outputs which pay to a
- After introducing a
MockInitialscript some unit tests fail
- seems like abortTx + observeAbortTx do not yield a Just anymore
- could it be that the mocked redeemer type
()messes withHead.Abort? - can't seem to spot the bug in
observeAbortTxand how it would lead toNothing.. too long of a day as it seems
- Found the reason:
observeAbortTxdoes think it sees aJust CollectComwhen encounteringDataConstr Constr 0 []in the redeemers, although that was the plutus equivalent of()(theMockInitialredeemer type)
- Next step: make
observeAbortTxmore robust
- Discussion about how
postTxwould fail if a transaction is invalid (or could not be submitted)- Synchronous failure via a return value or exception vs. asynchronous failure via the callback "back-channel"
- Easier to stick with the callback (
OnChainTx) for now - How to extend that type to accomodate a
PostTxFailed-> wrap or additional data constructor? - Decided for the latter
- Test assertion is then:
failAfter 10 $ takeMVar calledBackBob `shouldReturn` PostTxFailed - Bob thinks he is part of the party now
- Sees the InitTx even though he is not invited
- Two ways:
- Provide the hydra credential to the direct chain and check HeadParamters against it
- Require to be payed (via a script) a participation token, spendable by our Cardano credentials
- Take small steps: We go with checking Hydra credentials being in parameters or not in off-chain
- Formulating a property test in
TxSpecto facilitate "not being able to observe init tx when not invited"- required adding
Partyeverywhere inChain.Direct
- required adding
- Prop tests pass, local-cluster test errs because can't post
AbortTxin closed state- this catches the invalid scenario even in a synchronous manner
- in contrast to knowing whether a posted tx failed (tx submission thread is asynchronously connected via a queue)
- After wrestling a bit with the tx submission code, we get the DirectChainSpec test to pass!
- unfortunately, this is still not forcing us to do proper on-chain validation
- the stateful nature of the DirectChain component prohibits accidentially closing "other heads"
Today's goals:
- Finish reading Maladex's white-paper (they talk quite a lot about Hydra)
- Timebox trying to upgrade dependencies to make it possible to construct
Aborttransaction with cardano-api - Draft new ADRs
- Add PT minting, initial validator, and burning to init and abort txs
Trying to find a suitable set of dependencies to be able to build Hydra OCV transactions with cardano-api.
Found recent commits in cardano-node and cardano-ledger-specs that work fine but this leads to issues in plutus because of dependencies to stuff in networking
Removing all dependencies to PAB and off-chain Contract from hydra-node to try to get to a smaller set of dependencies in Plutus
Sadly, the on-chain part of the StateMachine code is in the new repository plutus-app which depends on cardano-node at an older version, but it is from more than one month ago and does not contain the changes we need, eg. allowing to store script as part of output's datum.
After upgrading dependencies for cardano-node, cardano-ledger-specs and plutus, and removing stuff related to the PAB, the local cluster test using cardano-api built transactions for init and abort passes:
Test.LocalCluster
should produce blocks and provide funds
Finished in 6.8452 seconds
1 example, 0 failures
Test suite integration: PASS
I had to vendorize the StateMachine related code from plutus-apps repository in order to make it work though.
Master build failed following merge, investigating why and fixing it:
- Timeout for observing transaction submission on-chain might too low for slow CI, might want to increase it
- Also, when tests fail, we cannot easily access the logs that were generated from the tests failure, we could upload the logs to some S3/google storage bucket when CI tests fail
Drafting ADRs related to cardano-api and direct chain interaction
Trying to write a test to introduce the need for Participation Tokens minting in order to make it possible to Commit funds to an opened Head and thus actually do useful stuff with an opened Head
The purpose of the PTs is to
- ensure only identified Head participants can commit
- ensure only Head participants can advance the Head, eg. post transactions for the OCV state machine The question is: How does one observe the rejection of a transaction submission, given this process is asynchronous?
- Seems like the "right way" to do this would be to add a constructor in the
OnChainTxdata type representing failed submissions, so that the node is notified when a submission fails? When we post a transaction we actually need evaluate the validators in order to be able to balance it and assign execution units, so we know whether or not the transaction is correct before submitting, even though a submitted transaction can still "fails", eg. being rolled back or rejected by the ledger because of double spending - There are quite a few
errorcalls inDirectmodule anyways, that should be handled in one way or another
Direct chain test is still failing apparently randomly, might come from issues in handling of rollbacks in the wallet: https://github.com/input-output-hk/hydra-poc/runs/3998479889?check_suite_focus=true#step:7:2699
- When Hydra (on mainnet)?
- What does opening a Hydra Head actually mean?
- Can a Hydra Head be opened from a mobile phone?
- How can a Hydra Head (auto-)scale horizontally?
- What are the limitations for Hydra when fully implemented?
- Where does the DApp go in Hydra?
- How would Hydra work with AMM (automatic market makers?)?
- Is Hydra a mixer?
- How is Hydra different from (zk-)rollups?
Trying to execute init -> abort script sequence using CardanoClient, eg. only using cardano-api stuff
Currently lost in the maze of type wrappers for scripts...
Got a surprising error when trying to submit transaction:
uncaught exception: CardanoClientException
BuildException "TxBodyError TxBodyMissingProtocolParams"
The Protocol parameters are required in the TxBodyContent to build the body in Alonzo era
So the script got executed but it produces the infamous error:
uncaught exception: CardanoClientException
BuildException "TxBodyScriptExecutionError [(ScriptWitnessIndexTxIn 0,ScriptErrorEvaluationFailed (CekError An error has occurred: User error:\nThe provided Plutus code called 'error'.))]"
Activated debugging output for scripts execution to try to see what's happening, but there are no logged errors, so it's a more general problem with the way the transaction is built.
- Trying to add collateral input does not fix the problem.
- MB spots (at least) one problem with the transaction: The datum is not part of it so the
Head.validatorScriptwill fail because the state machine requires both datums (input and output) to be present in the transaction. - As observed by SN, there used to be now way in the cardano-api to include a
Datumin a transaction, but this has changed "recently": One can now construct aTxOutpassing either aTxOutDatumHashor aTxOutDatumwhich will then be included in the scripts' context
- Problem is: coverFee does not know what's the input value to balance, inputs only is a txref
- short discussion about what to do now
- SN mentions this is a "self-made problem" as there is a split between the chain component and the tiny wallet, where the chain component would know the relevant utxo
- By keeping the separation we might end up with a good interface of "doing it externally" later
- Start pairing by adding a UTXO to lookup inputs to
coverFee :: ValidatedTx Era -> STM m (Either ErrCoverFee (ValidatedTx Era)) - After adding
lookupUtxotocoverFee_we introduceknownUtxoto get a set of known utxo from the Chain component'sOnChainHeadStateto provide tocoverFee- SN is not convinced by the OnChainHeadState in general
- But this provides for the right info at the right time without refactoring too much know
knownUtxo :: OnChainHeadState -> Map (TxIn StandardCrypto) (TxOut Era)
- Consequently we also add
TxOutvalues to theOnChainHeadState'sInitialconstructor- This required more work in
observeInitTxto provide theTxOutand made it's implementation more complex - But we also fixed the "bug" that it's thread output's might not always be the
firstInput
- This required more work in
- When trying to fix
observeAbortTxwith the additional txout for thethreadOutput...- fixing it is smelly as it is not used at all in
observeAbortTxand one would rather expect a single "Head identifier" or so - Remove the unused types with intention to re-add what the
observeXXXrequire further down the road to not "overgeneralize" - Removing the
OnChainHeadStatemade the signatures ofobserveAbortTxagain a-> Maybe (OnChainTx, OnChainHeadState)and simplified implementation ofobserveInitTxagain
- fixing it is smelly as it is not used at all in
- WalletSpec now failing because we changed interface and semantics of
coverFee - Need to also
resolveInputinwalletUtxo(besideslookupUtxo) - We now get an
ErrorCallfrom the ledger for too big TxOut values- scaling to
reasonablySizedgenerators helped
- scaling to
-
coverFeeresults in double the amount -> we counted the selected utxo twice- include the selected utxo to cover fee to the inputs (use inputs' for resolvedInputs)
-
MissingScriptUTXOWremains an error- we saw this in the past
- this time again, the ledger sees less needed than provided scripts -> confusing error
- in our case it was the abort tx not spending any 'initia' outputs (yet)
- We get
FeeTooSmallUTxOerror now from the ledger- after initial confusion of a too low fee or that
minFeeAandminFeeBbeing 0 are possible problems - we found out that this in fact that fee is just too low for script execution etc.
- our cluster had too high
executionPrices-> use realistic (mainnet) values for genesis-alonzo.json
- after initial confusion of a too low fee or that
- The integration test passes!
- We have a full roundtrip of posting and observing Init -> Abort, i.e. the short Head lifecycle (with many simplifications)!
-
Another round of discussions over deposit-based Head allowing more participants off-chain than on-chain, e.g. variation on the idea of distinguishing between running a Hydra Head node and using a Hydra Head to make transactions.
-
There is some research in that direction, along the lines of the tail protocol but removing the requirement for large amount of collateral deposit from intermediaries
-
We discussed a related approach proposed by Matthias based on deposits and inspired by Lightning and drew it up on our Miro board
Analysing transaction that fails validation to check our scripts execution logic and redeemers setting
- Turns out the issue in the
coverFee_test came from missing coins in the abortTx output: In the initTx output, we add a fixed 2000000 lovelace, but in the output of the abortTx we set the fee to 0 - Now more tests are failing, probably because of changes in
coverFee. Also the previousabortTxproperty fails because of the missing 2000000.
Test checking abortTx transition was flaky because we were looking at the datums to identify the aborttx, but it could be the case that we decode the initial datum first -> Refactored code to look at the redeemers instead of the datums
- Test is still failing because of the 0 execution units
Writing test to check coverFee_ updates and cover execution cost of scripts
- Instead of calculating the exact execution cost for the redeemers, we take the maximum for a Tx and divide by the number of redeemers.
- But it's not much more complicated to call the actual function for computing exunits... 🤔
We notice we are consuming the same UTXO in both transactions beacuse the Wallet does not remove the UTXO it consumes when it covers fees
-
retryin STM for the UTXO set to change when retrieving it
The error we now get is weird, it seems to be a mix of several errors:
- It's unbalanced
- It's using a UTXO which has already been spent before (39786f186d94d8dd0b4fcf05d1458b18cd5fd8c6823364612f4a3c11b77e7cc7)
- The
MissingScriptWitnessesUTXOWerror should be accompanied with a list of missing hashes - => We should improve error reporting in our Direct chain test ot understand better why it fails...
Continue working on replacing script with actual calls to cardano-api, part as educational work, part to build a proper CardanoClient we'll be able to use elsewhere when interacting with a node.
- Struggling to extract the value from the output: Some functions are very recent and not available in our version of the API which is already 3 weeks old
- Replaced more cardano-cli calls with custom function in
CardanoClientmodule, could go with more but I would like to have the master build.
Working on making master green, eg. have DirectChainSpec validates
- Added query for
PParamsas it's needed fro computingscriptIntegrityHash - Compute
scriptIntegrityHashincoverFee_-> InitTx passes correctly, now tacklingAbortTxwhich is failing currently
Adding a property test checking coverFee_ does the right thing with execution units computation and setting redeemers pointers right should it add a UTXO
- There's this annoying
PParamslying around, going to add it to some fixture code so that we can use it in different modules. - Managed to get property for coverFee fails with the "right" error, eg. covering of transaction succeeds but validation fails because
rdptris not correctly set, going to write code to fix that :fingers_crossed:
Adjusting the redeemer pointers, the idea is to compare the two sorted list of inputs and adjust those RdmrPtr for which the initial value is different from the final one, once we have added the input for balancing tx and paying fees.
- It's not working, still got a script execution error for one of the redeemers but it seems the adjustment of redeemer pointers actually worked.
Writing a cardano-cli "wrapper" module, something that would provide useful functions for common operations in order to remove the need to run scripts from the command-line, running into a couple snags:
- The generators exposed in the sub-library for cardano-api use Hedgehog instead of QuickCheck, so we need to convert them but this loses shrinking capability
- Of course, the cardano-cli uses types from cardano-api, but our
Walletuses cardano-ledger-specs types Going to write the module using cardano-api types
While trying to replace address building from cardano-cli with Haskell code I got the following error:
test/Test/LocalClusterSpec.hs:25:11:
2) Test.LocalCluster should produce blocks and provide funds
uncaught exception: ErrorCall
InputTextEnvelopeError (TextEnvelopeTypeError [TextEnvelopeType "PaymentVerificationKeyShelley_ed25519"] (TextEnvelopeType "GenesisUTxOVerificationKey_ed25519"))
CallStack (from HasCallStack):
error, called at src/Relude/Debug.hs:288:11 in relude-1.0.0.1-KWrPF7zdlFZ8gdnjuoSoUr:Relude.Debug
error, called at test/Test/LocalClusterSpec.hs:45:21 in main:Test.LocalClusterSpec
assertCanSpendInitialFunds, called at test/Test/LocalClusterSpec.hs:25:11 in main:Test.LocalClusterSpec
although the underlying representation is the same, it fails to deserialise properly because of the envelope tag, so need to make address more robust.
There is the castVerificationKey function which could be used for that?
In the genesis-shelley.json file we need a base16 encoding of the address but cardano-cli uses bech32
cardano-cli address build --testnet-magic 42 --payment-verification-key-file alice.sk | bech32
Managed to have a Direct component connected to the local cluster, had to tweak some parameters in the cardano-node.json to have protocols activated.
- We can see the transaction submitted but it fails, first with minimum output value, then with not being balanced -> need to wire wallet for balancing
-
coverFeein Wallet takes a TxBody and not aValidatedTx, so we need to adapt it - Still missing computation of
scriptIntegrityHashwhich sdhould be done as part of tx balancing, fee payment and signing because it requires the final set of inputs to be defined
- Updated GHC to version 8.10.7
- Updated haskell.nix to version fd4d10efe278ba9ef26229a031f2b26b09ed83ff
- Removed nix dependency to
cardanoPkgs- It's probably what caused me a lot of trouble when I first tried to update haskell.nix a week ago
-
local-clusternow usesbuild-tools-dependto pullcardano-nodeandcardano-cliin scope which guarantees we use the same version everywhere
- Updated plutus version to 5ffcfa6c0451b3b937c4b69d2575cd55adebe88b
- Updated ledger and node packages to suit plutus' dependencies
- Note this implies we temporarily depend on a fork of cardano-ledger-specs: https://github.com/raduom/cardano-ledger-specs, which does not yet contain major changes to directory structure
- Fixed minor changes in API following dependencies update
- Updated cabal.project's
index-stateto 2021-08-14T00:00:00Z- This needs to be older than haskell.nix's version but I am really unsure how they relate to each other
- Ideally, we should need to change only one of those to ensure pinned dependencies
Goal for today:
- Have a working Alonzo cluster with funds in
local-clusterpackage - Rename
local-cluster->hydra-clusterand rename modules
I can see the cluster tries to start but there's no logs, need to store the logs as we do in the EndToEndSpec tests
- Added capture of logs to the
CardanoNodeprocess wrapper but seems like no logs are output => need to addbackendsto the node's configuration -> Now have logs activated for all cardano-nodes, in JSON - Nodes are starting up and apparently succeeding in their connection, not sure why they are not producing logs? Our
waitForNewBlockfunction is a bit crude => making it a bit smarter actively waiting for a new block to be produced
Adding a test checking we can make a simple payment transaction using initial funds:
- Test simply executes a script using
cardano-cli, this sounded much easier than trying to replicate all commands within Haskell - Struggling to get script to run correctly, it cannot find the
cardano-cliexecutable in itsPATHeven though I pass the testing process' environment to it => I was incorrect calling[ -x cardano-cli] - I can see the transaction submitted to the node-1's mempool but it seems to never end in a block?
- The transaction appears to be rejected by the mempool:
{"thread":"73","loc":null,"data":{"tx":{"txid":"txid: TxId {_unTxId = SafeHash \"00943cd84146550c7162ba5fc9d2bdef940afddfc4712e916060af5373acefdb\"}"},"mempoolSize":{"numTxs":1,"bytes":234},"kind":"TraceMempoolRejectedTx","err":{"produced":{"policies":{},"lovelace":900000000000},"kind":"ValueNotConservedUTxO","consumed":{"policies":{},"lovelace":0},"badInputs":["2fec440b7b461450420820a57f913d17525bc915da37d86e0423775110a05683#0"],"error":"This transaction consumed Value 0 (fromList []) but produced Value 900000000000 (fromList [])"}},"sev":"Info","env" :"1.30.0:7ff91","msg":"","app":[],"host":"haskell-","pid":"696941","ns":["cardano.node.Mempool"],"at":"2021-10-15T08:10:56.70Z"}
It takes time for the UTXO to be committed, going for an active loop in the script to check the newly created UTXO presence:
- One problem in the script: the
-eoption fails the entire script as soon as 1 sub-command fails, sogrepfailing meant the script exited immediately - Another problem: I was using some syntax not available in plain
/bin/shso it was running somewhat differntly inside the test and outside of it.... - It actually takes a while to have the transaction correctly submitted => Reduced slot length to 100ms and it's definitely much faster :)
PR for AbortTx is green 🎉 🍾
Plans for today:
- merge PR to master
- send head -> abort transaction sequence to a local testnet
Now trying again to submit an aborttx to the testnet manually
Note: We should make available soon a Hydra Test Cluster "framework" or package that will make it easy for people to start a local cluster and interact with it programmatically, eg. expose our HydraNode from the local-cluster package to be used downstream and not only the docker-compose file. When working on the cardano-node cluster, I have found that having scripts is great but having our local-cluster is even greater as it should now be straightforward to wrap any test within this cluster and connect our hydra-nodes to it.
Exposing this feature to other developers would be super-useful.
Discussing with MB issues with running scripts on-chain:
- configure local-cluster to start in Alonzo era with some funded UTXOs
- see https://github.com/input-output-hk/cardano-configurations to pull config
- fix how we create transactions:
- correctly assign ex units => need to evaluate the Tx and update ex units in redeemers
- script integrity hash -> see
hashScriptIntegrityin LEdger - add collateral input -> should be done in the tinyWallet reusing the one and only input we have in the wallet
- beware of redeemer pointers logic: balancing the tx adds a new input which can change the pointer logic
- see
Shelley/Transaction.hsin cardano-wallet - beware of error in execution units -> seems like one needs to take a large margin (like 2x)
- scirpt execution takes maximum 6 ADA
- wire wallet in the Direct.Tx to cover fees
- should also assign redeemers?
To debug scripts failure need a cardano-node rebuilt with tweaked cardano-ledger-specs to provide logging output when evaluating scripts
-
Wrote a simple script to submit transactions for a head until the abort tx.
-
Managed to get some debugging output:
["L1","Ld","S5","PT5"]- L1 ->
Output constraintfailed - Ld is used for 2 things:
MustSatisfyAnyOf xs -> traceIfFalse "Ld" -- "MustSatisfyAnyOf" $ any (checkTxConstraint ctx) xs {-# INLINABLE checkScriptContext #-} -- | Does the 'ScriptContext' satisfy the constraints? checkScriptContext :: forall i o. ToData o => TxConstraints i o -> ScriptContext -> Bool checkScriptContext TxConstraints{txConstraints, txOwnInputs, txOwnOutputs} ptx = traceIfFalse "Ld" -- "checkScriptContext failed" $ all (checkTxConstraint ptx) txConstraints && all (checkOwnInputConstraint ptx) txOwnInputs && all (checkOwnOutputConstraint ptx) txOwnOutputs - S5 ->
"State transition invalid - constraints not satisfied by ScriptContextfrom the StateMachine library - PT5 -> generic error when a check is false
- L1 ->
Getting an even more puzzling error when trying to submit txs with MockHead script:
cardano-cli transaction build --alonzo-era --cardano-mode --testnet-magic 42 --change-address $(cat alice/payment.addr) --tx-in 0e2c5cb64ca7012dd235bad5b00fc8bf86662172e8af600a35aa5d42e761e5c3#1 --tx-in-collateral 0e2c5cb64ca7012dd235bad5b00fc8bf86662172e8af600a35aa5d42e761e5c3#0 --tx-out $(cardano-cli address build --payment-script-file mockHeadScript.plutus --testnet-magic 42)+1000 --tx-out-datum-hash ff5f5c41a5884f08c6e2055d2c44d4b2548b5fc30b47efaa7d337219190886c5 --tx-in-script-file mockHeadScript.plutus --tx-in-datum-file headDatum.data --tx-in-redeemer-file headRedeemer.data --out-file abort.draft --protocol-params-file example/pparams.json
Command failed: transaction build Error: The transaction does not balance in its use of ada. The net balance of the transaction is negative: Lovelace (-21987700) lovelace. The usual solution is to provide more inputs, or inputs with more ada.
- This shows I don't provide enough input to pay for the scripts' execution according to the result of evaluating execution units -> providing more input makes the transaction valid and submitted successfully.
L1 failure code happens in checkOwnOutputConstraint which checks the datum hash of the outputs of the transaction. It's as if it was missing the abort datum, but the hash is actually present in the transaction.
- Trying to use the
--tx-out-datum-fileoption which passes a hash extracted from datum file does not work either. - 🎉 The trick is to use
--tx-out-datum-embed-filewhich puts the data in the transaction! - This means transactions working with the state machine always require to have the datums of its output included and not only the hashes, which increases the tx size requirements esp. in the case of Hydra where the state of the SM can potentially be large
Configuring our local-cluster to start in alonzo and have initial funds
- Extended
LocalClusterSpecto check we can actually spend outputs from initial funds. Writing it all in Haskell would be a PITA so I will first try to do that using an embedded shell script. - Got an error when starting the cluster, so need to not delete the created directory if the tests fail.
-
LocalClusterSpecis failing with new configuration, but not sure if this is not just a timeout problem -> need to gather logs...
Managed to have the project builds outside nix-shell sidestepping some issue in the retrieval of blocks in Wallet module.
All but 3 tests pass:
-
TxSpectest for aborttx fails, I suspect this is caused by the change tovalidatorHashnow that the dependencies are updated - it cannot find cardano-node nor cardano-cli executable -> they should be provided in build-tools-depends as suggested by MPJ
- prometheus monitoring fails, perhaps a side effect of the above
What I did:
- Install ghcup and set versions to GHC 8.10.7
I had some troubles with my path because I asked to append the
$ curl --proto '=https' --tlsv1.2 -sSf https://get-ghcup.haskell.org | sh $ ghcup set ghc 8.10.7~/.ghcup/bintoPATHinstead of prepend - Install various system dependencies
Do not confuse
$ sudo apt install -y build-essential curl libffi-dev libffi7 libgmp-dev libgmp10 libncurses-dev libncurses5 libtinfo5 $ sudo apt install -y libz-dev liblzma-dev libzmq3-dev pkg-config libtoollzmawithliblzma-dev, those are 2 existing package - Install forked libsodium
git clone https://github.com/input-output-hk/libsodium cd libsodium/ git checkout 66f017f16633f2060db25e17c170c2afa0f2a8a1 ./autogen.sh ./configure make && sudo make install - Build and test everything:
cabal build all && cabal test all
Trying to update haskell.nix at fd4d10efe278ba9ef26229a031f2b26b09ed83ff and using ghc8107 -> udpated materialisation first -> nix-shell works fine and I can build hydra 🎉 locally on my VM.
It appears the datum file which the cardano-cli requires must be the JSON representation of the Data and not the TextEnvelope. => making the changes in the inspect-script code...
- Doing
Aeson.encode $ toData mydatadoes not work, it generates some CBOR encoding and not JSON - Turns out one must not use
serialiseToTextEnvelopefor datums butscriptDataToJson ScriptDataJsonDetailedSchema. ButscriptDataToJsonoperates on aScriptDatawhich is a mirror in cardano-api of Plutus'Dataso first one must converttoDataand then useCardano.Api.Shelley.scriptDataFromPlutus
From Duncan on slack:
if you're starting from a type from the underlying ledger rather than starting with the API types, then yes you can use the conversion functions to/from the underlying ledger types. The general principle of the API is that Cardano.Api exports everything at the level of the API types and Cardano.Api.Byron or Cardano.Api.Shelley exports and exposes all the underlying representations and conversion functions. So since you want to "lift the lid" and see all the representations (i.e. using fromPlutusData) then you want to import Cardano.Api.Shelley.
The generated datums should be OK for now, rebuilding cardano-node in order to be able to start a local cluster and retry submitting my script transactions.
- The generated
TextEnvelopefor scripts have an incorrect type but not sure why? -> Need to use aLedger.Scripts.toCardanoApiScripton Plutus' scripts to "convert"
Trying to fix the last remaining important issue following up dependencies upgrade, namely the Point Block conversion problem: We have a Point (ShelleyBlock (AlonzoEra c)) and we want a Point (CardanoBlock c).
castPoint definition requires coercibility between 2 different HeaderHash type family instances:
castPoint :: Coercible (HeaderHash b) (HeaderHash b') => Point b -> Point b'
The concrete needed coercion looks like:
ShelleyHash StandardCrypto
-> OneEraHash
'[Ouroboros.Consensus.Byron.Ledger.Block.ByronBlock,
ShelleyBlock (Cardano.Ledger.Shelley.ShelleyEra StandardCrypto),
ShelleyBlock (Cardano.Ledger.Allegra.AllegraEra StandardCrypto),
ShelleyBlock (Cardano.Ledger.Mary.MaryEra StandardCrypto),
ShelleyBlock (Cardano.Ledger.Alonzo.AlonzoEra StandardCrypto)]
Following the chain of definitions and imports gives me:
newtype OneEraHash (xs :: [k]) = OneEraHash { getOneEraHash :: ShortByteString }
then
newtype ShelleyHash c = ShelleyHash {
unShelleyHash :: SL.HashHeader c
}
...
type instance HeaderHash (ShelleyBlock era) = ShelleyHash (EraCrypto era)
with SL.HashHeader being
newtype HashHeader crypto = HashHeader {unHashHeader :: Hash crypto (BHeader crypto)}
then
type Hash c = Hash.Hash (HASH c)
where module Hash is ultimately Cardano.Crypto.Hash.Class which does not export the constructor for newtype Hash
module Cardano.Crypto.Hash.Class
( HashAlgorithm (..)
, sizeHash
, ByteString
, Hash(UnsafeHash)
...
newtype Hash h a = UnsafeHashRep (PackedBytes (SizeHash h))
...
pattern UnsafeHash :: forall h a. HashAlgorithm h => ShortByteString -> Hash h a
pattern UnsafeHash bytes <- UnsafeHashRep (unpackBytes -> bytes)
where
UnsafeHash bytes = UnsafeHashRep (packBytes bytes :: PackedBytes (SizeHash h))
{-# COMPLETE UnsafeHash #-}
So if I cannot coerce I could just pattern match and get the underlying ShortByteString and rewrap it.
Right tip -> do
let blk = case tip of
GenesisPoint -> GenesisPoint
(BlockPoint slot h) -> BlockPoint slot (fromShelleyHash h)
fromShelleyHash (Ledger.unHashHeader . unShelleyHash -> UnsafeHash h) = coerce h
query = QueryIfCurrentAlonzo $ GetUTxOByAddress (Set.singleton address)
pure $ LSQ.SendMsgQuery (BlockQuery query) (clientStQueryingUtxo blk)
Now I need to fix the TxSpec test which fails, probably because serialisation has been fixed in Plutus
- ✅ Replaced convoluted
Initial.Dependencieshash computation withvalidatorHashand script validates
Rebasing ch1bo/aborttx branch over master as it does not have some changes improving over flakiness of monitoring tests
Back to submitting transactions, restarting a cluster and recreating a user:
mkdir alice
cd alice
cardano-cli address key-gen --verification-key-file payment.vkey --signing-key-file payment.skey
cardano-cli address build --testnet-magic 42 --payment-verification-key-file payment.vkey > payment.addr
cd ..
cardano-cli query utxo --testnet-magic 42 --address $(cat alice/payment.addr)
I managed to build the aborttx transaction except the script validation failed before submission:
cardano-cli transaction build --alonzo-era --cardano-mode --testnet-magic 42 --change-address addr_test1vqpfgdh6ldx73nypc5hkur2wm2hpt0kx240qlxvykhy8efc74sfu5 --tx-in 6bd74fd0e48e6a35c4fd59ba474b671866f115bc67fc8d6d84259e45e229bf15#1 --tx-in-collateral 6bd74fd0e48e6a35c4fd59ba474b671866f115bc67fc8d6d84259e45e229bf15#0 --tx-out addr_test1wp3urt44rzvpsj2fu696su9ee573m6ne0ce4uydhcdnwhkshjamur+1000 --tx-out-datum-hash ff5f5c41a5884f08c6e2055d2c44d4b2548b5fc30b47efaa7d337219190886c5 --tx-in-script-file headScript.plutus --tx-in-datum-file headDatum.data --tx-in-redeemer-file headRedeemer.data --out-file abort.draft --protocol-params-file example/pparams.json
Command failed: transaction build Error: The following scripts have execution failures:
the script for transaction input 0 (in the order of the TxIds) failed with:
The Plutus script evaluation failed: An error has occurred: User error:
The provided Plutus code called 'erro
🤦 It's perfectly possible all this dance on dependencies upgrade was unneeded to submit transactions manually. I thought the formats had changed over the past few weeks but turns out I was using the wrong serialisation functions...
Today's goal:
- Spin-up an Alonzo (local) test network
- (optional) Send transactions from our Direct chain component to this network
The scripts directory in cardano-node repo contains mkfiles.sh script that does the necessary magic to create an Alonzo network, either transitioning all the way from Byron to Alonzo, or hardforking immediately at epoch 0.
Script needs to be run from top-level:
$ scripts/byron-to-alonzo/mkfiles.sh
~/cardano-node/example ~/cardano-node
scripts/byron-to-alonzo/mkfiles.sh: line 205: cardano-cli: command not found
and requires cardano-cli and cardano-node to be available in PATH
I need a recent version of cardano-node obviously to start an alonzo network, latest version in master is 1.30.0, but the version we have in scope in hydra-poc is 1.27.0
Activating nix inside cardano-node directory through echo use nix > .envrc and direnv allow .envrc -> perhaps we should upgrade our dependencies after all...
Other option suggested by MB: https://github.com/input-output-hk/cardano-wallet/blob/master/lib/shelley/exe/local-cluster.hs
- Wallet uses this version of cardano-node: https://github.com/input-output-hk/cardano-node/commits/0fb43f4e3da8b225f4f86557aed90a183981a64f
- Cardano-node (master) depends on this plutus version:
commit edc6d4672c41de4485444122ff843bc86ff421a0 Merge: 569f98402 63c6ca8ac Author: Michael Peyton Jones <[email protected]> Date: Fri Aug 20 10:43:53 2021 +0100 Merge pull request #3430 from input-output-hk/hkm/windows-cross windows cross compile
Running scripts/byron-to-alozon/mkfiles.sh alonzo "works": I can see 3 nodes up and running. Now need to understand how to post a transaction to them...
In the cardano-node scripts there's an initialFunds field which sets some lovelaces to some address, but in the wallet there's none and it says there can't be as it needs to transaction to byron from shelley, but if we hard fork at epoch 0 immediately this should work?
cardano-cli can talk to the node and get some information:
$ CARDANO_NODE_SOCKET_PATH=example/node-bft1/node.sock cardano-cli query tip --cardano-mode --testnet-magic 42
{
"epoch": 5,
"hash": "48c4b8c546a0a9ffd0649a77b0926881e6e8869d83cb6da70f1d32ac9f936878",
"slot": 2700,
"block": 182,
"era": "Alonzo",
"syncProgress": "42.68"
}
I can also get the UTXO set of the network:
$ CARDANO_NODE_SOCKET_PATH=example/node-bft1/node.sock cardano-cli query utxo --cardano-mode --whole-utxo --testnet-magic 42
TxHash TxIx Amount
--------------------------------------------------------------------------------------
61fa39c2f3e110850c741da3a0f978bcee0fd9abfc7b0bce4df3ea047d61e824 0 5010000000 lovelace + TxOutDatumNone
704e1dc2f4dfcc44c0ba90978a0d58371b5f7ee1d3c47b1cedb52e1b1cb37b18 0 5010000000 lovelace + TxOutDatumNone
f2d04cab14eefbb0571e6a74b64b49453ac3312c20ce8fcda9d125c9020bd267 0 900000000000 lovelace + TxOutDatumNone
Leveraging SN's previous experiments to learn how to create a transaction to send some ADAs between addresses in the testnet
- How do I get the details of a transaction using cardano-cli?
- Going throuhg https://github.com/input-output-hk/cardano-node/blob/master/doc/stake-pool-operations/simple_transaction.md to create a transaction to send some ADAs from the genesis funding transaction to some other user's address but it fails as I am missing the right key
- The key to use is
example/shelley/utxo-keys/utxo1.skeywhich results in successfully submitting transactions. I can see the transaction is successfully submitted but it does not appear when querying the utxo set: Possibly because I set the validity interval too far in the future and I need to wait? The node is stale and does not make progress anymore => Restarting from scratch
I was able to submit a transaction!
$ cardano-cli query utxo --whole-utxo --testnet-magic 42
TxHash TxIx Amount
--------------------------------------------------------------------------------------
06a82e6521f8d88a9ffe082f66f9f2bb114c9145d3f13cbfb36a3facba8d4de9 0 5010000000 lovelace + TxOutDatumNone
6cafe0b8352fa6bf5c7433bb668bf675c220d27adb42e1da28ab25741290176e 0 899999999599 lovelace + TxOutDatumNone
c989e8557c10a2fee3de5d37bc3858e4a3f2629d07d897f39d8d9ddf631e0c0f 0 5010000000 lovelace + TxOutDatumNone
Now going to submit some plutus transactions and check how it goes... There are a bunch of examples in scripts/plutus that seem interesting
-
I was able to run successfully :
$ scripts/plutus/example-txin-locking-plutus-script.sh guessinggame TxHash TxIx Amount -------------------------------------------------------------------------------------- 9fd2c741e9b582328269dcd1ee5282625be36215126ae2ce0edc24f48de82057 1 10000000 lovelace + TxOutDatumNoneIt does not work twice though, needed to do a minor change to retrieve the first Tx for given address -> updating script to select first transaction found
Next step: Generating needed files for our own scripts and datums -> Reviving old executable from MB that outputs a script in serialised form, useful for manually testing SC on a network
- Some interesting and useful documentation available here on genesis configuration for Shelley
- Plutus provides ways to export data for consumption by cardano-cli: https://plutus.readthedocs.io/en/latest/plutus/howtos/exporting-a-script.html -> One needs to serialise with
TextEnvelopeapparently
Looking at what the plutus script in cardano-node does:
plutusscriptaddr=$($CARDANO_CLI address build --payment-script-file "$plutusscriptinuse" --testnet-magic "$TESTNET_MAGIC")
it constructs an address from the script file's content which indeed is an "enveloped" serialised script
Wrote an inspect-script executable that output scripts, datums and redeemers for init and abort transaction given a currency and a token. These are written using cardano-node's custom TextEnvelope format which is "semi-readable", now going to try to submit head then abort transaction.
Restarting network from scratch, creating 3 utxos for Alice to use in the head txs
Estimating fees:
$ cardano-cli transaction calculate-min-fee --tx-body-file tx.draft --tx-in-count 1 --tx-out-count 4 --witness-count 1 --byron-witness-count 0 --testnet-magic 42 --genesis example/shelley/genesis.json
I want to send the change back to the genesis utxo, so I need its address: how do I get that?
$ cardano-cli -- shelley address build \
--payment-verification-key-file example/shelley/utxo-keys/utxo1.vkey \
--testnet-magic 42
addr_test1vqcvgup2qg3uf525ln7xyj5ymenupyzq6shrwcq08nanm2s2708jd
then
$ cardano-cli transaction build-raw --tx-in 837b43e0ce1da9aabe9794a4c5f8e3da5fde73e5f24927a97862c776357790b3#0 --tx-out $(cat alice/payment.addr)+10000000 --tx-out $(cat alice/payment.addr)+10000000 --tx-out $(cat alice/payment.addr)+10000000 --tx-out addr_test1vqcvgup2qg3uf525ln7xyj5ymenupyzq6shrwcq08nanm2s2708jd+$((900000000000 - 10000000 - 10000000 - 10000000 - 601)) --invalid-hereafter 10000 --fee 601 --out-file tx.draf
signing and submission:
$ cardano-cli transaction sign --tx-body-file tx.draft --signing-key-file example/shelley/utxo-keys/utxo1.skey --testnet-magic 42 --out-file tx.signed
$ cardano-cli transaction submit --tx-file tx.signed --testnet-magic 42
Transaction successfully submitted.
I now have 3 UTXOs to spend in the scripts.
$ cardano-cli query utxo --whole-utxo --testnet-magic 42
TxHash TxIx Amount
--------------------------------------------------------------------------------------
8930182280603aab400a1856daf20c63a6376cae31f2be584f2493f13fba3b22 0 5010000000 lovelace + TxOutDatumNone
b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f 0 10000000 lovelace + TxOutDatumNone
b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f 1 10000000 lovelace + TxOutDatumNone
b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f 2 10000000 lovelace + TxOutDatumNone
b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f 3 899969999399 lovelace + TxOutDatumNone
f517a7081008aa3e658a7f88ad0458bda733d22307a6401a1881cc12ff199890 0 5010000000 lovelace + TxOutDatumNone
Trying to generate script's address fails with
$ cat alice/headScript.plutus
...
"description":"headScript","type":"PlutusV1Script"}curry@haskell-dev-vm-1:~/cardano-node$ cardano-cli address build --payment-script-file alice/headScript.plutus --testnet-magic 42
Command failed: address build Error: alice/headScript.plutus: Error decoding script: TextEnvelope type error: Expected one of: SimpleScriptV1, SimpleScriptV2, PlutusScriptV1 Actual: PlutusV1Script
The descriptor type has been changed in recent cardano-node versions, so I need update cardano-node dependencies to have the proper tag. ATM, trying to simply change the type in the plutus files direclty..
$ cardano-cli address build --payment-script-file alice/headScript.plutus --testnet-magic 42
addr_test1wq2rv89vr2mtkfmcqqpzwz0f88sv86h05cw8mz74vcyd9gclj6lqt
🤦 Actually I need the datum hash to build the tx, not the datum of course.
Creating draft tx for Head init tx without outputting any PTs
$ cardano-cli transaction build --alonzo-era --cardano-mode --testnet-magic 42 --change-address $(cat alice/payment.addr) --tx-in b4f31ac83344988b1cd7bcf8bb150b9e3b4aca519b7f6bc89bc09d545b343f6f#0 --tx-out $(cardano-cli address build --payment-script-file alice/headScript.plutus --testnet-magic 42)+1000 --tx-out-datum-hash a6196b078239886432cc8bb0f981cb9f7df54bcf2fb8951b01c6639104a10640 --out-file head.draft
I was finally able to submit the transaction succesfully:
$ cardano-cli query utxo --whole-utxo --testnet-magic 42
TxHash TxIx Amount
--------------------------------------------------------------------------------------
6147dae7ecb37fc1ea0c34e32419c1cc5916244dfb94f1239622d65d0be0d23d 0 9998733 lovelace + TxOutDatumNone
6147dae7ecb37fc1ea0c34e32419c1cc5916244dfb94f1239622d65d0be0d23d 1 1000 lovelace + TxOutDatumHash ScriptDataInAlonzoEra "a6196b078239886432cc8bb0f981cb9f7df54bcf2fb8951b01c6639104a10640"
...
Now checking I can actually consume the transaction! Unfortunately, serialisation formats definitely have changed for datums too:
$ cardano-cli transaction build --alonzo-era --cardano-mode --testnet-magic 42 --change-address $(cat alice/payment.addr) --tx-in 6147dae7ecb37fc1ea0c34e32419c1cc5916244dfb94f1239622d65d0be0d23d#1 --tx-in-collateral 6147dae7ecb37fc1ea0c34e32419c1cc5916244dfb94f1239622d65d0be0d23d#0 --tx-out $(cardano-cli address build --payment-script-file alice/headScript.plutus --testnet-magic 42)+1000 --tx-out-datum-hash 08090cf3024c750773519501c52bec72749c28d8732dcafc3690c2f77793f84e --tx-in-script-file alice/headScript.plutus --tx-in-datum-file alice/headDatum.plutus --tx-in-redeemer-file alice/headRedeemer.plutus --out-file abort.draft
Command failed: transaction build Error: Error reading metadata at: "alice/headDatum.plutus"
JSON schema error within the script data: {"cborHex":"d8799fd8799f1b000000e8d4a51000ff80ff","description":"headDatum","type":"ScriptDatum"}
JSON object does not match the schema.
Expected a single field named "int", "bytes", "string", "list" or "map".
Unexpected object field(s): {"cborHex":"d8799fd8799f1b000000e8d4a51000ff80ff","description":"headDatum","type":"ScriptDatum"}
- When upgrading dependencies I have run into more nix/cabal/hackage issues and was unable to upgrade cabal.project alone.
- Now trying to build the project with updated plutus dependencies not using nix: downloaded and installed ghcup with ghc version 8.10.7
Goal: Upgrade dependencies to more recent Plutus, Ledger and Cardano-node
Try upgrading hydra-poc dependencies following https://github.com/CardanoSolutions/ogmios/blob/5048fb6cd9eb245b4062191220ad96e945d66258/server/cabal.project
- Hitting an issue with
plutus-contractpackage not building, checking what the dependencies are in Plutus at this commit - Dependencies in ogmios are actually too old for plutus. The revision pointed at dates back from 2 months ago:
commit edc6d4672c41de4485444122ff843bc86ff421a0 Merge: 569f98402 63c6ca8ac Author: Michael Peyton Jones <[email protected]> Date: Fri Aug 20 10:43:53 2021 +0100 Merge pull request #3430 from input-output-hk/hkm/windows-cross windows cross compile
Starting from plutus' master which might be a better choice for us
-
Interestingly, plutus depends on a fork of cardano-ledger-specs:
source-repository-package type: git location: https://github.com/raduom/cardano-ledger-specs tag: ef6bb99782d61316da55470620c7da994cc352b2The pointed at commit (https://github.com/raduom/cardano-ledger-specs/commit/ef6bb99782d61316da55470620c7da994cc352b2) says:
Make the code compile with a newer plutus version raduom/plutus-exbudget-error -
Now trying to update cardano-ledger-specs following changes in directories structure Looks like updating those dependencies will be a nice 🐰 🕳️
-
Build fails because of missing liblzma dependency, added
pkgs.lzmato the
shell.nixfile and now it's recompiling nix! -
Reverting all my changes as it has become a mess. Starting over from the nix shell dependencies as it seems to be the root to be updated. I am in the situation I definitely would like to avoid: I need to update dependencies and for this I need to modify nix stuff which means I need to understand what I am going and what to update where. But I don't really know what I am doing and SN is away and has been the one updating dependencies and maintaining the nix infrastructure in the past -> Bus factor = 1
-
When last updated dependencies SN used
nix-shell -A cabalOnlyto not use haskell.nix which seem to have helped him, will try this.
Adding lzma dependency to the shell.nix, also upgrading GHC to 8.10.7, haskell.nix archive to a more recent one and nixpgs reference to 21.05
- Needed to materialise nix plan and it's now compiling all base dependencies
- Struggling with nix to get the update to 8.10.7 to pass, now I need to add some more libsodium configuration for some packages from https://github.com/input-output-hk/plutus/blob/master/nix/pkgs/haskell/haskell.nix#L256 => Just duplicated the
libsodium-vrfdeclaration fromshell.nixtodefault.nixand it now seems to work
Seems like upgrading Plutus won't be easy: MPJ failed to upgrade dependencies to cardano-ledger-specs in their repository, he had to create a separate branch to add some pending changes. It's probably safer to stay as we are now even with the issues we are having.
List of archiectural katas: http://nealford.com/katas/list.html
Goal: Fix Direct on-chain component abort transaction validation failure
- Got failing test for init -> abort transaction logic, going to add traces to understand what's failing
- Checking the scripts and datums hashes to make sure the transaction provide all of them
- Modifying cardano-ledger-specs to add more verbose output when script validation fails. Trying to enter nix-shell in cardano-ledger-specs to be able to test my changes, took about 10 minutes to enter nix shell, now doing a
cabal build allin the ledger specs directory to compile stuff- Depending on external dependencies like the ledger increases turnaround time to insane levels: Now need to check it works in the original repo before changing the reference in the hydra-poc repo, because otherwise every commit will require a full recompilation which is ridiculously expenseive
- cardano-api depends on
ValidationFailederror's structure so I need to also adapt the code there because I changed the error in alonzo => Rather than modifying the cardano-ledger-specs I am going to work at a lower level, namely testing the script directly with Plutus, as a unit test - Just added
Verboselogs withDebug.Trace.tracein ledger spec then update the dependency in hydra-poc
- Abort Tx test fails with
which is exactly what MB was seeing the last time, which seems to imply the script cannot be found, either because the hash is invalid or some other reason. => looking at the source of the error
["mustRunContract: script not found.","Pd"]
Trying to write a test using https://github.com/input-output-hk/plutus/blob/master/plutus-ledger-api/src/Plutus/V1/Ledger/Api.hs#L262
- This is hard because I need to build the
ScriptContextwhich requires a full transaction which is difficult to build by hand. Could build the transaction in the ledger and then use the functions to translate it to Plutus but thought I might as well check first what the logsa are saying
Trying to uncomment the mustRunContract function which is the one resolving the contract references we need to validate the
output is correctly spent: This function fails to resolve the script, when replaced with a const True function the test does not pass but the scripts execution succeeeds
- display the
Dependenciescontent and compare with what's in the transaction Hashed dependencies show:but the scripts' hash in the transaction are :[581cf0bce8043dc5f9c32ebad31652e239a8f15d1bf01f4d8d1b9740f73f,581c2e95c0a89c450a245d3324d16260797b54f2010e2ea494e5214323c9]5d8dd23697de989275a58ef20edeacb320994f590cf0e10a0163cf3a f0bce8043dc5f9c32ebad31652e239a8f15d1bf01f4d8d1b9740f73f
Trying to simplify dependencies hash computation to use the validatorHash provided in the SC code. AFAICT The validatorHash ultimately uses the same hash function, the one from Cardano.Ledger.Era.hashScript
- Interestingly replacing the hash computation yielded the same hashes
- So I still have the same error... Now investigating what the hashes look like on both sides and trying to find how the
Credentialin theTxInInfowe are filtering is constructed on the ledger sidetransCred :: Credential keyrole crypto -> P.Credential transCred (KeyHashObj (KeyHash (UnsafeHash kh))) = P.PubKeyCredential (P.PubKeyHash (P.toBuiltin (fromShort kh))) transCred (ScriptHashObj (ScriptHash (UnsafeHash kh))) = P.ScriptCredential (P.ValidatorHash (P.toBuiltin (fromShort kh)))
🍾 I managed to have the abortTx validates its scripts. The issue was indeed in the way we construct the hashes. It was unclear to me why we are seeing different hashes between Plutus.validatorHash and Ledger.hashScript but I finally found the reason: We are using an "old" version of Plutus.
Hash Computation
Here is the code that computes a `ValidatorHash` given a scriptvalidatorHash = ValidatorHash . scriptHash . getValidator
scriptHash :: Script -> Builtins.BuiltinByteString
scriptHash =
toBuiltin
. Cardano.Api.serialiseToRawBytes
. Cardano.Api.hashScript
. toCardanoApiScript
toCardanoApiScript :: Script -> Script.Script Script.PlutusScriptV1
toCardanoApiScript =
Script.PlutusScript Script.PlutusScriptV1
. Cardano.Api.PlutusScriptSerialised
. SBS.toShort
. BSL.toStrict
. serialise
Then the code for Cardano.Api.Script.hashScript :
hashScript :: Script lang -> ScriptHash
hashScript (SimpleScript SimpleScriptV1 s) =
...
hashScript (PlutusScript PlutusScriptV1 (PlutusScriptSerialised script)) =
-- For Plutus V1, we convert to the Alonzo-era version specifically and
-- hash that. Later ledger eras have to be compatible anyway.
ScriptHash
. Ledger.hashScript @(ShelleyLedgerEra AlonzoEra)
$ Alonzo.PlutusScript script
Where Cardano.Ledger.Era.hashScript is a method of ValidatorScript typeclass with Era-dependent implementations,
The generic implemetnation says:
-- UNLESS YOU UNDERSTAND THE SafeToHash class, AND THE ROLE OF THE scriptPrefixTag
hashScript =
ScriptHash . Hash.castHash
. Hash.hashWith
(\x -> scriptPrefixTag @era x <> originalBytes x)
but the implementation for Alonzo says:
instance (CC.Crypto c) => Shelley.ValidateScript (AlonzoEra c) where
scriptPrefixTag script =
if isPlutusScript script
then "\x01"
else nativeMultiSigTag -- "\x00"
So it seems it hashes not only the script's serialised content but also a prefix tag of 0x01!
In the https://github.com/input-output-hk/cardano-node/blob/master/cardano-api/src/Cardano/Api/Eras.hs#L336 file we have:
ShelleyLedgerEra AlonzoEra = Ledger.StandardAlonzo
with the latter being defined in https://github.com/input-output-hk/ouroboros-network/blob/master/ouroboros-consensus-shelley/src/Ouroboros/Consensus/Shelley/Eras.hs#L89 as
type StandardAlonzo = AlonzoEra StandardCrypto
So all in all the computed hash values should be equal!
It happens it all make sense: The hashes are actually consistent but in a more recent version of Plutus code than the one we are using! The version of plutus we use is at commit 36dcbb9140af0c9b5b741b6f7704497d901c9c65 which contains this code for hashing scripts:
scriptHash :: Serialise a => a -> Builtins.BuiltinByteString
scriptHash =
toBuiltin
. Crypto.hashToBytes
. Crypto.hashWith @Crypto.Blake2b_224 id
. Crypto.hashToBytes
. Crypto.hashWith @Crypto.Blake2b_224 id
. BSL.toStrict
. serialise
-
Discussing so-called “star-shaped head network” protocol draft:
- There's one server which is part of a Head, or even is running a Head alone
- There are many clients connected to the server
- Client <-> Server are connected through 2-parties isomorphic channels, eg. "mini-Heads" that should be simpler than full multiparty head but with the same properties: Isomorphic, safe, requiring being online to ensure progress.
- Transactions can flow from one client to the others through the pairwise channels mediated by the Head which acts effectively as a bridge
- This channel construction is similar to Perun/lightning channels and can easily be leveraged to give Virtual channels network,
- On-line requirement is needed to ensure safety without collateral from the server like in T2P2: When offline, channel with the server is stale so client needs at least to be periodically online. This could be good fit for light/mobile clients provided there's a way to have a safe access to the chain's state (Watchtowers?)
- This implies some form of multi-protocol support inside a single node is needed (relaying, different protocols between different parties)
-
Discussing a potential collaboration with Perun researchers/engineers for an alternative way to inter-connect Hydra Heads (using virtual perun channels?)
-
Some more discussion about NFTs on Hydra Heads:
- analogy with "Scotty, beam me up" -> do you transport the matter or destruct/reconstruct it somewhere else?
- NFTs in Head -> MintingPolicy allowing to remint the NFT on-chain in the fanout
Published Milestone report urbi et orbi. Highlights are:
- good feedback from Summit and outcome of team Workshop in Berlin leading to refined understanding of short term goals and use case
- while we managed to have a working demo in time for the summit, we did not "close the loop" and were not able to run Hydra Head cluster over an actual Cardano Alonzo testnet,
- we are on track to provide a roadmap and implementation plan for S1 2022 by end of October 2021.
I want to clean up the PR backlog before tackling the Direct transaction submission problem, going to fix log-filter tests and process wrapper to ensure we can merge that today.
- Monitoring test is flacky on CI, although I changed the way we allocate ports => Fuse the 2 unit tests in one because it does not make much sense to have 2 separate tests for the same "behavior"
Thinking it could be a "fun" side-project to implement Golomb-Rice set in Haskell: https://github.com/btcsuite/btcutil/blob/master/gcs/gcs.go
Added a section in demo/README.md to start the demo without docker as requested by someone on the discord channel.
Also all PRs are merged and the only one left is the "direct chain interaction" https://github.com/input-output-hk/hydra-poc/pull/90 to have an Init -> Abort sequence working properly on-chain (or at least with transactions validated by the ledger)
Working on Wallet PR KtorZ/ADP-919/sign-transaction to add more integration tests
https://cbor.me enables decoding a base64 encoded CBOR data
- Got stuck with testing transaction signing with withdrawal(s), got an error with CBOR decoding of TX on the server side. Wrote a unit test at the deserialisation level for
sealedTxFromByteswhich led us to realising the roundtrip test was not covering much of the structured of the sealedTx. - Possible investigations: Cover more fields in the roundtrip, also try to get better error report at CBOR level
- The problem was that we passed the serialised
sealedTxinto the quasiquoter constructing the payload to the sign transaction endpoint, instead of the value itself. Seems like there's ainstance ToJSON ByteStringin scope. - A
SealedTxis just a wrapper around the raw bytes of a cardano transaction, not clear what the other fields are used for. It can be produced by parsing some bytes as it is just a cardano transaction in CBOR encoding. The question is: How is the bytestring encoded in user requests?- In JSON, it is assumed to be a string containing the base64 encoding of the transaction, but the cardano-cli output base16 encoded raw transactions for signing.
- So we should accept both encodings for an
ApiT SealedTxin order to minimize friction for end users.
Not having haskell-language-server working is painful, tried to install HLS from nix and source but does not work.
Trying to use HLS in the cardano-wallet does not work out of the box, seems like one needs to do more work: using scripts/gen-hie.sh. Wallet has more than 100 modules in the core packages which leads to long compilation time esp. without immediate feedback from HLS, this is painful.
Not sure me helping the wallet for a couple of weeks is very useful and productive use of my and Matthias' time: I won't be able to learn much of the codebase in the given time frame so won't be autonomous and will need to ask lot of questions and get a lot of guidance, for a net ROI which is probably negative as I won't staty working on the wallet. I could pair and possibly contribute useful observations and a second pair of eyes but this would require pairing most of the time with different people so unsure if that's the gaol.
- Adding a shell.nix to get build tools like clang in scope
- Libsodium: the
musig2_compatbranch is quite different to the one we use in thehydra-node-> make sure to rebase the necessary changes later - Got the
musig2testworking ->nix-shell --run "make && ./musig2test"on https://github.com/ch1bo/musig2/tree/ch1bo/build-via-nix - Plan: dump keys and signed message from
./musig2test, load them in haskell and run them through Plutus' verifySignature -> no FFI required for now - Writing keys and message was straight-forward
- quite nice to do C for a change!
- What is the format of the signed message / envelope?
- The example uses libsodium's combined mode
- Separated signatures are more likely what we would be requiring when wrapping this into a library
- Split it by hand after the fact for now
- Signature length should be
64bytes
- Realization: Protocol validators need to ensure that the tx spending the contract output and creating the next datum, needs to make sure that "others" can now the datum used for the datum hash. For example: The close tx validator needs ensure that (at least the) snapshot number is included such that other participants can re-construct the datum using the number + stored snapshots in order to spend the output / contest.
- Making the datum simple helps in keeping the script output "spendable". In the end, the datum is a secret and if we want it easy to be able to contest, just having the snapshot number as datum is maybe enough?
- Walking through the on-chain validators, datums and redeemers again
- Fanout is quite complex, it might get expensive with many outputs
- Good thing: it is deterministic and costs would be known up-front and can be avoided before acknowledging them in the Head
- Optimizations to keep the utxos in the head small could be worthwhile
- Utxo set size and costs created by that need to be tractable and not hidden from users of hydra-node
- Applications / operators should be able to take action and decide on such things
- There are hard limits though based on main-chain protocol parameters
Looking at document about Plutus extensions provided by MPJ, which is referring to Hydra in a few place. Proposals are:
- Add reference inputs to transactions, eg. inputs which are not consumed by the transaction but whose datum/datum hash are available to scripts
- Use inline datum instead of only datum hashes.
- Provide script references which is a combination of the above 2 proposals, to remove the need to provide a script as witness in the consuming transaction every time it is used.
Found it difficult at first to understand what I had to do, and how to properly extend existing code to do what I want, eg. retrieve data from Init transaction so that it can be consumed by the Abort transaction. It's actually hard to shape one's thoughts along the lines of another person's thoughts, esp. when in "experimentation" mode and we take a lot of shortcuts, or drift away from the actual goal because it's too complicated to do in one step. In this situation, probably everyone would do a slightly diffrerent step and take a slightly diffrerent direction.
This shed some lights on the importance of pairing/mobbing to share the context of the code we write when it's not already obvious. An alternative is to be very explicit about the goals, and the intermediate steps we are taking, and the assumptions we make about the environment, the shortcuts we take. They are here but disseminated across different functions and files which makes building a big picture hard.
Trying to add logs to the direct chain component, seems however we don't have JSON instenaces for ValidatedTx? -> using only Show for now but should be fine
How do I make a TxIn from a TxOut? => hash the TxBody and add the index
Got to the point where I have only one failing test, namely the one about init -> abort dance which now makes sense.
- I need to properly observe the abort transaction from the chain and make sure it has the necessary inputs from the current head state.
Actually there's currently no way to link init to abort because:
- Init produces a single output which is the address of the main (SM) script with the parameters as datum
- Abort consumes an output for the validator Script with the pubkeyhash of the recipient
- Abort should also consume the SM output and pass the parameters as datum which is what we need to verify first
Going back to basics, here are next steps to be able to do the init -> abort sequence correctly:
- Add the output for the SM to the Init tx
- Make sure this output updates the on-chain state
- Have the Abort transaction consumes this output
- Add the thread token inferred from the seed txin
- Add parties' verification keys to the head parameters
- Mint the PTs (using Thread token) and create one output per party with the PTs and their verification keys
- Consume those output in the abort tx
In parallel we need to write and check the validator scripts themselves as this is not really done in our tests because the mock ledger does not verify anything, of course.
Got state change test is green but now the abort tx unit test validation fails with:
Evaluation results: fromList [(RdmrPtr Spend 0,Left (ValidationFailed (CekError An error has occurred: User error:
The provided Plutus code called 'error'.)))]
- Recompiling ledger-specs setting the flag for evaluation to
Verbosein order to get better logs - It seems there's another problem in the Head validator's state machine as we don't pass any
ThreadToken, or rather we pass one but do not use it to instantiate the SM hence it's more than probable the evaluator fails to find the SM -> MB adapted the interface in another PR - Trying to switch the validator to a simpler one, and check I can build the aborttx, possibly also checking I can sequence the transctions and observe the aborttx. With a simple (parameterized) validator, script evaluation succeeds just fine even though the test fails because the count of results is incorrect, but next execution fails
- With a single script reference to the MockHead validator it passes, so I must be doing something wrong with thr
RdmrPtrlogic.
There is an issue related to how it resovles its redeemers.
- I was using directly
RdmrPtrpassing an incremented counter but of course this does not make sense becauseinputsis aSet. - One need to either sort the inputs by
(TxId, TxId)order or use therdptrfunction from the ledger API that does the right thing to associate the redeemer with the right input.
Topic: What to do about rollbacks?
Actually, theses are not really rollbacks it's just a longer chain was found. It's important because this means we are not going back in time, eg. the new chain will be at the same "moment" in time than the new one.
-
CollectComis the most sensitive transaction, if rolled back it's as if nothing happened in the head which might be quite annoying if people are expecting head transactions to be "final" or "settled" and do side-effects depending on it. -
CloseandContestcould also be problematic as they are time-bound (by the contestation period) and it could be the case some contests "disappear" for want of time to post them in case of a rollback - Other mainchain transactions are less problematic, they can simply be resubmitted
Hydra users need to be aware of the settlement time on the mainchain, for exeample there is a 600s limit on Kraken for payments on Cardano to be considered final. To be safe against adversarial nodes, one needs to wait for some number of blocks (there is a document available providing some simple tables relating expected probability of "failure" to adversarial stake and number of blocks to wait, eg. for 5% adv stake, 0.01% failure, one has to wait 73 blocks or ~20 minutes)
- The contestation period needs to be set to some large enough value, eg. larger than expected time to get a rollback
- Validators do not get absolute time slots, they only get the validity range of the transaction which, in the case of
Close/Contesttransactions includes the contestation period, and because scripts run after stage 1 validation, they can assume the range is valid. From there, the validator can check if the range falls within accepted bounds - Also,
T_maxshould not be too large as to prevent the head from making progress, but this can verified by the validator too in the range
Consequences for Hydra Head:
-
HeadLogicneeds to be aware that its state could be "rolled back", eg. an onchain transaction can reset the state to something else, even while the head is opened => This could be property tested, we had something similar at one point - The settlement time should be a parameter of the node set by users, depending on how long/what risk they are willing to take w.r.t to rollbacks
- The contestation period should be set large enough, possibly in relationship to this settlement time?
- The OnChain component could be the one doing the wait, retaining
OnChainTxuntil enough blocks have passed before notifying node
- Plutus validators are also
Blake2b_224, but why did thefromJustnot work before? case solved it, hash conversion works now - Get a MissingScript error now
Falsified (after 1 test):
TxInCompact (TxId {_unTxId = SafeHash "03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314"}) 180
Utxo: UTxO (fromList [(TxInCompact (TxId {_unTxId = SafeHash "03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314"}) 180,(Addr Testnet (ScriptHashObj (ScriptHash "1302a9a442fa86e8e836aa39d961ec3e71f500f21a633ae0cf2b60b1")) StakeRefNull,Value 0 (fromList []),SJust (SafeHash "faa51ea0059e04224cc13da34b53bba807fb2affd71ee401e85dfa3f769081fd")))])
Tx: ValidatedTx {body = TxBodyConstr TxBodyRaw {_inputs = fromList [TxInCompact (TxId {_unTxId = SafeHash "03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314"}) 180], _collateral = fromList [], _outputs = StrictSeq {fromStrict = fromList []}, _certs = StrictSeq {fromStrict = fromList []}, _wdrls = Wdrl {unWdrl = fromList []}, _txfee = Coin 0, _vldt = ValidityInterval {invalidBefore = SNothing, invalidHereafter = SNothing}, _update = SNothing, _reqSignerHashes = fromList [], _mint = Value 0 (fromList []), _scriptIntegrityHash = SNothing, _adHash = SNothing, _txnetworkid = SNothing}, wits = TxWitnessRaw {_txwitsVKey = fromList [], _txwitsBoot = fromList [], _txscripts = fromList [(ScriptHash "a893ca7be59f00c935c382fd8f8e515adcc9850f1ec5dbafbe99face",PlutusScript ScriptHash "a893ca7be59f00c935c382fd8f8e515adcc9850f1ec5dbafbe99face")], _txdats = TxDatsRaw (fromList []), _txrdmrs = RedeemersRaw (fromList [(RdmrPtr Spend 0,(DataConstr Constr 0 [],ExUnits {exUnitsMem = 0, exUnitsSteps = 0}))])}, isValid = IsValid True, auxiliaryData = SNothing}
Evaluation results: fromList [(RdmrPtr Spend 0,Left (MissingScript (RdmrPtr Spend 0)))]
- Reason: address script hash of utxo is different than the provided scripts
- Apparently
hashScriptis different thanhashFromBytesviaValidatorHash- suspect: the double hashing in plutus'
scriptHashis suspicious and maybe the ledger does not expect that? - use the same technique in both places to see whether it's just the hashing or the serialized script -> will it run?
- suspect: the double hashing in plutus'
- SN explained the current situation of transaction creation to AB
- Started off with the
MissingDatumerror when running theinitialvalidator against anabortTx - Solved it by providing the
PubKeyHashtoabortTx, but this is really only a quick fix. TheabortTxshould actually spend all the outputs which contain PTs. - By providing the
PubKeyHash,abortTxvalidates now againstinitial! - Bad news: We realize that we need some "onchain state" which is tracking utxo + which datums would be able to spend these
- e.g. for the abortTx we would need to have something like
[(TxIn, PTSpending)]withto keep track from where and how utxo's would need to be spent by this transactiondata PTSpending = FromInitial PubKeyHash | FromCommit (UTxO Era) - doing things the "Direct" way is hard!
- e.g. for the abortTx we would need to have something like
- Give another shot on PAB, tests fail because of a missing nextTransactionsAt
- awaitUtxoProduced seems for a good replacement, providing us with outputs and txs
- along with that there is
txOutRefMapForAddrfor filtering certain TxOut - can't seem to find/import
ChainIndexTxthough
-
typeScriptTxOutwon't decode the datum because thatChainIndexTxOutis not necessarily containing the tx and thus it can't know the Datum (but just the hash)-
txOutRefMapForAddrgives directly TxOuts which do only contain datum hashes - re-combine into a ChainIndexTxOut with ChainIndexTx or use
utxosTxOutTxFromTxasPlutus.Contract.StateMachine
-
- Changed to just merge all utxos seen in
watchInittogether and try to decode the right datum from any output- It turns we do not need the address of the state machine validator anymore
- Maybe this is inefficient
- The Head statemachine contract can't be observed for
Finalstate as it does not produce a txOut- re-defining
isFinal Final = Falseworks around this
- re-defining
- Idea: running the plutus validator in haskell against constructed transactions
- Which serialization of plutus scripts?
- From TypedValidator we would use tvValidator / mkValidatorScript to get a
Validator - This is an instance of
Serialise(fromserialisepackage) - However
Validatoris only a thin wrapper aroundScript, which has aToCBORwhere we can usecardano-binary'sserialize? -> opted for this one
- From TypedValidator we would use tvValidator / mkValidatorScript to get a
- Debating on how to run the validator now
-
evalScriptstakes the TxInfo already asData, so maybe usecollectTwoPhaseScriptInputs- this is from the ledger - there are other (more low-level) ways from
plutus-ledger-apipackage -
evaluateTransactionExecutionUnitsis used bycardano-api, maybe also fine for us / our tests? - the
constructValidatedfunction looks promising for a template to collect+eval scripts, although this function seems not to be used anywhere
-
- Call into plutus
evaluateScriptCountingdirectly as we want to evaluate a specific script on Tx - Realized that validating the init tx with the initial contract does not make sense!
- rather the commit tx or abort tx would be goverened and thus need to validate with the Initial script
- those txs would also include said script in order to spend the input, so using the plutus functions is too low level and we should be able to use the collect+eval scripts from ledger after all
- Refactor to use
evaluateTransactionExecutionUnitsfrom ledger
- What is that
ScriptIntegrityHashused for? - Converting from a plutus validatorHash/Address to the ledger's Addr is weird
- also could not find whether / where plutus is doing the same thing?
- BTW they also have exactly a mock chain sync client and server but this seems to be using their own
Txtype (seePlutus.V1.Ledger.Tx)
It's not only weird, but it fails because plutus does not give us blake2b_224 hashes (likely sha256 instead)
We discussed the new approach for on-chain interaction as an alternative to the PAB.
-
We were able to complete a full round-trip using the Ouroboros mini-protocols, though the transactions being submitted and deserialized are not representative of the actual chain interactions. It still demonstrates how the setup works and, that it's possible to fully test the approach in isolation with a mock server.
-
We agree that we want to keep "wallet concerns" inside the component, and not leak it through the abstraction to keep the solution as much as possible close to what the PAB gives us. Ideally, we could swap one implementation with the other once done.
-
The last point means that we have to provide the direct-chain component with credentials to be used for (a) signing Hydra on-chain transactions and (b) tracking users' funds to pay for those transactions. A simple pub/prv key pair would do, from which we can derive a change address and track the UTXO set easily. This means that the Hydra node would initially require users to move some funds into a specific address that they control, but gives "custody" to the Hydra-node for running a head. It's important that funds at the address are sufficient to cover a full head lifecycle (init, close and more importantly, contest) and we should warn users accordingly if not (or even, refuse to init a head).
-
The current implementation of the chain-sync client is wrong (and we know it) as it will synchronize blocks from the origin always. What we want however is to start at a much later point, for example, the current tip and onwards. This is easily achieved with the chain-sync protocol itself but, comes with a limitation: all participants have to be online and observing the chain before any init transaction is submitted to the network. While it is okay-ish for now, if we persist with that approach, we'll have to provide some synchronization mechanisms between peers.
-
We are also current over-simplifying the problem by considering that participants are only member of a single head at a time. Thus, looking for on-chain transaction does not currently check whether a transaction is indeed about a given head, but only checking whether it involves "us" as a participant. Later, we'll want to also recognize which head instance is concerned by an on-chain transaction, which can be done through the mean of the state-machine thread token.
-
The question of rollbacks was raised again. In principle, in Praos, a node can rollback up to 3k/f slots (~18h) so, transactions only truly reach immutability after 18h. Yet, they reach a high enough probability (99.999%) well before that; Still, depending on the adversarial stake in the system, this can vary between few minutes to hours. We want to bring this question to the next engineering meeting with the consensus team. There are really two types of rollbacks:
-
'organic' rollbacks, which can occur because of Praos and how the consensus sometimes elect two or more leaders for the same slot. This type is quite benign, and transactions lost by such rollbacks can simply be re-submitted if needed. Although this is annoying for the contestation, it can likely be managed gracefully.
-
'adversarial' rollbacks, which are induced by an adversarial party trying to double-spend. For example, one head participant could commit to a head some funds he/she is also trying to double-spend at the other edge of the network. This would result in head participants thinking they're indeed inside a head, whereas in practice, nothing really happened.
-
-
Work stream on the "round-trip" of
InitTxby posting and observing it again on chain - anologously as with theExternalPAB -
Start with storing/recovering the HeadParameters as Datum
- before diving into minting the thread token, participation tokens etc.
-
Managed to create and add a script output, BUT
- while I could check that there is an output with some datum hash (even checking the hash)
- cardano-api seems not to include the
Datumwhen creating an output (in contrast to the plutus framework) - So while the test would need to also
signShelleyTransactionto see script witnesses in the transaction, the ones for "creating outputs" are not present - This would make it impossible to deserialize the
HeadParametersfrom aninitTx - ... back to
cardano-ledger-specsfor more control?
-
Switching to cardano-ledger-specs was not too hard
- Could construct the TxBody with the single input
- Now the question is how to assert that the datum is present?
- Return
Txwithout signatures and update that later or introduce an intermediate type just for this? e.g.data TxDraft = TxDraft { body :: TxBody (AlonzoEra StandardCrypto) , -- | Datums used by scripts in the body. dats :: TxDats (AlonzoEra StandardCrypto) }
- Opted for simply returning an unsigned (and unbalanced for that matter)
ValidatedTx
-
Interesting observation: converting between HeadParameters and Initial / onchain representations is explicit now in the tests / not via JSON
-
observeTxcould serve as the pendant ofconstructTx- Using
Alternative Maybewe could provide for a nice interface - Is this efficient?
- Using
Trying to enhance log-filter program to be able to fork a hydra-node program and filter the logs it produces directly from its stdout, rather than filtering logs on disk.
Ran into some problems:
- children are not properly reaped when parent dies apparently
- passing arguments to the
log-filterprogram is problematic when invoking hydra-node throughcabal runbecause I need to pass--twice
Continue working on implementing a mock node in order to test Direct chain component that will build transactions from PostTx messages and send back OnChainTx messages from observed transactions in blocks.
Implementing mock TxSubmission server using a TQueue to hold the transactions, shortcutting the block construction with 1 block = 1 tx
- Find intersect: Client sneds some points which are supposed to be in the ledger, server will respond with the latest chain in the sent points
- Server maintains a cursor for each client and send them updates on request, which could be backward/forward
Design discussion about next steps for "mock node", or how to write server and client in order to test transactions are observed on both sides:
- We need one peer per client per protocol
- The client (production) and server-side (test) code should both be ouroboros applications
- We can build a pair of
Channels to the client/server? => the Mux can use aChannel- Or we need to bite the bullet and use a
Snocket
- Or we need to bite the bullet and use a
Next steps:
- Complete network layer
- Add smart constructors to create transactions for Hydra protocol
- Which API to use? Cardano-api or ledger-api? => Which one is easier
- We used the cardano-api in the tests to decouple from the hydra-node library code, and went for using the JSON schema API instead
- Goal: smart constructors for protocl txs, e.g.
PostChainTx -> Tx AlonzoEra - Start working on the "chain tx" constructors, created
Hydra.Chain.Direct.Txas separate module as it provides for a good "seam" for unit tests - Which Tx type to use? There are at least:
-
GenTx(CardanoBlock StandardCrypto) from ouroboros-consensus? -
ValidatedTx (AlonzoEra StandardCrypto)from cardano-ledger-specs? -
Tx AlonzoErafrom cardano-api?
-
- Discussion with KtorZ on whether we'll have a
Txor aTxBodyand "who signs transactions"?- Conclusion: Signing / keeping keys at the client would be morally the right thing, but we want to do it the same as we expected the PAB to work, i.e. "the system" has the keys and does sign txs / spend
- The smart constructors though, should be producing
TxBodyand thewithDirectChainwould have access to someSigningKey
- Started with using the
cardano-apitypes as they were easy to handle in the e2e test - All of these functions will be something like ..
TxIn -> Either TxBodyError (TxBody AlonzoEra) - For
initTxwe can use a singleTxInfor paying fees & as the parameter of minting thread token & participation tokens - Problem: No Arbitrary instance for cardano-api's
TxIn - Shift to using
cardano-ledger-specsas it hasArbitrary TxInin one of their test package, BUT- it's easier to shoot yourself in the foot with this API. i.e. nothing prevents you from creating a TX without inputs
- not sure if this is good!?
- Was pointed at hedgehog
Gen TxInforcardano-api-> switch back - Down a rabbit hole how to run a
hedgehogGenin aQuickCheckGen - Was pointed at
hedgehog-quickcheck, where I missed thehedgehog -> QuickCheckdirection by being blind - Back in the flow of creating tests against
initTx :: ... -> Either TxBodyError TxBody-> success with a generatedTxIn - Adding TxOut parameter to calculate and add a change output
- Fails if fees /= 0 because there might not be enough in the generated txOut
- create a
===>implication to ensure enough lovelace - hedgehog
Gen TxInseems not to bescale-able from QuickCheck
- Seeing the complexity of
makeTransactionBodyAutoBalancehad me pivot to changeinitTxcreateTxBodyContentinstead and have the balancing, change and fee calculation be done by this function
- Presentation of early benchmark results
- Evolution of plots and performance over time
- Effect of various dimension (number of nodes, generator type)
- We have 2 generators currently, a "standard" one which uses ledger's genTx producing "large" transacions and growing UTXO set, and a "constant" one that produces micro-payments like transcations (one input, one output) and keeps UTXO set constant
- Discussion:
- Suggestion: Do a heap profile of each nodes, does not require recompilation with profiling enabled
- "Fractal benchmark":: SN can't reproduce behaviour of simple txs over ouroboros network, so perhaps an artifact of a single run ona specific machine?
- Could be good to run on bare metal to get some real benchmarking, but virtualised environments are somewhat consistent with probable deployment model
- We should take some time with Marcin to check we are not doing something stupid in the protocol layer
- Question: Why does the performance seems to degrade over time? => we don't know (yet), requires more investigation
- Extract some info from RTS from the nodes and use embedded Prometheus server to get data at regular interval
- Discussion about PAB vs. direct interaction with the chain:
- Review with Marcin how to integrate with NodeClient protocol for the purpose of testing direct submission of txs to the chain
- Also have a look at how Plutus implements Mock chain server
- We want to get something running sooner, it's tactical decision for short-term unblock us
- PAB has taken a while to build because it's actaully complex, and deals with a lot of intrisic complexities of the chain (and node was also a moving target...)
- Once PAB is ready we will have an easier time because dev will be user-driven
- Even if we get it running now with the "direct" approach, it does not mean we won't be using the PAB (as-a-library) for production later
- Finish reading ACE Paper
I would like to count the UTXO set size after each transaction, then correlate that with the transaction confirmation time (or ReqTx time?) to check if this shows any correlation with the latency increase.
I now have a list of txId/utxo size pairs, now need to correlate that with the time the transaction was confirmed.
Need to have a table of list of tx ids with confirmation time from the SnapshotConfirmed message
Struggled a bit with jq syntax to extract timestamp for each transction in the confirmedTransactions field of a snapshot confirmed message but turns out it's pretty simple:
$ cat log.1 | grep ProcessedEffect | grep SnapshotConfirmed | jq -cr '{ts:.timestamp, txs: .message.effect.serverOutput.snapshot.confirmedTransactions[]}' | jq -cr '[.txs, .ts] | @csv'
The confirmedTransactions array is actually "flattened" and one record is generated per element.
Trying to map UTXO over time, I messed up with the logic: I need to apply the list of transaction in the order of their confirmation to keep track of the growth of UTXO se over time ; instead I simply concatenated the list of transactions in the dataset which does not make sense. The confirmation time for each tx is good however, need to sort the file on this.
Wrote a small haskell program to extract the information I need from the dataset.json file and the confirmed-txs.json, namely the size of UTXO per transaction confirmed ordered by time.
- Then I can produce a time series for the size of UTXO set with:
cat utxo-size.json | jq -cr '.[] | @csv' > utxo-size.csv join -t ',' utxo-size.csv confirmed-txs.csv | cut -d ',' -f 2- | tr -d \" | awk -F ',' '{ print $2, $1 }' | sort > utxo-time
Now trying to run a benchmark with a different transaction set, namely one where the UTXO set does not grow over time.
- Struggling to get the API rights to generate a sequence of transcations that cycle among aUTXO set.
- In the
Cardanomodule we useCrypto.Ledger.Keysbut this is not in thecardano-apiwhich has different types, ended up exposing a function to extract verification key from keypair as I need the former to generate UTXO, and the latter to generate addresses in transactions. - Trying to leverage what @KtorZ did for the TUI, generating keys, addresses and specific UTXO set, but working with the API to eg. select a random UTXO from a list of UTXOs is very annoying
- Trying another approach: Start with a single UTXO, a single key pair, and then consume the UTXO producing the same UTXOs, sending to a new key pair. It seems to work but I still have errors when running the property test, seems like the values are not correctly conserved...
- Turns out the error was I was trying to apply the transactions in the wrong order using
foldrinstead offoldl' - Now I get another error with some UTXO being too large, trying to trim that down to some manageable size.
Finally managed to run a benchmark with a constant UTXO set size:
Writing results to: test-bench-constant/results.csv
Confirmed txs: 18000
Average confirmation time: 6.724038065066666e-2
Confirmed below 1 sec: 100.0%
What's missing for demo?
- Sending native assets
- Showing the 3 clients opened in the head
- Explaining use of test faucet + use of addresses representing each party
- 2 lists of things in the Head: Connected nodes + hydra keys in hydra heads
- use address instead of peer to send UTXO to? + alias
- We send only a part of the UTXO
- Pb: There are 4 addresses instead of 3
- Show list of head participants using Hydra public keys w/ colors
- Showing the final UTXO set after closing
- Identifying parties, passsing key pair to the exec for the head
- Initial addressses for the head are generated from the port only (which are all the same which is weird...)
-
Picked up on the "aliased Party" branch and implemented an optionally aliased
Partytype-
Ordinstance problems + tests -
Showinstance which prefixesalias@and uses hex encoding on the verification key -
ToSONdroppingaliaswhenNothing
-
-
Have the node alias
meandotherPartieswhen loading keys from file- Heuristic to only alias when files start with a letter
- Leaves
Partyin end-to-end tests unaliased (otherwise we would need to carry around the names in our instrumentation)
-
Show the list of
Partyusing the new format in the TUI now -
Noticed that the TUI was refactored into
data State = State { ..., clientState :: ClientState}withdata ClientState = Connected | Disconnected, which is exactly what I avoided initially -> need to discuss this -
How to get the
me :: Partyon the client side?- Thought about different ways to do this:
- Simply add a
me :: PartytoReadyToCommit {parties :: Set Party, me :: Party}-> this provides us with enough info at the right point in time, which we can keep around while the head is open - Add a new ServerOutput
NodeInfo { me :: Party }.. maybe also including aversion :: Versionetc.- This could be sent as some kind of "greeting" to each connected client as the first output in history
- A corresponding
GetNodeInfoinput which fetches this information
- Simply add a
- Intially thought of option one being easiest, but changing
ReadyToCommitto be node-specific is a PITA for tests - Option 2 with the latched greeting was easy to achieve and has a taste of old-school protocols
- API server tests are obviously failing, could fix most.. but one was puzzling me and no time to fix now (sorry)
- Thought about different ways to do this:
-
Finally, the client-side can now store the public key of the connected node and we can generate Utxos and addresses from that, instead of a port or other info
- This essentially means we would want to
Party -> CardanoKeyPairandParty -> Utxo - Moved the "fauceting" and "credential conversion" into
Hydra.Ledger.Cardanoas it requires now to deconstruct theParty'sVerKeyMockDSIGNto get our hands on a suitable seed for crafting cardano credentials ->hydra-nodeseemed to be a better place for this than the TUI
- This essentially means we would want to
-
Handling the
Greetingwas trivial, but the fact thatme :: Mabye Partyis messing up quite a lot of the code -
Some final more polishing / experiments in highlighting "us" and "own addresses"
Extracting siez of UTXO set from reduced log:
$ cat log.1 | grep HeadIsFinalized | jq '.message.effect.serverOutput.utxo | keys' | wc -l
Extracting length of processed snapshots:
cat log.1 | grep ReqSn | grep NetworkEvent | grep ProcessedEvent | jq -r '[.timestamp, (.message.event.message.transactions| length)] | @csv' | tr -d \" > snapshot-length.csv
Extracting timings for ReqSn start and stop:
$ cat log.1 | grep ReqSn | grep NetworkEvent | grep 'ProcessingEvent' | jq -r '[.message.event.message.snapshotNumber, .timestamp] | @csv' | tr -d \" > snapshot-processing.csv
$ cat log.1 | grep ReqSn | grep NetworkEvent | grep 'ProcessedEvent' | jq -r '[.message.event.message.snapshotNumber, .timestamp] | @csv' | tr -d \" > snapshot-processed.csv
$ join -t ',' snapshot-processing.csv snapshot-processed.csv > snapshot-time.csv
The idea is to compute the time span between the first processing event for an ack and the last processed:
$ for i in {1..6}; do cat log.1 | grep AckSn | grep NetworkEvent | grep ProcessingEvent | jq -r "select (.message.event.message.party == $i) | [.message.event.message.snapshotNumber, .timestamp] | @csv" | tr -d \" > processing-ack-$i.csv ; done
$ for i in {1..6}; do cat log.1 | grep AckSn | grep NetworkEvent | grep ProcessedEvent | jq -r "select (.message.event.message.party == $i) | [.message.event.message.snapshotNumber, .timestamp] | @csv" | tr -d \" > processed-ack-$i.csv ; done
join -t ',' processing-ack-1.csv processing-ack-2.csv | join -t ',' - processing-ack-3.csv
What I want is the ack-sn.csv file that looks like:
2021-09-10T16:33:28.664Z,118
2021-09-10T16:33:28.912Z,194
2021-09-10T16:33:29.141Z,136
2021-09-10T16:33:29.347Z,108
2021-09-10T16:33:29.598Z,149
2021-09-10T16:33:29.816Z,59
2021-09-10T16:33:30.066Z,145
2021-09-10T16:33:30.424Z,349
2021-09-10T16:33:30.629Z,186
2021-09-10T16:33:30.867Z,130
To produce it from the logs was somewhat painful though:
- Aggregate timings for
AckSnevents processed for each node, grouped byProcessingEventandProcessedEvent, in a JSON array, - Load the array (in my case using
nodejs) and extract the minimum of start time and the maximum of stop time, - Compute the different between the two and produce a file with x value being the end time, and y value being the total latency for Acking a snapshot.
Joining all AckSn timings:
join -t ',' processed-ack-1.csv processed-ack-2.csv | join -t ',' - processed-ack-3.csv | join -t ',' - processed-ack-4.csv | join -t ',' - processed-ack-5.csv| join -t ',' - processed-ack-6.csv > processed-ack-all.csv
join -t ',' processing-ack-1.csv processing-ack-2.csv | join -t ',' - processing-ack-3.csv | join -t ',' - processing-ack-4.csv | join -t ',' - processing-ack-5.csv| join -t ',' - processing-ack-6.csv > processing-ack-all.csv
join -t ',' processing-ack-all.csv processed-ack-all.csv > ack-all.csv
Then I manually transformed this file back to JSON in Emacs :( before processing its content in node. Here is the JS script to produce a JSON that contains the above data:
const fs = require('fs');
const ack = fs.readFileSync('ack-sn.json');
const d = JSON.parse(ack);
const ts = d.map(arr => [arr[0]].concat(arr.slice(1).map(d => new Date(d).getTime())));
const minmax = ts.map(arr => [arr[0],Math.min(...arr.slice(1,6)), Math.max(...arr.slice(7))]);
const sntime = minmax.map(arr => [new Date(arr[2]),arr[2] - arr[1]]);
fs.writeFileSync('ack-sn.json',JSON.stringify(sntime));Then producing a CSV for plotting with gnuplot amounts to:
cat ack-sn.json | jq -rc '.[] | @csv' | tr -d \" > ack-sn.csv
gnuplot is a bit quirky to work with if the data is not in the right format, doing computation and transformations on data is awkward, eg. like computing the moving average for receiving all AckSn in the following transcript:
set xdata time
set format x "%H:%M:%S"
set xtics out rotate
set title 'Snapshot acknowledgement time (ms) - 1389 snapshots'
samples(x) = $0 > 9 ? 10 : ($0+1)
avg10(x) = (shift10(x), (back1+back2+back3+back4+back5+back6+back7+back8+back9+back10)/samples($0))
shift10(x) = (back10 = back9, back9 = back8, back8 = back7, back7 = back6, back6 = back5, back5 = back4, back4 = back3, back3 = back2, back2 = back1, back1 = x)
init(x) = (back1 = back2 = back3 = back4 = back5 = back6 = back7 = back8 = back9 = back10 = sum = 0)
plot sum =init(0),\
'ack-sn.csv' u (timecolumn(1,"%Y-%m-%dT%H:%M:%SZ")):($2) w l t 'AckSn Processing time (ms)', \
'ack-sn.csv' u (timecolumn(1,"%Y-%m-%dT%H:%M:%SZ")):(avg10($2)) w l t 'Moving average (10 points)', \
'ack-sn.csv' u (timecolumn(1,"%Y-%m-%dT%H:%M:%SZ")):(sum = sum + $2, sum/($0+1)) w l t 'Cumulative mean'
Plotting moving average with gnuplot is clunky: http://skuld.bmsc.washington.edu/~merritt/gnuplot/canvas_demos/running_avg.html
What I would like to do tomorrow is to map the UTXO set over time, checking if there's a correlation between the UTXO set and the time it takes to produce snapshots after a while. This does not seem to be the case as we can see the ReqSn processing time does not significantly change over time. Note this could also be tested experimentally by running a benchmark with a synthetic transaction sequence that does not increase the UTXO size, something like ping-pong style transactions which keep sending the same amount back and forth.
Plan for today:
- MB: Got simulation work to do for researchers, fixing "bug" in UTXO display and polishing TUI
- AB: Add more concurrency and parallelism to benchmark (run n clients per node, run more than 3 nodes)
Going to work on having multiple clients per node first, so that we can see the effect of submitting parallel non conflicting transactions to same node
- Also need to find a way to make the
contestationPerioddynamic becuase otherwise the test can fail as it timeouts waiting for finalisation - There is a single registry for confirmation time of all transactions on all sequences but this does really make sense and increases contention as each thread compete for the same piece of data -> split registry among all threads and combine them at end of run
Adding the ability to increase concurrency above number of nodes, eg. have more than one client per node
- Solution is rather simple: Just extract the
startConnectfunction to the toplevel, it's the one responsible for opening the connection to the hydra node.
Introduce withCluster function that "folds" withHydraNode over an arbitrary non-zero expected number of nodes
- Got an error at startup so it seems there's some 3 hardcoded somewhere...
- I can see the 4 hydra nodes starting up but the test fails on an expectation:
waitFor... timeout! nodeId: 4 expected: {"parties":[1,2,3,4],"tag":"ReadyToCommit"} seen messages: {"parties":[4,1,2,3],"tag":"ReadyToCommit"} - Turning the list of parties into a
Setdoes the trick -> ordering is guaranteed asSetneedsOrdinstances and maintain deterministic ordering of nodes
Running a benchmark with 6 clients, 4 nodes gives me:
Confirmed txs: 481
Average confirmation time: 0.4695817474137214
Confirmed below 1 sec: 100.0%
With 3 nodes:
Confirmed txs: 571
Average confirmation time: 0.4129197640157618
Confirmed below 1 sec: 100.0%
with 2 nodes:
Confirmed txs: 533
Average confirmation time: 0.3144588079249531
Confirmed below 1 sec: 100.0%
Spent some time fixing a mistake I pushed to master: Changed the type of parties committing from [Party] to Set Party and this had ripples I did not notice
Trying to run a 10 nodes simulation => waitFor startup timeouts, need to increase it
Trying with smaller number of nodes, say 6
- Got a 6-nodes benchmark running, I guess startup timeout should be increased to something like 20 seconds per node
- Benchmark still running after 1.5 hours, it's really hard to say how much has been done -> need better reporting than dumping transaction ideas and snapshots
Trying to run a benchmark with more "reasonable" values: 10 concurrent clients, 6 nodes, scaling factor of 50, and added some progress report:
Client 5 (node 2): 17/249 (6.83)
Client 1 (node 6): 13/1366 (0.95)
Client 3 (node 4): 15/1112 (1.35)
Client 4 (node 3): 14/642 (2.18)
Client 2 (node 5): 14/901 (1.55)
Client 6 (node 1): 17/1081 (1.57)
Client 7 (node 6): 13/361 (3.60)
Client 8 (node 5): 14/154 (9.09)
Client 9 (node 4): 15/571 (2.63)
Client 10 (node 3): 14/1452 (0.96)
Also set the number of transactions to be the same for all clients, otherwise we might get artifacts in the numbers we extract from the run if some clients stop sending transactions before others.
Benchmark TODOs:
- Run the benchmark over some period of time to get steady state behaviour => trim down the logs?
- Add validation time and throughput
- Add metrics internal to nodes (event queue size, confirmation time)
- Increase the number of nodes
- Spread the load on different clients
- We have the worst case => increase the parallelism of transactions generated
TUI TODOs:
- commit from faucet
- new transaction
- UTXO set visualisation
Note: We can easily DoS nodes spamming them with invalid transactions, as demonstrated by performance drop when submitting a lot of invalid transactions. We need some rate limiting and/or caching of validation to reduce the load on the nodes
Enhancing plot:
- Looking at https://torbiak.com/post/histogram_gnuplot_vs_matplotlib/#gnuplot to plot throughput of txs in benchmark
- Trying to plot a histogram of transactions throughput, this link is actually simpler and works: http://www.gnuplotting.org/calculating-histograms/
- Managed to plot confirmation time and throughput on the same graph. Obviously confirmation time follows an inverse trend with throughput which follows from Little's Law
Now adding more parallelism in the data set so that we can observe how the nodes behave with non conflicting transactions ste
- First step is generating
nnon-conflicting transaction sequences that will be each handle by one client thread sending to one node - Got caught in a rabbit hole transforming the way we store and pass data to
EndToEndin order to introduce potential concurrency Added a parameter to the benchmark and created aDatasettype to map aUxtoset and the sequence of transactions generated for it, so that we can have multiple sequences and distribute them among many clients - Implemented parallel submission and confirmation of transactions, with one thread per generated dataset and the threads distributed over the various clients available
The run deadlocks pretty quickly and
TxInvalidmessages show up which shouldn't be the case: It seems the first transaction submitted in the second thread is not the first transaction from dataset?- Of course: There is a single submission queue for all the threads -> removing it from the registry and creating it in each thread should work better, Each pair of submitter/confirmer now has its own queue but I am still running into a deadlock with concurrency > 2
- Trying to unblock the submitter when the transaction is invalid, so this works in the sense that the process does not deadlock but transactions stay invalid "forever"
- I have got the explanation: All the UTXO sets generated have the same TxId as reference, so the transaction consumes the wrong txout and everything goes awry => Need to make sure the UTXO sets are completely different...
- Updated
genUtxofunction to use an arbitrary genesis TxId which hopefully should be fine?
- Updated
- When trying to increase concurrency level over 3, eg. one client per node, I am running into troubles of course because all clients for same node share a connection to the node which does not work! => Need to make sure each submitter has its own connnection, which might be slightly annoying given the way we structured HydraNode
All nodes are now busy, each with its own dedicated client:
28553 curry 20 0 1024.8g 95716 23184 S 101.0 0.6 0:56.91 hydra-node
28585 curry 20 0 1024.8g 85788 23120 S 98.7 0.5 0:55.95 hydra-node
28574 curry 20 0 1024.8g 92324 23104 S 95.0 0.6 0:57.10 hydra-node
Tip: Counting the number of transcations in a dataset:
cat bench-parallel-test-2/dataset.json | jq -c .[].transactionsSequence[].id | wc -l
Just realised it's not possible to run 2 benchmarks in parallel: I killed a long running one because I was oblivious to this :( Now adding validation time to the plot so that we can check how this evolves over time. I suspect validation time amounts for a significant fraction of the time spent processing a transaction as the size of UTXO set grows
It's slightly annoying but due to the way the reader is coded for DiffTime one has to add an s suffix to the numnber of secondes for timeout:
$ cabal bench local-cluster --benchmark-options '--scaling-factor 20 --concurrency 3 --output-directory bench-test --timeout 1000s'
Trying to run the benchmark with nullTracer just to make sure the output of the logs is not impacting the performance of the nodes significantly: I suspect JSON (de)serialisation might contribute significantly to bad performance of nodes
Final plot of the day, with a null tracer:

Goal for today:
- Write the snapshot decider "by the book", eg. independently from the rest of the protocol and as described by the paper
- Remove the current
ReqSnproduction and wire thenewSn
Removing SnapshotStrategy which is really not used
Tried to wire the new snapshot decision logic into the Node, simply enhancing the effects with a ReqSn when we decide we should snapshot:
- Lot of tests are failing/hanging now -> Trying to debug the tests one by one as we have a lot of them around
- Also removed
RequestedSnapshotfrom the ADT - Removed all snapshot emission tests from
HeadLogicSpec-> Those are now covered bySnapshotStrategySpec
NodeSpec tests are failing with rather obvious reasons => We don't want to emit a ReqSn if seenTxs is empty while we ShouldSnapshot
Problem now is that we emit a ReqSn upon every ReqTx so one NodeSpec test fails because we add more ReqSn than expected
-
emitSnaapshotnow change the state so that we don't emit multiple snapshots while processing a batch of events - We also make
newSnwork onCoordinatedStatedirectly instead ofHeadStateto remove some cases
Got all tests to pass but benchmarks are still livelocking so something's still wrong in the snapshotting/confirmation logic:
- AB going to finish log-filter program to be abel to analyse more easily logs
- MB to troubleshoot issues with bench
Back working on log-filter process, got a an issue with transforming the list of confirmedTransactions, what I get is not a list of ids but a single ids so I assume my traversal is not working
Interesting question is: How to test this log filter and ensure it stays in sync with the structure of the log entries?
- MB suggests to generate random logs, and check compression achieved by
log-filteragainst some expected threshold - Wrote unit tests to assert LogFilter properly transforms an Array of transactions into an Array of TxIds. Wasted 10 minutes troubleshooting the test which failed for the wrong reason because a keyword used was invalid => test is useful!
Impact of log-filter, before:
-rw-rw-r-- 1 curry curry 184567326 Sep 8 10:12 /run/user/1001/bench-67fb0dac6531c6bf/1
after:
-rw-rw-r-- 1 curry curry 5595672 Sep 8 13:40 filtered
Trying to extract a test case from the failed benchmark run of yesterday
Started writing a "log-filter" program whose purpose is to filter and trim down the logs, removing unessential details at this stage in order to better understand the flow of transactions and messages It currently replaces full tx with txid and removes long map of UTXO from a few messages, still some work to do to have it usable and remove most of the noise of the logs Note that it also removes all log entries which are not from the NOde (eg. ouroboros network messages) I have used lens-aeson but unsure if I am really harvesting the full power of the library and lenses in general, but it seems to work fine so far
Discussing issues about snapshots with MB
- Our current approach seems to be flawed: Its the 3rd time we are having issues with snapshots while we are trying to produce them synchronously with other events
- In the original formulation of the paper, the decision to create a new snapshot is independent from processing of txs and snapshots, and has been translated in hydra-sim as a
SnapshotStrategythat drives a snapshot thread which injects theReqSn
Plans for tomorrow:
- Write a better tester for our protocol, possibly using some kind of events generator. One approach would be to consider individual transactions "validation journey" and then compose the needed events/messages to produce an arbitrary interleaving and test that
- Another solution would be to just generate randomly possible messages coming from the network, eg. consider the network abstractly without taking into account the other nodes and just observe messages coming and act on them. Then we could generate sequence of messages from the point of view of the network to try to trigger unexpected behaviour and interleaving (if it's sound unclear it's because it is...)
- emove the snapshotting logic from the
HeadLogic's functionupdateand write aSnapshotterindependently, with proper tests and specified behaviour, then plug it in the node
Recreating Dev VM, trying to see if I could get something faster to improve turnaround time for compilation, but seems like C2 machines are the fastest available CPU wise
I have managed to get autotest.sh script work again, thus will see over the next weeks/month whether this has an impact on the cabal time.
I expect it should because I should be able to run the tests using autotest.sh rather than cabal. Ideally I would need a small wrapper program to tally timings between compilation cycles withing ghcid, which means I shuold probably look into ghcid's source code
When compiling master, I am having problems with missing git reference from dependencies, which are addressed by SN's PR:
$ cabal build all & cabal test all
[1] 5621
fatal: reference is not a tree: 09433fe537a4ab57df19e70be309c48d832f6576
fatal: reference is not a tree: 09433fe537a4ab57df19e70be309c48d832f6576
Working on merging this PR, then trying to fix compiler errors stemming from upgrade in dependencies.
ContractTest is failing with
Contract instance stopped with error: ConstraintResolutionError (DatumNotFound ca54c8836c475a77c6914b4fd598080acadb0f0067778773484d2c12ae7dc756)
src/Plutus/Contract/Test.hs:241:
Init > Commit > CollectCom: CollectCom is not allowed when not all parties have committed
The collectCom test is expected to fail but it seems this error is not expected? Actually it fails in the commit call.
Rebasing update deps PR on master before pushing, contract tests are now pending but we should revisit Plutus anyway so 🤷
Deciding to work on bench as MB has not touched it a lot and it's broken. Writing a function to submit transactions in parallel with confirmation
- When we get a
TxInvalidwe want to wait for the next snapshot confirmed to resubmit the tarnsaction
Turning the Registry into a record that contains a queue of txs to resubmit
Slight problem for pairing: AB has not the missing reference from master anymore so cannot work on it, but MB has not built the update-deps branch so has no dependencies and the build is not finished yet so we don't have a cache available -> we don't rotate for the moment and wait for PR to finish building
MB got troubles compiling after merge of update deps PR: Problems with happy dependency from plutus-core -> program could not be found ??!!
- There were 2 different installations of happy and alex, removing one of them fixed the issue
We now put transactions in aqueue and then repush them when they are invalid, then we resubmit a transction only if the sanpshot number has changed
- Our transaction submitter gets stuck reenqueueing transactions and never stops, which hogs the process at 100% CPU
- Trying to simplify code by extracting a "decision function" running in STM that returns an
Outcomewhich says what to do with the function - Got a successful run with 5 snapshots and 10 txs so I suspect there is a race condition
I think the snapshot process "loses" transactions:
- A tx received through a ReqTx should be snapshotted by node 2 which will be the leader, but there is still a snapshot in progress so no new snapshot is started
- The snapshot is emitted, but then no new tx can be submitted because of the "blocked tx" which is not confirmed -> All subsequent txs that depend on it will fail to submit
- As no new tx can be submitted, no new
ReqTxis produced which prevents the production of a new snapshot
This is exactly the problem for which we introduced DoSnapshot initially: We cannot link the production of a new snapshot to ReqTx because if we don't produce it immediately, and there is no more ReqTx to trigger the check, the transaction gets "lost" and never confirmed.
Originally not planned to do this today, but I realized that our master cannot be built from scratch in a fresh working copy because we seem to be referring a now gone cardano-ledger-specs commit (we took that one from the plutus we tracked so far).
So I set out to update the source-repository-package in cabal.project to match the most recent plutus master.
Using a nix-shell -A cabalOnly this was quite rapid to set up and then later materialize into a "proper" haskell.nix shell.
Three API changes seem to have happened:
-
utxoAt -> utxosAtrenamed and does return aChainIndexTxOutnow instead of atype UtxoMap = Map TxOutRef TxOutTx. Was not much of a problem to us, but I changed the code slightly to use more oftenTxOutinstead of theTx-referencingTxOutTx. -
ScriptLookupsis now takingChainIndexTxOutforslTxOutputswhich can be created fromtxOutusingfromTxOut :: ChainIndexTxOut -> Maybe TxOutandslOtherScriptsis now takingValidatorHashinstead of addresses, this was also easy to change. -
nextTransactionsAtseem to have been deleted! This is more of a problem and was not yet solved. Was it replaced? None of the functions in Request.hs seem to be doing this.
Until the last API change has been fixed, these things can be found on this branch: https://github.com/input-output-hk/hydra-poc/tree/ch1bo/update-deps
Having a look at glean from FB, a tool to explore code bases as advertised on the web site
Looks like there are only indexers (eg. parsers that generate predicates and facts from source code) for JS/Flow and Hack, the 2 "proprietary" languages owned by FB Shouldn't be too hard to make some for Haskell code based on HLS provided tooling?
Discussing in details the Hydra demo and improvements to TUI we want to make
- Use UTxO and TX, do not try to abstract away the details of the ledger (eg. use values only)
- Show the TxRef + the address the values are sent to by the UTxO
- Detail: How do you know which output is owned by whom? Can we derive the pubkey of owner from the address used?
- when using Ledger generation machinery, keys are set as part of
Constants, we could use the same key for everyone
- How do you create a transaction?
- Select from your unspent outputs
- Accumulate available values from selected outputs
- One dialog screen per step: Select inputs, create outputs, confirm
Planning and aligning on work to do in the next couple of weeks:
- Flesh out the TUI
- Adding metrics to the benchmark and making it more useful (run with more nodes and different configurations) Also gather scrape internal metrics from each nodes and output them as part of the benchmark's results
Using the innovation/learning budget to see how to use Plutus w/o PAB => Run a spike to craft txs by hand and use compiled validators
Updated Logbook
Worked on 2 PRs still in flight: Log API documentation and schema testing, and round-robin leader "election" in the protocol.
Design discussion about the prop_specIsComplete property is defined and how to use it:
- It currently takes a
SpecificationSelectorwhich is really a lens selecting some part of the provided schema - This lens should point to a schema fragment which is an array of
objects having atitlefield, which we use to compare to the list of constructors extracted from arbitrary data and find discrepancies - Unfortunately, the lens or some other kind of expression is needed, and not only the name of the field we are interested in because of differences in structures in the schemas
- We should document this test and property a bit more as they are not really obvious, and also help users by letting the common parts out of the provided lens (the part extracting a list of
titles from aValue)
There was also a meta-discussion about whether or not it's ok for someone to add commits to someone else's PR to "fix" it. Seems like we agree this is all fine and good as long as the changes are motivated, but then one could ask: Why not simply add more commits on master directly, either pairing or ensembling, or discussing them at start of ensemble session?
Worked together on the PR to "complete" it as we agreed the test written in the HeadLogicSpec was not satisfying as it is: It would better fit as a NodeSpec test which is better suited to express the expected output of a Node given a sequence of events, without having to care about the details of the state.
Writing the NodeSpec test was not straightforward but led us to uncover an issue:
- If a node receives a ReqTx while a snapshot is being acknowledged, but before it's confirmed, and this node would be the next leader, then we should trigger a snapshot emission otherwise we run the risk of losing the transaction if no other tx is submitted => no snapshot is triggered until another transaction appears.
- We added a unit test in
HeadLogicto ensure a leader emits aReqSnwhen its turn comes
Discussing PRs in flight:
-
https://github.com/input-output-hk/hydra-poc/pull/69:
- Need to move
Enveloppetype to Logging module as it's use in tests makes api doc and code inconsistent - There's something fishy going on with the tests as they should not passs because we don't have the files in the data-files, plus there's a
namespacewhich should be used
- Need to move
- Implement ADTArbitrary as orphan instances in tests to make sure we cover constructors in aeson's roundrtip testse
-
https://github.com/input-output-hk/hydra-poc/pull/70 => merging, using gnuplot is fine and simple enough
- SN made some changes to make the script more portable => use
/usr/bin/envto findbashexecutable
- SN made some changes to make the script more portable => use
-
https://github.com/input-output-hk/hydra-poc/pull/72
- Discussion about the use of
contestationDeadlinein theOnCloseTx - Seems like we need the deadline anyway in various places, not only in the client
- We store the "on-chain" transactions in the mock chain because we want to calculate time at posting time and not at consumption time
- Discussion about the use of
Trying to display the IP address of connected hosts in the TUI
- The
Heartbeatanswers aPArtybut we really need aHost. We can simply encode theHostin theDataconstructor of theHeartbeat. - Got into troubles with the APISpec saying
Committedis wrong -> we had a comment saying it should be aPartybut it was really aPeer
Demoing the TUI:
- Mock chain is confusing name as it's already used by Plutus -> Stub chain or Proxy chain
- Make it clear what the limits of the demo are, what's available or not (crypto primitives, main chain, contracts...)
- Would be great to have PAB with (actual) mock-chain as its release is due mid-Sep but seems unrealistic
Demoing the benchmark:
- Does not work out-of-the-box as we made some breaking changes
Puzzled by the behaviour of the APISpec and LoggingSpec, esp. how the namespace is used to check some properties
The classify function only works on a specific structure, namely one where we have the following tree from the root:
properties:
<namespace>:
type: array
items:
oneOf:
- title: <property>
...
but the utxo and txs are defined as:
properties:
utxo:
type: array
items:
$ref: "#/definitions/Utxo"
txs:
type: array
items:
$ref: "#/definitions/Transaction"
and of course logs is not defined anywhere.
The intent of the property is pretty clear, namely to check the completeness of the specification against generated values but it is very inflexible, tied to the precise structure of the api.yaml file and not suited for anything but having top-level properties with a specific sturcture
Rewriting the property to accept some arbitrary selector which makes it possible to adapt to specific structure of schema and tested data type.
- Current goal: be able to iterate the full life-cycle of a Head
- but keep commands static and only later make the client aware which is possible when.
- Committing some value in Hydra-TUI could act as some kind of "faucet", but I opted for simply committing
mempty- Maybe we could have a brick dialog to ask users how much ADA (or other assets) they would like to commit?
- Adding command and server output handlers was really easy and quickly done. Although I refrained from rendering
Utxosets. - When head was closed, client does not really know when the contestation period ends and this felt very unresponsive
- The
ServerOutputshould provide a point in time when this is (roughly) ends - The UI can then show a countdown or so
- So when the
HeadIsClosedshould hold acontestationDeadline, theOnChainTxhandling inHydra.HeadLogicneeds to know the current time.. or is given the deadline as well. - The latter seems to be easier as the chain client would also know best about "what time really means" on the respective blockchain
- The
...And then add some tests for it
-
Got failures in the APISpec tests, unsurprisingly. Seems like
AuxiliaryDataproduces aNonewhen not present which is unexpected? -
Do
StrictMaybefields whose value isSNothinggeneratesNoneinstead ofnull? Strict maybe'sToJSONinstance is defined here: https://github.com/input-output-hk/cardano-base/blob/eb58eebc16ee898980c83bc325ab37a2c77b2414/strict-containers/src/Data/Maybe/Strict.hs#L91 and it's defined in terms ofMaybes instance which must produce anullifNothing: https://github.com/haskell/aeson/blob/master/src/Data/Aeson/Types/ToJSON.hs#L1244 -
Trying to generate a transaction and check manually the validity against api.json => Surprisingly, generated transaction is valid against schema. Trying to generate more but if I try to validate the generated tx from
encodeit works fine. -
Trying to save the input file to see if there's a discrepancy. The list of
CardanoTxin the temporary directory is empty, seems like its content is not correctly updated upon shrinks perhaps? -
Saw https://github.com/Julian/jsonschema/issues/623: Json schema outputs the cryptic
None: None is not of type 'string'when astringfield has anullvalue, which is really not obvious from the output. Seems like a PR fixed it but unsure if we have it in our version -
=> Found the solution to allow
nullvalues forauxiliaryDataandauxiliaryDataHash:-
For the hash, it's simply an enumeration of possible types:
auxiliaryDataHash: type: [ "string", "null"] description: >- Hex-encoding of the hash of auxiliary data section of the transactions. examples: - "9b258583229a324c3021d036e83f3c1e69ca4a586a91fad0bc9e4ce79f7411e0" -
For the data, I had to resort to use
oneOfkeyword to either have aCborvalue ornull:auxiliaryData: description: >- Hex-encoding of CBOR encoding of auxiliary data attached to this transaction. Can be null if there's no auxiliary data oneOf: - type: "null" - $ref: "#/definitions/Cbor"
Got bitten by the fact jsonschema is implemented in Python and actually relies on a mapping between the JSON schema specification and the Python type system. The value
nullis mapped to the value and type (?)Nonein Python leading to some cryptic error messages. I am mildly convinced by the model-first approach especially if the tooling trips us. Also, tests are somewhat intricate as we need to pass through a layer of transformation from YAML to JSON, then call an external process to validate a schema. -
While working on adding a validation test of log entries against JSON schema, I am hitting a snag: Importing both the Logging module and the Cardano module leads to conflicting JSON instances on UTxO, which has a JSON instance defined in Cardano.Api.Orphans.
Instead of custom types for string encoding, we should use media-types to represent various encoded pieces of data: https://json-schema.org/understanding-json-schema/reference/non_json_data.html
Start documenting in more details the structure of the Cardano transactions as exposed by Hydra node API.
- Got a bit puzzled by how to represent dynamic keys which are needed for assets' representation.
Playing with better formatting of errors in the benchmark, using hspec => We can use runSpec to run the bench, making it a Spec simply using it
Goals for today:
- Validate
NewTxagainst confirmed snapshot and not submitReqTxif it fails - Let the client (benchmark) handle resubmission
We simply drop the transaction if it cannot be submitted by the benchmark => if it happens early then a lot of transactions will be dropped later
- We see the error message for the TxInvalid and the benchmark keeps running but we don't see any snapshot confirmed
- Node 1 sents a
AckSnfor its signature but it does not get processsed, Seems like we don't process our ownReqSn?? - We need to improve our tooling for exploring the logs
Trying to modify wait so that we don't throw away messages => we can simply consume messages and dump them
- We still don't see snapshot confirmed messages
- Just happens we forgot to loop in the
TxInvalidcase 🤦
Benchmark succeeds but only 16 out of 526 transactions suceeded
- When having an InvalidTx we simply resubmit it. Resubmitting transaction immediately hits hard on the node, so trying to increase the delay between initial submission => much better, see a lot of snapshots
We managed to run benchmark to completion with all the transactions by delaying submission time => now plotting and interpreting the results
Week's progress:
- We are getting closer to a real ledger, no real crypto but a real ledger
- It's not about TPS but about latency => we need to plot distribution of latencies providing some kind of guarantees
- We also want to test with more nodes, seeing how the cluster behaves with more participants Load testing = saturate resources (CPU/Memory) and observe response time => need to be able to tune throughput to saturate the nodes
- Need to trim down the logs:
- remove some network logs which are very verbose ? => need to confirm if the network logs are actually a problem
- do not log full event/effect, log the end events/effects using ids
- Logs are written in a tmpfs now, we should parameterize it to be able to store more of them. tmpfs is limited in size. Later on, use some cloud storage or log ingestion system
There are infinitely many possibilities with the logs, what do we really need now?
- Confirmation of simulation?
- Is latency increasing when adding more nodes in an exponential/quadratic way?
- Keep a transaction set around that we can use as reference, rather than generating one on the fly every time. We need 2 different tools, we can have 2-3 different scenarios to becnhmark
- Extension to load testser: make the number of nodes dynamic, submit transactrions to multiple nodes instead of a single one
- We also want to check CPU/RAM load of each node to ensure they are saturaed (also network bandwidth?)
- Do some cleanup work and make tests green again
- Debugging APISpec failures is somehow possible by temporarily adding more specs for sub-types and corresponding top-level properties to the schema, e.g. "utxo"
specify "Utxo" $ \(specs, tmp) ->
property $ prop_validateToJSON @(Utxo CardanoTx) specs "utxo" (tmp </> "Utxo")
and
utxo:
type: array
items:
$ref: "#/definitions/Utxo"
additionalItems: false
- I kept these ☝️ entries for
Utxo CardanoTxandCardanoTxto differentiate test failures - Realized that a
HUnitFailureis not properly formatted inBehaviorSpec-> red bin - Found the bug in
NewTxfor the failingBehaviorSpec:
case canApply ledger utxo tx of
Valid -> \[ClientEffect $ TxValid tx, NetworkEffect $ ReqTx party tx\]
Invalid err -> \[ClientEffect $ TxInvalid{utxo = utxo, transaction = tx, validationError = err}\]]]
We had been validating against the confirmed ledger, but not reporting it being invalid using the seenUTxo, so the expectation was wrong.
This now also requires the test to wait for a SnapshotConfirmed before re-submitting the secondTx.
And the benchmark should likely do the same.
-
ToJSON should not contain empty objects, e.g. assets in a value
- We did remove it for the
assets, but there are others - Maybe tackle this when also documenting the API format for txs
- We did remove it for the
-
Is the benchmark really a load-test?
- Using
hspec/runSpecwould also handle and renderHUnitFailuresproperly - This turns out to quite simple:
runSpec (it "some context" action) defaultConfig >>= evaluateSummary(although this kills the process)
- Using
-
Created a log capture template to easily capture entries like this
(setq org-capture-templates
'(("l" "Log" entry
(file+headline org-default-notes-file "Log")
"* %? %T\n%a"))
;; other templates
)We start writing a "missing" unit test for broacast-to-self in NodeSpec but we realise this is not possible as it's not directly observable -> just remove the comment about the implementation details and rely on indirect observations
We switch to complete serialisation of cardano transactions, working on adding minted values to the JSON format
- Added
mintbut it seems transactions are not generated with minted value ingenTxfor Mary => check what's going on to Alonzo
Runing the benchmark we got errors in the validation of transactions. First error is about wrong script witness, then other errors about invalid key witnesses. Some transactions are valid, and we see 18 being processed as TxSeen, with 12 reported as TxValid
Error reporting in the benchmark is painful:
- we should stop as soon as we get a
TxInvalidreport - We are missing some information when we get a validation error, namely the details of the transaction that failed and the UTxo set to which the transaction was applied -> add it to
TxInvalidand then we can use it as regression tests when we get a failure
Adding unit test harvesting output from the Benchmark (need some love to be more usable), got the following failures:
ApplyTxError [UtxowFailure (MissingScriptWitnessesUTXOW (fromList [])),
UtxowFailure (UtxoFailure (ValueNotConservedUTxO (Value 55938162 (fromList [])) (Value 107981334 (fromList [])))),
UtxowFailure (UtxoFailure (BadInputsUTxO (fromList [TxInCompact (TxId {_unTxId = SafeHash \"d2635419a791eef0ba694bbcb66de7c7e76a865a493e7d2cc46f5c6b1ecb7b8d\"}) 3])))]")
MissingScriptWitnessesUTXOW shows an empty difference between needed and provided
Looking at how transactions are generated, we replace the property test on single transactions with one on a sequence of transactions -> The property fails, reusing the example to check why it fails
Trying to shrink the examples we have -> works for txs but not for utxos because the shrinking is not done in relationship with the UTXO
How can we generate a valid sequence of transactions that then fails to validate against the very same UTXO set used fir generating the transactions. What is the shrinker for lists doing by default?
Answer: The reason applying several transactions at once vs. applying one by one fails is that we throw away the delegation pool state between each application when we generate them, we only keep the changed UXxO set. This is not the case in the applyTxsTransition which carries over both the UTxO set and the DPState, so transactions can now fail.
Not applying the transactions as a list but one by one works!
- Seems there's something wrong with the way we are applying the transactions to the ledger?
- => changing the interfcae of the
Ledgerto only apply one transaction at a time - It's probable the failures we are seeing in the benchmark is caused by reordering of transactions?
We see more InvalidWitnessesUTXOW failure, with a list of public keys
- Issue probably comes from serialisastion of keys (and possibly script) witnesses, investigating from a failure, using the Haskell show instance to compare how it's serialised to JSON And back.
Managed to have a WitVKey constructed as:
key :: CryptoFailable (WitVKey 'Witness StandardCrypto)
key = do
pubkey <- publicKey @ByteString "\150\f[\192l\179\v\136%\182%\137 \STX\215\229up\228$V\157?F\151i\236\144\SI;e\142"
sig <- signature @ByteString "\160r\240\221\191\ACK\221*\193\178>\SUB\USL\252HAID0\DC1\NUL~\131\&0\DLEy\188\187\197u\236\&8\201\175aNK\150\141\224\190\EM\141\129\STX\155\231\226N'E\DLEZ\249\131,ao\156\156\CANA\t"
pure $ Cardano.WitVKey (VKey $ VerKeyEd25519DSIGN pubkey) (SignedDSIGN $ SigEd25519DSIGN sig)
Writing a ToJSON/FromJSON instance for WitVKwey, unpacking what we had in Witnesses before => The WitVKey is correctly encoded and decoded
Trying to chase the source of the error we are seeing from a failing transaction, deserialising the JSON witnesses and checking if they match the input transaction's => they do
However, this transaction contains minted values with a ScriptHash value as policy ID, could be the case that we get a missing script witness because we don't pass down the mints in the body?
We are minting the value
mint = Value 0 (fromList [(PolicyID {policyID = ScriptHash "42c7a014a4cd5537f64e5ae8ec7349db3d8603e16765dc37f8fb6e67"},fromList [("
yellow0",134392),("yellow5",368980)])])}
which matches a script hash provided as part of the witnesses. Could it be we get a witness error because there are too many witnesses? OR a script provided is not matched in the body of the transaction? => yes
- The verification of script hashes checks that all script witnesses are used, and all required scripts are present
- Added JSON instance for assets in
Valueso that we complete the TxBody, bar the PP updates
Still having an error in the benchmark
ApplyTxError [UtxowFailure (UtxoFailure (ValueNotConservedUTxO (Value 309051813 (fromList [])) (Value 333277734 (fromList [])))),UtxowFailure (UtxoFailure (BadInputsUTxO (fromList [TxInCompact (TxId {_unTxId = SafeHash \"5e2921b6a85257bcdb0f2c5e9d96f0e5ed7cf199a646ce4d5d8961fa939bb126\"}) 2])))]")
It's perfectly possible for a submitted transaction to not be applicable at NewTx time, but in the HeadLogic we still submit it as a ReqTx and report a TxInvalid to the client
- In the original paper, transactions are required to apply to the confirmed UTxO set before being propagated
- Changing the logic of
NewTxto:- Validate transaction against confirmed set (from latest snapshot
- Not send a
ReqTxif the transaction does not apply The behaviorSpec test now failswhich is to be expectedFailureException (HUnitFailure (Just (SrcLoc {srcLocPackage = "main", srcLocModule = "Hydra.BehaviorSpec", srcLocFile = "test/Hydra/BehaviorSpec.hs", srcLocStartLine = 248, srcLocStartCol = 15, srcLocEndLine = 248, srcLocEndCol = 31})) (Reason "Test timed out after 1s seconds"))
Two things for tomorrow/next session:
- Change in the HeadLogic and adapt the tests
- Adapt the benchmark to use hspec to run it so that we get better error reporting. It's not really a benchmark anyway, it's more a load test.
We should change the names of the witnesses fields:
-
scriptsis fine -
addresses->keys(and break it down into avkeyand asignaturepart)
We discussed the perceived awkwardness of the NodeSpec test as it is now:
- We should test at the boundaries and stub the effects, no more, and use the same
createHydraNodefunction for all tests - This means we should move the
BroadcastToSelfwrapper into the node and not configure it outside as it is an integral part of the behaviour of the node - Alternative would be to bake reinjection of
EventfromEffectinto theHeadLogicprotocol itself - Writing a test exposing
Waitof some event: We injet out of order AckSn/ReqSn and expect to see our own AckSn But we see the AckSn with a weird signature... => Refactoring Node code to have a dedicated createHydraNode function that does the wrapping - There is a problem in our
createHJydraNodefunction:withHeartbeatandwithBroadcastToSelfrequire and produceNetworkComponentswhich contain both sending and receiving part of the network. Solution is to refactorcreateHydraNodetowithHydraNodeas a with pattern => Let's take a step back and not focus too much on this refactoring => keep the test pending for now and refactor later in solo mode
We then turn our attention towards benchmark errors again:
- Fixing the test output to be less verbose so that we get a better error reporting Pondering if we should not write the messages into a file, but it makes things more complicatred, going for the simple thing of truncating the list of messages when displaying the error
We were waiting for snapshotConfirmed and we changed the serialisation format to use Generic which emits SnapshotConfirmed with a capital C 🤦
Scaling bench again we have the timeout on waiting for confirmations again, but with more snapshots produced => Extracting the snapshot number from the confirmed ones and reporting it, instead of throwing an error and giving information, let's report progress from the benchmark
- We should use our
Tracerto show progress in the benchmark
Investigating another failure, we see that node-3 gets a ReqSn that it drops because it is still processing another one. This case is actually incorrect, we should Wait if we get a ReqSn that could be valid in the future:
Writing a unit test in HeadLogic:
- Send 2 ReqSn in a row, should wait the 2nd one
- Receiving a ReqSn which is from the past should fail
Added some tests to cover the case of snapshots "from the future", re-running the benchmarks works on AB's machine, yielding 50-70 snapshots whereas it fails on SN's machine which is faster
Node 1 ends up not emitting snapshots, seems like it is not emitting a network effect to send ReqSn message
Trying to drop the null (seenTxs ) condition works => the condition should be at the level of the guard so that we don't "consume" The DoSnapshot event without either waiting or emitting a ReqSn
- There is a property waiting to be written there, expressing snapshot strategy invariants in terms of variation of state/sequence of events.
Benchmark now runs to completion without failing 🎉
Discussing the snapshot strategy as it's getting somewhat cumbersome now
(Cont'ed work on Tx generator)
Setting most values to 0 lead to an error in the generator, with all frequencies set to 0. QC.frequency is used in different sections of the generator:
- To generate credentials registration => some
XXCredmust be non 0 - To generate delegation =>
frequencyKeyCredDelegationorfrequencyScriptCredDelegationmust be non 0, or there must not exist a stake pool to delegate to (inDPState?) - To generate withdrawals => we use defaults so should be fine?
Got another error now:
ApplyTxError [UtxowFailure (MissingScriptWitnessesUTXOW
So the transactions are generated sometimes with script addresses which obviously require a script witness, which we simply drop when creating the witnesses... => Going to add handling of script witnesses in the CardanoWitness data structure
- The script witnesses field is actually a
Map ScriptHash Script, trying to serialise it as an object? Interestingly there's aToJSONinstance forScriptHashbut noToJSONKeywhich is somewhat sad - => Added scripts witnesses to the
ToJSONinstance for witnesses, andToJSONKey/FromJSONKeyinstances forScriptHash, now testing to see if I get failures in golden and roundtrip for witnesses, which should be the case... - Added JSON instances for
Timelockso that the JSON instance for witnesses is simpler
All serialisation tests now pass, now having an even more complex error with several issues:
Left (ValidationError {reason = "ApplyTxError [
UtxowFailure (MissingScriptWitnessesUTXOW (fromList [])),
UtxowFailure (InvalidWitnessesUTXOW [VKey (VerKeyEd25519DSIGN (PublicKey \"\\177g\\EM@R\\DC2\\251\\129\\GS\\175\\211+t\\146\\161\\205\\174\\138\\247\\154S\\244>\\r\\f%\\195U\\141\\166\\234\\&9\")),VKey (VerKeyEd25519DSIGN (PublicKey \"\\252]=\\212&\\139\\138\\240\\185\\ESC\\185\\GS\\186Dk\\164\\ESC`\\249I\\186\\163\\224K\\r\\SI\\192KT\\204\\160\\SO\")),VKey (VerKeyEd25519DSIGN (PublicKey \"\\150\\f[\\192l\\179\\v\\136%\\182%\\137 \\STX\\215\\229up\\228$V\\157?F\\151i\\236\\144\\SI;e\\142\")),VKey (VerKeyEd25519DSIGN (PublicKey \"o9\\174\\133\\251V\\252\\247\\210j\\187\\DC4\\178\\223@\\225\\182\\&9\\148\\a\\229\\\"4{\\185XR\\210<\\245\\154\\255\")),VKey (VerKeyEd25519DSIGN (PublicKey \"J\\219\\247.<\\203\\238\\216\\162\\EMhY{\\ESCk#\\214\\155\\170\\206J\\210\\FS\\206\\130\\209\\158s\\255\\&4\\255\\ETB\")),VKey (VerKeyEd25519DSIGN (PublicKey \"\\EOT0\\ETBo\\183\\n\\138\\182\\143\\192#\\172\\183\\243\\245\\215Sp\\201\\220\\DLE)\\SYNQ\\167\\ETB\\251\\218e\\ETX\\132\\196\")),VKey (VerKeyEd25519DSIGN (PublicKey \"h\\239\\210sTVfp\\NAK2-\\130\\STX\\253a\\DC2\\209\\204n\\245\\188\\213\\138cG\\136\\186I\\r\\249\\173\\143\"))]),
UtxowFailure (UtxoFailure (ValueNotConservedUTxO (Value 230733318 (fromList [])) (Value 230733318 (fromList [(PolicyID {policyID = ScriptHash \"42c7a014a4cd5537f64e5ae8ec7349db3d8603e16765dc37f8fb6e67\"},fromList [(\"yellow1\",729252),(\"yellow2\",901652),(\"yellow4\",871114),(\"yellow5\",127109)])]))))]"})
The transaction is relatively small:
{
"witnesses": {
"scripts": {
"42c7a014a4cd5537f64e5ae8ec7349db3d8603e16765dc37f8fb6e67": "820181820181820180",
"a3e84983320841577ac20d77058e440d7fb7e17e98659e921b1274a3": "83030383820282820182820181820518208200581c733aea10df2a2bb1d3019a7337b240ad64a174c919fc034fb372fdc9820182820181820418208200581c733aea10df2a2bb1d3019a7337b240ad64a174c919fc034fb372fdc9820282820182820181820518598200581c571758200680b643781738e0436291811be83c1707fc66edd4982b0e820182820181820418598200581c571758200680b643781738e0436291811be83c1707fc66edd4982b0e8202828201828201818205183582$0581cb9acb4b5682ddb6980f2471bbd13a3765e54d79ebf46417c850a609c820182820181820418358200581cb9acb4b5682ddb6980f2471bbd13a3765e54d79ebf46417c850a609c"
},
"addresses": [
"8200825820b16719405212fb811dafd32b7492a1cdae8af79a53f43e0d0c25c3558da6ea395840b8de36f9836332743d8068478fd5a1e93aeff12dfade0dedf86c74a252e23c1f7903b81d43a6a8b21e42b08fb531c2e9f6e78080aa71bf234e5117a7a1328a0f",
"8200825820fc5d3dd4268b8af0b91bb91dba446ba41b60f949baa3e04b0d0fc04b54cca00e5840ff802a5358b84e2110f981a697b60141dce0925f251a954c3c04877ea061083b9ddbfd40ce96a72e3600950bd4b866a49965480d70f0f45e186f8bcc8f9d130d",
"8200825820960c5bc06cb30b8825b625892002d7e57570e424569d3f469769ec900f3b658e5840cb9aa3267c3c7a05aabb7e3f57b62b238590922ff9e2b4d5965d4eab5a3fee92ccb1fd095c87ee7685f18a8704b04234ba56236adfb037ea157988aa8605e902",
"82008258206f39ae85fb56fcf7d26abb14b2df40e1b6399407e522347bb95852d23cf59aff58402e639dd813c0a9879366f7c9491ea95d70134be90687b7687308551488a556811c4fc0aced07af841f2e5cc0248af747cf3dfd506d5158d71592878800bce709",
"82008258204adbf72e3ccbeed8a21968597b1b6b23d69baace4ad21cce82d19e73ff34ff1758404c94c33d62fd704369c110cd010a4d6ea04eb002ae6c3fbdadb0d62a84b03c0e2be84696631d82e82eb3deec595c72e5b2d810f3a95909058bc2e82549a0a104",
"82008258200430176fb70a8ab68fc023acb7f3f5d75370c9dc10291651a717fbda650384c45840135bcc2f58a21a9273e8e7ca481a744aefadf12b48e9e8b5b3e5e6820e04bf74113d696cd45f5b5d17ade8d23e38522902dd4463852d17c9ce4c818e61c2c107",
"820082582068efd2735456667015322d8202fd6112d1cc6ef5bcd58a634788ba490df9ad8f5840fa9f0adb12e0ccea8ef31c656af30b473334c026228f223940e08f2ec344c6c9cc0d3a9b4ba0f597f79fb84bb885fd59089fc12890c3563b96a4121e68fb2701"
]
},
"body": {
"outputs": [
{
"address": "addr_test1qqzfllufs42yh9tz3j5zeqeh8v789hvzvz57kd4n5xez0pnc0554t3aspn7xrc0ekfq7he5gwwx935kc8yzx00znxr3shu4jta",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1qryc674js99w50kjf30heds8eqqe0vre3d8487swgrmd7q5a8uwp3k06h9vg32z7lrnzjvpey9eymx7zq8atvz755sjqcguqss",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1qz2y2kkjyhz4d5957msrgtesv85aerhynpgp29fnrxq676zpgz2ae0fmt5rr6kzr97c4e0qg8jvvnx4ktjnlx7unu4wsaad457",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1xzd9pvulknjv8x5fq6d9pluz3kccxw3l8rzvjk4tzdndug2900ddnvzdkregx3scav6qjjc0vq0l9apfh9sd8983zcesws0shy",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1xqs6qrhy9hu77wms0xcryy9tcnv32340gy63ejdz7zxeqj8kf86jrtuhuy6recsnpsn9gfen2u6uueqdljnsqlvu5kpsaldxc9",
"value": {
"lovelace": 46146663
}
},
{
"address": "addr_test1qqh8gkdmj6d8exd4tq65hql93dclhkfamh7dgr6etf8l4k5akymz4y4zwhufvwrymmy08acmy2ujkllln7jcs43k87ys04zqf8",
"value": {
"lovelace": 3
}
}
],
"inputs": [
"03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314#20",
"03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314#58",
"03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314#61",
"03170a2e7597b7b7e3d84c05391d139a62b157e78786d8c082f29dcf4c111314#80"
]
},
"id": "f207b15e2b5691ce237d0b28e6e26cfc5f933281eb16c363458652d663b3dd29",
"auxiliaryData": null
}
So it is missing outputs for non-ADA tokens obviously => We drop the mint field from the generated body and a lot of other fields, which is a remnant from previous attempt at dumbing down the Body. Just passing the body encoded as is should be fine hopefully, except we are probably dropping a lot of fields in the serialisation process => Test for applying transactions succeeds, now going to wire that in the benchmark
ETE test is now failing which is expected as the format has changed.
- Network is wrong, should unify to
TestNetfor all components. Funnily, theTestnetdata constructor from cardano-api is different from the one in the Shelley ledger as it takes aMagicargument, but in the conversion process it is simply dropped.
As expected, when converting the benchmark to use CardanoTx, it fails to validate the transactions emitted because our serialisation is missing quite a lot of fields from the TxBody. We should either filter those, or complete the serialisation to handle more body fields => Covering JSON serialisation of missing fields in TX, in order to ensure we can properly encode/decode all kind of transactions, we'll deal with rejecting irrelevant transactions later on.
Continued working on generating transactions and checking roundtrip/goldeb serialisation.
Need to tweak the max transaction size parameter to find the right one:
- Default maximum tx size is set to 2048 (bytes?): https://github.com/input-output-hk/cardano-ledger-specs/blob/nil/shelley/chain-and-ledger/executable-spec/src/Shelley/Spec/Ledger/PParams.hs#L320 Settting it to 1MB for the moment
Got a different error this time:
ApplyTxError [UtxowFailure (UtxoFailure (UpdateFailure (NonGenesisUpdatePPUP (fromList [KeyHas
h \"099f27f2d9bc901017518ee78b9b12a52ce658142e255666e2ce0b9d\",KeyHash \"859e3a86e34626df256a84ee03d813819aa731e854b6e4034e7024e0\",KeyHash \"a
01f063c96ada95334fcdc7beb3a8fb2d0ff4ee8d206be17fa1becae\",KeyHash \"a94e6fffab278ffef8092918bc3ae6ac47d3cf8d9f4b923ecbfd8236\",KeyHash \"f90e54
1ed22c517263ab0885721c02f08a313b21de009efd3672afed\"]) (fromList []))))]"})
Trying to strip the generated transactions' body from updates stuff but turns out it's not as simple as this, of course, simply stripping the body from the parts we are uninterested in leads to more errors:
[UtxowFailure (InvalidWitnessesUTXOW [VKey (VerKeyEd25519DSIGN (PublicKey \"J\\21
9\\247.<\\203\\238\\216\\162\\EMhY{\\ESCk#\\214\\155\\170\\206J\\210\\FS\\206\\130\\209\\158s\\255\\&4\\255\\ETB\")),VKey (VerKeyEd25519DSIGN (
PublicKey \"h\\239\\210sTVfp\\NAK2-\\130\\STX\\253a\\DC2\\209\\204n\\245\\188\\213\\138cG\\136\\186I\\r\\249\\173\\143\"))]),UtxowFailure (Miss
ingTxBodyMetadataHash (AuxiliaryDataHash {unsafeAuxiliaryDataHash = SafeHash \"8493f75f77f6f02b5342998e180be02fa132c26fba140bdb51026d9f1a2f6bce
\"}))]"})
As pointed out by Jared, we can tweak the generator by setting various parameters in the Constants argument
Reviewing TUI code written by SN
Trying to fix NodeSpec test by adding a Wait to replace error -> of course, unit test pass but benchmark still fails, note we have to revert to using SimpleTx in node and mock-chain
- One problem is that we can emit 2 times the same snapshot
Writing a NodeSpec test to expose the problem of not emitting 2 ReqSn for the same snapshot twice
- We don't see any
ReqSnafter injecting a bunch of txs -> node is not the leader - Also we do not handle effects so we want to create the node with a list of events to prepoluate the queue and then process events until completion or quiescence eg. when the queue is empty
Starting at ed8eeaba94efffca8596e2339e03b1852d3ce4aa BehaviorSpec tests hang:
-
We spent some time troubleshooting an issue which ended up being caused by the following code:
runHydraNode :: Tracer m (HydraNodeLog tx) -> HydraNode tx m -> m () runHydraNode tracer node = forever . stepHydraNodeIt happens that
foreverhas typeApplicative f => f a -> f b, and(a ->)is actually anApplicativeso forever would endlesslly evaluate a thunk which reduce to a function which cannot be evaluated further, which locks the process.
Ended up having a finer grained description of the SeenSnapshot than a Maybe to distinguish the situation from the point of view of the leader and the followers so that we don't make too many snapshots.
There is an interesting micro-pattern here that also is prominent in cardano API which is to not have Maybe blindness: Use expressive and "domain relevant" ADTs to express the state.
Re-running the benchmark (replacing ledger type) still fails in the followers with an out-of-order ReqSn but it fails much later, like in snapshot number 39 which is some progress: We get ReqSn for 40 but we are still processing 39 in node 2. We don't have any InvalidEvent in the node 3 though.
-
Added command line parameter to decide to which node to connect using
--connect- Uses the
Hosttype ofhyrda-nodemoduleHydra.Network - Needed to add more details to
ClientJSONDecodeErrorto realize that thehydra-nodes in docker containers were still previous API JSON instances - Rebuilt the docker images and re-started nodes using
docker-compose buildanddocker-compose up -dindemo/
- Uses the
-
Start adding commands to drive the Head lifecycle
- Rendering
[i]nitand handling theKeyEventis quite easy withbrick - Now the tricky part is implementing
Client{sendInput}for a not necessarily connected websocket
- Rendering
-
Switching to
CardanoTxand adding[a]bortwas a breeze -
I would like to test
handleEventas it gets quite complex, but I don't know how? -
At some point I realized that the
hydra-nodecontainers all full cpu utilization - are we busy looping?- Yes, in the recently introduced logging rewrite when flushing the queue: https://github.com/input-output-hk/hydra-poc/pull/63
Small note on the APISpec which tests our types against api.yaml:
test/Hydra/APISpec.hs:35:7:
1) Hydra.API, Validate JSON representations with API specification, ServerOutput
Assertion failed (after 1 test and 7 shrinks):
[PeerDisconnected {peer = VerKeyMockDSIGN 0}]
{'peer': 0, 'tag': 'PeerDisconnected'}: {'peer': 0, 'tag': 'PeerDisconnected'} is valid under each of {'required': ['tag', 'peer'], 'title': 'PeerDisconnected', 'properties': {'peer': {'$ref': '#/definitions/Peer'}, 'tag': {'enum': ['PeerDisconnected'], 'type': 'string'}}, 'type': 'object'}, {'required': ['tag', 'peer'], 'title': 'PeerConnected', 'properties': {'peer': {'$ref': '#/definitions/Peer'}, 'output': {'enum': ['PeerConnected'], 'type': 'string'}}, 'type': 'object'}
Does not mean necessarily that the PeerDisconnected is implemented wrong, but in this case it was indeed the specification in api.yaml of PeerConnected!
It's a bit confusing that no PeerConnected was in the failing list, althuogh this might come from the batch-wise invocation of jsonschema and shrinking?
Discussing strategies and options for roadmap of Hydra. Could it be interesting to frame this using Real Options?
Merging PR about bech32 addresses, seems like TH is a bit overkill but OTOH it's safer and less ugly than handling a Left impossible. Also, handling of various addr types is cumbersome and could be removed if we used cardano-api's functions -> put a red bin to refactor that later
Got a failure in the mock0chain serialisation of UTxO: It's because the mock-chain is using SimpleTx and not CardanoTx. Could it be made more polymorphic and agnostic in the type of transactions it transports? => Not easily
We don't report the error to the InvalidInput which is annoying -> moving the InvalidClientInput wrapper to the HeadLogic module to have it available there, we realise this representation is actually too complex and does not roundtrip properly in JSON. We fixed the encoding of invalid input as just Text and display it to the end user (in ServerOutput).
- Looks like we did not do the right thing the first time, namely making sure we are reporting error to the user properly
We now have a proper error message with ETE test: The transaction fails to be deserialised properly, and it points to the witnesses not being properly encoded.
- Writing unit test with the faulty transaction as JSON
- Comparing ETE encoding with what we have in the node, seems like the ETE is encoding the witness set as a CBOR list and not a list of CBOR?
- We encode a list of
KeyWitnesson the client side, which seems ok, but the encoding ofKeyWitnessis weird, depending on the type of witness it encodes it as a 2-elements list with the first element being a discriminator: We should do something symmetric in the ToJSON/FromJSON in the Cardano ledger
Serialisation of TX is working fine, and it now it fails on applying the transaction to the ledger: We'll reuse existing functions from MaryTest
- Previous code for applying transactions directly used
Cardano.Txbut we have wrapped it in our type Need to convert aCardanoTxto aCardano.Tx - We hit the same problem AB had for tx generation about
TxBody: The one available in the API is not the correct one -> Changing for the proper on inShelleyMA
We see the transaction is sent and injected into the ledger but it is invalid: We need to improve the reporting of errors about invalidity of transactions
- Make
ValidationErrormore verbose which showed us our addresses were incorrect, namely we sentMainnetand receiveTestnet{"transaction":{"witnesses":["8200825820db995fe25169d141cab9bbba92baa01f9f2e1ece7df4cb2ac05190f37fcc1f9d58400599ccd0028389216631446cf0f9a4b095bbed03c25537595aa5a2e107e3704a55050c4ee5198a0aa9fc88007791ef9f3847cd96f3cb9a430d1c2d81c817480c"],"body":{"outputs":[{"address":"addr1vx35vu6aqmdw6uuc34gkpdymrpsd3lsuh6ffq6d9vja0s6spkenss","value":{"lovelace":14}}],"inputs":["9fdc525c20bc00d9dfa9d14904b65e01910c0dfe3bb39865523c1e20eaeb0903#0"]},"id":"56b519a4ca907303c09b92e731ad487136cffaac3bb5bbc4af94ab4561de66cc"},"output":"transactionInvalid","validationError":{"reason":"ApplyTxError [UtxowFailure (UtxoFailure (WrongNetwork Testnet (fromList [Addr Mainnet (KeyHashObj (KeyHash \"a346735d06daed73988d5160b49b1860d8fe1cbe929069a564baf86a\")) StakeRefNull])))]"}}
Node now fails because of missing ToCBOR/FromCBOR instances for CardanoTx which prevents proper communication with other nodes.
- We realise our
CardanoTxtype is problematic as theTxIdsshould always stays in sync with theTxBody-> remove it from the data structure and recompute it every time - We got stuck in decoding the
Annotator TxBodyas there is noFromCBOR TxBodyinstance: TheFromCBORclass provides aDecoderbut therunAnnotatorrequires access to the underlyingByteStringwhich is annoying and prevents us to use aFromCBOR (Annotator a)wihthin aFromCBORinstance needing aFromCBOR a. - Getting dragged into the weeds of how transactions get serialised inside a node....
- Solution is to
decodeBytesthen use those bytes as input to thedecodeAnnotatorfunction. We write theToCBOR/FromCBORinstances ofCardanoTxusing the underlyingFromCBOR(Annotator Tx )instances and reconstruct the txId using the body
Finally got a green ETE test! 🎉
Implemented a basic Arbitrary instance for CardanoTx to have a proper roundtrip and golden test.
I have made the instance use a genCardanoTx function that will come in handy once we want to generate sequences of valid transactions, for example in the benchmark or end-to-end tests.
Now adding a ToCBOR/FromCBOR roundtrip test for completeness' sake.
Got failures when trying to quickcheck application of transactions to the Cardano ledger as the generated transactions run on the Testnet and not the Mainnet which changes the addresses used. -> Settle on using Testnet everywhere
Then another failure: The problem is that generated transactions are more complex than what we cope with in the ledger apparently.
- BTW, it's really not obvious how to meaningfully shrink a transaction!
- The transaction generated are actually using the auxiliary data so when we simply drop those in the generator, we make the transaction possibly invalid -> Add the auxiliary data as a field to the
CardanoTxso that we have all the information for a real transaction.
Looking at writing a genertor for our CardanoTx. There is a genTx function provided in ledger-specs : https://github.com/input-output-hk/cardano-ledger-specs/blob/master/shelley/chain-and-ledger/shelley-spec-ledger-test/src/Test/Shelley/Spec/Ledger/Generator/Utxo.hs#L103 which produces valid transactions. Trying to salvage work I have done before on model based tester
Follow-up on update/abort in PAB: We can't both wait for updates and expose endpoints because the update resumption always is done from the tip of the chain apparently, which means that by the time the abort is done, the update will miss the abort transaction. => activate 2 different contracts in 2 different threads
We add more tickets to the backlog on Miro, filling in some gaps we perceive in what's need for the milestone. We also agree on making less tickets "ensemble-only" to allow team members to pick more stuff when working alone.
We end the session closing the loop with the "real" cardano ledger in the Head:
- AB had prepared To/FromJSON instances for most of our types, so we could start by wiring up the
Hydra.Ledger.Cardano - When simply using
cardanoLedgerinhydra-node, the newTxand associatedUtxotypes were used - We had still an error when deserializing a
Commitclient input and added the aeson error to theAPIInvalidInputerror trace - Finally we realized that it was not parsing because the address format we used in our e2e fixture is using
bech32, while we were still de-/serializing a raw hex serialization for the address
-
Two solutions had been researched, but not audited yet
- One is implemented now by Inigo
- Both would be transparently supported by the node (when verifying)
-
Some little addition required in libsodium to make it possible to construct a multisig signature so that, on the verifier side no change would be needed and classic Ed25519 verification can be used.
-
Generating and not reusing nonces is a vital part (of engineering)
-
While it works in practice at the moment, the theory behind the MuSig2 needs to be validated from a mathematical standpoint.
-
Next steps
- Have it theoretically defined -> will yield a formal definition
- Implementation / changes to libsodium are checked against that
-
Really the most complicated part is managing nonces
- To produce a partial signature the signing party needs to have the nonce for it from all other signers
- Signers need to produce those nonces and need to keep a state of already produced nonces because reusing a nonce means disclosing private key. This implies keeping state on nodes for generated nonces
- Aggregation can be done by anyone, it does not handle any secret
- Aggregation of public keys and conversion from prime order group can be done once, at startup time
-
Other projects at IOHK will be using a non-interactive multi-signature scheme which requires changes on the verifier
- They will require a hard-fork then?
-
Code is located at https://github.com/input-output-hk/musig2, not production ready but something we can iterate on.
Then discussion about non-custodial head deployment model with some delegation, something like lightning with watchtowers. Conclusion is that it seems feasible as long as lawyers agree this is indeed a non-custodial solution.
There's no generally agreed upon JSON representation of a cardano transaction, which is somewhat annoying: In the cardano-api there are ToJSON instances but no FromJSON, and only for some parts of the API.
Switch to representing the utxo as a map from TxIn to TxOut instead of an object with a ref field and an output field: This means will be encoding the TxIn as it's done in the cardano-api, namely as a string with transaction id plus index.
We should write roundtrip serialisation tests for TxId, using the Gen TxId available in cardano-api perhaps?
There are actually 2 interesting properties:
- conformity with cardano-api
- roundtrip ToJSON/FromJSON
Quick reflection on the session: We lost track this morning of our TDD principles, did not run our ETE test once and got lost in the weeds of implementing serialisation code for the full transasction whereas we would only need the UTXO to make some progress (eg. be able to sned commit commands).
Discussing demo:
- what kind of audience are we targeting? devs/enthusiasts/SPOs.. -> interested in technical stuff
- summit is also a wider audience so having something graphical to show?
- the message is about scaling to global use cases
- is it the first time in Cardano we are "locking" fund?
- commits are mandatory, and an integral part of the head's capabilities
Do we really need a "lay people" demo? => Probably not, it might blur the message, letting people think this is all done and packages where it really is not => something to leave for marketing people to work on, to carve the right message
No need to have the 3 of us working on finishing serialisation, AB is going to wrap up the Cardano head ledger so that we can have ToJSON/FromJSON instances working and tested, then we pick up the ensemble session working on the validation and integration of actual ledger.
Managed to get a reduced out-of-order snapshot test case, after extracting an immutable prefix from the events stream so that keys and committed UTXOs are right. Trying to plug this reduced test case in the shrinker does not lead to more shrinks so it seems in a sense to be minimal.
Struggling with getting the ToJSON/FromJSON instances right. Wrote a roundtrip test for UTxO and then the JSON instances would seem easy enough, but there's this crypto and era type parameters which are pretty annoying.
- Solution is: Make
Crypto cryptoa constraint for all typeclasses and a parameter for types and this will make testing easier
Got everything to compile and test to run, but roundtrip is failing. Going to troubleshoot.
- Trying to reproduce test failure in the REPL, but it does not work as easily: Several instance of FromJSON are in scope as the
cabal replcommand loads all the files from the component - The issue is in the parsing of
TxInas key in mapsAs expected, working with encoded txIn is a PITA...Left "Error in $: parsing Natural failed, expected Number, but encountered String"
Making progress, can now properly serialise UTXO and witnesses:
Hydra.Ledger.Cardano
Cardano Head Ledger
JSON encoding of (UTxO (ShelleyMAEra 'Mary TestCrypto))
allows to encode values with aeson and read them back
+++ OK, passed 100 tests.
JSON encoding of (UTxO (ShelleyMAEra 'Mary TestCrypto))
produces the same JSON as is found in golden/UTxO (ShelleyMAEra 'Mary TestCrypto).json
JSON encoding of (WitnessSetHKD Identity (ShelleyMAEra 'Mary TestCrypto))
allows to encode values with aeson and read them back
+++ OK, passed 100 tests.
JSON encoding of (WitnessSetHKD Identity (ShelleyMAEra 'Mary TestCrypto))
produces the same JSON as is found in golden/WitnessSetHKD Identity (ShelleyMAEra 'Mary TestCrypto).json
Working again on signing a transaction to submit to the head, relevant function is signShelleyTransaction from cardano-api, but unsure if this is the right way to go as it seems a bit hard to work with.
We start trying to salvage what we did for MaryTest in order to build the txBody but give up after some efforts building the transaction: The ledger API used for MaryTest is too low level, we should really start from what we need in cardano-api, namely signShelleyTransaction and resolve issues from there.
Looking at how to build a transaction from the cardano-cli, what kind of data it provides. -> it uses TxBodyContent passing the different bits of information. Seems like a good strategy is to use exclusively stuff from Cardano.Api module which exposes the full node API.
For the moment, we use the getTxBody and getTxWitnesses to extract the data from the signed tx and just shove it as encoded strings into the JSON, but we want to have more details in the transactions.
We managed to build a complete transaction and print it in encoded form, now trying to format a transaction in JSON to send to the node. It's mildly annoying the cardano API does not provide default/empty values for a TxBodyContent so that we could just update the parts we are interested in, but it's nice it provides explicit types (note Maybe x) for all fields.
Also, we created a minimal JSON serialization of a Cardano Tx by "viewing" the TxBodyContent and using (partially) available ToJSON instances for TxIn and TxOut.
{
"witnesses": [
"8200825820db995fe25169d141cab9bbba92baa01f9f2e1ece7df4cb2ac05190f37fcc1f9d58400599ccd0028389216631446cf0f9a4b095bbed03c25537595aa5a2e107e3704a55050c4ee5198a0aa9fc88007791ef9f3847cd96f3cb9a430d1c2d81c817480c"
],
"body": {
"outputs": [
{
"address": "addr1vx35vu6aqmdw6uuc34gkpdymrpsd3lsuh6ffq6d9vja0s6spkenss",
"value": {
"lovelace": 14
}
}
],
"inputs": [
"9fdc525c20bc00d9dfa9d14904b65e01910c0dfe3bb39865523c1e20eaeb0903#0"
]
},
"id": "56b519a4ca907303c09b92e731ad487136cffaac3bb5bbc4af94ab4561de66cc"
}
Now need to make the test pass!
-
Discussion on user interfaces and how or whether to split between a high-level "wallet" and more lower-level management UI
-
Set off to "use the cardano ledger"
- Revisited codebase and see what's in
MaryTest, what would be missing and where we likely need to change things
- Revisited codebase and see what's in
-
What's our goal? Start from the outside! We want to have our end-to-end test be using cardano transactions
-
We use our "own" json format, but do intend to support the "serialized cbor" format for accepting transactions later on
- Rationale being, that some
ServerOutputis showing full transactions and clients likely are interested in comparing sent / seen transactionoutputsetc.
- Rationale being, that some
-
We stopped at signing the transaction, which in particular is not provided by the
cardano-ledger-corebased API we had been using for constructing addresses, so we think about switching to usingcardano-apifor constructing / signing a transaction- Also, it seems to be the most "blessed" and somewhat high-level API for dealing with Cardano transactions
-
Few tools and documents mentioned that are quite useful when working with Cardano data:
- bech32: very simple yet powerful for converting strings to/from bech32)
- cardano-addresses: handy command-line for creating, hashing and inspecting addresses and scripts. It has a nice(r than cardano-cli) interface.
- cbor.me: simple tool for inspecting hex-encoded CBOR content
- Mary CDDL & Alonzo CDDL: CBOR specifications for Cardano binary types.
- Creating a first draft for a terminal user interface using
brickto "manage a hydra node"- This will be a Hydra client, which connects to a (local) hydra-node
- Will focus on introspecting the hydra-node and the Head state, as well as opening and closing
- Start with static Brick UI which only shows
versionof the TUI and can be quit - Attaching it to the
hydra-nodeusing aClientcomponent (see ADRs) which opens a websocket connection to thehydra-node- For now hard-coded host and port
- Deserialize
ServerOutputand handle them as "application-specific events" usingcustomMain - For example
PeerConnectedupdates a list ofconnectedPeersin the State anddrawpaints them
- Making the
hydra-nodeconnection robust ist non-trivial though- Connectivity should ideally be known to the UI
- Changing the State to
Disconnected | Connected {...}to make "invalid states unrepresentable" - Extend event type to be something like
ClientConnected | ClientDisconnected | Update ServerOutput - Retry connection upon
ConnectionExceptionofwebsocketsis not enough, need to catch and retry also onIOException(initially)
- Next steps:
- Testing, any interesting properties in handling events / drawing?
- Command line parsing for picking
hydra-nodeto connect to - Adding commands and conditional rendering on
HeadState-> How to infer it and which command possible fromServerOutput?
Seems like the JSON logger we are using is actually unreliable, some messages appear truncated in the output, like:
{"thread":"41","loc":null,"data":{"network":{"data":{"trace":[["event","receive"],["agency","ClientAgency TokIdle"],["send",{"contents":{"tran\sactions":[{"outputs":[1797,1798,1799,1800,1801,1802,1803,1804,1805,1806],"id":329,"inputs":[1785,1795]},{"outputs":[1807,1808,1809],"id":330,\"inputs":[1801,1802,1803,1804,1806]},{"outputs":[1810,1811,1812,1813,1814,1815],"id":331,"inputs":[1798,1805,1807,1808]},{"outputs":[1816,1817\,1818],"id":332,"inputs":[1796,1800,1810,1813,1815]},{"outputs":[1819,1820,1821],"id":333,"inputs":[1797,1799,1809,1811,1812,1814,1816,1818]},\{"outputs":[1822,1823,1824,1825],"id":334,"inputs":[1819,1820,1821]},{"outputs":[1826,1827,1828,1829],"id":335,"inputs":[1817,1823,1824,1825]}\,{"outputs":[1830],"id":336,"inputs":[1826,1828]},{"outputs":[1831,1832,1833,1834,1835,1836,1837],"id":337,"inputs":[1822,1830]},{"outputs":[1\838,1839,1840,1841],"id":338,"inputs":[1829]},{"outputs":[1842,1843,1844,1845,1846,1847,1848],"id":339,"inputs":[1827,1831,1832,1834,1835,1836\,1837,1840]},{"outputs":[1849,1850,1851,1852,1853],"id":340,"inputs":[1833,1841,1843,1844,1847,1848]},{"outputs":[1854,1855],"id":341,"inputs"\:[1838,1839,1842,1846,1849,1850,1851,1852,1853]},{"outputs":[1856,1857,1858,1859,1860],"id":342,"inputs":[1845,1854,1855]},{"outputs":[1861,18\62,1863,1864,1865,1866,1867,1868,1869],"id":343,"inputs":[1858,1860]},{"outputs":[1870,1871,1872,1873,1874,1875,1876,1877,1878,1879],"id":344,\"inputs":[1856,1857,1862,1865,1866,1868]},{"outputs":[1880,1881],"id":345,"inputs":[1859,1870,1871,1874,1875,1876,1878,1879]},{"outputs":[1882\,1883,1884,1885,1886,1887],"id":346,"inputs":[1861,1864,1877]},{"outputs":[1888,1889,1890,1891,1892,1893,1894,1895,1896],"id":347,"inputs":[18\63,1873,1880,1883,1884,1885,1887]},{"outputs":[1897,1898,1899,1900,1901,1902],"id":348,"inputs":[1869,1872,1882,1886,1888,1890,1891,1894]},{"o\utputs":[1903],"id":349,"inputs":[1867,1889,1892,1893,1898,1899,1900,1901,1902]},{"outputs":[1904,1905],"id":350,"inputs":[1881,1895,1896]},{"\outputs":[1906,1907,1908,1909,1910,1911,1912],"id":351,"inp]},{"outp
Simplifying logs using a simple queue where log messages are written to and read from in another thread which is responsible for dumping them in JSON to stdout, adding timestamp and various metadata. Ended up not using Katip as it still adds some cruft on top of what we really need, jsut wrote a simple thread-based logger that pumps from a queue and write to stdout.
Added some simple test as I noticed this code is never tested directly.
Logging format is simpler, extracting events become:
$ cat /run/user/1001/bench-f531fd03b79aa8ca/1 | jq -c 'select((.message.tag == "Node") and (.message.node.tag|test("ProcessingEvent"))) | .message.node.event'
Rerunning the benchmark I still have an incorrectly formatted log entry for the first node, which seems to be the one generating the error, but it's unclear from the error message.
So the logs are truncated because of the error sent, which is ok but it's unclear why an entry in the middle of the file could be incorrect. Could be caused by the flushing of the logs I have added to Logging as we can still write more logs even when the inner action is interrupted by an exception that prevents proper evaluation of the JSON data?
Dumped events from node 3 whose last action is a ReqSn and which seems to crash to as it's output is incomplete, trying to reproduce the failure using those logs. But I am still unable to reproduce the error thrown from update in HeadLogic, even though feedEvents now just discards LogicErrors, probably because the logs are truncated when the exception is thrown.
Trying to remove the error call and replace with a standard LogicError specialised for InvalidSnapshot. Still got a benchmark failure as some messages are not received.
I probably should give up for now... Cannot reproduce the failure using the logs which is really annoying, will try again later.
Back to work on the External PAB
What we are really interested in observing are the transactions that will be reflected to the Node as ChainTx values. Can we observe the "redeemer", or we don't need to, we just need to observe the inputs of the transactions (eg. the AbortTx Utxos come from the inputs)
We don't need much information from the tx comming back from the chain, because by definition they have been validated so they are correct. This implies we need to split the OnChainTx in 2, one for sending txs and one for receiving them. By separating the OnChainTx in 2 types, we add more logic to the head code but remove logic from the PAB/Contracts which is good.
For OnCloseTx we only need the snapshot number, and then we can verify the number against our latest confirmed snapshot:
- If it's same => OK
- If it's lower => post a contestTx
- If it's greater => We have a problem
Adapting MockChain to convert between on and posted transactions, so far it was only forwarding what it received.
In the PAB we now want to convert whatever we observe from the current state to an OnChainTx which we'll send to the client. We cannot use the same types in the PAB and the client (Node) because that introduces coupling, and the PAB share types with onchain contarct and we don't want to tie our haskell code with plutus code.
We are stuck on the abort test, the endpoint is called but server returns an error 500 saying it's not there, which might come from incorrect body (but not the case here) or more simply from the fact the endpoint promise is not called in the contract.
select in plutus says that:
-- | @select@ returns the contract that makes progress first, discarding the
-- other one
So if the first "progressing" contract is waiting on the chain for something, then you're stuck. We could use 2 different contracts activated, one for listening to transactions, another one for endpoints and interacting with the head, but plutus team is working on another solution: passing a Promise to the waitForupdate function that makes it possible to have endpoints active while waiting, leading to a Timeout in the waiter code.
We part ways for the week setting our goal for next week: Integrate real ledger into the head. We'll leave the PAB and Plutus stuff aside for the moment and focus a couple weeks on the Hydra node itself, possibly adding some frontend (wallet for end users, TUI for admins).
-
Discussing issue with ν_initial validator:
- We initially thought the PTs would be paid to public key of participants, but actually this does not work because we need to be able to post an abort transaction which implies any participant must be able to consume the multiple UTxO from the Init transaction.
- We need to have the output containing the PTs to be actually paid to a script, which is the ν_initial script, and have the verification key be passed as datum so that the commit transactions are valid iff their signing key matches verification key.
- Tricky thing about this - in order to "discover a Hydra head" the address to which the PT is paid is ideally known in advance
- Having a "single" validator for all Hydra head instances should be fine (according to researchers)
-
We currently rely on the fact that the datum of the statemachine validator ν_SM is included in the
inittransaction- Manuel: This is not always the case! Datums do not *need to be included in the transaction producing [the outputs holding] them.
-
How are we going to get the committed UTXO to pass to the collectCom endpoint to build the transaction? Right now in the test, they are known in advance but that won't be the case in real code, because the poster only has access to the chain? Or maybe not and just pass around the UTxO off-chain.
-
We need to pass in the ν_initial datum the parameter to reconstruct the address of the ν_sm state machine validator.
-
Added and vetted new coding standards
-
Started to consolidate
masterwithcontract-smandorigin/KtorZ/experiment-move-lift-params-to-datumswork streams on our plutus contracts- Individual modules for each of our three contracts (head statemachine, initial and commit validators)
- Offchain / PAB glue code into the
Hydra.Contract.PABmodule
-
SN discovered that
doomemacs has a feature for yanking (and browsing) github at point usingSPC g y
- Updating
plutusand dependencies to continue investigation of weird behavior of smart contracts on semantically equivalent changes- New version of
plutuschanges howendpointwork, this function now takes a continuation - The default port and HTTP api paths have also changed
- New version of
- After having it compile again, the changes which made it pass before do fail the test now whereas what was passing before does fail now!?
- Plutus team suspects it's ordering issues
Back to work after 2 weeks vacations, catching up.
Recreated yubikey after the first one got destroyed when I dropped the laptop on hte wrong side, need to reorder a new one as spare. Fortunaly, I had an encrypted volume containing the secret keys as backup so I was able to restore the keys on the new card relatively easily following Dr.Duh's guide https://github.com/drduh/YubiKey-Guide/#configure-smartcard.
The only snag I had was that gpg keeps the state of the key in its store, so reimporting it does not change the flag saying the key is on the card. I had to --delete-secret-keys manually to remove the key completely from the storethen reimport it then move it to smartcard.
Also recreating development VM. For some reason, the disks and FW rules still existed and were not completely destroyed when I used terraform destroy so I had to remove them from the console directly. Also, need to gcloud config configurations activate default to use the correct account settings. It's annoying gcloud does not allow per-directory configurations... Took about 1h8m to recreate haskell dev VM from scratch.
Reading layer2 market survey document coordinated by Ahmad. Seems like isomorphic transactions are really a distinguishing of Hydra from all other proposals, which either are limited to special transactions, eg. payments, or rely on specialised contracts.
Checking Red bin to see if there's some useful employment of my time to do:
- Working on cleaning up working directory for tests
- Exposing a test prelude in a new package.
Mb
- Done some more testing and exploration with MPT, in particular, playing around with different alphabet's sizes.
- Refined a bit the test suite with a more precise formula w.r.t to the computation of the 'average' proof size.
- Read up on Verkle Trees and vector hashes
- More discussions and meeting on the Adrestia's side.
MB
-
Continued the work on getting static addresses for init and commit contracts. This also included reworking a bit the existing contractSM so that it'd distribute the participation tokens to static init contracts, which can be observed from the
watchInitendpoint. Doing so, I've also refactored a bit the module structure to more clearly separate on-chain from off-chain code. -
Opened a PR on Plutus to add an extra
MustSatisfyAnyOfprimitive for theTxConstraints, necessary for expressing some of the conditions we have in our various Hydra contracts. (https://github.com/input-output-hk/plutus/pull/3706) -
Investigated Plutus' contract size, and why they are so large. Opened an issue with some findings, and discussions with MPJ: https://github.com/input-output-hk/plutus/issues/3702
MB
- Mostly busy with Slack conversations and document reviews on various topics, but mainly, Adrestia, Plutus-core and the use of datum-parameterized contracts vs compilation-parameterized contracts.
MB
-
Stumbled upon https://vitalik.ca/general/2021/06/18/verkle.html, which I haven't read beside the intro which has enough to keep me captivated:
[Verkle Trees] serve the same function as Merkle trees. [...] The key property that Verkle trees provide, however, is that they are much more efficient in proof size.
-
Also discussed with Matthias Fitzi some possible improvements of the MPTs proofs:
- Each node could store their children hashes as Merkle Trees, this allow to reduce the overall proof size by a factor of 3.
- We may want to try shorter alphabet, to create slightly longer proofs but with less neighbors on each levels.
- We've seen in previous simulations that even with short concurrency factors (~20), head networks would still perform reasonably well. So there's a right limit to find which lead to satisfactory performances.
-
I also had a go at inspecting the sizes' of our Hydra contracts. It's rather big. 11KB for the state-machine, 8KB for the close and initial. We may want to consider optimization to make scripts smaller.
MB
-
Worked on an implementation of Merkle-Patricia Tree, including already a few of the necessary optimization w.r.t to the storage of the prefixes.
-
Writing a few QuickCheck properties revealed that we may have a proof size problem. While it is true that the size of proofs is in
log(|U|)(forUthe Utxo set), each element in the proof may embed up to 15 hashes, so for reasonably large UTxO set, we end up with proofs carrying 35/40 hashes! Since a proof is needed *per input and per output for each transaction, we may rapidly consume all the available space.
- We have been observing weird / blocking issues with Plutus: https://github.com/input-output-hk/plutus/issues/3648
-
Continued the work on the Hydra-Plutus-PAB integration to remove the hard-coded contestation period and have it part of the
inittransaction. Like a few other types (e.g.HeadParameters,Party) we have duplicating type-definitions for similar concepts between the Plutus on-chain code and the rest of the application. -
We discussed (and rejected) the idea of removing that duplication in favor of a single type definition at the root of the dependency tree; yet this is rather unsatisfactory because:
- Although data-types have the same names and represent similar concept on-chain and in the application, they aren't necessarily an exact overlap. Thus, we would end up with types that are more complex than they need to, because they need to satisfy more downstream consumers.
- More importantly, the Plutus machinery is more restrictive on what primitive types can be used. For example, there's no
Wordin Plutus, onlyInteger. NoDiffTime, onlyInteger. These restrictions forces data types defined for Plutus to loose a lot in expressiveness and satefy compared to what we would normally do for a Haskell program. Thus, while some little duplications is unfortunate, it actually helps to get a nicer designs for the main, off-chain, parts of the application while providing a minimal API surface for the on-chain code.
-
As a bridge between on-chain and off-chain types, we rely on JSON serialization (which fits nicely in the PAB design). Thus, PAB clients submits parameters as plain JSON, which gets deserialized into their on-chain compatible version using a more restricted set of primitives. Since this approach introduces some duplication in type-definitions, it now becomes utterly important to ensure it works as expected through tests for which property-based roundtrip tests are a good fit. Our first property actually even immediately caught what was a legitimate mistake of ours on its first execution.
Somewhat ad-hoc agenda on the Close-OCV again:
- MPT helps only for fan out txs .. but not when we on the Close / Contest
- MPT would move the "space burden" into redeemers instead of datum, but still requires non-bound size for validating hanging transactions", right?
- Applicability of Hanging transactions should be encodable by MPT insert / remove operations
- Sandro will check again
- We can/will walk through that with him
- Coordinated protocol (= no hanging transactions) might not be affected by this
- Signed snapshots would suffice to validate in close/contest
- Only fanout needs to check presence of utxos in datum
- Refactored
HeadStateandHeadStatusback into a single data type to make invalid states unrepresentable:ReadyStatewon't ever have a valid list ofpartiesas noInitTx, which announces the list of participants, was observed before that state - Reviewed status quo of
ExternalPABand discussed various aspects of it (like paying PTs to pubkeys or some quirks of the PAB)
- As https://github.com/input-output-hk/plutus/issues/3546 was resolved, I set off to update
plutusdependencies - This time a simple bump of
plutusandcardano-ledger-specswas anough to satisfycabal - Two additional changes required in the code though:
-
IsDatais now three type classesToData,FromDataandUnsafeFromData - Boilerplate/gluecode for PAB is now using the
HasDefinitionstype class
-
- The reproducer does now work!
- We cannot update our code statemachine to use
Just threadTokenthough as we do (currently) rely on forging our own tokens (including the PTs sharing the same currency symbol) and recent changes madePlutus.Contract.Statemachineforge thread token automagically.. which is not what we want 😿
- Talked about Tail simulation results, what would be a representative experiment for "general applicability" of the tail protocol, as well as future extensions / adapations and how they would compare to something like Ligthning network on Hydra Heads. Which also seems to be a good avenue to payment use cases for Hydra Heads.
- Read about and look into the the Raiden network
- This onboarding document seems to be a good "introduction" to their tech (stack)
- The spec is quite heavy-weight, but feels a bit "ad-hoc" or engineered rather than backed by research
- The raiden services seem to be "adding value" by short/cheap path finding and offline-capability (like LN watchtowers?) in exchange of some
RDNtoken fee (their ROI?)
Solving together the issue with snapshots not being emitted for transactions once we run out of transactions to submit Wrote failing behavioural test, solutions proposed:
- have a concurrent snapshot thread like in hydra-sim
- making sure we can have more than one snapshot in flight
Trying 2. as having multiple threads is unappealing. Test is still failing after changes seenSnapshot :: Maybe Snapshot -> seenSnapshots :: Map SnapshotNumber Snapshot
We need to debug what's going on as the failure message is unhelpful => dump IOSim traces whene something goes wrong in BehaviorSpec
Taking a step back and thinking how we should solve the snapshot number problem.
We need to add a snapshot number in the state, storing the nextSnapshotNumber and updating it in 2 places: When one emits a ReqSn and when one receives it, the former happening only when node is a leader
It's a monotonically increasing counter but it's redundant
Other solution is to store a Maybe Snapshot in the index, instead of a snapshot, so that the leader can use the index without constructing the snapshot
Test is failing because the snapshot 2 contains 2 confiremd txs instead of one => we should probably update the seenTxs as soon as we emit a ReqSn?
- If we remove the snapshotted txs from the seenTxs as soon as we emit a ReqSn => it does not work
Adding more edge cases for leader handling in snapshot emission, code seems more and more complicated
- We could also simply not handle
ReqTxwhen there is a snapshot in flight in the leader?
Reverting back to where we had failing tests (And traces from IOSim logs), fiddling with merge/revert conflicts
Trying a mixed approach, not having a separate thread but having a separate event for requesting a new snapshot. The idea is that as we enqueue an event for each transaction anyway, we don't lose anyone of them, the new snapshot will be created with whatever exists at the time of its processing, and if there is another snapshot in flight, we will wait/discard it.
We managed to get "parallel" benchmark working by using a NewSn message that decorrelates the request for a new snapshot from the actual creation of snapshot.
The NewSn message is enqueued and waited for if there's already a snapshot in flight, and discarded if there aren't any transactions to snapshot (seenTxs) is empty.
This alleviates the need to have a separate thread runnning to trigger the snapshot, and it also works if we want a finer grained snapshot policy, like after N txs.
Then fixing HeadLogicSpec unit tests which are now failing because the snapshot logic has changed. Push to master was a bit too hasty...
- Read and discussed recent Hydra research
- Extended the visual roadmap in discussions with researchers and product manager
An interesting minor suggestion for improving code reviews, and commit messages: https://ncjamieson.com/conventional-comments/ If we insist on doing reviews that is...
An interesting scalability paper co-authored by C.Decker, the guy behind eltoo.
Writing a unit test exposing the problem we are seeing with our parallel benchmark, namely that we get a signature for a snapshot we have not seen yet:
(OpenState headState@CoordinatedHeadState{seenSnapshot, seenTxs}, NetworkEvent (AckSn otherParty snapshotSignature sn)) ->
case seenSnapshot of
Nothing -> error "TODO: wait until reqSn is seen (and seenSnapshot created)"
Just (snapshot, sigs)
Reverting back to when we had parallel confirmations to try to load test the cluster leads to another failure -> Try providing a more helpful message when a waitMatch fails in ETE tests, as the current one is not very useful
- Wait function was missing
HasCallStack=> no stack trace, wrong information from the failing tests - The wait timeouts and there aren't any message received, this is puzzling. I can the snapshot being confirmed in the node's log and the
ClientEffecttrace, so could it be an artefact of deserialisation? - It seems we never get more than one snapshot when submitting txs in parallel, which looks like an issue in the way we are doing the protocol
- There aren't any
Waiteffect in the logs, so this means we never get into the situation where a tx or snapshot cannot be handled
I think I understand what's going on: We only request a snapshot when processing a transaction and there's no snapshot currently being processed, but given we have a single queue, we end up submitting all txs, then doing one snapshot, but other transactions do not trigger a snapshot request because there is still a snapshot unconfirmed in flight. Then the snapshot ends up being confirmed, but there's no more any transaction to trigger ReqSn.
Managed to edit the API documentation file with some descriptions, now trying to generate human-readable documentation from it. I can transform the document to JSON using
$ yq . hydra-node/api.yaml
which is a thin wrapper over jq and takes the same kind of expressions. Added nix expressions to the shell.nix for jq and yq, I guess only the latter is necesseary as jq is a dependency of it. Trying to add a python3 package called [https://github.com/coveooss/json-schema-for-humans] but it does work in nix: The package is not part of nix database. Rodney and others has some pointers on how to add a python package to nix.
- To install non-standard python packages, follow instructions here: https://nixos.wiki/wiki/Python. This basically mean writing a nix derivation that install the package and invoking it in the shell...
- There's also nix-mach which provides tooling to produce nix derivations from python requirements.
Don't know why but I got a déja vu feeling with those JSON Schemas, like I was back in the days of XML processing where XML was everything and everything was described in XML, with complex tools to parse, analyze, validate, merge/split/transform XML documents. It's not as worse with JSON but still feels quite similar. I guess the question is, as always, what's the best format for specifying interfaces and APIs: A pivot format from which to generate or verify code, or code from which to generate doc?
- Having a quick look at the generic Schema generator for Swagger in Haskell: It does not extract fields comments from data types records which is annoying as this means we'll need to repeat the same information twice.
Working on the ETE benchmark test, generating more transactions to input. We move the generator from test package to code package which introduce dependency to QC in hydra-node which is probably fine as we already depend on it in the Prelude => add more stuff from QC to the Prlude?
We probably want to separate submitting of transactions from confirmation in 2 different threads in order to make sure we observe confirmation as soons as possible, while loading the server with more trnasactions.
We are struggling to get a Set from a Vector of Value, until we realise the solution is simple: there is a Foldable toList method!
After parallelizing submission and confirmation of txs, we get an error in the waitForPeersConnected function when runnning the test. We spend some time troubleshooting it:
- This is weird as the error seems to be happening at the beginning of the test but we can see the nodes get transactions and messages so this means there is a thread that keeps running that timeout somewhere.
- Something's fishy in the
waitForAllfunction, adding traces to udnerstand what's going on - Adding more traces around the
timeoutcall: Could it be triggered asynchronously somehow? - Node 1 is starting to wait for peers connected again at some point after the initial head is open
- Adding more traces around various
waitXXXfunctions - It looks as if it was calling
waitForNodesConnecteda second time after the first round, like there was some thread running in parallel doing that only for node1 - Trying to reformat the code and use
concurrentlyinstead ofconcurrently_explicitly discarding the result - Turns out the
actionis actually run twice (or more): When we are disconnected, it throws anIOExceptionand this is handled bytryConnectalways as a connection failure, which triggers re-running the action. We should only catch exceptions thrown byrunClientand not the other ones, but this is not possible as failure to connect and disconnection seems to be represented as the same exception type. - There is a race condition in using the
race_function between detecting the process failed with exit code <> 0 and failing to connnect to it, which leads to non-deterministic test result
We workaround the issue by ensuring we don't retry to onnect when the action has started running which is clunky and uses a Bool flag but works well
Converting the SimpleTx generator to use getSize to be able to generate more transactions -> we see the process crashing and the error about reqSn not being properly implemented
Switching to upgrading dependencies, making sure we can get the latest plutus stuff from SN's branch. Plutus tests are failing and unfortunately the error message is not very informative:
[WARNING] Slot 3: W1: Validation error: Phase2 3ffcc708303460d9cb6871495ae3391ad855745bcec9d5af02c662705eb29c74: ScriptFailure (Evaluatio
nError [])
The Init transaction is failing so the commit is also failing too as it does not have the participation token to spend. Following luigi's advice in issues we raised, all the tests in the upgrade dependencies branch are now passing. The issue was a mismatch in the type of the monetary policy validator: A new parameter was added in a PR recently, like 1 month ago.
Trying to troubleshoot our close contract again. Removing the call to close endpoint still shows collectCom failure.
master branch is passing the test, so perhaps the issue comes from our types?
Trying to add small changes to the types to see if tests still pass here. Might be an issue with INLINABLE but this should break at compile time.
Test still fails with only a change to the types, trying to just add a simple no-arg constructor => still fails
Changing the order of ctors with a no-arg constructor pass the test, but not with Closed Snapshot.
There seems to be interaction between the order of constructors and the order of the case branches in the validator ??
Adding Close/Closed to state/transition at the end of datatypes make the test fail on close... which is weird
Adding traceIfFalse statements to check what exactly is failing (not obvious from Plutus' emulator messages) -> not very conclusive either
We should probably try with a more recent version of Plutus and check if we have same errors/better error reporting. Plutus SC is on our critical path anyway so no point in side-stepping it, but the Plutus team is drowning under pressure and deadlines. Looks like we are hitting a wall, next step is:
- upgrade dependencies and see if we can move forwared
- circle back with Plutus team for some help
Moving to implementing benchmark
Goal is to have a simple benchmark, running a number of nodes and hitting those nodes with transactions through their clients. Dimensions of the bench are: number of nodes, concurrency level, also structure of transaction.
Discussing the respective merits of monotonic time, clock time, Data.Time or System.time packages...
io-sim-classes uses a DiffTime to represent differences and also to represent monotonic time.
monotonic time starts at undefined moment in the past (start of system) but is a Word64 in Haskell => No need to care about all this right now
Got JSON output of each transaction submission time and confirmation duration, in the form of a list Now refactoring tx submission to actually confirm all txs that are returned by the snapshot confirmed message
Got benchmark compiling and outputting the confirmation time of a single transaction, extracting txs and txid from the JSON values we get from the server. Next step is to send more transactions from a single node, then send transactions in parallel, and finally sned them to several nodes.
See Miro board
Viewing Testing smart contracts by John Hughes
Trying to start again implementing a proper model for the head smart contracts, based on https://alpha.marlowe.iohkdev.io/doc/plutus/tutorials/contract-testing.html and John Hughes video. I think this should be our very first next step because it will help us get a complete picture of the smart contracts we need to implement and guide the implementation whatever form it can take, invidual validators or state machine based. I want to get back to the paper and formalise the SM specification there in code.
- Defining value dimensions for a Layer 2 solution:
- Speed
- Transaction cost
- Security model, custodial vs. non-custodial, level of trust required
- Decentralisation
- Ledger capability
- Scale of participants
- ...
- Map different solutions on a spider web chart (aka. radar chart)
- Do the same thing for technical parts/components needed, defining how feasible they are:
- The dimensions are the technical components of possible solution(s)
- The scale of the dimension is the maturity level of that particular technical part
- Solutions are composed of components
- We can then relate the "desirability"/value of a solution in front of its "feasibility"/maturity
- This should be done collaboratively wiht various stakeholders in order to foster discussions on values, solutions, dimensions
Idea: Create a Hydra testnet with several Hydra nodes connected together, that expose an API that can be used by clients, eg. a dedicated wallet for experimenting.
Detailed notes here along with link to Miro board.
- Added notes on eltoo to Lightning network page.
- Fun fact: The LND implementation of lightning network daemon as over 1000 files of Go source code
These slides from Orfeos are pretty much useless without accompanying talk
Following links from eltoo site, I found Yet another micropayment network proposal.
- Updating dependencies in
cabal.projectto build the minimal example from 2 days ago with most recentplutusversion - This is non-trivial! When simply bumping all the
source-repository-packagetags to the ones plutus is using, I get a conflict when resolving dependencies - Disabling
hydra-node(andlocal-cluster) helps, as the conflict is somehow because of our use ofcardano-node/ouroroboros-networkvs. it's usage via plutus in thehydra-pabexe? - To get the nix-shell updated, this is also involving a lot of updating sha256 sums and taking some things from
cardano-nodeand others fromplutus, e.g.- if we use the same
haskell.nixrev asplutus, but stick withghc8104, this would have the compiler be built along with ALL the packages (plutus seems to be using a custom ghc) - using the
haskell.nixrev fromcardano-node, requires us to be using a slightly older hackage index-state and useghc8105to get at least SOME of the dependencies and not need to compile ghc
- if we use the same
- An alternative is to use an ad-hoc shell to get my hands on ghc and cabal, e.g.
nix-shell -p ghc -p cabal-install -p pkgconfig -p zlib- the packages are not enough and we need the patched libsodium
- so I updated
shell.nixwith a "cabal-only" / "non-haskell.nix" variant of a shell derivation accessible bynix-shell -A cabalOnly
- Seems like multiple things have changed in recent
plutus-
MonetaryPolicyis now namedMintingPolicy, wrapping / compiling it is different, Several constraints have been renamed,BlockChainActionsseems to have been removed etc. - Instead of fixing all this, I removed non-repro code from the cabal module list
-
- Finally, I needed to update the
SM.hscode to use the newPlutus.Contract.StateMachine.getThreadTokenfunction and could re-run the minimal reproducer example ->runStepstill fails withInsufficientFunds - Reported this as an issue
- Added analysis of how much of the confirmed transactions are within
1,10and0.1slots and discussed that with researchers. Here are the previously recorded results for the1000slot compression (25323slots) results withs=10800, with pro-active snapshotting atp=0.8:
txs-1000clients-25323slots-10800s-0.8p.csv
Analyze
{ numberOfConfirmedTransactions = 17999
, averageConfirmationTime = 625.6537533845784
, percentConfirmedWithin1Slot = 0.9408300461136729
, percentConfirmedWithin10Slots = 0.9408856047558197
, percentConfirmedWithinTenthOfSlot = 0.9245513639646648
}
and without pro-active snapshotting
txs-1000clients-25323slots-10800s.csv
Analyze
{ numberOfConfirmedTransactions = 19761
, averageConfirmationTime = 747.7579325380825
, percentConfirmedWithin1Slot = 0.9313293861646678
, percentConfirmedWithin10Slots = 0.9313799908911492
, percentConfirmedWithinTenthOfSlot = 0.9168058296644906
}
- I also did perform simulation of the data with
100slot compression (253227simulated slots) to see how a ~3h settlement delays=10800would fair
txs-1000clients-253227slots-10800s-0.8p.csv
Analyze
{ numberOfConfirmedTransactions = 46952
, averageConfirmationTime = 980.7909634163951
, percentConfirmedWithin1Slot = 0.9134861134775941
, percentConfirmedWithin10Slots = 0.9135500085193389
, percentConfirmedWithinTenthOfSlot = 0.8959362753450332
}
- Surprisingly, the performance is not as good as on the
1000slot compression data. We would have expected that the settlement delay would fall "in between" transactions more often with a data set with less traffic. In discussions we speculated that the data might be biased due to the nature of the simulation, where all txs are "fast" untilsand in the end all txs require snapshots (as no funds are incoming)- Proposed solution: Add an option to
--discard-edgeson analysis, i.e. keep it for measuring confirmation times (influence), but not calculate it into the results. - This required a breaking change in how the results are written to disk: Previously the txref was "node + slot + .." packed as string and lexical sorting was not honoring "when" a transaction happened. So I do record now the slot + this label as a
TxRef, making the type effectively ordered by slot, but factually invalidating all previous results (they were not ordered), so I just added the slot as another column in the CSV to avoid confusion
- Proposed solution: Add an option to
- Given another simulation run on the
20000slot compression (1266slots) set withs=100, we expect that the "edges" in the range ofswith slots 0-100 and 1166-1266 (or even more) are biased in one way or the other (i.e. all txs untilsare "fast"). The non-discarded results are:
λ cabal exec hydra-tail-simulation -- analyze --discard-edges 0 txs-1000clients-1266slots-100s.csv
Analyze
{ numberOfConfirmedTransactions = 37448
, averageConfirmationTime = 10.018317012686664
, percentConfirmedWithin1Slot = 0.9032258064516129
, percentConfirmedWithin10Slots = 0.9038666951506088
, percentConfirmedWithinTenthOfSlot = 0.879619739371929
}
- Discarding edges for
100and200confirm that it "settles" in the "center" of the simulation run, although only slightly in this example:
λ cabal exec hydra-tail-simulation -- analyze --discard-edges 100 txs-1000clients-1266slots-100s.csv
Analyze
{ numberOfConfirmedTransactions = 30787
, averageConfirmationTime = 10.968831519346255
, percentConfirmedWithin1Slot = 0.8934940072108357
, percentConfirmedWithin10Slots = 0.8942735570208205
, percentConfirmedWithinTenthOfSlot = 0.8709845064475266
}
λ cabal exec hydra-tail-simulation -- analyze --discard-edges 200 txs-1000clients-1266slots-100s.csv
Analyze
{ numberOfConfirmedTransactions = 26721
, averageConfirmationTime = 10.80831723318664
, percentConfirmedWithin1Slot = 0.8944276037573444
, percentConfirmedWithin10Slots = 0.8953257737360129
, percentConfirmedWithinTenthOfSlot = 0.872609558025523
}
-
Two-party payment channels secured via hashes + time locks
-
Networking effect comes from routing payments using layered hashes/secrets (similar to Tor / onion routing)
- natural consequence: Lighning only works with fungible tokens!
- great for privacy as each party only knows previous and next hop
-
Liquidity is a big problem
- "can't receive until sent", Wallets tackle this by providing channels on-demand, e.g. Phoenix's pay-to-open
- by default, each payment channel needs to be liquid enough to forward the transactions value -> hard to pay $1M through lightning
- spreading payments over multiple channels is getting researched (and implemented?) recently
-
Lightning nodes need to be online to be safe .. right?
- there is this game-theoretic way of punishing if peer broadcasts old commitment txs (LN-penalty)
-
Watchtowers
- it seems like these are used to allow lightning nodes to be offline for a longer time without losing much safety
- they detect and ensure that no old states are posted on chain and do even dispute with more recent states of the payment channel
- typically implemented as a third party service to which lightning nodes send encrypted data with the tx id triggering the dispute being the encryption key
-
eltoo
- a not-yet-implemented way to enforce continuity of states without incorporating all of them (?)
- drop-in replacement of the penalty mechanism
- My thoughts on this: Are watchtowers this a way to make the Head protocol somewhat offline capable as well? i.e. backup multi-signed snapshots for potential contestation before you go offline .. obviously this is a trade-off for privacy (the watchtower sees all intermediate snapshots) unless we can also make this in an encrypted fashion?
-
How are non-custodial lightning wallets possible?
- They seem to have a lightning node integrated / running
- But only use a "light" bitcoin node to interact with the main chain (primarily as the storage required is huge), e.g. neutrino
-
(AB) took some notes on Lightning Network paper
- Talked to a developer of Marlowe as I found out that they are looking into "Merkelization" of the interpreter AST, which seems to be quite similar to our (researchers) ideas of using MPTs for not needing to store the whole UTxO set in the Hydra mainchain tx.
- Besides multiple organisational things, I started to look into the issue of
Plutus.Contract.Statemachine'srunSteperroring withInsufficientFundswhen usingJust threadToken - I started by creating a minimal reproducer example which contains of a very simple plutus state machine with two states
data State = First | Secondand a singledata Input = Stepas well as this trivialtransition oldState _ = Just (mempty, oldState{stateData = Second}) - The following code is then printing
SMCContractError (WalletError (InsufficientFunds \\\"Total: Value (Map [(,Map [(\\\\\\\"\\\\\\\",99985508)])]) expected: Value (Map [(858535eed6775064eed795dd9261d258dd97ad51983877cc3df52e3a10ed6108,Map [(\\\\\\\"thread token\\\\\\\",1)])])\\\"))
contract :: Contract () BlockchainActions String ()
contract = do
threadToken <- mapError show Currency.createThreadToken
logInfo @String $ "Forged thread token: " <> show threadToken
let client = stateMachineClient threadToken
void $ mapSMError $ SM.runInitialise client First mempty
logInfo @String $ "Initialized state machine"
res <- mapSMError $ SM.runStep client Step
case res of
SM.TransitionFailure (SM.InvalidTransition os i) -> logInfo @String $ "Invalid transition: " <> show (os, i)
SM.TransitionSuccess s -> logInfo @String $ "Transition success: " <> show s
where
mapSMError = mapError (show @String @SM.SMContractError)
- Next: updating
cabal.projectto a newer version ofplutusand all our transitive dependencies
Working on PAB again, trimming down what we have done so far to the bare minimum.
The goal is to have the InitTx transaciton with PTs posted and observed from the mainchain:
-
setupcontract creates the thread token and starts the SM - We need to have the cardano pubkeys available to post the init transaction, because we want to pay PTs to those pubkeys
- We see the transaction that creates a UTXO for token creation purpose, then the transaction that starts the state machine, with the thread token being added to the initiator's wallet.
- We got stuck with explicit
threadTokenthreading, seems like it's not actually implemented right now so passing the thread token to themkStateMachinefunction does not work and makes the transactions unbalanced.- https://github.com/input-output-hk/plutus/pull/3452 is the PR that introduces auto-forging of ST
- We want to reuse the same
CurrencySymbolfor the ST and the PTs but this won't be possible after that PR is merged. - Using
Currency.forgeContractto forge both ST and PTs but ignoring the former at the moment because it makes the transition fails.
Added wallet identifier to run withEXternalPAB and have a test with 2 parties so that we can actually be sure the party that checks the transaction has been posted is different from the one that actually initiate the head.
To observe the init transaction being posted, we listen to outputs paid to our pubkeyhash with some arbitrary currency symbol (the "unique id" of our head) and our pubkeyhash as a token name (and amount of 1),
The fact we have a Party in the node and another one in the contracts code is annoying => we have the same wire format so its fine for the moment, but a PArty should not be tied to specific types in Crypto module, it should jsut provide material to build keys which are just bytestrings
- Produced plots for multiple scenarios with high settlement times (as started yesterday)
- Obviously we have either very fast or transactions just after the settlement delay, but there is also a noticable set of txs at
2s - Investigating why this could be:
- The tail server does decide when a client
NeedSnapshot, based on the results ofmatchBlocked - There are multiple
withTMVarblocks -> suspecting a race condition - Created a reproducer
events.csv:
slot,clientId,event,size,amount,recipients
0,1,pull,,,
0,2,pull,,,
0,1,new-tx,297,53964900,2
1,1,pull,,,
1,1,new-tx,297,53964900,2
2,1,pull,,,
2,1,new-tx,297,53964900,2
results in these times with s=3600:
txId,confirmationTime
1053964900297[2],1.1441610433e-2
1153964900297[2],7200.044700244114
1253964900297[2],1.1418798826e-2
- After adding some
traces and above minimalevents.csv, it can be found that the second tx isreenqueued, but the third transaction actually gets handled before, thus delaying the second tx again -> double settlement delay on the confirmation time - Refactoring the code to not re-enqueue, but handle the message directly on
SnapshotDoneimproves this particular situation, but due to concurrency in the server there is still a chance of "new"NewTxbeing handled before the "re-handled"NewTxand even a confirmation of around3swas seen
- Implementing the
--pro-active-snapshotwas not trivial due to the lack of tests, but with a small / constructed set of events and sometraceit could be done (.. should've really written test myself this time :/) - Preliminary results of the 1000 compression data set (24320 compressed slots) with
s=3600and pro-active-snapshot limit ofp=0.8show a242.50993413052075slot average confirmation time, which is quite a bit better than the313.5453135554798slots average confirmation without the pro-active snapshotting. - There is a different number of
confirmedTransactionswhen running the same dataset just with--pro-active-snpapshoton and off, why? - Likely caused by the way the simulation is structured:
- Threads are forked for running the server and client loops, each client processing
[Event]and reacting on messages fromt the server (e.g. blocking the loop while doing a snapshot) - Each client is having a local notion of
currentSlotand gets delayed when blocking the event processing loop - The simulation is stopped after a pre-calculated time, currently
lastSlot of events + 2 * settlementDelayconverted to seconds
- Threads are forked for running the server and client loops, each client processing
Topic: Discuss Custodial Hydra Head or whatever we should rather aim for as our MVP
- Provide some context and back story to Duncan
- There is always "something custodial"
- It's not like "this" or "that"
- Custodial systems raise regulatory obligations to the operating parties
- Hydra Pay (Tail) is also in this situation, the server is also a custodian
- If you are processing other people's payments, you need to register (on many jurisdictions)
- Are we facing these issues with all of our variants?
- Lightning is not having this problem? It's on a smaller scale (really?)
- Besides "custodial issue" the tail has some more risks
- It involves creation of a client as an additional component
- Research is still in very early stages, contracts seem complex
- Faces the same problem of "finding the right server" (vs. "finding the right head")
- After having a
cardano-nodeandogmiosserver running on fully synched on the main chain, I could finally use anpm run pipelineinhydra-sim/scripts/tailto download blocks and construct aevents.csv - I opted for
npm run pipeline 1000 1000 24320to try to re-produce the 1000 node with 1000 slot compression dataset as it is also checked in to the repo, which has events of slot24320 - Although I see (some of) the same events be produced, I realized that the current
cardano-node-ogmiosinstance I am using is still (or again) having issues to synchronize fully:
[f30db018:cardano.node.DnsSubscription:Error:6539] [2021-07-07 07:41:28.21 UTC] Domain: "relays-new.cardano-mainnet.iohk.io" Application Exception: 18.158.202.103:3001 InvalidBlock (At (Block {blockPointSlot = SlotNo 28173266, blockPointHash = e42dacc3f7406a85b2e561fffc84118c63a5b71d05d3cb0272dbc2d11c235d2c})) (ValidationError (ExtValidationErrorLedger (HardForkLedgerErrorFromEra S (S (S (Z (WrapLedgerErr {unwrapLedgerErr = BBodyError (BlockTransitionError [LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "47edd4e27aa5ef468603ded3c3250b3fd53ac196d9009c3a189e3f2a")},Coin 14393994)])))),LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "aac051310c2760fae362766ab5e7dd27404da3f72732d68ea7ec0c2a")},Coin 1296084)])))),LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "e7a92b469d4af1e2b70efc3638f084757655e99a954d48aae232d488")},Coin 12093106)])))),LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "4ddc9a17c1e23a56f1e01718387f45e646b3bf9f83c0ba285b04e347")},Coin 1781746)])))),LedgersFailure (LedgerFailure (DelegsFailure (WithdrawalsNotInRewardsDELEGS (fromList [(RewardAcnt {getRwdNetwork = Mainnet, getRwdCred = KeyHashObj (KeyHash "ee9345b6e27716c48d68abd805aaca347ba2a65060c47f3e46904320")},Coin 1297137)]))))])})))))))
- Having another go with a
1.25.1cardano-nodeand a separateogmiosinstance to get it fully synchronized while I continue in running tail simulations on the part of the dataset what I have - After confirming that I have somewhat similar data, I ran the simulation
cabal exec hydra-tail-simulation run -- --payment-window 100 --settlement-delay 120 datasets/events-clients:1000-compression:20000.csvto see whether I get also somewhat similar results as in the paper- I got these results and the average confirmation time seems to somewhat correspond to the graph in the T2P2 paper for 1000 nodes (12.4 seconds)
RunOptions
{ slotLength = 1 s
, paymentWindow = Just
( Ada 100 )
, settlementDelay = SlotNo 120
, verbosity = Verbose
, serverOptions = ServerOptions
{ region = LondonAWS
, concurrency = 16
, readCapacity = 102400 KBits/s
, writeCapacity = 102400 KBits/s
}
}
SimulationSummary
{ numberOfClients = 1000
, numberOfEvents = 1035738
, numberOfTransactions = NumberOfTransactions
{ total = 517869
, belowPaymentWindow = 259512
, belowHalfOfPaymentWindow = 207355
, belowTenthOfPaymentWindow = 123539
}
, averageTransaction = Ada 24
, lastSlot = SlotNo 1184
}
[...]
Analyze
{ numberOfConfirmedTransactions = 28346
, maxThroughput = 218.99746835443037
, actualThroughput = 23.92067510548523
, averageConfirmationTime = 10.123735995981168
}
-
Just realized that if I run the simulations with an events file I got from MB for 1000 nodes and 20000 compression, I get the same
12.4seconds average confirmation time.- I wonder though why that
events-clients_1000-compression_20000.csvonly has519compressed slots, while my recreated20000compression of a shorter block chain has1184compressed slots?
- I wonder though why that
-
Possible next steps:
- Store all confirmation times and plot them (likely will show that the average is strongly biased by a small number of slow txs which required a snapshot)
- Increase settlement delay to 3h and re-run a simulation
- Add pro-active snapshotting when reaching a certain window limit (without lookahead)
- Only do pro-active snapshotting when sender knows not to have another tx anytime soon (> settlement delay)
-
Using the same data as above, but with a 500 slot settlement delay, we get double on the average confirmation time and about half of
confirmedTransactions:
Analyze
{ numberOfConfirmedTransactions = 14048
, maxThroughput = 218.99746835443037
, actualThroughput = 11.854852320675105
, averageConfirmationTime = 22.912008787073507
}
- Compression 1000 and settlement delay 3600 slots (~1h):
Analyze
{ numberOfConfirmedTransactions = 22672
, maxThroughput = 11.628016940092923
, actualThroughput = 0.9321985115743596
, averageConfirmationTime = 278.11186776079404
}
- Compression 1000 and settlement delay 10800 slots (~3h):
Analyze
{ numberOfConfirmedTransactions = 12842
, maxThroughput = 11.628016940092923
, actualThroughput = 0.5280210517659636
, averageConfirmationTime = 484.0768607394298
}
Trying (again) to complete the first test of PAB, passing some parties' keys to the Init transaction and have it recorded and observed on the chain through smart contracts and PAB.
We are making too many shortcuts in the PAB/Main thing so things don't make sense to me...
There's confusion between the contract activation logic which basically instantiate a contract and returns a contract identifier that can later be used to invoke endpoints on it, and the actual endpoints handling. To watch transactions from the state machine one needs to run a contract that waits for state changes which requires the thread token (or a state machine client which is created through the thread token)
There is a logical problem: We cannot start the state machine until we have the initTx command, that's what got me confused. Also, how does the endpoint mapping works through the webserver? Seems like we are using Builtin and calling init endpoint but it does seem to be declared anywhere => This is the case in our presetn incarnation, as we have not declared any endpoint so this can't possibly work.
How can I observe the state of a SM if I don't know it's thread token? And then how do I know it's thread token if I did not create the SM in the first place?
As show in the Auction contract's test, the buyer needs an external way of getting the thread token to observe the SM progression:
auctionTrace1 :: Trace.EmulatorTrace ()
auctionTrace1 = do
sellerHdl <- Trace.activateContractWallet w1 seller
_ <- Trace.waitNSlots 3
currency <- extractAssetClass sellerHdl
hdl2 <- Trace.activateContractWallet w2 (buyer currency)
_ <- Trace.waitNSlots 1
Trace.callEndpoint @"bid" hdl2 trace1WinningBid
void $ Trace.waitUntilTime $ apEndTime params
void $ Trace.waitNSlots 2
In all instances of StateMachine I could find, this is done through forging a currency which can then be used as a unique identifier that's either used directly or as part of some larger initial state.
Here in the TokenSale example from week08 of PPP, the token is part of the TokenSale initialiser.
tsStateMachine :: TokenSale -> StateMachine (Maybe Integer) TSRedeemer
tsStateMachine ts = mkStateMachine (Just $ tsNFT ts) (transition ts) isNothing
This implies that a node that's not initiating a head won't be able to know what's the head identifier is if there's no way to get it through another mean: Either out-of-band, through the Head members network, or by watching a specific contract's address which is only dependent of the HeadParameters and henceforth knowable by all parties. Could also be some other address where the participation tokens are forged, with the PTs being defined with a currency symbol which is exactly the thread token.
Init for SM should then:
- Create a unique thread token for the head -> this will be the head identifier
- Post a transaction with outputs containing PTs for each head participant sent to their HeadParameter's pub keys -> like what we do now
All parties observe this known address and retrieve the PT sent to them to know what is the SM thread token -> then they can start monitoring the SM and observe its state changes specifically.
We need to pass both HydraKey abd CardanoKey in the InitTx so that listeners can retrieve Participation tokens and then the state machine's token.
Listening should happen in 2 stages:
- listen to the
InitTxby listening to PTs being paid to one's pubkey - listen to the state machine's changes using the PT's currency symbol as key to the state machine's instance
-
Had multiple meetings with researchers today
-
First it was about doing some additional Tail simulations:
- Using shelley data
- Focus on (optimistic) latency and ‘window recycling’
- Do some kind of "pro-active" snapshotting when reaching a certain window watermark, e.g. 0.8[-w,w]
- Ideally when the sender knows it is offline for at least
s(settlement delay) - Increase
sto a more realistic length of ~500 slots / ~3h - We are most interested in confirmation times (not really throughput) -> plot each tx individually as pointcloud? with time and value as axes?
-
But originally, this is motivated by "prioritization" of Hydra Tail / Head, which was then discussed in the full Research Meeting
- Pointed out that the Hydra Head is more realistic to be implemented any time soon
- Maybe "prioritization" issue is about just the wrong appearance of Tail being the only solution to (micro-)payments?
- Eventually pitched our MVP idea for a delegated hydra head
- Was somewhat well received and also incremental approach on creating this made sense to most
-
After Aggelos talked to Charles though, the "delegated hydra head" seemed to be a non-solution because of regulatory obligations being implied by it being actually a "custodial hydra head"
- In order to be able to redo some of the simulations, I started by following these instructions
- First I was using the combined Docker image for
cardano-node-ogmiosagainst a somewhat olddbof acardano-nodeand invokingnpm run pipelineagainst the ogmios server running with that state- For this I had to add
nodejs-14_xto theshell.nixofhydra-sim - Also the download first fails because of
TypeError: reader.end is not a function, but restarting the pipeline picks up the downloadedblocks.json - Contrary to the
README, there is a third parameter which seems to be limiting themaxSlot(after compression?)
- For this I had to add
- When seeing that the data is not complete (slot no is way too low), I realized my
cardano-nodehad problems extending the chain - After trying several different tags and also re-synching completely from scratch, I found that there seems to be an issue with recent
cardano-nodeversions and the allegra hard fork (in retrospect) - This was also observed by others on slack - Synching with the
1.25.1node (using a docker image) seems to work now
While fixing PR review's issues, I noticed one problem with the use of JUnit formatter: It does not output anything anymore, the output is sent to the XML file but not to the console which is annoying. Trying to find a way to configure format and get both outputs. Managed to combine the 2 formatters, JUnit XML file generator and console reporter. Seems like there could be a generic function to define there, something like:
both :: (a -> m ()) -> (a -> m ()) -> (a -> m ())
both one two a = one a >> two a
This is aptly defined as tee.
Trying to understand why our close contract fails to validate properly, and at which points validation fails.
We get an error about some signature not being done but it seems we add the constraints and lookups that are needed. Trying to dump the generated transcations using logInfo to see how it looks like.
- Trying to use
ownPubKeyto sign the transaction instead of the pub key we pass in the contract but to no avail, it's still failing. - Trying to add
traceIfFalsein theOnChaincode => now fails in thecollectComtransaction. - Trying to trim down the
closevalidator to bare minimum. - Turnaround time is long: > 1 minute per compilation cycle which is horribly slow
- Why is
collectComsomtimes failing while we are changing seemingly unrelated code?
Going through Plutus code that constructs a transaction, trying to understand where it's validated and what each part is doing.
Trying to trace the transactions that are posted. It's not possible to get the ones that are not validated by the wallet, seems like only the failing on ledger ones are dumped.
Plan for afternoon:
- Remove mock-chain -> complete PAB with lightweight contract logic so that we get a complete
OnChainclient talking to PAB - Good to have a look at the plutus pioneer program again
Keep troubleshooting cloes contract failure: Seems pretty clear the failure is in the amounts but what's unclear is why adding traces can have side effects that make the collectCom transaction validation to fail. going to try to validate the transaction and then investigate the effects of traces.
Just adding the single validation mustBeSignedByOneOf makes the test fails at the collectcom call which does not really make sense.
Trying to remove and and the list of constraints makes the test pass!
- Adding check on amounts with
&&operator (which is supposed to fail) makes the test fail and the error message is cryptic - Trying to add
&& Truemakes the test pass -> Seems like it's not the operator the problem, but the operand?
It's hard to troubleshoot errors when one cannot print/trace values: traceIfFalse takes a String but one cannot pass show from vlaues on chain apparently?
-
Trying to remove check on equality of committed values/closed values which seems to break thigns a lot, focusing on the correct transitioning to
Closedstate -
Trying to replace the amount computed from the inputs with constant lovelace value of 1 and pass that off-chain
With a constant
adaLovelaceValue 1the transactions successfully completes but the test now fails with the wallet's balance not being the one expected. Alice's wallet should have changed by -1000 but it actually changed by 999 which probably comes from the fact it submitted the close transaction that only output 1 lovelace and the rest of the inputs went to Alice's wallet. Changing the test to have the transaction posted by Bob gives the same result, plus the state is not changed!! -
The
payToTheScriptconstraint in off-chain correctly generates a value that contains the committed ADAs and the participation tokens. Need to add that constraint in the on-chain validator which seems to be what we doing before but maybe not?Putting an incorrect value in validator's verification raises an error in the
closecontract as expected. -
Ended up submitting a Plutus Bug to try to have a better understanding of what's going on with our failures in the
closecontract's invocation.
Going to beef up Dev VM with C2 instance to have a faster CPU => does not significantly change turnaround time.
Changing the way the amount is computed in the validator changes the outcome of the test: Now I can see the transaction validation failing on-chain and the transaction is dumped to the console, which does not help that much troubleshooting it as it's huge progress nevertheless.
It seems the transaction has no input with value, I can see only a single input which is the script address with datum and redeemer.
The inputs and outputs of the CollectCom transaction are:
{inputs:
- 268bf918b954642de3e4a1b2d108dee48f2ed4a0f9c974b35c6291b60070ab54!1
Redeemer: <>
- 3832e5b62e1bf8df95054f42d522ec24388b407652dd8564281a30367dcac0ad!1
Redeemer: <>
- 63fcd1840fb27fa8eef570f6f8ea42f1309518c17d4e252f3ee2ddc4c4492848!1
Redeemer: <>
collateral inputs:
outputs:
- Value (Map [(,Map [("",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed
4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])]) addressed to
addressed to ScriptCredential: 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1 (no staking credential)
The I/Os for the close transaction are:
{inputs:
- 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac!1
Redeemer: <<1,
[<<<"2\130\148\246\255\SUB;X\235\236\191j\224\fZj\137\221\SOs i\EM\235\137\FS\199H(\192\178\178">,
<>>,
[<"", [<"", 1480>]>],
<>>,
<<<"\FS|\224\211\DC3\244\DEL\141R\144\229^i\171B\170E\DC2+E\180\186k\171\154>]T\252\171ub">,
....
outputs:
- Value (Map [(,Map [("",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed
4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])]) addressed to
addressed to ScriptCredential: 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1 (no staking credential)
Logging the utxoAt result in the close endpoint that gives the TxOutTx attached to the script's address:
Contract log: String "State machine UTxO: fromList [
(TxOutRef {txOutRefId = 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac, txOutRefIdx = 1}
,TxOutTx {txOutTxTx = Tx {
txInputs = fromList [ TxIn {txInRef = TxOutRef {txOutRefId = 268bf918b954642de3e4a1b2d108dee48f2ed4a0f9c974b35c6291b60070ab54, txOutRefIdx = 1}
, txInType = Just (ConsumeScriptAddress Validator { <script> } (Redeemer {getRedeemer = Constr 0 []}) (Datum {getDatum = Constr 0 [Constr 0 [Constr 0 [B \"9\\247\\DC3\\208\\166D%?\\EOTR\\148!\\185\\245\\ESC\\155\\b\\151\\157\\b)YY\\196\\243\\153\\SO\\230\\ETB\\245\\DC3\\159\"],Constr 1 []],List [Constr 0 [B \"\",List [Constr 0 [B \"\",I 1000]]]],Constr 1 []]}))}
,TxIn {txInRef = TxOutRef {txOutRefId = 3832e5b62e1bf8df95054f42d522ec24388b407652dd8564281a30367dcac0ad, txOutRefIdx = 1}
, txInType = Just (ConsumeScriptAddress Validator { <script> } (Redeemer {getRedeemer = Constr 0 []}) (Datum {getDatum = Constr 0 []}))}
,TxIn {txInRef = TxOutRef {txOutRefId = 63fcd1840fb27fa8eef570f6f8ea42f1309518c17d4e252f3ee2ddc4c4492848, txOutRefIdx = 0}
, txInType = Just ConsumePublicKeyAddress}
,TxIn {txInRef = TxOutRef {txOutRefId = 63fcd1840fb27fa8eef570f6f8ea42f1309518c17d4e252f3ee2ddc4c4492848, txOutRefIdx = 1}
, txInType = Just (ConsumeScriptAddress Validator { <script> } (Redeemer {getRedeemer = Constr 0 []}) (Datum {getDatum = Constr 0 [Constr 0 [Constr 0 [B \"!\\254\\&1\\223\\161T\\162abk\\248T\\EOTo\\210'\\ESC{\\237Kj\\190E\\170X\\135~\\244\\DEL\\151!\\185\"],Constr 1 []],List [Constr 0 [B \"\",List [Constr 0 [B \"\",I 1000]]]],Constr 1 []]}))}]
, txCollateral = fromList [TxIn {txInRef = TxOutRef {txOutRefId = 63fcd1840fb27fa8eef570f6f8ea42f1309518c17d4e252f3ee2ddc4c4492848, txOutRefIdx = 0}, txInType = Just ConsumePublicKeyAddress}]
, txOutputs = [
TxOut {txOutAddress = Address {addressCredential = PubKeyCredential 21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9
, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",99915670)])])
, txOutDatumHash = Nothing}
, TxOut {txOutAddress = Address {addressCredential = ScriptCredential 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])])
, txOutDatumHash = Just 2b0b15c43ac83cb6d7a68f7ed516e3017964b30f44d2d828408dd9559c8df82d
}]
, txForge = Value (Map [])
, txFee = Value (Map [(,Map [(\"\",52304)])])
, txValidRange = Interval {ivFrom = LowerBound NegInf True, ivTo = UpperBound PosInf True}
, txForgeScripts = fromList []
, txSignatures = fromList [(d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a,2dba3d2cc78c83aef5e080be8dbf85645f90a44edf596913abe466b8cd0634a4250239a127792629702cd8cc4178360999699590e05b38ad2cee9eed12d9bb01)]
, txData = fromList [(2b0b15c43ac83cb6d7a68f7ed516e3017964b30f44d2d828408dd9559c8df82d,Datum {getDatum = Constr 1 ...[]]]]})
,(2cdb268baecefad822e5712f9e690e1787f186f5c84c343ffdc060b21f0241e0,Datum {getDatum = Constr 0 []}),(d38a1142ade90b55793912774ec6b633b03b810ce2f7513b9776d628a5387aa5,Datum {getDatum = Constr 0 [....]]})
,(f37dfa2dac3e68fad98162f5fe2db3ea5e253dccad695ba540b16cbcdc486ece,Datum {getDatum = Constr 0...]})]}
, txOutTxOut = TxOut {txOutAddress = Address {addressCredential = ScriptCredential 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1
, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])])
, txOutDatumHash = Just 2b0b15c43ac83cb6d7a68f7ed516e3017964b30f44d2d828408dd9559c8df82d}})]"
Submitted TX in the close:
Tx {
txInputs = fromList [ TxIn {txInRef = TxOutRef {txOutRefId = 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac, txOutRefIdx = 0}
, txInType = Just ConsumePublicKeyAddress}
, TxIn { txInRef = TxOutRef {txOutRefId = 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac, txOutRefIdx = 1}
, txInType = Just (ConsumeScriptAddress Validator { <script> } (Redeemer {getRedeemer = Constr 1 [Constr...}))}]
, txCollateral = fromList [ TxIn {txInRef = TxOutRef {txOutRefId = 4fb4fd36491388958f3299ea539b574fdca790e8b5668599e99cac6ba5a39fac, txOutRefIdx = 0}
, txInType = Just ConsumePublicKeyAddress}]
, txOutputs = [ TxOut {txOutAddress = Address {addressCredential = PubKeyCredential 21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",99894975)])])
, txOutDatumHash = Nothing}
,TxOut {txOutAddress = Address {addressCredential = ScriptCredential 43071a376b64c7e305c2fedff9125e3d498f2d00ab0f486fdcaad81baa99dbe1, addressStakingCredential = Nothing}
, txOutValue = Value (Map [(,Map [(\"\",2000)]),(1dd1049cbd0ff6c47602ac8ce76d9d0558edec1c8f417ad6fd4a2111d8b10f10,Map [(0x21fe31dfa154a261626bf854046fd2271b7bed4b6abe45aa58877ef47f9721b9,1),(0x39f713d0a644253f04529421b9f51b9b08979d08295959c4f3990ee617f5139f,1)])])
, txOutDatumHash = Just 98f5b7eed56b55ca67fb14a2f90708dc7e4939bdc424af280ad934d8343388fb}]
, txForge = Value (Map [])
, txFee = Value (Map [(,Map [(\"\",20695)])])
, txValidRange = Interval {ivFrom = LowerBound NegInf True, ivTo = UpperBound PosInf True}
, txForgeScripts = fromList []
, txSignatures = fromList [(d75a980182b10ab7d54bfed3c964073a0ee172f3daa62325af021a68f707511a,bd8f627fef117528a32c8a48c0fa7992e2bbd03fee8b219c72b3a1f95ea8ec97140875009f4bfd7f1e9da7f263c619ff556a202a1d0d2cc9173b10a2445b8b01)]
, txData = fromList [(2b0b15c43ac83cb6d7a68f7ed516e3017964b30f44d2d828408dd9559c8df82d
,Datum {getDatum = Constr 1 [...]]})
,(98f5b7eed56b55ca67fb14a2f90708dc7e4939bdc424af280ad934d8343388fb
,Datum {getDatum = Constr 2 [Constr 0 [...]})]}"
Transaction seems correct, though!
Deciding what to do next in the aftermath of first milestone meeting and update:
- Protocol is still not complete: There are a bunch of TODOs and Contest is not at all implemented . This is rather straightforward so better do it in solo mode
- PAB integration: we'll wait for SN to do this together
- Continuing smart contracts: We are not handling all transitions in the Head SCs and there's still a failing test (commented out)
Fixing commented test in ContractTest: Adding a tryCallEndpoint that returns something if an error thrown within the contract endpoint, then we can assert the return value. However there is a assertContractError functino that assert predicate over a ContractError supposedly thrown by a contract's instances.
Implemented basic endpoints logic for close
-
Add a
Snapshottype containing a list of UTxO and a snapshot number. We should add the multisig later. -
Add
OnChainvalidator: It's pretty straightforward as there's not much to check. -
Add
OffChaincode to submit transaction for the close -> test fails with a mysterious[WARNING] Slot 7: 00000000-0000-4000-8000-000000000000 {Contract instance for wallet 1}: Contract instance stopped with error: WalletError (ValidationError (ScriptFailure (EvaluationError ["Missing signature","checkScriptContext failed"])))
Note: In the OCV algorithms, there's no mention of checking equality between the amount(s) initially committed and the amount of each transition in the SM, nor with the UTxO decommitted. This is implicit in the fact the snapshot committed is valid and signed hence has been produced by a valid ledger, yet it would probalby be better to check it in the OCV?
Bandwidth is fixed to 2000MB/s, all nodes are colocated in same DC, transactions are assumed to be always non-conflicting.
| Nr. nodes | concurrency | tps (snapshot) | snap size |
|---|---|---|---|
| 20 | 10 | 685 | 100 |
| 20 | 1 | 259 | 10 |
| 50 | 10 | 709 | 250 |
| 50 | 1 | 296 | 25 |
| 100 | 10 | 717 | 500 |
| 100 | 1 | 314 | 50 |
To compare with Simple Protocol's results.
We should make an ADR for hiding technical layers behind modules, eg. Hydra.Network encapsulates and re-exports everything network-related.
What happens when CollectComTx and AbortTx happen concurrently?
- We should not observe both coming back from the chain, but this assumes the chain is safe
- We have this property stating we can receive messages in any state, but it's probably wrong
- We should guard the
OnChainEventhandlers too with the state
We should take care of mainchain rollbacks at some point:
- Chain can be rolled back up to 36 hours in the past
- This means our whole state could disappear, with the rug pulled under our feet while we run the head
- => we need to wait for opening the head until we get sufficient confidence it cannot be rolled back?
- Also relevant for contestation => if you have enough stake you could succeed in forcing a rollback which means you could cheat...
- => delays full finalization even further
- Heads us in the direction of long-running heads, with incremental commits/decommits (with same problem of rollbacks)
- Ouroboros Genesis should solve part of the issue
There's a tradeoff between acceptable risk and head duration -> TINSTAAFL
There is an error in our code: We transition Abort to ClosedState which is wrong, but we cannot really observe that.
- The only way is to make sure there are some actions we can or cannot do. Also, states naming is not consistent with the paper -> renaming
InitStatetoReadyState - Also, there is no need for a
FinalStateas there's only one head. When weAbortor finalize the head, we move back toReadyStatewhich means we can test the correctness of the abort by snedingInitagain. - Test fails because we had some uniqueness requirement on the txs in mocked chain in
BehaviorSpecso sendingInitTx [1,2]twice fails -> remove the uniqueness requirements
Removing our property stating we should handle on-chain transactions in all states -> it's not true anymore. Then adding some unit tests in HeadLogicSpec to assert CollectComtX and AbortTx are exclusive of each other.