Cryptographic Calculations! - benjamin-schultz/wow-such-miner GitHub Wiki
Scrypt
How does Scrypt work? To google!
It is a key derivation function, so not exactly a hash. It takes a 'password' input. It is also noted on Wiki that it is designed to take up a large amount of memory. Oh dear, that doesn't bode well. Anyway, how does it work?
Function
It generates a long string of psuedo-random bits, then psuedo-randomly pulls out part of the string of bits and joins those parts together. This long string of bits can be generated part by part on the fly, but that would make it take longer. So it is designed to be a trade off between going fast or using less memory.
This is good for me, because I don't care much about the speed, so I can implement it via on-the-fly generation. Time to look at the algorithm!
Algorithm
Wikipedia provides some psuedo-code, but i'm gonna try to understand it rather than re-write the psuedo-code. (The word psuedo is coming up a lot... i feel like this will be common in cryptography)
It takes a passphrase and a salt. Passphrase will be whatever the block chain gives me I imagine, and i'm thinking the salt is the nonce? It doesn't matter at this stage I don't think, but I will keep an eye out.
There is Cost Factor, BlockSizeFactor and ParallelizationFactor, and these seem like the configuration settings that determine the speed vs memory trade off. There are also some other settings about lengths, i'm not gonna worry about them yet. I don't even understand the algorithm yet!
Ok, now the actual algorithm. So Scrypt takes your input, applies PBKDF2 (using HMAC-SHA256 as the hashing function), does some rearranging, then uses PBKDF2 again with our output. There are 2 main parts of this, PBKDF2 and the rearranging. The wiki clearly outlines the psuedo-code for the rearranging, however the PBKDF2 is a bit of a rabbit hole.
PBKDF2
That's a mouthful of an algorithm. Stands for "Password-Based Key Derivation Function 2", and basically it applies a hash and obfuscates it with some iteration. The actual algorithm itself is conceptually pretty simple, but it does rely on a specified hash function. In this case, that hash function is HMAC-SHA256... which means i'm going to need to implement that as well. I didn't expect to need to implement SHA as a part of this project, but here we are.
I could theoretically find it online, i'm sure its been done, but... might as well do it myself. Never done it before and its a learning experience.
PBKDF2 requires the hash function to take a key and an input. Initially I thought Scrypt only used SHA256, so I was confused trying to find the key, however now I realize it is HMAC-SHA256. So I guess first step of this project will be learning how to implement HMAC-SHA256!
HMAC-SHA256
So... whats this then? Well a HMAC is a "Hash-based Message Authentication Code", and the SHA256 tells you what the hash algorithm is. Basically you append a key, then hash, then do it again. The HMAC side of this is also a relatively simple algorithm, its just the SHA256 that scares me. Now I understand where the key and message go from the PBKDF2. I don't want to think about the SHA256 yet (even though its the first thing i'm gonna likely do), so next I will talk about the rearranging function in Scrypt!
Rearranging
Explaining whats happening here doesn't seem super important at this stage. Conceptually I mostly understand whats happening, and its mainly in the technical details. I don't want to both with technical details here, once I start implementing I may discuss though.
Conclusion
After writing and reading this, I have a reasonable understanding of how the process works. I can see 2 potential avenues of how to proceed from here:
- Read about SHA256 and try to implement it
- Ignore SHA256 and build the Scrypt function around it. Build the HMAC with a dummy hash, then the PBKDF2 etc... This sounds mor interesting. It also allows me to potentially cop-out and get a SHA256 implementation from the internet. Writing this out, it seems like this is the best option to get something that looks like a shell of what I want. I imagine the SHA256 will be a rabbit hole all of its own, and I don't want to get side tracked too early and discouraged! (Also, going HMAC -> PBKDF2 -> Scrypt feels like a much more gentle introduction into using VHDL and FPGA, as each step is slightly more complicated than the last)
So I guess... Next time, i'll start writing the HMAC! Exciting!