new ORC notes - animeshtrivedi/notes GitHub Wiki
In the file OrcColumnarBatchReader
has in initialize
function, this line
recordReader = reader.rows(options);
which I also have in my file format benchmark (so this is cool, I wasn't off).
In initBatch
function, the ORC schema (i also had this (IAHT)), it does batch = orcSchema.createRowBatch(CAPACITY);
What is this copyToSpark is about // Just wrap the ORC column vector instead of copying it to Spark column vector.
- this we can do as well. Why would you do otherwise?