0 0 Delayed Feather.ipynb mrocklin - smart1004/ReadTheDocs GitHub Wiki
mrocklin/Delayed-Feather.ipynb
https://gist.github.com/mrocklin/e7b7b3a65f2835cda813096332ec73ca
Custom Computations with dask.delayed and for loops Because the real world is a messy place
This example uses dask.delayed to construct a parallel dask.dataframe from a nested directory of data stored in a custom format, feather. It is a good example of using dask.delayed to handle messy situations in the real world and then hand those situations off to dask.dataframe for clean processing.
Example: Hierarchically stored data in custom format Hierarchical storage: Custom directory structure with filenames encoding columns Feather: New Dataframe format that came out two weeks ago