filter csv.py - Holger-HBB/hbb-python-scripts GitHub Wiki

Filter CSV using another CSV

My first contribution to a GitHub repository was adding two parameters from Last.fm API to Petr Viktorin's lastexport.py.

My final goal will be to cross-reference Last.fm's scrobbles with MusicBrainz's database. Naturally the best way is to replicate the MusicBrainz server and query it directly but my own personal PC is nowhere near as powerful as would be required. So the better way right now is to set up an SQL database and import only the tables needed for the task but even there the tables are simply too large. The artist table contains more than 1.7 million entries, I may need somewhere between 10,000 and 20,000 of these.

So I've created this script to filter the large file.

I've first heard about a Python script here: Filter a large CSV file with Python (katiekodes.com) and I decided to generalize the script.