Agenda 12th of May - SelinaBlijleven/WebshopRecommendations GitHub Wiki
Update on receiving data: Data has been received per email on the evening of the 29th of April.
Deencrypting of data process: Data was encrypted with aes and took somewhat longer to decrypt. Justin was contacted for some additional help on this issue and data has now been de-encrypted.
First insights into data (manual and using python): Producing insights into data using python was ineffective due to large amounts of data. While functions worked on a smaller file of 300 lines, the complete file containing 2,7 million lines could not be analysed with this method. Maarten Marx proposed the pandas package for python as an alternative. https://pypi.python.org/pypi/pandas/0.18.1/#downloads
The main focus for the next week will be on understanding data and gathering insights for filtering the data using pandas. Next week Maarten Marx will be updated on progress on this topic. Some pages for the final report might be written describing data.
Discussing approach on filtering data: Next week's focus will be on understanding and analysing data instead of filtering data. This will be done later in the project.