4.0 Functional programming exercises - GiovanniKaaijk/functional-programming GitHub Wiki

To practice data cleaning, I used a survey that our class filled out a while ago. I picked up 2 columns and cleaned them.

I started off with cleaning up the study debt of every student. I chose this column because I wanted to start with an easy one to clean up. The study debt comes in 3 different formats:

  • Geen studieschuld
  • NumberX - NumberY (ex. 10.000-15.000)
  • Hoger dan 25000

I wanted a single number in every format, so I did the following:

I have replaced 'Geen studieschuld' by 0, the average debt calculated with NumberX-NumberY and replaced 'Hoger dan 25000' by 25000+. The code I used for this is the following:

/**
 * 
 * @param {*} string This is the raw string
 * @returns {String} string with single numbers
 */
function debtToNumbers(string) {
    let stringArray = string.split("\n");
    let newArray = [];

    for(let thisString of stringArray){
        if(thisString.includes("Geen studieschuld")) {
            newArray.push(thisString.replace("Geen studieschuld", 0))
        } else if (thisString.includes("Meer dan 25000")) {
            newArray.push('25000+')
        } else if (thisString.length > 0) {
            let currentItem = thisString.split("-")
                .map(item => parseInt(item))
                .reduce((val1, val2) => (val1 + val2)/2);
            newArray.push(currentItem)
        }
    }
    return newArray = newArray.join("\n");
}
debtToNumbers(string);

I felt like to clean up another column of the survey, so I chose for the hobby column. In this column the hobbies must be separated by ;. Many people used , to separate the hobbies, some people also used spaces and others did not.

To clean this column I replaced every , by ; and removed all the spaces coming before or after the hobby. I remained the spaces between words because some hobbies are multiple words. I used the following code:

/**
 * This function returns a string without any spaces. Also any , get replaced by ;
 * @param {*} string This is the raw string 
 * @returns {String} string without spaces or ,
 */
function removeSpace(string, replaceBy) {
/**
 * Splits the string at any enter
 */
    let stringArray = string.split("\n")
//mapping to replace every , into ;
        .map(singlePerson => singlePerson.replace(/,/g, replaceBy))
//splitting again to get an array of all the hobbies
        .map(currentPerson => currentPerson = currentPerson.split(replaceBy)
//mapping to delete the first and last space of a hobby if there is a space
            .map(hobby => hobby.replace(/^\s|\s$/gm, ''))
//binding the array with hobbies back to 1 string
            .join(replaceBy))
//binding all persons back to 1 string
        .join("\n")

    return stringArray
}
removeSpace(string2, ';');

Checkup

After checking with Danny on 8-11 I had to explain my code through each line.

First I split the single string in an array with multiple strings, I split at each line break.

let stringArray = string.split("\n")

[ '',
  'Muziek maken; muziek produceren; uitgaan met vrienden',
  'Drummen; Gamen; Chillen',
  'Fitness, Voetbal, Tennis, Uitgaan, Tekenen',
]

I then map over these strings to replace the , by the given character when calling the function

.map(singlePerson => singlePerson.replace(/,/g, replaceBy))

[ '',
  'Muziek maken; muziek produceren; uitgaan met vrienden',
  'Drummen; Gamen; Chillen',
  'Fitness; Voetbal; Tennis; Uitgaan; Tekenen',
}

After replacing these characters, I divide the individual strings into an array with the hobbies.

.map(currentPerson => currentPerson = currentPerson.split(replaceBy)

[ [ '' ],
  [ 'Muziek maken', ' muziek produceren', ' uitgaan met vrienden' ],
  [ 'Drummen', ' Gamen', ' Chillen' ],
  [ 'Fitness', ' Voetbal', ' Tennis', ' Uitgaan', ' Tekenen' ],
]

Then I map out these hobbies to replace any space at the front or back of the hobby, I do this by using a regular expression.

.map(hobby => hobby.replace(/^\s|\s$/gm, ''))

[ [ '' ],
  [ 'Muziek maken', 'muziek produceren', 'uitgaan met vrienden' ],
  [ 'Drummen', 'Gamen', 'Chillen' ],
  [ 'Fitness', 'Voetbal', 'Tennis', 'Uitgaan', 'Tekenen' ],
]

After replacing the spaces, I bind the array back to a single string at the replacement character and I bind these single strings back to one string at the line break.

.join(replaceBy))
.join("\n")

Muziek maken;muziek produceren;uitgaan met vrienden
Drummen;Gamen;Chillen
Fitness;Voetbal;Tennis;Uitgaan;Tekenen