Cleaning & Transforming - NathanNeelis/frontend-data GitHub Wiki

Cleaning and transforming

I've done this process 3 times. I started to learn the basics while cleaning the survey data.
After the basics, I moved on to real data, the NS API, and in the end a big dataset from NPR.

Index

Survey data column: Aantal glazen water per dag
Survey data column: Kleur ogen
Cleaning & Transforming NS API datasets
Cleaning & Transforming NPR open data


Cleaning NPR dataset

Opening the NPR dataset, it contains around 8000 entries of the following objects:

  "ParkingFacilities": [
    {
      "name": "P+R Station Appingedam (Appingedam)",
      "identifier": "fc749565-1fe9-42f0-920a-3b4e718d62f9",
      "staticDataUrl": "https://npropendata.rdw.nl//parkingdata/v2/static/fc749565-1fe9-42f0-920a-3b4e718d62f9",
      "limitedAccess": false,
      "staticDataLastUpdated": 1601717435
    },

Fetching the 'real' data

As my concept involves P+R data, I can easily filter on the name to get a new array of P+R parking facilities.
I noticed when trying to filter that I got an error because there are one or more objects without a name key in the object.
I removed these objects by using the following code:
This is not best practice, and later on, I change missing data to 'UNKOWN'.

function removeNoName(allData) {
    let newArray = allData.filter(obj => Object.keys(obj).includes("name"));
    return newArray
}
Resource: https://stackoverflow.com/questions/51367551/how-to-remove-object-from-array-if-property-in-object-do-not-exist

Now I can filter for P+R parking areas by using the following code:

function filterPrParking(allData) {
    let substring = 'P+R';
    let prParkingArray = [];
    for (let i = 0; i < allData.length; i++) {
        if (allData[i].name.indexOf(substring) !== -1) {
            prParkingArray.push(allData[i])
        }
    }
    return prParkingArray;
}

This function filtered overall object names and went looking for object names containing 'P+R' in the name.

This returned an array of 405 objects. So down from 8k to 405.

In the online courses, someone noticed that the identifier in the object is the last part of the unique URL to fetch all data.
So I had to filter out an array of all identifiers. I did so with the following code:

function filterID(prData) {
    return prData.map(prParkingData => prParkingData.identifier) // returns array with PR parking ID's
}

The next step was to create all URLs to start fetching.
Setting up a base URL where I could paste the identifier after so I had a fetchable URL in promiseAllPr
This gave me an Array of about 405 promises that are about to fetch
Credits for my teacher Laurens for showing us how in this example

const endpointNPR = 'https://npropendata.rdw.nl/parkingdata/v2/';
const baseURL = endpointNPR + 'static/';

const promiseAllPr = makeURLs(baseURL, prParkingIDs);

function makeURLs(baseURL, IDs) {
    return IDs.map(id => getData(baseURL + id))
}

async function getData(url) {
    const response = await fetch(url);
    const data = await response.json();
    return data;
}

So now I have an array with a lot of promises about to fire.
I noticed my teacher (Laurens) got rate-limited on 429 fetches and I had to make 405, so would it work?

const wrappedData = await Promise.all(promiseAllPr)
const prDataArray = unwrapData(wrappedData);

function unwrapData(wrappedData) {
    return wrappedData.map(item => item.parkingFacilityInformation)
}

fs.writeFileSync('./result.json', JSON.stringify(prDataArray, null, 4));

Yes, it did work! I now had a data array with objects wrapped into 1 object called parkingFacilityInformation.
Laurens wrote an awesome function that removes that outer object and returns a nice array with my data. So I re-used his function here. After that, I wanted to save my data object into a new JSON file. So I wouldn't have to fetch 405 data points every time I reload my page. In the end, I saved this JSON file into a Github Gist so I could fetch the object from there later on.

My fetched data looks like this:

There is so much data pact in a single object:

How I want my object to look like

Now I can start working with my fetched data object. Time to clean it and transform it into an array of objects I need for my visualization.
I want an object with the following keys:

{
    city: "Appingedam"
    description: "P+R Station Appingedam (Appingedam)"
    capacity: 22
}

Step 1: Start transforming the data - getting the city name

Starting with the city name I noticed that the city name was noted in two different places and that how one place could have it undefined the other place still might show it. So to gather two arrays with these city names I used the following cleaning and transforming functions:

let accessPointDataArray = filterData(prData, 'accessPoints'); // Array with all accesPoint data 

function filterData(dataArray, column) {
    return dataArray.map(result => result[column]);
}

So this function returns an array of the column I specify, in this case this is accessPoints.

(405) [Array(1), Array(1), Array(1), Array(1)

Unfortunately, it was wrapped in an array, so I had to clean this by writing and using the following function:

function removeOuterArray(prData) {
    return prData.map(result => result[0]);
}

This returned 405 of the following objects:

0:
accessPointAddress: {emailAddresses: Array(1), streetName: "Stationsweg", houseNumber: "36", zipcode: "9901CS", city: "Appingedam", …}
accessPointLocation: [{…}]
alias: ""
isPedestrianEntrance: false
isPedestrianExit: false
isVehicleEntrance: true
isVehicleExit: true
validityEndOfPeriod: 0
validityStartOfPeriod: 1433116800
__proto__: Object

As you can see, in the accesPointAddress there is a key with the city name in there. That's the key I would like to extract into an array. But I ran into an issue with not every data point having the city name defined. I fixed this by using the following function to clean:

function fixEmptyValues(prData) {
    // Create an object with all the keys in it
    // This will return one object containing all keys the items
    let obj = prData.reduce((res, item) => ({
        ...res,
        ...item
    }));
    // console.log('nieuw object', obj);

    // Get those keys as an array
    let keys = Object.keys(obj);
    // console.log('all object keys', keys)

    // Create an object with all keys set to the default value ('UNKNOWN')
    let def = keys.reduce((result, key) => {
        result[key] = 0;
        return result;
    }, {});
    // console.log('All keys with a UNKOWN value', def);

    // Use object destrucuring to replace all default values with the ones we have
    let result = prData.map((item) => ({
        ...def,
        ...item
    }));

    // console.log('adds all values that we have to replace the 0 values', result)

    return result

    // CREDITS FOR user184994 @ stackoverflow
    // AMAZING RESOURCE: https://stackoverflow.com/questions/47870887/how-to-fill-in-missing-keys-in-an-array-of-objects
}

Thanks to the resource I got this function working, but instead of replacing all not existing values with UNKNOWN I, later on, had to change it to 0 because of the amount of capacities I wanted to add. Still on my To-Do list is to write an extra function that also fixes this issue.

I could now extract my data Array with city names. The second city names I want to extract are easier, with the filter data function I reached my key. But I now have two arrays of city names, and I want to combine these into an object. I wrote the following function to transform my data into a single object.

function wrapCity(cityNameOne, cityNameTwo) {
    let cities = cityNameOne.map((city, index) => {
        return {
            cityFirst: city,
            citySecond: cityNameTwo[index]
        }
    });

    return cities
}
// RESOURCE https://stackoverflow.com/questions/40539591/how-to-create-an-array-of-objects-from-multiple-arrays

Now I have an array of objects containing both city names. First step done!

{
    cityFirst: "Appingedam"
    citySecond: "Appingedam"
}

Step 2: Transforming the capacity

    let prSpecifications = filterData(prData, 'specifications') // all specifications
    let prSpecificationClean = removeOuterArray(prSpecifications); // all specificiations without the outer array
    let prSPecificationFixed = fixEmptyValues(prSpecificationClean); // changes empty values to "0" 


    let prCapicity = filterData(prSPecificationFixed, 'capacity'); //  Array of capacity for each parking area

The same steps as gathering the city names, first filtering on the big object file for specifications, then removing the outer Array and cleaning the empty values. After that, I could filter again for capacity. I now also have an array of capacities.

Step 3: Transforming the description

let prDescription = filterData(data, 'description'); // array with descriptions

The description was not wrapped into other arrays or objects, so easy to extract.
I now have the following data arrays

  • 1 object with 2 city names
  • Array with capacities
  • Array with descriptions

Step 4: Transforming city names, description and capacity into an object

The next step is to combine this into 1 object. I did so with the following function:

function wrap(city, capacity, description) {
    let items = city.map((city, index) => {
        return {
            description: description[index],
            cityFirst: city.cityFirst,
            citySecond: city.citySecond,
            capacity: capacity[index]
        }
    });

    return items
}

My transforming is nearly done, my current object looks as follows:

{
    capacity: 150
    cityFirst: "Delft"
    citySecond: "Delft"
    description: "P+R Delft-Zuid (Delft)"
}

Step 5: Cleaning to just randstad cities

Since my concept just includes the Randstad cities, for now, I want my array to only have data points for the Randstad Cites. I used the following functions for that:

function filterRandstad(prData, city) {
    let randstadData = prData.filter(array => {
        return array.cityFirst === city || array.citySecond === city; // return array if object city equals the giving city name
    })
    return randstadData;
}

function selectRandstad(objectArray) {
    let delftCities = filterRandstad(objectArray, 'Delft'); // creates an array with all datapoints for the city 
    let dordrechtCities = filterRandstad(objectArray, 'Dordrecht'); // creates an array with all datapoints for the city 
    let leidenCities = filterRandstad(objectArray, 'Leiden'); // creates an array with all datapoints for the city 
    let zaandamCities = filterRandstad(objectArray, 'Zaandam'); // creates an array with all datapoints for the city 
    let haarlemCities = filterRandstad(objectArray, 'Haarlem'); // creates an array with all datapoints for the city 
    let utrechtCities = filterRandstad(objectArray, 'Utrecht'); // creates an array with all datapoints for the city 
    let denHaagCities = filterRandstad(objectArray, 'Den Haag'); // creates an array with all datapoints for the city 
    let rotterdamCities = filterRandstad(objectArray, 'Rotterdam'); // creates an array with all datapoints for the city 
    let amsterdamCities = filterRandstad(objectArray, 'Amsterdam'); // creates an array with all datapoints for the city 

    let randstadCities = [...delftCities, ...dordrechtCities, ...leidenCities, ...zaandamCities, ...haarlemCities, ...utrechtCities, ...denHaagCities, ...rotterdamCities, ...amsterdamCities]
    // creates a new array that include all arrays for randstad cities

    return randstadCities;
}

So the function filterRandstad takes 2 arguments. The first one is the data and the second is the name of the city. In the second function selectRandstad I loop through the data and create 9 arrays for each individual Randstad city. After which I create a new array from all the 9 arrays creating 1 array with 51 objects.
But to create a clean array to work with I have to remove the doubles and create an array of unique datapoints.

function listUnique(rsData) {
    const uniqueArray = rsData.filter((value, index) => {
        const keys = JSON.stringify(value);
        return index === rsData.findIndex(obj => {
            return JSON.stringify(obj) === keys;
        });
    });

    return uniqueArray;

    // Thanks to Eydrian @ stackoverflow
    // Resource: https://stackoverflow.com/questions/2218999/remove-duplicates-from-an-array-of-objects-in-javascript

}

Step 6: Transforming the 2 city names into just one object key "city"

I now have an array with 50 unique objects for all P+R Parking areas in the Randstad. But in my object there are still 2 city names, I am going to transform that into 1 city name.

function combineData(rsData, city) {
    let cleanData = rsData.map((data) => {
        if (data.cityFirst === city || data.citySecond === city) {
            // console.log('i am looking for', city)
            return {
                description: data.description,
                city: city, // REPLACES cityFirst and citySecond FOR CITY: CITY
                capacity: data.capacity // ADDS THE CAPACITY
            }
        } else if (data.city != undefined) {
            return {
                description: data.description,
                city: data.city, // CHECKS IF DATA.CITY ALREADY EXIST, IF SO RETURNS THE SAME DATA.
                capacity: data.capacity // ADDS CAPACITY
            }
        } else return { // IF THE ABOVE STATEMENT IS NOT SO, THEN RETURN THE SAME OBJECT AS BEFORE.
            description: data.description,
            cityFirst: data.cityFirst,
            citySecond: data.citySecond,
            capacity: data.capacity
        }
    })
    return cleanData;

    // DIT KAN WAARSCHIJNLIJK NOG WEL MOOIER MET EEN ARRAY VAN RANDSTAD CITIES DIE HIER LOOPT. MISSCHIEN VOOR LATER.
}

function cleanRandStadData(rsData) {
    let delftClean = combineData(rsData, 'Delft'); 
    let dordrechtClean = combineData(delftClean, 'Dordrecht'); 
    let leidenClean = combineData(dordrechtClean, 'Leiden'); 
    let zaandamClean = combineData(leidenClean, 'Zaandam'); 
    let haarlemClean = combineData(zaandamClean, 'Haarlem'); 
    let utrechtClean = combineData(haarlemClean, 'Utrecht'); 
    let denHaagClean = combineData(utrechtClean, 'Den Haag'); 
    let rotterdamClean = combineData(denHaagClean, 'Rotterdam');
    let amsterdamClean = combineData(rotterdamClean, 'Amsterdam'); 
    // Returns Array with objects that combines both citynames to this cityname and adds it to the previous array.

    let cleanData = [...amsterdamClean]; // creates a new array with all objects from above.
    let fixedCleanData = fixDescription(cleanData); // if description is undefined it sets the description as P+R + city name.

    return fixedCleanData;
}

The function combineData takes two arguments, the data and the city. If the city name in the argument equals the city name in either on of the two in the object it returns the city key value now as the argument city. If it does not equal the city name it returns the old object as was.

Result: Data cleaned and transformed for use

I now have a clean and transformed data array of 50 objects that look like this:

{
    capacity: 150
    city: "Delft"
    description: "P+R Delft-Zuid (Delft)"
}

This data can be used for my visualization.

Extra step: Transforming extra data array for interaction in visual

For my interaction in the data visualization, I want to show the 9 Randstad cities with a total amount of parking spaces.
I got this data transformed by the following function:

export function combineDoubleCities(rsData) {
    let arr = rsData,
        result = [];

    arr.forEach(function (a) {
        if (!this[a.city]) { // if the city name is not in the array yet, continue
            this[a.city] = { // create new object
                description: a.city, // Quickfix for showing right cityname at datavis. should be description normally
                city: a.city, // city name is city out of previous object
                capacity: 0 // sets capacity to starting point of 0
            };
            result.push(this[a.city]); // push object to new array
        }
        this[a.city].capacity += a.capacity; // adds the capacity
    }, Object.create(null));

    return result;

    // WINNING RESOURCE: https://stackoverflow.com/questions/38294781/how-to-merge-duplicates-in-an-array-of-objects-and-sum-a-specific-property
    // RESOURCE: https://stackoverflow.com/questions/60036060/combine-object-array-if-same-key-value-in-javascript
}

By using this function I have a new array with only 9 objects with the total amount of parking spaces

0: {description: "P+R Delft-Zuid (Delft)", city: "Delft", capacity: 150}
1: {description: "P+R Energiehuis (Dordrecht)", city: "Dordrecht", capacity: 150}
2: {description: "P+R Station Leiden Lammenschans (Leiden)", city: "Leiden", capacity: 636}
3: {description: "P+R Zaandam", city: "Zaandam", capacity: 119}
4: {description: "P+R Station Haarlem Spaarnwoude (Haarlem)", city: "Haarlem", capacity: 152}
5: {description: "P+R Papendorp", city: "Utrecht", capacity: 4675}
6: {description: "P+R Terrein Leidschenveen (Den Haag)", city: "Den Haag", capacity: 1929}
7: {description: "P+R Hoogvliet (Rotterdam)", city: "Rotterdam", capacity: 7622}
8: {description: "P+R Bos en Lommer (Amsterdam)", city: "Amsterdam", capacity: 3380}

Cleaning NS API dataset

I started off by working with this dataset, however later on I noticed all P+R data was inconclusive because for example the city Amsterdam had only 1 or 2 P+R facilities while this should be more near the 10. That's when I decided to look for another API. Which turned out to be the NPR open dataset.

But the work I did on the NS dataset is described below:
The NS dataset has a massive amount of data. Opening the API shows me 287 different datasets. The ones that are interesting are:

0: {type: "stationV2", name: "Stations", identifiers: Array(0), locations: Array(150), open: "Unknown", …}
4: {type: "stationfacility", name: "P+R betaald", identifiers: Array(1), categories: Array(2), locations: Array(60), …}
13: {type: "stationfacility", name: "P+R gratis", identifiers: Array(1), categories: Array(2), locations: Array(150), …}

The first one [0] is the dataset with all information about all train stations connected to 'De Nederlandse Spoorwegen'.
Using the filterData function I got a lot of nested information into arrays like this:

function filterData(dataArray, column) {
    return dataArray.map(result => result[column]);
}

let trainStationName = filterData(trainStation, 'name'); // Name of the train station
let trainStationCode = filterData(trainStation, 'stationCode'); // Unique code for each train station

// location train station
let trainStationLongitudeArray = filterData(trainStation, 'lng'); // longitude train station
let trainStationLatitudeArray = filterData(trainStation, 'lat'); // latitude train station
let trainStationLocation = latLongCombine(trainStationLatitudeArray, trainStationLongitudeArray);

let trainStationCountry = filterData(trainStation, 'land');
let trainStationNL = filterCountryNL(trainStation); // all stations based in NL

To combine the longitude and latitude I used the following function:

function latLongCombine(latitudeArray, longitudeArray) {
    let locationArray = latitudeArray.map(function (latitude, index) {
        return [latitude, longitudeArray[index]];
    });
    return locationArray;

}
// Combining lat + long array
//Resource: https://stackoverflow.com/questions/47235728/how-to-merge-two-arrays-with-latitudes-and-longitudes-to-display-markers

Filtering out the trainstations in the Netherlands

function filterCountryNL(stations) {
    let countryNL = stations.filter(function (stationArray) {
        return stationArray.land === 'NL';
    })
    return countryNL;
}

After gathering all that information into new arrays I got to work with the P+R parking datasets. I gathered a lot of data arrays as shown below. I didn't start yet with transforming it for visualization.

       let prPaid = nsData[4].locations; // all PR paid location data

        // PR Name
        let prPaidName = filterData(prPaid, 'name'); // Name of the PR parking area

        // PR station code
        let prPaidStationCode = filterData(prPaid, 'stationCode'); // refers to closest train station

        // PR paid location data
        let prPaidLongitudeArray = filterData(prPaid, 'lng'); // longitude PR parking area
        let prPaidLatitudeArray = filterData(prPaid, 'lat'); // latitude PR parking area
        let prPaidLocation = latLongCombine(prPaidLatitudeArray, prPaidLongitudeArray); // Array of lat + long combined
        let prPaidCity = filterData(prPaid, 'city'); // All city names
        let prPaidCityClean = removeEmptySlots(prPaidCity); // all city names with empty slots removed

        // PR paid rates
        let prPaidRates = filterData(prPaid, 'extra'); // ALL rates: Day rate regular, Hour rate regular, Day rate train passenger. + Total amount parking spaces
        let prPaidRegularDayRate = filterData(prPaidRates, 'Dagtarief regulier'); // Day rate regular parking
        let prPaidRegularHourRate = filterData(prPaidRates, 'Uurtarief regulier'); // Hourly rate regular parking
        let prPaidTrainPassengerRate = filterData(prPaidRates, 'Dagtarief treinreiziger'); // Day rate for a train passenger

        // PR parking spaces
        let prPaidTotalParkingSpots = filterData(prPaidRates, 'Aantal parkeerplaatsen'); // Total amount of parking spots in the PR parking area

        // ------------- PR FREE PARKING AREAS ------------- ------------- ------------- ------------- ------------- ------------- ------------- 
        let prFree = nsData[13].locations; // all PR paid location data

        // PR Name
        let prFreeName = filterData(prFree, 'name'); // Name of the PR parking area


        // PR station code
        let prFreeStationCode = filterData(prFree, 'stationCode'); // refers to closest train station

        // PR paid location data
        let prFreeLongitudeArray = filterData(prFree, 'lng'); // longitude PR parking area
        let prFreeLatitudeArray = filterData(prFree, 'lat'); // latitude PR parking area
        let prFreeLocation = latLongCombine(prFreeLatitudeArray, prFreeLongitudeArray); // Array of lat + long combined
        let prFreeCity = filterData(prFree, 'city'); // All city names
        let prFreeCityClean = removeEmptySlots(prFreeCity); // all city names with empty slots removed

        // PR parking spaces
        let prFreeExtra = filterData(prFree, 'extra');
        let prFreeTotalParkingSpots = filterData(prFreeExtra, 'Aantal parkeerplaatsen'); // Total amount of parking spots in the PR parking area

How my data looks like after cleaning and transforming

Keep in mind that I worked this to the point that I had all my data in individual arrays. I only transformed the location into an object. After that I quit working on this dataset.

Station data

let trainStationName = Array with transtation names (150 in total)
let trainStationCode = Array with station codes (150 in total)
let trainStationLocation = Array with objects with 2 keys (longitude and latitude) (150 in total)
let trainStationNL = Array with objects with all trainstations in the Netherlands (106 in total)

P+R Paid data

let prPaid = Array with all P+R locations that are paid (60 in total)
let prPaidName = Array with names of the P+R paid locations (60 in total)
let prPaidLocation = Array with objects with 2 keys (longitude and latitude)(60 in total)
let prPaidCity = Array with city names of the P+R (60 in total)
let prPaidRegularDayRate = Array with Day rate regular parking
let prPaidRegularHourRate = Array with Hourly rate regular parking
let prPaidTrainPassengerRate = Arrahy with Day rate for a train passenger
let prPaidTotalParkingSpots = Array with Total amount of parking spots in the PR parking area

P+R free data

let prFree = Array with all P+R locations that are free (150 in total)
let prFreeName = Array with names of the P+R free locations (150 in total)
let prFreeLocation = Array with objects with 2 keys (longitude and latitude)(150 in total)
let prFreeCity = Array with city names of the P+R (150 in total)
let prFreeTotalParkingSpots = Array with Total amount of parking spots in the PR parking area

Cleaning Survey data

Survey data column: Aantal glazen water per dag

Because my coding skills are still a bit rusty, I started off with an easy data column to clean.
The data in this column are the amount of glasses water the participant drinks each day.

Original column information

  • 93 datapoints
  • string data

To do list

  • remove all empty keys
  • convert strings to number / integer data.

Remove all empty keys

In order to use the data, I want to remove all empty keys in the array.
The code below is how:

function removeEmptySlots(arr) {
    let cleanData = arr.filter(keys => keys != "");
    return cleanData;
}

Convert strings to integers

To use this data I can imagine one would prefer the data in numbers (integers) instead of strings.
In the code below I convert the string to integers by using the following code:

function stringToNumbers(arr) {
    let newCleanData = arr.map(x => +x);
    return newCleanData;
}

Resource: https://stackoverflow.com/questions/15677869/how-to-convert-a-string-of-numbers-to-an-array-of-numbers

New clean column information

  • 92 datapoints
  • numbers

Survey data column: Oog kleur

To challenge myself a bit further I choose to clean the eye color data column.
I want all eye colors to be a HEX color code.

Original column information

  • 93 datapoints
  • string data
  • Most of the data is a Hex color code (#d2691e)

To do list

  • Convert all input to lowercase
  • Convert all color names to hex color codes
  • Remove all spaces in-between strings
  • Check if there is a hashtag in front of the code

Convert all input to lowercase

Before I can start converting all color names to hex color codes, I want all color names to have lowercases.
Below here is the code I wrote to convert all array data to lowercases:

function toLowerCase(arr) {
    let newCleanData = arr.map(x => x.toLowerCase());
    return newCleanData;
}

Convert all color names to hex color codes

To have a uniform array of hex color codes I have to convert all color names to color codes.
I checked on a website (see resources) which color name pares with which hex color code.
After that I converted the names by using the .replace() method shown in the code below:
Not all of these color names are present in the original data, but for future data, I took all of the main colors to convert.

function replaceColorNamesToHexcolors(arr) {
    var cleanData = arr.map(
        x => {
            return x
                .replace(/blauw/, '#0000FF')
                .replace(/blue/, '#0000FF')
                .replace(/groen/, '#008000')
                .replace(/green/, '#008000')
                .replace(/bruin/, '#A52A2A')
                .replace(/brown/, '#A52A2A')
                .replace(/rood/, '#FF0000')
                .replace(/red/, '#FF0000')
                .replace(/roze/, '#FFC0CB')
                .replace(/pink/, '#FFC0CB')
                .replace(/oranje/, '#FFA500')
                .replace(/orange/, '#FFA500')
                .replace(/geel/, '#FFFF00')
                .replace(/yellow/, '#FFFF00')
                .replace(/paars/, '#800080')
                .replace(/purple/, '#800080')
                .replace(/grijs/, '#808080')
                .replace(/gray/, '#808080')
                .replace(/wit/, '#FFFFFF')
                .replace(/white/, '#FFFFFF');
        });
    return cleanData; // Array with colornames converted to hex colors.
}

Resource: https://stackoverflow.com/questions/953311/replace-string-in-javascript-array
Resource: https://stackoverflow.com/questions/7990879/how-to-combine-str-replace-expressions-in-javascript
Resource: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/replace
Resource: Color data https://htmlcolorcodes.com/color-names/

Remove all spaces in strings

There were some data points that had a space between the # and the color code.
To make these data entries to valid color codes I have to remove the spaces.

function removeSpaces(arr) {
    let cleanData = arr.map(keys => keys.replace(/ /, ''));
    return cleanData; 
}

Check if there is a hashtag in front of the code

There are a few data entries that just have the color code without the hashtag.
To make sure these are valid color codes I have to add a hashtag in front of the code.
I got a bit stuck here, because I wanted to do this with (indexOf()), but concluded that this wasn't going to work.
I took a sneak peek at the code of my support group member 'Marco' and found out he was using the charAt() method, this was my break through.

function hexCheck(arr) { // Check if arrayItems start with #
    let cleanData = arr;
    for (result in cleanData) {
        if (arr[result].charAt(0) !== '#') { // If the first char is nog a # 
            cleanData[result] = '#' + cleanData[result] // add the # infront of the string
        }
    }
    return cleanData // return array with added #
}

Resource: https://github.com/marcoFijan/functional-programming/blob/bfa470db102b7c095bc4bf9d1bccae51cfcb6129/index.js#L58-L70

New data information

So after all my data cleaning I still have 93 entry points. But there are still 2 data points not valid.
One is a combination of color code and color name.
The other is an RGB entry point.
At this point, the course advances toward the RDW datasets, and for now I decided to move on.