Riffing on splitting - kevshouse/exam_quest GitHub Wiki
ft_split
Explained
This document breaks down the functionality of the ft_split
C function, which is designed to split a string into an array of words based on whitespace delimiters.
High-Level Goal
The primary objective of ft_split
is to take a string (e.g., " hello world ") and produce a NULL
-terminated array of strings (e.g., ["hello", "world", NULL]). This requires careful memory management to allocate space for both the array of pointers and each individual word.
To comply with the 42 School Norm, the logic is broken down into several small, static helper functions.
Core Functions
main
1. The main
function serves as the entry point for testing. It initializes a sample string, calls ft_split
to perform the split, and then iterates through the resulting array to print each word. Finally, it demonstrates proper memory management by freeing each word and then the array itself.
int main(void)
{
char *test_string = " See you later, alligator! ";
char **result;
int i = 0;
printf("Original string: \"%s\"\n", test_string);
result = ft_split(test_string);
if (result)
{
printf("Split result:\n");
while (result[i])
{
printf(" Word %d: \"%s\"\n", i, result[i]);
free(result[i]);
i++;
}
free(result);
}
return (0);
}
ft_split
2. This is the main public function that orchestrates the entire splitting process. Its logic is as follows:
- Count Words: It first calls
count_words
to determine how many words are in the input string. This is crucial for allocating the correct amount of memory for the main array. - Allocate Array: It allocates memory for the array of
char
pointers (char **
). The size isword_count + 1
to accommodate aNULL
terminator at the end. - Extract and Fill: It enters a loop that runs
word_count
times. In each iteration, it callsget_next_word
to extract the next word from the string. - Error Handling: If
get_next_word
fails (returnsNULL
), it triggers a cleanup routine, freeing all previously allocated memory before returningNULL
. - Null-Termination: After the loop, it sets the final element of the array to
NULL
.
char **ft_split(char *str)
{
char **arr;
int word_count;
int i;
if (!str)
return (NULL);
word_count = count_words(str);
arr = (char **)malloc(sizeof(char *) * (word_count + 1));
if (!arr)
return (NULL);
i = 0;
while (i < word_count)
{
arr[i] = get_next_word(&str);
if (!arr[i])
{
while (i > 0)
free(arr[--i]);
free(arr);
return (NULL);
}
i++;
}
arr[i] = NULL;
return (arr);
}
get_next_word
3. This function is the core of the word extraction logic. It is designed to be called repeatedly to get one word at a time.
- Pointer to Pointer: It takes a
char **str_ptr
as an argument. This allows it to modify thestr
pointer in theft_split
function directly, so the next call toget_next_word
starts from where the last one left off. - Skip Whitespace: It first advances the pointer past any leading whitespace.
- Find Word End: It then finds the end of the current word by scanning for the next whitespace character.
- Allocate and Copy: It allocates the exact amount of memory needed for the word and copies it over, adding a
\0
at the end. - Update Pointer: Finally, it updates the original
str
pointer (via*str_ptr
) to the end of the extracted word, ensuring the next call starts at the right place.
static char *get_next_word(char **str_ptr)
{
char *str;
char *word_start;
int len;
char *word;
int i;
str = *str_ptr;
while (*str && is_whitespace(*str))
str++;
word_start = str;
len = 0;
while (str[len] && !is_whitespace(str[len]))
len++;
word = (char *)malloc(sizeof(char) * (len + 1));
if (!word)
return (NULL);
i = -1;
while (++i < len)
word[i] = word_start[i];
word[i] = '\0';
*str_ptr = str + len;
return (word);
}
count_words
4. This helper function provides a preliminary scan of the string to count how many words it contains. It works by iterating through the string and identifying sequences of non-whitespace characters.
static int count_words(char *str)
{
int count;
count = 0;
while (*str)
{
while (*str && is_whitespace(*str))
str++;
if (*str && !is_whitespace(*str))
{
count++;
while (*str && !is_whitespace(*str))
str++;
}
}
return (count);
}
is_whitespace
5. A simple utility that returns 1
if a character is a space, tab, or newline, and 0
otherwise. This keeps the main logic clean and readable.
static int is_whitespace(char c)
{
return (c == ' ' || c == '\t' || c == '\n');
}
Memory Management
Proper memory handling is critical:
- The
ft_split
function allocates memory for the main array (char **
). - The
get_next_word
function allocates memory for each individual word (char *
). - If any allocation fails, a cleanup process is triggered to free all previously allocated memory to prevent leaks.
- The
main
function shows the correct way to free the memory after use: first, free each word in the array, and then free the array itself.