Entry 2.4: Mock Quiz Concept Review - bcb420-2025/Chloe_Calica GitHub Wiki

grep() vs grepl()

  • Quiz solution: myIDs$name[grep(myIDs$name, pattern="^[0-9]")]
  • My answer: myIDs$name[grepl("[1-9], myIDs$name)]
    • Missing a quotation mark
    • Missing "^" to indicate that we only want to look at the start of the string
    • Started at 1 instead of 0
    • Used grepl() instead of grep()

Comparison Table

Feature grep grepl
Output Indices (or values if value = TRUE Logical vector (TRUE or FALSE)
When to use Find position or extract elements Perform logical operations (subsetting, conditional checks)

Regular Expression Patterns

  1. Anchors
  • ^: Match start of string
  • $: Match end of a string
  1. Character classes
  • [abc]: match any one character from the set
  • [^abc]: match any character not in the set
  • [a-z]: match any lowercase letter from a to z

log() vs log2()

  • Quiz solutions: myIDs$log_ratio <- log2(myIDs$stim / myIDs$ctrl)

  • My answer: myIDs$fold <- log(myIDs$ctrl / myIDs$stim)

    • Used log instead of log2
    • Inverted numerator and denominator: Control always in denominator
  • Function Definitions

    • log(x, base): more general, can calculate logarithms for any base
      • Defaults to e (natural logarithm)
    • log2(): basically a shortcut, the same as log(x, base = 2)

strsplit()

Definition

  • used to split strings into substrings based on a specified delimiter or pattern.
  • returns a list, where each element of the list corresponds to one input string, and contains the resulting substrings.

Syntax

strsplit(x, split)

  • x: character vector to split
  • split: the delimiter or pattern used to split strings

Why use unlist after strsplit?

As strsplit returns a list, unlist is often used to convert this list into a vector. This is useful when you want to work with the substrings in a single flat structure rather than a list.

Use unlist when:

  • all input strings result in only one list element
  • you want to flatten all substrings into a single vector
  • you need to apply vectorized operations like filtering, subsetting, or matching