Design Approach - nsip/curriculum-mapper GitHub Wiki

Macro level

  • Identify the critical metadata for the two curricula being aligned:

    • Year Level(s)

    • Learning Area

    • Strand (where shared between the two)

    • Keywords (e.g. ScOT) (where available in both)

    • Text (bag of words extraction for both, possibly with synonym engine)

      • Differentiate between main text and ancillary text (e.g. elaborations)

  • Express curricula in machine readable format

    • We’ll stick to CSV for now

    • If we can get RDF encodings for both, good, but we’ll want that translated into tuples out of RDF/XML

  • Create learning corpus

  • Feed into machine learning engine

  • See what happens

Items may in principle be incommensurable: The Australian Curriculum has content descriptions, the NSW Syllabus has Outcomes, they are not meant to be the same thing at all. In principle. In practice, we need a mapping, and the mappings are only ever going to generate recommendations; something is much better than nothing.

Micro level

As a first cut, we will do a Bayesian classification of the target curriculum to align them to the source curriculum text.

  • Each source curriculum standard has text

  • We use the source curriculum as the training for the document classifier, classifying the text of each standard as a document whose class is the standard identifier

    • The text includes both the content description itself, and the elaborations of the content descriptions. The elaborations are illustrative examples, and are not meant to be exhaustive; but every skerrick of text helps.

    • We tweak the classifier to use TD-IDF (foregrounding distinctive keywords in the text), which should be slightly better in performance than treating all words as equally important

  • We run the text of the target curriculum item through the document classifier; it will generate scores of best alignment for the target curriculum item to a source curriculum item

Since the document classifier generates numerical scores, they are a natural input to a neural network, as described above. For the purposes of the pilot, the learning areas and year levels are constrained, and can be treated as filters imposed on the curriculum. So we don’t need to run full neural network training over the data: we have 37 Australian Curriculum content descriptions, to match against 18 BoSTES syllabus outcomes, for years 7 and 8 Science.

Outcomes are not the same kind of thing as content, at all. But in truth, the division between outcome and content is porous, and only becomes more porous in non-STEM areas (English).

The Syllabus has both outcomes and content text for any given learning-area/stage/strand/substrand tuple. In Science Yr 7 & 8:

  • The Skills strand (which is competency-oriented) has one outcome per tuple; so the content text directly aligns to that tuple, and can be used to align Australian Curriculum Content Descriptions to that outcome.

  • The Knowledge and Understanding strand (which is content-oriented) has one or two outcomes per tuple; so the content text for that tuple will have broader coverage than the outcome. (Some of the text in the content may be specific to one content, and we have no a priori way of telling which without human analysis.) But the content text at least constrains the alignment of outcomes to Australian Curriculum Content Descriptions.

  • The Values and Attitudes strand (which is on the vague side of competency) has no content text. Outcomes in that strand will likely apply across any content description in the same tuple, so we expect poor and generic alignment.

The content text in the Syllabus partially aligns with Australian Curriculum content descriptions, and nominates the content descriptions it aligns to. Those content descriptions can be used as a check for any output of the document classifier.

We will separate the alignments based on the outcome text from those based on the associated content text (which in Knowledge and Understanding does not differentiate the two candidate outcomes). The outcomes should have poor alignment to the content descriptions.

⚠️ **GitHub.com Fallback** ⚠️