TOSEC - cressie176/Load64 GitHub Wiki
TOSEC (The Old School Emulation Center) is a retrocomputing initiative dedicated to the cataloguing and preservation of software, firmware, and resources for retro systems. The TOSEC DAT files for the Commodore 64 are used to seed the LoadC64 Catalogue — matching ROM files by SHA-1 hash and extracting structured metadata from TOSEC filenames.
This page documents the TOSEC Naming Convention (TNC v4, 2015-03-23) as it applies to C64 software, with notes on how each field maps to LoadC64 concepts.
The authoritative specification is at https://www.tosecdev.org/tosec-naming-convention.
A TOSEC filename encodes all metadata about a ROM image. The full structure is:
Title version (demo) (date)(publisher)(system)(video)(country)(language)(copyright)(devstatus)(media type)(media label)[dump flags][more info].ext
Only Title, (date), and (publisher) are mandatory. All other fields are optional and appear in the order shown above when present.
Two bracket types are used throughout:
| Brackets | Purpose |
|---|---|
( ) |
Classification metadata (who, when, what) |
[ ] |
Dump information and supplementary detail |
Last Ninja, The (1987)(System 3)(PAL)[cr Steve][!].d64
| Segment | Field | Value |
|---|---|---|
Last Ninja, The |
Title | Last Ninja, The |
(1987) |
Date | 1987 |
(System 3) |
Publisher | System 3 |
(PAL) |
Video | PAL |
[cr Steve] |
Dump flag | Cracked by Steve |
[!] |
Dump flag | Verified good dump |
Mandatory. The display title of the software.
- Articles (
The,A,An,De,Die,Le,La,Les) are moved to the end, preceded by a comma and space:The Last Ninja→Last Ninja, The. - Subtitles are appended after
-(space-hyphen-space):Robocop - The Last Chapter. - Forbidden characters are never present:
/ \ ? : * " < > | - Apostrophes and hyphens are permitted.
- Numbers and symbols are permitted.
The seeding utility must reverse article normalisation to recover a natural display title:
| TOSEC title | Recovered title |
|---|---|
Last Ninja, The |
The Last Ninja |
Legend of Tosec, A |
A Legend of Tosec |
Atic Atac |
Atic Atac |
Detection rule: if the title ends with , The or , A or , An, move the suffix to the front.
Optional. Appears immediately after the title, before the first parenthesis.
| Format | Example |
|---|---|
v x.yy |
v1.0 |
v x.yyb |
v1.03b |
Rev N |
Rev 1 |
vYYYYMMDD |
v20000101 |
The seeding utility should capture this as part of the title or store it separately for reference, but it does not map to a LoadC64 Game field.
Optional. Appears after version, before the date.
| Value | Meaning |
|---|---|
(demo) |
General demonstration |
(demo-kiosk) |
Demo intended for kiosk/retail |
(demo-playable) |
Playable portion of a game |
(demo-rolling) |
Non-interactive rolling demo |
(demo-slideshow) |
Non-interactive slideshow |
Seeding note: Demo entries should be excluded from the LoadC64 Catalogue — they are not complete games.
Mandatory. Always the first parenthesised field after title/version/demo.
| Format | Meaning |
|---|---|
(19xx) |
Year unknown, 1900s |
(200x) |
Year unknown, 2000s |
(1986) |
Known year |
(2001-01) |
Known year and month |
(1986-06-21) |
Full date |
(19xx-12-Dx) |
Partial information (day unknown within month) |
Seeding mapping: Extract the four-digit year where known. If the year contains x (e.g. 19xx, 198x), treat as unknown and omit the year field.
Mandatory. Always the second parenthesised field.
| Value | Meaning |
|---|---|
(-) |
Publisher unknown |
(Devstudio) |
Single publisher |
(Delphine - U.S. Gold) |
Multiple publishers, alphabetical order |
(Smith, Robert) |
Individual person |
Seeding mapping: Map to publisher. If the value is (-), set publisher to null. Strip outer parentheses.
Optional. Used in multi-system DATs to indicate which hardware variant an image is for.
Common C64-relevant values: (C64), (C128), (+4), (VIC-20).
Seeding note: For C64-specific DATs this field is typically absent. If present and not C64, the entry should be excluded from the C64 catalogue.
Optional. Specifies the TV standard when it cannot be inferred from the country.
| Value | Meaning |
|---|---|
(NTSC) |
NTSC |
(PAL) |
PAL |
(PAL-60) |
PAL-60 |
(NTSC-PAL) |
Dual standard |
(PAL-NTSC) |
Dual standard |
Seeding mapping: Map to colour_encoding:
| TOSEC video | LoadC64 colour_encoding |
|---|---|
PAL |
pal |
PAL-60 |
pal |
NTSC |
ntsc |
NTSC-PAL |
unknown |
PAL-NTSC |
unknown |
| absent | infer from country (see below) |
Optional. ISO 3166-1 alpha-2 codes, uppercased. Multiple countries separated by - in alphabetical order: (DE-FR), (EU-US).
Selected codes relevant to C64 software:
| Code | Country/Region |
|---|---|
AT |
Austria |
AU |
Australia |
BE |
Belgium |
CA |
Canada |
DE |
Germany |
DK |
Denmark |
ES |
Spain |
EU |
Europe |
FI |
Finland |
FR |
France |
GB |
Great Britain |
IT |
Italy |
JP |
Japan |
NL |
Netherlands |
NO |
Norway |
NZ |
New Zealand |
PL |
Poland |
PT |
Portugal |
SE |
Sweden |
US |
United States |
ZA |
South Africa |
Seeding mapping — colour_encoding inference from country when video field is absent:
| Country code(s) | Inferred colour_encoding |
|---|---|
US, CA, JP
|
ntsc |
EU, GB, DE, FR, ES, IT, NL, SE, NO, DK, FI, AT, BE, PT, AU, NZ, ZA, PL
|
pal |
| Multiple mixed regions | unknown |
| Absent | unknown |
Optional. ISO 639-1 codes, lowercased. Multiple languages in alphabetical order: (en-fr). More than two: (M3) etc.
English is the default — absence of a language field implies English or language-neutral.
Seeding note: Language is not a LoadC64 Game field. Capture in TOSEC notes only.
Optional.
| Code | Meaning |
|---|---|
(CW) |
Cardware |
(FW) |
Freeware |
(GW) |
Giftware |
(LW) |
Licenceware |
(PD) |
Public Domain |
(SW) |
Shareware |
(SW-R) |
Shareware Registered |
Seeding note: Not a LoadC64 field. May be useful for filtering — PD and Freeware titles are unambiguously safe to include.
Optional.
| Value | Meaning |
|---|---|
(alpha) |
Early test build |
(beta) |
Feature-complete test |
(preview) |
Near-complete |
(pre-release) |
Near-complete |
(proto) |
Unreleased prototype |
Seeding note: Prototypes and betas represent distinct ROM variants and should be included as separate ROMSet entries under their game, but flagged in the label field (e.g. "Beta (1986)").
Optional. Describes multi-part media.
| Value | Format example | Meaning |
|---|---|---|
Disk |
(Disk 1 of 3) |
Magnetic disk |
Disc |
(Disc 1 of 2) |
Optical disc |
Tape |
(Tape 1 of 2) |
Magnetic tape |
Side |
(Side A) / (Side B)
|
Tape or disk side |
File |
(File 1 of 2) |
Individual file |
Part |
(Part 1 of 3) |
Numbered part |
Seeding mapping: The total count (of N) determines how many ROM files belong to a ROMSet. Individual files with the same title and publisher that form a set should be grouped into a single ROMSet. The media type and number map to roms[].label (e.g. "Disk 1", "Side A").
Optional. The last parenthesised field before square brackets. Describes the label printed on a physical disk, used to identify which disk to insert at runtime.
Examples:
(Disk 1 of 2)(Program)(Disk 2 of 2)(Data)(Disk 3 of 3)(Character Disk)
Seeding mapping: Append to roms[].label for disambiguation where needed: "Disk 1 – Program".
Square bracket flags [ ] describing the image's condition or modification history. Multiple flags on a single file are ordered: modification flags first (alphabetically), then dump process flags.
| Flag | Full form examples | Meaning |
|---|---|---|
[cr] |
[cr], [cr Cracker]
|
Copy protection removed (cracked) |
[f] |
[f], [f Fix], [f Fix Fixer]
|
Fixed to run in a non-standard environment |
[h] |
[h], [h Hack], [h Hack Hacker]
|
Intro, sprites, or text altered (hacked) |
[m] |
[m], [m Modification]
|
Unintended modification (e.g. save state) |
[p] |
[p], [p Pirate]
|
Unlicensed copy |
[t] |
[t], [t +3 Trainer]
|
Cheat/trainer added |
[tr] |
[tr fr], [tr de-partial Translator]
|
Translated to another language |
| Flag | Full form examples | Meaning |
|---|---|---|
[o] |
[o] |
Over dump — more data than expected |
[u] |
[u] |
Under dump — less data than expected |
[v] |
[v], [v Virus Name]
|
Image contains a virus |
[b] |
[b], [b Descriptor]
|
Bad dump — known damage |
[a] |
[a], [a2], [a3]
|
Alternate dump — variant of the original |
[!] |
[!] |
Verified good dump |
Numbering: Multiple instances of the same flag are numbered from the second: [a], [a2], [a3]. There is no [a1].
Multiple crackers/hackers: Separated by - within the flag: [h PDX - TRSi].
The seeding utility should apply the following rules based on dump flags:
| Flag present | Action |
|---|---|
[!] |
Include — verified good dump; preferred ROMSet |
[b] |
Exclude — bad dump; unreliable data |
[v] |
Exclude — virus present |
[o] |
Exclude — over dump; corrupt or inaccurate |
[u] |
Exclude — under dump; incomplete |
[cr] |
Include — cracked; common on C64, often the only dump available; use as separate ROMSet |
[t] |
Include — trained; use as separate ROMSet with label noting trainer |
[h] |
Include with caution — hacked; include only if no clean dump exists; label accordingly |
[f] |
Include — fixed; often required to run on modern emulators; use as separate ROMSet |
[tr] |
Include — translated; use as separate ROMSet with language noted in label |
[m] |
Exclude — unintended modification; data unreliable |
[p] |
Include — pirated copy may be the only dump; use as separate ROMSet |
[a] |
Include — alternate dump; use as separate ROMSet |
Square bracket field [ ] appearing after all dump flags. Free-text supplementary information not covered by other fields.
Examples:
-
[aka House of TOSEC]— alternate name -
[Req TRS-DOS]— software requirement -
[source code]— source code image -
[data disk]— data-only disk -
[docs]— documentation disk
Seeding mapping: [aka ...] values are useful for matching games that are known under alternate titles. [data disk], [docs], and [source code] entries should be excluded from the game catalogue as they are not playable software.
Multiple programs in one filename, separated by & (space-ampersand-space):
Amidar (19xx)(Devstudio) & Amigos (1987)(Mr. Tosec)
Each segment carries its own metadata. Global flags (shared by all programs in the set) are separated by - before the flag:
Amidar (19xx)(Devstudio) & Amigos (1987)(Mr. Tosec) -(PD)[!]
Seeding note: Multi-image compilations in a single file represent a different use case from a multi-disk game. A compilation file should not be added to the catalogue as a game — its component titles may already exist as individual entries. Skip compilation entries during seeding.
Relevant extensions for C64 software:
| Extension | Media type |
|---|---|
.d64 |
1541 disk image |
.d71 |
1571 disk image |
.d81 |
1581 disk image |
.t64 |
Tape image (container) |
.tap |
Raw tape image |
.prg |
Executable program file |
.crt |
Cartridge image |
.g64 |
GCR-encoded disk image |
.nib |
Nibble-encoded disk image |
.p00 |
PC64 program file |
The seeding utility should parse a TOSEC filename as follows:
- Strip the file extension.
- Split on
&— if more than one segment results, this is a compilation; skip it. - Extract dump info flags: match all
[...]tokens from the right of the string. Remove them from the working string. - Extract more info flags: these are
[...]tokens that follow all dump flags. In practice, extract all[...]tokens; classify by content. - Extract parenthesised fields: match all
(...)tokens from the string in order. Remove them from the working string. - The remaining text is the title (plus optional version and demo fields).
- Parse the title for a trailing version string (
v\d,Rev \d,vYYYYMMDD). - Parse the title for a trailing demo flag (
(demo...)— already removed in step 5, but check the ordered list). - Parse parenthesised fields in order:
- Field 1: date
- Field 2: publisher
- Field 3+: system, video, country, language, copyright, devstatus, media type, media label — identify by content pattern
Match each (value) token against these patterns in order — use the first match:
| Pattern | Field |
|---|---|
Matches date pattern (\d{4}, 19xx, 200x, etc.) |
date (already consumed as field 1) |
Matches known system token (C64, C128, VIC-20, +4, etc.) |
system |
Matches known video token (PAL, NTSC, PAL-60, etc.) |
video |
Matches 2-letter uppercase country code or known region (EU, US, etc.), optionally hyphenated |
country |
Matches 2-letter lowercase language code, optionally hyphenated, or M\d
|
language |
Matches copyright token (PD, SW, FW, GW, LW, CW, SW-R, GW-R, CW-R) |
copyright |
Matches devstatus token (alpha, beta, preview, pre-release, proto) |
devstatus |
Matches media type pattern (Disk \d+ of \d+, Tape \d+ of \d+, Side [AB], etc.) |
media type |
| Any remaining token | media label |
| TOSEC field | LoadC64 field | Notes |
|---|---|---|
| Title | title |
Reverse article normalisation (, The → The ) |
| Date | year |
Extract 4-digit year; omit if contains x
|
| Publisher | publisher |
Strip parentheses; set null if (-)
|
| Video | colour_encoding |
See video mapping table above |
| Country | colour_encoding |
Fallback when video absent; see country inference table above |
| Devstatus | romsets[].label |
Append to label: e.g. "Beta (1986)"
|
| Media type | roms[].label |
e.g. "Disk 1", "Side A"
|
| Media label | roms[].label |
Append for disambiguation |
| Dump flags | romsets[].label |
Summarise notable flags in label: e.g. "PAL release [cr]", "Trained +3"
|
| SHA-1 | roms[].sha1 |
Taken from DAT file, not computed from filename |
| TOSEC filename | third_party_ids.tosec |
Store the full filename (without extension) as the TOSEC ID |
- Video field if present
- Country field inference if video absent
-
unknownif neither is present or inference is ambiguous
TOSEC provides no TDE flag. Default to false; allow manual override in the catalogue.
TOSEC provides no multiplayer data. Default to unknown; allow manual override in the catalogue.
Build a human-readable romsets[].label from available context:
<video or country> <devstatus> <notable dump flags>
Examples:
| Filename flags | Label |
|---|---|
(PAL)[!] |
PAL |
(NTSC)[cr] |
NTSC [cr] |
(EU)(beta) |
EU beta |
(US)[t +3 Trainer] |
US [t] |
(GB)[f NTSC] |
GB [f] |
[a] |
Alternate |
[a2] |
Alternate 2 |
The seeding utility should skip a TOSEC entry if any of the following apply:
- The filename contains
&(compilation) - The demo field is present (demo version)
- Any
[more info]flag containsdata disk,docs, orsource code - Dump flags include
[b],[v],[o], or[u] - The system field is present and is not
C64(or the expected target platform) - The
[m](modified) flag is present