CAT nip SFW guide by Covalent and Curious Nekomimi - thaalesalves/ai-games-research GitHub Wiki

CAT<nip> Format v3.0

by Curious Nekomimi

Editor disclaimer

This format was made by Curious Nekomimi, but the original version is NSFW. This SFW version was made by Covalent, and uploaded to GitHub for better readability, formatted with markdown. For the original version of the guide (I repeat, it's NSFW) made the by the creator of the format themselves, check this link.

Purpose

A structured, highly accurate, and agile format for use with AI Dungeon’s Griffin and Dragon models. CAT<nip> format makes use of optimized tokenization (documentation coming soon courtesy of RogueSphinx) while also returning consistent outputs. It is doubtful that any format will ever achieve 100% accuracy with GPT-3. However, CAT<nip> can come close to 99.99% accuracy for many things (documentation coming soon).

Thanks

Bookdust, CrisAlcilian, Dichotomy, Kimtaengsshi, RogueSphinx, Zaltys, and many others for providing me with the inspiration to create this format! I will provide accurate acknowledgements where possible.

Recommendation

Most formats, including CAT<nip> function best at Randomness settings at, or below, 1.2. Operators Quick Reference

  • [ ] Square brackets indicate text the AI should ignore when formatting the style of outputs.
  • < > Angle brackets indicate strong emphasis on using a particular term above all others.
  • Triple bar very strongly indicates that two items are identical.
  • / Forward slash weakly indicates that two items are related, think “also” or “and/or”.
  • & Ampersand very strongly indicates that two items are referenced together, think “and”.
  • && Double ampersand indicates a stronger relationship than a single. Only a last resort.
  • : Colon indicates that two items are related, think “what follows is part of this”.
  • . Period indicates the end of an entry.

Operators

Space “

Each word/number in CAT<nip> MUST be preceded by a space. The reason has to do with how the AI handles tokens. Okay, technically not each word needs a space but it’s easier to do it that way, rather than checking to see which words get broken into multiple tokens and which don’t. Thanks to kimtaengsshi for their Text-to-Token Converter!

Take the word “railroad” for example. Without a leading space, the AI interprets “railroad” as two tokens “rail” and “road”. With a leading space, the AI interprets “railroad” as a single token “railroad”.

  1. This GREATLY enhances accuracy. Synergized with < >, &, and , it turns CAT<nip> into a surprisingly accurate format in both Griffin and Dragon.

  2. Leading spaces can reduce token count, thus freeing up tokens for other information to be sent to the AI.

No leading space With a leading space
Input: railroad
Tokens: 30224, 6344
Decoded Tokens: rail | road
Decoded Output:
railroad
Total Token Count: 2
Input: railroad
Tokens: 24337
Decoded Tokens: railroad
Decoded Output: railroad
Total Token Count: 1

Note (2021-02-03): I’ve been made aware that whitespace experiments were conducted as far back as September by Zaltys, Kimtaengsshi, and Monky. While I was unaware of their research until today, I do want to acknowledge their efforts, in the interest of collegiality.

Square Brackets [ ]

Square brackets [ ] should always be used to encapsulate all entries in Memory, Author’s Note, and World Info to prevent format leaks. This is true even if not using a format as the prose (writing style) you enter can leak into the AI’s outputs.

Angle Brackets < >

The CAT<nip> format uses angle brackets to indicate to the AI that a specific word or collection of elements must be interpreted as literally as possible, with minimal creativity.

Angle brackets < > are used to emphasize that a particular word or statement is strongly preferred. Testing the accuracy of words encapsulated in < > showed a +90% retention rate (AI used the wording as-is), versus using parentheses ( ), which showed a retention rate of only around 60%. It is key to maintain proper encapsulation. Every format will leak if there are an unequal number of < > or { } or [ ]. If you’re getting text in your output that looks like format, check that the opening and closing brackets are balanced.

Triple Bar

The CAT<nip> format uses a triple bar to indicate that two or more elements should be considered identical.

The triple bar sign denotes “identical to” in mathematics and is much stronger than the equals sign = or a colon : when making associations. At first I used < > or : for associations like age<21>. Further testing revealed that that statements like < age≡ 21> are far more reliable.

Credits: Zaltys for suggesting as a potential replacement for : and =.

Forward Slash /

The CAT<nip> format uses a forward slash / to indicate a weak transitive relationship between two or more elements, much like a comma.

The forward slash / denotes a weak relationship or list divider, e.g. < happy/ insane> can be interpreted by the AI as either happy or insane or both happy and insane. There are instances this is useful and / can be used to reliably string together at least three elements. Going beyond three associated elements yields reduced accuracy.

Ampersand & and Double Ampersand &&

The CAT<nip> format uses an ampersand & or, rarely, a double ampersand &&, to indicate a very strong relationship between two or more elements.

The ampersand & and double ampersand && denote that two items are strongly related. This means, if you were to put the words/items in a sentence, they should generally make sense with an AND between them.

For example, < happy& insane> is almost always interpreted as being both happy and insane. The double ampersand && is nearly never needed. If you find that two traits that should be connected almost never are, even when using &, try using && to further strengthen the connection.

As with the forward slash, then it comes to refinement, & and && and be strung together reliably for up to three items, e.g. < happy& insane& tired> with anything more becoming less reliable.

Colon :

The CAT<nip> format uses a colon : to indicate the start of an entry following a label.

Period .

The CAT<nip> format uses a period . to indicate the end of an entry.

Putting It All Together

Now for the fun part! Let’s create a character.

  1. Let’s name our characterl Lily Ellison (thanks https://www.name-generator.org.uk).

[ Lily summary:< full name≡ Lily Ellison>.]

But what does it all mean!?

Ah! Now we see an advantage of CAT<nip> over other formats that only indicate the character at the start of the “character sheet”. Labeling each entry with the character’s name significantly reduces the AI’s confusion over which attributes go with which character, a frustration I’ve had with other formats. So Lily summary: means that this is the summary attribute of the character Lily.

Okay, so what about < full name≡ Lily Ellison>?

Hopefully it’s easy to see what’s happening here. full name≡ reads “full name is identical to...” < full name≡ Lily Ellison> therefore means “full name is identical to Lily Ellison.”

Altogether Lily summary:< full name≡ Lily Ellison> translates to a character named Lily whose full name is Lily Ellison!

Examples

Characters

[ Jurģis description: < full name≡ Jurģis Ozols>/< age≡ 40>/< male>.
    Jurģis wearing: < blue citizen's jumpsuit& brown shoes>.
    Jurģis appearance: < hair≡ short& straight& brown>/< eyes≡ green>/< skin≡ somewhat pale>.
    Jurģis situation: < fought in French Foreign Legion during Seven Hour War>.
    Jurģis traits: < skills≡ combat& field medicine>.
    Jurģis mental: < frustrated& nervous>.]
[ you description:< full name≡ Jacob Ellison>/< age≡ 24>/< male>.
    you wearing:< polo& tan slacks& brown loafers>.
    you appearance:< athletic& fit>/< skin≡ dark& flawless>.
    you situation:< PhD student preparing for candidacy exam>.
    you mental:< confident& kind& resourceful>.
    you occupation:< graduate student>.
    you relationships:< sister≡ Lily>/< mother≡ Camilla>/< father≡ Gordon>.]

Added by Covalent & javaman: CAT<nip> can also be used to describe a variety of exotic or non-humanoid characters. Go wild!

[ Olra description:< solar dragon/ old archdragon>/< female>/< age≡ three millennia>/< sun goddess>.
    Olra appearance:< quadruped& four wings& eyes≡ white& scales≡ golden& large horns>.
    Olra wearing: < solar crown>.
    Olra abilities: < solar light magic>.
    Olra mental: < haughty& domineering& hate dark dragons& pious>.]
[ Yith summary:< name≡ yith/ alien& eldritch>.
    Yith appearance:< height≡ 3m>/< body≡ conical>/< arms≡ extensible& crab-like& clawed>/< rear arm≡ cluster& four& trumpet-like organs>/< foot≡ muscular slug-like>/< head≡ spherical>/< eyes≡ three& dark>/< hearing organs≡ stalked& flower-like& ontop>/< facial tentacles≡ green& on bottom>/< legs≡ slug& slimy& crawl>.
    Yith speech:< telepathy>.
    Yith abilities: < spacetime manipulation>/< mind puppeteer>.]
 [ Hermaeus Mora description:< name ≡ Hermaeus Mora/ Herma-Mora/ Hermorah>/< age ≡ unknown>/< genderless>.
    Hermaeus Mora summary:< Usually referred to as a male, his plane of Oblivion is Apocrypha>.
    Hermaeus Mora appearance:< body ≡ tentacles/ many eyes/ floating monster/ wretched abyss/ dark purple vortex>.
    Hermaeus Mora mental:< smart& intelligent& knows all& possessive& manipulative& arrogant& deceiver>.
    Hermaeus Mora occupation:< god of knowledge/ daedric prince of knowledge>.
    Hermaeus Mora traits:< arrogant/ prepotent/ manipulative>.  Hermaeus Mora speech:< telepathy>.
    Hermaeus Mora abilities:< knowing all>/< manipulation>/< mind puppeteer>/< controlling people>.]

Locations

Added by javaman: These examples are from my Elder Scrolls scenario. The original CAT<nip> guide doesn't have examples on locations, so I did my own testing, and had great results. More solid tests are being done, and I'll keep this doc updated with their results. So far, I have defined about 30 WIs using this format, and they've been working great.

[ Whiterun description:< city state in eastern skyrim>/< hold≡ Whiterun>/< kingdom≡ eastern skyrim>.
    Whiterun climate:< warm& breeze>.
    Whiterun culture:< economical power& hunting& smithing>.
    Whiterun ruler:< jarl≡ Yolanda>/< palace≡ Dragonsreach>.
    Whiterun features:< built on top of hill& dragonsreach& jorrvaskr& temple of kynareth& capital city of Whiterun>.]
[ Winterhold description:< city state in eastern skyrim>/< hold≡ Winterhold>/< kingdom≡ eastern skyrim>/<rival city≡ Windhelm>.
    Winterhold climate:< snowy&freezing>.
    Winterhold culture:< nord& magic& fishing& sea trading>.
    Winterhold ruler:< jarl≡ Ungvid>.
    Winterhold features:< college of winterhold& port city& capital city of winterhold>.]

Added by javaman: tests with EWIJSON breaking down the WIs into many and sorting their t position also worked very well.

[ Haafingar description:< hold in Western Skyrim>/< capital city≡ Solitude>/< kingdom≡ Western Skyrim>.
    Haafingar climate:< north≡ snowy& freezing>/< south≡ warm>.
    Haafingar geography:< Karth River& Sea of Ghosts& mountain& farmland& fertile& mills>.
    Haafingar ruler:< king≡ Vrage>.]

[ Haafingar culture:< nord& fishing& sea trading& port& arts>.
    Haafingar features:< Karth River& Kilkreath Temple& wolves& trolls& bears>.
    Haafingar towns:< Dragon Bridge>.]
[ Hjaalmarch description:< hold in northwestern Skyrim& part of the kingdom of Western Skyrim>/< capital city≡ Morthal>/< kingdom≡ Western Skyrim>.
    Hjaalmarch climate:< warm& hot>.
    Hjaalmarch geography:< marsh& swamp& Hjaal River& Karth River>.
    Hjaalmarch ruler:< jarl≡ Thora>.]

[ Hjaalmarch culture:< nord& wooden constructions& mining& hunting>.
    Hjaalmarch features:< mining& hunting& swamp& Hjaal River& Karth River>.
    Hjaalmarch towns:< Heljarchen& Stonehills& Dunstad>.]

Items

WIP

Spells

WIP

Advanced Concepts

Why ordering matters, nested expressions, etc… Coming soon!

Disclaimer

Abstract concepts can work in CAT<nip>, for example giving a single character multiple bodies (think Cerberus from Helltaker), but unless referenced by the player with context clues, it is unlikely the AI will reference those extra bodies correctly or with an acceptable level of frequency. Such abstract ideas require testing and hours of manipulation and token testing depending on each case and are not guaranteed, with any format yet extant, to have much success. It is often less aggravating, and a more efficient use of time, to make such references during the course of play and guide the AI on a case-by-case basis.

Garbage in garbage out. If the inputs you’re giving the AI are logically incomprehensible, poorly formatted, grammatical atrocities, don’t expect ANY format to help you.

⚠️ **GitHub.com Fallback** ⚠️