Lorebook Formats - TravelingRobot/NAI_Community_Research GitHub Wiki

This page is outdated!

The research notes in here concern the Sigurd model. Do not apply anything in here to newer models (e.g. Euterpe or Krake).

For Euterpe and Krake there is a guide by pume_ that describes current best practices (which is either write in full prose or use the Attribute Method).

Table of Contents

For new users (the tl;dr)
General Notes
    Which format is better?
    Encapsulation signs
List of Lore Formats
    Full Prose
    Concise Prose
    NAI Caveman
    Cat<nip>
    NAI Featherlite
    JSON style
    Python Dict Style
    Revised "no more JSON format"

For new users (the tl;dr)

For now, it looks like you will not be doing much wrong no matter the format you use. For beginners I would recommend you write your entries in concise prose, or full prose depending on what feels more natural to you.

Can I just import my old AID world info in zaltys/catnip/featherlite, etc.?

In short: Yes, it just will not be the ideal for NAI. The exception seems to be zaltys - you'll be better off converting it to something else.

Slightly longer answer (careful - mostly just my own opinion): In general most formats seem to work okayish, but will not be ideal for NAI. NAI's model are likely to have their own strength and weaknesses, and formats that played to dragon's strength might not play to Calliope's/Sigurd's strength. (Dragon: much bigger base model so better at "understanding" implied semantic relations, tight token/character limit so formats should use as little characters as possible; versus NAI models: need things to be spelled out more explicity, but no need to be super stingy with your characters)

Zaltys deserves a special note as we have seen heavy bleeding with zaltys, something that usually isn't a big problem with Sigurd. So it is strongly advised to not use zaltys. The same goes for abbreviated category words like APPE, MIND etc. These probably don't do help Sigurd much in "understanding" the entry, but lead to bleeding

General notes on Lore Formats

A word of caution by zaltys:

I'd recommend not going overboard with format research, as NAI is always working on new models and the way the 'formats' work is likely to change without much warning.

An analogy by Gnurro that I think is a good way to think about Lore entries (and context manipulation in general):

the AI can be primed like people is kinda what this whole transformers thing is about.
It's like saying "green, lush, fresh" to make people more likely to say "plants".
All the shortened formats basically do that, but people think they do some hacking/computer thing, while it's actually a language thing. Transformers are just that good at simulating how people associate words.

Which format is better?

First empirical results found no significant difference between performance of different formats so far.

Encapsulation signs

  • Using [] seems to work best.
    • As always leave a space after the [. (So: [ Mark...])
  • Be mindful about using {}, you might get unwanted associations with programming languages (Valahraban)

    the wiki people say {} is closer to the topic of programming/pure data and I'm inclined to agree

List of Lore Formats

Full Prose

So there is a bit of an ongoing debate of full prose versus condensed formats with OccultSage being the main proponent of using full prose for Lore entries. While I disagree with some of his points I'll try to present his arguments that I think are worth considering:

  • So far no conclusive evidence has been presented against either full prose or the more condensed formats
  • Well-written full prose entries have the distinct advantage that they can be used to guide the style of the AI. (In that case you obviously drop the [] around your lore entries)
  • In theory the style of condensed lore entries might leak into your output (but so far this has not been observed to be a huge problem for most common condensed formats enclosed in [] , even when testing in an empty prompt). The following are things I suspect to be true (but take with a grain of salt):
  • Full prose give you plenty of syntactic sugar that might help making relationships between entities a bit more explicit. This could maybe help with defining more complex relationships/concepts (this is just my own hunch though)
  • Condensed formats on the other hand let you focus on just the keywords you want to associate with a topic word. This might help with making the relationships between topic word + trait stronger (less "baggage" from connecting words as Rinter has put it) So full prose or a condensed format? My take on it is that currently it seems like you will not be doing much wrong either way. So I would recommend to write the entries in what makes the most intuitive sense for you and feel free to experiment with other formats as you see fit.

Concise Prose

Concise prose seems to work quite well in NAI (concise prose in this context = very short simple sentences). I only give it a heading here, because it is well suited for beginners. A good starting point is to start with the topic word and try to not let the sentence run longer than 10 tokens. Then begin the next sentence either with the topic word of the entry or a signifier pronout (he/she/it/his/her/its). This way the AI hopefully does not "forget" what you are talking about. Put [] around your entry. Then put a space after the opening [ (so: [ Mark...). If you want to experiment with more condensed formats: Transitioning to NAI caveman is quite easy from concise prose.

  • Example: [ Mark is a 35 year old witty man. He is strong and has red hair.]

Monky's NAI Caveman

  • Use <= 18 tokens per line, encapsulate each line with it's own []
  • Repeat signal pronoun (he/she/it) every 10 token (so about middle of the line)
  • Might work better with "Descriptions: Write in prose" in A/N. (RollForPanda)

Basic Format (copied from Fuzzy)

[ NAME 7token REINFORCEMENT 7token ]
[ NAME 7token REINFORCEMENT 7token ]

Examples

[ Lucy hair shining golden, very long, her skin pale white, heavily scarred]
[ Lucy eyes bright blue, sparkle brilliantly, her outfit polished black leather armor]
[ Lucy weighs fifty five pounds, she three and half foot tall]
[ Lucy note: wingless angel]
[ Lydia AI 'Miss Fortune' shuttle, Lydia and 'Miss Fortune' belong Zennifer]
[ Lydia operates 'Miss Fortune' shuttle, her objective deliver packages for Zennifer]
[ Lydia speech program features drunken accent, her behavior analytical helpful by design]
[ Lydia projects herself throughout shuttle interior, her avatar luminescent pink hologram]
[ Lydia is unfamiliar with you]
[ Jane hair light green, somewhat short, her skin deep purple, heavily tattooed]
[ Jane eyes glossy orange, fairly dull, outfit rusted yellow metal exoskeleton]
[ Jane stature humongous, weighs 2500 pounds, she 15 ft tall]

fairly accurate with no real sorcery involved. -6 position and -400 priority is about it. AI gets inventive as always, assumes she's an alien (fair) and adds extra eyes and metal limbs but it gets the defined part

[ Fox-girls are human-looking girls with fox traits]
[ Fox girl traits are fox-like ears, tail, claws, and attitude]

Kitsunes (rando)

(added here because it is sort of between prose and caveman imho...)

[Kitsunes: girls that have the form of a human, but have fox ears and one or more fox tails.]
[Genre: anime;]

Adding this entry and the genre seemed to finally do it. Clearly more anime is needed in the finetune though.

Cat<nip> (tested by rando, Cass)

Catnip SFW-Doc (AID), NSFW Doc (AID)

  • Still seems to still work fine in NAI
  • Writing the entry single line (seperate with .) also seems to work okay
  • can get some rare leaks with & and <>, you might want to ban <>, just in case

NAI Featherlite (Rinter)

  • Rinter is already working on revising featherlite for NAI here; revision notes here; notes on experimental ideas here

there's not much to say really. Replace bullet with [ ], and you don't need to worry about space removal between words and symbols. That's about it 😄

one other change (mainly to prevent word mashing format leaking) would be to dial back the amount of wordmashing, since it's not really that important anymore.

the tl;dr of modern featherlite now-a-days is [ pointer topic filter: spectrumofattributes topic moreattributes ] with some wordmashing only between associated words....

  • Kalmarr also regularly uses and tests featherlite. His information and examples can be found here.

Rando's JSON style

  • Essentially make a JSON and then minify it to one whole line

Why use JSON?

  • Can be manipulated with code
  • Useful for using scraped content as Lore entry
  • Supports nesting

Examples:

[{"Aliens":{"speciesType":"dangerous and terrifying insectoid monsters","behaviors":"ambush predators who strike with stealth and speed","motives":"behavior is driven by instinct, hunger, and animalistic cunning","appear":"Aliens are tall and sleek with serpentine grace and sinuous tails. Alien bodies are covered in glistening dark black chitin with sharp claws and teeth."}}]

[{"Lucy":{"sex": "female","speech": "Lucy is a mute & speaks only in sign language.","hair": "shining golden, very long", "skin": "pale white, heavily scarred", "eyes": "bright blue, sparkle brilliantly", "wears": "polished black leather armor", "weighs": "fifty five pounds", "height": "3.5", "race":"angel (wingless)"} }]

[{"Lucy":{"sex":"female","speech":{"accent":"like scottish","yes":"aye","you":"ye","can't, cannot, can not":"cannae","ass":"arse","small, little, tiny":"wee","didn't, did not":"dinae","you'":"ye'","my":"me","and":"an'"},"hair":"shining golden, very long","skin":"pale white","eyes":"bright blue, sparkle brilliantly","wears":"polished black leather armor","weighs":"55lbs","height":"5.5","race":"elf"}}]

I tried removing the accent or using the replacements to define it, but it broke it. Seems accent is required to trigger any replacements. Some accents are stronger than others.

[{"Albedo":{"nameFull":"Albedo","gender":"female","age":"18","race":"succubus","height":"170cm","weight":"90lbs","eyes":"golden/silted-pupil","hair":"black","skin":"white","appear":"black feathered wings, large breasts","wears":"a white dress with golden spiders web, black angel wings, and a golden necklace, white horns on head","mind":"calm, level-headed, strict, capable, loves Ainz/Momonga, hates humans","summary":"Overseer of the Guardians of the Great Tomb of Nazarick, virgin, created by Tabula.","class":"Unholy Knight","powers":"Succubus, Immune to disease, high offensive power, high defense power"}}]

[{"Illithid, mind flayer": "Illithids are eldritch creatures with 3 to 6 tentacles coming out of their face where their mouths should be. Blue black skin. Orange eyes. No nose or ears. They eat the brains of sentient creatures and are psionic."}]

If I didn't tell it where the mind flayer's tentacles came out of, it wouldn't understand.
kept having them come from his back
same with the number of tentacles
he had hundreds at one point

RollForPanda's Python Dict Style

it's literally just a Python Dictionary, without the " before and after the : (I've found it helps stop leaking). I've also found I don't have to reinforce who I'm referring to within the lore entry. I haven't tested a lot, but so far it looks promising.

Examples

[ Jane = {"Gender: Female", "Race: Advura", "Head: Elongated, Chitinous,", "Hair: Hairless", "Eyes: Not visible, Covered by chitin", "Skin: Grey, Chitinous"}]
[ Vonari = {"Name: Vonari Rishai", "Age: 30", "Gender: Female", "Species: Advura", "Movement: On all fours, Predatory Grace", "Length: Very Long", "Build: Slender, Lithe", "Body: Covered in blue, shiny, tough chitin", "Head: Elongated, smooth, chitinous", "Hair: Hairless", "Eyes: Not Visible, covered by chitin", "Feet: Clawed, Three Digits", "Tail: Long, Flexible, Segmented, Bladed tip", "Back: Dorsal Spines, Sensitive, Quill-Like", "Clothing: Naked", "Home: SSV Ikiro, Deck 2", "Rank: Commander", "Job: Executive Officer", "Personality: Determined, Loyal, Protective", "Sexuality: Straight", "Relationships: Single"}]
[ Advura= {"Species: Advura", "Type: Insectoid", "Movement: Quadrupedal, On all fours, Predatory Grace", "Length: Very Long", "Advura_Body: Slender, Lithe", "Advura_Skin: Chitin(dark, shiny, thick, tough)", "Advura_Head: Elongated(smooth, chitinous)", "Advura_Hair: None", "Advura_Eyes: Not Visible(covered by chitin)", "Advura_Legs: Slender", "Advura_Feet: Clawed, Three Digits", "Advura_Tail: Long, Flexible, Segmented, Bladed tip", "Advura_Back: Dorsal Spines(Sensitive, Quill-Like)", "Clothing: Naked", "Personality: Predatory", "Speech: Talking, Hissing".}]
[ Tahlia= {"Species: Tahlia", "Type: Humanoid, Reptilian", "Movement: Bipedal, Graceful", "Height: Average", "Body: Slender, Curvy, Hairless", "Body_Skin: Covered in bluish, leathery", "Body_Feet: Humanoid", "Body_Arms: Humanoid", "Body_Legs: Humanoid", "Head: Humanoid", "Head_Hair: Hairless, covered in flexible head crests", "Head_Eyes: Humanoid",  "Clothing: Ornate", "Personality: Arrogant", "Speech: Talking".}]
[ Asari= {"Species: Asari", "Gender: Monogendered(female)", "Type: Humanoid", "Movement: Bipedal, Dexterous", "Height: Average", "Body: Slender", "Asari_Skin: Blue(textured)", "Asari_Hair: None, head crests (cartilage, flexible)", "Clothing: Ornate", "Personality: Diplomatic", "Speech: Talking".}]
[ Turian= {"Species: Turian", "Type: Humanoid, Avian, Reptilian", "Movement: Bipedal", "Height: Average", "Body: Slender", "Turian_Skin: Brown(carapace)", "Turian_Hair: None, head crests (chitin, rigid)", "Clothing: Ornate", "Personality: Militaristic", "Speech: Talking".}]
[ Irasa= {"Species: Irasa", "Gender: Monogendered(female)", "Type: Humanoid", "Movement: Bipedal, Dexterous", "Height: Average", "Irasa_Body: Slender", "Irasa_Scalp: crests(several, cartilage, flexible), no hair", "Irasa_Skin: Blue(textured)", "Clothing: Ornate", "Personality: Diplomatic", "Speech: Talking".}]
[ Xael = {"Species: Xael", "Type: Humanoid", "Shape: Humanoid", "Movement: Bipedal", "Height: Slightly smaller", "Xael_Body: Slender", "Xael_Face: Flat", "Xael_Eyes: Wide", "Xael_Mouth: Humanoid", "Xael_Nose: Flat(Y-Shaped Slit)", "Xael_Skin: Grey-Blue(Leathery, Rough)", "Xael_Arms: Hands(Four Digits)", "Xael_Legs: Feet(Cloven)", "Loyalty: Xael Empire, Greater Good".}]

Note: For anyone that wants to try these out, they are NOT optimized for token count right now, just testing the format frame.

Rando's revised "no more JSON format"

(sorry for the name - just had to call it something for the heading for now...)

JSON is making me mad. When an unlikely result (like an elf child being 70 in Overlord) happens, it still tends to go nuts and grab info even outside the brackets and inside other brackets.I'm testing out converting it to a format that reminds the AI who it's talking about for every category.

[Mare nameFull: Mare Bello Fiore; Mare gender: male; Mare age: 76; Mare race: Dark Elf; Mare height: 104cm; Mare weight: 42lbs; Mare eyes: right eye blue, left eye green; Mare hair: brown; Mare skin: tan; Mare appear: young dark elf child, looks feminine (is male), crossdresser; Mare wears: blue dragon scale suit, white vest with gold embroidery, short white skirt, green leaf cloak, white gloves, a black wooden staff, an acorn necklace; Mare mind: shy, cowardly, reads books, lazy, dislikes talking, likes sleeping, likes to eat, innocent; Mare summary: Floor Guardian of the 6th floor with his twin sister Aura, created by Bukubukuchagama.; Mare class: Druid; Mare powers: high leveled magical power, speed, endurance, enhanced strength, invisibility;]

⚠️ **GitHub.com Fallback** ⚠️