Minecraftian Narrative: Part 7

Table of Contents

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”


Unlike previous iterations of this series, today we’ll be diving into the field of linguistics a bit more intensely. The reason for my lack of any new posts in a month and a half has been the result of my work on a completely new language which is now approaching an alpha state (at which point, I will theoretically be able to build a full parser for it). Today, I’ll be covering why I decided to invent a language, where it came from, how it is different, and how it all ties into the overall goal of narrative scripting.

Future posts will most certainly reference this language, so if you aren’t interested in the background and just want the TL;DR of the language features and relevance to narrative scripting, then feel free to skip on down to the conclusion where I will review everything.

Without further ado, let’s begin!

Issues With “Toki Sona”

Prior to this post, I had puffed up the possibilities of using a toki pona-derived language, heretofore referred to as toki sona. While I was quite excited about tp’s potential to combine concepts together and support a minimal vocabulary with simple pronunciation and an intuitive second-hand vocabulary (e.g. “water-enclosedConstruction” = bathroom), there were also a variety of issues that forced me to reconsider its adaptation towards narrative scripting.


First and foremost is the ambiguity within the language. Syntactic ambiguity makes it nearly impossible for an algorithm to easily understand what is being stated, and tp has several instances of this lack of clarity. For example, “mi moku moku” could mean “the hungry me eats” or “I hungrily eat” or even some new double-word emphasis that someone is experimenting with: “I really dove into eating [something]” or “I am really hungry”.  With the language unable to clearly distinguish modifiers and verbs from each other, identifying parts of speech and therefore the semantic intent of a word and its relationship to other words is needlessly complicated.

The one remedy I thought of for this would be to add hyphens in between nouns/verbs and their associated modifiers, not only allowing us to instantly disambiguate this syntactic confusion, but also to simplify and accelerate the computer’s parsing with a clear delimeter (a special symbol to separate two ideas). However, this solution is not audibly communicable during speech and therefore retains all of these issues in spoken dialogue, violating our needs. Using an audible delimeter would of course be completely impractical.

The other problem of ambiguity with the language is the intense level of semantic ambiguity present due to the restricted nature of the language’s vocabulary. The previously mentioned “bathroom” (“tomo telo”) could also be a bathhouse, a pool, an outhouse, a shower stall, or any number of other related things. Sometimes, the distinction is minor and unimportant (bathroom vs. outhouse), but other times that distinction may be the exact thing you wish to convey. What happens then if we specify “bathroom of outside”?


One possibility is the use of “ma” meaning “the land, the region, the outdoors, the earth”, but then we don’t know if this is an outdoor bathroom or if it is the only indoor bathroom in the region, or if it is just a giant pit in the earth that people use. The other possibility could be “poka” meaning “nearby” or “around”, but that is even more unclear as it specifies a request purely for an indoor bathroom of a given proximity.

As you can see, communicating specific concepts is not at all tp’s specialty. In fact, it goes against the very philosophy of the language: the culture supported by its creators and speakers is one that stresses the UNimportance of knowing such details.

If you ever happen to speak with a writer, however, they will tell you the importance of words, word choice, and the evocative nature of speech. They can make you feel different emotions and manipulate the thoughts of the reader purely through the style of speech and the nuanced meanings of the terminology they have used. If we are to support this capacity in a narrative scripting language, we cannot be allowed to build its foundation on a philosophy prejudiced against good writing.

Lost and Confused Signpost

The final issue, related to the philosophy, is the grammatical limitations imposed by its syntax.

  1. Inter-sentence conjunctions like English’s FANBOYS (“I was tired, yet he droned on.”) are not entirely absent thankfully: they have an “also” (“kin”) and a “but” (“taso”) that can be used to start the following sentence and relate two ideas. One can even adverbial phrases (the only types of dependent clauses allowed) to assist in relating ideas. However, limitations are still present, and the reason for that is a mix of the philosophy and the (admirable) goal of keeping the vocabulary size compact.
  2. You cannot satisfactorily describe a single noun with adjectives of multiple, tiered details (“I like houses with toilet seats of gold and a green doorway”). This is a problem many have attempted to deal with revolving around the “noun1 pi multi-word modifier” technique that converts a set of words into an adjective describing noun1. Users of tp have debated on ways of combating this. One that I had considered, and which is mildly popular, was re-appropriating the “and” conjunction for nouns (“en”) as a way of connecting pi’s, but because you effectively need open and closed parentheses to accomplish the more complex forms of description, there isn’t really a clean way of handling this.

Through all of the limitations, prejudice of writing, and ambiguity, toki pona, and any language closely related to it, is inevitably going to find itself wanting in viability for narrative scripting. Time to move on.

Lojban: Let’s Speak Logically!

In an attempt to solve the problems of toki pona, a friend recommended to me that I check out the “logical language”, Lojban (that ‘j’ is soft, as in “beige”).

Lojban is unlike any spoken language you have ever learned: it borrows much of its syntax from functional programming languages like Haskell. Every phrase/clause is made up of a single word indicating a set of relationships and the other words are all things that act as “parameters” by plugging in concepts for the relations.


For example, “cukta” is the word for book. If you simply use it on its own, it plugs “book” into the parameter of another word. However, that’s not all it means. In full, “cukta” means…

x1 is a book containing work x2 by author x3 for audience x4 preserved in medium x5

If you were to have tons of cukta’s following each other (with the proper particles separating them), having a full version would mean…

A book is a book about books that is written by a book for books by means of a book.

You can also specifically mark which “x” a given word is supposed to plugin as, without needing to use any of the other x’s, so the word cukta can also function as the word for “topic”, “author”, “audience”, and “medium” (all in relation to books). If that’s not conservation of vocabulary, I don’t know what is.

It should be noted that 5-parameter words in Lojban are far more rare than simpler ones with only 2 or 3 parameters. Still, it’s impressive that, using this technique, Lojban is able to communicate a large number of topics using a compressed vocabulary, and yet remain extremely explicit about the meaning of its words.

Just as important to notice is how Lojban completely does away with the concept of a “noun”, “verb”, “object”, “preposition”, or anything of the sort. Concepts are simply reduced to a basic entity-relation-entity form: entity A has some relationship x? to entity B. This certainly makes things easier for the computer. In addition, while on the one hand one might think this would make things easier to understand when learning (since it is a much simpler system), the fact that it is so vastly different from the norm means that people coming from more traditional languages will have a more difficult time understanding this system, especially given the plurality of relationships that are possible with a single word.


Another strong advantage of Lojban is that it is structured to provide perfect syntactic clarity to a computer program and can be completely parsed by a computer in a single pass. In laymen’s terms, it means that the computer only needs to “read” the text one time to understand with 100% accuracy the “parts of speech” of every word in a sentence. There is no need for it to guess how a word is going to be syntactically interpreted.

In addition, Lojban employs a strict morphological structure on its words to indicate their meaning. For example, each of these “root set” words like “cukta” have one of the two following patterns: CVCCV and CCVCV (C being “consonant” and V being “vowel”). This makes it much easier for the computer to pick out these words in contrast to other words such as particles, foreign words, etc. Every type of word in the language conforms to morphological standards of a similar sort. The end result is that Lojban parsers, i.e. “text readers” are very very fast in comparison to those for other languages.

One more great advantage of Lojban is that it has these terrifically powerful words called “attitudinal indicators” that allow one to communicate a complex emotion using words on a spectrum. For example, “iu” is “love”, but alternative suffixes give you “iucu’i” (“lack of love”, a neutral state) and “iunai” (“hate/fear”, the opposite state). You can even combine these terms to compose new emotions like “iu.iunai” (literally “love-hate”).


For all of these great elements though, Lojban has two aspects that make it abhorrent to use for the simple narrative scripting we are aiming for. It is too large of a language: 1,350 words just for the “core” set that allows you to say reasonable sentences. While this is spectacularly small for a traditional language, in comparison to toki pona’s nicely compact 120, it is unacceptably massive. As game designers, we simply can’t expect people to devote the time needed to learn such a huge language within a reasonable play time.

The other damaging aspect is the sheer complexity of the language’s phonology and morphology. When someone wishes to invent a new word using the root terms, they essentially mash them together end-to-end. While this would be fine alone, switching letters around and having part of the latter consumed by the end of the former is unfortunately very difficult to follow. For example…

skami = “x1 is a computer used for purpose x2”
pilno = “x1 uses/employs x2 [tool, apparatus, machine, agent, acting entity, material] for purpose x3.”
skami pilno => sampli = “computer user”

Because “skami pilno” was a commonly occuring word in Lojban’s usage, a new word with the “root word” morphology can be invented on the fly by combining the letters. Obviously, this appears very difficult to do on the fly and effectively involves people learning an entirely new word for the concept.

All that to say that Lojban brings some spectacularly innovative concepts to the table, but due to its complex nature, fails to inspire any hope for an accessible scripting language for players.

tokawaje: The Spectral Language

We need some way of combining the computer-compatibility of Lojban with the elegance and simplicity of toki pona that omits as much ambiguity as possible, yet also allows the user to communicate as broadly and as specifically as needed using a minimal vocabulary.

Over the past month and a half, I’ve been developing just such a language, and it is called “tokawaje”. An overview of the language’s phonology, morphology, grammar, and vocabulary, along with some English and toki pona translations, can be found on my commentable Google Sheets page here (concepts on the Dictionary tab can be searched for with “abc:” where “abc” is the 3-letter root of the word). With grammar and morphology concepts derived from both Lojban and toki pona, and with a minimal vocabulary sized at 150 words, it approximates a toki pona-like simplicity with the potential depth of Lojban. While it is still in its early form, allow me to walk through the elements of tokawaje that capture the strengths of the other two despite avoiding their pitfalls.

Lojban has three advantages that improve its computer accessibility:

  1. The entity-relation-entity syntax for simpler parsing and syntactic analysis.
  2. Morphological and grammatical constraints: the word and grammar structure is directly linked to its meaning.
  3. The flexibility of meaning for every individual learned word: “cukta” means up to 5 different things.

toki pona has two advantages that improve its human accessibility:

  1. Words that are not present in the language can be estimated by combining existing words together and using combinatorics to construct new words. This makes words much more intuitive.
  2. It is extremely easy to pronounce words due to its mouth-friendly word construction (all consonants are followed by a vowel except ‘n’).


“tokawaje” accomplishes this by…

  1. Using a similar, albeit heavily modified entity-relation-entity syntax.
  2. Having its own set of morphological constraints to indicate syntax.
  3. Using words that represent several things that are associated with one another on a spectrum.
  4. Relying on toki pona-like combinatoric techniques to compose new words as needed.
  5. Using a phonology and morphology focused on simple sound combinations that are easily pronounced. Must match the pattern: VCV(CV)*(CVCV)*.

Now, once more, but with much more detail:

1) Entity-Relation-Entity Syntax

Sentences are broken up into simple 1-to-1 relations that are established in a context. These contexts contain words that each require a grammar prefix to indicate their role in that context. After the prefix, each word then has some combination of concepts to make a full word. Concepts are each composed of some particletag, or root, (some spectrum of topic/s) followed by a precision marker that indicates the exact meaning on that spectrum.

The existing roles are… (pronounced like in Spanish):

  1. prefix ‘u’: a left-hand-side entity (lhs) similar to a subject.
  2. prefix ‘a’: a relation similar to a verb or preposition.
  3. prefix ‘e’: a right-hand-side entity (rhs) similar to an object.
  4. prefix ‘i’: a modifier for another word, similar to an adjective or adverb.
  5. prefix ‘o’: a vocative marker, i.e. an interjection meant to direct attention.

Sentences are composed of contexts. For example, “I am real to you,” is technically two contexts. One asserts that “I am real” while the other asserts that “my being real” is in “your” perspective. This nested-context syntax is at the heart of tokawaje.

These contexts are connected with each other using context particles:

  1. ‘xa’ (pronounced “cha”) meaning opening a new context (every sentence silently starts with one of these).
  2. ‘xo’ meaning close the current context.
  3. ‘xi’ meaning close all open contexts back to the original layer.

(These also must each be prefixed with a corresponding grammar prefix)

Examples of Concept Composition:

  1. ‘xa’ = an incomplete word composed of only a particle+precision.
  2. “uxa” = a full word with a concept composed of a grammar prefix and a particle+precision.
  3. “min” = root for pronouns, “mina” = “self”, full “umina” = “I”.
  4. “vel” = root for “veracity”, “vela”= “truth”, full “avela” = “is/are”.
  5. “sap” = root for object-aspects, “sapi” = “perspective”, full “asapi” = “from X’s perspective”.

Sample Breakdown:

“I am real to you” => “umina avela evela uxo asapi emino.”

  1. “umina” {u: subject, min/a: “pronoun=self”}
  2. “avela” {a: relation, vel/a: “veracity=true”}
  3. “evela” {e: object, vel/a: “veracity=true”}
  4. “uxo” {u: subject, xo: “context close”} // indicating the previous content was all a left-hand-side entity for an external context.
  5. “asapi” {a: relation, sap/i: “aspect=perspective”}
  6. “emino” {e: object, min/o: pronoun=you}


It’s no coincidence that the natural grammatical breakdown of a sentence looks very much like JSON data (web API anyone?). In reality, it would be closer to…

{ prefix: ‘u’, concepts: [ [“min”,”a”] ] }

…since the meanings would be stored locally between client and server devices.

This is DIFFERENT from Lojban in the sense that no single concept will encompass a variety of relations to other words, but it is SIMILAR in that the concept of a “subject”/”verb”/”object” structure isn’t technically there in reality. For example:

“umina anisa evelo” => “I -inside-> lie” => “I am inside a lie.”

In this case, “am inside” isn’t even a verb, but purely a relation simulating an English prepositional phrase where no “is” verb is technically present.

These contexts can be used without a complete context to create gerunds, adjective phrases, etc. For example, to create a gerund left-hand-side entity of “existing”, I might say

“avela uxo avela evelo.” => “Existing is (itself) a falsehood.”


You might ask, “how do we tell the difference with something like [uxavela]? Might it be {u: object, xav/e: something, la: something}? Actually, no. The reason the computer can immediately understand the proper interpretation is because of tokawaje’s second Lojban incorporation:

2) Strict Morphological Constraints for Syntactic Roles

Consonants are split up into two groups: those reserved for particles, such as ‘x’ and those reserved for roots, such as ‘v’. The computer will always know the underlying structure of a word’s morphology and consequent syntax. Therefore, given the word “uxavela” we will know with 100% certainty that the division is u (has the V-form common to all prefixes), xa (CV-form of all particles), and vela (CVCV-form of all roots).

Particles can be split up into two categories based on their usual placement in a word.

  1. Those that are usually the first concept in a word.
    1. ‘x’ = relating to contexts (as you have already seen previously)
      1. ‘xa’ = open
      2. ‘xo’ = close
      3. ‘xi’ = cascading close
      4. ‘xe’ = a literal grammar context (to talk about tokawaje IN tokawaje)
    2. ‘f’ = relating to irrelevant and/or non-tokawaje content
      1. ‘fa’ = name/foreign word with non-tokawaje morphology constraints
      2. ‘fo’ = name/foreign word with tokawaje morphology constraints
      3. ‘fe’ = filler word for something irrelevant
  2. Those that are usually AFTER a concept as a suffix (could be mid-word).
    1. ‘z’ = concept manipulation
      1. ‘za’ (zah) = shift meaning more towards the ‘a’ end of the spectrum
      2. ‘zo’ (zoh) = shift meaning more towards the ‘o’ end of the spectrum
      3. ‘zi’ (zee) = the source THING that assumes the left-hand-side of this relation.
        1. Ex. “uvelazi” => that which is something
        2. Shorthand for “ufe avela uxo”.
      4. ‘ze’ (zeh) = the object THING that assumes the right-hand-side of this relation.
        1. Ex. “uvelaze” => that which something is.
        2. Shorthand for “avela efe uxo”.
      5. ‘zu’ (as in “food”) = questioning suffix
      6. ‘zy’ (zai) = commanding suffix
      7. ‘zq’ (zow) = requesting suffix
    2. ‘c’ = tensing, pronounced “sh”
      1. ‘ca’ = future tense
      2. ‘ci’ = progressive tense
      3. ‘co’ = past tense
    3. ‘b’ = logical manipulation
      1. ‘be’ = not
      2. ‘ba’ = and
      3. ‘bi’ = xor
      4. ‘bo’ = or

All other consonants in the language fall into the “root word” set. With these clear divisions, tokawaje will always know what role a concept has in manipulating the meaning of that word.

I’d also like to point out that informal, conversational uses of these two groups of particles may completely remove the distinction between them. For example, someone may simply say:

“uzq” => “Please.”

This would not actually impact the computer’s capacity to distinguish terms though. I even plan to make my own parser assume that lack of a grammar prefix implies an intended ‘u’ prefix (not that that’s encouraged)

3) Concepts in Tokawaje Exist on Spectra

Most every word in the language has exactly 4 meanings, with 3 non-root concepts using more than that: the grammar prefixes and ‘z’-based word manipulators (as you’ve already seen), and general expressive noises / sound effects which are vowel-only. This technique allows for vocabulary that is flexible, yet intuitive, despite its initial appearance of complexity.

4) Sounds and Structure are Designed for Clear, Flowing Speech

Every concept is restricted to a form that facilitates clear pronunciation and a consistent rhythm. Together, these elements ensure that the language is simple to learn phonetically.

Concepts have the form C (particles/tags) or CVC (roots) along with a vowel grammar prefix and a vowel precision suffix, resulting in a minimum word of VCV or VCVCV.

The rhythm to concepts emphasizes the middle CV: u-MI-na, a-VE-la, etc. Even with suffixes applied to words, this pattern never becomes unmanageable. The result is a nice, flowy-feeling language:

  1. uvelominacoze / avelominaco (“velomina” => a personal falsehood)
    1. u-VE-lo-MI-na-CO-ze (that which one lied to oneself about)
    2. a-VE-lo-MI-na-co (to lie to oneself in the past)


5) Tokawaje Employs Tiered Combinatorics to Invent New Concepts

The first concept always communicates the root “what” of a thing while the subsequent concepts add further description of the thing. This structure emulates toki pona’s noun-combining mechanics.

‘u’, ‘a’, and other non-‘i’ terms are primary descriptors and more closely adhere to WHAT a thing is. ‘i’ terms are secondary descriptors and approximate the additional properties of a thing BEYOND simply WHAT it is. Fundamentally, every concept follows these simple rules:

  1. Non-‘i’ words are more relevant to describing their role’s reality than ‘i’ words.
  2. However, individual words are described more strongly by their subsequent ‘i’ words than they are by other terms.
  3. Multiple non-‘i’ words will further describe that non-‘i’ term such that later non-‘i’ words act as descriptors for the next-left non-‘i’ word and its associated ‘i’ words.

Let’s say I have the following sentence (I’ll be using the filler particle “fe” with an artificially inserted number to reference more easily. Think of each of these as a root+precision CVCV form):

“ufe1fe2 ife3fe4 ife5 ufe6fe7 ife8 avela uxofe9 ife10 afe11”

This can be broken down in the following way:

  1. Any pairing of adjacent fe’s form a compound word in which the second fe is an adjective for the previous fe, but the two of them together form a single concept. For example, “ife3fe4”: fe4 is modifying fe3, but the two together form an adjective modifying the noun “ufe1fe2”.
  2. The subject is primarily described by “ufe1fe2” and secondarily by “ufe6fe7” since they are both prefixed with ‘u’, but one comes later. “ufe6fe7” is technically modifying “ufe1fe2”, even if “ufe1fe2” is also being more directly modified by the ‘i’-terms following it.
  3. Each of those ‘u’ terms are additionally modified by their adjacent ‘i’ term adjective modfiers.
  4. “ife5” is an adverb modifying “ife3fe4”.
  5. The “existence of ” the “ufe1-8” entity is the u-term of the “afe11” relation.
  6. The entirety of that u-term has a primary adjective descriptor of “-fe9” and a secondary adjective descriptor of “ife10”.


Suppose the word for “dog” were “uhumosoviloja” (u/lhs,humo/beast,sovi/land,loja/loyal = “loyal land-beast”). How might you describe a disloyal dog then? You would use an ‘u’ for stating it is a dog (that identifying aspect) and an ‘i’ for the disloyalty (the added on description). The spectrum of loyalty (“loj”) would therefore show up twice.

“uhumosoviloja ilojo” => “disloyal dog”

For clarity purposes, you may even split up the “loja”, but that wouldn’t impact the meaning since “uloja” still has a higher priority than “ilojo”.

“uhumosovi uloja ilojo” => “disloyal dog” (equivalent)

Let’s say there were actual distinctions between words though. How about we take the noun phrase “big wood box room”? Here’s the necessary vocabulary:

“sysa” => “big/large/to be big/amount”
“lijavena” => “rigid thing of plants” => “wood/wooden/to be wood”
“tema” => “of or relating to cubes”
“kita” => of or relating to rooms and/or enclosed spaces”

Now let’s see some adaptations:

  1. ukitasysa => an “atrium”, a “gym”, some space that, by definition, is large.
  2. ukita usysa => same thing.
  3. ukita isysa => a room that happens to be relatively big.
  4. ukita utema ulijavena usysa => cube room of large-wood.
  5. ukita utema ulijavena isysa => cube room of large-wood.
  6. ukita utema ilijavena usysa => room of wooden large-boxes.
  7. ukita itema ulijavena usysa => a cube-shaped room of large-wood.
  8. ikita utema ulijavena usysa => [something] related to rooms that is cube-shaped, wooden, and large.
  9. ukita utema ilijavena isysa => a room of large-wood cubes.
  10. ukita itema ilijavena usysa => the naturally large room associated with wooden cubes.
  11. ikita itema ulijavena usysa =>[something] related to cube-shaped rooms that is a large-plant.
  12. ukita itema ilijavena isysa => the room of large-wood boxes.
  13. ikita itema ilijavena usysa => [something] related to plant-box rooms that is an amount. (an inventory of greenhouses or something?)
  14. ikita itema ilijavena isysa => [something] related to rooms of large-wood boxes.
  15. ukita usysa utema ilijavena => large room of wooden boxes.
  16. ukita utema usysa ilijavena => room of wood-amount boxes.
  17. ukita utemasysa ilijavena => room of wooden big-boxes.
  18. ukita usysa iba ulijavena itema => A big-room and a cube-related plant.


Some of these are a little crazy and some of them are amazingly precise. The point is, we are achieving this level of precision using a vocabulary closer to the scope of toki pona. I can guarantee you that you would never have been able to say any of this in a language as vague as TP nor will it ever try to approximate this level of clarity. I can likewise guarantee that Lojban will never have a minified version of itself available for video games. Good thing we don’t need one.


As you can see, tokawaje combines the breadth, depth and computer-accessibility of Lojban with the simplicity, intuitiveness, and human-accessibility of toki pona.

For those of you wanting the TL;DR:

The invented language, tokawaje, is a spectrum-based language. Clarity of pronunciation, compactness of vocabulary (150 words), and combinatoric techniques to invent concepts all lend the language to great accessibility for new users of the language. On the other hand, a sophisticated morphology and grammar with clear constraints on word formations, sentence structure, and their associated syntax and semantics result in a language that is well-primed for speedy parsing in software applications.

More information on the language can be found on my commentable Google Sheets page here (concepts on the Dictionary tab can be searched for with “abc:” where “abc” is the 3-letter root of the word).

This is definitely the longest article I’ve written thus far, but it properly illuminates the faults with pre-existing languages and addresses the potential tokawaje has to change things for the better. Please also note that tokawaje is still in an early alpha stage and some of its details are liable to change at this time.

If you have any comments or suggestions, please let me know in the comments below or, if you have specific thoughts that come up while perusing the Google Sheet, feel free to comment on it directly.

Next time, I’ll likely be diving into the topic of writing a parser. Hope you’ve enjoyed it.


Next Article: Coming Soon!
Previous Article: Relationship and Perception Modeling

Minecraftian Narrative: Part 6

Table of Contents

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”


Previously, we identified two narrative AIs: the StoryMind that manages story development and content generation behind the scenes, and the Agent that simulates the behaviors of a character. The Agent consults with a Character while interpreting narrative scripting input. It then relays instructions to the Vessel that executes those instructions in the virtual world on behalf of the Character. Today, we’ll explore how an Agent could model socio-cultural constructs, account for multiple layers of interactive perceptions, and integrate narrative scripting into each of these.

Vessel = gameplay logic, Character = personnel record, Agent = interpretation AI logic

Amorphic Relationship Abstractions

In games, programmers will often construct objects in the game world based on a flexible design called the “Component” design pattern. This technique builds game objects less by focusing on a hierarchy (a Dragon is a Beast is a Physical is a Renderable is an Object), and more by attributing generic qualities to them which can be added or removed as needed. The objects then simply function as amorphous containers for these “components” of behavior. You don’t have a “dragon”, you have a “fire-breathing”, “flying”, “intelligent” “serpentine” and “animalistic” object that “occasionally attacks cities” which we simply label as a dragon. A player could then talk with the dragon and convince it to become more peaceful mid-game. The “Component” system is what allows us to dynamically change the dragon Character by simply removing the city-attacking behavior.


This same model would seem to be extremely effective at describing our relationships in life. Relationships are amorphous and are often interpreted by context: which behaviors actually exist between two entities, and which behaviors are expected. Let’s say you are trying to understand whether you are in a “friendship” relationship with someone. If the other person isn’t doing what you are expecting a friend to do, then the likelihood that your unknown, actual relationship is the suspected one decreases. On the flip side, if you expect your friends to quack like a duck, and this random person does quack like a duck at you, then you have found someone who is likely to become your friend (though, you and she would be a bit weird). Critical to this is how each Character may have its own definition of what behaviors constitute “friendship.”

In addition, when one evaluates the satisfaction of a relationship, one typically focuses on the behaviors one wishes to engage in with others. However, the person doesn’t then start engaging in new behaviors immediately in the context of their old relationship; they first prioritize changing the relationship itself, so as to make the sought-after behavior more acceptable to the other party.


For example, if a boy likes a girl, he shouldn’t (necessarily) immediately go to her home and declare his love, but perhaps get her to first have a “familiar”, then “associate”, and then “friend” relationship first (though the way a relational path from relationship A to B is calculated for any given Character would be a function of the Character’s personality).

The implication is that these kinds of procedurally generated relational pathways can lead to characters that naturally develop a variety of human-like behaviors as they decide on a goal, calculate a possible social path towards that goal, and then further break down ways in which to change the situation they are in to meet their goals. This is related to the concept of hope. When you hope for someone to engage in a given behavior, then you are really stating that you will be more satisfied if you are in a relationship where those kinds of behaviors are expected (and where the person actually does those behaviors, indicating that they actually are in that relationship with you).


For example, a “daughter” entity D and a “father” entity F exist in the following way:

  • D & F both have the same expectations of F such that they both agree F is a “biological father” with D (he is responsible for impregnating her mother).
  • D & F both have the same expectations of each other such that they both agree F is a “guardian” with D (he houses/feeds/protects her, & pays for her healthcare/schooling, etc.).
  • D & F have different expectations of each other such that F believes he has a “fatherhood” relationship with D, but D does not believe this. F is always working, and D wishes he would play with her more often and come to her public achievements in school. As such, D is not satisfied since her conception of the “fatherhood” relationship is not the same.
  • Because D and F can both have different variations of the same relationship expectations, an AI will be able to support a system in which D and F may talk to each other “about” the same topic, but be thinking of totally different things (simulating the naturally confusing elements of the human condition). This is because the label for the relationship is equivalent, but the definition of each person’s idea of the relationship consists of different behaviors.

Tiered Relationship Expectations

Furthermore, the variations in relationship expectations may diverge at the individual level or group level. We may be able to assume that the vast majority of people within a given “group” have similar beliefs regarding one topic or another. However, we must also consolidate a hierarchy of relational priorities: for any given person, individual expectations override any group-level ones, and different groups will have various degrees to which they influence the individual’s social expectations of others. Let’s take The Legend of Korra as an example.


In this world, some people, called “benders,” can control a certain element (fire, earth, water, or air). Previously, the differences between types of benders and the cultures they came from led to conflict in the world. In “Republic City” however, those people can now live in peace with one another. This represents a “national” group with cultural expectations of uniting people despite their different cultures.


But for those who do not have powers at all, they are subject to the prejudice and general economic superiority of the “benders”. From there, spawns the political activist and terrorism group: the Equalists. This group adds another layer of people on top of the “national” group layer. An Equalist who still believes in the capacity for people to unite is simply someone who is not as loyal to the Equalist cause. Whether the person still has this hope for a positive relationship between the Republic City entity and themselves is something that others may notice and consider when evaluating the actions of this person. The Equalist leader will see this person as someone who must be further manipulated to the cause whereas peacekeepers will seek to redirect this person’s efforts of reform wrought from their emotions.

Finally, we have the individual level, which supersedes all group-level social expectations. Say one of these questionably-loyal Equalists also has another expectation that relates to equality: they believe a city should always be concerned about the well-being of the diversity of animals in the Avatar world. Animal-care efforts by the city therefore play a larger role in currying favor in this particular person, even if their Equalist position still puts them in a climate of distaste for the city. This person, like many who might join that organization, likely have a variety of internal conflicts that they are managing, expectations and needs that are battling for dominance of the mind. This is how our Agent’s should take into account decision-making: through a diverse conflict of interests.


Relationship Modeling

So, how to actually take this relational concept and model it in a way the computer can understand? Well, let’s first define our terms:

  • Narrative Entity: A basic “thing” in the narrative that has narrative relevance. Can be a form of Life, a non-living Object, a Place, or an abstract idea or piece of Lore. This is “what” the thing is and implies various sorts of properties. It also places default limits on things (for example, a Lore cannot be interacted with physically).
  • Character: A Narrative Entity that has a ‘will’, i.e. can desire for itself a behavior. This is “who” the thing is (ergo, it has a personality) and implies what sorts of behaviors it would naturally engage in.
  • Role: The name a Narrative Entity assumes under the context of its behaviors in a Relationship.
  • Relationship: The set of behaviors that have occurred between two Narrative Entities. They will always be binary links between NEs and may or may not be bidirectional, i.e. an Entity may not even have to do anything to be in a Relationship. It may not even be aware that it is in a Relationship with another entity.

A behavior as we will see it is defined by some action or state change. For actions, there is a source for this behavior and an object. As such, we can graphically portray transitive behaviors as directed lines running from one Narrative Entity to another. For intransitive verbs, we simply have a directed line pointing to a Null Entity that represents nothingness.

Rather than simply have these be straight lines however, it is easier to think of them as longitudinal lines between points on a globe.

Renderings may not necessarily place them at equidistant positions. This is merely the simplest rendering.
At each pole is a Narrative Entity and the entirety of the globe encompasses the actual relationship between the two. We can then define a set of “ideal” relationships that have their own globes of interactions. By checking the degree to which the ideal is a subset of the actual, we can calculate the likelihood that the actual includes the idealized relationship. This is an example of set logic in mathematics and its applications in identifying and relating relationships.

I further propose that these globular relationships have a sequence of layers: a core globe summarizing the history of behaviors that have occurred between the two entities, and an intermediate layer composed of hoped-for behaviors for any given Character.

The intermediate layer is far more complex since it is both hypothetical and subjective between any 2 Characters (visualized as two clearly-divided hemispheres) or a Character and a Narrative Entity (a globe).  The intermediate (hemi)sphere(s) would be calculated from an algorithm that takes into account the historical core of the relationship and the associated Character’s personality. Given Character goals X and past interactions Y, what type of relationship, i.e. what collection of behaviors does the Character wish to have with the target of the relationship?

Picture each division of this orange as the source hemisphere of two respective Characters: clearly divided, yet maintaining the same directed-lines-as-globe structure.

Perception Modeling

Furthermore, we must ensure that we can simulate the accumulation of knowledge and the questionable nature of it: How are we to model perceptions of knowledge, e.g. “I suspect that you ate my cookies.”

In this scenario, Person 1 is fairly certain Person 2 stole their cookies, but Person 2 has not yet even realized that Person 1’s cookies are missing. Person 2 also does not know how Person 1 obtained the cookies.
For this, we must allow even behaviors themselves to be abstracted into Narrative Entities that can be known, suspected, or unknown. Without this recursiveness, without the ability to form interactions between Characters and knowledge of interactions, you cannot replicate more complicated scenarios such as…

  • A actually knows a secret S1.
  • B hopes to know S2.
  • A suspects B wants to know S1 and therefore attempts to hide their knowledge of S1 from B.
  • B has reason to believe that A knows S2, so B pays more attention to A, but tries to avoid revealing this suspicion to A.
  • A has noticed B’s abnormal attention directed at him/her, so A surreptitiously engages in a behavior X to help hide the “way” of learning about S1.
  • C witnesses X and tells B about it, so B is now more confident that A knows about S2.
  • (We don’t even necessarily know if S1 and S2 are the same secret).
  • etc.

With this quick example, you can see how perceptions need to be able to have various degrees of confidence in behaviors (actions and state changes) to help inform the mentalities of Agents.

Narrative Scripting Integration

As far as codifying these Entities and Behaviors goes, that is where the narrative scripting comes in to play. Every Entity, every Behavior, and therefore every Relationship is described solely in terms of narrative scripting statements. This is to prevent situations where the technology must do an intermediate translation into another language during interpretation of scripted content into logical meaning.

So, for example, Cookies might have the following abstract description:

Cookies are…

  1. a bread-based production.
  2. have a flavor: (usually) sweet.
  3. have a shape: (usually) small, circular, and nearly flat.
  4. have a source material: bread-based semi-solid
  5. have a creation method: (usually) heated in a box-heat-outside (i.e. “oven”, distinct from the box-heat-inside, i.e. “microwave”).

These properties are defined in an order of priority such that if something were to refer to an entity that is a bread-based treat that is small and circular, the computer would have a higher percentage confidence in evaluating that statement as the entity that shares the other qualities “sweet”, “made in an oven”, “nearly flat”, etc. vs another entity described as having a different shape or a different taste.


With innumeral globes of interactions and perception lines linking everything together, a fully-rendered model might look something like this:

Some of you may recognize this from the article on “Modeling Human Behavior and Awareness”
This concept has really grown out of a pre-existing theory on how to model these same kinds of behaviors that I developed. No doubt it will receive revisions as an actual implementation is underway, but before we get to that, we’ll have to dive once more into the field of linguistics.

The great break in content between articles here is because I’ve been hard at work on developing my own constructed language that is quite distinct from Toki Pona/Sona. To hear about the reasons why, and what form this new language will take, please look forward to the next article.

As always, comments and criticisms are welcome in the comments below. Cheers!

Next Article: Evolution of Toki Sona to “tokawaje”
Previous Article: Dramatica and Narrative AI

Minecraftian Narrative: Part 5

Table of Contents

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”


Today’s the day! We’ve gotten an idea of what form Toki Sona-based narrative scripting will take, and we’ve examined some of the concerns regarding its integration and maintenance with code. Now we’re finally going to dive into my favorite part: theorizing the behavior of classes that would actually use Toki Sona and react.

The most brilliant illustrations of media, in my opinion, are those which exhibit the Grand Argument Story. These stories have an overarching narrative with a particular argument embedded within, advanced throughout the experience by the main character and those he or she meets as they personify competing, adjacent, or parallel ways of thinking.

But how are we to teach a computer the narrative and character relationships as they appear to us? Thankfully, a well-fleshed out narrative framework already exists to help us as we figure it out. Its name is Dramatica, and from it, we shall design the computer types responsible for managing a dynamic narrative: the Character, Agent, and StoryMind.

Brief Dramatica Overview

The Dramatica Theory of Story is a framework for identifying the functional components of a narrative. In its 350-page introductory book which is available for free on their website (the advanced book can be found over here too), it defines a set of story concepts that must exist within a Grand Argument Story in order for it to be fully fleshed out. If anything is missing, then the story will be lacking. To be honest, the level of detail it gets into is rather jaw-dropping as a writer. Its creators even had to create a software application just to help writers manage the information from the framework! How detailed is it? Check this out…

Dramatica defines four categories of Character, Plot, Theme, and Genre.

It also defines 4 “Throughlines” which are perspectives on the Themes.

  • Overall Story (OS) = The story summarized as everyone experiences it. A dispassionate, objective view.
  • Main Character Story (MC) = The story as the main character experiences it. The character we relate to, experiencing inside-out.
  • Influence Character Story (IC) = The story as the influential character experiences it. The character we sympathize/empathize with, experiencing from the outside-in.
  • Relationship Story (SS) = The story viewed as the interactions between the MC & IC. An extremely passionate, objective view.

Within Theme, there are 4 “Classes” that have several subdivisions within them.

  • Universe: External/State => A Situation
  • Physics: External/Activity => An Activity
  • Psychology: Internal/Activity => A Manner of Thinking
  • Mind: Internal/State => A State of Mind

One Throughline is matched to each of the Classes so that, for example, the MC is mainly concerned about dealing with a state of mind, the IC is trying to avoid a situation related to his/her past, the community at large is freaking out about the ongoing activity of preparing for and running a local tournament, and there is an ongoing difference in methodologies between the MC and IC that draws tension between them.

Each Class can be broken down into 64 elements. Highlighted: Universe.Future.Choice.Temptation Element.

For each Class, you select 1 Variation of a Concern per story. The 4 Plot Acts (traditionally exposition, rising action, falling action, and denouement) each then shift between the 4-Element quad within the chosen Variation. Since Variations each have a diagonal opposite, diagonal movements (a “slide”) don’t change the topic Variation as intensely as shifting Variations horizontally or vertically (a “bump”).

This slideshow requires JavaScript.

For example the Universe.Future.Choice variation has the two opposing Elements, “Temptation” and “Self-Control” plus the other two “Logic” and “Feeling”. Notice these are two distinct, albeit related spectra of the human experience that come into play when making decisions about the future regarding an external situation that must be dealt with. Shifting topics from Temptation to Self-Control wouldn’t be as big of a change as going to Logic or Feeling since the former deals with the same conflicting pair of Elements.

Each of those Elements can be organized with the Acts in a number of permutations. 3 patterns arise, each of which have 4 orientations and can be run forwards or backwards (2). That gives 24 possible permutations for each Variation. 16 Variations per class, 4 Classes per story, and then times 4 again since each of the 4 Throughlines can be paired with a Class. Altogether, that comes out to 6,144 possible Plot-Theme permutations.

The Theme Classes are also matched up with Genre categories which can help the engine identify what sort of content to create at a given point of the story (doesn’t increase multiplicity).

The merging of Plot with Genre

On top of that, there are the Characters to consider. There are 8 general Archetypes, each of them composed by combining a Decision Characteristic and an Action Characteristic  for each of 4 aspects of character: their reason for toiling, the way they do things, how they determine success, and what they are ultimately trying to do.


You can make any character by combining 2 Characteristics from 2 unopposed Archetypes. So, (7!) permutations of any given characteristic within an aspect (not matching up with an opposite for each of them). 5,040 * 4 aspects * 2 characteristics = 40,320 permutations of Characters, optimally.

Finally, there’s the number of Themes that can be delivered by the external/internal successes and failures of the MC…

4 Possibilities

…and whether the MC and IC remained steadfast in their Class or changed (e.g. did they stay/change their state of mind?) and the success/failure thereof.

This slideshow requires JavaScript.

That makes 4 * 4 possible endings: 16.

PHEW! Okay, now, altogether that’s 16 endings * 40,320 characters * 6,144 plots…

Carry the 3…there we go:

3.96 billion. Stories.

And that’s without even “skinning” them as pirate, sci-fi, fantasy, take your pick.

Needless to say, these kinds of possibilities are EXACTLY the sort of variation we should be looking for in procedural narrative generation. Even if you knocked out the Informational genre in the interest of counting only the  non-edutainment games, that still leaves about 2.97 billion possibilities. Good odds, I say.

Also, keep in mind, any given video game will often times have several sub-stories within the overarching story, ones where minor characters have their own stories to explore and see themselves as the Main Character and Protagonist of their own conflict. In these stories, you, the original main character, may play the role of Influence Character (think Mass Effect 2 loyalty missions if you’ve ever played that: every character’s unique storyline is critically affected by the decisions you make while accompanying them for a personal, yet vital journey). Assuming any given story has, say, 9 essential characters (pretty small number by procedural generation standards, but pretty normal for children’s books), that would imply any single gameplay experience may involve 26.73 billion story arrangements.

It isn’t just Dramatica’s variability that makes it so appealing though. Each of these details are designed to be clearly identified and catalogued. This has two important consequences. The first is that the engine will know what goes into into making a good story and will therefore know how to create a good story structure from scratch. The second, and far more important to us, is that the engine will know when and how any of these qualities are not present or properly aligned. It will therefore understand what has happened to the story when the player changes things and how to fix them. Even better, because of its understanding of related story structures, it will even be able to adapt with completely new story forms should it wish to.

Head hurting yet? Fantastic! Let’s dig into characters as computer entities.

Characters & Agents

While Dramatica gives us the functional role of Characters, it doesn’t really flesh them out properly. Unfortunately, writers don’t really maintain a consolidated list of brainstorming material, but you can find several odds and ends around the Internet (list of character needs, list of unique qualities for realistic characters, and a character background sheet, for example). Any and all of these can be used to help flesh out and define the particular aspects of our Characters, beyond just their functional role.

The main interest we have with these brainstorming materials is to define a set of fields that an AI can connect Toki Sona inputs to. Given some Toki Sona instruction A, a definition of Character B, and a certain Context C, what course of action D should I take? Answering this question is the job of the Agent.

What exactly does an Agent entail? They would be the singular existence that represents the computer logic for the entirety of an assigned Character. In our case, we’re going to define a Character as ANY Narrative Entity that has (or could resume having) a will of its own. A Narrative Entity would simply be anything that requires a history of interactions with it to be recorded such as a Life, an Object, a Place, or a piece of Lore.


Notice that characters don’t have to be living beings specifically. For example, an enchanted swamp may have an intelligence living amongst the trees. It would most certainly be a Character; however, it would also definitely be a Place that people can enter, exit, and reside in. As a swamp entity would be the embodiment of both the land, the plants, and the animals within, one could also extend its attributes to Life as well. As a result, we’d have the swamp Agent that accesses the Character which in turn maintains properties of both the Life and Place for the swamp Narrative Entity.

Sample low-effort UML Class Diagram for the Agent Subsystem (made with UMLet)

In the diagram above, we specify that a single Agent is responsible for something called a Vessel rather than for a Character directly. What’s more, the Vessel can “wear” several Characters! What is the meaning of this?

Let’s say we wished to create a Jekyll & Hyde story. Although Jekyll and Hyde have different personalities, they also share a body. Whatever one is doing, wherever one is, the other will also be doing the moment they switch identities. This relates back to assets too. Whatever one sprite/model animation will be doing, the other will also be doing when those assets are switched to another set. In this way, Characters and Vessels are fully changeable without affecting the other. A multiple personality character might change Characters while not changing Vessels. A shapeshifting character might change Vessels while not changing Characters. In the case of Jekyll and Hyde, it would be a swap for both Character and Vessel as their personalities and bodies are BOTH different, but it will always be tied to the same location and activity at the time of switching.


So, the Agent is just an AI that doesn’t care what it’s controlling or to what ends. It looks to the Character to figure out what it narratively should and can do, and it issues instructions based on that to the Vessel. It doesn’t care whether the Vessel knows how to do it. It simply assumes the Vessel will know what the instructions mean. In the process, we’ve divorced the concept of a Character 1) from the in-story and in-engine thing that they are embodied as and 2) from the logic that figures out what a given Character should do given a set of Toki Sona inputs from the interpreter.

The last important thing to note about the Characters and Agents here is that the Agents are informed, context-wise, by their associated Characters. As such, an Agent’s decisions are constrained by their Vessel’s current Character; only its acquired knowledge, background of experience and skills, and personality will invoke behavior. An Agent will therefore factor into its decision-making the Character’s history of perceptions, likes and dislikes, attitude, goals, and everything else that constitutes the Character. It then translates incoming Toki Sona instructions into gameplay behavior. For example, what might a Character do if asked, “What do you know about the aliens?”


Maybe they don’t know much about the aliens. Or maybe they do, but it’s in their best interest to only reveal X information and not Y. But maybe they also really suck at lying, so you can see through it anyway. How will they know exactly what to say? How will they say it? Does the personality invite a curt, direct response, or do they swathe the invading aliens with adoration and delight in a giddy, I’m-too-obsessed-with-science kinda way?

The StoryMind

Finally, we address the overall story controls: the StoryMind. In Dramatica, the StoryMind is the fully formed mental argument and thought-process that the story communicates. In our context, the StoryMind is the computer type responsible for delivering the Dramatica framework’s StoryMind. It understands the possible story structures and makes decisions regarding whether the story can reasonably deliver the same themes with the existing Characters and Plot or whether it will need to adjust.

The StoryMind will have full and total control over that which has yet to be exposed to a human player within the story. It’s job is to generate and edit content to deliver a Grand Argument Story of some kind to the player. What might this look like?

Story time:


Typical Fantasy RPG world/game. You’re a strength-and-dexterity-focused mercenary and you’ve developed a bit of a reputation for taking on groups of enemies solo and winning with vicious direct onslaughts. You’re walking through town and come across a flyer about a duke’s kidnapped heir (one of a few pre-generated premises made by the StoryMind). You ask a barkeep about it (and it alone), so the StoryMind begins to suspect that you may be interested in pursuing this storyline further (rather than whatever other premises it had prepared for you). It therefore begins to develop more content for this premise, inferring that it will need that story information soon. In fact, it takes the initiative.


You are blocked in the road by a woman named Steph who overheard you outside the bar and wishes to accompany you on your journey to rescue the heir. She says that she’s a sorceress with some dodgy business concerning the duke and she needs a bargaining chip. Let’s say you respond with “Sure. I only want the duke’s money,” (in Toki Sona of course). All of a sudden, the StoryMind knows a couple of things:

  1. You care more about the reward money than pleasing the duke.
  2. Because you have already invited risk into your relationship with the yet-to-be-met, quest-giving duke, you are even more likely to behave negatively towards this particular duke and his associates in the future. You also might have a natural bias against those of a higher social status (something it will test later perhaps).
  3. You have some level of trust towards Steph, though it’s not defined.
  4. You are not a definitive loner. You accepted a partnership, despite your past as a solo mercenary. But how deep does this willingness extend? It’s possible it might be worth testing this too.

Since you may have related goals, the StoryMind sets her up as the Influence Character. It randomly decides to attempt a “friendly rivals / romance?” relationship (partnership of convenience), modifying her Character properties behind the scenes so that she is similar to you (based on your actions and speech).


Along the way, a group of goblins ambush and surround you both, so you dash in to slaughter the beasts. The StoryMind may have been designing Steph to support you, but unbeknownst to you, in the interest of generating conflict, it changes some of Steph’s settings! Steph yells for you to stop, but you ignore her and slash through one of them to make an opening. In response, Steph sparks a blinding light, grabs your hand, and runs away in the ensuing chaos.


As soon as you’re clear, she starts yelling at you, asking why you wouldn’t wait. After you get her to calm down a bit and explain things, she confides that she is hemophobic and can’t stand to see, smell, or be anywhere near blood. She’d prefer to stealthily knock out, sneak passed, trick, or bloodlessly maim those who stand in her way. How will you react? Astonishment? Scorn? Sympathy? Is this a deal breaker for your temporary partnership? Remember, she’s always paying attention to you, and so is the StoryMind. This difference in desired methodologies is but a small part of the narrative the StoryMind is crafting.

  • Throughline Type: Class.Concern.Variation.Element, Act I
    • Description
    • Genre
  • Overall Story Throughline: Physics.Obtaining.SelfInterest.Pursuit
    • A dukedom heir has been kidnapped.
    • Entertainment Through Thrills: Pursuit of an endangered royalty.
  • Influence Character Throughline: Mind.Preconcious.Worry.Result
    • Steph is worried about how to deal with her hemophobia. (StoryMind shortly generates this afterward=) She can’t find work beast-slaying or healing because of it, and is now low on money. The duke is evicting her, despite her frequent requests for bloodless work as payment. Everything’s so stressful, and it’s all because of that stupid blood!
    • Comedy of Manners: the almighty sorceress, the bane of beasts and harbinger of health, brought down by the mere sight of blood.
  • Relationship Throughline:
  • You’d prefer to hack away at enemies, but she can’t stand blood and prefers alternative approaches to removing obstacles. As such, you each have different manners of thinking about how you feel obstacles should be dealt with.
  • Growth Drama


  • Main Character Throughline: Universe.?
    • If nothing interrupts the progress of the other 3, the StoryMind is fit to throw external state-related problems your way, and these problems will necessarily dig into deeper, thematic issues. For example…
    •  Universe.Progress.Threat.Hunch
      • You eventually learn that those who took the heir have loose ties to the duke himself. Since you’re in pursuit to rescue him/her, you have a hunch that you may be a target soon as well. You need to unearth the mystery surrounding this. Does this impact your ability to trust the various characters you come across?
      • Entertainment Through Atmosphere: You’re experiencing a fantasy world!



And to think, if you’d just said, “No, thanks. I’m more of a loner,” at the beginning, Steph might instead have developed as a hindering rival Influence Character who tries to steal the heir for herself, popping in and out of the story when you least expect it! Does she even know about the duke’s possible relation to the kidnapping? Too bad we’ll never find out. After all, you didn’t say that. The characters and experiences in this world are real and permanent. You live with your choices, build relationships, and engage with a game world that truly listens to you, more intimately than any other game has before.


I have found that Dramatica is an excellent starting point from which to build story structures and inform our StoryMind narrative AI of how to craft and manipulate storylines and characters. I hope you too are interested in the potential of this sort of system so that one day we might see it in action.

Also, it’s entirely possible I might have slightly messed up some calculations concerning the Dramatica system as the book doesn’t do a great job of clearly defining the relationships in one place (it’s sort of scattered about in the chapters). As far as I can tell, I’ve got them right, but I don’t terribly favor my math skills. I’d be happy to correct any mistakes someone notices.

In the future, expect to find an article diving into the hypothetical technical representation of Agents: their relationships, perceptions, and decision-making. Again, I’d love to hear from you below with comments, criticisms, and/or questions. Cheers!

Next Article: Relationship and Perception Modeling
Previous Article: Toki Sona Implementation Quandries

Minecraftian Narrative: Part 4

Table of Contents:

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”


At this point, I’ve communicated the basics of the Toki Sona language (a “story-focused” Toki Pona), its potential for simply communicating narrative concepts, and the types of interfaces and games that could exploit such a language.

This time, we’ll be diving into some of the nuts and bolts that might revolve around the actual interpretation of Toki Sona and how it might tie into code. An intriguing array of questions come into play due to Toki Sona’s highly interpretive semantics. The end result is a sort of exaggerated problem domain taken from Natural Language Processing. How much information should we infer from what we are given? How do we handle vague interpretations in code? And what do we do when the language itself changes through usage over time? Let’s start thinking…

Variant Details In Interpretation

What we ultimately want in a narrative engine is to be able to craft a computer system that can dynamically generate the same content that a human author would be able to create. To accomplish this, we must leverage our main tool: reducing the complexity of language to such an extent that the computer doesn’t have to compete with the linguistic nuances and artistic value that an author can imbue within their own work. Managing the degree to which we include these nuances requires a careful balancing act though.

For example, “It was a dark and stormy night…” draws into your mind many images beyond simply the setting. It evokes memories filled with emotions which an author may use to great effect in their manipulation of the audience’s emotional experience. Toki Sona’s focus on vague interpretation leaves many different ways of conveying the same concept, depending on one’s intent. Here are some English literal translations:

  • Version A: “When a black time of monstrous/fearful energy existed…”
    • tenpo-pimeja pi wawa-monsuta lon la, …
  • Version B: “This is the going time: The time is the black time. The air water travels below. As light of huge sound cuts the air above…”
    • ni li tenpo-kama: tenpo li tenpo-pimeja. telo-kon li tawa anpa. suno pi kalama-suli li kipisi e kon-sewi la, …

You’ll notice that version A jumps directly into communicating the tone that the audience should understand. As a result, it is far less particular in setting the scene’s physical characteristics about the weather.

Version B on the other hand takes the time to establish scene details with particulars (as specific as it can get, anyway). Although it takes several more statements to present the idea, it eventually equates itself loosely with the original English phrase. In this way, it manages to conjure emotions in the audience through imagery the same way the original does, but you can also tell that the impact isn’t quite as nuanced.


One of the key aspects of Toki Sona is that it is unable to include two independent phrases in a single statement. It is also unable to include anything beyond a single, adverbial dependent clause in addition to the core independent clause. These restrictions help ensure that each individual statement has a clear effect on interpretation. Only one core set of subjects and one core set of verbs may be present. Everything else is simply details for the singularly described content. As a result, a computer should be able to extract these singular concepts from Toki Sona more easily than it would a more complex language.

So while both database queries and statistical probability calculations are factors in interpreting the text, the algorithms will rely more on the probabilities due to the diminished size of database contents (not as many vocabulary terms to track). This is also likely because words frequently have several, divergent meanings that are relevant to a given context. As such, algorithms will often need to re-identify meanings after-the-fact once successive statements have been interpreted.

Our difficulty comes in when we must identify how interpreted statements are to be translated into understood data. Version B is far more explicit about how things are to be added, while version A relies far more heavily on the interpreter to sort things out. How many narrative elements should the interpreter assume based on the statistical chances of their relevance? The more questionable elements are added, the more items we’ll need to revisit for every subsequent statement. After all, future statements could add information that grants us new insight into the meaning of already stated terms.

To illustrate this, let’s break down how the interpreter might compose a scene based on these statements into pseudocode, starting with version B. We’ll leave English literal translations in and identify them as if they were Toki Sona terms.


Version B
contextFrames[cxt_index = 0] = cxt = new Context(); //establish 1st context

"This is the going time:" =>
contextFrames[++cxt_index] = new Context(); //':' signifies new context
cxt = contextFrames[cxt_index]; //future ideas added to new context
cxt += Timeline(Past); //Add the "time that has gone" to the context

"The time is the black time." =>
cxt += TimeOfDay(Night) //Add the "time of darkness" to the context

"The air water travels below." =>
cxt += Audio(Rain) + Visual(Rain) // Add "water of the air" visuals. Audio auto-added.

"As light of huge sound cuts the air above..." =>
cxt += {Object|Visual}(Light+(Sound+Huge)) >> Action(Cut) >> Visual(Sky+Air);
cxt += Mood(Ominous)?
// The scene includes a light that is often associated with loud noises. These lights (an object? A visual? Is it interactive?) are cutting across the "airs in the sky", likely clouds. All together, this combination of elements might imply an ominous mood.

Version A
contextFrames[cxt_index = 0] = cxt = new Context(); //establish 1st context

"When a black time of monstrous/fearful energy existed..." =>
cxt += TimeOfDay(Night)? + Energy(Terrifying)? + Mood(Terrifying) + Mood(Ominous)?
// Establish night time and presence of a terrifying form of energy in the scene. Based on these, establish that the mood is terrifying in some way with the possibility of more negatively toned content to follow soon. Possible that "monstrous energy" may imply a general feel rather than a thing, in which case "black time" may reference an impression of past events as opposed to the time of day.

To emphasize ease of use and make a powerful assistance tool, it’s best to let the interpreter do as much work as possible and then just update previous assumptions as new information is introduced. That way, even if the user inputs a small amount of information, it will feel as if the system is anticipating your meaning and understanding you effectively. To do otherwise would save significantly on processing time, but would result in far too many assumptions being made that don’t account for the full context. This would in turn result in terrible errors in interpretation. Figuring out exactly how the data is organized and how the interpreter will make assumptions will be its own can of worms that I’ll get to some other day.

Data Representation

An additional concern is to identify the various ways that words will be understood logically as classes or typenames, hereafter “types” (for the non-programmers out there, this would be the organization the computer uses to better identify the relationships and behaviors between terms). Examples in the above pseudocode include TimeOfDay, Visuals and Audio elements, etc. Ideally, each of these definitions would alter the context in which characters exist. It would inform their decision-making and impact the kinds of events that might trigger in the world (if anything like that should exist).

One option would be to create a data structure type for each Toki Sona word (there’d certainly be few enough of them memory-wise, so long as a short-cut script were written to auto-generate the code). Having types represent the terms themselves, however, is quite unreliable as we don’t want to have to alter the application code in response to changes in the language. Furthermore, any given word can occupy several syntactic roles depending on its positioning within a sentence, and each Toki Sona word in a syntactic role comes with a variety of semantic roles based on context.


For example, “kon”, the word for air, occupies a variety of meanings. As a noun, it can mean “air”, “wind”, “breath”, “atmosphere”, and even “aether”, “spirit” or “soul” (literally, “the unseen existence”). These noun meanings are then re-purposed as other forms of speech. The verb to “kon” means to “breathe” or, if being creative, it could mean “to pass by/through as if gas” / “to blow passed as if the wind”. To clarify, when one says, “She ‘kon’s” or “She ‘kon’ed”, one is literally saying “she ‘air’ed”, “she ‘wind’-ed”, “she ‘soul’-ed”, etc. The nouns themselves are used AS verbs, which in turn results in language conventions for interpreted meaning. You can therefore understand the interpretive variations involved, and that’s not even moving on to adjectives and adverbs! Through developing conventions, we could figure out that when a person “airs”, its semantic role is usually that the person breathes, sighs, or similar, not that they spirit away or become one with the atmosphere or something (which are far less likely to use “kon” as an verb in the first place – probably an adverb if anything).

In the end, a computer needs to understand a definitive behavior that is to occur with a given type name. However, since the nature of this behavior is dictated by the combination of terms involved, we can understand that Toki Sona terms are meant to serve as interpreted inputs to the types. Furthermore, it seems most appropriate for types to serve two purposes: they must indicate the syntactic role the word has in a sentence, and they must indicate the functional role the word has in a context.

In the pseudocode excerpt I came up with, we chose to highlight the latter route, defining described content based on how it impacted the narrative context: is this an Audio or Visual element that will affect perception or is this a detail concerning the setting’s external details such as the TimeOfDay, etc.? In addition to this, we’ll also need to incorporate syntactic analysis to better identify what the described content will actually be (is it a noun, verb, adjective, etc.?). As mentioned before, the way a word is used will greatly affect the type of meaning it has, so the function should be built on the syntax which is in turn built on the vocabulary.


Language Evolution

In addition, a system that implements this sort of code integration should be built around the assumption that the core vocabulary and semantics will change. As it stands, we already want to give users the power to add their own custom terms to the language for a particular application. These custom terms are always re-defined using a combination of sentences made of core terms and pre-existing custom terms.

However, because the integration of a living, breathing, and spoken language into a code base is a drastic measure, it is vital that the code be designed around the capacity for the core language to change. After all, languages are not unlike living creatures that adapt to environments, evolve to meet their needs, and strive to achieve their goals in the midst of it. In this sense, we can rest assured that players and developers alike will look forward to experimenting with and transforming this technology. This transformation will assuredly extend to the core terms, so not even the language should be tightly bound to them.

Given the lack of assurances in regard to the core terms over an extended period of time, it would behoove us to incorporate an external dictionary. It should most likely be pre-baked with statistical semantic associations derived from machine learning NPL algorithms and then fed into runtime calculations that combine with the context to narrow down the interpretation most likely to meet users’ expectations.


In simple terms, Wyrd should be given a massive list of Toki Pona (or Toki Sona, later on as it becomes available) statements periodically, perhaps with a monthly update. It should then scan through them, learn the words, and figure out what they likely mean: How frequently is “kon” used as a noun? What verbs and adjectives is it often paired with? What words is it NEVER associated with? What sorts of emotions have been associated with the various term-pairings and which are most frequent? These statistical inputs will assist the system in determining the functional and syntactic role(s) words possess. Combining this data with the actual surrounding words in context will let the application have a keen understanding of how to use them AND grant it the ability to reload this information when necessary.


Wyrd applications should also keep track of all Toki Sona input (if the user has volunteered it) so that they can be used as new machine learning test material. If people start using a word in a new way, and that trend develops, then the engine should respond by learning to adapt to that new usage and incorporate it into characters’ speech and applications’ descriptions. To do this, the centralized library of core terms must be updated by scanning through more recent Toki Sona literature. Ideally, we would pull this from update-electing users, generate new word data, and then broadcast this update to those same Wyrd users.


Well, we’ve explored some of the more in-depth programming difficulties that reside in using Toki Sona. There’ll likely be more updates in the future, but for now, this has all just been a brainstorming and analysis activity. I apologize for those of you who weren’t more tech-savvy (tried to make things a little simpler outside of the pseudocode). From here on out, it’s likely we’ll end up dealing with things that are a bit more technical than the previous fare, but there will also be plenty of high level discussion, so worry not!

For next time, I’ll be diving into the particulars of Agents, Characters, and the StoryMind: the fundamental tools for manipulating and understanding narrative concepts!

Next Article: Dramatica and Narrative AI
Previous Article: Interface and Gameplay Possibilities

Minecraftian Narrative: Part 3


Table of Contents:

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”


The last time, we discussed the concept of a narrative scripting language that could revolutionize the way players interact with a game world. We considered the possibility of using Toki Pona, an artificial 120-word language with 10 syntax rules, as a starting point for creating a custom language to be used for scripting purposes. In this article, we’ll be focusing a bit more on the ways in which the language might be used and what form its interface might take.

To begin with, I would like to clarify both the licensing plans I have for this concept as well as what terms are involved:

  • Toki Pona: (“The Simple Language”) The original artificial language. This is already free to be used for any purpose.
  • Toki Sona: (“The Story Language”) The modified language that I will be deriving from Toki Pona (Toki Pona’s word for knowledge/story is “sona”). This too will be free to use for any purpose (as it should be). A free C++, C#, Javascript, and APIs would probably be made available for engine/application integration.
  • Wyrd: a paid-for plugin for various popular engines that will include an AI system for interpreting and responding to Toki Sona dynamically for spontaneous dialogue, game events, and character behavior.

Now that we’ve defined things, I’ll be exploring how the heck Wyrd might show up in a game!

Interface Possibilities: Suggested GUI Input

Imagine you’re playing a game and you are given the chance to say something to another character. Rather than being given a succinct list of possible responses, you could simply be given a Minecraft-crafting style of word-composition system.

A character is encountered in the world.
An input field and a word bank are made available to the user (players could summon it at will).
Players can click on an image from the word bank. Think of this as picking a “block” in Minecraft.
As players select statements, the system visually hints at how things are being interpreted, for example: {Myself} {Want} {Go}. These would be the things that are “crafted” from putting terms together.
This hinting informs players of what concepts are ACTUALLY being communicated. For example: tomo a.k.a. {enclosed space} => “home” or “house”
When players need to combine concepts together, for example: {enclosed space} and {water}, it can show them that it is understood as “bathroom”
Players could then check what other interpretations are available for that combination (perhaps by clicking on it).
If they wished to communicate the desire to bathe/shower instead, they could select that option.
Obviously, it would be the responsibility of the Toki Sona engine to ensure there is a standardized image available for all of the desirable concepts, but limits will be necessary.

Another possibility that may be more realistic is to make it so that the hinted images are generated based on the full content of the statements made. For example, the {enclosed space} {water} combo may be assumed to be “bathroom”, but then when the player follows up that statement with, {myself} {feel} {dirty} (mi pilin jaki), then it might show the bathroom image modified to one of bathing after-the-fact. In this way, users wouldn’t be responsible for the interpretation (it can all be automated) which will allow the system to not have too much of a scope-creep going on, mapping images to concepts, etc. It also makes the player have to do less in order to fully interact with the system. Users would also be able to see how their statements impact the interpretation of previous statements.

Interface Possibilities: Suggested Text Input

The text concept is very much like that of the GUI input, however all that would be displayed to the user instead is a text field. Typing into that text field would display a filtered subset of the word bank below the typed text. Things typed would be assumed to be Toki Sona words (for example, “tomo”, meaning “enclosed space”, i.e. “home”, “house”, “construction”). Players would be able to hit [Tab] to move down the hint list and hit [Enter] to auto-complete the selected word and have the image and hinted interpretation pop up. This would allow for MUCH more fluid communication once a player has pieced together the actual vocabulary of the language (you’d eventually get to the point where you wouldn’t even want/need the suggested text).

We would also likely need both input types to have expected grammar displayed, i.e. having a big N underneath the beginning to show you need a noun for a subject. If you have a noun typed, it might suggest a compound noun or a verb, etc. All versions would also auto-edit what you have typed so that it is grammatically correct in Toki Sona (things like, [auto-inserting forgotten grammar particle here], etc.).

Gameplay Possibilities

Creation: One could easily envision a game where the player is capable of supplying WHAT they want to make in narrative terms and then clicking on the environment to place that thing. One could then edit the behavior of anything placed in the scene by selecting it, etc.

Simulation/Visual Novel: Something more like the Sims where the player is given a character and must direct them to do and say things to proceed. What they do and say, and to whom / where they do it may trigger changes in the other characters, the story, the environment, etc. This could naturally progress things.

PuzzleA game where the player is given a certain number of resources (limited points with which to spend for creation, a limited vocabulary, etc.) and must solve a problem. This would be something more like a Don’t Starve or Scribblenauts variety.

Generic RPG: A regular RPG game that allows the player to pop-up a custom dialogue window for speaking purposes, but which would otherwise not rely on directing player controls through the Wyrd system.

Roleplay-Simulation: A game that directly attempts to simulate the experience of a live role-playing game. The system acts as a Game/Dungeon Master and various players can connect to the game to participate in the journey together. A top-down grid environment may show up during enemy encounters of some kind, but players would completely interact with and understand the environment based on the text/graphic output of the system. More like an old school text adventure, but hyped up to the next level.


These are just some of the ideas I’ve had for interface and gameplay using the Wyrd system. Obviously this system still needs a lot of work, but I feel there is clearly a niche market that would long for experiences like this. If you have any comments or suggestions please let me know!

I know this article was a little bit shorter / lighter than my usual fare, but I promise I will develop some more detailed content for you next time. Cheers!

Next Article: Toki Sona Implementation Quandries
Previous Article: Is “Toki Pona” Suitable for Narrative Scripting?

Minecraftian Narrative: Part 2

Table of Contents:

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”


In the previous article, I explored the necessary elements of a “Minecraftian” game mechanic: one tailored for accessible and steady skill development, one that is equal parts editable and adaptable, visual and simple, granular and tabular.

I then addressed many issues with leveraging common languages to describe abstract concepts in this kind of mechanic. They are frequently hard to master. The Latin-based ones focus more on sounds than they do meanings. Their complexity warrants excessive processing for computer algorithms that are impractical for any imminent use on the scale with which we intend to use them. Relying on existing language saves learning time, but only for a subset of the intended audience; for others, it is an ostracizing element that comes with the expectation of translating into other existing languages to provide the same privileges to alternative audiences. It would also bias any software made against younger players with underdeveloped language skills.

Because of these considerations, we began to consider the language Toki Pona as a possible tool to adapt for narrative scripting. What are the advantages of this Simple Language? Are there any problems with it? Let’s dive in and find out.


Ideal Narrative Scripting

Let’s first review what exactly we mean by “narrative scripting”. What sorts of tasks are we actually wanting to perform with this language? We’ve already established many of the characteristics we are looking for from our Minecraft analysis, and while Toki Pona meets many of these criteria, we must also consider the actual usage environment of our target language before we can significantly evaluate the utility of Toki Pona.

Screen Shot 2016-09-02 at 8.39.49 PM.png

Before continuing, I would also like to point out that this sort of narrative scripting is entirely distinct from the “narrative scripting” language known as Ink. Scripting languages in general are just languages that are more user-friendly and provide a more intuitive, simple interface for computer-tasks that would otherwise be fairly complex. With Ink, the goal is to inform the computer of the relationships between lines of dialogue in branching story lines. In our case, the goal is to inform the computer of the narrative concepts associated with game world objects, actions, and places so that it can 1) interpret meaning based on those associations and 2) trigger events that can be leveraged by AI characters, world controls, and human players/modders to create behavior and change the game world. We want to put this kind of control in the hands of players.

Skyrim-creator Bethesda’s “creation kit” modding tool has its own scripting language as well, to edit objects/events in the game world. It is a bit technical though.

If we truly had a narrative scripting language, then we would be able to craft, with as little vocabulary and syntactic structure as possible, a description of any narrative content. More specifically, we should be able to describe with some measure of accuracy…

  • places’ geography, geometry (both its form and absolute and relative locations), and thematic atmosphere.
  • objects’ nomenclature, physical and functional characteristics, relative purpose, effects, history, ownership, and value.
  • human’s (and, as a categorical subset, animals’ and human-like creatures’) nomenclature, physical and emotional characteristics, relationships, state of mind, responsibilities, history and scars, beliefs, tastes, hopes and dreams, fears, biases, allegiances, knowledge, awareness, senses and observations, skills and powers.
  • concepts’ and ideologies’ subject domain, relationships, and significant details.

These qualities will allow the user to competently describe an environment and the items, creatures, and people in it. In addition, for accessibility and then functional purposes, we need the language to be useful for the following tasks: 1) scenario descriptions, 2) dialogue, and 3) computer processing. The attributes above cover the first case.

Ideally, dialogue “options” wouldn’t exist, and we would be able to directly input a writing system that the non-player characters would be able to understand without some preset arrangement of scripted responses.

As for dialogue, that means it must also be able to model questions, interjections, quotations, prepositions, nouns, adjectives, and adverbs (common syntactic structures). It must accommodate the linguistic relationships between terms and their relative priority, e.g. is X AND/OR’d with Y, are these words a noun/adjective/adverb, are they subject/object, which adjectives describe the noun more clearly, etc. We must also have some means of singling out identifiers, i.e. terms that refer to things that are not themselves a part of the language (a player’s name, for example).

Finally, we must also ensure that the language’s structural simplicity is reinforced so that its consequent processing is more easily conducted by computer algorithms. This primarily involves restricting the number of syntactic rules, lexicon size, but also includes more subtle nuances like the number of interpreted meanings for a given set of words and the number of compound words.

To maximize the utility of the language itself, the “root” words that are individually present in the language must have the following characteristics:

  • The words must be, as much as possible, “perpenyms” of one another; that is, they must be perpendicular in meaning to their counterparts, neither synonyms – for obvious reasons – nor antonyms, to prevent you from simply saying, “not [the word]” to get another word in the language.
  • The words must have a high number of literal or metaphorical interpretive meanings laced within them to ensure a maximum number of functionally usable words per each learned word. Keep in mind, these interpretations must also be strongly linked by theme so that the words’ definitions will be easy to remember. If possible however, these multiple meanings should be individually interpretive based on context, so that one can assume in any given context one interpreted meaning vs. another somewhat clearly. The more this is supported, the less work the computer will be doing.
    • For example, in Toki Pona, the word “tawa” can mean “to go” as a verb, “mobile/moving/traveling” as an adjective, “to, until, towards, in order to” as a preposition (notice how each of those are generally applied to different preposition objects, to help pick out which one is being used), or even, “journey/transportation/experience”, literally “the going” as a noun. Each context can be easily identified based on the positioning of the word in a sentence, each run along a common theme, but each also have a unique connotation that can be interpreted rather well in context.
  • The words must be highly reactive with their fellow terms so that a high number of reasonable compound words can be made. Again, the goal is to maximize the number of unique meanings we can derive from the minimum set of words to learn. This will increase the number of vocabulary terms, but in a less defined way as the minimalist nature of the language will make it favor interpretation of clear-cut meanings anyway. Therefore, alternative constructions for the same compound word concept won’t be too big of a deal.

Toki Pona’s Potential

As previously mentioned, Toki Pona has many desirable characteristics that we seek in developing a narrative scripting language. Limited syntax and vocabulary clears the user-accessible and computationally efficient requirements (3). Toki Pona also does an excellent job of functioning as dialogue since it was designed from the ground up to be used conversationally (2).

The language has an exceptional potential to detail a wide variety of topics, and although it tends to be extremely vague, enough detail can be made to elucidate the general meaning of an idea. However, there are some details about the language that have the effect of restricting its potential, namely the fact that the languages’ creator, Sonja Lang, designed it based on addressing the linguistic needs of a hypothetical, simplistic, aboriginal people on an island. As such, the language is not completely designed from the ground up to account for a maximization of functional vocabulary and in fact caters to the range of topics and activities that such a people would participate in.

The 2014 guide to Toki Pona, occupying an entire word in the language.

For example, the language includes words like “pu” (meaning “the book of Toki Pona”), which is utterly meaningless for our purposes, and “alasa” (meaning “to hunt, to forage”), which fails on perpendicularity since you can easily create the same meaning with phrases like “tawa oko tan moku” (meaning “to go eyeing/looking for food”).

Also, despite how much the language does to accommodate different modes of verbiage, (including past, present, future, progressive variations of each, etc.), it can be troublesome to express some necessary concepts since “wile” (the word for “want to”) itself encompasses want to, must or have/need/required to, and should/ought to, each of which are highly distinct and significant nuances.

Although, some inventive uses for verbiage have been adapted for a lacking vocabulary. For example, a person can convey that he or she should do something by using the command imperative on themselves as a statement, e.g. “mi o moku” => “Me, eat” => “I should eat”, distinct from “mi moku” => “I eat.” These intricacies must be learned on top of any such vocabulary or syntax rules though as they are built upon usage conventions and through their obscurity inevitably hinder the accessibility of the language.

As such, it would appear that the most effective strategy would be to develop a derivation of Toki Pona, built on the same principles and leveraging much of the same language, but stripping it down to only the elements that are most critical for communication and plugging up gaps in linguistic coverage as much as possible.

Narrative Language: Core Concepts

While we won’t iron out the entirety of a language in one sitting, we can likely get a sense for what sorts of concepts must be included as core elements. If we limit ourselves to include 150 words (and even that is really stretching it, if we want to keep the language as effective as possible), then let’s see what ideas are really needed.

  • Pronouns
  • Parts of the body
  • Spatial reasoning, i.e. directions, orientation, positioning
  • Counting, simple math
  • Colors
  • Temporal reasoning, i.e. time (and tense) references
  • Types of living things (including references to types of people, e.g. male/female)
  • Common behaviors (supports occupations, most helping verbs, basic tasks)
  • Common prepositions
  • Elements of nature (earth, wind, water, fire/heat, light, darkness)
  • Forms of matter
  • Grammar particles (obviously taken directly from Toki Pona more or less)

We must also then include elements that are uniquely necessary to fulfill our needs of describing the nature of people and relationships.

  • Elements of perception (able to describe the “feel”, “connotation”, or “theme” of an experience)
  • Elements of relationships (same for relationships, but also able to describe the expected responsibilities. Need basic words to help illuminate expected responsibilities, e.g. Toki Pona’s “lawa” for “leader”)
  • Elements of personality

Ideally, we would be able to get many of these meanings using the same words, but just applying them to a different object. For example, if we had the word “bitter”, we could use it to describe a perception, the overall nature of a relationship (or perhaps even the feelings experienced by one party in the relationship), or someone’s personality, e.g. they are a bitter and resentful person, etc.


In our analysis of Toki Pona, we covered the fact that it has many advantages in regards to fulfilling our requirements as a narrative scripting language; namely that it can be used as dialogue and has a high number of meanings per learned vocabulary term, and therefore it is able to cover a lot of topics with little base learning time involved. However, we have also determined that the language has many flaws due to the original design purpose: meeting the linguistic needs of an isolated and tribal hunter-gatherer people. As such, there are many unnecessary terms and some concepts which simply cannot be conveyed adequately, if at all.

We can therefore state that the best course of action would be to derive a new language from the structure and vocabulary of Toki Pona, shaving away “the fat” as it were, and editing the language to bolster its linguistic breadth and depth while still staying true to its minimalist nature. To that end, we outlined several topics of vocabulary that would be essential for outlining a narrative scripting language.

In future articles, I’ll begin to address the interface we may see in a narrative scripting editor and identify actual gameplay mechanics we could see with narrative scripting put into practice. Please feel free to leave comments if you have any further insights or criticisms and stay tuned for more!

Next Article: Interface and Gameplay Possibilities
Previous Article: What is “Minecraftian Narrative”?