Godot’s Node System, Part 3: Engine Comparisons

This article is part of a series.
Part 1: An OOP Overview
Part 2: Framework Concepts
Part 3: Engine Comparisons

Edit 1: Fixed an error in the description of GameObject layers.

Edit 2: Clarified use of “entity” and “component” terms vs. ECS depiction. Part 1 elaborates on differences.

Introduction

The time has finally arrived.

The last two articles presented an overview of what one can expect from OOP design and popular game frameworks. This article explores Unity, Unreal Engine 4, and Godot Engine. Specifically, how their frameworks each use the concepts discussed in those articles. If you want to understand how Godot’s Nodes relate to the other APIs, this is the article for you.

If you want the TL;DR; version, each subsection, Unity/Unreal/Godot and Conclusion, has a bullet-point summary of the article’s information.

While it was mentioned in the last two articles, I’ll re-iterate:

This article’s uses of the terms “entity” or “component” are not references to the aspects of an ECS paradigm. Rather, a traditionally OOP Objects’ features are split up into essential elements called “components”. Anything in the world that has components is an “entity”.

1200px-unity_technologies_logo-svg

Unity

In Unity, the “level” is a Scene. Each Scene contains a list of GameObject entities which serve as the root of their own tree of GameObjects.

Each GameObject may have a list of Components attached to it. They also have one Tag and one Layer. Tags and Layers can be defined globally (some are engine-defined). The Tags assist with identification while layers assist in operations (categorized custom rendering/collision/UI sorting, etc.). Each GameObject also has a Transform (so even abstract things have a place in the world).

Unity provides many Components for all manner of gameplay. MonoBehaviour, one of these, is blank aside from a C# script location. C# scripts extending MonoBehaviour are then new, user-defined types of Components. MonoBehaviours are therefore able to delegate to their assigned C# script.

79749b89-4925-44ce-88fc-0f79d5b7a2b7

Simulated Multiple Inheritance

Unity’s framework supplies a very traditional Entity-Component framework, super-powered with user-customization and utilities for inheritance simulation.

What is meant by inheritance simulation? Well, Unity employs some tricks with its Component API. A user can look up GameObjects of a certain type. But, GameObjects do not have a user-defined type, and the only definable types are components. So while one is actually testing whether the GameObject has-a Component, Unity simulates it as an is-a relationship.

Unity further solidifies this illusion by mirroring the GameObject’s getComponent method on each Component. Each Component is able to access the others as if it is the GameObject. The Components don’t really own any other components though. Devs added this access for convenience to reduce verbosity.

Components can also enforce dependencies for other Components onto the owning GameObject via the RequireComponent attribute. This means that Components can carry their dependency burdens onto the GameObject.

This all implies a sort of multiple inheritance by composition. The components do not manage their own dependencies, but rather exploit the access to the GameObject to fulfill their requirements. A GameObject is considered to be several different types based on what Components it holds.

Users can mitigate this issue by relying on interfaces (C# version). For Unity specifically, by using interface-based getters for one’s MonoBehaviour properties. This then allows a single MonoBehaviour to loosely declare relationships with other Components. That MonoBehaviour becomes the predominate “type” indicator for the GameObject. The MonoBehaviour has loose “ownership” of the interfaces because users can swap the implementing Components out freely.

image01

Prefabs

The Engine’s serialization solution is the Prefab. Each Prefab saves a single tree of GameObjects. You can instance the Prefab many times and changing the original will update the instances. Prefabs do not support inheritance with each other.

You can also update one instance and apply those changes to the Prefab itself. That then automatically propagates the edited content to all other instances.

Note that in Unity, Prefabs are optional. When one defines a new type, they do it by defining a Component. They can create a GameObject and add the Component without needing a Prefab.

The Prefabs are not officially registered types in Unity. They are resources for recreating a particular arrangement of GameObjects, Components, and property values.

Unity provides two layers of abstraction: the organization of Components on a GameObject and the organization of GameObject trees. Many utilities already exist for the former. The Prefab exists for the latter.

But the Prefab is limited in that it cannot internally form relationships with other Prefabs. This means that a Scene’s list of GameObjects cannot be sub-organized into smaller units. If users build a Prefab, it can only become related to another Prefab in the Scene context.

The 2018.3 version of Unity will support nested Prefabs. With this, the Prefabs will be able to store other Prefabs. This actually stands to improve the situation drastically.

Rather than relying on inter-component relationships to define types, people will be able to use Prefabs. Because Prefabs provide a looser coupling via tree structures, users can simplify their code by reducing the number of related Components on any given GameObject.

networkmanagerinspector

Networking

Unity provides a lot of simple networking available to users. Like others, it has a low-level API for direct communication and an HLAPI for server/client frameworks.

The main aspect of the HLAPI is having a NetworkManager-owning GameObject in the world that regulates connectivity and other NetworkIdentity-owning GameObjects to be replicated. Different types of NetworkComponents are provided that automatically sync complex, dynamic data between each machine.

The NetworkBehaviour Component then provides several utilities:

  • Synchronizing variables.
  • Testing whether this “version” of the object is the server or the client.
  • Testing whether this object even belongs to the local client.
  • Sending notifications to the server or to other clients
  • etc.

All in all, there’s a lot of powerful functionality here if you just want to dive in and get started.

tip_13-03

Data Management

If a user wants to work with data in their Unity project, they need customizable data. One of the most effective ways to do so is with a ScriptableObject.

To create them, users need to dip their feet into the Unity Editor’s tools for modifying the editor itself. They first create an editor script that will create the ScriptableObject they want. Once compiled, the scripts create new menu items that will create the asset.

The earlier C# script defines the properties of the ScriptableObject to create. Once made, those properties are then editable from the Inspector. With these assets, users can create concrete, customized datasets.

If users require more editing power, they can write their own editors. Unity provides a scripting API for generating quick and dirty UI elements called Immediate Mode GUI (IMGUI).

Many Unity plugins make use of these tools. They do not rely on the GameObject-based GUI used for game production though. Instead, it creates a simple set of C# attributes and Editor-related Objects for adding content.

complexscene

Unity Summary

  • Zero or more GameObjects compose a Scene (as a list).
  • One or more GameObjects compose a GameObject externally (as a tree, including the root).
  • Zero or more Components compose a GameObject internally (as a list).
  • GameObjects have a Transform and possibly a tag and/or multiple layers.
  • Components may inherit from other Components.
  • User scripts extend a particular Component: MonoBehaviour or NetworkBehaviour.
  • Components may form dependencies between each other on a single GameObject.
  • Unity strives to simulate multiple inheritance through its composition techniques.
  • Prefabs package up a tree of GameObjects and their Components.
  • Editing Prefabs can be executed both directions, rippling to all instances.
  • Prefabs do not inherit from each other.
  • Prefabs will soon support inter-Prefab ownership.
  • Prefabs are optional. GameObjects with components can be “inlined” without formalizing them as a Prefab type.
  • Unity provides many utilities for automatically setting up networked gameplay, including…
    • a Network Manager to coordinate connections.
    • ready-to-use components.
    • a blank template component with an HLAPI for server/client infrastructure.
  • Unity provides customizable datasets through ScriptableObjects.
  • Users can define custom editors for ScriptableObjects and other tools using a quick and easy IMGUI scripting API.

unreal-engine-4-logo-large

Unreal Engine 4

In Unreal, the “level” is a Level. Each Level contains a list of Actor entities which serve as the root of their own tree of Actors.

Each Actor has an internal tree of ActorComponents attached to it. Each Actor has a Transform, so they each have a place in the world.

Actors and ActorComponents are not the end though. Unreal has a fully fleshed out class system that the editor relies on. The engine provides several types, and the scripting system involves defining new types. Types can be declared with either a C++ script or a Blueprint, a visual scripting language.

Every class is a serializable, instanceable object. Users can edit an Actor C++ class or Blueprint independently and those changes will apply to every instance of that Actor in all Levels. The same goes for ActorComponents instanced within Actors.

Unreal sees everything as a class. The engine detects the creation of new classes or edits to existing ones and updates accordingly.

Unreal also has a variety of tag features. Actors and ActorComponents support multiple Tags. The engine provides an API for requesting all Actors or ActorComponents with a specific tag (one function for each).

The Editor also allows users to define global-scope GameplayTags, which are not the same thing. These can be queried and compared to any GameplayTag(s) property on an Actor or ActorComponent. Because they are properties, they don’t fulfill the same function as the previously mentioned tags.

web-design-planning

Suggestive Design

Unreal is designed around particular ways of structuring one’s code. It helps to enforce good practices by segregating programming tasks into different classes.

  • Actors are objects in the world.
  • Actors have an internal tree hierarchy of ActorComponents.
    • ActorComponents don’t exactly “own” other ActorComponents.
    • An Actor may organize ActorComponents to have child ActorComponents, but that structure is only visible to the Actor.
    • Hence, ActorComponents cannot be organized into self-contained hierarchies.
  • ActorComponent-derived SceneComponents can have positions in the world too. Example: a Camera that is attached to a player and is part of the player’s identity.
  • SceneComponent-derived PrimitiveComponents have a graphical aspect to them.
  • Because self-contained ActorComponent hierarchies are not possible, UE4.14 added ChildActorComponents. These allow users to represent a child Actor as a SceneComponent in a parent Actor.
  • Actor-derived Pawns are Actors that receive instructions from a Controller.
  • Controllers can be…
    • PlayerControllers that respond to input.
    • AIControllers that hook into an AI data structure of some kind.
  • Pawn-derived Characters are a simple template for building mobile, animated, humanoid Pawns.
  • A GameMode singleton determines the rules of play and the initialization of the game:
    • What’s the starting playable character?
    • What’s the starting level?
    • Etc.
  • A GameInstance singleton exists for Level-to-Level data preservation.

The Actor “entity” class absolutely is a type in Unreal, and it has several specializations and relationships to other types. The same goes for its components which have several variations themselves. Quite different from Unity’s generic GameObject which pseudo-adopts the types of its components.

As one can see, this class layout is much more suggestive about how to structure a project. There is a class for every purpose.

The suggestive design has advantages and disadvantages though. On the bright side, it encourages users to write cleaner code. Each class’s abundance of designed functionality improves iteration time and gets projects running quickly. The user’s code is bottled up into each class naturally and dependencies to other classes are clearly defined.

On the downside, it renders other structures more difficult to adapt to the defined dependencies. If the user doesn’t know in advance how they should design their project, then refactoring will be needed.

The most common refactoring issue was needing to change an Actor into a SceneComponent or vice versa. That issue is much easier to resolve though, now that users can simulate Actors as SceneComponents via ChildActorComponents. If users create Actors for most things, they can still nest it within another Actor as a ChildActorComponent.

Note that the root of an Actor’s component tree is a SceneComponent. Users can set another SceneComponent-derived type as the root if desired. So long as it derives from SceneComponent, the Actor will still have a place in the world. In this sense, one might think of a Level as a list of SceneComponent-derived entities with a subtree of components that are wrapped by an Actor’s class code.

unitblueprint-1024x563

Blueprints

One of Unreal’s greatest advantages is in how it allows users to rely on a GUI to organize and edit each class. These GUIs, the Blueprints, can inherit from C++ classes, so users can safely work with both.

Each Blueprint can expose properties, including classes themselves, alongside methods, macros, and events. The workflow for events is identical to that of a void-returning method, and the macros are extremely flexible. Most of the C++ workflow can be reproduced in Blueprints. Indeed, Blueprints can even be compiled down to C++ for optimization.

In terms of Unity, this would be akin to having the full Unity Editor available for constructing the design of engine elements. The MonoBehaviours, Prefabs, GameObjects, and the relationships between them could interact with more than just a code editor and the Inspector.

Of course, the only reason Unreal needs such compartmentalization is that it has so many different high-level classes. However, it is to Unreal’s credit that the Blueprint element of its framework is intuitive, visually readable, and powerful.

A lot can be done with very little work because one glance at any Blueprint gives users a descriptive, high-level view of the class. Users can organize different aspects of a game’s design into the appropriate Unreal classes. Blueprint’s strong autocompletion features can even hint at what users can do and which class they need to access to do it.

As such, users can work very effectively with Blueprints. They can keep well-organized code despite its distributed nature, so long as the user knows what they are doing. This is more difficult to do in Unity, at least until Nested Prefab support is added.

One of the small downsides of everything being a class is that prototyping a makeshift type still involves needing to create a formal class (C++ or BP). Not a big deal, but creating a “throwaway” class still involves creating files that later need to be deleted. Unity will pretty easily let you just make a GameObject, throw components onto it, and then delete it, no filesystem cleanup involved.

featured-1

Networking

Unreal’s networking system is one of the most convenient and mature aspects of the engine. Since Epic Games designed it for online multiplayer FPS games to begin with, it has several integrations with Blueprints that facilitate users’ networking HLAPI use.

  • The Editor can launch however many game instances the user desires, with or without a dedicated server instance.
  • Users can flag individual properties to replicate in various ways automatically.
  • No special classes are required.
    • Actors support replication out-of-the-box.
    • ActorComponents also support replication if their owning Actor’s replication is turned on.
  • Methods, Events, and by extension Macros all support various remote procedure modes (client-only, server-only, synchronize with all clients, etc.).
    • The contents of any of these can also test the role of the current context to define custom behavior, e.g. has authority or not, affiliated with the local client or a proxy, etc.
  • Users can define explicit validation functions for remotely called methods. These ensure that parameters are valid, i.e. not hacks, before progressing with remote execution.

Considering the depth and variety of features, it’s hard to top Unreal’s out-of-the-box networking smoothness.

45782-screenshot_2

Data Management

Data in Unreal is editable from two main assets: DataTables and CurveTables. Users can make each of them directly from the editor.

Each DataTable is a Map of StringName (i.e. hashed string) keys paired with a Struct instance. A single Struct type is associated with each DataTable, as its properties define the columns.

Structs are declared like classes with C++ or BP, but they don’t have anything except properties. The engine provides a simplified GUI editor for them as well.

CurveTables are quite similar in that they represent spreadsheets with StringName keys. But, their columns are numerically indexed and all cell values are floating point, e.g. 3.14. This is because every row of the CurveTable represents a Curve on the same graph.

With these, Unreal makes it easy to import, export, and edit spreadsheets with customized data. To modify how the data is edited, a user has two options.

The first would be to write a wrapper class to handle it, but it would only work at runtime. In this case, your class would take in inputs and execute changes to the data asset after validating or evaluating those inputs. Likewise, data requests are made simpler since the wrapper class abstracts away access to the spreadsheet. This is similar to what one might do for a typical database.

The second would be to incorporate a Slate plugin to create an Unreal Editor extension. Slate is UE4’s version of Unity’s IMGUI. It relies on its own declarative language embedded within the C++ source code. Slate can be used to create custom editors for the details panel in the Unreal Editor, among other things.

maxresdefault

Unreal Engine 4 Summary

  • Zero or more Actors compose a Level (as a list).
  • One or more Actors compose an Actor externally (as a tree, including the root).
  • One or more ActorComponents compose an Actor internally (as a tree, including the root). The root defaults to and must derive from SceneComponent.
  • Actors have a Transform because SceneComponents have a Transform.
  • Actors and ActorComponents can have tags. Once a user has found an instance, they can run queries on GameplayTag(s) properties.
  • Most everything is an inheritable class.
  • There is a diverse ecosystem of classes for many purposes. Each one suggests a best practice.
    • It is easy to build content when following Unreal’s development plan.
    • Deviating from the plan can result in refactoring or manual work. Will usually want to design with Unreal’s API in mind from the beginning.
  • Actors can be nested within each other through ChildActorComponents. This provides Unity’s Nested Prefab feature immediately.
  • Blueprints provide a wonderful overview of classes. Iteration with them is quite easy.
  • Blueprints grant isolated views into a project’s structure.
  • Unreal’s class system requires a C++ or Blueprint file. Inlined, disposable types for quick prototyping aren’t possible.
  • Unreal is built from the ground up to support networking.
    • Several utilities for common tasks.
    • Can configure any property, method, or event’s replication from C++ or BP.
    • Replication supported for Actors. If an Actor is replicated, its ActorComponents may be specially replicated too.
  • Data can be edited as an in-editor spreadsheet. Spreadsheets are always a StringName key mapped to…
    • a customizable Struct for DataTables.
    • a series of float values representing a curve for CurveTables.
  • Creating tools and custom editors is done with Slate, an IMGUI-like system that builds editor changes using a scripting API.

2000px-godot_logo-svg

Godot Engine

In Godot, the “level” is a scene. Each scene contains a single tree of nodes. Nodes can inherit from other nodes, and as part of a tree, nodes can own other nodes. Users can nest scenes within each other and inherit one scene from another.

Nodes do not necessarily have a location within the world. They do not have a Transform by default, unlike GameObjects and Actors. The most common types of Nodes are Node (empty), Spatial (3D transform), Node2D (2D transform), and Control (2D transform on GUI layer).

Each Node can have a single script which extends its type. The script itself just has to meet a Script interface. Godot 3.0 allows the following script types:

  • GDScript, a C++-Python mashup with a high level of Godot Engine integration.
  • VisualScript, a visual programming language akin to Blueprint Visual Scripting (although more limited and complex to understand in comparison). Also has high integration with Godot Engine. Users can easily define their own VisualScript nodes using any scripting language, including other VisualScripts (linked example uses GDScript).
  • Mono C#, as is present in Unity’s scripting. Exporting a project with C# is still a work-in-progress at the time of Godot 3.0.2. Someone has also been building F# examples, though that isn’t officially maintained (Setup, Implementation).
  • NativeScript, a script that executes dynamic library code. To do so, it uses a C API for the scripting engine called GDNative. Languages with generated C bindings can then also be compiled into dynamic libraries and executed. C++ is the only officially maintained binding. Bindings currently exist for…
  • PluginScript, a script that also relies on GDNative like NativeScript, but which also teaches the Godot Editor how to handle the language directly automatically recompiling the compiled dynamic library as needed.

Scripts are usually created as files. However, GDScript and VisualScript files can be inlined and saved directly as sub-resources within scene files. This makes prototyping types much easier as there is no filesystem cleanup needed when dumping a disposable type.

Entities don’t really exist, nor do components. The notion of an Entity and a Component is instead replaced by a Node.

Nodes can be members of an arbitrary number of Groups. This is Godot’s tag system. Groups are globally recognized.

Godot games involve running a SceneTree which manages the game’s global node tree.

BigSceneVsRecursiveScene

How do Nodes and Scenes Relate?

So, how exactly does a Node work? How do they fit into scenes in Godot? Well first, let’s break down what a roughly equivalent structure would look like in the other engines. Keep in mind, these are rough estimates. Nodes in Godot are actually quite cheap to produce, so the comparisons aren’t clear-cut.

If Unity Were Godot

In Unity, imagine having a Scene with a single GameObject. That GameObject may have an arbitrary number of child GameObjects. Every single GameObject has exactly one component. The most basic one would by default just be a MonoBehaviour with no C# script attached.

Each GameObject has a single type and supplies a single batch of properties and methods. Each GameObject can operate on its own without any dependencies on the other GameObjects. Any arbitrary subtree can be packaged up into a Prefab, only these Prefabs can inherit from each other and be nested within one another. Then consider that your Scene itself can essentially be one giant Prefab. What if the whole Unity Editor was effectively a Prefab editor?

What if “running a game” meant creating and executing a Prefab? Who needs Scenes when all you have are Prefabs? You would then also be able to “run your game” at any level, because Prefabs exist at all scales of your game’s design.

If Unreal Were Godot

In Unreal, if you recall, this concept was already hinted at: have a Level with a single Actor that manages the entire subtree of Actors. The Level itself becomes a giant tree of SceneComponents that interact with each other. An Actor wrapper encapsulates interactions with each SceneComponent. Each Actor, i.e. relative root SceneComponent, then manages its own internal subtree of ActorComponents.

The Blueprints provide an independent, visual editor for reviewing and developing each class. The class could be an Actor-wrapper or an individual ActorComponent. This provides a full editor for just a class’s functionality, not any particular instance.

At this point, with only a single Actor in each Level, the Unreal Editor with its Viewport, Toolbar, Details Panel, etc. is just a tool for editing a Blueprint’s components and exposed properties in the world.

There is a separate space for “actors” (instances in the world) and “scripts/classes” (the data and behavior of each instance’s class).

What if “running a game” meant creating and executing an Actor Blueprint? Who needs Levels when all you have are Blueprints? You would then also be able to “run your game” at any level because Actor Blueprints exist at all scales of your game’s design.

wpf-data-visualization-network-node-node-relocation-en-us

Godot is Godot. Everything is a Node.

In Godot, each scene has only one Node at its root. Nodes fulfill a singular function but may encapsulate their own data and logic.

Nesting one scene within another is the same as instantiating a new type of Node that micromanages its own internal subtree of Nodes. Editing a scene is then identical to editing a Prefab or Blueprint independently of its context (a declarative extension of the Node’s type).

Defining a new script also allows one to create and add child Nodes, along with micromanaging them. Editing a script is then identical to editing a single-MonoBehaviour GameObject or Blueprint independently (an imperative extension of the Node’s type). A scene’s root node’s script is functionally similar to an Actor wrapper class around an internal subtree of components.

As a result, everything in Godot is a Node. Scripts are imperatively specialized nodes. Scenes are declaratively specialized nodes. A scene’s root node and other internal nodes can all have scripts, letting users mix and match programming styles (Here’s GDQuest’s description). Games are SceneTrees with a root Viewport node. The Godot Editor is one big fancy EditorNode. Everything is a Node, and there’s nothing magical about them.

This point is made even more clear by the fact that a scene cannot be saved or executed unless a root node has been defined. This is because the scene needs to know what type of node the scene is. Contrast that with Unity’s Scenes and Unreal’s Levels which are arbitrary containers of entities that users can safely save or execute in-editor even while empty.

Many of Godot’s C++ nodes are even imperatively defined subtrees of nodes. The image below shows the appearance of a simple scene with a Tree node on the left, as seen from the Editor. On the right, the scene’s runtime view reveals the diversity of nodes that comprise the Tree. The Tree node abstracts away a great many other nodes. The subtree is procedurally generated at runtime as users interact with the Tree’s API.

Scripts and scenes allow users to define new types of nodes with their own localized subtrees in the same way.

Nodes have a notion of “ownership”. Users can see and click on nodes that have no owner in the main screen; however, those nodes don’t exist in the scene dock at design-time. Only “owned” nodes appear in the scene dock.

The Tree node above creates nodes and adds them as children, but doesn’t make itself their owner. As such, they appear to be a part of the Tree node. Any scene file’s root node is automatically made the owner of its sub-nodes so as to make them visible.

“Running a game” means running a SceneTree. SceneTrees automatically instance the user-designated “main” scene. Users can also execute the SceneTree with the currently edited scene though. Because running a game entails running a scene, you can run your game at any level, because scenes exist at all scales of your game’s design.

Loosening Things Up

The big change here is that Unity’s GameObject/MonoBehaviour and Unreal’s Actor/ActorComponent frameworks each rely on composition. They tightly couple the relationship between them. The components cannot survive without their container.

The components also do not handle their own sub-relationships directly. They might declare what other components they need. However, the responsibility of forging those relationships is foisted upon the container logic. This prevents components from handling their own internal substructures without resorting to hacks. If it’s possible to begin with.

In Godot, nodes rely instead on aggregation. Any node can be detached from the SceneTree and continue to exist. It will be frozen in time, blind to the world. Plugging it back in allows it to detect process updates, node movements, and other SceneTree-related notifications.

When users change what scene they are running, only the main scene’s nodes change. The main scene is still a child node of the root Viewport node, which remains unchanged. A user can therefore instantly create a persistent node by just attaching it to the SceneTree’s root node.

Even the scripts attached to nodes don’t have to be attached to operate. A standalone Script functions just like a standalone class. Constants, subclasses, and static methods are all accessible from the loaded script. Attaching the script just causes it to generate a script instance with properties, signals, and local methods.

Nodes operate largely independent of other nodes, so “component” problems are rare. If a node does have a dependency, the editor will clearly warn you.

The scripting API relies on duck-typing inherently. Certain implementations (like C++, C#) might not internally do that, but attempting to call methods on other loaded scripts will operate with duck-typing. As such, interfaces are fast and loose.

Godot’s framework can maintain loose coupling no matter how complex a project becomes. Users can execute and test every scene, and therefore every node/script, independently and at any level.

Need to test a subtree in a scene? No problem. Extract the subtree into its own scene (“saving a branch as a scene”).

Want to copy one scene’s subtree into the current one? No problem. Just merge it in (“merging from a scene”).

There is no aspect of Godot’s class framework that limits users. Users don’t waste time fighting the engine’s framework to build their own. Instead, they use Godot’s framework to build the perfect set of nodes for their game.

hqdefault

Okay, So Everything is a Node?

Well, Ok. Not everything is actually a Node. After all, the SceneTree isn’t one. What do they have in common? Well, they are both Objects. What parts of a Node are an Object?

Well, every Object can generically store properties, methods, constants, and signals (Godot’s event system). They also each store a script. Everything an Object supports, a script can also define (including signals).

The scripts actually implement prototypal inheritance. If a request isn’t met by a script, it delegates the request to the C++ Object. Some scripts (like GDScript and C#) support inheritance between other scripts of the same type. With scripts, users can imperatively define new types. One can even create a SceneTree script, point the Godot executable at it on the command line, and run it as a game!

Everything within an Object is subject to Godot’s animation system too, so the AnimationPlayer node tends to be far more powerful than it first appears. It can control property changes, method calls, signal emissions, signal responses, etc. One can even combine AnimationPlayer nodes to animate the others’ animations.

An Object’s memory is manually allocated and deleted. To simplify memory management, Godot also has a reference-counted Reference type. Those are then extended into the automatically serializable Resource type to which Script belongs. Even scenes get serialized using a Resource called PackedScene.

unit_res_gd

unit_res

Data Management

Resources are most like the ScriptableObjects of Unity. Both provide a class wrapper around a set of data and can be (de-)serialized smoothly. Godot has many pre-made resources for different things (images, audio configurations, animations, etc.).

Creating a Resource is as easy as telling the Inspector to make one and saving it. Users can make custom resources by creating a blank one and attaching a script that derives Resource and defines new exposed properties.

If one needs to create a custom editor for a Resource, Godot’s tools make that simple as well. Any node can be instructed to execute at design-time (“tool” mode). Godot’s editor is itself a Godot game. So, users can seamlessly integrate into the editor the same Control GUI nodes that they use for game development. With Controls, they can build custom docks, panels, toolbars, and main screen viewers very quickly and visually.

If a user wanted to save a tree of nodes, they could also do that by saving a scene using the PackedScene Resource. Another option would be using JSON utilities or a custom Resource to save more precise properties for later reproduction.

nmms

Networking

Godot 2.1 came with very rudimentary networking features such as an HttpRequest node and direct, low-level Peer-to-Peer communication.

Godot 3.0 adds a networking HLAPI (intro and lobby system tutorial, simple FPS demo). GDScript is well integrated with it, enabling users to specifically declare the replication mode of methods. In addition, every node has access to the HLAPI utility methods, either through its own API or through the SceneTree’s.

  • Every Node has access to HLAPI utility methods:
    • Can set properties or call methods.
    • Users can send each setter/call method to a specific client, to the server, or as a multicast to all.
    • Users can send each setter/call method with TCP or unreliably with UDP.
    • Every Node can get a reference to the SceneTree for its features.
  • The SceneTree provides methods for network connections overall.
    • Connecting to other players
    • Testing a player’s connection
    • Checking which network ID is the authority
    • Checking the network ID of the current context.

There are many improvements people are discussing to potentially add to the HLAPI. One desired feature is Unreal’s ability to launch multiple server/client instances when running a scene. Another is the ability to simulate networking problems like latency, packet loss, bandwidth restrictions, etc. on a single machine.

Godot 3.1 will have WebSocket support, implemented by the same guy who made the HLAPI. That will allow people to send custom data over networks more effectively.

All that said, many of the features need refinement (as they were just introduced in the latest major release). Most of the HLAPI is only accessible from the scripting side of the engine, not the editor. As such, you can’t simply flag a node or property’s replication mode from the editor’s Inspector like you can with Unreal.

So, while Godot’s functionality is largely there, its usability still needs some work.

gltf-damaged-helmet

Godot Engine Summary

  • There is one SceneTree.
  • Scenes contain a single tree of nodes.
  • The SceneTree manages the root Viewport node plus 1 main scene and zero or more other child nodes.
  • Except for the root Viewport node, every Node has a parent and zero or more child Nodes.
  • Nodes manage self-contained data and logic. Nodes can manage other nodes:
    • imperatively micro-manage via script (potentially invisible to the editor).
    • declaratively micro-manage via scene (fully visible in the editor).
  • Scripts and Scenes ultimately just define new types of Nodes.
  • Scripts can be inlined into scenes if necessary. No need to clutter the file system while prototyping small things.
  • Developers work with the engine to develop the unique API their game needs. They don’t build an API around what the engine makes them use.
  • Just as Nodes can inherit from other Nodes and own other nodes, so too can scenes inherit from and own each other.
  • Scripts implement prototypal inheritance through ownership. What the script doesn’t do, it passes off to its owning C++ class.
  • Godot has a full class hierarchy as well, but rather than defining several high-level objects like Unreal, it defines several low-level objects.
    • Object defines properties, constants, methods, and signals (the event system).
    • The main low-level Objects include reference-counted References, serializable Resources, Scripts, and PackedScenes.
    • Nodes are Objects that support Groups (tag system), delete their own subtrees, and can see the SceneTree.
  • Godot provides JSON utilities and scripted Resources for custom data management.
  • Custom tools and editors are simple to make.
    • Any script can be executed directly in the editor, because…
    • The Godot Editor is a Godot game. Control GUI scenes designed in the Editor can easily integrate into the editor. Therefore…
    • Users can build games and tools using the node system interchangeably.
  • Godot has low-level and high-level networking features, supporting both P2P and server/client frameworks, just like the other engines.
    • GDScript has a simplified interface with the HLAPI for better usability.
  • The HLAPI does not yet have integration with the Editor. It is purely a scripting API.

bringing_it

Conclusion

So, that was a lot of information, way more than any of the other articles. If you stuck with it though, you should now have a decent understanding of how Godot’s framework compares to Unity’s and Unreal’s. I believe that it naturally allows for more flexibility in users’ design choices.

However, the main goal here was just to help people understand what Godot’s nodes and scenes are all about. The concepts are so different as to be very confusing to newcomers.

Unity, Unreal, and Godot share several properties with minute differences. Those small differences have a large impact on the usability of their design though:

  • Each engine provides a description of the world: Scene(s), Level(s), SceneTree.
  • That description entails a data structure of objects: list, list, tree.
  • The objects each manage their own tree hierarchies of similar objects: true, true, true.
  • In some cases, the most basic form of these objects must include a transform: true, true, false.
  • In some cases, the engine supports inheritance between these objects: false, true, true.
  • Each engine provides a means of assigning ownership between these objects: composition, composition, aggregation.
  • In some cases, these objects are containers with components that burden the container with dependencies: true, true, false.
  • In some cases, the engine provides an extended class hierarchy for users to use (which is neither good nor bad):
    • False. Components only.
    • True. High-level API with suggestive design for quick, but inflexible prototyping.
    • True. Low-level API that gives you only exactly what you need and stays out of the way.
  • In some cases, the engine provides a simple means of prototyping disposable classes: GameObjects w/o Prefabs, false, inlined scripts.
  • Each engine provides access to its class API with one or more scripting languages. C#, C++/Blueprint, GDScript/VisualScript/C#/C++/D/Nim/Rust (WIP)/Python.
  • Each engine provides a means of organizing reproducible, serializable, connected arrangements of object hierarchies: Prefabs, Blueprints, Scenes.
  • In some cases, the engine supports inheritance between these serializable hierarchies: false, true, true.
  • Each engine provides a means of assigning ownership between these serializable hierarchies: “Coming Soon” with Nested Prefabs, ChildActorComponents, Scenes.
  • In every case, the engine provides a way of overriding exposed properties in instances of serializable hierarchies, within the world: true, true, true.
  • In some cases, the engine provides a way of overriding exposed properties in locally owned serializable hierarchies, within other serializable hierarchies: undefined, false, true.
  • In some cases, the engine provides independent editors for those pre-made object hierarchies:
    • False. Unity Editor only, Prefabs edited in the same environment as other GameObjects.
    • True. Blueprints with a separate Blueprint Editor (modify only, not run).
    • True. Scenes with the singular Godot Editor (modify and run).
  • Each engine provides a means of defining data:
    • ScriptableObjects. Must use editor tools and a script to create. Editable from Inspector.
    • DataTables and CurveTables. Structs and *Tables can all be created and edited with different in-editor GUIs. Compatibility with spreadsheet tech is a nice plus.
    • Resources. Many premade resources available from Inspector. Can use a script for custom data. Editable from Inspector. No tools required.
  • Each engine provides a means of defining custom tools and editors.
    • IMGUI. Use a special scripting API with attributes to declare in-code what changes to load into the Editor. Quick and dirty.
    • Slate. Use a special scripting API with a declarative syntax to declare in-code what changes to load into the Editor. Dirty, with some configuration requirements needed.
    • Tool scripts. Use existing gamedev UI knowledge and the full Godot Editor to visually design expressive tools. Use any supported scripting language. Godot Editor is a Godot game with an identical GUI API.
      • In-scene changes can be done immediately.
      • Editor changes require the use of the EditorPlugin node which has access to the Godot Editor. EditorPlugins require some user configuration to be created (for now).
  • Each engine provides a rich, high-level networking API combined with low-level networking access to meet users’ needs: true, true, true.
  • In some cases, the engine provides access to networking HLAPI changes from the editor, and not just the scripting API: true, true, false.

So, judging from the bulleted list, it seems like Godot has the most flexible and light design of the bunch. But it’s important to note that this is a skewed list. It is examining the class framework: how the framework compares to best programming practices and what functionality the framework offers its users.

Much more goes into a game engine than just what this article presents above:

  • The breadth and depth of out-of-the-box features.
  • Available platforms.
  • Cost. And not just money, but also development time and maintenance once a product is released.
  • Community size and support.
  • How reliable the developers are.
  • Whether you as a developer can fix or workaround problems you encounter, without depending on the engine’s developers.
  • The breadth and depth of introductory and/or intermediate learning materials for different topics within the engine’s use.
  • Whether the scripting API is easy to learn and understand. If you’ve gotten through the easier, tutorial level of learning, it’ll need to be easy to follow so that you can continue to build content efficiently.
  • Is there readable source code accessible to your team?
  • Many, many more.

This series doesn’t go into any of that, so it should by no means be a definitive statement as to why you should use a particular engine.

For full disclosure, I have several Godot Engine enhancements I am working on, including…

  • Editor recognition of Scripts and Scenes as named types with Node-like support immediately upon creation.
  • Improving usability of EditorPlugin creation and installation (want to automate the configuration needed for EditorPlugins).
  • Improving usability of custom nodes defined by shareable EditorPlugins.

This article is bound to have some information I messed up somewhere. While I have used each engine before, it’s been some time since I’ve gotten in deep with Unity and Unreal. Things I didn’t already know for sure, I tried my best to look up and confirm (actually learned about UE4’s ChildActorComponents that way, so that was cool).

If you find anything that is stated in error or which is presented while missing crucial details, please let me know. I update articles with new information and corrections as they are reported to me. I sincerely hope the article helped you to better understand Godot’s relationship to Unity and Unreal Engine 4. Cheers!

Advertisements

Godot’s Node System, Part 2: Framework Concepts

This article is part of a series.
Part 1: An OOP Overview
Part 2: Framework Concepts
Part 3: Engine Comparisons

Edit 1: Clarified use of “entity” and “component” terms to direct away from the traditional ECS meanings. Part 1 elaborates on differences.

Introduction

While the last article grounded us in basic OOP concepts, this one will dive into common features of frameworks, i.e. collections of object relationships. Game frameworks do not require these ideas in order to be a framework by any means. Instead, these ideas will try to paint a picture of what one can expect from the game frameworks of today’s most popular game engines.

Moving forward with this article, note that references to the terms “entity” or “component” are references not to the aspects of an ECS paradigm. Rather, a traditionally OOP Objects’ features are split up into essential elements called “components”. Anything in the world that has components is an “entity”.

ss_4a99eb1ff8c3c72d030fb6ddd92efeac72bb300b-1920x1080

Structure

As mentioned in the previous article, some engines use component systems in their frameworks. These are frameworks that rely heavily on various composition techniques rather than inheritance.

Because component systems are becoming more prevalent, we will be assuming this to be the case in our overview of game framework concepts. If it doesn’t have a component system, then chances are these concepts are still used, but the exact terminology and how they apply will be slightly different.

Pretty much all engines have users create some “space” in which to create content. While the name used for these spaces may vary, we’ll simply say “level”. Users create a level with an environment paired with a list of entities. Users can edit the environment or place entities to customize their game content.

But, these game engines don’t need any entities in a level to run it. An empty level will execute just as cleanly as a populated one. You usually need a Camera entity to see or hear anything, but the editors will still run the level without it.

The entities can own other entities as children (tree structures), so users can organize things. Each entity also supports ownership of components, but the components don’t own things. The entity maintains all ownership as the user organizes its internal components.

n-the-matrix-628x314

Serialization

But what if a user wanted to define an entity with components that they will use many times? What if they could describe an entity with components, save it to a file, and then load it up whenever they wanted?

Well, first they’d need some automated way of converting the entity and all its components into a file format (usually binary, i.e. 0s and 1s). And thus the engines added Serialization for their entities. With a simple operation, users can take a built entity and save it for later production.

Devs do this by storing information about properties, usually its name and value. They first specify some means of formatting that information (text data, binary data?) and then write it directly to a file. They do this for the entity, all of its components, and then each of its sub-entities and their components.

When devs need to create the entity hierarchy again, they load up the file. They determine how each entity is related, which components they have, and which properties should have what values. Then, they create those entities and components with the associated properties, effectively reproducing the stored entity.

In no particular framework, here’s an example of how this might be done in C++.

int main() {
    // We will record this integer value under the name "count"
    int count = 250;
    File f;
    // Create and open a file called "stats.txt"
    bool is_open = f.open("stats.txt", File::WRITE);
    if (!is_open) {
        return ERR_COULD_NOT_OPEN_FILE; //report failure
    }
    f.write_str("count"); // record the name to the File's buffer
    f.write_int(count);   // followed by the value.
    //the file will record how many bytes the integer takes up as well
    f.save(); // the data from the buffer has now been written to disk
    f.seek(0); // move back to the beginning of the file
    String name = f.read_str(); //fetch the name of the data
    if (name == "count") {      //in this case...
        int count = f.read_int(); //load the value into our variable
    }
    //we have now de-serialized the "count" variable
    f.close(); //terminate the file connection to release OS resources
}

43be352b8bfc9e96e98b9a21b34f690d

Instancing

Users can now start creating many of this entity all throughout their project. But what if they need to make a change to each one? They’d need some way of editing one and having that change applied to all related entities.

Well, developers found a solution for that too. When users create the entity, the entity remembers which file the user deserialized it from. If all the entities use the file as a guide for how to exist, then editing the file can change all the entities at once.

Creating these entities from the serialized type is called “instancing”. Here’s another example of how this works in C++, using no specific framework.

int main() {
    //setup types and data
    Entity e1;
    Entity e2;
    CountComponent cc;
    cc.count = 250;

    //give each Entity a copy of the component
    e1.add_component(CountComponent(cc));
    e2.add_component(CountComponent(cc));
    //create a hierarchy of entities
    e1.add_child(e2);

    //save the hierarchy to a file.
    Serializer::save(e1, "entity.dat");
    //load the file
    SerializedEntity se;
    se.load("entity.dat");

    //Create a few instances.
    //Each has a copy of e1 and e2, each with a CountComponent.
    Entity e3 = se.instance(); //remembers "entity.dat" created it
    Entity e4 = se.instance(); //same

    //make sure one of them is overridding the default value
    e4.get_component<CountComponent>().count = 400;
    //update the original data
    e1.get_component<CountComponent>().count = 300;

    //re-serialize it
    Serializer::save(e1, "entity.dat");
    //de-serialize the data again.
    //another possibility is that the framework detects that "entity.dat" has changed and updates se automatically.
    se.load("entity.dat");

    //propagate the data to all instances.
    //Will create an instance of the data and then
    //diff all components and fields against each instance.
    //If the value is not overridden, go ahead and reset it.
    //Again, the framework may automatically do this.
    se.propagate(); //via Observer pattern, explained later.

    //prints reloaded 300
    print(e3.get_component().count);
    //prints individually overridden 400
    print(e4.get_component().count);

    e4.reset(); //framework may provide a means of resetting.
    //prints the reset value of 300
    print(e4.get_component().count);
}

So now we can make edits to one thing and have those changes propagate to other instances. Sounds a little like inheritance, right? Well, it is an extension of the inheritance concept, but this time, it’s a little different.

do-you-know-what-this-is-no-directed-by-george-18596819

Prototypes

Developers have a couple ways to design inheritance. They can build objects from an idea, i.e. “class” (which we covered earlier), or they can create an object and have the instanced copy look to it for guidance on how to exist. This difference is one of class vs. prototypal inheritance.

The key distinction is in what the objects look to for guidance. If a user requests a property from an object, the object has to check whether it has the property.

Class-based objects don’t define properties. Instead, they check their class and its inherited classes (which again, are just ideas). If any of those classes have the property, then the object will too. Users usually define classes in the beginning and can’t edit them as the program executes.

A Prototype-based object can define its own properties. But, it also has an inherited prototype, another object. If the current object can’t find a requested property, it asks its prototype. If the prototype object has that property, then so does the current one. Continued failures keep moving up the prototype chain until the base prototype (Delegation).

The trick here is that prototypes aren’t ideas. They are fully-fledged objects. If the prototypes gain new properties during execution, all their derived types do too. What’s more, a prototype can change at run-time, meaning inheritance is much more fluid. An object may be a drop-in replacement for one type one moment and another type the next moment.

I won’t be doing a code example of Prototype logic. Ample examples of that can be found everywhere since JavaScript, an extremely popular language, is built upon prototypal inheritance.

srv_p2p

Networking

Devs often need to send data and execution instructions over a network. Most frameworks give people low-level controls for reading and writing network data. Writing involves sending network data to particular IP addresses and ports. Reading likewise involves listening for network data on particular ports.

Developers have several options when it comes to how to use or implement networking.

  • They can send network data on reliable, slow connections with TCP or on unreliable, fast connections with UDP (TCP vs UDP).
  • They can visualize the network data as individual messages (packets) or as a stream of data (WebSockets) (Packets vs. WebSockets).
  • They can have each application fully aware of the game state (peer-to-peer) or choose a single entity to be the authority on the game state (client-server) (P2P vs. client-server). Note that when the authority is not one of the players, devs call it a “dedicated server.”

Managing these relationships can be incredibly complex. Low-level controls like that are cumbersome for simpler, more common behaviors, such as…

  • Updating a local variable and synchronizing that update on the remote computers.
  • Calling a function locally and replicating that call on the remote computers.
  • Notifying the game state of a change, but not needing to notify other players, e.g. pausing.

Tasks such as these should be more accessible to developers. To simplify things, these frameworks can also define a high-level networking API (HLAPI). It generally creates utilities for client-server designs:

  • It assists in creating, managing, and destroying player connections.
  • It verifies which connected player is the authority.
  • It can test whether the currently executing context is the authority (to create client- or server-specific methods).
  • It enables users to mark which properties/methods it should replicate over the network automatically, to whom, and how.

A game engine’s editor usually has some setting in its GUI that enables one to trigger different aspects of the HLAPI in regards to particular variables or methods.

As you can tell, networking is a deep topic with many associated technologies. As such, a concrete example of its usage is difficult. But, you might see something like this at the low level…

int main() {
    Socket s;
    int count = 250;
    String ip_address = "1.1.1.1";
    int port = 9000;
    //send the "count" value to the machine at IP address "ip_address"
    //and send the data to the "port" port on that machine.
    s.send_int(ip_address, port, count);
    //Now listen on the same port for a confirmation of arrival from
    //the other machine.
    while (s.listen(9000)) {
        //maybe s.listen() has logic for timing out & returning false
    }
}

…and something like this at the high level (but still in engine code, sort of – this is kind of a bad example, but it’s shorter and requires little explanation to understand).

int main() {
    Integer count;
    Game game;
    //declares that "count" is a variable in the game w/ value 5
    //Also declares that it should synchronize the value to all players
    //All players now know that "count" exists on each machine.
    game.declare_int("count", &count, 5, REPLICATE_MODE::SYNC);
    //The Integer class is handling logic to propagate the new
    //value to other machines automatically through Game's HLNAPI
    count.value = 5;
}
hey__listen__by_peachycheetah-d5d1cpv
Credit: Peachy Cheetah

Events

So, users want to maintain a loose coupling, but they also want to connect behaviors. By separation of concerns, an owned object should never know the details of its parent or siblings. But what if the child does something, and the parent or siblings need to react? What’s the best way for it to notify the others?

A poor implementation might be for it to directly access the parent, and through it, the siblings, to call a method on them. That creates a tight coupling though as the notifier now has to know the type that it notifies. It also is forming a dependency whereby it can only exist in a parent that also has the required sibling.

To prevent this, devs usually make an event system available to the framework’s users. The parent already is aware of both siblings’ existence and types. It also knows that the sibling will need to react.

Instead of the notifier taking matters into its own hands, the parent simply tells the sibling to listen for the event on the notifier. The notifier can then emit the event. Because the sibling was listening for it, it will react. Neither the notifier nor the sibling knew each other. The parent orchestrates everything from behind-the-scenes. Devs call this design pattern the Observer pattern.

Almost every gaming framework has an event system of this nature. Some even have more than one for different purposes (input, collision detection, user-defined, etc.).

JavaScript has several examples of the Observer pattern since it is also built upon it, via EventListeners. This pattern can be found in virtually every language. We’ll cover its usage in modern game engines and their languages later.

roles

Tags

Now users may start creating a huge number of entities in their games. What if they just need a certain group of them? What if they just need one?

It would be inefficient to cycle through all existing entities and compare their IDs to a list. Instead, it might be a good idea to set up a cache of entities that fit into particular categories. Looking through a list of 1 or 10 entities is a lot faster than looking through 1,000 or a 100,000.

Well, devs decided to allow users to create these caches at will. To access one, they need to give it a name and add entities to the cache. Users can then perform lookups for these names, a.k.a. “tags” to find entities or they can check which tags an entity has.

Here’s another simple example in non-specific C++ code.

int main() {
    Game game;
    PlayerEntity pe;
    pe.add_tag("player");
    game.add_entity(pe); //add the player
    //add 1000 miscellaneous entities
    Entity e;
    for (int i = 0; i < 1000; i++) {
        game.add_entity(Entity(e));
    }
    EntityArray arr = game.get_entities_by_tag("player");
    print(arr.size()); //prints 1
    TagArray arr2 = pe.get_tags();
    print(arr2); //prints ["player"]
}

b333839608616bede4b8485fbbdf9802

Conclusion

While many more features are also shared between engines, these are the highlighted ones here. You should by now have an idea of what to expect in popular game engines nowadays, as well as why those features are available in the first place.

Again, we’re only touching the surface here, and we could never go into enough detail on even one of these topics (or sub-topics) in a single article. Feel free to look further into these concepts (especially otherdesign patterns” like the Observer concept above). There’s a nearly endless amount of material ripe for learning.

If you feel like you have a general grasp of these concepts, then you should be ready for us to talk about how some of today’s engines actually incorporate them. The next article will explain how Unity, Unreal, and Godot enable these designs for their users.

If you liked the article, or if something was confusing or off-sounding, please let me know! I’d love to hear from you in the comments. I do my best to ensure each article is updated as conversations progress. Cheers!

Godot’s Node System, Part 1: An OOP Overview

This article is part of a series.
Part 1: An OOP Overview
Part 2: Framework Concepts
Part 3: Engine Comparisons

Edit 1: Others graciously corrected my mislabeling of Component-based frameworks as a variation of the ECS paradigm. Added a Data-Oriented Design Disclaimer section to prevent misinformation.

Introduction

My last post reviewed Godot Engine. It’s a strong open source competitor to Unity and Unreal Engine 4, among others. While I covered each engine’s scripting frameworks already, I wish to do that analysis more justice.

Newcomers to Godot are often confused about how to approach Godot’s nodes and scenes. This is usually because they are so accustomed to the way objects in the other engines work.
This article is the first in a 3-part series in which we examine the frameworks of Unity, Unreal Engine 4, and Godot Engine. By examining how Unreal and Unity’s frameworks compare, readers may be able to ease into Godot better. Readers should come away understanding Godot’s feature parity with the other engines’ frameworks.

Discussing the engine comparisons requires prior understanding of basic concepts, first with programming and then with game frameworks. The first two articles each respectively cover those topics. If you feel you are already familiar with a given topic, feel free to skip to the next article. Let’s go!

main-qimg-19f14ae4b88596379962f539b2973525

What is OOP?

Game Engines tend to define a particular “framework”. They are a collection of objects with relationships to each other. Each framework empowers developers to structure their projects how they want. The more a framework enables the developers to organize things as desired, the better it is.

To do so, they use Object-Oriented Programming (OOP) whereby users define groups of related variables and functions. Each group is usually called a “class”, an abstract concept. In the context of a class, the variables are “properties” and the functions are “methods”. Creating a manifestation of a class gives you an object, a.k.a. an “instance” of the class.

OOP has several significant elements to it. This article explores each of them in later comparisons. Therefore, it is imperative that we dive into them now.

oop5b15d

The Big 3 of OOP

Devs want users to know how to use a class, but not how it operates. For example, if someone drives a car, they don’t need to know the complexity of how the car drives. Only how to instruct it to drive. Driving might involve several other parts. It might involve data and behaviors the user knows nothing about. The car abstracts away the complexity of the task (Abstraction).

Below is a simple GDScript example of Abstraction. The user is really dealing with a complex collection of data: length, width, and area. But certain rules apply. The area is not editable, and its value changes as length and width change. Also, the user needs to refer to the length and width together. The dev has made it possible for the user to organize a complex concept into a single logical unit.

# rectangle.gd
extends Reference
var length = 1
var width = 1
func get_area():
    return length * width

Users want to edit data or “state”. Devs don’t want users to access the data though. They instead opt to enable users to issue instructions on how to change a class’s data. Devs can then change the data’s structure without affecting the program’s logic. For example, the user can explain that they want to “add an Apple” to the Basket, but how the Apple is stored relative to the Basket is the devs’ concern, not the user’s. Devs should also be able to control the visibility of data to other devs (Encapsulation – Definition 2).

This type of Encapsulation guarantees flexibility of data structure. For example, what if the user attempts to set the length directly?

# main.gd
extends Node
func _ready():
    var r = preload("rectangle.gd").new()
    r.length = 5

Well then, what if we later decide to store the data as a Vector2? That is, a struct containing two floats together? Each Vector2 has an “x” and a “y”.

# rectangle.gd
extends Reference
var vec = Vector2(1, 1)
func get_area():
    return vec.x * vec.y

Well with the change, “length” is no longer a property. The user now has to search their entire code base for every reference to r.length and change it to r.vec.x. What if instead, we encapsulated the data behind instructions to change the data, as methods? Then…

# rectangle.gd
extends Reference
var _length = 1 setget set_length, get_length
var _width = 1 setget set_width, get_width
func get_length():
    return _length
func get_width():
    return _width
func set_length(p_value):
    _length = p_value
func set_width(p_value):
    _width = p_value
func get_area():
    return _length * _width

# main.gd
extends Node
func _ready():
    var r = preload("rectangle.gd").new()
    r.set_length(5) # data change is encapsulated by class's method.
    r.length = 5 # in GDScript, the setget keyword will automatically force a call to the method rather than direct access.

Now, if we want to change the data, we don’t need to modify the program. Either way, we are only calling set_length(5). Only the object, a self-contained area, must be modified.

# rectangle.gd
extends Reference
var _vec = Vector2(1, 1) setget set_vec, get_vec
func set_vec(p_value):
    pass # block people from setting it directly (if desired)
func get_vec():
    pass # block people from accessing it (if desired)
func get_length():
    return _vec.x
func get_width():
    return _vec.y
func set_length(p_value):
    _vec.x = p_value
func set_width(p_value):
    _vec.y = p_value
func get_area():
    return _vec.x * _vec.y

Users want to refer to a specialized class and a basic class with a common vocabulary. Let’s say you create a Square which is a Rectangle. You should be able to use a Square anywhere you can use a Rectangle. It has the same properties and methods as a Rectangle. But, it’s specialized behavior sets it apart (Inheritance).

The rule of a Square is that it is a Rectangle, but it’s sides are equal at all times. For this reason, the length and width are both referred to as its “extent.” Let’s change its instructions to maintain that rule.

# square.gd
extends "rectangle.gd"

# first, we setup the new rule
func set_extent(p_value):
    _vec = Vector2(p_value, p_value)

# Then, we override the original setter instructions.
# Now, only the Square's actions will execute
func set_length(p_value):
    set_extent(p_value)
func set_width(p_value):
    set_extent(p_value)

# main.gd
extends Node
func _ready():
    var s = preload("square.gd").new()
    s.set_length(5)
    print(s.get_area())

Note how Square explicitly creates a specialized version of the set_length() method. It keeps the width equal to the length. The Square can also still use get_area(), even though it doesn’t define it. That is because it is a Rectangle, and Rectangle does define it.

Now, specialized classes may or may not have read or write access to all the base properties and methods. GDScript doesn’t support this type of Encapsulation, but C++ does. C++ has what it calls “access modifiers” that determine what can only be seen by the current class (“private”), what specialized types can also see (“protected”), and what can be seen by all objects (“public”). This controlled access within a class allows for content to be encapsulated within particular layers of inheritance hierarchies (Encapsulation – Definition 1).

class Rectangle {
private:
    int _area;
    bool _area_dirty;
    void _update_area() {
	_area = _length * _width;
	_area_dirty = false;
    }
protected:
    int _length;
    int _width;
public:
    int get_area() {
        if (_area_dirty) {
            _update_area();
        }
        return _area;
    }
};

class Square : public Rectangle {
public:
    void set_extent(int p_value) {
        _length = p_value;
        _width = p_value;
    }
};

While the area content is locked to the Rectangle class (Square has no need for changing how area works), the Square still needs access to _length and _width in order to assert equivalent values for them. Square can only reference “protected” data in its set_extent() method, so area is inaccessible. This encapsulates the “area” concept safely within the Rectangle type.

In cases where you don’t want even the specialized classes to see or access data, access modifiers of this kind can be very useful in assisting the encapsulation, i.e. “hiding” of data from other parties.

growthhacker

Advanced Concerns

Developers want to minimize how much they need to change the code. How? Replace instructional changes with execution changes where possible. The instructions on how to use an object are its “interface.” The way an object performs an instruction is its “implementation.” Devs refer to a program or library’s public-facing interface as the Application Programmer Interface (API) (Interface vs Implementation).

You saw an example of the need for maintaining an interface with the first Encapsulation: if the user calls set_length() in all cases, then they don’t need to concern themselves with how it is performing its operations to change the data. An API is more generic though. If a user creates another object that also has a set_length() method, then the objects match the same API.

Some languages, like GDScript, use duck typing, and will allow the user to call the method simply because it exists when requested. Other static languages like C++ and C# will require the user to convert the object into a type that has the needed function strictly defined (if the language supports interfaces like C# or a form of it like C++).

A specialized class should execute its behavior in place of the base class’s where applicable. For example, let’s say Animal has specialized classes Dog, Monkey, and Human. If Animal has an “isHumanoid” method, each of them should have a specialized response to it (false, true, true). Accessing the “isHumanoid” method should always call the specialized response if possible (Polymorphism).

Below is an example of Polymorphism in GDScript. It outlines how the user can request the same information from different objects, but each one, depending on its type, provides a specialized response. There is no risk of the created WindowsOS somehow printing an empty line. Because Mac and Linux inherit from Unix, they inherit a polymorphic response, i.e. a response unique from the Windows response.

# main.gd
extends Node

class OpSystem:
func get_filepath_delimeter():
    return ""
class WindowsOS:
    extends OpSystem
    func get_filepath_delimeter():
        return "\\" # the first backslash is "escaping" the second
class UnixOS:
    extends OpSystem
    func get_filepath_delimeter():
        return "/"
class MacOS:
    extends UnixOS
class LinuxOS:
    extends UnixOS

func _ready():
    print(WindowsOS.new().get_filepath_delimeter())
    print(MacOS.new().get_filepath_delimeter())
    print(LinuxOS.new().get_filepath_delimeter())

To avoid over-complicating specializations, it can be useful to establish ownership between classes. If one class owns another class, then an instance of the second may only belong to a single instance of the first. Using ownership in place of specialization provides many advantages. (Aggregation).

  • It is easy to change a class’s ownership from a second class to a third class. It is difficult to change a single class’s specialization to another specialization (Loose Coupling).
  • Favoring Aggregation guarantees protection of the owned class’s data. This supports Encapsulation by ensuring that the class only has to worry about its own data and implementation (Separation of Concerns).
  • Aggregation abstracts away the complexity of using several classes at once. Inheritance abstracts away only the usage details of a single class. Inheritance has a deeper effectiveness for abstracting a single class’s complexity. But the tight coupling inheritance generates is often not beneficial in the long run when Aggregation is an option (stronger Abstraction).

In some cases, developers may want a class to rely on another class for its very existence. One class owns the other, but it is stronger than Aggregation. The owner creates the owned object directly, and if the owner dies, so does the owned object (Composition).

For example, computers and software have this relationship. Turning on the computer starts the software. Shutting it off kills the software. The software has a strong dependency on the computer. It “composes” the computer.

Below is a GDScript example of these concepts.

# main.gd
# Assuming a scene structure like so...
# - Node2D "main"
# - - Node2D "child1"
# - - - Sprite
# - - Node2D "child2"
# - - - Sprite
extends Node2D
func _ready():
    var texture = preload("icon.png")
    $child1/Sprite.texture = texture
    $child2/Sprite.texture = texture

Now, a key feature here is that each “child” isn’t a single object that has a texture property with logic to manipulate images. Instead, all of that is self-contained within a Sprite. The “child” nodes are free to have a variety of other features, but the Sprite tidbit is narrowly confined to that one node. This illustrates separation of concerns.

Because a node can be added and removed easily, it also shows loose coupling. Technically speaking, we could place another node of a different type where the sprites are, give them the name “Sprite”, and make sure they have a property called “texture”. If that were the case, the program would execute the same way. It isn’t tied strictly to the use of the Sprite node exactly.

In this example, the texture asset is aggregated while each Sprite node that uses it composes its parent “child” node. If the user deletes child1, its child Sprite will die with it. This is because of the Sprite’s compositional relationship to the parent node. But that doesn’t mean the texture is unloaded from memory, i.e. the image is not deleted. It is still in use by child2’s Sprite.

In this way, each Sprite “owns” a reference to the texture, but the texture doesn’t depend on any particular Sprite to exist. The Sprites do depend on a particular node (their parent) to continue existing though. That is composition.

8df8xts

The Birth of Component-Based Systems

In game development, relying on inheritance alone is messy. Combining inheritance with aggregation and composition gives developers great flexibility.

Developers often want to describe game concepts as a set of attributes and behaviors. In the early days, devs took a naive approach to defining these qualities. They might define a basic object and then ever more specialized objects for all purposes of game development.

That quickly became too difficult to manage though. If two specialized classes need common behavior, devs had to give it to a basic class. The “basic” classes soon became too complex to maintain.

# main.gd
extends Node
class Bird:
    func fly():
        # REALLY complex flying logic here. Expensive to reproduce and maintain
class Camera:
    extends Bird
    func record():
        # REALLY complex recording logic here. Same story.

Let’s say there is functionality in a class somewhere that could be really useful somewhere else. The Camera needs the movement logic to fly smoothly through the environment. Well, if that logic has already been written in the Bird class, then it can make the job a lot easier to just have the camera extend the Bird class and get the flying logic for free. Then the user can just add the camera logic.

But that poses a significant problem. If the camera is a bird, it suddenly also has a lot of unrelated bird functionality too: feathers, a bone structure, animations, chirping logic, wings, etc. And what if the user creates different types of cameras? What if the user makes a TrackCamera that moves on a track that doesn’t even require free movement like the bird provides? In order to get the Camera logic, the TrackCamera now has to carry all of this unnecessary Bird logic!

To solve this, developers created a new technique: create an empty container of attributes and behaviors. Then add only what you need! Devs call these empty containers “entities”. The attributes and behaviors are then grouped into “components” and added to an entity. Destroying the entity also destroys the components.

Through composition, devs could finally design their objects with only what they need. Nothing more. If the demands of an entity change, you add and/or remove a component. If the nature of a particular component needs to change, no problem! Swap out the component with another one that meets the same interface, but provides a new implementation.

Data-Oriented Design Disclaimer

This Component-based architecture is not to be confused with the often-used-interchangeably Entity-Component-System (ECS) pattern that is a staple of Data-Oriented Design (DOD). DOD is useful when one is primarily concerned with the hyper-performance of a game engine. It stresses techniques that organize programs around data and transformations rather than abstractions of real-world concepts. ECS is the latest illustration of this practice whereby…

  • Entities are merely an ID number.
  • Components are bundles of data.
  • Systems are stateless transformations of Components’ data.
  • Components are associated with Entities by their ID.

The lighter version of this design (which we cover here) foregoes the System aspect and merely abstracts its Objects into entity or component arrangements. The subdivision of objects into their components is still practiced. Users can simplify their Object into containers for those components. However, often the entity and/or components still execute behaviors rather than having systems handle it.

Note that a component-based design solves many of the issues with Object-Oriented Programming’s love for inheritance, but it does not meet DOD goals if one wishes to program that way. There is a general trend of moving towards more ECS-like paradigms. The major engines aren’t quite there yet, but it appears as though Unity has hired some experts to help them focus more on that style down the road.

documents-1024x576

Conclusion

Hopefully that wasn’t too much for your brain! By now you should have an introductory understanding of Object-Oriented Programming and its applications in developing software. There are many more, far more detailed explanations for each of the reviewed concepts, and the links provided are simply a way to get you started.

The keywords mentioned should give you the means to find more information if you need a better understanding. Once you feel comfortable with the concepts of inheritance, ownership, and other programming principles, dive into the next topic: today’s gaming framework designs.

As always, please comment with any feedback you have, especially if something is in error or confusing. I frequently update my work over time to try to maintain its integrity.

Godot: the Game-Changer for GameDevs

Edit 1: Updated Graphics Comparison section with before/after shots of Godot and accuracy corrections.

Edit 2: There was some confusion over whether the Deponia devs used Godot for their PS4 port. I had removed that content, but now they they have officially confirmed that it WAS in fact a PS4 port. Links in the Publishing section.

Edit 3: I’ve received questions regarding the preference for a 2D renderer option, so I’ve added a paragraph explaining my motivations in the Graphics section.

Edit 4: Godot lead developer Juan Linietsky mentioned in a comment a few points of advancement present in the new 3D renderer. The Graphics section has been updated with a link to his comment.

Edit 5: I have personally confirmed that the GDNative C++ scripting on Windows 10 x64 is now fully functional in a VS2015 project. Updating the Scripting section.

Edit 6: I have received new information regarding the relative performance of future Godot scripting options. I have therefore updated the scripting section.

Edit 7: Unity Technologies that they are phasing out UnityScript. Updating the Scripting section. Also, making a correction to Unreal’s 2D renderer description in Graphics.

Edit 8: Unreal recently added C# as a scripting language. I’ve updated the Scripting section accordingly.

Edit 9: More “official” blog posts / tutorials have been made explaining other aspects of the Godot Engine, so I am including links to those where appropriate.

Edit 10: Adding a GM:S API diff link.

Edit 11: Including addendums to each section for GameMaker: Studio, because I frequently see people despairing that this article doesn’t include it in its comparisons.

Introduction

I’ve been tinkering with game development for around 5 years now. I’ve worked with GameMaker: Studio (1.x, though I’ve seen 2.0), Unity, and Unreal Engine 4 along with some exploration of Phaser, Construct 2, and custom C++ engine development (SFML, SFGUI, EntityX). Through my journey, I’ve struggled to find an engine that has just the right qualities, including…

  • Graphics:
    • Dedicated 2D renderer
    • Dedicated 3D renderer
  • Accessibility:
    • Powerful scripting capabilities
    • Strong community
    • Good documentation
    • Preferred: visual scripting (for simpler designer/artist/writer tooling)
    • Preferred: Simple and intuitive scripting architecture
  • Publishing:
    • A large variety of cross-platform support (non-negotiable)
    • Continuously expanding/improving cross-platform support (non-negotiable)
  • Cost:
    • Low monetary cost
    • Indie-friendly licensing options
  • Customization:
    • Ease of extending the editor for custom tool creation
    • Capacity for powerful optimizations (likely via C++)
    • Preferred: open source
    • Preferred: ease of editing the engine itself for community bug fixes / enhancements

For each associated field, I’ll examine some various features in the top engines and include a comparison with the new contender on the block: Godot Engine, a community-driven, free and open source engine that is beginning to expand its graphics rendering capabilities. Note that my comments on the Godot Engine will be in reference to the newest Godot 3.0 pre-alpha currently in development and nearing completion.

Initial Filtering Caveats

Outside of the big 3, i.e. GM:S, Unity and UE4, everything fails on the publishing criteria alone since any custom or web-based engine isn’t going to be easily or efficiently optimized for publishing to other platforms.

github-electron
The new GitHub for Desktop app is an example of Electron (web browser dev-tools on right).

It is true that technologies such as Electron have been developed that ensure that it is possible to port HTML5 projects into desktop and mobile applications, but those will inherently suffer limitations in ways natively low-level engines will not. HTML5 games will always need to rely on 3rd party conversion tools in order to become available for more and more platforms. If we wish to avoid that limitation, then that leaves us with engines written in C++ that allow for scripting languages and optimizations of some sort.

GM:S’s scripting system is more oriented towards beginners and doesn’t have quite the same flexibility that C# (Unity) or Blueprint (UE4) has. In addition, GM:S has no capacity for extending the functionality of the engine or optimizing code. The latest version has a $100 buy-in (or a severely handicapped free version). Without a reasonable free-to-use option available in addition to all of the other issues, GM:S fails to meet our constraints. That leaves only Unity, Unreal, and Godot.

Edit: for people who decide to try Godot, someone has started a repository for collecting API differences between Godot and GM:S.

Edit: Due to comments I’ve received over the past several months, I will add sections covering GameMaker: Studio. Note that while I have first-hand experience with 1, I’ve only watched some comparison videos for version 2, so it is possible that I may have missed something.

Graphics Comparisons

There is no doubt that Unreal Engine 4 is currently the reigning champion when it comes to graphical power with Unity coming in at a close second. In regards to 2D renderers, it should be noted that…

  1. Unity has no dedicated 2D renderer as selecting a 2D environment just locks the z-axis / orthogonal view of the 3D cameras in the 3D environment.
  2. Unreal’s dedicated 2D renderer, Slate, can be leveraged through the UMG widget framework to be used in the 3D environment. In a project consisting solely of UMG content therefore, you can more or less use Unreal to get the benefits of operating solely within a 2D renderer. However, all of Unreal’s official tooling and features related to 2D are confined within the non-UMG content (collision, physics, etc.), so it’s not exactly a legitimate “support” for 2D rendering.
  3. GameMaker: Studio has a dedicated 2D renderer. From what I understand, they have begun to simplify the workflow for 3D rendering as well, although, it personally sounds like it’s still a laborious process.
  4. Godot has a dedicated 2D renderer (what it started with in fact) and a dedicated 3D renderer with similar APIs and functionality for each.

For those who might wonder why a dedicated 2D renderer is even significant, the main reason is the ease of position computation. If you were to shift an object’s position in 2D, the positions (at least in Godot) are described purely in terms of pixels, so it’s a simple pair of addition operations (one for each axis). An analogous operation in a 3D renderer requires one to map from pixels to world units (a multiplication), calculate the new position in world coordinates (3 additions) and then convert from world space to screen space (a matrix multiplication, so several multiplications and additions). Things are just a lot more efficient if you have the option of working directly with a 2D renderer.

For the 3D rendering side, Unity and Unreal are top dogs in the industry, no doubt about it.

ue4-smoothness-vs-unity5-comparison_thumb
This is a few iterations old, but you get the idea of where they stand (very impressive)

I had some trouble finding sensible 3D examples for GameMaker Studio 2. This was one of 3 images I discovered. It appears to be decent, but not quite of the same caliber as the others.

maxresdefault1

Godot 2.x’s 3D renderer left something to be desired compared with Unity/UE4 (similar to GMS, although the workflow for the same quality of work in Godot 2.x appears to be much simpler than in GMS). The showcased marketing materials on their website look like this:

3dgames
Godot 2.x 3D demonstration, from the godotengine.org 3D marketing content

With Godot 3.0, steps are being taken to bring Godot closer to the fold. Pretty soon, we may start to see marketing materials like this:

godot-graphics
A test demonstration of the 3.0 pre-alpha’s power, shared in the Godot Facebook group.

I’d say that’s a grand improvement, and this graphical foundation lays the groundwork for an impressive future if the words of Godot’s lead developer, Juan Linietsky, are to be taken to heart. (Edit: Juan recently published an article that goes into MUCH further detail regarding how the 3D renderer works. He also recently updated the docs for the Godot shading language)

It remains to be seen exactly how far this advancement will go, but if this jump in progress is any indication, I see potential for Godot 3.0 to enter the same domain of quality as Unity and Unreal.

Publishing Comparisons

Unity is by far the leader in publishing platform diversity with Unreal coming in second and Godot coming in last.

unity-platforms
Unity’s cross-platform support as of July 2017

unreal-platforms-1

unreal-platforms-2
Unreal Engine 4’s cross-platform support as of July 2017
game_maker_publishing
Game Maker’s platforms as of March, 2018

(Note that Switch support for GMS is on its way soon.)

godot-platforms
Godot’s publicly disclosable cross-platform support as of July 2017

Note that for Godot specifically, it also has the capacity to be integrated with console platforms since it is natively written in C++; all you need is the development kit. For legal reasons, however, Godot’s free and open source GitHub cannot include (and thereby publicize freely) the integrated source code for these proprietary kits. Developers who already own a devkit can provide ports, but legally, the non-profit that manages Godot Engine (more on that later) cannot be involved. Despite this setback, the PS4 port of the game Deponia was implemented in Godot.

In addition, Godot 3.0 has recently conformed to the OpenHMD API, integrating functionality for all VR platforms that rely on that standard (so that would include HTC Vive, Oculus Rift, and PSVR). The community is gradually adding VR support, documentation, demonstrations, and tutorials.

vr-platforms

All in all, Unity is still the clear leader here, but both Unreal and Godot provide a wealth of options for prospective developers to publish to the most notable and widespread platforms related to game development. As such, this factor tends to be somewhat irrelevant unless one is targeting release on one of the engine-specific platforms.

Licensing Comparisons

Unity’s free license permits the developer to craft projects with little-to-no feature limitations (only Unity-powered services are restricted, not the engine itself), so long as the user’s total revenue from a singular game title does not exceed $100,000 (since time of writing). The license does limit the number of installations you can have, however, as they are linked to a “Personal Edition” of the software. If you end up exceeding the usage limits, then you must pay for a premium license that involves a $35 (to double the monetary limit) or $125 (to remove the monetary limit) monthly subscription rate.

Unreal Engine 4 likewise has a free license, however its license has no restrictions whatsoever on the size of your team or the number of instances of the engine you are using (distinct from Unity). On the other hand, it has a revenue-sharing license in which 5% of all income over $3,000 per quarter is contributed to Epic Games.

unsecured-debt

The licensing between these two platforms therefore can be more or less beneficial depending on…

  1. How long you plan to spend developing (if using Unity professionally).
  2. How quickly you expect the revenue from your game to roll in (if it dips into UE4’s 5% cut trigger).
  3. How much total revenue you expect to make from a single game (Unity’s revenue cap per title).

GameMaker: Studio is priced from the very beginning. It supplies a free trial with limited asset usage (a bit of an extreme nerf in my opinion) and then ever-increasing purchase rates to get licenses that can publish to more platforms, ranging from $39 (desktop only) all the way to $399 (desktop, mobile, web, and console).

Godot, as you might expect, has no restrictions in any capacity since it is a free and open source engine. It uses the MIT license, effectively stating that you may use it for whatever purposes you wish, personal or commercial, and have no obligation to share any resources or accumulated revenue with the engine developers in any way. You can create as many copies of the engine as you like for as many people as you like. The engine itself is developed through the support of its contributors’ generous free work and through a Patreon that is filtered by the Software Freedom Conservancy.

In this domain, Godot is the obvious winner. The trade-off therefore comes in the form of the additional tooling and effort you as a developer have to invest to develop and publish your game with Godot. This and more we shall cover in the below examinations.

Scripting Comparisons

Edit: For a more detailed overview of the scripting API differences, please check out my subsequent article.

Unity officially supports Mono-powered C#. With some tweaking, you could potentially use other .NET languages too (like F#). If you end up needing optimizations, you are restricted to the high level language’s typical methods of speeding things up. It would be more convenient and vastly more efficient if one could just directly develop C++ code that can be called from the engine, but alas, this is not the case. Unity also doesn’t have any native visual scripting tools, although there are several paid-for extensions to the engine that people have developed.

Unreal Engine 4 is gaining a stronger and stronger presence due to its tight integration of C++, the powerful, native Blueprint visual scripting language, and its recent addition of Mono C#. Blueprint is flexible, effective, and can be compiled down into somewhat optimized C++ code. Unreal C++ is an impressive concoction of its own that adds reflection and garbage collection features commonly associated with high level languages like C#.

20140731_unreal-engine_findcover
Unreal’s Blueprint visual scripting language

GameMaker: Studio has its own GameMaker Language scripting language. It’s simple enough to use for beginners. What’s also cool is an even more beginner-friendly drag-and-drop programming system that can auto-translate to GML (at least in version 2). Somehow, this all gets mixed into a visual scripting framework that connects things (only in version 2, these are all disparate windows in version 1).

7913962684d5dc85d5214480bfbcbd96
GameMaker Studio’s GML

It is in this area that Godot especially shines out from the others. Previous iterations of Godot have had directly implemented C++ and an in-house python-like scripting language called GDScript. It was used after having already tried Python, Lua, and other scripting languages and found all of them lacking in efficiency when dealing with the unique architectural designs that the Godot Engine implements. As such, GDScript is uniquely tailored for Godot usability in the same way that Blueprints for UE4 are.

Later on, a visual scripting system called VisualScript was implemented that could function equivalently to GDScript. Godot 3.0 is also including native support for Mono C# to cater to Unity enthusiasts.

godot-visual-scripting
Godot’s VisualScript visual scripting language

The power that truly sets Godot 3.0 apart however is its inclusion of a new C API for binding scripted properties, methods, and classes to code implemented in other languages. This API allows any native or bound language’s capabilities to be automatically integrated with every other native or bound language’s dynamically linked functionality. The dynamically linked libraries are registered as “GDNative” code that points to the bound languages’ code rather than as an in-engine script, effectively creating a foreign interface to Godot’s API. This means that properties, methods, and classes declared and implemented in one language can be used by any other language that has also been bound. Bindings of this sort have already been implemented for C++ (Windows, Mac, and Linux). Developers are also testing bindings for Python (already in beta), Nim, and D. Rust and JavaScript bindings are in the works as well, if I understand correctly.

In comparing these various scripting options, C# will likely have better performance than GDScript, but GDScript is more tightly integrated and easier to use. VisualScript will be the least performant of these, but arguably the easiest for non-programmers to use. If raw performance is the goal, then GDNative will be the most effective (since it is literally native code), but it is the least easiest to use out of these as you have to create different builds of the dynamic library for each target platform.

prog-languages

The “loose integration” this enables will empower any Godot developer to leverage pre-existing libraries associated with any of the bound languages such as C++’s enhanced optimizations/data structures, any C# Unity plugins that are ported to Godot, pre-existing GDScript plugins, and the massive library of powerful statistical analysis and machine learning algorithms already implemented by data research scientists in Python. With every newly added language, users of Godot will not have to resign themselves to the daunting “language barrier” that haunts game development today. Instead, they’ll be able to create applications that take advantage of every conceivable library from every language they like.

Edit: C# was recently merged in, and someone ran a comparison of the performance between GDScript, C#/Mono, and GDNative C++. In addition, here is a post I made on Reddit that goes more in-depth into the relationship between the engine’s scripting languages.

Framework Comparisons

Unity and Unreal have very similar and highly analogous APIs when it comes to the basic assets developers work with. There are the loadable spaces in the game world (the Scene in Unity or Level in Unreal). They then have component systems and a discrete root entity that is used to handle logic referring to a collection of components (the GameObject in Unity or Actor in Unreal). Loadable spaces are organized as tree hierarchies of the discrete entities (Scene Hierarchy in Unity or World Outliner in Unreal) and the discrete entities can be saved into a generalizable format that can be duplicated or inherited from (the Prefab in Unity or Blueprint in Unreal).

If you want to extend the functionality of these discrete entities, you then must create scripts for them. In Unity this is done by adding a new MonoBehaviour component within the 1-dimensional list of components associated with a game object. You can add multiple scripts and each script can have its own properties that are exported to the editor’s property viewer (the “Inspector”).

unity-growing-pains-understanding-the-inspector
Multiples of these scripts can be added to a single GameObject if desired, but there is also no relationship defined between components.

In Unreal, a discrete entity has an Actor-level tree-hierarchy showing its components. Scripts, however, are not components themselves (although scripts can extend components too), but rather things directly added to the Actor Blueprint as a whole. An individual function may be created from scratch or extending/overloading an existing function. One can also create Blueprint scripts disassociated from any entity as an engine asset (called a Blueprint Function Library). The bad news is that Blueprints aren’t just a file you point to, i.e. you can’t just add the same script file to different Blueprints like you can with Unity’s C# files.

blueprint_editor_example
Components have their own hierarchy, but are merely variables in the scripting organized by context (event/function/macro) in Actors.

In GameMaker: Studio, you create “rooms” in which to place your objects, sprites, tiles, backgrounds, etc. Version 2 has added the ability for rooms to inherit from one another, which is an interesting nuance compared to Unity/UE4’s methods, allowing you to sort of “Blueprint” your room layout and initialized properties. The objects you place in these rooms can then have scripted responses to in-game events. Objects can inherit from one another, but there is no notion of a component system. In order to construct any sort of composition, the “has” relationships need to be declared overtly by searching the room for the object you wish to own and then manually assigning that object id to a variable. It feels clunky to me personally, but its simplicity can simplify things for beginners who don’t want to concern themselves with the cleanliness of their code.

321
Objects in GM:S have a sprite, physics, a parent, and events to which they can attach scripts.

In Godot, things are simplified a great deal. Components, called “Nodes” are similarly organized into discrete entities that can be saved, duplicated, inherited and instanced; however, Godot sees no difference between the way a Prefab/Blueprint would organize their components and the way a Scene/Level would organize the entities. Instead, it unifies these concepts into a “scene” in its entirety, i.e. a Prefab/Blueprint is a GameObject/Actor is a Scene/Level; everything is just a gigantic set of instanceable and inheritable relationships between nodes. Scenes can be instanced within other scenes, so you might have one scene each for your bullet, your gun, your character, and your level (using them as you would a Prefab/Blueprint). Scripts to extend node functionality are attached 1-to-1 with nodes, and nodes can be cheaply added with the attached script being built-in (Saved into the scene file) or externally linked from a saved script file.

godot_scene_dock_2godot_scene_dock_sub

This section is more or less just to demonstrate how each engine has their own way of organizing the game data and highlighting the relationships between elements of functionality. In my personal experience, I find Godot’s model to be much more intuitive to reason about and work with once preconceptions from other engines’ tropes are discarded, but to be honest, this is really just a matter of personal taste.

(Edit: for lack of another place to put this, I’m inserting here; Godot will soon be integrating the Bullet physics engine as an option you can toggle in the editor settings.)

Community and Documentation Comparisons

The glaringly divergent quality between the engines is the documentation. Unreal’s Blueprint and C++ documentation pale in comparison to the breadth and depth of Unity’s massive array of concepts, examples, and tutorials, built both by Unity Technologies and the large community. This is a damaging blow, but wouldn’t be so bad if Unreal’s documentation were at least adequate. Unfortunately, this is not the case: Blueprints have some diversity of tutorials and documentation (nothing like Unity’s though), especially from the user base, but Unreal C++’s documentation is abhorrently lacking. In-house tutorials will often times be several versions behind and the Q&A forums can take anywhere from a few days to weeks, months, or even over a year to get a proper response (several engine iterations later when the same issue is popping up still).

The ironic curve-ball in the situation is that Unreal Engine 4 publishes its own source code to its licensed users. One could arguably reference the source code itself in order to teach themselves UE4’s API and best practices. Unfortunately, Unreal C++ tends to be a huge, intimidating beast with custom compilation rules that are not well documented, even in code comments, and very difficult-to-follow code due simply to the complexity of the content. A typical advantage of source code-publishing projects is the capacity to spot a problem with the application, identify a fix, implement it, and submit a pull request, but the aforementioned complexity makes taking full advantage of UE4’s visible source code much more difficult for the average programmer (at least, in my experience and that of other programmers I’ve discussed it with).

GameMaker: Studio actually has very nice documentation for its usage which is a testament to its high usability for new beginners. And you can easily jump between functions and topics as most topics will provide a list of related functions. Therefore, learning about HOW to use a topic is often intertwined with the documentation on what the topic is (very beginner friendly). This is probably one of the highlights of using GM:S in my eyes.

docs_logo

Godot Engine’s documentation is stronger than Unreal’s, but still a bit weaker than Unity’s. I would also say there are some ways in which it is both better and worse than GameMaker: Studio’s (community managed means things can change very quickly, but it also means things need to be reported / discussed and someone has to actually do it. Proactive-ness is key). A public document generator is used for the Godot website documentation while an in-engine XML file is used to generate the contents of the Godot API documentation. As such, anybody can easily open up the associated files and add whatever information may be helpful to users (although they are approved by the community through pull requests).

On the downside this means that it is the developers’ responsibility to learn how to use tools. On the upside, the engine’s source code is beautifully written (and therefore very easy to understand), so teaching yourself isn’t really difficult when you really have to do it; however, that is often unnecessary as the already small community is filled with developers who have created very in-depth YouTube tutorials for many major tasks and elements of the engine.

You can receive fully informative answers to questions within a few hours on the Q&A website, Reddit page, or Facebook group (so it’s much more responsive than Unreal’s community). In this sense, the community is already active enough to start approaching the breadth and depth of Unity’s documentation and this level of detail is achieved with a minute fraction of the user base. If given the opportunity, a fully grown, matured, and active Godot community could easily create documentation approaching the likes of Unity’s professional work.

teamwork

So, while Unity is currently still the winner here, it is also clear from Godot’s accessibility and elegance that even with a larger community, Godot could easily enhance the dimensions of its documentation and tutorials to compensate for the community’s needs.

(Edit: note, that the community has been doing weekly documentation sprints in anticipation of the 3.0 release. Even with API revisions between 2.1 and 3.0, the docs have already improved by roughly 20% over the previous version’s content in the past 5 weeks alone. If you are interested in assisting, please visit the Class API contribution guide to get involved and discuss your plans / progress with the #documentation Discord channel (Discord link).)

Extension Comparisons

Due to UE4’s code complexity and Unity’s closed-source nature, both engines suffer from the disease of needing to wait for the development teams to implement new feature requests, bug fixes, and enhancements.

UE4 supposedly exposes itself for editor extensions with their Slate UI that can be coded in C++, but 1) Slate is incredibly hard to read and interpret and 2) it relies on the C++ code just to extend the editor as opposed to a simple scripting solution.

Unity does supply a strong API for creating editor extensions though. Creating certain types of C# scripts with Attributes above methods and properties can allow one to somewhat easily (with a little bit of learning) develop an understanding of how to create new tools and windows for the Unity engine. The relative simplicity of developing extensions for the editor is a prime reason why the Unity Asset Store is so replete with available options for high quality editor extensions.

As far as I’m aware, GM:S provides some customization options for the UI itself, but doesn’t really provide any means of extending the editor (correct me if I’m wrong). They provide a means of making “extension packages” that let you bundle in scripting functionality and assets, but they don’t give users the ability to modify how the editor works very effectively. It has an extension marketplace similar to other high-profile engines.

Godot has an even easier interface for creating editor extensions than Unity: Adding the ‘tool’ keyword to the top line of a script simply tells the script to run at design time rather than run-time, instantly empowering the developer to understand how to manipulate their scripts for tool development: they need only apply their existing script understanding to the design-time state of the scene hierarchy.

rtofmq

EditorPlugin scripts can also be written to edit the engine UI and create new in-engine types. The greatest boon is that all of the logic and UI of the scripting API is the exact same API that allows them to control the logic and UI of the engine itself, allowing these EditorPlugin scripts to operate using the same knowledge already accumulated during one’s ordinary development. These qualities together make creating tools in Godot unbelievably accessible.

In a completely unexpected, but bewilderingly helpful feature, Godot also helps to simplify the process of team-based / communal extension development: all engine assets can be saved with binary files (the standard option for Unreal and Unity) OR with text-based, VCS-friendly files (.scn and .tscn, respectively). Using the latter kinda makes pull requests and git diffs trivially simple to analyze, so it comes highly recommended.

Another significant difference between Unity and Godot’s extension development is the cultural shift: when looking up something in the Unity Asset Store, you’ll often times find a half-dozen or more plugins for the same feature with different APIs, feature-depth/breadth, and price points.

Godot’s culture on the other hand is one of “free and open source tools, proprietary games”. Plugins on the Godot Asset Library must be published with an open license (most of them use MIT), readily available for community enhancements and bug fixes. There is usually only 1 plugin for any given feature with a common, community-debated implementation that results in a common toolset for ALL developers working with the feature in the engine. This common foundation of developer knowledge and lack of any cost makes integrating and learning Godot plugins a joy.

1200px-godot_28game_engine29_logo-svg

Conclusion

Given a desire for high accessibility, a strong publishing and community foundation, minimal cost, powerful optimizations, and enhanced extensibility, I believe I’ve made the potential of Godot 3.0’s affect on the game industry quite clear. If offered a chance, it could become a new super-power in the world of top-tier game engines.

This article is the result of my working with the Godot 3.0 pre-alpha for approximately 3 months. I had never investigated it before, but was blown away by the engine when I first started working with it. I simply wished to convey my experience as a C++ programmer and my insight into what the future of Godot might hold. Hopefully you too will be willing to at least give it a try.

Who knows? You might find yourself falling in love all over again. I know I did.

Minecraftian Narrative: Part 7

Table of Contents

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”

Introduction

Unlike previous iterations of this series, today we’ll be diving into the field of linguistics a bit more intensely. The reason for my lack of any new posts in a month and a half has been the result of my work on a completely new language which is now approaching an alpha state (at which point, I will theoretically be able to build a full parser for it). Today, I’ll be covering why I decided to invent a language, where it came from, how it is different, and how it all ties into the overall goal of narrative scripting.

Future posts will most certainly reference this language, so if you aren’t interested in the background and just want the TL;DR of the language features and relevance to narrative scripting, then feel free to skip on down to the conclusion where I will review everything.

Without further ado, let’s begin!

Issues With “Toki Sona”

Prior to this post, I had puffed up the possibilities of using a toki pona-derived language, heretofore referred to as toki sona. While I was quite excited about tp’s potential to combine concepts together and support a minimal vocabulary with simple pronunciation and an intuitive second-hand vocabulary (e.g. “water-enclosedConstruction” = bathroom), there were also a variety of issues that forced me to reconsider its adaptation towards narrative scripting.

b8b2dece10393ffc45bf3c03cd6633eb_-confused-tyrion-is-memes-confused_550-309

First and foremost is the ambiguity within the language. Syntactic ambiguity makes it nearly impossible for an algorithm to easily understand what is being stated, and tp has several instances of this lack of clarity. For example, “mi moku moku” could mean “the hungry me eats” or “I hungrily eat” or even some new double-word emphasis that someone is experimenting with: “I really dove into eating [something]” or “I am really hungry”.  With the language unable to clearly distinguish modifiers and verbs from each other, identifying parts of speech and therefore the semantic intent of a word and its relationship to other words is needlessly complicated.

The one remedy I thought of for this would be to add hyphens in between nouns/verbs and their associated modifiers, not only allowing us to instantly disambiguate this syntactic confusion, but also to simplify and accelerate the computer’s parsing with a clear delimeter (a special symbol to separate two ideas). However, this solution is not audibly communicable during speech and therefore retains all of these issues in spoken dialogue, violating our needs. Using an audible delimeter would of course be completely impractical.

The other problem of ambiguity with the language is the intense level of semantic ambiguity present due to the restricted nature of the language’s vocabulary. The previously mentioned “bathroom” (“tomo telo”) could also be a bathhouse, a pool, an outhouse, a shower stall, or any number of other related things. Sometimes, the distinction is minor and unimportant (bathroom vs. outhouse), but other times that distinction may be the exact thing you wish to convey. What happens then if we specify “bathroom of outside”?

cedar-grove-windy-hill-outhouse

One possibility is the use of “ma” meaning “the land, the region, the outdoors, the earth”, but then we don’t know if this is an outdoor bathroom or if it is the only indoor bathroom in the region, or if it is just a giant pit in the earth that people use. The other possibility could be “poka” meaning “nearby” or “around”, but that is even more unclear as it specifies a request purely for an indoor bathroom of a given proximity.

As you can see, communicating specific concepts is not at all tp’s specialty. In fact, it goes against the very philosophy of the language: the culture supported by its creators and speakers is one that stresses the UNimportance of knowing such details.

If you ever happen to speak with a writer, however, they will tell you the importance of words, word choice, and the evocative nature of speech. They can make you feel different emotions and manipulate the thoughts of the reader purely through the style of speech and the nuanced meanings of the terminology they have used. If we are to support this capacity in a narrative scripting language, we cannot be allowed to build its foundation on a philosophy prejudiced against good writing.

Lost and Confused Signpost

The final issue, related to the philosophy, is the grammatical limitations imposed by its syntax.

  1. Inter-sentence conjunctions like English’s FANBOYS (“I was tired, yet he droned on.”) are not entirely absent thankfully: they have an “also” (“kin”) and a “but” (“taso”) that can be used to start the following sentence and relate two ideas. One can even adverbial phrases (the only types of dependent clauses allowed) to assist in relating ideas. However, limitations are still present, and the reason for that is a mix of the philosophy and the (admirable) goal of keeping the vocabulary size compact.
  2. You cannot satisfactorily describe a single noun with adjectives of multiple, tiered details (“I like houses with toilet seats of gold and a green doorway”). This is a problem many have attempted to deal with revolving around the “noun1 pi multi-word modifier” technique that converts a set of words into an adjective describing noun1. Users of tp have debated on ways of combating this. One that I had considered, and which is mildly popular, was re-appropriating the “and” conjunction for nouns (“en”) as a way of connecting pi’s, but because you effectively need open and closed parentheses to accomplish the more complex forms of description, there isn’t really a clean way of handling this.

Through all of the limitations, prejudice of writing, and ambiguity, toki pona, and any language closely related to it, is inevitably going to find itself wanting in viability for narrative scripting. Time to move on.

Lojban: Let’s Speak Logically!

In an attempt to solve the problems of toki pona, a friend recommended to me that I check out the “logical language”, Lojban (that ‘j’ is soft, as in “beige”).

Lojban is unlike any spoken language you have ever learned: it borrows much of its syntax from functional programming languages like Haskell. Every phrase/clause is made up of a single word indicating a set of relationships and the other words are all things that act as “parameters” by plugging in concepts for the relations.

2c2085342cd6ca770e37b77eea8dbb2a

For example, “cukta” is the word for book. If you simply use it on its own, it plugs “book” into the parameter of another word. However, that’s not all it means. In full, “cukta” means…

x1 is a book containing work x2 by author x3 for audience x4 preserved in medium x5

If you were to have tons of cukta’s following each other (with the proper particles separating them), having a full version would mean…

A book is a book about books that is written by a book for books by means of a book.

You can also specifically mark which “x” a given word is supposed to plugin as, without needing to use any of the other x’s, so the word cukta can also function as the word for “topic”, “author”, “audience”, and “medium” (all in relation to books). If that’s not conservation of vocabulary, I don’t know what is.

It should be noted that 5-parameter words in Lojban are far more rare than simpler ones with only 2 or 3 parameters. Still, it’s impressive that, using this technique, Lojban is able to communicate a large number of topics using a compressed vocabulary, and yet remain extremely explicit about the meaning of its words.

Just as important to notice is how Lojban completely does away with the concept of a “noun”, “verb”, “object”, “preposition”, or anything of the sort. Concepts are simply reduced to a basic entity-relation-entity form: entity A has some relationship x? to entity B. This certainly makes things easier for the computer. In addition, while on the one hand one might think this would make things easier to understand when learning (since it is a much simpler system), the fact that it is so vastly different from the norm means that people coming from more traditional languages will have a more difficult time understanding this system, especially given the plurality of relationships that are possible with a single word.

computerhummingcartoon

Another strong advantage of Lojban is that it is structured to provide perfect syntactic clarity to a computer program and can be completely parsed by a computer in a single pass. In laymen’s terms, it means that the computer only needs to “read” the text one time to understand with 100% accuracy the “parts of speech” of every word in a sentence. There is no need for it to guess how a word is going to be syntactically interpreted.

In addition, Lojban employs a strict morphological structure on its words to indicate their meaning. For example, each of these “root set” words like “cukta” have one of the two following patterns: CVCCV and CCVCV (C being “consonant” and V being “vowel”). This makes it much easier for the computer to pick out these words in contrast to other words such as particles, foreign words, etc. Every type of word in the language conforms to morphological standards of a similar sort. The end result is that Lojban parsers, i.e. “text readers” are very very fast in comparison to those for other languages.

One more great advantage of Lojban is that it has these terrifically powerful words called “attitudinal indicators” that allow one to communicate a complex emotion using words on a spectrum. For example, “iu” is “love”, but alternative suffixes give you “iucu’i” (“lack of love”, a neutral state) and “iunai” (“hate/fear”, the opposite state). You can even combine these terms to compose new emotions like “iu.iunai” (literally “love-hate”).

8f3866045250113c66ba7d907c10ee71_image-credit-unfortunately-clipart_4288-2848

For all of these great elements though, Lojban has two aspects that make it abhorrent to use for the simple narrative scripting we are aiming for. It is too large of a language: 1,350 words just for the “core” set that allows you to say reasonable sentences. While this is spectacularly small for a traditional language, in comparison to toki pona’s nicely compact 120, it is unacceptably massive. As game designers, we simply can’t expect people to devote the time needed to learn such a huge language within a reasonable play time.

The other damaging aspect is the sheer complexity of the language’s phonology and morphology. When someone wishes to invent a new word using the root terms, they essentially mash them together end-to-end. While this would be fine alone, switching letters around and having part of the latter consumed by the end of the former is unfortunately very difficult to follow. For example…

skami = “x1 is a computer used for purpose x2”
pilno = “x1 uses/employs x2 [tool, apparatus, machine, agent, acting entity, material] for purpose x3.”
skami pilno => sampli = “computer user”

Because “skami pilno” was a commonly occuring word in Lojban’s usage, a new word with the “root word” morphology can be invented on the fly by combining the letters. Obviously, this appears very difficult to do on the fly and effectively involves people learning an entirely new word for the concept.

All that to say that Lojban brings some spectacularly innovative concepts to the table, but due to its complex nature, fails to inspire any hope for an accessible scripting language for players.

tokawaje: The Spectral Language

We need some way of combining the computer-compatibility of Lojban with the elegance and simplicity of toki pona that omits as much ambiguity as possible, yet also allows the user to communicate as broadly and as specifically as needed using a minimal vocabulary.

Over the past month and a half, I’ve been developing just such a language, and it is called “tokawaje”. An overview of the language’s phonology, morphology, grammar, and vocabulary, along with some English and toki pona translations, can be found on my commentable Google Sheets page here (concepts on the Dictionary tab can be searched for with “abc:” where “abc” is the 3-letter root of the word). With grammar and morphology concepts derived from both Lojban and toki pona, and with a minimal vocabulary sized at 150 words, it approximates a toki pona-like simplicity with the potential depth of Lojban. While it is still in its early form, allow me to walk through the elements of tokawaje that capture the strengths of the other two despite avoiding their pitfalls.

Lojban has three advantages that improve its computer accessibility:

  1. The entity-relation-entity syntax for simpler parsing and syntactic analysis.
  2. Morphological and grammatical constraints: the word and grammar structure is directly linked to its meaning.
  3. The flexibility of meaning for every individual learned word: “cukta” means up to 5 different things.

toki pona has two advantages that improve its human accessibility:

  1. Words that are not present in the language can be estimated by combining existing words together and using composition to construct new words. This makes words much more intuitive.
  2. It is extremely easy to pronounce words due to its mouth-friendly word construction (every consonant must be followed by a single vowel).

medium

“tokawaje” accomplishes this by…

  1. Using a similar, albeit heavily modified entity-relation-entity syntax.
  2. Having its own set of morphological constraints to indicate syntax.
  3. Using words that represent several things that are associated with one another on a spectrum.
  4. Relying on toki pona-like combinatoric techniques to compose new words as needed.
  5. Using a phonology and morphology focused on simple sound combinations that are easily pronounced. Must match the pattern: VCV(CV)*(CVCV)*.

Now, once more, but with much more detail:

1) Entity-Relation-Entity Syntax

Sentences are broken up into simple 1-to-1 relations that are established in a context. These contexts contain words that each require a grammar prefix to indicate their role in that context. After the prefix, each word then has some combination of concepts to make a full word. Concepts are each composed of some particletag, or root, (some spectrum of topic/s) followed by a precision marker that indicates the exact meaning on that spectrum.

The existing roles are… (pronounced like in Spanish):

  1. prefix ‘u’: a left-hand-side entity (lhs) similar to a subject.
  2. prefix ‘a’: a relation similar to a verb or preposition.
  3. prefix ‘e’: a right-hand-side entity (rhs) similar to an object.
  4. prefix ‘i’: a modifier for another word, similar to an adjective or adverb.
  5. prefix ‘o’: a vocative marker, i.e. an interjection meant to direct attention.

Sentences are composed of contexts. For example, “I am real to you,” is technically two contexts. One asserts that “I am real” while the other asserts that “my being real” is in “your” perspective. This nested-context syntax is at the heart of tokawaje.

These contexts are connected with each other using context particles:

  1. ‘xa’ (pronounced “cha”) meaning opening a new context (every sentence silently starts with one of these).
  2. ‘xo’ meaning close the current context.
  3. ‘xi’ meaning close all open contexts back to the original layer.

(These also must each be prefixed with a corresponding grammar prefix)

Examples of Concept Composition:

  1. ‘xa’ = an incomplete word composed of only a particle+precision.
  2. “uxa” = a full word with a concept composed of a grammar prefix and a particle+precision.
  3. “min” = root for pronouns, “mina” = “self”, full “umina” = “I”.
  4. “vel” = root for “veracity”, “vela”= “truth”, full “avela” = “is/are”.
  5. “sap” = root for object-aspects, “sapi” = “perspective”, full “asapi” = “from X’s perspective”.

Sample Breakdown:

“I am real to you” => “umina avela evela uxo asapi emino.”

  1. “umina” {u: subject, min/a: “pronoun=self”}
  2. “avela” {a: relation, vel/a: “veracity=true”}
  3. “evela” {e: object, vel/a: “veracity=true”}
  4. “uxo” {u: subject, xo: “context close”} // indicating the previous content was all a left-hand-side entity for an external context.
  5. “asapi” {a: relation, sap/i: “aspect=perspective”}
  6. “emino” {e: object, min/o: pronoun=you}

json

It’s no coincidence that the natural grammatical breakdown of a sentence looks very much like JSON data (web API anyone?). In reality, it would be closer to…

{ prefix: ‘u’, concepts: [ [“min”,”a”] ] }

…since the meanings would be stored locally between client and server devices.

This is DIFFERENT from Lojban in the sense that no single concept will encompass a variety of relations to other words, but it is SIMILAR in that the concept of a “subject”/”verb”/”object” structure isn’t technically there in reality. For example:

“umina anisa evelo” => “I -inside-> lie” => “I am inside a lie.”

In this case, “am inside” isn’t even a verb, but purely a relation simulating an English prepositional phrase where no “is” verb is technically present.

These contexts can be used without a complete context to create gerunds, adjective phrases, etc. For example, to create a gerund left-hand-side entity of “existing”, I might say

“avela uxo avela evelo.” => “Existing is (itself) a falsehood.”

pwaa-object-09182015-970x545

You might ask, “how do we tell the difference with something like [uxavela]? Might it be {u: object, xav/e: something, la: something}? Actually, no. The reason the computer can immediately understand the proper interpretation is because of tokawaje’s second Lojban incorporation:

2) Strict Morphological Constraints for Syntactic Roles

Consonants are split up into two groups: those reserved for particles, such as ‘x’ and those reserved for roots, such as ‘v’. The computer will always know the underlying structure of a word’s morphology and consequent syntax. Therefore, given the word “uxavela” we will know with 100% certainty that the division is u (has the V-form common to all prefixes), xa (CV-form of all particles), and vela (CVCV-form of all roots).

Particles can be split up into two categories based on their usual placement in a word.

  1. Those that are usually the first concept in a word.
    1. ‘x’ = relating to contexts (as you have already seen previously)
      1. ‘xa’ = open
      2. ‘xo’ = close
      3. ‘xi’ = cascading close
      4. ‘xe’ = a literal grammar context (to talk about tokawaje IN tokawaje)
    2. ‘f’ = relating to irrelevant and/or non-tokawaje content
      1. ‘fa’ = name/foreign word with non-tokawaje morphology constraints
      2. ‘fo’ = name/foreign word with tokawaje morphology constraints
      3. ‘fe’ = filler word for something irrelevant
  2. Those that are usually AFTER a concept as a suffix (could be mid-word).
    1. ‘z’ = concept manipulation
      1. ‘za’ (zah) = shift meaning more towards the ‘a’ end of the spectrum
      2. ‘zo’ (zoh) = shift meaning more towards the ‘o’ end of the spectrum
      3. ‘zi’ (zee) = the source THING that assumes the left-hand-side of this relation.
        1. Ex. “uvelazi” => that which is something
        2. Shorthand for “ufe avela uxo”.
      4. ‘ze’ (zeh) = the object THING that assumes the right-hand-side of this relation.
        1. Ex. “uvelaze” => that which something is.
        2. Shorthand for “avela efe uxo”.
      5. ‘zu’ (as in “food”) = questioning suffix
      6. ‘zy’ (zai) = commanding suffix
      7. ‘zq’ (zow) = requesting suffix
    2. ‘c’ = tensing, pronounced “sh”
      1. ‘ca’ = future tense
      2. ‘ci’ = progressive tense
      3. ‘co’ = past tense
    3. ‘b’ = logical manipulation
      1. ‘be’ = not
      2. ‘ba’ = and
      3. ‘bi’ = to (actually, it is the “piping” functionality in programming, if you know about that)
      4. ‘bo’ = or
      5. ‘bu’ = xor

All other consonants in the language fall into the “root word” set. With these clear divisions, tokawaje will always know what role a concept has in manipulating the meaning of that word.

I’d also like to point out that informal, conversational uses of these two groups of particles may completely remove the distinction between them. For example, someone may simply say:

“uzq” => “Please.”

This would not actually impact the computer’s capacity to distinguish terms though. I even plan to make my own parser assume that lack of a grammar prefix implies an intended ‘u’ prefix (not that that’s encouraged)

3) Concepts in Tokawaje Exist on Spectra

Most every word in the language has exactly 4 meanings, with 3 non-root concepts using more than that: the grammar prefixes and ‘z’-based word manipulators (as you’ve already seen), and general expressive noises / sound effects which are vowel-only. This technique allows for vocabulary that is flexible, yet intuitive, despite its initial appearance of complexity.

4) Sounds and Structure are Designed for Clear, Flowing Speech

Every concept is restricted to a form that facilitates clear pronunciation and a consistent rhythm. Together, these elements ensure that the language is simple to learn phonetically.

Concepts have the form C (particles/tags) or CVC (roots) along with a vowel grammar prefix and a vowel precision suffix, resulting in a minimum word of VCV or VCVCV.

The rhythm to concepts emphasizes the middle CV: u-MI-na, a-VE-la, etc. Even with suffixes applied to words, this pattern never becomes unmanageable. The result is a nice, flowy-feeling language:

  1. uvelominacoze / avelominaco (“velomina” => a personal falsehood)
    1. u-VE-lo-MI-na-CO-ze (that which one lied to oneself about)
    2. a-VE-lo-MI-na-co (to lie to oneself in the past)

e03db51667b11575b3b27f64fc969411

5) Tokawaje Employs Tiered Combinatorics to Invent New Concepts

The first concept always communicates the root “what” of a thing while the subsequent concepts add further description of the thing. This structure emulates toki pona’s noun-combining mechanics.

‘u’, ‘a’, and other non-‘i’ terms are primary descriptors and more closely adhere to WHAT a thing is. ‘i’ terms are secondary descriptors and approximate the additional properties of a thing BEYOND simply WHAT it is. Fundamentally, every concept follows these simple rules:

  1. Non-‘i’ words are more relevant to describing their role’s reality than ‘i’ words.
  2. However, individual words are described more strongly by their subsequent ‘i’ words than they are by other terms.
  3. Multiple non-‘i’ words will further describe that non-‘i’ term such that later non-‘i’ words act as descriptors for the next-left non-‘i’ word and its associated ‘i’ words.

Let’s say I have the following sentence (I’ll be using the filler particle “fe” with an artificially inserted number to reference more easily. Think of each of these as a root+precision CVCV form):

“ufe1fe2 ife3fe4 ife5 ufe6fe7 ife8 avela uxofe9 ife10 afe11”

This can be broken down in the following way:

  1. Any pairing of adjacent fe’s form a compound word in which the second fe is an adjective for the previous fe, but the two of them together form a single concept. For example, “ife3fe4”: fe4 is modifying fe3, but the two together form an adjective modifying the noun “ufe1fe2”.
  2. The subject is primarily described by “ufe1fe2” and secondarily by “ufe6fe7” since they are both prefixed with ‘u’, but one comes later. “ufe6fe7” is technically modifying “ufe1fe2”, even if “ufe1fe2” is also being more directly modified by the ‘i’-terms following it.
  3. Each of those ‘u’ terms are additionally modified by their adjacent ‘i’ term adjective modfiers.
  4. “ife5” is an adverb modifying “ife3fe4”.
  5. The “existence of ” the “ufe1-8” entity is the u-term of the “afe11” relation.
  6. The entirety of that u-term has a primary adjective descriptor of “-fe9” and a secondary adjective descriptor of “ife10”.

dog-bite-prevention_456px

Suppose the word for “dog” were “uhumosoviloja” (u/lhs,humo/beast,sovi/land,loja/loyal = “loyal land-beast”). How might you describe a disloyal dog then? You would use an ‘u’ for stating it is a dog (that identifying aspect) and an ‘i’ for the disloyalty (the added on description). The spectrum of loyalty (“loj”) would therefore show up twice.

“uhumosoviloja ilojo” => “disloyal dog”

For clarity purposes, you may even split up the “loja”, but that wouldn’t impact the meaning since “uloja” still has a higher priority than “ilojo”.

“uhumosovi uloja ilojo” => “disloyal dog” (equivalent)

Let’s say there were actual distinctions between words though. How about we take the noun phrase “big wood box room”? Here’s the necessary vocabulary:

“sysa” => “big/large/to be big/amount”
“lijavena” => “rigid thing of plants” => “wood/wooden/to be wood”
“tema” => “of or relating to cubes”
“kita” => of or relating to rooms and/or enclosed spaces”

Now let’s see some adaptations:

  1. ukitasysa => an “atrium”, a “gym”, some space that, by definition, is large.
  2. ukita usysa => same thing.
  3. ukita isysa => a room that happens to be relatively big.
  4. ukita utema ulijavena usysa => cube room of large-wood.
  5. ukita utema ulijavena isysa => cube room of large-wood.
  6. ukita utema ilijavena usysa => room of wooden large-boxes.
  7. ukita itema ulijavena usysa => a cube-shaped room of large-wood.
  8. ikita utema ulijavena usysa => [something] related to rooms that is cube-shaped, wooden, and large.
  9. ukita utema ilijavena isysa => a room of large-wood cubes.
  10. ukita itema ilijavena usysa => the naturally large room associated with wooden cubes.
  11. ikita itema ulijavena usysa =>[something] related to cube-shaped rooms that is a large-plant.
  12. ukita itema ilijavena isysa => the room of large-wood boxes.
  13. ikita itema ilijavena usysa => [something] related to plant-box rooms that is an amount. (an inventory of greenhouses or something?)
  14. ikita itema ilijavena isysa => [something] related to rooms of large-wood boxes.
  15. ukita usysa utema ilijavena => large room of wooden boxes.
  16. ukita utema usysa ilijavena => room of wood-amount boxes.
  17. ukita utemasysa ilijavena => room of wooden big-boxes.
  18. ukita usysa iba ulijavena itema => A big-room and a cube-related plant.

wooden-box-apartment-2

Some of these are a little crazy and some of them are amazingly precise. The point is, we are achieving this level of precision using a vocabulary closer to the scope of toki pona. I can guarantee you that you would never have been able to say any of this in a language as vague as TP nor will it ever try to approximate this level of clarity. I can likewise guarantee that Lojban will never have a minified version of itself available for video games. Good thing we don’t need we have an alternative.

Conclusion

As you can see, tokawaje combines the breadth, depth and computer-accessibility of Lojban with the simplicity, intuitiveness, and human-accessibility of toki pona.

For those of you wanting the TL;DR:

The invented language, tokawaje, is a spectrum-based language. Clarity of pronunciation, compactness of vocabulary (150 words), and combinatoric techniques to invent concepts all lend the language to great accessibility for new users of the language. On the other hand, a sophisticated morphology and grammar with clear constraints on word formations, sentence structure, and their associated syntax and semantics result in a language that is well-primed for speedy parsing in software applications.

More information on the language can be found on my commentable Google Sheets page here (concepts on the Dictionary tab can be searched for with “abc:” where “abc” is the 3-letter root of the word).

This is definitely the longest article I’ve written thus far, but it properly illuminates the faults with pre-existing languages and addresses the potential tokawaje has to change things for the better. Please also note that tokawaje is still in an early alpha stage and some of its details are liable to change at this time.

If you have any comments or suggestions, please let me know in the comments below or, if you have specific thoughts that come up while perusing the Google Sheet, feel free to comment on it directly.

Next time, I’ll likely be diving into the topic of writing a parser. Hope you’ve enjoyed it.

Cheers!

Next Article: Coming Soon!
Previous Article: Relationship and Perception Modeling

Minecraftian Narrative: Part 6

Table of Contents

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”

Introduction

Previously, we identified two narrative AIs: the StoryMind that manages story development and content generation behind the scenes, and the Agent that simulates the behaviors of a character. The Agent consults with a Character while interpreting narrative scripting input. It then relays instructions to the Vessel that executes those instructions in the virtual world on behalf of the Character. Today, we’ll explore how an Agent could model socio-cultural constructs, account for multiple layers of interactive perceptions, and integrate narrative scripting into each of these.

Vessel = gameplay logic, Character = personnel record, Agent = interpretation AI logic

Amorphic Relationship Abstractions

In games, programmers will often construct objects in the game world based on a flexible design called the “Component” design pattern. This technique builds game objects less by focusing on a hierarchy (a Dragon is a Beast is a Physical is a Renderable is an Object), and more by attributing generic qualities to them which can be added or removed as needed. The objects then simply function as amorphous containers for these “components” of behavior. You don’t have a “dragon”, you have a “fire-breathing”, “flying”, “intelligent” “serpentine” and “animalistic” object that “occasionally attacks cities” which we simply label as a dragon. A player could then talk with the dragon and convince it to become more peaceful mid-game. The “Component” system is what allows us to dynamically change the dragon Character by simply removing the city-attacking behavior.

d089815416c17212d16cf9bd60d7fa27

This same model would seem to be extremely effective at describing our relationships in life. Relationships are amorphous and are often interpreted by context: which behaviors actually exist between two entities, and which behaviors are expected. Let’s say you are trying to understand whether you are in a “friendship” relationship with someone. If the other person isn’t doing what you are expecting a friend to do, then the likelihood that your unknown, actual relationship is the suspected one decreases. On the flip side, if you expect your friends to quack like a duck, and this random person does quack like a duck at you, then you have found someone who is likely to become your friend (though, you and she would be a bit weird). Critical to this is how each Character may have its own definition of what behaviors constitute “friendship.”

In addition, when one evaluates the satisfaction of a relationship, one typically focuses on the behaviors one wishes to engage in with others. However, the person doesn’t then start engaging in new behaviors immediately in the context of their old relationship; they first prioritize changing the relationship itself, so as to make the sought-after behavior more acceptable to the other party.

0da127_970e26c00f244006a3ed72b72128ab35

For example, if a boy likes a girl, he shouldn’t (necessarily) immediately go to her home and declare his love, but perhaps get her to first have a “familiar”, then “associate”, and then “friend” relationship first (though the way a relational path from relationship A to B is calculated for any given Character would be a function of the Character’s personality).

The implication is that these kinds of procedurally generated relational pathways can lead to characters that naturally develop a variety of human-like behaviors as they decide on a goal, calculate a possible social path towards that goal, and then further break down ways in which to change the situation they are in to meet their goals. This is related to the concept of hope. When you hope for someone to engage in a given behavior, then you are really stating that you will be more satisfied if you are in a relationship where those kinds of behaviors are expected (and where the person actually does those behaviors, indicating that they actually are in that relationship with you).

14377-father-daughter-love-hug-dad-woman-wide-1200w-tn

For example, a “daughter” entity D and a “father” entity F exist in the following way:

  • D & F both have the same expectations of F such that they both agree F is a “biological father” with D (he is responsible for impregnating her mother).
  • D & F both have the same expectations of each other such that they both agree F is a “guardian” with D (he houses/feeds/protects her, & pays for her healthcare/schooling, etc.).
  • D & F have different expectations of each other such that F believes he has a “fatherhood” relationship with D, but D does not believe this. F is always working, and D wishes he would play with her more often and come to her public achievements in school. As such, D is not satisfied since her conception of the “fatherhood” relationship is not the same.
  • Because D and F can both have different variations of the same relationship expectations, an AI will be able to support a system in which D and F may talk to each other “about” the same topic, but be thinking of totally different things (simulating the naturally confusing elements of the human condition). This is because the label for the relationship is equivalent, but the definition of each person’s idea of the relationship consists of different behaviors.

Tiered Relationship Expectations

Furthermore, the variations in relationship expectations may diverge at the individual level or group level. We may be able to assume that the vast majority of people within a given “group” have similar beliefs regarding one topic or another. However, we must also consolidate a hierarchy of relational priorities: for any given person, individual expectations override any group-level ones, and different groups will have various degrees to which they influence the individual’s social expectations of others. Let’s take The Legend of Korra as an example.

tumblr_static_1rcwd9q8foys04ss48o04oswc

In this world, some people, called “benders,” can control a certain element (fire, earth, water, or air). Previously, the differences between types of benders and the cultures they came from led to conflict in the world. In “Republic City” however, those people can now live in peace with one another. This represents a “national” group with cultural expectations of uniting people despite their different cultures.

legend-of-korra-103-the-revelation-full-episode

But for those who do not have powers at all, they are subject to the prejudice and general economic superiority of the “benders”. From there, spawns the political activist and terrorism group: the Equalists. This group adds another layer of people on top of the “national” group layer. An Equalist who still believes in the capacity for people to unite is simply someone who is not as loyal to the Equalist cause. Whether the person still has this hope for a positive relationship between the Republic City entity and themselves is something that others may notice and consider when evaluating the actions of this person. The Equalist leader will see this person as someone who must be further manipulated to the cause whereas peacekeepers will seek to redirect this person’s efforts of reform wrought from their emotions.

Finally, we have the individual level, which supersedes all group-level social expectations. Say one of these questionably-loyal Equalists also has another expectation that relates to equality: they believe a city should always be concerned about the well-being of the diversity of animals in the Avatar world. Animal-care efforts by the city therefore play a larger role in currying favor in this particular person, even if their Equalist position still puts them in a climate of distaste for the city. This person, like many who might join that organization, likely have a variety of internal conflicts that they are managing, expectations and needs that are battling for dominance of the mind. This is how our Agent’s should take into account decision-making: through a diverse conflict of interests.

conflictofinterest_wordle

Relationship Modeling

So, how to actually take this relational concept and model it in a way the computer can understand? Well, let’s first define our terms:

  • Narrative Entity: A basic “thing” in the narrative that has narrative relevance. Can be a form of Life, a non-living Object, a Place, or an abstract idea or piece of Lore. This is “what” the thing is and implies various sorts of properties. It also places default limits on things (for example, a Lore cannot be interacted with physically).
  • Character: A Narrative Entity that has a ‘will’, i.e. can desire for itself a behavior. This is “who” the thing is (ergo, it has a personality) and implies what sorts of behaviors it would naturally engage in.
  • Role: The name a Narrative Entity assumes under the context of its behaviors in a Relationship.
  • Relationship: The set of behaviors that have occurred between two Narrative Entities. They will always be binary links between NEs and may or may not be bidirectional, i.e. an Entity may not even have to do anything to be in a Relationship. It may not even be aware that it is in a Relationship with another entity.

A behavior as we will see it is defined by some action or state change. For actions, there is a source for this behavior and an object. As such, we can graphically portray transitive behaviors as directed lines running from one Narrative Entity to another. For intransitive verbs, we simply have a directed line pointing to a Null Entity that represents nothingness.

Rather than simply have these be straight lines however, it is easier to think of them as longitudinal lines between points on a globe.

long
Renderings may not necessarily place them at equidistant positions. This is merely the simplest rendering.
At each pole is a Narrative Entity and the entirety of the globe encompasses the actual relationship between the two. We can then define a set of “ideal” relationships that have their own globes of interactions. By checking the degree to which the ideal is a subset of the actual, we can calculate the likelihood that the actual includes the idealized relationship. This is an example of set logic in mathematics and its applications in identifying and relating relationships.

I further propose that these globular relationships have a sequence of layers: a core globe summarizing the history of behaviors that have occurred between the two entities, and an intermediate layer composed of hoped-for behaviors for any given Character.

The intermediate layer is far more complex since it is both hypothetical and subjective between any 2 Characters (visualized as two clearly-divided hemispheres) or a Character and a Narrative Entity (a globe).  The intermediate (hemi)sphere(s) would be calculated from an algorithm that takes into account the historical core of the relationship and the associated Character’s personality. Given Character goals X and past interactions Y, what type of relationship, i.e. what collection of behaviors does the Character wish to have with the target of the relationship?

orange-halves-02
Picture each division of this orange as the source hemisphere of two respective Characters: clearly divided, yet maintaining the same directed-lines-as-globe structure.

Perception Modeling

Furthermore, we must ensure that we can simulate the accumulation of knowledge and the questionable nature of it: How are we to model perceptions of knowledge, e.g. “I suspect that you ate my cookies.”

Relationship_Modeling_Knowledge
In this scenario, Person 1 is fairly certain Person 2 stole their cookies, but Person 2 has not yet even realized that Person 1’s cookies are missing. Person 2 also does not know how Person 1 obtained the cookies.
For this, we must allow even behaviors themselves to be abstracted into Narrative Entities that can be known, suspected, or unknown. Without this recursiveness, without the ability to form interactions between Characters and knowledge of interactions, you cannot replicate more complicated scenarios such as…

  • A actually knows a secret S1.
  • B hopes to know S2.
  • A suspects B wants to know S1 and therefore attempts to hide their knowledge of S1 from B.
  • B has reason to believe that A knows S2, so B pays more attention to A, but tries to avoid revealing this suspicion to A.
  • A has noticed B’s abnormal attention directed at him/her, so A surreptitiously engages in a behavior X to help hide the “way” of learning about S1.
  • C witnesses X and tells B about it, so B is now more confident that A knows about S2.
  • (We don’t even necessarily know if S1 and S2 are the same secret).
  • etc.

With this quick example, you can see how perceptions need to be able to have various degrees of confidence in behaviors (actions and state changes) to help inform the mentalities of Agents.

Narrative Scripting Integration

As far as codifying these Entities and Behaviors goes, that is where the narrative scripting comes in to play. Every Entity, every Behavior, and therefore every Relationship is described solely in terms of narrative scripting statements. This is to prevent situations where the technology must do an intermediate translation into another language during interpretation of scripted content into logical meaning.

So, for example, Cookies might have the following abstract description:

Cookies are…

  1. a bread-based production.
  2. have a flavor: (usually) sweet.
  3. have a shape: (usually) small, circular, and nearly flat.
  4. have a source material: bread-based semi-solid
  5. have a creation method: (usually) heated in a box-heat-outside (i.e. “oven”, distinct from the box-heat-inside, i.e. “microwave”).

These properties are defined in an order of priority such that if something were to refer to an entity that is a bread-based treat that is small and circular, the computer would have a higher percentage confidence in evaluating that statement as the entity that shares the other qualities “sweet”, “made in an oven”, “nearly flat”, etc. vs another entity described as having a different shape or a different taste.

Conclusion

With innumeral globes of interactions and perception lines linking everything together, a fully-rendered model might look something like this:

image-20151122-412-10995v3
Some of you may recognize this from the article on “Modeling Human Behavior and Awareness”
This concept has really grown out of a pre-existing theory on how to model these same kinds of behaviors that I developed. No doubt it will receive revisions as an actual implementation is underway, but before we get to that, we’ll have to dive once more into the field of linguistics.

The great break in content between articles here is because I’ve been hard at work on developing my own constructed language that is quite distinct from Toki Pona/Sona. To hear about the reasons why, and what form this new language will take, please look forward to the next article.

As always, comments and criticisms are welcome in the comments below. Cheers!

Next Article: Evolution of Toki Sona to “tokawaje”
Previous Article: Dramatica and Narrative AI

Minecraftian Narrative: Part 5

Table of Contents

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”

Introduction

Today’s the day! We’ve gotten an idea of what form Toki Sona-based narrative scripting will take, and we’ve examined some of the concerns regarding its integration and maintenance with code. Now we’re finally going to dive into my favorite part: theorizing the behavior of classes that would actually use Toki Sona and react.

The most brilliant illustrations of media, in my opinion, are those which exhibit the Grand Argument Story. These stories have an overarching narrative with a particular argument embedded within, advanced throughout the experience by the main character and those he or she meets as they personify competing, adjacent, or parallel ways of thinking.

But how are we to teach a computer the narrative and character relationships as they appear to us? Thankfully, a well-fleshed out narrative framework already exists to help us as we figure it out. Its name is Dramatica, and from it, we shall design the computer types responsible for managing a dynamic narrative: the Character, Agent, and StoryMind.

Brief Dramatica Overview

The Dramatica Theory of Story is a framework for identifying the functional components of a narrative. In its 350-page introductory book which is available for free on their website (the advanced book can be found over here too), it defines a set of story concepts that must exist within a Grand Argument Story in order for it to be fully fleshed out. If anything is missing, then the story will be lacking. To be honest, the level of detail it gets into is rather jaw-dropping as a writer. Its creators even had to create a software application just to help writers manage the information from the framework! How detailed is it? Check this out…

Dramatica defines four categories of Character, Plot, Theme, and Genre.

It also defines 4 “Throughlines” which are perspectives on the Themes.

  • Overall Story (OS) = The story summarized as everyone experiences it. A dispassionate, objective view.
  • Main Character Story (MC) = The story as the main character experiences it. The character we relate to, experiencing inside-out.
  • Influence Character Story (IC) = The story as the influential character experiences it. The character we sympathize/empathize with, experiencing from the outside-in.
  • Relationship Story (SS) = The story viewed as the interactions between the MC & IC. An extremely passionate, objective view.

Within Theme, there are 4 “Classes” that have several subdivisions within them.

  • Universe: External/State => A Situation
  • Physics: External/Activity => An Activity
  • Psychology: Internal/Activity => A Manner of Thinking
  • Mind: Internal/State => A State of Mind

One Throughline is matched to each of the Classes so that, for example, the MC is mainly concerned about dealing with a state of mind, the IC is trying to avoid a situation related to his/her past, the community at large is freaking out about the ongoing activity of preparing for and running a local tournament, and there is an ongoing difference in methodologies between the MC and IC that draws tension between them.

dramatica_universe_breakdown
Each Class can be broken down into 64 elements. Highlighted: Universe.Future.Choice.Temptation Element.

For each Class, you select 1 Variation of a Concern per story. The 4 Plot Acts (traditionally exposition, rising action, falling action, and denouement) each then shift between the 4-Element quad within the chosen Variation. Since Variations each have a diagonal opposite, diagonal movements (a “slide”) don’t change the topic Variation as intensely as shifting Variations horizontally or vertically (a “bump”).

This slideshow requires JavaScript.

For example the Universe.Future.Choice variation has the two opposing Elements, “Temptation” and “Self-Control” plus the other two “Logic” and “Feeling”. Notice these are two distinct, albeit related spectra of the human experience that come into play when making decisions about the future regarding an external situation that must be dealt with. Shifting topics from Temptation to Self-Control wouldn’t be as big of a change as going to Logic or Feeling since the former deals with the same conflicting pair of Elements.

Each of those Elements can be organized with the Acts in a number of permutations. 3 patterns arise, each of which have 4 orientations and can be run forwards or backwards (2). That gives 24 possible permutations for each Variation. 16 Variations per class, 4 Classes per story, and then times 4 again since each of the 4 Throughlines can be paired with a Class. Altogether, that comes out to 6,144 possible Plot-Theme permutations.

The Theme Classes are also matched up with Genre categories which can help the engine identify what sort of content to create at a given point of the story (doesn’t increase multiplicity).

dramatica_genre_class
The merging of Plot with Genre

On top of that, there are the Characters to consider. There are 8 general Archetypes, each of them composed by combining a Decision Characteristic and an Action Characteristic  for each of 4 aspects of character: their reason for toiling, the way they do things, how they determine success, and what they are ultimately trying to do.

dramatica_archetype_breakdown

You can make any character by combining 2 Characteristics from 2 unopposed Archetypes. So, (7!) permutations of any given characteristic within an aspect (not matching up with an opposite for each of them). 5,040 * 4 aspects * 2 characteristics = 40,320 permutations of Characters, optimally.

Finally, there’s the number of Themes that can be delivered by the external/internal successes and failures of the MC…

dramatica_thematicconclusions
4 Possibilities

…and whether the MC and IC remained steadfast in their Class or changed (e.g. did they stay/change their state of mind?) and the success/failure thereof.

This slideshow requires JavaScript.

That makes 4 * 4 possible endings: 16.

PHEW! Okay, now, altogether that’s 16 endings * 40,320 characters * 6,144 plots…

Carry the 3…there we go:

3.96 billion. Stories.

And that’s without even “skinning” them as pirate, sci-fi, fantasy, take your pick.

Needless to say, these kinds of possibilities are EXACTLY the sort of variation we should be looking for in procedural narrative generation. Even if you knocked out the Informational genre in the interest of counting only the  non-edutainment games, that still leaves about 2.97 billion possibilities. Good odds, I say.

Also, keep in mind, any given video game will often times have several sub-stories within the overarching story, ones where minor characters have their own stories to explore and see themselves as the Main Character and Protagonist of their own conflict. In these stories, you, the original main character, may play the role of Influence Character (think Mass Effect 2 loyalty missions if you’ve ever played that: every character’s unique storyline is critically affected by the decisions you make while accompanying them for a personal, yet vital journey). Assuming any given story has, say, 9 essential characters (pretty small number by procedural generation standards, but pretty normal for children’s books), that would imply any single gameplay experience may involve 26.73 billion story arrangements.

It isn’t just Dramatica’s variability that makes it so appealing though. Each of these details are designed to be clearly identified and catalogued. This has two important consequences. The first is that the engine will know what goes into into making a good story and will therefore know how to create a good story structure from scratch. The second, and far more important to us, is that the engine will know when and how any of these qualities are not present or properly aligned. It will therefore understand what has happened to the story when the player changes things and how to fix them. Even better, because of its understanding of related story structures, it will even be able to adapt with completely new story forms should it wish to.

Head hurting yet? Fantastic! Let’s dig into characters as computer entities.

Characters & Agents

While Dramatica gives us the functional role of Characters, it doesn’t really flesh them out properly. Unfortunately, writers don’t really maintain a consolidated list of brainstorming material, but you can find several odds and ends around the Internet (list of character needs, list of unique qualities for realistic characters, and a character background sheet, for example). Any and all of these can be used to help flesh out and define the particular aspects of our Characters, beyond just their functional role.

The main interest we have with these brainstorming materials is to define a set of fields that an AI can connect Toki Sona inputs to. Given some Toki Sona instruction A, a definition of Character B, and a certain Context C, what course of action D should I take? Answering this question is the job of the Agent.

What exactly does an Agent entail? They would be the singular existence that represents the computer logic for the entirety of an assigned Character. In our case, we’re going to define a Character as ANY Narrative Entity that has (or could resume having) a will of its own. A Narrative Entity would simply be anything that requires a history of interactions with it to be recorded such as a Life, an Object, a Place, or a piece of Lore.

tumblr_inline_no70trekdk1toha3t_1280

Notice that characters don’t have to be living beings specifically. For example, an enchanted swamp may have an intelligence living amongst the trees. It would most certainly be a Character; however, it would also definitely be a Place that people can enter, exit, and reside in. As a swamp entity would be the embodiment of both the land, the plants, and the animals within, one could also extend its attributes to Life as well. As a result, we’d have the swamp Agent that accesses the Character which in turn maintains properties of both the Life and Place for the swamp Narrative Entity.

agents_class_diagram
Sample low-effort UML Class Diagram for the Agent Subsystem (made with UMLet)

In the diagram above, we specify that a single Agent is responsible for something called a Vessel rather than for a Character directly. What’s more, the Vessel can “wear” several Characters! What is the meaning of this?

Let’s say we wished to create a Jekyll & Hyde story. Although Jekyll and Hyde have different personalities, they also share a body. Whatever one is doing, wherever one is, the other will also be doing the moment they switch identities. This relates back to assets too. Whatever one sprite/model animation will be doing, the other will also be doing when those assets are switched to another set. In this way, Characters and Vessels are fully changeable without affecting the other. A multiple personality character might change Characters while not changing Vessels. A shapeshifting character might change Vessels while not changing Characters. In the case of Jekyll and Hyde, it would be a swap for both Character and Vessel as their personalities and bodies are BOTH different, but it will always be tied to the same location and activity at the time of switching.

the_curious_case_of_jekyll_and_hyde_by_nickhuddlestonartist-d61zr1t

So, the Agent is just an AI that doesn’t care what it’s controlling or to what ends. It looks to the Character to figure out what it narratively should and can do, and it issues instructions based on that to the Vessel. It doesn’t care whether the Vessel knows how to do it. It simply assumes the Vessel will know what the instructions mean. In the process, we’ve divorced the concept of a Character 1) from the in-story and in-engine thing that they are embodied as and 2) from the logic that figures out what a given Character should do given a set of Toki Sona inputs from the interpreter.

The last important thing to note about the Characters and Agents here is that the Agents are informed, context-wise, by their associated Characters. As such, an Agent’s decisions are constrained by their Vessel’s current Character; only its acquired knowledge, background of experience and skills, and personality will invoke behavior. An Agent will therefore factor into its decision-making the Character’s history of perceptions, likes and dislikes, attitude, goals, and everything else that constitutes the Character. It then translates incoming Toki Sona instructions into gameplay behavior. For example, what might a Character do if asked, “What do you know about the aliens?”

orig-orig-f

Maybe they don’t know much about the aliens. Or maybe they do, but it’s in their best interest to only reveal X information and not Y. But maybe they also really suck at lying, so you can see through it anyway. How will they know exactly what to say? How will they say it? Does the personality invite a curt, direct response, or do they swathe the invading aliens with adoration and delight in a giddy, I’m-too-obsessed-with-science kinda way?

The StoryMind

Finally, we address the overall story controls: the StoryMind. In Dramatica, the StoryMind is the fully formed mental argument and thought-process that the story communicates. In our context, the StoryMind is the computer type responsible for delivering the Dramatica framework’s StoryMind. It understands the possible story structures and makes decisions regarding whether the story can reasonably deliver the same themes with the existing Characters and Plot or whether it will need to adjust.

The StoryMind will have full and total control over that which has yet to be exposed to a human player within the story. It’s job is to generate and edit content to deliver a Grand Argument Story of some kind to the player. What might this look like?

Story time:

fantasy_rpg_town_by_e_mendoza-d6lb9td

Typical Fantasy RPG world/game. You’re a strength-and-dexterity-focused mercenary and you’ve developed a bit of a reputation for taking on groups of enemies solo and winning with vicious direct onslaughts. You’re walking through town and come across a flyer about a duke’s kidnapped heir (one of a few pre-generated premises made by the StoryMind). You ask a barkeep about it (and it alone), so the StoryMind begins to suspect that you may be interested in pursuing this storyline further (rather than whatever other premises it had prepared for you). It therefore begins to develop more content for this premise, inferring that it will need that story information soon. In fact, it takes the initiative.

konosuba

You are blocked in the road by a woman named Steph who overheard you outside the bar and wishes to accompany you on your journey to rescue the heir. She says that she’s a sorceress with some dodgy business concerning the duke and she needs a bargaining chip. Let’s say you respond with “Sure. I only want the duke’s money,” (in Toki Sona of course). All of a sudden, the StoryMind knows a couple of things:

  1. You care more about the reward money than pleasing the duke.
  2. Because you have already invited risk into your relationship with the yet-to-be-met, quest-giving duke, you are even more likely to behave negatively towards this particular duke and his associates in the future. You also might have a natural bias against those of a higher social status (something it will test later perhaps).
  3. You have some level of trust towards Steph, though it’s not defined.
  4. You are not a definitive loner. You accepted a partnership, despite your past as a solo mercenary. But how deep does this willingness extend? It’s possible it might be worth testing this too.

Since you may have related goals, the StoryMind sets her up as the Influence Character. It randomly decides to attempt a “friendly rivals / romance?” relationship (partnership of convenience), modifying her Character properties behind the scenes so that she is similar to you (based on your actions and speech).

26da040ea41e80c3fd737836e1d1e6e4_full

Along the way, a group of goblins ambush and surround you both, so you dash in to slaughter the beasts. The StoryMind may have been designing Steph to support you, but unbeknownst to you, in the interest of generating conflict, it changes some of Steph’s settings! Steph yells for you to stop, but you ignore her and slash through one of them to make an opening. In response, Steph sparks a blinding light, grabs your hand, and runs away in the ensuing chaos.

youknowhowiamijusthadtomake_672f3630055044f7bf539d68ed1d8106

As soon as you’re clear, she starts yelling at you, asking why you wouldn’t wait. After you get her to calm down a bit and explain things, she confides that she is hemophobic and can’t stand to see, smell, or be anywhere near blood. She’d prefer to stealthily knock out, sneak passed, trick, or bloodlessly maim those who stand in her way. How will you react? Astonishment? Scorn? Sympathy? Is this a deal breaker for your temporary partnership? Remember, she’s always paying attention to you, and so is the StoryMind. This difference in desired methodologies is but a small part of the narrative the StoryMind is crafting.

  • Throughline Type: Class.Concern.Variation.Element, Act I
    • Description
    • Genre
      ——
  • Overall Story Throughline: Physics.Obtaining.SelfInterest.Pursuit
    • A dukedom heir has been kidnapped.
    • Entertainment Through Thrills: Pursuit of an endangered royalty.
  • Influence Character Throughline: Mind.Preconcious.Worry.Result
    • Steph is worried about how to deal with her hemophobia. (StoryMind shortly generates this afterward=) She can’t find work beast-slaying or healing because of it, and is now low on money. The duke is evicting her, despite her frequent requests for bloodless work as payment. Everything’s so stressful, and it’s all because of that stupid blood!
    • Comedy of Manners: the almighty sorceress, the bane of beasts and harbinger of health, brought down by the mere sight of blood.
  • Relationship Throughline:
    Psychology.Conceiving.Expediency.Protection
  • You’d prefer to hack away at enemies, but she can’t stand blood and prefers alternative approaches to removing obstacles. As such, you each have different manners of thinking about how you feel obstacles should be dealt with.
  • Growth Drama

 

  • Main Character Throughline: Universe.?
    • If nothing interrupts the progress of the other 3, the StoryMind is fit to throw external state-related problems your way, and these problems will necessarily dig into deeper, thematic issues. For example…
    •  Universe.Progress.Threat.Hunch
      • You eventually learn that those who took the heir have loose ties to the duke himself. Since you’re in pursuit to rescue him/her, you have a hunch that you may be a target soon as well. You need to unearth the mystery surrounding this. Does this impact your ability to trust the various characters you come across?
      • Entertainment Through Atmosphere: You’re experiencing a fantasy world!

 

wallhaven-361677

And to think, if you’d just said, “No, thanks. I’m more of a loner,” at the beginning, Steph might instead have developed as a hindering rival Influence Character who tries to steal the heir for herself, popping in and out of the story when you least expect it! Does she even know about the duke’s possible relation to the kidnapping? Too bad we’ll never find out. After all, you didn’t say that. The characters and experiences in this world are real and permanent. You live with your choices, build relationships, and engage with a game world that truly listens to you, more intimately than any other game has before.

Conclusion

I have found that Dramatica is an excellent starting point from which to build story structures and inform our StoryMind narrative AI of how to craft and manipulate storylines and characters. I hope you too are interested in the potential of this sort of system so that one day we might see it in action.

Also, it’s entirely possible I might have slightly messed up some calculations concerning the Dramatica system as the book doesn’t do a great job of clearly defining the relationships in one place (it’s sort of scattered about in the chapters). As far as I can tell, I’ve got them right, but I don’t terribly favor my math skills. I’d be happy to correct any mistakes someone notices.

In the future, expect to find an article diving into the hypothetical technical representation of Agents: their relationships, perceptions, and decision-making. Again, I’d love to hear from you below with comments, criticisms, and/or questions. Cheers!

Next Article: Relationship and Perception Modeling
Previous Article: Toki Sona Implementation Quandries

Minecraftian Narrative: Part 4

Table of Contents:

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”

Introduction

At this point, I’ve communicated the basics of the Toki Sona language (a “story-focused” Toki Pona), its potential for simply communicating narrative concepts, and the types of interfaces and games that could exploit such a language.

This time, we’ll be diving into some of the nuts and bolts that might revolve around the actual interpretation of Toki Sona and how it might tie into code. An intriguing array of questions come into play due to Toki Sona’s highly interpretive semantics. The end result is a sort of exaggerated problem domain taken from Natural Language Processing. How much information should we infer from what we are given? How do we handle vague interpretations in code? And what do we do when the language itself changes through usage over time? Let’s start thinking…

Variant Details In Interpretation

What we ultimately want in a narrative engine is to be able to craft a computer system that can dynamically generate the same content that a human author would be able to create. To accomplish this, we must leverage our main tool: reducing the complexity of language to such an extent that the computer doesn’t have to compete with the linguistic nuances and artistic value that an author can imbue within their own work. Managing the degree to which we include these nuances requires a careful balancing act though.

For example, “It was a dark and stormy night…” draws into your mind many images beyond simply the setting. It evokes memories filled with emotions which an author may use to great effect in their manipulation of the audience’s emotional experience. Toki Sona’s focus on vague interpretation leaves many different ways of conveying the same concept, depending on one’s intent. Here are some English literal translations:

  • Version A: “When a black time of monstrous/fearful energy existed…”
    • tenpo-pimeja pi wawa-monsuta lon la, …
  • Version B: “This is the going time: The time is the black time. The air water travels below. As light of huge sound cuts the air above…”
    • ni li tenpo-kama: tenpo li tenpo-pimeja. telo-kon li tawa anpa. suno pi kalama-suli li kipisi e kon-sewi la, …

You’ll notice that version A jumps directly into communicating the tone that the audience should understand. As a result, it is far less particular in setting the scene’s physical characteristics about the weather.

Version B on the other hand takes the time to establish scene details with particulars (as specific as it can get, anyway). Although it takes several more statements to present the idea, it eventually equates itself loosely with the original English phrase. In this way, it manages to conjure emotions in the audience through imagery the same way the original does, but you can also tell that the impact isn’t quite as nuanced.

wd1-07

One of the key aspects of Toki Sona is that it is unable to include two independent phrases in a single statement. It is also unable to include anything beyond a single, adverbial dependent clause in addition to the core independent clause. These restrictions help ensure that each individual statement has a clear effect on interpretation. Only one core set of subjects and one core set of verbs may be present. Everything else is simply details for the singularly described content. As a result, a computer should be able to extract these singular concepts from Toki Sona more easily than it would a more complex language.

So while both database queries and statistical probability calculations are factors in interpreting the text, the algorithms will rely more on the probabilities due to the diminished size of database contents (not as many vocabulary terms to track). This is also likely because words frequently have several, divergent meanings that are relevant to a given context. As such, algorithms will often need to re-identify meanings after-the-fact once successive statements have been interpreted.

Our difficulty comes in when we must identify how interpreted statements are to be translated into understood data. Version B is far more explicit about how things are to be added, while version A relies far more heavily on the interpreter to sort things out. How many narrative elements should the interpreter assume based on the statistical chances of their relevance? The more questionable elements are added, the more items we’ll need to revisit for every subsequent statement. After all, future statements could add information that grants us new insight into the meaning of already stated terms.

To illustrate this, let’s break down how the interpreter might compose a scene based on these statements into pseudocode, starting with version B. We’ll leave English literal translations in and identify them as if they were Toki Sona terms.

bqcpj7acmaahrp4

Version B
contextFrames[cxt_index = 0] = cxt = new Context(); //establish 1st context

"This is the going time:" =>
contextFrames[++cxt_index] = new Context(); //':' signifies new context
cxt = contextFrames[cxt_index]; //future ideas added to new context
cxt += Timeline(Past); //Add the "time that has gone" to the context

"The time is the black time." =>
cxt += TimeOfDay(Night) //Add the "time of darkness" to the context

"The air water travels below." =>
cxt += Audio(Rain) + Visual(Rain) // Add "water of the air" visuals. Audio auto-added.

"As light of huge sound cuts the air above..." =>
cxt += {Object|Visual}(Light+(Sound+Huge)) >> Action(Cut) >> Visual(Sky+Air);
cxt += Mood(Ominous)?
...
// The scene includes a light that is often associated with loud noises. These lights (an object? A visual? Is it interactive?) are cutting across the "airs in the sky", likely clouds. All together, this combination of elements might imply an ominous mood.

Version A
contextFrames[cxt_index = 0] = cxt = new Context(); //establish 1st context

"When a black time of monstrous/fearful energy existed..." =>
cxt += TimeOfDay(Night)? + Energy(Terrifying)? + Mood(Terrifying) + Mood(Ominous)?
...
// Establish night time and presence of a terrifying form of energy in the scene. Based on these, establish that the mood is terrifying in some way with the possibility of more negatively toned content to follow soon. Possible that "monstrous energy" may imply a general feel rather than a thing, in which case "black time" may reference an impression of past events as opposed to the time of day.

To emphasize ease of use and make a powerful assistance tool, it’s best to let the interpreter do as much work as possible and then just update previous assumptions as new information is introduced. That way, even if the user inputs a small amount of information, it will feel as if the system is anticipating your meaning and understanding you effectively. To do otherwise would save significantly on processing time, but would result in far too many assumptions being made that don’t account for the full context. This would in turn result in terrible errors in interpretation. Figuring out exactly how the data is organized and how the interpreter will make assumptions will be its own can of worms that I’ll get to some other day.

Data Representation

An additional concern is to identify the various ways that words will be understood logically as classes or typenames, hereafter “types” (for the non-programmers out there, this would be the organization the computer uses to better identify the relationships and behaviors between terms). Examples in the above pseudocode include TimeOfDay, Visuals and Audio elements, etc. Ideally, each of these definitions would alter the context in which characters exist. It would inform their decision-making and impact the kinds of events that might trigger in the world (if anything like that should exist).

One option would be to create a data structure type for each Toki Sona word (there’d certainly be few enough of them memory-wise, so long as a short-cut script were written to auto-generate the code). Having types represent the terms themselves, however, is quite unreliable as we don’t want to have to alter the application code in response to changes in the language. Furthermore, any given word can occupy several syntactic roles depending on its positioning within a sentence, and each Toki Sona word in a syntactic role comes with a variety of semantic roles based on context.

download-2

For example, “kon”, the word for air, occupies a variety of meanings. As a noun, it can mean “air”, “wind”, “breath”, “atmosphere”, and even “aether”, “spirit” or “soul” (literally, “the unseen existence”). These noun meanings are then re-purposed as other forms of speech. The verb to “kon” means to “breathe” or, if being creative, it could mean “to pass by/through as if gas” / “to blow passed as if the wind”. To clarify, when one says, “She ‘kon’s” or “She ‘kon’ed”, one is literally saying “she ‘air’ed”, “she ‘wind’-ed”, “she ‘soul’-ed”, etc. The nouns themselves are used AS verbs, which in turn results in language conventions for interpreted meaning. You can therefore understand the interpretive variations involved, and that’s not even moving on to adjectives and adverbs! Through developing conventions, we could figure out that when a person “airs”, its semantic role is usually that the person breathes, sighs, or similar, not that they spirit away or become one with the atmosphere or something (which are far less likely to use “kon” as an verb in the first place – probably an adverb if anything).

In the end, a computer needs to understand a definitive behavior that is to occur with a given type name. However, since the nature of this behavior is dictated by the combination of terms involved, we can understand that Toki Sona terms are meant to serve as interpreted inputs to the types. Furthermore, it seems most appropriate for types to serve two purposes: they must indicate the syntactic role the word has in a sentence, and they must indicate the functional role the word has in a context.

In the pseudocode excerpt I came up with, we chose to highlight the latter route, defining described content based on how it impacted the narrative context: is this an Audio or Visual element that will affect perception or is this a detail concerning the setting’s external details such as the TimeOfDay, etc.? In addition to this, we’ll also need to incorporate syntactic analysis to better identify what the described content will actually be (is it a noun, verb, adjective, etc.?). As mentioned before, the way a word is used will greatly affect the type of meaning it has, so the function should be built on the syntax which is in turn built on the vocabulary.

8516931_f520

Language Evolution

In addition, a system that implements this sort of code integration should be built around the assumption that the core vocabulary and semantics will change. As it stands, we already want to give users the power to add their own custom terms to the language for a particular application. These custom terms are always re-defined using a combination of sentences made of core terms and pre-existing custom terms.

However, because the integration of a living, breathing, and spoken language into a code base is a drastic measure, it is vital that the code be designed around the capacity for the core language to change. After all, languages are not unlike living creatures that adapt to environments, evolve to meet their needs, and strive to achieve their goals in the midst of it. In this sense, we can rest assured that players and developers alike will look forward to experimenting with and transforming this technology. This transformation will assuredly extend to the core terms, so not even the language should be tightly bound to them.

Given the lack of assurances in regard to the core terms over an extended period of time, it would behoove us to incorporate an external dictionary. It should most likely be pre-baked with statistical semantic associations derived from machine learning NPL algorithms and then fed into runtime calculations that combine with the context to narrow down the interpretation most likely to meet users’ expectations.

does-it-mean

In simple terms, Wyrd should be given a massive list of Toki Pona (or Toki Sona, later on as it becomes available) statements periodically, perhaps with a monthly update. It should then scan through them, learn the words, and figure out what they likely mean: How frequently is “kon” used as a noun? What verbs and adjectives is it often paired with? What words is it NEVER associated with? What sorts of emotions have been associated with the various term-pairings and which are most frequent? These statistical inputs will assist the system in determining the functional and syntactic role(s) words possess. Combining this data with the actual surrounding words in context will let the application have a keen understanding of how to use them AND grant it the ability to reload this information when necessary.

giphy

Wyrd applications should also keep track of all Toki Sona input (if the user has volunteered it) so that they can be used as new machine learning test material. If people start using a word in a new way, and that trend develops, then the engine should respond by learning to adapt to that new usage and incorporate it into characters’ speech and applications’ descriptions. To do this, the centralized library of core terms must be updated by scanning through more recent Toki Sona literature. Ideally, we would pull this from update-electing users, generate new word data, and then broadcast this update to those same Wyrd users.

Conclusion

Well, we’ve explored some of the more in-depth programming difficulties that reside in using Toki Sona. There’ll likely be more updates in the future, but for now, this has all just been a brainstorming and analysis activity. I apologize for those of you who weren’t more tech-savvy (tried to make things a little simpler outside of the pseudocode). From here on out, it’s likely we’ll end up dealing with things that are a bit more technical than the previous fare, but there will also be plenty of high level discussion, so worry not!

For next time, I’ll be diving into the particulars of Agents, Characters, and the StoryMind: the fundamental tools for manipulating and understanding narrative concepts!

Next Article: Dramatica and Narrative AI
Previous Article: Interface and Gameplay Possibilities

Minecraftian Narrative: Part 3

 

Table of Contents:

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”

Introduction

The last time, we discussed the concept of a narrative scripting language that could revolutionize the way players interact with a game world. We considered the possibility of using Toki Pona, an artificial 120-word language with 10 syntax rules, as a starting point for creating a custom language to be used for scripting purposes. In this article, we’ll be focusing a bit more on the ways in which the language might be used and what form its interface might take.

To begin with, I would like to clarify both the licensing plans I have for this concept as well as what terms are involved:

  • Toki Pona: (“The Simple Language”) The original artificial language. This is already free to be used for any purpose.
  • Toki Sona: (“The Story Language”) The modified language that I will be deriving from Toki Pona (Toki Pona’s word for knowledge/story is “sona”). This too will be free to use for any purpose (as it should be). A free C++, C#, Javascript, and APIs would probably be made available for engine/application integration.
  • Wyrd: a paid-for plugin for various popular engines that will include an AI system for interpreting and responding to Toki Sona dynamically for spontaneous dialogue, game events, and character behavior.

Now that we’ve defined things, I’ll be exploring how the heck Wyrd might show up in a game!

Interface Possibilities: Suggested GUI Input

Imagine you’re playing a game and you are given the chance to say something to another character. Rather than being given a succinct list of possible responses, you could simply be given a Minecraft-crafting style of word-composition system.

slide1
A character is encountered in the world.
slide2
An input field and a word bank are made available to the user (players could summon it at will).
slide3
Players can click on an image from the word bank. Think of this as picking a “block” in Minecraft.
slide4
As players select statements, the system visually hints at how things are being interpreted, for example: {Myself} {Want} {Go}. These would be the things that are “crafted” from putting terms together.
slide5
This hinting informs players of what concepts are ACTUALLY being communicated. For example: tomo a.k.a. {enclosed space} => “home” or “house”
slide6
When players need to combine concepts together, for example: {enclosed space} and {water}, it can show them that it is understood as “bathroom”
slide7
Players could then check what other interpretations are available for that combination (perhaps by clicking on it).
slide8
If they wished to communicate the desire to bathe/shower instead, they could select that option.
slide9
Obviously, it would be the responsibility of the Toki Sona engine to ensure there is a standardized image available for all of the desirable concepts, but limits will be necessary.

Another possibility that may be more realistic is to make it so that the hinted images are generated based on the full content of the statements made. For example, the {enclosed space} {water} combo may be assumed to be “bathroom”, but then when the player follows up that statement with, {myself} {feel} {dirty} (mi pilin jaki), then it might show the bathroom image modified to one of bathing after-the-fact. In this way, users wouldn’t be responsible for the interpretation (it can all be automated) which will allow the system to not have too much of a scope-creep going on, mapping images to concepts, etc. It also makes the player have to do less in order to fully interact with the system. Users would also be able to see how their statements impact the interpretation of previous statements.

Interface Possibilities: Suggested Text Input

The text concept is very much like that of the GUI input, however all that would be displayed to the user instead is a text field. Typing into that text field would display a filtered subset of the word bank below the typed text. Things typed would be assumed to be Toki Sona words (for example, “tomo”, meaning “enclosed space”, i.e. “home”, “house”, “construction”). Players would be able to hit [Tab] to move down the hint list and hit [Enter] to auto-complete the selected word and have the image and hinted interpretation pop up. This would allow for MUCH more fluid communication once a player has pieced together the actual vocabulary of the language (you’d eventually get to the point where you wouldn’t even want/need the suggested text).

We would also likely need both input types to have expected grammar displayed, i.e. having a big N underneath the beginning to show you need a noun for a subject. If you have a noun typed, it might suggest a compound noun or a verb, etc. All versions would also auto-edit what you have typed so that it is grammatically correct in Toki Sona (things like, [auto-inserting forgotten grammar particle here], etc.).

Gameplay Possibilities

Creation: One could easily envision a game where the player is capable of supplying WHAT they want to make in narrative terms and then clicking on the environment to place that thing. One could then edit the behavior of anything placed in the scene by selecting it, etc.

Simulation/Visual Novel: Something more like the Sims where the player is given a character and must direct them to do and say things to proceed. What they do and say, and to whom / where they do it may trigger changes in the other characters, the story, the environment, etc. This could naturally progress things.

PuzzleA game where the player is given a certain number of resources (limited points with which to spend for creation, a limited vocabulary, etc.) and must solve a problem. This would be something more like a Don’t Starve or Scribblenauts variety.

Generic RPG: A regular RPG game that allows the player to pop-up a custom dialogue window for speaking purposes, but which would otherwise not rely on directing player controls through the Wyrd system.

Roleplay-Simulation: A game that directly attempts to simulate the experience of a live role-playing game. The system acts as a Game/Dungeon Master and various players can connect to the game to participate in the journey together. A top-down grid environment may show up during enemy encounters of some kind, but players would completely interact with and understand the environment based on the text/graphic output of the system. More like an old school text adventure, but hyped up to the next level.

Conclusion

These are just some of the ideas I’ve had for interface and gameplay using the Wyrd system. Obviously this system still needs a lot of work, but I feel there is clearly a niche market that would long for experiences like this. If you have any comments or suggestions please let me know!

I know this article was a little bit shorter / lighter than my usual fare, but I promise I will develop some more detailed content for you next time. Cheers!

Next Article: Toki Sona Implementation Quandries
Previous Article: Is “Toki Pona” Suitable for Narrative Scripting?

Minecraftian Narrative: Part 2

Table of Contents:

  1. What is “Minecraftian Narrative”?
  2. Is “Toki Pona” Suitable for Narrative Scripting?
  3. Interface and Gameplay Possibilities
  4. Toki Sona Implementation Quandries
  5. Dramatica and Narrative AI
  6. Relationship and Perception Modeling
  7. Evolution of Toki Sona to “tokawaje”

Introduction

In the previous article, I explored the necessary elements of a “Minecraftian” game mechanic: one tailored for accessible and steady skill development, one that is equal parts editable and adaptable, visual and simple, granular and tabular.

I then addressed many issues with leveraging common languages to describe abstract concepts in this kind of mechanic. They are frequently hard to master. The Latin-based ones focus more on sounds than they do meanings. Their complexity warrants excessive processing for computer algorithms that are impractical for any imminent use on the scale with which we intend to use them. Relying on existing language saves learning time, but only for a subset of the intended audience; for others, it is an ostracizing element that comes with the expectation of translating into other existing languages to provide the same privileges to alternative audiences. It would also bias any software made against younger players with underdeveloped language skills.

Because of these considerations, we began to consider the language Toki Pona as a possible tool to adapt for narrative scripting. What are the advantages of this Simple Language? Are there any problems with it? Let’s dive in and find out.

tokiponasentencesimages

Ideal Narrative Scripting

Let’s first review what exactly we mean by “narrative scripting”. What sorts of tasks are we actually wanting to perform with this language? We’ve already established many of the characteristics we are looking for from our Minecraft analysis, and while Toki Pona meets many of these criteria, we must also consider the actual usage environment of our target language before we can significantly evaluate the utility of Toki Pona.

Screen Shot 2016-09-02 at 8.39.49 PM.png

Before continuing, I would also like to point out that this sort of narrative scripting is entirely distinct from the “narrative scripting” language known as Ink. Scripting languages in general are just languages that are more user-friendly and provide a more intuitive, simple interface for computer-tasks that would otherwise be fairly complex. With Ink, the goal is to inform the computer of the relationships between lines of dialogue in branching story lines. In our case, the goal is to inform the computer of the narrative concepts associated with game world objects, actions, and places so that it can 1) interpret meaning based on those associations and 2) trigger events that can be leveraged by AI characters, world controls, and human players/modders to create behavior and change the game world. We want to put this kind of control in the hands of players.

maxresdefault-1
Skyrim-creator Bethesda’s “creation kit” modding tool has its own scripting language as well, to edit objects/events in the game world. It is a bit technical though.

If we truly had a narrative scripting language, then we would be able to craft, with as little vocabulary and syntactic structure as possible, a description of any narrative content. More specifically, we should be able to describe with some measure of accuracy…

  • places’ geography, geometry (both its form and absolute and relative locations), and thematic atmosphere.
  • objects’ nomenclature, physical and functional characteristics, relative purpose, effects, history, ownership, and value.
  • human’s (and, as a categorical subset, animals’ and human-like creatures’) nomenclature, physical and emotional characteristics, relationships, state of mind, responsibilities, history and scars, beliefs, tastes, hopes and dreams, fears, biases, allegiances, knowledge, awareness, senses and observations, skills and powers.
  • concepts’ and ideologies’ subject domain, relationships, and significant details.

These qualities will allow the user to competently describe an environment and the items, creatures, and people in it. In addition, for accessibility and then functional purposes, we need the language to be useful for the following tasks: 1) scenario descriptions, 2) dialogue, and 3) computer processing. The attributes above cover the first case.

1235-0-1448069166
Ideally, dialogue “options” wouldn’t exist, and we would be able to directly input a writing system that the non-player characters would be able to understand without some preset arrangement of scripted responses.

As for dialogue, that means it must also be able to model questions, interjections, quotations, prepositions, nouns, adjectives, and adverbs (common syntactic structures). It must accommodate the linguistic relationships between terms and their relative priority, e.g. is X AND/OR’d with Y, are these words a noun/adjective/adverb, are they subject/object, which adjectives describe the noun more clearly, etc. We must also have some means of singling out identifiers, i.e. terms that refer to things that are not themselves a part of the language (a player’s name, for example).

Finally, we must also ensure that the language’s structural simplicity is reinforced so that its consequent processing is more easily conducted by computer algorithms. This primarily involves restricting the number of syntactic rules, lexicon size, but also includes more subtle nuances like the number of interpreted meanings for a given set of words and the number of compound words.

To maximize the utility of the language itself, the “root” words that are individually present in the language must have the following characteristics:

  • The words must be, as much as possible, “perpenyms” of one another; that is, they must be perpendicular in meaning to their counterparts, neither synonyms – for obvious reasons – nor antonyms, to prevent you from simply saying, “not [the word]” to get another word in the language.
  • The words must have a high number of literal or metaphorical interpretive meanings laced within them to ensure a maximum number of functionally usable words per each learned word. Keep in mind, these interpretations must also be strongly linked by theme so that the words’ definitions will be easy to remember. If possible however, these multiple meanings should be individually interpretive based on context, so that one can assume in any given context one interpreted meaning vs. another somewhat clearly. The more this is supported, the less work the computer will be doing.
    • For example, in Toki Pona, the word “tawa” can mean “to go” as a verb, “mobile/moving/traveling” as an adjective, “to, until, towards, in order to” as a preposition (notice how each of those are generally applied to different preposition objects, to help pick out which one is being used), or even, “journey/transportation/experience”, literally “the going” as a noun. Each context can be easily identified based on the positioning of the word in a sentence, each run along a common theme, but each also have a unique connotation that can be interpreted rather well in context.
  • The words must be highly reactive with their fellow terms so that a high number of reasonable compound words can be made. Again, the goal is to maximize the number of unique meanings we can derive from the minimum set of words to learn. This will increase the number of vocabulary terms, but in a less defined way as the minimalist nature of the language will make it favor interpretation of clear-cut meanings anyway. Therefore, alternative constructions for the same compound word concept won’t be too big of a deal.

Toki Pona’s Potential

As previously mentioned, Toki Pona has many desirable characteristics that we seek in developing a narrative scripting language. Limited syntax and vocabulary clears the user-accessible and computationally efficient requirements (3). Toki Pona also does an excellent job of functioning as dialogue since it was designed from the ground up to be used conversationally (2).

The language has an exceptional potential to detail a wide variety of topics, and although it tends to be extremely vague, enough detail can be made to elucidate the general meaning of an idea. However, there are some details about the language that have the effect of restricting its potential, namely the fact that the languages’ creator, Sonja Lang, designed it based on addressing the linguistic needs of a hypothetical, simplistic, aboriginal people on an island. As such, the language is not completely designed from the ground up to account for a maximization of functional vocabulary and in fact caters to the range of topics and activities that such a people would participate in.

The 2014 guide to Toki Pona, occupying an entire word in the language.

For example, the language includes words like “pu” (meaning “the book of Toki Pona”), which is utterly meaningless for our purposes, and “alasa” (meaning “to hunt, to forage”), which fails on perpendicularity since you can easily create the same meaning with phrases like “tawa oko tan moku” (meaning “to go eyeing/looking for food”).

Also, despite how much the language does to accommodate different modes of verbiage, (including past, present, future, progressive variations of each, etc.), it can be troublesome to express some necessary concepts since “wile” (the word for “want to”) itself encompasses want to, must or have/need/required to, and should/ought to, each of which are highly distinct and significant nuances.

Although, some inventive uses for verbiage have been adapted for a lacking vocabulary. For example, a person can convey that he or she should do something by using the command imperative on themselves as a statement, e.g. “mi o moku” => “Me, eat” => “I should eat”, distinct from “mi moku” => “I eat.” These intricacies must be learned on top of any such vocabulary or syntax rules though as they are built upon usage conventions and through their obscurity inevitably hinder the accessibility of the language.

As such, it would appear that the most effective strategy would be to develop a derivation of Toki Pona, built on the same principles and leveraging much of the same language, but stripping it down to only the elements that are most critical for communication and plugging up gaps in linguistic coverage as much as possible.

Narrative Language: Core Concepts

While we won’t iron out the entirety of a language in one sitting, we can likely get a sense for what sorts of concepts must be included as core elements. If we limit ourselves to include 150 words (and even that is really stretching it, if we want to keep the language as effective as possible), then let’s see what ideas are really needed.

  • Pronouns
  • Parts of the body
  • Spatial reasoning, i.e. directions, orientation, positioning
  • Counting, simple math
  • Colors
  • Temporal reasoning, i.e. time (and tense) references
  • Types of living things (including references to types of people, e.g. male/female)
  • Common behaviors (supports occupations, most helping verbs, basic tasks)
  • Common prepositions
  • Elements of nature (earth, wind, water, fire/heat, light, darkness)
  • Forms of matter
  • Grammar particles (obviously taken directly from Toki Pona more or less)

We must also then include elements that are uniquely necessary to fulfill our needs of describing the nature of people and relationships.

  • Elements of perception (able to describe the “feel”, “connotation”, or “theme” of an experience)
  • Elements of relationships (same for relationships, but also able to describe the expected responsibilities. Need basic words to help illuminate expected responsibilities, e.g. Toki Pona’s “lawa” for “leader”)
  • Elements of personality

Ideally, we would be able to get many of these meanings using the same words, but just applying them to a different object. For example, if we had the word “bitter”, we could use it to describe a perception, the overall nature of a relationship (or perhaps even the feelings experienced by one party in the relationship), or someone’s personality, e.g. they are a bitter and resentful person, etc.

Conclusion

In our analysis of Toki Pona, we covered the fact that it has many advantages in regards to fulfilling our requirements as a narrative scripting language; namely that it can be used as dialogue and has a high number of meanings per learned vocabulary term, and therefore it is able to cover a lot of topics with little base learning time involved. However, we have also determined that the language has many flaws due to the original design purpose: meeting the linguistic needs of an isolated and tribal hunter-gatherer people. As such, there are many unnecessary terms and some concepts which simply cannot be conveyed adequately, if at all.

We can therefore state that the best course of action would be to derive a new language from the structure and vocabulary of Toki Pona, shaving away “the fat” as it were, and editing the language to bolster its linguistic breadth and depth while still staying true to its minimalist nature. To that end, we outlined several topics of vocabulary that would be essential for outlining a narrative scripting language.

In future articles, I’ll begin to address the interface we may see in a narrative scripting editor and identify actual gameplay mechanics we could see with narrative scripting put into practice. Please feel free to leave comments if you have any further insights or criticisms and stay tuned for more!

Next Article: Interface and Gameplay Possibilities
Previous Article: What is “Minecraftian Narrative”?