Using Psychological Principles for Great User Interfaces

October 17, 2013, 8:17 pm

≫ Next: Overview of Modern Volume Rendering Techniques for Games - Part 1

≪ Previous: How the Procedural Map Generation Works in Reactor Heart

There are proven psychological principles to user interfaces that work.

Whether you have a team of design experts or are just building with programmer art, you can use these principles to make your game easy to understand and a joy to pick up.

When you’re putting together the user interface design for your game, whether it’s a heads-up display (HUD), a level select menu, an in-game map, or a life meter, you want it to all work. Perfect UI design is invisible, that is, the user isn’t really grappling with how the UI works – the UI disappears and the player focuses on what they want to do in the game.

My favorite analogy to this is driving a car. When you’re first learning how to drive, you need to be taught what the steering wheel does, how the accelerator and brakes work, and the gear shifts. But once you learn these things, then they all disappear. “Turn the steering wheel right” turns into “I want to make the car go over there”.

A good UI tries to get the player to this second stage as quickly as possible. Over at my site The Game Prodigy, we try to stress getting to this point immediately so that players can really get into your game.

To do this there are a handful of psychological principles you can use – scaffolding concepts, functional and color consistency, and the rule of 7.

Let’s get started!

Scaffolding Concepts

When you’re laying out your game’s UI, what you are really laying out is a map of concepts.

The quickest way to explain a concept to someone is by making an analogy between the new concept and something they already understand. In teaching and education this is called “scaffolding” – by propping up new ideas with old ones, the new ideas are easier to comprehend.

Life bars are used almost universally in games, and they build on the common concept that people understand from progress bars or gas gauges. If the bars are full then you have much more to go. If the bars are low then you’re almost out.

You can see how this analogy is used in Kenji Inafune’s recent kickstarter project, Mighty No. 9:

The best is when you can scaffold with an object commonly understood in real life.

One note is that all players are different in terms of their knowledge. If your game is for casual players or typical “non-gamers” then you may have to try and pull concepts from real life to use as your analogies. If your players are gamers and familiar with many game conventions, then it will be easier to borrow from other games and expect them to know how certain elements already work.

Some other concepts that are useful to use are light switches, dials (like on a stove), on/off switches, elevator buttons, escape buttons, or clocks/alarms. The more common the real world object, the better.

Application: When developing your UI, ask yourself these questions:

Are there any UI concepts or analogies here that will be totally unfamiliar to players?
Can these new concepts be scaffolded with old concepts to make them easier to understand?

Strive for Consistency in Actions and Colors

When you introduce a UI concept to players, then you want to make sure you are as consistent as possible across the game.

In accordance with the scaffolding concept we just discussed, consistency helps players understand what’s familiar. The worst experience is teaching the player how something works, and then in another area of the game, it doesn’t work as you’ve taught them.

A great example of this done well is in the recent indie hit Papers, Please. The game asks the player to deal with a variety of items in deciding who they should let through the immigration border control. These include passports, permits, photographs, and more.

Each of these is interacted with using the mouse to drag items around on your desk, since that’s a sensible way to handle papers. However, the game also has lots of dialogue between the characters. Dialogue is typically done in a floating box on top of the screen, with up and down used to select menus (think Final Fantasy or Mass Effect dialogue menus).

But that interaction style wouldn’t fit with how the player is interacting with the rest of the game.

So to keep the UI consistent, Papers, Please also makes the dialog an object on the desk via a printed transcript. This doesn’t require the player pull up some new menu or learn a new way of interacting with the world – it’s the same as all the rest of the objects. This helps the game feel consistent and the player immediately understands how to interact with it.

Colors can also be a great way to drive consistency and is one of the big concepts we teach back at The Game Prodigy. Different colors have meanings associated with them in culture, and keeping these colors consistent in a game makes it more intuitive.

Red in Western culture typically means stop (from traffic lights), warning, or bad. You can see this in the typical “Damage Taken” red shade that appears on map UI in Grand Theft Auto V…

..and also associated with UI around enemies in Nimble Quest. The red skulls represent how many “bad guys” are remaining, which again is associated using red:

These colors make it possible make quick sense of what's going on in the game world.

Application:

When designing your UI elements, try to have consistency between them. Don’t switch from one interaction type to another, especially if it’s different throughout the game
Make use of color to subtlely point out similarities between game elements to your players

Digit Span and Chunking

Let’s do an exercise. Memorize these numbers, and then close your eyes and try to recite them from memory:

4930661

If you’re just trying to skip ahead, don’t do it. Actually try to memorize them, it will help with the illustration.

Have you done it? Great! Now try these numbers:

5982385741

How did you do? According to research, the first set should have been simple, but the second much more difficult. Why?

Studies have shown that people can only hold about 7 unique numbers in their head at a time, give or take two. This is called “digit span” and is the reason that phone numbers are 7 numbers (without country or area code).

This concept can be used in a more abstract sense as well. If there are 7 ideas presenting themselves to a player at the same time, that’s reaching the limits of what most players can handle. Beyond that it becomes jumbled and confusing.

However ideas can be pulled together to form one higher level idea. This is called “chunking” and appears frequently in psychology literature around memory.

For example, try and memorize these numbers

199020012013

This is much easier to memorize if you chunk them into years: 1990, 2001, 2013. Simple, right?

Let’s look at this example of digit span and chunking in games from Dark Souls:

There are a lot of things going on here, 9 total UI elements on the HUD:

Health Bar
Stamina Bar
Level
Up Item
Down Item
Right Equipment
Left Equipment
Currency
Interaction Dialog

However, four of them, the weapons and items, are chunked together so that the player can think of them as a single concept that maps to the D-pad.

This brings the total number down to 6 total elements: combining all the Up/Down/Left/Right into one item: "Items". By using this chunking and mapping those 4 elements together through both how they interact on the controller as well as how they look on the screen, the Dark Souls UI comes across to players as just simple enough.

Let’s look at another example from Terraria:

Using chunking, this UI is made up of only 3 main elements:

Life bar
Mana bar
Items bar (chunked together)

This visual design makes the game trivial to pick up and understand immediately what’s going on.

Application:

How many unique UI elements does the player have to pay attention to at once? If it’s more than 7, consider reducing them or chunking common ideas together
Make sure that chunked ideas are interacted with in a similar manner

Summary

When you’re designing the UI for your game, try and draw analogies to knowledge and concepts your audience will already understand (even if you are making something new you can still pull parts from common ideas). Keep your interaction styles and colors consistent in order to allow players to navigate through without being surprised. Limit the number of ideas or concepts you are showing at once to no more than 7 to keep from becoming too jumbled.

All of these are rules and there are exceptions of course. But adhering to them will help to make your game more quickly understandable.

With some smart psychological principles, good UI can help your players get through the menus and into the game world you’ve created.

Good luck!

For more information on how to build and design games and a game career, visit The Game Prodigy for a free 29-page eBook.

Article Update Log

10/23/2013 - Added another Papers, Please screenshot to make point clearer

↧

Overview of Modern Volume Rendering Techniques for Games - Part 1

October 18, 2013, 4:03 am

≫ Next: How NOT to Market Your Indie Game

≪ Previous: Using Psychological Principles for Great User Interfaces

A couple of months ago Sony revealed their upcoming MMO title EverQuest Next. What made me really excited about it was their decision to base their world on a volume representation. This enables them to show amazing videos like this one. I’ve been very interested in volume rendering for a lot of time and in this series I’d like to point at the techniques that are most suitable for games today and in the near future.

In this series I’ll explain the details of some of the algorithms as well as their practical implementations.

This first post introduces the concept of volume rendering and what are its greatest benefits for games.

Volume rendering is a well known family of algorithms that allow the projection of a set of 3D samples onto a 2D image. It is used extensively in a wide range of fields as medical imaging (MRI, CRT visualization), industry, biology, geophysics etc. Its usage in games however is relatively modest with some interesting use cases in games like Delta Force, Outcast, C&C Tiberian Sun and others. The usage of volume rendering faded until recently, when we saw an increase in its popularity and a sort of “rediscovery”.

A voxel-based scene with complex geometry

In games we usually are interested just in the surface of a mesh – its internal composition is seldom of interest – in contrast to medical applications. Relatively few applications selected volume rendering in place of the usual polygon-based mesh representations. Volumes however have two characteristics that are becoming increasingly important for modern games – destructibility and procedural generation.

Games like Minecraft have shown that players are very much engaged by the possibility of creating their own worlds and shaping them the way they want. On the other hand, titles like Red Faction place an emphasis on the destruction of the surrounding environment. Both these games, although very different, have essentially the same technology requirement.

Destructibility (and of course constructability) is a property that game designers are actively seeking.

One way to achieve modifications of the meshes is to apply it to the traditional polygonal models. This proved to be a quite complicated matter. Middleware solutions like NVIDIA Apex solve the polygon mesh destructibility, but usually still require input from a designer and the construction part remains largely unsolved.

Minecraft unleashed the creativity of users

Volume rendering can help a lot here. The representation of the mesh is a much more natural 3D grid of volume elements (voxels) than a collection of triangles. The volume already contains the important information about the shape of the object and its modification is close to what happens in the real world. We either add or subtract volumes from one another. Many artists already work in a similar way in tools like Zbrush.

Voxels themselves can contain any data we like, but usually they define a distance field – that means that every voxel encodes a value indicating how far we are from the surface of the mesh. Material information is also embedded in the voxel. With such a definition, constructive solid geometry (CSG) operations on voxel grids become trivial. We can freely add or subtract any volume we’d like from our mesh. This brings a tremendous amount of flexibility to the modelling process.

Procedural generation is another important feature that has many advantages. First and foremost it can save a lot of human effort and time. Level designers can generate a terrain procedurally and then just fine-tune it instead of having to start from absolute zero and work out every tedious detail. This save is especially relevant when very large environments have to be created – like in MMORPG games. With the new generation of consoles with more memory and power, players will demand much more and better content. Only with the use of procedural generation of content, the creators of virtual worlds will be able to achieve the needed variety for future games.

In short, procedural generation means that we create the mesh from a mathematical function that has relatively few input parameters. No sculpting is required by an artist at least for the first raw version of the model.

Developers can also achieve high compression ratios and save a lot of download resources and disk space by using procedural content generation. The surface is represented implicitly, with functions and coefficients, instead of heightmaps or 3D voxel grids (2 popular methods for surface representations used in games). We already see huge savings from procedurally generated textures – why shouldn’t the same apply for 3D meshes?

The use of volume rendering is not restricted to the meshes. Today we see some other uses too. Some of them include:

- Global illumination (see the great work in Unreal Engine 4)

- Fluid simulation

- GPGPU ray-marching for visual effects

In the next posts in the series I’ll give a list and details on modern volume rendering algorithms that I believe have the greatest potential to be used in current and near-future games.

↧

How NOT to Market Your Indie Game

October 19, 2013, 8:27 pm

≫ Next: Light Introduction to Genetic Programming

≪ Previous: Overview of Modern Volume Rendering Techniques for Games - Part 1

There are so many of those ‘How to market your indie game’ articles written by more or less successful developers that I decided to write about this from a different angle. From the angle of someone who failed at marketing. My angle.

I have a long record of failing at marketing and PR and you can easily check that by looking at my name/nick and not recognizing it. Thus you can trust me on what I’m going to say.

Warning: the following list might be filled with things so stupid that you wouldn’t ever imagine doing them, and yet I did all of them at some point, often multiple times. If that’s the case you can as well just make fun of me since you’re already here.

Don’t accidentally forget to put the links to your website, Facebook and Twitter under anything you post about your game.
Don’t post detailed stuff about your game that only the most eager fans would be interested in. Especially when your game isn’t finished yet and doesn’t have any fans.
Like this very post we’ve posted a few days ago. Who the fuck could care about the backstory of one of the political parties in one of the playable races in our game that no one knows about?
Don’t casually accost random editors that've never heard of you on Twitter or Facebook.
Don’t fill your email’s title with tons of buzzwords. i.e. (Steam-punk MMORPG with a vast world to explore and innovative storyline, also a spiritual successor to XXX). Despite the amount of words it actually doesn’t say anything about your game.
The same goes for Reddit posts.
Don’t release the screenshots that you took 5 days into the development, they will stay in the Internet forever and haunt you. (Press posting about your game and using a year-old screenshot as a news header would be the best example)
Don’t try to be funny if it doesn’t come to you naturally. It’s the most pathetic thing ever.
Don’t send a press release to 20 editors, putting their email addresses in ‘To:’ instead of ‘Bcc:’.
In fact don’t send a press release to 20 editors at all! Send each of your emails separately, with some consideration as to who you’re talking to.
Don’t believe them when they say that the press wants to write about your game.
You have to do EVERYTHING that is in your power to make yourself and your game look outstanding in the crowd of other developers and their games.
Don’t post your updates in the middle of the night. Do your research on when’s the best time to post. Facebook’s added a cool feature recently that lets you check the hourly activity of your fans.
Don’t wait with spamming the press until your game is released. Email them right now. They need to know about the awesome project you’re working on, even if they don’t reply or post about it on teh websitez.
Don’t send an email titled “We’re making a game, it’ll be fun”. They’re not gonna make a story about it. They’re not gonna post about it. Unless you’re Notch, of course.
Don’t ask reviewers if they want a review copy of your game. Throw it at their faces. They weren’t gonna buy it anyways.
Don’t visit Twitter and forums only to post an update on your game’s development. If you’re not a part of a particular community, it’s better to not spam there at all. (Some may not agree with this, but IMO it’s kind of a scumbag move.)
Don’t miss out on #screenshotsaturday.
Don’t hate everyone that is more successful than you. It’s not good for your health. There’s simply too many of them.
Don’t use your blog as a weekly list of all the sprites you did in the past days and all the little bugs you’ve fixed. No one cares about that. Don’t bore people to death for deciding to read your stuff.
Don’t play the ‘Top-secret project’ game! If you don’t show how cool your game is, then no one will know how cool your game is. Unless you’re already a successful developer, but then you wouldn’t be reading this, right?
If you don’t reveal your secret ultimate feature then no one’s gonna know about it. Dang, even if you reveal it most likely no one’s gonna know about it.
Don’t use ‘6 playable characters’ and ’20 enemies to kill’ as your key features. Trust me on this one.
Google ‘USP’ and think harder.
Don’t trust yourself on how good your gameplay is. Your opinion is ultimately biased.
Don’t make a game similar to a well-known hit if you can’t make yours better. People would rather just play the original.
Don’t skip on the pre-production phase, and don’t skip on thinking of your target group of players. (There has to be one!)
Look at your game! And I mean: look at it like you’re looking at other games. Make your friends look at it. Make strangers look at it. Don’t say anything more than what you have on your website/in your posts. Accept their feedback with gratitude. Change the way you’re presenting your game when you still have time for that.
Even if you’re making an awesome, innovative and original game in an entirely new genre, it may still look generic in your presentation, or in the way you’re describing it. Think about that.
Don’t try to make a game for both casual and hardcore players.
Don’t make the art in your game look inconsistent. It’s better to have bad but consistent art than a few good pixel-art assets mixed with good 3d renders, and so on.
Don’t skip on polishing the game!
Don’t insist on adding more content instead of polishing what you already have in the game.
Don’t have your website look like shit.
Don’t have your Facebook fanpage outdated and looking like shit.
Don’t expect people to think too much! They don’t find your game worthy of such a drag. Make everything obvious and in front of their very eyeballz.
Don’t be a dick if no one plays your game. It sucks. Deal with it.

I hope that helps : )

Now get back to working on your game! Don’t waste your time on reading articles like this one. It’s not like you’re gonna believe anything that someone else says, before you make the same mistakes as them. At least that’s my case. And yes, I’ve read thousands of those articles.

In case you would like to see more of my epic failures with your own eyes then you ought to follow me on Twitter.

And if you like games that you find awesome you should follow my game’s fanpage on Facebook, it’ll exceed your expectations.

↧

Light Introduction to Genetic Programming

October 20, 2013, 1:16 pm

≫ Next: Limits of Developing a Web-Based Hidden Object Game for Learning Languages

≪ Previous: How NOT to Market Your Indie Game

A few years ago I implemented a card game I knew from my high school years into a mobile app. I had written an AI for it, but I wasn’t satisfied with its skill level. Not convinced it was worth to further manually code a better solution, I looked into AI algorithms and ended up at Genetic Programming (GP). To make a long story short, it worked so well that the AI plays better than I do. GP is easy to conceptually understand and reasonably simple to implement at a basic level, which is what this article is about. Be warned, this article does not teach you how to program in general.

This article has two sections: One section to introduce GP, and one section to describe the example project.

Introduction to GP

GP starts with ‘genetic’ because it is inspired from a process in nature called evolution. It is a subclass of Genetic Algorithms (GA) that deals with “programs” represented by tree graphs. By running a genetic program, you can perform a computation, make a decision, or generally execute some sequence of events. An example is shown below of a genetic program that calculates A + B * 7.

Here’s another genetic program that executes an 8 pool Starcraft 2 build order intro (disclaimer: this is my interpretation based on a description on a popular SC2 site). Note that this is still a tree structure, although it’s drawn in a non-standard manner. How to evaluate this tree should be straightforward except for the decision nodes, whose conditions might be difficult to implement; “high damage potential” is a fuzzy concept.

The ‘genetic’ part comes from the fact that these trees/programs can be considered genomes that can be “mutated” (changed) in various ways. By having a population of different genomes/programs mutating through successive generations, while applying fitness/selection criteria, we can influence what tasks the genetic programs become good at solving. Let those individual programs that best solve the task have the largest genetic influence on the next generation - this is evolution!

The image below shows an (unrealistically tiny) population of four individuals undergoing evolution for five generations through random color change and node addition/removal, where the task is to have as many blue nodes as possible. Of course, this is a contrived example given that we know how to make a huge blue tree.

Note how the best trees are either cloned or subjected to mutation, which may or may not result in a more fit child. Cloning ensures good trees survive in unaltered form, but slow down mutation (in huge populations cloning isn’t really necessary). Also note how selection rules are not necessarily 100% fair; it doesn’t matter.

Finally, let me clarify on what makes GP different from genetic algorithms. GA in general is about optimization of parameters through evolution, a form of stochastic search. How those parameters are combined is nothing GA concerns itself with, except in specific cases such as GP. GP on the other hand provides a way of not only evolving parameter values, but also how those parameters are combined, making it a more powerful learning approach. Even if you don’t know the expression for a certain computation, GP can learn it for you rather than just optimize a set of input parameters!

Some examples of what a GP tree can represent

Arithmetic tree

The easiest and I suspect most common use of GP is to let programs represent arithmetic expressions, most commonly expressions on “real” numbers (OK, computer floating precision aside). We have already looked at this earlier, but suffice to say that the ability to evolve a mathematical function for some purpose is a powerful one. Below shows an evaluation function in Lua evolved for a “medium” level AI in my game:

(((((env.rounds)+((env.suit)*(env.rank)))+(env.safe))+((env.block)*(env.block)))*(((-4.0745883)+(0.025403308))+((((env.selfblock)*(env.dist))+((env.selfblock)*(env.safe)))+(math.random()))))

Note how there are terminals (leaves) such as env.rounds, math.random() and constants, and the binary nonterminals + and *. Note that nonterminals in arithmetic trees don’t in general need any extra parameters - just receive them as input branches. The whole thing is very confusing when read with parentheses in a linear text file, so let’s draw the tree:

This tree form is what’s actually stored in memory while evolution and simulation are running; the linear code is only an export optimized for my target game. Another interesting thing to note is the constant result computation -4.0745883 + 0.025403308, such redundancy and overgrowth (huge sub-trees of dubious utility) may be controlled with the fitness function as we shall see later.

Execution tree

Another option is to let a genetic program represent a computer program in a more traditional sense, i.e. a control flow graph. Of course, a major restriction compared to a normal computer program is the fact that the graph must be a tree. This may or may not be a problem, and there are ways to work around the lack of loops through innovative node types.

The Starcraft 2 build order example earlier is an example of an execution tree; execution in that case occurs from root to leaves with decisions in-between. Nodes may either be decisions (needing annotated parameters) or simply execute an instruction with a linear result, such as “build worker”. Some instruction nodes may have no children at all and thus effectively can function as leaves as well.

A less generally useful but mentionable option is you might actually reverse definition of flow direction and execute from a certain start leaf to the root in a predictable linear path. In that case, execution might be seen as going from a specific to a more generalized conclusion, in which different specific start conditions share a consequence.

Decision tree

Decision trees are tree graphs that are evaluated from root to leaves, following a path of decisions from the root that eventually end up with a leaf consequence/classification. This is without any non-standard execution in the nonterminals - all nonterminals are pure decisions. Note that decision trees have been studied extensively and that there are likely better algorithms than GP if you need to build a decision tree. Yet, it’s worth mentioning the possibility.

Note that in decision tree flow graph notation, nonterminals are indicated with multiple graphical elements:

Like for execution trees, nonterminals in decision trees are more complex than for arithmetic trees since extra parameters such as conditions or probabilities need to be annotated. Possibly, these parameters also need to be mutated directly for more efficient learning, rather than just being initialized randomly at node creation.

Genetic operations

Although only cloning and mutation were mentioned so far, there are in fact several fundamental genetic operations that can be applied on a GP tree.

Cloning

The easiest operation to understand is cloning, in which an identical copy of the parent/source tree is returned. I will avoid the term “parent” because it conflicts with tree graph terminology, and instead call parents source trees further on in this text. Similarly, I will avoid the term child (even though these terms are present in the source code). Anyway, here is an example of a cloning operation on a source tree:

Mutation

Mutation involves making some sort of change to a single source tree. In a GP tree, it is possible to replace a subtree on a branch of the source tree with a random tree. This will in fact cover most cases of tree alteration you can think of: It can both grow the tree or shrink the tree, depending on the maximum depth of the random tree generated.

A good max depth for the random subtree seems to be 2 (allowing at most one nonterminal). If it were 1, trees would never grow through mutation, and much larger and your trees will rapidly grow to be filled with loads of random crap, excuse the expression.

For cases where parameters on nonterminals are required, you might want to include code to mutate these parameters without replacing trees. Similarly, if you have constant terminals in an arithmetic tree, you might want to controllably mutate these. Consider that a better solution often lies close to an already good solution, and completely throwing a constant out of the window to replace it with a completely random value as occurs with subtree replacement is not likely to be particularly successful.

Above is the constant update rule I use. Note that the magnitude is exponentially altered, from 0.5 to 2. The sign is intentionally never changed. In expressions such as A * B, it turns out that changing the sign of one of the operands has a rather dramatic effect - it changes the sign of the result as well. For changing signs, I instead rely on the node replacement covered earlier.

Crossover

Did you ever wonder why you have two biological parents and not one? Now you will know!

The limitation of mutation is that random data isn’t likely to do anything useful. Meanwhile, your population hopefully already contains a diversity of useful genetic programs. A key insight here is that the subtrees of a genetic tree may be useful sub-solutions to the task. Thus, rather than replacing a branch with random data, we replace it with a subtree of a proven useful source tree.

Crossover, as this is known, means you combine the traits of two source trees. The results will not always be successful, but overall the chances of something interesting resulting are comparatively good. Crossover will speed up evolution since different successful sub-solutions will be shared among different lineages without having to develop independently through mutation.

I didn't include an image because crossover looks similar to mutation, except the donor tree isn't random, but a randomly selected subtree of an individual.

Fitness and limiting overgrowth

Throughout this article we have assumed that there is a driving force behind evolution, guiding the evolution of genetic programs towards solving a particular task. That force is the fitness for the task.

How you define fitness is domain specific, but for GP a good start is as a function that can be used to order the population. The fitness function of an individual should be based on how efficient it is at the task (whether it is to win, to survive, etc). For instance, in a game with win/lose outcomes you might base fitness on N_wins / (N_wins + N_losses).

There is another important aspect of the fitness function; it can be used to protect against overgrowth. Overgrowth was briefly mentioned earlier, and it is a huge danger to the efficiency of evolution in some GP setups. Consider how efficient the expression 0 * (abs(sin(x)) ^ 5.88 + 72.21 - 28.01 * min(-827.17, x)) really is. In practice, rather than exactly 0, you might see constants evolving to some very small values when evolution tries to get rid of useless subtrees. The full effect of this can be unlimited growth of useless subtrees throughout the population, slowing down evolution to snail’s pace, or worse.

The problem, of course, is that there’s no penalty for useless arithmetic in the AI code included in the game simulation. But rather than trying to add that to the simulation (which would break the game rules), we can add the penalty to the fitness function. A good start (if rough) is basing the penalty on the size of the GP tree. For instance, here’s the fitness function I use in Sevens:

float getFitness() {
   int size = play.getTreeSize() + send.getTreeSize();
   int factor = Math.max(0, size - 20);
   return getWinRatio() - factor * 0.0004f;
}

Overgrowth might be less of a problem with execution trees than arithmetic trees since fitness will be influenced directly by the simulation cost of executing instructions; building a worker does not take zero resources in reference to Starcraft 2. On the other hand, decision nodes might consistently leave a branch unvisited in which case the subtree on that branch will be dead weight memory-wise.

Evolution

There’s another important detail yet to discuss, and it is how we select what genetic operations to apply to the individuals in each generation. First, here’s how my rather crude implementation works (approximately) for each generation:

New population := {}, population = (already initialized)
Run simulation on population
Sort population according to fitness in descending order
For each genetic operation O:
    N := split_O * N_population
    For i = 1, N:
        Source := population[i]
         Emit individual in new population using O(Source)
Population := new population

The split_O values should add up to 1 to keep the population at the same size between generations. These values let you assign how large of a percentage of each population should be submitted to each type of genetic operation. In practice, I use 10% cloning, 45% constant mutation, 22.5% tree mutation and 22.5% crossover. You’ll have to experiment with what works for you.

Why not do GP

GP is great since it (for example) can really learn entire functions independently, right? Yes, but only under certain conditions. To use arithmetic trees for AI agent evaluation functions as an example: First, you need to specify what the outputs of your functions mean, and in what context they will apply. Secondly, you need to describe the state of the environment with terminal nodes accurately for agents to be able to make informed decisions. Thirdly, you need to think about the set of nonterminal nodes. Including all possible binary functions on real numbers in mathematics you can think of is not necessarily useful and evolution will likely be much faster and stable with a limited set of nonterminal operators.

Genetic programs share the same weakness with GA in general; a large number of generations with a large number of individuals must be computed for any learning/optimization to happen. This pretty much means GP is too slow for any online adjustment to the genomes. You need to be able to do batch-processing to execute a huge number of trials in advance (offline), and that entails having a simulation of the world. Consequently, GP is more suitable for games (on the condition they have batch processable simulation cores) than e.g. robotics, for which it is harder to produce simulations of the environment the agents operate in.

GP and GA are random searches that will be outclassed by search algorithms that are better suited to the problem space, and should be seen as a last resort or “I don’t care if it takes longer”-option. There are no silver bullets in AI; always investigate a lot before picking your approach or you end up wasting your time.

Example project

The code attached with this article is the command line simulation and evolution tool in Java/Eclipse I developed in tandem with my app. The Lua source of my original app is not something anyone would like to view anyhow.

The game

In the game of Sevens, the decisions that need to be made by an agent are of a very basic nature: Select the best card in two different contexts, when playing or “sending” a card to a player lacking a playable card. So, we can calculate a (utility) value for a certain action (a playable card), and we can pick the action with the best value. This is a reasonable way of modelling the decision structure of the game, and its simplicity means we can evolve basic arithmetic functions with GP for calculating the values from the state of the game for a particular action and context.

Which one???

Sevens is a highly stochastic game in which the outcome is often decided by the cards dealt, so the fact that a player happens to win a single round does not imply superiority. However, the law of large numbers applies and if you play 1000 games of which agent A pitted against B and C wins 500 you can reasonably assume A is the better agent. A word of warning though: Some games display non-transitive relations when comparing skill levels of various agents, and although I couldn’t find this to be the case in Sevens, it may make it necessary to be very careful in how you select agents to be pitted against one-another to ensure they meet varied opposition.

Running the tool

The tool simply executes from SevenGP.main() without taking any parameters or input. Thus, it’s preferable to work from within Eclipse so you can modify the main function as needed. If you feel an urge to rant about this, consider that you will not be executing this kind of tool with changed input parameters very many times. You can download Eclipse for free here.

The tool generates a file called computer.txt in the project directory, containing the Lua exported evaluation functions. I’ve left a default generated file there for your interest. Notice how at both easy and medium, many of the functions are downright trivial consisting of a single terminal - this is due to the stochastic nature of the game and the fact 0 versus 1 generations were executed for these skill levels. Only at hard with 11 generations you start seeing more interesting evaluation functions. I commented out the expert level as it’s not necessary for demo purposes and it takes a while with 111 generations.

How to adapt for own needs

I would recommend you rewrite the code in your favorite language and for your own game (while making improvements), in order to understand it. Try to avoid language mismatches like the Java/Lua mismatch in this implementation, since it is certainly preferable to be able to share game simulation code rather than rewriting it in two languages. Nothing prevents you from generating e.g. C# code from C# (heck, in .NET even at runtime if desirable) if you’re concerned about the efficiency of the intermediate tree representation.

You might find it possible to directly reuse this code in JVM projects - feel free to do so, although refactoring is still recommended. I did not intend this to be used as a library and I have no desire to start reworking it into one myself.

Speaking of libraries, there probably are a couple to be found on the Internet in case you want to use a ready-made solution rather than do-it-yourself. If you feel confident you understand GP sufficiently or need a solution for a production level game, it’s likely the better option.

Structure of code

.sevengp package: Contains the main class SevenGP which you’ll probably want to get reading/playing around with right away.

.sevengp.game package: Contains the game simulation. Read up on this if you want to understand the game rules clearly, but you don’t really need to.

.gp package: Some base classes and interfaces for GP development, not particularly likely to be of generic use nonetheless. The Individual class is quite interesting, since it contains implementations of the genetic operations in its constructors.

.sevengp.gp package: Extensions to the base GP package and player class for Sevens.

.sevengp.gp.alphabet package: Terminals and nonterminals for Sevens. Many operations reside here in object oriented fashion (aspect oriented programming might have been more appropriate). Reading these are quite recommended.

There’s some confusion in the code regarding the term “individual”. The Individual class only represents a single tree, whereas the Specimen class is a true individual, specifically tailored for the game of Sevens with two different trees for play and send contexts. There are more terms that might be confused, as a disclaimer.

Final words

Hopefully this helps beginners to understand GP a bit better, and possibly giving them that final nudge that causes all pieces of the puzzle to fall in the right places and give sufficient understanding to implement GP. In my (subjective) opinion, GP isn’t a difficult concept at all, and can be a very useful tool in the gameplay programmer’s arsenal. Just make sure it’s the best option before you spend time applying it to a problem.

Top image credit: GECCO

↧

Limits of Developing a Web-Based Hidden Object Game for Learning Languages

November 6, 2013, 3:51 am

≫ Next: "Not So Random Randomness" in Game Design and Programming

≪ Previous: Light Introduction to Genetic Programming

As production of "Pavel Piezo - Trip to the Kite Festival" draws to a close later this year I reviewed the material I collected for the Postmortem and found it too much and too diverse to put in one huge article. So, I identified the topics that can stand very well on their own, which were not limited to this specific game or production, and decided to write three short(er) posts in advance to the Postmortem.

Limits of HTML5, Javascript and canvas when developing a "Hidden Object Game for learning languages"
"Not so random randomness" in game design and programming
Using a "Leitner system" to track a player's exposition to content and mechanics
Postmortem: "Pavel Piezo - Trip to the Kite Festival", a game for learning languages (to come)

Setting Up Production

Earlier this year, after the game design and concept for "Pavel Piezo - Trip to the Kite Festival" were almost done and funding was secured, the crucial question had to be answered: Which game engine or programming system should be used for production?

Within our company, intolabs GmbH, the core team for this production consists only of two people, Sven Toepke as core programmer and me as game designer, producer, additional programmer, auxiliary art director and everything else. Sure, we would have external, outsourced help with artwork, audio, marketing and so on, but the core development and production would be split up between us two.

The game is to be released for tablets with iOS, later for tablets with Android and Windows and after that for Windows and Mac desktop.

A specific game engine springs to mind? Yes, it does.

But as we had virtually no budget we chose a different solution. We both had successfully done various projects with HTML5, Javascript, jQuery, Cordova / Phonegap etc. and earlier this year Adobe committed to the CreateJS suite to deal with canvas, sound, tweens, preloading and such. Since we had done some prototyping with this combination already, the decision was made to use this as the "game engine".

After all, it's just big static pictures with some sprites for items, a few animations and sounds, right? Well, yes and no.

Although the game does run well, even on our minimum-specification test device, we did hit some limits and road bumps that are worth noting for anyone who plans to delve into creating games with HTML5/JS/canvas.

Missing or Insufficient Libraries or Functions

This one's the easiest to convey as you may be already aware if you dabble in HTML5/JS. In particular, we missed particle effects and a more elaborate animation library. Sure, you can solve the problems yourself or use additional third-party libraries but although you can overcome all problems described in this article, it adds to the complexity, memory usage, programming effort and production time.

Particle Effects

Since we only needed one nice effect, instead of using an additional library for particle effects we opted to use sprite-animations with semi-transparent sprites. This is a valid solution and works well, but adds to memory management and overdraw (see below). No biggie, but using a particle system would have been easier.

Animations

Again, we opted to work with what is provided within the described combination of systems. In our case we were missing circular animations. Sven programmed these himself, which is very doable with Javascript. However having this function within, say, CreateJS/TweenJS would have saved time. (You may find more elaborate functions for animations in libraries like melonJS, CAAT, Canvas Engine or impactJS.)

2D Overdraw in canvas

While this one wasn't bothering us too much, we could see the effect in early performance tests and countered its drag on frame rate at the very beginning. In case you are not familiar with the problem of overdraw, you can do a quick search here on gamedev.net, on Gamasutra, stackoverflow or simply by searchengine, but overdraw in 2D sprite-based games is quickly explained:

If the graphics of your sprites (objects) are not a perfectly filled rectangle, which they seldom are, you'll have transparent pixels around your sprite's graphic that fill up into the closest rectangle. This rectangle is your sprite (object) that you place and move around. Now, if you have overlapping objects which do overlap in the transparent areas, the renderer still has to calculate the visibility for every transparent pixel. If you have multiple overlapping objects/sprites this has to be calculated for every pixel in the Z (depth) order individually. This will add up, especially when animating. CreateJS/EaselJS allows you to group objects together in a container-object, which in turn can be cached, but again, this adds to complexity and programming effort.

Right after seeing the issues in our first tests we decided to preemptively cut background-layers with huge transparent areas into smaller objects and arrange them together on stage. Additionally, we changed sprites that were originally positioned partially behind a "background" object to be a rectangle that includes the pixels of the "background". That then allowed the sprite to be placed in front of the background layer, switching the sprite in question to a version with transparency only for animation.

Again, extra work.

Advanced 2D game engines provide additional, better techniques, such as putting sprites on non-rectangular meshes or using specific, well adapted renderers. These were not available in any HTML5/JS/canvas library we looked at. That may change with time but I found it's due to constraints in canvas and its implementations in different browsers itself.

Sound and FX

"Pavel Piezo - Trip to the Kite Festival" is relying heavily on sound. Each level has two different background loops, which are played out of alignment, sound effects for the GUI and game-events, plus there are a few different short spoken phrases for every item and active area in each level / picture.

For instance, if you see in the GUI that you are to look for sunglasses, tapping on those in the GUI will play one of two or three short sentences, like "You need to find the sunglasses". Tapping on them in the picture (finding them), will again play one of two or three sentences, like "Nice! You found the sunglasses." Tapping on an item, which you are not to find (yet), in the picture will also play one of two or three sentences, like "Very useful, a toothbrush." Remember, Pavel Piezo is a game for getting to know foreign languages, so we maximize exposition to the vocabulary and phrases as well as to the association between the picture of an item and the spoken word.

The biggest problem we had with handling all these sound files is described in the passage about memory, see below, but there were additional quirks and annoyances. In the end our finding was that we had to cheat our way around obvious negligence in the implementations of sound in different versions of browsers, webkit or even the version of the underlying OS or hardware.

When preloading all sounds for a level proved to be too much for the memory, we switched to streaming the bigger and less frequently used files. However, depending on the system, that added up to 500ms latency before the sound started playing. This was not (only) limited to the least powerful system we tested with. The latency varied (seemingly random) between OS versions, etc.

The most bizarre glitch we found was that one target system didn't seem to like some of our MP3s and would cut them a few hundred milliseconds short upon playing. We tried re-encoding, checking the files meta-data as well as several other methods to identify the culprit. It wasn't the longest sounds, it wasn't ones with a specific length, it wasn't something we could identify in the meta-data; we didn't find the reason within our limited time. We could however reproduce the glitch, it was always the same ones that got cut short.

In the end we just checked the playback of all audio files and extended the ones that got cut with half a second of silence... more work.

Memory Management

"But Carsten, you are developing for mobile devices, you have to be aware of memory constraints right from the start!" you say and prepare to skip this paragraph.

“Yes”, I say, “we know”, I say, “we were”.

What we were not completely prepared for is how much developing a webkit-based application tightens the corset even further, despite having developed similar applications in the past. The webkit on our minimum-spec test device left us with little more than 200 MB of usable memory, of which Cordova and the Javascript-libraries ate a good 100 MB from the start.

We spent a good deal of time and effort optimizing the graphics, handling preloading and streaming, finding the highest compression for sounds while preserving the desired quality, cutting graphics to save on transparent areas (which eat up memory when displayed on stage) and utilizing other tricks mobile developers are well acquainted with.

While time consuming, it was still very doable, of course, as you should optimize the heck out of your application for mobile devices, regardless. But even with our min-spec system, using a game engine that does not run on top of webkit would have granted us more than double the memory to use.

Conclusion

All the small problems aside, HTML5/JS/canvas can be a very viable combination for your development and with Cordova/Phonegap, there are very few other ways to have your application cross-platform capable with that little effort.

Just be aware of the constraints that are still in place today.

Until full-fledged game engines like Unreal Engine or Unity3D become available to run on top of canvas there'll be a bunch of extra work and there's still the additional memory constraints to keep in mind. We feel that we have reached some limits of what is possible with "Pavel Piezo", especially on older devices that are still widely used. It's clear that we could still optimize further with much more tinkering. From the perspective of production, though, it's just more feasible to use a full-fledged system for cross-platform game development.

Still, in our opinion the combination HTML5/JS/jQuery/createJS/Cordova/Phonegap/... is the choice to make nifty, good looking cross-platform apps in record time. Just like modern HTML/CSS, the application doesn't have to look "html-y" and if the application consists mostly of logic, "screens" and some slick animations for transitions, popups, slide-ins etc. you can't beat the speed and ease of cross-platform development for a production of a certain scope. As we relied heavily on sound and tried to use as much animation as possible for "Pavel Piezo - Trip to the Kite Festival" we did hit a point where it became clear that pushing further with coming releases, the effort for optimization would be too high, compared to using a full-fledged game system.

Maybe that will change (again) in the future but for now we have a very good idea of the boundaries that exist, when to use HTML5 and when to use purpose-built tools and engines.

↧

"Not So Random Randomness" in Game Design and Programming

November 6, 2013, 6:20 am

≫ Next: Using a "Leitner System" to Track a Player's Exposition to Content and Mechanics

≪ Previous: Limits of Developing a Web-Based Hidden Object Game for Learning Languages

Limits of HTML5, Javascript and canvas with developing a "Hidden Object Game for learning languages"
"Not so random randomness" in game design and programming
Using a "Leitner system" to track a players exposition to content and mechanics
Postmortem: "Pavel Piezo - Trip to the Kite Festival", a game for learning languages (to come)

This bit of information was first posted on Gamasutra as short comment on Lucky Breaks. I use this technique in almost every game design and programming, it's a handy tool in the belt and justifies a longer explanation and some examples.

Does "randomness" have to be really random?

In game design there is a kind of love-hate relationship with randomness. On the one hand it allows for variety with many types of content, on the other hand one can't "design" true randomness. How about a function that provides random outcomes but within parameters that can be influenced to fit the game design? It's little effort compared to what is gained in control, it can easily be abstracted in code and only requires holding a few more variables for every not-really-random-randomness.

A quick example: how I have previously dealt with rolling dice for calculating "loot" in a "chest". (It was something different, but the mechanics are the same and the concepts of "chest" and "loot" are instantly recognizable to any game grognard.) Let's say the chance of finding a golden ticket, alongside the usual sell-loot, in a chest should be 1:10000. If you use true randomness the player may eventually find a ticket in three consecutive chests and after that she will find no ticket in three months of play. What we do first is to reduce the number 10000 by 100 with every chest opened, store this value in howLikelyIsPlayerToFindTicket and use this variable to calculate our random chance. 1:9900, 1:9800 and so on. The chances get better with every chest until, after 100 tries, a ticket is granted. Remember, it is still possible that the player finds a ticket with a chance of 1:8400 or all other combinations, the odds simply get better with every try.

Additionally, when the player has found a ticket, we want her to not find another ticket again, too shortly after. We set another variable absolutelyNoTicketFindable to, say, 10. With every chest looted we make sure that no ticket is in there and decrease the variable by 1.
Once absolutelyNoTicketFindable reaches 0 we go back to our initial 1:10000 chance and to reducing it by 100 with every chest opened. We introduced a few additional values: Base Chance (10000), Increase Chance (100), Blocker (10), Current Chance (X, between 10000 and 0) and Current Blocks (Y, between 10 and 0).

If you are a programmer, you can already see where this is headed.

I would hold BC, IC and B as statics (or in a parent class) and CC and CB as variables (or within an object or in a derived class). A function MyRandomSuccess() processes these statics and variables (or gets the classes and objects as parameters). It calculates the success with the current values, modifies the variables accordingly and simply returns true or false. Depending on how you want to influence the outcome, you can introduce as many additional values as you wish.

You can influence the success withs buffs, power-ups, in-game events or what-have-you.
You can reduce or increase “Increase Chance” with a buff or power-up
For a level that is playing in a "poor" area, the player never finds a ticket and finds fewer of the lootable "rare" items.
You don’t have to define a completely different lootable for every occasion. Simply tag the area or the specific chest as "poor".
If it fits better, the value of absolutelyNoTicketFindable can be an amount of time that counts down.
You can influence variables for e.g. "hard hit" and "critical hit" depending on the level-difference between opponents to even out the playing field in a MOBA.
You can generate filler enemies with not-so-random-randomness strengths and weaknesses based on the players performance in the game thus far.

Properly abstracted, the function can be used for countless other decisions. I used this technique for (something comparable to) critical hits, enemy encounters, "random" goodies in gamification, chances in a lottery depending on real life weather, etc.

Voila, randomness harnessed.

In conclusion: The mechanics in question are still "random", can be heavily influenced by game design and, in my experience, are far more easy to balance then having, for instance, 100 different lootables and just switch them around.

Top image credit

↧

Using a "Leitner System" to Track a Player's Exposition to Content and Mechanics

November 6, 2013, 6:54 am

≫ Next: Introduction to Game Programming with CUDA

≪ Previous: "Not So Random Randomness" in Game Design and Programming

Limits of HTML5, Javascript and canvas with developing a "Hidden Object Game for learning languages"
"Not so random randomness" in game design and programming
Using a "Leitner system" to track a players exposition to content and mechanics
Postmortem: "Pavel Piezo - Trip to the Kite Festival", a game for learning languages (to come)

The Leitner system

You may know the "Leitner system" from your days at school, university or pretty much any other situation where you tried to get new knowledge in your head. Using the Leitner system is often called "using flashcards" since those two go together very well, but actually that means “employing the Leitner system with the use of flashcards”. Wikipedia explains:
"The Leitner system is a widely used method to efficiently use flashcards that was proposed by the German science journalist Sebastian Leitner in the 1970s. It is a simple implementation of the principle of spaced repetition, where cards are reviewed at increasing interval.
...
In this method flashcards are sorted into groups according to how well you know each one in the Leitner's learning box. This is how it works: you try to recall the solution written on a flashcard. If you succeed, you send the card to the next group. But if you fail, you send it back to the previous group. Each succeeding group has a longer period of time before you are required to revisit the cards."

It is a very good method of learning vocabulary or any other information for us humans and I will show how we can employ this method within computer games to efficiently track a player's behavior and learning progress within a game system. Using that information, a game can react and tune any dynamic gameplay, content exposition or usage of game mechanics that we, as designers, choose. From here on I assume that you either know how the Leitner system works or at least had a good look at the graphics and animations at the corresponding wikipedia page. It is easy in principle and I won't explain it further.

Flashcards in Pavel Piezo

The "Pavel Piezo" games are a series about learning foreign languages. In the games, all questions, phrases and short sentences are entirely in the chosen (foreign) language. For the sake of readability I will write everything in english in this article.

We are using digital flashcards with the Leitner system to track a player's progress learning the vocabulary. But, we also use it for choosing which items (and therefore vocabulary) to put on the stage when a new level is started and for choosing items for the short vocabulary test between levels. For this, we extended the classic system by a few values.

Every time the player taps an item on the screen, she hears the spoken word within a short sentence ("That's a toothbrush."). This counts as one exposition and gets noted on the virtual flashcard that accompanies that word; its exposition counter is incremented by one. If the player finds an item he is currently tasked to look for, thereby effectively making an active recognition of the link between the spoken word and the picture of the correspondent item, a recognition counter gets incremented on that word's virtual flashcard. Finally, in the (skippable) short vocabulary test between levels, a question is played through audio ("Which one is the apple?"), four pictures are shown and the player has to tap on the picture of the apple. Doing that successfully on the first try increments a counter known on the word's virtual flashcard.

With this we know if and how often a player has been exposed to the link between a spoken word and the corresponding visual, if and how often this link has been actively recognized and if and how often knowledge of this link has been successfully queried. When choosing pictures for the vocabulary test, we employ the standard Leitner system to choose which vocables to query from the user but we also only choose words which the player had at least one exposition or better yet recognition within the game. Actually we are choosing words with a combination of high exposition, high recognition and low known counters.

At the beginning of the level we do not only choose the items by random to provide variety (see previous article) but we consider the information about which vocabulary the player has been least exposed to in previous levels or play-throughs. Additionally, if the system identifies items which the player has been exposed to many times but never identified it in the vocabulary test, we can make sure that this item gets chosen as often as possible. (Or maybe we have to check the corresponding graphical representations…)

Less obvious applications

That's all well and fine for games in which the player learns languages, but how about a few examples of game designs that don't suggest themselves so clearly?

Imagine a game where, to open a door, the player can either pick the lock or just bash the door in.

At one point in the game's story it is vitally important to pick some doors to not be detected by guards. If the game has stored tries and successes of previous lockpickings the system can react accordingly. Remind the player that there is lockpicking at all, offer to jump to the according section in the tutorial, make picking the locks in question extremely easy (or extra hard, if previous success suggests so).

Another application is for game mechanics. Let's take a shooter as example.

Our assumed player clearly prefers to jump in, shotguns blazing. But at one point in the game there's an event where, for story or design reasons, the sniper rifle is needed to successfully overcome the obstacles. With information on how often or little the sniper rifle was used before, and with how much or little success, the game can tweak the gameplay leading up to this point accordingly. In our case supply of ammunition for the shotgun would run out, sniper rifles and ammunition would be placed in plain sight and enemies would be spawned that are easy to pick off with the sniper rifle. Effectively, the area leading up to the sniper event will be turned into an in-game-tutorial.

Of course, if we know the player knows very well how to handle the sniper rifle and just prefers to use a shotgun, we can leave her to her own style of playing, resting sure that the upcoming sniping will cause no frustration.

We can also make decisions based on a player's choice of classes in a MOBA or MMO.

If the player can choose different classes, different characters or skill sets, which one does he choose how often? How easy is it for him to find a group? How "successful" is the group? How much time does he spend with this character, class or skill set over all? We store this, and more if desired, in flashcards associated with the player.

(Keep in mind, we store this to have information about the player's preferences and act accordingly. Gathering information about how effective a class or combination of classes is against another class or combination of classes is a different beast.)

With the information we gather about the player's preferences and knowledge about the game system we can, for instance, fine tune the grouping suggestions, making sure the player gets suggestions in which he can play his preferred style. On the other hand we could offer a reward for playing a rarely or never played class, thus, exposing him to content he would otherwise miss out on.

Conclusion

Gathering metrics about players, play styles, content exposition and usage of game mechanics is a deep rabbit hole. There are so many possibilities that it's hard to decide what information to collect and how. In my experience, starting out with an abstracted and generically applicable flashcard system that can be extended to ones own needs helps immensely to get started with key metrics and allows for easy expansion and tuning where necessary.

Top image credit

↧

Introduction to Game Programming with CUDA

November 11, 2013, 5:30 pm

≫ Next: Why your Games are Unfinished, and What To Do About It

≪ Previous: Using a "Leitner System" to Track a Player's Exposition to Content and Mechanics

Intro to CUDA

Modern game engines have a lot going on. With so many different subsystems competing for resources, multi-threading is a way of life. As multi-core CPUs have gotten cheaper and cheaper, game developers have been able to more easily take advantage of parallelism. While Intel and AMD fight to bring more cores and more cores to the CPU, GPUs have been easily surpassing them for raw parallel abilities. Modern GPUs contain thousands of cores, allowing tens of thousands of threads to execute code simultaneously. This presents game developers with yet another opportunity to add parallelism to their programs. In separate threads, an engine may want to perform a search or sort against a large amount of data, pre-process trees, generate a large amount of random data, process an image or perform calculations to be used for a transformation or collision detection. Any parallel computational task can be a good candidate for offloading to the GPU. This article aims to show you one possible way of harnessing that ability in a game using NVidia's CUDA.

CUDA is both a parallel platform and model that allows code to run directly on the processing cores that make up modern GPUs. It was created by NVidia and currently only supported on NVidia's hardware. It is similar to OpenCL in the idea but different in execution. Using CUDA is as simple as having a recent NVidia graphics card and downloading the free SDK. Links for Windows, Linux and Mac OSX can be found here. While it is proprietary to NVidia, the programming model is easy to use and supported by many languages such as C/C++, Java and Python and is even seeing support on ARM7 architectures. The CUDA programming syntax itself is based on C and so pairs well with games written in C or C++. The CUDA code you write is compiled to object code with NVidia's nvcc compiler and then is linked with standard C code using gcc or Visual Studio to produce the final program. For simple programs, the same file can be used to contain both your entry point and your CUDA function(s). After downloading and installing the toolkit, compiling CUDA code can be done from the command line with the nvcc compiler or through Visual Studio using the CUDA Runtime template which makes it easy to combine standard C/C++ and CUDA code files together in one project.

To demonstrate CUDA with C, we can start with a simple addition function. All samples shown in this article were compiled with the CUDA 5.5 toolkit:

__global__ void cudaAdd(int a, int b, int *c)
{
	*c = a + b;
}

This program adds two numbers and stores the result in c. The __global__ identifier marks this function as an entry point for the CUDA program. Now we will see an example of how to call the above program. This can be placed in the same file to create one complete program:

#include <stdio.h>
    
__global__ void cudaAdd(int a, int b, int *c)
{
	*c = a + b;
}

int main()
{
    int a = 4;
    int b = 7;
    int *c;
    int answer;
    cudaMalloc((void**)&c, sizeof(int));
    cudaAdd<<<1,1>>>(a, b, c);
    cudaMemcpy(&answer, c, sizeof(int), cudaMemcpyDeviceToHost);
    printf("%d + %d = %d\n", a, b, answer);
    return 0;
}

Programs on CUDA are executed as kernels, with one kernel executing at a time. The kernel can be run by just one or even thousands of threads at the same time. Since we are retrieving a result from the GPU, we first use CUDA to allocate memory for it. Next we execute our program, using the < >>> syntax to specify how many blocks and threads we want the kernel to use. The number of threads that can run in a block is dependent on the specific architecture of the GPU you have. For Fermi GPUs you can execute up to 1024 threads on a block. For this simple example we are just executing one thread on one block. Once we have the data in our c variable, we need to copy it back to system memory using cudaMemcpy. Finally we can display the result.

Performing a Reduce

With a simple example out of the way, we can look at a more common example. A reduce is a parallel operation where data that exists across many threads is combined over a series of steps until a single value is held by one thread. A common example could be computing a sum where each steps adds the values of two different threads. After each step, fewer and fewer threads are used until only the final thread adds the last two values remaining and holds the sum. For this sample, we will demonstrate a program that has separate threads count the number of 5's in parts of an array and then perform a reduce to get the final total. This sample can be run over any number of blocks and threads:

__global__ void countFives(int *array, int size, int *total)
{
    int index = threadIdx.x;
    int totalThreads = blockDim.x * gridDim.x;
    int totalThreadIndex = (blockIdx.x * blockDim.x) + threadIdx.x;
    __shared__ int sharedCounts[512];

    //first determine how many elements each thread must count
    int chunk = (size / totalThreads);
    if (size % totalThreads > 0)
        chunk++;

    int start = totalThreadIndex * chunk;
    int end = start + chunk;
    if (end >= size)
        end = size;

    sharedCounts[index] = 0;
    *total = 0;

    //have each thread count its own elements and store in shared memory
    for (int i = start; i < end; i ++)
    {
        if (array[i] == 5)
        {
            sharedCounts[index]++;
        }
    }
    __syncthreads();
	
    //now perform a reduce to get the sum of all counts
    //the stride tells us how many elements to include at each level
    //each loop reduces the number of threads needed until only the first thread is used to capture the count
    for (int stride = 1; stride < blockDim.x; stride*=2)
    {
        int offset = index*(stride*2);
        if (offset + stride < blockDim.x)
        {
            sharedCounts[offset]+=sharedCounts[offset+stride];
        }
    }
	
    //now have the first thread of each block sum the results to global memory
    if (index == 0)
    {
        atomicAdd(total, sharedCounts[0]);
    }
}

This program has three basic steps. First we broke up the array into chunks and had each thread look for 5's in its own chunk. Then we performed a simple add reduction across the threads on each block, storing the result in the shared memory of the first thread of each block. For the last step we used an atomic add to update the global total across the different blocks. The atomic add prevents any contention issues between threads. The syncthreads function show here is used to provide a stopping point for the threads. All threads must reach this point before the program can continue. The example on the whole is inefficient as it only uses about half the total threads for the reduction and has potential contention issues when accessing the global memory but hopefully demonstrates the basic concept of a reduction. The following allocates memory for the array and calls the function:

const int size = 11;
int sourceArray[size] = { 1, 4, 5, 2, 5, 6, 8, 9, 5, 12, 5 };
int total; //stores final value we can examine
int *cudaTotal; //value to allocate for cuda to use
int *cudaArray;
cudaMalloc(&cudaTotal, sizeof(int));
cudaMalloc(&cudaArray, sizeof(int)*size);
//copy our source numbers to cuda before calling
cudaMemcpy(cudaArray, sourceArray, sizeof(int)*size, cudaMemcpyHostToDevice);
countFives<<<2,2>>>(cudaArr, size, cudaTotal);
//copy our result from the device to the program's memory
cudaMemcpy(&total, cudaTotal, sizeof(int), cudaMemcpyDeviceToHost);

CUDA Thrust

A really great library that can be used for common CUDA tasks is Thrust. Thrust is a template library for CUDA that allows STL-like syntax to increase developer productivity. The CUDA SDK comes with a version of Thrust that can be easily used in C code. The following demonstrates a sum reduction and a count of fives using the same array as above:

#include <thrust/device_vector.h>
#include <thrust/count.h>

    ...

    thrust::device_vector<int> thrustArray(11);
    thrustArray[0] = 1;	thrustArray[1] = 4; thrustArray[2] = 5; thrustArray[3] = 2;
    thrustArray[4] = 5;	thrustArray[5] = 6; thrustArray[6] = 8; thrustArray[7] = 9;
    thrustArray[8] = 5;	thrustArray[9] = 12; thrustArray[10] = 5;

    //compute the sum of all elements in our array
    int sum = thrust::reduce(thrustArray.begin(), thrustArray.end(), (int) 0, thrust::plus<int>());
    //get a count of just the 5's in our array
    int count = thrust::count(thrustArray.begin(), thrustArray.end(), 5);
    printf("Array sum: %d Count of fives: %d\n", sum, count);

As you can see, the syntax is very similar to the Standard Template Library and makes it very easy to call common functions, saving you lots of coding time. It also integrates well with STL vectors. For useful examples of what Thrust can do, you can go here.

Integrating with OpenGL

A great feature of CUDA is its built-in ability to work with OpenGL directly. This allows a CUDA program easy access to data such as texture, pixel buffers or vertex buffers to perform operations against it quickly. Here we will see how we can use CUDA to alter data in parallel against a vertex buffer. The buffer shown here will be small and simple for demonstration purposes. I won't show all of the basic OpenGL set up or program layout here but this sample will work with code from any basic OpenGL tutorial. I placed all my OpenGL code and the main game loop in one c file and the CUDA kernel function and a wrapper to call it in a separate file with a .cu extension.

To get started, first we need to define our simple data structures to use to create the vertex buffer:

struct vertex {
    float x;
    float y;
    float z;
};

struct VertexType
{
    vertex position;
    //texture coordinate and other information below
    ...
};

Next we want to allocate an array to use for our vertex buffer using the above structures and then generate a buffer. For this sample we will just allocate an array of four vertices to store a quad. We also need a global variable to store the ID of our vertex buffer:

    GLuint vbufferId;
    ...

    VertexType verts[4];
    verts[0].position.x = -1.0f; verts[0].position.y = 1.0f; verts[0].position.z = 0.0f;
    verts[1].position.x = 1.0f; verts[1].position.y = 1.0f; verts[1].position.z = 0.0f;
    verts[2].position.x = 1.0f; verts[2].position.y = -1.0f; verts[2].position.z = 0.0f;
    verts[3].position.x = -1.0f; verts[3].position.y = -1.0f; verts[3].position.z = 0.0f;

    //fill in texture coordinates, etc
    ...

    glGenBuffers( 1, &vbufferId );
    glBindBuffer( GL_ARRAY_BUFFER, vbufferId );
    glBufferData( GL_ARRAY_BUFFER, 4 * sizeof(VertexType), verts, GL_DYNAMIC_DRAW );

With a simple buffer created, we can now create a CUDA resource to store a pointer to our vertex buffer. We need another global variable to store our resource:

struct cudaGraphicsResource *cuda_vb_resource;

Then we can map it to our vertex buffer immediately after the glBufferData call above:

cudaGraphicsGLRegisterBuffer(&cuda_vb_resource, vbufferId, cudaGraphicsMapFlagsWriteDiscard);

The resource now has a pointer to the vertex buffer we created above. This allows us to retrieve and modify them using CUDA. The actual program to modify our vertices is very simple. Since we want to stretch our cube in all directions, we must first get a positive or negative value by dividing the current vertices position by the absolute value of itself. Then we will multiply it by the elapsed time in seconds and by our desired rate of movement of .05 units a second.

__global__ void update_vb(VertexType *verts, double timeElapsed)
{
    int i = threadIdx.x;
    float valx = verts[i].position.x / abs(verts[i].position.x);
    float valy = verts[i].position.y / abs(verts[i].position.y);

    verts[i].position.x += valx * timeElapsed * .05f;
    verts[i].position.y += valy * timeElapsed * .05f;   
}

I placed this code in a file separate from the main c file with the OpenGL code and gave it an extension of .cu. Note that the program assumes that each thread will only act on one vertice. It also assumes one block for simplicity but you could easily execute this over multiple blocks if you had enough vertices. We use the index of our current thread to determine which vertice to operate on. We also use an elapsed time variable to control how much change we want in each loop. This helps keep the movement constant if frame rates vary and our time elapsed delta is constantly changing.

The last step now is to create a function to call our CUDA kernel. We can place this function in the same .cu file. The extern keyword is used so that our main c program is able to find it when compiling and linking.

extern "C" void cuda_kernel(VertexType *verts, double timeElapsed)
{
    update_vb<<<1,4>>>(verts, timeElapsed);
}

All the wrapper needs to do is pass in the arguments and instruct CUDA how many blocks and threads we want to run on. In this example we tell it to run over 4 threads in one block so each thread has its own vertice. With the function in place, we can call it from the main logic loop. You will want to put the above function's signature with the extern keyword in your main c file if using multiple files so it can be found when linking. This code is set to execute once per loop:

    VertexType *verts;
    cudaGraphicsMapResources(1, &cuda_vb_resource, 0);
    cudaGraphicsResourceGetMappedPointer((void **)&verts, &num_bytes, cuda_vb_resource);
    cuda_kernel(verts, timeElapsed);
    cudaGraphicsUnmapResources(1, &cuda_vb_resource, 0);

The code works by getting a pointer to the vertices in the vertex buffer that is mapped to our CUDA resource. Then they are passed to the kernel wrapper to be modified and unmapped so they are released. This sample assumes there is some code for getting the time elapsed delta between this and the previous loop. QueryPerformanceCounter works well for this. After clearing buffers and setting our texture, our render code looks like this:

    glEnableClientState( GL_VERTEX_ARRAY );
    glEnableClientState( GL_TEXTURE_COORD_ARRAY );
    glTexCoordPointer( 2, GL_FLOAT, sizeof(vertexType), (GLvoid*)offsetof( vertexType, texcoord ) );
    glVertexPointer( 3, GL_FLOAT, sizeof(vertexType), (GLvoid*)offsetof( vertexType, vert ) );

    //now draw the array
    glBindBuffer(GL_ARRAY_BUFFER, vbufferId);
    glDrawArrays(GL_QUADS, 0, 4);

    glDisableClientState( GL_TEXTURE_COORD_ARRAY );
    glDisableClientState( GL_VERTEX_ARRAY );

The last step is to free our resources:

    cudaGraphicsUnregisterResource(cuda_vb_resource);
    glBindBuffer(1, vbufferId);
    glDeleteBuffers(1, &vbufferId);

And thats it. OpenGL integration is fairly straightforward when dealing with buffers. This example can be easily extended to cover TextureBuffers, PixelBuffers or RenderBuffers as well.

Integrating with Direct3D

Similar to its integration with OpenGL, CUDA provides the ability to tie in with Direct3D 9, 10 or 11. Here I will demonstrate the Direct3D 11 version of modifying a simple vertex buffer. Just like with the OpenGL example, we will create a simple 2D cube that we can resize in a game loop. We can use the same vertex structure from the OpenGL example which allows us to use the same CUDA kernel function as we did earlier:

    struct cudaGraphicsResource *cuda_vb_resource;
    ...
    D3D11_BUFFER_DESC vertexBufferDesc, indexBufferDesc;
    D3D11_SUBRESOURCE_DATA vertexData, indexData;
    HRESULT result;

    m_vertexCount = 4;
    m_indexCount = 6;

    vertices = new VertexType[m_vertexCount];
    if(!vertices)
    {
        return false;
    }
    indices = new unsigned long[m_indexCount];
    if(!indices)
    {
        return false;
    }

    vertices[0].position.x = -1.0f; vertices[0].position.y = -1.0f; vertices[0].position.z = 0.0f;
    vertices[1].position.x = -1.0f; vertices[1].position.y = 1.0f; vertices[1].position.z = 0.0f;
    vertices[2].position.x = 1.0f; vertices[2].position.y = 1.0f; vertices[2].position.z = 0.0f;
    vertices[3].position.x = 1.0f; vertices[3].position.y = -1.0f; vertices[3].position.z = 0.0f;
    //fill in other properties
    ...
    //fill in indices for 2 triangles
    indices[0] = 0; indices[1] = 1; indices[2] = 2;
    indices[3] = 0; indices[4] = 2; indices[5] = 3;

    //create a dynamic vertex buffer
    vertexBufferDesc.Usage = D3D11_USAGE_DYNAMIC;
    vertexBufferDesc.ByteWidth = sizeof(VertexType) * m_vertexCount;
    vertexBufferDesc.BindFlags = D3D11_BIND_VERTEX_BUFFER;
    vertexBufferDesc.CPUAccessFlags = D3D11_CPU_ACCESS_WRITE;
    vertexBufferDesc.MiscFlags = 0;
    vertexBufferDesc.StructureByteStride = 0;
    vertexData.pSysMem = vertices;
    vertexData.SysMemPitch = 0;
    vertexData.SysMemSlicePitch = 0;

    result = device->CreateBuffer(&vertexBufferDesc, &vertexData, &m_vertexBuffer);
    if(FAILED(result))
    {
        return false;
    }

    //now create the index buffer
    indexBufferDesc.Usage = D3D11_USAGE_DEFAULT;
    indexBufferDesc.ByteWidth = sizeof(unsigned long) * m_indexCount;
    indexBufferDesc.BindFlags = D3D11_BIND_INDEX_BUFFER;
    indexBufferDesc.CPUAccessFlags = 0;
    indexBufferDesc.MiscFlags = 0;
    indexBufferDesc.StructureByteStride = 0;
    indexData.pSysMem = indices;
    indexData.SysMemPitch = 0;
    indexData.SysMemSlicePitch = 0;

    result = device->CreateBuffer(&indexBufferDesc, &indexData, &m_indexBuffer);
    if(FAILED(result))
    {
        return false;
    }

With the buffers created we can associate the resource and our vertex buffer like we did with OpenGL:

    cudaGraphicsD3D11RegisterResource(&cuda_VB_resource, m_vertexBuffer, cudaGraphicsRegisterFlagsNone);

Finally our rendering code looks like this:

    unsigned int stride = sizeof(VertexType); 
    unsigned int offset = 0;
    
    deviceContext->IASetVertexBuffers(0, 1, &m_vertexBuffer, &stride, &offset);
    deviceContext->IASetIndexBuffer(m_indexBuffer, DXGI_FORMAT_R32_UINT, 0);
    deviceContext->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST);

With that up and running, we can call the update from inside a game loop just like with OpenGL. The example I wrote used the exact same kernel and external wrapper function from the OpenGL example:

    VertexType *verts;
    size_t num_bytes;

    cudaGraphicsMapResources(1, &cuda_vb_resource, 0);
    cudaGraphicsResourceGetMappedPointer((void **)&verts, &num_bytes, cuda_vb_resource);
    cuda_kernel(verts, elapsedTime);
    cudaGraphicsUnmapResources(1, &cuda_vb_resource, 0);

Lastly we need to clean up:

    cudaGraphicsUnregisterResource(cuda_VB_resource);
    if(m_indexBuffer)
    {
        m_indexBuffer->Release();
        m_indexBuffer = 0;
    }

    if(m_vertexBuffer)
    {
        m_vertexBuffer->Release();
        m_vertexBuffer = 0;
    }

    delete [] vertices;
    delete [] indices;

Now we have seen some basic examples of how to create CUDA programs and how they can directly interact with data from OpenGL or Direct3D. These examples are pretty basic but hopefully provide a springboard to more advanced concepts. The SDK is loaded with useful samples that demonstrate the power and flexibility of the toolkit.

↧

Why your Games are Unfinished, and What To Do About It

November 13, 2013, 6:35 pm

≫ Next: How to Outsource Art

≪ Previous: Introduction to Game Programming with CUDA

This post originally available on my dev blog.

So, you've got a new game idea, and it's going to change what everyone knows about the genre! Great!

After making a Game Design Document, you proceed to make some art, or maybe a prototype. You even got that fancy gimp program, or started using a new 'multi-platform' library.

Time went on and you hit a wall. Maybe it's that annoying bug in the second level. Your plans aren't panning out that well. It's just too much work.

You start making excuses. The game idea wasn't that great. It might actually be a bit boring. The art looks crappy.

You abandon the project. There are better ideas you say.

If the above sounds like you, then the bad news is with the way things are going, you might not release any games at all and just lock them inside your head!

The good news is, you're not alone. Almost every game developer loses interest in projects they are working on.

Coming from my personal experience, and interviewing a few other successful game developers, I've compiled a list of things to do when you find yourself killing off your own games.

1. Stop Editing

When writing, authors typically have one rule when making their first draft, and that is to kill the 'infernal internal editor'. "Don't edit, just write!"

This actually carries over to a lot of other creative industries, including game development. When developing a game, always make it so it just barely passes, and move on. The more you work on other parts of the project, the more motivated you will be. Don't try to perfect your game on the first run, remember, you can always edit it later on.

2. Make A Deadline

This goes hand in hand with Number One. Enforce a time constraint, do your best to stick to it, and you'll find yourself working on the essential game aspects.

i-download+

There are lots of events with this in mind, such as OGAM (One Game A Month), and Ludum Dare.

3. Go For Small Games - if you're just starting

When you're just starting, start small. Making a fun solid game/minigame is a huge leap for a game developer, and having at least one released game already puts you ahead of many of your contemporaries.

"But that super awesome MMORPG with that unique mechanic is going to be huge" you say. That sort of enthusiasm will go a long way, but if you haven't even been able to create one small game, do you really have what it takes to commit yourself to such a big project?

If your game idea simply and absolutely cannot wait, then try creating what's called a 'Vertical Slice'. Instead of creating your entire game, why not try creating one scene, one battle, or one encounter? This nets a plus on all components, because you can instantly:

Test out your idea
See if it's actually fun
And actually have something to show for your effort

4. Make it a Habit

Whether you're someone who makes games as a hobby, or someone who really wants to get into the industry, then make it a habit. Do one part of your game, everyday. It doesn't matter how much you can do in a day, the important part is that you work on the game.

You can even get yourself a to do list. Ticking something as done gets a nice feeling to your stomach!

5. Don't worry about the technology

You're salivating over that new libgdx library that can compile to every platform known to mankind. You want to use Haxe because it's fast, multiplatform, and l33t. Microsoft killed XNA, and you avoid it like a plague.

The thing is, Don't Care! Remember, you're making a game, and it doesn't matter what language you use.

If your game is boring, no one will play it, even if you used the newest, shiniest language to have ever existed.

The next tip is also an inherent flaw of game programmers, but can be applied to game developing in general.

6. Keep It Simple Stupid!

If you're a programmer, just code, don't edit (slightly bringing us back to tip #1).

Design Patterns? Throw 'em away. Component Based Systems? So last year. Event Listener's inefficient? Leave them be.

Keep it Simple Stupid (KiSS) is an actual programming methodology. It's what it says on the tin, just keep your code simple. Don't get fancy with design patterns, component based systems, or making your loop run in the most efficient way possible. Pre-Optimization is the root of all evil.

Take pride in doing what you did, even if it was bad code. You might have a game with bad code, but at least you're not the other guy who has no game but good code.

7. Public Beta Tests

When losing motivation, try being public! Share what you have so far, be it a doodle, a screenshot, or maybe even a demo. Get a friend to play your game, and with the internet, you can't have excuses for not finding anyone.

The feedback you'll get for your game is priceless, outlining what's fun and what's not, and it may even be the push you need to make it big.

8. Flow

If there was ever a point in your life where you've done something that you didn't even realize time passing, then you've undergone flow (which is what hypnosis puts you in). When you're in this state, you're so focused on what you're doing that you won't even notice a plane crashing next door (okay, maybe that was an over statement).

The point is, we can be totally immersed in one activity, and this is what you want to happen when developing your game. Close out your browser and focus, have fun, and don't think of anything else. Throw out coding practices, optimizations, and don't perfect stuff. Just do it.

9. It's Dead Jim

Maybe the game really didn't pan out as you've hoped to be. The gameplay was really flawed and it's not fun.

Sometimes, we need to quit when it's simply not working. (Seth Godin tackles it in his book, The Dip: A Little Book That Teaches You When to Quit (and When to Stick))

Remember, there is nothing wrong with making a game and just leaving it at that. You've gained experience, and that's always a plus. But don't just leave your game in the back of your hard drive, going back to tip #7, be Public! Share it on forums, saying it was a game you did in your free time that was left unfinished.

What do you know, maybe someone even gives you valuable feedback that turns out to be all that you needed...

↧

How to Outsource Art

November 15, 2013, 2:56 am

≫ Next: How To Make My Sample Library Sound Good - Part 1 Staccato Strings

≪ Previous: Why your Games are Unfinished, and What To Do About It

Many clients and friends often ask me, “we have never outsourced art work before, how should we start?” “What things should I prepare for you (the outsource team)?” Here I can share some ideas, hopefully they will be useful.

I think for a game developer to outsource art works, he can follow the steps:

Sort out and make a full list of assets which should be done by the outsource team. For example, you may decide to outsource all or part of the background images, the animations, UI, etc. The background images are roughly done in two steps, line art and coloring, you may decide to let the in-house team do the line art, and outsource team do the coloring.
Find reference pictures for a style/quality guide. Look for pictures of in-game quality, to show the outsource team the art style and quality you need for your game. The reference pictures could be already done pictures in your game, and could be screenshots of other games.
Assign technical specifications. For example, if you outsource background images, or UI, you should tell the outsource team the image resolution you need them to be delivered in. The image size affects the labor amount and price.
Write down descriptions for each piece of assets. For example, for a background image, or a UI button. Write what you want, and your ideas.
Get a quote. After giving the above #1~4 to the outsource team, you can ask them for a quote. Then you can make a bargain with them. This step is important. Please don’t assign a test, before you get the quote. Chances are the team can deliver nice pictures, but you can’t afford them.
Assign a test. After you get a satisfactory quote from the team, you can assign a piece of art work as a test. It’s best to be a part of your incoming game, it’s the most relevant content for a test. During the test, you can watch the team, to see their way of processing things, working speed and above all, the art quality.
Proceed to the formal commission. Then if you feel the team is competent for your project, you can sign a contract with them and enter formal commission stage.

And during the production phase, there are two concerns I want to add:

Whether to give some info of the storyline and game play: No doubt this info would help the art contractors to do things better. Sometimes due to commercial secret concern or other reasons, you would not like to give this information. Nonetheless, make sure the description and reference for each art piece are clear, so that the contractor will be able to deliver precisely what you want.
When the game is released, share the news with the art contractors. This would give them a sense of engagement, and achievement. Next time they will be willing to work with you.

↧

How To Make My Sample Library Sound Good - Part 1 Staccato Strings

November 17, 2013, 4:23 am

≫ Next: How To Reverse Time - Introduction to Git, Cloud Computing, and Version Control

≪ Previous: How to Outsource Art

When real life instrumentalists are playing their instrument they naturally make their own subtle accents. This video is demonstrating how you can replicate this to make music which further involves your audience. Furthermore this video will demonstrate the basics of MIDI editing for shorter notes and how to start the creation of a piece of music in a DAW. Basically it will enhance the way you use a sample library.

Software Used:

DAW: Cockos Reaper
Sample Library: EWQL Hollywood Strings

The image above is the music contained in the video and it has been colour coded according to strength of accent on the 4 beats of each bar. The first beat is almost always accented slightly unless the real life player has been told otherwise. We also see this in the 3rd beat of each bar (in 4/4) and this then happens on the 2nd and 4th beat when there are quavers (8ths).

Red = Strongest Accent
Yellow = Medium Accent
Green = Weakest Accent

Why Does This Happen?

This happens because of the way players count. In 4/4 time players count 1 and 2 and 3 and 4 and.
Whereas in 6/8 time players count 1 2 3 4 5 6 with the accents placed accordingly.

It is essential that you understand the counting and the slight stresses players put on the notes to successfully mimic the way real life musicians play.

Bear in mind that these accents are actually not very strong relative to an actual accent on printed music.

Note: The patches used in this tutorial are 'Round Robin Patches' which means that there are multiple samples per note per velocity group. Therefore this example may sound better than non-round robin patches however this technique can still be used with those patches.

The Video

Conclusion

This video is part of a series which is currently in the process of being made so keep your eye out for the next article! Thank you for watching the video/reading the article!

Article Update Log

17 Nov 2013: Initial release

↧

How To Reverse Time - Introduction to Git, Cloud Computing, and Version Control

November 17, 2013, 9:02 pm

≫ Next: The Total Beginner's Guide to 3D Graphics Theory

≪ Previous: How To Make My Sample Library Sound Good - Part 1 Staccato Strings

This post originally available on my devblog.

"Are you telling me I need to rework my entire game from scratch!?"
"If you want to pass, then yes."

This is how a conversation went down with one of my university friends and his professor. My friend had the unfortunate luck of his laptop dying on him, along with all its data, the day his final project is due. Now, while I love my friend, the professor should have failed him. Or it may even be the case that it's the professor who failed, since he didn't teach one very important thing. In the era of cloud computing, there's no excuse for lost data anymore. Having a backup of your data, and even sharing with others has become trivially easy. Perhaps you've heard of Git, or maybe GitHub. People treat it like it's the best thing since sliced bread, and you're left wondering, "What the hell are this things?" If you want a (beginner-friendly) introduction, read on!

But first...

The easy - Cloud file hosting

Cloud hosting services are easy ways to backup your data. All you do is install a program, and you'll have a folder that gets instantly 'synced' with your account. That way, if you lost your local copy, you can always download the data from your account. Neat huh?

These 'free' cloud hosting services are perhaps the most popular:

Dropbox - is the most popular. It starts out with 2 GB, but you can earn up to 20 GB for free quite easily! (It also helps to have connections to a senior VP architect there *grin*)

SkyDrive - Microsoft's Skydrive generously offers 7GB upfront! There's no way to make it bigger though (unless you want to pay).

GoogleDrive - Offers 5 GB upfront but, like Skydrive, you're stuck on that much unless you want to give some cash.

The 'big three' is enough of a list to get you started. Pick one, and when you're ready, go ahead.

The trivial - Version Control

This happens all too often. Let's say you're working on a game and, after hours of work, you may have some of the solid mechanics down. 'Alpha' you call it. You continue working on your game for hours, adding new features and stuff. But compiling it, you realize a drastic mistake with your code. After fiddling with the code a few hours more, you realize how much of a mess your code now is. You bang your head on the wall - if only you could reverse time to when everything worked, to your precious 'Alpha'.

This is where version control comes in. True to its name, a version control system, *ahem* 'controls' your versions so you can easily "reverse time" when you need to. Instead of something akin to cloud file hosting, where your files get synced every time you make an update, here, you save your file every 'version'.

This is a HUGE time saver, since if you want to 'revert' back changes, you could easily do it with a click of a button! (or pedantically, typing a few words)

There are lots of Version Control Systems (VCS) but Git is probably the most relevant right now. There are still people who use SVN though, and knowing two VCS can't hurt right?

The Idea

In its most simple form, here is a Version Control project.

Every little box above is a "version", which we call a "commit". Every "commit" gets saved on what's called the "repository" which is just a fancy word for "the box that contains all your saved data".

Whenever you want to go 'back in time', you "revert" your changes. If we were midway through 'alpha' to 'gold', but we realize there's a mistake, we can easily revert our changes back to 'alpha'.

Collaboration

VCS are more sophisticated than that though, and they are VERY HELPFUL in collaborations, not only because you can revert mistakes of your team, but also because you can have very real boundaries and division of labor.

Let's say your team decides what to work on. We decide that Member A will work on the mechanics of the game. The physics engine is horrendously slow, so we assign Member B, to rework the physics engine.

If we used what we did above, blindly committing changes of your team will get ugly real quickly

What we could do is set up what are called "branches".

In the above, instead of committing in the main line, otherwise called a 'trunk', we create a 'branch', specifically for physics. Note the first commit in the physics branch. There is a 'commit message' aptly named 'buggy physics'. This does not at all affect the main line (containing alpha) and Member A can still work on the mechanics of the game, completely oblivious to the buggy physics.

Member B then manages to get the physics right, and finally 'merges' it. Member A, who at this point did not even see the changes by B, is delighted to see the new, fast physics engine.

Here we see a more complicated VCS, but note that it simply has the concepts above.

650px-Subversion_project_visualization.s

That's great, but how do you use it!

SVN

If you're using Windows, TortoiseSVN is an absolutely fantastic tool to work with SVN.

TortoiseSVN, right click menu

Here we see some of the options of Tortoise SVN. They are self explanatory, 'SVN Commit' commits your files, while 'Revert' rolls back changes. 'SVN Update' updates your 'local working copy', so that if your teammates made a new commit, you can see the changes.(your files get synced)

SmartSVN is a good alternative if you're not in Windows.

GIT

First off, Github != Git.

Git, like SVN, is the 'real' Version Control System. Github is a web-based hosting service. Think of it as facebook for Git.

Now that we have that out of the way, let's look at our options for Git. Go ahead and get Git first.

The best option I have found so far for GUI-based git is TortoiseGit, a port of the above TortoiseSVN. Because it's a port, it looks like the above, only that it uses Git.

There is also GitX for all you Mac lovers out there.

So what's github?

Github is a like a social network that uses Git. You can commit your games/files there and people can take a look at it and play with your code, copying your repository and working on it on their own (called forking).

There's a private repository option, but that requires payment.

If you want to have a private git repository for free, try bitbucket.

Good VCS practices

Keep your commit messages short but punchy.
Always, always use good commit messages. Tell everyone what you did with your commit, so they can track your changes. There's nothing more scary than 'fix bug' or 'updated game' so you have no idea what was changed.
Note the side effects of your changes. If you edited the GUI, which in turn changed how the levels work, make sure to note it.
Branch your code when necessary.
Test your code before you commit on the main trunk. Seriously.

Git - Something special for real programmers

You ask "Why did you give me Git Gui Options! Tell me how to use git on the terminal!"

If you consider yourself a real programmer, who can't be bothered to use GUI crap, please read my article, Why your games are unfinished, and what to do about it.

If you still want to learn Git through terminal, make sure you're not doing it because you're being an elitist. The above tools are more than enough to get you started on git. Ask yourself 'Will there be any practical gains for me by learning how to use Git through terminal?'.

If the answer to the above is yes, and a real definitive YES, then read on.

I'm not going to lie. I use Git through the terminal and there are real practical gains from doing so. One of which is that you don't get to use GUI crap (heh), and therefore you can be more productive. This is because you minimize clicks. Ideally you would never even need to touch the mouse (some IDEs and even text editors have built in terminals, and you can also Alt-Tab if you need to.)

Here are the very basics of git terminal (some lines taken from git-scm, because they had nice colors):

$ git init

The above initializes your repository in your current directory. This creates an empty git repository.

$ git add *.c
$ git commit -m 'initial project version'

git add *.c adds all the files that has '.c' as an extension in the repository. The commit command accepts a '-m' flag, the string afterwards is your commit message.

Note that git commit will only commit on your local repository. If you're working with a team / using a remote repository, you also need to push your commits to it there.

$ git push origin master

If you want to clone a repository, then use the clone command:

$ git clone git://github.com/schacon/grit.git

You can configure your identity by using:

$ git config --global user.name "Your Name"
$ git config --global user.email "username@domain.com"

And that's the basics of git!

Note: There are other good VCS like Mercurial, but space lacks and this is a good introduction to those who want to get started with VCS.

↧

The Total Beginner's Guide to 3D Graphics Theory

October 22, 2013, 7:39 am

≫ Next: Games Are a Whole New Form of Storytelling

≪ Previous: How To Reverse Time - Introduction to Git, Cloud Computing, and Version Control

Introduction

When I was a kid, I thought computer graphics was the coolest thing ever. When I tried to learn about graphics, I realized it was harder than I thought to create those super slick programs I'd seen growing up. I tried to hack my way through by reading things like the OpenGL pipeline specs, blogs, websites, etc. on how graphics worked, did numerous tutorials, and I got nowhere. Tutorials like NeHe's helped to see how to set things up, but I would misplace one glXXX() call, and my program would either not work or function exactly as before without my new additions. I didn't know enough about the basic theory to debug the program properly, so I did what any teenager does when they're frustrated because they aren't instantly good at something...I gave up.

However, I got the opportunity a few years later to take some computer graphics classes at the university (from one of Ivan Sutherland's doctoral students, no less) and I finally learned how things were supposed to work. If I had known this before, I would have had a lot more success earlier on. So, in the interest in helping others in a similar plight as mine, I'll try to share what I learned.

The Idea Behind Graphics

Overview

Let's start by thinking about the real world. In the real 3D world, light gets emitted from lots of different sources, bounces off a lot of objects, and some of those photons enter your eye via the lens and stimulates your retina. In a real sense, the 3D world is projected on to a 2D surface. Sure, your brain takes visual cues from your environment and composites your stereoscopic vision to perceive the whole 3D space, but it all comes from 2D information. This 2D image on your retina is constantly being changed just by things moving in the scene, you moving in relation to your scene, lighting changing, and so on. Our visual system processes these images at a pretty fast rate and the brain constructs a 3D model.

Horse movie image sequence courtesy of the US Library of Congress.

If we could take images and show them at a similar or higher rate, we could artificially generate a scene that would seem like a real space. Movies basically work on this same principle. They flash images from a 3D scene fast enough that everything looks continuous, like in the horse example above. If we could draw and redraw a scene on the computer that changed depending on motion through the scene, it would seem like a 3D world. Graphics works exactly the same way: it takes a 3D virtual world and converts the whole thing into an accurate 2D representation at a fast enough rate to make the brain think it's a 3D scene.

Constraints

The human vision threshold to process a series of images as continuous is about 16 Hz. For computer graphics, that means we have at most 62.5 milliseconds to do the following:

Determine where the eye is looking in a virtual scene.
Figure out how the scene would look from this angle.
Compute the colors of the pixels on the display to draw this scene.
Fill the frame buffer with those colors.
Send the buffer to the display.
Display the image.

This is a complex problem. The time constraint means we can't just use a brute-force method like taking the 3D scene, throwing a bunch of photons in it from all our light sources, calculating trajectories and intensities, and figuring out which ones hit the eye, map that to a 2D image, and then draw it. (Note: that's kind of a lie because that is kind of what happens in raytracing, but the techniques are really sophisticated and is different enough to say that the above is true.) Fortunately, there are some cool tricks and things we can take advantage of to cut down on the amount of computation.

Basic Graphics Theory

All the World's a Stage

Painting by the infamous Bob Ross courtesy of deshow.net.

Let's begin with an example. Let's say you're in a valley with mountains around you and a meadow in front of a river, similar to the Bob Ross painting above. You want to represent this 3D scene graphically. How do you do it? Well, we can try to paint an image that captures all the elements of the scene. That means we have to pick an angle we want to view the scene with, paint only the things we can see, and ignore the rest. We then have to determine what parts of which objects are behind others. We can see the meadow but it obscures part of the river. The mountains are way in the distance, but they obscure everything behind them, so we can ignore those objects behind the mountain. Since the real physical size of the scene is much bigger than our canvas, we have to figure out how to scale the things we see to the canvas. Then we can paint the objects, taking the lighting and shadows into account, the haze of the mountains in the distance, etc. This is a good analogue to how computer graphics processes a scene. The main steps are:

Determine what the objects in the world look like.
Determine where the objects are in the world.
Determine the position of the camera and a portion of the scene to render.
Determine the relative position of the objects with respect to the camera.
Draw the objects in the scene.
Scale the scene to the viewport of the image.

These steps above are basically trying to map points from our objects in the 3D world to the 2D image on the screen. This seems like a lot of work, but there's some really cool math tricks we can use to make this quick and easy. Remember going through algebra and thinking, "What will I ever use this for?" One answer is, graphics!

We can use matrices to map coordinates from the world into our image. Why matrices? Well, for one, a lot of operations can be represented in matrix form. However, the most important thing is that we can concatenate operations together by multiplying matrices together and get a single matrix that does all the operations at the same time. So, even if we have 50 transformations, we can multiply the matrices together once to get 1 matrix that will do all 50 transformations.

We can define matrices to do the operations that we talked about during our painting example (defining scene, defining view, etc.). These matrices will convert the scene from one coordinate system to another. These conversions between coordinate systems are called transformations. We will talk about each coordinate system and what transformation will move us from one to the other.

Object Coordinates - Breaking up objects

How do we draw objects on the screen quickly? Computers are really great at doing relatively simple commands a lot of times in succession really fast. So, to take advantage of this, if we were able to represent the whole world with simple shapes, we could optimize graphics algorithms to process a lot of simple shapes really fast. This way, we don't have to make the computer recognize what a mountain or a meadow is in order to know how to draw it.

We'll have to create some algorithms to break our shapes down to simple polygons. This is called tessellation. Although we can use squares, we'll probably use triangles. There are lots of advantages to them, such as that all triangle points are co-planar and the fact that you can approximate just about anything with triangles. The only problem we have is that round objects will look polygonal. However, if we make the triangles small enough, like 1 pixel in size, we won't notice them. There are lots of methods on the "best way" to do this and it might depend on the shape you're tessellating.

Let's say we have a sphere that we want to tessellate. We can define the local origin of the sphere to be the center. If we do that, we can use an equation to pick points on the surface and then connect those points with polygons that we can draw. A common surface parameterization for a sphere is $S(u,v) = [r\sin{u}\cos{v}, r\sin{u}\sin{v},r\cos{v}]$, where u and v are just variables with a domain of $u\in[0,\pi],v\in[0,2\pi]$ and r is the radius of the sphere. As you can see in the above picture, the points on the surface are drawn with rectangles. We could have just as easily connected them with triangles.

The points on the surface are in what we can call object coordinates. They are defined with respect to a local origin, in this case, the center of the sphere. If we want to place them in a scene, we can define a vector from the origin of the scene to the point we want to place the sphere's origin, and then add that vector to every point on the sphere's surface. This will put the sphere in world coordinates.

World Coordinates - Putting our objects in the world

We really start our graphics journey here. We define an origin somewhere and every point in the scene is defined by a vector from the origin to that point. Although it's a 3D scene, we'll define each point as a 4-dimensional point $ [x,y,z,w] $, which will map to a 3D point at coordinates $[\frac{x}{w},\frac{y}{w},\frac{z}{w}]$. There are advantages to using 4D coordinates, but I won't discuss them here. Just know we want to use them.

A problem presents itself if we want to move around in our scene. If we want to move our view, we can either move the camera to another location, or just move the world around the camera. In the computer, it's actually easier to move the world around, so we do that and let the camera be fixed at the origin. The modelview matrix is a 4x4 matrix that we can use to move every point in the world around and keep our camera fixed at its location. This matrix is basically a concatenation of all the rotations, translations and scaling that we want to do to the scene. We multiply our points in world coordinates by the modelview matrix to move us into what we call viewing coordinates:

\[
\left [
\begin{matrix}
x \\
y \\
z \\
w \\
\end{matrix}
\right ]_{view}

= [MV]
\left [
\begin{matrix}
x \\
y \\
z \\
w \\
\end{matrix}
\right ]_{world}
\]

Viewing Coordinates - Pick what we can see

After we've rotated, translated, and scaled the world, we can select just a portion of the world to consider. This we do by defining a viewing frustrum, or a truncated pyramid. This frustrum is formed by defining 6 clipping planes in viewing coordinates. The idea is that everything outside this frustrum will be clipped, or discarded, when drawing the final image. This frustrum is defined in a 4x4 matrix. The OpenGL glFrustrum() function defined this matrix as follows:

\[
P = \left [
\begin{matrix}
\frac{2*n}{r-l} & 0 & \frac{r+l}{r-l} & 0 \\
0 & \frac{2*n}{t-b} & \frac{t+b}{t-b} & 0 \\
0 & 0 & -\frac{f+n}{f-n} & -\frac{2fn}{f-n} \\
0 & 0 & -1 & 0 \\
\end{matrix}
\right ]
\]

Picture courtesy of Silicon Graphics, Inc.

We can adjust this matrix for perspective or orthographic viewing. Perspective has a vanishing point, but orthographic views don't. Perspective views are what you usually see in paintings, orthographic views are seen on technical drawings. Because this matrix controls how the objects are projected onto the screen, this is called the projection matrix. Here, t,b,l,r,n,f are the coordinates of the top, bottom, left, right, near, and far clipping planes. Multiplying by the projection matrix moves the point from viewing coordinates to what we call clip coordinates:

\[
\left [
\begin{matrix}
x \\
y \\
z \\
w \\
\end{matrix}
\right ]_{ndc}

= [P][MV]
\left [
\begin{matrix}
x \\
y \\
z \\
w \\
\end{matrix}
\right ]_{world}
\]

Clip Coordinates - Only draw what we see

This coordinate system is a bit different. These coordinates are left-handed (we've been dealing with right-handed systems up to now) and is such that the viewing frustrum we defined earlier maps to a cube that ranges from (-1,1) in X, Y and Z.

Up to now, we've been keeping track of all the points in our scene. However, once we have them in clip coordinates, we can start clipping them. Remember our 4D-to-3D point conversion? If not, we said that $ [x,y,z,w]_{4D} = [\frac{x}{w},\frac{y}{w},\frac{z}{w}]_{3D} $. Because we only want points in our viewing frustrum, we only want to further process points such that $ -1 \ge \frac{x}{w} \ge 1 $, or $ -w \ge x \ge w $. This goes for coordinates in Y and Z as well. This is a simple way to tell if points lie inside or outside our view.

If we have points inside our viewing frustrum, we do something called perspective divide, where we basically divide by w to move from 4D to 3D coordinates. These points are still in the left-handed clip coordinates, but at this stage, we call them normalized device coordinates.

Normalized Device Coordinates - Figure out what obscures what

You can think of this as an intermediate step before mapping to an image. If you think about all the possible sizes of images you could have, we don't want to render for one image size and then either scale and stretch the image or re-render the image to fit in case the size changes. Normalized device coordinates (NDC) are nice because no matter what the image size is, you can scale the points in NDC to your image size. In NDC, you can see how the image will be constructed. The image being rendered will be projections of the objects inside the frustrum on the near clipping plane. Thus, the smaller the coordinate of a point in the Z direction, the closer that point is.

At this point, we don't usually do matrix calculations anymore, but apply a viewport transformation. This is usually just to stretch the coordinates to fit the viewport, or the final image size. The last step is to draw the image by converting things to window coordinates.

Window Coordinates - Scale objects to canvas

The window is where the image is being drawn. At this point, our 3D world is a 2D image on the near clipping plane. We can use a series of line and polygon algorithms to draw the final image. 2D effects, such as anti-aliasing and polygon clipping, are done at this point before the image is drawn.

As well, our window might have different coordinates. For example, sometimes images are drawn with positive X to the right and positive Y downward. A transformation might be needed to draw things correctly in window coordinates.

There and Back Again - The Graphics Pipeline

You won't be handling all of the above steps yourself. At some point, you will use a graphics library to define things like the modelview and projection matrices and polygons in world coordinates and the library will do just about everything else we talked about. If you're designing a game, you don't care about how the polygons get drawn, only that they get done correct and fast, right?

Libraries like OpenGL and DirectX are very fast and they can use dedicated graphics hardware to do these computations quickly and easily. They are already widely available and there are a large number of developers that use them, so get comfortable with them. They still leave you with a lot of control over how things are done and you'd be amazed at some of the things people can do with them.

Conclusion

This is a very simple overview of how things are done. There are many more things that happen at the later stages of the rendering process, but this should be enough to get you oriented so that you can read and understand all the brilliant techniques presented in other articles and in the forums.

External Links

If you're interested in reading up on some of the interesting things in this article, I suggest the following sites:

http://www.scratchapixel.com/lessons/3d-advanced-lessons/perspective-and-orthographic-projection-matrix/perspective-projection-matrix/

http://www.songho.ca/opengl/index.html

Article Update Log

21 Nov 2013: Initial release

↧

Games Are a Whole New Form of Storytelling

November 18, 2013, 9:26 am

≫ Next: 10 Ways To Improve Your Indie Game Development

≪ Previous: The Total Beginner's Guide to 3D Graphics Theory

I have been fascinated by stories in games ever since I fell more and more into the universe of gaming. When I started making games I was convinced stories were the most important thing in games. I read books on game design which stated that games couldn't tell decent stories and probably never will. They told me the important thing about games was the mechanics, story was just a lick of paint. That was several years ago and ever since reading this I have been determined to make stories a part of games that people didn't think of as a lick of paint.

As I learnt more about games I decided that some serious thought was needed to decide how stories in games should be treated. Thankfully 2013 has provided some amazing games that have finally given me confidence that storytelling has a place in games. But not like any other medium. This was the revelation I needed.

If we take a step back for a minute and look at storytelling in other mediums we notice something. It's different in every medium. Books, for example tell stories by describing situations and characters in a unique way that engages a person's imagination and lets them follow the story in a deep and meaningful way. Movies and television use completely different methods to tell stories, they rely on visuals to tell a story, however, books and television both have something in common. They are both forms of passive storytelling.

Passive storytelling is when the reader or viewer sits along and watches the story as told by someone else. They do not have any involvement in it. This allows the writers to craft a dramatic arc with twists and cliffhangers that get the viewer or reader on the edge of their seat as they follow the story.

This is were games come in. Games have always fascinated me far more than any other medium of entertainment and art because of how many levels of interaction they have. They are interactive and visual, which create a whole new level of psychology that the player experiences. This however, in my observations, ruins the normal methods of storytelling. Dramatic arcs for example are not as effective because the player has direct effect over the outcome of the game. The player could walk forward and trigger a cutscene with a plot twist, or they could walk around in circles, breaking the pacing of the story. This makes it much harder to tell a story.

The player is now not a witness, he is not being guided through a story. He is now a part of it. It's not a case of watching a character. The player is the character, the character is an extension of the player in the same way a car is. The character therefore inherits all the personality traits of the player. This can create situations that detriment a linear story, for example if the character is presented as shy and fearful, but the player can then take control and run over people in the street with a car. Admittedly this is an exaggerated example since many game developers carefully craft their characters and mechanics so its hard to do something that doesn't make sense. But you are now taking away freedom from the player. A difficult problem to fix.

I have been quite negative so far so lets see how we can fix this problem. Let's start by looking at the strengths of games. Games are systems. Systems of mechanics, very often with a reward or achievement system. These achievements and rewards are often psychological more than literal. For example making a player feel powerful after using a massive weapon, or passing a difficult level. This must be the basis for telling a story, rather than using methods borrowed from other mediums. The problem still exists though, how can we get the player to behave in a way that makes sense to a story? It is clear games require whole new ways to tell stories than have ever existed before.

assassinpirateJUSTPUSHSTARTDOTCOM-1024x5

The way I think this should be done, and the way a lot of modern games are going now, is context. That is, putting the player in a context that makes sense. A story is not a context, the world is the context. The universe the player is involved in. I have noticed that when a player is given a world that makes sense, they will likely fall deeper and deeper into it. They will take on behaviours that their subconscious thinks they should in the context. This is similar to social convention in the real world. We just need to use it in fictional worlds. The player however must need to feel like they want to be in this world. The Assassins Creed games are wonderful examples of this. The player is absorbed into a world that they feel makes sense, they feel like they want to be a part of this world. In AC4 the player is lost in a pirate world and so starts to do things pirates would, which make sense in the story.

Obviously this has a problem, it doesn’t always work. Everyone plays games differently, which means their experience of the game is going to be different, but this is okay. We need to accept that games are not linear. Everyone's experience will be different, but that is because, unlike books or television, the player is actually a part of the story. This is something we need to celebrate about this medium and cherish for its uniqueness.

Doing this in games is not easy, I will not sugar coat that. You need to make the player feel like the character, and then give them a choice of actions that make sense, however the player must not feel restricted. The Elder Scrolls V: Skyrim is an example where I don't think this has been done as well as it could be. For example players are given a race, which gives them a backstory and personality. It is fully the players responsibility to act like an elf for example, but when the player does something that an elf wouldn't do, it doesn’t quite feel right. In this sense, a huge amount of games are essentially role-playing games, as in the player has to want to role-play the character.

The player's imagination is another important part of this process. They need to become the character in their own head. Imagination can also be used to help the player experience a story rather than watch it. If you look at your own life as a story that you experienced you will see that nobody told you how your life is going, you figured it out yourself based on what was happening around you. Games need to tell stories in the same way. Put the player in a situation were he can use his imagination to connect the dots and realize what's going on. The player will relish the moment he figured it out far more than when he was told what was going on. It's a case of show, don't tell.

Now I want to tackle the problem with cutscenes and cinematics. They are a borrowed form of storytelling from another medium - this isn't good. We need to figure out how to tell stories without cutscenes. We are getting much better at this though, which makes me exceedingly happy. A great example is Call of Duty. The Call of Duty games have very little cutscenes, other than the animated sequences shown during level loading. These sequences never have animated characters, rather visuals and voices. They are also short and snappy. The rest of the story is told with dialogue while the player plays the game, and the occasional moment of a super quick cinematic that blends seamlessly into gameplay. This, in my opinion, is a very good way of dealing with linear stories. While on the subject I'd like to mention how the gameplay of Call of Duty matches the personality of the character, so it never feels out of place in the story.

I was thinking the other day about games and what I remember from games. After a chat with a friend I noticed something. I remember what I call “player engineered moments” far more than I do scripted moments of stories. What I mean by player engineered moments is memorable moments where whatever happens is as a result of my own actions. Examples of games that do this very well are Grand Theft Auto 5 and Just Cause 2. Both are sandbox games and both give the player huge choice over their actions. This is an example of a highly non-linear game, and it really does feel like a game. If a game is filled with moments like these you have a situation were the story can be completely and utterly player engineered. The player has full control over the story, and he feels this. Games with branching storylines tend to not feel as liberal as they intend to, I often feel like I'm missing out when I know there is another storyline I am missing. The key is to make the player feel completely in control of the story, as though it's player engineered but isn't as free as the player thinks.

The Walking Dead game, although obviously linear, has moments of this done very well. I was discussing my experiences of the game with a friend when I noticed that our opinions of the characters vary wildly. They hated a character that I loved. The game makes you feel involved in the story far more than a television show because you do genuinely have control over things, even though they are minor it makes a noticeable difference in the players experience.

When crafting stories for games one must remember that the player will enjoy a story he experienced far more than one he was told. Games are not like movies, and I think it is completely valid to think of storytelling in games as something different. To prove my point I will ask you to leave a comment telling me your most memorable moment in games. Is it a scripted moment? Or is it the time you went hunting in Red Dead Redemption and you fought a huge and vicious bear to the death?

I asked myself after thinking about player engineered moments if the actual systems of mechanics in a game could tell a story rather than dialogue and cutscenes. It's a difficult question to answer, although I do believe they can. It's the other side of my previous argument. If the player does things that make sense in the world, they will contribute to the character development and therefore the story of the game. Bioshock Infinite uses subtle psychological cues to influence a player's actions so they compliment the story. A particular example is when (SPOILERS) the statue in the city is collapsing, you walk out into the beach and everyone is looking at the statue. Your normal human reactions cause you to look in the direction of the statue, because everyone else is. This is superb, as you have shown the player something, without taking control away from him, and without using cutscenes or cinematics.

Another example of telling a story with few cutscenes is Brothers: A Tale of Two Sons. The game has no dialogue, although very effectively tells a story through body language. Can you imagine trying to sway a player's thoughts by the way a character moves? It is an effective means of conveying emotion, just as I mentioned earlier, you show the player, instead of telling them. They will connect the dots and create a story. An important factor here is empathy. Emotional empathy is very powerful and can be used effectively to draw the player into the role of a character. If the events around the character in the game would make you feel sad, and you craft animations that make the character look sad, the player will follow. This is another thing games have over movies and books. The feeling of empathy is far more powerful and involving than the sympathy of watching someone else.

When the player has so much control over the story they may create situations were someone in real life would say “this is the stuff you cannot write”. This is perfect, when the player is feeling so involved in the world they can engineer a story from what they find around them, that makes sense in their head, a story in which they feel a part of, like it is real life. This is the power of games, a drastically different medium of entertainment and storytelling than anything we have ever seen before.

As I briefly mentioned earlier player interpretation is a factor here, a factor we cannot ignore. Although I am comfortable in the knowledge that that is okay. Think of a painting, think of how a painter can tell a story using just a simple image, were the viewer's interpretation of the image fill in the gaps and create the story. Games are no different. This is also an argument towards games being art, as they share the way in which the player or viewer interprets something to create the story.

I am confident we are heading towards this in the games industry. 2013 has been the best year for storytelling in my opinion, games like BioShock Infinite, the Last of Us and indie games like Gone Home, The Stanley Parable and Brothers have made mind blowing leaps towards what storytelling in games should be like.

We must never forget that games are a completely different medium to anything humans have encountered before, therefore it is unfair to compare the storytelling in games to books and movies and then say it is bad. It's not bad it's just very different, it's storytelling where the player's imagination tells the story, however it is possible for you to have some control over how the player's thoughts and imagination go, therefore you can tell a story, like never before.

This is a repost from my own site: www.peripherallabs.com

↧

10 Ways To Improve Your Indie Game Development

November 23, 2013, 12:44 am

≫ Next: Writing Fast Code: Introduction To Algorithms and Big-O

≪ Previous: Games Are a Whole New Form of Storytelling

I realize that most, if not all, of the stuff in here is somewhat obvious – it’s not like I’m the first to have these ideas. However, I believe that a lot of good games could be great ones if some of the following ideas were acted on. Most of these suggestions are either free to implement or require nothing more than an investment of time and energy. This, in itself, is often at a premium, which is why I’ve started the list by discussing community.

Here are my thoughts…

Step 1: Foster a community developed for – and by – your fans

The most important word here is foster. Just because you’ve thrown up a blog or a forum and you’ve got a twitter account, doesn’t really mean you’re fostering a community. To build the community and have it last, you’ve got to be involved with it. This means listening to, responding to, and even implementing the community’s suggestions for the game. If you’ve drawn people to your site that are bothering to post, tweet or comment on your game, chances are they’ve got some good ideas. Clearly not all of them are, but every now and then there’s a gem that’ll actually improve the game. Don’t be scared of trying a few of these ideas – the community will appreciate the fact that you’re paying attention.

Big companies generally have the budgets to go beyond these basics – they create art packs and/or music for their community to build their own fansites. This is huge – you’re basically giving your community the opportunity to do your marketing FOR YOU. Let them do it – give them the art, screenshots, wallpapers, mp3s. Whatever it takes to get your game out there – after all, there’s thousands of games coming out a year, yours needs to stand out. So let your community work for you.

Step 2: Don’t worry about graphics, but DO pay attention to art style

Many indie games do this right (Braid, Terraria, Revenge of the Titans, Castle Crashers, Frozen Synapse, Lume, etc etc). There’s a ton of great examples where in lieu of big-budget graphics, they’ve opted for 8-bit looks or just a clean, stylized look. This is great for several reasons – it makes your game more recognizable and also makes your game far less hardware-dependent.

Where this might be a negative is if your look strays too far into a design that some might consider childlike or ‘kid-friendly’. Bold, colorful graphics are great, but you might turn off some people who just prefer a more “adult game” look. Regardless of how many people think your game is the best-looking one they’ve ever seen, by making sure your game looks unique, you’re going to guarantee that it will at least stand out, which gives it a far better chance of being remembered by your target audience

Step 3: Mods, mods, MODS

This is going to give your game longevity out the wazoo. Heck, I’ve replayed games for hours and hours just to try out new mods. Notable games that have a great mod scene are Half Life 2 (2006), Stalker: Shadow of Chernobyl (2007), Oblivion (2006), Torchlight (2009). Heck, even Morrowind (2002) has a dedicated mod community that’s as strong as ever. There’s many more than just these few examples, but the key here is longevity. Granted, Oblivion and Morrowind are huge games, expensive games, but even in the most recent super-ridiculous Steam Sale, they only dropped in price by $6 and $5, respectively. They’ve been lower in the past, but they maintain a reasonably high price for old games, mostly due to the fact that they can get it. They’re well supported, they (by now) run on almost all hardware, and staggeringly enormous amounts of mods are available for them. You can quite literally play these games for years. Only Half Life 2 has a mod community that rivals them.

Step 4: Let me play it first – the disappearing demo

This is something some devs do really, really well – Spiderweb Software and Soldak Entertainment come to mind here. Not only do they both provide comprehensive demos for every game they make, Soldak goes the extra mile and updates the demo with pretty much every patch they apply to the full game, so when you download one of their demos, you know it’s going to be representative of what you’re buying. Kudos to you for that, Soldak.

Demos are going the way of the dodo in the era of 10+ gigabyte games. That’s not surprising, but if you’re an indie developer, this is your chance to get your game out there into people’s hands. This may not always result in a sale, but if it does, that’s not only cash in your pocket, but also a potential one-person marketing team. If they like it, they’re probably going to tell their friends about it or possibly even join and become active in your community. This is worth its weight in gold.

Step 5: Stay focused on your simple, unique idea and implement it well

This is a tricky thing to nail down. You’re developing the game, so clearly you’re doing this. And in some ways, it flies in the face of my suggestion to take your community’s suggestions and implement them. I guess what I mean is this: make sure that whatever the core of your game is, do that one thing really well. In Braid, it was time-manipulation in a platformer. This could have bombed, if they’d over-complicated it, but they stuck to the basics and polished it until perfect. That’s what is going to make your game different from all the others, so make sure it works.

Step 6: Make your website look professional enough for people to give you money

This is something that can be overlooked quite easily. It’s also something that isn’t necessarily a hard-and-fast rule. But the way I see it, if I’m going to give you $10 or $20 of my money – or more – I want to feel like you are going to be around for a little while. I don’t want you and/or the site to disappear. This goes hand-in-hand with creating and supporting a community. If you’re working on your game a bunch and want it to be awesome, this is really a no-brainer. If web’s not really your thing, you’re more of a programmer, put some feelers out to the community and see if you can get some volunteers to add a level of polish.

Step 7: Keep working on it and keep those patches coming

This, like above, is going to give your audience and potential customers a lot of confidence. They want to feel like you’re working to make this the best game ever. If you don’t patch it regularly, and especially when it’s necessary, kiss a lot of possible purchases goodbye. Nothing kills enthusiasm more than a buggy game that’s not getting the love it needs. People will talk, negative reviews will get posted. Start working on patches regularly, even if it’s not perfect, people will start talking about that. This exact thing happened to Star Ruler last year. By most accounts, buggy and unfinished out of the gate. This was admitted by the developers, as they simply ran out of cash. People took them on their word that they’d keep working on the game and bought it. Since that time, they’ve sold enough to be around a year later with reports of the games’ current state being a massive improvement.

Step 8:Design your game for netbooks and other really low end hardware

This is sort of a given for most Indie developers simply due to the fact that the majority of them are developing low-poly count, reasonably non-hardware-hungry type games. Generally they’re slightly more casual. That being said, some of them aren’t. However, with the explosion of netbooks and now handheld devices, be they smartphone or tablet, I have a ‘gaming platform’ with me at all times, pretty much. The more platforms you can get your game on, the more it will sell.

Step 9: Leverage other people's work

It was very interesting to work in the game industry until about the early 2000’s. We were figuring out new rendering technology, how to solve physics, collision etc. At some point in early 2000 or so these all sort of became solved problems. There really isn’t any value in solving them again (back to point one). You might be really interested in how physics simulations work, but you need to decide if you want to play with a physics engine or make games for a living. There are tons of middleware out there - if it solves a problem for you, use it, instead of wasting your time reinventing the wheel. This is a great lead-in to our next topic.

Step 10: Make it addictive

This is one that can’t always be implemented. And of course, all designers probably want this. Get ‘em hooked on a simple but addictive gameplay element and you’ve pretty much guaranteed yourself a bundle o’ dough. Hell, who woulda thunk a game with a unicorn mascot based around a game that became popular with a daytime TV show (Pachinko) would help get their company sold for over a billion dollars? I probably wouldn’t have put money on it if someone had said it that way… but if you play Peggle… then you start to understand why. That crap is addictive like heroin added to your morning cup of coffee. Heck, I’ve put more hours into Zuma-alikes than I’d care to admit.

And frankly, although I seriously dug Crackdown, the thing that kept me playing for hours and hours on end were those goddamn orbs. I just.. needed…. to… get… one… more…

Addict me, I’m begging you.

Article Update Log

23 Nov 2013: Initial release
26 Nov 2013: Changing the title from: How to build a better indie game in 10 steps

↧

Writing Fast Code: Introduction To Algorithms and Big-O

November 25, 2013, 7:08 pm

≫ Next: Optimizing Multiplayer 3D Game Synchronization Over the Web

≪ Previous: 10 Ways To Improve Your Indie Game Development

This post originally available on my devblog.

Did you know computers are stupid machines? They are actually so stupid that they don't know how to multiply; instead, they add - whenever you say 2*5, the computer is really doing 2+2+2+2+2!

What computers can, and have always been capable of, is executing things incredibly quickly. So even if it uses repeated addition, it adds so quickly that you don't even feel it adds repeatedly, it just spits out the answer as if it used its memory like you would have.

However, fast as computers are, sometimes you need it to be faster, and that's what we're going to tackle in this article.

But first, let's consider this:

Here are 7 doors, and behind them are numbers, one of which contains a 36. If I asked you to find it, how would you do it?

Well, what we'd do is open the doors, and hope we're lucky to get the number.

Here we open 4 doors, and find 36. Of course, in the worst case scenario, if we were unlucky, we might have ended up opening all 7 doors before finally finding 36.

Now suppose I give you a clue. This time, I tell you the numbers are sorted, from lowest to highest. You can instantly see that we can work on opening the doors more systematically:

If we open the door in the middle, we know which half the number we are looking for is. In this case we see 24, and we can then tell that 36 should be in the latter half.

We can take the concept even further, and - quite literally - divide our problem in half:

In the picture above, instead of worrying about 6 more doors, we now only worry about 3.

We again open the middle, and find 36 in it.

In the worst case scenario, if we opened the doors wildly, or linearly, opening doors one by one without a care, we would have to open all doors to find what we are looking for.

In our new systematized approach, we would have to open a significantly lower number of doors - even in the worst case scenario. This approach is logarithmic in running time, because we always divide our number of doors by half.

Here is a graph of the doors we have to open, relative to the number of doors.

See how much faster log n is, even on incredibly large input. On the worst case scenario, if we had 1000 doors and we opened them linearly, we would have to open all of them. If we leverage the fact that the numbers are sorted however, we could continually divide the problem in half, and drastically lowering that number.

An Algorithm is a step-by-step procedure for solving a problem, and here we have two basic algorithms.

As you have already seen, our second algorithm is faster than our first in the worst case scenario. We could classify them by running time, the time it needs to take to complete the problem, which we equated here by the numbers of doors opened (since you take time to open doors).

The first algorithm, which is a linear search algorithm, would have to open ALL doors, so we say it's running time is O(n), (pronounced Big-Oh-of-n) where n is the number of doors (input).

The second algorithm, which is a binary search algorithm, would have to open a logarithmic amount of doors, so we say it's running time is O(log n).

In the so called O-Notation, O represents the time needed for your algorithm to complete the task in the worst case scenario.

We haven't looked at the other side of the coin however. What about the best case scenario?

In both of our algorithms, the best case scenario would be to find the what we're looking for in the very first door of course! So, in both our cases, the best case running time would be 1. In O-Notation, theta (O) represents the best case scenario. So in notation, we say O(1)

Here is a table so far of our algorithms:

	Ω	O
Linear Search	1	n
Binary Search	1	log n

Now unless all you're going to write in your life are door problems the style of Monty Hall, then we need to study more interesting algorithms.

Sorting - Case Study

Let's say I'm making an RTS, and I needed to sort out units by their health.

As you can see here, the positions of our units are a mess!

A human being can sort this out very easily - one can instantly 'see' that it should be arranged like so:

The brain makes some pretty interesting calculations, and in a snap, it sees how the problem can be solved, as if in one step.

In reality though, it actually does a series of steps, an algorithm, to sort out these pigs. Let's try to solve this problem programmatically.

One very straight forward way to solve this problem is to walk through our input, and if they are not sorted, we swap them:

Are the first two pigs sorted? Yes, so we leave them be.

Are the next two pigs sorted?

No, so we swap them.

We will continually do this recursively, until we walk through our input and see that we have in fact sorted it out.

To save you some bandwidth, here is our sorting algorithm in an animation, courtesy of Wikipedia:

(The animation is pretty long, so you might want to refresh the page to start over)

The algorithm we have described here is a Bubble Sort. Let's define its running time shall we?

How many steps do we take to fully sort it out?

First we walk through our input. If our input is n, that is a total of n steps.

But how many times do we start over and walk again? Well, in the worst case scenario, that is, when the numbers are arranged largest to lowest, then we would have to walk through it n times too.

So n steps per walk, and n walks, then that is a total of n².

In the best case scenario, we would only need to walk through it once (and see it's already sorted). In that case, n steps per walk, and 1 walk, then that is a total of n.

Now you might have noticed this, but n² is a pretty bad number to have. If we had to sort 100 elements, then that means we have to take 100² steps, that is 10,000 steps for a 100 elements!

Selection Sort

Let's use another approach. This time, we walk through the input left to right, keeping track of the smallest number we find.

After each walkthrough, we swap the smallest number with the left-most one that is not in the correct place yet.

To again save you some bandwidth, here is an animation, again courtesy of Wikipedia:

The red item is the current lowest number that we save, presumably in a variable. After the walk, we put this lowest number to the top of the list, and we know that this is sorted already (yellow).

But what is the running time?

In the worst case scenario, we would have to do n walks. The steps per walk decreases. Since we know that the first few numbers are already sorted, we can skip it. So our walks then become n, n-1, n-2... and so on.

The total running time of this is (n(n-1))/2. However, computers are so fast that a division is negligible, so we can say that this is n² still.

But what about the best case scenario? If you think about it, we would still need to walk through the input, even if it is already arranged! So our best case scenario is also n².

Insertion sort

Okay, so all of our examples so far have been n² which we know is bad. Let's take a look at another algorithm shall we?

Imagine you're sorting a deck of cards. Now I don't know about you, but what most would do is walk through the deck, and insert the card in front of them to its right position in another deck.

This might help you visualize it:

This is awesome! We only need to walk through the list once! So the running time is n right?

Not quite. What if we needed to insert 1 to our sorted list? We would have to move every single one of them to make room. If you consider this, the worst case running time is actually n².

Let's table them shall we:

	Ω	O
Bubble Sort	n	n²
Selection Sort	n²	n²
Insertion Sort	n	n²

Are we screwed? Do we have no choice but n² running time for sorts? Of course not! If you've been vigilant, you'll realize that none of the algorithms introduced so far uses the same 'divide-and-conquer' approach of binary search.

You might want to take a look at this visualization to see just how fast merge sort is to bubble, selection, and insertion sorts.

Note: That little tidbit about computers doing repeated addition is a bit of a lie.

Credits
- Running Time Graph Image taken from: CS50, taught by David J. Malan from Harvard University
- Visualization Animations from Wikipedia.

↧

Optimizing Multiplayer 3D Game Synchronization Over the Web

November 27, 2013, 6:12 am

≫ Next: Overview of Modern Volume Rendering Techniques for Games - Part 2

≪ Previous: Writing Fast Code: Introduction To Algorithms and Big-O

A few months ago, I stumbled upon an interesting article by Eric Li titled "Optimizing WebSockets Bandwidth" [1]. Eric mentions how the advent of WebSockets has made it easier to develop HTML5 multiplayer games. But delivering the positional and rotational data for each player in a 3D game can consume a lot of bandwidth. Eric does some calculations and illustrates several optimizations to expose how much bandwidth might be needed.

After reading the article, I was thrilled because I have been dealing with bandwidth optimization of real-time data streaming for thirteen years now. I am the CTO and co-founder of Lightstreamer, a Real-Time Web Server that we originally created for the financial industry to deliver real-time stock prices. Well, many of the optimization algorithms created for online financial trading can be applied, unchanged, to online gaming. And we have worked with many banks across the world for several years to optimize bandwidth and reduce latency. So, I considered Eric's article as a challenge to demonstrate how well the algorithms we have developed in the last decade can give immediate benefit to multiplayer games, including MMOs, MMORPGs, and immersive 3D virtual worlds. Coming from the sector of real-time financial data, I am not a game development expert but I think that "cross-fertilization" between finance and gaming could give some unexpected benefits.

We decided to work on an online demo of a simple multiplayer 3D world, using Lightstreamer for the real-time synchronization, while showing the actual bandwidth used. We added several buttons, sliders, and controls to allow tweaking the parameters of the scenario and simulate any flavor of data delivery.

The final result is a toolkit that can be exploited to experiment and tune different game communication scenarios. The full source code of the demo is freely available on GitHub.

In this article, I will take you through the demo, showing you how to use it, explaining what's under the hood, and illustrating some of the advanced techniques employed. If you are interested to know how we optimized the network transport, you can skip to the section "Techniques to Employ in the Real-Time Web Stack" below.

Let's start with the link to the online demo:
http://demos.lightstreamer.com/3DWorldDemo

You can play with it right now. The demo can be tweaked on purpose to consume high amounts of bandwidth and CPU. For this reason, we had to put a limit on the number of connected users. If you find yourself unable to connect, please let us know, and we will try to arrange some dedicated demo time.

Preliminary Remarks
How to Use the Demo
   How to Move Around
   Your Identity
   The Matrix
   Tuning
   Results
Under the Hood
Techniques to Employ in the Real-Time Web Stack
   Dynamic Throttling
   TCP vs. UDP
      Batching and Nagles' Algorithm
      Avoid Queuing
   Delta Delivery
   Stream-Sense
   Client-to-Server in-Order Guranteed Messaging
   Lightweight Protocol
   Message Routing
   Different Subscription Modes
   Scalability
Conclusion
Credits
Resources
References

Preliminary Remarks

What this demo is about:

Lightweight 3D positional/rotational data streaming over the Web
Bandwidth optimization and low-latency data delivery
Dynamic throttling with adaptive data resampling for each client

What this demo is NOT about:

Lightweight 3D physics calculations
Lightweight 3D rendering
Cool 3D rendering

Warning: Based on the tuning parameters you choose, you might experience high bandwidth, CPU, or memory usage. This is done on purpose, to let you experiment with many of the variables affecting the optimization of the multiplayer 3D world.

In other words, we show how the real-time transmission of the coordinates of a 3D world over normal Web protocols (HTTP and WebSockets) can be easily achieved, while passing through any kind of proxy and firewall, and optimizing the bandwidth required by data streaming. We did not focus on optimizing the client-side rendering of the world (for which we used Three.js, a very nice lib, in a basic way), nor on optimizing the physics engine itself. Basically, we are just the delivery guys but we are pretty good at that ;)

For efficient HTML5 3D rendering, it seems that Chrome and Firefox browsers are currently better than other browsers. In any case, feel free to do your own tests with any browser you want (including mobile browsers).

The demo server is located in Europe (at Amazon's data center in Dublin). Take this into account when evaluating the latency you are experiencing.

The demo can work in two different modes:

In server-side mode, the physics is calculated on the server side, according to a stream of input from clients. Every change to the coordinates of each object is streamed back to the clients. This means that not only there is no prediction done on the client side, but also no interpolation! This is really an extreme scenario, where the client acts as a "dumb renderer".
In client-side mode, the physics is calculated both on the server side and the client side. The rendering is based on physics calculations (translation and rotation for all the players) performed by the JavaScript client. Each client receives in real time from the server the commands originated by all the other clients (or, more precisely, the changes to the velocity vector and the angular momentum), which are used as input for the physics calculations.In addition, the client periodically receives from the server (which still calculates the physics for the world) a snapshot with the positional and rotational data for all the objects. This way, the client can resynchronize with the authoritative data coming from the server-side physics engine and can correct any drift. It is also possible to stop the periodic resynchronization. In this case, each client will make its state of the world evolve independently, with possible divergences arising with time passing.

For more information on the two modes, a good reading is "Networked Physics" by Glenn Fiedler [2].

How to Use the Demo

Point your browser to http://demos.lightstreamer.com/3DWorldDemo. You will see some controls and a 3D rendering of a virtual world, where some 3D objects are floating. You are the red object. Other human players are represented by blue objects, whereas white objects are played by robots (they are there to make the Default world less lonely, in case you are the only player there).

And now, the magic of the Real-Time Web...

Open another browser window, or a different browser, even on a different computer, and go to the same address (http://demos.lightstreamer.com/3DWorldDemo). You will see the same world, fully synchronized.

All the controls in the page have a question mark icon next to them, which provides full details on their purpose. In the Rendering box, you have two sliders for controlling the camera zoom and the camera field of view.

If you close the Rendering box, you are still there, but the rendering algorithms in the client are stopped to save on CPU.

HOW TO MOVE AROUND

You can start moving your red object by using the keyboard. Open the Commands box to know what keys to use. Basically, you can add force impulses and torque (rotation) impulses on the three axes. If you can't use the keyboard, just press the on-screen keys with your mouse or finger. The on-screen keys are available in both the Commands and Rendering boxes. The more impulses you give, the higher the speed becomes in that direction.

YOUR IDENTITY

The Identity box allows you to change your nickname, broadcast a message associated to you, and choose your world. You will start in the "Default" world, but you can be teleported to any other world. Just Enter the name of another world, either existing (agree with your friends on a name) or brand new.

If there are too many users connected to a world, you will be put in watcher mode. Try another world to become an active player.

THE MATRIX

Open the Matrix box to see the actual data being delivered by the remote server. In other words, the table shows in real time all the positional and rotational data that is actually being received on the streaming connection of your browser window.

By default, the client-side physics engine is used, which means that only some periodic world synchronization will be delivered by the server. This is why you will see the numbers in the matrix change very infrequently. But read more in the Tuning section below to change that radically...

TUNING

The Tuning box contains most of the juice of the demo and is the actual toolkit provided for your experiments.

When the demo starts, the default mode is client-side, with 2-second resynchronization.

In the Tuning box, you can easily switch between server and client side modes. As a result, you can see the current bandwidth, displayed at the top of the box, change dramatically. Moving to a world where there are no other players, information about the bandwidth used by the whole page is a fairly accurate estimate of the required bandwidth for each player.

When client-side mode is used, you can tweak the resynchronization period with a slider, which by default is 2 seconds.
In server-side mode, many more parameters can be tweaked. You can have real fun going through some of the cases discussed in Eric's article, by choosing the precision of the incoming data and the frequency of the updates. You can have even more fun if you open the Matrix box and see the crazy speed of changes of the coordinates.

First, you can change the frequency for updates streamed by the server via the "Max frequency" slider. By default, it's 33 updates/second for each object. You can increase it up to the server-side physics engine clock, which works at 100 Hz (100 updates/second). Look how the bandwidth changes when changing the max frequency.

Then, you can play with data encoding. You can choose between binary and string encoding for the coordinates. In case of binary encoding, you can switch between double precision and single precision. In case of string encoding, you can choose the number of decimals (8 by default).

Finally, you can choose the maximum bandwidth allowed for the downstream channel (bottom slider). The data flow will be resampled on the fly to respect the allocated bandwidth. Try to reduce it and see the movements become less fluid.

RESULTS

We can now simulate some of the scenarios depicted in Eric's article. For example, let's take a single user (create a new lonely world to see the resulting bandwidth). If we deliver 33 updates/s in double-precision floating point format, when the object is translating and rotating on all three axes, the bandwidth consumption is about 5.6 KBps.

In this particular kind of scenario, a fixed decimal length string with 1 or 2 decimal digits may be enough. With this encoding, the bandwidth decreases to about 3.6 KBps.

But these are worst case scenarios, where all seven coordinates change (translation and rotation on all three axes). If any of these coordinates does not change, or change less frequently, Lightstreamer automatically uses a delta delivery feature to only send changed data. For example, with a single object that is only moving on the X axis, with string encoding, 2 decimals, and 33 updates/s, we get a bandwidth usage of about 1.5 KBps.

More information on the optimization algorithms implemented by Lightstreamer is in the "Techniques to Employ in the Real-Time Web Stack" section below.

Under the Hood

Lightstreamer is used in this demo for full-duplex, bi-directional, real-time data delivery. More simply, Lightstreamer is a special Web server that is able to send and receive continuous flows of updates to/from any kind of client (browsers, mobile apps, desktop apps, etc.) using only Web protocols (that is, HTTP and WebSockets). Based on a form of publish/subscribe paradigm, it applies several optimizations on the actual data flows, rather than acting as a "dumb pipe".

This demo is made up of some client-side code and server-side code.

On the client side, the application was developed in pure HTML and JavaScript. The source code is available on GitHub. The client uses the JS libs below:

JQuery for the controls
Three.js for 3D rendering
Lightstreamer for the real-time data transmission and for the matrix widget.

On the server side, two Lightstreamer Adapters were developed in Java. The Adapters are custom components that are plugged into the Lightstreamer Server to receive, elaborate, and send messages. The source code of the Adapters is available on GutHub too. The Adapters use the CroftSoft Code Library for the physics calculations.

So, the 3D physics engine was built on the server side as a Java Adapter for Lightstreamer. The Adapter keeps the positional and rotational data for all the players in the game, and recalculates the information with a configurable frequency up to 100 Hz.

The clients send the user's commands (keys pressed) to the server via the sendMessage facility of Lightstreamer. Each user can also change her nickname and send a message to the other players at any time (which means that a basic chat is built into the demo).

The clients receive the real-time data by subscribing to Lightstreamer items, with a set of fields, using a subscription mode (see the sections "Message Routing" and "Different Subscription Modes" to learn more):

There exists an item for each world in the game (actually, there is one item for each combination of world and representation precision, but let's keep the explanation simple). Such item works in COMMAND mode and delivers the dynamic list of players in that world, signaling players entering and leaving the world, in addition to notifying nickname changes and chat messages.The fields available for this item are: "key" (the unique identifier of each player), "command" (add, update, or delete), "nick" (the current nickname), and "msg" (the current chat message).
For each player that enters a world, a specific item is created by the server (and subscribed by the clients) to carry all the real-time coordinates and movements for that player. This item works in MERGE mode and it is subscribed-to/unsubscribed-from by each client in that world based on the commands received as part of the first item above.The fields available for this item are:- The coordinates and the quaternions, which represent the current position of the object: "posX", "posY", "posZ", "rotX", "rotY", "rotZ", "rotW"This set of fields above is subscribed to in server-side mode, to get the actual positions in real time and render the objects accordingly. It subscribed to in client-side as well, to get the periodic authoritative resynchronizations (unless the resync period is set to 'never').The matrix widget uses these fields to feed a Lightstreamer widget called DynaGrid.- The velocity vector and the angular momentum, which represent the current movement of the object: "Vx", "Vy", "Vz", "momx", "momy", "momz"This set of fields is subscribed to in client-side mode only, to receive the input for the client-side physics engine.
Each client subscribes to an item in DISTINCT mode to implement presence. In other words, each player signals her presence by keeping this subscription active. By leaving the page, the automatic unsubscription determines the loss of presence, and the server can let all the other players know that the user has gone away (by leveraging the COMMAND-mode item above).
Each client subscribes to an item in MERGE mode, to know the current downstream bandwidth (used by its own connection) in real time.

You are encouraged to study the source code of this demo, fork it on GitHub, and derive any possible work from it, only limited by your fantasy. For example, you might attach a Unity front-end, you might want to create a real game, or you might create a mobile app. Lightstreamer offers libraries for many different client-side technologies. You can download the Vivace edition of Lightstreamer, which contains a free non-expiring demo license for 20 connected users.

Techniques to Employ in the Real-Time Web Stack

Several techniques and features are required, as part of the real-time web stack, to help game developers implement multiplayer communication in a simple, reliable, and optimized way. Let's see some of them, delving a bit deeper.

DYNAMIC THROTTLING

There exists many kinds of data, which can be filtered (that is, resampled) by their very own nature. When a data feed provides a series of real-time samples for some given physical characteristics, you might want to resample such series to decrease the frequency, if you don't need all of the samples. Imagine you have a sensor that measures the current temperature 100 times per second and you want to deliver the measurements in real time to a software application. If you just want to display the value to a human user, perhaps 10 updates per second are more than enough. Or, even better, you might want to deliver as many updates as possible without saturating the link bandwidth.

Resampling a real-time data flow is a complex procedure that needs to take several variables as input: the unsampled data flow, the desired maximum frequency, the desired maximum bandwidth, and the available bandwidth.

Back to our 3D virtual world, when server-side physics is used, a lot of data is produced to keep all the clients in sync with the world changes. If the server-side physics works at 100 hz, in many cases, you don't need or cannot afford to send the data with the same frequency to all the clients. You want to send as many updates as you can, based on the actual bandwidth available at any time for each individual client connection.

On the Developer Community pages of Valve [3] (the engine that powers some famous 3D games), you read:

Clients usually have only a limited amount of available bandwidth. In the worst case, players with a modem connection can't receive more than 5 to 7 KB/sec. If the server tried to send them updates with a higher data rate, packet loss would be unavoidable. Therefore, the client has to tell the server its incoming bandwidth capacity by setting the console variable rate (in bytes/second). This is the most important network variable for clients and it has to be set correctly for an optimal gameplay experience.

This is done automatically and dynamically by Lightstreamer. Thanks to its adapting throttling algorithms (originated in the financial industry), Lightstreamer is able to resample the data flow for each individual user
on the fly, taking into account all the dynamic variables:

Bandwidth allocation: For each client, a maximum bandwidth can be allocated to its multiplexed stream connection. Such allocation can be changed at any time and Lightstreamer guarantees that it is always respected, whatever the original data flow bandwidth is.
Frequency allocation: For each data flow (a subscription, in Lightstreamer terms) of each client's multiplexed stream connection, a maximum update frequency can be allocated. Again, such allocation can be changed at any time.
Real bandwidth detection: Internet congestions are automatically detected and Lightstreamer continuously adapt to the real bandwidth available.

Lightstreamer heuristically combines these three categories of information to dynamically throttle the data flow with resampling. In our 3D World Demo, you can see all this in action. In the Tuning box, switch to server side. The "Max bandwidth" and "Max frequency" sliders allows you to govern bandwidth allocation and frequency allocation in real time. The data flow is resampled accordingly. Now, try to move to an unreliable network. Perhaps use your tablet or smartphone connected via 3G (or, better, 2G) and move to a place where the mobile signal is not very strong. In other words, you should provoke some packet loss and bandwidth shrinking. Lightstreamer will detect this and instead of buffering the updates and playing them back as an aged history later, it will begin resampling and reducing frequency and bandwidth automatically.

Basically, every client will find its sweet spot for bandwidth usage, while still receiving fresh data. Receiving less frequent updates does not mean receiving old updates. When you have a chance to get an update, it must be the most recent available, not a piece of data buffered some time ago. Lightstreamer does exactly this. It sends fresh data even on tiny-bandwidth links, by resampling the data flow, rather than queuing it. You can see another live example in the simpler Bandwidth and Frequency Demo (http://demos.lightstreamer.com/BandwidthDemo).

Resampling works better with conflation. Conflation means that instead of simply discarding updates, when resampling is done, the sender should try to merge them to save as much information as possible. Let's clarify with an example. In the 3D World Demo, as part of each subscription to the coordinates of an object, 7 fields are sent (3 for position and 4 for rotation), as you can see in the Matrix box. Now, let's take into account only the 3 positions for the sake of simplicity and imagine this sequence of updates for the original data flow:

X=5.1; Y=3.0; Z=2.3
X=unchanged; Y=3.1; Z=2.5
X=5.3; Y=unchanged; Z=2.8

If the resampling algorithms decides to drop event 2, the conflation mechanism will produce a single update that replaces event 3, as follows:

X=5.3; Y=3.1; Z=2.8

As you see, event 2, which carried an update to Y, has not been fully discarded but has been conflated with event 3.

Conflation is enabled by default in Lightstreamer when subscriptions are done in MERGE mode.

This means you can produce data at any rate and let Lightstreamer automatically and transparently resample and conflate them on the fly for each individual connection.

TCP vs. UDP

TCP is often considered a bad choice when implementing a gaming communication layer. Its congestion control mechanisms and in-order delivery with automatic retransmission may cause degraded games' performance. A couple of examples from literature: In a paper by National Taiwan University [4], a protocol comparison for MMORPGs was performed, showing the cons of TCP. And Glenn Fiedler wrote [5]:

Using TCP is the worst possible mistake you can make when developing a networked game! To understand why, you need to see what TCP is actually doing above IP to make everything look so simple!

I'm not going to disagree with these positions, as they totally make sense, but I would like to introduce another perspective.

Let's highlight the advantages of TCP versus UDP:

It is reliable (and even if you don't always need it, it's better to have it than not, provided, of course, that the price to pay is not too high).
If used under Web protocols (HTTP and WebSockets), it can pass through any proxy, firewall, and network intermediary.

So, it would be nice to be able to use TCP even for games. Lightstreamer, which leverages HTTP and WebSockets, uses TCP and it tries to overcome some of the limits with its smart algorithms. I am not saying here that Lightstreamer has magically made TCP as efficient as UDP... But I seriously maintain that it makes it usable enough for several games. Let's see how.

Batching and Nagles' Algorithm

As mentioned by Glenn Fiedler, going for the TCP_NODELAY option (that is, disable the TCP Nagle's algorithms) is a must. Lightstreamer, after disabling Nagles', uses its own algorithms to decide how to pack data into TCP segments. A trade-off between latency and efficiency can be configured. What is the maximum latency you can accept to forge bigger packets and decrease the overheads (both network overheads and CPU overheads)? You can answer this question for each application and game and configure it in Lightstreamer via the max_delay_millis parameter.

The highest the acceptable delay, the more data can be stuffed into the same TCP segment (batching), increasing overall performance. For extreme cases, you can use 0 for max_delay_millis.

Consider that delivering real-time market data to professional traders, who work with 8 screens and have The Matrix-like capabilities of reading them, is somehow similar to connecting multiplayer games... We found that a batching interval of 30 ms works for most cases.

Avoid Queuing

The real issue with TCP is with packet retransmissions. If a segment is lost, newer data queues up in the server buffers until the initial segment is actually delivered. This provokes a clutter on the network, as bursts of aged data are sent out, with the risk of triggering a continuous packet loss on small-bandwidth networks.

As explained above, Lightstreamer's dynamic throttling makes it possible to stop queuing data that has been produced but not yet sent out to the network if fresher data is available. Basically, when network congestion provokes packet loss, Lightstreamer begins resampling and conflating the data flow, overriding aged queued data with fresher data.

Again, this has its roots in financial market data dissemination. You want to be watching the very latest price of a stock. If a network congestion blocks the data flow, when the network is available again, you don't want to see a playback of old data, you want new data at once.

DELTA DELIVERY

On the Developer Community pages of Valve [3], you read:

Game data is compressed using delta compression to reduce network load. That means the server doesn't send a full world snapshot each time, but rather only changes (a delta snapshot) that happened since the last acknowledged update. With each packet sent between the client and server, acknowledge numbers are attached to keep track of their data flow. Usually full (non-delta) snapshots are only sent when a game starts or a client suffers from heavy packet loss for a couple of seconds.

Delta delivery is used by default on Lightstreamer, as it has always been a mean to reduce the amount of financial data that need to be sent. If a users subscribes to 20 different fields for each stock (price, bid, ask, time, etc.), only a few of them really change at every tick. Lightstreamer automatically extracts the delta and the client-side library is able to rebuild the full state. Upon initial connection, Lightstreamer sends a full snapshot for all the subscribed items. Even in case of disconnections, the Lightstreamer client-side library automatically reconnects and restores the state via a full snapshot.

STREAM-SENSE

The documentation of Unity [7] (one of the most used 3D rendering engines) says:

Connecting servers and clients together can be a complex process. Machines can have private or public IP addresses and they can have local or external firewalls blocking access. Unity networking aims to handle as many situations as possible but there is no universal solution.

Stream-Sense is an exotic name we use in Lightstreamer to refer to the ability to automatically detect the best transport protocol to use over TCP. WebSocket is the best choice, but there are many cases in the real world where WebSockets simply don't work. Apart from browser support, the real issue is with some network intermediaries (firewalls, proxies, caches, NATs, etc.) that don't speak WebSockets and block them. In all these cases, Lightstreamer automatically switches to HTTP Streaming, which is as efficient as WebSocket for sending data from the server to the client. But there are even some cases where HTTP Streaming is blocked by some corporate proxies. In such extreme situations, Lightstreamer still works by automatically switching to HTTP Smart Polling (a.k.a Long Polling). This polling style is very different from traditional polling, as the polling frequency is dynamic and data-driven. See this slide deck [6] for more details.

For many applications, the user experience provided by Smart Polling is the same as WebSockets. And for games, being able to play behind a super-strict corporate firewall, even if with less frames per second, is better than nothing. Consider that even Smart Polling in Lightstreamer is subjected to resampling, conflation, and bandwidth management. So, at each polling cycle, you are guaranteed to receive fresh data.

We have mentioned the situation where the network (including all the intermediaries) imposes to use Smart Polling. But there is another case in Lightstreamer where Smart Polling may be used automatically. Even if the network supports Streaming (based on HTTP or WebSocket), the event rate of the data stream might be too high for the processing capacity of the client. This applies particularly to older devices that cannot keep the pace of a high-frequency data flow. In these cases, the Lightstreamer client library can automatically detect that the client code is not dequeuing the received events fast enough and decide to switch to Smart Polling. This way, the actual update rate will be driven by the client capacity to process a batch of events and request the new batch. Subsequent batches are subjected to conflation. Though it could sound strange, in these cases, Smart Polling can be used even over WebSocket. For interactive games, polling, even in its smart form, is never as good as streaming. But, again, being able to play behind very strict corporate firewalls with a slightly decreased quality of service is better than nothing.

The selected transport is totally transparent to the application (the game), which does not need to take any specific actions based on the transport being used.

To know which transport is being used when you connect to the 3D World Demo, just roll the mouse pointer over the top-left "S" tag. The tag will slide to the right, revealing the actual transport in use (WebSocket or HTTP, in Streaming or Polling mode).

CLIENT-TO-SERVER IN-ORDER GUARANTEED MESSAGING

Contrary to what you read sometimes, HTTP Streaming does not have any additional overhead than WebSocket in sending messages from the server to the client. The real benefit of WebSocket over HTTP Streaming is when you need to send messages from the client to the server. The reason is that HTTP, when sending messages from client to server, as managed by any normal web browser, has three main limitations:

It requires a full round trip for each message (unless HTTP pipelining is used, which is turned off by default in most browsers).
There is no control over message ordering, as multiple messages might be sent over multiple connections. This is decided by the HTTP connection pool logic of the browser and no information is exposed to the JavaScript layer. So, even if you send two messages in sequence from your JavaScript code (with 2 HTTP requests), they might surpass each other on the network.
There is no control over connection reuse. As said above, the browser has exclusive control over its connection pool and usually closes an idle connection after a timeout. So, if after some time you need to send a low-latency update to the server, you have to set up a new TCP connection first, thus increasing latency.

Lightstreamer implements a layer of abstraction so the three issues above are eliminated even when using HTTP as a transport. In particular, Lightstreamer acts as follows:

It automatically batches high-frequency client messages. This means if the client tries to deliver 100 messages in a burst, instead of sending 100 HTTP requests, the browser will send just a couple of HTTP requests.
The client messages are automatically numbered. The server can send back ACKs for each message and reorder out-of-order messages. There is mechanism in the client that controls automatic retransmission of unacknowledged messages. In other words, some sort of "TCP over HTTP" has been implemented to get reliable messaging from client to server inside a browser!
Reverse heartbeats can be activated so the browser keeps the connection open and ready to use as soon as needed. Reverse heartbeats are tiny messages sent by the client to force the browser to avoid closing the underlying TCP connection.

Such layer for optimized in-order guaranteed messaging from client to server originated in Lightstreamer to deliver order submissions as part of online trading front ends. This drove the need to make it very reliable and support high-frequency messaging. This layer is exposed to the application layer via a very easy sendMessage function call.

LIGHTWEIGHT PROTOCOL

Given a working transport (as chosen by the Stream-Sense mechanism) and a data model (see Message Routing below), a protocol is needed to deliver the data to the clients with full semantics. Lightstreamer avoids using JSON or, even worse, XML, as the base of its protocol. These are extremely verbose protocols, which carry large amounts of redundant data with each event (for example, the field names), resulting in increased bandwidth usage.

Lightstreamer uses a position-based protocol, which reduces the overhead to a minimum. The actual payload accounts for most of the bandwidth.

Below is an example taken from the 3D World Demo by dumping the network traffic, where server-side mode with string encoding is used. In this case, a JS-based protocol is used, though not JSON. Other protocols are possible.

d(27,1,2,'-18.85667',3,'0.91879','-0.39474',1);d(30,1,5,'-0.70090',1,'-0.71325','-59.00561');d(31,1,3,'27.42105','-0.00389','-0.44682','0.89365','-0.04139',1);d(29,1,3,'-57.12912','-0.00389','-0.44682','0.89365','-0.04139',1);d(28,1,2,'42.71763',1,'-0.40773','0.48877','-0.05785','-0.76909','-7.72828');

MESSAGE ROUTING

Having implemented optimized and reliable transports and protocols is not enough for developing complex real-time applications and games. On top of such mechanisms, you need an easy and flexible way to route messages, that is, to govern which messages should be delivered to which clients.

Lightstreamer is based on what I call an asymmetric publish-subscribe paradigm. Clients can subscribe to items and can send messages to the server, but the actual publishers are on the server side. Based on any back-end data feed, any back-end data computation algorithms, and any message received from the clients, the publishers (Lightstreamer Data Adapters) inject real-time updates on any item into the Lightstreamer Server. Then, it's up to the server to route the updates based on the client subscriptions.

In the 3D World Demo, the Data Adapter contains the game engine. Based on the movement commands received from the clients (via the sendMessage facility), it calculates physics and publishes item updates accordingly.

Client subscriptions are based on items and schemas (sets of fields). For example, in the 3D World Demo, if a client wants to know the real-time position of an object, it will subscribe to a specific item, which represents that object, using the schema below:

"posX", "posY", "posZ", "rotX", "rotY", "rotZ", "rotW"

Let's suppose the client is interested in knowing the coordinates and the velocity vector. It will subscribe to the same item with a different schema:

"posX", "posY", "posZ", "Vx", "Vy", "Vz"

If the client wants to know the nicknames and messages of the other players, in addition to be notified of players entering and leaving a world, it will subscribe to the main item (which represents that world), with a schema like this:

"nick", "msg", "key", "command"

Basically, every client can subscribe to as many items it needs, each with its own schema.

As a result, the Lightstreamer Server will multiplex the updates for all the subscribed items of each client on the top of the same physical connection.

Clients can subscribe to and unsubscribe from any item at any time in their life cycle, without interrupting the multiplexed stream connection.

Lightstreamer uses an on-demand publishing model. Only when an item is first subscribed by a client, the Data Adapter (publisher) is notified to begin publishing data for that item. Then, further clients can subscribe to the same item and the Lightstreamer Server takes care of broadcasting updates to all them. When the last client unsubscribes from that item, the Data Adapter is notified to stop publishing data for that item. This way, no data needs to be produced when no actual recipient exists, saving on systems resources.

With this item-based approach, all the routing scenarios are supported by Lightstreamer: broadcasting, multicasting, and unicasting. For example, if 1 million different clients subscribe to the same item, the Data Adapter will need to publish the updates only once, and the Lightstreamer Server will take care of performing the massive fan-out. On the other hand, each client may subscribe to its own individual item, thus enabling the delivery of personal messages. The good thing is that different routing scenarios can be mixed together as part of the same multiplexed connections.

In a game, the flexibility of this publish and subscribe model makes it much easier to control what data should reach each player.

DIFFERENT SUBSCRIPTION MODES

Okay, we have a reliable and optimized transport/protocol, we have a message routing mechanism with support for items and schemas. What else might be needed? Content-aware subscription modes.

In Lightstreamer, when an item is subscribed to, the client can specify a subscription mode, choosing among MERGE, DISTINCT, COMMAND, and RAW. The subscription mode, together with further subscription parameters, determines the filtering mechanism that should be used (resampling with conflation, queuing, etc.) and opens the door to so-called meta-push, thanks to COMMAND mode.

With COMMAND mode, you can push not only changes to fields of subscribed items, but you can also control, from the server, what items should be subscribed to and unsubscibed from by the client at any time. This is done by sending add, delete. and update commands as part of an item, which provokes subscriptions to and unsubscriptions from underlying items. The delivery of such commands on the network is extremely optimized, with the auto-detection and elimination of redundant commands.

In the 3D World Demo, the heart of the world is delivered in COMMAND mode. Whenever a player enters and leaves the world, a real-time command is sent to add or remove an object in the rendering pane and a row in the matrix. Then, for each added object (and row), an underlying item is subscribed to in MERGE mode to get the positions, vectors, etc.

COMMAND mode originated in financial trading applications to manage a dealer's portfolio in real time (stocks entering, being updated, and leaving the portfolio at any time).

SCALABILITY

To manage the real-time data for massively multiplayer online games, very high scalability is needed for the streaming server. Lightstreamer employs a concurrent staged-event driven architecture with non-blocking I/O. This means that the number of threads in the server is fully decoupled from the number of connections. The number of threads in the pools is automatically tuned based on the number of CPU cores.

This architecture allows graceful degradation of the quality of service. In other words, in the extreme case that the server CPU should saturate, the game will not be blocked abruptly, but all the users will be slightly penalized, receiving lower-frequency (but still fresh) updates.

The main driver in the scalability of Lightstreamer is the overall message throughput, more than the total number of connections. For example, a single server instance was successfully tested with one million connected clients, with a low message rate on each connection (a few updates per minute). By increasing the message frequency to tens of messages per second per client, the number of concurrent clients managed by a single machine is reduced to tens of thousands.

Several Lightstreamer Server instances can be clustered via any good Web load-balancing appliance, to reach much higher scalability. This should enable high-frequency massively multiplayer online games to become easily deployed in a more cost-effective way.

Conclusion

We have presented an online demo that shows how a technology created for the financial industry can be used with great benefits in the gaming industry. Adaptive streaming (dynamic throttling, conflation, delta delivery, etc.) is a set of techniques to make sure that the amount of real-time data needed to synchronize a 3D virtual world among several clients is governed by the actual network capacity for each client. Several low-level optimizations and high-level abstractions are required to make multiplayer game development easier with reliable results. If you want to solve the problems of real-time gaming efficiently, you should use similar techniques in your real-time web stack.

The source code of this demo is available on GitHub. Feel free to modify it and create any derivative work; perhaps a full-fledged game! Let us know if you find the demo code useful and if you will be working on any project based on it. We are eager for any feedback on both the demo and this article. You can contact us at support@lightstreamer.com.

Credits

Many thanks to Eric Li, Matt Davey, and Michael Carter for the comments.
Originally publish on the Lightstreamer blog.

Resources

Online 3D World Demo:http://demos.lightstreamer.com/3DWorldDemo
Client-side source code of 3D World Demo:https://github.com/Weswit/Lightstreamer-example-3DWorld-client-javascript
Server-side source code of 3D World Demo:https://github.com/Weswit/Lightstreamer-example-3DWorld-adapter-java

References

[1] Eric Li. Optimizing WebSockets Bandwidth:http://buildnewgames.com/optimizing-websockets-bandwidth/
[2] Glenn Fiedler. Networked Physics:http://gafferongames.com/game-physics/networked-physics/
[3] Valve Developer Community - Source Multiplayer Networkinghttps://developer.valvesoftware.com/wiki/Source_Multiplayer_Networking
[4] Chen-Chi Wu, Kuan-Ta Chen, Chih-Ming Chen, Polly Huang, and Chin-Laung Lei. On the Challenge and Design of Transport Protocols for MMORPGs:http://www.iis.sinica.edu.tw/~swc/pub/tcp_mmorpg.html
[5] Glenn Fiedler. UDP vs. TCP:http://gafferongames.com/networking-for-game-programmers/udp-vs-tcp/
[6] Alessandro Alinone. From Push Technology to the Real-Time Web:http://www.slideshare.net/alinone/from-push-technology-to-the-realtime-web/23
[7] Unity - High Level Networking Concepts:http://docs.unity3d.com/Documentation/Components/net-HighLevelOverview.html

↧

Overview of Modern Volume Rendering Techniques for Games - Part 2

December 3, 2013, 6:24 am

≫ Next: You Don't Need to Hide Your Source Code

≪ Previous: Optimizing Multiplayer 3D Game Synchronization Over the Web

In this blog series I write about some modern volume rendering techniques for real-time applications and why I believe their importance will grow in the future.

If you have not read part one of the series please check it out here, it is an introduction to the topic and overview of volume rendering techniques. Check it out if you haven’t already and then go on.

In this second post from our multi-post series on volume rendering for games, I’ll explain the technical basics that most solutions share. Through all the series I’ll concentrate on ‘realistic’, smooth rendering – not the ‘blocky’ one you can see in games like Minecraft.

Types of Techniques

Volume rendering techniques can be divided in two main categories – direct and indirect.

Direct techniques produce a 2D image from the volume representation of the scene. Almost all modern algorithms use some variation of ray-casting and do their calculations on the GPU. You can read more on the subject in the papers/techniques “Efficient Sparse Voxel Octrees” and “Gigavoxels”.

Although direct techniques produce great looking images, they have some drawbacks that hinder their wide usage in games:

Relatively high per-frame cost. The calculations rely heavily on compute shaders and while modern GPUs have great performance with them, they are still effectively designed to draw triangles.
Difficulty to mix with other meshes. For some parts of the virtual world we might still want to use regular triangle meshes. The tools developed for editing them are well-known to artists and moving them to a voxel representation may be prohibitively difficult.
Interop with other systems is difficult. Most physics systems for instance require triangle representations of the meshes.

Indirect techniques on the other hand generate a transitory representation of the mesh. Effectively they create a triangle mesh from the volume. Moving to a more familiar triangle mesh has many benefits.

The polygonization (the transformation from voxels to triangles) can be done only once – on game/level load. After that on every frame the triangle mesh is rendered. GPUs are designed to work well with triangles so we expect better per-frame performance. We also don’t need to do radical changes to our engine or third-party libraries because they probably work with triangles anyway.

In all the posts in this series I’ll talk about indirect volume rendering techniques – both the polygonization process and the way we can effectively use the created mesh and render it fast – even if it’s huge.

What is a Voxel?

A voxel is the building block of our volume surface. The name ‘voxel’ comes from ‘volume element’ and is the 3D counterpart of the more familiar pixel. Every voxel has a position in 3D space and some properties attached to it. Although we can have any property we’d like, all the algorithms we’ll discuss require at least a scalar value that describes the surface. In games we are mostly interested in rendering the surface of an object and not its internals – this gives us some room for optimizations. More technically speaking we want to extract an isosurface from a scalar field (our voxels).

The set of voxels that will generate our mesh is usually parallelepipedal in shape and is called a ‘voxel grid’. If we employ a voxel grid the positions of the voxels in it are implicit.

In every voxel, the scalar we set is usually the value of the distance function at the point in space the voxel is located. The distance function is in the form f(x, y, z) = dxyz where dxyz is the shortest distance from the point x, y, z in space to the surface. If the voxel is “in” the mesh, than the value is negative.

If you imagine a ball as the mesh in our voxel grid, all voxels “in” the ball will have negative values, all voxels outside the ball positive, and all voxels that are exactly on the surface will have a value of 0.

Cube polygonized with a MC-based algorithm – notice the loss of detail on the edge

Marching Cubes

The simplest and most widely known polygonization algorithm is called ‘Marching cubes’. There are many techniques that give better results than it, but its simplicity and elegance are still well worth looking at. Marching cubes is also the base of many more advanced algorithms and will give us a frame in which we can more easily compare them.

The main idea is to take 8 voxels at a time that form the eight corners of an imaginary cube. We work with each cube independently from all others and generate triangles in it – hence we “march” on the grid.

To decide what exactly we have to generate, we use just the signs of the voxels on the corners and form one of 256 cases (there are 2^8 possible cases). A precomputed table of those cases tells us which vertices to generate, where and how to combine them in triangles.

The vertices are always generated on the edges of the cube and their exact position is computed by interpolating the values in the voxels on the corners of that edge.

I’ll not go into the details of the implementation – it is pretty simple and widely available on the Internet, but I want to underline some points that are valid for most of the MC-based algorithms.

The algorithm expects a smooth surface. Vertices are never created inside a cube but always on the edges. If a sharp feature happens to be inside a cube (very likely) then it will be smoothed out. This makes the algorithm good for meshes with more organic forms – like terrain, but unsuitable for surfaces with sharp edges like buildings. To produce a sufficiently sharp feature you’d need a very high resolution voxel grid which is usually unfeasible.
The algorithm is fast. The very difficult calculation of what triangles should be generated in which case is pre-computed in a table. The operations on each cube itself are very simple.
The algorithm is easily parallelizable. Each cube is independent of the others and can be calculated in parallel. The algorithm is in the family “embarrassingly parallel”.

After marching all the cubes, the mesh is composed of all the generated triangles.

Marching cubes tends to generate many tiny triangles. This can quickly become a problem if we have large meshes.

If you plan to use it in production, beware that it doesn’t always produce ‘watertight’ meshes – there are configurations that will generate holes. This is pretty unpleasant and is fixed by later algorithms.

In the next series I’ll discuss what are the requirements of a good volume rendering implementation for a game in terms of polygonization speed, rendering performance and I’ll look into ways to achieve them with more advanced techniques.

References

Cyril Crassin, Fabrice Neyret, Sylvain Lefebvre, Elmar Eisemann. 2009. GigaVoxels : Ray-Guided Streaming for Efficient and Detailed Voxel Rendering.

Samuli Laine, Tero Karras. 2010. Efficient Sparse Voxel Octrees.

Paul Bourke, 1994, Polygonising a scalar field

Marching cubes on Wikipedia.

↧

You Don't Need to Hide Your Source Code

December 4, 2013, 9:44 pm

≫ Next: Writing Fast JavaScript For Games & Interactive Applications

≪ Previous: Overview of Modern Volume Rendering Techniques for Games - Part 2

The Case Against Proprietary Software Games

For decades now, it's been fashionable for game developers, along with other software developers, to withhold their games' source code, keep it secret. But why? As far as I can tell, people do this not because of any inherent benefits they see to withholding the source code, but because everyone else (well, almost everyone else) does it. In truth, there is no reason for people to be doing this.

Don't misunderstand me: I don't just mean to say that there's no reason for people to be doing this with non-commercial games. Keeping source code a secret is unnecessary for commercial games, too. Commercial games are perfectly viable without keeping their source code secret. In fact, I am going to argue that not only is there no harm in the public having access to the source code; there is not even any harm in putting the source code under a free/libre/open-source software (FLOSS) license.

The Point of Source Code Secrecy in the First Place

"What?!" I hear you say. "How can there be no harm to my business in making my game FLOSS?" To understand the answer to this question, you must first understand what the point was in keeping source code secret in the first place.

Understand: keeping source code secret does not by itself do anything to stop unauthorized copying. A binary distribution of a game can be distributed just as easily as source code. Therefore, there must be some other reason this trend of keeping source code secret started.

The practice of withholding source code came about in 1969. IBM was the first party to charge for its software and withhold the source code to it. One has to wonder why IBM refused to provide source code, though; after all, copyright law already gave IBM a legal monopoly over the distribution of that program. So what was the purpose of keeping the source code secret?

The answer is simple: the point of withholding the source code was not to make money from distributing copies of the program, but to have an additional monopoly on support services. Because only IBM had access to the source code to its software, any time users of the software needed a new feature added or a bug fixed, they had to go to IBM, and IBM could charge more for the services because of this.

So in fact, the original reason for keeping source code secret, gaining a monopoly on support services, is not even applicable to video games. Video games aren't even tools, and video game developers don't sell support services. People who play the games just buy them once and play them, or pay a regular fee for continued access to a server.

The New Point of Source Code Secrecy: DRM

More recently, a new reason to keep source code secret has arisen: DRM, which proponents say stands for "Digital Rights Management", though I prefer the more accurate term, "Digital Restrictions Management".

Keeping source code secret is essential to making DRM work. This is because DRM's functionality can be sufficiently summed up as "the function of refusing to function". If a computer is refusing to function, and you have the means to make it not refuse to function (the source code), it isn't doing a very good job of refusing to function.

But even if small video game developers were able to put effective DRM into their games, and even if the public was not strongly against DRM, it has been shown time and time again that DRM just doesn't work. There is a reason DRM use has died in music: it has been clearly demonstrated that it doesn't stop unauthorized copying. All it does is inconvenience legitimate customers and encourage them to obtain unauthorized copies that will actually work instead of buying authorized copies that won't work.

In short, source code secrecy is needed for DRM to work, but DRM is not any good in the first place.

Game vs. Game Engine

So it's clear that keeping the source code to a commercial game is completely unnecessary, but perhaps you think that, while it would be fine to distribute the source code to your games, it would be suicide to release that source code under a FLOSS license such as the GNU General Public License. After all, if you did that people would be able to just share copies of the game freely; how would you make money in that situation?

In fact, it might still be possible to make money in that situation. Crowdfunding can theoretically earn you money upfront to pay for the full cost of developing the game plus a good paycheck for the developer(s). It's also been shown that people will pay for quality entertainment such as games voluntarily; the Humble Bundle is a good example of this in action.

But these ideas have not had thorough testing, so let's assume that while these methods may work for some developers, they won't work for every developer. To understand why the most widely-used method of making money from game development, selling copies, would be unaffected by releasing the source code under a FLOSS license, you must understand and be fully aware of the distinction between the game and the game engine.

You may tend to think of the game as being one piece, but in fact it's made of two parts: the game engine, and the game data. The game data further consists of two major parts: the art assets, and the scenario.

The art assets are, obviously, things like sprites, 3-D models, sounds, and music. What exactly the scenario is depends on the game. In many simple games, the scenario will be a collection of levels. In some scrolling shooters, the scenario might be a list of what attack patterns to use at what times. In RPGs, the scenario will be a combination of the world, the quests, and the story.

The other half, the game engine, is the source code: the software that allows users to play the game.

With this distinction kept in mind, it is fairly obvious how one might make money from a game that uses a FLOSS game engine: simply leave the game data or even just the scenario proprietary.

Benefits of FLOSS Game Engines

You might be asking now, "OK, but what's the point in releasing my game engine as FLOSS, anyway?" There are currently four main benefits:

Additional Exposure

Hopefully this is temporary, but currently, there are thousands and thousands of proprietary software games out there. There are far fewer FLOSS games. Being a part of this smaller pot as well as the larger pot of all games gives your game more exposure.

Benefiting from Other Developers

When several game developers release their engines under FLOSS licenses, this means that code reuse is possible on a much larger scale than when everyone keeps their own game engines to themselves. This makes it possible for faster and easier game development by all FLOSS game developers.

Easy Cross-Platform Compatibility (if you use the right libraries)

One major weakness of proprietary software games is they will only run on platforms you specifically support.

With FLOSS games, you don't need to worry about it; just provide binaries for some popular platforms and users of all the other platforms will do just fine compiling the source code themselves.

Volunteers May Help (though they probably won't)

It is popularly claimed that releasing a program as FLOSS will cause a lot of people to come over and fix bugs. In fact, this is false most of the time, but one thing that is certainly true is that it is possible for volunteers to contribute to a FLOSS game, and some may do so if the game is sufficiently popular.

Counter-Arguments

I can think of three major counter-arguments against the use of FLOSS game engines:

"If I release the game engine as FLOSS, someone is just going to easily make their own game data and outcompete me."

In fact, for most games, making a whole new scenario is hard work, and most people just can't be bothered to do it. Take Freedoom, for instance; Freedoom is a project that has been in development for much more than five years as of 2013, and yet it is still in alpha. This can be attributed to a combination of lack of interest (everyone wants to play Doom or Doom II, not Freedoom) and the amount of work that needs to be put in for it to be done.

This contention is true enough for some games; Minecraft, for example, doesn't have any creative scenario to speak of (the scenario is entirely randomly generated), and in fact there are already suitable replacements for the art assets of Minecraft in a FLOSS clone, Minetest. But most games are not like Minecraft; most games have a significant enough scenario that any free replacement for it is not likely to cause a loss of profit.

"I won't be able to benefit from other people's engines, but other people will be able to benefit from mine without giving anything back to me."

This would be true if you released your game engines under permissive licenses, such as the Expat License (often ambiguously called the "MIT License"). But fortunately, this is not the only option. The thought of proprietary software benefiting from his work freely, giving them an unfair advantage, was the very reason Richard Stallman pioneered the concept of copyleft, which ensures that anyone who uses the FLOSS code in question does not do this. The most popular copyleft license, and the best choice in most cases, is the GNU General Public License.

"I don't want to share my source code because I think it's art (or some other similar reason)."

To be frank, I think this is as delusional as claiming that a DVD player is art because it's playing a wonderfully artistic movie.

Most game engines simply are not art. They are simply an interface to allow players to play the game. It's the actual game, the scenario, which is art.

If there is no scenario to speak of in the game, that doesn't make the game engine art, either. It rather makes the game more analogous to a simple toy than a DVD player playing a movie. Much like simple toys, such simple games don't tend to qualify as art.

Don't get me wrong; I appreciate code as much as the next guy. I'm a programmer, after all. But there's a difference between appreciating code and mistaking it for art.

The Moment of Truth

As you can see, not only is keeping the source code to your games unnecessary, you may be missing out on useful benefits because of it.

Of course, you can continue to make your game engines proprietary. It's your choice. But I hope you will make the wise choice: choose FLOSS, and don't look back.

Article Update Log

5 Dec 2013: Initial release

↧

Writing Fast JavaScript For Games & Interactive Applications

December 12, 2013, 5:04 am

≫ Next: How to Handle Circular Dependencies with Templates in C++

≪ Previous: You Don't Need to Hide Your Source Code

Recent versions of JavaScript engines are designed to execute large bodies of code very fast but if you don't know how JavaScript engines work internally you could easily degrade your application's performance. It's specially true for games that need every drop of performance they can get.

In this article I will try to explain some common optimization methods for JavaScript code that I picked up during development of my projects.

Garbage Collection

One of the biggest problems in having a smooth experience resides in JavaScript garbage collector(GC) pauses. In JavaScript you create objects but you don't release them explicitly. That's job of the garbage collector.

The problem arises when GC decides to clean up your objects: execution is paused, GC decides which objects are no longer needed and then releases them.

Zig-zag memory usage pattern while playing a JavaScript game.

To keep your framerate consistent, you should keep garbage creation as low as possible. Often objects are created with the new keyword e.g. new Image() but there are other constructs that allocate memory implicitly:

var foo = {}; //Creates new anonymus object
var bar = []; //Creates new array object
function(){}  //Creates new function

You should avoid creating objects in tight loops (e.g. rendering loop). Try to allocate objects once and reuse them later. In languages like C++ sometimes developers use object pools to avoid performance hits associated with memory allocation and fragmentation. The same idea could be used in JavaScript to avoid GC pauses. Here you can find out more about object pools.

To better demonstrate implicit garbage creation, consider following function:

function foo(){
	//Some calculations
	return {bar: "foo"};
}

Each time this function is called it creates a new anonymous object that needs to be cleared at some point. Another performance hit comes from using array shorthand [] to clear your array:

var r = new Array("foo", "bar"); //New array filled with some values, same as ["foo", "bar"]
r = [];//Clear the array

As you can see, the second line creates a new array and marks the previous one as garbage. It's better to set the array length to 0:

r.length = 0;

Functions can wake up the GC as well. Consider the following:

function foo(){

	return function(x, y){
		return x + y;
	};
	
}

In above code we return a function refrence from our foo function but we also allocate memory for our anonymous function. The above code could be rewritten to avoid GC:

function bar(x, y){
	return x + y;
}

function foo(){
	return bar;
}

An important thing to be aware of is that global variables are not cleaned up by the garbage collector during the life of your page. That means objects like the above functions are only created once so whenever possible use them to your advantage. Also globals are cleaned up when users refresh the page, navigate to another page or close your page.

These are straightforward ways for avoiding performance hits that come from GC but you should also be aware of other JavaScript library functions that may create objects. By knowing what values are returned from your library functions you could make better decisions about designing your code. For example, if you know that a library function may allocate memory and you use that function in a performance-critical section of your code you may want to rewrite or use a similiar but more effecient function.

JavaScript Internals

JavaScript engines do some preparation on your code (including some optimizations) before execution. Knowing what they do behind the scenes will enable you to write generally better code. Here is an overview of how two popular JavaScript engines (Google's V8 and Mozilla's SpiderMonkey) work under the hood:

V8:

JavaScript is parsed and native machine code is generated for faster execution. The initial code is not highly optimized.
A runtime profiler monitors the code being run and detects "hot" functions (e.g. code that runs for long time).
Code that's flagged as "hot" will be recompiled and optimized.
V8 can deoptimize previousely optimized code if it discovers that some of the assumptions it made about the optimized code were too optimistic.
Objects in V8 are represented with hidden classes to improve property access.

SpiderMonkey:

JavaScript is parsed and bytecode is generated.
A runtime profiler monitors the code being run and detects "hot" functions (e.g. code that runs for long time).
Code that's flagged as "hot" will be recompiled and optimized by Just-In-Time(JIT) compiler.

As you can see, both engines and other similiar engines applies common optimizations that we can take advantage of.

Deleting Object Properties

Avoid using the delete keyword for removing object properties if you can. Consider the following code:

var obj = { foo: 123 };
delete obj.foo;
typeof obj.foo == 'undefined' //true

It will force V8 to change obj's hidden class and run it on a slower code path. Same is true about other JavaScript engines that optimize for "hot" objects. If possible, it's better to null the properties of object instead of removing them.

Monomorphic Variables

Whenever possible try to keep your variables monomorphic. For example don't put different objects with different hidden classes in your arrays. Same applies to properties and function parameters. Functions that are supplied with constant parameter types perform faster than the ones with different parameters.

//Fast
//JS engine knows you want an array of 3 elements of integer type
var arr = [1, 2, 3];
//Slow
var arr = [1, "", {}, undefined, true];

The below function can be called with different parameter types(ints, strings, objects, etc) but it will make it slow:

function add(a, b){
	return a + b;
}

//Slow
add(1, 2);
add('a', 'b');
add(undefined, obj);

Array of Numbers

Using an array is usually faster than accessing object properties. This is particularly beneficial when the array contains numbers. For example it's better to write vectors using arrays than with objects with x, y, z properties.

Arrays with Holes

Avoid "holes" in your arrays. It will make things slower than it should. Holes are created by deleting elements or adding elements out of range of the array. For example:

var arr = [1, 2, 3, 4, 5];//Full array
delete arr[0];//Creates hole
arr[7] = 1;   //Creates hole
var hArr = [0, 1, 2, 3, /* hole */, 5];//Holey array

Pre-allocating Large Arrays

Current implementations of SpiderMonkey and V8 favor growing over pre-allocating large arrays (with more than 64K elements). Keep in mind that this is largely implementation-dependent as some implementations such as Nitro(Safari) or Carakan(Opera) favors pre-allocated arrays.

Object Declaration

I can't give you single method for creating objects as it's very engine-dependent but you can see current performance test reults for yourself here and decide what's best for your application.

Integer Arithmetic

Use integers where possible. Because most programs use integers, modern JavaScript engines are optimized for integer operations. In JavaScript all numbers are Number type thus you can't directly specify storage type (int, float, double, etc) like other strongly typed languages. If your application is math heavy, one unintended floating point artimetic can degrade your application performance and it can spread through your application. For example:

function halfVector(v){
	v[0] /= 2;
    v[1] /= 2;
    v[2] /= 2;
}

var v = [3, 5, 9];
halfVector(v);

In a strong typed language like C++ where we used int type we would get [1, 2, 4] as result but in our case we implicitly switched to floating point math in our code.

JavaScript engines use integer math operations where possible and its because modern processors execute integer operations faster than floating point operations. Unlike other objects and floating point values, common integer values are stored in memory where they don't require allocation.

To tell JavaScript engine we want to store integer values in our array in above example we could use bitwise or operator:

function halfIntVector(v){
	v[0] = (v[0] / 2) | 0;
    v[1] = (v[1] / 2) | 0;
    v[2] = (v[2] / 2) | 0;
}

Result of the bitwise or operator is an integer, this way JavaScript engine knows that it should not allocate memory.

Floating Point Values

As stated in previous point, any time a floating point number is assigned to an object property or array element a memory is allocated. If your program does lots of floating point math these allocations can be costly. Although you can't avoid allocation for object properties you can use typed arrays (Float32Array and Float64Array).

Typed arrays can only store floating point values and JavaScript runtime can access and store the values without memory allocations.

Conclusion

There are many other ways to optimize your code but I think this should be enough to get started. Just keep in mind that you should always profile your code and optimize for the portion that takes the most time to execute.

Article Update Log

19 Dec 2013: Updated the article with some tips for integer/float values

12 Dec 2013: Initial release

↧