Tag Archives: implementation

The uses of kernal threads

A novel approach to multitasking is assigning engine subsystems to run separate threads. The processor is what actually runs a code stream whilst the thread acts as an execution context for the processor.

A kernel thread is a kernel entity, like processes and interrupts handlers; it is the entity handled by the system scheduler. A kernel thread runs in user mode environment when executing user functions or library calls; it switches to kernel mode environment when executing system calls.

It stores the execution state of the 3 basic parts of the code stream; the stack which contains private data from execution, the registers which save and restore the thread, and the thread control block.

When the OS schedules a thread, it stores the registers of the current one and restores the registers of the new one. New thread starts running just where it left off.

It is fairly common for kernel code to create lightweight processes – kernel threads – which perform a certain task asynchronously. To see these threads, run ps ax on a 2.6 kernel and note all of the processes in at the beginning of the listing. The code which sets up threads has tended to be implemented again every time a new thread is needed, however, certain tasks are not always handled well. The current kernel also does not easily allow the creator of a kernel thread to control the behavior of that thread.

A kernel thread is created with the following line of code:

    struct task_struct *kthread_create(int (*threadfn)(void *data),

                                       void *data,

                                                       const char *namefmt, …);

Sometimes, a thread asks the kernel to do something that doesn’t need to execute instructions such as reading a block off the disk, then receive a packet from the network. These calls block inside the kernel.

The kernel starts the operation, puts the thread on a wait queue for when the operation completes, and schedules a new thread to run. It’s as if the call to schedule() only returns when the operation is complete: the stack and data registers are maintained

    Because kernel has full knowledge of all threads, Scheduler may decide to give more time to a process having large number of threads than process having small number of threads. Kernel-level threads are especially good for applications that frequently block.

The kernel-level threads are slow and inefficient. For instance, threads operations are hundreds of times slower than that of user-level threads.

    Since kernel must manage and then schedule threads as well as processes. It require a full thread control block (TCB) for each thread to maintain information about threads. As a result there is significant overhead and increased in kernel complexity.

Kernel-Level threads make concurrency much cheaper than process because, much less state to allocate and initialize. However, for fine-grained concurrency, kernel-level threads will still suffer from too much overhead. Thread operations still require system calls. Ideally, we require thread operations to be as fast as a procedure call. Kernel-Level threads have to be general to support the needs of all programmers, languages, runtimes, etc.

Dark side of programming, hard coding, it’s more code now than game

The definition of a DSL, the applicability of DSLs to AI in particular, the practical side of constructing and deploying DSLs for AI, and the benefits (and beneficiaries) of DSLs for AI. Lastly, we will conclude with some advice from hard experience on the subject.

A Domain-Specific Language is mere a language designed to be highly adept at solving a specific class of problems. Some languages are more specialized than others, some languages are designed to execute in certain environments and some languages are intended to be as widely applicable as possible

The first major principle that motivates our application of DSLs is code cleanliness and the effects of mapping code directly into the problem space (versus code which is too generalized, too abstract, or too specific – i.e. “hard coded”). The second major principle that motivates DSL deployment is elimination of irrelevant concerns. Our third motivating principle: removing excess overhead.

Scripts were developed in the early AI work by Roger Schank, Robert P. Abelson and their research group as a method of representing procedural knowledge. They are very much like frames, only the values that fill the slots must be ordered. A script is a structured representation describing a stereotyped sequence of events in a particular context. Scripts are used in natural language understanding systems to organize a knowledge base in terms of the situations that the system should understand.

The classic example of a script involves the typical sequence of events that occur when a person drinks in a restaurant: finding a seat, reading the menu, ordering drinks from the waitstaff. In the script form, these would be decomposed into conceptual transitions, such as MTRANS and PTRANS, which refer to mental transitions [of information] and physical transitions.

Schank, Abelson and their colleagues tackled some of the most difficult problems in artificial intelligence (i.e., story understanding), but ultimately their line of work ended without tangible success. This type of work received little attention after the 1980s, but it is very influential in later knowledge representation techniques, such as case-based reasoning.

Scripts can be inflexible. To deal with inflexibility, smaller modules called memory organization packets (MOP) can be combined in a way that is appropriate for the situation. The behavior design is the roadmap for our AI scripting, giving us a clear set of coding elements to complete before it’s AI takes shape.

Although there are many different methods of designing AI, including sophisticated methods such as neural networks and artificial evolution, with a simple creature like this, our tutorial will follow the most basic and standard method used by all stock Unreal AI; conditional decision-making. This is the method that takes stimuli as a constant condition or trigger to either do this or that; also known as “if this, then do that”, or “if-then” decision-making. A common example of this kind of AI is the AI Script in a Scripted Sequence, but even Bots use this method, just in an extremely complex form.

We have some features of our NPC already defined for us; its model, textures, animation and sounds. Normally, a custom character would be started by defining its purpose or goal and then design those features around it. But for this behavior design, we will work “backwards”; starting with those features and finding the purpose and behavior design that fits it.

These are misconceptions about DSL development. DSLs are, by definition, simple and to the point. They eliminate the bulk of the issues that plague generic languages and require far less infrastructure to deploy effectively. Additionally, they require much less up-front design and development work.

Programmers in general are lazy and often try to find the simplest approach. This is isn’t necessarily a vice, after all, technology is a means to maximize efficiency by decreasing the energy and time required to do something. This is a god send for game developers as better technology can enable them to produce the best software and interface possible in shorter time spans given how higher ups give them specific deadlines. Scripts are ideal because of their flexibility, that way programmers can make necessary changes in time, with hard code it would be near impossible.

The cinematics of Arkham Origins, lessons learned from making both a movie and a game

I went to MIGS this weekend and had a blast, being surrounded by all these games, as well as fellow students, upcoming developers and professionals from major studios is quite frankly a dream come true.


My undisputed favorite part of the presentation by Ben Mattes of Warner Bros. games of Montreal. He talked about making a movie and a game at the same time; in which they speak about their experiences creating the cinematics of Arkham Origins.


We all saw the TV spots and trailers, those CG cutscenes looked so visually amazing, I honestly thought it was a live action movie at first glance.


Naturally the process was very difficult, according to their stories, they had since late last year to create everything which is a very tight schedule. That wasn’t even the worst of it. Given that they were telling a story, they naturally had to follow a script. The problem was the script wasn’t readily available to them from the start as you’d expect. No, the script was written, reviewed and approved in increments for the sake of editing flexibility, which left Mr. Mattes team at a disadvantage with the time schedule. Considering how serious WB & DC are about their character, it was not like WB games could take any liberties of the sort. Anything having to do with the story and characters begun and ended with their property owners, the rest was left to the cinematic cutscene developers.


In order to properly animate the characters of the game, they made extensive use of motion capture and shot everything at a studio with an army of stuntmen and stuntwomen enacting the actions of the characters. Everything from Batman’s martial arts to Joker’s over the top body language to Copperhead’s movements was done with motion capture. On the topic of Copperhead, things like climbing on the walls were simulated with walls and rails that they built. Every movement that required some specific environment, the team built them in order to properly capture the right animations.

Indeed, they put so much effort like you wouldn’t even imagine, and of course it was a difficult task given what resources they had to gather. They had to go through the trouble of casting each motion capture actor to perfectly suit their roles, in particular they had to find a large man in order to play Bane. Developers don’t just get people off the street to do these, in order to be hired to do motion capture, you need to be a credible actor and/or stunt person. I even met one at MIGS who told me this information. Like actors in movies, motion capture actors have schedules that they and the developers need to organize. This was a huge problem for them given the issue with getting a script on time.


There is a faster method to create these cutscenes, an alternative to motion capture is performance capture; which is a recording method that encompasses body motion capture, facial motion capture and voice recording. The problem is as you’d expect, it’s far too expensive.

Fortunately the long way proved to be much more ideal in the aesthetics department. With voice acting, they did it separately with expert voice actors such as Troy Barker as Joker. As for the facial rigging, they did that by using blenders, changing the facial expressions manually in maya by interpolating using catmull rom between 9 different expressions.


This ended up working better because they managed to avoid Uncanny Valley and retain the exaggerated expressions of comic book characters.

They captured all these movements with the usage of a virtual camera. But it’s not a traditional virtual camera that’s created in Maya and exported onto the engine. The animators used a portable camera that shot the motion capture set, projecting the objects and animations on a virtual space. Like a regular camera, it’s handled and moved in certain positions by a camera in order to get the exact angle they want. It’s barely different from traditional filmmaking.


Arkham Origins is one of the few games this year that made use of pre-rendered cinematics which is higher quality but takes up more disk space. After all the scenes are shot they take them into the engine and composite them in order to have…..drumroll please…… SHADERS!  Adding lighting effects, dust particles and pyrotechnics to create a more lively and realistic environment.


The lengths the animators took to create their cutscenes is no different from how regular films are shot; they hire actors to perform in front of a camera handled by a camera man, they need to follow the script and have to take the scenes and add effects later on in post-production. It’s uncanny how much effort they went through given the amount of obstacles they encountered, and to produce what they did at that caliber is to be commended. I think these cutscenes have better animation than most Pixar movies.

My only disappointment is not enough time to ask him questions, I had tonnes.

A look at Prince of Persia: The Forgotten Sands game engine

In my last Game Engine Design class which was on Halloween, the disappointment of my professor’s lack of costume was neutralized by showing us videos of two Prince of Persia games; The Forgotten Sands and Warrior Within. We were tasked with identifying different aspects of the game engine of the former, often criticizing some of the inner mechanics despite its blockbuster production values and technological achievement.


Prince of Persia: The Forgotten Sands is a 2010 multi-platform game developed by Ubisoft. Any developer will tell you how tasking it is to design the engine to work with certain consoles, similar to car engines, they need to be carefully constructed with the capabilities and limitations in mind. The version we looked at to my recollection is either the PS3 or Xbox 360 version.

My primary grievance with the engine is the sloppily put together animation layering. Most of what I know from animating layering comes from Uncharted 2 and their hierarchal structure of their character kinematics. How it works is that limbs, hands, feet and other body parts are each animated separately while using triggers to switch between animation states according to either your character’s relation to the world and/or according to your character’s action.

The issue with Prince of Persia: The Forgotten Sands’s animation layering is that it lacks detail and feels rushed, for example. When your character grabs onto ledges or walks on the wall for a few seconds, the hands don’t seem to grip on anything, it feels like that Game Jam game Surgeon Simulator where your hand can’t grip on anything.

Spider Prince, spider prince, the bad layering and lack of collision makes me wince.


A major part of gaming is investment, it’s difficult to invest in your player’s actions if something like grabbing onto something doesn’t feel solid. Internet critic and entertainer Doug Walker, under his persona the Nostalgia Critic, attested Tom & Jerry’s effectiveness in its craft due to how solid the animators can make objects, hence enhancing the slapstick.

This technique can also apply to games. Take Ninja Gaiden Sigma for example, when I look at what Prince of Persia: The Forgotten Sands did wrong, Ninja Gaiden Sigma did right. When Ryu climbs, runs along, or jumps off of the walls, you can feel every one of his steps colliding with force on the wall. An example of lack of solidity would be Sonic Heroes a game I loved since my youth, has an issue with enemies feeling too not rigid to the point where you can almost breeze through them like air when your character’s get strong enough. (I may blog about that game’s engine some time down the future) Naruto Shippuden Ultimate Ninja Storm 3 is a game that manages to have both soft and hard collisions. A combat based game at its core, you have your strong, medium and weak attacks, and naturally they make for collisions of the same level of force provided your attack damages your opponent.

It’s difficult to really attest for the force of collisions in games for a lot of people without really playing the games. Though gaming veterans and people involved in the field should be able to immediately identify such deficiencies.

The game’s biggest issue is its AI and how incompetent they are. You know how in movies every enemy attacks one at a time rather than all at once? That’s the game’s AI to a tee. If that’s what the programmers were going for then I’d still protest that they robbed players of a challenge. It doesn’t help that the AI moves at a snail’s pace, both in traversing towards you as well as their attacks. It takes then, no lie, 4 seconds to land a hit on you. Though your attacks are delayed as well, so it all balances out right? WRONG! That’s just bad combat. It’s not even satisfying to kill them due to the sound effects which sound like you’re being blocked rather than tearing away at their flesh.


Yeah just stand there and look intimidating, I’m sure that’ll scare the guy who can move himself more than 2 meters per second.

Once again I must refer to Ninja Gaiden Sigma, when your attacks are blocked, you hear a metallic sound effect, but when you hit you can hear the sound of your weapon tearing his flesh off his skin or bones being broken. It also helps that said game has brilliantly programmed AI whom are nearly as capable as your character, presenting a lot of challenge, a demand for skill on your part and the satisfaction of overcoming said challenge. Which means you better bring it in the boss battle.
Seriously, have I hammered it into your brains already? Go play any of the Ninja Gaiden Sigma games.

I’ve been pretty hard of Prince of Persia The Forgotten Sands all this long. The truth is, despite its shortcomings, its engine has many positive aspects. One of them is its marriage of the camera system and trigger system. The camera manages to follow your character all over the level, placing itself in dynamic and interesting angles whilst capturing the emotions of the environment and situation. This means that when an explosion happens, the camera moves to emphasize the collision.


Perhaps the best example is when the character swings on poles, the camera very subtly moves along with him as he swings, seemingly taking you as the player along with the ride. Parts where you’re assigned an object through an in-game cutscene, the camera will pan to point out where you’re supposed accomplish your task.

Despite the faults of the animation layering, the kinematics is above average by industry, AAA standards. The screen space ambient occlusion is top notch, and gameplay is most likely fun as ever, wouldn’t know though as I haven’t played it. Modelling, texturing, shader rendering, and movement is all top notch as well.


Warrior Within proves to be a superior product despite its technological inferiority; the engine provides much better animation layering, combat kinetics and overall collisions for all the reasons opposite to that of The Forgotten Sands. The former simply has more polished mechanics. In this battle of supremacy, the old proved superior to the new; hopefully developers take not of what it means to deliver on a polished engine.

Navigation mesh as used in insomniac games

A navigation mesh, or navmesh, is an abstract data structure utilized in AI applications to aid agents in path-finding through vast spaces. Meshes that don’t map to static obstacles in the environment they model offer the additional advantage that agents with access to the mesh will not consider these obstacles in path-finding, reducing computational effort and making collision detection between agents and static obstacles moot. Meshes are typically implemented as graphs, opening their use to a large number of algorithms defined on these structures.

One of the most common uses of a navigation mesh is in video games, to describe the paths that a computer-controlled character can follow. It is usually represented as a volume or brush that is processed and computed during level compilation.

Games are striving for a more immersive and fun experience and Non-Playable-Characters in game play a huge role in it as it gives NPCs the ability to perceive the world around them, react according to situation is a fundamental element of the game.

How AIs navigate is often how design gives personality to the NPC in the game. Designers are not burdened with movement implementation and let him/her worry about higher level aspects of game as it should be. The industry is moving away from heavily scripted AI to more dynamic and innate movement solution as usually AI usage comes in as few commands into the system.

NPC also uses system to scan for targets to feed it back to AI logic to monitor threat perception by AI & player. Also used to determine ‘spawn’ locations in the level. Resistance single player levels reused by multiplayer designers, various world representations have been used as a result

Navigation-mesh has gained considerable current reception. A* algorithm and its mods are typically used for path-finding on the navigation world following covers our experience, implementation and progress path-finding (A*) had volumes as nodes.


Designer laid out the entire level navigation mesh in maya, polygons are represented as nodes and edges as connections for A*. Navigation was one of the big bottle necks on PPU; maximum soft limit of 8 navigating NPCs at a given time. The lod system was put in place so NPCs at a distance did not use nav and went to great lengths to stagger the nav queries across frames even with above limits.

It was different production cycle for PS3 launch. If you look at RFOM levels . 3 RFOM levels shared a theme and wanted that to make one Resistance 2 level. Level streaming took care of rendering but nav was monolithic for whole level design planned to increase nav poly density by 3x. we were making 8 player co-op mode which implied enough NPCs to keep 8 players engaged in some encounters.


Tri-edges are A* graph nodes and polys are connections/links in the search graph which provide customization abilities like jump-up and jump-down distance A* parametrization now possible. It introduced Hierarchical path-finding and used A* among poly-clusters to calculate coarse path in order to refine NPC’s immediate path with in poly cluster as needed

They ran into inherent disadvantages of the scheme where in it doesn’t give shortest path which designer expects as a result path caching was added. NPC’s new request for path was checked against its last successful-path (high hit ratio) and made cost estimates in A* do more work which removed edge-mid point to edge-mid point for cost estimate that computed point on edge which minimizes cost (greedily) to be outputted path hardly needed smoothing.

A* at the time was so low that we took out hierarchical path finding. Hardly spent 10% of navigation shader budget on A* even in co-op mode with ~100 NPCs using navigation on screen. Path-finding was changed to consider more parameters ability to jump up/down various distances where we use one navigation-mesh for varied size NPCs.

A* considered NPC’s threshold clearance when searching the design used above feature to selective path among AI. They preferred corner radius (maintains radius away from boundary) and full-frame deferred navigation queries. The game-play code was changed to  use navigation data as queries while AI synced previous frame results and set the query for next frame’s needs. Code was streamlined to batch all navigation queries. Navigation query data access was also isolated for concurrency needs. Readied for shader offloading of navigation processing to find point on mesh query given a volume and parameters, outputs closest point on mesh maintain NPC’s tolerance radius away from boundary. Navigation-mesh cluster’s AABBs are kept in LS for broad-phase.

Read navigation query results from previous frame, steering the setup navigation queries for next frame’s animations and physics setup update. Animation shader jobs are added, physics shader jobs are added and navigation shader jobs are added. Could have run anywhere after pre-update to next-frame’s pre-update.

All the navigation queries were full frame deferred (gameplay issues reads results from previous frame’s issued query, steers and setup up for next frame if it has to)


One mesh is used for all sized AI so string pull doesn’t maintain NPC’s tolerance away at corners. The issue, especially for big NPCs like Titan in Resistance (where they would try to cut across the edge).

NPC has to not only get to first bend point but also need to orient towards the next bend point in the path. Navigation shader job computed the bezier approach curve for the bend point.




Curve tangent is computed by doing a swept sphere check for the specified distance in the current NPC facing direction. The end curve tangents is computed by doing back projection swept sphere check from first bend point in the opposite direction of second bend point. The bezier curve is computed using above tangents. NPC always aim for the mid-point of the curve for smoother approach.

To support higher volume of NPCs, especially for 8 player coop steering used a simple dynamic avoidance for the shader to compute escape tangents for each obstacle it came across.


Steering span all escape tangents in a circle around the NPC and are picked the resulting direction is the valid vacant hole closer to the desired direction towards the path bend point. Simplicity trade-off above in order to support huge number of NPCs running and it held up just fine. NPCs could still get stuck as picking the closest escape tangent would make it run into wall.

When you wish upon A*, programmers will wonder what you are.


Unlike sports or board games, the great thing about video games is that you can play it alone thanks to the AI; the Artificial Intelligence, presenting you with a challenge by means of opposing enemies. AI is in nearly every game, and as a result with our ever ascending technological breakthroughs, developers are blessed with the opportunity to further improve the AI, making them smarter and more capable, one day they will be smart enough overthrow us and enslave us, nothing to worry about, a kid born out of a confusing time paradox and a robot with a funny accent will save us all. Coincidentally related to the previous sentence, one of the many ways to implement AI is with the A* algorithm.


The A* is an algorithm is a pathfinding and graph traversal that allows an object to follow something. This is done by planning out a traversable path between nodes. It uses a best-first search and finds the shortest (rather least costly path if you think in terms of Big (O)) from an initial node to one goal node. It follows the path of the lowest expected total distance basically whilst keeping a sorted priority queue of alternate paths.It identifies all possible routes that lead to the goal.

A* takes into account distances it already travelled; the g(x) part of the heauristic is the cost from the starting point. The formula for A* is f(n) = g(n) + h(n), g(n) is the cost of the starting node to reach n, and h(n) is the estimate of the cost of the cheapest path from n to the goal node.

Take this unweighted graph for instance:


We want A path 5 –> v1 –> v2 –> … –> 20

As such g(20) = cost(5àv1) + cost(v1àv2)+…..+ cost(à20)

A* generates an optimal solution if h(n) is an admissible heuristic and the search space is a tree, see h(n) is admissible if it never overestimates the cost to reach the destination node. It also generates an optimal solution if h(n) is a consistent heuristic and the search space is a graph as h(n) this time is consistent if for every node ‘n’ and for every successor node n’ of n, as such: h(n) <= c(n,n’) + h(n’)

Have you ever played a PC game where the computer always knows exactly what path to take, even though the map hasn’t actually been explored yet? Depending upon the game, pathfinding that is too good can be unrealistic. Fortunately though, this is a problem that is can be handled fairly easily.

The answer is to create a separate knownWalkability array for each of the players and computer opponents. Each array would contain information about the areas that the player has explored, with the rest of the map assumed to be walkable until proven otherwise. Using this approach, units will wander down dead ends and make similar wrong choices until they have learned their way around. Once the map is explored, however, pathfinding would work normally. F.E.A.R is an example of a game that uses A* to plan its search.


I’ve programmed an A* algorithm before, it’s a basic but tremendously useful search algorithm that can be used from simple flash games to AAA titles, it’s like the Liam Neeson of algorithms. For the first time in your life you could get people to willingly come after you, well your game object at least. A programmer can dream can’t he?

P.S. you can thank Brett Gilespe for the joke in the title of this blog, that was a good one.

Quadtrees vs. Grid detection

Me and my groupmates are making a fighting game by the name of Shattered Tides, like any game we’re implementing collision detection code. There are many ways to implement it, however the real challenge is finding the most effect in the best possible time given our anemic time frame. After looking into it, it would seem that the quadtrees are the most common way to go but in some resources they mention the grid based solution.

So which is better and more effective? The right answer depends a little bit on the actual game you’re making, and choosing one over the other is needs implementing both and doing profiling to find out which one is more time or space efficient on your specific game.

Grid detection seems to solely apply to detecting collisions between moving objects and a static background. The greatest advantage is that the static background is represented as a contiguous memory array, and each collision lookup has the Big O of O(1) with good locality if you have to do multiple reads because entities need to be covered more than one cell in the grid. The disadvantage however is if the static background is large, is that the grid can be rather wasteful of space.

If instead you represent the static background as a quadtree, then the cost of individual lookups goes up, but because large blocks of the background take up small amounts of space, the memory requirements decrease, and so more of the background can sit in the cache. Even if it takes 10 times as many reads to do a lookup in such a structure, if it’s all in the cache, it’s still faster than a single lookup with a cache miss.

Personally I’d go with the grid implementation, because it’s easier to do and would be more productive for me to study. If I notice that our game is running slow, we’ll do some profiling and see what could use some help. If it looks like the game is spending a lot of time doing collision detection, we’d try quadtree and see if that’s any better.