Quantcast
Channel: GameDev.net
Viewing all 17825 articles
Browse latest View live

Math for Game Developers: Quaternions

$
0
0
Math for Game Developers is exactly what it sounds like - a weekly instructional YouTube series wherein I show you how to use math to make your games. Every Thursday we'll learn how to implement one game design, starting from the underlying mathematical concept and ending with its C++ implementation. The videos will teach you everything you need to know, all you need is a basic understanding of algebra and trigonometry. If you want to follow along with the code sections, it will help to know a bit of programming already, but it's not necessary. You can download the source code that I'm using from GitHub, from the description of each video. If you have questions about the topics covered or requests for future topics, I would love to hear them! Leave a comment, or ask me on my Twitter, @VinoBS

Note:  
The video below contains the playlist for all the videos in this series, which can be accessed via the playlist icon in the bottom-right corner of the embedded video frame once the video is playing. The first video in the series is loaded automatically


Note:  
This is an ongoing series of videos that will be updated every week. When a new video is posted we will update the publishing date of this article and the new video will be found at the end of the playlist


Quaternions




Open Source and the Gaming Industry

$
0
0
Wait! Don't skip this article. I know you think this doesn't apply to you, after all you don't program for Linux. But there is a lot more to open source than just the penguin. Many companies are taking advantage of open source software, much in part to the increasing popularity of Linux. Open source has become a big buzzword, with many people not really understanding what it really is.

In fact, most people are using open source software and don't even realize it. When you are looking at a website or checking your email, chances are the inner-workings (the mail transports and web servers) are open source.

So where do we begin? Lets start with a little history lesson behind open source.

Note:  
This article was published on GameDev.net back in 2002. In 2008 it was revised by the original author and included in the book Business and Production: A GameDev.net Collection, which is one of 4 books collecting both popular GameDev.net articles and new original content in print format.


Sharing source goes back as early as the 1950's when the SHARE group was formed to exchange code for IBM mainframes. IBM released the source to the software for their mainframes, allowing changes for specific needs. This updated code was passed around the SHARE group, creating the earliest form of open source.

In 1985, The Free Sofware Foundation was formed by Richard Stallman, to to promote the freedom to distribute and modify computer software without restriction. Stallman had been working in the AI department at MIT and left in 1984 to began writing GNU free software. (GNU is pronounced "guh-NEW" and it stands for "GNU's Not Unix").

The Free Software Foundation (fsf.org) defines free software is a matter of liberty, not price. To understand the concept, you should think of "free" as in "free speech," not as in "free beer." Free software is a matter of the users' freedom to run, copy, distribute, study, change and improve the software.

In 1997, a paper written by Eric S. Raymond, The Cathedral and the Bazaar triggered a series of events that led to the forming of the Open Source Institute. One of the more significant events at the time was Netscape releasing the source to their browser. That project is still alive and well with various projects like the Firefox browser and the NVU (nvu.com) HTML editor. The OSI was jointly founded by Eric Raymond and Bruce Perens in late February 1998 , as a general educational and advocacy organization.

Next, it is necessary to understand what open source software is and what it is not. Open source in its purist sense is software that has the source code available to anyone who desires it. That individual is free to use the code as he sees fit, however any modifications to the code must be made available for everyone to use and modify, under the same conditions. According to opensource.org, "Open source promotes software reliability and quality by supporting independent peer review and rapid evolution of source code."

It's not source code that you download from a website with permission to redistribute and modify and also to add additional restrictions to it. It's not a free program like Artweaver (artweaver.de) that's made available to the public for free without the source code. It's not source code released into public domain where the rights to the code are forfeited. It is also not source code that's distributed freely with a clause saying it has to be non-profit. Open source creates a system of free access of information for the greater good of the project.

Types of Licenses


There are two categories of open source licenses, copyleft and permissive. A copyleft license is primarily concerned with preserving open software rights in forked versions of the project. It requires that any subsequent code be released and any finished products release the source code for the product. A permissive license places minimum restrictions on the source code. It can be modified without releasing the modifications. A permissive license can work hand-in-hand with closed, proprietary code.

Of the copyleft licenses, the most widely used is the GPL from the Free Software Foundation. It has a couple of variations, The LGPL and the AGPL. The LGPL (Lesser General Public License) makes a specific piece of code open source, but other parts of the project can be proprietary source code. The AGPL (Affero General Public License) focuses on software as a service and relaxes some restrictions on releasing the source to network clients. You can find out more about these licenses at gnu.org/licenses.

The BSD is a popular permissive license, in fact often a permissive license will be referred to as a “BSD type” license. There are a number of variations on each of these licenses. If you go to opensource.org/licenses, you can find a number of licenses approved by the OSI.

There is great debate about the various types of licenses and the impact of each. Some game developers shy away from a restrictive copyleft license. There are parts of games that sometimes need to be kept away from view. Sometimes to stop cheating, sometimes to create an edge in the market. Also, there is the perspective that a copyleft license doesn't provide true freedom. There is a great blog post on copyleft (GPL) licenses for games here on GameDev.net

Really, the choice in license is about what is best for your project. It may make more sense to have closed source or it may make sense to use open source and open your own source. There is a great overview of the legal aspects of open source licenses here.

Open Source Games


There are several open source games available. Some games have been released as open source, some have become open after a commercial release. On Wikipedia is a list of open source games, I'll highlight a few here.

Parallel Realities has a number of open source games including Blob Wars: Metal Blob Solid a 2-D arcade game. A cool thing that Parallel Realties does is release the making-of for several games.

Alien Trap has released a 3D shooter called Nexuiz, a game that is built on the Darkplaces engine, a fork of the Quake 1 engine. There is a Super Mario Cart clone called SuperTuxKart. The site has a good bit of information about the workings of the game and how to create additional levels.

Vega Strike is a 3D Action-Space-Sim. The website has a great deal of documentation and a developers blog. Warzone 2100 is a real-time strategy game that was released commercially in 1999, then released as open source in 2004. All of these games (and there are many others) provide a good starting point for developing your own game and for seeing various ways to organize an open source game project.

There is also an open source game that is a joint venture between Blender and Crystal Space teams. The name for the project is Apricot and is worth checking out. Speaking of Crystal Space, there are a few commercially released games that were made from the CS engine, including Ice Land.

How Open Source Helps the Gaming Industry


Funding


There are a few options for getting some funding for open source games. Right now Google has the Summer of Code that pays college students to develop open source software over the summer. There are a couple of other groups that provide funding to open source projects, one is Linux Fund and another is SPI. There is not a lot of funding money to go around, but every little bit helps. Another idea is to do like the Apricot project and pre-sell your final product. Some vendors may also donate hardware to an open source project.

Tools


There is a great number of tools available for about anything you need to do in creating a game. Using open source software can greatly reduce the cost of development. For example, you can use Eclipse for your IDE, Open Office for document writing, GIMP for editing raster graphics, Inkscape for vector editing, Blender for 3D modeling and Audacity for sound editing.

When using open source tools, at the very least, provide feedback. It will give you and the rest of the world a better product. If you have time to spare, write some code or documentation. It would still save money from buying the commercially available products.

Information Sharing


Many think that the single greatest element in the rapid advancement of the gaming industry is the wide-open exchange of knowledge. There are many websites with gaming information, source code for games being released to be examined and studied, magazines and books giving anyone the ability to understand game making techniques. Open source also encourages and promotes this same information sharing.

Back in 2004, I started using the Mojavi framework for web development. I was able to get quite an education on the Model View Controller pattern from Sean Kerr, the developer of the framework, just by hitting the forums, IMing and volunteering to help out when possible. In return I wrote documentation and tutorials for the project. We both benefited (and so did the community) and as a bonus, I had the opportunity to get to know a great person. Unfortunately, Sean was unable to maintain Mojavi, but the project forked and now lives on in Agavi.

Reduced Overhead


Using open source libraries is a great way to reduce time in a project, save money and have more developers working on making the best library possible. The many eyes on a project are one of the greatest strength of open source software.

There are a number of open source libraries available, doing a search for your need with the phrase “open source” opens a world of possibilities. A few libraries to consider are the Simple and Fast Multimedia Layer, a cross-platform multimedia library designed to provide low level access to audio, keyboard, mouse, joystick and networking. You could use the SQLite database in your game for data storage.

Do you need a game engine? You could use the Crystal Space engine (crystalspace3D.org). Is there a need to write a scripting language when you could use Python (python.org)? If you need an audio library then you could use OpenAL. Is it completely necessary to write a physics library when the Bullet library might work for you?

Development Speed


Imagine the ego trip for a hard core gamer to be able to say that he beta tested the newest RPG on the market. This alone will draw a number of contributors to an open source game project, especially if there is some credit in the finished game.

Having additional coders for fixing or improving code after an alpha or beta stage can be a great benefit to the project. Often times, as the project comes to a close, there is great pressure to hurry up and get the product finished. This is an area where open source can also be a benefit. The additional coders working on your project will be anxious to see the project finish and potentially increase their productivity.

I did an informal survey of developers on a couple of open source projects. I found, on average, the programmers that responded to the survey spent 87 hours a month writing code for various open source projects, the time spent ranged from 10 hours a month to about 350 hours a month. Look at it this way; if you had 4 outside developers (which is very realistic) spending an average of 87 hours a month on your project, you have picked up the equivalent of a full time programmer for the project without the added cost.

Additionally, just over half of those who responded say they make sure the code compiles and cleans up minor errors on a daily or bi-daily basis. Also, many developers spend time on documentations and product support.

Reduce Redundancy


A few years back, in writing a 3ds importer for the Crystal Space library, I felt like there could be a savings in time if I could find someone who had already done the dirty work of translating the 3ds information. I found lib3ds (lib3ds.sourceforge.net), which suited my needs, however I found that the I/O interface was specifically for disk access. Crystal Space uses a Virtual File System making it incompatible with lib3ds.

I sent an email to JE Hoffman, project leader of lib3ds and told him my dilemma. He gladly rewrote the interface to use any data format and it was integrated into Crystal Space. However, the story doesn't end here. Lib3ds was known to work only on Windows and Linux. Crystal Space covers many more platforms. I sent an email to the Crystal Space list asking for help for lib3ds to get it to work on other platforms. A Crystal Space developer, Eric Sunshine, responded and helped out on making lib3ds available for more platforms.

This was a win-win situation. Crystal Space got an importer; lib3ds was made available on more platforms. The programming community also benefits with having a more versatile library.

While this may not seem like the greatest example, how many trivial programming tasks consume parts of your day? Do you really need to write a compression library or will zlib work just as well for you?

Project Stability


Project stability could be the greatest case for using open source software. Open source software has been peer reviewed every step of the way. It has been tried in a number of environments and under a number of conditions. Just think of the number of machines with the wide variety of configurations that will be testing the product.

Mature open source code is as bulletproof as software ever gets. Why? It's always being looked over and compiled and tested. In my survey of open source developers, a number of developers spent a sizable percentage of their time making sure the code was in tip-top shape.

Broader Market


Why do game companies limit the number of platforms for which the product is available? Is it because they don't want that market share? Of course not. The reason is that that market share doesn't cover the costs of developing for that platform.

Would that platform be profitable if it could be added for next to nothing? Absolutely. This is another added benefit of open source; end users are often more than willing to do the necessary work to make a game work on their platform. Each added port is an added market for the game.

The additional exposure for the game generated by being an open source project can also help the project gain a greater market share. Anyone contributing code to the project would have an additional incentive to purchase the finished game.

Opportunity for Lone Gunmen


There have been many lamentations lately for the solo game developer. Oh, the good ol' days of the past when you could write a game yourself, put it in a zip-lock bag and sell it at a computer show. I don't believe the days of the solo developer have to be a distant memory of the past.

Granted, a single developer will never put together a project with the scope of Scorched 3D, but a Zuma type game could be developed by a single developer. There are a number of open source tools and libraries available for creating Flash games or Java based games that be delivered across a number of platforms.

Making money with Open Source


The first argument you will hear from businesses concerning open source is that you cannot make money with it. Au contraire. There are companies making money with open source and we'll look at a few and look at some ways that you can use open source to make money, making games.

The first thing I would recommend is heading over to pentaho.org/beekeeper to download and read the brilliant paper called The Beekeeper. It is a well thought out analogy on making money with open source and can provide great insight into how open source fits into a business model.

Creating Titles


If your focus is making games and not creating technology, open source is an excellent way to focus more of your resources on the game play and less on the technology involved. A line of titles could be made from open source code, keeping the content of the game protected and making money from that as normal, just the cost of development would go down. I believe this would also help to shift the focus of games from eye-candy to content.

Selling Libraries


Dual licensing is the best of both worlds. This can be a working model for companies who make money from selling libraries. You have an open version, licensed with a GPL like license, that helps with development and finding bugs. With the GPL any additional work by other developers would go back to the project. However, if a company wanted to use the library as a base to start from, but not give back any new development back to the community, then you charge the developer for the same software under a different license.

Selling Support


Much of the money made in selling libraries is actually in the form of selling support. This is included in the initial cost of the library, however many have an additional fee for each continuing year after the first year. The same could be done with open source software. The support could include initial training for the product, setting up systems for the product, implementing specific changes to the software and telephone support.

Selling support doesn't have to be limited to software you have developed. Many open source projects fall short in documentation and support. You could step in to offer that service for companies trying to decrease the learning curve.

Peripheral Makers


If you make the drivers for sound cards open source, you increase the potential of your hardware working on a greater number of platforms. With peripherals there is probably no revenue from the drivers so the cost of moving to open source is minimal.

Having the drivers readily available and modifiable encourages developers to add support for a product in the game. This increases the number of products and sub-sequentially the number of potential end users for their product.

In 2007 when AMD started opening up their graphics drivers, Novell released an alpha version of the ATI Radeon driver in just 8 days. It definitely makes more sense for developers who know an operating system to write a driver.

Writing


Who could better give a view of a project than one of its developers? There are many writing opportunities available. A book about using the software could be helpful to developers using the software. Magazine articles about the technology used in the software would help to give greater insight into the project and give back to the community.

While this isn't an option for everyone, there are those who could make money this way. A variation on this theme would be giving speeches about areas of the software.

Downsides


As with any good thing, there are a few downsides as well. This is also true of open source. However, most of the downsides can be dealt with if enough thought and effort is put into it.

Clusters


Often times group-designing ends up being more talk than action. Or if there weren't a unified direction, the project would end up going nowhere. Either of these could cause the project to go into a serious tailspin.

A cluster could be overcome by having a good design document. Another possible solution to this could be doing all of the engine design internally then opening the project up. It may even be a good idea to delay releasing the code until alpha release of the project.

Cheaters


This is a serious, legitimate concern. Cheaters take away from the enjoyment of other players. This concern already exists with proprietary source code, having the source wide open would make it easier for cheaters to see how to beat the system. This was a problem when the source for Quake was released as open source. Will this be the one area that makes open source difficult to use in the gaming industry?

One solution is to use a license, such as the BSD, that would allow for the non-distribution of sensitive client/server code.

The Competitive Edge


Many companies feel that having the latest and greatest eye-candy gives them a competitive edge. Open source would take some of this away. Anyone using your code would have the same features. Even if they weren't using your code, they could see what is being implemented, describe it to another programmer and legally have the same features without deriving from the original work.

However, real competitive advantages do not come from having "cooler" graphics than the next game; it comes from having solid, creative game play. If the focus remains on game play and not on graphics, then someone getting your newest graphical feature becomes less of a concern.

Conclusion


Unfortunately, many developers are very dogmatic when it comes to open source software. So the flame wars about proprietary verses open source rage on. Then it becomes evil capitalist against dazed and confused socialist. All of this is counterproductive and takes away from making games.

It is best to remember that open source is a tool to make better software. Not everyone wants to use that tool. Some think that proprietary software is the way to go. Of course, everyone has the right to be wrong!

Seriously, determining if open source is right for a project has to be decided on an individual basis. Eric S. Raymond has said that open source is not for everyone. His writings are a good place to start in considering if open source is right for your project.

In the end, it's all about writing code and making games, and while healthy debates are good and very needed, open source shouldn't be treated like a religious experience.

Removing The 'Tech' From 'Design Document'

$
0
0

Who are You?


If you are an aspiring game designer, artist, or programmer, this is your ticket to better game development. Think of it as “Introduction to game development 101” for anyone looking to break into the industry. You will learn about the differences between the mysterious “Design Document”1 and the even more mysterious “Technical Document” that together comprises the road map for developing a game.

If you are coming from a design, art or sound background, you might question the value of being familiar with “the technical details.” Like all media forms, games have their own set of medium restrictions which influence design decisions. If you want your design, artwork, or sound contribution to be the best it can be, it is important to understand how your contribution translates into a working game. If you are coming from a programming background, the technical documentation is ultimately a reflection of the game design. It is essential to know how to map from the Design Document to the Technical Document, and what belongs in each. This is true even if you are not directly involved in the design, since you will be responsible for implementing the most important part - interaction!

Note:  
This article was originally published to GameDev.net back in 2002. It was revised by the original author in 2008 and published in the book Design and Content Creation: A GameDev.net Collection, which is one of 4 books collecting both popular GameDev.net articles and new original content in print format.


Why Do Upfront Design?


Let's get some things straight – these days professionally developed games have multi-million dollar budgets. In some cases, hundreds of people work on a single game project over the course of years. Clearly, these people require a plan so they can work towards a common goal. The goal is usually to create a fun game, although considering some of the massively-hyped duds that have come out over the years, you can't help but wonder exactly what the goal was (or maybe, where the plan was.)

That being said, upfront design isn't just for large teams of professionals! Aspiring game developers2 also benefit from the process and end result of having useful design and tech documents for their new game. If you have never thought about “game design” before, and if I haven't lost you yet, it's time to explain the title of this article. As you may already know, design documents are essential to any game. They tell you, and the other people working on the game (if there are any) what the game is about, what it will eventually be, and all the little details that define the game's “design”. On the other hand, do you really need to plan the implementation of the graphics library in advance? Or make considerations for what platforms3 your game will be released on? You already know your main character is going to have a red hat. You can figure out how that will appear on the screen later, when you get there, right?

Sure.

In fact, many game developers do exactly that every day. Heck, you should see some of the MMORPG design documents that are out there on the web. They are huge! But if you go to the creators and ask them for a technical document, they might look at you like you're speaking another language. Or *cringe* they might say that the technical document is included in the design document. Funny, there are only a few MMORPG's in good working order, and only a handful from amateur sources. Well-done design (including technical) can save your project from being relegated to the “might have been cool” project category. It will ensure that your game actually gets finished, and somewhat resembles the original vision4.

What's This Game Design All About?


Let's quickly go over what “game design” is, just to be sure we are on the same page. We'll get started with GameDev.net's Game Dictionary definition of a “Design Document”:

"A document that the designer creates which contains everything that a game should include. Sometimes referred to as a "design bible", this document should list every piece of art, sound, music, character, all the back story and plot that will be in the game. Basically, if the game is going to have it, it should be thoroughly documented in the design document so that the entire development team understands exactly what needs to be done and has a common point of reference."


Effectively, it is a description of how the game will look when finished, from the user's point of view. It's the end result, or the final goal of developing the game. Other articles in this book already cover game design in greater detail, so let us focus on what does not go into the design document.

A Design Document is not the place to:
  • Keep software requirements.
  • Keep project milestones.
  • Depict software architecture (that includes UML diagrams, with the possible exception of Use Cases).
  • Describe specific art file formats, sound file formats... or any other file formats!
  • Describe system requirements, or performance metrics.
  • Describe game development processes (artwork pipeline, coding standards, etc.).
To summarize, a Design Document is not the place to have detailed information on how the game will be implemented. That's what technical documentation is for. The reason? To keep the concepts separate. Also, while game design is subject to (often a great deal of) change during the development phase, the design document should keep pace. That means that, regardless of the state of the technical status of the project, you always know what the end result should be. Games often change target platforms, or need to be developed for new platforms. With radically different platforms, the game project's technical documentation might be completely different. In fact, these days, it's not surprising to see separate companies working on the same game for different platforms. For example, Lucas Art's upcoming game The Force Unleashed is being developed by multiple companies for several difference console systems. On the other hand, the design will remain the same for the most part, usually only changing to account for different control schemes, and possibly different levels of audio/visual fidelity. So you save time by keeping them separate and up-to-date.

'Tech' From 'Design Document', Huh?


Up until this point, I really haven't discussed why you need to have a technical document. Clearly you could start developing artwork, sounds, and music for your game with just a design document. Maybe start hacking out code for one (or more) of the platforms you intend to run the game on. Unfortunately this is a simple recipe for disaster. One of the problems with starting production on a game without a detailed technical overview of how the game will come into being is like driving to a city you know, but not knowing where you are starting from, and having no map to direct you to your final destination.

Basically, the design document describes the “place” you want to be at some point in the future. It's what you expect to be able to see, hear and touch when you arrive. However, you need to know how to get to that place. You need to know where you are starting from. That's where the technical documentation comes into play. It describes how you will get there – all the tools you need, and all the directions. Everything you need for how to get your game done.

Here's the definition of a technical document from GameDev.net's Game Dictionary:

"A specification for all of the programming algorithms, data, and the interfaces between the data and the algorithms."


Okay, while that's a concise definition, it teaches you little, if anything. Also, it is a very-software centric view of game development. These days, most teams have more artists then programmers, and they can often have very technical roles in the project as well. With that in mind, here is a simple checklist for getting you started on your own technical documentation:

A Technical Document is the place to:
  • Write the requirements. Typically software requirements – but it's good place to write requirements for artwork, music and sound development too.
  • Write which platforms the game will run on, and the specific details for each platform. Specifying clearly what (if anything) will be different on the platforms is important!
  • Describe how art, sound, and other game assets will be managed and put into the game, as well as the process for converting from standard media formats to game-specific formats.
  • List file types, data layout, etc. If you intended to have your own image, sound, map formats, etc. this is the place to document them.
  • List projected system requirements (usually only applies to PC/Mac games).
  • List all the technologies the game uses, and (ideally) how the game benefits from them. This includes software tools, licensed artwork, etc.
  • Describe the system architecture (usually thought of as software architecture).
  • Give considerations to future additions, and "what ifs”. Doing this right will help you avoid boxing yourself into a corner with a rigid design/development process.
  • Specify documentation conventions such as coding conventions, artwork format requirements, etc.
That's a good starting list, but by no means complete. What you actually need to cover will vary, but generally the larger and more complex the project, and/or the more people involved, the more important it is to describe as much as possible in detail. Now that we've cleared up both what a design document is and what a technical document is, I think it's time we had a look at an example of a technical design document.

MMORPG? No. 3D Shooter? No. Tic-Tac-Toe? You must be kidding…


Unfortunately I'm not. Unless you're a documentation fanatic, a technical document for a program as small as a tic-tac-toe game may be overkill, but it's an easy example to work with. And, hey, we'll try to spruce it up a bit as we go. More importantly, I'm going to give you a major heads up. Technical documents may or may not contain information that covers the same areas. That is to say that if one technical document describes a MMORPG, and another describes tic-tac-toe, they will not and should not try to cover the same areas. It is not a simple question of page numbers either. What they contain will be significantly different, and reflect the different challenges in each – or lack thereof.

With that in mind, I am not going to suggest that you memorize some arcane template for a technical document. Although you might benefit from something like that, for the wrong game project, you could end up wasting all sorts of time considering things that just do not matter. Likewise, you could ignore important considerations – probably much more of a concern then over-compensating. So, when you ask yourself if you have considered and documented enough yet, keep this in mind: The goal is to develop technical documentation that lets you meet the goals of the design document – nothing more, nothing less.

Okay, the following example doesn't try to fill in all the details (yes, even for a tic-tac-toe game), but it gives you an idea of how you can take apart the concept and use it for your own work.

SUPER+AWESOME TIC-TAC-TOE
Technical Document


Table of Contents

1. Introduction
2. Requirements
3. Development Process Overview
4. Software Architecture Overview
5. ...
13. Future Considerations


1. Introduction
This document describes the overall technical description of Super+Awesome Tic-Tac-Toe. Based on the game design document, the platform we are targeting (Windows PC), and the resources dedicated to the project, we will (etc...)

2. Requirements [Sample of high-level requirements]:

1. The program must be able to run on any Windows XP/Vista based system.
2. The program must not depend on any code libraries, other than those included with the OS itself.
3. It should be controlled exclusively by the mouse. No other input device should be required. That means we can't plan on having the player use a keyboard.
4. Each of the two players can interact with the game board on screen during their turn only, using the mouse.

(etc...)

3. Development Process Overview:
The development process will be fluid, using an iterative agile software development method. Even with an agile approach, we will try to do as much upfront design and documentation as possible. We will use Test-Driven-Development during software development...
(etc...)

4. Overview Of Software Architecture:
The game will be written for the Windows platform, using C++, and making use of the default Win32 platform functionality. We considered using DirectX, but it is simply not necessary to meet the game's graphical/sound requirements based on the provided design, so we choose not to make use of it. Due to the simplicity of the project, the overall software architecture will effectively be a default implementation of the standard model-view-controller software architecture pattern.

Attached Image: BasicDesign_RemovingTechFig1_MSikora.png

(etc...)

13. Future Considerations
Some of the things that were brought up during design for consideration for the next version of Super+Awesome Tic-Tac-Toe were:

1. Supporting multi-player over the Internet.
    We discussed the ways our current design could accommodate for this in a future version and decided that we could adapt the design to use a new Controller without changing the other parts of the system ...
(etc...)

Conclusion


The ultimate goal here was to stimulate your thoughts on how game design and technical design fit into the overall game development process. If you are working on some of your first game projects, the important message to takeaway is that planning and documentation are important for all the reasons outlined here - platform requirements, design reuse, communication, and most importantly, working towards a common goal as a team.

1In most development studios a Game Design consists of many documents. The same applies for Technical Documentation. I'll keep referring to them as single document just to make things a little more readable

2I will just use “game developer” to refer to all the people involved in game development...like designers, artists, programmers, level designers, sound editors, script editors, producers, quality assurance people, etc

3Even amateur developers are releasing their games on multiple platforms these days, and it's practically a requirement for full-on commercial titles.

4A vision, which more often than not, starts life on the back of a cocktail napkin.

Getting Started with the D Programming Language

$
0
0
I’ve been using the D Programming Language for ten years now, these days more than ever. Over the years, I’ve experimented with different ways of compiling my D projects (make, build tools, custom build scripts) and with different editors and IDEs. Despite the fact that there are several great options now for using IDEs such as Visual Studio, MonoDevelop and Eclipse for D development, I have found myself consistently working from the command line.

In this article, I’m going to show how to get started with a D development process that mirrors my own. I’m not trying to push this process on anyone, nor am I making any claims that it’s the best way to go about developing software with D. Developers can be a fickle lot and each has their own preferences and habits, so I would never presume to say, “This is the One True Way!” Instead, I’m hoping to show those who are curious about D that it’s quite simple to get up and running so that experimentation can begin in no time.

I should also say before I go further that I live on Windows. I venture into the land of the penguins now and again, when I really have to, but I prefer to avoid it as much as possible. And the only Apple products I’ve used since the Apple II have been iPods/Pads/Phones. As such, this article is going to be quite Windows-centric. That said, it shouldn’t take too much effort to translate the setup I describe here to those other OSes.

Installing DMD


At the time of writing, there are three D compilers available. GDC is built on top of the GNU Compiler Collection. LDC is based on LLVM. DMD is the reference compiler maintained by Walter Bright, the creator of the D Programming Language. GDC and LDC are great compilers. In fact, they often produce more performant executables than DMD. But they do require a bit more effort to setup on Windows than DMD does. So for this article, I’m going to focus on DMD.

On the DMD download page you will find a Windows installer, a dmg file for Mac users, several deb packages and RPMs for a few different Linux distros and an all-inclusive zip file which contains binaries for all supported platforms. Personally, I prefer the zip file, even for the rare occasions I have to drop into Linux. I really believe that the Windows installer is superfluous. There’s nothing to configure besides the path and nothing needs to be copied into any system directories. Unzip, set the path, done.

A decision must be made in how to set the path. With the zipped DMD package, the path for the Windows binaries is “dmd2\windows\bin”, but there are also binaries for FreeBSD, Linux, and OSX (32- and 64-bit where supported). The simplest thing on Windows is to set the global path (a quick trip to Google should help in figuring out how to do this for a particular version of Windows). Be careful, though, as one of the Windows binaries is the Digital Mars version of ‘make’. If you have another version of make on your global path, that can cause some headaches. Setting the global path also makes it difficult to support multiple versions of DMD. There are easy ways around that (I use batch files and cmd.exe shortcuts, and there’s also Jacob Carlborg’s D Version Manager), but for someone just starting out it’s not a big deal. Linux users using the zip file can either copy some stuff around to make everything globally visible, or edit the appropriate shell config file to set things up for a specific user. More details can be found at dlang.

Once the path is configured properly, whether manually or via installer or deb package/RPM, get to a command prompt and type the following:

dmd

Invoking dmd with no args will cause it to display the version number along with all the valid command line options, the same as typing dmd --help. This will confirm that the path is properly configured. Next, open up a text editor (I used Crimson Editor, a.k.a. Emerald Editor, for years, but switched to Sublime Text 2 a few months back — well-worth the price of the license). Enter the following into a text file and save it somewhere as "hello.d".

import std.stdio;

void main() {
    writeln( "Hello D!" );
}

Now navigate on the command line to the directory in which hello.d was saved and enter dmd hello.d. If it compiles without error, then the compiler was able to find the standard library and all is well. If there were errors, be sure to read them carefully. Most DMD errors are fairly clear, but for someone without experience with compilers on the command line this might not be the case. At any rate, if it is not clear that the error was caused by the code or the configuration, the digitalmars.D.learn newsgroup is the place to go for answers. For those who left their newsreaders in the 90s or have no idea what a newsreader is, there is both a forum interface and a mailing list interface for all of the D newsgroups to make them more palatable.

One final note on the installation. The vanilla DMD package on Windows only fully provides what is needed to compile 32-bit programs. For 64-bit, it’s only half the story. 32-bit makes use of the Digital Mars C++ compiler backend toolchain (including the annoyingly ancient OPTLINK linker and some equally ancient Win32 libraries). 64-bit compilation actually relies on the Microsoft C++ compiler tools, meaning a version of the Windows SDK which includes the compiler tools must be installed. Right now, that means version 7.1 of the Windows SDK.

How to Avoid Invoking DMD Yourself


Many statically compiled system languages require a two-step process to create executables: compiling and linking. This can be condensed into one step if all of the source files and required libraries are passed on the command line together, but when compiling and linking separately each source file needs its own compile step, then all the object files need to be passed to the linker to generate the executable. Either way, when multiple source files are involved, this can be difficult to manage by hand.

It would be nice if DMD (and GDC and LDC) compiled in a way more similar to Java than C and C++. They don’t. However, there are some tools out there that do. One of them actually ships with DMD. Before I get to that, a little history (because I just can’t resist).

A few years back, a D user named Derek Parnell created what, as I recall, was the first build tool for D. Called ‘bud’, it worked by following the import tree from the main source module, gathering up each imported module, and passing all of them to DMD for compilation. For example, if bud foo.d were executed, and foo.d imported bar.d, then bud would see that import, add bar.d to its list of files to compile, then scan imports in bar.d, and so on, until there were no more imports to parse. The tool became quite popular. In my binding project, Derelict, I initially used make, but eventually switched over to supporting bud as the default build tool via a build script. I began to recommend that people not compile Derelict at all, but just use bud to compile their app so that the Derelict modules would be compiled automatically. Compilation with DMD has always been extremely fast. bud made it extremely convenient.

Not long after, Gregor Richards released DSSS, which was a build and package management tool. Initially, he was using bud for the build side, but decided later to create his own tool called Rebuild (a play on bud, which was originally called "build"), which he released individually. Now it became possible to make D libraries distributable and compilable via a simple configuration script. Users could execute dsss net install derelict, for example, to automatically download a library (in this case, Derelict), compile it, and make it available to any new D programs developed on the local system. It looked set to become the de facto standard.

Unfortunately, neither project is active any longer. They were still usable for quite a while, but they inevitably fell off the radar. Thankfully, some new options have since become available.

Shortly after Andrei Alexandrescu became involved with D, he slapped together a little utility called ‘rdmd’. The basic concept behind it is the same as bud. It follows imports and pulls in all the imported files, then hands them off to DMD for compilation. By default, it executes the binaries it compiles, but can be told via a command line switch to compile without executing. It has other options as well, such as the --eval switch which allows code to be entered on the command line. See the documentation for details. It now ships with DMD on all supported platforms, so it’s easy to get compilation of multiple D source files up and running out of the box without installing any third-party tools.

The newest build tool on the block, and the one I’ve taken to using, is ‘dub’. It’s not just a build tool, but also a package manager, the spiritual successor to DSSS. This tool comes from Sönke Ludwig (who also the gave us vibe.d). He has set up a registry for developers to register their projects. Users can specify registered projects as dependencies in a dub configuration file (package.json), and the tool will make sure all dependencies are downloaded, installed and compiled. Currently, only projects on github and Bitbucket are supported for automatic installation.

I now use dub for all of my current D projects and plan to continue using it in the future. There are two things I really love about it. First, the configuration file is dead simple to set up. The tool can generate a generic one for you along with a directory tree for your project, but since I don’t like the directory tree it creates, I just create my own and copy my package.json file from project to project, editing it as I go (really, it’s dead simple). Second, the tool has some simple heuristics to differentiate a library project from an executable project, but can be configured to compile any source tree in any way. This is extremely useful for library development. In the past, I’ve always had to keep a separate test program as I develop a library and compile it separately. With dub, I just specify two different configurations in the package.json file, set up the project to match dub's simple heuristics, and I can use a single source tree as a library or as an executable.

Just to more clearly illustrate what I mean, imagine a library called ‘MyLib’. All of the library source modules will be in the “source\mylib” directory. Then, a separate module, “source\app.d” is created. In package.json, dub is instructed of two configurations. One, the default which is used when dub build is executed, compiles a library. The second, let’s call it ‘dev’, is used when the command dub build --config=dev is executed and will produce an executable. The tool will automatically include “source\app.d” in the second case, but will ignore it in the first. The two configurations in the package.json file only need to specify a name for the configuration and a target type (library or executable). There are other options, but those are the only two required. An example is provided in the next section. More info is available at the dub registry.

To summarize, it’s easy to get going with multiple source files out of the box with DMD by using the rdmd tool that ships with the compiler. When experimentation moves beyond using the standard library and making use of third-party libraries, dub makes that a piece of cake as well.

Derelict 3


Now I’m going to toot my own horn a little. Given that this is a game development site, I would imagine that people experimenting with D for the first time will quickly move beyond “Hello world” style programs and want to get to something more visual. That’s where Derelict 3 comes in.

Derelict is a collection of D bindings to a number of popular C and C++ libraries that are useful for game development. I first started working on it in early 2004, providing dynamic bindings for OpenGL, SDL, and OpenAL. Over the years, it has evolved through three major versions, packages have been added and removed, build systems have come and gone, and I’ve pulled out more hair than I care to think about. I try to spend as little time on it as I can possibly get away with without feeling guilty. So far, that has served me well.

Getting started with Derelict isn’t too terribly difficult. The easiest way is to use dub. I’ll describe the harder way first.

Anyone who has been doing any sort of C or C++ development long enough must be familiar with git by now. If not, don’t look to me for help. Just move on to the next paragraph. Otherwise, Derelict 3 can be cloned from https://github.com/aldacron/Derelict3.git. Once that’s done, open up a command prompt with DMD on the path, cd to the “build” subdirectory in the “Derelict3” folder, and execute dmd build (or substitute gdc or ldc2). That compiles the build script. Now execute build (or ./build for the penguins). This will compile libraries for each Derelict package and each library will be output to “Derelict3/lib”. To use Derelict, make sure “Derelict3/import” is on the import path (the import path can be set on the command line with the -I switch) and that the libraries are linked (via pragma(lib, “path/to/lib”) in source or specified on the command line). Yeah, this is the more complicated way, particularly since each compiler has a different way of specifying the library path on the command line, and with DMD it depends on which backend is being used (I can never remember what it is for 32-bit DMD on Windows). So I recommend using dub anyway.

Here’s the package.json for another library I’m working on. The relevant bit for Derelict is in the “dependencies” section.

{
    "name": "derringdo",
    "description": "A framework for 2D games with SDL2.",
    "homepage": "",
    "copyright": "Copyright (c) 2013, Michael D. Parker",
    "authors": [
        "Mike Parker"
    ],

    "dependencies": {
        "derelict:sdl2": "~master",
        "derelict:physfs": "~master"
    },

    "configurations": [
        {
            "name": "lib",
            "targetType": "library",
            "targetPath": "lib"
        },
        {
            "name": "dev",
            "targetType": "executable",
            "targetPath": "bin"
        }
    ]
}

Nothing to it. Edit this to fit your project, save it in the parent directory of the project source tree, execute dub build or dub build --config=dev (as described above) and D Programming with Derelict 3 happiness is sure to follow.

Using Derelict in code is quite simple as well. Let’s assume a project using SDL2 and SDL2_image, two commonly used game development libraries. This snippet of code shows how to import the relevant Derelict modules and load the libraries (remember, Derelict is a set of dynamic bindings, meaning it is designed to load shared libraries (.dll, .so and .dylib) at runtime).

import derelict.sdl2.sdl;
import derelict.sdl2.image;

void main() {
    DerelictSDL2.load();
    DerelictSDL2Image.load();
}

That’s it. If the libraries fail to load, a DerelictException will be thrown, specifically one of the subclasses SharedLibLoadException or SymbolLoadException. The former is thrown when the library fails to load, the latter when a symbol from the library fails to load. Derelict also allows the throwing of SymbolLoadExceptions to be skipped so that if specific symbols are missing, loading can continue. This is useful for loading older versions of a library. Importing derelict.util.exception will pull all DerelictExceptions into the namespace so that they can be handled appropriately. Also, each loader has a version of the load method that allows shared library names to be specified explicitly. By default, the loaders use the common library names as output by each distribution's build system (such as SDL2.dll on Windows) and uses the default system search path to find them. Sometimes it may be desirable to override this behavior, such as when shipping all shared library dependencies in a subdirectory (DerelictSDL2.load(”libs/SDL2.dll”)).

Notice that there are no corresponding calls to any “unload” methods. This is because Derelict makes use of D’s static module destructor feature to automatically unload libraries when the app exits. Static module constructors could have been used to load them as well, but that would take away the opportunity to handle exceptions, or to specify alternate library names. The unload methods do exist and are publicly accessible in case a library needs to be explicitly unloaded at a particular time, but they are not required to be called otherwise.

When compiling the above manually, the libraries DerelictSDL2.lib and DerelictUtil.lib need to be linked. There is no DerelictSDL2Image.lib. The loaders for SDL2_image, SDL2_ttf, SDL2_mixer, and SDL2_net are all part of DerelictSDL2.lib. The format of the file names and the file extensions depend on the platform and compiler. On Posix systems, it is also necessary to link with libdl, since Derelict uses dlopen and friends to handle the shared libraries. When compiling with dub, libdl will still need to be linked on Posix systems. This can be done with a “libs-posix” entry in the package config file (see the dub package documentation for details).

All Derelict packages adhere to this basic format, with the one exception being DerelictGL3. There are two OpenGL loaders in Derelict, one that does not include deprecated functions and one that does. For the former, import derelict.opengl3.gl3 and call DerelictGL3.load. For the latter, import derelict.opengl3.gl and call DerelictGL.load. In both cases, after an appropriate OpenGL context has been created, a reload method must be called (DerelictGL3.reload and DerelictGL.reload respectively). The load methods load the OpenGL 1.0 and 1.1 functions. The reload methods load the functions for versions 1.2+, as well as all supported ARB extensions (support for all ARB extensions may not yet be implemented). It is also recommended to call the reload method each time the context is switched. Both loaders also provide a means of determining the version of OpenGL that actually loaded.

Derelict 3 provides bindings for OpenGL, OpenAL, SDL2, SFML2, GLFW3, AssImp3, Lua 5.2, FreeType 2.4 and more. At the time of writing, the project README claims it’s all in an alpha state, but it’s 100% usable. The only thing missing is the documentation and a couple of more packages I want to add. Of course, one thing I haven’t mentioned here is that, when using Derelict, it’s necessary for the user to get possession of the appropriate shared libraries, otherwise there's nothing to load. Most projects provide binary distributions, some do not. In the latter case, it is necessary to have a build environment set up to compile C binaries. I try to keep the bindings up to date for each project, but sometimes I fall behind. I’m always happy to accept pull requests to correct that.

Conclusion


This has been a brief introduction on one way to get ready to compile programs with the D Programming Language, with an eye toward would-be D game developers. I hope the information here is useful enough to help those new to the language in getting started. As for how to actually program in D, I’ll leave that to Andrei Alexandrescu to explain in his excellent book, The D Programming Language, and to the helpful souls in the D Community, which includes the D Newsgroups and #D on freenode.net.

For those who want to go beyond the bounds of what I’ve described here, I encourage a look at VisualD, MonoD, and DDT for Eclipse to experiment with D IDEs, and of course GDC and LDC to get a feel for alternative compilers. Anyone using Derelict is encouraged to join the Derelict forums to ask for help and to report bugs and other issues on the project page at github.

I find D an enjoyable programming language to use and I look forward to seeing more new faces join the ranks of our ever-growing community.

Intelligent 2D Collision and Pixel Perfect Precision

$
0
0
Please feel free to comment and tell me if I missed something or if there's anything wrong! Feedback is really appreciated!

When you're making a 2D game where collision is an important factor, the more precise your logic is, the better! But perfect collision in 2D games often involves pixel perfect collision, which generates an overhead. This type of collision, at least in some simplementations, uses a secondary 1-bit bitmap (mask) where you have only black and white, for the example, or a boolean matrix, so you get the origin of two sprites, get their masks and check for collisions of white-pixels (colliding "trues", if using the matrix), if there's any, the objects are in fact colliding.

When you have 100 small sized objects, it's ok to check every one against every other. But when this changes to multiple hundreds of relatively big resolution sprites, the overhead of the collision calculations will affect the game. What to do now? There are lots of ways to reduce this overhead, we'll discuss some of them here today.

For the article, I'll be using the following situation:


Attached Image: 2hqqdj6.png


Here, we have 32 different objects (the topmost platform is formed by 2 rectangles), and the ground rectangles are way too big to check pixel perfect collision, it'll surely be a problem.

Bounding Box

Probably the most basic collision test is the bounding box check.


Attached Image: b843u8.png


The bounding box check is both the easiest and the most efficient way we have to reduce overhead on pixel-perfect collision games! Before you check if two sprites are colliding you check if they are near enough to have any chance of collision. If their bounding boxes aren't touching, there's no reason to check for pixel collision, since all pixels are necessarily inside their bounding boxes.

To verify if a bounding box is touching another is really simple. We have 2 objects, OB1 and OB2, that has 4 coordinates: OB1top, OB1bot, OB1Left, OB1Right, the same for OB2.


Attached Image: n6sx2r.png


Where
  • OB1top is the y coordinate of point A;
  • OB1bot is the y coordinate of point B;
  • OB1left is the x coordinate of point A;
  • OB1bot is the x coordinate of point B.
Now you create something similar to this:

bool colliding (Object OB1, Object OB2){

	// Check the collision Vertically

	if (OB1bot>OB2top) return false; /* this means that OB1 is above OB2,

   					far enough to guarantee not to be touching*/

	if (OB2bot>OB1top) return false; /* this means that OB2 is above OB1 */



	// Check the collision Horizontally

	if (OB1left>OB2right) return false; /* this means that OB1 is to the right of OB2 */

	if (OB2left>OB1right) return false; /* this means that OB2 is to the right of OB1 */



	return true; /* this means that no object is way above the other

						nor is to the right of the other meaning that the

						bounding boxes are, int fact, overlapping.*/

}

Just by doing this simple check, you'll reduce the overhead to your perfect collision enormously, without reducing its precision. A major advantage on this approach is that it's independant of the size of the objects! While the pixel perfect collision depends a lot on the size of the objects being tested, this will have constant check time.

Now let's think in numbers:
If we collide every object againt every other, the number of collision checks will be nothing less than:
32 objects X 31 other objects / 2 since we will check each pair only once = 32*31/2 = 496 collision checks!

Now imagine if instead of 6 soldiers, we had 12, and all of them shot twice as many projectiles, for a total of 88 projectiles:
The number of objects would rise to 104 but the number of checks would rise to no less than 5356! While we rose from 32 objects to 104 (3.25 times), the collision tests rose to more than 10 times than before (~=10.8)! As you can see, it's rather impractical.

How can we further increase our collision performance if we have already removed almost all pixel level collisions? The answer is still simple - we have reduced the time each collision takes to be calculated, now we have to reduce the total number of collision checks! It may look weird, but to reduce collisions you'll need to create new special collisions!

You can chop your playable area, section it up, so you will check collisions only between objects in the same sections!

Sector (Grid) Collision Check

Using both a grid and bounding boxes, you'll have this:


Attached Image: 1zowxus.png


As can be seen here, there are four objects under 2 sections (2 bullets and 2 ground tiles!) and, if we had an object in the exact point where the lines cross each other, that object would be in all 4 sections. When this happens, you have to check the collision of these objects against every object in each sections that part of it belongs.

But how do I know if an object is in a section? Do a bounding box collision check against the sections! If the bounding box return true, the object is in fact in that section. In this case, we would have something like this:

//All of this is in pseudo-code

//It's really game-specific

std::list<Object*> allObjectsList; //Contains all active objects

std::list<Object*> sectionList[4]; /*Contains all objects within the sections*/

Object sections[4];

void insert (Object object, sectionList List){

	//insert Object in sectionList

}

void flush (){

	//remove all objects on sectionList[0,1,2 and 3]

}

void sectionizeObjects (){

	flush();

	for(int obj = 0; obj < allObjectsList.lenght; obj++){

		for (int i = 0; i < 4; i++)

			if( checkBBoxCollision(allObjectsList[obj], sections[i]) )

				insert (allObjectsList[obj], sectionList[i])

	}

}

Ta-da! Our collision system is finally optimized and ready to grow in scale! (relatively)

What now?


We have our pixel perfect collision world set up, working and really optimized in questions of performance when compared to the initial situation. How can we further optimize this collision system? Now we are entering the zone of game-specific intelligent collision! We have lots of ways of further optimizing this environment, so let's make some analysis of our previous situations!

Bounding Box Re-introduced

When we made that bounding box collision checker function we simply checked for:

  1. Vertical collision Possibility
  2. Horizontal collision Possibility

but let's look at our situation again:


Attached Image: 2hqqdj6.png


If you see, lots of the objects have approximately the same Y while different X coordinates. The ones with different Y will have different X as well... This will be most of the cases of our little game here! So, the majority of the collision tests will pass the first two tests with no conclusion and go to last 2 (horizontal check) where the collision will be denied. So, we are making lots of vertical tests that are inconclusive, while the most conclusive tests here are horizontal...

If we simply swap the positions of both code parts, we will cut the number of comparisons to almost half of our original configuration. The result:

bool colliding (Object OB1, Object OB2){

	// Check the collision Horizontally

	/* checking horizontally first will make sure most of the functions will return false

	/* with one or two tests, instead of the previous use.*/

	if (OB1left>OB2right) return false;

	if (OB2left>OB1right) return false;



	// Check the collision Vertically

	if (OB1bot>OB2top) return false;

	if (OB2bot>OB1top) return false;



	return true;
}

Now that we have optimized our bounding box algorithm logic, we have to move further to more advanced stuff.

Sector (Grid) Collision Check

If you pay attention to our grid, it has only 4 sections and the sections will have lots of objects in them. What if we increased the number of sections to... let's say... 10? What if, instead of dividing the scene into 4 cartesian-like quarters I divided it in horizontal sections? Well, here's the result of that:


Attached Image: 206blhf.png


Note: This horizontal grid makes testing vertical collisions unnecessary. If it was vertical, it would make horizontal collision checks unnecessary. You may want to make an optimized function when using these, if that's the case.

Changing your grid will greatly influence the number of collision checks you'll make, changing how many sections, the sections' shapes...

The smallest size of a section in this case should be the size of a soldier. If the grid was smaller than a soldier, there'd be soldiers in 6 different sections, bullets in 4 sections, and the number of collision tests would increase instead of diminish.

So, how do I choose a size and shape for my grid?

The only way I know is: Think and Try. Create a variable that stores the total number of bounding box collision checks made and take note of your grid and this number until you find an optimal grid configuration for your game, one that minimizes the number of calls.

I've already used rectangular sections just like the first example and I've used vertical/horizontal sections as well. I've even used overlaping circular sections (worked really well as all my objects were circular) - it really depends on your game.

Another way to optimize this is to change your grid implementation to something better.
Some implementations of collision sections use R-trees, quad trees, red-black trees and other kinds of segregation.
I will not enter this realm since it's probably worthy a full post! But I'll leave some links at the bottom!

If you're interested try searching the net!

Intelligent Collision segregation


Now it gets a bit more complex and harder to make an example under a single simple situation as our little war zone here. How can we make our collision check smarter? Let's say in our game the bullets will pass through each other, they won't destroy when impacting with other bullets. This way, we can go ahead and remove all bullet vs bullet collision checks! The same can be done with allied soldiers, if in our game they block each other's way, we have to do the collisions, but if allied soldiers can run through each other, there's no need to collide them. The same with allied soldiers/allied bullets. If our game has no Friendly Fire, why calculate these collisions?

As you can see here, this part is the one that depends the most on your game.

Good Sources


http://www.gamedev.n...-detection-r735
http://www.flipcode....ersection.shtml
http://www.metanetso.../tutorialA.html
http://go.colorize.n..._xna/index.html
http://www.fourtwo.s...exing-in-a-grid
http://gamedev.stack...gic-into-action
http://stackoverflow...ision-detection
http://stackoverflow...for-a-game-in-c
http://www.gamasutra...s_for_games.php
http://www.wildbunny...on-for-dummies/

soldier sprite taken from the flying yogi's SpriteLib (here)

Appendix A: Circular Collision Detection


To calculate if two circles are colliding, you need to check if the distance between their centers is less than the sum of their radius. Some games have the collision between entities as being simple circle collisions. This way, the entities need only a 2D position and a Radius.

bool colliding (Object OB1, Object OB2){

	if ( squared(OB1.x-OB2.x) + squared(OB1.y-OB2.y) < squared(OB1.Radius+OB2.Radius)) return true;

	return false;

}

GDC 2013: Interview with Marc Singer

$
0
0
If you were making video games, what role would you play? Design3 got a chance to chat with Marc Singer of Nival at GDC 2013, where he shared his favorite part about being a producer, the challenges of international publishing and some details on the upcoming MOBA/SRPG, Prime World.


Watch more interviews and game dev tutorials FREE at http://www.design3.com


Efficient Normal Computations for Terrain Lighting in DirectX 10

$
0
0
In 2005 I spent a good amount of time answering questions in the “For Beginners” forums of GameDev.net. One question which I frequently saw was how to compute the vertex normals required for various diffuse lighting models used in games. After answering the question three or four times, I aimed to write an introductory article which would explain the most commonly used algorithms for computing surface and vertex normals to date and to compare and contrast their performance and rendering quality. However, times and technology have changed.

At the time of my first article, Real-Time Shaders were just becoming the de-facto standard method for implementing graphics amongst video game developers, and the lack of texture fetches for Vertex Shaders and the limited instruction count prevented people from performing many of the operations they would have liked on a per-frame basis directly on the GPU. However, with the introduction of DirectX 10, Geometry Shaders, Streamed Output, and Shader Model 4.0 it’s now possible to move the work-load of transforming and lighting our scenes entirely to the GPU.

In this article, we will be utilizing the new uniform interface for texture fetching and shader resources of DirectX 10, along with the new Geometry Shader Stage to allow us to efficiently compute per-frame, per vertex normals for dynamic terrain such as rolling ocean waves or evolving meshes - entirely on the GPU. To make sure we cover our bases will we be computing the normals for terrain lighting using two distinct algorithms, each of which addresses a specific type of terrain. It is my hope that by the end of this article you will have discovered some efficient and exciting ways to take advantage of current-generation hardware to compute the surface and vertex normals necessary for the complex lighting algorithms which will be popping up with the new generation of rendering hardware.

Note:  
This article was originally published to GameDev.net in 2005. It was revised by the original author in 2008 and published in the book Advanced Game Programming: A GameDev.net Collection, which is one of 4 books collecting both popular GameDev.net articles and new original content in print format.


Heightmap Based Terrain


Depending upon the genre of game you're making, players may spend a significant amount of time looking at the terrain or ocean waves passing them by. With this in mind, it is desirable to have a terrain which is both realistic and attractive to look at. Among the simplest methods of creating attractive terrain are those based on heightmaps.

A heightmap is a one or two dimensional array of data which represents the height of a piece of terrain at a specific (x, z) point in space. In other words, if you were to look at the [x][z] offset within a 2D array or compute the associated index into a one-dimensional array, the value at that location in memory would be the height at point X, Z in 3D space.

The value in memory can be stored as either a floating point value or an integer, and in the case of an integer the values are often stored as either a 1 byte (8 bits) or 4 byte (32 bits) block of memory. When the height values are stored as 8-bit integers the heightmap can easily be saved and loaded to disk in the form of a grayscale image. This makes it possible to use a standard image editing application to modify the terrain offline. Figure 1 shows an example grayscale Heightmap.


Attached Image: AdvGameProg_EfficientNormal_Walsh_1.jpg
Figure 1: An example heightmap taken from Wikipedia


Fortunately for us, technology has advanced a great deal in recent years, and floating point buffers and textures are now frequently used. For the purpose of this article, we will use 32 bit, single precision Floating Point values to represent height.

When working with static terrain, or when it’s necessary to perform collision detection, the values from the heightmap can be read from memory and then assigned to a grid-shaped mesh that contains the same number of rows and columns of vertices as the dimension of the heightmap. Once this is done, the newly generated mesh can be triangulated and passed to the renderer for drawing or can be used for picking and collision detection. This “field of vertices” which is used for rendering and collision is called the heightfield. You can see a 3D heightfield representation of the heightmap used above in Figure 2.


Attached Image: AdvGameProg_EfficientNormal_Walsh_2.jpg
Figure 2: An example 3D heightfield taken from Wikipedia


Because the distance between pixels in a heightmap are unfirormly treated as one, it is common to generate a heightfield with similarly distributed vertices. However, forcing your vertices to exist in increments of one can cause the terrain to seem unnatural. For this reason, a horizontal scale factor, sometimes called "Scale" or "Units per Vertex" is added to allow your vertices to be spaced in distances greater or smaller than 1.0.

When it is not necessary for actors to collide against the terrain, or when it’s possible for the collisions to be computed on the GPU, it is more common to pass the heightmap directly to the GPU along with a flat mesh. In this case, the scaling and displacement of the vertices is performed by the GPU and is referred to as Displacement Mapping. This is the method we’ll use in this article.

Slope Method of Computing Heightfield Normals


Because heightfields are generated from heightmaps, the vertices are always evenly spaced and are never overlapping (resulting in the y-component of our normal always facing “up”). This makes it possible to break our 3-dimensional heightfield into two, 2-dimensional coordinate systems, one in the XY plane, and one in the ZY plane. We can then use the simple and well-known phrase “rise above run” from elementary geometry to compute the x and z component normals from each of our coordinate systems, while leaving the y-component one. Consider the line shown in figure 3.


Attached Image: AdvGameProg_EfficientNormal_Walsh_3.jpg
Figure 3: A simple 2D line


In figure 3 you can see that the slope for the line segment is 2. If you assume for a moment that the line segment represents a 3D triangle laying on its side, and that the front face of the triangle points “up”, then the surface normal for such a triangle (in the x-direction) can be determined by finding the negative reciprocal of the slope. In this case, the negative reciprocal is -(1/2). At the beginning of this explanation I made a point of indicating that we can express our 3D heightfield as a pair of 2D coordinate systems because the Y component of our normal always points up. That implies that we want to keep our y-component positive. So the slope for our normal is better expressed as 1 / -2. Note that this means our dy is 1 and dx is -2, and that if we use those values as the x and y components of a 2D normal vector we get the vector (-2, 1). Once normalized, that would indeed represent a vector which is normal to the triangle lying on its side in the XY plane.

In the discussion of heightfields we also noted that the distance between pixels is always 1, and consequently, the distance between vertices in the heightfield (before scaling) is also one. This further simplifies our computation of normals because it means that the denominator of our expression “rise / run” is always one, and that our x-component can be computed simply by subtracting the y-components (the rise) of the two points which make up our line segment. Assume y is 1, and then subtract our first height (1) from our first height (3), and you get -2.

So now we have a fast and efficient method of computing the normal for a line segment, the 2D equivalent of a surface normal, but we still need to take one more step to compute the normal for each vertex. Consider the picture in Figure 4.


Attached Image: AdvGameProg_EfficientNormal_Walsh_4.jpg
Figure 4: A series of 2D line segments


Here you can see 3 vertices, each separated by a line segment. Visually, the normal at point 0 would just be the up vector, the normal at point 2 would be the same as the line segment which we computed previously, and point 1 would be half-way in between. From this observation we can generalize an algorithm for computing the component-normal for a point in a 2D coordinate system.


“The vertex normal in a 2D coordinate system is the average of the normals of the attached line segments.”


Or, when expressed using standard equation speak:


ComponentNormal = Σ (lineNormals) / N; where N is the number of normals


Up until this point I’ve attempted to consistently use the term “component normal” to remind you that what we’ve been computing so far is simply the X-component of the normal in our 3D heightfield at any given vertex. Fortunately for us, computing the Z component is exactly the same. That is, we can compute the dy in the z-direction to get the z-component of the normal, just like we computed dy in the x-direction to get the x-component. When we combine the two equations, we get the following:


Normal.x = Σ(x-segments) / Nx;
Normal.y = 1.0
Normal.z = Σ(z-segments) / Nz;


The above algorithm can be shown more effectively using a visual aid.


Attached Image: AdvGameProg_EfficientNormal_Walsh_5.jpg
Figure 5: An overhead view of a heightfield


If you take the example shown in Figure 5 you’ll see the algorithm can be filled in to get the following equations:


Normal.x = [(A-P) + (P-B)] / 2.0
Normal.y = 1.0
Normal.z = [(C-P) + (P-D)] / 2.0


During implementation, the algorithm becomes a little more complicated because you must:
  • Find the indices of your points in the height data
  • Handle edge cases in which there are fewer points to sample
Rather than describe the special cases here, let’s look at them in the context of an implementation in DirectX 10 using Vertex Shaders.

Implementing the Algorithm with DirectX 10


In the explanation that follows I will try and address the most relevant components of implementing the above algorithm within a DirectX 10 application. However, I will be using the source code from the associated demo program which you may examine for a more complete listing and an idea of how it may fit in into your own games.

Before we do anything else we need to define our custom vertex format. For our 3D heightfield we’re only going to need two floating point values, x and z. This is because the normal values, along with the y-component of our position, will be pulled from the heightmap and procedurally computed within our Vertex Shader. When initializing the x and z components within our program, we are going to set them to simple integer increments, 0, 1, 2… We do this so we can compute an index into our heightmap in order to determine the height at any given vertex.

struct FilterVertex	// 8 Bytes per Vertex
{
  float x, z; 
};

The index into a one dimensional heightmap can be computed with the following equation, where numVertsWide is how many pixels in your heightmap in the x dimension:

index Ãâ z * numVertsWide + x 

DirectX 10 is unique from DirectX 9 and is particularly suited for this type of problem because unlike DirectX 9 it provides a uniform interfaces for each stage in the graphics pipeline. This allows you create buffers, textures, and constants which can be accessed the same across all stages. For our particular purpose we’re going to need a buffer to store our heights in. In DirectX 9, with Shader Model 3, we could have done this by stuffing our heights into a texture and then accessing it using the SM 3.0 vertex texture fetching operations. However, with DirectX 10 it’s even easier. We can define a buffer which will store our heights and then bind it to the graphics pipeline as a Shader Resource. Once we do this, we access the buffer as though it were any other global variable within our shaders. To make this possible we need to define three different fields:

ID3D10Buffer*			m_pHeightBuffer;
ID3D10ShaderResourceView* 	m_pHeightBufferRV;
ID3D10EffectShaderResourceVariable* m_pHeightsRV;

First, we’ll need the buffer itself. This is what ultimately contains our float values and what we’ll be updating each frame to contain our new heights. Next, all resources which derive from ID3D10Resource (which includes textures and buffers) require an associated Resource View which tells the shader how to fetch data from the resource. While we fill our buffer with data, it is the Resource View which will be passed to our HLSL Effect. Finally, we’re going to need an Effect Variable. In DirectX 10, all effect fields can be bound to one of several variable types. Shader Resources such as generic buffers and textures use the Shader Resource Variable type.

While we won’t demonstrate it here, you will need to define your vertex and index buffers, and fill them with the corresponding values. Once you’ve done that you’ll want to create the shader resources we previously discussed. To do this you create an instance of the D3D10_BUFFER_DESC and D3D10_SHADER_RESOURCE_VIEW_DESC structures and fill them in using the following code:

void Heightfield::CreateShaderResources(int numSurfaces)
{
  // Create the non-streamed Shader Resources
  D3D10_BUFFER_DESC desc;
  D3D10_SHADER_RESOURCE_VIEW_DESC SRVDesc;
  // Create the height buffer for the filter method
  ZeroMemory(&desc, sizeof(D3D10_BUFFER_DESC));
  ZeroMemory(&SRVDesc, sizeof(SRVDesc));
  desc.ByteWidth = m_NumVertsDeep * m_NumVertsWide * sizeof(float);
  desc.Usage                  = D3D10_USAGE_DYNAMIC;
  desc.BindFlags              = D3D10_BIND_SHADER_RESOURCE;
  desc.CPUAccessFlags         = D3D10_CPU_ACCESS_WRITE;
  SRVDesc.Format              = DXGI_FORMAT_R32_FLOAT;
  SRVDesc.ViewDimension       = D3D10_SRV_DIMENSION_BUFFER;
  SRVDesc.Buffer.ElementWidth = m_NumVertsDeep * m_NumVertsWide;
  m_pDevice->CreateBuffer(&desc, NULL, &m_pHeightBuffer);
  m_pDevice->CreateShaderResourceView(m_pHeightBuffer, &SRVDesc, 
    &m_pHeightBufferRV);
}

For our filter normal algorithm, we’re going to be writing to the heightmap buffer once every frame, so we’ll want to specify it as a writeable, dynamic resource. We also want to make sure it’s bound as a shader resource and has a format which supports our 32 bit floating point values. Finally, as seen in the code listing, we need to specify the number of elements in our buffer. Unfortunately, the field we use to do this is horribly misnamed and in the documentation is described as containing a value which it should not. The field we’re looking for is ElementWidth. The documentation says it should contain the size of an element in bytes, however this incorrect. This field should contain the total number of elements. Don’t be fooled.

After we’ve created our index, vertex, and height buffers and filled them in with the correct values, we’ll need to draw our heightfield. But, before we pass our buffers off to the GPU we need to make sure to set the relevant properties for the different stages and set our buffers. So let’s examine our draw call a little bit at a time. First, we’ll define a few local variables to make the rest of the method cleaner.

void Heightfield::Draw()
{
  // Init some locals
  int numRows = m_NumVertsDeep - 1;
  int numIndices = 2 * m_NumVertsWide;
  UINT offset = 0;
  UINT stride = sizeof(FilterVertex);

Next, we need to tell the vertex shader the dimensions of our heightfield, so that it can determine whether a vertex lies on an edge, a corner, or in the middle of the heightfield. This is important as the number of line segments included in our algorithm is dependent upon where the current vertex lies in the heightfield.

  m_pNumVertsDeep->SetInt(m_NumVertsDeep);
  m_pNumVertsWide->SetInt(m_NumVertsWide);
  m_pMetersPerVertex->SetFloat(m_MetersPerVertex);

Next, we’re going to follow the usual procedure of specifying the topology, index buffer, input layout, and vertex buffer for our terrain.

  m_pDevice->IASetPrimitiveTopology (D3D10_PRIMITIVE_TOPOLOGY_TRIANGLESTRIP);
  m_pDevice->IASetIndexBuffer(m_pIndexBuffer,DXGI_FORMAT_R32_UINT,0);
  m_pDevice->IASetInputLayout(m_pHeightfieldIL);
  m_pDevice->IASetVertexBuffers(0, 1, &m_pHeightfieldVB, &stride, &offset);

Next, and this is perhaps the most important step, we’re going to bind the resource view of our buffer to the effect variable established effect variable. Once we’ve done this, any references to the buffer inside of the HLSL will be accessing the data we’ve provided within our heightmap buffer.

  m_pHeightsRV->SetResource(m_pHeightBufferRV);

Finally, we’re going get apply the first pass of our technique in order to set the required stages, vertex and pixel shaders, and then we’re going to draw our terrain.

  m_pFilterSimpleTech->GetPassByIndex(0)->Apply(0);
  for (int j = 0; j < numRows; j++)
    m_pDevice->DrawIndexed( numIndices, j * numIndices, 0 ); 
}

In my demo I treated the terrain as a series of rows, where each row contains a triangle strip. Previous to DirectX 10 it might have been better to render then entire heightfield as a triangle list in order to reduce the number of draw calls. However, in DirectX 10 the performance penalty for calling draw has been amortized and is significantly less. As a result, I chose to use Triangle Strips to reduce the number of vertices being sent to the GPU. Feel free to implement the underlying topology of your heightfield however you like.

Now that we’ve taken care of the C++ code let’s move on to the HLSL. The first thing we’re going to do is define a helper function which we will use to obtain the height values from our buffer. The Load function on the templated Buffer class can be used to access the values in any buffer. It takes an int2 as the parameter where the first integer is the index into the buffer, and the second is the sample level. This should just be 0, as it’s unlikely your buffer has samples. As I mentioned previously, DirectX 10 provides a uniform interface for many types of related resources. The second parameter of the Load method is more predominately used for Texture objects which may actually have samples.

float Height(int index)
{
  return g_Heights.Load(int2(index, 0));	
}

Now that we’ve got our helper function in place lets implement our filter method. In this method we declare a normal with the y-component facing up, and then we set the x and z values to be the average of the computed segment normals, depending on whether the current vertex is on the bottom, middle, or top rows, and whether it’s on the left, center, or right hand side of the terrain. Finally, we normalize the vector and return it to the calling method.

float3 FilterNormal( float2 pos, int index )
{
  float3 normal = float3(0, 1, 0);
  if(pos.y == 0)
    normal.z = Height(index) - Height(index + g_NumVertsWide);
  else if(pos.y == g_NumVertsDeep - 1)
    normal.z = Height(index - g_NumVertsWide) - Height(index);
  else
    normal.z = ((Height(index) - Height(index + g_NumVertsWide)) +
          (Height(index - g_NumVertsWide) - Height(index))) * 0.5;
  if(pos.x == 0)
    normal.x = Height(index) - Height(index + 1);
  else if(pos.x == g_NumVertsWide - 1)
    normal.x = Height(index - 1) - Height(index);
  else
    normal.x = ((Height(index) - Height(index + 1)) +
          (Height(index - 1) - Height(index))) * 0.5;
	return normalize(normal);
}

For each vertex we’re going to execute the following vertex shader:

VS_OUTPUT FilterHeightfieldVS( float2 vPos : POSITION )
{
  VS_OUTPUT Output = (VS_OUTPUT)0;
  float4 position	 = 1.0f;
  position.xz	 = vPos * g_MetersPerVertex;

First, we take the 2D position which was passed in and compute the index into our buffer. Once we’ve got the index, we use the Load method on the texture object in order to obtain the height at the current position, and use that to set the y-component of our vertex’s position.

  // Pull the height from the buffer
  int index		 = (vPos.y * g_NumVertsWide) + vPos.x;
  position.y 	 = g_Heights.Load(int2(index, 0)) * g_MetersPerVertex;
  Output.Position = mul(position, g_ViewProjectionMatrix);


Next, we pass the 2D position and the index into the previously shown filter method. We pass the index into the function in order to prevent having to compute it again, and we pass in the position as the x and z values are used to determine whether the vertex lies on the edge of the terrain.

  // Compute the normal using a filter kernel
  float3 vNormalWorldSpace = FilterNormal(vPos, index);
  // Compute simple directional lighting equation
  float3 vTotalLightDiffuse = 	g_LightDiffuse *
    max(0,dot(vNormalWorldSpace, g_LightDir));
  Output.Diffuse.rgb = g_MaterialDiffuseColor * vTotalLightDiffuse;
  Output.Diffuse.a  = 1.0f;
  return Output;
}

As an additional note about this implementation, with DirectX 10 and the introduction of the Geometry Shader it’s now possible to generate geometry directly on the GPU. If someone were interested they could completely avoid the need to pass a mesh to the GPU and instead generate it, along with the normals inside of the geometry shader. However, as most games are not bus-bound, there would be no noticeable performance benefit as simply creating a static vertex buffer and passing it to the GPU each frame requires little overhead by the CPU.

Mesh-Based Terrain


While the previous algorithm works effectively for heightmap based terrain, it’s unacceptable for mesh-based terrain. In this case, caves, chasms, overhangs, and waves prevent us from performing any type of optimizations because we can make no guarantees about the direction of the y-component. For this reason it becomes necessary to compute the full normal. The following algorithm is a fast and efficient method for computing cross-products for mesh-based terrain entirely on the GPU, so long as certain assumptions and constraints are made.

Grid-Mesh Smooth Shading Algorithm


I refer to this algorithm as the Grid-Mesh Smooth shading algorithm because it is a combination of two principles. The first principle, which results in the “Smooth Shading” part of the name, was set forth by Henri Gouraud. Gouraud suggested that if you were to compute the normals of each of the facets (surfaces) of a polyhedron then you could get relatively smooth shading by taking all of the facets which are "attached" to a single vertex and averaging the surface normals.

Gouraud Shading is thus a two stage algorithm for computing the normal at a vertex. The first stage is to compute the surface normals of each of the triangles in the heightfield using cross-product calculations. The second stage is to sum the surface normals attached to any given vertex and then normalize the result. The following two equations are the mathematical definitions for the surface and vertex normals.


Surface Normal: NsA × B
Vertex Normal: N = Norm( Σ (Nsi) )


While I won’t demonstrate all the methods here (though they are included in the associated demo), there are actually three different ways for computing the vertex normals which distribute the workload differently. In order of efficiency they are:

  1. Compute both the surface and vertex normals on the CPU
  2. Compute the surface normals on the CPU and the vertex normals on the GPU
  3. Compute both the surface and vertex normals on the GPU

Until DirectX 10, only the first two options were available, and were still expensive as computing the cross-products for a large number of triangles on the CPU each frame can become unreasonable. With DirectX 10 and Geometry Shaders it is now possible to compute both the surface and vertex normals entirely on the GPU. This method is significantly faster than computing the surface normals on the CPU and allows us to get much closer in performance of complex, mesh-based terrain to the methods used for heightmap based terrain.

The Geometry Shader Stage is unique in its functionality in that, unlike the Vertex Shader Stage which receives a single vertex at a time, or the Pixel Shader Stage which receives a single pixel at a time, the Geometry Shader Stage can receive a single primitive at a time. With a single triangle we can compute the cross-product using each of the three points of the triangle and can then stream the surface normal back out of the graphics pipeline to be used in a second pass.

With all of the above said, having both the current vertex position and the surface normals only solves half the problem. Without a method of determining which surfaces are attached to the current vertex, there’s no way to determine which of the surfaces in buffer should be summed and normalized. This leads us to the “Grid-Mesh” portion of the algorithm.

While this algorithm is intended to work with irregular meshes that may contain overhangs, vertical triangles, caves, etc…one thing must remain consistent: in order for us to quickly and predictably determine the surfaces attached to any given vertex, the mesh must have a fixed and well-understood topology. One method of ensuring this is to derive our terrain mesh from a grid. Every grid, even one containing extruded, scaled, or otherwise manipulated triangles, has a unique and predictable topology. In specific, each vertex in grid-based mesh has between one and six attached surfaces depending on the orientation of the triangles and the position of the vertex within the mesh. Consider the diagram of a grid-based mesh in Figure 6 which shows a few different cases, and the related surface normals for that case.


Attached Image: AdvGameProg_EfficientNormal_Walsh_6.jpg
Figure 6: A simple grid-based mesh with triangles


Having a fixed topology which is based on a grid allows us to compute an index into a buffer as we did in the Filter method. As before, depending upon which corner or edge the vertex is on will change the number and indices of the surfaces which will be used in computing the normal. Let’s take a look at an implementation which uses DirectX 10 to compute both surface and vertex normals on the GPU.

Implementing the Algorithm with DirectX 10


As with the filter method, the first thing we must do is define our vertex format.

struct MeshVertex
{
	D3DXVECTOR3 pos;
	unsigned i;
};

Rather than having a 2D position vector containing an x and z component, we’ve now got a full 3D position. The second parameter, an unsigned integer which I call i, will be used as the vertex’s index. This index was unnecessary with the filter method because the x and z values combined could be used to compute where in the heightfield the vertex lied. However, the arbitrary nature of vertices within a mesh makes it impossible to determine its relationship to other vertices based solely on its position. Thus, the index helps us to identify which normals are attached to any given vertex within the surface normal buffer.

Next, we’ll need to create a buffer and associated view resources. Unlike before, this buffer will be used store our surface normals rather than simple height values and will need to be both read from and written to by the geometry shader.

ID3D10Buffer*	m_pNormalBufferSO;
ID3D10ShaderResourceView*	m_pNormalBufferRVSO;
ID3D10EffectShaderResourceVariable* m_pSurfaceNormalsRV;

When creating our buffer we need to make sure that in addition to being bound as a shader resource, it is also bound as a stream output buffer so that it can be set as a stream output target.

void Heightfield::CreateShaderResources( int numSurfaces )
{
  // Create the non-streamed Shader Resources
  D3D10_BUFFER_DESC desc;
  D3D10_SHADER_RESOURCE_VIEW_DESC SRVDesc;
  // Create output normal buffer for the Stream Output
  ZeroMemory(&desc, sizeof(D3D10_BUFFER_DESC));
  ZeroMemory(&SRVDesc, sizeof(SRVDesc));
  desc.ByteWidth 	= numSurfaces * sizeof(D3DXVECTOR4);
  desc.Usage	= D3D10_USAGE_DEFAULT;
  desc.BindFlags	= D3D10_BIND_SHADER_RESOURCE | D3D10_BIND_STREAM_OUTPUT;

Next, we need to make sure that the buffer format is compatible with stream output stage and that it can be used as a stream output target. The only format I was able to successfully bind was DXGI_FORMAT_R32G32B32A32_FLOAT. I attempted to get it working with R32G32B32_FLOAT, a full 4 bytes smaller per buffer element, but the compiler claimed it was an incompatible type. This may suggest the stream output stage must treat each element as a full float4.

  SRVDesc.Format	       = DXGI_FORMAT_R32G32B32A32_FLOAT;
  SRVDesc.ViewDimension		= D3D10_SRV_DIMENSION_BUFFER;
  SRVDesc.Buffer.ElementWidth 	= numSurfaces;
  m_pDevice->CreateBuffer(&desc, NULL, &m_pNormalBufferSO);
  m_pDevice->CreateShaderResourceView(m_pNormalBufferSO, &SRVDesc, 
    &m_pNormalBufferRVSO);
}

After initializing all of our buffers and updating our mesh positions the next step is to draw our mesh. As before, we’re going to create some local variables, set some necessary constants within our HLSL, and set the input layout, index buffer, and vertex buffers so that the input assembler knows how to construct and process our vertices for use by the vertex shader.

void Heightfield::Draw()
{
  int numRows  = m_NumVertsDeep - 1;
  int numIndices = 2 * m_NumVertsWide;
  m_pNumVertsDeep->SetInt(m_NumVertsDeep);
  m_pNumVertsWide->SetInt(m_NumVertsWide);
  m_pMetersPerVertex->SetFloat(m_MetersPerVertex);
  m_pDevice->IASetPrimitiveTopology (D3D10_PRIMITIVE_TOPOLOGY_TRIANGLESTRIP);
  m_pDevice->IASetIndexBuffer(m_pIndexBuffer, DXGI_FORMAT_R32_UINT,0);
  UINT offset = 0;
  UINT stride = sizeof(MeshVertex);
  m_pDevice->IASetInputLayout(m_pMeshIL);
  m_pDevice->IASetVertexBuffers(0, 1, &m_pMeshVB, &stride, &offset);

Next, we get a chance to take a look at some new stuff. Here we’re creating an array of ID3D10ShaderResourceView’s containing a single element, NULL and setting it as the shader resource array of the Vertex Shader. We do this because we’re using the same buffer for the stream output stage as input for our vertex shader. In order to do this we need to ensure that it’s not still bound to the Vertex Shader from a previous pass when we attempt to write to it.

  ID3D10ShaderResourceView* pViews[] = {NULL};
  m_pDevice->VSSetShaderResources(0, 1, pViews);

Once we’ve cleared the shader resources attached to slot 0 we’re going to set our surface normal buffer as the output for the stream output stage. This makes it so any vertices we add to the stream get stored within our buffer. Because we don’t actually read from that buffer until the second pass, it’s also fine to bind it to the Vertex Shader at this point.

  m_pDevice->SOSetTargets(1, &m_pNormalBufferSO, &offset);
  m_pSurfaceNormalsRV->SetResource(m_pNormalBufferRVSO);

Unlike the filter method, we take a few additional steps when calling draw here. First, because we’re working with a technique which we know to have more than 1 pass we’re going to be polite and actually ask the technique how many passes it has, and then iterate over both.

  D3D10_TECHNIQUE_DESC desc;
  m_pMeshWithNormalMapSOTech->GetDesc(&desc);
  for(unsigned i = 0; i < desc.Passes; i++)
  {
    m_pMeshWithNormalMapSOTech->GetPassByIndex(i)->Apply(0);
    for (int j = 0; j < numRows; j++)
      m_pDevice->DrawIndexed(numIndices, j * numIndices, 0 );

At the end of our first pass, after iterating over each of the rows in our mesh and drawing them, we need to make sure to clear the stream output targets so that in pass 1 we can use the surface normal buffer as an input to the vertex shader.

    m_pDevice->SOSetTargets(0, NULL, &offset);
  }	
}

Next, we move on to the HLSL. While the technique declaration comes last within the HLSL file, I’m going to show it to you here first so you have a clear picture of how the technique is structured and how the 2 passes are broken down. In the first pass, pass P0, we’re setting the vertex shader to contain a simple pass-through shader. All this shader does is take the input from the Input Assembler and pass it along to the Geometry Shader.

GeometryShader gsNormalBuffer = ConstructGSWithSO( CompileShader( gs_4_0, 
  SurfaceNormalGS() ), "POSITION.xyzw" );
technique10 MeshWithNormalMapSOTech
{
  pass P0
  {
    SetVertexShader( CompileShader( vs_4_0, PassThroughVS() ) );
    SetGeometryShader( gsNormalBuffer );
    SetPixelShader( NULL );
  }

The next part of the declaration is assignment of the geometry shader. For clarity, the geometry shader is built before the pass declaration using the ConstructGSwithSO HLSL method. Of importance, the last parameter of the ConstructGSWithSO method is the output format of the primitives that are added to the streamed output. In our case, we’re simply passing out a four-component position value, which, incidentally doesn’t represent position, but represents our surface normal vectors. The final part of P0 is setting the pixel shader. Because P0 is strictly for computing our surface normals we’re going to set the pixel shader to null.

Once Pass 0 is complete, we render our mesh a second time using pass 1. In pass 1 we set a vertex shader which looks almost identical to the vertex shader we used for the filter method. The primary difference is that this vertex shader calls ComputeNormal instead of FilterNormal, resulting in a different approach to obtaining the vertex normal. Because Pass 1 is ultimately responsible for rendering our mesh to the screen we’re going to leave our Geometry Shader null for this pass and instead provide a pixel shader. The pixel shader is just a standard shader for drawing a pixel at a given point using the color and position interpolated from the previous stage. Note we also enable depth buffering so that the mesh waves don’t draw over themselves.

  pass P1
  {
    SetVertexShader( CompileShader( vs_4_0, RenderNormalMapScene() ) );
    SetGeometryShader( NULL );
    SetPixelShader( CompileShader( ps_4_0, RenderScenePS() ) );
    SetDepthStencilState( EnableDepth, 0 );
  }
}

Now let’s take look at the most important of those shaders one at a time. The first item on the list is the Geometry Shader.

[maxvertexcount(1)] 
void SurfaceNormalGS( triangle GS_INPUT input[3], inout PointStream<GS_INPUT> PStream )
{
  GS_INPUT Output = (GS_INPUT)0;
  float3 edge1 = input[1].Position - input[0].Position;
  float3 edge2 = input[2].Position - input[0].Position;
  Output.Position.xyz = normalize( cross( edge2, edge1 ) );
  PStream.Append(Output);
}

Here we declare a simple geometry shader that takes as input a triangle and a point stream. Inside of the geometry shader we implement the first part of Gouraud’s Smooth Shading algorithm by computing the cross-product of the provided triangle. Once we’ve done so, we normalize it, and then add it as a 4-component point to our point stream. This stream, which will be filled with float4 values, will serve as the surface normal buffer in the second pass.

This leads us to the second part of Gouraud’s Smooth shading algorithm. Rather than list the entire vertex shader here, let’s focus on the code that actually does most of the work – ComputeNormal.

float3 ComputeNormal(uint index)
{
  float3 normal 	 = 0.0;
  int topVertex	 = g_NumVertsDeep - 1;
  int rightVertex	 = g_NumVertsWide - 1;
  int normalsPerRow = rightVertex * 2;
  int numRows	 = topVertex;
  float top		 = normalsPerRow * (numRows - 1);
  int x = index % g_NumVertsWide;
  int z = index / g_NumVertsWide;
  // Bottom
  if(z == 0)
  {
    if(x == 0)
    {
      float3 normal0 = g_SurfaceNormals.Load(int2( 0, 0 ));
      float3 normal1 = g_SurfaceNormals.Load(int2( 1, 0 ));
      normal = normal0 + normal1;
    }
    else if(x == rightVertex)
    {
      index = (normalsPerRow - 1);
      normal = g_SurfaceNormals.Load(int2( index, 0 ));
    }
    else
    {
      index = (2 * x);
      normal = g_SurfaceNormals.Load(int2( index-1, 0 )) +
        g_SurfaceNormals.Load(int2( index,  0 )) +
        g_SurfaceNormals.Load(int2( index+1, 0 ));
    }
  }
  // Top
  else if(z == topVertex)
  {
    if(x == 0)
    {
      normal = g_SurfaceNormals.Load(int2( top, 0 ));
    }
    else if(x == rightVertex)
    {
      index =	(normalsPerRow * numRows) - 1;
      normal = g_SurfaceNormals.Load(int2( index,  0 )) +
           g_SurfaceNormals.Load(int2( index-1, 0 ));
    }
    else
    {
      index = top + (2 * x);
      normal = g_SurfaceNormals.Load(int2( index-2, 0)) +
           g_SurfaceNormals.Load(int2( index,  0)) +
           g_SurfaceNormals.Load(int2( index-1, 0));
    }
  }
  // Middle
  else
  {
    if(x == 0)
    {
      int index1 = z * normalsPerRow;
      int index2 = index1 - normalsPerRow;
      normal = g_SurfaceNormals.Load(int2( index1,  0 )) +
           g_SurfaceNormals.Load(int2( index1+1, 0 )) +
           g_SurfaceNormals.Load(int2( index2,  0 ));
    }
    else if(x == rightVertex)
    {
      int index1 = (z + 1) * normalsPerRow - 1;
      int index2 = index1 - normalsPerRow;
      normal = g_SurfaceNormals.Load(int2( index1,  0 )) +
           g_SurfaceNormals.Load(int2( index2,  0 )) +
           g_SurfaceNormals.Load(int2( index2-1, 0 ));
    }
    else
    {
      int index1 = (z * normalsPerRow) + (2 * x);
      int index2 = index1 - normalsPerRow;
      normal = g_SurfaceNormals.Load(int2( index1-1, 0 )) +
           g_SurfaceNormals.Load(int2( index1,  0 )) +
           g_SurfaceNormals.Load(int2( index1+1, 0 )) +			
           g_SurfaceNormals.Load(int2( index2-2, 0 )) +
           g_SurfaceNormals.Load(int2( index2-1, 0 )) +
           g_SurfaceNormals.Load(int2( index2,  0 ));
    }
  }
  return normal;
}

As with the filter algorithm for heightmap based terrain, the ComputeNormal method performs a series of checks to determine whether the vertex is on the left, middle, or right edge, and whether it’s on the bottom, center, or top edge of the grid. Depending on the answer, between one and six surface normals are sampled from the buffer and then summed together. The result is then returned to the main entry function for the vertex shader, where it is normalized, and used in computing the final color of the Vertex.

Conclusion


The code snippets contained within this article are based on the accompanying demo which is written in C++ using DirectX 10 and HLSL, and contains not only the algorithms detailed here but also the two remaining methods for computing vertex normals for a grid-based mesh. That is, it computes the Surface and Vertex normals on the CPU, and also computes the Surface normals on the CPU while using the GPU for the Vertex Normals. Please make sure to download those files, run the demo for yourself, and evaluate the remaining source code. You can then perform a personal comparison of the performance afforded to you by each method. For an idea of what the demo looks like, refer to Figure 7.


Attached Image: AdvGameProg_EfficientNormal_Walsh_7.jpg
Figure 7: A screenshot of the demo program for this article


As the demo uses DirectX 10 and SM 4.0, you will need a computer running Windows Vista and a compatible video card with the most up-to-date DirectX SDK and drivers in order to successfully run the application.

The limited documentation for the demo can be displayed within the application by pressing the F1 key. Additionally, on the lower right-hand corner of the screen there are four controls which can be used to configure the demonstration. The resolution slider identifies the number of vertices across and deep the terrain sample will be. The slider ranges from 17 to 512.

Below the Resolution Slider is a combo box which allows the user to select from one of the 4 different methods of implementation within the demo: “Filtered Normals”, “CPU Normals”, “GPU Normals”, and “Streamed Normals”.

Below the normal mode combo boxes are two buttons labeled “Toggle Animation” and “Toggle wireframe”. No surprise, these either toggle on/off the animation or toggle on/off wireframe mode, respectively.

References

Normal Computations for Heightfield Lighting
Jeromy Walsh, 2005, GameDev.net

GDC 2013: Interview with Jake Lewandowski

$
0
0
Where do video game developers find their inspiration? Some look to the fist! Design3 met up with Jake Lewandowski at GDC 2013, where he told us all about the success of his hit-game, Fist Puncher. Jake also spilled the beans on the free tools he uses, turning his KickStarter supporters into characters in his game and what it's like developing games with his brother.


Watch more interviews and game dev tutorials FREE at http://www.design3.com



Introduction to Object Oriented Programming Concepts (OOP) and More

$
0
0

1. Introduction


I have noticed an increase in the number of articles published in the Architect category in code-project during the last few months. The number of readers for most of these articles is also high, though the ratings for the articles are not. This indicates that readers are interested in reading articles on Architecture, but the quality does not match their expectations. This article is a constructive attempt to group/ define/ explain all introductory concepts of software architecture for well seasoned developers who are looking to take their next step as system architects.

One day I read an article that said that the richest 2 percent own half the world's wealth. It also said that the richest 1 percent of adults owned 40 percent of global assets in the year 2000. And further, that the richest 10 percent of adults accounted for 85 percent of the world's total wealth. So there is an unbalanced distribution of wealth in the physical world. Have you ever thought of an unbalanced distribution of knowledge in the software world? According to my view point, the massive expansion of the software industry is forcing developers to use already implemented libraries, services and frameworks to develop software within ever shorter periods of time. The new developers are trained to use (I would say more often) already developed software components, to complete the development quicker. They just plug in an existing library and some how manage to achieve the requirements. But the sad part of the story is, that they never get a training to define, design the architecture for, and implement such components. As the number of years pass by, these developers become leads and also software architects. Their titles change, but the old legacy of not understanding, of not having any architectural experience continues, creating a vacuum of good architects. The bottom line is that only a small percentage of developers know how to design a truly object oriented system. The solution to this problem is getting harder every day as the aggressive nature of the software industry does not support an easy adjustment to existing processes, and also the related online teaching materials are either complex or less practical or sometimes even wrong. The most of them use impractical, irrelevant examples of shapes, animals and many other physical world entities to teach concepts of software architecture. There are only very few good business-oriented design references. Unfortunately, I myself am no exception and am a result of this very same system. I got the same education that all of you did, and also referred to the same resource set you all read.

Coming back to the initial point, I noticed that there is a knowledge gap, increasing every day, between the architects who know how to architect a system properly and the others who do not know. The ones, who know, know it right. But the ones, who do not know, know nothing. Just like the world’s wealth distribution, it is an unbalanced distribution of knowledge.

2. Background


This article began after reading and hearing the questions new developers have, on basics of software architecture. There are some good articles out there, but still developers struggle to understand the basic concepts, and more importantly, the way to apply them correctly.

As I see it, newcomers will always struggle to understand a precise definition of a new concept, because it is always a new and hence unfamiliar idea. The one, who has experience, understands the meaning, but the one who doesn’t, struggles to understand the very same definition. It is like that. Employers want experienced employees. So they say, you need to have experience to get a job. But how the hell is one supposed to have that experience if no one is willing to give him a job? As in the general case, the start with software architecture is no exception. It will be difficult. When you start to design your very first system, you will try to apply everything you know or learned from everywhere. You will feel that an interface needs to be defined for every class, like I did once. You will find it harder to understand when and when not to do something. Just prepare to go through a painful process. Others will criticize you, may laugh at you and say that the way you have designed it is wrong. Listen to them, and learn continuously. In this process you will also have to read and think a lot. I hope that this article will give you the right start for that long journey.

“The knowledge of the actions of great men, acquired by long experience in contemporary affairs, and a continual study of antiquity” – I read this phrase when I was reading the book named “The Art of War”, seems applicable here, isn’t it?

3. Prerequisites


This article is an effort to provide an accurate information pool for new developers on the basics of software architecture, focusing on Object Oriented Programming (OOP). If you are a developer, who has a minimum of three or more years of continuous development experience and has that hunger to learn more, to step-in to the next level to become a software architect, this article is for you.

4. The Main Content


4.1. What is Software Architecture?


Software Architecture is defined to be the rules, heuristics and patterns governing:
  • Partitioning the problem and the system to be built into discrete pieces
  • Techniques used to create interfaces between these pieces
  • Techniques used to manage overall structure and flow
  • Techniques used to interface the system to its environment
  • Appropriate use of development and delivery approaches, techniques and tools.

4.2. Why Architecture is important?


Attached Image: OOP.jpg


The primary goal of software architecture is to define the non-functional requirements of a system and define the environment. The detailed design is followed by a definition of how to deliver the functional behavior within the architectural rules. Architecture is important because it:
  • Controls complexity
  • Enforces best practices
  • Gives consistency and uniformity
  • Increases predictability
  • Enables re-use.

4.3. What is OOP?


OOP is a design philosophy. It stands for Object Oriented Programming. Object-Oriented Programming (OOP) uses a different set of programming languages than old procedural programming languages (C, Pascal, etc.). Everything in OOP is grouped as self sustainable "objects". Hence, you gain re-usability by means of four main object-oriented programming concepts.

In order to clearly understand the object orientation, let’s take your “hand” as an example. The “hand” is a class. Your body has two objects of type hand, named left hand and right hand. Their main functions are controlled/ managed by a set of electrical signals sent through your shoulders (through an interface). So the shoulder is an interface which your body uses to interact with your hands. The hand is a well architected class. The hand is being re-used to create the left hand and the right hand by slightly changing the properties of it.

4.4. What is an Object?


An object can be considered a "thing" that can perform a set of related activities. The set of activities that the object performs defines the object's behavior. For example, the hand can grip something or a Student (object) can give the name or address.

In pure OOP terms an object is an instance of a class.

4.5. What is a Class?


Attached Image: class.gif


A class is simply a representation of a type of object. It is the blueprint/ plan/ template that describe the details of an object. A class is the blueprint from which the individual objects are created. Class is composed of three things: a name, attributes, and operations.

public class Student
{
}

According to the sample given below we can say that the student object, named objectStudent, has created out of the Student class.

Student objectStudent = new Student();

In real world, you'll often find many individual objects all of the same kind. As an example, there may be thousands of other bicycles in existence, all of the same make and model. Each bicycle has built from the same blueprint. In object-oriented terms, we say that the bicycle is an instance of the class of objects known as bicycles.

In the software world, though you may not have realized it, you have already used classes. For example, the TextBox control, you always used, is made out of the TextBox class, which defines its appearance and capabilities. Each time you drag a TextBox control, you are actually creating a new instance of the TextBox class.

4.6. How to identify and design a Class?


This is an art; each designer uses different techniques to identify classes. However according to Object Oriented Design Principles, there are five principles that you must follow when design a class,

  • SRP - The Single Responsibility Principle -
    A class should have one, and only one, reason to change.
  • OCP - The Open Closed Principle -
    You should be able to extend a classes behavior, without modifying it.
  • LSP - The Liskov Substitution Principle-
    Derived classes must be substitutable for their base classes.
  • DIP - The Dependency Inversion Principle-
    Depend on abstractions, not on concretions.
  • ISP - The Interface Segregation Principle-
    Make fine grained interfaces that are client specific.

For more information on design principles, please refer to Object Mentor.


Additionally to identify a class correctly, you need to identify the full list of leaf level functions/ operations of the system (granular level use cases of the system). Then you can proceed to group each function to form classes (classes will group same types of functions/ operations). However a well defined class must be a meaningful grouping of a set of functions and should support the re-usability while increasing expandability/ maintainability of the overall system.


In software world the concept of dividing and conquering is always recommended, if you start analyzing a full system at the start, you will find it harder to manage. So the better approach is to identify the module of the system first and then dig deep in to each module separately to seek out classes.


A software system may consist of many classes. But in any case, when you have many, it needs to be managed. Think of a big organization, with its work force exceeding several thousand employees (let’s take one employee as a one class). In order to manage such a work force, you need to have proper management policies in place. Same technique can be applies to manage classes of your software system as well. In order to manage the classes of a software system, and to reduce the complexity, the system designers use several techniques, which can be grouped under four main concepts named Encapsulation, Abstraction, Inheritance, and Polymorphism. These concepts are the four main gods of OOP world and in software term, they are called four main Object Oriented Programming (OOP) Concepts.


4.7. What is Encapsulation (or information hiding)?


The encapsulation is the inclusion within a program object of all the resources need for the object to function - basically, the methods and the data. In OOP the encapsulation is mainly achieved by creating classes, the classes expose public methods and properties. The class is kind of a container or capsule or a cell, which encapsulate the set of methods, attribute and properties to provide its indented functionalities to other classes. In that sense, encapsulation also allows a class to change its internal implementation without hurting the overall functioning of the system. That idea of encapsulation is to hide how a class does it but to allow requesting what to do.


Attached Image: class.gif


In order to modularize/ define the functionality of a one class, that class can uses functions/ properties exposed by another class in many different ways. According to Object Oriented Programming there are several techniques, classes can use to link with each other and they are named association, aggregation, and composition.

There are several other ways that an encapsulation can be used, as an example we can take the usage of an interface. The interface can be used to hide the information of an implemented class.


IStudent myStudent = new LocalStudent();
IStudent myStudent = new ForeignStudent();

According to the sample above (let’s assume that LocalStudent and ForeignStudent are implemented by the IStudent interface) we can see how LocalStudent and ForeignStudent are hiding their, localize implementing information through the IStudent interface.

4.8. What is Association?


Association is a (*a*) relationship between two classes. It allows one object instance to cause another to perform an action on its behalf. Association is the more general term that define the relationship between two classes, where as the aggregation and composition are relatively special.

public class StudentRegistrar
{
    public StudentRegistrar ();
    {
        new RecordManager().Initialize();
    }
}

In this case we can say that there is an association between StudentRegistrar and RecordManager or there is a directional association from StudentRegistrar to RecordManager or StudentRegistrar use a (*Use*) RecordManager. Since a direction is explicitly specified, in this case the controller class is the StudentRegistrar.


Attached Image: Association.gif


To some beginners, association is a confusing concept. The troubles created not only by the association alone, but with two other OOP concepts, that is association, aggregation and composition. Every one understands association, before aggregation and composition are described. The aggregation or composition cannot be separately understood. If you understand the aggregation alone it will crack the definition given for association, and if you try to understand the composition alone it will always threaten the definition given for aggregation, all three concepts are closely related, hence must study together, by comparing one definition to another. Let’s explore all three and see whether we can understand the differences between these useful concepts.

4.9. What is the difference between Association, Aggregation and Composition?


Association is a (*a*) relationship between two classes, where one class use another. But aggregation describes a special type of an association. Aggregation is the (*the*) relationship between two classes. When object of one class has an (*has*) object of another, if second is a part of first (containment relationship) then we called that there is an aggregation between two classes. Unlike association, aggregation always insists a direction.

public class University
{
    private Chancellor  universityChancellor = new Chancellor();
}


Attached Image: aggregation.gif


In this case I can say that University aggregate Chancellor or University has an (*has-a*) Chancellor. But even without a Chancellor a University can exists. But the Faculties cannot exist without the University, the life time of a Faculty (or Faculties) attached with the life time of the University . If University is disposed the Faculties will not exist. In that case we called that University is composed of Faculties. So that composition can be recognized as a special type of an aggregation.


Attached Image: Composite.gif


Same way, as another example, you can say that, there is a composite relationship in-between a KeyValuePairCollection and a KeyValuePair. The two mutually depend on each other.


.Net and Java uses the Composite relation to define their Collections. I have seen Composition is being used in many other ways too. However the more important factor, that most people forget is the life time factor. The life time of the two classes that has bond with a composite relation mutually depend on each other. If you take the .net Collection to understand this, there you have the Collection Element define inside (it is an inner part, hence called it is composed of) the Collection, farcing the Element to get disposed with the Collection. If not, as an example, if you define the Collection and it’s Element to be independent, then the relationship would be more of a type Aggregation, than a Composition. So the point is, if you want to bind two classes with Composite relation, more accurate way is to have a one define inside the other class (making it a protected or private class). This way you are allowing the outer class to fulfill its purpose, while tying the lifetime of the inner class with the outer class.


So in summary, we can say that aggregation is a special kind of an association and composition is a special kind of an aggregation. (Association->Aggregation->Composition)


Attached Image: association_aggre_com.gif


4.10. What is Abstraction and Generalization?


Abstraction is an emphasis on the idea, qualities and properties rather than the particulars (a suppression of detail). The importance of abstraction is derived from its ability to hide irrelevant details and from the use of names to reference objects. Abstraction is essential in the construction of programs. It places the emphasis on what an object is or does rather than how it is represented or how it works. Thus, it is the primary means of managing complexity in large programs.

While abstraction reduces complexity by hiding irrelevant detail, generalization reduces complexity by replacing multiple entities which perform similar functions with a single construct. Generalization is the broadening of application to encompass a larger domain of objects of the same or different type. Programming languages provide generalization through variables, parameterization, generics and polymorphism. It places the emphasis on the similarities between objects. Thus, it helps to manage complexity by collecting individuals into groups and providing a representative which can be used to specify any individual of the group.


Abstraction and generalization are often used together. Abstracts are generalized through parameterization to provide greater utility. In parameterization, one or more parts of an entity are replaced with a name which is new to the entity. The name is used as a parameter. When the parameterized abstract is invoked, it is invoked with a binding of the parameter to an argument.


4.11. What is an Abstract class?


Abstract classes, which declared with the abstract keyword, cannot be instantiated. It can only be used as a super-class for other classes that extend the abstract class. Abstract class is the concept and implementation gets completed when it is being realized by a subclass. In addition to this a class can inherit only from one abstract class (but a class may implement many interfaces) and must override all its abstract methods/ properties and may override virtual methods/ properties.

Abstract classes are ideal when implementing frameworks. As an example, let’s study the abstract class named LoggerBase below. Please carefully read the comments as it will help you to understand the reasoning behind this code.


public abstract class LoggerBase
{
    /// <summary>
    /// field is private, so it intend to use inside the class only
    /// </summary>
    private log4net.ILog logger = null; 

    /// <summary>
    /// protected, so it only visible for inherited class
    /// </summary>
    protected LoggerBase()
    {
        // The private object is created inside the constructor 
        logger = log4net.LogManager.GetLogger(this.LogPrefix);
        // The additional initialization is done immediately after
        log4net.Config.DOMConfigurator.Configure();
    }

    /// <summary>
    /// When you define the property as abstract,
    /// it forces the inherited class to override the LogPrefix
    /// So, with the help of this technique the log can be made,
    /// inside the abstract class itself, irrespective of it origin.
    /// If you study carefully you will find a reason for not to have ÃÃÃÃÃÃÃÃâsetÃÃÃÃÃÃÃÃâ method here.
    /// </summary>
    protected abstract System.Type LogPrefix
    {
        get;
    }

    /// <summary>
    /// Simple log method, 
    /// which is only visible for inherited classes
    /// </summary>
    /// <param name="message"></param>
    protected void LogError(string message)
    {
        if (this.logger.IsErrorEnabled)
        {
            this.logger.Error(message);
        }
    }

    /// <summary>
    /// Public properties which exposes to inherited class 
    /// and all other classes that have access to inherited class
    /// </summary>
    public bool IsThisLogError
    {
        get
        {
            return this.logger.IsErrorEnabled;
        }
    }
}

The idea of having this class as an abstract is to define a framework for exception logging. This class will allow all subclass to gain access to a common exception logging module and will facilitate to easily replace the logging library. By the time you define the LoggerBase, you wouldn’t have an idea about other modules of the system. But you do have a concept in mind and that is, if a class is going to log an exception, they have to inherit the LoggerBase. In other word the LoggerBase provide a framework for exception logging.

Let’s try to understand each line of the above code.


Like any other class, an abstract class can contain fields, hence I used a private field named logger declare the ILog interface of the famous log4net library. This will allow the Loggerbase class to control, what to use, for logging, hence, will allow changing the source logger library easily.


The access modifier of the constructor of the LoggerBase is protected. The public constructor has no use when the class is of type abstract. The abstract classes are not allowed to instantiate the class. So I went for the protected constructor.


The abstract property named LogPrefix is an important one. It enforces and guarantees to have a value for LogPrefix (LogPrefix uses to obtain the detail of the source class, which the exception has occurred) for every subclass, before they invoke a method to log an error.


The method named LogError is protected, hence exposed to all subclasses. You are not allowed or rather you cannot make it public, as any class, without inheriting the LoggerBase cannot use it meaningfully.


Let’s find out why the property named IsThisLogError is public. It may be important/ useful for other associated classes of an inherited class to know whether the associated member logs its errors or not.


Apart from these you can also have virtual methods defined in an abstract class. The virtual method may have its default implementation, where a subclass can override it when required.


All and all, the important factor here is that all OOP concepts should be used carefully with reasons, you should be able to logically explain, why you make a property a public or a field a private or a class an abstract. Additionally, when architecting frameworks, the OOP concepts can be used to forcefully guide the system to be developed in the way framework architect’s wanted it to be architected initially.


4.12. What is an Interface?


In summary the Interface separates the implementation and defines the structure, and this concept is very useful in cases where you need the implementation to be interchangeable. Apart from that an interface is very useful when the implementation changes frequently. Some say you should define all classes in terms of interfaces, but I think recommendation seems a bit extreme.

Interface can be used to define a generic template and then one or more abstract classes to define partial implementations of the interface. Interfaces just specify the method declaration (implicitly public and abstract) and can contain properties (which are also implicitly public and abstract). Interface definition begins with the keyword interface. An interface like that of an abstract class cannot be instantiated.


If a class that implements an interface does not define all the methods of the interface, then it must be declared abstract and the method definitions must be provided by the subclass that extends the abstract class. In addition to this an interfaces can inherit other interfaces.


The sample below will provide an interface for our LoggerBase abstract class.


public interface ILogger
{
    bool IsThisLogError { get; }
}

4.13. What is the difference between a Class and an Interface?


In .Net/ C# a class can be defined to implement an interface and also it supports multiple implementations. When a class implements an interface, an object of such class can be encapsulated inside an interface.

If MyLogger is a class, which implements ILogger, there we can write


ILogger log = new MyLogger();

A class and an interface are two different types (conceptually). Theoretically a class emphasis the idea of encapsulation, while an interface emphasis the idea of abstraction (by suppressing the details of the implementation). The two poses a clear separation from one to another. Therefore it is very difficult or rather impossible to have an effective meaningful comparison between the two, but it is very useful and also meaningful to have a comparison between an interface and an abstract class.

4.14. What is the difference between an Interface and an Abstract class?


There are quite a big difference between an interface and an abstract class, even though both look similar.

  • Interface definition begins with a keyword interface so it is of type interface
  • Abstract classes are declared with the abstract keyword so it is of type class
  • Interface has no implementation, but they have to be implemented.
  • Abstract class’s methods can have implementations and they have to be extended.
  • Interfaces can only have method declaration (implicitly public and abstract) and fields (implicitly public static)
  • Abstract class’s methods can’t have implementation only when declared abstract.
  • Interface can inherit more than one interfaces
  • Abstract class can implement more than one interfaces, but can inherit only one class
  • Abstract class must override all abstract method and may override virtual methods
  • Interface can be used when the implementation is changing
  • Abstract class can be used to provide some default behavior for a base class.
  • Interface makes implementation interchangeable
  • Interface increase security by hiding the implementation
  • Abstract class can be used when implementing framework
  • Abstract classes are an excellent way to create planned inheritance hierarchies and also to use as non-leaf classes in class hierarchies.

Abstract classes let you define some behaviors; they force your subclasses to provide others. For example, if you have an application framework, an abstract class can be used to provide the default implementation of the services and all mandatory modules such as event logging and message handling etc. This approach allows the developers to develop the application within the guided help provided by the framework.


However, in practice when you come across with some application-specific functionality that only your application can perform, such as startup and shutdown tasks etc. The abstract base class can declare virtual shutdown and startup methods. The base class knows that it needs those methods, but an abstract class lets your class admit that it doesn't know how to perform those actions; it only knows that it must initiate the actions. When it is time to start up, the abstract class can call the startup method. When the base class calls this method, it can execute the method defined by the child class.


4.15. What is Implicit and Explicit Interface Implementations?


As mentioned before .Net support multiple implementations, the concept of implicit and explicit implementation provide safe way to implement methods of multiple interfaces by hiding, exposing or preserving identities of each of interface methods, even when the method signatures are the same.


Let's consider the interfaces defined below.


interface IDisposable
{
    void Dispose();
}

Here you can see that the class Student has implicitly and explicitly implemented the method named Dispose() via Dispose and IDisposable.Dispose.


class Student : IDisposable
{
    public void Dispose()
    {
        Console.WriteLine("Student.Dispose");
    }

    void IDisposable.Dispose()
    {
        Console.WriteLine("IDisposable.Dispose");
    }
}

4.16. What is Inheritance?


Ability of a new class to be created, from an existing class by extending it, is called inheritance.


Attached Image: Inheritance.gif


public class Exception
{
}


public class IOException : Exception
{
}

According to the above example the new class (IOException), which is called the derived class or subclass, inherits the members of an existing class (Exception), which is called the base class or super-class. The class IOException can extend the functionality of the class Exception by adding new types and methods and by overriding existing ones.


Just like abstraction is closely related with generalization, the inheritance is closely related with specialization. It is important to discuss those two concepts together with generalization to better understand and to reduce the complexity.


One of the most important relationships among objects in the real world is specialization, which can be described as the “is-a” relationship. When we say that a dog is a mammal, we mean that the dog is a specialized kind of mammal. It has all the characteristics of any mammal (it bears live young, nurses with milk, has hair), but it specializes these characteristics to the familiar characteristics of canis domesticus. A cat is also a mammal. As such, we expect it to share certain characteristics with the dog that are generalized in Mammal, but to differ in those characteristics that are specialized in cats.


The specialization and generalization relationships are both reciprocal and hierarchical. Specialization is just the other side of the generalization coin: Mammal generalizes what is common between dogs and cats, and dogs and cats specialize mammals to their own specific subtypes.


Similarly, as an example you can say that both IOException and SecurityException are of type Exception. They have all characteristics and behaviors of an Exception, That mean the IOException is a specialized kind of Exception. A SecurityException is also an Exception. As such, we expect it to share certain characteristic with IOException that are generalized in Exception, but to differ in those characteristics that are specialized in SecurityExceptions. In other words, Exception generalizes the shared characteristics of both IOException and SecurityException, while IOException and SecurityException specialize with their characteristics and behaviors.


In OOP, the specialization relationship is implemented using the principle called inheritance. This is the most common and most natural and widely accepted way of implement this relationship.


4.17. What is Polymorphisms?


Polymorphisms is a generic term that means 'many shapes'. More precisely Polymorphisms means the ability to request that the same operations be performed by a wide range of different types of things.

At times, I used to think that understanding Object Oriented Programming concepts have made it difficult since they have grouped under four main concepts, while each concept is closely related with one another. Hence one has to be extremely careful to correctly understand each concept separately, while understanding the way each related with other concepts.


In OOP the polymorphisms is achieved by using many different techniques named method overloading, operator overloading and method overriding,


4.18. What is Method Overloading?


The method overloading is the ability to define several methods all with the same name.

public class MyLogger
{
    public void LogError(Exception e)
    {
        // Implementation goes here
    }

    public bool LogError(Exception e, string message)
    {
        // Implementation goes here
    }
}

4.19. What is Operator Overloading?


The operator overloading (less commonly known as ad-hoc polymorphisms) is a specific case of polymorphisms in which some or all of operators like +, - or == are treated as polymorphic functions and as such have different behaviors depending on the types of its arguments.

public class Complex
{
    private int real;
    public int Real
    { get { return real; } }

    private int imaginary;
    public int Imaginary
    { get { return imaginary; } }

    public Complex(int real, int imaginary)
    {
        this.real = real;
        this.imaginary = imaginary;
    }

    public static Complex operator +(Complex c1, Complex c2)
    {
        return new Complex(c1.Real + c2.Real, c1.Imaginary + c2.Imaginary);
    }
}

I above example I have overloaded the plus operator for adding two complex numbers. There the two properties named Real and Imaginary has been declared exposing only the required “get” method, while the object’s constructor is demanding for mandatory real and imaginary values with the user defined constructor of the class.

4.20. What is Method Overriding?


Method overriding is a language feature that allows a subclass to override a specific implementation of a method that is already provided by one of its super-classes.

A subclass can give its own definition of methods but need to have the same signature as the method in its super-class. This means that when overriding a method the subclass's method has to have the same name and parameter list as the super-class's overridden method.


using System;
public class Complex
{
    private int real;
    public int Real
    { get { return real; } }

    private int imaginary;
    public int Imaginary
    { get { return imaginary; } }

    public Complex(int real, int imaginary)
    {
        this.real = real;
        this.imaginary = imaginary;
    }

    public static Complex operator +(Complex c1, Complex c2)
    {
        return new Complex(c1.Real + c2.Real, c1.Imaginary + c2.Imaginary);
    }

    public override string ToString()
    {
        return (String.Format("{0} + {1}i", real, imaginary));
    }
}

In above example I have extended the implementation of the sample Complex class given under operator overloading section. This class has one overridden method named “ToString”, which override the default implementation of the standard “ToString” method to support the correct string conversion of a complex number.

Complex num1 = new Complex(5, 7);
Complex num2 = new Complex(3, 8);

// Add two Complex numbers using the
// overloaded plus operator
Complex sum = num1 + num2;

// Print the numbers and the sum 
// using the overriden ToString method
Console.WriteLine("({0}) + ({1}) = {2}", num1, num2, sum);
Console.ReadLine();

4.21. What is a Use case?


A use case is a thing an actor perceives from the system. A use case maps actors with functions. Importantly, the actors need not be people. As an example a system can perform the role of an actor, when it communicate with another system.


Attached Image: usercase1.gif


In another angle a use case encodes a typical user interaction with the system. In particular, it:
  • Captures some user-visible function.
  • Achieves some concrete goal for the user.
A complete set of use cases largely defines the requirements for your system: everything the user can see, and would like to do. The below diagram contains a set of use cases that describes a simple login module of a gaming website.


Attached Image: usecaseLogin.gif


4.22. What is a Class Diagram?


A class diagrams are widely used to describe the types of objects in a system and their relationships. Class diagrams model class structure and contents using design elements such as classes, packages and objects. Class diagrams describe three different perspectives when designing a system, conceptual, specification, and implementation. These perspectives become evident as the diagram is created and help solidify the design.


The Class diagrams, physical data models, along with the system overview diagram are in my opinion the most important diagrams that suite the current day rapid application development requirements.


UML Notations:


Attached Image: notation.jpg


4.23. What is a Package Diagram?


Package diagrams are used to reflect the organization of packages and their elements. When used to represent class elements, package diagrams provide a visualization of the name-spaces. In my designs, I use the package diagrams to organize classes in to different modules of the system.

4.24. What is a Sequence Diagram?


A sequence diagrams model the flow of logic within a system in a visual manner, it enable both to document and validate your logic, and are used for both analysis and design purposes. Sequence diagrams are the most popular UML artifact for dynamic modeling, which focuses on identifying the behavior within your system.

4.25. What is two-tier architecture?


The two-tier architecture is refers to client/ server architectures as well, the term client/ server was first used in the 1980s in reference to personal computers (PCs) on a network. The actual client/ server model started gaining acceptance in the late 1980s, and later it was adapted to World Wide Web programming.

According to the modern days use of two-tier architecture the user interfaces (or with ASP.NET, all web pages) runs on the client and the database is stored on the server. The actual application logic can run on either the client or the server. So in this case the user interfaces are directly access the database. Those can also be non-interface processing engines, which provide solutions to other remote/ local systems. In either case, today the two-tier model is not as reputed as the three-tier model. The advantage of the two-tier design is its simplicity, but the simplicity comes with the cost of scalability. The newer three-tier architecture, which is more famous, introduces a middle tier for the application logic.


Attached Image: 2-Tier.jpg


4.26. What is three-tier architecture?


The three tier software architecture (also known as three layer architectures) emerged in the 1990s to overcome the limitations of the two tier architecture. This architecture has aggressively customized and adopted by modern day system designer to web systems.

Three-tier is a client-server architecture in which the user interface, functional process logic, data storage and data access are developed and maintained as independent modules, some time on separate platforms. The term "three-tier" or "three-layer", as well as the concept of multi-tier architectures (often refers to as three-tier architecture), seems to have originated within Rational Software.


Attached Image: 3-Tier.jpg


The 3-Tier architecture has the following three tiers.

  1. Presentation Tier or Web Server: User Interface, displaying/ accepting data/ input to/ from the user
  2. Application Logic/ Business Logic/ Transaction Tier or Application Server: Data validation, acceptability check before being added to the database and all other business/ application specific operations
  3. Data Tier or Database server: Simple reading and writing method to database or any other storage, connection, command, stored procedures etc

4.27. What is MVC architecture?


The Model-View-Controller (MVC) architecture separates the modeling of the domain, the presentation, and the actions based on user input into three separate classes.

Unfortunately, the popularity of this pattern has resulted in a number of faulty usages; each technology (Java, ASP.NET etc) has defined it in their own way making it difficult to understand. In particular, the term "controller" has been used to mean different things in different contexts. The definitions given bellow are the closes possible ones I found for ASP.NET version of MVC.


Attached Image: mvc.jpg


  1. Model: DataSet and typed DataSet (some times business object, object collection, XML etc) are the most common use of the model.
  2. View: The ASPX and ASCX files generally handle the responsibilities of the view.
  3. Controllers: The handling of events or the controlling is usually done in the code-behind class.

In a complex n-tier distributed system the MVC architecture place the vital role of organizing the presentation tier of the system.

4.28. What is SOA?


A service-oriented architecture is essentially a collection of services. These services communicate with each other. The communication can involve either simple data passing or it could involve two or more services coordinating some activity. Some means of connecting services to each other is needed.

The .Net technology introduces the SOA by mean of web services.


Attached Image: SOA.gif


The SOA can be used as the concept to connect multiple systems to provide services. It has it's great share in the future of the IT world.


According to the imaginary diagram above, we can see how the Service Oriented Architecture is being used to provide a set of centralized services to the citizens of a country. The citizens are given a unique identifying card, where that card carries all personal information of each citizen. Each service centers such as shopping complex, hospital, station, and factory are equipped with a computer system where that system is connected to a central server, which is responsible of providing service to a city. As an example when a customer enter the shopping complex the regional computer system report it to the central server and obtain information about the customer before providing access to the premises. The system welcomes the customer. The customer finished the shopping and then by the time he leaves the shopping complex, he will be asked to go through a billing process, where the regional computer system will manage the process. The payment will be automatically handled with the input details obtain from the customer identifying card.


The regional system will report to the city (computer system of the city) while the city will report to the country (computer system of the country).


4.29. What is the Data Access Layer?


The data access layer (DAL), which is a key part of every n-tier system, is mainly consist of a simple set of code that does basic interactions with the database or any other storage device. These functionalities are often referred to as CRUD (Create, Retrieve, Update, and Delete).

The data access layer need to be generic, simple, quick and efficient as much as possible. It should not include complex application/ business logics.


I have seen systems with lengthy, complex store procedures (SP), which run through several cases before doing a simple retrieval. They contain not only most part of the business logic, but application logic and user interface logic as well. If SP is getting longer and complicated, then it is a good indication that you are burring your business logic inside the data access layer.


4.30. What is the Business Logic Layer?


I know for a fact that this is a question for most, but from the other hand by reading many articles I have become aware that not everyone agrees to what business logic actually is, and in many cases it's just the bridge in between the presentation layer and the data access layer with having nothing much, except taking from one and passing to the other. In some other cases, it is not even been well thought out, they just take the leftovers from the presentation layer and the data access layer then put them in another layer which automatically is called the business logic layer. However there are no god said things that cannot be changed in software world. You can change as and when you feel comfortable that the method you apply is flexible enough to support the growth of your system. There are many great ways, but be careful when selecting them, they can over complicating the simple system. It is a balance one needs to find with their experience.

As a general advice when you define business entities, you must decide how to map the data in your tables to correctly defined business entities. The business entities should meaningfully define considering various types of requirements and functioning of your system. It is recommended to identify the business entities to encapsulate the functional/ UI (User Interface) requirements of your application, rather than define a separate business entity for each table of your database. For example, if you want to combine data from couple of table to build a UI (User Interface) control (Web Control), implement that function in the Business Logic Layer with a business object that uses couple of data object to support with your complex business requirement.


4.31. What is Gang of Four (GoF) Design Patterns?


The Gang of Four (GoF) patterns are generally considered the foundation for all other patterns. They are categorized in three groups: Creational, Structural, and Behavioral. Here you will find information on these important patterns.

Creational Patterns
  • Abstract Factory Creates an instance of several families of classes
  • Builder Separates object construction from its representation
  • Factory Method Creates an instance of several derived classes
  • Prototype A fully initialized instance to be copied or cloned
  • Singleton A class of which only a single instance can exist
Structural Patterns
  • Adapter Match interfaces of different classes
  • Bridge Separates an object’s interface from its implementation
  • Composite A tree structure of simple and composite objects
  • Decorator Add responsibilities to objects dynamically
  • Facade A single class that represents an entire subsystem
  • Flyweight A fine-grained instance used for efficient sharing
  • Proxy An object representing another object
Behavioral Patterns
  • Chain of Resp. A way of passing a request between a chain of objects
  • Command Encapsulate a command request as an object
  • Interpreter A way to include language elements in a program
  • Iterator Sequentially access the elements of a collection
  • Mediator Defines simplified communication between classes
  • Memento Capture and restore an object's internal state
  • Observer A way of notifying change to a number of classes
  • State Alter an object's behavior when its state changes
  • Strategy Encapsulates an algorithm inside a class
  • Template Method Defer the exact steps of an algorithm to a subclass
  • Visitor Defines a new operation to a class without change

4.32. What is the difference between Abstract Factory and Builder design patterns?


The two design patterns are fundamentally different. However, when you learn them for the first time, you will see a confusing similarity. So that it will make harder for you to understand them. But if you continue to study eventually, you will get afraid of design patterns too. It is like infant phobia, once you get afraid at your early age, it stays with you forever. So the result would be that you never look back at design patterns again. Let me see whether I can solve this brain teaser for you.


In the image below, you have both design pattern listed in. I am trying to compare the two one on one to identify the similarities. If you observe the figure carefully, you will see an easily understandable color pattern (same color is used to mark the classes that are of similar kind).


Attached Image: Factory.jpg


Please follow up with the numbers in the image when reading the listing below.


Mark #1: Both patterns have used a generic class as the entry-class. The only difference is the name of the class. One pattern has named it as “Client”, while the other named it as “Director”.
Mark #2: Here again the difference is the class name. It is “AbstractFactory” for one and “Builder” for the other. Additionally both classes are of type abstract.
Mark #3: Once again both patterns have defined two generic (WindowsFactory & ConcreteBuilder) classes. They both have created by inheriting their respective abstract class.
Mark #4: Finally, both seem to produce some kind of a generic output.


Now, where are we? Aren’t they looking almost identical? So then why are we having two different patterns here?


Let’s compare the two again side by side for one last time, but this time, focusing on the differences.

  • Abstract Factory: Emphasizes a family of product objects (either simple or complex)
  • Builder: Focuses on constructing a complex object step by step
  • Abstract Factory: Focus on *what* is made
  • Builder: Focus on *how* it is made
  • Abstract Factory: Focus on defining many different types of *factories* to build many *products*, and it is not a one builder for just one product
  • Builder: Focus on building a one complex but one single *product*
  • Abstract Factory: Defers the choice of what concrete type of object to make until run time
  • Builder: Hide the logic/ operation of how to compile that complex object
  • Abstract Factory: *Every* method call creates and returns different objects
  • Builder: Only the *last* method call returns the object, while other calls partially build the object

Sometimes creational patterns are complementary: So you can join one or many patterns when you design your system. As an example builder can use one of the other patterns to implement which components get built or in another case Abstract Factory, Builder, and Prototype can use Singleton in their implementations. So the conclusion would be that the two design patterns exist to resolve two type of business problems, so even though they look similar, they are not.


I hope that this shed some light to resolve the puzzle. If you still don’t understand it, then this time it is not you, it has to be me and it is since that I don’t know how to explain it.


5. What is the Conclusion?


I don't think, that it is realistic trying to make a programming language be everything to everybody. The language becomes bloated, hard to learn, and hard to read if everything plus the kitchen sink is thrown in. In another word every language has their limitations. As system architect and designer we should be able to fully and more importantly correctly (this also mean that you shouldn’t use a ballistic missile to kill a fly or hire FBI to catch the fly) utilize the available tools and features to build usable, sustainable, maintainable and also very importantly expandable software systems, that fully utilize the feature of the language to bring a competitively advance system to their customers. In order to do it, the foundation of a system places a vital role. The design or the architecture of a software system is the foundation. It hold the system together, hence designing a system properly (this never mean an *over* desinging) is the key to the success. When you talk about designing a software system, the correct handling of OOP concept is very important. I have made the above article richer with idea but still kept it short so that one can learn/ remind all of important concept at a glance. Hope you all will enjoy reading it.


Finally, after reading all these, one may argue with me saying that anybody can write all these concept definitions but do I know how/ when to apply them in real world systems. So for them to see these concepts being applied in real world systems, please check the source code of the latest of my open-source project name Rocket Framework.


Note: For newbies Rocket Framework is going to be little too advance but check it, use it and review it if you have any questions/ criticisms around my design don't hesitate to shoot them here or there..


6. What I Referred?


7. History

  • 28/01/2008
    • - Added more content base on design principles.
    • - Added History section
  • 04/02/2008
    • - Added more details to explain the Composition.
  • 15/04/2008
    • - Added Comparison between Abstract Factory and Builder
    • - Updated some of the wording slightly
  • 31/05/2010
    • - Corrected the 'Composition' related description, as pointed here
  • 26/01/2011
    • Conclusion is updated and a link is added to Rocket-Framework
Reposted from Code Project with author's permission

What Language Do I Use?

$
0
0
Like operating systems, software office suites, and computers themselves, there exist a large variety of computer languages. And the reason for such variety is the same as the reason for variety anywhere else -- because there is not a single solution that solves all problems. Some languages are better at raw speed. Some languages make it easier to write crash-resistant code. Some languages are very good at parsing strings of text and work effectively on a server. Some languages have very large corporate investment. And some languages still exist because they are compatible with large amounts of existing code that is impractical to rewrite.

Your choice of language will affect the rest of your project, and it's impossible to change languages in the middle of a project without a complete (or at least very extensive) rewrite, so it is not a choice you should make lightly. It is also not a choice that you should allow to be colored by your own personal preferences or the urging of friends. Your choice of computer language for your project should be well-researched and pragmatic. What counts most is the quality of your results and not that the language be worthy of your programming skills.

This article will cover some of the languages that are popular with game programmers. This list is neither complete nor deep. This article is intended to give you a bird's eye view of the most popular game development languages out there along with a short overview and a few situations where they would be a good or a poor choice for a project.

One final note before you read the list. This list does include plenty of terminology that might not be familiar to you as a beginner, and there simply isn't enough space to define everything. It is recommended that you keep Wikipedia handy for the terminology with which you are not yet familiar.

Note:  
This article was originally published to GameDev.net back in 2000. It was revised by the original author in 2008 and published in the book Beginning Game Programming: A GameDev.net Collection, which is one of 4 books collecting both popular GameDev.net articles and new original content in print format.


Note:  
We'd like to update this document once again from its 2008 version, so please use the comments to add new language sections and we will place them in the article. Additions and updates to current sections are welcome as well!


C


The C Programming Language is either the direct parent or a heavy influence to every other language discussed in this article. While it was itself derived from a couple of other languages that have fallen into disuse, C now considered to be one of the "root" languages of computer science. Some other languages that predate C (COBOL, FORTRAN, Lisp, Smalltalk) are still in use today and owe nothing to C, but just about every language that's been produced since the 1980's owes at least some of its syntax to C.

C is a classic "brace language", which is to say that it's a structured goto-less language that uses the open and closed curly-braces to group statements together. It's a structure that's repeated in many of the languages that follow (C++, Java, C#, ActionScript, PHP). One advantage of the endlessly-imitated structure of C, braces and otherwise, is that once you understand how things are done in C, you can carry them over to the other languages with almost no changes. The if(), while(), and for() statements in C and PHP, for example, are basically identical. For that very fact alone, it is recommend that you familiarize yourself with C syntax, as it is something that you can keep with you.

Advantages: C is good for writing small fast programs. It is easy to interface with assembly language. The language itself as well as the library routines are standardized, so moving programs to other platforms can be a straightforward process if you plan ahead.

Disadvantages: The ability to write small programs quickly also works against C, as C does not support object oriented programming, which is a method of structuring your code that's better suited to large programs with distributed development, and large C programs can grow disorganized easily. While many very large projects have been written in C (Unix, Windows, Oracle), managing a large C-based project requires more discipline than in languages that are built around more modularity.

Portability: While the core of the language itself and the ISO function calls are very portable, these calls are limited to control flow, simple memory management and simple file handling. Modern user-interface constructs, like menus, buttons, dialog boxes, etc., are not portable between platforms, so you will need to either write your code to work with a third-party UI toolkit or plan to write your user-interface twice.

While the language was ahead of its time when it was created, the C library is showing its age. Many non-trivial library functions, like memory management and strings, have a simplistic syntax that has been significantly tuned up in the language's successors.

While C has been redesigned and re-standardized several times to keep up with the times, some of the syntax is counterintuitive by necessity. Some of the quirkier language constructs remain for the sake of compatibility with existing code.

Suitability for Beginners: Not very good. While the core of C is fairly compact and easy to grasp, many of C's library calls are antiquated and are easier handled in some of its successor languages.

Resources: While Kernighan and Ritchie's The C Programming Language is the "classic" book on the subject, the book does cover topics fairly quickly, and it might be too quick for a rank beginner at programming. Other well-recommended books include C How to Program, C Programming: A Modern Approach, and C Primer Plus

C++


C++ is C's most established "child" language. It was designed in the 1980's as an extended version of C with support for "classes", which are abstract data structures that aggregate primitive data types and algorithms into something that is better able to model real-world (or in the case of games, simulated-world) objects. C++ classes also support the concept of "data hiding" in which you can hide the underlying implementation of an object from the rest of your program. While this method seems a bit inscrutable, it is extremely useful when programming in a team environment. It allows you to agree on how the object's interface is intended to work without regard as to how the object works internally. It's a bit like saying "I'm giving you a job, and I don't care how you do it as long as it gets done, and the result looks the way I want".

Advantages: Supports the object-oriented (OO) paradigm very completely, which is much better than C for supporting large projects. Unlike C, it contains a very well-designed library of common data structures and algorithms.

Disadvantages: The Syntax of C++ has grown larger and more complicated with each iteration, and the language is absolutely byzantine now. The syntax lends itself very easily to abuse and, while the language does support team programming very well, its huge and deep syntax can make code difficult to read.

Portability: Despite its roots in C, C++ has better portability than C. This is because most modern portability toolkits are implemented as C++ object libraries rather than the old-style C function libraries. In addition, C++'s standard library and very useful Boost library is very standardized and cross-platform despite the complexity of both.

Suitability for Beginners: While the memory management and I/O operations in C++ are significantly easier to understand than C, C++ has a pretty high learning curve just from its sheer size. Thankfully, one doesn't have to learn the entire language to be productive with it.

Resources: A perfect beginner's book for C++ is C++: A Dialog by Steve Heller. While the book hasn't seen an update in a while, it doesn't really need to as the C++ language doesn't change often. It's very well-paced and approachable, and it's perfect for beginning programmers.

For a more comprehensive approach, Bruce Eckel's monumental two-volume Thinking in C++ series will tell you all you need to know about C++ in only 1,600 pages.

As an added bonus, these books are available for free download. Google for them, and you should have no trouble finding the official sites.

C or C++: Short of "DirectX or OpenGL", this is one of the most-asked questions when getting ready to learn the language. C++ is, with a couple of very minor exceptions, a proper superset of C. That means that all that C does, C++ does the same. Every C++ compiler will also compile C, and C-only compilers are very hard to find nowadays. While those facts might make it seem logical to start with C and then learn the "rest" of the language later, it is a better idea to learn classes and OO programming early on rather than having to "unlearn" techniques that don't model the environment (real or simulated) very well.

Assembly Language


There are two things that you must know about assembly language.

  1. The name of the language is "assembly". The name of the tool that converts assembly language into low-level machine code is called an "assembler". It's a common mistake, even among experienced programmers, to call the language "assembler". So please start out on the right foot by calling the language by its proper name.
  2. Assembly language is, by default, the smallest and fastest language in this article. Since it is not a high-level language but is actually a mnemonic representation of your CPU's instruction set, there is nothing written in a higher level language that can't be done faster in assembly

And given fact number two above, you might think your search is over. After all, if a language is necessarily the smallest and fastest of the bunch, why not use it? In fact, why do other people bother with C or C++ or anything else when you can write your code in assembly and get the best results by definition?

That's because the term "best code" doesn't just refer to the raw speed and size of your program. There is the quality of readability, as you might need to hand some of your code over to a colleague so he can work on it. There is the quality of portability, as you might need to move your code to another operating system or hardware architecture. There is the quality of maintainability, which is the need to easily fix problems at the close of the project. There is the quality of abstraction, in which you can write code in terms of moving a character down a hallway rather than manipulating numbers in locations in memory.

And in all of those factors, assembly language comes in dead last. Assembly language is very difficult to read and maintain. Unless meticulously commented, it is of little use to anyone else inheriting the code. Fixing bugs and extending the existing code is difficult at best. You have to keep a constant eye on portability, lest you end up writing code that won't even run on different processor models by the same manufacturer. And the closest you'll get to "move the alien ten pixels to the left" are some register updates followed by several instructions to call the bitmap-display function.

In practice, assembly is almost never used for complete games. Assembly, when it is used, is used in parts of programs that do a lot of calculations and are called a lot of times. Sometimes whittling a few machine-instructions out of a function that is called millions of times can provide enough of a benefit to make the extra work worthwhile. But this is a process that isn't undertaken from the beginning of a project. It is something done in early testing, after determining where the programming bottlenecks actually are.

Assembly language is not for the faint of heart, and if you're reading a book-chapter to try to figure out what language to use, then you should probably look elsewhere.

Advantages: Is the fastest and most compact way to go if you know what you are doing.

Disadvantages: If you are reading this, you probably don't know what you are doing. Prepare to spend a long time learning a million little tricks to shave off processor ticks here and there.

Portability: Worse than bad. Unless you are programming for the baseline processor, your programs might not even run on other "compatible" processors. For example, some special instructions on AMD processors are not available on Intel and vice versa.

Suitability for Beginners: Run away.

Resources: Assembly Language for Intel-Based Computers is well-recommended if you intend to write for Intel processors. If not Intel, check out the makers of your target CPU for technical resources.

Java


Java is, for practical purposes, the first "post web" language. While some languages like Perl suddenly found their string-handling capabilities to be a natural for retrieving values and sending display-able HTML to web browsers, Java initially found its footing in the browser itself, first in the very interesting but monumentally quirky HotJava browser (written in Java itself), and later in the form of extensions for existing browsers.

Java, while on its face structured similarly to C and C++, behaves very differently on the "back end". The Java compiler, rather than compiling Java source code to native machine-code, compiles to "bytecode" that is run by a Virtual Machine or VM. The Java bytecode is an assembly language of sorts, but it is not an assembly language that is married to a particular processor. The Java VM, which is actually a runtime interpreter of bytecode, will then interpret the bytecode for your machine's target processor. The advantage of this approach is that Java bytecode can be moved from machine to machine and executed without changes provided that the target machine has a compatible Java VM or, as Java promised, "write once, run everywhere". The disadvantage of this approach is that Java bytecode is not native machine code, and while technologies such as "just in time" compilers can improve VM performance, the fact is that you're doing some level of interpretation at runtime, and that does entail a minor, but measurable performance hit.

The other disadvantage is that realities of Java have not lived up to the language's early promises. While the idea of executing games inside web-pages captured everyone's hearts almost immediately, the reality quickly set in that Java VM's aren't as compatible with each other as they should be, and a Java application or applet written on one machine using a particular VM may or may not run nicely on another machine with another VM version. "Write once, run everywhere" was snarkily renamed "write once, port everywhere", which is to say that once you finished writing your Java code and it's running beautifully on one platform, you then had the non-trivial task of making sure the application will actually run well and look nice on all systems.

The third disadvantage of Java came with its GUI. While the first "pass" at making a Java GUI used native OS controls (buttons, scroll bars, etc) and was reasonably small and fast, it wasn't very deep. The next pass, Swing, looked better but performed worse and was entirely different from the original controls. And, worst of all, Sun (Java's parent) was slow to add OS features that had been in existence in the underlying OS for years, like support for ClearType font rendering. Hence, Java applications always seemed to look a few versions away from state-of-the-art.

There is one place, though, where Java took a good hold and Java's advantages outweighed its disadvantages, and that was in server programming. One big advantage of a VM is that since it's not an actual processor but just a simulation of one, crashing the VM isn't much of an issue. If you do manage to completely confuse the Java VM, that doesn't really affect the parent operating system, and you can close and restart a session without having to reboot the entire machine. Couple that with the fact that Java's memory management scheme is a generation evolved from that in C++ and C, and suddenly problems like allocating memory without releasing it back to the system became much less of a problem. And a system like this is perfect for a server environment. A server can pop up and kill VM's as necessary without affecting the underlying OS. Also, the GUI problems don't really apply, as it doesn't matter if your server software doesn't look spectacular unless you just want to impress server-admins. Today you'll find many commercial Massively Multiplayer games that use Java on the server side. A good example would be the multiplayer games by Three Rings that are fully Java on the client as well as the server-side.

Another place where Java has very strongly caught on is in the mobile phone market. J2ME (Java 2, Micro Edition) is a "miniature" version of the Java VM with a significantly truncated class library which is designed to run on mobile phones and other small devices. In fact, if you include the mobile phone demographic, Java is one of the most popular platforms in existence.

Advantages: Java's Virtual Machine coupled with its memory management and automatic collection of no-longer-needed memory allows you to make software that is very robust and crash-resistant. It also has a strong tradition of extensive documentation[1].

Disadvantages: Java's "write once, run everywhere" promise wasn't fulfilled. The Java class libraries have been rewritten multiple times without removing old calls, so while the libraries are very backward-compatible with old code, there seems to be three ways of doing everything, all but one of which are discouraged as being "obsolete".

Portability: Fairly good, but not as good as it should have been. Making an application in Java that is portable and uses the underlying OS's latest features is almost as difficult to do in Java as in C++.

Suitability for Beginners: Reasonably good. While figuring out the "right" way to do things without bumping into a deprecated object is a bit of a pain for beginners wading through the language, the core of the language itself is well-designed and easy to understand. Also Java is a standard language for many university courses.

Resources: Oracle Inc., the Java authority, has plenty of great resources for Java programmers.

.NET Languages (specifically C# and Visual Basic)


.NET (pronounced "dot net") is basically Microsoft's answer to the Java VM. Actually, .NET is the name for the overarching technology. The actual VM's name is CLR (common language runtime), and everything that was said earlier about the Java VM also applies to the CLR with one significant exception: the CLR was designed from the ground up to not be "married" to a single language, as Java was. And because of this, there are a whole host of languages that use the CLR to do the back-end processing. Everything from ancient legacy languages like COBOL and FORTRAN to modern languages like Python can target the CLR. Mind you, some of the CLR projects out there are little one-man projects, so don't get too excited if you find your favorite language in a CLR version, as some of the compilers are far from mature.

C# and Visual Basic, both developed by Microsoft, are the most popular CLR-based languages. C# is a language clearly derived from Java, and it shares about 90% of Java's syntax despite sounding more like something derived from C or C++. C# does have several nice language extensions that Java has been slow to add as well as a completely rewritten class library.

Visual Basic, which was briefly renamed VB.NET, is a CLR implementation and replacement for Microsoft's established and popular Visual Basic environment. While it is still called "Basic" and is no longer in all-caps, it bears very little resemblance to the old BASIC interpreters that were burned into the ROM of just about every computer sold in the 1980's. The syntax is now structured similarly to the other languages in this list, although it still doesn't use braces to group statements. It also uses the more object-oriented "dot notation" to call functions rather than the large function library of the pre-CLR versions of the language.

Advantages: While Java does have a couple of minor efforts to compile languages to the Java VM, the CLR is designed from the ground-up to support this. Hence, there are several CLR-based languages, and it's relatively easy to get them to communicate with each other.

The .NET technologies are very well-supported by Microsoft's Visual Studio environment, which is a very mature and feature-rich development environment.

C# is the premier programming language for Microsoft's XNA technology, which is a method of making games that are portable between Windows and the XBox 360 game console.

Disadvantages: Unlike Java, CLR applications can't run as applets within web pages. While the "Silverlight" technology does allow this, it's fairly late to the game and is not entrenched in browsers the way that Java and Flash are. Silverlight has also been discontinued after release 5.

CLR-based applications are much less portable than they should be.

Portability: While there are third-party efforts to port the CLR to operating systems other than Windows, the efforts in that direction are significantly smaller than the work being done in Windows. So while you might be able to create a very robust .NET application for Windows, your Mac and Linux efforts won't be nearly as smooth.

Suitability for Beginners: Good on both counts (C# and Visual Basic). Both languages are straightforward and easy to understand. In addition, their tight integration with the Visual Studio environment makes setup fairly easy.

Resources: Microsoft.com

Flash and ActionScript


Flash is a bit of an unusual member of this list, as its roots do not exist with the language itself but with an animation tool. In the 1990's, a couple of developers were dismayed at the size required to display animated graphics on web pages and the method by which they were displayed, so they developed a browser plug-in called "FutureSplash" as well as a drawing and animation tool that could create very compact vector-based animations. Macromedia, already a player in the interactive web-business with their Shockwave plug-in that could play content from their Director animation tool, purchased FutureSplash, renamed it "Flash", and proceeded to take the browser animation market by storm.

A few versions later, Macromedia added a subset of JavaScript (discussed later), dubbed "ActionScript" to the plug-in, and Flash became a fully fledged programming environment, although it did take a few versions for the language as well as the development tool to "grow up" into a first-class environment useful for content other than simple web-based games. Today, Flash, now owned by Adobe, is based on ActionScript 3, which is a full implementation of the ECMAScript standard and is as capable as anything on this list.

A few years ago, Adobe introduced a tool called "Flex" which was an attempt to "grow up" Flash into something more suitable for building browser-based user interfaces and RIA (Rich Internet Application) content rather than just animations and games. While not a replacement for Flash, Flex is better suited for building user interfaces, as it is an XML-based programming language with rich UI support rather than an animation tool with a built-in programming language.

Along with the most recent version of Flex, Adobe introduced a product called AIR which decoupled Flash content from the browser. Using AIR and some newly-created objects intended to give Flash content access to more machine resources than the browser plug-in, you can now create first-class executables out of Flash (as well as JavaScript, HTML, and PDF) that run cross-platform.

Advantages: Flash's integrated drawing and programming tools make programming web-based games absurdly easy. The Flash environment, while not having the pedigree of Visual Studio, is extremely feature-rich.

Disadvantages: Flash, while a great environment for one person, does not support team programming very well.

While the Flex compiler is free, the very nice FlexBuilder Flex content-creation tool is not.

Unlike the other languages on this list, ActionScript is a client-only technology. While some kind of server-side ActionScript interpreter might prevent you from having to learn a separate server language, such a thing does not exist.

Portability: Flash runtime players are available on Windows, Mac, Linux, several breeds of mobile phone, and some game consoles. Not all devices support the same version of Flash, though, so you will need to learn the capabilities of each version and what you want to target before you get started.

Suitability for Beginners: Excellent, especially with its aggregation of drawing and animation tools. Although such ease of building in the Flash environment will not apply to other languages. A technique or build tool used in C++, for example, will likely have an equivalent in Java and vice-versa. The Flash development environment is unlike all others.

Resources: Just like Microsoft is one-stop-shopping for all of your .NET needs, Adobe is the place to go for Flash.

Python


Python, unlike the previously-mentioned languages, did not start out as a large corporate or university project. It was much more of a grassroots effort among university students and, later, among people in the industry who liked the language's structure as well as its lack of legacy features. The language itself is fairly compact and is easy to use. It is also easy to embed a Python language interpreter into existing projects, which is why you'll see Python as an embedded scripting language in many games.

Python also functions well as a server language as it features many of the server-friendly attributes of Java. In fact, Python compilers exist that can compile Python code to both the Java VM and Microsoft's CLR. Many services that you would recognize (YouTube, Google, and Yahoo) use Python extensively for back-end processing.

Python is also becoming quite popular in the gaming community with the user-supported PyGame library. PyGame is an object library that abstracts the well-established SDL cross-platform graphics library into something friendly and easy-to-use from Python. Several impressive arcade games have been written entirely in Python.

Advantages: Free and open-source. Very dedicated user community. Integrated fully into Google AppEngine, which is Google's "pay to play" processing server.

Disadvantages: Virtually everything is handled not by a large corporation but by its user-community, so it might be a hard-sell to get a company to sign off on a Python-based project, although some very big players are now invested heavily in Python so Python now looks like much less of a "hobby language" than it used to be.

Portability: Pretty good. Most of the third-party libraries made for Python are built around portable technologies like SDL and OpenGL, so it's not too difficult to write something in Python that will run on several platforms.

Suitability for Beginners: The Python language has an easy-to-follow syntax and is easy to learn. In addition, there are several good community-written tutorials out there.
Resources: Python.org is a well-organized home for all things Python. It is also the home of many active community forums where you can get your questions answered.

Server languages


While many of the languages mentioned above will work nicely on the server, some of the technologies around which they were built are rather archaic, and much smoother easier-to-use solutions are now available. The original standard for writing a custom back-end for web-pages was called CGI, and it was a simple standard for running a customized executable on the server-side, passing it data from the page that called it, and collecting text output and returning it to the user. And while it was simple to implement on the server-side, it really doesn't model the interactive experience as the web is used today, and CGI-based solutions for interactive web-applications can be clunky at best.

PHP

PHP was one of the first true "embedded" scripting languages for the web, and in many ways it revolutionized the way that pages are scripted. While PHP can operate as a scripting language that takes input from a web form, processes it, and returns output, its real strength is its usage as a hypertext preprocessor. PHP, once configured to work as a preprocessor for a server, can process code that's embedded the page itself. So rather than writing a piece of standalone code that will, for example, print out the current date in a page, you can just embed the PHP code directly into your web page that prints the date, and the code will be quietly replaced with the resulting text as it is sent to a browser. PHP also added a library of native commands to communicate with the free and powerful MySQL database, making storing and retrieving persistent data easy.

Another thing that made PHP instantly popular was the price. It was free, thus cementing it in the popular server configuration known as LAMP (Linux, Apache, MySQL, and PHP). The combination of these four technologies gave beginning or low-budget web designers an easy-to-use, scalable, and very powerful web setup for free. And as a bonus, it could run on low-cost hardware. And this fact was not lost on web developers. Today there are a large number of free or low price PHP scripts to perform just about any task, from simple user databases to complete "website in a box" setups.

ASP.NET

Not to be outdone, Microsoft quickly put together their own PHP-esque configuration based entirely on Microsoft's technologies, namely Windows, Internet Information Server (Microsoft's web server), CLR, and SQL Server. While far from free and not in PHP's league in the breadth of third-party scripts available, ASP.NET does have the advantage of all the tech support money can buy as well as support for languages other than PHP. If you're familiar with Visual Basic on the client side, for example, you can use it on the server-side with ASP.NET.

But if you're working within a budget, a LAMP setup will likely be a better fit.

Ruby on Rails

Ruby on Rails (often just called "Rails") is not itself a programming language but is a class library built with the Ruby programming language. While Ruby itself is a not-very-revolutionary object-oriented scripting language that owes its heritage to both Python and Perl, it is the Rails library that makes the system revolutionary. The Rails library fully integrates the MVC (Model View Controller) paradigm and is designed to prevent as little repetition of technique as possible. And this does allow you to build a fairly rich server-based system with a minimum of code. And the Rails folks will proudly show off web-forums and social networking sites that have been written in an absurdly small amount of code.

Like PHP, Ruby on Rails is free.

Other Languages That Are Worth Mention


The sections above cover most of the languages that have a pedigree in the game development world, both on the client and server. That is to say that large scale or popular games have been written using these languages. There are, however, a couple of interesting technologies out there that, while not yet established as first-class languages for games, show lots of promise. It would not be surprising to see these languages score some major projects in the future.

JavaScript

JavaScript received brief mention in the section about Flash and ActionScript. JavaScript has the same root as ActionScript, the ECMAScript standard, and the languages resemble each other quite a bit. JavaScript first gained popularity as a language for scripting web pages, and today it's ubiquitous as the language used to nudge and squeeze and stretch and convince web browsers to display web content exactly the way you want. If you've ever been annoyed by a web page that resizes itself to fit the page's content, you can rest assured that JavaScript was behind that.

But JavaScript is useful for more than just web annoyances. It's becoming a popular language for writing all of those interesting and mostly-useless widgets that are sticking on desktops everywhere. Even better, the language is robust enough to write complete games, running inside the browser or standalone if used with a widget framework.

There are two primary problems that are preventing JavaScript from becoming more popular as a language for web games (compared to Flash, for example). One is that, unlike the Flash plug-in, JavaScript's interpreter is dependent on the maker of the browser. And while the language is based on the very complete ECMA standard, the JavaScript implementations used by browsers differ both in language features and performance. Hence it is difficult to write a truly large and robust JavaScript application that works in all browsers.

The second problem is one of protecting your intellectual property. JavaScript is not compiled and is streamed to browsers as source code, which is then interpreted by the browser itself. And while there are some clever tools that attempt to obfuscate and hide your code from a user, the source code to your game is never very far away from the "View Source" command of your browser. If you don't want your game to be borrowed/improved/stolen, you should first take a hard look at the security solutions available.

D

D is a sort of "unofficial" grandchild of C, C++, and Java. It is the brainchild of Walter Bright, who is one of the pioneers of C and C++ compiler construction on PCs. Growing frustrated with the ever-growing class libraries as well as the fanatical need for backward compatibility, Mr. Bright decided to build something from the ground-up that took the best features of C, C++, and Java while jettisoning anything that didn't have a very good reason for existing. And the result was a language that was tighter and easier to learn than its "parents" without sacrificing important features or runtime speed. D is what happens when you take language design away from the realms of the committee.

While D does jettison backward compatibility in the name of simplicity, it does have excellent methods for communicating with C code, so if you have a third-party library or some C source code that you are loath to rewrite, you can still talk to it without much difficulty.

That is to say that D would be the best of all worlds if it gained more support. It simply doesn't yet have the huge libraries of code, wealth of tools, and base of user support of the other languages. Hopefully its support will grow and it will receive the attention it deserves, but that will take some time.

Conclusion


Early on it was clear that this article would not reach a satisfactory conclusion as to what language to use. Fact is, there's very rarely a single solution that will solve all problems. Hopefully this list will at least whittle your choices down to two or three good candidates. The rest of the research is up to you.

Thankfully, virtually every solution mentioned above has a free implementation, so you can try out these languages and choose the one that you think will best suit your project.

Embedding Math Equations in Articles

$
0
0

Introduction


It's no secret that we at Gamedev.net have always been pretty heavy on the programming side of game development. The purpose of this article is to share some quick information on how to beef up your articles with all sorts of fancy pants math equations and formulas that will help others to understand your article topic better.

Background


First we want to give a special thanks to Tim Bright for pointing out MathJax to us. MathJax is a LaTeX and MathML javascript-based display engine that works in all browsers. We currently use the standard configuration of MathJax, so all examples will closely follow the documentation on their site.

Tutorial


Most of this article is based off of the tex samples located at http://www.mathjax.org/demos/tex-samples/ . The key to using formulas inside of your articles is to wrap any LaTeX or MathML formula in one of two special wrappers.

For multiline formulas use this:
\[ multiline formula goes here \]

For inline formulas and equations such as \(\sqrt{3x-1}+(1+x)^2\) you can use the following:
\( inline formula goes here \)


Quick Demo


The Lorenz Equations

\[\begin{aligned}
\dot{x} & = \sigma(y-x) \\
\dot{y} & = \rho x - y - xz \\
\dot{z} & = -\beta z + xy
\end{aligned} \]

The Cauchy-Schwarz Inequality

\[ \left( \sum_{k=1}^n a_k b_k \right)^2 \leq \left( \sum_{k=1}^n a_k^2 \right) \left( \sum_{k=1}^n b_k^2 \right) \]

A Cross Product Formula

\[\mathbf{V}_1 \times \mathbf{V}_2 = \begin{vmatrix}
\mathbf{i} & \mathbf{j} & \mathbf{k} \\
\frac{\partial X}{\partial u} & \frac{\partial Y}{\partial u} & 0 \\
\frac{\partial X}{\partial v} & \frac{\partial Y}{\partial v} & 0
\end{vmatrix} \]

The probability of getting \(k\) heads when flipping \(n\) coins is

\[P(E) = {n \choose k} p^k (1-p)^{ n-k} \]

An Identity of Ramanujan

\[ \frac{1}{\Bigl(\sqrt{\phi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}} =
1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
{1+\frac{e^{-8\pi}} {1+\ldots} } } } \]

A Rogers-Ramanujan Identity

\[ 1 + \frac{q^2}{(1-q)}+\frac{q^6}{(1-q)(1-q^2)}+\cdots =
\prod_{j=0}^{\infty}\frac{1}{(1-q^{5j+2})(1-q^{5j+3})},
\quad\quad \text{for $|q|<1$}. \]

Maxwell’s Equations

\[ \begin{aligned}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\ \nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0 \end{aligned}
\]

Demo Source


<b>The Lorenz Equations</b>

\[\begin{aligned}
\dot{x} & = \sigma(y-x) \\
\dot{y} & = \rho x - y - xz \\
\dot{z} & = -\beta z + xy
\end{aligned} \]

<b>The Cauchy-Schwarz Inequality</b>

\[ \left( \sum_{k=1}^n a_k b_k \right)^2 \leq \left( \sum_{k=1}^n a_k^2 \right) \left( \sum_{k=1}^n b_k^2 \right) \]

<b>A Cross Product Formula</b>

\[\mathbf{V}_1 \times \mathbf{V}_2 =  \begin{vmatrix}
\mathbf{i} & \mathbf{j} & \mathbf{k} \\
\frac{\partial X}{\partial u} &  \frac{\partial Y}{\partial u} & 0 \\
\frac{\partial X}{\partial v} &  \frac{\partial Y}{\partial v} & 0
\end{vmatrix}  \]

<b>The probability of getting \(k\) heads when flipping \(n\) coins is</b>

\[P(E)   = {n \choose k} p^k (1-p)^{ n-k} \]

<b>An Identity of Ramanujan</b>

\[ \frac{1}{\Bigl(\sqrt{\phi \sqrt{5}}-\phi\Bigr) e^{\frac25 \pi}} =
1+\frac{e^{-2\pi}} {1+\frac{e^{-4\pi}} {1+\frac{e^{-6\pi}}
{1+\frac{e^{-8\pi}} {1+\ldots} } } } \]

<b>A Rogers-Ramanujan Identity</b>

\[  1 +  \frac{q^2}{(1-q)}+\frac{q^6}{(1-q)(1-q^2)}+\cdots =
\prod_{j=0}^{\infty}\frac{1}{(1-q^{5j+2})(1-q^{5j+3})},
\quad\quad \text{for $|q|<1$}. \]

<b>MaxwellÃÃÃÃâs Equations</b>

\[  \begin{aligned}
\nabla \times \vec{\mathbf{B}} -\, \frac1c\, \frac{\partial\vec{\mathbf{E}}}{\partial t} & = \frac{4\pi}{c}\vec{\mathbf{j}} \\   \nabla \cdot \vec{\mathbf{E}} & = 4 \pi \rho \\
\nabla \times \vec{\mathbf{E}}\, +\, \frac1c\, \frac{\partial\vec{\mathbf{B}}}{\partial t} & = \vec{\mathbf{0}} \\
\nabla \cdot \vec{\mathbf{B}} & = 0 \end{aligned}
\]


Conclusion


We hope this helps to build stronger, better, faster mathematics-based articles. Good luck and let us know if you come up with any issues.

Article Update Log


2 Aug 2013: Initial release

How About UNICODE and UTF-8

$
0
0
When many first learn to program computers, they are often introduced to ASCII with or without knowing it. ASCII stands for "American Standard Code for Information Interchange" and when programmers use it, they are often talking about the character encoding scheme for the English alphabet. If you're using C or C++ and use the char data type to write strings, you're probably using ASCII. ASCII actually only uses 7 bits and is from 0 - 127. (There is an extended ASCII set, but in this article, only the original set will be considered.) This works well when you only want to use Latin letters in your programs, but in our more global world, we need to thinking about making programs that can display characters in other characters such as Korean, Chinese, or Japanese.

UNICODE was developed as a way to encode all of the characters for every language, but when we consider languages like Korean and Chinese, 8 bit characters just isn't enough. Windows programmers maybe familiar with USC-2. USC-2 is a 16 bit version of UNICODE and it can encode the values for all of the most common UNICODE characters. In USC-2, all characters are exactly 16 bits. These days, Windows also supports UTF-16 as well which uses 16 bit values, but some characters can be composed of two 16 bit units. This works well on Windows and fits perfectly with the Windows 16 bit wchar_t type. For many who want to support different language characters and at the same time be able to support multiple platforms, this is not enough..

wchar_t has some disadvantages. For example, wchar_t is 16 bits on Windows, but 32 on some other platforms. Also when using wchar_t and even with UTF-16 and UTF-32, you have to worry about endianess. UTF-8 can be used as an alternative to this.

What is UTF8 and how is it encoded?


UTF-8 is a way to encode the UNICODE values. From this point forward, the word character in this article will be used to refer to the value of the character in unicode which goes from 1 through 1,112,064 with zero, which can be used as a string terminator. UTF-8 is a variable-sized encoding. In UTF-8, characters will code into eito either 1, 2, 3, or 4 bytes. 1 byte encodings are only for characters from 0 to 127 meaning if it's a 1 byte encoding it'll be equivilent to ASCII. 2 byte encodings are for characters from 128 to 2047. 3 byte encodings are for characters from 2048 to 65535 and 4 byte encodings are for characters from 65536 to 1,112,064. To understand how the encoding works, we'll need to examine the binary representation of each character's numeric value. To do this easily, I'll also use hexadecimal notation as one hexadecimal digit always corresponds to a 4 bit nibble. Here's a quick table.

Attached Image: hextable.png



So 2C (hexadecimal) = (2 X 16) + (12 X 1) = 48(decimal) and 0010 1100(binary)
I understand that many may know this already, but I want to make sure new programmers will be able to understand.

The UTF-8 Format


In UTF-8, the high-order bits in binary are important. UTF-8 works by using the leading high-order bits of the first byte to tell how many bytes were used to encode the value. For 8 bit encoding from 0 to 127, the high-order bit will always be zero. Because of this, if the high-order bit is zero, the byte will always be treated as a single byte encoding. Therefore, all single byte encodings have the following form: 0XXX XXXX

7 bits are available to code the number. Here is the format for all of the encodings:

Attached Image: utf8format.png



Once you know the format of UTF-8, converting back and forth between it is fairly simple. To convert to UTF-8, you can easily see if it will encode to 1, 2, 3, or 4 bytes by checking the numerical range. Then copy the bits to the correct location.

Example Conversion


Let's try an example. I'm going to use hexadecimal value 1FACBD for this example. Now, I don't believe this is a real UNICODE character, but it'll help us see how to encode values. The number is greater than FFFF so it will require a 4-byte encoding. Let's see how it'll work. First, here's the value in binary.

Attached Image: utf8convertingnumber.png



This will be a 4-byte encoding so we'll need to use the following format.

Attached Image: utf84charformat.png



Now converting to UTF-8 is as simple as copying the bits from right to left into the correct positions.

Attached Image: utf8conversion.png



Like I said, UTF-8 is a fairly straight-forward format.

Advantages of UTF-8


If you want to support non-Latin characters, UTF-8 has a lot of advantages. Since it codes characters using one byte chunks and since UTF-8 strings will never contain a "null" byte, you can use UTF-8 strings with most traditional null-terminated string processing functions. More and more things are being encoded in UTF-8, especially things that are sent over the Internet. Many web pages are coded in UTF-8, and UTF-8 is often used with XML and JSON. Supporting UTF-8 will allow developers to retrieve text data from other sources without conversions. UTF-8 is also byte oriented and as long as it is read one byte at a time, you don't have to worry about endianess.

Conclusion


Now it's not difficult converting back and forth between UTF-8. As a programmer who wants to use UNICODE, you have to decided whether or not it would it be better to continue to store things as wide character string, using UTF-8 only to store things in files or to use UTF-8 all of the time. If you want to support non-Latin characters, UTF-8 has a lot of advantages. Unless you need to do a lot of string manipulation, you can keep your strings in UTF-8 until you need to display them. Typical string operations like concatenation, copying, and finding sub-strings can be done directly in UTF-8. If you want to parse through all of the characters to show them in a GUI for example, you can create an iterator to go through each character. (Comment From Aressera). Using UTF-8 in code is not difficult to implement and if you're wondering how to add support for non-Latin characters, it's worth considering.

Additional References


ASCII Wiki - http://en.wikipedia.org/wiki/ASCII
UTF-8 Encoding - http://www.fileformat.info/info/unicode/utf8.htm
UTF-8 Wiki - http://en.wikipedia.org/wiki/UTF-8

Article Update Log


4 Aug 2013: Initial Draft
6 Aug 2013: Updated Introductions and Conclusions


This article was originally posted on the Squared'D Blog

D Exceptions and C Callbacks

$
0
0

Introduction


When mixing multiple languages in the same project, there are often some subtle issues that can crop up from their interaction. One of those tricky cases is exception handling. Even handling exceptions across shared libraries implemented in a single language, like C++, can be a problem if different compilers were used to compile the libraries. Since the D Programming Language has exceptions built in, and they are always on, some special care should be taken when interacting with other languages.

In this article, I'm going to explain a specific scenario I encountered using GLFW 3 in D and the solution I came up with. I think this solution can work in many situations in which D is used to create an executable interacting with C libraries. If it's the other way around, where a C executable is using D libraries, this isn't going to be as broadly applicable.

About GLFW


GLFW 3 is a cross-platform library that can be used to create OpenGL applications. It abstracts away window and context creation and system event processing. The latest version has been pared down quite a bit to provide little more than that. It's a small library and does its job well.

Events in GLFW are processed via a number of callback functions. For example, the following C code shows how to handle notification of window close events.

#include <stdio.h>

// Assume this is opened elsewhere.
FILE *_log;

static void onWindowClose( GLFWwindow* win ) {
    fputs( "The window is closing!", _log );

    // Tell the main loop to exit
    setExitFlag();
}

void initEventHandlersForThisWindow( GLFWwindow* win ) {
    // Pass the callback to glfw
    glfwSetWindowCloseCallback( win, onWindowClose );
}

All system events are handled in this manner. In C, there's very little to get into trouble with here. There is a warning in the documentation not to call glfwDestroyWindow from a callback, but other than that pretty much anything goes. When doing this in D, the restriction on destroying windows still holds, but that's not all.

The Problem


I've seen code from people using D who misunderstood the meaning of D's extern( C ) linkage attribute. They believed that this restricted them to only using C libraries and/or C constructs inside the function. This is not the case. Any D library calls can be made and any valid D code can be used in these functions. All the attribute does is tell the compiler that the function uses the __cdecl calling convention.

I bring this up because one of the primary use cases of implementing a function in D with extern( C ) linkage is for callbacks to pass to C libraries. To demonstrate, I'll rewrite the above C example in D.

import std.stdio;

// Imaginary module that defines an App class
import mygame.foo.app;

// Assume this is opened elsewhere.
File _log;

// private at module scope makes the function local to the module,
// just as a static function in C. This is going to be passed to
// glfw as a callback, so it must be declared as extern( C ).
private extern( C ) void onWindowClose( GLFWwindow* win ) {

    // Calling a D function
    _log.writeln( "The window is closing!" );

    // Tell the app to exit.
    myApp.stop();
}

void initEventHandlersForThisWindow( GLFWwindow* win ) {
    glfwSetWindowCloseCallback( &onWindowClose );
}

This bit of code will work fine, probably very near 100% of the time. But taking a gander at the documentation for std.stdio.File.writeln will reveal the following.

Throws:
Exception if the file is not opened. ErrnoException on an error writing to the file.

Now imagine that there is a write error in the callback and an ErrnoException is thrown. What's going to happen?

GLFW events are processed by calling glfwPollEvents. Then, as events are internally processed, the appropriate callbacks are called by GLFW to handle them. So the sequence goes something like this: DFunc -> glfwPollevents -> onWindowClose -> return to glfwPollEvents -> return to DFunc.

Now, imagine for a moment that all GLFW is written in D so that every function of that sequence is a D function. If log.writeln throws an ErrnoException, the sequence is going to look like this: DFunc -> glfwPollEvents -> onWindowClose -> propagate exception to glfwPollEvents -> propagate exception to DFunc. That would be great, but it isn't reality. Here's what the sequence really looks like: DFunc -> glfwPollEvents ->onWindowClose -> {} -> return to glfwPollEvents -> return to DFunc. The {} indicates that the exception is never propagated beyond the callback. Once onWindowClose returns, execution returns to a part of the binary that was written in C and was not compiled with any instructions to handle exception propagation. So the exception is essentially dropped and the program will, with luck, continue as if it were never thrown at all. I'm told that on Linux, D exceptions can sometimes be propagated through the C side, but it can leave things in an undefined state.

The result of not handling an exception can be unpredictable. Sometimes, it's harmless. In this particular example, it could be that nothing is ever written to the same log object again, or maybe the next call to _log.writeln succeeds, or maybe it fails again but happens in D code where the exception can be propagated. In my own tests using callbacks that do nothing but throw exceptions, no harm is ever done. But it's not a perfect world. Sometimes exceptions are thrown at a certain point in a function call, or as a result of a certain failure, that causes the application to be in an invalid state. This can, sooner or later, cause crashes, unexpected behavior, and hard to find bugs. For a program to be more robust, exceptions thrown in D from C callbacks ought to be handled somehow.

A Solution


I'm convinced that there's a genetic mutation we programmers have that leads us all to believe we can be disciplined about our code. I know I've suffered from it. I've used C for years and never felt the need to choose C++ for any of my projects. I know all of the pitfalls of C strings and C arrays. I can manage them effectively! I don't need any std::string or std::vector nonsense! I've got some good C code lying around that mitigates most of those problems most of the time. Yet despite (or maybe because of) all of my confidence in my ability to properly manage the risks of C, I've still had bugs that I wouldn't have had if I'd just gone and used C++ in the first place. Coding discipline is a skill that takes time to learn and is never perfectly maintained. It's irrational to believe otherwise. We all lapse and make mistakes.

Any solution to this problem that relies on discipline is a nonsolution. And that goes doubly so for a library that's going to be put out there for other people to use. In this particular case, that of GLFW event handling, one way around this is to use the callbacks solely to queue up custom event objects, and then process the queue once glfwPollEvents returns. That's a workable solution, but it's not what I settled on. I have an aversion to implementing anything that I don't absolutely need. It just doesn't feel clean. Besides which, it's a solution that's useful only for a subset of cases. Other cases that don't translate to the event-handling paradigm would require a different approach.

Another solution is to is to wrap any external calls made by the callback in a try...catch block and then save the exception somewhere. Then, when the original call into C returns, the exception can be rethrown from D. Here's what that might look like.

// In D, class references are automatically initialized to null.
private Throwable _rethrow;

private extern( C ) void onWindowClose( GLFWwindow* win ) {

    try {
        _log.writeln( "The window is closing!" );
        myApp.stop();
    } catch( Throwable t ) {
        // Save the exception so it can be rethrown below.
        _rethrow = t;
    }
}

void pumpEvents() {
    glfwPollEvents();

    // The C function has returned, so it is safe to rethrow the exception now.
    if( _rethrow !is null ) {
        throw _rethrow;
    }
}

Notice that I'm using Throwable here. D has two exception types in the standard library, which both derive from Throwable: Exception and Error. The latter is analagous to Java's RuntimeException in that it is not intended to be caught. It should be thrown to indicate an unrecoverable error in the program. The language does not prevent them being caught. But, Andrei Alexandrescu's book "The D Programming Language" has this to say about catching Throwable.

The first rule of Throwable is that you do not catch Throwable. If you do decide to catch it, you can't count on struct destructors being called and finally clauses being executed.

Another issue is that any Error caught might be of the type AssertError. This sort of error really should be propagated all the way up the call stack. It's acceptable to catch Throwable here, since I'm rethrowing it. By making _rethrow a Throwable, I'm ensuring that I won't miss any exceptions, regardless of type. With one caveat.

This implementation is fine if only one callback has been set. But if others have been set, glfwPollEvents can call any number of them on any given execution. If more than one exception is thrown, the newest will always overwrite its immediate predecessor. This means that, potentially, an Exception might overwrite an Error that really shouldn't be lost. In practice, this is unlikely to be a problem. If I'm implementing this for my own personal use, I know whether or not I care about handling multiple exceptions and can modify the code at any time if I find that I need to later on. But for something I want to distribute to others, this solution is not enough.

Like exceptions in Java, exceptions in D have built-in support for what's known as 'exception chaining.' Throwable exposes a public field called next. This can be set in a constructor when the exception is thrown.

void foo() {
    try { ... }
    catch( Throwable t ) {
        throw new MyException( "Something went horribly wrong!", t );
    }
}

void bar() {
    try {
        foo();
    } catch( MyException me ) {
        // Do something with MyException
        me.doSomething();

        // Throw the original
        throw me.next;
    }
}

Exception chaining is something that isn't needed often, but it can be useful in a number of cases. The problem I'm describing in this article is one of them.

Given the information so far, a first pass modification of onWindowClose might look something like this.

try { ... }
catch( Throwable t ) {
    // Chain the previously saved exception to t.
    t.next = _rethrow;

    // Save t.
    _rethrow = t;
}

This appears to do the job, making sure that no exceptions caught here are lost. However, there's still a problem. If t has an existing exception chain, then setting t.next will cause any previous exceptions connected to it to be lost.

To make the problem clear, what I want to do here is to save the caught exception, t and anything chained to it, along with any previously saved exceptions and their chains. This way, all of that information is available if needed once the C callback returns. That appears to call for more than one next field. Additionally, it would probably be a good idea to distinguish between saved Errors and Exceptions and handle them appropriately so that there's no need to rely on programmer discipline to do so elsewhere. Finally, it would be nice for user code to be able to tell the difference between exceptions thrown from the C callbacks and exceptions thrown from normal D code. This can be a useful aid in debugging.

Given the above contraints, a simple solution is a custom exception class. Here is the one I've implemented for my own code.

class CallbackThrowable : Throwable {

    // This is for the previously saved CallbackThrowable, if any.
    CallbackThrowable nextCT;

    // The file and line params aren't strictly necessary, but they are good
    // for debugging.
    this( Throwable payload, CallbackThrowable t,
            string file = __FILE__, size_t line = __LINE__ ) {

        // Call the super class constructor taht takes file and line info,
        // and make the wrapped exception part of this exception's chain.
        super( "An exception was thrown from a C callback.", file, line, payload );

        // Store a reference to the previously saved CallbackThrowable
        nextCT = t;
    }
    
    // This method aids in propagating non-Exception throwables up the callstack.
    void throwWrappedError() {
        // Test if the wrapped Throwable is an instance of Exception and throw it
        // if it isn't. This will cause Errors and any other non-Exception Throwable
        // to be rethrown.
        if( cast( Exception )next is null ) {
            throw next;
        }
    }
}

Generally, it is frowned upon to subclass Throwable directly. However, this subclass is not a typical Exception or Error and is intended to be handled in a special way. Furthermore, it wraps both types, so doesn't really match either. Conceptually, I think it's the right choice.

This is intended to be used to wrap any exceptions thrown in the callbacks. I assign the caught exception to the next member, via the superclass constructor, so that it and its entire chain are saved. Then I assign the previously saved CallbackThrowable to the custom nextCT member. If nothing was previously saved, that's okay, since D will have initialized _rethrow to null. Finally, I save the new CallbackThrowable to the _rethrow variable. The modified callback example below demonstrates how this is used.

private CallbackThrowable _rethrow;

private extern( C ) void onWindowClose( GLFWwindow* win ) {

    try {
       _log.writeln( "The window is closing!" );
        myApp.stop();
    } catch( Throwable t ) {
        // Save a new CallbackThrowable that wraps t and chains _rethrow.
        _rethrow = new CallbackThrowable( t, _rethrow );
    }
}

void pumpEvents() {
    glfwPollEvents();

    if( _rethrow !is null ) {
        // Loop through each CallbackThrowable in the chain and rethrow the first
    	// non-Exception throwable encountered.    
    	for( auto ct = _rethrow; ct !is null; ct = ct.nextCT ) {
        	ct.throwWrappedError();
        }
        
        // No Errors were caught, so all we have are Exceptions.
        // Throw the saved CallbackThrowable.
        throw _rethrow;
    }
}

Now, code further up the call stack can look for instances of CallbackThrowable and not miss a single Exception that was thrown in the callbacks. Errors will still be thrown independently, propagating up the callstack as intended.

This still isn't quite as perfect as one would like. If multiple Errors were thrown, then all but one would be lost. If that's important to handle, it's not difficult to do so. One potential solution would be to log each error in the loop above and rethrow the last one logged, rather than calling throwWrappedError. Another would be to implement CallbackError and CallbackException subclasses of CallbackThrowable. Errors can be chained to the former, Exceptions to the latter. Then the loop can be eliminated for something like this. I'll leave that as an exercise for the reader.

Conclusion


In Java, I see exceptions used everywhere, but that's primarily because there's often no choice (I still cringe thinking about my first experience with JDBC). Checked exceptions can be extremely annoying to deal with. Conversely, in C++ where there are no checked exceptions, I've found that they are less common (in the source I've seen, which certainly is in a narrow domain -- I wouldn't be surprised to find them used extensively in some fields outside of game development).

In D, like C++, there are no checked exceptions. However, their usage in D tends to be pervasive, but in a way that is more subtle than in Java. That is, you don't see try...catch blocks all over a D codebase as you do a Java codebase. Often, in user code it's enough just to use try...catch only in special cases, such as file IO when you don't want to abort the app just because a file failed to load, and let the rest propagate on up the call stack. The runtime will generate a stack trace on exit. But they can be thrown from anywhere any time, particularly from the standard library. So in D, whether a codebase is littered with exceptions or not, they should always be on the mind.

As such, cross-boundary exception handling is one potential source of problems that needs to be considered when using D and C in the same codebase. The solution presented here is only one of several possible. I find it not only simple, but flexible enough to cover a number of use cases.

Changelog

August 7, 2013: Clarified the difference between Errors and Exceptions. Updated the CallbackThrowable solution to handle Errors differently.

Level Creation Concerns with Unity

$
0
0
Developers who are creating a game for the first time may find the idea of creating levels a daunting task. Unity is an extremely powerful engine and editor which gives the developer a lot of freedom, but with so much freedom where should a developer begin? This article covers some of my experiences and opinions of the level creation process. Please comment with your own experiences, techniques and criticism!

Here are some of the concerns that I had when starting out with my first game:
  • Should I create the entire level in a modelling package?
  • What advantages would there be to composing the levels in Unity?
  • Which techniques would be most suitable for mobile development?
The truth is that there is no single "correct" approach. It all depends upon the type of game that you are making, the platforms that you are targeting, the version of Unity that you are using (Indie / Pro), and of course the level of skill that you posses!

Level Creation


Create bulk of scene using modelling package

With the modelling package of your choice you can create the bulk of your scene with tons of flexibility.

When taking this route I would strongly recommend creating an appropriate scene hierarchy (when supported) because this will make things a lot easier when working in Unity. I have also found it particularly useful to group objects using named empty objects; this avoids having to scroll through a very long list of objects in Unity!

Whilst levels created using this approach will generally require more memory at runtime, this can lead to improved rendering performance.

The only major problem that I found with this approach was with regards to changing and re-exporting the level. I found that on several occasions components (and component properties) that were assigned to sub-objects with the Unity editor get lost. This may have been something that I was doing wrong though...

Compose scene in Unity

Instead of creating and mapping the whole scene in a modelling package, create a selection of parts that can be reused. Compose your scene using the Unity editor by making the most of prefabs.

This will often use less memory than the previous approach, however the dynamic batching of objects may be more intensive.

When using this approach I did not encounter problems when updating meshes and re-exporting them.

Use specialized extensions

There are a number of extensions available from the Unity asset store that can make it considerably easier to create levels. It is well worth taking a look at what is available because there are some fantastic tools that can save a lot of time and hard work!

Here are just a few that are relevant to level design:
  • Rotorz Tile System - This is an extension that I created which aims to make it easier to design levels using 3D tiles. You can create tiles using your favourite modelling software and then import them into Unity for use within tile systems. You can control how painted tiles are transformed and can optionally be automatically oriented.

    Attached Image: gallery_198490_437_187985.png

  • UniTile (2D Map Editor) - Another fantastic tile based editor which specialises with 2D graphics. Easily create tiles from texture atlases and build highly optimized levels.
  • RageSpline - Create and edit smooth 2D vector-style graphics. This is ideal for creating both levels and various other graphics. This tool can lead to some visually stunning effects with a similar look to Flash-type games.
Procedural Generation

This is an extremely complex topic, but in some scenarios it may be beneficial to dynamically build scenes. Scenes could be composed of predefined meshes or from procedurally generated meshes.

Performance Considerations


When modelling for games it is useful to keep the idea of draw calls in mind (especially for mobile development!). The number of draw calls increases for each mesh that is rendered (one for each material applied to mesh). There are a number of ways in which the number of draw calls can be reduced.

Reduce the number of materials

Use the fewest number of materials possible to achieve the visual quality that you are after. Where possible reduce the number of similar materials by combining multiple textures into a single image (often referred to as a texture atlas). In many cases this can significantly reduce the number of draw calls. The "Angry Bots" demo project that is included with Unity is an excellent example of this!

Batching

Both Unity and Unity Pro support dynamic batching which attempts to reduce the number of draw calls by submitting multiple objects that share the same material at the same time. Unity Pro includes the additional option of static batching which takes this a step further by combining objects that share the same material into a single object. Whilst static batching does require more memory, it can lead to significantly better performance.

It is also possible to create a custom script that combines meshes at runtime, or a custom editor script.

Number of triangles and vertices

Keeping the number of triangles and vertices in a mesh to a minimum will improve performance, especially on mobile devices. More often than not additional detail can be added to a texture instead. Custom shaders can also be created to add detail to an otherwise flat object using special textures like bump and height maps (for example).

For those who are interested in getting started with their own shaders, I would strongly recommend watching JessyUV's videos on YouTube!


Originally published on May 12, 2012 in kruncher's Journal

Wade Not In Unknown Waters: Part Two

$
0
0

This time I want to speak on the 'printf' function. Everybody has heard of software vulnerabilities and that functions like 'printf' are outlaw. But it's one thing to know that you'd better not use these functions, and quite the other to understand why. In this article, I will describe two classic software vulnerabilities related to 'printf'. You won't become a hacker after that but perhaps you will have a fresh look at your code. You might create similar vulnerable functions in your project without knowing that.


STOP. Reader, please stop, don't pass by. You have seen the word "printf", I know. And you're sure that you will now be told a banal story that the function cannot check types of passed arguments. No! It's vulnerabilities themselves that the article deals with, not the things you have thought. Please come and read it.


The previous post can be found here: Part one.


Introduction


Have a look at this line:


printf(name);

It seems simple and safe. But actually it hides at least two methods to attack the program.


Let's start our article with a demo sample containing this line. The code might look a bit odd. It is, really. We found it quite difficult to write a program so that it could be attacked then. The reason is optimization performed by the compiler. It appears that if you write a too simple program, the compiler creates a code where nothing can be hacked. It uses registers, not the stack, to store data, creates intrinsic functions and so on. We could write a code with extra actions and loops so that the compiler lacked free registers and started putting data into the stack. Unfortunately, the code would be too large and complicated in this case. We could write a whole detective story about all this, but we won't.


The cited sample is a compromise between complexity and the necessity to create a code that would not be too simple for the compiler to get it "collapsed into nothing". I have to confess that I still have helped myself a bit: I have disabled some optimization options in Visual Studio 2010. First, I have turned off the /GL (Whole Program Optimization) switch. Second, I have used the __declspec(noinline) attribute.


Sorry for such a long introduction: I just wanted to explain why my code is such a crock and prevent beforehand any debates on how we could write it in a better way. I know that we could. But we didn't manage to make the code short and show you the vulnerability inside it at the same time.


Demo sample


The complete code and project for Visual Studio 2010 can be found here: Attached File  printf_demo.zip   3.47KB   26 downloads
.

const size_t MAX_NAME_LEN = 60;
enum ErrorStatus {
  E_ToShortName, E_ToShortPass, E_BigName, E_OK
};

void PrintNormalizedName(const char *raw_name)
{
  char name[MAX_NAME_LEN + 1];
  strcpy(name, raw_name);

  for (size_t i = 0; name[i] != '\0'; ++i)
    name[i] = tolower(name[i]);
  name[0] = toupper(name[0]);

  printf(name);
}

ErrorStatus IsCorrectPassword(
  const char *universalPassword,
  BOOL &retIsOkPass)
{
  string name, password;
  printf("Name: "); cin >> name;
  printf("Password: "); cin >> password;
  if (name.length() < 1) return E_ToShortName;
  if (name.length() > MAX_NAME_LEN) return E_BigName;
  if (password.length() < 1) return E_ToShortPass;

  retIsOkPass = 
    universalPassword != NULL &&
    strcmp(password.c_str(), universalPassword) == 0;
  if (!retIsOkPass)
    retIsOkPass = name[0] == password[0];

  printf("Hello, ");
  PrintNormalizedName(name.c_str());

  return E_OK;
}

int _tmain(int, char *[])
{
  _set_printf_count_output(1);
  char universal[] = "_Universal_Pass_!";
  BOOL isOkPassword = FALSE;
  ErrorStatus status =
    IsCorrectPassword(universal, isOkPassword);
  if (status == E_OK && isOkPassword)
    printf("\nPassword: OK\n");
  else
    printf("\nPassword: ERROR\n");
  return 0;
}

The _tmain() function calls the IsCorrectPassword() function. If the password is correct or if it coincides with the magic word "_Universal_Pass_!", then the program prints the line "Password: OK". The purpose of our attacks will be to have the program print this very line.


The IsCorrectPassword() function asks the user to specify name and password. The password is considered correct if it coincides with the magic word passed into the function. It is also considered correct if the password's first letter coincides with the name's first letter.


Regardless of whether the correct password is entered or not, the application shows a welcome window. The PrintNormalizedName() function is called for this purpose.


The PrintNormalizedName() function is of the most interest. It is this function where the "printf(name);" we're discussing is stored. Think of the way we can exploit this line to cheat the program. If you know how to do it, you don't have to read further.


What does the PrintNormalizedName() function do? It prints the name making the first letter capital and the rest letters small. For instance, if you enter the name "andREy2008", it will be printed as "Andrey2008".


The first attack


Suppose we don't know the correct password. But we know that there is some magic password somewhere. Let's try to find it using printf(). If this password's address is stored somewhere in the stack, we have certain chances to succeed. Any ideas how to get this password printed on the screen?


Here is a tip. The printf() function refers to the family of variable-argument functions. These functions work in the following way. Some amount of data is written into the stack. The printf() function doesn't know how many data is pushed and what type they have. It follows only the format string. If it reads "%d%s", then the function should extract one value of the int type and one pointer from the stack. Since the printf() function doesn't know how many arguments it has been passed, it can look deeper into the stack and print data that have nothing to do with it. It usually causes access violation or printing trash. And we may exploit this trash.


Let's see how the stack might look at the moment when calling the printf() function:


image1.png


Figure 1. Schematic arrangement of data in the stack.


The "printf(name);" function's call has only one argument which is the format string. It means that if we type in "%d" instead of the name, the program will print the data that lie in the stack before the PrintNormalizedName() function's return address. Let's try:


Name: %d

Password: 1

Hello, 37

Password: ERROR


This action has little sense in it for now. First of all, we have at least to print the return addresses and all the contents of the char name[MAX_NAME_LEN + 1] buffer which is located in the stack too. And only then we may get to something really interesting.


If an attacker cannot disassemble or debug the program, he/she cannot know for sure if there is something interesting in the stack to be found. He/she still can go the following way.


First we can enter: "%s". Then "%x%s". Then "%x%x%s" and so on. Doing so, the hacker will search through the data in the stack in turn and try to print them as a line. It helps the intruder that all the data in the stack are aligned at least on a 4-byte boundary.


To be honest, we won't succeed if we go this way. We will exceed the limit of 60 characters and have nothing useful printed. "%f" will help us - it is intended to print values of the double type. So, we can use it to move along the stack with an 8-byte step.


Here it is, our dear line:


%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%x(%s)


This is the result:


image2.png

</p>

Figure 2. Printing the password. Click on the picture to enlarge it.


Let's try this line as the magic password:


Name: Aaa

Password: _Universal_Pass_!

Hello, Aaa

Password: OK


Hurrah! We have managed to find and print the private data which the program didn't intend to give us access to. Note also that you don't have to get access to the application's binary code itself. Diligence and persistence are enough.


Conclusions on the first attack


You should give a wider consideration to this method of getting private data. When developing software containing variable-argument functions, think it over if there are cases when they may be the source of data leakage. It can be a log-file, a batch passed on the network and the like.


In the case we have considered, the attack is possible because the printf() function receives a string that may contain control commands. To avoid this, you just need to write it in this way:


printf("%s", name);

The second attack


Do you know that the printf() function can modify memory? You must have read about it but forgotten. We mean the "%n" specifier. It allows you to write a number of characters, already printed by the printf() function, by a certain address.


To be honest, an attack based on the "%n" specifier is just of a historical character. Starting with Visual Studio 2005, the capability of using "%n" is off by default. To perform this attack, I had to explicitly allow this specifier. Here is this magic trick:


_set_printf_count_output(1);

To make it clearer, let me give you an example of using "%n":

int i;

printf("12345%n6789\n", &i);

printf( "i = %d\n", i );

The program's output:

123456789

i = 5


We have already found out how to get to the needed pointer in the stack. And now we have a tool that allows us to modify memory by this pointer.


Of course, it's not very much convenient to use it. To start with, we can write only 4 bytes at a time (int type's size). If we need a larger number, the printf() function will have to print very many characters first. To avoid this we may use the "%00u" specifier: it affects the value of the current number of output bytes. Let's not go deep into the detail.


Our case is simpler: we just have to write any value not equal to 0 into the isOkPassword variable. This variable's address is passed into the IsCorrectPassword() function, which means that it is stored somewhere in the stack. Do not be confused by the fact that the variable is passed as a reference: a reference is an ordinary pointer at the low level.


Here is the line that will allow us to modify the IsCorrectPassword variable:


%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f%f %n


The "%n" specifier does not take into account the number of characters printed by specifiers like "%f". That's why we make one space before "%n" to write value 1 into isOkPassword.


Let's try:


image4.png

</p>

Figure 3. Writing into memory. Click on the picture to enlarge it.


Are you impressed? But that's not all. We may perform writing by virtually any address. If the printed line is stored in the stack, we may get the needed characters and use them as an address.


For example, we may write a string containing characters with codes 'xF8', 'x32', 'x01', 'x7F' in a row. It turns out that the string contains a hard-coded number equivalent to value 0x7F0132F8. We add the "%n" specifier at the end. Using "%x" or other specifiers we can get to the coded number 0x7F0132F8 and write the number of printed characters by this address. This method has some limitations, but it is still very interesting.


Conclusions on the second attack


We may say that an attack of the second type is hardly possible nowadays. As you see, support of the "%n" specifier is off in contemporary libraries by default. But you may create a self-made mechanism subject to this kind of vulnerabilities. Be careful when external data input into your program manage what and where is written into memory.


Particularly in our case, we may avoid the problem by writing the code in this way:


printf("%s", name);

General conclusions


We have considered only two simple examples of vulnerabilities here. Surely, there are much more of them. We don't make an attempt to describe or at least enumerate them in this article; we wanted to show you that even such a simple construct like "printf(name)" can be dangerous.


There is an important conclusion to draw from all this: if you are not a security expert, you'd better follow all the recommendations to be found. Their point might be too subtle for you to understand the whole range of dangers on yourself. You must have read that the printf() function is dangerous. But I'm sure that many of you reading this article have learned only now how deep the rabbit hole is.


If you create an application that is potentially an attack object, be very careful. What is quite safe code from your viewpoint might contain a vulnerability. If you don't see a catch in your code, it doesn't mean there isn't any.


Follow all the compiler's recommendations on using updated versions of string functions. We mean using sprintf_s instead of sprintf and so on.


It's even better if you refuse low-level string handling. These functions are a heritage of the C language. Now we have std::string and we have safe methods of string formatting such as boost::format or std::stringstream.


P.S. Some of you, having read the conclusions, may say: "well, it's as clear as day". But be honest to yourself. Did you know and remember that printf() can perform writing into memory before you read this article? Well, and this is a great vulnerability. At least, it used to. Now there are others, as much insidious.


The Healthy Programmer

$
0
0
When the publisher asked me to review The Healthy Programmer: Get Fit, Feel Better, and Keep Coding, I was interested. I also raised my skeptic flag, because books on getting healthy can go in two directions. They are either common-sense guides to good eating and exercise, or they descend straight into “be healthier with no effort at all on your part” pseudoscience. That's not The Healthy Programmer. This is a nice readable guide for programmers who want to avoid the perils of the I-spend-eighty-hours-in-front-of-a-screen lifestyle.

The book itself is pretty common-sense. There is little coverage of all of those fitness doodads that you clip to your body. It does mention a couple of apps that are helpful, but they are basically just little database-type “to do” lists that you can use to track your progress towards becoming less sedentary. The exercises generally do not require any extra equipment, and the eating advice looks pretty healthy. Of course, such down-to-earth advice means that you will find little that you would not be able to do with an “eat less and move more” lifestyle change. But if your game is on a hard deadline, you are probably not in a position to take a break for a salad and some yoga.

The Healthy Programmer avoids trendiness. There is no mention of toxins or cleanses or whatever kinds of lose-weight-fast plans are popular this week and will be forgotten next week. It is pretty-much "here is how to eat better and move more" translated into programmerese and put into a plan that will work with a minimum (or even zero) of leaving your office, and with a focus on some programmer-centric maladies like carpal tunnel.

If I had a complaint about The Healthy Programmer, it would be that it does not delve too deeply into specifics. As someone who is shopping for a standing desk, I would liked to have seen a better discussion or comparison of products beyond the benefits of having a standing desk and pictures of standing desks constructed from IKEA tables. Although that is a fairly minor complaint given the number of well-reviewed desks on the major office supply websites.

The Healthy Programmer is available in paper form at Amazon, or you can get DRM-free ebook versions at the publisher's site.

Checking the Open-Source Multi Theft Auto Game

$
0
0
We haven't used PVS-Studio to check games for a long time. So, this time we decided to return to this practice and picked out the MTA project. Multi Theft Auto (MTA) is a multiplayer modification for PC versions of the Grand Theft Auto: San Andreas game by Rockstar North that adds online multiplayer functionality. As Wikipedia tells us, the specific feature of the game is "well optimized code with fewest bugs possible". OK, let's ask our analyzer for its opinion.

Introduction


This time I decided to omit the texts of diagnostic messages generated by PVS-Studio for every particular defect. I comment upon examples anyway, so if you want to find out in which particular line and by which diagnostic rule a certain bug was found, see the file mtasa-review.txt.

When looking through the project, I noted in the mtasa-review.txt file those code fragments which I found suspicious and used it to prepare the article.

Important! I added only those code fragments which I personally didn't like. I'm not a MTA developer, so I'm not familiar with its logic and principles. That's why I must have made a few mistakes attacking correct code fragments and missing genuine bugs. Also, when studying certain fragments, I felt lazy indeed to describe some slightly incorrect printf() function calls. So, I'm asking MTA Team developers not to rely on this article and consider checking the project by themselves. It is pretty large, so the demo version of PVS-Studio won't be enough. However, we support free open-source projects. If you're an open source developer, contact us and we'll discuss the question of giving you a free registration key.

So, Multi Theft Auto is an open-source project in C/C++:
Analysis was performed by the PVS-Studio 5.05 analyzer:
Now let's see what bugs PVS-Studio has managed to find in the game. They aren't numerous, and most of them are found in rarely-used parts of the program (error handlers). It's no wonder: most bugs are found and fixed through other, more expensive and slow, methods. To use static analysis properly is to use it regularly. By the way, PVS-Studio can be called to analyze recently modified and compiled files only (see incremental analysis mode). This mechanism allows the developer to find and fix many bugs and misprints immediately, which makes it much faster and cheaper than detecting errors through testing. This subject was discussed in detail in the article "Leo Tolstoy and static code analysis". It's a worthy article, and I do recommend reading the introduction to understand the ideology of using PVS-Studio and other static analysis tools.

Strange Colors


// c3dmarkersa.cpp
SColor C3DMarkerSA::GetColor()
{
  DEBUG_TRACE("RGBA C3DMarkerSA::GetColor()");
  // From ABGR
  unsigned long ulABGR = this->GetInterface()->rwColour;
  SColor color;
  color.A = ( ulABGR >> 24 ) && 0xff;
  color.B = ( ulABGR >> 16 ) && 0xff;
  color.G = ( ulABGR >> 8 ) && 0xff;
  color.R = ulABGR && 0xff;
  return color;
}

By mistake '&&' is used instead of '&'. The color is torn into bits and pieces to leave only 0 or 1.

The same problem is found in the file "ccheckpointsa.cpp".

One more problem with colors.

// cchatechopacket.h
class CChatEchoPacket : public CPacket
{
  ....
  inline void SetColor( unsigned char ucRed,
                        unsigned char ucGreen,
                        unsigned char ucBlue )
  { m_ucRed = ucRed; m_ucGreen = ucGreen; m_ucRed = ucRed; };
  ....
}

Red is copied twice, while blue is not copied at all. The fixed code should look like this:

{ m_ucRed = ucRed; m_ucGreen = ucGreen; m_ucBlue = ucBlue; };

The same problem is found in the file cdebugechopacket.h.

By the way, quite a number of bugs of the game are duplicated in two files which, I suspect, refer to the client-side and the server-side correspondingly. Do you feel the great power of the Copy-Paste technology? :).

Something Wrong with utf8


// utf8.h
int
utf8_wctomb (unsigned char *dest, wchar_t wc, int dest_size)
{
  if (!dest)
    return 0;
  int count;
  if (wc < 0x80)
    count = 1;
  else if (wc < 0x800)
    count = 2;
  else if (wc < 0x10000)
    count = 3;
  else if (wc < 0x200000)
    count = 4;
  else if (wc < 0x4000000)
    count = 5;
  else if (wc <= 0x7fffffff)
    count = 6;
  else
    return RET_ILSEQ;
  ....
}

The size of the wchar_t type in Windows is 2 bytes. Its value range is [0..65535], which means that comparing it to values 0x10000, 0x200000, 0x4000000, 0x7fffffff is pointless. I guess the code should be written in some different way.

Missing break


// cpackethandler.cpp
void CPacketHandler::Packet_ServerDisconnected (....)
{
  ....
  case ePlayerDisconnectType::BANNED_IP:
    strReason = _("Disconnected: You are banned.\nReason: %s");
    strErrorCode = _E("CD33");
    bitStream.ReadString ( strDuration );
  case ePlayerDisconnectType::BANNED_ACCOUNT:
    strReason = _("Disconnected: Account is banned.\nReason: %s");
    strErrorCode = _E("CD34");
    break;
  ....
}

The break operator is missing in this code. It results in processing the situation BANNED_IP in the same way as BANNED_ACCOUNT.

Strange Checks


// cvehicleupgrades.cpp
bool CVehicleUpgrades::IsUpgradeCompatible (
  unsigned short usUpgrade )
{
  ....
  case 402: return ( us == 1009 || us == 1009 || us == 1010 );
  ....
}

The variable is compared twice to the number 1009. A bit ahead in the code there is a similar double comparison.

Another strange comparison:

// cclientplayervoice.h
bool IsTempoChanged(void)
{ 
  return m_fSampleRate != 0.0f ||
         m_fSampleRate != 0.0f ||
         m_fTempo != 0.0f;
}

This error was also copied into the cclientsound.h file.

Null Pointer Dereferencing


// cgame.cpp
void CGame::Packet_PlayerJoinData(CPlayerJoinDataPacket& Packet)
{
  ....
  // Add the player
  CPlayer* pPlayer = m_pPlayerManager->Create (....);
  if ( pPlayer )
  {
    ....
  }
  else
  {
    // Tell the console
    CLogger::LogPrintf(
      "CONNECT: %s failed to connect "
      "(Player Element Could not be created.)\n",
      pPlayer->GetSourceIP() );
  }
  ....
}

If the object player can't be created, the program will attempt printing the corresponding error message into the console. It will fail because it's a bad idea to use a null pointer when calling the function pPlayer->GetSourceIP().

Another null pointer is dereferenced in the following fragment:

// clientcommands.cpp
void COMMAND_MessageTarget ( const char* szCmdLine )
{
  if ( !(szCmdLine || szCmdLine[0]) )
    return;
  ....
}

If the szCmdLine pointer is null, it will be dereferenced.

The fixed code must look like this, I suppose:

if ( !(szCmdLine && szCmdLine[0]) )

The following code fragment I like most of all:

// cdirect3ddata.cpp
void CDirect3DData::GetTransform (....) 
{
  switch ( dwRequestedMatrix )
  {
    case D3DTS_VIEW:
      memcpy (pMatrixOut, &m_mViewMatrix, sizeof(D3DMATRIX));
      break;
    case D3DTS_PROJECTION:
      memcpy (pMatrixOut, &m_mProjMatrix, sizeof(D3DMATRIX));
      break;
    case D3DTS_WORLD:
      memcpy (pMatrixOut, &m_mWorldMatrix, sizeof(D3DMATRIX));
      break;
    default:
      // Zero out the structure for the user.
      memcpy (pMatrixOut, 0, sizeof ( D3DMATRIX ) );
      break;
  }
  ....
}

Very nice Copy-Paste. The function memset() must be called instead of the last memcpy() function.

Uncleared Arrays


There are a number of errors related to uncleared arrays. They all can be arranged into two categories. The first includes unremoved items, the second includes partial array clearing errors.

Unremoved Items


// cperfstat.functiontiming.cpp
std::map < SString, SFunctionTimingInfo > m_TimingMap;

void CPerfStatFunctionTimingImpl::DoPulse ( void )
{
  ....
  // Do nothing if not active
  if ( !m_bIsActive )
  {
    m_TimingMap.empty ();
    return;
  }
  ....
}

The function empty() only checks whether or not the container contains items. To remove items from the m_TimingMap container one should call the clear() function.

Another example:

// cclientcolsphere.cpp
void CreateSphereFaces (
  std::vector < SFace >& faceList, int iIterations )
{
  int numFaces = (int)( pow ( 4.0, iIterations ) * 8 );
  faceList.empty ();
  faceList.reserve ( numFaces );
  ....
}

Some more similar bugs are found in the file cresource.cpp.

Note:  If you have started reading the article from the middle and therefore skipped the beginning, see the file mtasa-review.txt to find out exact locations of all the bugs.


Partial Array Clearing Errors


// crashhandler.cpp
LPCTSTR __stdcall GetFaultReason(EXCEPTION_POINTERS * pExPtrs)
{
  ....
  PIMAGEHLP_SYMBOL pSym = (PIMAGEHLP_SYMBOL)&g_stSymbol ;
  FillMemory ( pSym , NULL , SYM_BUFF_SIZE ) ;
  ....
}

Everything looks alright at the first sight. But FillMemory() will in fact have no effect. FillMemory() and memset() are different functions. Have a look at this fragment:

#define RtlFillMemory(Destination,Length,Fill) \
  memset((Destination),(Fill),(Length))
#define FillMemory RtlFillMemory

The second and the third arguments are swapped. That's why the correct code should look like this:

FillMemory ( pSym , SYM_BUFF_SIZE, 0 ) ;

The same thing is found in the file ccrashhandlerapi.cpp.

And here is the last error sample of this type. Only one byte gets cleared.

// hash.hpp
unsigned char m_buffer[64];
void CMD5Hasher::Finalize ( void )
{
  ....
  // Zeroize sensitive information
  memset ( m_buffer, 0, sizeof (*m_buffer) );
  ....
}

Asterisk '*' should be removed: sizeof (m_buffer).

Uninitialized Variable


// ceguiwindow.cpp
Vector2 Window::windowToScreen(const UVector2& vec) const
{
  Vector2 base = d_parent ?
    d_parent->windowToScreen(base) + getAbsolutePosition() :
    getAbsolutePosition();
  ....
}

The variable base initializes itself. Another bug of this kind can be found a few lines ahead.

Array Index out of Bounds


// cjoystickmanager.cpp
struct
{
  bool    bEnabled;
  long    lMax;
  long    lMin;
  DWORD   dwType;
} axis[7];

bool CJoystickManager::IsXInputDeviceAttached ( void )
{
  ....
  m_DevInfo.axis[6].bEnabled = 0;
  m_DevInfo.axis[7].bEnabled = 0;
  ....
}

The last line m_DevInfo.axis[7].bEnabled = 0; is not needed.

Another error of this kind

// cwatermanagersa.cpp
class CWaterPolySAInterface
{
public:
  WORD m_wVertexIDs[3];
};

CWaterPoly* CWaterManagerSA::CreateQuad ( const CVector& vecBL, const CVector& vecBR, const CVector& vecTL, const CVector& vecTR, bool bShallow )
{
  ....
  pInterface->m_wVertexIDs [ 0 ] = pV1->GetID ();
  pInterface->m_wVertexIDs [ 1 ] = pV2->GetID ();
  pInterface->m_wVertexIDs [ 2 ] = pV3->GetID ();
  pInterface->m_wVertexIDs [ 3 ] = pV4->GetID ();
  ....
}

One more:

// cmainmenu.cpp
#define CORE_MTA_NEWS_ITEMS 3

CGUILabel* m_pNewsItemLabels[CORE_MTA_NEWS_ITEMS];
CGUILabel* m_pNewsItemShadowLabels[CORE_MTA_NEWS_ITEMS];

void CMainMenu::SetNewsHeadline (....)
{
  ....
  for ( char i=0; i <= CORE_MTA_NEWS_ITEMS; i++ )
  {
    m_pNewsItemLabels[ i ]->SetFont ( szFontName );
    m_pNewsItemShadowLabels[ i ]->SetFont ( szFontName );
    ....
  }
  ....
}

At least one more error of this kind can be found in the file cpoolssa.cpp. But I decided not to describe it in the article because that would be a pretty large sample and I didn't know how to make it brief and clear. As I've already said, this and all the rest of the bugs can be found in the detailed report.

The Word 'throw' is Missing


// fallistheader.cpp
ListHeaderSegment*
FalagardListHeader::createNewSegment(const String& name) const
{
  if (d_segmentWidgetType.empty())
  {
    InvalidRequestException(
      "FalagardListHeader::createNewSegment - "
      "Segment widget type has not been set!");
  }
  return ....;
}

The correct line is throw InvalidRequestException(....).

Another code fragment.

// ceguistring.cpp 
bool String::grow(size_type new_size)
{
  // check for too big
  if (max_size() <= new_size)
    std::length_error(
      "Resulting CEGUI::String would be too big");
  ....
}

The correct code should look like this: throw std::length_error(....).

Oops: free(new T[n])


// cresourcechecker.cpp
int CResourceChecker::ReplaceFilesInZIP(....)
{
  ....
  // Load file into a buffer
  buf = new char[ ulLength ];
  if ( fread ( buf, 1, ulLength, pFile ) != ulLength )
  {
    free( buf );
    buf = NULL;
  }
  ....
}

The new operator is used to allocate memory, while the function free() is used to release it. The result is unpredictable.

Always True/False Conditions


// cproxydirect3ddevice9.cpp
#define D3DCLEAR_ZBUFFER 0x00000002l
HRESULT CProxyDirect3DDevice9::Clear(....)
{
  if ( Flags | D3DCLEAR_ZBUFFER )
    CGraphics::GetSingleton().
      GetRenderItemManager()->SaveReadableDepthBuffer();
  ....
}

The programmer wanted to check a particular bit in the Flag variable. By mistake he wrote the '|' operation instead of '&'. This results in the condition being always true.

A similar mess-up is found in the file cvehiclesa.cpp.

Another bug in a check is found here: unsigned_value < 0.

// crenderitem.effectcloner.cpp
unsigned long long Get ( void );

void CEffectClonerImpl::MaybeTidyUp ( void )
{
  ....
  if ( m_TidyupTimer.Get () < 0 )
    return;
  ....
}

The Get() function returns the value of the unsigned unsigned long long type. It means that the check m_TidyupTimer.Get () < 0 is pointless. Other errors of this type can be found in the files csettings.cpp, cmultiplayersa_1.3.cpp and cvehiclerpcs.cpp.

This Code May Work, but You'd Better Refactor It


Many PVS-Studio diagnostics detected bugs which will most likely in no way manifest themselves. I don't like describing such bugs because they are not interesting. So, here are just a couple of examples.

// cluaacldefs.cpp
int CLuaACLDefs::aclListRights ( lua_State* luaVM )
{
  char szRightName [128];
  ....
  strncat ( szRightName, (*iter)->GetRightName (), 128 );
  ....
}

The third argument of the strncat() function refers, instead of the buffer size, to the number of characters you can put into the buffer. A buffer overflow can theoretically occur here, but in practice it will most probably never happen. This type of error is described in detail in the V645 diagnostic's description.

The second example.

// cscreenshot.cpp
void CScreenShot::BeginSave (....)
{
  ....
  HANDLE hThread = CreateThread (
    NULL,
    0,
    (LPTHREAD_START_ROUTINE)CScreenShot::ThreadProc,
    NULL,
    CREATE_SUSPENDED,
    NULL );
  ....
}

In many game fragments, the functions CreateThread()/ExitThread() are used. This is in most cases a bad idea. You should use the functions _beginthreadex()/_endthreadex() instead. For details on this issue see the V513 diagnostic's description.

I Have to Stop Somewhere


I have described only a part of all the defects I noticed. But I have to stop here: the article is already big enough. See the file mtasa-review.txt for other bug samples.

There you will find bugs which I haven't mentioned in the article:
  • identical branches in the conditional operator if () { aa } else { aa };
  • checking a pointer returned by the new operator for being a null pointer: p = new T; if (!p) { aa };
  • a poor way of using #pragma to suppress compiler warnings (instead of push/pop);
  • classes contain virtual functions but no virtual destructors;
  • a pointer gets dereferenced first and only then checked for being a null pointer;
  • identical conditions: if (X) { if (X) { aa } };
  • miscellaneous.

Conclusion


The PVS-Studio analyzer can be efficiently used to eliminate various bugs at early development stages both in game projects and projects of any other type. It won't find algorithmic errors of course (it needs AI to do that), but it will help save a lot of time programmers usually waste searching for silly mistakes and misprints. Developers actually spend much more time on finding plain defects than they may think. Even debugged and tested code contains numbers of such errors, while 10 times more of them get fixed when writing new code.

The Most Effective Playtester: The Griefer!

$
0
0
Back in my college days, when the internet was still new to the general public, I formed a friendship with this guy that turned out to be "The Greatest Gamer I've Ever Known". He was a natural at fighting games. He read tabletop RPG manuals and exploited the flaws in character development better than anyone else. His insight on the gaming (and piracy!) communities was unparalleled. He's the only guy I knew in 2000 that could make a self-booting Dreamcast copy of a disk. He was quick of wit but always calm in his demeanor. And he was a jerk to everyone except his friends and he did it for fun! He's dead now (no joke!) but he's immortalized in my mind as a "Master Griefer"... and the world, most especially the online community, is better for having experienced him in their lives even if he was the villain they so vehemently despised. (And rightfully so!)

His name was "Ralph". (He would have LOVED the Wreck It Ralph movie! It's so appropriate...)

The Master Griefer:


The Effective Playtester


I watched Ralph terrorize several online MUDs. Those communities rallied like white blood cells to combat his actions which is exactly what he wanted: attention and power. Over his shoulder, I watched as whole scenes played out over several days, weeks, and months as the community and admins tried their best to contain "The Griefer". What was great about it, though, was that he knew that whatever exploit he exposed would be fixed within a week or so. As I marveled at his expertise to find flaws, I wondered out loud to him why the game admins didn't try to get him on their side and circumvent the grief he was causing to the player-base. He'd shrug noncommittally and mutter "I dunno." I replied, "If I ever had someone like you poking holes as big as you are in my game (and Ralph was a big guy!), I'd be sure to get you on my side to help me fix things." He chuckled at that and responded with the essence of what this article is about: "They are fixing things because of me. I have to work harder every week to be as big a pain in the ass as I am to them now." I'll never forget that line. He was justifying his playstyle as being an effective service to the games he was screwing around with. And I can't say he was wrong!

This claim I am making would be incomplete without some examples. Of course, this could be incriminating in a court of law, but Ralph's dead now and he always worked alone. I never participated in anything illegal he did. I was just impressed by his skills and had a second-row seat.

One particular MUD he loved to harass banned all the accounts he made as fast as he could be identified. At one point, he got the entire BellSouth ISP banned from making accounts for that MUD. Ten minutes later through a couple proxies, he had another account made and was back up to his hijinks.

Same MUD, he caused the admins to enforce strong passwords on their server after harassing another account that stood up to his bullying in game. The victim's login name was "Merlin". Ralph talked smack about himself being an admin (not true) and waited for the guy to log out of the game. It's an insult to hackers to say that Ralph hacked the other guys account. "Merlin's" password was "Excalibur". The victim was looted, password was changed, and his character was left out in an area far above his level in case it was ever recovered. The strong password requirement came down a few days later.

Another MUD: Ralph caused naming conventions to be enforced. There was a player on the server that was decently respected and liked by the community named "Virgil". Ralph (over several accounts) and Virgil became heated enemies over time. So Ralph engaged in some character assassination by making an account named "VirgiI", replacing the lower case "L" with an uppercase "i". On the UNIX FTP screen on which the MUD was played, "Virgil" (VirgiL) and "VirgiI" (Virgii) were identically displayed down to the pixel. He only played "VirgiI" (Virgii) when he knew "Virgil"(VirgiL) wasn't around. Knowing Virgil so well from the time they fought each other verbally online, he mimicked the original's typing style and mannerisms. He even made derogatory comments about The Griefer account that Ralph was currently using. Subtly, over months, the derogatory comments became inflammatory and then outrageous to the point where Virgil (VirgiL) started suffering the repercussions of VergiI's (Virgii's) actions. The whole thing took about 8 months to be discovered and VrigiI (Virgii) got banned. But when Ralph went to make another character, there was a new rule in place that would only allow the FIRST letter of a character's name to be capitalized.

EDIT: The naming was so effective, that it even looks the same in this article. I've added alternate spellings in parenthesis to help clarify things.

Same MUD as above: (Ralph really liked picking on this MUD!) There was a very underutilized character class on the server. It was a spellcasting class that was terrible at soloing as it was a support class. One spell it had was WEB. WEB was cast on an opponent and held that opponent in place for a period of time determined by some formula tied to the Personality trait. Personality was generally used for NPC interactions and shop discounts. Once it got to a certain level, it was never increased by the common player. For this one class and this one spell, though, Ralph figured out that if he dumped ALL his creation points into Personality and got the WEB spell right away, that WEB would root a target in place for 3 days in game time. Once more... in game time! And this MUD encouraged PVP. So Ralph would wander around picking fights and casting WEB as his first move. Yes, his character would die and respawn, but the other guy was stuck in that spot for 3 days! In game time! That meant 72 hours of logged-in game time. If they logged out, the timer paused until they logged back in. The fix for that came down in 2 weeks since most of the regular player base was stuck and couldn't play the game. The WEB spell became a tier-based timer based on Personality with a max cap of 5 minutes.

So what can be done?


I'm not defending Ralph or his behavior. He was an online monster. Calling him an early internet troll is as massive an understatement as like saying Jim Jones "had a few followers". He was a jerk of colossal proportions and liked being that way. I was fortunate to be on his "good side" and learn how to deal with people like him when I'm not on their "good side".

It takes what I call "social judo": redirecting the will of The Griefer to suit your own purposes. All Ralph wanted to do was to give someone a hard time online. Since it was online, it didn't really affect the person's "real world" life, right? (You couldn't disagree with Ralph. He'd grief you in a heartbeat with a big dopey smile on his face! It helped that he was technically right most of the time anyway.) So I accepted him for who he was. When I ran tabletop RPGs I let him make the characters the way he liked. I watched him. I talked to him and asked him for his opinion. I gave him what he wanted: attention and the power to shape the games he played. And by doing that, we got to be good friends and he worked WITH me in my assorted projects, making them mostly "Grief Proof" in the process.

Now time has passed and Ralph's been dead for several years now. (Natural physical health reasons.) The internet and online communities now have several controls in place to deal with the likes of him now. But I know that he wasn't just one-of-a-kind and there are others like him out there. I assert that even though they are trouble, that they should be embraced as a part of the community, not fought. If Ralph is any indication, they are geniuses struggling with their life circumstances and looking for some outlet for their feelings of frustration: needing to be in control of some aspect of their lives. Seriously, any MMO company would have benefitted tremendously with someone like Ralph as a permanent playtester. He would have broken their game in so many ways, the final product would have been bulletproof.

Article Update Log


16 Aug 2013: Wrote initial article points and incomplete first draft as a placeholder.

3 Sep 2013: Article approved for review. Minor clarifications and proofreading edits made.


GameDev.net Soapbox logo design by Mark "Prinz Eugn" Simpson

Useless Snippet #2: AABB/Frustum test

$
0
0
Welcome back. This time we'll take a look at one of the most common operations in a 3D graphics engine, frustum culling. How fast can such code become by using SIMD?

The problem


Classify whether a batch of AABBs are completely inside, completely outside or intersecting a frustum (6 planes).

Restrictions
  • AABBs are defined as (Center, Extent) pairs.
  • All vectors are Vector3f's.

Structs and initialization


#define OUTSIDE 0
#define INSIDE 1
#define INTERSECT 2 // or 3 depending on the algorithm used (see the discussion on the 2nd SSE version)

struct Vector3f
{
	float x, y, z;
};

struct AABB
{
	Vector3f m_Center;
	Vector3f m_Extent;
};

struct Plane
{
	float nx, ny, nz, d;
};

All the functions presented in the snippets below have the same signature:

void CullAABBList(AABB* aabbList, unsigned int numAABBs, Plane* frustumPlanes, unsigned int* aabbState);

As you can see, I've taken the state of the AABB (with regard to the specified frustum) outside of the struct itself. This has a couple advantages.
  • You can cull the same AABB list with multiple different frustums in parallel without worrying about conflicts. Just pass a different aabbState array to each function call and you are done.
  • All relevant AABB data are close to each other, which helps reducing cache misses (since the array is expected to be read sequentially).
Finally, all arrays are assumed to be 16-byte aligned. This is required for the SSE version.

Let's take a look at some code.

C++ (reference implementation)


// Performance (cycles/AABB): Average = 102.5 (stdev = 12.0)
void CullAABBList_C(AABB* aabbList, unsigned int numAABBs, Plane* frustumPlanes, unsigned int* aabbState)
{
	for(unsigned int iAABB = 0;iAABB < numAABBs;++iAABB)
	{
		const Vector3f& aabbCenter = aabbList[iAABB].m_Center;
		const Vector3f& aabbSize = aabbList[iAABB].m_Extent;

		unsigned int result = INSIDE; // Assume that the aabb will be inside the frustum
		for(unsigned int iPlane = 0;iPlane < 6;++iPlane)
		{
			const Plane& frustumPlane = frustumPlanes[iPlane];

			float d = aabbCenter.x * frustumPlane.nx + 
				  aabbCenter.y * frustumPlane.ny + 
				  aabbCenter.z * frustumPlane.nz;

			float r = aabbSize.x * fast_fabsf(frustumPlane.nx) + 
			          aabbSize.y * fast_fabsf(frustumPlane.ny) + 
				  aabbSize.z * fast_fabsf(frustumPlane.nz);

			float d_p_r = d + r;
			float d_m_r = d - r;

			if(d_p_r < -frustumPlane.d)
			{
				result = OUTSIDE;
				break;
			}
			else if(d_m_r < -frustumPlane.d)
				result = INTERSECT;
		}

		aabbState[iAABB] = result;
	}
}

The code is based on method 4c from an excellent post by Fabian Giesen on AABB culling: View frustum culling. If you haven't read it yet, check it out now. He presents several different ways of performing the same test, each of which has its own advantages and disadvantages.

There's nothing special about the above code, except from the fact that it breaks out of the inner loop as soon as the box is classified as completely outside of one of the planes. No need to check the others.
But we can do better. First thing, precalculate the absolute of the plane normals once, outside of the AABB loop. No need to recalculate the same thing over and over. We can also replace the floating point comparisons with bitwise ANDs. Lets take a look at the code.

C++ Optimized


// Performance (cycles/AABB): Average = 84.9 (stdev = 12.3)
void CullAABBList_C_Opt(AABB* __restrict aabbList, unsigned int numAABBs, Plane* __restrict frustumPlanes, unsigned int* __restrict aabbState)
{
	Plane absFrustumPlanes[6];
	for(unsigned int iPlane = 0;iPlane < 6;++iPlane)
	{
		absFrustumPlanes[iPlane].nx = fast_fabsf(frustumPlanes[iPlane].nx);
		absFrustumPlanes[iPlane].ny = fast_fabsf(frustumPlanes[iPlane].ny);
		absFrustumPlanes[iPlane].nz = fast_fabsf(frustumPlanes[iPlane].nz);
	}

	for(unsigned int iAABB = 0;iAABB < numAABBs;++iAABB)
	{
		const Vector3f& aabbCenter = aabbList[iAABB].m_Center;
		const Vector3f& aabbSize = aabbList[iAABB].m_Extent;

		unsigned int result = INSIDE; // Assume that the aabb will be inside the frustum
		for(unsigned int iPlane = 0;iPlane < 6;++iPlane)
		{
			const Plane& frustumPlane = frustumPlanes[iPlane];
			const Plane& absFrustumPlane = absFrustumPlanes[iPlane]; 

			float d = aabbCenter.x * frustumPlane.nx + 
				  aabbCenter.y * frustumPlane.ny + 
				  aabbCenter.z * frustumPlane.nz;

			float r = aabbSize.x * absFrustumPlane.nx + 
				  aabbSize.y * absFrustumPlane.ny + 
				  aabbSize.z * absFrustumPlane.nz;

			float d_p_r = d + r + frustumPlane.d;
			if(IsNegativeFloat(d_p_r))
			{
				result = OUTSIDE;
				break;
			}

			float d_m_r = d - r + frustumPlane.d;
			if(IsNegativeFloat(d_m_r))
				result = INTERSECT;
		}

		aabbState[iAABB] = result;
	}
}

fast_fabsf() and IsNegativeFloat() are simple functions which treat the float as int and remove/check the MSB.

I don't have any other C-level optimizations in mind. So let's see what we can do using SSE.

SSE (1 AABB at a time)


// 2013-09-10: Moved outside of the function body. Check comments by @Matias Goldberg for details.
__declspec(align(16)) static const unsigned int absPlaneMask[4] = {0x7FFFFFFF, 0x7FFFFFFF, 0x7FFFFFFF, 0xFFFFFFFF};

// Performance (cycles/AABB): Average = 63.9 (stdev = 10.8)
void CullAABBList_SSE_1(AABB* aabbList, unsigned int numAABBs, Plane* frustumPlanes, unsigned int* aabbState)
{
	__declspec(align(16)) Plane absFrustumPlanes[6];

	__m128 xmm_absPlaneMask = _mm_load_ps((float*)&absPlaneMask[0]);
	for(unsigned int iPlane = 0;iPlane < 6;++iPlane)
	{
		__m128 xmm_frustumPlane = _mm_load_ps(&frustumPlanes[iPlane].nx);
		__m128 xmm_absFrustumPlane = _mm_and_ps(xmm_frustumPlane, xmm_absPlaneMask);
		_mm_store_ps(&absFrustumPlanes[iPlane].nx, xmm_absFrustumPlane);
	}

	for(unsigned int iAABB = 0;iAABB < numAABBs;++iAABB)
	{
		__m128 xmm_aabbCenter_x = _mm_load_ss(&aabbList[iAABB].m_Center.x);
		__m128 xmm_aabbCenter_y = _mm_load_ss(&aabbList[iAABB].m_Center.y);
		__m128 xmm_aabbCenter_z = _mm_load_ss(&aabbList[iAABB].m_Center.z);
		__m128 xmm_aabbExtent_x = _mm_load_ss(&aabbList[iAABB].m_Extent.x);
		__m128 xmm_aabbExtent_y = _mm_load_ss(&aabbList[iAABB].m_Extent.y);
		__m128 xmm_aabbExtent_z = _mm_load_ss(&aabbList[iAABB].m_Extent.z);

		unsigned int result = INSIDE; // Assume that the aabb will be inside the frustum
		for(unsigned int iPlane = 0;iPlane < 6;++iPlane)
		{
			__m128 xmm_frustumPlane_Component = _mm_load_ss(&frustumPlanes[iPlane].nx);
			__m128 xmm_d = _mm_mul_ss(xmm_aabbCenter_x, xmm_frustumPlane_Component);
			
			xmm_frustumPlane_Component = _mm_load_ss(&frustumPlanes[iPlane].ny);
			xmm_d = _mm_add_ss(xmm_d, _mm_mul_ss(xmm_aabbCenter_y, xmm_frustumPlane_Component));

			xmm_frustumPlane_Component = _mm_load_ss(&frustumPlanes[iPlane].nz);
			xmm_d = _mm_add_ss(xmm_d, _mm_mul_ss(xmm_aabbCenter_z, xmm_frustumPlane_Component));

			__m128 xmm_absFrustumPlane_Component = _mm_load_ss(&absFrustumPlanes[iPlane].nx);
			__m128 xmm_r = _mm_mul_ss(xmm_aabbExtent_x, xmm_absFrustumPlane_Component);

			xmm_absFrustumPlane_Component = _mm_load_ss(&absFrustumPlanes[iPlane].ny);
			xmm_r = _mm_add_ss(xmm_r, _mm_mul_ss(xmm_aabbExtent_y, xmm_absFrustumPlane_Component));

			xmm_absFrustumPlane_Component = _mm_load_ss(&absFrustumPlanes[iPlane].nz);
			xmm_r = _mm_add_ss(xmm_r, _mm_mul_ss(xmm_aabbExtent_z, xmm_absFrustumPlane_Component));

			__m128 xmm_frustumPlane_d = _mm_load_ss(&frustumPlanes[iPlane].d);
			__m128 xmm_d_p_r = _mm_add_ss(_mm_add_ss(xmm_d, xmm_r), xmm_frustumPlane_d);
			__m128 xmm_d_m_r = _mm_add_ss(_mm_sub_ss(xmm_d, xmm_r), xmm_frustumPlane_d);

			// Shuffle d_p_r and d_m_r in order to perform only one _mm_movmask_ps
			__m128 xmm_d_p_r__d_m_r = _mm_shuffle_ps(xmm_d_p_r, xmm_d_m_r, _MM_SHUFFLE(0, 0, 0, 0));
			int negativeMask = _mm_movemask_ps(xmm_d_p_r__d_m_r);

			// Bit 0 holds the sign of d + r and bit 2 holds the sign of d - r
			if(negativeMask & 0x01)
			{
				result = OUTSIDE;
				break;
			}
			else if(negativeMask & 0x04)
				result = INTERSECT;
		}

		aabbState[iAABB] = result;
	}
}

Again, since we are processing one AABB at a time, we can break out of the inner loop as soon as the AABB is found to be completely outside one of the 6 planes.

The SSE code should be straightforward. All arithmetic operations are scalar and we use _mm_movemask_ps in order to extract the signs of (d + r) and (d - r). Depending on the signs, we classify the AABB as completely outside or intersecting the frustum. Even though we are testing one AABB at a time, the code is faster than the optimized C++ implementation.

The "problem" with the above snippet is that all SSE operations are scalar. We can do better by swizzling the data from 4 AABBs in such a way that it will be possible to calculate 4 (d + r) and 4 (d - r) simultaneously. Let's try that.

SSE (4 AABBs at a time)


// 2013-09-10: Moved outside of the function body. Check comments by @Matias Goldberg for details.
__declspec(align(16)) static const unsigned int absPlaneMask[4] = {0x7FFFFFFF, 0x7FFFFFFF, 0x7FFFFFFF, 0xFFFFFFFF};

// Performance (cycles/AABB): Average = 24.1 (stdev = 4.2)
void CullAABBList_SSE_4(AABB* aabbList, unsigned int numAABBs, Plane* frustumPlanes, unsigned int* aabbState)
{
	__declspec(align(16)) Plane absFrustumPlanes[6];
	__m128 xmm_absPlaneMask = _mm_load_ps((float*)&absPlaneMask[0]);
	for(unsigned int iPlane = 0;iPlane < 6;++iPlane)
	{
		__m128 xmm_frustumPlane = _mm_load_ps(&frustumPlanes[iPlane].nx);
		__m128 xmm_absFrustumPlane = _mm_and_ps(xmm_frustumPlane, xmm_absPlaneMask);
		_mm_store_ps(&absFrustumPlanes[iPlane].nx, xmm_absFrustumPlane);
	}

	// Process 4 AABBs in each iteration...
	unsigned int numIterations = numAABBs >> 2;
	for(unsigned int iIter = 0;iIter < numIterations;++iIter)
	{
		// NOTE: Since the aabbList is 16-byte aligned, we can use aligned moves.
		// Load the 4 Center/Extents pairs for the 4 AABBs.
		__m128 xmm_cx0_cy0_cz0_ex0 = _mm_load_ps(&aabbList[(iIter << 2) + 0].m_Center.x);
		__m128 xmm_ey0_ez0_cx1_cy1 = _mm_load_ps(&aabbList[(iIter << 2) + 0].m_Extent.y);
		__m128 xmm_cz1_ex1_ey1_ez1 = _mm_load_ps(&aabbList[(iIter << 2) + 1].m_Center.z);
		__m128 xmm_cx2_cy2_cz2_ex2 = _mm_load_ps(&aabbList[(iIter << 2) + 2].m_Center.x);
		__m128 xmm_ey2_ez2_cx3_cy3 = _mm_load_ps(&aabbList[(iIter << 2) + 2].m_Extent.y);
		__m128 xmm_cz3_ex3_ey3_ez3 = _mm_load_ps(&aabbList[(iIter << 2) + 3].m_Center.z);

		// Shuffle the data in order to get all Xs, Ys, etc. in the same register.
		__m128 xmm_cx0_cy0_cx1_cy1 = _mm_shuffle_ps(xmm_cx0_cy0_cz0_ex0, xmm_ey0_ez0_cx1_cy1, _MM_SHUFFLE(3, 2, 1, 0));
		__m128 xmm_cx2_cy2_cx3_cy3 = _mm_shuffle_ps(xmm_cx2_cy2_cz2_ex2, xmm_ey2_ez2_cx3_cy3, _MM_SHUFFLE(3, 2, 1, 0));
		__m128 xmm_aabbCenter0123_x = _mm_shuffle_ps(xmm_cx0_cy0_cx1_cy1, xmm_cx2_cy2_cx3_cy3, _MM_SHUFFLE(2, 0, 2, 0));
		__m128 xmm_aabbCenter0123_y = _mm_shuffle_ps(xmm_cx0_cy0_cx1_cy1, xmm_cx2_cy2_cx3_cy3, _MM_SHUFFLE(3, 1, 3, 1));

		__m128 xmm_cz0_ex0_cz1_ex1 = _mm_shuffle_ps(xmm_cx0_cy0_cz0_ex0, xmm_cz1_ex1_ey1_ez1, _MM_SHUFFLE(1, 0, 3, 2));
		__m128 xmm_cz2_ex2_cz3_ex3 = _mm_shuffle_ps(xmm_cx2_cy2_cz2_ex2, xmm_cz3_ex3_ey3_ez3, _MM_SHUFFLE(1, 0, 3, 2));
		__m128 xmm_aabbCenter0123_z = _mm_shuffle_ps(xmm_cz0_ex0_cz1_ex1, xmm_cz2_ex2_cz3_ex3, _MM_SHUFFLE(2, 0, 2, 0));
		__m128 xmm_aabbExtent0123_x = _mm_shuffle_ps(xmm_cz0_ex0_cz1_ex1, xmm_cz2_ex2_cz3_ex3, _MM_SHUFFLE(3, 1, 3, 1));

		__m128 xmm_ey0_ez0_ey1_ez1 = _mm_shuffle_ps(xmm_ey0_ez0_cx1_cy1, xmm_cz1_ex1_ey1_ez1, _MM_SHUFFLE(3, 2, 1, 0));
		__m128 xmm_ey2_ez2_ey3_ez3 = _mm_shuffle_ps(xmm_ey2_ez2_cx3_cy3, xmm_cz3_ex3_ey3_ez3, _MM_SHUFFLE(3, 2, 1, 0));
		__m128 xmm_aabbExtent0123_y = _mm_shuffle_ps(xmm_ey0_ez0_ey1_ez1, xmm_ey2_ez2_ey3_ez3, _MM_SHUFFLE(2, 0, 2, 0));
		__m128 xmm_aabbExtent0123_z = _mm_shuffle_ps(xmm_ey0_ez0_ey1_ez1, xmm_ey2_ez2_ey3_ez3, _MM_SHUFFLE(3, 1, 3, 1));

		unsigned int in_out_flag = 0x0F; // = 01111b Assume that all 4 boxes are inside the frustum.
		unsigned int intersect_flag = 0x00; // = 00000b if intersect_flag[i] == 1 then this box intersects the frustum.
		for(unsigned int iPlane = 0;iPlane < 6;++iPlane)
		{
			// Calculate d...
			__m128 xmm_frustumPlane_Component = _mm_load_ps1(&frustumPlanes[iPlane].nx);
			__m128 xmm_d = _mm_mul_ps(xmm_frustumPlane_Component, xmm_aabbCenter0123_x);

			xmm_frustumPlane_Component = _mm_load_ps1(&frustumPlanes[iPlane].ny);
			xmm_frustumPlane_Component = _mm_mul_ps(xmm_frustumPlane_Component, xmm_aabbCenter0123_y);
			xmm_d = _mm_add_ps(xmm_d, xmm_frustumPlane_Component);

			xmm_frustumPlane_Component = _mm_load_ps1(&frustumPlanes[iPlane].nz);
			xmm_frustumPlane_Component = _mm_mul_ps(xmm_frustumPlane_Component, xmm_aabbCenter0123_z);
			xmm_d = _mm_add_ps(xmm_d, xmm_frustumPlane_Component);

			// Calculate r...
			xmm_frustumPlane_Component = _mm_load_ps1(&absFrustumPlanes[iPlane].nx);
			__m128 xmm_r = _mm_mul_ps(xmm_aabbExtent0123_x, xmm_frustumPlane_Component);

			xmm_frustumPlane_Component = _mm_load_ps1(&absFrustumPlanes[iPlane].ny);
			xmm_frustumPlane_Component = _mm_mul_ps(xmm_frustumPlane_Component, xmm_aabbExtent0123_y);
			xmm_r = _mm_add_ps(xmm_r, xmm_frustumPlane_Component);

			xmm_frustumPlane_Component = _mm_load_ps1(&absFrustumPlanes[iPlane].nz);
			xmm_frustumPlane_Component = _mm_mul_ps(xmm_frustumPlane_Component, xmm_aabbExtent0123_z);
			xmm_r = _mm_add_ps(xmm_r, xmm_frustumPlane_Component);

			// Calculate d + r + frustumPlane.d
			__m128 xmm_d_p_r = _mm_add_ps(xmm_d, xmm_r);
			xmm_frustumPlane_Component = _mm_load_ps1(&frustumPlanes[iPlane].d);
			xmm_d_p_r = _mm_add_ps(xmm_d_p_r, xmm_frustumPlane_Component);

			// Check which boxes are outside this plane (if any)...
			// NOTE: At this point whichever component of the xmm_d_p_r reg is negative, the corresponding 
			// box is outside the frustum. 
			unsigned int in_out_flag_curPlane = _mm_movemask_ps(xmm_d_p_r);
			in_out_flag &= ~in_out_flag_curPlane; // NOTed the mask because it's 1 for each box which is outside the frustum, and in_out_flag holds the opposite.

			// If all boxes have been marked as outside the frustum, stop checking the rest of the planes.
			if(!in_out_flag)
				break;

			// Calculate d - r + frustumPlane.d
 			__m128 xmm_d_m_r = _mm_sub_ps(xmm_d, xmm_r);
 			xmm_d_m_r = _mm_add_ps(xmm_d_m_r, xmm_frustumPlane_Component);
			
			// Check which boxes intersect the frustum...
			unsigned int intersect_flag_curPlane = _mm_movemask_ps(xmm_d_m_r);
			intersect_flag |= intersect_flag_curPlane;
		}

		// Calculate the state of the AABB from the 2 flags.
		// If the i-th bit from in_out_flag is 0, then the result will be 0 independent of the value of intersect_flag
		// If the i-th bit from in_out_flag is 1, then the result will be either 1 or 2 depending on the intersect_flag.
		aabbState[(iIter << 2) + 0] = ((in_out_flag & 0x00000001) >> 0) << ((intersect_flag & 0x00000001) >> 0);
		aabbState[(iIter << 2) + 1] = ((in_out_flag & 0x00000002) >> 1) << ((intersect_flag & 0x00000002) >> 1);
		aabbState[(iIter << 2) + 2] = ((in_out_flag & 0x00000004) >> 2) << ((intersect_flag & 0x00000004) >> 2);
		aabbState[(iIter << 2) + 3] = ((in_out_flag & 0x00000008) >> 3) << ((intersect_flag & 0x00000008) >> 3);
	}

	// Process the rest of the AABBs one by one...
	for(unsigned int iAABB = numIterations << 2; iAABB < numAABBs;++iAABB)
	{
		// NOTE: This loop is identical to the CullAABBList_SSE_1() loop. Not shown in order to keep this snippet small.
	}
}

Compared to the original (unoptimized) C++ version, it's 4x faster. But that's unfair. Compared to the scalar SSE version it's about 2.5x faster. Not bad, don't you think? Can it get any better? Probably yes. The problem with the 2 SSE versions is the lack of enough XMM registers to keep all the AABB data in them and avoid using the stack for storing intermediate results. Unfortunately, we need 6 registers for the 4 Center/Extent pairs, and the rest (2) aren't enough for the inner loop. Compiling under 64-bits should give better results because of that.

Another way to optimize this algorithm further is to have the data already laid out the way the code expects them (in SoA form). This will get rid of the shuffles (the memory loads will still be 6).

Finally, the code used to calculate the state of each AABB from the two bitfields can change. One other way of doing it is the following:

intersect_flag &= in_out_flag;
intersect_flag <<= 1;
aabbState[i] = ((in_out_flag & (1 << i)) | (intersect_flag & (1 << (i + 1)))) >> i

This way, we can get rid of the variable shifts (all shifts and ANDs are compile-time constants). The difference from the previous version is that the intersection state is 3 instead of 2. 2 isn't a valid state, that's why we AND the intersection_flag with the in_out_flag. 2 means that the box is outside and intersecting the frustum, which is an invalid case.

Unfortunately, in practice this doesn't make a big difference in performance. So, it's a matter of taste.

Results


Below is a table with more performance data from all the above snippets. Two cases have been tested. For both of them, the frustum is the [0, 1]^3 box. The first one is a random case, where all the boxes are randomly placed in the [-1.0, 2.0] box, and have random sizes in [0.1, 0.2] range. The other case is the worst case, where all boxes are inside the frustum.

MethodRandom (32)Worst (32)Random (1024) *Worst (1024)
C++ (ref)105.2 (14.0)181.5 (21.0)102.5 (12.0)159.3 (20.3)
C++ (opt)96.1 (10.3)138.0 (17.2)84.9 (12.3)119.8 (16.3)
SSE_173.7 (9.8)93.2 (13.8)63.9 (10.8)72.8 (9.8)
SSE_426.5 (4.0)27.6 (4.3)24.1 (4.2)22.9 (4.0)

Table: Performance comparison between the 4 snippets. 2 batch sizes, 32 and 1024 AABBs. The column marked with * contains the data shown in the post.

That's all folks. Thanks for reading. Any corrections/suggestions are welcome.

Changes

2013-09-06: Changed C to C++ because the code uses references. Thanks to @zdlr for pointing that out.
2013-09-10: Removed underscore from structure names (comment by @NightCreature83). Also moved definition of absFrustumPlaneMask outside of the two SSE functions (see comments by @Matias Goldberg for details).

Introduction to the Graphics Pipeline

$
0
0

Introduction


This article is mainly intended to give some introductory background information to the graphics pipeline in a triangle-based rendering scheme and how it maps to the different system components. We'll only cover the parts of the pipeline that are relevant to understaning the rendering of a single triangle with OpenGL.

Graphics Pipeline


The basic functionality of the graphics pipeline is to transform your 3D scene, given a certain camera position and camera orientation, into a 2D image that represents the 3D scene from this camera's viewpoint. We'll start by giving an overview of this graphics pipeline for a triangle-based rendering scheme in the following paragraph. Subsequent paragraphs will then elaborate on the identified components.

High-level Graphics Pipeline Overview


We'll discuss the graphics pipeline from what can be seen in figure 1. This figure shows the application running on the CPU as the starting point for the graphics pipeline. The application will be responsible for the creation of the vertices and it will be using a 3D API to instruct the CPU/GPU to draw these vertices to the screen.


Attached Image: Graphics_pipeline1.png
Figure 1: Functional Graphics Pipeline


We'll typically want to transfer our vertices to the memory of the GPU. As soon as the vertices have arrived on the GPU, they can be used as input to the shader stages of the GPU. The first shader stage is the vertex shader, followed by the fragment shader. The input of the fragment shader will be provided by the rasterizer and the output of the fragment shader will be captured in a color buffer which resides in the backbuffer of our double-buffered framebuffer. The contents of the frontbuffer from the double-buffered framebuffer is displayed on the screen. In order to create animation, the front- and backbuffer will need to swap roles as soon as a new image has been rendered to the backbuffer.

Geometry and Primitives


Typically, our application is the place where we want to define the geometry that we want to render to the screen. This geometry can be defined by points, lines, triangles, quads, triangle strips... These are so-called geometric primitives, since they can be used to generate the desired geometry. A square for example can be composed out of 2 triangles and a triangle can be composed from 3 points. Lets assume we want to render a triangle, then you can define 3 points in your application, which is exactly what we'll do here. These points will then reside in system memory. The GPU will need access to these points and this is where the 3D API, such as Direct3D or OpenGL, will come into play. Your application will use the 3D API to transfer the defined vertices from system memory into the GPU memory. Also note that the order of the points can not be random. This will be discussed when we consider primitive assembly.

Vertices


In graphics programming, we tend add some more meaning to a vertex then its mathematical definition. In mathematics you could say that a vertex defines the location of a point in space. In graphics programming however, we generally add some additional information. Suppose we already know that we would like to render a green point, then this color information can be added. So we'll have a vertex that contains location, as well as color information. Figure 2 clarifies this aspect, where you can see a more classical "mathematical" point definition on the left and a "graphics programming" definition on the right.


Attached Image: PointVSVertex.png
Figure 2: Pure "mathematics" view on the left versus a "graphics programming" view on the right


Shaders - Vertex Shaders


Shaders can be seen as programs, taking inputs to transform them into outputs. It is interesting to understand that a given shader is executed multiple times in parallel for independent input values: since the input values are independent and need to be processed in exact the same way, we can see how the processing can be done in parallel.

We can consider the vertices of a triangle as independent inputs to the vertex shaders. Figure 3 tries to clarify this with a "pass-through" vertex shader. A "pass-through" vertex shader will take the shader inputs and will pass these to its output without modifying them: the vertices P1, P2 and P3 from the triangle are fetched from memory, each individual vertex is fed to vertex shader instances which run in parallel. The outputs from the vertex shaders are fed into the primitive assembly stage.


Attached Image: VertexShader.png
Figure 3: Clarification of shaders


Primitive Assembly


The primitive assembly stage will break our geometry down into the most elementary primitives such as points, lines and triangles. For triangles it will also determine whether it is visible or not, based on the "winding" of the triangle. In OpenGL, an anti-clockwise-wound triangle is considered as front-facing by default and will thus be visible. Clockwise-wound triangles are considered back-facing and will thus be culled (or removed from rendering).

Rasterization


After the visible primitives have been determined by the primitive assembly stage, it is up to the rasterization stage to determine which pixels of the viewport will need to be lit: the primitive is broken down into its composing fragments. This can be seen in figure 4: the cells represent the individual pixels, the pixels marked in grey are the pixels that are covered by the primitive, they indicate the fragments of the triangle.


Attached Image: Rasterization.png
Figure 4: Rasterization of a primitive into 58 fragments


We see how the rasterization has divided the primitive into 58 fragments. These fragments are passed on to the fragment shader stage.

Fragment Shaders


Each of these 58 fragments generated by the rasterization stage will be processed by fragment shaders. The general role of the fragment shader is to calculate the shading function, which is a function that indicates how light will interact with the fragment, resulting in a desired color for the given fragment. A big advantage of these fragments is that they can be treated independently from each other, meaning that the shader programs can run in parallel. After the color has been determined, this color is passed on to the framebuffer.

Framebuffer


From figure 1, we already learned that we are using a double-buffered framebuffer, which means that we have 2 buffers, a frontbuffer and a backbuffer. Each of these buffers contains a color buffer. Now the big difference between the frontbuffer and the backbuffer is that the frontbuffer's contents are actually being shown on the screen, whereas the backbuffer's contents are basically (I'm neglecting the blend stage at this point) being written by the fragment shaders. As soon as all our geometry has been rendered into the backbuffer, the front- and backbuffer can be swapped. This means that the frontbuffer becomes the backbuffer and the backbuffer becomes the frontbuffer.

Figure 1 and figure 5 represent these buffer swaps with the red arrows. In figure 1, you can see how color buffer 1 is used as color buffer for the backbuffer, whereas color buffer 2 is used for the frontbuffer. The situation is reversed in figure 5.


Attached Image: Graphics_pipeline2.png
Figure 5: Functional Graphics Pipeline with swapped front- and backbuffer


This last paragraph concludes our tour through the graphics pipeline. We have now a basic understanding of how vertices and triangles end up on our screen.

Further reading


If you are interested to explore the graphics pipeline in more detail and read up on, e.g.: other shader stages, the blending stage... then, by all means, feel free to have a look at this.

If you want to have an impression of the OpenGL pipeline map, click on the link.

This article was based on an article I originally wrote for my blog.
Viewing all 17825 articles
Browse latest View live


<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>