One of the awesome things about ArenaNet is that we run a programming internship program that actually does a phenomenal job of preparing people to work in the games industry. This is accomplished by focusing on three primary principles:
Last week I had the chance to write down some thoughts about his performance so far in the program, and offer some feedback. After sitting down to re-read my letter, it struck me that there's a lot of stuff in there that might be useful for anyone who is just learning to work on large projects with large teams.
You may not be working on the next great MMO (or maybe you are!) but I think there's some value in this advice for anyone who is early on in their programming career.
This is my variant of the old carpenter's rule of "measure twice, cut once." In general, one of the biggest challenges of working on large-scale projects is keeping in mind all the ramifications of your decisions. Sometimes those implications are easier to see, and sometimes there's just no way to know ahead of time. But either way, it pays to take some time when writing code (and after writing code) to think very carefully about it.
One thing I personally like to do is let my non-trivial changelists sit for a day or so and then come back to them with a fresh mind, and re-read the code. I try to approach it as if I'd never seen the code before and was responsible for a code review on it. There are two directions that need to be considered: the tiny details, and the big-picture implications. Usually I will do two passes, one for each frame of mind.
I'll cover some of the small details later; the big things are generally harder anyways. It takes a lot of practice and experience to intuitively spot the consequences of design decisions; this has two important facets. First, it means that it won't be immediately obvious most of the time when you make a decision that has large-scale effects. Second, it means that it will take effort and conscious deliberation to train yourself to recognize those situations. The best suggestion I can offer is to pause often and try and envision the future - which I'll tackle next
I could also say "be nice to everyone else who will ever read your code" but that doesn't make for as nice of a section heading. The general idea here is that code is written once and then lives a very, very, very long time. The natural impact of this is that people will have to read the code many times while it survives. Again there are two directions this can go in: tiny details, and large-scale impacts, and again, the details are easier to spot - especially at first.
Some concrete examples are probably in order at this point. For details, one of the things that goes a long way is simple formatting. It may seem almost overbearingly anal-retentive to complain about whitespace and which line your braces go on, but it has an impact. After twenty-odd years of reading code, you get to a point where you can recognize common patterns very easily. Especially in a workplace like ArenaNet with a strict and consistently-followed formatting standard, this is a major time-saver; if everyone writes code that looks similar, it's easier to spot "weird" things. If my brain is distracted while reading a piece of code by the formatting, it gets harder to focus on the meaning and intent of the code itself.
On the larger scale, there are things like comments. Code commenting is a religious warfare issue in most of the world, and even ArenaNet has lots of diverse viewpoints on how it should be done. However, we have some basic philosophical common ground that is very helpful.
Sometimes comments need to go away, and sometimes they need to be added. Comments that need to go away are generally guilty of at least one of the following crimes:
This is a lot more detail-oriented stuff. There are a number of conventions that any team will typically have that are important and help readability and clarity of intent.
There are many specific examples in ArenaNet's code standards documentation, but there are other conventions that are more general and likely to be in use in almost any environment. For example, pluralization is important. Iterator variables should be singular ("item" or "currency" or "character"). Containers should be plural ("items" or "currencies" or "characters"). The same goes for function names; if a function does one thing, name it singularly. If it does multiple things, or does one thing to multiple things, pluralize appropriately.
In general names are hard to select and should be chosen with great care. If a function's purpose changes, change its name as well. Always make sure a name corresponds to the best description of intent that you can manage. (Just don't use extremelyVerboseVariableNamesWithLotsOfExtraneousDetails.)
Extra parameters that aren't used by a function should be removed. (There are compiler warnings for this, by the way - you should always build with warnings as errors and the maximum warning level that you can manage.) Similarly, make sure that all variables are actually used and do something. Make sure that function calls are useful. For example, initializing a variable and then immediately changing its value just creates extra noise that confuses the reader. Passing nothing but a string literal to sprintf() and then printf()ing the resulting buffer is confusing as well, and a little wasteful.
This is partly about readability and partly about real efficiency. In large code bases like ours, the death isn't from a single massively wasteful piece of code - it's from thousands of tiny decisions that add up over time... both in terms of reading code, and in terms of how it performs at runtime. Both memory and processing power are something to keep in mind. They may seem cheap (or even free) at times, especially in environments with immense resources. But don't forget that everything has a cost and those costs accumulate a lot faster than we might wish sometimes.
A corollary to this is cleanup practices. If you remove a piece of functionality, make sure all vestiges of it are gone - comments, preparatory code, cleanup code, etc. It's easy to forget pieces of logic when removing things, and this just leads to more noise and confusion for the next reader. Once again, re-reading your own code with a critical eye helps a lot here.
Where things live in code as well as data is always an important consideration. Some things don't need to be wrapped into classes - if there's no state being carried around, or no shared interface that must be used, favor free functions instead of building classes that are just methods. On the data side, make sure to scope things as tightly as you can, and declare things as close as possible to their first point of use. Some things don't need to be file-level variables (let alone globals), and can just live as locals in some function. Some things don't even need to live the entire lifetime of a function. RAII is a big deal in C++, and should be used liberally.
File organization is also a big thing to keep in mind. Keeping related bits of code in self-contained files is a good habit; but it takes some careful thought to decide the taxonomy for what is "related." Think of programs like a pipeline. Each segment of pipe (module) should do one very specific and very contained thing. To build the whole program, you link together segments of pipe (modules/classes/etc.) and compose them into a more sophisticated machine.
You should always try and emulate the code you're working in if someone else owns it. Follow the style, the naming patterns, the architectural decisions, and so on. Often you can make your life harder by departing from established convention, and you will definitely make the lives of everyone else harder at the same time.
Don't be shy to ask questions if any of those decisions or patterns are unclear. I know that a lot of this stuff seems a bit vague and mystical right now; much of the reasoning for why things are the way they are may not be immediately apparent. Asking questions is always good - if there are reasons for why things are a certain way, then you get to learn something; and if there are no reasons, or bad reasons, you open up an opportunity to make things better.
One of the highest aspirations we should have as team programmers is to leave code better than we found it. This ranges from fixing minor details to cleaning up design decisions and adding documentation. This is something you should try for even if you're not in your "own" code - often if there is objective room for improvement, the owner will actually appreciate you taking the time to make his area nicer. Obviously this doesn't mean you should go on a rampage to change every code file just for the sake of it, and there's always value in consulting with an owner before making changes - but you get the idea.
It can be very tempting in life to plateau. Sometimes we just want to feel like we have "arrived" and now we "get it." And indeed there will be milestones in your career where you can notice a profound change in your skills and perspective.
The key is to never stop chasing the next boost. Even after more than twenty years of writing computer programs, I learn new things all the time. Your learning process is either constantly running, or you're effectively dead.
And that, of course, leads to the final piece of advice I could ever offer anyone on life: "don't be dead."
- Everything you do will matter. There is no pointless busy-work, no useless coffee-and-bagel fetching type nonsense, and every project contributes directly to something that impacts the studio and/or the game as a whole.
- Everything you do will be reviewed. We have at least one senior programmer per intern dedicated to helping make sure that the work is top-notch. This entails exhaustive code reviews and extended design/analysis discussions before a line of code is even written.
- Whether we end up hiring you or not, we're committed to making sure you leave the program as a good hire. The program is not a guaranteed-hire affair. However, with extremely few exceptions, we ensure that upon completion of the internship you're well prepared and ready to tackle game industry jobs.
Last week I had the chance to write down some thoughts about his performance so far in the program, and offer some feedback. After sitting down to re-read my letter, it struck me that there's a lot of stuff in there that might be useful for anyone who is just learning to work on large projects with large teams.
You may not be working on the next great MMO (or maybe you are!) but I think there's some value in this advice for anyone who is early on in their programming career.
Think Twice, Commit Once
This is my variant of the old carpenter's rule of "measure twice, cut once." In general, one of the biggest challenges of working on large-scale projects is keeping in mind all the ramifications of your decisions. Sometimes those implications are easier to see, and sometimes there's just no way to know ahead of time. But either way, it pays to take some time when writing code (and after writing code) to think very carefully about it.
One thing I personally like to do is let my non-trivial changelists sit for a day or so and then come back to them with a fresh mind, and re-read the code. I try to approach it as if I'd never seen the code before and was responsible for a code review on it. There are two directions that need to be considered: the tiny details, and the big-picture implications. Usually I will do two passes, one for each frame of mind.
I'll cover some of the small details later; the big things are generally harder anyways. It takes a lot of practice and experience to intuitively spot the consequences of design decisions; this has two important facets. First, it means that it won't be immediately obvious most of the time when you make a decision that has large-scale effects. Second, it means that it will take effort and conscious deliberation to train yourself to recognize those situations. The best suggestion I can offer is to pause often and try and envision the future - which I'll tackle next
Be Nice to Future You
I could also say "be nice to everyone else who will ever read your code" but that doesn't make for as nice of a section heading. The general idea here is that code is written once and then lives a very, very, very long time. The natural impact of this is that people will have to read the code many times while it survives. Again there are two directions this can go in: tiny details, and large-scale impacts, and again, the details are easier to spot - especially at first.
Some concrete examples are probably in order at this point. For details, one of the things that goes a long way is simple formatting. It may seem almost overbearingly anal-retentive to complain about whitespace and which line your braces go on, but it has an impact. After twenty-odd years of reading code, you get to a point where you can recognize common patterns very easily. Especially in a workplace like ArenaNet with a strict and consistently-followed formatting standard, this is a major time-saver; if everyone writes code that looks similar, it's easier to spot "weird" things. If my brain is distracted while reading a piece of code by the formatting, it gets harder to focus on the meaning and intent of the code itself.
On the larger scale, there are things like comments. Code commenting is a religious warfare issue in most of the world, and even ArenaNet has lots of diverse viewpoints on how it should be done. However, we have some basic philosophical common ground that is very helpful.
Sometimes comments need to go away, and sometimes they need to be added. Comments that need to go away are generally guilty of at least one of the following crimes:
- Repeating what the code already says
- Being out of date or in imminent danger of becoming out of date
- Being outright false (often as a result of accidents with copy/paste)
A Rose By Any Other Name Smells Like Shit
This is a lot more detail-oriented stuff. There are a number of conventions that any team will typically have that are important and help readability and clarity of intent.
There are many specific examples in ArenaNet's code standards documentation, but there are other conventions that are more general and likely to be in use in almost any environment. For example, pluralization is important. Iterator variables should be singular ("item" or "currency" or "character"). Containers should be plural ("items" or "currencies" or "characters"). The same goes for function names; if a function does one thing, name it singularly. If it does multiple things, or does one thing to multiple things, pluralize appropriately.
In general names are hard to select and should be chosen with great care. If a function's purpose changes, change its name as well. Always make sure a name corresponds to the best description of intent that you can manage. (Just don't use extremelyVerboseVariableNamesWithLotsOfExtraneousDetails.)
Wastefulness Will Bite You Sooner Rather Than Later
Extra parameters that aren't used by a function should be removed. (There are compiler warnings for this, by the way - you should always build with warnings as errors and the maximum warning level that you can manage.) Similarly, make sure that all variables are actually used and do something. Make sure that function calls are useful. For example, initializing a variable and then immediately changing its value just creates extra noise that confuses the reader. Passing nothing but a string literal to sprintf() and then printf()ing the resulting buffer is confusing as well, and a little wasteful.
This is partly about readability and partly about real efficiency. In large code bases like ours, the death isn't from a single massively wasteful piece of code - it's from thousands of tiny decisions that add up over time... both in terms of reading code, and in terms of how it performs at runtime. Both memory and processing power are something to keep in mind. They may seem cheap (or even free) at times, especially in environments with immense resources. But don't forget that everything has a cost and those costs accumulate a lot faster than we might wish sometimes.
A corollary to this is cleanup practices. If you remove a piece of functionality, make sure all vestiges of it are gone - comments, preparatory code, cleanup code, etc. It's easy to forget pieces of logic when removing things, and this just leads to more noise and confusion for the next reader. Once again, re-reading your own code with a critical eye helps a lot here.
Give It A Nice Home
Where things live in code as well as data is always an important consideration. Some things don't need to be wrapped into classes - if there's no state being carried around, or no shared interface that must be used, favor free functions instead of building classes that are just methods. On the data side, make sure to scope things as tightly as you can, and declare things as close as possible to their first point of use. Some things don't need to be file-level variables (let alone globals), and can just live as locals in some function. Some things don't even need to live the entire lifetime of a function. RAII is a big deal in C++, and should be used liberally.
File organization is also a big thing to keep in mind. Keeping related bits of code in self-contained files is a good habit; but it takes some careful thought to decide the taxonomy for what is "related." Think of programs like a pipeline. Each segment of pipe (module) should do one very specific and very contained thing. To build the whole program, you link together segments of pipe (modules/classes/etc.) and compose them into a more sophisticated machine.
Follow the Leader...
You should always try and emulate the code you're working in if someone else owns it. Follow the style, the naming patterns, the architectural decisions, and so on. Often you can make your life harder by departing from established convention, and you will definitely make the lives of everyone else harder at the same time.
Don't be shy to ask questions if any of those decisions or patterns are unclear. I know that a lot of this stuff seems a bit vague and mystical right now; much of the reasoning for why things are the way they are may not be immediately apparent. Asking questions is always good - if there are reasons for why things are a certain way, then you get to learn something; and if there are no reasons, or bad reasons, you open up an opportunity to make things better.
...But Clean Up What You Find
One of the highest aspirations we should have as team programmers is to leave code better than we found it. This ranges from fixing minor details to cleaning up design decisions and adding documentation. This is something you should try for even if you're not in your "own" code - often if there is objective room for improvement, the owner will actually appreciate you taking the time to make his area nicer. Obviously this doesn't mean you should go on a rampage to change every code file just for the sake of it, and there's always value in consulting with an owner before making changes - but you get the idea.
Learning Never Stops
It can be very tempting in life to plateau. Sometimes we just want to feel like we have "arrived" and now we "get it." And indeed there will be milestones in your career where you can notice a profound change in your skills and perspective.
The key is to never stop chasing the next boost. Even after more than twenty years of writing computer programs, I learn new things all the time. Your learning process is either constantly running, or you're effectively dead.
And that, of course, leads to the final piece of advice I could ever offer anyone on life: "don't be dead."