DEVBLOG 5: The Performance Of A Runtime

Hi!  Today we’re talking a little bit about run-time optimization.  Parts of this are applicable to coding in general, but we’ll talk about it specifically in game-engine terms.

A game engine does the following, over and over and over (critical performance tasks in bold):

  • process player input
  • process network input/output (if applicable)
  • update game state
  • render frame
  • (sometimes) load stuff or write to disk

Ideally, your engine is able to complete the above five actions in 16.67ms or less (ideally much less!) to keep your game running at 60FPS.  There are lots of ways to do the above tasks wrong that can jeopardize this goal.  In most cases, the three biggest weapons are batching, removing loop-invariant work, and removing pointless work.

Batching is taking small jobs that you submit very frequently and turning them into bigger jobs you call less often.  If you have a room with 500 tiles in it and you submit one draw call to the graphics card for every single tile, you’re going to draw rooms very slowly, especially when you start asking the engine to render multiple rooms at once.  A draw call takes a list of vertices and a list of indices and turns them into drawn primitives (triangles or quads).  Instead of drawing each of your tiles separately, compile all your tiles into one big vertex buffer, and draw them all at once.  “But wait!” you might say.  “How do I change textures in the middle of a draw call?”  Well, you don’t.  Use a texture atlas for that.  (You’ll need to make sure each of your tiles has the correct texture coords to draw properly, which is just doing some math.  Maybe a topic for another post.)

The same can be done for enemies and pickups.   If you know groups of things in your scene require the same texture, either concatenate their vertex buffers and draw them all at once, or at least draw them one after another so you don’t have to keep changing textures back and forth.

Loop-invariant work is any work your engine does over and over again that it doesn’t need to.  The first time you code a feature, you might just do whatever computations or loads are necessary to get the feature working and testable without regard to whether or not you should be saving the results of any of that.  If you need to load up a level texture atlas, don’t release it until you don’t need it anymore (unless you’re a memory hog already).  Instead, keep it in memory and the next time you’d try to load that texture from disk, see if you’ve already got it.

Look at your functions that do 2D/3D math — do any of them do work that’s performed the same way every time the function is called?  Your compiler might optimize some of this away, but there’s no guaranteeing that.  Move the math somewhere where it can be done once per frame instead of once per enemy check.

Removing pointless work is about making sure your engine doesn’t calculate anything it doesn’t need to. Your enemy that isn’t interacting with the player and is 15 screens away probably doesn’t need to be doing any AI calculations.  That platform on the other side of the level probably doesn’t need to be drawn.  Check which game features really warrant dealing with right now (would they even be visible on the screen?) and don’t draw/update the ones that aren’t.

Make as few calls to your graphics card as possible, as few memory allocations as you can get away with, and eliminate any code that’s being run redundantly, and you’ll be running smoothly in no time.

See you next week!