Search public documentation:


Interested in the Unreal Engine?
Visit the Unreal Technology site.

Looking for jobs and company info?
Check out the Epic games site.

Questions about support via UDN?
Contact the UDN Staff

UE3 Home > Performance, Profiling, and Optimization

Performance, Profiling, and Optimization


Making a game that both runs well and fits withing memory constraints is an exciting problem that touches upon many systems and many disciplines. This page is your gateway into a large number of tools and techniques in which to help you win the battle. Even with the best tools in the world, you still need to instill the mindset of making certain that people are using the minimal amount of data / cpu cycles to create what they want. The tools and techniques listed here will help you do exactly that.

To do that we break the large problem into a number of sub problems to allow us to focus on each area in as much of an orthogonal way as possible. This allows multiple people to work/scale independently.

General Upkeep

Keeping your game running is a balancing act between constantly adding new features vs fitting within your budgets. Code being handled by multiple developers has to all fit together while constantly being modified. The game has to run within the constraints of the target platforms. Keeping track of all of these aspects and making sure the game continues to be functional and playable requires constant monitoring.

The Game Maintenance page outlines several tools and techniques that can be used on a regular basis to maintain a game during production.

Basic Tools and Techniques

Being able to quickly and easily find which parts of your game are causing issues is extremely important during production. This can save time and money. But what can be an even bigger time saver is identifying potential problem areas early, even before they are shown to cause issues in standard game testing situations. If you know something is going to cause issues, it can be fixed or worked around before too many other elements or systems rely on it, making the solution that much more of a headache.

The Basic Profiling and Optimization Techniques page describes several simple techniques that can be used to identify and fix areas or elements in your game that are currently, or may become, performance issues.

The Gameplay Optimization for Gameplay Programmers page describes a sample workflow for optimizing gameplay and rendering being used during the development of Infinity Blade Dungeons.

Memory Profiling

Memory usage is always a concern for video games, especially those destined for consoles or mobile devices where memory space is limited. This extends from the amount of space taken up by content assets on disk to memory usage of different systems during runtime to the amount of memory allocations and deallocations. These are all extremely important pieces of information that make it possible to constrain memory usage to acceptable limits.

The Memory Usage and Profiling page shows several tools and techniques that can be used to be aware of and control memory usage by your game.

Content Profiling and Optimization

Given the amount of content used in today's games, it is extremely important to limit the impact of that content on the game's perofrmance whenever possible. Optimizing content has the potential to provide the largest return on investment so it is often the first area that is profiled and optmized.

Unreal Engine 3 provides several tools to help in this process. These tools along with many tips on how to limit the ipacet of content are detailed on the Content Profiling and Optimization page.

CPU Performance

The work performed by the engine on the CPU is broken up into threads. While there can be several worker threads performing various duties at any one time, the two main threads are:

  • GameThread - The GameThread handles updating gameplay. This includes ticking Actors, ticking components, performing garbage collection, etc.
  • RenderThread - The RenderThread handles performing lighting calculations, shadowing calculations, translucenecy setup, updating scene captures, occlusion, etc.

Managing engine performance on the CPU means making sure that no one thread is ever sitting around for long waiting for any other thread to complete its work, while also trying to keep overall time spent doing work on the CPU to a minimum.

After you've isolated your frame bottleneck as a CPU performance issue, one of next steps should be to determine where in your C++ code the time is being spent. Additionally, you may want to look for hidden performance costs such as Load-Hit-Stores or cache misses.

Most platforms have robust CPU sampling tools available that can provide detailed code metrics and even call graphs for C++ code using data captured from a live session. For example, on the Windows platform you might use Intel VTune to capture sampling data over a short period to isolate hot spots. Console platforms have similar tools available, often included with the development tool chain.

You'll want to familiarize yourself with these applications and use them extensively throughout development!

Note that on some consoles, Unreal has a utility function to help perform single frame CPU sampling captures. You can type TRACE GAME to initiate a CPU trace.

Most CPU profilers support capturing sampling data and/or call graph data over a short duration. This is great for debugging performance issues that involve a sustained poor frame rate. Sampling captures will usually give you function call hot spots in the C++ code. This can be useful sometimes, but more often than not you'll find that a Call Graph capture is much more useful!

Call graphs often have more capture-time overhead, but provide detailed information about the callers of hot spot functions, which can often lead you straight to the source of a problem! Even in cases where many objects/actors are calling a single function that shows up as slow, if you're lucky the call graph will lead you back to a particular actor class name or other pattern that hints you to the cause.

Sometimes call graphs will simply show UnrealScript taking up time (calls to ProcessEvent, CallFunction, etc), in which case you'll want to run a Stats Viewer or Gameplay Profiler session capture to drill down into those calls.

When dealing with frame hitches, CPU profilers can sometimes be unwieldy due to relatively low sampling frequency. If the hitch is easily reproducible in game, a good strategy is to initiate the hitch as frequently as possible (ideally many times per frame) while capturing a short CPU profiling session. If the hitch is more elusive, you'll want to use capture data for use in Stats Viewer which will provide historical call graph data over many frames.

Game Thread Performance

Every gameplay object that is added to the scene takes up some resource; and usually the most interesting objects take up the most resources. In order to have enough GameThread CPU time for those objects you need to make certain that other objects are not unfairly utilizing resources and that all objects do the minimal amount of work needed to accomplish their goal.

Many tools, tips, and techniques for profiling and optimizing GameThread work are detailed on the Game Thread Profiling and Optimization page.

Render Thread Performance

The complexity of the scene in terms of lighting and shadowing, as well as several other processor-intensive visual aspects, can have a huge impact on time spent in the RenderThread. Figuring out which of these areas is slwoing down the RenderThread will help focus efforts on lessening the impact of elements that are not as important, while making more resources available to crucial features and visual showpieces.

The Render Thread Profiling and Optimization page demonstrates several tools and methods of profiling and optimizing visual elements handled in the RenderThread.

GPU Performance

While the CPU handles calculations and updating the game each frame, the GPU is responsible for rendering the game. Obviously, this means drawing polygons, but also involves dealing with shader complexity and overdraw, GPU-skinned skeletal meshes, post processing effects, fluid surfaces, etc. If the GPU is spending a lot of time working, it could mean your game is fill-bound or other rendering elements are causing performance issues.

The GPU receives commands from the rendering thread and does:

  • Transforming vertices and skinning of skeletal meshes
  • Rasterization (figuring out which pixels fall on which triangles)
  • Shading (executing the pixel shader on all the pixels of a given triangle. The pixel shader is controlled by the material given to that triangle)

With GPU profiling there are a lot of complicated characteristics. The rule of thumb with regards to GPU performance is to always experiment. The main time-consuming operation performed by the GPU is pixel shading. The complexity of the shaders used, the amount of screenspace those shaders take up, the amount of overdraw caused by translucent shaders, etc. are all very important factors to consider when profiling the GPU. Another area that the GPU is responsible for is the transforming of skinned meshes. The amount of vertices in the skeletal meshes being used is a large factor as skinned vertices are much more costly than static mesh vertices. Obviously, the amount of polygons being drawn is also a potential problem area for GPU performance, though modern cards have the ability to drawn large amounts of polygons. Even so, you always want to be sure that polygons which do not need to be drawn are not being drawn, and that meshes only have as many polygons as are necessary to achieve the desired visual result.

To aid in the capturing of information about the GPU, a GPU single-frame sampling capture can be performed on some consoles by using the TRACE RENDER command.

There are several tools available to help in profiling the GPU both on the PC and on the various consoles. The GPU Performance and Profiling page explains how to use the tools provided with Unreal Engine 3 as well as other external tools to determine where performance issues may be occurring on the GPU.

Network Profiling

Keeping an eye on the amount of data being sent over the network for online games is important. You don't want to be sending a bunch of unnecessary data or have specific objects spamming the network. Keeping the network traffic streamlined will make online games run smoother for all the players, but especially those with limited bandwidth.

Mobile Profiling

Profiling games on mobile devices is pretty similar to profiling for PC. There are some special tools for the various platforms and some considerations to take into account, however. For details on profiling on mobile devices, see the Mobile Profiling Home page.