您的位置：首页 > 移动开发 > Unity3D

Optimizing Graphics Performance-Unity图形性能优化

2014-08-22 11:53 387 查看

Good performance is critical to the success of many games. Beloware some simple guidelines for maximizing the speed of your game'sgraphical rendering.

Where are the graphics costs

The graphical parts of your game can primarily cost on two systemsof the computer: the GPU or the CPU. The first rule of anyoptimization is to find where the performanceproblem is; because strategies for optimizing for GPU vs. CPUare quite
different (and can even be opposite - it's quite commonto make GPU do more work while optimizing for CPU, and viceversa).

Typical bottlenecks and ways to check for them:

GPU is often limitedby fillrate or memorybandwidth.

Does running the game at lowerdisplay resolution make it faster? If so, you're most likelylimited by fillrate on the GPU.

CPU is often limited by the number ofthings that need to be rendered, also known as "draw calls".

Check "draw calls"in RenderingStatistics window; if it's more than severalthousand (for PCs) or several hundred (for mobile),
then you mightwant to optimize the object count.

Of course, these are only the rules of thumb; the bottleneck couldas well be somewhere else. Less typical bottlenecks:

Rendering is not a problem, neitheron the GPU nor the CPU! For example, your scripts or physics mightbe the actual problem. Use Profiler tofigure
this out.
GPU has too many vertices to process.How many vertices are "ok" depends on the GPU and the complexity ofvertex shaders. Typical figures are "not more than 100 thousand" onmobile, and "not more than several million" on PC.
CPU has too many vertices to process,for things that do vertex processing on the CPU. This could beskinned meshes, cloth simulation, particles etc.

CPU optimization - draw call count

In order to render any object on the screen, the CPU has some workto do - things like figuring out which lights affect that object,setting up the shader & shader parameters, sending drawingcommands to the graphics driver, which then prepares the commandsto
be sent off to the graphics card. All this "per object" CPU costis not very cheap, so if you have lots of visible objects, it canadd up.

So for example, if you have a thousand triangles, it will be much,much cheaper if they are all in one mesh, instead of having athousand individual meshes one triangle each. The cost of bothscenarios on the GPU will be very similar, but the work done by theCPU
to render a thousand objects (instead of one) will besignificant.

In order to make CPU do less work, it's good to reduce the visibleobject count:

Combine close objects together,either manually or using Unity's draw callbatching.
Use less materials in your objects,by putting separate textures into a larger texture atlas and soon.
Use less things that cause objects tobe rendered multiple times (reflections, shadows, per-pixel lightsetc., see below).

Combine objects together so that each mesh has at least severalhundred triangles and uses only one Material for theentire mesh. It is important to understand that combining twoobjects which don't share a material
does not give you anyperformance increase at all. The most common reason for havingmultiple materials is that two meshes don't share the sametextures, so to optimize CPU performance, you should ensure thatany objects you combine share the same textures.

However, when using many pixel lights inthe Forwardrendering path, there are situations where combining objectsmay not make sense, as explained
below.

GPU: Optimizing Model Geometry

When optimizing the geometry of a model, there are two basicrules:

Don't use any more triangles thannecessary
Try to keep the number of UV mappingseams and hard edges (doubled-up vertices) as low as possible

Note that the actual number of vertices that graphics hardware hasto process is usually not the same as the number reported by a 3Dapplication. Modeling applications usually display the geometricvertex count, i.e. the number of distinct corner points that makeup
a model. For a graphics card, however, some geometric verticeswill need to be split into two or more logical vertices forrendering purposes. A vertex must be split if it has multiplenormals, UV coordinates or vertex colors. Consequently, the vertexcount in
Unity is invariably higher than the count given by the 3Dapplication.

While the amount of geometry in the models is mostly relevant forthe GPU, some features in Unity also process models on the CPU, forexample mesh skinning.

Lighting Performance

Lighting which is not computed at all is always the fastest!Use Lightmapping to"bake" static lighting just once, instead of computing it eachframe. The process
of generating a lightmapped environment takesonly a little longer than just placing a light in the scene inUnity, but:

It is going to run a lot faster (2-3times for 2 per-pixel lights)
And it will look a lot better sinceyou can bake global illumination and the lightmapper can smooth theresults

In a lot of cases there can be simple tricks possible in shadersand content, instead of adding more lights all over the place. Forexample, instead of adding a light that shines straight into thecamera to get "rim lighting" effect, consider adding a dedicated"rim
lighting" computation into your shaders directly.

Lights in forwardrendering

Per-pixel dynamic lighting will add significant rendering overheadto every affected pixel and can lead to objects being rendered inmultiple passes. On less powerful devices, like mobile or low-endPC GPUs, avoid having more than one PixelLight illuminating
any single object, anduse lightmaps to light static objects instead of having theirlighting calculated every frame. Per-vertex dynamic lighting canadd significant cost to vertex transformations. Try to avoidsituations where multiple lights illuminate any given
object.

If you use pixel lighting then each mesh has to be rendered as manytimes as there are pixel lights illuminating it. If you combine twomeshes that are very far apart, it will increase the effective sizeof the combined object. All pixel lights that illuminate
any partof this combined object will be taken into account duringrendering, so the number of rendering passes that need to be madecould be increased. Generally, the number of passes that must bemade to render the combined object is the sum of the number ofpasses
for each of the separate objects, and so nothing is gainedby combining. For this reason, you should not combine meshes thatare far enough apart to be affected by different sets of pixellights.

During rendering, Unity finds all lights surrounding a mesh andcalculates which of those lights affect it most.The QualitySettings are used to
modify how many of thelights end up as pixel lights and how many as vertex lights. Eachlight calculates its importance based on how far away it is fromthe mesh and how intense its illumination is. Furthermore, somelights are more important than others purely
from the game context.For this reason, every light has a RenderMode setting which can be setto Importantor Not Important;
lights markedas Not Important willtypically have a lower rendering overhead.

As an example, consider a driving game where the player's car isdriving in the dark with headlights switched on. The headlights arelikely to be the most visually significant light sources in thegame, so their Render Mode would probably be setto Important.
On the other hand, there maybe other lights in the game that are less important (other cars'rear lights, say) and which don't improve the visual effect much bybeing pixel lights. The Render Mode for such lights can safely beset to Not
Important so as toavoid wasting rendering capacity in places where it will givelittle benefit.

Optimizing per-pixel lighting saves both CPU and the GPU: the CPUhas less draw calls to do, and the GPU has less vertices to processand pixels to rasterize for all these additional objectrenders.

GPU: Texture Compression and Mipmaps

Using CompressedTextures will decrease the size of yourtextures (resulting in faster load times and smaller memoryfootprint) and can also dramatically
increase renderingperformance. Compressed textures use only a fraction of the memorybandwidth needed for uncompressed 32bit RGBA textures.

Use Texture Mip Maps

As a rule of thumb, always have GenerateMip Maps enabled for textures used in a 3Dscene. In the same way Texture Compression can help limit theamount
of texture data transfered when the GPU is rendering, a mipmapped texture will enable the GPU to use a lower-resolutiontexture for smaller triangles.

The only exception to this rule is when a texel (texture pixel) isknown to map 1:1 to the rendered screen pixel, as with UI elementsor in a 2D game.

LOD and Per-Layer Cull Distances

In some games, it may be appropriate to cull small objects moreaggressively than large ones, in order to reduce both the CPU andGPU load. For example, small rocks and debris could be madeinvisible at long distances while large buildings would still bevisible.

This can be either achieved by Level OfDetail system, or by setting manual per-layerculling distances on the camera. You could put small objects intoa separatelayer and
setup per-layer cull distances usingthe Camera.layerCullDistances scriptfunction.

Realtime Shadows

Realtime shadows are nice, but they can cost quite a lot ofperformance, both in terms of extra draw calls for the CPU, andextra processing on the GPU. For further details, seethe Shadowspage.

GPU: Tips for writing high-performance shaders

A high-end PC GPU and a low-end mobile GPU can be literallyhundreds of times performance difference apart. Same is true evenon a single platform. On a PC, a fast GPU is dozens of times fasterthan a slow integrated GPU; and on mobile platforms you can seejust
as large difference in GPUs.

So keep in mind that GPU performance on mobile platforms andlow-end PCs will be much lower than on your development machines.Typically, shaders will need to be hand optimized to reducecalculations and texture reads in order to get good performance.For example,
some built-in Unity shaders have their "mobile"equivalents that are much faster (but have some limitations orapproximations - that's what makes them faster).

Below are some guidelines that are most important for mobile andlow-end PC graphics cards:

Complex mathematical operations

Transcendental mathematical functions (suchas pow, exp, log, cos, sin, tan,
etc) are quiteexpensive, so a good rule of thumb is to have no more than one suchoperation per pixel. Consider using lookup textures as analternative where applicable.

It is not advisable to attempt to write yourown normalize, dot, inversesqrt operations,however. If you use the built-in ones
then the driver will generatemuch better code for you.

Keep in mind that alpha test (discard) operation will make yourfragments slower.

Floating point operations

You should always specify the precision of floating point variableswhen writing custom shaders. Itis critical to pick thesmallest possible floating point format in order to get the bestperformance. Precision of operations is completely
ignored on manydesktop GPUs, but is critical for performance on many mobileGPUs.

If the shader is written in Cg/HLSL then precision is specified asfollows:

float - full 32-bitfloating point format, suitable for vertex transformations but hasthe slowest performance.
half - reduced 16-bitfloating point format, suitable for texture UV coordinates androughly twice as fast as highp.
fixed - 10-bit fixedpoint format, suitable for colors, lighting calculation and otherhigh-performance operations and roughly four times fasterthan highp.

If the shader is written in GLSL ES then the floating pointprecision is specified specified as highp, mediump, lowp respectively.

For further details about shader performance, please readthe ShaderPerformance page.

Simple Checklist to make Your Game Faster

Keep vertex count below 200K..3M perframe when targetting PCs, depending on the target GPU
If you're using built-in shaders,pick ones from Mobile or Unlit category. They work on non-mobileplatforms as well; but are simplified and approximated versions ofthe more complex shaders.
Keep the number of differentmaterials per scene low - share as many materials between differentobjects as possible.
Set Static property on anon-moving objects to allow internal optimizationslike staticbatching.
Do notuse Pixel Lights when itis not necessary - choose to have only a single (preferablydirectional) pixel light affecting your geometry.
Do not use dynamic lights when it isnot necessary - choose to bake lighting instead.
Use compressed texture formats whenpossible, otherwise prefer 16bit textures over 32bit.
Do not use fog when it is notnecessary.
Learn benefitsof OcclusionCulling and use it to reduce amount of visiblegeometry and draw-calls in case of complex static
scenes with lotsof occlusion. Plan your levels to benefit from ccclusionculling.
Use skyboxes to "fake" distantgeometry.
Use pixel shaders or texturecombiners to mix several textures instead of a multi-passapproach.
If writing custom shaders, always usesmallest possible floating point format:

fixed / lowp - forcolors, lighting information and normals,
half / mediump - fortexture UV coordinates,
float / highp - avoid inpixel shaders, fine to use in vertex shader for positioncalculations.

Minimize use of complex mathematicaloperations such as pow, sin, cos etc. inpixel shaders.
Choose to use less textures perfragment.