I just finished up my 1st iteration of my sprite renderer and I'm sort of questioning its performance.
Currently, I am trying to render 10K worth of 64x64 textured sprites in a 800x600 window. These sprites all using the same texture, vertex shader, and pixel shader. There is basically no state changes. The sprite renderer itself is dynamic using the D3D11_MAP_WRITE_NO_OVERWRITE then D3D11_MAP_WRITE_DISCARD when the vertex buffer is full. The buffer is large enough to hold all 10K sprites and execute them in a single draw call. Cutting the buffer size down to only being able to fit 1000 sprites before a draw call is executed does not seem to matter / improve performance. When I clock the time it takes to complete the render method for my sprite renderer (the only renderer that is running) I'm getting about 40ms. Aside from trying to adjust the size of the vertex buffer, I have tried using 1x1 texture and making the window smaller (640x480) as quick and dirty check to see if the GPU was the bottleneck, but I still get 40ms with both of those cases.
I'm kind of at a loss. What are some of the ways that I could figure out where my bottleneck is?
I feel like only being able to render 10K sprites is really low, but I'm not sure. I'm not sure if I coded a poor renderer and there is a bottleneck somewhere or I'm being limited by my hardware
Just some other info:
Dev PC specs:
GPU: Intel HD Graphics 4600 / Nvidia GTX 850M (Nvidia is set to be the preferred GPU in the Nvida control panel. Vsync is set to off)
CPU: Intel Core i7-4710HQ @ 2.5GHz
Renderer:
//The renderer has a working depth buffer
//Sprites have matrices that are precomputed. These pretransformed vertices are placed into the buffer
Matrix4 model = sprite->getModelMatrix();
verts[0].position = model * verts[0].position;
verts[1].position = model * verts[1].position;
verts[2].position = model * verts[2].position;
verts[3].position = model * verts[3].position;
verts[4].position = model * verts[4].position;
verts[5].position = model * verts[5].position;
//Vertex buffer is flaged for dynamic use
vertexBuffer = BufferModule::createVertexBuffer(D3D11_USAGE_DYNAMIC, D3D11_CPU_ACCESS_WRITE, sizeof(SpriteVertex) * MAX_VERTEX_COUNT_FOR_BUFFER);
//The vertex buffer is mapped to when adding a sprite to the buffer
//vertexBufferMapType could be D3D11_MAP_WRITE_NO_OVERWRITE or D3D11_MAP_WRITE_DISCARD depending on the data already in the vertex buffer
D3D11_MAPPED_SUBRESOURCE resource = vertexBuffer->map(vertexBufferMapType);
memcpy(((SpriteVertex*)resource.pData) + vertexCountInBuffer, verts, BYTES_PER_SPRITE);
vertexBuffer->unmap();
//The constant buffer used for the MVP matrix is updated once per draw call
D3D11_MAPPED_SUBRESOURCE resource = mvpConstBuffer->map(D3D11_MAP_WRITE_DISCARD);
memcpy(resource.pData, projectionMatrix.getData(), sizeof(Matrix4));
mvpConstBuffer->unmap();
Vertex / Pixel Shader:
cbuffer mvpBuffer : register(b0)
{
matrix mvp;
}
struct VertexInput
{
float4 position : POSITION;
float2 texCoords : TEXCOORD0;
float4 color : COLOR;
};
struct PixelInput
{
float4 position : SV_POSITION;
float2 texCoords : TEXCOORD0;
float4 color : COLOR;
};
PixelInput VSMain(VertexInput input)
{
input.position.w = 1.0f;
PixelInput output;
output.position = mul(mvp, input.position);
output.texCoords = input.texCoords;
output.color = input.color;
return output;
}
Texture2D shaderTexture;
SamplerState samplerType;
float4 PSMain(PixelInput input) : SV_TARGET
{
float4 textureColor = shaderTexture.Sample(samplerType, input.texCoords);
return textureColor;
}
If anymore info is needed feel free to ask, I would really like to know how I can improve this assuming I'm not hardware limited
↧