PV227 GPU Rendering Marek Vinkler Department of Computer Graphics and Design PV227 GPU Rendering 1 / 55 Workflow same operation exactly once for every vertex/patch/primitive/fragment, independent states, no communication, program is for the entire pipeline, data can be passed between shaders. PV227 GPU Rendering 2 / 55 Workflow – Shaders Which shader to use for a given task? depends on the modified data, per vertex → vertex shader, per patch → tessellation shader, per primitive → geometry shader, per fragment → fragment shader, no idea → compute shader. PV227 GPU Rendering 3 / 55 Workflow – Shaders (cont.) Which shader to use for a given task? may depend on special properties of the processors: cancel computation → fragment or geometry shader, some build-in functions are defined only for certain processors. PV227 GPU Rendering 4 / 55 Workflow – Properties Shaders replace entire fixed pipeline. If we want to modify the vertex transformation behaviour, we also have to write code for lighting, texture generation, . . . This may be tedious when small changes are desired. In bigger projects you usually rewrite it anyway ;-). PV227 GPU Rendering 5 / 55 Vertex Processor Replaces the following fixed functionality: Vertex transformation by modelview and projection matix. Texture coordinates transformation by texture matrices. Transformation of normals to eye coordinates. Rescaling and normalization of normals. Texture coordinate generation. Per vertex lighting computations. Color material computations. Point size distance attenuation. PV227 GPU Rendering 6 / 55 Vertex Processor – Fixed Functionality Fixed functionality applied to the result: Perspective division on clip coordinates. Viewport mapping. Depth range scaling. View frustum clipping. Front face determination. Culling. Flat-shading. Associated data clipping. Final color processing. PV227 GPU Rendering 7 / 55 Vertex Processor – Input and Output Figure: Scan from OpenGL Shading Language 3rd edition PV227 GPU Rendering 8 / 55 Input Data vertex attributes (user-defined), uniforms (built-in, user-defined), textures, special built-in variables (very few in core). PV227 GPU Rendering 9 / 55 Vertex Attributes user-defined per vertex data, consist of a number of indexed locations called current vertex state, limited number of attributes, attributes are set with glVertexAttrib family of functions, one indexed location can hold a quadruple, matrix attributes are stored in column-major order in succesive attribute locations, the same value can be set for all vertices (that do not have it otherwise specified). PV227 GPU Rendering 10 / 55 Vertex Attributes – Binding void glBindAttribLocation(GLuint program, GLuint index, const GLchar ∗name); program − the handler to the program. index − index of the generic vertex attribute to be bound. name − string containing the name of the vertex shader attribute variable to which index is to be bound. Used before linking to set the attribute name-index pairing. Automatic assignment of index+1, [index+2, [index+3]] for matrix name. Reserved variables (gl_*) must not be bound this way. May set the pairing of attributes from the same array for different shaders consistently. PV227 GPU Rendering 11 / 55 Vertex Attributes – Binding (cont.) GLint glGetAttribLocation(GLuint program, const GLchar ∗name); program − the handler to the program. name − string containing the name of the vertex shader attribute variable to be queried. Used after linking to get the attribute name-index pairing. For matrix name the returned index is for the first column (index+1, [index+2, [index+3]]). For non-existent attributes or reserved variables (gl_*) −1 is returned. PV227 GPU Rendering 12 / 55 Vertex Attributes – Enable void glEnableVertexAttribArray(GLuint index); void glDisableVertexAttribArray(GLuint index); index − index of the generic vertex attribute to be enabled/disabled. Enabled/disable vertex attributes for use in the draw calls. By default all generic attributes are disabled. PV227 GPU Rendering 13 / 55 Vertex Attributes – Data void glVertexAttribPointer (GLuint index, GLint size, GLenum type, GLboolean normalized, GLsizei stride, const GLvoid ∗pointer); void glVertexAttribIPointer (GLuint index, GLint size, GLenum type, GLsizei stride, const GLvoid ∗ pointer); index − index of the generic vertex attribute to be modified. size − the number of components of the generic attribute (1|2|3|4) . type − the type of each component. normalized − whether fixed−point data should be normalized. stride − byte offset between consecutive vertex attributes. pointer − offset of the first attribute in the buffer bound to GL_ARRAY_BUFFER target. Specifies the location and format of vertex attributes. The I variant passes integer attributes unchanged. PV227 GPU Rendering 14 / 55 Vertex Arrays and Buffers All attributes are bound to a single vertex array object (VAO). This VAO consists of a number of buffers holding the individual attributes. The VAO holds all the information for the draw call e.g. glDrawArrays or glDrawElements. PV227 GPU Rendering 15 / 55 Vertex Arrays and Buffers – Example 1 GLuint vao ; 2 3 / / Create the VAO 4 glGenVertexArrays (1 , &vao ) ; 5 glBindVertexArray ( vao ) ; 6 7 / / Create buffers f o r our vertex data 8 GLuint buffers [ 2 ] ; 9 glGenBuffers (2 , buffers ) ; 10 11 / / Vertex coordinates buffer 12 glBindBuffer (GL_ARRAY_BUFFER, buffers [ 0 ] ) ; 13 glBufferData (GL_ARRAY_BUFFER, sizeof ( ve rt ic es ) , vertices , GL_STATIC_DRAW) ; 14 glEnableVertexAttribArray (VERTEX_COORD_ATTRIB) ; 15 g l V e r t e x A t t r i b P o i n t e r (VERTEX_COORD_ATTRIB, 4 , GL_FLOAT, 0 ,0 ,0) ; 16 17 / / Index buffer 18 glBindBuffer (GL_ELEMENT_ARRAY_BUFFER, buffers [ 1 ] ) ; 19 glBufferData (GL_ELEMENT_ARRAY_BUFFER, sizeof ( faceIndex ) , faceIndex , GL_STATIC_DRAW) ; 20 PV227 GPU Rendering 16 / 55 Vertex Arrays and Buffers – Example (cont.) 21 / / Unbind the VAO 22 glBindVertexArray (0) ; 23 24 . . . 25 26 / / Render VAO 27 glBindVertexArray ( vao ) ; 28 glDrawElements (GL_TRIANGLES, faceCount ∗3 , GL_UNSIGNED_INT, 0) ; PV227 GPU Rendering 17 / 55 Uniforms user-defined: read-only in all shaders, constant per draw call, changed per primitive at most (not recommended for performance), can be initialized inside the shader, location indices are assigned during link, limited number of uniforms (both build-in and user-defined), uniforms can be grouped into named blocks. PV227 GPU Rendering 18 / 55 Uniforms – Blocks all variables outside named block are in default block, sampler variables must be in default block, cannot be used for another program, advantageous for variables tied to an individual shader/program. PV227 GPU Rendering 19 / 55 Uniforms – Location GLint glGetUniformLocation(GLuint program, const GLchar ∗name); program − the handler to the program. name − string containing the name of uniform variable to be queried. Returns the memory location of a uniform variable. Must be called after linking the program (location may change with each link). Not usable for structures, arrays, subcomponents of vectors and matrices. For structures and arrays, its elements can be set with “.” and “[]”. For non-existent uniforms or reserved names (gl_*) −1 is returned. PV227 GPU Rendering 20 / 55 Uniforms – Lifetime during link uniforms are set to 0, their value can be modified only when their program is used, the values are preserved when the program is switched off and on, uniforms are set with glUniform family of functions. PV227 GPU Rendering 21 / 55 Uniforms – Data void glUniform{1|2|3|4}{ f | i | ui }(GLint location , TYPE v); location − the location of the uniform variable. v − 1|2|3|4 component value of the uniform. void glUniform{1|2|3|4}{ f | i | ui}v(GLint location , GLsizei count, const TYPE∗ v); location − the location of the uniform variable. count − number of array elements to be specified. v − array of values to be loaded. void glUniformMatrix{2|3|4|2x3|3x2|2x4|4x2|3x4|4x3}fv(GLint location, GLsizei count, GLboolean transpose, const GLfloat∗ v); location − the location of the uniform variable. count − number of matrices to be specified. transpose − load from row major order? v − array of values to be loaded. PV227 GPU Rendering 22 / 55 Uniforms – Properties types and sizes of the uniform variables must match the functions, locations for array elements and other variables cannot be computed: loc("A[n]") != loc("A")+n. PV227 GPU Rendering 23 / 55 Uniforms – Example 1 uniform s t r u c t 2 { 3 s t r u c t 4 { 5 f l o a t a ; 6 f l o a t b [ 1 0 ] ; 7 } c [ 2 ] ; 8 vec2 d ; 9 } e ; 1 loc1 = glGetUniformLocation ( prog , "e . d" ) ; / / v a l i d : vec2 2 loc2 = glGetUniformLocation ( prog , "e . c [ 0 ] " ) ; / / i n v a l i d : s t r u c t 3 loc3 = glGetUniformLocation ( prog , "e . c [ 0 ] . b" ) ; / / v a l i d : array 4 loc4 = glGetUniformLocation ( prog , "e . c [ 0 ] . b [ 2 ] " ) ; / / v a l i d : array element 5 6 glUniform2f ( loc1 , 1.0 f , 2.0 f ) ; / / v a l i d : vec2 7 glUniform2i ( loc1 , 1 , 2) ; / / i n v a l i d : not ivec2 8 glUniform2f ( loc1 , 1.0 f ) ; / / i n v a l i d : not f l o a t 9 glUniform2fv ( loc3 , 10 , &f ) ; / / v a l i d : b [ 0 ] (+10) 10 glUniform2fv ( loc4 , 10 , &f ) ; / / i n v a l i d : out of range 11 glUniform2fv ( loc4 , 8 , &f ) ; / / v a l i d : b [ 2 ] (+8) PV227 GPU Rendering 24 / 55 Uniforms – Samplers only glUniform1i and glUniform1iv can be used to load samplers, the loaded value is the index of the texture unit to be used, the same unit cannot be loaded into samplers of different types. PV227 GPU Rendering 25 / 55 Special Built-in Variables gl_VertexID – implicit vertex index passed by e.g. DrawArrays, gl_InstanceID – implicit primitive index passed by instanced draw calls e.g. glDrawArraysInstanced, PV227 GPU Rendering 26 / 55 Output Data special built-in variables (very few in core), varying variables (user-defined), PV227 GPU Rendering 27 / 55 Special Built-in Variables in vec4 gl_Position; homogeneous position in clip space (modelview, projection), must be set, used by the rest of the pipeline, in float gl_PointSize; size of the rasterized points, must be set if points are rasterized, in float gl_ClipDistance []; array of distances to user clipping planes, must be set if user clipping is enabled. PV227 GPU Rendering 28 / 55 Varying Variables passed from vertex processor to rasterizer, anything can be passed, more variables can be outputed than used by follow-up shader, interpolation type can be set, limited number of interpolated values. PV227 GPU Rendering 29 / 55 Geometry Processor Optional (no fixed pipeline equivalent). Receives assembled primitives, outputs zero (culling) or more primitives. May receive adjacency information. The type of input and output primitives need not match (triangles → points). Designed for moderate geometry amplification, not tessellation. PV227 GPU Rendering 30 / 55 Geometry Processor – Primitives Input primitives: points, lines, lines_adjacency, triangles, triangles_adjacency. Output primitives: points, line_strip, triangles_strip. PV227 GPU Rendering 31 / 55 Input Data varying variables (built-in, user-defined), uniforms (built-in, user-defined), textures, special built-in variables (very few in core). PV227 GPU Rendering 32 / 55 Varying Variables build-in and user-defined varying variables for each vertex, in the form of array of structures (user-defined or gl_PerVertex), definition must match vertex shader. PV227 GPU Rendering 33 / 55 Uniforms defined the same way as for vertex shader, can be the same set of variables as in vertex shader, no need to setup uniforms for each shader, limited number of uniforms (both build-in and user-defined). PV227 GPU Rendering 34 / 55 Output Data same output as the vertex shader, definition of primitives, special built-in variables (very few in core), varying variables (user-defined). PV227 GPU Rendering 35 / 55 Fragment Processor Replaces the following fixed functionality: Texture environments and texture functions. Texture application. Color sum. Fog. PV227 GPU Rendering 36 / 55 Fragment Processor – Fixed Functionality Fragment shader does not change the following operations: Texture image specification. Alternate texture image specification. Compressed texture image specification. Texture parameters that behave as specified even when a texture is accessed from within a fragment shader. Texture state and proxy state. Texture object specification. Texture comparison modes. PV227 GPU Rendering 37 / 55 Fragment Processor – Input and Output Figure: Scan from OpenGL Shading Language 3rd edition PV227 GPU Rendering 38 / 55 Input Data interpolated varying variables (built-in, user-defined), uniforms (built-in, user-defined), textures, special built-in variables (very few in core). PV227 GPU Rendering 39 / 55 Varying Variables in vec4 gl_FragCoord; window coordinate position (xy), fragment depth (z), in bool gl_FrontFacing; whether the fragment originated from front facing primitive, in vec2 gl_PointCoord; position of the fragment (only for point primitives), user defined varying variables, definition must match vertex/geometry shader. PV227 GPU Rendering 40 / 55 Uniforms defined the same way as for vertex/geometry shader, can be the same set of variables as in vertex/geometry shader, no need to setup uniforms for each shader, limited number of uniforms (both build-in and user-defined). PV227 GPU Rendering 41 / 55 Output Data special built-in variables (very few in core), user-defined output, PV227 GPU Rendering 42 / 55 Special Built-in Variables out float gl_FragDepth; replaces fragment depth (can be also discarded), fragments x,y position cannot be changed, PV227 GPU Rendering 43 / 55 User-defined Output output color or discard fragment, multiple buffers may be updated. PV227 GPU Rendering 44 / 55 User-defined Output – Rendering Targets void glDrawBuffers(GLsizei n, const GLenum ∗bufs); n − number of render targets. bufs − array of output buffers . sets the output rendering targets, void glBindFragDataLocation(GLuint program, GLuint colorNum, const char ∗name); program − the handler to the program. colorNum − the color number to bind the user−defined varying out variable to. name − the name of the varying out variable whose binding to modify. the index of the target as specified in glDrawBuffers, also possible to set from shader code. PV227 GPU Rendering 45 / 55 Tools shaders are just strings → any editor you desire, RenderMonkey (http://developer.amd.com/ resources/archive/archived-tools/ gpu-tools-archive/rendermonkey-toolsuite/), FX Composer (https://developer.nvidia.com/fx-composer), OpenGL Shader Designer (http: //www.opengl.org/sdk/tools/ShaderDesigner/), and many more, mostly discontinued, shader programming got diverse, only IDEs for specialized tasks. PV227 GPU Rendering 46 / 55 Debuggers & Profilers NVIDIA NSight, for Registered developers, AMD CodeXL, directly downloadable, gDEBugger (http://www.gremedy.com/), directly downloadable, up to OpenGL 3.2 VS2010, use what is already there, syntax highlighting, IntelliSense. PV227 GPU Rendering 47 / 55 Project Setup create folder H:\PV227 (not Desktop, Documents, . . . ), create subfolders Templates and Final, unzip the libraries into the Templates folder (optionally also to the Final), unzip the source codes into these folders, launch the projects with Ctrl-F5 (keeps the console open). PV227 GPU Rendering 48 / 55 GLUT multiplatform windowing system for OpenGL, not updated, alternatives exist: FreeGLUT (http://freeglut.sourceforge.net/), download built libraries at http://www.transmissionzero.co.uk/software/ freeglut-devel/. PV227 GPU Rendering 49 / 55 GLEW library for accessing OpenGL core and extension functionality, download built libraries at http://glew.sourceforge.net/, use the older version 1.10.0. PV227 GPU Rendering 50 / 55 Visual Studio Paths Project properties → Set All Configurations: VC++ Directories, Include Directories: \freeglut\include;\glew-1.10.0\include; Library Directories: \freeglut\lib;\glew-1.10.0\lib\Release\Win32; Debugging, Environment: PATH=\freeglut\bin;\glew- 1.10.0\bin\Release\Win32; PV227 GPU Rendering 51 / 55 Visual Studio Editor Syntax highlighting: Tools → Options, Text Editor → File Extension, add vert, geom, frag with Microsoft Visual C++ syntax, update usertype.dat in the VS2010 directory C:\Program Files\Microsoft Visual Studio 10.0\Common7\IDE. PV227 GPU Rendering 52 / 55 Example Complete the CPU calls. Vertex Shader: Project the triangle! Fragment Shader: Shade triangle! PV227 GPU Rendering 53 / 55 “Advanced” example Rotate triangle on the CPU. Vertex Shader: Rotate triangle, set varying attribute (color). Fragment Shader: Draw inverse color. PV227 GPU Rendering 54 / 55 Build-in Constants values accessible from OpenGL API by glGet, give minumum value for OpenGL conforming implementation. 1 const i n t gl_MaxVertexAttribs = 16; 2 const i n t gl_MaxVertexUniformComponents = 1024; 3 const i n t gl_MaxFragmentUniformComponents = 1024; 4 . . . glGetIntegerv(GL_MAX_{VERTEX|GEOMETRY|FRAGMENT} _UNIFORM_COMPONENTS, &nComponents); glGetIntegerv(GL_MAX_VARYING_FLOATS, &nFloats); glGetIntegerv(GL_MAX_VERTEX_ATTRIBS, &nAttribs); glGetIntegerv(GL_MAX_DRAW_BUFFERS, &nBuffers); PV227 GPU Rendering 55 / 55