PV227 GPU programming Marek Vinkler Department of Computer Graphics and Design PV227 GPU programming 1 / 48 GLSL officialy OpenGL Shading Language, part of OpenGL standard (from OpenGL 2.0), high-level procedural language (based on C and C++), independent on hardware, performance oriented (through custom compilers). PV227 GPU programming 2 / 48 GLSL single set of instructions for all shaders (almost), native support for vectors and matrices, no pointers (hurray :D) and strings, strict with types, no length limit (language part). PV227 GPU programming 3 / 48 GLSL part of the OpenGL driver – graphics driver, common front-end (should be), different (optimized) back-ends, shaders are combined into programs, linking resolves cross shader references, shaders are strings, not files (no #include). PV227 GPU programming 4 / 48 Scalar data types float, int, uint, bool, declarations may appear anywhere. PV227 GPU programming 5 / 48 Scalar data types 1 f l o a t f ; 2 f l o a t h = 2.4; / / f l o a t constant in GLSL 3.3 and below 3 f = 0.2 f ; 4 f l o a t f f = 1.5e10 ; 5 f f −= 1.E−3; 6 7 u i n t n = 5; 8 n = 15u ; 9 i n t a = 0xA ; 10 a += 071; 11 12 bool skip = true ; 13 skip = skip && false ; PV227 GPU programming 6 / 48 Vector data types vec2, vec3, vec4 – float, ivec2, ivec3, ivec4 – int, uvec2, uvec3, uvec4 – uint, bvec2, bvec3, bvec4 – bool, two, three or four component vectors of scalar types. PV227 GPU programming 7 / 48 Vector data types field selection or array access, x, y, z, w – for positions or directions, r, g, b, a – for colors, s, t, p, q – for texture coordinates, only for readability, all select certain vector coordinate (e.g. v.x ≡ v.r ≡ v.s ≡ v[0]). PV227 GPU programming 8 / 48 Matrix data types only matrices of floats mat2, mat3, mat4 – 2 × 2, 3 × 3, 4 × 4 matrices, matmxn – m × n (column × row) matix, column major order (first coordinate is column), as in OpenGL, unlike C/C++. 1 mat4 m; 2 vec4 v = m[ 3 ] ; / / Fourth column 3 f l o a t f = m[ 3 ] [ 1 ] ; / / Second component ( row ) of the fourth column vector PV227 GPU programming 9 / 48 Sampler data types for texture access, variants for floats, ints, unsigned ints (no bool), [i|u]sampler{1|2|3}D – access one, two or three dimensional texture, [i|u]samplerCube – access cube-map texture, [i|u]sampler2DRect – access two-dimensional rectangle texture, [i|u]sampler{1|2}DArray – access one or two dimensional texture array, [i|u]samplerBuffer – access texture buffer, PV227 GPU programming 10 / 48 Sampler data types sampler{1|2|3}DShadow – access one, two or three dimensional depth texture with comparison, sampler{1|2|3}DShadow – access one, two or three dimensional depth texture with comparison, sampler2DRectShadow – access two-dimensional rectangle depth texture with comparison, sampler{1|2}DArrayShadow – access one or two dimensional depth texture array with comparison. PV227 GPU programming 11 / 48 Sampler data types application initializes the samplers, passed to shaders through uniform variables, samplers cannot be manipulated in shader, shadow textures and color samplers must not be mixed → undefined behaviour. 1 uniform sampler2D sampler ; 2 vec2 coord = vec2 ( 0 . f , 1. f ) ; 3 vec4 color = texture ( sampler , coord ) ; PV227 GPU programming 12 / 48 Structures C++ style (name of structure → name of type), can be embedded and nested, contain arrays, bit-fields not allowed, no union, enum, class. 1 s t r u c t vertex 2 { 3 vec3 pos ; 4 vec3 color ; 5 } ; 6 vertex v ; PV227 GPU programming 13 / 48 Arrays available for any type, zero indexed, no pointers → always declared with [] and size, the array must be declared with same size in all shaders. PV227 GPU programming 14 / 48 Declarations and scopes variable name format same as in C/C++ (case sensitive), names begining with “gl_” or “__” are reserved, scoping similar to C++. 1 f l o a t f ; / / Declared from t h i s point u n t i l the end of the block 2 f o r ( i n t i = 0; i < 3; ++ i ) / / i i s declared only in t h i s cycle 3 f ∗= f ; 4 5 i f ( i == 1) / / I n v a l i d 6 { 7 . . . 8 } PV227 GPU programming 15 / 48 Initializers and constructors scalar variables may be initialized in declaration, constants must be initialized, in and out variables may not be initialized, uniform variables may be initialized. 1 i n t a = 0 , b , c = 1; 2 const f l o a t eps = 1e−3f ; 3 uniform f l o a t temp = 36.5 f ; PV227 GPU programming 16 / 48 Initializers and constructors aggregate types are initialized/set with constructors, the number of components in vectors need not match. 1 vec2 v = vec2 ( 0 . f , 1. f ) ; 2 v = vec2 ( 1 . f , 0. f ) ; 3 vec3 v3 = vec3 ( v , 0. f ) ; 4 5 v3 = vec3 ( 1 . f ) ; / / vec3 ( 1 . f , 1. f , 1. f ) ; 6 v = vec2 ( v3 ) ; / / vec2 ( 1 . f , 1. f ) ; 7 8 f l o a t array [ 4 ] = f l o a t [ 4 ] ( 0 . f , 1. f , 2. f , 3. f ) ; 9 10 s t r u c t person 11 { 12 s t r u c t a t t r i b 13 { 14 vec3 color ; 15 bool active ; 16 } ; 17 vec3 pos ; 18 } person1 = person ( a t t r i b ( vec3 (0.5 f , 0.5 f , 0.5 f ) , true ) , v3 ) ; PV227 GPU programming 17 / 48 Matrix constructors matrix components are read and written in column major order, matrices cannot be constructed from matrices. 1 mat2 matrix = mat2 ( 1 . f , 2. f , 3. f , 4. f ) ; / / 1. f , 3. f 2 / / 2. f , 4. f ) ; 3 mat2 i d e n t i t y = mat2 ( 1 . f ) ; / / I n i t i a l i z e s diagonal 4 / / mat2 ( 1 . f , 0. f , 0. f , 1. f ) ; 5 6 vec2 v = vec2 (1.0 f ) ; 7 mat2 i d e n t i t y 2 = mat2 ( v ) ; PV227 GPU programming 18 / 48 Type matching and promotion strict matching (prevents ambiguity), assigned types, functions parameters must match exactly, scalar integers may be implicitely converted to scalar floats, may force the programmer to use explicit conversion. PV227 GPU programming 19 / 48 Type conversions performed with constructors, no C-style typecast, no way to reinterpret a value, conversions to boolean → non-zero as true, zero as false, conversions from boolean → true as 1 (1.f), false as 0 (0.f). PV227 GPU programming 20 / 48 GLSL qualifiers tell compiler where the value comes from, in – vertex attribute (vertex shader) vertex data (geometry shader) or interpolated value (fragment shader), uniform – constant variable in all shaders, out – varying variable passed from one shader to another, output to frame buffer, const – compile time constant variables, in, uniform, out are always global variables, qualifier are specified before variable type. PV227 GPU programming 21 / 48 Uniform qualifier cannot be modifed from shader, less frequently updated, max once per primitive, all data types supported, used for samplers, all shaders inside a program share uniform variables. PV227 GPU programming 22 / 48 In qualifier (vertex shader) vertex attributes, can be changed as often as a single vertex, not all data types supported: boolean scalars and vectors, structures, arrays. PV227 GPU programming 23 / 48 Out qualifier (vertex shader/geometry shader) output to the geometry shader / rasterizer, interpolation qualifiers for computing fragments: smooth out – perspective-correct interpolation, noperspective out – interpolation without perspective correction, flat out – no interpolation. floating point scalars, vectors, matrices and arrays, with flat out: [unsigned] integer scalars, vectors, arrays, no structures. PV227 GPU programming 24 / 48 In qualifier (fragment shader) interpolated values from the rasterizer, must match the definition of out variables in vertex / geometry shader, interpolation qualifier, type, size, name. PV227 GPU programming 25 / 48 Out qualifier (fragment shader) passed to per fragment fixed-function stage, floating point/integer/unsigned integer scalars, vectors and arrays, no matrices or structures, can be preceeded with layout(location = x), where x is the number of the render target. PV227 GPU programming 26 / 48 Constant qualifiers compile time constant, not visible outside the shader, individual structure items may not be constants, initializers may contain only literal values or other const variables. PV227 GPU programming 27 / 48 No qualifiers can be both read and written, unqualified global variables, shared between shader of the same type, not between shaders of different types, not visible outside program, lifetime limited to a single run of a shader (no “static”), different variables for different processors → do NOT use. PV227 GPU programming 28 / 48 Interface blocks common names for several variables, different meaning for each qualifier, same syntax, used for passing data between shaders, loading uniform variables. 1 s t o r a g e _ q u a l i f i e r block_name 2 { 3 4 } [ instance_name ] ; block_name used from OpenGL, instance_name optional to create named instances inside GLSL, possible arrays of instances. PV227 GPU programming 29 / 48 Inter shader communication Name based matching: 1 / / vertex shader 2 out vec4 color ; 3 4 −−−−−−−−−−−−−−−−−−− 5 / / geometry shader 6 in vec4 color [ ] ; 7 out vec4 colorFromGeom ; 8 9 −−−−−−−−−−−−−−−−−−−−− 10 / / fragment shader 11 in vec4 colorFromGeom ; names in shaders must match, in and out cannot be named the same, cannot use the same shader for vertex → fragment and vertex → geometry → fragment. PV227 GPU programming 30 / 48 Inter shader communication Location based matching: 1 / / vertex shader 2 layout ( l oc at ion = 0) out vec3 normalOut ; 3 layout ( l oc at ion = 1) out vec4 colorOut ; 4 5 −−−−−−−−−−−−−−−−−−−−− 6 / / geometry shader 7 layout ( l oc at ion = 0) in vec3 normalIn [ ] ; 8 layout ( l oc at ion = 1) in vec4 colorIn [ ] ; 9 10 layout ( l oc at ion = 0) out vec3 normalOut ; 11 layout ( l oc at ion = 1) out vec4 colorOut ; 12 13 −−−−−−−−−−−−−−−−−−−−− 14 / / fragment shader 15 layout ( l oc at ion = 0) in vec3 normalIn ; 16 layout ( l oc at ion = 1) in vec4 colorIn ; locations in shaders must match, location is per max vec4 item (not aggregate types), difficulty with assigning location numbers. PV227 GPU programming 31 / 48 Inter shader communication Interface based matching: 1 / / vertex shader 2 out Data { 3 vec3 normal ; 4 vec3 eye ; 5 vec2 texCoord ; 6 } DataOut ; 7 −−−−−−−−−−−−−−−−−−−−− 8 / / geometry shader 9 in Data { 10 vec3 normal ; 11 vec3 eye ; 12 vec2 texCoord ; 13 } DataIn [ ] ; 14 15 out Data { 16 vec3 normal ; 17 vec3 eye ; 18 vec2 texCoord ; 19 } DataOut ; 20 −−−−−−−−−−−−−−−−−−−−−−−−−−−− PV227 GPU programming 32 / 48 Inter shader communication (cont.) 21 / / fragment shader 22 in Data { 23 vec3 normal ; 24 vec3 eye ; 25 vec2 texCoord ; 26 } DataIn ; 27 . . . 28 DataOut . normal = normalize ( someVector ) ; block names in shaders must match, data manipulation through instance name, same members in blocks. PV227 GPU programming 33 / 48 Uniform interface blocks sharing uniforms between programs, setting multiple uniforms at once, named blocks of uniform variables (individual items are globally scoped), backed by buffers for data transfer, for setting transform matrices, common variables in shader families etc. PV227 GPU programming 34 / 48 Uniform interface blocks 1 layout ( xxx ) uniform ColorBlock { 2 vec4 d i f f u s e ; 3 vec4 ambient ; 4 } ; 5 . . . 6 out vec4 outputF ; 7 8 void main ( ) { 9 outputF = d i f f u s e + ambient ; 10 } layout specifies storage (default is implementation dependent), std140 – OpenGL specified layout, blocks can be shared between shaders, shared – implementation dependent layout, blocks can be shared between shaders, packed – unused variables are optimized-out, not shareable. PV227 GPU programming 35 / 48 Uniform interface blocks uniform blocks are connected with buffers through binding points, block indices are assigned during program link, multiple blocks can be bound to the same binding point. 1 GLuint bindingPoint = 1 , buffer , blockIndex ; 2 f l o a t myFloats [ 8 ] = {1.0 , 0.0 , 0.0 , 1.0 , 0.4 , 0.0 , 0.0 , 1 . 0 } ; 3 4 / / Assign the uniform block to the binding point 5 blockIndex = glGetUniformBlockIndex (p , " ColorBlock " ) ; 6 glUniformBlockBinding (p , blockIndex , bindingPoint ) ; 7 8 glGenBuffers (1 , &buffer ) ; 9 glBindBuffer (GL_UNIFORM_BUFFER, buffer ) ; 10 11 / / Assign the buffer to the bindong point 12 glBufferData (GL_UNIFORM_BUFFER, sizeof ( myFloats ) , myFloats , GL_DYNAMIC_DRAW) ; 13 glBindBufferBase (GL_UNIFORM_BUFFER, bindingPoint , buffer ) ; PV227 GPU programming 36 / 48 Uniform interface blocks individual uniforms may be aligned in memory, to set them correctly we need to compute their offset, queried with glGetActiveUniformBlockiv and glGetActiveUniformsiv, set with glBufferSubData. 1 layout ( std140 ) uniform ColorBlock2 { 2 vec3 d i f f u s e ; 3 vec3 ambient ; 4 } ; 5 6 GLuint bindingPoint = 1 , buffer , blockIndex ; 7 f l o a t myFloats [ 3 ] = {0.4 , 0.0 , 0 . 0 } ; 8 9 glGenBuffers (1 , &buffer ) ; 10 glBindBuffer (GL_UNIFORM_BUFFER, buffer ) ; 11 12 glBufferSubData (GL_UNIFORM_BUFFER, 4∗ sizeof ( f l o a t ) , sizeof ( myFloats ) , myFloats ) ; / / Notice the o f f s e t PV227 GPU programming 37 / 48 Program flow similar to C++, main is the entry point for a shader, global variable are initialized before main is executed, looping for, while, do-while, break, continue, selection if, if-else, if-else if-else, ?: and switch, expressions must be booleans, partial evaluation of && and ||, ?:, no goto, discard prevents fragment from updating frame buffer. PV227 GPU programming 38 / 48 Functions support for C++ overload by parameter type, prototype declaration or definition before call to the function, exact matching of parameters, return values (return), no recursion, programs entry point is function void main(). PV227 GPU programming 39 / 48 Calling conventions value-return, all input parameter values are copied to function before execution, all output parameter values are copied from the function after execution, parameter behaviour controlled by qualifiers in (default), out and inout, in parameters can be also const (not writeable inside function). PV227 GPU programming 40 / 48 Functions continued arrays and structures are also copied by value, any return type (including structures). 1 void foo1 ( in vec3 normal , f l o a t eps , inout vec3 coord ) ; 2 vec3 foo2 ( in vec3 normal , f l o a t eps , in vec3 coord ) ; 3 void foo3 ( in vec3 normal , f l o a t eps , in vec3 coord , out vec3 coordOut ) ; 4 5 / / Get coord 6 vec3 coord ; 7 foo1 ( normal , eps , coord ) ; 8 coord = foo2 ( normal , eps , coord ) ; 9 foo3 ( normal , eps , coord , coord ) ; PV227 GPU programming 41 / 48 Swizzling used to select (rearrange) components of a vector, must use component names from the same set, must still be a valid type (no more than 4 components), R-values any combination and repetition of components, L-values no repetition of components. 1 vec4 pos = vec4 ( 1 . f , 2. f , 3. f , 4. f ) ; 2 vec2 v1 = pos . xy ; 3 vec3 v2 = pos . abb ; 4 vec4 v3 = pos . xyrs ; / / I l l e g a l d i f f e r e n t sets 5 vec4 o ; 6 o . xw = v2 ; / / ( 1 . f , 2. f , 3. f , 4. f ) 7 o . xx = vec2 ( 0 . f ) ; / / I l l e g a l r e p e t i t i o n PV227 GPU programming 42 / 48 Operations on vectors and matrices mostly component-wise (independently for each component), vector sizes must match, vector ∗ matrix and matrix ∗ matrix are not component-wise, logical operations (!, &&, ||, ^^) only on scalar boolean, not conmonent-wise logical not on boolean vectors. PV227 GPU programming 43 / 48 Operations on vectors and matrices relational operators (<, >, <=, >=) on scalar floats and integers → scalar boolean, build-in functions like lessThanEqual do component-wise relational operations on vectors, == and != operate on all types except arrays → scalar boolean, for component-wise comparision equal and nonEqual → boolean vector, any and all turn boolean vector into boolean scalar, = and its variants (+=, -=, *=, /=) operates on all types except structures and arrays. PV227 GPU programming 44 / 48 Preprocessor basically the same as in C, macros begining with “GL_” or “__” are reserved, shaders should declare the GLSL version they are written for (#version number) as the first line of the code, usefull pragmas optimize(on/off) and debug(on/off), language extensions can be accessed using #extension. PV227 GPU programming 45 / 48 Build-in functions make shader programming easier, expose hardware functionality not writeable in the shader, provide optimized (possibly hardware accelerated) implementations of common functions, usually both scalar and vector variants, can be overriden by redeclaration, may be specific for a single shader type. PV227 GPU programming 46 / 48 Shader specific functions Geometry shader: void EmitVertex(void);, use the current output state for a new vertex, void EndPrimitive(void);, complete the current output primitive. PV227 GPU programming 47 / 48 Keep up to date http://www.opengl.org/sdk/docs/man/ http://www.opengl.org/sdk/docs/manglsl/ http://www.opengl.org/registry/ PV227 GPU programming 48 / 48