Home
        Users Manual
         Contents
1.                                       input    same struct is output from  cg multiparzntVP cg    struct MultiPaintV2F    float4 HPosition POSO O So  CELOS O ES   float4 TexCoords   TEXCOORDO     base ST coordinates  float3 OPosition   TEXCOORD1     position  obj space   float3 Normal   TEXCOORD2     normal  eye space   float3 VPosition   TEXCOORD3     view pos  obj space   losas T   TEXCOORD4     tangent  obj space   float3 B   TEXCOORD5     binormal  obj space   float3 N   TEXCOORD6     normal  obj space   float4 LightVecO B  sve UP    iiie chis  cel9y specs        channels in our material map   eese SIPC  Sane sx   define METALNESS y   define NORM SPEC EXPON z                            Il gstielos im YSpecData     define MINPOWER x   define MAXPOWER y   define MAXSPEC z          zy  Pu  H  Pu       gustielo im    Retlinaca     define FRESNEL MIN x  define FRESNEL MAX y  define FRESNEL EXPON z  define REFL STRENGTH w                                                    808 00504 0000 006    NVIDIA    167          Cg Language Toolkit       subfields in  BumpData    define BUMP_SCALE x       half4 main MultiPaintV2F IN                 uniform sampler2D ColorMap  if colos   uniform sampler2D MaterialMap     see above   uniform sampler2D NormalMap     tangent space normals  uniform samplerCUBE EnvMap     environment skybox  uniform float4 SpecData     see above   uniform float4 ReflData     see above   uniform float4 BumpData    see above    P  COLOR    half4 surfCol   tex2D Col
2.                                  y     DWORD declaration         D3DVSD_STREAM 0    D3DVSD_REG cgD3D8ResourceToInputRegister  CG_POSITION    DSDVSDTERROATS I  D3DVSD_REG cgD3D8ResourceToInputRegister  CG_COLORO    D3DVSDT_D3DCOLOR                   D3DVSD_STREAM  1    D3DVSD_SKIP  4    D3DVSD REG cgD3D8ResourceToInputRegister CG TEXCOORDO                           808 00504 0000 006 89  NVIDIA          Cg Language Toolkit    D3DVSDT FLOAT2    D3DVSD  END     y     If it is possible to do so  the functions cgD3D9ResourceToDeclUsage    and  cgD3D8ResourceToInputRegister    convert a CGresource enumerated  type into a Direct3D vertex shader input register        BYTE  cgD3D9ResourceToDeclUsage  CGresource resource    DWORD cgD3D8ResourceToInputRegister  CGresource resource      If the resource is not a vertex shader input resource  the call to  cgD3D9ResourceToDeclUsage    returns CGD3D9 INVALID REG and the call  to cgD3D8ResourceToInputRegister    returns CGD3D8 INVALID REG    To write the vertex declarations described above based on the program  parameters  which eliminates the reference to any semantic  use  cgD3D9ResourceToDeclUsage    Or cgD3D8ResourceToInputRegister        CGparameter position         cgGetNamedParameter  program   position     CGparameter color         cgGetNamedParameter  program   color     CGparameter texCoord    cgGetNamedParameter  program   texCoord                      const D3DVERTEXELEMENT9 declaration        7L  9   D   slizcor  Melo  y  D3DDECL
3.           With this program handle  egEvaluateProgram   evaluates the program  over the same one   two   or three dimensional domain  Its parameters are as  follows     a CGprogram handle    Q a float   to an output buffer  Q the number of components in the output buffer  1  2  3  or 4   Q the number of positions in the x dimension at which to evaluate the    function    QO the number of positions in the y dimension       Q the number of positions in the z dimension    The total size of the buffer should be equal to the product of the number of  positions in each of the dimensions and the number of components in the  buffer      define RES 256    define NCOMPS 4   float  buf   new float  NCOMPS RES RES    cgEvaluateProgram tp  buf  NCOMPS  RES  RES  1       Do something with buf    delete   buf                                   It is a runtime error to pass a CGprogram that doesn t have the  CG PROFILE GENERIC profile to cgEvaluateProgram          808 00504 0000 006 3l  NVIDIA          Cg Language Toolkit    Annotations    Additionally  each variable  technique  pass  and program in the file can have  an optional annotation  The annotation is a per variable instance structure  that contains data that the effect author wants to communicate to a CgFX   aware application  such as an artist tool  The application can then allow the  variable to be manipulated  based on a GUI element that is appropriate for  the type of annotation     An annotation can be used to describe a user in
4.          recurra losa  ria Color 15 09  p       808 00504 0000 006 207  NVIDIA          Cg Language Toolkit       Shadow Mapping    Description    This effect shows generating texture coordinates for shadow mapping  along  with using the shadow map in the lighting equation per pixel  Fig  19          Fig  19  Example of Shadow Mapping       208 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Shadow Mapping    struct appdata                     float3 Position POSITION   float3 Normal NORMAL    y    struct vocoma 1  float4 Hposition BOSTON   float4 TexCoordO0 EXCOORDO   float4 TexCoordl EXCOORD1   low erem  COLORO     y     vpconn main appdata IN   uniform  uniform  uniform  uniform float3 LightVec    vpconn OUT    float3 worldNormal      float ldotn   max  dot  LightVec     UA  Color sy tcl teme    float4 tempPos   tempPos xyz   IN Position xyz                 tempPos w   1 0   OU exCoord0   mul  TexTransform   OU exCoordl   mul TexTransform                 OUT Hposition      rerurn OUT     normalize  mul  WorldIT     worldNormal   0     mul  WorldViewProj     float4x4 WorldViewProj   float4x4 TexTransform   float3x3 WorldIT     IN Normal       0       tempPos    tempPos      tempPos         808 00504 0000 006  NVIDIA    209          Cg Language Toolkit    Pixel Shader Source Code for Shadow Mapping    struct v2f_simple    ao Sion IOSIITIQNE  float4 TexCoord0   TEXCOORDO   float4 TexCoordl   TEXCOORD1   float4 Color0   COLORO       
5.         HRESULT hresult   cgD3D8LoadProgram vertexProgram  TRUE                                                  D3DXASM_DEBUG  D3DUSAGE_SOFTWAREVERTEXPROCESSING   declaration    HRESULT hresult   cgD3D8LoadProgram fragmentProgram  TRUE           0  0  0      If you want to apply the same vertex program to several sets of geometric  data  each having a different layout  you need to load the program with  different vertex declarations in Direct3D 8  To do so  you need to make a  duplicate of the program  using cgCopyProgram     for each of these  declarations  Here is a code sample illustrating this operation   CGprogam programl  program2   programl   cgCreateProgramFromFile context  CG_SOURCE   yerce rosa  es  CE ARO Iba WS i    09  const DWORD declarationl    cgD3D8GetVertexDeclaration  programl    cgD3D8LoadProgram programl  TRUE  0  0  declaration1               program2   cgCopyProgram programl    const DWORD declaration2              Custom declaration    y   if  cgD3D8ValidateVertexDeclaration  program2  declaration2    cgD3D8LoadProgram program2  TRUE  0  0  declaration2            Only the loading functions differ between Direct3D 9 and Direct3D 8  the  unloading and binding functions are the same     To release the Direct3D resources allocated by cgD3D9LoadProgram    such  as the Direct3D shader object and any shadowed parameter  use    HRESULT cgD3D9UnloadProgam CGprogram program       Note that cgD3D9UnloadProgam   does not free any core runtime resources   such as
6.        O CG INVALID PROFILE ERROR  Returned when the profile is not  supported        808 00504 0000 006 7  NVIDIA          Cg Language Toolkit    O CG INVALID VALUE TYPE ERROR  Returned when an unknown value  type is assigned to a parameter     Q CG_NOT_MATRIX_PARAM ERROR  Returned when the parameter is not of a  matrix type        O CG INVALID ENUMERANT ERROR  Returned when the enumerant  parameter has an invalid value     O CG NOT 4x4 MATRIX ERROR  Returned when the parameter must be a  4x4 matrix type        CG FILE READ ERROR  Returned when the file cannot be read   CG FILE WRITE ERROR  Returned when the file cannot be written     CG MEMORY ALLOC ERROR  Returned when a memory allocation fails        D D DO O    CG INVALID CONTEXT HANDLE ERROR  Returned when an invalid  context handle is used     QO CG INVALID PROGRAM HANDLE ERROR  Returned when an invalid  program handle is used     OQ CG INVALID PARAM HANDLE ERROR  Returned when an invalid  parameter handle is used        O CG UNKNOWN PROFILE ERROR  Returned when the specified profile is  unknown     O CG VAR ARG ERROR  Returned when the variable arguments are specified  incorrectly     O CG INVALID DIMENSION ERROR  Returned when the dimension value is  invalid     O CG ARRAY PARAM ERROR  Returned when the parameter must be an  array        QO CG OUT OF ARRAY BOUNDS ERROR  Returned when the index into an  array is out of bounds           API Specific Cg Runtimes    Each API specific Cg runtimes provides an additional se
7.        tex1D sampler1D tex  float2 sz     1D nonprojective depth compare       tex1D sampler1D tex  float2 sz  float dsdx  float dsdy     1D nonprojective depth compare with derivatives       texlDproj     samplerlD tex  float2 sq     1D projective       texlDproj     sampler1D tex  float3 szq     1D projective depth compare       tex2D sampler2D tex  float2 s     2D nonprojective       tex2D sampler2D tex  float2 s  float2 dsdx  float2 dsdy     2D nonprojective with derivatives       tex2D sampler2D tex  float3 sz     2D nonprojective depth compare       tex2D sampler2D tex  float3 sz  float2 dsdx  float2 dsdy     2D nonprojective depth compare with derivatives       tex2Dproj     sampler2D tex  float3 sq     2D projective       tex2Dproj        sampler2D tex  float4 szq     2D projective depth compare                808 00504 0000 006    NVIDIA    39          Cg Language Toolkit    Table 3  Texture Map Functions  continued        Texture Map Functions    Function   Description       texRECT  samplerRECT tex  float2 s     2D RECT nonprojective       texRECT  samplerRECT tex  float2 s  float2 dsdx  float2 dsdy   2D RECT nonprojective with derivatives  texRECT  samplerRECT tex  float3 sz     2D RECT nonprojective depth compare       texRECT  samplerRECT tex  float3 sz  float2 dsdx  float2 dsdy     2D RECT nonprojective depth compare with derivatives       texRECTproj samplerRECT tex  float3 sq     2D RECT projective       texRECTproj samplerRECT tex  float3 szq     2D RECT pro
8.        y        float4 main v2f_simple IN   uniform sampler2D ShadowMap        uniform sampler2D SpotLight    COLOR     float4 shadow   tex2D ShadowMap  IN TexCoord0 xy    float4 spotlight   tex2D SpotLight  IN TexCoordl xy    float4 lighting   IN Color0     return shadow   spotlight   lighting        210 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders       Shadow Volume Extrusion    Description    This effect uses vertex programs to generate shadow volumes by extruding  geometry along the light vector  Fig  20          Fig  20    Example of Shadow Volume Extrusion       808 00504 0000 006 211  NVIDIA          Cg Language Toolkit    Vertex Shader Source Code for Shadow Volume Extrusion    struct appdata         y     Sa ABONO SON  float3 Normal   NORMAL   Sas ECO lO mC OO RO  Tloat2 Tex   cordi  s LTEXCOORD 0        Steve vocoma      y     float4 Hposition   POSITION   mioara Colorz0O MEME GA EIE  float2 TexCoord0   TEXCOORDO        vpconn main appdata IN     uniform float4x4 WorldViewProj    uniform float4 LightPos      in object space   uniform float4 Fatness    uniform float4 ShadowExtrudeDist    uniform float4 Factors       vpconn OUT        Create normalized vector from vertex to light  float4 light to vert   normalize IN Position   LightPos         N dot L to decide if point should be moved away  ES from the light to extrude the volum  float ndotl   dot  light to vert xyz  IN Normal xyz            Inset the position along      the normal vector direction    
9.       Cg Language Toolkit    out float4 coloro ICO TOR 0  out float4 texCoordO   TEXCOORDO   const uniform float4x4 ModelViewMatrix        positionO   mul  position  ModelViewMatrix     colorO   color   texCoordO   texCoord        Fragment Program    The following Cg code is assumed to be in a file called FragmentProgram cg     void FragmentProgram            iin itl aie4  cales C OFORO  O ATEO E mele COORD OF  ott Elo coloro 2  COLOR     const uniform sampler2D BaseTexture   const uniform float4 SomeColor     colorO   color   tex2D BaseTexture  texCoord    SomeColor        Direct3D 9 Application    The following C code links the previous vertex and fragment programs to  the Direct3D 9 application      include  lt cg cg h gt    include  lt cg cgD3D9 h gt     IDirect3DDevice9  device     Initialized somewhere else  IDirect3DTexture9  texture     Initialized somewhere else  D3DXMATRIX matrix     Initialized somewhere else  D3DXCOLOR constantColor     Initialized somewher ls  CGcontext context    CGprogram vertexProgram  fragmentProgram   IDirect3DVertexDeclaration9  vertexDeclaration   IDirect3DVertexShader9  vertexShader   IDirect3DPixelShader9  pixelShader    CGparameter baseTexture  someColor  modelViewMatrix                       Called at application startup  void OnStartup       J  Create comites  context   cgCreateContext                 92    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library       Called whenever the Direct3D device needs to be create
10.       Fragment program   SIUC Gai 4  float4 diffusecolor   COLORO   float4 uv0   TEXCOORDO   float4 uvl TLE COORD                y    fragout bar  myvf indata     float4 x   indata uv0   E ET        The following binding semantics are available in all Cg vertex profiles for  output from vertex programs  POSITION  PSIZE  FOG  COLORO COLOR1  and  TEXCOORDO TEXCOORD7     All vertex programs must declare and set a vector output that uses the  POSITION binding semantic  This value is required for rasterization     To ensure interoperability between vertex programs and fragment programs   both must use the same struct for their respective outputs and inputs  For  example    struct myvert2frag    Ria Dos BOSTON  float4 uv0   TEXCOORDO   float4 uvl   TEXCOORD1   y                    Vertex program   myvert2frag vertmain         myvert2frag outdata   RU    return outdata        Fragment program  void fragmain myvert2frag indata      float4 tcoord   indata uv0     jew nq tul       8 808 00504 0000 006  NVIDIA    Introduction to the Cg Language    Note that values associated with some vertex output semantics are intended  for and are used by the rasterizer  These values cannot actually be used in the  fragment program  even though they appear in the input struct  For  example  the indata  pos value associated with the POSITION fragment  semantic may not be read in the fragmain shader     Varying Outputs from Fragment Programs    Binding semantics are always required on the outputs of fr
11.      The OpenGL ARB Vertex Program Profile is used to compile Cg source code  to vertex programs compatible with version 1 0 of the  GL_ARB_vertex_program extension     Q Profile name  arbvp1       Q How to invoke  Use the compiler option  profile arbvp1     This section describes the capabilities and restrictions of Cg when using the  arbvpl profile     O The arbvp1 profile is similar to the vp20 profile except for the format of  its output and its capability of accessing OpenGL state easily        Q ARB vertex programhas the same capabilities as NV vertex program  and DirectX 8 vertex shaders  so the limitations that this profile places on  the Cg source code written by the programmer is the same as the   NV vertex   program  profile     Accessing OpenGL State    The arbvp1 profile allows Cg programs to refer to the OpenGL state directly   unlike the vp20 profile  However  if you want to write Cg programs that are  compatible with vp20  vp30  and dx8vs profiles  you should use the alternate  mechanism of setting uniform variables with the necessary state using the Cg  run time  The compiler relies on the feature of ARB vertex assembly  programs that enables parts of the OpenGL state to be written automatically  to program parameter registers as the state changes  The OpenGL driver  handles this state tracking feature     A special variable semantic called state can be used to refer to every part of  the OpenGL state that ARB vertex programs can reference  Following this  pa
12.      ls  312  Table 51  ps_1_x Uniform Input Binding Semantics                   313  Table 52  ps_1_x Varying Input Binding Semantics       oaoa oaa aaa 314  Table 53  ps 1 x Varying Output Binding Semantics                 004 314  Table 54  ps 1 x Auxiliary Texture Functions           aa 315  xii 808 00504 0000 006    NVIDIA          Foreword    We are in the midst of a great transition in computer graphics  both in terms  of graphics hardware and in terms of the visual quality and authoring  process for games  interactive applications  and animation  Graphics  hardware has evolved from    big iron    graphics workstations costing  hundreds of thousands of dollars to single chip graphics processing units   GPUs  whose performance and features have grown to match and now even  to exceed traditional workstations  The processing power provided by a  modern GPU ina single frame rivals the amount of computation that used to  be expended for an offline rendered animation frame  Indeed  at the launch  of GeForce3 on the Apple Macintosh  a convincing version of Pixar   s Luxo  Jr   was demonstrated running interactively in real time  At the 2001 SIGGRAPH  conference  an interactive version of a more recent film  Square Studios    Final  Fantasy  was shown running in real time  again on a GeForce3     Although these feats of computation are astounding  there is much more to  come  Today   s GPUs evolve very quickly  Typically  a product generation is  only six months long  and with
13.     Hf      Application code that is traced       cgD3D9EnableDebugTracing  CG_FALSE                       Note that each debug trace output sets an error equal to cgD3D9DebugTrace   So  if an error callback has been registered with the core runtime using    cgSetErrorCallback     each debug trace output triggers a call to this error  callback  see    Using Error Callbacks    on page 116      Direct3D Error Reporting    Error reporting in Cg includes defined error types  functions that allow  testing for errors  and support for error callbacks     Direct3D Error Types    The Direct3D runtime generates errors of type CGerror  reported by the Cg  core runtime and of type HRESULT  reported by the Direct3D runtime  In  addition  it returns the errors listed in the next two groups that are specific to  the Direct3D Cg runtime     QO CGerror       cgD3D9Failed Set when a Direct3D runtime function makes a  Direct3D call that returns an error        cgD3D9DebugTrace  Set when a debug message is output to the  debug console when using the debug DLL  see    Direct3D  Debugging Mode  on page 112    Q HRESULT      CGD3D9ERR_INVALIDPARAM  Returned when a parameter value  cannot be set     Y  CGD3D9ERR INVALIDPROFILE  Returned when a program with an  unexpected profile is passed to a function        CGD3D9ERR INVALIDSAMPLERSTATE  Returned when a parameter of  type D3DTEXTURESTAGESTATETYPE  which is not a valid sampler  state  is passed to a sampler state function        114    808 00504 00
14.     Input point size  Generic Attribute 6       BLENDINDICES  ATTR7    Generic Attribute 7       TEXCOORDO TEXCOORD 7   ATTR8   ATTR15    Input texture coordinates  texcoord0   texcoord7   Generic Attributes 8 15       Generic Attribute 14  Generic Attribute 15    TANGENT  ATTR14                BINORMAL  ATTR15       The valid binding semantics for varying output parameters in the vp20  profile are summarized in Table 31     These binding semantics map to NV_vertex_program output registers  The  two sets act as aliases to each other     Table 31  vp20 Varying Output Binding Semantics       Binding Semantics Name Corresponding Data       POSITION  HPOS Output position       PSIZE  PSIZ Output point size                FOG  FOGC Output fog coordinate          808 00504 0000 006 281    NVIDIA       Cg Language Toolkit    Table 31  vp20 Varying Output Binding Semantics  continued        Binding Semantics Name    Corresponding Data       COLORO  COLO    Output primary color       COLOR1  COL1    Output secondary color       BCOLO    Output backface primary color       BCOL1    Output backface secondary color          TEXCOORDO TEXCOORD3  TEXO TEX3       Output texture coordinates          The profile also allows wPos to be present as binding semantics on a member  of a structure of a varying output data structure  provided the member with  this binding semantics is not referenced  This allows Cg programs to have  the same structure specify the varying output of a vp20 profile prog
15.     Structure methods are called using the      notation  given an object    of type  Foo  the valueTimesTwo    method is called by    valueTimesTwo        Interfaces    Interfaces may be declared in order to define a set of methods that a structure  must provide in order to implement that interface     Programs and functions can take interfaces as parameters  where the specific  structure types being passed to them may be resolved at runtime  Depending  on hardware limitations  some profiles may require that the concrete types  associated with a particular usage of interfaces be resolved by the runtime  before the program can execute     Interfaces are specified with the interface keyword     interface Light    float3 illuminate float3 position    y        228    808 00504 0000 006  NVIDIA    Types    Appendix A Cg Language Specification    A structure indicates that it implements a particular interface with a colon  and the name of the interface     struct lusum 3 heim 1  floes illunmatre  lots positiam  d sss    y     A structure may only implement a single interface and inheritance between  structures is not supported     Cg s types are as follows     Q The int type is preferably 32 bit two s complement  Profiles may  optionally treat int as float     Q The float type is as close as possible to the IEEE single precision  32 bit   floating point  Profiles must support the   1oat data type        Q The half type is lower precision IEEE like floating point  Profiles must  su
16.     Varying input binding semantics in the   p20 profile consist of COLORO   COLOR1  TEXCOORDO  TEXCOORD1  TEXCOORD2 and TEXCOORD3  These map to  output registers in vertex shaders     The valid binding semantics for varying input parameters in the   p20 profile  are summarized in Table 36     Table 36    p20 Varying Input Binding Semantics                                  Binding Semantics Name Corresponding Data  COLOR  COLORO Input color value vO  COL  COLO  COLOR1 Input color value v1  COL1  TEXCOORDO   TEXCOORD3 Input texture coordinates t0 t3  TEXO   TEX3  FOGP Input fog color and factor  FOG  808 00504 0000 006 289    NVIDIA          Cg Language Toolkit    Additionally  the   p20 profile allows POSITION  PSIZE  TEXCOORD4   TEXCOORD5  TEXCOORD6  and TEXCOORD7 to be specified on varying inputs   provided these inputs are not referenced  This allows Cg programs to have  the same structure specify the varying output of a vp20 profile program and  the varying input of a   p20 profile program     The valid binding semantics for varying output parameters in the   p20  profile are summarized in Table 37     Table 37    p20 Varying Output Binding Semantics             Binding Semantics Name Corresponding Data  COLOR  COLORO Output color    1oat4   COL  COLO   DEPR Output depth  float   DEPTH                The output depth value is special in that it may only be assigned a value of  the form    float4 t    lt texture shader operation gt      float z   dot  texCoord lt n gt   t 
17.     float4 HPosition    POSITION   loss Orosielca 8 EE COORD  Hoat sT mos licioa     MECO RD   float3 Normal ITESO ORD  float3 TexCoord0   TEXCOORDO   Tigers Colon  ME OO RO   float3 LightPos EXC O ORD A  float3 ViewerPos   TEXCOORD5     y                 void calcLighting out float diffuse  out float specular     Eie     1    itllexeuES  mormeall  Blogs fracios  Elosies EAE SS  float3 eyePos  float specularExp        loat3 light   lightPos   fragPos   loat len   length  light      ligne   light   lemy    11  a    i     loat3 eye   normalize eyePos   fragPos    loat3 halfVec   normalize eyePos   light      loat aAttemmciiom   1     3   lenj          1    loat4 lighting   lit dot light  normal    dot  halfVec  normal   specularExp         diffuse   lighting y   attenuation   specular   lighting z   attenuation     float4 main vert2frag IN     uniform float4  LightPos   uniform sampler3D noise map   uniform sampler2D nv map   uniform samplerCUBE cube map   uniform float4  interpolate    EEG USES       float diffuse  specular     float3 biVariate    float3 IN OPosition x IN OPosition z        808 00504 0000 006    163  NVIDIA          Cg Language Toolkit    JEN c  G8 oYSjaLie ab oda o sz INAOBO sion  0p  float3 uniVariate   float3 IN OPosition x IN OPosition z   0  0      float3 normal   normalize IN Normal    float3 noiseTex   float3  IN OPosition x IN OPosition z  6   EN MOL So ZONE  float3 noiseSum   tex3D noise_map  biVariate 3  rgb 12    tex3D noise map  noiseTex   rgb 18  
18.     gt  M n   m  Componentwise    M n   m    M n   m    gt  M n   m  Componentwise    M n   m    M n   m    gt  M n   m  Componentwise                    808 00504 0000 006    NVIDIA    247          Cg Language Toolkit    Operators    Boolean   amp  amp          Boolean operators may be applied to boo1 packed boo1 vectors  in which  case they are applied in elementwise fashion to produce a result vector of the  same size  Each operand must be a bool vector of the same size     Both sides of  amp  amp  and    are always evaluated  there is no short circuiting as  there is in C     Comparisons   lt   gt   lt    gt   lo       Comparison operators may be applied to numeric vectors  Both operands  must be vectors of the same size  The comparison operation is performed in  elementwise fashion to produce a bool vector of the same size     Comparison operators may also be applied to bool vectors  For the purpose  of relational comparisons  true is treated as one and false is treated as zero   The comparison operation is performed in elementwise fashion to produce a  bool vector of the same size     Comparison operators may also be applied to numeric or boo  scalars     Arithmetic                    unary  unary     The arithmetic operator   is the remainder operator  as in C  It may only be  applied to two operands of cint or int type     When   or   is used with cint or int operands  C rules for integer   and    apply    The C operators that combine assignment with arithmetic operat
19.    1  id  i  i be    lic colos  floata sheenColor     1  i  i  1 e    sheen Color  float4 skinColor   tex2D  texl  In texcoords      float3 g      Woe  035 M0 Pe   float3 albedo   Us  Uso  d       oiliness mask  float4 oiliness   0 9   tex2D  tex2  In texcoords             Get eye spac ye vector   float3 v   normalize   In eyeSpacePosition             Get eye space light and halfangle vectors   float3 1   normalize  eyeSpaceLightPosition    In eyeSpacePosition     leer S la   seXonsvellistexe o w sr db E          Get tangent space normal vector from normal map    float3 tangentSpaceNormal   tex2D tex0  In texcoords   rgb   float3 bumpscale     bscale  bscale  1 0     tangentSpaceNormal   tangentSpaceNormal   bumpscale                 Transform it into eye space                    Floats qms   n 0    dot  In tangentToEyeMat0 xyz  tangentSpaceNormal     n 1    dot  In tangentToEyeMatl  tangentSpaceNormal     n 2    dot  In tangentToEyeMat2  tangentSpaceNormal       n   normalize  n          Compute the lighting equation   fleet m  lotril   nesl elo  im  1   0 je fi elsu 0 to 1  float meloicin max  dot n h   0       eis 0 to 1  loewe deg  lora   actor  gt   0p       Compute oil  sheen  subsurf scattering contributions   ilo a Calls  float4 sheen        178    808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders    Plot silos wei y  itle ike  IA  itle It  ez   loata  ar  127  TILOAES IR  IR       Compute fresnel at sheen layer  ramp it up a bit   Kr   fresnel   v  n  eta
20.    Binding Semantics Name Corresponding Data  COLOR  COLORO Output color    1oat4   COL  COLO   DEPTH Output depth  float   DEPR                The output depth value is special in that it may only be assigned a value in  the ps_1_3 profile  and must be of the form    float4 t    lt texture addressing operation gt         float z   dot  texCoord lt n gt   t xyz    float w   dot  texCoord lt n 1 gt   t xyz    depth   z   w   314 808 00504 0000 006    NVIDIA    Appendix B Language Profiles    Auxiliary Texture Functions    Because the capabilities of the texture addressing instructions are limited in  DirectX pixel shader 1_X  a set of auxiliary functions is provided in these  profiles that express the functionality of the more complex texture  addressing instructions  These functions are provided merely as a  convenience for writing ps_1_x Cg programs  The same result can be  achieved by writing the expanded form of each function directly  The  expanded form has the added advantage of being supported on other  profiles     These functions are summarized in Table 54     Table 54  ps_1_x Auxiliary Texture Functions       Texture Function       Description       offsettex2D  uniform sampler2D tex  float2 st   float4 prevlookup  uniform float4 m        Performs the following   float2 newst   st   m xy   prevlookup xx   m zw   prevlookup  yy   return tex2D tex  newst     where  st are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture o
21.    Cg Language Toolkit       3  Use the Cg Standard Library    The functions in the Cg Standard Library have been carefully written for  both efficiency and correctness  By using Standard Library functions when  appropriate  you can automatically take advantage of the work that went  into making sure they compile to fast code on GPUs while you concentrate  on the hard problems you re solving in your own shaders     Particularly fast Standard Library functions include dot     which computes  the dot product of two vectors  abs     which computes the absolute value of  a variable  saturate     which clamps a value to be between zero and one   and min   and max     which return the minimum and maximum of a pair of  values  You won t be able to write more efficient implementations of these  functions than the Standard Library provides because many of them compile  directly to GPU assembly language instructions  Writing a dot product  function of your own     float mydot  float3 a  float3 b     Seva ay dos GP Ela Wild harz DRZ          compiles to a handful of instructions  while the built in dot    function  compiles to a single specialized dot product instruction  There   s no other way  to get to this instruction other than by using the Standard Library     Two functions deserve particular attention  The abs    function usually has  no cost in either vertex or fragment programs because the GPU can evaluate  the function while executing other instructions  Similarly  the saturat
22.    Returns x otherwise           sign x  lifx gt  0    lif x  lt 0   0 otherwise    sin  x  Sine of x        sincos  float x   out s  out c     s is set to the sine of x  and cis set to the cosine of x     If sin  x  and cos  x  are both needed  this function  is more efficient than calculating each individually        sinh  x     Hyperbolic sine of x        smoothstep  min   max  x     For values of x between min and max  returns a  smoothly varying value that ranges from 0 at x   min  to 1 at x   max  x is clamped to the range  min   max  and then the interpolation formula is evaluated      2     x min     max min       3     min     max min                    step a  x  Difx lt a   lifx gt  a   sqrt  x  Square root of x   x must be greater than zero   tan  x  Tangent of x   tanh  x  Hyperbolic tangent of x        transpose  M           Matrix transpose of matrix M  If M is an AxB matrix  the  transpose of M is a BxA matrix whose first column is  the first row of M  whose second column is the second  row of M  whose third column is the third row of M  and  SO On           808 00504 0000 006    37  NVIDIA             Cg Language Toolkit       Geometric Functions    Table 2     Geometric Functions    presents the geometric functions that are  provided in the Cg Standard Library     Table 2  Geometric Functions       Geometric Functions    Function    Description       distance  pt1  pt2     Euclidean distance between points pt1 and pt2        faceforward N  I  Ng     N if dot
23.    User   s Manual    A Developer s Guide to Programmable Graphics    Release 1 4  September 2005       Cg Language Toolkit    ALL NVIDIA DESIGN SPECIFICATIONS  REFERENCE BOARDS  FILES  DRAWINGS  DIAGNOSTICS   LISTS  AND OTHER DOCUMENTS  TOGETHER AND SEPARATELY   MATERIALS   ARE BEING PROVIDED   AS IS   NVIDIA MAKES NO WARRANTIES  EXPRESSED  IMPLIED  STATUTORY  OR OTHERWISE WITH  RESPECT TO THE MATERIALS  AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF  NONINFRINGEMENT  MERCHANTABILITY  AND FITNESS FOR A PARTICULAR PURPOSE     Information furnished is believed to be accurate and reliable  However  NVIDIA Corporation assumes  no responsibility for the consequences of use of such information or for any infringement of patents or  other rights of third parties that may result from its use  No license is granted by implication or  otherwise under any patent or patent rights of NVIDIA Corporation  Specifications mentioned in this  publication are subject to change without notice  This publication supersedes and replaces all  information previously supplied  NVIDIA Corporation products are not authorized for use as critical  components in life support devices or systems without express written approval of NVIDIA  Corporation     Trademarks    NVIDIA and the NVIDIA logo are trademarks or registered trademarks of NVIDIA Corporation in the  United States and other countries     Microsoft  Windows  the Windows logo  and DirectX are registered trademarks of Microsoft  Corporation     
24.    based Cg profiles there is no such implied mapping     Binding semantics may be specified directly on program parameters rather  than on struct elements  Thus  the following vertex program definition is                      legal   outdata foo float3 myPosition   POSITION   float3 myNormal   NORMAL   float3 myTangent   TANGENT   float refractive index   TEXCOORD3     DET     Within the program  the parameters are referred to by     their variable names   myPosition    myNormal    mangent Emol Mireia abite let n  Wee sas a         Varying Outputs to and from Vertex Programs    The outputs of a vertex program pass through the rasterizer and are made  available to a fragment program as varying inputs  For a vertex program and  fragment program to interoperate  they must agree on the data being passed  between them     As it does with the data flow between the application and vertex program   Cg uses binding semantics to specify the data flow between the vertex  program and fragment program     This example shows the use of binding semantics for vertex program output        Vertex program  struct myvf                     float4 pout B POSITIONS    WESC or asic Sica zc aL   float4 diffusecolor   COLORO   float4 uvO ECO ORIO  float4 uvl TEXAS O ORD   y    WME EOSS soo Y Y Al  myvf outstuff   fius ae  ctf   808 00504 0000 006 7    NVIDIA          Cg Language Toolkit    return outstuff          And  this example shows how to use this same data as the input to a  fragment program  
25.    float4 main  MyInterface foo    COLOR    sica  tor Well  a 5  ES  p       Listing 3  Cg Program 3    Notice that both Cg Program 1 and Cg Program 2 define the val    method  of the MyInterface and MyStruct types using the float type  whereas Cg  Program 3 does so using the half type  As a result  the MyInterface and  MyStruct types defined in Cg Program Three are not equivalent to types in  the other two programs  even though the types have the same names     The following C program creates all three of the above Cg programs and  connects shared parameter instances to their input parameters   static CGprogram CreateProgram const char  program_str     return cgCreateProgram Context  CG_SOURCE   program str  CG PROFILE ARBFP1   Muela  Nfl  P             ame mein  Late euge  che exe   if  CGContext Context   CGprogram Programl  Program2  Program3   CGparameter msl  ms3      Disable automatic compilation  since the     programs cannot be compiled until concrete structs     are connected to each program s interface parameters              62    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library       Context   cgCreateContext      cgSetAutoCompile  Context  CG_COMPILE MANUAL       Create the programs       Programl   CreateProgram ProgramlString    Program2   CreateProgram Program2String     Program3   CreateProgram Program3String          Create two shared parameters      one of the MyStruct type from Programl  and     one of the MyStruct type from Program3  
26.    msl   cgCreateParameter  cgGetNamedUserType  Programl   iN Sie EUGEN     g    ms3   cgCreateParameter  cgGetNamedUserType  Program3   UMV eee        p       Connect the same shared parameter to Programl and                Progran2 wy  cgConnectParameter  Fool  cgGetNamedParameter  Programl        MESONI J  cgConnectParameter  Fool  cgGetNamedParameter  Program2   KECOH       The following would generate an error because the type     of the Fool parameter is not equivalent to type   J  MS ciacer Erom Pieegieena c      cgConnectParameter  ms1    Hit cgGetNamedParameter  Program3   foo        cgConnectParameter  ms3  cgGetNamedParameter  Program3   UE xg            Now we can compile all three programs   cgCompileProgram Programl    cgCompileProgram Program2    cgCompileProgram Program3     We a   a BO GM us             808 00504 0000 006 63  NVIDIA          Cg Language Toolkit    Parameter Properties    Parameter properties encompass validity  references  size  and other  attributes     Parameter Type    The Cg language defines a number of built in parameter types  such as  float4  int3x3  and so on  In addition  user defined types may be specified  in a program when declaring structure and interface types  For example  if  the following Cg code is included in the source to a CGprogram created via  cgCreateProgram     the types MyInterface and MyStruct will be added to  the resulting CGprogram        interface MyInterface    float SomeMethod  float x    y   struct MyStruct  
27.    newResult  xyz   tlostatl 0 00     normal   normalize normal         calculate diffuse lighting off the normal   Ue that was just calculated   floats liceos   Elec  0  9 19  5   float3 lightVec   normalize lightPos   position    float diffuselnten   dot  lightVec  normal         e wp the final color     The first term is a semi random term based  Ve on the total height of this straw     The second term is the diffuse lighting component  UT Coloro   nongmalize  ars    chris lnea    NADO Sion       return OUT        204    808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders       Refraction    Description    This effect performs custom texture coordinate generation to compute a  refracted vector per vertex that is then used to look up in a cube map  Fresnel  is also calculated to blend between reflection and refraction  Fig  18          Fig  18    Example of Refraction       808 00504 0000 006 205  NVIDIA          Cg Language Toolkit    Vertex Shader Source Code for Refraction    SUSE 3UmpexUR  e      float4 Position PO SIERO NIS  float4 Normal   NORMAL     y     SEXucl DULPUES               float4 hPosition B IONS IL ILO P  float4 fresnelTerm   COLORO   float4 refractVec B LE COORD OT   float4 reflectVec EAS O OE Daa           y        fresnel approximation  fixed rast ares  mle n  o at NNI   float3 fresnelValues              fixed power   fresnelValues x    fixed scale   fresnelValues y    fixed bias   fresnelValues z    ieirwica dias   jocww il 0     cdta  1   Texo
28.    variables     Binding Semantics for Uniform Data    The valid binding semantics for uniform parameters in the vp20 profile are sum     marized in Table 29     Table 29  vp20 Uniform Input Binding Semantics       Binding Semantics Name    Corresponding Data       C0 C95       register  c0   register  c95        Constant register  0  95      The aliases c0 c95  lowercase  are also    accepted     If used with a variable that requires more  than one constant register  for example  a  matrix   the semantic specifies the first    register that is used              280    NVIDIA    808 00504 0000 006    Appendix B Language Profiles    Binding Semantics for Varying Input Output Data    The valid binding semantics for varying input parameters in the vp20 profile  are summarized in Table 30     One can also use TANGENT and BINORMAL instead of TEXCOORD6 and  TEXCOORD7  A second set of binding semantics  ATTRO ATTR15  can also be  used  The two sets act as aliases to each other     Table 30  vp20 Varying Input Binding Semantics       Binding Semantics Name Corresponding Data       POSITION  ATTRO Input Vertex  Generic Attribute 0       BLENDWEIGHT  ATTR1 Input vertex weight  Generic Attribute 1    NORMAL  ATTR2 Input normal  Generic Attribute 2       COLORO  DIFFUSE  ATTR3 Input primary color  Generic Attribute 3       COLOR1  SPECULAR  ATTR4 Input secondary color  Generic Attribute 4          TESSFACTOR  FOGCOORD  ATTR5    Input fog coordinate  Generic Attribute 5       PSIZE  ATTR6
29.   0 5    lightVectorInTangentSpace   0 5                    compute view vector  float3 viewVector    normalize  EyePosition xyz   IN Position xyz            compute half angle vector  float3 halfAngleVector    normalize  LightVector xyz   viewVector         transform half angle vector from     object space to tangent space  OUT HalfAngleVector xyz     mul  objToTangentSpace  halfAngleVector          transform position to projection space  OUT Position   mul  WorldViewProj  IN Position      return OUT     Pixel Shader Source Code for Bump Dot3x2    EErEE wi       y     float4 Position   POSITION    in projection space  float4 Normal   COLORO    in tangent space   float4 LightVectorUnsigned   COLOR1    in tangent space  float3 TexCoord0   TEXCOORDO    float3 TexCoordl   TEXCOORD1    float4 LightVector   TEXCOORD2    in tangent space  float4 HalfAngleVector   TEXCOORD3    in tangent space                               Elec meat  2 JON     uniform sampler2D DiffuseMap        194    808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders    uniform sampler2D NormalMap   uniform sampler2D IlluminationMap   uniform float Ambient    COLOR    Ii kerteh base color  float4 color   tex2D DiffuseMap  IN TexCoord0 xy         fetch bump normal and expand it to   1 1   float4 bumpNormal   2     tex2D  NormalMap  IN TexCoordl xy    0 5         compute the dot product between    Ue the bump normal and the light vector       compute the dot product between   v the bump normal and the half a
30.   5         set the two color coefficients  the magic constants     are arbitrary   these two color coefficients are used          808 00504 0000 006 159  NVIDIA          Cg Language Toolkit       to calculate the contribution from each of the two     environment cubemaps  one bright  one dark     OUT Color0    fres 1 4   min reflected y 0   xxxx    Elige 4 2 oS oS E  OUT Colorl    fres 1 26   xxxx        return OUT     Pixel Shader Source Code for Improved Water                float4 main in float3 colorO0 INCOLORO  iia flogs colori  amp  COLOR    in float3 reflectVec EIE A9  9 1   in tloadks  rer llechViccDark S EEREXGOORDST  uniform samplerCUBE environmentMaps  2   Y  2 COLOR       float3 reflectColor   texCUBE environmentMaps  0            reflectVec   rgb   float3 reflectColorDark   texCUBE  environmentMaps 1    reflectVecDark   rgb           floats color    retlectColor   color0      refliectColorDark   colorl    return floatti color  1 0         160 808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders       Melting Paint    Description    This shader uses an environment map with procedurally modified texture  lookups to create a melting effect on the surface texture  the NVIDIA logo in  this example   The reflection vector is shifted using a noise function  giving  the appearance of a bumpy surface  The surface texture   s texture coordinates  are shifted in a time dependent manner  also based on a noise texture        Fig  7  Example of Melting Paint    Verte
31.   NVIDIA    Appendix A Cg Language Specification    structure that is a uniform parameter to the program  This requirement also  applies when the array is indirectly a uniform program parameter  that is  it  and or the structure containing it has been passed via a chain of in function  parameters   There are two operations that must be supported     O Rvalue subscripting by a run time computed value or a compile time  value          Passing the entire array as a parameter to a function  where the  corresponding formal function parameter is declared as in    The following operations are explicitly not required to be supported   O Lvalue subscripting  a Copying    Q Other operators  including multiply  add  compare  and so on       Note that when the array is rvalue subscripted  the result is an expression   and this expression is no longer considered to be a uniform program  parameter  Therefore  if this expression is an array  its subsequent use must  conform to the standard rules for array usage     These rules are not limited to arrays of numeric types  and thus imply  support for arrays of struct  arrays of matrices  and arrays of vectors when  the array is a uniform program parameter  Maximum array sizes may be  limited by the number of available registers or other resource limits  and  compilers are permitted to issue error messages in these cases  However   profiles must support sizes of at least float arr 8   float4 arr 8   and  float4x4 arr 4   4      Fragment profile
32.   Ng  I   lt 0   otherwise     N        length  v     Euclidean length of a vector        normalize  v     Returns a vector of length 1 that points in the same  direction as vector v        reflect  i  n     Computes reflection vector from entering ray  direction i and surface normal n     Only valid for 3 component vectors        refract i  n  eta           Given entering ray direction i  surface normal n   and relative index of refraction eta  computes  refraction vector  If the angle between i and n is  too large for a given eta  returns  0  0  0      Only valid for 3 component vectors              Texture Map Functions    Table 3     Texture Map Functions    presents the texture functions that are  provided in the Cg Standard Library  These texture functions are fully  supported by the ps 2  arbfp1    p30  and   p40 profiles  The two   dimensional variants of these functions are supported by the vp40 profile   All of the functions in the table return a float4 value     Because of the limited pixel programmability of older hardware  the ps 1  and   p20 profiles use a different set of texture mapping functions  See     Language Profiles  on page 255 for more information        38    808 00504 0000 006  NVIDIA    Cg Standard Library Functions       Table 3  Texture Map Functions  Texture Map Functions  Function   Description       tex1D sampler1D tex  float s     1D nonprojective       tex1D sampler1D tex  float s  float dsdx  float dsdy     1D nonprojective with derivatives
33.   O G A O OU w U U 5                                  y              Ensure the resulting declaration is compatible with the     shader  This is really just a sanity check        808 00504 0000 006 107  NVIDIA          Cg Language Toolkit    assert  cgD3D9ValidateVertexDeclaration vertexProgram   declaration          device  gt CreateVertexDeclaration    declaration   amp vertexDeclaration       Load the program with th xpanded interfac     Parameter shadowing is enabled  second parameter   TRUE    cgD3D9LoadProgram vertexProgram  TRUE  0                  Create the pixel shader    fragmentProgram   cgCreateProgramFromFile   context  CG SOURCE   FragmentProgram cg    pixelProfile   FragmentProgram   pixelOptions               Load the program with th xpanded interface  Parameter     shadowing is enabled  second parameter   TRUE   Ignore      vertex shader specifc flags  such as declaration usage   cgD3D9LoadProgram fragmentProgram  TRUE  0                  Grab some parameters              modelViewMatrix   cgGetNamedParameter  vertexProgram    ModelViewMatrix     baseTexture   cgGetNamedParameter  fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram   Some oou   e          Sanity check that parameters have th xpected siz  assert  cgD3D9TypeToSize  cgGetParameterType    modelViewMatrix      16    assert  cgD3D9TypeToSize  cgGetParameterType  someColor        SE          Set parameters that don t change  They can be set     only once since parame
34.   Program Iteration    The programs within a context are sequentially ordered and can be iterated  over by using cgGetFirstProgram   and cgGetNextProgram        CGprogram cgGetFirstProgram CGcontext context    CGprogram cgGetNextProgram CGprogram program       The first program of the sequence is retrieved by cgGetFirstProgram    If  the context is invalid or does not contain any program  the function returns  zero  Given a program  cgGetNextProgram   returns the program  immediately next in the sequence  or zero if there is none  Here is how those  two functions would typically be used given a valid context named context   CGprogram program   cgGetFirstProgram context     while  program    0         Here is the code that handles the program      program   cgGetNextProgram program            Nothing is guaranteed regarding the order of the programs in the sequence  or how cgGetFirstProgram   and cgGetNextProgram   behave when  programs are created or destroyed during iteration     Program Query    Program queries encompass validity  compilation results  and attributes     Program Validity    Use cgIsProgram   to check whether a program handle references a valid  program   CGbool cgIsProgram CGprogram program       Compilation Result    You can query the result of the compilation resulting from the last call to  cgCreateProgram   for a given context by using cgGetLast Listing      const char  cgGetLastListing CGcontext context         808 00504 0000 006 53  NVIDIA          Cg 
35.   R T      Kee   sinooclasicee  0 0  0 5  me   g   ite    d40   Xx          Compute the refracted light ray and the refraction     coefficient    Ke2   ame  SL  mn  ical  1X2  IZ Mp   122 Ssmooehsiten 00  Oh oy NN   ua   1 0     IXe2p             For oil contribution  modulate the oiliness mask by a     specular term   Oil   0 5   olliness   poy  acloitla  m p       For sheen contribution  modulate Fresnel term by      sheen color times specular  Modulate by additional      diffuse term to soften it a bit    sheen   2 5 Kr sheenColor   ndot1   0 2   pow  ndoth  m           Compute single scattering approximation to subsurface     scattering  Here we compute 3 scattering terms     simultaneously and the results end up in the x y z     components of a float3  Using 3 terms approximates     distribution of multiply scattered light  For     details see  Matt Pharr   s SIGGRAPH 2001 RenderMan     course notes    Layered Media for Surface Shaders      float3 temp   singleScatter  T2  T  n  g  albedo   thickness      uloste   2 5   sikamCollor   mdgtl   Eug   EZ  gt     temp  x temp y temp z                        Add contributions from oil  sheen  and subsurface      scattering and modulate by light color and result      of a shadow map lookup    return lightColor tex2Dproj  tex3  In shadowcoords   r     oil   sheen   subsurf         808 00504 0000 006 179  NVIDIA          Cg Language Toolkit       Thin Film Effect    Description    This demo shows a thin film interference effect  
36.   This book is intended as an introduction to Cg  as well as a practical  handbook to get programmers started developing in Cg  It includes a  language description  a reference for the standard and run time libraries  and  is full of helpful examples  The goal for this book is to be both an  introduction and a tool for the new user  as well as a reference and resource  for developers as they become more proficient     Welcome to the world of Cg     David Kir amp   Chief Scientist    NVIDIA Corporation       xiv 808 00504 0000 006  NVIDIA    a    o LN                         gt       Preface    The goal of this book is to introduce to you Cg  a new high level language for  graphics programming  To that end  we have organized this document into  the following sections     Q    Introduction to the Cg Language    on page 1  A quick introduction to the current release of Cg  with everything you  need to know to start working it     Q    Cg Standard Library Functions    on page 33  A list of the Standard Library functions  which can help to reduce your  program development time     Q    Introduction to the Cg Runtime Library    on page 43  An introduction to the Cg runtime APIs  which allow you to easily  compile Cg programs and pass data to them from within applications     Q    Introduction to CgFX    on page 117  The CgFX API  which supports this Cg extended file format  is described     Q    A Brief Tutorial    on page 145   A description of a simple Cg program and Microsoft Vi
37.   This moves the shadow volume points      inside the model slightly to minimize      popping of shadowed areas as      each facet comes in and out of shadow       The Fatness value should be negative   float4 inset pos    IN Normal   Fatness xyz    IN   POSANE I OI SAV  VBP   inset_pos w   IN Position w        scale the vector from light to vertex       212    808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders       float4 extrusion_vec   light_to_vert   ShadowExtrudeDist        if ndotl  lt  0 then the vertex faces   Vay  away from the light  so move it       It will be moved along the direction from   1 d light to vertex to extrude the shadow volume   iPlWoxeE  chew     low   cor Ll  lt  0  5             Move the back facing shadow volume points  float4 new_position   extrusion_vec   away   inset_pos        Transform position to hclip space   OUT Hposition   mul  WorldViewProj  new position         Set the color to blue for when the shadow volume    il is rendered in color for illustrative purposes  float    color   TPloat4 r0  0  BRactors x 0      OUT Color0   color   OUT TexCoord0 xy   IN TexCoord0   return OUT           808 00504 0000 006 213  NVIDIA          Cg Language Toolkit       Sine Wave Demo    Description    This effect modifies the vertex positions using a sine function based on the  current time  It demonstrates use of the built in sin    function  It also  computes a normal based on the perturbed mesh  and uses this to compute a  reflection vector to
38.   and less than the value of  GL_MAX CLIP_PLANES  ColorMask bool4 1 0  ColorMatrix float4x4 ARB imaging  ColorMaterial int2 Front  Back  1 0  FrontAndBack   Emission  Ambient   Diffuse  Specular   AmbientAndDiffuse  CullFace int Front  Back  1 0  FrontAndBack  DepthBounds float2 EXT depth bounds test  DepthFunc int Never  Less  1 0  LEqual  Equal   Greater  NotEqual   GEqual  Always  DepthMask bool 1 0  DepthRange float2 1 0  FogMode int Linear  Exp  Exp2  1 0  FogDensity float 1 0  FogStart float 1 0  FogEnd float 1 0  FogColor float4 1 0  FragmentEnvParameter float4 ARB fragment program      ndx                 ndx must be greater than or  equal to zero and less than  the value of   GL MAX PROGRAM ENV  PARAMETERS ARB for the  GL FRAGMENT PROGRAM  ARB target to  glGetProgramivARB          132    NVIDIA    808 00504 0000 006       Introduction to CgFX                                                       Table 6    CgFX OpenGL State Manager States  continued   State Name Type Valid Enumerants Requires  Fragment LocalParameter float4 ARB fragment program    ndx  ndx must be greater or  equal to zero and less than  the value of  GL MAX PROGRAM LOCAL  PARAMETERS ARB for the  GL FRAGMENT PROGRAM ARB  target to  glGetProgramivARB  FogCoordSrc int FragmentDepth  OpenGL 1 4 or  FogCoord EXT fog coord  FogDistanceMode int EyeRadial  NV fog distance  EyePlane   EyePlaneAbsolute  FragmentProgram compile ARB fragment program  statement OrNV fragment program  FrontFace int CW  CCW 1 0  L
39.   coreCg 50  control constructs used 19  core Cg context 50  Core Cg error reporting 71  Core Cg parameter 54  Core Cg program 50  core Cg runtime 49    D  data types  bool 11  fixed 11  float 11  half 11  int 11  sampler 11  supported 11  data types for performance 325  debugging function 41  declaration  Cg definition 224  definition  as used in Cg 224  derivative functions 41  Direct3D Cg runtime 85  cgD3D9EnableDebugTracing   114  cgD3D9GetLastError   115  cgD3D9TranslateHRESULT   116  CGerror 114  debugging mode 112  error callbacks 116  error testing 115  error types 114  expanded interface 98  cgD3D8LoadProgram   103  cgD3D8SetSamplerState   102  cgD3D9BindProgram   105    cgD3D9EnableParameterShadowing      103  cgD3D9GetDevice   98  cgD3D9GetlatestPixelProfile    cgD3D9GetLatestVertexProfile      cgD3D9GetOptimalOptions   105    808 00504 0000 006    cgD3D9IsParameterShadowingEnable  d   103  cgD3D9IsProgramLoaded   104  cgD3D9LoadProgram   103  cgD3D9SetDevice   98  cgD3D9SetSamplerState   102  cgD3D9SetTexture   102  cgD3D9SetTextureWrapMode   102  cgD3D9SetUniform   100  cgD3D9SetUniformArray   101  cgD3D9SetUniformMatrix   101  cgD3D9SetUniformMatrixArray   10  1  cgD3D9UnloadProgam   104  Direct3D 8 application 109  Direct3D 9 application 106  Direct3D device 98  fragment program 106  lost devices 98  parameters 100  array 101  sampler 102  uniform 100  profile support 105  program executiion 103  vertex program 106  HRESULT 114  minimal interface     85  cgD3D8
40.   discard   texl a   col0   sum     scale_by_one_half          How different NV_texture_shader and NV_register_combiners instruction  set modifiers are expressed in Cg programs are summarized in Table 32  For  more details on the context in which each modifier is allowed and ways in  which modifiers may be combined refer to the NV_texture_shader and  NV_register_combiners documentation        284 808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Table 32  NV texture shader and NV register combiners Instruction  Set Modifiers                   Instruction  Register Modifier Cg Expression  scale by two   2 x   scale by four   A x  scale_by_one_half   x 2  bias_by_negative_one_half   x 0 5       bias by negative one half scale by two     2   x 0 5                             unsigned reg  saturate  x     i e  min 1  max 0  x    unsigned_invert reg  1 saturate  x   half_bias reg  x 0 5   reg  x  expand reg  2   x 0 5        Language Constructs and Support  Data Types    In the   p20 profile  operations occur on signed clamped floating point values  in the range  1 to 1  These profiles allow all data types to be used  but all  operations are carried out in the above range  Refer to the  NV_texture_shader and NV_register_combiners documentation for more  details     Statements and Operators    The   p20 profile supports all of the Cg language constructs  with the  following exceptions     Q Arbitrary swizzles are not supported  though arbitrary write masks are    Only t
41.   modulos  and casts  from floating point types     Q fixed or sampler  data types are not supported  but the profile does  provide the minimal partial support that is required for these data types  by the core language specification    that is  it is legal to declare variables  using these types  as long as no operations are performed on the  variables        270    808 00504 0000 006  NVIDIA    Bindings    Appendix B Language Profiles    Statements and Operators    This profile is a superset of the vp20 profile  Any program that compiles for  the vp20 profile should also compile for the vp30 profile  although the  converse is not true     The additional capabilities of the vp30 profile  beyond those of vp20 are    Q for  while  and do loops are supported without requiring loop unrolling       Q Full support for if else allowing non constant conditional expressions    Binding Semantics for Uniform Data    The valid binding semantics for uniform parameters in the vp30 profile are  summarized in Table 23     Table 23  vp30 Uniform Input Binding Semantics          Binding Semantics Name Corresponding Data   register  c0   register  c255    Constant register  0  255     C0 C255 The aliases c0  c255  lowercase  are also  accepted     If used with a variable that requires more  than one constant register  for example  a  matrix   the semantic specifies the first  register that is used                    808 00504 0000 006 271    NVIDIA          Cg Language Toolkit    Binding Seman
42.   register c95   Constant register  0  95     C0 C95 The aliases c0  c95  lowercase  are also  accepted     If used with a variable that requires more than  one constant register  for example  a matrix    the semantic specifies the first register that is  used              Binding Semantics for Varying Input Output Data    The valid binding semantics for uniform parameters in the vs 1 1 profile are  summatized in Table 46  These map to the input registers in DirectX 8 1 vertex  shaders     Table 46  vs 1 1 Varying Input Binding Semantics                      Binding Semantics Name Corresponding Data   POSITION Vertex shader input register  vo  BLENDWEIGHT Vertex shader input register  v1  BLENDINDICES Vertex shader input register  v2  NORMAL Vertex shader input register  v3  PSIZE Vertex shader input register  v4  COLORO  DIFFUSE Vertex shader input register  v5                   306    808 00504 0000 006  NVIDIA       Options    Table 46  vs 1 1 Varying Input Binding Semantics  continued     Appendix B Language Profiles       Binding Semantics Name    Corresponding Data       COLOR1  SPECULAR    Vertex shader input register  v6       TEXCOORDO TEXCOORD7    Vertex shader input register  v7 v14       TANGENT    Vertex shader input register  v14       BINORMAL       Vertex shader input register  v15             i TANGENT is an alias for TEXCOORD7     The valid binding semantics for varying output parameters in the vs 1 x  profile  These map to output registers in DirectX 8 1 ve
43.   setucate  Clore  NIE  Dohe  ssColor   AmbiColor   baseTex   DiffLight   ffPupil   AmbiColor   saturate  dot  xAxis   Ln             lfAng   normalize  Ln   Vn      abs  dot  Nf halfAng     cl   pow ndh  GlossData PHONG       smoothstep GlossData GLOSS1  GlossData GLOSS2      lerp GlossData DROP  specl  s2    ecularLight   SpecColor   specl     tColor   missColor     e  gt   0 0h     radedEta   BallData ETA              808 00504 0000 006    173  NVIDIA          Cg Language Toolkit    gradedEta   1 0h gradedEta   half3 faceColor   BgColor              half3 refVector   refract  Vn  Nf  gradedEta    if  dot  refVector  refVector   gt  0        now let s intersect with the iris plane  half irisT   intersect_plane IN OPosition  refVector   planeEquation     half fadeT   irisT   BallData LENS DENSITY   fadeT   fadeT   fadeT   faceColor   DiffPupil xxx   iit  aeisi  gt  0  d  half3 irisPoint   IN OPosition   irisT refVector   Halts Sierss th issscaile imi spon   MBULIES  0  On  OO  Sia  O  Sim  y  faceColor   tex2D ColorMap  irisST yz   rgb                        faceColor   lerp faceColor  LensColor  fadeT     hitColor   lerp missColor  faceColor   smoothstep 0 0h  GRADE  slice             hitColor   hitColor   SpecularLight   maSiewien walii  inwicolloie  LaO p       174 808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders       Skin    Description    This effect demonstrates some techniques for rendering skin ranging from  simple Blinn Phong Bump Mapping to more compl
44.   state texgen 0  eye q state texgen 0  object s  state texgen 0  object t state texgen 0  object r  state texgen 0  object q state fog color  state fog params state clip 0  plane   The state semantics of type   1oat that can be accessed are listed in Table 15   Table 15  float state Semantics   state point size state point attenuation          Position Invariance       m   language specification   m   semantic of GL MVP   Data Types    The arbvp1 profile supports position invariance  as described in the core    The modelview projection matrix is not specified using a binding    This profile implements data types as follows        258    NVIDIA    808 00504 0000 006       Appendix B Language Profiles    O float data type is implemented as defined in the ARB_vertex_program  specification     half data type is implemented as float        fixed or sampler  data types are not supported  but the profile does  provide the minimal partial support that is required for these data types  by the core language specification   that is  it is legal to declare variables  using these types as long as no operations are performed on the  variables     Compatibility with the vp20 Vertex Program Profile    Programs that work with the vp20 profile are compatible with the arbvp1  profile as long as they use the Cg run time to manage all uniform parameters   including OpenGL state  That is  arbvp1 and vp20 profiles can be used  interchangeably without changing the Cg source code or the application 
45.   tempnorm  xyz   normalVec   normalize  normalVec            compute th ye  gt vertex vector  float3 eyeVec   EyeVector Xxyz              compute the view depth for the thin film  float viewdepth    1 0   dot normalVec  eyeVec      FilmDepth x     OUT filmDepth   viewdepth xx        store normalized light vector  float3 lightVec   normalize  float3 LightVector         calculate half angle vector  float3 halfAngleVec   normalize lightVec   eyeVec         808 00504 0000 006 181  NVIDIA          Cg Language Toolkit       calculate diffuse component  float diffuse   dot  normalVec  lightVec          calculate specular component  float specular   dot normalVec  halfAngleVec          use the lit instruction to calculate lighting      automatically clamp  igata ikGileicaing   lite  clilirituse  secular  32        output final lighting results  OUT diffCol    float4 lighting y   OUT specCol    float4 lighting z        return OUT     Pixel Shader Source Code for Thin Film Effect    STEUCE Wie      eloco clinical COLOR 0  float3 specCol EOL        float2 filmDepth   TEXCOORDO   y        void main  v2f IN   Gwt iloac  color 3 COCOR   uniform sampler2D fringeMap   uniform sampler2D diffMap        diffuse material color  eloco chiro   thoacs  0 3  0 3  0 5 p       lookup fringe value based on view depth    float3 fringeCol    float3 tex2D fringeMap  IN filmDepth          modulate specular lighting by fringe color      combine with regular lighting    color rgb   fringeCol IN specCol   IN
46.   tex3D noise map  biVariate 6  rgb 18   normal   normalize normal   noiseSum            calcLighting diffuse  specular  normal  IN OPosition   IN LightPos  IN ViewerPos  32      float3 nvShift   tex3D noise map  uniVariate 3  rgb   2    tex3D noise map  uniVariate  rgb   4    tex3D noise map  biVariate 3  rgb   16    yal  yla are     dumerpodetesax   Sp   0     FANASIMIL IEE  s sx  ANASINILIETE 7    biVariate   float3 IN OPosition x     IN OPosition z   INODORO 0p  Float  texloowel   loiveiciace ss7 4 t Eloeuz  lo 3259     nvShift yx   float2 0  interpolate x 8    float3 nvDecal    tex2D nv_map  float2 1 texCoord x  texCoord y   rgb     Imes OOlaice 2 U 27  335       float3 eye   IN ViewerPos JEN  OP Sive iL om p  float3 lightMetal   texCUBE cube map   reflect normal  eye   rgb   loss der Meral    Cchiiriruse   iloacs  5   25 0   a  Specular  se fil eat  iv  DN       float3 finalColor   lerp lightMetal  darkMetal  nvDecal x    wejeulicin ilo erc4  ienmalCoillei  1  5       164 808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders       MultiPaint    Description    MultiPaint presents a single pass solution to a common production problem   mixing multiple kinds of materials on a single polygonal surface  MultiPaint  provides a simple BRDF  bidirectional reflectance distribution function  that  is still complex enough to represent many common metallic and dielectric  surfaces  and controls all key factors of the variable BRDF through texturing   This permits you to cre
47.   the appropriate code path at run time     An example of this situation would be a fragment shader that supported a  generic light source model for shading  Depending on how its parameters  were set  it might implement a point light  a spotlight  or a light source that  projected a texture map to determine the light distribution  Rather than  having a series of if else tests to determine which light model to use   having a separate version of the shader for each light type is generally more  efficient        328    808 00504 0000 006  NVIDIA          pendix D  Cg Compiler Options    This appendix describes the command line options for the Cg compiler   What follows are the command line options for the Cg compiler  cgc   exe   Qh  profile prof   Compile for the prof profile        OU  profileopts profopts  Specify a comma separated list of profile specific options  See the profile  specification for valid options     QO   entry fname   Specify the main function name as fname    O  o fname   Write the output to file fname    QO  Dmacro  value    Define a macro  with optional value    UA   Ipathname   Specify path to an include directory    ao  1 filename   Write compiler messages to filename rather than to standard output   Q  strict   Enforce strict type checking    QO  nofx   Do not treat CgFX keywords as reserved words   a  quiet   Suppress printing the header to stdout    a  nocode   Compile  but do not generate any code        QO  nostdlib  Do not include the stdlib h hea
48.   use the   Advanced Profile Sample Shaders  on page 153 and  Basic Profile Sample  Shaders  on page 189 as a basis to build your own effects        Release Notes    Release notes for Cg are now contained in a separate document that is part of  the Cg distribution     Please report any bugs  issues  and feedback to NVIDIA by e mailing  cgsupport nvidia com  We will expeditiously address any reported  problems        Online Updates    Any changes  additions  or corrections are posted at the NVIDIA Cg Web  site     http   developer nvidia com Cg    Refer to this site often to keep up on the latest changes and additions to the  Cg language  Information on how to report any bugs you may find in the  release is also available on this site        xvi 808 00504 0000 006  NVIDIA          Introduction  to the Cg Language    Historically  graphics hardware has been programmed at a very low level   Fixed function pipelines were configured by setting states such as the  texture combining modes  More recently  programmers configured  programmable pipelines by using programming interfaces at the assembly  language level  In theory  these low level programming interfaces provided  great flexibility  In practice  they were painful to use and presented a serious  barrier to the effective use of hardware     Using a high level programming language  rather than the low level  languages of the past  provides several advantages     a A high level language speeds up the tweak and run cycle when a 
49.   xut eese dee we ahah aed ane boe kei SR eae eae wees 304  Language Constructs and Support wie ca kac a ee ee o pdg ee 304  BINGINGS iu acies ura pao HO Ec Reh eee Rede Fe parar Kad qr d 306  OPUS  ari C Xp PEUPLE Ris ed RO eal aes eine 307   DirectX Pixel Shader 1 x Profiles  ps_1_    0    ccc cee eee oraka 308  aU Dag PCT 308  Modifies cura cute s mapa qued EEA Ad xe Rp PT AREE EU BENE RE 309  Language  Constructs  and SUPPOM aa Ra 310  Standard Library FUNCUONS sucio rt dad e 311  BINGINGS   cet bee pin AAA BRERA E A RP hale 312  Auxiliary Texture  FUNCIONS 24 2246 963245 8 2 VAGRRARE OAS AER NS dde qx 315  Examples       os aid a e set Ei 319   Appendix C   Nine Steps to High Performance C9         lt ccooooccccc o 321  Appendix D   Cg Compiler OptiONS    i5  x iconos a dca A a el A ew a a 329  MAER EE 331  808 00504 0000 006 vii    NVIDIA          Cg Language Toolkit       viii 808 00504 0000 006  NVIDIA    Contents  Figures  and Tables    List of Figures       Figs  2  CgsModelofthe GPU o isa eek yx Fono m SOR A Gok a Box BOR A ox 3 2  Fig  2  The Parts of the Cg Runtime API        2 2 2 2    0 022     eee 45  Fig 3  The Cg Simple Workspace  sois  ee RRR Ee X ox UR RO a 145  Fig  4  Thesimple cg Shader    2    cns 146  Fig  5  Example of Improved Skinning               lens 154  Fig  6  Example of Improved Water         les 157  Fig  7     Example of Melting Paint   ues Rm ike Ug a a de RR GR A 161  Fig  8  Example of MultiPaint             ns 165  Fig  9  Example of R
50.  123  saturate   for performance 324  scalar type category 232  semantics  aliasing 243  restrictions 243  shader sample  anisotropic lighting 190  bump dot 3x2 diffuse and specular 192  bump reflection mapping 196  fresnel 200  grass 202  improved skinning 154  improved water 157  matrix palette skinning 217  melting paint 161  multipaint 165  ray traced refraction 170  refraction 205    shadow mapping 208  shadow volume extrusion 211  sine wave demo 214    skin 175  shader  simple cg example 146  shaders    advanced profile samples     153  basic profile samples 189  shading computations for performance 326  shadow mapping 208  pixel shader code example 210  sample shader 208  vertex shader code example 209  shadow volume extrusion  sample shader 211  vertex shader code example 212  shadow volumes 211  silent incompatibilities with C 221  simple cg  basic transformations 149  passing arguments 149  Sine function 202  214  sine wave demo  sample shader 214  vertex shader code example 215  sinh x  37  skin  pixel shader code example 175  sample shader 175  skinning  improved  sample shader 154  vertex shader code example 155  smearing  scalar to vector 237  Stanford shading language  relation to Cg 221  State assignment 118  statements  introduction 18  statements  in Cg 244  structures  introduction 13  swizzle  for performance 323  swizzle operator 22  swizzle operator  described 245       336    808 00504 0000 006    NVIDIA    T    technique 117  technique validation 120  
51.  43 for more details     Consider the following effect    float3 DiffuseColor lt    string type    color     float3 minValue   float3 0 0 0    float3 maxValue   float3 10 10 10       cd qu d sms    technique FixedFunctionLighting            pass    LightingEnable   true   ightEnable 0    true   tige eosi elom O    Lileasea  10  10  10  i1       LightAmbient 0    float4  1  1  1  1    LightDiffuse 0     float4 2 DiffuseColor  1     LightSpecular 0    float4 1 1 1 1            MaterialShininess   10 f   MaterialAmbient   float4 1 1 1 1         118    808 00504 0000 006  NVIDIA    Introduction to CgFX    MacercrtalpDpi Ss Eoi bM   NIME  MaterialSpecular   float4  5   5   5  1              The effect defines a single effect parameter  DiffuseColor  with three  associated annotations  a string named type and two float3s named  minValue and maxValue  These annotations exist purely for the use of the  application using the effect file  the Cg runtime does not interpret the  annotation names or values in any way  The effect parameter is initialized to  the value  1 1 1      The effect also defines a single technique  named FixedFunctionLighting   which in turn contains a single rendering pass  The rendering pass sets the  appropriate OpenGL state to perform per vertex lighting using the built in  fixed function material model of OpenGL  The complete set of supported  OpenGL states is listed in the section    OpenGL State     on page 129     Note that the LightDiffuse  0  state value 
52.  Bar     DESEA       eye    iL    Foo     Mar il T  ooo 1014     Bar  i   Fooalt  B       Parameter Values    The core Cg runtime provides a number of entry points for setting and  retrieving parameter values  In addition  the graphics API specific Cg  runtimes provide additional entry points for managing parameter values     When managing numeric parameters  choosing which set of entry points to  use is largely a matter of programmer preference  In some circumstances  it  may be slightly more efficient to use the core Cg runtime entry points   However  parameters that hold graphics API specific quantities  such as  sampler handles  must be set using the API specific entry points  The API   specific entry points must be used because the core Cg runtime  which is  graphics APl agnostic  provides no such entry points     The most often used parameter value routines are used to set and get a  parameter s current values  A parameter s current value is initialized to any  default value assigned in the Cg source  or 0 otherwise  The current value of  a numeric parameter can be queried using the family of entry points   int cgGetParameterValue i f d  r c   CGparameter param    int nvals  type  v      The given parameter must be a scalar  vector  matrix  or an  possibly   multidimensional  array of scalars  vectors  or matrices  There are versions of  each function to retrieve the values into an int  float  or double buffer  these  are signified by the i      and din the entry point 
53.  CG_SOURCE   FragmentProgram cg    de miro a  198 2 0   Iracmena rogue  0 4       CComPtr lt ID3DXBuffer gt  byteCode    const char  progSrc   cgGetProgramString fragmentProgram   CG_COMPILED_PROGRAM      D3DXAssembleShader  progSrc  strlen progSrc   0  0  0              808 00504 0000 006 93  NVIDIA          Cg Language Toolkit     amp byteCode  0    device  gt CreatePixelShader  byteCode   GetBufferPointer      amp pixelShader           Grab some parameters   modelViewMatrix   cgGetNamedParameter  vertexProgram               ModelViewMatrix     baseTexture   cgGetNamedParameter  fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram    SomeColor           Sanity check that parameters have th xpected siz   assert  cgD3D9TypeToSize  cgGetParameterType    modelViewMatrix      16     assert  cgD3D9TypeToSize  cgGetParameterType  someColor          4  2                Called to render the scen  void OnRender                  Get the Direct3D resource locations for parameters     This can be done earlier and saved  DWORD modelViewMatrixRegister    cgGetParameterResourcelndex  modelViewMatrix      DWORD baseTextureUnit     cgGetParameterResourcelndex  baseTexture     DWORD someColorRegister     cgGetParameterResourceIndex  someColor                      See the Dizect3D state   device  gt SetVertexShaderConstantF  modelViewMatrixRegister   cmaci a Aye  device  gt SetPixelShaderConstantF  someColorRegister    eC OSes Color A E  vice  gt SetVertexDeclara
54.  CgFX file may contain one technique  for an advanced GPU with powerful fragment programmability  and another  technique for older graphics hardware supporting fixed function texture  blending  CgFX techniques can also be used for functionality  level of detail   or performance fallbacks  For example           technique PixelShaderVersion  dox  B    technique FixedFunctionVersion    Leo  f    technique LowDetailVersion    Loch B    An application can make queries about which techniques are present in an  effect and can choose an appropriate one at runtime  based on whatever  criteria are appropriate     Each technique contains one or more passes  Each pass represents a set of  render states and shaders to apply for a single rendering pass within a  technique  For instance  the first pass might lay down depth only so that  subsequent passes can apply an additive alpha blending technique without  requiring polygon sorting     Each pass may contain a vertex program  a fragment program  or both  and  each pass may use fixed function vertex  pixel processing  or both  For  example  a first pass might use fixed function pixel processing to output the  ambient color  The next pass could use an   p30 fragment program  and pass  three might use an arbfp1 fragment program     State Assignments    Each pass also contains render state assignments such as alpha blending   depth writes  and texture filtering modes  to name a few  For example     pass firstPass         DepthTestEnable   tru
55.  Comparison operators are allowed   gt    lt    gt     lt            and Boolean  operators        amp  amp       are allowed  However  the logic operators  s            are not     Data Types  The profiles implement data types as follows     O float data types are implemented as IEEE 32 bit single precision     Q half and double data types are treated as float           int data type is supported using floating point operations  which adds  extra instructions for proper truncation for divides  modulos and casts  from floating point types     Q fixed or sampler  data types are not supported  but the profiles do  provide the minimal partial support that is required for these data types  by the core language specification    that is  it is legal to declare variables  using these types  as long as no operations are performed on the  variables     Using Arrays    Variable indexing of arrays is allowed as long as the array is a uniform  constant  For compatibility reasons arrays indexed with variable expressions  need not be declared const just uniform  However  writing to an array that is  later indexed with a variable expression yields unpredictable results     Array data is not packed because vertex program indexing does not permit  it  Each element of the array takes a single 4 float program parameter  register  For example  float arr 10   float2 arr 10   float3 arr 10    and float4 arr 10  all consume 10 program parameter registers     It is more efficient to access an array 
56.  IDENTITY for applying no transformation at all  O CG GL MATRIX TRANSPOSE for transposing the matrix  O CG GL MATRIX INVERSE for inverting the matrix  O CG GL MATRIX INVERSE TRANSPOSE for inverting and transposing the    matrix    Setting Uniform Arrays of Scalar  Vector  and Matrix Parameters    To set the values of arrays of uniform scalar or vector parameters  use the  cgGLSetParameterArray functions     void cgGLSetParameterArraylf  CGparameter parameter   long startIndex  long numberOfElements   const float  array     void cgGLSetParameterArrayld CGparameter parameter   long startIndex  long numberOfElements   const double  array     void cgGLSetParameterArray2f  CGparameter parameter   long startIndex  long numberOfElements   const float  array     void cgGLSetParameterArray2d CGparameter parameter   long startIndex  long numberOfElements   const double  array         76    808 00504 0000 006  NVIDIA    void    void    void    void    Introduction to the Cg Runtime Library    cgGLSetParameterArray3f  CGparameter parameter   long startIndex  long numberOfElements   const float  array      cgGLSetParameterArray3d CGparameter parameter   long startIndex  long numberOfElements   const double  array      cgGLSetParameterArray4f  CGparameter parameter   long startIndex  long numberOfElements   const float  array     cgGLSetParameterArray4d  CGparameter parameter   long startIndex  long numberOfElements   const double  array      The digit in the name of those functions indica
57.  IDirect 3DDevice8   CreateVertexShader         A data stream is basically an array of data structures  Each of those structures  is of a particular type called the vertex format of the stream  Here is an  example of a vertex declaration for Direct3D 9                                                                                                                                                                                      const D3DVERTEXELEMENT9 declaration        LO       silzcor  float    D3DDECLTYPE_FLOAT3  D3DDECLMETHOD_DEFAULT   D3DDECLUSAGE_POSITION  0       Position  LO Ss ezeo  oat    D3DDECLTYPE_FLOAT3  D3DDECLMETHOD_DEFAULT   D3DDECLUSAGE_NORMAL  0       Normal  LO  8  slizcor  rote  y  D3DDECLTYPE_FLOAT2  D3DDECLMETHOD_DEFAULT   D3DDECLUSAGE_TEXCOORD  0       Base texture  T db  0  sizcor  elote  y  D3DDECLTYPE_FLOAT3  D3DDECLMETHOD_DEFAULT   D3DDECLUSAGE_TEXCOORD  1       Tangent  D3DD3CL_END       y     Here is an example of a vertex declaration for Direct3D 8     const DWORD declaration        D3DVSD_STREAM  0         D3DVSD_REG  D3DVSDE_POSITION  D3DVSDT_FLOAT3      Position  D3DVSD REG D3DVSDE NORMAL  D3DVSDI_FLOAT3      Normal  D3DVSD SKIP  2      Skip the diffuse and specular color  D3DVSD REG  D3DVSDE TEXCOORDO        DSDM DTERKCOA 2       eee west   D3DVSD STREAM 1      Tangent basis stream   D3DVSD REG  D3DVSDE EXCOORD1  D3DVSDT FLOAT3     Tangent  D3DVSD END                                          y     Both declarations tell the Direct3D runtim
58.  Multiple color outputs are not supported in pixel shaders  Only Coloro  is supported        808 00504 0000 006 303  NVIDIA          Cg Language Toolkit       DirectX Vertex Shader 1 1 Profile  vs 1 1     The DirectX Vertex Shader 1 1 profile is used to compile Cg source code to  DirectX 8 1 Vertex Shaders and DirectX 9 VS 1 1 shaders     o Profile name  vs 1 1       Q How to invoke  Use the compiler option  profile vs 1 1     The vs 1 1 profile limits Cg to match the capabilities of DirectX Vertex  Shaders     This section describes how using the vs 1 1 profile affects the Cg source  code that the developer writes     Memory Restrictions    DirectX 8 vertex shaders have a limited amount of memory for instructions  and data     Program Instruction Limits    The DirectX 8 vertex shaders are limited to 128 instructions  If the compiler  needs to produce more than 128 instructions to compile a program  it reports  an error     Vector Register Limits    Likewise  there are limited numbers of registers to hold program parameters  and temporary results  Specifically  there are 96 read only vector registers  and 12 read write vector registers  If the compiler needs more registers to  compile a program than are available  it generates an error     Language Constructs and Support    Data Types  This profile implements data types as follows     O float data types are implemented as IEEE 32 bit single precision        Q half and double data types are treated as float        8  To unders
59.  MyInterface    float Scale   SomeMethod float x     return Scale   x       y     In order to obtain the unique enumerant associated with a parameter s type   the following entry point should be used  CGtype cgGetParameterNamedType  CGparameter param       The CGtype associated with a named user defined type in a program can be  retrieved using    CGtype cgGetNamedUserType  CGhandle handle  const char  name     Here  handle can be either a CGprogram Or a CGeffect     The struct types can implement a given interface  In such a case  the  indicated interface is known as a parent type of the struct type  In the  example above  MyStruct has a single parent type  MyInterface  The parent  types of a given named type may be obtained with the following entry  points    int cgGetNumParentTypes  CGtype type     CGtype cgGetParentType  CGtype type  int index      Note that the Cg language specification currently makes it impossible for a  struct type to have more than a single parent type        64    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    All of the user defined types associated with a program may be obtained  with the following entry points     int cgGetNumUserTypes  CGprogram program     CGtype cgGetUserType  CGprogram program  int index      Note that the runtime treats interface program parameters as if they were  structure parameters with no concrete data or function members     In older applications that use the Cg runtime  you may encounter the  
60.  MySampler      cgD3D9SetTexture  mySampler  myDefaultPoolTexture     Te se El               See the Direct3D documentation for a full explanation of lost devices and  how to properly handle them     Setting Expanded Interface Parameters    This section discusses setting the various types of parameters of the  expanded interface  including uniform scalar  uniform vector  uniform  matrix  uniform arrays of the three previous types  and sampler     Setting Uniform Scalar  Vector  and Matrix Parameters    The function cgD3D9SetUni form    sets floating point parameters like  float3 and float4x3     HRESULT cgD3D9SetUniform CGparameter parameter   const void  value       The amount of data required depends on the type of parameter  but is  always specified as an array of one or more floating point values  The type is  void  so a user defined structure that is compatible can be passed in without  type casting  Here is some code illustrating the use of cgD3D9SetUni form     for setting a vectorParam of type float3  matrixParamof type float2x3   and arrayParam of type float2x2  3     DEDXVECTORS vectorData l  2 3         filbat matrixDatal2       4  2  Sty  14  SBS  Bi  float arrayData 3  2  2     title Zh   187 Aio lis Oro iodo MAS  WON   ML  1231     cgD3D9SetUniform vectorParam   amp vectorData     cgD3D9SetUniform matrixParam  matrixData    cgD3D9SetUniform arrayParam  arrayData      As mentioned previously  cgD3D9TypeToSize    can be used to determine  how many values are requi
61.  N    N  E    where  strq are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate _coordl are texture coordinates associated with the n 2  texture unit   intermediate coord2 are texture coordinates associated with the n 1  texture unit  and    eye is the eye ray vector     This function can be used generate the  dot product reflect cube map const eye NV texture shader  instruction combination           tex dp3x2 depth float3 str  float4 intermediate coord     float4 prevlookup           Performs the following  float z   dot intermediate coord xyz  prevlookup xyz    float w   dot str  prevlookup xyz    return z   w   where  str are texture coordinates associated with the nth texture unit   intermediate coord are texture coordinates associated with the n 1  texture unit  and  prevlookup is the result of a previous texture operation   This function can be used in conjunction with the DEPTH varying out semantic  to generate the dot  product depth replace NV texture shader  instruction combination              294    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Examples    The following examples show how a developer can use Cg to achieve  NV_texture_shader and NV_register_combiners functionality           Example 1   struct VertexOut    float4 color ENG ORORO  float4 texCoord0   TEXCOORDO   sui riis Totoral y EDe   OR DMs          y    float4 main VertexOut IN   uniform sampler2D diffuseMap   un
62.  The  final skinned positions are computed using these bones  along with the  weights supplied per vertex  Tangent space bases are skinned in a similar  fashion and then used to transform the light vector into tangent space for  per pixel bump mapping  Fig  22          Fig  22  Example of Matrix Palette Skinning       808 00504 0000 006 217  NVIDIA          Cg Language Toolkit    Vertex Shader Source Code for Matrix Palette Skinning    struct appdata     ellen Ss  iles slicskoin 8 POSITION   loat2 Weights   BLENDWEIGHTO   loat2 Indices   BLENDINDICES   loat3 Normal   NORMAL                 T             loat2 TexCoord0   TEXCOORDO   Loats 5  TEXCOURDI   loat3 SN TEXCOORD                       lan la ten  da Gey Uy ler           ARSS 2    MEDMCOMINDI SIs  y     struct vpconn    float4 Hposition   POSITION   float4 TexCoord0   TEXCOORDO   float4 TexCoordl   TEXCOORD1   float4 Color0   COLORO                    y     vpconn main appdata IN   uniform float4x4 WorldViewProj   uniform float3x4 Bones 26    uniform float3 LightVec     VOC Onmm OWA     float4 tempPos   tempPos xyz   IN Position xyz   tempPos w   1 0        grab first bone matrix  float IN indices x       transform position  float3 pos0   mul  Bones i   tempPos        create 3x3 version of bone matrix  float3x3 m    m  mOO m01 m02   Bones i  _m00_m01_m02   fate  MO mdd il 2 Bonos ia  m _m20_m21_m22   Bones i  _m20_m21_m22            tension S  UT   SEXT  float3 s0   mula NIS        218 808 00504 0000 006  NVIDIA    Bas
63.  about the performance  of this and other NVIDIA GPUs   The   p40 profile  therefore  provides two  options to control whether the compiler should emit branches or  conditionally executed code for the if statements and loops within Cg  shaders  The options are described in Table 22        268    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Table 22    p40 Compiler Branching Options       Compiler Option Description        ifcvt  all   none   count N    Changes the if conversion mode  based on the option selected   QO all  All i   statements are converted  to conditional writes   QO none  All if statements generate  branching code     O count N  Sets if limit cost to N  operations         unroll  all   none   count N  Changes the loop unrolling mode  based on the option selected   a all  All loop statements that can be  unrolled will be     Ud none  All loop statements that can be  implemented with branching  will be     O countzN    Sets loop limit cost to N  operations                 Setting both  ifevt and  unroll to a11 yields behavior similar to the   p30  profile  for which branch instructions are not available  Using  ifcvt none  places the burden on the Cg fragment program author to use i   statements  where they want true branches and to use conditional expressions otherwise     FACE Semantic    The FACE semantic can be applied to a varying parameter to a program  The  value of such a parameter has a value less than zero if the fragment being  render
64.  corresponding formal parameter in any function in the set  remove all  functions whose corresponding parameter does not match exactly     b  If there is a defined promotion for the type of the actual parameter to  the unqualified type of the formal parameter of any function  remove  all functions for which this is not true from the set     c  If there is a valid implicit cast that converts the type of the actual  parameter to the unqualified type of the formal parameter of any  function  remove all functions without this cast     d  Fail   Choose a function based on profile     a  If there is at least one function with a profile that exactly matches the  compilation profile  discard all functions that don   t exactly match     b  Otherwise  if there is at least one function with a wildcard profile that  matches the compilation profile  determine the    most specific     matching wildcard profile in the candidate set  Discard all functions  except those with this most specific wildcard profile  How    specific    a  given wildcard profile name is relative to a particular profile is  determined by the profile specification        240    808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    7  Ifthe number of functions remaining in the set is not one  then fail     Global Variables    Global variables are declared and used as in C  Uniform non static variables  may have a semantic associated with them  Uniform non static variables may  have their value set throu
65.  corresponding to the fixed   function light s diffuse color  is set with an expression involving the  DiffuseColor effect parameter  If the value of this parameter is changed by  the application and the pass s state is later set  the parameter   s new value is  used in the expression that sets the light   s diffuse color     Note also that this expression is parenthesized  In general  CgFX requires  that most expressions  like this one  involving effect parameters be in  parenthesis  This is necessary so that CgFX can distinguish between effect  parameters and built in enumerant values representing constants     The code below demonstrates how to create an effect given the name of an  effect file  After creating a Cg context  cgGLRegisterStates    sets up the  state assignments that support the standard OpenGL state manager  Most  applications will want to do this immediately after creating the CGcontext   Next  the effect is created and associated with the given context   CGcontext context   cgCreateContext      cgGLRegisterStates  context     CGeffect effect   cgCreateEffectFromFile  context   oamp ler oe a me N UI   if  leffect      fprintf stderr   Unable to creat ffect  n      const char  listing   cgGetLastListing  context     de  Mist    corales  sedert   Esa   lisina    exa  iL  E                808 00504 0000 006 119    NVIDIA          Cg Language Toolkit    Technique Validation    Before using any of the techniques in an effect  it   s important to validate the  te
66.  diffCol diffCol     color a   1 0        182 808 00504 0000 006    NVIDIA    Advanced Profile Sample Shaders       Car Paint 9    Description    This car paint shader uses gonioreflectometric paint samples measured by  Cornell University  The samples were converted into a 2D texture map which  is indexed using NdotL and NdotH as the  s  t  coordinate pair  and which  provides the diffuse component of our lighting equation  The specular term  is calculated using the Blinn model  and also includes a term which simulates  the clear coat s metallic flecks     The fleck normal mipmap chain has randomly generated vectors which  reside within a positive Z cone in tangent space  The cone is reduced  gradually at every level such that in the distance the flecks are pointing  mostly up  The flecks    specular power and their contribution are reduced by  distance  to give it a grainier appearance up close and a more uniform  appearance from afar  Next  the view vector is reflected off a wavy normal  map    which represents the object s natural undulations    to index into the  environment map  The shininess of the clear coat itself is calculated by  scaling the Fresnel term by the luminance of the environment map   The  luminance transfer function selects only the perceptually bright areas of the  environment map in order not to reflect the darker areas of the scene    Finally  the shader lerps between the diffuse paint color and the reflection  based on the Fresnel term  and adds the 
67.  each new product generation comes a two   fold increase in performance  Graphics processor performance increases at  approximately three times the rate of microprocessors Moore   s Law cubed   In addition to the performance increases  each year brings new hardware  features  supported by new application programming interfaces  APIs   This  dizzying pace is difficult for developers to adapt to  but adapt they must     Developers and users are demanding better rendering quality and more  realistic imagery and experiences  Users don   t care about the details  they  simply want games and other interactive applications to look more like  movies  special effects  and animation  Developers want more power  always  more   along with more flexibility in controlling the massively capable GPUs  of today and tomorrow  APIs do not  and cannot  keep up with the rapid  pace of innovation in GPUs  As APIs and underlying technologies change   programmers  artists  and software publishers struggle to adapt to the  change and the churn of the hardware software platform     What s needed is to raise the level of abstraction for interaction with GPUs   Continued updates and improvements to the hardware and APIs are too  painful if developers are too    close to the metal     This problem was    808 00504 0000 006 xiii    NVIDIA       Cg Language Toolkit    exacerbated by the advent of programmability in GPUs  Older GPUs had a  small number of controllable or configurable rendering paths  but th
68.  either entry point  Only unsized arrays may be modified  using these entry points     Parameter Attributes    A parameter s general class can be queried using  CGparameterclass cgGetParameterClass  CGparameter param       The returned CGparameterclass value enumerates the high level parameter  classes     O CG PARAMETERCLASS SCALAR   A scalar type  such as CG  INT or CG FLOAT   O CG PARAMETERCLASS VECTOR   A vector type  such as CG_INT1 or CG_FLOAT4   O CG_PARAMETERCLASS_MATRIX   A matrix type  such as CG_INT1X2 Or CG_FLOAT4X4  O CG_PARAMETERCLASS_STRUCT   A struct or interface    O CG PARAMETERCLASS SAMPLER  A sampler type  such as sampler1D or samplerCUBE       O CG PARAMETERCLASS OBJECT  A texture  string  or program    The program that the parameter corresponds to is found using  cgGetParameterProgram     CGprogram cgGetParameterProgram CGparameter parameter      To determine whether the parameter is varying  uniform  or constant   cgGetParameterVariability    is used   CGenum cgGetParameterVariability  CGparameter parameter      The call returns CG_VARYING if the parameter is a varying parameter    CG UNIFORM if the parameter is a uniform parameter  or CG  CONSTANT if the  parameter is a constant parameter  A constant parameter is a parameter whose  value never changes for the life of a compiled program  so that changing its       68    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    value requires recompiling the program  For some profiles  the 
69.  float and int types except for the  usual arithmetic conversion behavior and function overloading rules  see     Function Overloading    on page 240      The usual arithmetic conversions for binary operators are defined as follows   1  If either operand is double  the other is converted to double     2  Otherwise  if either operand is float  the other operand is converted to  float     3  Otherwise  if either operand is ha1    the other operand is converted to  half     4  Otherwise  if either operand is fixed  the other operand is converted to  fixed        236 808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    5  Otherwise  if either operand is c  1oat  the other operand is converted to  cfloat     6  Otherwise  if either operand is int  the other operand is converted to  int     7  Otherwise  both operands have type cint     Note that conversions happen prior to performing the operation     Assignment    Assignment of an expression to an object or compile time typed value  converts the expression to the type of the object or value  The resulting value  is then assigned to the object or value     The value of the assignment expressions         and so on  is defined as in C   An assignment expression has the value of the left operand after the  assignment but is not an lvalue  The type of an assignment expression is the  type of the left operand unless the left operand has a qualified type  in which  case it is the unqualified version of the type of the l
70.  floating point  is supported    It is recommended that you use fixed  half  and float in that order for  maximum performance  Reversing this order provides maximum precision   You are encouraged to use the fastest type that meets your needs for  precision     Statements and Operators          Full support for if else  Q No   or and while loops  unless they can be unrolled by the compiler  Q Support for flexible texture mapping  QO Support for screen space derivative functions  Q No support for variable indexing of arrays  274 808 00504 0000 006    NVIDIA    Bindings    Appendix B Language Profiles    Binding Semantics for Uniform Data    The valid binding semantics for uniform parameters in the   p30 profile are sum   marized in Table 26     Table 26    p30 Uniform Input Binding Semantics    Binding Semantics Name Corresponding Data       register  s0  register s15  Texunit N  where N is in the range  0  15    TEXUNITO TEXUNIT15 May be used only with uniform inputs with  sampler  types        register  c0  register c31  Constant register N  where N is in range  C0 C31  0  15   May only be used with uniform inputs                 Binding Semantics for Varying Input Output Data    The valid binding semantics for varying input parameters in the   p30 profile  are summarized in Table 27     These binding semantics map to NV_fragment_program input registers  The  two sets act as aliases to each other  The profile also allows POSITION  FOG   PSIZE  HPOS  FOGC  PSIZ  BCOLO  BCOL1  
71.  if x is infinite   isnan  x  Returns true if x is NaN  not a number    ldexp x  n  x   2        lerp a  b  f     Linear interpolation   1     a   b fwhere a and b  are matching vector or scalar types  Parameter    can be  either a scalar or a vector of the same type as a and b        lit ndotl  ndoth  m     Computes lighting coefficients for ambient  diffuse  and   specular light contributions  Returns a 4 vector as   follows      The x component of the result vector contains the  ambient coefficient  which is always 1 0      The y component contains the diffuse coefficient  which is zero if  n     1     0  otherwise  n     1       The z component contains the specular coefficient  which is zero if either  n     1   lt  Oor  ne n   lt 0    n 9 n   otherwise      The w component is 1 0    There is no vectorized version of this function                          log x  Natural logarithm 1n  x     x must be greater than zero   log2  x  Base 2 logarithm of x    x must be greater than zero   log10  x  Base 10 logarithm of x    x must be greater than zero   max a  b  Maximum of a and b   min a  b  Minimum of a and b           808 00504 0000 006    35  NVIDIA             Cg Language Toolkit    Table 1  Mathematical Functions  continued        Mathematical Functions    Function    Description       modf  x  out ip     Splits x into integral and fractional parts  each with the  same sign as x    Stores the integral part in ip and returns the fractional  part        mul M  N     Matrix
72.  images  To interface Cg programs with  applications  you must do two things     1  Compile the programs for the correct profile  In other words  compile the  programs into a form that is compatible with the 3D API used by the  application and the underlying hardware     2  Link the programs to the application program  This allows the  application to feed varying and uniform data to the programs     You have two choices as to when to perform these operations  You can  perform them at compile time  when the application program is compiled  into an executable  or you can perform them at run time  when the  application is actually executed  The Cg runtime is an application  programming interface that allows an application to compile and link Cg  programs at run time     808 00504 0000 006 43    NVIDIA       Cg Language Toolkit    Benefits of the Cg Runtime    Future Compatibility    Most applications need to run on a range of profiles  If an application  precompiles its Cg programs  the compile time choice   it must store a  compiled version of each program for each profile  This is reasonable for one  program  but is cumbersome for an application that uses many programs   What s worse  the application is frozen in time  It supports only the profiles  that existed when it was compiled  it cannot take advantage of the  optimizations that future compilers could offer     In contrast  programs compiled by applications at run time    O Benefit from future compiler optimizations for 
73.  instructions  no limit on texture instructions  no limit on texture dependent  reads  and support for predication     This section describes the capabilities and restrictions of Cg when using  these profiles     Program Instruction Limit    DirectX 9 Pixel shaders have a limit on the number of instructions in a pixel  shader     Q PS 2 0  ps_2_0  pixel shaders are limited to 32 texture instructions and 64  arithmetic instructions        a Extended PS 2  ps 2 x  shaders have a limit of maximum number of  total instructions between 96 to 1024 instructions   There is no separate texture instruction limit on extended pixel shaders     If the compiler needs to produce more than the maximum allowed number  of instructions to compile a program  it reports an error   Vector Register Limit    Likewise  there are limited numbers of registers to hold program parameters  and temporary results  Specifically  there are 32 read only vector registers       7  To understand the capabilities of DirectX PS 2 0 Pixel Shaders and the code produced by  the compiler  refer to the Pixel Shader Reference in the DirectX 9 SDK documentation        300    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    and 12 32 read write vector registers  If the compiler needs more registers to  compile a program than are available  it generates an error     Language Constructs and Support  Data Types    This profile implements data types as follows     O float data type is implemented as IEEE 32 bit si
74.  is no enum or union   Bit field declarations in structures are not allowed     There are no bit field declarations in structures        D D D DO    Variables may be defined anywhere before they are used  rather than just  at the beginning of a scope as in C   That is  we adopt the C   rules that  govern where variable declarations are allowed   Variables may not be  redeclared within the same scope     Q Vector constructors  such as the form   1oat4  1 2 3  4   may be used  anywhere in an expression     Q A struct definition automatically performs a corresponding typedef   as in C       Q An interface can be specified to define a set of methods that comprises  an abstract interface        Q A struct type can be declared as implementing an interface by    adding a colon         and the name of the interface after the name of the  struct     Methods can be defined in the body of a struct definition        C   style    comments are allowed in addition to C style         comments        Detailed Language Specification    Definitions  The following definitions are based on the ANSI C standard   Q Object    An object is a region of data storage in the execution environment  the  contents of which can represent values  When referenced  an object may  be interpreted as having a particular type     Q Declaration  A declaration specifies the interpretation and attributes of a set of  identifiers    a Definition  A declaration that also causes storage to be reserved for an object or co
75.  it  this way   loea      lowed  Ao 4e 194   Boks a   lem   a w   b w         than to write it this way     float4 c   ath     The compiler does its best to find vectorization in your programs  but the  more vectorized your original code is  the better starting place it has to work  from     A more specific example comes from a common computation done for  tangent space bump mapping  Given a texture map that encodes a bump  map by storing the offset along the tangent direction in x  the offset along the  binormal in y  and the offset along the normal in z  the bump mapped  normal is computed by scaling the tangent  binormal  and normal  appropriately  In C or C    the natural way to write this computation is as    shown       Tangent  binormal  normal  Passed in from vertex program   Indes T  2  Np   Float3 Nbump     Bump mapped normal    Float3 bump   tex2D bumpSampler  uv     Note   loin   A ap lobia    Noe  Noto  y   loeo   Toy ar lobo oy   iy ar loto  79 IN  Novios   lomos   oz ap loups   I  ce lio   a 9 ENEZ           However  here we have written a series of computations that add and  multiply single pairs of floating point values at a time  After a little algebra   we can rewrite this as three multiplies of a   1oat3 and a float and two    loat3 additions    which runs several times faster than the original     Nouns   lobia gs     UD sr Jouwqpgw c i xe EUME  zZ   INP       322 808 00504 0000 006  NVIDIA    Appendix C Nine Steps to High Performance Cg       2  Use Swiz
76.  look up in a cube map  Fig  21          Fig  21  Example of Sine Wave       214 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Sine Wave    struct appdata    float4 TexCoord0   TEXCOORDO        y     struct woeonm i  tloecd Pos e POSITION   float4 COLO   COLORO   float4 TEXO   TEXCOORDO                 y     vpconn main appdata IN   uniform float4x4 WorldViewProj   uniform float3x4 WorldView   uniform float3x3 WorldViewIT   uniform float3 WavesX   acojan lors WavesY   uniform float3 WavesH   lima ona closes Time    MOC Omni OW ley  float3 angle   WavesX   IN TexCoord0 x      WavesY   IN TexCoord0 y   angle   angle   Time        float3 sine  cosine   sincos  angle  sine  cosine         posicion iss  u  sumas   sim smedei     vy 1   float4 position    position xz   IN TexCoord0 xy    POSE LON o y dot  WavesH  sine     position w lcg    OUT HPOS   mul WorldViewProj  position         normal is  t a WaveX cos  angle    fia      t h WaveY cos angle      ellos     normal        normal x   dot  WavesH   WavesX  cosine    normalny      l o 03 y  808 00504 0000 006 215    NVIDIA          Cg Language Toolkit          216 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders       Matrix Palette Skinning    Description    This effect performs matrix palette skinning using two bones per vertex  All  the bones for the mesh are set in the constant memory  and each vertex  includes two indices that indicate which bones influence this vertex 
77.  main  parameter gives the name of the function to use as the main entry       46    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    point when the program is executed  Lastly  args is a null terminated list of  null terminated strings that is passed as an argument to the compiler     Loading a Program    After you compile a program  you need to pass the resulting object code to  the 3D API that you re using  For this  you need to invoke the Cg runtime s  API specific functions     The Direct3D specific functions require the Direct3D device structure in  order to make the necessary Direct3D calls  The application passes it to the  runtime using the following call    cgD3D9SetDevice  Device          You must do this every time a new Direct3D device is created  typically only  at the beginning of the application     You can then load a Cg program in this way for the Direct3D 9 Cg runtime   cgD3D9LoadProgram program  CG_FALSE  0         or this way for the Direct3D 8 Cg runtime              cgD3D8LoadProgram program  CG_FALSE  0  0  vertexDeclaration          The parameter vertexDeclaration is the Direct3D 8 vertex declaration  array that describes where to find the necessary vertex attributes in the  vertex streams   See    Expanded Interface Program Execution    on page 103  for the details on the arguments to cgD3D8LoadProgram   and  cgD3D9LoadProgram        In OpenGL  the equivalent call is  cgGLLoadProgram  program    Modifying Program Parameters    
78.  parameter shadowing is  turned off for a given program and the value of any of its uniform  parameters is set by some function of the Direct3D Cg runtime  it is  immediately downloaded to the GPU constant memory  the memory  containing the values of all the uniform parameters   When parameter  shadowing is turned on  the value is shadowed instead and no Direct3D call  is made at the time it is set  only when the program is bound are all of its  parameters actually downloaded to the constant memory  This means that a  parameter value set after binding the program is not used during the  execution of the program until the next time the program is bound   Parameter shadowing applies to all parameter settings including texture  state stage and texture mode     Disabling parameter shadowing allows the runtime to consume less  memory  but forces the application to do the work of making sure that the  constant memory contains all the right values every time it activates a  program     OpenGL Cg Runtime    This section discusses setting parameters and program execution for the  OpenGL Cg runtime        808 00504 0000 006 73  NVIDIA          Cg Language Toolkit       Note  Before any OpenGL Cg runtime functions can be executed  an OpenGL context must  be created with either wylCreateContext    or glXCreateContext           Setting Parameters in OpenGL    In accordance with the OpenGL convention  many of the functions described  below come in two versions  a version operating on float
79.  product of matrix mand matrix N  as shown  below        Mi Ma Ms Ma  Ni Na Ns Na  Mia Mza Mz Ma  Ni Na Ns2 Na  Mis Mos Mss Ma  Ni Na N33 Nas  Mia M24 M34 Maa  Ni4 N23 N34 Nas     If M has size AxB  and N has size BxC  returns  a matrix of size AxC     mul M  N                         mul  M  v     Product of matrix M and column vector v  as shown  below        Mi Ma Ma Ma   Vi  Mia Mza Mz Maz   V2  Mis Mos M33 Maz    V3   Mia M24 M34 Maa    Va    mul M  v                      If M is an AxB matrix and v is a Bx1 vector  returns an  Ax1 vector        mul  v  M     Product of row vector v and matrix M  as shown below     Mi  Ma Mz Ma  Mia Mz Mz Maz  Mis Mos M33 Maz  Mia Mar Ms4 Maa    If v is a 1xA vector and M is an AxB matrix  returns a  1xB vector     mul v M     Vi V2 V3 Va              noise  x     pow x  y     Either a 1   2   or 3 dimensional noise function  depending on the type of its argument     The returned value is between zero and one and is  always the same for a given input value     xY       radians  x     Degree to radian conversion           round  x        Closest integer to x           36    808 00504 0000 006  NVIDIA       Table 1     Cg Standard Library Functions    Mathematical Functions  continued        Function    Mathematical Functions    Description       rsqrt  x     Reciprocal square root of x   x must be greater than zero        saturate  x     Equivalent to clamp x  O  1      Returns 0 if x is less than O      Returns 1 if x is greater than 1  
80.  profile is an enumerant specifying the profile to which the program  must be compiled     Q entry is the name of the function that must be considered as the main  entry point by the compiler  If the value is zero  the name main is used        Q args isa pointer to a null terminated array of null terminated strings  that are passed as arguments to the compiler  The pointer may itself be  null     The only difference between the two functions is how programis interpreted   For cgCreateProgramFromFile     program is a string containing the name  of a file containing source code  for cgCreateProgram     program directly  contains source code  If the enumerant programType is equal to CG_SOURCE   the source code is Cg source code  if it is equal to cG  OBJECT  the source code  is precompiled object code and does not require any further compilation     The CGprogram handle returned by cgCreateProgramFromFile    is valid if  it is different from zero  which means that the program has been successfully  created and compiled  The program is destroyed by passing its handle to  cgDestroyProgram       void cgDestroyProgram CGprogram program      The Cg runtime allows for either automatic or manual compilation of  programs  Compilation of a program is required before the program may be  used when drawing  As such  program compilation is necessary sometime  after the program is first created  or whenever it enters an uncompiled state   A program may enter an uncompiled state for a variety 
81.  program and any of its parameter handles  On the other hand   destroying a program with cgDestroyProgram   or cgDestroyContext     releases any Direct3D resources by indirectly calling  cgD3D9UnloadProgam       Function cgD3D9IsProgramLoaded    returns CG TRUE if a programis  loaded     CGbool cgD3D9IsProgramLoaded  CGprogram program         104    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    All programs must be loaded before they can be bound  Binding a program  is done by calling cgD3D9BindProgranm      HRESULT cgD3D9BindProgram CGprogram program      This function basically activates the Direct3D shader corresponding to  program by calling IDirect 3DDevice9   SetVertexShader    or  IDirect3DDevice9  SetPixelShader   depending on the program s  profile  If parameter shadowing is enabled for program  it also sets all the  shadowed parameters and their associated Direct3D states  such as texture  stage states for the sampler parameters   No value or state tracking is  performed by the runtime so that this setting is done regardless of what the  current values of these parameters or of their states are  If a shadowed  parameter has not been set by the time cgD3D9BindProgram   is called  no  Direct3D call of any sort is issued for this parameter     Only one vertex program and one fragment program can be bound at any  given time  so binding a program of a given type implicitly unbinds any  other program of the same type     Expanded Interface Profil
82.  program except for specifying a different profile  However  if any of the  glProgramParameterxxNV    routines are used the application program  needs to be changed to use the corresponding ARB functions     Since there is no ARB function corresponding to glTrackMatrixNV     an  application using glTrackMatrixNV    and the arbvp1 profile needs to be  modified  One solution is to change the Cg source code to refer to the matrix  using the state structure so that the matrix is automatically tracked by the  OpenGL driver as part of its GL_ARB_vertex support  Another solution is for  the application to use the Cg run time routine  cgGLSetStateMatrixParameter    to load the appropriate matrix or  matrices when necessary     Another potential incompatibility between the arbvp1 and vp20 profiles is  the way that input varying semantics are handled  In the vp20 profile   semantic names such as POSITION and ATTRO are aliases of each other the  same way NV_vertex_program aliases Vertex and Attribute 0  see Table 30      p20 Varying Input Binding Semantics   on page 281   In the arbvp1  profile  the semantic names are not aliased because ARB vertex program  allows the conventional attributes  such as vertex position  to be separate  from the generic attributes  such as Attribute 0   For this reason it is  important to follow the conventions given in Table 17     arbvp1 Varying  Input Binding Semantics     on page 261 so that arbvp1 programs work for all  implementations of ARB_vertex_pr
83.  programs to behave correctly under other pixel shader profiles     The swizzles required on the texture coordinate parameter to the projective  texture lookup functions are listed in Table 34        808 00504 0000 006 287  NVIDIA          Cg Language Toolkit    Bindings    Table 34  Required Projective Texture Lookup Swizzles                      Texture Lookup Function Texture Coordinate Swizzle  texlDproj  Xw  ra   tex2Dproj xyw  rga   texRECTproj  Xyw  rga   tex3Dproj  xyzw  rgba   texCUBEproj  Xyzw   rgba                Manual Assignment of Bindings    The Cg compiler can determine bindings between texture units and uniform  sampler parameters texture coordinate inputs automatically  This automatic  assignment is based on the context in which uniform sampler parameters  and texture coordinate inputs are used together     To specify bindings between texture units and uniform parameters texture  coordinates to match their application  all sampler uniform parameters and  texture coordinate inputs that are used in the program must have matching  binding semantics     for example  TEXUNIT  n   may only be used with  TEXCOORD  n    Partially specified binding semantics may not work in all  cases  Fundamentally  this restriction is due to the close coupling between  texture samplers and texture coordinates in the NV  texture shader  extension     Binding Semantics for Uniform Data    If a binding semantic for a uniform parameter is not specified  then the  compiler will allocat
84.  sample of which is  given below  For a complete list see    Texture Map Functions    on page 38        808 00504 0000 006 23  NVIDIA          Cg Language Toolkit    Q Standard nonprojective texture lookup   tex2D  sampler2D tex  float2 s    texRECT  samplerRECT tex  float2 s    texCUBE  samplerCUBE tex  float3 s      Q Standard projective texture lookup   tex2Dproj  sampler2D tex  float3 sq    texRECTproj  samplerRECT tex  float3 sq   texCUBEproj  samplerCUBE tex  float4 sq      a Nonprojective texture lookup with user specified filter kernel size   tex2D  sampler2D tex  float2 s   float2 dsdx  float2 dsdy    texRECT  samplerRECT tex  float2 s   float2 dsdx  float2 dsdy    texCUBE  samplerCUBE tex  float3 s   float3 dsdx  float3 dsdy      The filter size is specified by providing the derivatives of the texture  coordinates with respect to pixel coordinates x  dsdx  and y  dsdy   For  more information see    Texture Map Functions  on page 38     Q Shadowmap lookup     tex2Dproj  sampler2D tex  float4 szq    tex2DRECT  samplerRECT tex  float4 szq      In these functions  the z component of the texture coordinate holds a  depth value to be compared against the shadowmap  Shadowmap  lookups require the associated texture unit to be configured by the  application for depth compare texturing  otherwise  no depth  comparison is actually performed        Effects    Cg includes a powerful  versatile shader specification and interchange  format  CgFX  For artists and developers of rea
85.  sg e  E  lt a  es  P moines ll   285  8 Rep        gt  maval      maxval xxx B Xp  return x          Texture Lookups in Advanced Fragment Profiles    Cg s advanced fragment profiles and the vp40 profile provide a variety of  texture lookup functions  Please note that Cg uses a different set of texture  lookup functions for basic fragment profiles because of the restricted pixel  programmability of that hardware  Basic fragment profile lookup functions  aren t discussed in this introductory chapter     Advanced fragment profile texture lookup functions always require at least  two parameters     Q Texture sampler  A texture sampler is a variable with the type sampler  sampler1D   sampler2D  sampler3D  samplerCUBE  Or samplerRECT and represents  the combination of a texture image with a filter  clamp  wrap  or similar  configuration  Texture sampler variables cannot be set directly within the  Cg language  instead  they must be provided by the application as  uniform parameters to a Cg program     Q Texture coordinate  Depending on the type of texture lookup  the coordinate may be a scalar   a two vector  a three vector  or a four vector     The following fragment program uses the tex2D    function to perform a 2D  texture lookup to determine the fragment   s RGBA color     void applytex uniform sampler2D mytexture   1t ML oye 2 uv   TEXCOORDO   out float4 outcolor   COLOR     outcolor   tex2D mytexture  uv              Cg provides a wide variety of texture lookup functions  a
86.  simple files   ro P POSITION  2w Fil oat4 Position    Y Source Files float4 Normal NORMAL   iS cg simple cpp    E  data path cpp   4  obiload cpp    define outputs from vertex shader   Y Header Files struct vertout  t         3 CG Programs iti i  float4 HPosition POSITION   jJ simple ca  float4 Color0   COLORO      3  Extemal Dependencies       vertout main appin IN   uniform float4x4 ModelVievProj   uniform float4x4 ModelVievIT   uniform float4 LightVec     vertout OUT        transform vertex position into homogenous clip space  OUT HPosition   mul ModelViewProj  IN  Position         transform normal from model space to view space  float3 normalVec   normalize mul ModelViewIT  IN Normal  xyz         store normalized light vector  float3 lightVec   normalize lightVec xyz         calculate half angle vector  float3 eyeVec   float3 0 0  0 0  1 0    float3 halfVec   normalize lightVec   eyeVec         calculate diffuse component  float diffuse   dot normalVec  lightYec      Es    Uni  Coil  REC  COL  DVR  READ A       Fig  3  The Cg  Simple Workspace    808 00504 0000 006 145  NVIDIA       Cg Language Toolkit    As usual  click the FileView tab to view the various files in the project   What s different in this case  though  is that in addition to the usual Source  Files and Header Files folders  there is also a Cg Programs folder     This Cg Programs folder should contain one Cg program  simple  cg  which  is what you can use for experimentation  Double click simple  cg to ope
87.  source  code     Examples shown are    Anisotropic Lighting   Bump Dot3x2 Diffuse and Specular  Bump Reflection Mapping   Fresnel   Grass   Refraction   Shadow Mapping   Shadow Volume Extrusion    Sine Wave Demo       Oooo oO O OO ODO DO    Matrix Palette Skinning    808 00504 0000 006 189  NVIDIA    Cg Language Toolkit       Anisotropic Lighting    Description    The anisotropic lighting effect  Fig  13   shows the vertex program   s half   angle vector calculation  It uses HdotN and LdotN per vertex to look up into a  2D texture to achieve interesting lighting effects        Fig  13  Example of Anisotropic Lighting    190 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Anisotropic Lighting    struct appdata      Hoar Poslrlca ROSI ETON     float3 Normal  y     struct VO OA    NORMAL     Eloet Hposicion B  POSITION   float4 TexCoord0   TEXCOORDO     y        vpconn main appdata IN   uniform float4x4 WorldViewProj   Uns orm elloar Wo lel   uniform float3x4 World   uniform float3 LightVec   uniform float3 EyePos     vpconn OUT     CEMPLOS XYZ    tempPos w        vector from  float3 vertTol       float3 worldNormal   normalize  mul  WorldIT  IN Normal          build float4  float4 tempPos     INAP O Sion ey  TOA      compute world space position  float3 worldSpacePos   mul  World  tempPos      vertex to eye  normalized             OUT  TexCoord0  OU exCoord0             OUT Hposition    return OUT        Eye   normalize  EyePos   wor
88.  structure that is defined in simple  cg is vertout  which connects  the vertex to the fragment        define outputs from vertex shader  SETUCO vertout     float4 HPosition g POSITIONS  float4 Color COLOR  y        148    808 00504 0000 006  NVIDIA    A Brief Tutorial    The vertout structure also contains only two members  Hposition  the  vertex position in homogeneous coordinates  and Color  the vertex color   Again  binding semantics are used to specify register locations for the  variables  In this case  the homogeneous position information resides in the  hardware register corresponding to POSITION and that the color information  resides in the hardware register corresponding to COLOR     Passing Arguments    Now let s take a look at the body of the program  section by section  starting  with the declaration of main        vertout main appin IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelViewIT   uniform float4 LightVec     As required for a vertex program  main   takes an application to vertex  structure as input and returns a vertex to fragment structure  In this case  we  are using the two structure types we have already defined  appin and  vertout  Notice that main   takes in three uniform parameters  two  matrices and one vector  All three parameters are passed to simple cg by  the application  using the run time library     The first matrix  ModelViewProj  is the concatenation of the modelview and  projection matrices  Together  these matrices transfo
89.  supported by the Cg Standard Library  Vertex profiles are not required to  support these functions     Table 4  Derivative Functions       Derivative Functions    Function Description       ddx  a  Approximate partial derivative of a with respect to  Screen space x coordinate     ddy  a  Approximate partial derivative of a with respect to  screen space y coordinate                    Debugging Function    Table 5   Debugging Function  presents the debugging function that is  supported by the Cg Standard Library  Vertex profiles are not required to  support this function        808 00504 0000 006 41  NVIDIA          Cg Language Toolkit    Table 5  Debugging Function          Debugging Function  Function Description  void debug float4 x  If the compiler s DEBUG option is specified  calling    this function causes the value x to be copied to the  COLOR output of the program  and execution of the  program is terminated   If the compiler   s DEBUG option is not specified  this  function does nothing           The debug function is intended to allow a program to be compiled twice     once with the DEBUG option and once without  By executing both programs   you can obtain one frame buffer containing the final output of the program  and a second containing an intermediate value to be examined for    debugging        Predefined Fragment Program Output Structures    A number of  e per structure types for use in fragment programs are  predefined in the standard library  Variables of th
90.  teak Ae eee mb ob Rel oC Rodeo d 208  Descriptio  causes tice oboe ERU RUE EO bae E d BOR doka OPE a Rc eed ae ee E eR RE REIR 208  Vertex Shader Source Code for Shadow Mapping            0 000 eee eee 209  Pixel Shader Source Code for Shadow Mapping          llle 210  Shadow Volume  EXEPUSIOE s sai sa a ahaa RC RR OR ER y V OR e e Rd ec OR 211  presi et PRICE UELLE UTE 211  Vertex Shader Source Code for Shadow Volume Extrusion              000005 212  Sine  Wave  DEMO scc so e RH ceed Ree OR qot c RAR ESA e e RR 214  BILE ca TT TTL 214  Vertex Shader Source Code for Sine Wave            liliis 215  Matrix Palette SKIMMING  s xu ae gate det da A gid  ae a A 217  DESCAPON cock x cer A CHAR Red OC ORO MERA CR Ree RR 217  Vertex Shader Source Code for Matrix Palette Skinning                        218  Appendix A  Cg Language Specification        ooooocccccr eee 221  Language OVSIVIGW  cs sss tsi A RS RENE eU Sd ROLE E RSEN NOR PU US 221  Silent incormpatibilities       acu need Eg poder and ox eee Ed Vade do ad 221  Similar Operations That Must be Expressed Differently                000005 222  Differences from ANSI Co    ce hh 222  Detailed Language Specification    sse a hh oe 224  A Io ad  Rate Ride Ru a echa i t RI  Pao dba MR RS 224  Proteo neta o tite dapi cp tarda  Bua a teat qaot Ap Rod AUR RAE 225  The Unifor MOAIEN is a aden  dex iex oia de Ue RD t   ai a RR alc p D a Re     a 225  Function Declarations sise rd d o ac AP a Sob deed wea dade qe 226  Overloading of Fu
91.  texture handle should be used for the sampler2D in the effect file  Secondly   the application must use the Cg runtime to set the texture state given in the  sampler_state block at the appropriate time     Under OpenGL  the easiest way to achieve these goals is to call  cgGLSetupSampler  param  textureID   This entry points binds the given  texture  associates the texture handle with the given parameter  and  initializes the sampler state by calling cgSetSamplerState       Alternately  an application can perform these steps itself  The code below  shows this in practice           CGparameter p   cgGetNamedEffectParameter  effect   samp       GLuint handle   glGenTextures 1   amp handle    glBindTexture  GL TEXTURE 2D  handle                  cgGLSetTextureParameter p  handle    cgSetSamplerState  p               glTexlImage2D GL TEXTURE 2D  0  GL RGBA  RES  RES  0  GL RGBA   GL FLOAT  data                        Note the calls to cgGLSetTextureParameter    and cgSetSamplerState     The first call is the usual runtime call that needs to be made to tell the  runtime which OpenGL texture object is associated with a given parameter     The egSetSamplerState    call ends up making the glTexParameter calls  that set up the texture state defined in the sampler state block  It expects  that the appropriate texture object has been bound with g1BindTexture first     After the sampler has been initialized in either of these manners  there are  two possibilities for how the texture para
92.  the same meaning as they do for the  cgGLSetMatrixParameter functions     Setting Varying Parameters    The values of fragment program varying parameters are set as the result of  the interpolation across the triangles performed by the GPU  so only the  values of vertex program varying parameters are set by the application     Setting a vertex varying parameter requires two steps     The first step consists in passing a pointer to an array containing the values  for each vertex  This is done using cgGLSetParameterPointer      void cgGLSetParameterPointer  CGparameter parameter    GLint size  GLenum type  GLsizei stride    GLvoid  array      The variable size indicates the number of values per vertex that are stored in  array  It is equal to 1  2  3  or 4  If fewer values are set than the parameter  requires  the non specified values default to 0 for x  y  and z  and 1 for w        78    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    The enumerate type type specifies the data type of the values stored in  array  GL_SHORT  GL_INT  GL_FLOAT  or GL_DOUBLE     The parameter stride is the byte offset between any two consecutive  vertices  Passing a value of zero for stride is equivalent to passing a byte  offset equal to size multiplied by the size of type in bytes  in other words  it  means that there is no gap between two consecutive vertex values  Note that  the minimum size for array is implicitly defined by the biggest vertex index  specified in the t
93.  to which output values       808 00504 0000 006 127  NVIDIA          Cg Language Toolkit    are to be written  ncomp is the number of components per pixel in the output  buffer  1  2  3  or 4   and nx  ny  and nz indicate the number of positions at  which the function should be evaluated in each of the x  y  and z dimensions     The total size of the buffer should be equal to the product of the number of  positions in each of the dimensions and the number of components in the  buffer  as in the example below     define RES 256    define NCOMPS 4   float  buf   new float  NCOMPS RES RES     cgEvaluateProgram tp  buf  NCOMPS  RES  RES  1        do something with buf   delete   buf                                   It is a error to pass a CGprogram that doesn t have the CG PROFILE GENERIC  profile to cgEvalauteProgram       Annotations    Using annotations  it is possible to attach additional information to  parameters  techniques  programs  and passes in the effect file for use by the  application  An annotation is a list of variables and values denoted by angle  brackets immediately following a declaration  as in the effect below    loas Lalola aie  lt  sismo Ues   Wehiicecieaeim   p Sf    technique fancyHalo  lt   bool optional   true    gt  4  pass  lt  string geometry    character    string destination    texture    gt            CgFX does not interpret the meaning of annotations in any way  annotations  exist solely for the convenience of the application  The example abov
94.  values  marked  with an     and a version operating on double values  marked with a d     Setting Uniform Scalar and Uniform Vector Parameters    To set the values of scalar parameters or vector parameters  use the  cgGLSetParameter functions     void cgGLSetParameterlf  CGparameter parameter  float x    void cgGLSetParameterlfv CGparameter parameter    const float  array    void cgGLSetParameterld  CGparameter parameter  double x    void cgGLSetParameterldv CGparameter parameter    const double  array      void cgGLSetParameter2f  CGparameter parameter  float x   float y    void cgGLSetParameter2fv CGparameter parameter   const float  array    void cgGLSetParameter2d  CGparameter parameter  double x   double y    void cgGLSetParameter2dv  CGparameter parameter   const double  array      void cgGLSetParameter3f  CGparameter parameter  float x   float y  float z     void cgGLSetParameter3fv CGparameter parameter   const float  array     void cgGLSetParameter3d  CGparameter parameter  double x   double y  double z     void cgGLSetParameter3dv  CGparameter parameter   const double  array      void cgGLSetParameter4f  CGparameter parameter  float x   float y  float z  float w    void cgGLSetParameter4fv  CGparameter parameter   const float  array         74    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    void cgGLSetParameter4d  CGparameter parameter  double x   double y  double z  double w    void cgGLSetParameter4dv  CGparameter parameter   const dou
95.  vp20 Vertex Shader profile and the DirectX VS 1 1 profile is that the vp20  profile supports two additional outputs  BCOLO  for back facing primary  color  and BCOL1  for back facing secondary color      Position Invariance    O The vp20 profile supports position invariance  as described in the core  language specification        O The modelview projection matrix must be specified using a binding  semantic of  GL MVP     Data Types    This profile implements data types as follows        float data types are implemented as IEEE 32 bit single precision        O half and double data types are implemented as float        3  To understand the NV vertex program and the code produced by the compiler using the  vp20 profile  see the GL NV vertex program extension documentation    4  See  OpenGL NV vertex program 1 0 Profile  vp20   on page 279 for a full explanation  of the data types  statements  and operators supported by this profile        808 00504 0000 006 279    NVIDIA          Cg Language Toolkit    Bindings    Q int data type is supported using floating point operations  which add  extra instructions for proper truncation for divides  modulos  and casts    from floating point types     Q fixed or sampler  data types are not supported  but the profile does  provide the minimal partial support that is required for these data types  by the core language specification    that is  it is legal to declare variables  using these types  as long as no operations are performed on the 
96. 00 006  NVIDIA    Introduction to the Cg Runtime Library       CGD3D9ERR INVALIDVEREXDECL  Returned when a program is  loaded with the expanded interface  but the given declaration is  incompatible       CGD3D9ERR_NODEVICE  Returned when a required Direct3D device is  0  This typically occurs when an expanded interface function is  called and a Direct3D device has not been set with  cgD3D9SetDevice          CGD3D9ERR NOTMATRIX  Returned when a parameter that is not a  matrix type is passed to a function that expects one        CGD3D9ERR NOTLOADED  Returned when a parameter has not been  loaded with the expanded interface by cgD3D9LoadProgram         CGD3D9ERR_NOTSAMPLER  Returned when a parameter that is not a  sampler parameter is passed to a function that expects one        CGD3D9ERR_NOTUNIFORM  Returned when a parameter that is not  uniform is passed to a function that expects one       CGD3D9ERR_NULLVALUE  Returned when a value of zero is passed to a  function that requires a non zero value        CGD3D9ERR OUTOFRANGE  Returned when an array range specified to  a function is out of range        CGD3D9 INVALID REG  Returned when a register number is  requested for an invalid parameter type  This error is specific to the  minimal interface functions and does not trigger an error callback     Testing for Errors    When a Direct3D runtime function is called that returns an error of type  HRESULT  the proper method of testing for success or failure is to use the  Win32 macros 
97. 00 006 309    NVIDIA          Cg Language Toolkit    Table 48  ps 1 x Instruction Set Modifiers  continued                       Instruction  Register   Cg Expression   Modifier   instr sat saturate  x   ie min 1  max 0  x    reg_bias x 0 5   l reg 1 x    reg  x   reg bx2 2   x 0 5              Language Constructs and Support  Data Types    In the ps 1 x profiles  operations occur on signed clamped floating point  values in the range MaxPixelShaderValue to MaxPixelShaderValue  where  MaxPixelShaderValue is determined by the DirectX implementation  These  profiles allow all data types to be used  but all operations are carried out in  the above range  Refer to the DirectX pixel shader 1 X documentation for  more details     Statements and Operators    The DirectX pixel shader 1  X profiles support all of the Cg language  constructs  with the following exceptions     Q Arbitrary swizzles are not supported  though arbitrary write masks are    Only the following swizzles are allowed   x  r  y  g  z  b  w  a   xy  rg  xyz  rgb  xyzw  rgba   xxx  rrr  yyy  ggg  zzz  bbb  www  aaa   xxxx  rrrr  yyyy  gggg  zzzz  bbbb  wwww  aaaa    Matrix swizzles are not supported        Boolean operators other than      lt     gt  and  gt   are not supported   Furthermore   lt    lt     gt  and  gt   are only supported as the condition in the      operator     Bitwise integer operators are not supported          is not supported unless the divisor is a non zero constant or it is used  to compute t
98. 1  float4 v  float4 low  float4 high     return saturate  v   low   high low            float4 remapFrom01 float4 v  float4 low  float4 high     return lerp low  high  v           Don t forget vectorization here as well  If two   1oat valued functions have  the same domain and range  you can pack them into two texture components  of the same texture  Only one texture lookup is needed to load them both   and vectorized versions of the remap     can be used to do the remapping  more efficiently as well        5  Use Data Types with Minimum Sufficient Precision    For profiles that support multiple precisions  a general rule of thumb is that  if you can do a computation with fixed precision variables  the computation  is faster than if you use half  and if you use half  the computation is faster  than if you use float  Although sometimes you need the range and extra  precision that half and float offer  you should avoid using them unless  necessary        808 00504 0000 006 325  NVIDIA          Cg Language Toolkit       6  Use the Right Standard Library Routines for Shading  Computations    If you   re implementing a shading model  such as Lambertian  Blinn  or  Phong   you ll generally be performing some dot product routines  clamping  negative results to zero  and raising some of the values to a power  to  compute a specular exponent  There are a few tricks that can speed up this  process     Q Besure to use the dot    function when computing dot products        Q Ifyou need t
99. 22  operator  enhancements 247  precedence 247    operators  arithmetic 20  boolean 21    conditional 22  introduction 18    808 00504 0000 006    swizzle 22  write mask 22    P  packed  type modifier 230  parameter shadowing 73  parameters  modifiable function  passing 19  parameters in function definitions  syntax 227  pass 117  120  pass state 120  performance techniques  abs   324  avoiding matrix transposes 328  computation frequency 327  conditional code in fragment  programs 328  datatypes 325  dot   324  min   324  saturate   324  shading computations 326  swizzle 323  texture maps 324  vectorization 321  pixel program  defined 3  pixel shader  defined 3  position invariance 250  profile  arbfpl 263  arbvpl 256  fp20 283  fp30 274  ps_1_1  ps_1_2  ps_1_3 308  ps_2_0  ps_2_x 300    vp20 279  vp30 270  vs_1_1 304    vs20 vs2x 296  profile  defined 3  program   declaring 5   kinds of inputs 5  program profiles   fragment 252    335  NVIDIA          Cg Language Toolkit    vertex 250  programming model  GPU 2  ps 1 x profile 308  ps 2 0 profile 300  ps 2 x profile 300    R  ray traced refraction  pixel shader code example 172  sample shader 170  vertex shader code example 171  recursion  function 19  reflection vector 200  refraction  pixel shader code example 207  sample shader 205  vertex shader code example 206  release notes xvi  Renderman  relation toCg 221  reserved words 249  runtime  core Cg 49    S    sampler data type 11  sampler type  specification 230  samplers
100. 3        tex2Dproj sampler2D  float4        tex3D sampler3D  float3                 808 00504 0000 006 311  NVIDIA          Cg Language Toolkit    Bindings    Table 49  Supported Standard Library Functions  continued        tex3Dproj sampler3D  float4        texCUBE  samplerCUBE  float3              texCUBEproj samplerCUBE  float4           Note  The non projective texture lookup functions are actually done as projective  lookups on the underlying hardware  Because of this  the w component of the  texture coordinates passed to these functions from the application or vertex  program must contain the value 1        Texture coordinate parameters for projective texture lookup functions must  have swizzles that match the swizzle done by the generated texture  addressing instruction  While this may seem burdensome  it is intended to  allow ps 1  X profile programs to behave correctly under other pixel shader  profiles    The swizzles required on the texture coordinate parameter to the projective  texture lookup functions are listed in Table 50     Table 50  Required Projective Texture Lookup Swizzles                      Texture Lookup Function Texture Coordinate Swizzle  texlDproj  Xw  ra   tex2Dproj xyw  rga   texRECTproj  Xyw  rga   tex3Dproj  xyzw  rgba   texCUBEproj  xyzw  rgba                Manual Assignment of Bindings    The Cg compiler can determine bindings between texture units and uniform  sampler parameters texture coordinate inputs automatically  This automatic  assi
101. 3 May be used only with uniform inputs with  sampler  types        register  c0   register  c7   Constant register  0  7   C0 C7                Binding Semantics for Varying Input Output Data    The varying input binding semantics in the ps 1 x profiles are the same as  the varying output binding semantics of the vs 1 1 profile     Varying input binding semantics in the ps 1 x profiles consist of COLORO   COLOR1  TEXCOORDO  TEXCOORD1  TEXCOORD2 and TEXCOORD3  These map to  output registers in DirectX vertex shaders        808 00504 0000 006 313  NVIDIA          Cg Language Toolkit    The valid binding semantics for varying input parameters in the ps_1_x  profiles are summarized in Table 52     Table 52  ps 1 x Varying Input Binding Semantics                Binding Semantics Name Corresponding Data   COLOR  COLORO Input color value vO   COL  COLO   COLOR1 Input color value v1   COL1   TEXCOORDO   TEXCOORD3 Input texture coordinates t0 t3  TEXO   TEX3                Additionally  the ps_1_x profiles allow POSITION  FOG  PSIZE  TEXCOORD4   TEXCOORD5  TEXCOORD6  and TEXCOORD7 to be specified on varying inputs   provided these inputs are not referenced  This allows Cg programs to have  the same structure specify the varying output of a vs_1_1 profile program  and the varying input of a ps_1_x profile program     The valid binding semantics for varying output parameters in the ps_1_x  profile are summarized in Table 53     Table 53  ps 1 x Varying Output Binding Semantics          
102. 33  overloading by profile 226  standard library 33  texture map 38    G  geometric functions 38  GL_ARB_vertex 256  global variables 241  graphics hardware  evolution of xiii  grass   sample shader 202   vertex shader code example 202    H  half datatype 11  half type  specification 229    l  if statements 244  inputs  uniform 5  varying 5 6    int data type 11   int type  specification 229  integral type category 232  interfaces 125    J    Java  relation to Cg 221    L  language profiles  conceptof 3    M   mathematical functions 33  matrices  multiplying 20  matrices  support of 12  matrix palette skinning 217       334    808 00504 0000 006    NVIDIA    sample shader 217   vertex shader code example 218  matrix transposes and performance 328  melting paint   pixel shader code example 163   sample shader 161   vertex shader code example 161  min   for performance 324  miscellaneous operators 249  modifiable function parameters  passing 19  multipaint   pixel shader code example 167   sample shader 165   vertex shader code example 166    namespaces 237  numeric type category 232    O  object  Cg definition 224  open profile functions 227  OpenGL Cg runtime 73  error reporting 85  OpenGL application 82  parameter setting 74  OpenGL CGerror 85  OpenGL profiles  ARB fragment program 263  ARB vertex program 256  NV fragment program 274  NV register combiners 283  NV texture shader 283  NV vertex program 279  NV vertex program 2 0 270  operations  expressed differently from C 2
103. 4   MaxTexIndirections  lt n gt   where n  gt   1  default infinite   NumDrawBuffers  lt n gt   where 1 lt   n lt   4  default 1   266 808 00504 0000 006    NVIDIA    Appendix B Language Profiles       OpenGL NV_vertex_program 3 0 Profile  vp40   The vp40 profile is an extended version of the arbvp1 profile  It has all of the  capabilities of arbvp1 and the added capability described in this section     Vertex Texturing    The vp40 profile supports accessing texture maps in programs  Textures are  available via the usual sampler  types and the tex     standard library calls        808 00504 0000 006 267  NVIDIA          Cg Language Toolkit       OpenGL NV_fragment_program 2 0 Profile    p40     The   p40 profile is an extended version of the arpfp1 profile  It has all of the  capabilities of arbfp1 as well as the added capabilities described in this  section     Branching    The branching support in   p40 allows some if statements and looping  constructs to be implemented with branching  In profiles such as   p30   conditional execution of code was always implemented with predicated  instructions  and loops were always unrolled     In the GeForce 6800 GPU  there is a cost associated with executing a branch  in the fragment shading engine  As such  it is possible that the cost of the  branch will out weigh the savings from skipping over a block of  conditionally executed code or of executing an unrolled loop   Please refer to  the NVIDIA developer Web site for more information
104. 504 0000 006  NVIDIA    Introduction to the Cg Language    computations to be performed in slower  high precision arithmetic  If the C  behavior is desired  the constant should be explicitly typed to force the type  promotion  halfvar   2 0   is compiled as   float  halfvar    2 0       Cg uses the following type suffixes for constants   Q   f for float    O h for half       O x for fixed    Structures and Member Functions    Cg supports structures the same way C does  Cg adopts the C   convention  of implicitly performing a typedef based on the tag name when a struct is  declared     struct mystruct     5 sou EL ka    mystice sp    Define  s  as m Vastu      Structures may define member functions in addition to member variables   Member functions provide a convenient way of encapsulating helper  functions associated with the data in the structure  or as a means of  describing the behavior of a data object     Structure member functions are declared and defined within the body of the  structure definition     struct Foo    logic Wells  float helper float x     imenEwUE well ep 29     y     Member functions may reference their arguments or the member variables of  the structure in which they are defined  The result of referring to a variable  outside the scope of the enclosing structure  such as  global variables  is  undefined  instead  passing such variables as arguments to member  functions that need them is recommended     Member functions are invoked using the usual      
105. Arithmetic  Operators TOM C    uis eee RR ERE Chae Aes heehee ERR E SAU d 20  Multiplication  FUNCHONSS  cascos ett pops aperte a prec Eo Foe ob e uot gd 20  Vector COnStructOFr   3 acoso img RR CER ORR RC p Ra REA RA RR AN 21  Boolean and Comparison Operators     1 1 0    ee 21  Swizzle Operator serani eret eii eng d eyed cole tea Merten ew ate 22  Wite Mask Operdtotiin einander aci au gc a kg E ace  toned a aeg cut v a lec bees 22  Conditional Operator a p smpi piede E Vb qal acd Bop eda IR OR ORE US EOD OR d 22  Texture Lookups in Advanced Fragment Profiles          llis 23  clc PP                               m 24  Imc e TP 25  Passes  svo yk ROT OAD SEE Gc AEG EHE EEO aS EEUU XI EUER RA ADR KON 26  State ASS MENS cotas s gE SENCER EErEE ia E eR REA e SCR d 26  Parameters and Semantics i s sonrasi a tanir ea ras 27  Vertex and Fragment Programs usada 27  Textures and Samplers   24 3 dog hok RE RES LRG ERG ae EEE OE ORY qus 29  Interfaces and Unsized Arrays    s vues pec ur Vows ea RE eee ok C Ree EO we 29  Running Cg Programs On the CPU oca RR e Re Re eR IL Ue ea o RR 30  808 00504 0000 006    NVIDIA       Cg Language Toolkit       ANNOIN S ua  arras d Seb eo dede tede Sed BOE taa ra c qd c po ER dod d ong 32  More DS Tea cep  PEE 32  Cg Standard Library Functions          000 ccc e eee 33  Mathematical FUNCOMS    ou eee eke CEA Cet ra Ree eee Rl deu RY 33  Geomettic FUNCHONS   as qued wade ERE aah EQ Re E PD EO AWE Cp 38  Texture Map FUNCUONS    ac cee kbavav even Shee 
106. Assignments          0    0                    4 141  Table 9  Type Conversions    sooo RA RRA A 235  Table 10  Expanded Operators       2            0    247  Table 11  Vertex Output Binding Semantics          s                  4 251  Table 12  Fragment Output Binding Semantics     2    1                        252  Table 16  arbvp1 Uniform Input Binding Semantics                       260  Table 17  arbvp1 Varying Input Binding Semantics                   0   261  Table 18  arbvpi Varying Output Binding Semantics            0           261  Table 19  arbfp1 Uniform Input Binding Semantics              ll ss 265  Table 20  arbfp1 Varying Input Binding Semantics                   265  Table 21  arbfp1 Varying Output Binding Semantics                   265  Table 22    p40 Compiler Branching Options                 00040  269  Table 23  vp30 Uniform Input Binding Semantics                048  271  Table 24  vp30 Varying Input Binding Semantics                 04  272  Table 25  vp30 Varying Output Binding Semantics               000 G 272  Table 26    p30 Uniform Input Binding Semantics                04  275  Table 27    p30 Varying Input Binding Semantics                 0   275  Table 28    p30 Varying Output Binding Semantics              a a 276  Table 29  vp20 Uniform Input Binding Semantics                048  280  Table 30  vp20 Varying Input Binding Semantics                 0   281  Table 31  vp20 Varying Output Binding Semantics            0 0 00 ee 281  Table 32  
107. COORDO                       float3 3  MEPXCIOQEND ILS ARO e cis pares  Hoar BE COORD P EON SD des  loat N p WaxXCOOROSs   a  m ales Space       Si    y     float Position 2 POSITION    im projection space                   float4 Normal   COLORO    in tangent space   float4 LightVectorUnsigned   COLOR1    in tangent space  float3 TexCoord0   TEXCOORDO    float3 TexCoordl   TEXCOORD1    Hoa icto or EL XCOORDZ    in tangent space  float4 HalfAngleVector   TEXCOORD3    in tangent space          v2f main a2v IN     uniform float4x4 WorldViewProj   uniform float4 LightVector    in object space  uniform float4 EyePosition   in object space            v2   OUT        pass texture coordinates for     fetching the diffuse map  OUT  TexCoord0 xy   IN TexCoord xy        pass texture coordinates for     fetching the normal map  OUT TexCoordl xy   IN TexCoord xy        compute the 3x3 transform from     tangent space to object space  float3x3 objToTangentSpace   objToTangentSpace 0    IN T   objToTangentSpace 1  TEIN S ehe  objToTangentSpace  2  TEIN SINE                      transform normal from       808 00504 0000 006 193    NVIDIA          Cg Language Toolkit       object space to tangent space  OUT Normal xyz   0 5   mul objToTangentSpace  IN Normal     Oar       transform light vector from     object space to tangent space  float3 lightVectorInTangentSpace     mul  objToTangentSpace  LightVector xyz    OUT LightVector xyz   lightVectorInTangentSpace   OUT  LightVectorUnsigned xyz 
108. Car Paint 9   pixel shader code example 186  vertex shader code example 184  cfloat type  specification 229   Cg    brief tutorial 145   defined 1   language  introduction 1  necessity for xiv   standard library functions 33    Cg compiler  cgc exe 329  command line options 329  Cg runtime  API specific 72  benefits 44  compiling 46  context creation 46  Direct3D 85    NVIDIA    cgD3D9GetLastError   115  CGerror 114  debugging mode 112  error callbacks 116  error testing 115  error types 114  Direct3D  cgD3D9EnableDebugTracing   114  Direct3D  cgD3D9TranslateHRESULT   116  Direct3D expanded interface 98  cgD3D8LoadProgram   103  cgD3D8SetSamplerState   102  cgD3D9BindProgram   105  cgD3D9EnableParameterShadowing    103  cgD3D9GetDevice   98  cgD3D9GetLatestPixelProfile   105  cgD3D9GetLatestVertexProfile   105       Cg Language Toolkit    cgD3D9GetOptimalOptions   105  cgD3D9IsParameterShadowingEnable  d   103  cgD3D9IsProgramLoaded   104  cgD3D9LoadProgram   103  cgD3D9SetDevice   98  cgD3D9SetSamplerState   102  cgD3D9SetTexture   102  cgD3D9SetTextureWrapMode   102  cgD3D9SetUniform   100  cgD3D9SetUniformArray   101  cgD3D9SetUniformMatrix   101  cgD3D9SetUniformMatrixArray   10  T  cgD3D9UnloadProgam   104  Direct3D 8 application 109  Direct3D 9 application 106  Direct3D device 98  fragment program 106  lost devices 98  parameters 100  array 101  sampler 102  uniform     100  profile support 105  program executiion 103  vertex program 106  Direct3D HRESULT 114  Direct3D 
109. Car Paint Q cara neared RC ne RR RR 186  Basic Profile Sample Shaders         coooocononc eee 189  AnisotropicEighlfit suo Ss gue cinta wie Aa g AN Rod a B N O E 190  Descriptio eve 2L TP E REOR EAS A LL Ep A ORE EAA dd 190  Vertex Shader Source Code for Anisotropic Lighting                  o  oooo   191  Bump Dot3x2 Diffuse and Specular serrara sek prr tex iiy d eser ad Rage 192  DESCUPUON acer eae ee qe m pp Roo ep or eod mee a dg mee Cic 192  Vertex Shader Source Code for Bump Dot3X2           ssll ee 193  Pixel Shader Source Code for Bump Dot3x2           0 000  cee ee 194  B  mp Reflection Mappllii     5 agii mogsa ie ee CAR IS a RR tap Te Te cde AMM alg  ea 196  Descrip seres Sen PO me a ee ee Rer Sens par desunt e a 196  808 00504 0000 006 iii    NVIDIA          Cg Language Toolkit       Vertex Shader Source Code for Bump Reflection Mapping               00005 197  Pixel Shader Source Code for Bump and Reflection Mapping                   199  o A A AO 200  DESCAPHON mtr  200  Vertex Shader Source Code for Fresnel      0 0 0 0  0c 200  GaSe ce ow EHI ARO oo 202  DESCARTO pra inte edi a eie as sido qui we ie Rosen  o 202  Vertex Shader Source Code for Grass          liliis 202  Refraction ci exa Rx eR RR ECCE AAA AAA RO OR GC CH IRR 205  BDeSCHBLOTI rsrs arx eos genes RE aede CE n Sa EA Eee dee we d aua 205  Vertex Shader Source Code for Refraction         2    206  Pixel Shader Source Code for Refraction          l l  eee 207  Shadow MappIDg     32 35 x bred e dun pee
110. D  TRACE   Activating vertex shader for program 3       cgD3D  TRACE   Setting shadowed parameters for program 3       cgD3D  TRACE   Setting registers for uniform parameter   ModelViewProj  of type float4x4       CgD3D TRACE   Setting constant registers  0   3  for  parameter  ModelViewProj  of type float4x4          cgD3D  TRACE   Activating pixel shader for program 24       CgD3D TRACE   Setting shadowed parameters for program 24       CgD3D TRACE   Setting texture for sampler parameter   BaseT    CgD3D TRACE   Setting SamplerState 0  D3DTSS MAGFILTER for  sampler parameter  BaseTexture   CgD3D TRACE   Setting SamplerState 0  D3DTSS MINFILTER for  sampler parameter  BaseTexture     cgD3D  TRACE   Setting SamplerState 0  D3DTSS MIPFILTER for  sampler parameter  BaseTexture                                                     CgD3D TRACE   Deleting vertex shader for program 3                         cgD3D  TRACE   Deleting pixel shader for program 24  To use the debug DLL     1  Link your application against egD3D9d 1ib  or cgb3D8d  1ib  instead of  CcgD3D9 lib  or cgD3D8  lib      2  Make sure that the application can find egD3D9d d11  or cgD3D8d   d11         808 00504 0000 006 113  NVIDIA          Cg Language Toolkit    3  Turn on and turn off tracing of portions of your code using  cgD3D9EnableDebugTracing        void cgD3D EnableDebugTracing CGbool enable      Here is how you would enable debug tracing for part of the application code     cgD3D9EnableDebugTracing CG TRUE
111. E E EA To ER ARR rra LR A nre 260  OPUS ba ios o ciao es 262  OpenGL ARB Fragment Program Profile azbf  p1             sisse 263  Accessing  OpenGL State a oo sem ding er   ed a 263  808 00504 0000 006    NVIDIA          Cg Language Toolkit       MET SUB DOME a aca fered anita  dood tin a ra dete aR ale d 263  Resource EImits 5s pia ae 264  Language Constructs and Support    es 264  Bindlhgs   uiuit pem hehe Shae ahaa Rode Wea Bese aed Rone Seg cb hem ee Stans 265  anc CP  ree IM 266  OpenGL NV vertex program 3 0 Profile  vb40          leen 267  Vertex  Textulitigi  uox douce a ia da aed mab RR CU Re Rn 267  OpenGL NV fragment program 2 0 Profile   p40            isse 268  sud e PPP RPE CA CREM SERA E eee RRR EE 268  FACE Semantis viele made CaaS Ad A wale EE eed xa 269  OpenGL NV vertex program 2 0 Profile  vb30          see 270  Position MATERIE a sms tue dc deti ia dove een d goede Pidgin y bie Ay dak q dd 270  Language CONSHUCES   ccc peor ice Roe RC RC Cee A AAA Re Ro 270  Biridilngs   suscita dies a a sinus A aR ho ade tested Meese UE EU TIE  271  OpenGL NV fragment program Profile   p30              llle 274  Language Constructs and Suppor orig wurst es macetna sos ac ema  9 dled mundo 274  BIN S pei a al Waco ER a BSG 215  Pack and Unpack FUNGOS s a sii cue ceo iix Dee xk a EROR RR RR Ra RC URS 216  OpenGL NV vertex program 1 0 Profile  vP20           lise 279  AUCI Werer nse errar eRe 279  Position I nVallaflcBs ir REUS oak A AE Ka 279  Data Types veria REOR E RA A O
112. Enable bool 1 0   LightModelLocalViewerEnable bool 1 0   LightModelTwoSideEnable bool 1 0   LineSmoothEnable bool 1 0   LineStippleEnable bool 1 0   LogicOpEnable bool 1 0   MultisampleEnable bool 1 3 or ARB multisample   NormalizeEnable bool 1 0   PointSmoothEnable bool 1 0                   808 00504 0000 006    139    NVIDIA             Cg Language Toolkit                                                                   Table 7  Enable Disable States  continued    Enable  Disable State Name Type Requires   PointSpriteEnable bool 2 0  ARB point sprite  Of NV_point_sprite   PolygonOffsetFillEnable bool OpenGL 1 1   PolygonOffsetLineEnable bool 1 1   PolygonOffsetPointEnable bool 1 1   PolygonSmoothEnable bool 1 0   PolygonStippleEnable bool 1 0   RescaleNormalEnable bool 1 20r EXT rescale normal   SampleAlphaToCoverageEnable  bool 1 3 0r ARB multisample   SampleAlphaToOneEnable bool 1 3 0r ARB multisample   SampleCoverageEnable bool 1 3 or ARB_multisample   ScissorTestEnable bool 1 0   StencilTestEnable bool 1 0   TexGenSEnable  ndx  bool 1 0  ndx must be greater or equal to zero and less  than the value of cr  MAx TEXTURE COORDS   TexGenTEnable  ndx  bool Same as TexGenSEnable   TexGenREnable  ndx  bool Same as TexGenSEnable   TexGenQEnable  ndx  bool Same as TexGenSEnable   TexturelDEnable  ndx  bool 1 0  ndx must be greater or equal to zero and less  than the value of Gt  MAX TEXTURE IMAGE UNITS   Texture2DEnable  ndx  bool same as TexturelDEnable   Texture3DEnable ndx  
113. FAILED    and SUCCEEDED      Simply testing the error against  Zero or D3D OK is not sufficient  because there could be more than one  success value     As an added convenience  and for uniformity with the core runtime  the  Direct3D runtime also supplies cgD3D9GetLastError     which is analogous  to cgGetLastError    but returns the last Direct3D runtime error of type  HRESULT for which the FAILED    macro returns TRUE     HRESULT cgD3D9GetLastError       The last error is always cleared immediately after the call        808 00504 0000 006 115  NVIDIA          Cg Language Toolkit    The function egD3D9TranslateHRESULT    converts an error of type HRESULT  into a string   const char  cgD3D9TranslateHRESULT  HRESULT hr       This function should be called instead of DXGetErrorDescription9     because it also translates errors that the Cg Direct3D runtime generates     Using Error Callbacks    Here is an example of a possible error callback that sorts out debug trace  errors from core runtime errors and from Direct3D runtime errors   void MyErrorCallback       CGerror error   cgGetError      if  error    cgD3D9DebugTrace         This is a debug trace output       A breakpoint could be set here to step from one      debug output to the other                       return      char buffer 1024    if  error    cgD3D9Failed   sora  otero  WA Direccion emo  Occurred Sa  val   cgD3D9TranslateHRESULT  cgD3D9GetLastError       else  sorrat  outra  WA Ce arron occurred  Ss aU   cgD3D9Tra
114. JD  s COLORI  return  Los  gt  QR T ww     2 suv        float bar     technique NewSimpleFrag    pass    VertexProgram   NULL   FragmentProgram   compile arbfpl main 2   bar                   Here  the value 2 bar is associated with the foo parameter of main     When  the value of bar is changed by the application  the value of foo in main    is  set appropriately        28    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    Finally  vertex or fragment programs may be assigned the value NULL in the  state assignment  This signifies that no program should be used in this pass     Textures and Samplers    CgFX makes it possible to define state related to textures in the effect file  The  short effect file below shows an example     sampler2D samp   sampler_state    generateMipMap   true   minFilter   LinearMipMapLinear   magFilter   Linear     y           float4 texsimple  uniform sampler2D sampler   tloata uy TEXCOORDO  s COLOR Y  return tex2D sampler  uv              technique TextureSimple    pass    FragmentProgram   compile arbfpl texsimple samp                   Interfaces and Unsized Arrays    CgFX also supports Cg s interfaces and unsized arrays features  Given an  effect file with Cg programs that use these features  the compile statement  can be used in two different ways to resolve the interfaces and unsized arrays  so that the program can be compiled     Consider the following example  a Light interface has been defined with  SpotLight implementing t
115. L Profile Support    A convenient function is provided that gives the best available profile for  vertex or fragment programs depending on the available OpenGL  extensions     CGprofile cgGLGetLatestProfile CGGLenum profileType      Parameter profileType is equal to CG GL VERTEX Or CG_GL_FRAGMENT   Function cgGLGetLatestProfile    may be used in conjunction with  cgCreateProgram   Or cgCreateProgramFromFile    to ensure that the best  available vertex and fragment profiles are used for compilation  This allows  you to make your application future ready  because the Cg programs are  automatically compiled for the best profiles that are available at runtime   even if these profiles did not exist at the time the application was written   Another function that allows you optimal compilation is  cgGLSetOptimalOptions     It sets implicit compiler arguments that are       80    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    appended to the argument list passed to cgCreateProgram   or  cgCreateProgramFromFile     void cgGLSetOptimalOptions  CGprofile profile      OpenGL Program Execution    All programs must be loaded before they can be bound  To load a program  use cgGLLoadProgram     void cgGLLoadProgram CGprogram program      Binding a program only works if its profile is enabled  This is done by calling  cgGLEnableProfile   with the program profile   void cgGLEnableProfile CGprofile profile      The binding itself is done using cgGLBindProgram      void 
116. Language Toolkit    If no call to cgCreateProgram   has been made for the context   cgGetLastListing   returns zero  Otherwise  it returns a string containing  the output you would typically get from the command line version of the  compiler     Program Attributes    To retrieve the context the program belongs to  use  cgGetProgramContext      CGcontext cgGetProgramContext  CGprogram program       Retrieving the profile the program has been compiled to is done with  cgGetProgramProfile     CGprofile cgGetProgramProfile  CGprogram program      The function pair cgGetProfile    and cgGetProfileString   allows you  to find the correspondence between a profile enumerant and its  corresponding string    CGprofile cgGetProfile const char  profileString      const char  cgGetProfileString CGprofile profile      If the string passed to egGetProfile   does not correspond to any profile   CG PROFILE UNKNOWN is returned     The function cgGetProgramString   retrieves various strings related to the  program depending on the value of the enumerant stringType     const char  cgGetProgramString CGprogram program   CGenum stringType      The variable stringType can have any of these values   O CG PROGRAM SOURCE  The original Cg source program is returned     O CG PROGRAM ENTRY  The main entry point of the Cg source program is  returned     CG PROGRAM PROFILE  The profile string is returned        CG COMPILED PROGRAM  The resulting compiled program is returned     Core Cg Parameters    Cg par
117. MirrorClamp   MirrorClampToEdge   MirrorClampToBorder       OpenGL 1 2 or EXT  texture3D for  WrapR  1 2 or  EXT_texture_edge_clamp for  ClampToEdge  1 3 or  ARB_texture_border_clamp for  ClampToBorder  1 4   ARB_texture_mirrored_repeat  OF  IBM_texture_mirrored_repeat for  MirroredRepeat   EXT_texture_mirror_clamp or  ATI_texture_mirror_once for  MirrorClamp Or MirrorClampToEdge   EXT texture mirror clamp for  MirrorClampToBorder          808 00504 0000 006    NVIDIA    14             Cg Language Toolkit                                        Table 8  sampler_state State Assignments  continued    Name Type Valid Values Requires   BorderColor float4 OpenGL 1 0   CompareMode int None  1 4 or ARB_ shadow  CompareRToTexture   CompareFunc int Never  Less  LEqual   1 40rARB shadow  1 5 or  Equal  Greater  EXT shadow funcs for Never  Less   NotEqual  GEqual  Equal  Greater  NotEqual  Of Always  Always   DepthMode int Alpha  Intensity  1 40r ARB depth texture  Luminance   GenerateMipMa  bool 1 4 or SGIS generate mipmap   P   LODBias float 1 4   MinFilter int Nearest  Linear  1 0  LinearMipMapNearest   NearestMipMapNearest   NearestMipMapLinear   LinearMipMapLinear   MagFilter int Nearest  Linear 1 0   MaxMipLevel float 1 20r EXT texture lod   MaxAnisotropy  float EXT texture filter anisotropic   MinMipLevel float 1 2 or EXT texture lod   Texture texture   Reference to texture             parameter           OpenGL State Not Specifiable with State Assignments    By design  state assi
118. NV_texture_shader and NV_register_combiners Instruction Set Modifiers      285  Table 33  Supported Standard Library Functions             0000 eee 286  Table 34  Required Projective Texture Lookup Swizzles                 ss 288  808 00504 0000 006 xi    NVIDIA          Cg Language Toolkit    List of Tables       Table 35    p20 Uniform Binding Semantics                  0048  289  Table 36    p20 Varying Input Binding Semantics                 048  289  Table 37    p20 Varying Output Binding Semantics            0 0000 ae 290  Table 38    p20 Auxiliary Texture Functions                    0000007 291  Table 39  vs 2   Uniform Input Binding Semantics                   298  Table 40  vs 2   Varying Input Binding Semantics                lins  298  Table 41  vs 2   Varying Output Binding Semantics                   299  Table 42  ps 2   Uniform Input Binding Semantics                   302  Table 43  ps 2   Varying Input Binding Semantics                0   302  Table 44  ps 2   Varying Output Binding Semantics                        302  Table 45  vs 1i 1 Uniform Input Binding Semantics                   306  Table 46  vs 1i 1 Varying Input Binding Semantics                 0   306  Table 47  vs 1i 1 Varying Output Binding Semantics               iss  307  Table 48  ps 1 x Instruction Set Modifiers                                    309  Table 49  Supported Standard Library Functions            0 0 00  eee 311  Table 50  Required Projective Texture Lookup Swizzles            
119. ON     Tangent space VIEW   distance attenuation  O view   dElbexeuE  eean CEN       tanV z  viewP w         808 00504 0000 006 185  NVIDIA          Cg Language Toolkit       Vi NI       EWTANG          O tange  O  oia   O norma  O fresn    return    ine   mal  AL    O       normalize  View   normalize  View   normalize  View   FresnelApprox        Tangent  0     Tangent  1     Tangent  2       Pixel Shader Source Code for Car Paint 9       column     column    0  il      Sala 2       This shader is based on the Time Machine temporal rust                                                                                                          shader  Car paint data was measured by Cornell     University from samples provided by Ford Motor Company   EN  SPSS MO MA  float4 HPosition POSITION     coord position in window  float2 uv EXCOORDO     wavy fleckmap coords  clones Lale ame EXCOORD1     light pos  tangent space   float4 halfangle EXCOORD2     Blinn halfangle  floats  reflection  TEXCOORDS ARE vector  per vertex   float4 view EXCOORD4     view  tangent space   float3 tangent EXCOORD5     view tangent matrix  float3 binormal EXCOORD6      float3 normal EXCOORD7      float fresn COLORO   y      PIXEL SHADER  float4 main  VS_OUTPUT vert   uniform sampler2D WavyMap register s0    uniform samplerCUBE EnvironmentMap register  s1    uniform sampler2D PaintMap register  s2    uniform sampler2D FleckMap register  s3    uniform float Ambient   COLOR        NEWPAINTSPEC     UNUSED  S
120. OSITION      position  elijo spaca   float4 TexCoords   TEXCOORDO     base ST coordinates  float3 OPosition   TEXCOORD1     position  obj space   float3 Normal   TEXCOORD2     normal  eye space   float3 VPosition   TEXCOORD3     view pos  obj space   iloet3 7    TEXCOORD4     tangent  obj space   loe s  18   TEXCOORD5     binormal  obj space   floats STI   TEXCOORD6     normal  obj space   float4 LightVecO SER IDEE 2 ASIE Dre cie  elos sees                                   MultiPaintV2F main  appin IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelViewIT   uniform float4x4 ModelViewl   uniform float4 TexRepeats   uniform float4 LightVec      eye space     MultiPaintV2F OUT   OUT HPosition   mul  ModelViewProj  IN Position         pass through object space position  OUD OPosit Lone ENE Os On 2727       transform normal to eye space  OUT Normal   normalize  mul  ModelViewIT  IN Normal   xyz            OUT TexCoords   IN UV   TexRepeats        pass through object space normal  tangent  binormal        166 808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders          OUT N   normalize  IN Normal xyz      QUE   1   IN  Temejsat o xS TP   OU Pee EN AB no a  gt       transform view pos  origin  to obj space  OUT VPosition   mul  ModelViewI  float4 0 0 0 1   xyz      transform light vector to obj space   OUT LightVecO   mul  ModelViewI  LightVec       return OUT     Pixel Shader Source Code for MultiPaint    T     define WHITE half4 1 0h 1 0h 1 0h 1 0h            
121. OpenGL is a trademark of SGI   Other company and product names may be trademarks of the respective companies with which they  are associated     Updates  Any changes  additions  or corrections will be posted at the NVIDIA Cg Web site     http     developer nvidia com  Cg    Refer to this site often to keep up on the latest changes and additions to the Cg language     Copyright     2002   2005 NVIDIA Corporation  All rights reserved     NVIDIA     NVIDIA Corporation  2701 San Tomas Expressway  Santa Clara  CA 95050  www nvidia com       Foreword  asia aaa aa xiii    Preface iaa a o AAA CN a xv  Release Notes ies se ERREUR Keene E RUE NEQOE BI xvi  Online  Updates  a RN xvi   Introduction   to the Cg Language   545 eode ru Rx A KR RI RR Ra ad E 1  Th   Cg Language is creed ih Ee P TU Rob Paco REPRE RP E QUEE Rob dtd 2  Cg s Programming Model for GPUs  soci const 0 0000 2  Cg Language PMOTIES ecards 3  Declaring    Programs IMCO esep inie Rag a a A qo adobe 5  Program Inputs and OUtpUls     s s e a e n oboe Roe E ER Gees 5  Working  With Data 3 2  2x gioid e ga E Rex g s EORR E dnbie do ao 11  Basie D  ta TYPES cea cds id ia oia Sande 11  TYPE CONVETSIONE   souci orangia tee eR IRR it mc IRIS ART o WUE RAO EAO Ree 12  Structures and Member FUNCOMS coi o dae eS 13  AMS P  PIERDE 14  Statements and OPEO S sapa xac ones kasd ee dra bbc obe qq a Py Roe darus 18  CONTEOLEIOW asi ss eee LAER REE A Ca CR Ra OR e nd 19  Function Definitions and Function Overloading            lille 19  
122. PEC POWER  GLOSSINESS      FLECK SPEC POWER    float4 NewPaintSpec   OW  648 08  Does Bo je  float3 ClearCoat   099 0 59 vi  Oy dibaie Teo  float3 FleckColor   1 S 1 05  soe is  float3 WavyScale eeu Une  E VU Ps          186    NVIDIA    808 00504 0000 006    Advanced Profile Sample Shaders       Tangent space LIGHT vector  float3 L   normalize vert light             Tangent space HALF ANGLE vector  float3 H   normalize vert halfangle xyz                  Tangent space VIEW vector  float3 V   normalize vert view xyz    float v_dist   vert view w           Tangent space WAVY_NORMAL          float3 wavyN    float3 tex2D WavyMap  vert uv  2 1   wavyN   normalize  wavyN WavyScale        PAINT       A normal map map could be loaded here instead if      we wanted more detail  In this case we have a      uniform tangent space normal  0 0 1    llore ig ol jl   Mas mE   elote mella   kozy   float3 paint color    float3 tex2D  PaintMap   tlosciez  um  cl 1  mel 1m    p          SPECULAR POWER   use a saturated diffuse term     to clamp the backlighting  n_d_h   saturate  n_d_1 4   pow n_d_h  NewPaintSpec y                        REFLECTION ENVIRONMENT      Reflect view vector about wavy normal and bring     to view space   float3 R   reflect  V  wavyN     R   R x vert tangent   R y vert binormal    R z vert normal   float3 reflect_color                          float3 texCUBE  EnvironmentMap  R            FLECKS      Load random 3 vector flecks from fleck map     Reduce tiling artifact
123. PLEVEL D3DTSS_MAXANISOTROPY    Parameter value is a value appropriate for the corresponding type  Here is  an example of how to use this function     cgD3D8SetTextureStageState  parameter  D3DTSS_MAGFILTER   D3DTEXF_LINEAR       The texture wrap mode is set using    HRESULT cgD3D9SetTextureWrapMode  CGparameter parameter   DWORD value       The input value is either zero or a combination of D3DWRAP_U  D3DWRAP_V   and D3DWRAP_W  Here is an example of how to use this function   cgD3D9SetTextureWrapMode  parameter  D3DWRAP_U   D3DWRAP V    Parameter Shadowing   Parameter shadowing can be enabled or disabled on a per program basis     Q When loading the program  see    Expanded Interface Program  Execution  on page 103        102    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    O Atany time using    HRESULT cgD3D9EnableParameterShadowing    CGprogram program  CGbool enable       for which enable should be set to CG_TRUE to enable parameter  shadowing and to CG_FALSE to disable it     To know if parameter shadowing is enabled for a given program  use   CGbool cgD3D9IsParameterShadowingEnabled  CGprogam program       This function returns CG_TRUE if parameter shadowing is enabled for  program     Expanded Interface Program Execution    To load a program in Direct3D 9 use cgD3D9LoadProgram        HRESULT cgD3D9LoadProgram CGprogram program   CG_BOOL parameterShadowingEnabled   DWORD assembleF lags       This function assembles the result of the compilation 
124. Plane float 4 Same as   ndx  TexGenSEyePlane  TexGenQObjectPlane float4 Same as   ndx  TexGenSEyePlane  TexturelD ndx  sampler1D OpenGL 1 0  ndx must be  greater or equal to zero and  less than the value of  GL MAX TEXTURE IMAGE  UNITS  Texture2D ndx  sampler2D Same as TexturelD  Texture3D  ndx  sampler3D 1 2 or EXT texture3D  ndx must be greater or  equal to zero and less than  the value of  GL MAX TEXTURE IMAGE  UNITS  TextureRectangle ndx   samplerRECT ARB texture rectangle                 EXT texture rectangle   Apple   or   NV texture rectangle   ndx must be greater or  equal to zero and less than  the value of   GL MAX TEXTURE IMAGE  UNITS             808 00504 0000 006    NVIDIA    137          Cg Language Toolkit    Table 6     CgFX OpenGL State Manager States  continued        State Name    Type Valid Enumerants    Requires       TextureCubeMap  ndx     TextureEnvColor  ndx     samplerCUBE    float4    1 3   ARB_texture_cube_map   or  EXT_texture_cube_ map   ndx must be greater or  equal to zero and less than  the value of   GL MAX TEXTURE IMAGE  UNITS       OpenGL 1 0  ndx must be  greater or equal to zero and  less than the value of   GL MAX TEXTURE UNITS       TextureEnvMode  ndx     int Modulate  Decal   Blend  Replace   Add    1 0  1 3    ARB texture env add Oor  EXT texture env addfor  Add  ndx must be greater or  equal to zero and less than  the value of   GL MAX TEXTURE UNITS       VertexEnvParameter   ndx     float4    ARB vertex program   ndx must be greate
125. PointSize float 1 0  PointSizeMin float 1 4   ARB point parameters   or  EXT point parameters  134 808 00504 0000 006    NVIDIA       Introduction to CgFX                                                 Table 6    CgFX OpenGL State Manager States  continued   State Name Type Valid Enumerants Requires  PointSizeMax float OpenGL 1 4   ARB point parameters  or  EXT point parameters  PointSpriteCoordOrigin int LowerLeft  2 0  UpperLeft  PointSpriteCoordReplace bool 2 0  ARB point sprite    ndx  Or NV point sprite  ndx  must be greater than or  equal to zero and less than  the value of  GL MAX TEXTURE COORDS  PointSpriteRMode int Zero  R  S NV point sprite  PolygonMode int2 Front  Back  1 0  FrontAndBack   Point  Line  Fill  PolygonOffset float2 1 1  ProjectionMatrix float4x4 1 0  Scissor int4 1 0  ShadeModel int Flat  Smooth 1 0  StencilFunc int3 Never  Less  1 0  LEqual  Equal   Greater  NotEqual   GEqual  Always  StencilMask int 1 0  Stencilop int3 Keep  Zero  1 0  Replace  Incr   Decr  Invert   IncrWrap  DecrWrap  StencilFuncSeparate int4 Front  Back  2 0 or          FrontAndBack   Never  Less   LEqual  Equal   Greater  NotEqual   GEqual  Always          EXT stencil two side          808 00504 0000 006    NVIDIA    135          Cg Language Toolkit                               Table 6    CgFX OpenGL State Manager States  continued   State Name Type Valid Enumerants Requires  StencilMaskSeparate int2 Front  Back  OpenGL 2 0 or  FrontAndBack EXT stencil two side  StencilOpSepara
126. RESULT cgD3D9SetUniformMatrixArray  CGparameter parameter   DWORD startIndex  DWORD numberOfElements   const D3DMATRIX  matrices      The parameters startIndex and numberOfElements have the same  meanings as for cgD3D9SetUniformMatrix       The upper left portion of each matrix of the array matrices is extracted to fit  the size of the element of the array parameter parameter  Array matrices is  assumed to have numberOfElements elements        808 00504 0000 006 101  NVIDIA          Cg Language Toolkit    Setting Sampler Parameters    You assign a Direct3D texture to a sampler parameter using    HRESULT cgD3D9SetTexture  CGparameter parameter   IDirect3DBaseTexture9  texture       To set the sampler state in the Direct3D 9 Cg runtime  use    HRESULT cgD3D9SetSamplerState  CGparameter parameter   D3DSAMPLERSTATETYPE type  DWORD value       Parameter type is any of the D3DSAMPLERSTATETYPE enumerants and  parameter value is a value appropriate for the corresponding type  Here is  an example of how to use this function     cgD3D9SetSamplerState  parameter  D3DSAMP_MAGFILTER   D3DTEXF_LINEAR       To set the texture stage state in the Direct3D 8 Cg runtime  use     HRESULT cgD3D8SetTextureStageState  CGparameter parameter   D3DTEXTURESTAGESTATETYPE type  DWORD value       Parameter type must be one of the following values     D3DTSS_ADDRESSU D3DTSS_ADDRESSV  D3DTSS_ADDRESSW D3DTSS_BORDERCOLOR  D3DTSS_MAGFILTER D3DTSS_MINFILTER  D3DTSS_MIPFILTER D3DTSS_MIPMAPLODBIAS    D3DTSS_MAXMI
127. RGUERCROROR AR RA 279  Sp ae  280  OpenGL NV texture shader and NV register combiners Profile  fp20              283  OVeIVIGW c expe AA al Me Ade eed Na eU  tap EOE oh ER 283  sca cd                                    283  uno C  284  Language Constr  cts arid SUpDOFt sspe re ur Pewee a Parr 285  Standard Library EURCEIORS ci ace ee eat RR op BOR RC RR a RU RO Re ee OR 286  Sp aee C                                                   288  Auxiliary Texture FubcHorls xut sie edet de ime grind UIS ap aro RR pci AE e OR 290  Examples   s sacs m RR a PEER KL Na cm UE Mac RON 295  DirectX Vertex Shader 2 x Profiles  vs 2              se II 296  Or                                       296  Memory    eus dunk PA eee RO ER EPEN G ADEE PET CRE GE ae ee 296  Statements and Operators  2 vrs rar wie ee DRE AES LEE EASE dd 297  Data TYPES wise cece ee cee ecb OO ERORROEGUPR CRAY AR CRESS CREE OES RRS 297  USIN GLASS  ace trud ports eder a cs O O E edad andes a 297  BIMGIINGS ac aces ia ea id da Rara Bonn de Sons oa mae eR 298  E ceeded Resa 299  DirectX Pixel Shader 2 x Profiles ps 2      0    cee eee eee 300  A E peddle tata soo ide eats A Sorde 300  Language Constructs and SUPPO   xu ose  tera a ads 301  e amimga Eo EORR E SOROR Se mead A aud de meti a ed Do 302  ORT OMS   ccn dre biti Rodin ea en tao harta do aed 303  vi 808 00504 0000 006    NVIDIA    Limitations inthis Implementation pan le 303       DirectX Vertex Shader 1 1 Profile vs 1 1            s RR 304  Memory RestriCLIOns   
128. ResourceToDeclUsage   90  cgD3D8ValidateVertexDeclaration    88  cgD3D9ResourceToDeclUsage   90  cgD3D9ValidateVertexDeclaration    88  Direct3D 8 application 95  Direct3D 9 application 92  fragment program 92  type retrieval 91  vertex declaration 85  vertex declaration for Direct3D 8 86  vertex declaration for Direct3D 9 86  vertex program 91    Direct3D debug DLL  using 113  DirectX pixel shader 1 x profiles 308  DirectX pixel shader 2 x profile 300  DirectX vertex shader 1 1 profile 304          Cg Language Toolkit    DirectX vertex shader 2 x profile 296  dot   for performance 324  dx8ps profile  deprecated 308    E  effect 117  Effect parameter 118  effect parameters 121  evaluating Cg programs 127  explicit casts  compile time 235  numeric 236  numeric matrix 236  numeric vector 236    F  fixed datatype 11  fixed type  specification 229  float data type 11  float type  specification 229  floating type category 232  for statements 244  fp20 profile 283  fp30 profile 274  fragment profiles  texture lookups 23  fragment program 121  predefined output structures 42  varying output 9  fragment program profiles 252  OpenGL ARB 263  OpenGL NV fragment program 274  fragment program  defined 3  fresnel 200  sample shader 200  vertex shader code example 200  function  calls 228  multiplying 20  open profile 227  function definitions  introduction 19  function overloading 240  introduction 19  functions    debugging 41   declaring 226   derivative 41   geometric 38  mathematical 
129. Specular and diffuse  lighting are computed per vertex in a Cg program  along with a view depth  parameter  which is computed using the view vector  surface normal  and  the depth of the thin film on the surface of the object  The view depth is then  perturbed in an ad hoc manner per fragment by the underlying decal  texture  and is then used to lookup into a 1D texture containing the  precomputed destructive interference for red   green   blue wavelengths  given a particular view depth  This interference value is then used to  modulate the specular lighting component of the standard lighting equation        Fig  11  Example of Thin Film Effect    Vertex Shader Source Code for Thin Film Effect       define inputs from application  JEJEUIG E  UE      ElOat4 Position e POSITION        180 808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders    float3 Normal   NORMAL   y        define outputs from vertex shader    Siew VAE      float4 HPOS    POSITION   tloei4 crece COLOR OF  float  specCol 8  COMLORIL        float2 filmDepth   TEXCOORDO   y        v2f main a2v IN   uniform float4x4 WorldViewProj   uniform float4x4 WorldViewIT   uniform float4x4 WorldView   uniform float4 LightVector   uniform float4 FilmDepth   uniform float4 EyeVector        WE  QUUD      transform position to clip space  OUT HPOS   mul  WorldViewProj  IN Position       float4 tempnorm   float4 IN Normal  0 0            transform normal from model space to view spac  float3 normalVec   mul WorldViewIT
130. TYPE FLOAT3  D3DDECLMETHOD_DEFAULT   cgD3D9ResourceToDeclUsage    cgGetParameterResource position     cgGetParameterResourceIndex position      LO  3    Sizcor  alo     D3DDECLTYPE D3DCOLOR  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage  cgGetParameterResource  color     cgGetParameterResourceIndex color        i  4 8 sizcor  loe   D3DDECLTYPE FLOAT2  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage   cgGetParameterResource  texCoord     cgGetParameterResourceIndex texCoord      D3DD3CL END                                                                                                                              y     DWORD declaration         D3DVSD_STREAM 0     D3DVSD_REG  cgD3D8ResourceToInputRegister   cgGetParameterResource  position    D3DVSDT_FLOAT3    D3DVSD REG  cgD3D8ResourceToInputRegister                       90    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library       cgGetParameterResource color    D3DVSDT_D3DCOLOR    D3DVSD_STREAM  1    D3DVSD_SKIP  4    D3DVSD REG cgD3D8ResourceToInputRegister   cgGetParameterResource texCoord    D3DVSDT FLOAT2    D3DVSD  END                   y    The size specified as the second argument of the D3DVSD_REG    macro call of  a Direct3D 8 declaration does not need to match the size of the  corresponding parameter for the vertex declaration to be valid  Those sizes  are specified to describe how the data is laid out in the streams  not to  perform any type checking with the shader code  The data referr
131. Table 6    CgFX OpenGL State Manager States  continued   State Name Type Valid Enumerants Requires  BlendFuncSeparate int4 Zero  One  OpenGL 1 4 or   rgb_src  DestColor  EXT blend func separate  rgb dst  OneMinusDestColor  1 4 or NV  blend square  a src  SrcAlpha  for SrcColor or  a_dst  OneMinusSrcAlpha  OneMinusSrcColor for  DstAlpha  rgb_src  and DstColor or  OneMinusDstAlpha  OneMinusDstColor for  SrcAlphaSaturate  rgb_dst  SrcColor   OneMinusSrcColor   ConstantColor   OneMinusConstantColor   ConstantAlpha   OneMinusConstantAlpha  BlendEquation int FuncAdd  1 4 or ARB_imaging  or  FuncSubtract  Min   EXT blend subtract for  Max  LogicOp FuncSubtract Or  FuncReverseSubtract   Or EXT blend minmax for  Min Or Max  or  EXT_blend_logic_op for  LogicOp  BlendEquationSeparate  int2  rgb   FuncAdd  EXT_blend_equation_  alpha  FuncSubtract  Min    separate  or 1 4   Max  LogicOp ARB_imaging  Or  EXT_blend_subtract for  FuncSubtract or  FuncReverseSubtract  Ol  1 4  ARB_imaging  or  EXT_blend_minmax for  Min Or Max  or  EXT_blend_logic_op for  LogicOp  BlendColor float4 1 4  ARB_imaging  or  EXT blend color  ClearColor float4 1 0  ClearStencil int 1 0  ClearDepth float 1 0                      808 00504 0000 006    NVIDIA    131             Cg Language Toolkit                                                 Table 6    CgFX OpenGL State Manager States  continued   State Name Type Valid Enumerants Requires  ClipPlane  ndx  float4 OpenGL 1 0  ndx must be  greater than or equal to zero
132. The runtime gives you the option of modifying the values of your program  parameters  The first step is to get a handle to the parameter        CGparameter myParameter   cgGetNamedParameter    program   myParameter       The variable myParameter is the name of the parameter as it appears in the  program source code     The second step is to set the parameter value  The function used depends on  the parameter type     Here is an example in OpenGL     cgGLSetParameter4fv myParameter  value         808 00504 0000 006 47  NVIDIA          Cg Language Toolkit    Here is the same example in Direct3D     cgD3D9SetUniform myParameter  value      Numeric parameters may also be set using core Cg runtime calls  such as     cgSetParameterValuefr myParameter  4  value      These function calls assign the four floating point values contained in the  array value to the parameter myParameter  which is assumed to be of type  float4     In both APIs  there are variants of these calls to set matrices  arrays  textures   and texture states  The core Cg runtime provides variants of these calls to set  the value of numeric parameters  including scalars  vectors  arrays  and  structures  The graphics API specific runtimes must be used to set API   specific values  such as sampler handles     Executing a Program    Before you can execute a program in OpenGL  you must enable its  corresponding profile   cgGLEnableProfile CG_PROFILE_ARBVP1            In Direct3D  nothing explicitly needs to be done to 
133. X greater than    Unlike C  Cg allows all boolean operators to be applied to vectors  in which  case boolean operations are performed in an elementwise fashion  The result  of such a boolean expression is a vector of bool elements with that number of  elements being the same as the two source vectors  Also unlike C  the logical  AND   amp  amp   and logical OR    1  operators cannot be used for short circuiting  evaluation  side effects of both sides of these expressions always occur   regardless of the value of the boolean expression        808 00504 0000 006 21  NVIDIA          Cg Language Toolkit    Swizzle Operator    Cg has a swizz e operator     that allows the components of a vector to be  rearranged to form a new vector  The new vector need not be the same size as  the original vector    elements can be repeated or omitted  The characters x  y   z  and w represent the first  second  third  and fourth components of the  original vector  respectively  The characters r  g  b  and a can be used for the  same purpose  Because the swizzle operator is implemented efficiently in the  GPU hardware  its use is usually free     The following are some examples of swizzling     float3 a  b  c  zyx yields float3 c  b  a   float4 a  b  c  d  xxyy yields float4 a  a  b  b   float2  a  b  yyxx yields   1oat4  b  b  a  a   float4  a  b  c  d  w yields d    The swizzle operator can also be used to create a vector from a scalar     a xxxx yiclds float4 a  a  a  a     The precedence of th
134. _shader instruction combinations        texCUBE_reflect_dp3x3 uniform samplerCUBE tex  float4 strq     float4 intermediate coordl   float4 intermediate coord2   float4 prevlookup           Performs the following  float3 E   float3 intermediate coord2 w  intermediate coordl w   strq w    float3 N   float3  dot  intermediate coordl xyz  prevlookup xyz    dot  intermediate coord2 xyz  prevlookup xyz    dot  strq xyz  prevlookup xyz     return texCUBE tex  2   dot  N  E    dot  N  N    N  E    where  strq are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate coordl are texture coordinates associated with the n 2  texture unit  and  intermediate coord2 are texture coordinates associated with the n 1  texture unit     This function can be used to generate the  dot product reflect cube map eye from qs NV texture shader  instruction combination                 808 00504 0000 006    NVIDIA    293          Cg Language Toolkit    Table 38    p20 Auxiliary Texture Functions  continued        Texture Function       Description       texCUBE reflect eye dp3x3  uniform samplerCUBE tex     float3 str    float4 intermediate coordl   float4 intermediate coord2   float4 prevlookup    uniform float3 eye        Performs the following  float3 N   float3  dot  intermediate coordl xyz  prevlookup xyz    dot  intermediate coord2 xyz  prevlookup xyz    dot  coords  xyz  prevlookup xyz     return texCUBE tex  2   dot N  E    dot N 
135. age Specification    correct precision and range  but is not required to produce bit exact results  It  is recommended that compilers provide an option either to forbid these  optimizations or to guarantee that they are made in bit exact fashion     Operator Precedence    Cg uses the same operator precedence as C for operators that are common  between the two languages     The swizzle and write mask operators     have the same precedence as the  structure member operator     and the array index operator   1      Operator Enhancements    The standard C arithmetic operators                 unary   are extended to  support vectors and matrices  Sizes of vectors and matrices must be  appropriately matched  according to standard mathematical rules  Scalar to   vector promotion  see    Smearing of Scalars to Vectors    on page 237  allows  relaxation of these rules     Table 10  Expanded Operators       Operator    Description       M n   m     Matrix with n rows and mcolumns       V n     Vector with n elements        V n    gt  V n      M n    gt  M n     Unary vector negate    Unary matrix negate       vin    V n    gt  V n     Componentwise         V n     V n    gt  V n     Componentwise         Componentwise         V n        V n    V n      V n        Vin    gt  V n     Componentwise         vin    V n    gt  V n     Componentwise                          M n   m    M n  m    gt  M n   m  Componentwise    M n  m    M n   m    gt  M n   m  Componentwise    M n   m    M n   m
136. agment programs   Fragment programs are required to declare and set a vector output that uses  the COLOR semantic  This value is usually used by the hardware as the final  color of the fragment  Some fragment profiles also support the DEPTH output  semantic  which allows the depth value of the fragment to be modified  and  some support additional color outputs for hardware that supports multiple  render targets  MRIs      As with vertex programs  fragment programs may return their outputs in the  body of a structure  However  it is usually more convenient to either declare  outputs as out parameters        valo mana o y  cue Eloar4 olor 8 COLOR  otic flogs esca s Dase  qd  PES we yf  colori Chlitcusecolor     E cole    depth                  or to associate a semantic with the return value of the shader     Flota mala  Y asco       8 COLOR 4  IE oem tus  reruwa  lliexwiexexclolione   4  3 0  w g         The following example shows a simple vertex program that calculates  diffuse and specular lighting  Two structures for varying data  appin and  vertout  are also declared  Don   t worry about understanding exactly what  the program is doing   the goal is simply to give you an idea of what Cg code  looks like     A Brief Tutorial    on page 145 explains this shader in detail        Define inputs from application   struct appin            float4 Position 8 ISO X ONIE  float4 Normal   NORMAL   y   808 00504 0000 006 9    NVIDIA          Cg Language Toolkit       Define outputs fr
137. alues are propagated do not  appear as lvalues within any kind of control statement  if  for  or  while  or    construct    Profiles may choose to support more general constant propagation   techniques  but such support is not required     Q Profiles may optionally support fully general for and while loops     New Vector Operators  These new operators are defined for vector types     Q Vector construction operator   lt typeID gt        This operator builds a vector from multiple scalars or shorter vectors     float4 scalar  scalar  scalar  scalar   float4  float3  scalar     Q Matrix construction operator    typeID              244 808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    This operator builds a matrix from multiple rows  Each row may be  specified either as multiple scalars or as any combination of scalars and  vectors with the appropriate size    float3x3 1  2  3  4  5  6  7  8  9    float3x3  float3  float3  float3    float3x3 1  float2  float3  float3  1  1  1     Q Swizzle operator            a   b xxyz     A swizzle operator exampl      Atleast one swizzle character must follow the operator     There are two sets of swizzle characters and they may not be mixed   Set one is xyzw   0123  and set two is rgba   0123       The vector swizzle operator may only be applied to vectors or to  scalars       Applying the vector swizzle operator to a scalar gives the same  result as applying the operator to a vector of length one   Thus  myscalar xxx and 
138. ameterArray3d  CGparameter parameter    long startIndex  long numberOfElements  double  array    cgGLGetParameterArray4f  CGparameter parameter    long startIndex  long numberOfElements  float  array    cgGLGetParameterArray4d  CGparameter parameter    long startIndex  long numberOfElements  double  array         808 00504 0000 006    NVIDIA    TI          Cg Language Toolkit    Similar functions exist to set the values of arrays of uniform matrix  parameters     void cgGLSetMatrixParameterArrayfr  CGparameter parameter   long startIndex  long numberOfElements   const float  array      void cgGLSetMatrixParameterArrayfc  CGparameter parameter   long startIndex  long numberOfElements   const float  array      void cgGLSetMatrixParameterArraydc  CGparameter parameter   long startIndex  long numberOfElements   const double  array      void cgGLSetMatrixParameterArraydc  CGparameter parameter   long startIndex  long numberOfElements   const double  array       and to query those values    void cgGLGetMatrixParameterArrayfr  CGparameter parameter   long startIndex  long numberOfElements  float  array      void cgGLGetMatrixParameterArrayfc  CGparameter parameter   long startIndex  long numberOfElements  float  array      void cgGLGetMatrixParameterArraydc  CGparameter parameter   long startIndex  long numberOfElements  double  array      void cgGLGetMatrixParameterArraydc  CGparameter parameter   long startIndex  long numberOfElements  double  array       The e and r suffixes have
139. ameters fall into three broad categories  program parameters  effect  parameters  and shared parameters     Program parameters are associated with Cg programs  A parameter that is  declared as part of the program s entry point belongs to the program s       54    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    namespace  A parameter that is declared globally in the file scope of the Cg  program belongs to the program   s global namespace     Effect parameters are associated with Cg Effects  See the Introduction to CgFX  chapter for more information on managing effect parameters     Shared parameters are associated with Cg contexts  See    Shared Parameters     on page 59  for more details     Cg functions exist for retrieving  creating  and querying program  parameters     Program Parameter Retrieval    Parameters associated with Cg programs may be retrieved iteratively or  directly     Iteration    A program has a sequence of parameters that can be iterated over by using  cgGetFirstParameter    and cgGetNextParameter      CGparameter cgGetFirstParameter  CGprogram program    CGenum namespace     CGparameter cgGetNextParameter  CGparameter parameter       A call to cgGetFirstParameter    returns the first parameter of the  sequence  If the program is invalid or does not contain any parameter  the  call returns zero  Given a parameter  cgGetNextParameter    returns the  parameter immediately next in the sequence or zero if there is none  The  namespace 
140. and CLPO CLP5 to be present as  binding semantics on a member of a structure of a varying input data  structure  provided the member with this binding semantics is not  referenced  This allows Cg programs to have the same structure specify the  varying output of a vp30 profile program and the varying input of an   p30  profile program     Table 27    p30 Varying Input Binding Semantics       Binding Semantics Name Corresponding Data  type        COLORO  COLO Input color  float 4        COLOR1  COL1 Input colorl    1oat4        TEXCOORDO TEXCOORD7  Input texture coordinates  float 4   TEXO TEX7          WPOS Window Position Coordinates  float 4                 808 00504 0000 006 275    NVIDIA          Cg Language Toolkit    The valid binding semantics for varying output parameters in the   p30 profile  are summarized in Table 28     Table 28    p30 Varying Output Binding Semantics             Binding Semantics Name Corresponding Data  COLOR  COLORO  COL Output color  float 4   DEPTH  DEPR Output depth  float              Pack and Unpack Functions    The   p30 profile provides a number of functions for packing multiple  floating point values into a single 32 bit result  Corresponding unpacking  functions are also provided  These functions map directly to the packing and  unpacking instructions defined by the Nv    ragment program OpenGL  extension     pack_2half      float pack 2half float2 a    float pack 2half  half2 a      Converts the components of a into a pair of 16 bit f
141. are associated with a CGcontext  They may be created  with the following entry points     CGparameter cgCreateParameter  CGcontext ctx  CGtype type    CGparameter cgCreateParameterArray  CGtype type  int length    CGparameter cgCreateParameterMultiDimArray  CGtype type    int dim  int  lengths      Only parameters of concrete types may be created  In particular  parameters  of abstract interface types may not be created  By default  a created  parameter has uniform variability and undefined values     Shared Parameter Deletion    Shared parameters may be deleted using  Void cgDeleteParameter  CGparameter param      When a shared parameter is deleted  all parameters connected to it are  disconnected  and vice versa     Connecting Parameters   Once created  a shared parameter may be connected to any number of  program  effect  or shared parameters using   void cgConnectParamteer  CGparameter source  CGparameter sink      where source is the shared parameter  and sink is the target parameter that  will inherit the shared parameter s values     Once a parameter has had a source connected to it  its value should no  longer be set directly  Instead  its value can be set indirectly by setting the  value of the associated sink     A parameter that has been connected to a shared source parameter may be  disconnected using    Void cgDisconnectParameter  param      Shared Parameters and Interfaces    Using Cg  it is possible to create families of code  modules  that share a  common inte
142. argument of cgGetFirstParameter    specifies the name space  of the parameters returned by this function and subsequent calls to  cgGetNextParameter     Every parameter belongs to a particular name  space that defines its scope  When CG_GLOBAL is specified  the program   s  global parameters  i e   those parameters that are in the file scope of the  program s entry point   are iterated over  When CG PROGRAM is specified  the  parameters specified in the program   s entry point declaration are iterated  over     Here is how those two functions would typically be used given a valid  program called program   CGparameter parameter   cgGetFirstParameter  program   CG PROGRAM     while  parameter    0        Here is the code that handles the parameter     parameter   cgGetNextParameter  parameter                    808 00504 0000 006 55  NVIDIA          Cg Language Toolkit    These functions don   t provide access to the fields of a structure parameter   type CG_STRUCT  or the elements of an array parameter  type CG_ARRAY   In  other words  if a struct or array parameter is declared  these entry points  return will return a handle to the struct or array itself     One way to access the fields of a structure is to use  cgGetFirstStructParameter    along with cgGetNextParameter        CGparameter cgGetFirstStructParameter  CGparameter parameter      If parameter is not of type CG  STRUCT  cgGetFirstStructParameter     returns zero     Similarly  to get access to the elements of an ar
143. ariable may only be used  by passing it to another function as an in parameter  Assignment to  sampler  variables is not permitted  and sampler  expressions are not  permitted    The following sampler  types are always defined  sampler  sampler1D   sampler2D  sampler3D  samplerCUBE  and samplerRECT  The base  sampler type may be used in any context in which a more specific  sampler type is valid  However  a sampler variable must be used in a  consistent way throughout the program  For example  it cannot be used  in place of both a sampler1D and a sampler2D in the same program   Fragment profiles are required to fully support the sampler  sampler1D   sampler2D  sampler3D  and samplerCUBE data types  Fragment profiles  are required to provide partial support  see    Partial Support of Types     on page 231  for the samplerRECT data type and may optionally provide  full support for this data type    Vertex profiles are required to provide partial support for the six  sampler data types and may optionally provide full support for these  data types     An array type is a collection of one or more elements of the same type   An array variable has a single index     Some array types may be optionally designated as packed  using the  packed type modifier  The storage format of a packed type may be  different from the storage format of the corresponding unpacked type   The storage format of packed types is implementation dependent  but  must be consistent for any particular combinatio
144. ariables are initialized with the same value  but the variables are not aliased  thereafter  Output aliasing is illegal  but implementations are not required to  detect it  If the compiler does not issue an error on a program that aliases  output binding semantics  the results are undefined     Restrictions on Semantics Within a Structure    For a particular profile  it is illegal to mix input binding semantics and  output binding semantics within a particular struct  That is  for a particular  top level function  a struct must be either input only or output only   Likewise  a struct must consist exclusively of uniform inputs or exclusively  of non uniform inputs  It is illegal to use binding semantics to mix the two  within a single struct     Additional Details for Binding Semantics    The following rules are somewhat redundant  but provide extra clarity     Semantics names are case insensitive     D    Semantics attached to parameters to non main functions are ignored     O Input semantics may be aliased by multiple variables        a Output semantics may not be aliased     How Programs Receive and Return Data    A program is just a non static function that has been designated as the main  entry point at compilation time  The varying inputs to the program come  from this top level function s varying in parameters  The uniform inputs to  the program come from the top level function s uniform in parameters and  from any non static global variables that are referenced by the 
145. as  main        must be declared as uniform     A structure that implements a particular interface may be used wherever its  interface type is expected  For example     float3 myfunc Light light     lloeies  resule   lacing n abd Ib saos los o  8    float4 main uniform SpotLight spot     float3 color   myfunc  spot           Here  the SpotLight variable spot may be used as a generic Light in the call  to my  unc     because Spot Light implements the Light interface     It is possible to declare a local variable of an interface type  However  a  concrete structure must be assigned to that variable before any of the    interface s methods may be called  For example   Light mylight   SpotLight spot   tlocrs Colors  fs imitialize spe 7           colori milicia c dla  rs  7 7  cra  mylight   spot   ecole   mylieiae  dlluminste lao      OK  808 00504 0000 006 17    NVIDIA          Cg Language Toolkit    Under all current profiles  the concrete implementation of all interface  method calls must be resolvable at compile time  There is no dynamic run   time determination of which implementation to call under any current  profile     See the interfaces_ogl example  included in the Cg distribution  for an  example of the use of interfaces    Notes and Caveats   The following limitations may be addressed in future releases     a There is no inheritance per se in Cg  a structure may not inherit from  another structure     Q Structures may only implement a single interface        Q Interfa
146. as Rr eee die eR RR RU ORE crar eds 121  Textures and Samplers  iu s xa vox CREDO x bete dA eats URN UR Be RR IR 123  Interfaces    and Unsized ATTAYS    lt  lt  sies es ep kom Rem oh RR E heh a 125  Evaluating Cg Programs using the Virtual Machine          llle 127  ANOTACIONES  aa ara ped 128  OpenGL  State socio a A RA A A A EORR 129  OpenGL Sampler State    us crop e a e A 141  OpenGL State Not Specifiable with State Assignments           o oo ooooooooo 142  ABrief Tutorial         02 0 0 cece ee 145  Loading the WORKS PaCS sit a sig ginau go at AR aeg g a ae bet bom Rd 145  Understanding SIMP O air AAA 146  Program  Listing Tor SIMPE C  conos rar ei Meee ARR REA ERU 147  Definitions for Structures with Varying Data         0 0 0 0    cece es 148  Passing ArQUMENES     4 s cepe ee pp PCR rar eee oe Ide 149   ii 808 00504 0000 006    NVIDIA    Basic Transformations    aaa rre 149       Prepare TOR LATINAS   ue aerobic sedis ese ah ok cae 150  Calculating the Vertex Colom 3    aac cepe ra etek ERU PRR Rd Ra RR gen 151  Further Experimentation ssa east rui BSG SIRE TES XR EN RAE 152  Advanced Profile Sample Shaders                     le eeeeeler nnne 153  Improved SKIMMING   sos aciei gaa oa de pai   s 154  A as g n pi etus ae eave tesi pants Rud lese gi S 154  Vertex Shader Source Code for Improved Skinning                 000000 eae 155  Improved Water  ego e ob m ee bb eed AA 157  DESCNPUIOM  sis qom cos o ERE P XUL ee TARE Re Hd x 157  Vertex Shader Source Code for Impr
147. ate multiple materials without switching shaders   splitting your model  or resorting to multiple passes     Uses for MultiPaint might include complex armor built of inlaid metals   woods  and stones   all modeled on a single  simple poly mesh  buildings  composed of multiple types of stone  glass  and metal  expressed as simple  cubes  cloth with inlaid metallic threads  or as in this demo  metal partially  covered with peeling paint     Using multiple BRDFs is common in the offline world  but rarely optimized   instead  two different shaders may be evaluated and their results blended  using a mask texture or chained through if statements  For maximum real   time performance  MultiPaint instead integrates all of the key parts of the  BRDFs as multiple painted textures so that only one pass through the shader  is required to create the mixed appearance  This permits a single pass shader  containing diffuse  specular  and environmental lighting effects in a compact   fast executing package        Fig  8  Example of MultiPaint       808 00504 0000 006 165  NVIDIA          Cg Language Toolkit    Vertex Shader Source Code for MultiPaint       define inputs from vertex buffer  struct appin                                                float4 Position 2 IPOS IW IONS  float4 UV TO LEXCOORDIO   float4 Tangent ee LEEXCOOR DAN   float4 Binormal ESO EID   float4 Normal 2 TEXCOORD 3   y      output    same struct is the input to  cg multipaint cg   spice Milicia qd  float4 HPosition 3 P
148. ay Traced Refraction         ss 170  Fig 10  Example of SKI  2k 8b 244 6S eee Re BREE     BOR RS A Re 175  Fig  11  Example of Thin Film Effect s s cotor o    ooo m Ro Re n Rs 180  Fig  12    Example of Car Paint 9  uium pon ac e i A A A A A UR 183  Fig  13  Example of Anisotropic Lighting                00002000  190  Fig  14  Example of Bump Dot3x2 Diffuse and Specular              04  192  Fig  15  Example of Bump Reflection Mapping                lens 196  Fig  16  Exampleof Fresnel e 2 3  6304 fe ae Oe ware 339 x0 49k Soe xx 200  Fig  17    EXample of Grass     cce soi a ia Rw IRR ek es ee me d 202  Fig  18  Example of Refraction    s o soior caom acanar a a a a a a a 205  Fig  19  Example of Shadow Mapping                  nsn 208  Fig  20  Example of Shadow Volume Extrusion          llle 211  Fig  21  Example of Sine Wave      2                4    214  Fig  22  Example of Matrix Palette Skinning       2    2 2 00  00000  217  808 00504 0000 006 ix    NVIDIA          Cg Language Toolkit    List of Figures       X 808 00504 0000 006  NVIDIA    List of Tables       Table 1  Mathematical Functions            o    e    34  Table 2  Geometric FUNCIONS   s e aa e AAA 38  Table 3  Texture Map Functions         ons 39  Table 4  Derivative Functions     2      lens 41  Table 5  Debugging FUNCION uus os a AA ad ae 42  Table 6  CgFX OpenGL State Manager States                                130  Table 7  Enable Disable States    2    2  0    oen 139  Table 8  sampler state State 
149. ber of the source can be converted to the target    ii  Not allowed if target is larger than source  Warning issued if target is smaller than source    iii  Only allowed if source and target are the same total size    iv  Only allowed if both source and target have the same number of members  and each  member of the source can be converted to the corresponding member of the target     Explicit casts are    Q Compile time type when applied to expressions of compile time type       808 00504 0000 006 235  NVIDIA          Cg Language Toolkit    Q Numeric type when applied to expressions of numeric or compile time  type   Q Numeric vector type when applied to another vector type of the same  number of elements       Q Numeric matrix type when applied to another matrix type of the same  number of rows and columns    Type Equivalency  Type T1 is equivalent to type T2 if any of the following are true   Q T2 is equivalent to T1     Q T1 and T2 are the same scalar  vector  or structure type   A packed array type is not equivalent to the same size unpacked array     Tl is a typedef name of T2     T1 and T2 are arrays of equivalent types with the same number of  elements     O The unqualified types of T1 and T2 are equivalent  and both types have  the same qualifications        Q T1 and T2 are functions with equivalent return types  the same number  of parameters  and all corresponding parameters are pair wise  equivalent     Type Promotion Rules    The cfloat and cint types behave like
150. ble  array       The digit in the name of those functions indicates how many scalar values  are set by the function  The v suffix is for functions that operate on an array  of values as opposed to individual arguments     If more values are set than the parameter requires  the extra values are  ignored  If less values are set than the parameter requires  the last value is  smeared  The egGLSetParameter functions may be called for either uniform  or varying parameters  When called for a varying parameter  the appropriate  immediate mode OpenGL entry point is called     The corresponding parameter value retrieval functions are as follows     cgGLGetParameterlf  CGparameter parameter  float  array    cgGLGetParameterld CGparameter parameter  double  array     cgGLGetParameter2f  CGparameter parameter  float  array    cgGLGetParameter2d  CGparameter parameter  double  array    cgGLGetParameter3f  CGparameter parameter  float  array    cgGLGetParameter3d  CGparameter parameter  double  array    cgGLGetParameter4f  CGparameter parameter  double  array    cgGLGetParameter4d  CGparameter parameter  type  array      Setting Uniform Matrix Parameters    The egGLSetMatrixParameter functions are used to set any matrix     void cgGLSetMatrixParameterfr  CGparameter parameter   const float  matrix    void cgGLSetMatrixParameterfc  CGparameter parameter   const float  matrix    void cgGLSetMatrixParameterdr  CGparameter parameter   const double  matrix    void cgGLSetMatrixParameterdc  CGpa
151. blic register reinterpret cast  return row major sampler  sampler state sampler1D sampler2D  sampler3D samplerCUBE shared  short signed sizeof  static static_cast string   struct switch technique   template texture  texturelD       808 00504 0000 006    NVIDIA    249       Cg Language Toolkit    texture2D texture3D textureCUBE  textureRECT this throw   true try typedef   typeid typename uniform   union unsigned using   vector  vertexfragment  vertexshader    virtual void volatile   while   identifier  two underscores before identifier        Cg Standard Library Functions    Cg provides a set of built in functions and predefined structures with  binding semantics to simplify GPU programming  These functions are  discussed in    Cg Standard Library Functions    on page 33        Vertex Program Profiles    A few features of the Cg language that are specific to vertex program profiles  are required to be implemented in the same manner for all vertex program  profiles     Mandatory Computation of Position Output    Vertex program profiles may  and typically do  require that the program  compute a position output  This homogeneous clip space position is used by  the hardware rasterizer and must be stored in a program output with an  output binding semantic of POSITION  or HPOS for backward compatibility      Position Invariance    In many graphics APIs  the user can choose between two different  approaches to specifying per vertex computations  use a built in  configurable fixed fu
152. bool 1 20r EXT texture3D  ndx must be greater or             equal to zero and less than the value of  GL MAX TEXTURE IMAGE UNITS          140    808 00504 0000 006    NVIDIA       Introduction to CgFX    Table 7  Enable Disable States  continued        Enable  Disable State Name    Type    Requires       TextureRectangleEnable   ndx     bool    ARB texture rectangle   EXT texture rectangle  Apple   or       NV texture rectangle  ndx must be greater or  equal to zero and less than the value of  GL MAX TEXTURE IMAGE UNITS                TextureCubeMapEnable  ndx  bool OpenGL 1 3  ARB texture cube map  Or   EXT texture cube map  ndx must be greater or  equal to zero and less than the value of   GL MAX TEXTURE IMAGE UNITS       OpenGL Sampler State    The following table lists the state assignments available in sampler state  blocks when using the CgFX OpenGL state manager  Any state values given  are set when the cgSetSamplerState    routine is called with the  CGparameter handle for a particular sample     Note that some of these states are defined in OpenGL extensions    for  example  MirrorClampToBorder is defined in the   EXT texture mirror clamp extension  Any state used that is based on an  extension not supported by the current OpenGL context is ignored by the    CgFX runtime     Table 8    sampler state State Assignments       Name Type    Valid Values    Requires       WrapS  WrapT   int  WrapR             Repeat  Clamp   ClampToEdge   ClampToBorder   MirroredRepeat   
153. bout the contents of a Cg file     Cg also includes built in vector data types that are based on the basic data  types  A sample of these built in vector data types includes  but is not limited  to  the following    float4 float3 float2 floatl   bool4 bool3 bool2 booll    Additional support is provided for matrices of up to four by four elements   Here are some examples of matrix declarations     floatixl matrixl     One element matrix   it llexeuE 2528  ieee P    Two by three matrix  six elements   float4x2  matrix      Four by two matrix  eight elements   locuras mede dios p    Four by four matrix  sixteen  elements     Note that the multi dimensional array   1oat M 4   4  is not type equivalent  to the matrix float4x4 M     There are no unions or bit fields in Cg at present     Type Conversions    Type conversions in Cg work largely as they do in C  Type conversions may  be explicitly specified using the C  newtype  cast operator     Cg automatically performs type promotion in mixed type expressions  just  as C does  For example  the expression floatvar   halfvar is compiled as  floatvar    float  halfvar     Cg uses different type promotion rules than C does in one case  A constant  without an explicit type suffix does not cause type promotion  CG compiles  the expression halfvar   2 0 as halfvar    half  2 0     In contrast  C would compile itas   double  halfvar    2 0  Cg uses  different rules than C to minimize inadvertent type promotions that cause       12    808 00
154. by a large grid  of vertices  because of the free rotation   but switching to wireframe or  increasing the frustum angle makes it apparent that the vertices are a static  mesh with the height  normal  and texture coordinates being calculated on   the fly based on the direction and height of the viewer  This technique allows  for very GPU friendly water animations because the static mesh can be  precomputed  The vertices are displaced using sine waves  and in this  example a loop is used to sum five sine waves to achieve realistic effects        Fig  6  Example of Improved Water       808 00504 0000 006 157  NVIDIA          Cg Language Toolkit    Vertex Shader Source Code for Improved Water    struct app2vert       float4 Position 8 IPOS ILI NS    H    GUEIEUNEIE Vertrag                         icll cied  IPO Slicslom 8 POSITION   float4 TexCoord0   TEXCOORDO   float4 TexCoordl   TEXCOORD1   float4 Color0 2 COMOIRO y  float4 Colorl o  IOUL ONRUL e       H    void calcWave out float disp  out float2 normal   float dampening  float3 viewPosition   float waveTime  float height   float frequency  float2 waveDirection     float distancel   dot viewPosition xy  waveDirection    distancel   frequency   distancel   waveTime     disp   height   sin distancel    dampening   normal    cos distancel    height   frequency     waveDirection xy      4 dampening      vert2frag main   app2vert IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelView   uniform float4x4 ModelViewIT   uni
155. c ran eN Y 15   FragmentProgram   compile arbfpl main  2 f                 technique AsmFrag    pass         808 00504 0000 006 27  NVIDIA          Cg Language Toolkit       FragmentProgram   asm    EIA   sx  oOfCOLR    Mhz  WEG  2Dp  END          um     Compile statements are generally the most commonly used of these three  options for specifying programs  They take the profile that the program is to  be compiled to    p30    p40  arbfp1  vp20  and so on   the name of the  function in the effect file to be compiled  and a list of expressions   2    in the  above example   These expressions have a one to one correspondence with  the uniform parameters of the program being compiled    there must be  exactly one for each uniform program parameter     In the example above  the expression  2    sets the value of the   oo  parameter to main     Because it is using a literal value  CgFX is able to  compile the shader into a particularly efficient version that just includes  returning the uv value     Inline assembly is given with the asm keyword  with the assembly language  code between braces as in the example above  CgFX depends on having the   appropriate header at the start of the assembly        FP1 0  for   p30       ARBvp1 0 for arbvp1  and so on    to determine which assembly profile the  code is given in     It is also possible to include effect parameters in the expression used in the  compile statement  For example        losa meda  Marto lost roo  Sloet wy s  UBOXCIOQNE
156. called FragmentProgram cg     void FragmentProgram         iin logar color  amp  COLORO    in float4 texCoord   TEXCOORDO   out float4 coloro SOMO RO       const uniform sampler2D BaseTexture   const uniform float4 SomeColor     colorO   color   tex2D BaseTexture  texCoord    SomeColor        OpenGL Application    This C code links the previous vertex and fragment programs to the  application                    include  lt cg cg h gt    include  lt cg cgGL h gt    float  vertexPositions     Initialized somewher lse  float  vertexColors     Initialized somewher lse  float  vertexTexCoords     Initialized somewher lse  GLuint texture     Initialized somewher lse  float constantColor       Initialized somewher ls       CGcontext context    CGprogram vertexProgram  fragmentProgram    CGprofile vertexProfile  fragmentProfile    CGparameter position  color  texCoord  baseTexture  someColor   modelViewMatrix        Il Called art imicializariomn  void CgGLInit     1      Create context   context   cgCreateContext              82    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library       Initialize profiles and compiler options  vertexProfile   cgGLGetLatestProfile CG GL VERTEX    cgGLSetOptimalOptions  vertexProfile     fragmentProfile   cgGLGetLatestProfile CG GL FRAGMENT    cgGLSetOptimalOptions fragmentProfile                           Create the vertex program   vertexProgram   cgCreateProgramFromFile   context  CG SOURCE   VertexProgram cg    vertexProfile   V
157. can be retrieved by calling cge  LGet Text ureEnum      see the following  discussion     The second step consists of enabling the texture unit associated with the  sampler parameter for a specific drawing call  It is strongly recommended       808 00504 0000 006 79  NVIDIA          Cg Language Toolkit    that applications allow the Cg OpenGL runtime library to perform this  second step itself  This is accomplished by calling     void cgGLSetManageTextureParameters  CGcontext context   CGbool enable       with enable set to a non zero value after the Cg context has been created   When automatic texture parameter management is in effect  the Cg OpenGL  runtime will automatically enable all appropriate texture units when a  CGprogram is bound     If  despite the above  you wish to manage texture parameters yourself  you  can use the helper function  void cgGLEnableTextureParameter  CGparameter parameter      which must be called after cgGLsetTextureParameter    and before the  actual drawing call     The equivalent disabling function is   void cgGLDisableTextureParameter  CGparameter parameter      You can retrieve the texture object assigned to a sampler parameter using  GLuint cgGLGetTextureParameter  CGparameter parameter      You can retrieve the OpenGL enumerant for the texture unit associated with  a sampler parameter using  GLenum cgGLGetTextureEnum CGparameter parameter      The returned enumerant has the form GL_TEXTURE  ARB where   is the  texture unit index     OpenG
158. ces cannot be extended or combined     Although there is no structure inheritance  it is possible to define a default  implementation of a particular interface method  The default  implementation can be defined as a global function  and structures that  implement that interface may then call this default method via a wrapper     Note  also  that interface and structure parameters of top level functions   such as main     may be connected to structures that are created in the  runtime  See the Cg runtime documentation for more details        Statements and Operators    Cg supports the following types of statements and operators   Control flow   Function definitions and function overloads   Arithmetic operators from C   Multiplication function   Vector constructor   Boolean and comparison operators   Swizzle operator    Write mask operator       Ooo D OO O O O    Conditional operator       18    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    Control Flow    Function    Cg uses the following C control constructs        Q Function calls and the return statement  QO if else   O while   QO for    These control constructs require that their conditional expressions be of type  bool  Because Cg expressions like i  lt   3 are of type bool  this change from  C is normally not apparent     Profiles like vs_2_x  vp30  and vp40 support branch instructions  so for and  while loops are fully supported in these profiles  In other profiles  for and  while loops may only be us
159. cgGLBindProgram CGprogram program     Only one vertex program and one fragment program can be bound at any  given time  so binding a program implicitly unbinds any other program of  that type    Profiles are disabled using cgGLDisableProfile      void cgGLDisableProfile CGprofile profile      Some profiles may not be supported on some systems  For example  a given  profile is not supported if the OpenGL extensions it requires are not  available  You can check if a profile is supported by using  cgGLIsProfileSupported      CGbool cgGLIsProfileSupported CGprofile profile      It returns CG  TRUE if profile is supported and cG FALSE otherwise     OpenGL Program Examples    This section presents code that illustrates how to use functions from the  OpenGL Cg interface to make Cg programs work with OpenGL  The vertex  and fragment programs below are used in  OpenGL Application  on   page 82     OpenGL Vertex Program    The following Cg code is assumed to be in a file called VertexProgram  cg     void VertexProgram            in float4 position B POSITION   in float4 color eC ORO  in float4 texCoord PLEX  IIS 0T  808 00504 0000 006 81    NVIDIA          Cg Language Toolkit    au Fleece  posltlomo 2 POSITION    out loet coloro   COLORO    out float4 texCoordO   TEXCOORDO    const uniform float4x4 ModelViewMatrix         positionO   mul  position  ModelViewMatrix     coloro   colo  texCoordO   texCoord        OpenGL Fragment Program    The following Cg code is assumed to be in a file 
160. chniques  Validation fails  for instance  if a techniques includes a    compile     state assignment that references a profile that isn t supported on the current  graphics hardware  Similarly  validation fails if the technique includes a state  assignment that uses an unsupported OpenGL extension  Effects are  commonly written such that the application can iterate over the given  techniques in order and then choose the first technique that passes validation  to apply the effect  For this reason  techniques are usually given in order of  decreasing quality     The code below iterates through the techniques in a CGeffect in turn   attempting to validate each of them and printing an error for the ones that  fail        CGtechnique technique   cgGetFirstTechnique  effect    while  technique     if  cgValidateTechnique  technique     CG FALSE              fprintf stderr    Technique  s did not validate  Skipping  n    cgGet TechniqueName  technique         technique   cgGetNextTechnique  technique                     The function cgIsTechniqueValidated    can be used to check if the given  technique has been validated     Note that any Cg programs referenced in a technique are not compiled until  the technique is validated  This makes it possible to modify the uncompiled  program by connecting concrete shared structs to interface effect  parameters  marking uniforms as literals  changing the program   s profile   and so on     Passes and Pass State    The heart of CgFX is applyin
161. compiler has  to add some that correspond to literal constant values in the code     A parameter s variability can also be modified via the core Cg runtime using    void cgSetParameterVariability  CGparameter parameter   CGenum vary       Here  vary may be one of     0 CG _ UNIFORM  The parameter is set to uniform variability        O CG_LITERAL   The parameter is marked as a literal  whose value can be assumed to be a  compile time constant compilation  This feature can be used to    bake     parameter values into the compiled Cg program  which often produces  much more efficient compiled code     Q CG_DEFAULT  The parameter reverts to its default variability as specified in the  program text  or is made to inherit its variability from any source it has  been connected to     Note that parameters may not currently be set to CG_VARYING variability     To obtain the parameter direction  use cgGetParameterDirection      CGenum cgGetParameterDirection  CGparameter parameter      It returns CG IN if the parameter is an input parameter  CG_OUT if the  parameter is an output parameter  or CG  INOUT if the parameter is both an  input and an output parameter    The entry point cgGetParameterType    retrieves the parameter name   const char  cgGetParameterName  CGparameter parameter      Use cgGetParameterSemantic    to retrieve the parameter semantic string   const char  cgGetParameterSemantic CGparameter parameter      If the parameter does not have any semantic  an empty string is 
162. ction set and machine architecture limit  programmability in these profiles compared to what is allowed by Cg  constructs   Thus  these profiles place additional restrictions on what can and  cannot be done in a Cg program     The main differences between these profiles from the Cg perspective is that  additional texture addressing operations are exposed in ps 1 2 and ps 1 3  and the depth value output is made available  in a limited form  in ps 1 3     Operations in the DirectX pixel shader 1 X profiles can be categorized as  texture addressing operations and arithmetic operations  Texture addressing  Operations are operations which generate texture addressing instructions   arithmetic operations are operations which generate arithmetic instructions   A Cg program in one of these profiles is limited to generating a maximum of  four texture addressing instructions and eight arithmetic instructions  Since       9  For more details about the underlying instruction sets  their capabilities  and their  limitations  refer to the MSDN documentation of DirectX pixel shaders 1 1  1 2 and 1 3        308    808 00504 0000 006  NVIDIA    Modifiers    Appendix B Language Profiles    these numbers are quite small  users need to be very aware of this limitation  while writing Cg code for these profiles     There are certain simple arithmetic operations that can be applied to inputs  of texture addressing operations and to inputs and outputs of arithmetic  operations without generating an a
163. ctors  this causes a warning if it is done implicitly  A matrix may  also be converted implicitly to a matrix of the same size and shape and  compatible element type        234 808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    A matrix may be converted to a smaller matrix type  the upper left  submatrix is selected  or to a vector of the same total size  but a warning  is issued if an explicit cast is not used     Q Structure conversions  A structure may be explicitly cast to the type of its first member or to  another structure type with the same number of members  if each  member of the struct can be converted to the corresponding member of  the new struct  No implicit conversions of struct types are allowed     Q Array conversions  No conversions of array types are allowed     Table 9  summarizes the type conversions discussed here  The table entries  have the following meanings  but please pay attention to the footnotes     Allowed  allowed implicitly or explicitly    D    Warning  allowed  but warning issued if implicit    Q Explicit  only allowed with explicit cast       a No  not allowed    Table 9  Type Conversions                         Target Type Source Type   Scalar Vector Matrix Struct Array  Scalar Allowed Warning Warning Explicit  No  Vector Allowed Allowed Warning Explicit  No  Matrix Allowed Warning    Allowed Explicit  No  Struct Explicit No No Explicit  No  Array No No No No No                            i  Only allowed if the first mem
164. d  void OnCreateDevice          Create the vertex shader  vertexProgram   cgCreateProgramFromFile context  CG SOURCE    VertexProgram cg   CG PROFILE VS 2 0   VertexProgram   0    CComPtr  ID3DXBuffer   byteCode   const char  progSrc   cgGetProgramString vertexProgram   CG COMPILED PROGRAM    D3DXAssembleShader progSrc  strlen progSrc   0  O0  0    amp byteCode  0       If your program uses explicit binding semantics  like     this one   you can create a vertex declaration     using those semantics   const D3DVERTEXELEMENT9 declaration              Size o  plod    D3DDECLTYPE_FLOAT3  D3DDECLMETHOD_DEFAULT   D3DDECLUSAGE POSITION  O0       97   sizeof float    D3DDECLTYPE D3DCOLOR  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE COLOR  O0     t   sizeof float    D3DDECLTYPE FLOAT2  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE TEXCOORD  O0     D3DD3CL END                                    e                                              I                                     T                             o Oo e eee ler g                                    Make sure the resulting declaration is compatible with      the shader  This is really just a sanity check    assert  cgD3D9ValidateVertexDeclaration  vertexProgram   declaration           device  gt CreateVertexDeclaration     declaration   amp vertexDeclaration     device  gt CreateVertexShader     byteCode  gt GetBufferPointer      amp vertexShader                   Create the pixel shader   fragmentProgram   cgCreateProgramFromFile context    
165. de  that will be generated for a function named by an identifier is a  definition        224 808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    Profiles    Compilation of a Cg program  a top level function  always occurs in the  context of a compilation profile  The profile specifies whether certain  optional language features are supported  These optional language features  include certain control constructs and standard library functions  The  compilation profile also defines the precision of the float  half  and fixed  data types  and specifies whether the fixed and sampler  data types are  fully or only partially supported  The choice of a compilation profile is made  externally to the language  by using a compiler command line switch  for  example     The profile restrictions are only applied to the top level function that is being  compiled and to any variables or functions that it references  either directly  or indirectly  If a function is present in the source code  but not called directly  or indirectly by the top level function  it is free to use capabilities that are not  supported by the current profile     The intent of these rules is to allow a single Cg source file to contain many  different top level functions that are targeted at different profiles  The core  Cg language specification is sufficiently complete to allow all of these  functions to be parsed  The restrictions provided by a compilation profile are  only needed for code generati
166. deprecated entry point     CGtype cgGetParameterType  CGparameter parameter      This entry point differs from cgGetNamedUserType    in that it always  returns CG_STRUCT for any struct parameter  rather than returning the  enumerant associated with the user defined type of the struct     The name associated with a given type enumerant can be queried using  const char  cgGetTypeString CGtype type      If the string passed to cgGetType    does not correspond to any type   CG_UNKNOWN_TYPE is returned     Function cgGetParameterBaseType    returns the basic type of vector  matrix and matrix parameters  For example  given a float 4x4 parameter   cgGetParameterBaseType    returns the CG_FLOAT type  Similarly  given a  multidimensional array of float 4x4s  it also returns CG_FLOAT     It is also possible to determine the general class of the type of a parameter   CGparameterclass cgGetParameterClass  CGparameter param      It returns one of the following enumerated values   CG PARAMETERCLASS UNKNOWN CG_PARAMETERCLASS_SCALAR  CG PARAMETERCLASS VECTOR  CG_PARAMETERCLASS OBJECT  CG PARAMETERCLASS MATRIX CG PARAMETERCLASS STRUCT  CG PARAMETERCLASS ARRAY    Parameter Type Equivalency    If a program containing a user defined type is created in a context that  already contains another program or effect that defines a user type with the  same name  the two type definitions are compared  If both type definitions  are found to be equivalent  the CGtype enumerant associated with the user  typ
167. der file before compilation     808 00504 0000 006 329  NVIDIA       Cg Language Toolkit    a  longprogs   Allow code generation that is longer than a profile   s limit   A  debug   Activate the debug    function    0  v   Print the compiler   s version to stdout    a  h   Print a short help message        O  maxunrollcount N  Set the maximum loop unroll count to N  Loops with greater than N  iterations are not unrolled  Defaults to 256     OU  posinv  Generate a position invariant vertex program if position invariance is  supported by the current profile        330 808 00504 0000 006  NVIDIA          A  abs   for performance 324  animation of geometry 202  anisotropic lighting  sample shader 190  vertex shader code example 191  Annotation 118  ANSI C  differences from Cg 222  relation toCg 221  arbfp1 profile 263  arbvpl profile 256  arithmetic operators 20  248  arithmetic precision 246  arithmetic range 246  array type  specification 230  arrays  declaration and use of 238  support of 14    B  binding semantics 242  defined 6  overview 241  Blinn Phong Bump Mapping 175  booldatatype 11  bool type  specification 229  boolean operators 21  248  built in functions 33  bump dot3x2 diffuse and specular  pixel shader code example 194  sample shader 192  vertex shader code example 193  bump reflection mapping  pixel shader code example 199  sample shader 196  vertex shader code example 197    C    C preprocessor    808 00504 0000 006    supporting 241   C    relation to Cg 221   
168. e     function usually has no cost in fragment programs  Do not hesitate to use  these functions when appropriate        4  Use Texture Maps to Encode Complex Functions    For profiles that support texture maps  filtered texture map lookups are  extraordinarily efficient  If you have a complex function that takes more than  a handful of arithmetic operations to evaluate  you might want to encode the  function in a texture map  Say that you have written a function    x  y  that  is a bottleneck in your shader  Assume for now that it is always called with  values of x and y between zero and one  and that the value that    x  y   computes is always between zero and one  If the function is reasonably  smooth and you don   t need to compute it at extremely high precision  you       324    808 00504 0000 006  NVIDIA    Appendix C Nine Steps to High Performance Cg    can precompute the function in your application and store it in a texture  map  replacing calls like    float val   f x  y      with code like  Flogs val EDE oanp ller lol  3    03     This method can also be applied to one  and three dimensional functions   using 1D and 3D texture maps     More generally  the values you pass to the function may not be in the range   0  1   and the values your function returns may not be in the range  0  1    In this case  the following two utility functions can serve as a base   remapTo01    remaps the range  low  high  into  0  1   remapFrom01     does the opposite     float4 remapTo0
169. e   DepthFunc   Less              AlphaTestEnable   true   AlphaFunc   float2 Equal  0            26    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    Parameters and Semantics    The CgFX file also contains global Cg parameters  These variables are usually  passed as uniform parameters to Cg functions  or as the values for render or  texture state settings  For instance  a bool variable might be used as a  uniform parameter to a Cg function  or as a value enabling or disabling the  alpha blend render state     bool AlphaBlending   false   float bumpHeight   0 5f        These variables can contain a user defined semantic  which helps    applications provide the correct data to the shader without having to  decipher the variable names     float4x4 myViewMatrix   ViewMatrix   texture2D someTexture   DiffuseMap        A CgFX enabled application can then query the CgFX file for its variables  and their semantics     Vertex and Fragment Programs    With the OpenGL state manager  vertex and fragment programs are defined  via assignments to the VertexProgram and FragmentProgram states     respectively  Three different types of expressions can be on the right hand  side of these program types     O Compile statements    Q In line assembly          Q NULL   These three possibilities are demonstrated in the effect file below    float makin a Orme   ioar Oo  OA EXC OO RD O   CODO  return  foo  gt  QU  2 uy 3 2    cum         technique SimpleFrag    pass    VieKnirexP ho
170. e Support    Two convenient functions are provided that give the highest vertex and pixel  shader versions supported by the device     CGprofile cgD3D9GetLatestVertexProfile      CGprofile cgD3D9GetLatestPixelProfile       This allows you to make your application future ready  because the Cg  programs are automatically compiled for the best profiles that are available  at runtime  even if these profiles did not exist at the time the application was  written  Another function that allows you optimal compilation is  cgD3D9GetOptimalOptions     It returns a string representing the optimal  set of compiler options for a given profile    char const  cgD3D9GetOptimalOptions  CGprofile profile      This string is meant to be used as part of the argument parameter to  cgCreateProgram     It does not need to be destroyed by the application   However  its content could change if cgD3D9GetOptimalOptions    is called  again for the same profile but for a different Direct3D device     Expanded Interface Program Examples    In this section we provide programs that illustrates how and when to use  functions from the expanded interface to make Cg programs work with  Direct3D  For the sake of clarity  the examples do very little error checking   but a production application should check the return values of all Cg       808 00504 0000 006 105  NVIDIA          Cg Language Toolkit    functions  The vertex and fragment programs that follow are referenced in     Expanded Interface DirectD3D 9 App
171. e application must then create a shared array of concrete light instances   To do so  the application proceeds as it would when operating on a  CGprogram    by retrieving the CGtype corresponding to each type of concrete  instance to be created  and calling cgCreateParameter    or  cgCreateParameterArray    to create the shared parameter of the given  type  Lastly  the shared parameter is connected to the effect parameter   This process is illustrated below     CGtype spotTyp   cgGetNamedUserType  effect   SpotLight      CGparameter spots   cgCreateParameterArray  context   spotType  4     CGparameter lights   cgGetNamedEffectParameter  effect    Ws giae  p                cgConnectParameter spots  lights      Note that cgGetNamedUserType    in this case is passed a CGeffect handle   rather than a CGprogram handle        126    808 00504 0000 006  NVIDIA    Introduction to CgFX    Later  when the associated technique is validated  any programs that make  use of the abstract effect parameters are compiled     Note that abstract parameters may not be used on the right hand side of any  state assignments other than compile state assignments  Doing so results in  an error at effect creation time     Evaluating Cg Programs using the Virtual Machine    There are many situations where it is useful to execute Cg programs on the  CPU using the Cg runtime Virtual Machine  VM   Although running Cg  programs on the CPU doesn t offer the same performance as execution on the  GPU  it is som
172. e in the new program will be identical to that of the identical user type in  the existing program or effect  If the types are not equivalent  the new type  will be assigned a unique CGtype  In this way  type equivalency of       808 00504 0000 006 65  NVIDIA          Cg Language Toolkit    parameters shared between multiple programs and effects can be assured  simply by comparing CGtype enumerants     In order for two types to be considered equivalent  they must meet the  following requirements     OQ The type names must match   Both types must have the exact same name        Q The parent types  if any  must match   If the type is a structure  both must either not implement an interface  or  both implement interfaces that are type equivalent     Q The member variables and methods must match   They must both have the exact same member variables and methods   The order and name of the variables must match exactly  and the order  and name of the methods must match  The signature of the methods   including argument and return types  must be identical     Type equivalency is useful when using shared parameters instances with  multiple programs by connecting them with cgConnectParameter        Parameter Validity    The function cgIsParameter    allows you to check whether a parameter  handle references a valid parameter or not   CGbool cgIsParameter CGparameter parameter      A parameter handle becomes invalid when the program or the context of the  program it corresponds to is de
173. e most  recent technology is highly programmable  and becoming ever more so  We  can now write short vertex and fragment programs to be executed by the  GPU  This requires great skill  and is only possible with short programs     When GPU hardware grows to allow programs of hundreds  thousands  or  even more instructions  assembly coding will no longer be practical  Rather  than programming each rendering state  each bit  byte  and word of data and  control through a low level assembly language  we want to express our ideas  in a more straightforward form  using a high level language     Thus Cg     C for Graphics     becomes necessary and inevitable  Just as C was  derived to expose the specific capabilities of processors while allowing  higher level abstraction  Cg allows the same abstraction for GPUs  Cg  changes the way programmers can program  focusing on the ideas  the  concepts  and the effects they wish to create not on the details of the  hardware implementation  Cg also decouples programs from specific  hardware because the language is functional  not hardware implementation   specific  Also  since Cg can be compiled at run time on any platform   operating system  and for any graphics hardware  Cg programs are truly  portable  Finally  and perhaps best of all  Cg programs are future proof and  can adapt to run well on future products  The compiler can optimize directly  for a new target GPU that perhaps did not even exist when the original Cg  program was written   
174. e of an object  when used in an expression  The qualifiers are    Q const  The value of a const qualified object cannot be changed after its initial  assignment  The definition of a const qualified object that is not a  parameter must contain an initializer  Named compile time values are  inherently qualified as const  but an explicit qualification is also  allowed   The value of a static const cannot be changed after compilation  and  thus its value may be used in constant folding during compilation  A  uniform const  on the other hand  is only const for a given execution of  the program  its value may be changed via the runtime between  executions     Q inandout  Formal parameters may be qualified as in  out  or both  by using in out  or inout   By default  formal parameters are in qualified  An in  qualified parameter is equivalent to a call by value parameter  An out  qualified parameter is equivalent to a call by result parameter  and an       808 00504 0000 006 233  NVIDIA          Cg Language Toolkit    inout qualified parameter is equivalent to a value result parameter  An  out qualified parameter cannot be const qualified  nor may it have a  default value     Type Conversions    Some type conversions are allowed implicitly  while others require an cast   Some implicit conversions may cause a warning  which can be suppressed by  using an explicit cast  Explicit casts are indicated using C style syntax   casting variable to the float4 type can be achieved using    floa
175. e one automatically  Scalar uniform parameters may be  allocated to either the xyz or the w portion of a constant register depending  on how they are used within the Cg program  When using the output of the  compiler without the Cg runtime  you must set all values of a scalar uniform  to the desired scalar value  not just the x component     The valid binding semantics for uniform parameters in the   p20 profile are  summarized in Table 35        288    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Table 35    p20 Uniform Binding Semantics       Binding Semantics Name Corresponding Data       register  s0    register s3   Texture unit N  where N is in range  0  3      TEXUNITO   TEXTUNIT3 May be used only with uniform inputs with  sampler  types                 The ps_1_x profiles allow the programmer to decide which constant register  a uniform variable will reside in by specifying the C lt n gt  register  c lt n gt    binding semantic  This is not allowed in the   p20 profile since the  NV_register_combiners extension does not have a single bank of constant  registers  While the NV register combiners extension does describe  constant registers  these constant registers are per combiner stage and  specifying bindings to them in the program would overly constrain the  compiler     Binding Semantics for Varying Input Output Data    The varying input binding semantics in the   p20 profile are the same as the  varying output binding semantics of the vp20 profile 
176. e shows  a few common uses for annotations  the annotation of LightDir indicates  what sort of user interface widget would be appropriate to provide the user  for setting that parameter  The technique   s annotation might indicate that  applying the technique was optional when rendering the scene  In the  example above  the pass annotations indicates to the application which part  of the scene geometry to draw when rendering that pass  as well as where to  store the image from rendering the pass        128    808 00504 0000 006  NVIDIA    Introduction to CgFX    Given a handle to a technique  pass  or parameter  there are API entry points  for iterating through the annotations in turn     CGannotation cgGetFirstTechniqueAnnotation  CGtechnique     CGannotation cgGetFirstPassAnnotation  CGpass     CGannotation cgGetFirstParameterAnnotation  CGparameter     CGannotation cgGetFirstProgramAnnotation  CGprogram     CGannotation cgGetNextAnnotation  CGannotation       In addition  there are entry points for retrieving annotations by name     CGannotation cgGetNamedTechniqueAnnotation  CGtechnique   const char      CGannotation cgGetNamedPassAnnotation CGpass  const char      CGannotation cgGetNamedParameterAnnotation  CGparameter   const char      CGannotation cgGetNamedProgramAnnotation  CGprogram   const char        Given an annotation handle  its values may be retrieved through the use of  one of the cgGet AnnotationValues    entry points     const float  cgGetFloatAnnotationVal
177. e specified length  For example     loe fune  lose Seeds   4    logic mesi sc  Eloat xd pe   Pihost  ve Pod it       1   1G     float myvl   func valsl      match  6    6       808 00504 0000 006 15  NVIDIA          Cg Language Toolkit    float myv2   func vals2      no match  5    6       Unsized arrays may only be declared as function parameters   they may not  be declared as variables  Furthermore  in all current profiles  the actual array  length and address calculations implied by array indexing must be known at  compile time     Unsized array parameters of top level functions  such as  main    may be  connected to sized arrays that are created in the runtime  or their size may be  set directly for convenience  See the cgSetArraySize    manual in the Cg  core runtime documentation for details     Interfaces    Cg supports interfaces  a language construct found in other languages   including Java and C   and in C   as pure virtual classes   Interfaces provide  a means of abstractly describing the member functions a particular structure  provides  without specifying how those functions are implemented  When  used in conjunction with parameter instantiation by the Cg runtime  this  abstraction makes it possible to plug in any structure that implements a  given interface into a program    even if the structure was not known to the  author of the original program     An interface declaration describes a set of member functions that a structure  must define in order to impleme
178. e supported by this profile are  presented in Table 33  See the standard library documentation for  descriptions of these functions     Table 33  Supported Standard Library Functions       dot  floatN  floatN        lerp floatN  floatN  floatN        lerp floatN  floatN  float        tex1D  samplerl1D  float        tex1D sampler1D  float2                 286 808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Table 33  Supported Standard Library Functions  continued        tex1Dproj sampler1D  float2        tex1Dproj sampler1D  float3        tex2D sampler2D  float2        tex2D sampler2D  float3        tex2Dproj sampler2D  float3        tex2Dproj sampler2D  float4        texRECT  samplerRECT  float2        texRECT  samplerRECT  float3     texRECTproj samplerRECT  float3        texRECTproj samplerRECT  float4        tex3D  sampler3D  float3        tex3Dproj sampler3D  float4        texCUBE  samplerCUBE  float3              texCUBEproj samplerCUBE  float4           Note  The nonprojective texture lookup functions are actually done as projective lookups  on the underlying hardware  Because of this  the w component of the texture  coordinates passed to these functions from the application or vertex program must  contain the value 1        Texture coordinate parameters for projective texture lookup functions must  have swizzles that match the swizzle done by the generated texture shader  instruction  While this may seem burdensome  it is intended to allow   p20   profile
179. e swizzle operator is the same as that of the array  subscripting operator   1      Write Mask Operator    The write mask operator     is placed on the left hand side of an assignment  statement  It can be used to selectively overwrite the components of a vector   It is illegal to specify a particular component more than once in a write mask   or to specify a write mask when initializing a variable as part of a  declaration     The following is an example of a write mask     floats color isa tato  O CORSO COINS  Colona e Op     7 Sie cuidas o 1 0  lesa RES alone     The write mask operator can be a powerful tool for generating efficient code  because it maps well to the capabilities of GPU hardware  The precedence of  the write mask operator is the same as that of the swizzle operator     Conditional Operator    Cg includes C s if else conditional statement and conditional operator        With the conditional operator  the control variable may be a boo1 vector  If  so  the second and third operands must be similarly sized vectors  and  selection is performed on an elementwise basis  Unlike C  any side effects       22    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    associated with the second and third operands always occur  regardless of  the conditional     As an example  the following would be a very efficient way to implement a  vector clamp function  if the min   and max   functions did not exist     Eloot ome lanpi oar oi Elo miavel  lost maseyell    
180. e the device changes or is destroyed  void OnDestroyDevice      device  gt DeleteVertexShader  vertexShader     device  gt DeletePixelShader  pixelShader                   Called before application shuts down  void OnShutdown         This frees any core runtime resources      The minimal interface has no dynamic storage to free        cgDestroyContext  context          808 00504 0000 006 97  NVIDIA          Cg Language Toolkit    Direct3D Expanded Interface    If you use the expanded interface for a program  in order to avoid any  unfortunate inconsistencies it is advisable to stick with the expanded  interface for all shader related operations that can be performed through its  functions  such as shader setting  shader activation  and parameter setting      including setting texture stage states     Setting the Direct3D Device    The expanded interface encapsulates more functionality than the minimal  interface to ease program and parameter management  It does this by  making the appropriate Direct3D calls at the appropriate times  Because  some of these calls require the Direct3D device  it must be communicated to  the Cg runtime    HRESULT cgD3D9SetDevice  IDirect3DDevice9  device       You can get the Direct3D device currently associated with the runtime using  cgD3D9GetDevice      IDirect3DDevice9  cgD3D9GetDevice         When egD3D9SetDevice    is called with zero as an input  all Direct3D  resources used by the expanded interface are released  Since a Direct3D  device 
181. e to find  1  the positions of the  vertices in stream 0 as the first three floating point values of the vertex  format   2  the normals as the next three floating point values following the  three floating point values in stream 0  and  3  the texture coordinates as the  two floating point values located at an offset equal to twice the size of a  DWORD from the end of the normal data in stream 0  The tangents are       86    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    provided in stream 1 as a second texture coordinate set that is found as the  first three floating point values of the vertex format     To get a vertex declaration from a Cg vertex program for the Direct3D 9 Cg  runtime use cgD3D9GetVertexDeclaration        CGbool cgD3D9GetVertexDeclaration  CGprogram program   D3DVERTEXELEMENT9 declaration  MAXD3DDECLLENGTH         MAXD3DDECLLENGTH is a Direct3D 9 constant that gives the maximum length  of a Direct3D 9 declaration  If no declaration can be derived from the  program  cgD3D9GetVertexDeclaration    fails and returns CG FALSE     To get a vertex declaration from a Cg vertex program for the Direct3D 8 Cg  runtime use cgD3D8GetVertexDeclaration        CGbool cgD3D8GetVertexDeclaration  CGprogram program   DWORD declaration MAX FVF DECL SIZE       MAX FVF DECL SIZE is a Direct3D constant that gives the maximum length  of a Direct3D declaration  If no declaration can be derived from the program   cgD3D8GetVertexDeclaration    fails and ret
182. eColor   amp constantColor               Called to render the scen  void OnRender                 Load model view matrix   D3DXMATRIX modelViewMatrix   Hi       Set the parameters that change every frame     This must be done before binding the programs    cgD3D8SetUniformMatrix modelViewMatrix   amp modelViewMatrix          Bind the programs  This downloads any parameter values       808 00504 0000 006 111    NVIDIA          Cg Language Toolkit       that have been previously set   cgD3D8BindProgram vertexProgram    cgD3D8BindProgram fragmentProgram         Draw scene                 Called before the device changes or is destroyed  void OnDestroyDevice              Ky Calling calg Fwetlisa cells da xpanded interface to     release its internal reference to the Direct3D devic  jf amd free TES Direct  n masoumess    cgD3D8SetDevice  0                  Called before application shuts down  void OnShutdown             This frees any core runtime resource   cgDestroyContext  context          Direct3D Debugging Mode    In addition to the error reporting mechanisms described in    Direct3D Error  Reporting    on page 114  a debug version of the Direct3D 9 or Direct3D 8 Cg  runtime DLL is provided to assist you with the development of applications  using the Direct3D 9 or Direct3D 8 Cg runtime  This version does not have  debug symbols  but when used in place of the regular version  it uses the  Win32 function OutputDebugString   to output many helpful messages  and traces to the d
183. eae ad RR ERA OE RR ERR d 38  Derivative FUleHlofiS   tet aa aa pal cil ede aon tao n t qd 41  Debugging FUNCION ss cavar oa ea 41  Predefined Fragment Program Output Structures    1    ce ees 42  Introduction to the   Co Runtime Library sica A eee 43  Introducing the Cg RUMEME sacos ria e gai ies 43  Benetits of Me Cg RUNTIME  ecc aos aux det a ai m RU edat e m ac Ro deca 44  Overview of the Cg RUHBEITIG  oink ae Eee ae ERU ERROR RP EORR RUE Rr on ps 45  CORE Cg RUNM i id ons mo icon ae ps n hk mai a ec ated lea RD At Red UR UE RR UR mace OR 49  Core Cg COMPRE ic ebrei Ron edic dci aca qon cid a qo E aod 50  Core Cu  PIOS uc egt kt te Hec eU a ero erty  Dti deco NR 50  Core Cg Parameters cs erasa Eck e Race e EIE RON EL qud BS 54  Core Co Error Reporting   assesses A Qu cR RE ETE ROS E ee 71  API Specific Cg RUNUMES  comas o wae rte ton satis qu cr AR Reale d Hae Roan E 72  Parameter SHAG OWING sn i o uec dc cob Srt poA e pk du Ed epos E OR CR 73  OpenGE  Cg Runtime  aquest m QUSE REGEM  ERE EYES PERPE wade 73  DireetaD Cg RUNTIME   creta gee we eee Ree RE OU EER ERE ERE EX REG 85  Introduction to CgFX surco A a ee 117  COFX OVER W cosas 117   sudor 117  Getting Stated    use init ope PLE ATO Mak E LE deb rd AA 118  Technigue Valldatioti    v3 s dyna ed e815 AAA AA 120  Passes and  Pass State    eese eb Dr x ks erbe ie e A adest debe e 120  Effect Parameters  usse sacco dci e ea oh alicia itat a nasce Rie S ROS aug Bole otto Sl avin otis 121  Vertex and Fragment Programs   u
184. ebug output console  Examples of information the debug  DLL outputs are the following     Q Any Direct3D or Cg core runtime errors    Q Debugging information about parameters that are managed by the  expanded interface       Q Potential performance warnings    Here is a sample trace        CgD3D TRACE   Creating vertex shader for program 3  cgD3D  TRACE   Discovering parameters for vertex program 3    CgD3D TRACE   Discovered uniform parameter  ModelViewProj   of type float4x4                      112    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library       cgD3D  TRACE   Finished discovering parameters for vertex  program 3                   cgD3D  TRACE   Creating pixel shader for program 24   cgD3D  TRACE   Discovering parameters for pixel program 24  cgD3D  TRACE   Discovered sampler parameter  BaseTexture   cgD3D  TRACE   Discovered uniform parameter  SomeColor  of    type float4       cgD3D  TRACE   Finished discovering parameters for pixel  program 24       cgD3D  TRACE   Shadowing state for sampler parameter  BaseTexture    cgD3D  TRACE   Shadowing sampler state D3DTSS_MAGFILTER for  sampler parameter  BaseTexture     cgD3D  TRACE   Shadowing sampler state D3DTSS_MINFILTER for  sampler parameter  BaseTexture     cgD3D  TRACE   Shadowing sampler state D3DTSS_MIPFILTER for  sampler parameter  BaseTexture                                               cgD3D  TRACE   Shadowing 16 values for uniform parameter   ModelViewProj  of type float4x4       cgD3
185. ed     Function cgGetParameterResourceIndex    retrieves the numerical portion  of the resource     unsigned long cgGetParameterResourcelndex    CGparameter parameter      For example  if the resource for a given parameter is CG_TEXCOORD7   cgGetParameterResourcelIndex    returns 7     The cgGetParameterValues    function retrieves the default or constant  value of a uniform parameter     const double  cgGetParameterValues  CGparameter parameter   CGenum valueType  int  numberOfValuesReturned       It retrieves the default value if valueType is equal to CG_DEFAULT and the  constant value if valueType is equal to CG_CONSTANT  The components of the  value are returned in row major order as a pointer to an array containing  type double elements  After cgGetParameterValues    is called  the number  of components available in the array is pointed to by  numberOfValuesReturned        70    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    Core Cg Error Reporting    An error code is associated with each type of runtime error that can be  generated  The runtime caches both the most recently generated error  as  well as the error that was first generated since the error code was last  checked by the application  Applications can query the cached error codes  as  well as the error message corresponding to either  using   CGerror error   cgGetError       CGerror error   cgGetFirstEror       const char  errorString   cgGetErrorString error       An error code of 0 i
186. ed if the compiler can fully unroll them  that is  if  the compiler can determine the iteration count at compile time   Likewise   return can only appear as the last statement in a function in these profiles     Function recursion  and co recursion  is forbidden in Cg     The switch  case  and default keywords are reserved  but they are not  supported by any profiles in the current release of the Cg compiler     Definitions and Function Overloading    To pass a modifiable function parameter in C  the programmer must  explicitly use pointers  C   provides a built in pass by reference mechanism  that avoids the need to explicitly use pointers  but this mechanism still  implicitly assumes that the hardware supports pointers  Cg must use a  different mechanism because the vertex and fragment hardware of the GPU  does not support the use of pointers  Cg passes modifiable function  parameters by value result  instead of by reference  The difference between  these two methods is subtle  it is only apparent when two function  parameters are aliased by a function call  In Cg  the two parameters have  separate storage in the function  whereas in C   they would share storage     To reinforce this distinction  Cg uses a different syntax than C   to declare  function parameters that are modified        function olani  out loci s     x is output only  fune tion blan   Inour loari      x is input and output  function blah3 in loa X     x is input only  function blah4  float x      x is inpu
187. ed in Cg is to  associate a binding semantic with each element of the packet  This is a bind   by name approach  For example  an output with the binding semantic Foo is  fed to an input with the binding semantic Foo  Profiles may allow the user to  define arbitrary identifiers in this    semantic namespace     or they may restrict       808 00504 0000 006 241  NVIDIA          Cg Language Toolkit    the allowed identifiers to a predefined set  Often  these predefined names  correspond to the names of hardware registers or API resources     In some cases  predefined names may control non programmable parts of  the hardware  For example  vertex programs normally compute a position  that is fed to the rasterizer  and this position is stored in an output with the  binding semantic POSITION     For any profile  there are two namespaces for predefined binding  semantics   the namespace used for in variables and the namespace used for  out variables  The primary implication of having two namespaces is that the  binding semantic cannot be used to implicitly specify whether a variable is  in or out     Binding Semantics    A binding semantic may be associated with an input to a top level function  in one of three ways     Q The binding semantic is specified in the formal parameter declaration for  the function  The syntax for formal parameters to a function is   const   in   out   inout    lt type gt   lt identifier gt       lt binding semantic gt       lt initializer gt      Q Ifthe f
188. ed is back facing  greater than zero if it is front facing  and zero if the  fragment was from a line or a point        808 00504 0000 006 269  NVIDIA          Cg Language Toolkit       OpenGL NV_vertex_program 2 0 Profile  vp30     The vp30 Vertex Program profile is used to compile Cg source code to vertex  programs for use by the NV_vertex_program2 OpenGL extension     Q Profile name  vp30       Q How to invoke  Use the compiler option  profile vp30     The vp30 profile limits Cg to match the capabilities of the  NV_vertex_program2 extension  This section describes the capabilities and  restrictions of Cg when using the vp30 profile     Position Invariance    Under vp30  unlike other profiles  the following points can be made     Q The  posinv option won t cause an OPTION driver directive to be added  to the assembly code header  see the OpenGL specification for more  details on this directive      Q The instructions for transforming the position using the modelview   projection matrix are emitted     They are true because the final assembly code itself guarantees that the  position calculation is invariant compared to the fixed pipeline calculation     Language Constructs    Data Types  This profile implements data types as follows   O float data type is implemented as IEEE 32 bit single precision     O half data type is implemented as float        O int data type is supported using floating point operations  which adds  extra instructions for proper truncation for divides
189. ed matrix types   Implementations must also predefine type identifiers  in the global  scope  to represent these types     packed TYPE1 TYPE1x1 1   packed TYPE1 TYPE3x1 3    packed TYPE2 TYPE1x2 1   packed TYPE2 TYPE3x2 3    packed TYPE3 TYPE1x3 1   packed TYPE3 TYPE3x3 3    packed TYPEA TYPE1x4  1   packed TYPEA TYPE3x4  3    packed TYPE1 TYPE2x1 2   packed TYPE1 TYPE4x1 4    packed TYPE2 TYPE2x2 2   packed TYPE2 TYPE4x2 4    packed TYPE3 TYPE2x3 2   packed TYPE3 TYPE4x3 4    packed TYPE4 TYPE2x4 2   packed TYPE4 TYPE4x4 4      For example  implementations must predefine the type identifiers  float2x1  float3x3  float 4x4  and so on  A typedef follows the usual  matrix naming convention of TYPE_rows_X_columns  If we declare  float4x4 a  then a 3  is equivalent to a _m30_m31_m32_m33    Both expressions extract the third row of the matrix     Q Implementations are required to support indexing of vectors and  matrices with constant indices     O A struct type is a collection of one or more members of possibly  different types        O An interface type defines a collection of methods that comprises an  abstract interface     Partial Support of Types    This specification mandates partial support for some types  Partial support for  a type requires the following     Q Definitions and declarations using the type are supported        808 00504 0000 006 231  NVIDIA          Cg Language Toolkit    Q Assignment and copy of objects of that type are supported  including  implicit copie
190. ed to by a  D3DVSD_REG    macro call is expanded to the four floating point values of the  corresponding hardware register  and the missing values are set to 0 for x  y   and z  and to 1 for w     Minimal Interface Type Retrieval    Use cgD3D9TypeToSize    to retrieve the size of a CGtype enumerated type  in terms of floating point numbers     DWORD cgD3D9TypeToSize  CGtype type      More precisely  it is the number of floating point values required to store a  parameter of type type  This function does not apply to some types  like the  sampler types  in which case it returns zero  It is useful because applications  can determine how many floating point values they have to provide to set  the value of a given parameter     Minimal Interface Program Examples    In this section we provide some code samples that illustrate how and when  to use functions from the minimal interface to make Cg programs work with  Direct3D  To enhance clarity  the examples do very little error checking  but a  production application should check the return values of all Cg functions   The vertex and fragment programs below are referenced in    Direct3D 9  Application    on page 92 and    Direct3D 8 Application    on page 95     Vertex Program    The following Cg code is assumed to be in a file called VertexProgram  cg     void VertexProgram            o ADO So BOSE ON   iia loci color COMO RO   in float4 texCoord   TEXCOORDO   out float4 positionO   POSITION        808 00504 0000 006 91  NVIDIA    
191. effect contains one or more  echniques  A technique is intended to  encapsulate the information needed to produce a visual effect     graphics  state  shaders  and at least one rendering pass     Pass    Each technique contains one or more rendering passes  Passes store graphics  state  possibly including fixed function state settings and vertex and    808 00504 0000 006 117    NVIDIA       Cg Language Toolkit    fragment shaders  The passes are generally processed in order  CgFX sets the  graphics state for a pass  the application draws the scene geometry  the state  for the next pass is set  geometry is drawn again  and so on     State assignment    Passes hold state assignments that describe the graphics state for the pass     Annotation    Annotations make it possible to associate meta data with parameters   techniques  passes  and so on  For example  a parameter like   light Intensity might have annotations indicating the minimum and  maximum valid values for the parameter     Effect parameter    Parameters declared in the global scope of the effect file are effect parameters   Effect parameter values may be set and queried using the Cg runtime API   Effect parameters may be referenced on the right hand side of state  assignments and also as global parameters within Cg functions and  programs defined within the effect        Getting    Started    We expect that the reader is generally familiar with the Cg runtime  See     Introduction to the Cg Runtime Library    on page
192. eft operand  The side  effect of updating the stored value of the left operand occurs between the  previous and the next sequence point     Smearing of Scalars to Vectors    If a binary operator is applied to a vector and a scalar  the scalar is  automatically type promoted to a same sized vector by replicating the scalar  into each component  The ternary    operator also supports smearing  The  binary rule is applied to the second and third operands first  and then the  binary rule is applied to this result and the first operand     Namespaces    Just as in C  there are two namespaces  Each has multiple scopes  as in C     O Tag namespace  which consists of struct tags       O Regular namespace       typedef names  including an automatic typedef from a struct  declaration       Variables      Function names       808 00504 0000 006 237  NVIDIA          Cg Language Toolkit    Arrays and Subscripting    Arrays are declared as in C  except that they may optionally be declared to be  packed  as described under    Types    on page 229  Arrays in Cg are first class  types  so array parameters to functions and programs must be declared  using array syntax  rather than pointer syntax  Likewise  assignment of an  array typed object implies an array copy rather than a pointer copy     Arrays with size  1  may be declared but are considered a different type  from the corresponding non array type     Because the language does not currently support pointers  the storage order  of arrays 
193. egister N  where N is in range  C0 C31  0  31   May only be used with uniform inputs   Binding Semantics for Varying Input Output Data  The valid binding semantics for varying input parameters in the ps 2 0 and  ps 2 x profiles are summarized in Table 43   Table 43  ps 2   Varying Input Binding Semantics  Binding Semantics Name Corresponding Data  type   COLORO Input color 0    1oat4   COLOR1 Input color 1  float 4   TEXCOORDO0 TEXCOORD7 Input texture coordinates  float 4   The valid binding semantics for varying output parameters in the ps_2_0  and ps_2_x profiles are summarized in Table 44   Table 44  ps 2   Varying Output Binding Semantics  Binding Semantics Name Corresponding Data  COLOR  COLORO Output color  float4   DEPTH Output depth    1oat   302 808 00504 0000 006    NVIDIA    Appendix B Language Profiles    Options  The ps_2_x profile allows the following profile specific options   NumTemps  lt n gt   where 0  lt   n  lt   32  default 32   NumInstructionSlots  lt n gt   where n  gt   0  default 1024   Predication  lt b gt   where b 0 or 1  default 1   ArbitrarySwizzle  lt b gt   where b  0 or 1  default 1   GradientInstructions   b    where b  0 or 1  default 1   NoDependentReadLimit  lt b gt   where b  0 or 1  default 1   NoTexInstructionLimit  lt b gt   where b  0 or 1  default 1     Limitations in this Implementation  Currently  this profile implementation has the following limitations     Q Dynamic flow control is not supported in extended pixel shaders        Q
194. enable a specific profile     Next  you bind the program to the current state  This means that in  subsequent drawing calls the program is executed for every vertex in the  case of a vertex program and for every fragment in the case of a fragment  program     Here s how to bind a program in OpenGL   cgGLBindProgram  program       Here s how to bind a program in Direct3D   cgD3D9BindProgram  program       You can only bind one vertex and one fragment program ata time for a  particular profile  Therefore  the same vertex program is executed until  another vertex program is bound  Similarly  the same fragment program is  executed as long as no other fragment program is bound     In OpenGL  you disable profiles by the following call   cgGLDisableProfile  CG_PROFILE_ARBVP1          Disabling a profile also disables the execution of the corresponding vertex or  fragment program        48    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    Releasing Resources  When your application is ready to close  it is good programming practice to    free resources that you ve acquired     Because the Direct3D runtime keeps an internal reference to the Direct3D  device  you must tell it to release this reference when you are done using the  runtime  This is done with the following call    cgD3D9SetDevice 0      To free resources allocated for a program  call this function     cgDestroyProgram program      To free resources allocated for a context  use this function     cgD
195. encouraged to issue a warning     a Implementations may choose to recognize more general versions of the    second condition  such as the variables being copy propagated from the  original inputs and outputs   but this additional generality is not  required     Binding Semantics for Outputs    As shown in Table 11   there are two output binding semantics for vertex  program profiles     Table 11  Vertex Output Binding Semantics                Name Meaning Type Default Value  POSITION   Homogeneous clip space position  float4  Undefined   fed to rasterizer   PSIZE Point size float Undefined                Profiles may define additional output binding semantics with specific  behaviors  and these definitions are expected to be consistent across  commonly used profiles        808 00504 0000 006    251  NVIDIA             Cg Language Toolkit       Fragment Program Profiles    A few features of the Cg language that are specific to fragment program  profiles are required to be implemented in the same manner for all fragment  program profiles     Binding Semantics for Outputs    As shown in Table 12   there are three output binding semantics for fragment  program profiles  Profiles may define additional output binding semantics  with specific behaviors  and these definitions are expected to be consistent  across commonly used profiles     Table 12  Fragment Output Binding Semantics       Name Meaning Type Default Value       COLOR RGBA output color float4   Undefined       COLORO Sa
196. erally are executed  many more times than vertex programs  Therefore  move computation from  fragment programs into vertex programs whenever possible  Recall that  varying outputs from vertex programs are automatically linearly  interpolated before being passed to the fragment program     There are three main cases where you can move computation from a  fragment program into a vertex program     Q The result is constant over all fragments  If the vertex shader computes a value that is the same for all vertices  so  that all fragments receive the same value after interpolation  any  computation that the fragment shaders do that is based solely on such  values can be moved to the vertex shader  as long as it doesn   t require  texture map lookups or other fragment only operations      Q The result is linear across a triangle   If the fragment shader is computing a value that varies linearly over the  face of the triangle  for example  the distance from the fragment to a light  source  to be used for attenuation   the value can be computed in the  vertex shader at each vertex  passed to the fragment shader  and  automatically interpolated by the GPU along the way     Q The result is nearly linear across a triangle   When a value computed by a fragment shader varies slowly over  triangles  it may be an acceptable approximation to compute its value at  each vertex and use its linearly interpolated value in the fragment  shader  For example  the usual Gouraud shading algorithm take
197. ertexProgram   0                  Load the program  cgGLLoadProgram vertexProgram               Create the fragment program  fragmentProgram   cgCreateProgramFromFile     context  CG SOURCE   FragmentProgram cg    fragmentProfile   FragmentProgram   0            Load the program  cgGLLoadProgram fragmentProgram            Grab some parameters    position   cgGetNamedParameter  vertexProgram   position     color   cgGetNamedParameter  vertexProgram   color     texCoord   cgGetNamedParameter vertexProgram   texCoord     modelViewMatrix   cgGetNamedParameter  vertexProgram                        ModelViewMatrix     baseTexture   cgGetNamedParameter  fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram    SomeColor           Set parameters that don t change       They can be set only once because of parameter shadowing   cgGLSetTextureParameter  baseTexture  texture    cgGLSetParameter4fv someColor  constantColor                Il Called to render the seen  void Display     1          Set the varying parameters  cgGLEnableClientState  position            808 00504 0000 006 83  NVIDIA          Cg Language Toolkit       cgGLSetParameterPointer  position  3  GL FLOAT  0   vertexPositions    cgGLEnableClientState  color    cgGLSetParameterPointer color  1  GL FLOAT  0   vertexColors    cgGLEnableClientState  texCoord     cgGLSetParameterPointer texCoord  2  GL FLOAT  0   vertexTexCoords                           Set the uniform parameters that change ev
198. ery frame  cgGLSetStateMatrixParameter  modelViewMatrix   CG GL MODELVIEW PROJECTION MATRIX   CG GL MATRIX IDENTITY                              Enable the profiles  cgGLEnableProfile  vertexProfile    cgGLEnableProfile fragmentProfile                        Bind the programs  cgGLBindProgram vertexProgram    cgGLBindProgram fragmentProgram         Enable texture  cgGLEnableTextureParameter  baseTexture                   Draw scene    Vif       Disable texture  cgGLDisableTextureParameter  baseTexture             Disable the profiles  cgGLDisableProfile vertexProfile    cgGLDisableProfile fragmentProfile               Set the varying parameters  cgGLDisableClientState  position    cgGLDisableClientState  color    cgGLDisableClientState  texCoord               Called before application shuts down  void CgShutdown            This frees any runtime resource        84 808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    cgDestroyContext  context          OpenGL Error Reporting    Here is the list of the CGerror errors specific to the OpenGL Cg runtime     Q CG_PROGRAM_LOAD_ERROR  Returned when the program could not be  loaded     Q CG_PROGRAM_BIND_ERROR  Returned when the program could not be  bound     O CG_PROGRAM_NOT_LOADED_ERROR  Returned when the program must be  loaded before the operation may be used        O CG_UNSUPPORTED_GL_EXTENSION_ERROR  Returned when an  unsupported Open GL extension is required to perform the operation     Any OpenGL Cg runtime f
199. es    where you can push the GPU to its limits though  careful programming     The Cg language shields you from the majority of the low level details of  GPU hardware  enabling you to think about your shaders at a higher level  than the low level GPU instruction sets  However  just as an understanding  of modern computer architecture  such as cache and memory hierarchy  issues  is important for writing fast C and C   code  understanding a bit  about the GPU can help you write better Cg code  This appendix focuses on  techniques for maximizing performance from vertex and fragment programs  written in Cg and running on the NVIDIA GeForce FX architecture   specifically the vp30    p30  arbfp1  ps_2_0  ps_2_x  vs_2_0  and vs_2_x  profiles   although many of the principles are more broadly applicable           Program for Vectorization    The GPU can generally perform four arithmetic operations as quickly as it  can perform a single operation  Therefore  if you have two vectors of four  floating point values     float ay 195  you can add the two vectors together    float4 c   ath     808 00504 0000 006 321    NVIDIA       Cg Language Toolkit    with no more computational expense than adding together two of their  elements    loe Cl     os se IB    This has two implications for efficient programming  First  you should try to  write code that naturally maps to these vector operations  If you want to add  two   1oat4 variables together  it may be substantially less efficient to write
200. ese types can be used to  hold the outputs of a fragment program  Their use is strictly optional     For the ps_1 and   p20 profiles  the   ragout structure is defined as follows     struct deer  1  closet col   COLOR   y     The ps_2  arbfp1  and   p30 profiles have two fragment output types  defined     struct Tragout 1  half4 col CONOR   float depth   DEPTH   y   struct fragout float    float col  COLOR   loe  GClejoirln   DERT   y              42    808 00504 0000 006  NVIDIA             Introduction to the  Cg Runtime Library    This chapter introduces the Cg Runtime Library  It assumes that you have  some basic knowledge of the Cg language  as well as the OpenGL or  Direct3D APIs  depending on which one you use in your applications     The first section    Introducing the Cg Runtime    on page 43 describes the  benefits of using the Cg Runtime Library and gives a brief overview of how it  is used in an application to create and manage Cg programs  The next two  sections     Core Cg Runtime  on page 49 and  API Specific Cg Runtimes    on  page 72  describe the APIs composing the Cg Runtime     This chapter is primarily focused on using the Cg runtime to directly create  and manage Cg programs  The following chapter     Introduction to CgFX     describes how the runtime may also be used to create and manage Cg based  shader effects        Introducing the Cg Runtime    Cg programs are lines of code that describe shading  but they need the  support of applications to create
201. estroyContext  context      Note that destroying a context destroys all the programs it contains as well        Core Cg Runtime    The core Cg runtime provides all the functions necessary to manage Cg  programs from within the application  It makes no assumption about which  3D API the applications uses  so that any application could easily ignore the  API specific Cg runtime libraries and content itself with the core Cg runtime     The core Cg runtime is built around three main concepts  context  program   and parameter  which are represented by the CGcontext  CGprogram  and  CGparameter object types  Those concepts are hierarchically related one to  each other  a program has several parameters  a context contains several  programs and shared parameters  and the application can define several  contexts     The next sections describe these three basic object types and the runtime  entry points that operate on them  The three object types have some points in  common     O The use of CGboo1  which is an integer type equal to either CG_TRUE or  CG FALSE    Q The use of CGenum  which is an enumerate type used to specify various  enumerate values that are not necessarily related       Q Theconvention that functions that return a value of type CGcontext   CGprogram  CGparameter  Or const char  indicate failure by returning  Zero       808 00504 0000 006 49  NVIDIA          Cg Language Toolkit    Core Cg Context    The Cg runtime provides functions for creating  destroying  and quer
202. etimes useful  as in tabularizing complex functions into texture  maps     Programs that are to run on the VM are declared as follows     float foo   4    S    AMAN O A DO SON O NS E SE 8 COLOR          ISIE UCIN EOS   jon APP         The POSITION semantic denotes the parameter or parameters that are  initialized with the coordinates of each point at which the function is  evaluated  The value passed varies from zero to one in each of the  dimensions over which the function is being evaluated  The PSIZE semantic  denotes the parameter that is initialized with the spacing between samples at  which the function is being evaluated  Lastly  the COLOR semantic denotes  which parameter  or function return value  holds the computed value  Thus   the function above could have been written as a void function but with an  out float4 ret   COLOR parameter and an assignment to ret  instead of  using a return statement     Given an effect file with such a program  a CGprogram handle to it can be  retrieved by creating a program using the CG PROFILE GENERIC profile        CGprogram tp   cgCreateProgramFromEffect  effect   CG PROFILE GENERIC     Minas  JUI  P                   Given such a program handle  cgEvaluateProgram evaluates the program  over the same one   two   or three dimensional domain        cgEvaluateProgram Cgprogram prog  float  obuf  int ncomp   WME  TX  ME My  Gime m  y    Where prog is the Cgprogram handle retrieved using  cgCreateProgramFromEffect     obuf is the buffer
203. ex Subsurface Scattering  lighting models  It also illustrates the use of    Rim    lighting and simple  translucency for capturing some of the more subtle properties of skin  resulting from complex  non local lighting interactions  Finally  it shows how  the various techniques can be combined to produce compelling  stylized  skin        Fig  10  Example of Skin    Pixel Shader Source Code for Skin    STUE Eragi       float2 texcoords   TEXCOORDO           808 00504 0000 006 175  NVIDIA          Cg Language Toolkit          float4 shadowcoords TEXCOORD1   float4 tangent ToEyeMat0 TEXCOORD4   float3 tangent ToEyeMat1l TEXCOORD5   float3 tangent ToEyeMat2 TEXCOORD6   float3 eyeSpacePosition TEXCOORD7     H    itlkeyeue S  laciomase i icleacs wil  closes w2  EloemsS ep        float costheta    lost ae    float3 gtemp     Costnera cle vl  we jp   g2   g g    gtemp   1 0 xxx   g2   2 0 g costheta   gtemp   pow  gtemp  1 5 xxx      gtemp    1 0 xxx   g2    gtemp    return gtemp           Computes the single scattering approximation to     scattering from a one dimensional volumetric surface   Proart o sine lescnteci   locas vy  Eloses vo  Eloctes a   float3 g  float3 albedo   float thickness      float win   abs  dot  wi n     float won   abs  dot  wo  n     float  eterm    float3 result     term 1L   Seo wa  won  thickness  JF  result   eterm    albedo   hgphase  wo  wi  g       win   won             return result     Ki i ie tha incident ray      n is the surface normal      eta 
204. ex2D diffuseMap  IN texCoord0 xy      float4 normal   2    tex2D normalMap  IN texCoordl xy  0 5     flees lagi vector   2    UN  color  eo   0 5     float4 dot result   saturate  dot  light_vector   moral Sa  SS  A   return dot_result   diffuseTexColor                      Example 2   struct VertexOut    float4 texCoord0   TEXCOORDO   float4 texCoordl   TEXCOORD1   float4 texCoord2   TEXCOORD2   float4 texCoord3   TEXCOORD3           y     float4 main VertexOut IN   uniform sampler2D normalMap   uniform sampler2D intensityMap   uniform sampler2D colorMap    COLOR    float4 normal   2    tex2D normalMap  IN texCoord0 xy  0 5    float2 intensCoord   float2    dot  IN texCoordl xyz  normal xyz    doe  INstexCoord2yxyz  no na ly e  float4 intensity   tex2D intensityMap  intensCoord     float4 color   tex2D colorMap  IN texCoord3 xy    return color   intensity        808 00504 0000 006 319  NVIDIA          Cg Language Toolkit       320 808 00504 0000 006  NVIDIA          Appendix C  Nine Steps to High Performance Cg    Writing Cg code that compiles to efficient programs requires techniques and  approaches that are different from efficient programming in C  C    or Java   While some of the basic lessons are the same  such as using efficient  underlying algorithms   the hardware programming model of modern GPUs  is substantially different from that of modern CPUs  This can lead to  pitfalls    where you may be disappointed by your shader   s performance   as  well as to opportuniti
205. float2 newst   float2  dot  intermediate coord xyz  prevlookup xyz    dot str  prevlookup xyz     return tex2D tex  newst    where  str are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation  and  intermediate coord are texture coordinates associated with the previous  texture unit     This function can be used to generate the texm3x2pad texm3x2tex  instruction combination in all ps 1 x profiles     tex3D dp3x3 sampler3D tex  float3 str   float4 intermediate coordl   float4 intermediate coord2  float4 prevlookup     texCUBE dp3x3 samplerCUBE tex  float3 str     float4 intermediate coordl   float4 intermediate coord2  float4 prevlookup        Performs the following   float3 newst   float3  dot  intermediate coordl xyz  prevlookup xyz    dot  intermediate coord2 xyz  prevlookup xyz    dot str  prevlookup xyz     return tex3D CUBE  tex  newst     where                316    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Table 54  ps_1_x Auxiliary Texture Functions  continued        Texture Function       Description       str are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate_coord1 are texture coordinates associated with the n 2  texture unit  and  intermediate coord2 are texture coordinates associated with the n 1  texture unit   This function can be used to generate the texm3x3pad texm3x3pad   texm3x3tex instruction combination in a
206. float3  myscalar myscalar myscalar   yield the same value       If only one swizzle character is specified  the result is a scalar  not a  vector of length one  Therefore  the expression b  y returns a scalar       Care is required when swizzling a constant scalar because of  ambiguity in the use of the decimal point character  For example  to  create a three vector from a scalar  use one of the following     1  xxx Of 1  xxx Ot 1 0 xxx Of 1 0f xxx      The size of the returned vector is determined by the number of  swizzle characters  Therefore  the size of the result may be larger or  smaller than the size of the original vector    For example  float2  0 1   xxyy and float4  0 0 1 1  yield the  same result     Q Matrix swizzle operator   For any matrix type of the form  lt type gt  lt rows gt x lt columns gt   the notation      matrixObject    m  row    col    m  row    col           can be used to access individual matrix elements  in the case of only one    row    col   pair or to construct vectors from elements of a matrix  in  the case of more than one  lt row gt  lt co1 gt  pair   The row and column  numbers are zero based        808 00504 0000 006 245  NVIDIA          Cg Language Toolkit    For example     float4x4 myMatrix   it leu  myFloatScalar   float4 myFloatVec4        Set myFloatScalar to myMatrix 3  2    myFloatScalar   myMatrix  m32        Assign the main diagonal of myMatrix to myFloatVec4   myFloatVec4   myMatrix  m00 m11 m22 m33       For compatibility wit
207. foo parameter of main      When the value of bar is changed by the application  the value of foo in  main   is set appropriately     The second class of program state assignment types is assembly code  In line  assembly is indicated using the asm keyword  with the assembly language  code between braces  as in the example above  CgFX depends on having the  appropriate header at the start of the assembly        FP1 0 for   p30       ARBvp1 0 for arbvp1  and so on   to determine the profile for which the  code is given     Finally  vertex or fragment programs may be assigned the value NULL in the  state assignment  This signifies that no such program should be used in this  pass     Textures and Samplers    CgFX also makes it possible to define state related to textures in the effect  file  The effect file below shows an example  The full set of supported  OpenGL texture state is listed in    OpenGL State     on page 129   sampler2D samp   sampler_state     generateMipMap   true    minFilter   LinearMipMapLinear        magFilter   Linear     y     float4 texsimple  uniform sampler2D sampler   fiost2 uw rf TEXCOORDO    COLOR 4  return tex2D sampler  uv                    technique TextureSimple    pass    FragmentProgram   compile arbfpl texsimple samp          808 00504 0000 006 123  NVIDIA          Cg Language Toolkit         Given this effect file  the application must take an extra step or two when  setting up the texture in OpenGL  First  the application must indicate which 
208. form float4x4 TextureMat   pusubirg yem loe Wiis              uniform float4 Wavel   uniform float4 WavelOrigin   uniform float4 Wave2   uniform float4 Wave20rigin     const uniform float4 WaveData 5      vert2frag OUT        158 808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders    float4 position   float4 IN Position x  O0   JUIN  18 Sali 3L oma Wp 1L  p  float4 normal   float4 0 1 0 0    float dampening   1   dot position xyz  position xyz  1000   float 2  giabsmp  delit 2o        woe  ax   OF a   5e O      float waveTime   Time x   WaveData i  z   float frequency   WaveData i  z   float height   WaveData i  w   float2 waveDir   WaveData i  xy     calcWave disp  norm  dampening  IN Position xyz   waveTime  height  frequency  waveDir     OSLELOM y   POSE lomy SO   normal xz   normal xz   norm     OUT HPosition   mul  ModelViewProj  position         transfom normal into eye space  normal   mul  ModelViewIT  normal    normal xyz   normalize  normal xyz         get a vector from the vertex to the eye  float3 eyeToVert   mul  ModelView  position   xyz   eyeToVert   normalize  eyeToVert          calculate the reflected vector for cubemap look up  float4 reflected   mul  TextureMat   reflect  eyeToVert  normal xyz   xyzz               output two reflection vectors for the two     environment cubemaps   OUT TexCoord0   reflected    OUT TexCoordl   reflected        Ii Calevlete a Exesmel term  mote that ix   0   float fres   1 dot  eyeToVert normal xyz    fres   pow fres
209. g the state defined in the passes in a technique   The loop below demonstrates the standard approach for looping over a  technique   s passes and applying their states in turn   CGpass pass   cgGetFirstPass  technique     while  pass      cgSetPassState  pass     drawGeom      cgResetPassState  pass     pass   cgGetNextPass  pass         120    808 00504 0000 006  NVIDIA    Introduction to CgFX    Each of the state assignments in a pass translates directly to an OpenGL API  call  For example  LightingEnable   true  translates to the call   glEnable  GL_LIGHTING   and LightPosition 0    float4  10  10   10  1  translates to the call glLightfv GL_LIGHTO  GL POSITION  v   where vis an array of four GL  1oat values     Before or after the call to cySetPassState     the application is of course free  to set other OpenGL state as desired  However  any state set before the call to  cgSetPassState    may be overridden by the pass     Note that if the technique containing the indicated pass has not been  validated  calling cgSetStatePass    triggers an attempted validation of the  technique  If validation fails  a runtime error results     After the geometry has been drawn  cgResetPassState    resets the state  that was set by the pass to the default values as specified by OpenGL  Note  that it does not reset state to its values before cgSetPassState       an  application that desires this behavior should either push and pop OpenGL  state  or should manually examine the state assignme
210. gCreateProgramFromFile  context  CG SOURCE    VertexProgram cg   CG_PROFILE_VS_1_1   VertexProgram   0    CComPtr lt ID3DXBuffer gt  byteCode   const char  progSrc   cgGetProgramString vertexProgram                    808 00504 0000 006 95  NVIDIA          Cg Language Toolkit    CG COMPILED PROGRAM       Normally  you also grab the constants and prepend them     to your vertex declaration  Not shown here for brevity   D3DXAssembleShader progSrc  strlen progSrc   0  0  O0    amp byteCode  0       If your program uses explicit binding semantics  like                               this one   you can create a vertex declaration     using those semantics   DWORD declaration        D3DVSD STREAM 0    D3DVSD REG D3DVSDE POSITION  D3DVSDT FLOAT3    D3DVSD REG D3DVSDE DIFFUSE  D3DVSDT_D3DCOLOR     D3DVSD REG D3DVSDE TEXCOORDO  D3DVSDT FLOAT2    D3DVSD END                    Make sure the resulting declaration is compatible with      the shader  This is really just a sanity check    assert  cgD3D8ValidateVertexDeclaration  vertexProgram    declaration          Create the shader handle using the declaration    device  gt CreateVertexShader  declaration   byteCode  gt GetBufferPointer     amp vertexShader  0                        Create the pixel shader    fragmentProgram   cgCreateProgramFromFile  context   CG_SOURCE   FragmentProgram cg    CE Miro ba  198 1 1   rracmena zogrent  0              CComPtr lt ID3DXBuffer gt  byteCode   const char  progSrc   cgGetProgramString  fragmentProg
211. gh the run time API     Use of Uninitialized Variables    It is incorrect for a program to use an uninitialized variable  However  the  compiler is not obligated to detect such errors  even if it would be possible to  do so by compile time data flow analysis  The value obtained from reading  an uninitialized variable is undefined  This same rule applies to the implicit  use of a variable that occurs when it is returned by a top level function  In  particular  if a top level function returns a struct  and some element of that  struct is never written  then the value of that element is undefined        Note  Variables are not defined as being initialized to zero because this would result in a  performance penalty in cases where the compiler is unable to determine if a  variable is properly initialized by the programmer        Preprocessor    Cg profiles must support the full ANSI C standard preprocessor capabilities    if   define  and so on  However  Cg profiles are not required to support  macro like  define or the use of  include directives        Overview of Binding Semantics    In stream processing architectures  data packets flow between different  programmable units  On a GPU  for example  packets of vertex data flow  from the application to the vertex program     Because packets are produced by one program  the application  in this case    and consumed by another  the vertex program   there must be some method  for defining the interface between the two  The approach us
212. gnment is based on the context in which uniform sampler parameters  and texture coordinate inputs are used together        312    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    To specify bindings between texture units and uniform parameters texture  coordinates to match their application  all sampler uniform parameters and  texture coordinate inputs that are used in the program must have matching  binding semantics     that is  TEXUNIT lt n gt  may only be used with  TEXCOORD lt n gt      Partially specified binding semantics may not work in all cases   Fundamentally  this restriction is due to the close coupling between texture  samplers and texture coordinates in DirectX pixel shaders 1_X     Binding Semantics for Uniform Data    If a binding semantic for a uniform parameter is not specified then the  compiler will allocate one automatically  Scalar uniform parameters may be  allocated to either the xyz or the w portion of a constant register depending  on how they are used within the Cg program  When using the output of the  compiler without the Cg runtime  you must set all values of a scalar uniform  to the desired scalar value  not just the x component     The valid binding semantics for uniform parameters in the ps_1_x profiles  are summarized in Table 51     Table 51  ps_1_x Uniform Input Binding Semantics       Binding Semantics Name Corresponding Data       register  s0    register s3    Texture unit N  where N is in range  0  3    TEXUNITO   TEXTUNIT
213. gnments are limited to OpenGL state related to  rendering geometric primitives  OpenGL state that is not assignable using  the built in OpenGL state manager includes the following        Q Pixel path state  such as pixel transfer and convolution state   Q Per vertex attributes  such as glColor or glNorma1     Q Client side state such as vertex arrays and pixel store modes       142    NVIDIA    808 00504 0000 006       Introduction to CgFX    Vertex and pixel buffer object state    Miscellaneous state for evaluators  feedback  selection  or occlusion  queries       Q Texture environment GL_COMBINE state  Although related to rendering  it is complex and redundant with  fragment color operations better specified with Cg fragment programs     Future enhancements may allow assignments for currently unassignable  OpenGL state        808 00504 0000 006 143  NVIDIA          Cg Language Toolkit       144 808 00504 0000 006  NVIDIA       A Brief Tutorial    This section walks you through the sample Cg Microsoft Visual Studio  workspace we have provided  along with a simple Cg program that you can  use for experimentation        Loading the Workspace    When you load the Cg_Simple file  your workspace should look like the  image in Fig  3     cg  simple   Microsoft Visual C     C      cg_simple simple cg    El Ele Edt view Insert Project Build Tools Window Help    1d Sua A    RASA Ja   Sm INS           define inputs from application  Workspace  cg_simple  1 proje struct appin     E cg 
214. h OpenGL or Direct3D  It addresses the following four issues     Q The Cg language lets you easily express how an object should be  rendered  Although current Cg profiles describe only a single rendering  pass  many shading techniques  such as shadow volumes or shadow  maps  require more than one rendering pass     Q Many applications need to target a wide range of graphics hardware  functionality and performance  Thus  versions of shaders that run on  older hardware  and versions that aid performance for distant objects are  important     Q Each Cg program typically targets a single profile  and doesn t specify  how to fall back to other profiles  to assembly language shaders  or to  fixed function vertex or fragment processing     Q To generate images with Cg programs  some information about their  environment is needed  For instance  some programs might require  alpha blending to be turned on and depth writes to be disabled  Others  may need a certain texture format to work correctly  This information is  not present in standard Cg source files     Techniques    Each CgFX file usually presents a certain effect that the shader author is  trying to achieve    such as bump mapping  environment mapping  or  anisotropic lighting  The CgFX file contains one or more techniques  each of  which describes a way to achieve the effect  Each technique usually targets a       808 00504 0000 006 25  NVIDIA          Cg Language Toolkit    Passes    certain level of GPU functionality  so a
215. h the D3DMatrix data type  Cg also allows one   based swizzles  using a form with the m omitted after the _ symbol       matrixObject      row    col      row    col           In this form  the indexes for   row   and   co1   are one based  rather  than the C standard zero based  So  the two forms are functionally  equivalent     float4x4 myMatrix   float4 myVec        These two statements are functionally equivalent   myVec   myMatrix  m00 m23 m11 m31   myVec   myMatrix  11 34 22 42     Because of the confusion that can be caused by the one based  indexing  use of the latter notation is strongly discouraged       The matrix swizzles may only be applied to matrices  When multiple  components are extracted from a matrix using a swizzle  the result is  an appropriately sized vector  When a swizzle is used to extract a  single component from a matrix  the result is a scalar     The write mask operator        It can only be applied to an lvalue that is a vector  It allows assignment to  particular elements of a vector or matrix  leaving other elements  unchanged The only restriction is that a component cannot be repeated     Arithmetic Precision and Range    Some hardware may not conform exactly to IEEE arithmetic rules  Fixed   point data types do not have IEEE defined rules     Optimizations are allowed to produce slightly different results than  unoptimized code  Constant folding must be done with approximately the       246    808 00504 0000 006  NVIDIA    Appendix A Cg Langu
216. half angle vector   float3 eyeVec   float3 0 0  0 0  1 0    float3 halfVec   normalize lightVec   eyeVec         Calculate diffuse component   float diffuse   dot normalVec  lightVec         Calculate specular component   float specular   dot normalVec  halfVec               Use the lit function to compute lighting vector from       808 00504 0000 006 147  NVIDIA          Cg Language Toolkit       diffuse and specular values   float4 lighting   lit diffuse  specular  32         Blue diffuse material  moato chicituseiecerial   tloaes  0 0  0 10  La    5       White specular material  fal eat specu lateMat eral allel   ate on  lel INDE       Combine diffuse and specular contributions and      output final vertex color    OUT Color rgb   lighting y   diffuseMaterial    lighting z   specularMaterial    OUT Color a   1 0     return OUT     Definitions for Structures with Varying Data    The first thing to notice is the definitions of structures with binding  semantics for varying data     Let s take a look at the appin structure        define inputs from application  struct appin     float4 Position IDO SIRIO Ne  float4 Normal   NORMAL         This structure contains only two members  Position and Normal  Because  this data varies per vertex  the binding semantics POSITION and NORMAL tell  the compiler that the position information is associated with the predefined  attribute POSITION and that the normal information is associated with the  predefined attribute NORMAL     The other
217. have a  cgD3D9 prefix  Because most of the functions are identical between the two  runtimes  we describe the Direct3D 9 Cg runtime with the understanding  that the description applies to the Direct3D 8 Cg runtime as well  unless  otherwise indicated     The same prefix convention used for the function names is also used for the  type names  macro names and enumerant values   Header Files    Here is how to include the core Cg runtime API into your C or C   program   include  lt Cg cg h gt     Here is how to include the OpenGL Cg runtime API   include  lt Cg cgGL h gt     Here is how to include the Direct3D 9 Cg runtime API   include  lt Cg cgD3D9 h gt     And  here is how to include the Direct3D 8 Cg runtime API   include  lt Cg cgD3D8 h gt              Creating a Context    A context is a container for multiple Cg programs  It holds the Cg programs   as well as their shared data     Here   s how to create a context   CGcontext context   cgCreateContext           Compiling a Program  Compile a Cg program by adding it to a context with cgCreateProgram       CGprogram program   cgCreateProgram context   CG_SOURCE  myVertexProgramString   CG PROFILE ARBVP1   main   args               CG SOURCE indicates that myVertexProgramString  a string argument   contains Cg source code  not precompiled object code  Indeed  the Cg  runtime also lets you create a program from precompiled object code  if you  want to     CG PROFILE AREBVP1 is the profile the program is to be compiled to  The  
218. he Cg Runtime Library    Overview of the Cg Runtime  The Cg runtime API consists of three parts  Fig  2       Q A core set of functions and structures that encapsulates the entire  functionality of the runtime    O A set of functions specific to OpenGL built on top of the core set       a Aset of functions specific to Direct3D built on top of the core set    To make it easier for application writers  the OpenGL and Direct3D runtime  libraries adopt the philosophy and data structure style of their respective  API     Application       Fig  2  The Parts of the Cg Runtime API    The rest of the section provides instructions for using the Cg runtime in the  framework of an application  Each step includes source code for OpenGL  and Direct3D programming     Functions that involve only pure Cg resource management belong to the core  runtime and have a cg prefix  In these cases  the same code is used for  OpenGL and Direct3D     When functions from the OpenGL or Direct3D Cg runtimes are used  notice  that the API name is indicated by the function name  Functions belonging to  the OpenGL Cg runtime library have a cgGL prefix  and functions in the  Direct3D Cg runtime library have a cgD3D prefix     There are actually two Direct3D Cg runtime libraries  One for Direct3D 8 and  one for Direct3D 9  Functions belonging to the Direct3D 8 Cg runtime have a       808 00504 0000 006 45  NVIDIA          Cg Language Toolkit    cgD3D8 prefix  and functions belonging to the Direct3D 9 Cg runtime 
219. he depth output in ps 1 3        310 808 00504 0000 006  NVIDIA       Appendix B Language Profiles      is not supported        Ternary    is supported if the boolean test expression is a compile time  boolean constant  a uniform scalar boolean or a scalar comparison to a  constant value in the range   0 5  1 0   for example  a  gt  0 5   b   c      Q do  for  and while loops are supported only when they can be  completely unrolled        Q arrays  vectors  and matrices may be indexed only by compile time  constant values or index variables in loops that can be completely  unrolled     Q The discard statement is not supported  The similar but less general  clip    function is supported        Q The use of an allocation rule identifier for an input or output  struct is optional     Standard Library Functions    Because the DirectX pixel shader 1_X profiles have limited capabilities  not  all of the Cg standard library functions are supported  Table 49  presents the  Cg standard library functions that are supported by these profiles  See the  standard library documentation for descriptions of these functions     Table 49  Supported Standard Library Functions       dot  floatN  floatN        lerp floatN  floatN  floatN        lerp floatN  floatN  float        tex1D  samplerl1D  float        tex1D sampler1D  float2        tex1Dproj sampler1D  float2        tex1Dproj sampler1D  float3        tex2D  sampler2D  float2        tex2D  sampler2D  float3        tex2Dproj sampler2D  float
220. he following swizzles are allowed   x  r  y  g  z  b  w  a   xy  rg  xyz  rgb  xyzw  rgba   xxx  rrr  yyy  ggg  zzz  bbb  www  aaa   xxxx  rrrr  yyyy  gggg  zzzz  bbbb  wwww  aaaa       808 00504 0000 006 285  NVIDIA          Cg Language Toolkit    Matrix swizzles are not supported        Boolean operators other than  lt    lt     gt  and  gt   are not supported   Furthermore   lt    lt     gt  and  gt   are only supported as the condition in the     operator     Bitwise integer operators are not supported       is not supported unless the divisor is a non zero constant or it is used  to compute the depth output       is not supported        Ternary    is supported if the boolean test expression is a compile time  boolean constant  a uniform scalar boolean or a scalar comparison to a  constant value in the range   0 5  1 0   for example  a  gt  0 5   b   c      O do  for and while loops are supported only when they can be  completely unrolled        Q arrays  vectors  and matrices may be indexed only by compile time  constant values or index variables in loops that can be completely  unrolled     Q The discard statement is not supported  The similar but less general  clip    function is supported        O The use of an allocation rule identifier for an input or output  struct is optional     Standard Library Functions    Because the   p20 profile has limited capabilities  not all of the Cg standard  library functions are supported     The Cg standard library functions that ar
221. he interface  The main   program takes an  unsized array of Light interface objects  loops over them  and returns the  sum of the values returned by their respective value    methods     interface Light    float4 value     y   SELUGIE Soo llei 2 bacine    deitas value EE rulo a AS EE  y     float4 main uniform Light 1      COLOR         808 00504 0000 006 29  NVIDIA          Cg Language Toolkit    float4 v   float4 0 0 0 0     foe  ame 3     Of a  lt  lade x   v    l i  value      return v          Recall that all uniform parameters to the program must have expressions in  the parenthesized list in the compile statement and  therefore  one expression  is necessary here for the one parameter  The first way that main    can be  compiled is to give the name of an effect parameter that resolves both the  actual size of the array as well as the concrete type that implements the  Light interface     Spor might spots     technique    pass    FragmentProgram   compile arbfpl main  spots            Alternatively  the application can leave the resolution of the concrete types  and array size until later so that they can be set via Cg runtime calls from the  application   This was the usual approach before CgFX 1 4      For this case  the expression passed to the compile statement should just be  an unsized array of the abstract interface type     wieme laches lg    technique    pass    FragmentProgram   compile arbfpl main  lights            Running Cg Programs on the CPU    There are 
222. ic Profile Sample Shaders    flost3 tU   mul  m  IN T    Eloato sxtU aula  INS       next bone  i   IN Indices y        create 3x3 version of bone   m _m00_m01_m02   Bones i  _m00_m01_m02   m _m10_m11_m12   Bones i  _m10_m11_m12   Tier Omer leer Bone AA Omer IA    float3 posl   mul  Bones i   tempPos      I    tzanstomn S UT  Sx  float3 sl  mulim  INS  logics el il  Gm  dS 1  p  tiloacs ssel dl  Gm  I  N  SIN  E       final blending    li blemil m  Ey Su   float3 finalS   sO   IN Weights x   sl   IN Weights y   Eloats Esa   t0   IN Weights x   tl   IN Weights y   float3 finalSxT   sxt0   IN Weights x sxt1   IN Weights y           blend between the two positions  float3 finalPos   pos0   IN Weights x posl1 IN Weights y                    float3x3 worldToTangentSpace        worldToTangentSpace _m00_m01_m02   finalS   worldToTangentSpace _m10_m11_m12   finalT   worldToTangentSpace _m20_m21_m22   c 3Ligvel IL Sean e             float3 tangentLight    normalize  mul  worldToTangentSpace  LightVec            secale emd bias  ech bie  ut embleme  tangentLight     tangentLight   1 0    0 5    0 2        create float4 with 1 0 alpha  float4 tempLight    tempLight xyz   tangentLight xyz   tempLight w   1 0    OUT Color0   tempLight           808 00504 0000 006 219  NVIDIA          Cg Language Toolkit          220 808 00504 0000 006  NVIDIA          Appendix A  Cg Language Specification       Language Overview    The Cg language is primarily modeled on ANSI C  but adopts some ideas  fro
223. ic data types are  float  half  and fixed  Fragment profiles are required to support all  three data types  but may choose to implement half and fixed at float  precision  Vertex profiles are required to support half and float  but  may choose to implement half at float precision  Vertex profiles may  omit support for fixed operations  but must still support definition of  fixed variables  Cg allows profiles to omit run time support for int  Cg  allows profiles to treat double as float     Many operators support per element vector operations        The          amp  amp      and comparison operators can be used with bool four   vectors to perform four conditional operations simultaneously  The side  effects of all operands to the          and  amp  amp  operators are always  executed     Q Non static global variables and parameters to top level functions     such  as main       may be designated as uniform  A uniform variable may be  read and written within a program  just like any other variable   However  the uniform modifier indicates that the initial value of the  variable or parameter is expected to be constant across a large number of  invocations of the program     A new set of sampler  types represents handles to texture objects     Functions may have default values for their parameters  as in C    These  defaults are expressed using assignment syntax        Q Function overloading is supported        808 00504 0000 006 223  NVIDIA          Cg Language Toolkit    There
224. iform sampler2D normalMap    COLOR    float4 diffuseTexColor   tex2D diffuseMap  IN texCoord0 xy    float4 normal   2    tex2D normalMap  IN texCoordl xy  0 5    flog lagi wector   2    Mealor eo   0 5    float4 dot result   saturate     dot  stg hivavie cols  morma 9 2  cse  n  return dot_result   diffuseTexColor                   Example 2   struct VertexOut    float4 texCoord0   TEXCOORDO   loan mises    CO SC amen OOD lee  float4 texCoord2   TEXCOORD2   suits Mte C oco MEETUPS    ORD or             y     float4 main VertexOut IN   uniform sampler2D normalMap   uniform sampler2D intensityMap   uniform sampler2D colorMap    COLOR    float4 normal   2    tex2D normalMap  IN texCoord0 xy  0 5    float2 intensCoord   float2    dot  IN texCoordl xyz  normal xyz    dot  IN texCoord2 xyz  normal xyz     float4 intensity   tex2D intensityMap  intensCoord     float4 color   tex2D colorMap  IN texCoord3 xy    recta colar    iaeteas iy         808 00504 0000 006 295  NVIDIA          Cg Language Toolkit       DirectX Vertex Shader 2 x Profiles  vs 2       Overview    Memory    The DirectX Vertex Shader 2 0 profiles are used to compile Cg source code to  DirectX 9 VS 2 0 vertex shaders  and DirectX 9 VS 2 0 Extended vertex  shaders     Q Profile names  vs_2_0  for DirectX 9 VS 2 0 vertex shaders   vs_2_x  for DirectX 9 VS 2 0 extended vertex shaders     Q How to invoke  Use the compiler options   profile vs_2_0   profile vs_2_x    This section describes how using the vs_2_0 and vs 2 x 
225. ightModelAmbient float4 1 0  LightAmbient  ndx  float4 1 0  ndx must be greater or  equal to 0 and less than the  value of GL  MAX LIGHTS  LightConstantAttenuation float Same as LightAmbient   ndx   LightDiffuse ndx  float4 Same as LightAmbient  LightLinearAttenuation float Same as LightAmbient   ndx   LightPosition ndx  float4 Same as LightAmbient  LightQuadraticAttenuation    float Same as LightAmbient  ndx   LightSpecular ndx  float4 Same as LightAmbient  LightSpotCutoff ndx  float Same as LightAmbient  LightSpotDirection ndx  float3 Same as LightAmbient                      808 00504 0000 006    NVIDIA    133          Cg Language Toolkit                                                                      Table 6    CgFX OpenGL State Manager States  continued   State Name Type Valid Enumerants Requires  Light SpotExponent float Same as LightAmbient   ndx   LightModelColorControl int SingleColor  OpenGL 1 2 or  SeparateSpecular EXT_separate_  specular_color  LineStipple int2 1 0  LineWidth float 1 0  LogicOp int Clear  And  1 0  AndReverse  Copy   AndInverted  Noop   Xor  Or  Nor   Equiv  Invert   OrReverse   CopyInverted   Nand  Set  MaterialAmbient float4 1 0  MaterialDiffuse float4 1 0  MaterialEmission float4 1 0  MaterialShininess float 1 0  MaterialSpecular float4 1 0  ModelViewMatrix float4x4 1 0  PointDistanceAttenuation float3 1 4   ARB point parameters   or  EXT point parameters  PointFadeThresholdSize float 1 4   ARB point parameters   or  EXT point parameters  
226. interface   s reference to that texture so it can be  destroyed and the Direct3D device can be reset from a lost state  Later  after  resetting the Direct3D device and recreating the texture  it needs to be re   bound to the sampler parameter  For example     IDirect3DDevice9  device     Initialized elsewhere  IDirect3DTexture9  myDefaultPoolTexture   CGprogram program           void OneTimeLoadScene             Load the program with cgD3D9LoadProgram and       enable parameter shadowing          1 E EA  cgD3D9LoadProgram program  TRUE  0  0  0     Ferry       Bind sampler parameter   GCparameter parameter    parameter   cgGetParameterByName  program   MySampler      cgD3D9SetTexture  parameter  myDefaultPoolTexture             void OnLostDevice              First release all necessary resources       PrepareForReset          Next actually reset the Direct3D devic  device ese     aan e NF      Finally recreate all those resource  OnReset           void PrepareForReset        Pe oso El     Releas xpanded interface referenc  cgD3D9SetTexture  mySampler  0       Release local reference     and any other references to the texture  myDefaultPoolTexture  gt Release     PES naue fU             808 00504 0000 006 99  NVIDIA          Cg Language Toolkit    void OnReset            Recreate myDefaultPoolTexture in D3DPOOL_DEFAULT  Pe soa El     Since the texture was just recreated      it must be re bound to the parameter  GCparameter parameter   parameter   cgGetParameterByName  prog  
227. ions  such  as     are also supported when the corresponding arithmetic operator is  supported by Cg     Conditional Operator    25    If the first operand is of type bool  one of the following statements must hold  for the second and third operands     Q Both operands have compatible structure types        248    808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    Q Both operands are scalars with numeric or bool type        O Both operands are vectors with numeric or bool type  where the two  vectors are of the same size  which is less than or equal to four     If the first operand is a packed vector of bool  then the conditional selection  is performed on an elementwise basis  Both the second and third operands  must be numeric vectors of the same size as the first operand     Unlike C  side effects in the expressions in the second and third operands are  always executed  regardless of the condition     Miscellaneous Operators   typecast       Cg supports C   s typecast and comma operators     Reserved Words    The following are the reserved words in Cg        asm  asm_fragment auto   bool break case  catch char class  column major compile const  const_cast continue decl   default delete discard  do double dword   dynamic cast else emit   enum explicit extern  false fixed float    for friend get   goto half if   in inline inout   int interface long  matrix  mutable namespace  new operator out  packed pass  pixelfragment   pixelshader  private protected  pu
228. is destroyed only when all references to it are removed  the  application should call cgD3D9SsetDevice    with zero as an input when it is  done with a Direct3D device so that it gets destroyed when the application  shuts down  Otherwise  Direct3D does not shut down properly and reports  memory leaks to the debug console     Note that calling cgD3D9SetDevice    with zero as an input does not affect  the Cg core runtime resources in any way  all the related core runtime  handles  of type CGprogram  CGparameter  and so on  remain valid     If you call cgD3D9SetDevice    a second time with a different device  all  programs managed by the old device are rebuilt using the new device     Responding to Lost Direct3D Devices    The expanded interface may hold references to Direct3D resources that need  to be recreated in response to a lost device  In particular  certain sampler  parameters might need to be released before a Direct3D device can be reset  from a lost state  The expanded interface is holding a reference to a texture  that needs to be reset in response to a lost device if both of the following are  true for a texture     Q Itwas created in the D3DPOOL DEFAULT pool        98    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    Q It was bound to a sampler parameter  using cgD3D9SetTexture     of a  program for which parameter shadowing is enabled     In this case  the parameter must be set to zero  using cgD3D9Set Texture       to remove the expanded 
229. is only visible when an application passes parameters to a vertex or  fragment program  Therefore  the compiler is currently free to allocate  temporary variables as it sees fit     The declaration and use of arrays of arrays is in the same style as in C  That  is  if the 2D array A is declared as    loew AJANTA   then  the following statements are true     Q The array is indexed as A  row   column         Q The array can be built with a constructor using    A  i  ALO  101  ALt  ALO   2   ALOlI3I     A 1   0   All   1   ALII  2   A 1  31     A 2   0   AI21 1   Al2  2   A 2  31     AIST tol ASTID  Allil   s  m    Q A 0  is equivalent to  A 0   0   A 0   1   A 0   2   A 0   3       Support must be provided for any struct containing arrays     Minimum Array Requirements    Profiles are required to provide partial support for certain kinds of arrays   This partial support is designed to support vectors and matrices in all  profiles  For vertex profiles  it is additionally designed to support arrays of  light state  indexed by light number  passed as uniform parameters  and  arrays of skinning matrices passed as uniform parameters     Profiles must support subscripting  copying  and swizzling of vectors and  matrices  However  subscripting with run time computed indices is not  required to be supported     Vertex profiles must support the following operations for any non packed  array that is a uniform parameter to the program  or is an element of a       238    808 00504 0000 006
230. is the ratio of indices of refraction     r is the reflected ray          is the transmitted ray       176    808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders    loe fresnel loss al  Eloeis im  lose Elica   out logars r  out Eloato t jJ    float result   float cui  float sz   loat Elec        Refraction vector courtesy Paul Heckbert     cl   dot   i n    cs   1  0 eta  eus  4   O cilci  p  celes    los   es2  gt   0 0     ig   titlag      Gtar cl   seiet  es2   a     Start  y  le ws  ilssSevohy unit lemojela Cie 1 0 10    0         Compute Fresnel terms       From Global Illumination Compendeum    Plot  reXoloxtic P   filogie Cosic_cliw cosils   itle Coar eiv COSL    close Esp   flog tor   loge lez     meore   lou   m  E  y       dosis chiy cosl   meote   ely   cosa bw sos   el   Teitt    is    Cesie_chiv Cosi   eta     os  COS IES   SIS   so    cosl_ciwy_ cos    eta     cosi civ coss   etaj   fg   o   top    ki   0 5     EE SE pol  E  resulti telag se  Lo cr ka  P  r   reflect  i  n       return result     float4 main  fragin In   uniform sampler2D tex0   uniform sampler2D texl   uniform sampler2D tex2   uniform sampler2D tex3   uniform float3 eyeSpaceLightPosition   uniform float thickness        808 00504 0000 006 177  NVIDIA          Cg Language Toolkit       uniform float4 ambient     COLOR   float bscale   In tangentToEyeMat0 w    float eta    1 0 1 4     If ratio OH UACLGSS Qu refraccion  museum    float m   34      specular exponent  tloatd ilagincColoe  
231. jective depth compare       tex3D sampler3D tex  float3 s     3D nonprojective       tex3D sampler3D tex  float3 s  float3 dsdx  float3 dsdy   3D nonprojective with derivatives  tex3Dproj sampler3D tex  float4 szq     3D projective depth compare       texCUBE  samplerCUBE tex  float3 s     Cubemap nonprojective       texCUBE  samplerCUBE tex  float3 s  float3 dsdx  float3 dsdy     Cubemap nonprojective with derivatives          texCUBEproj samplerCUBE tex  float4 sq           Cubemap projective          40    808 00504 0000 006  NVIDIA    Cg Standard Library Functions    In the table  the name of the second argument to each function indicates how  its values are used when performing the texture lookup  s indicates a 1   2    or 3 component texture coordinate  z indicates a depth comparison value for  shadowmap lookups  q indicates a perspective value and is used to divide  the texture coordinate  s  before the texture lookup is performed     For convenience  the standard library also defines versions of the texture  functions prefixed with h4  such as h4tex2D     that return hal  4 values and  prefixed with x4  such as x4tex2D     that return fixed4 values     When the texture functions that allow specifying a depth comparison value  are used  the associated texture unit must be configured for depth compare  texturing  Otherwise  no depth comparison is actually performed        Derivative Functions    Table 4   Derivative Functions  presents the derivative functions that are 
232. l   functions are for multiplying matrices by vectors  and matrices  by matrices        Matrix by column vector multiply  mar reco Mumma we Cie One B  uil  MA        Row vector by matrix multiply  row vector matrix  mul v  M         Matrix by matrix multiply  matrix matrix  mul M  N      It is important to use the correct version of mul      Otherwise  you are likely  to get unexpected results  More detail on the mu1    functions are provided  in  Cg Standard Library Functions  on page 33        20    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    Vector Constructor    Cg allows vectors  up to size 4  to be constructed using the following  notation     y       load  3 0  2560  150   1 0      The vector constructor can appear anywhere in an expression  Furthermore   vectors can be constructed from smaller vectors     MA  amp    sss  float    b   tloatst a  0 0  1 0     Boolean and Comparison Operators    Cg includes three of the standard C boolean operators      amp  amp  logical AND  I  logical OR    logical negation    In C  these operators consume and produce values of type int  but in Cg  they consume and produce values of type bool  This difference is not  normally noticeable  except when declaring a variable that will hold the  value of a boolean expression  Cg also supports the C comparison operators   which produce values of type bool      lt  less than    lt   less than or equal to      inequality      equality    gt   greater than or equal to   gt  
233. l bits set to 0 corresponds to the  value   128 127   and a representation with all bits set to 1 corresponds to    127 127   The four signed integers are then packed into a single 32 bit  result  This operation may be reversed using the unpack 4byte    function        C Pseudocode       iplo      roume ia    elema laczy  128 127  3307 1299  ae 128  p   lolo    segwumol 127   elemala yo  128 127  227 127    129 p   lolo  zs   segweumogl 127    lem anz  128 127  127 1279  s 128  p   Wow   rovni Ar    elama la w  126 127  127 129  ae 129p   restile    woow  lt  lt  24      wo   lt  lt  16      wo   s 6    wox   808 00504 0000 006 277    NVIDIA          Cg Language Toolkit    unpack_4byte    half4 unpack_4byte  float a      Unpacks four 8 bit integers from a and scales the results into individual 16   bit floating point values between   128 127  and   127 127         C Pseudocode    e SU      a e 0   amp  Osan    329    X275  resule    Ma  gt  gt  8      Osa    128    127 07  mesi     a  gt  gt  16  E Osan    128    127 06  reste      a x 24  m Odin    125    127 07    pack_4ubyte      float pack _4ubyte float4 a    float pack _4ubyte half4 a      Converts the four components of a into 8 bit unsigned integers  The  unsigned integers are such that a representation with all bits set to 0  corresponds to 0 0  and a representation with all bits set to 1 corresponds to  1 0  The four unsigned integers are then packed into a single 32 bit result   This operation can be reversed using 
234. l time graphics  this format  provides several key benefits     O Encapsulation of multiple rendering techniques  enabling fallbacks for  level of detail  functionality  and performance     Support for Cg  assembly language  and fixed function shaders     Editable parameters and GUI descriptions embedded in the file        Multipass shaders        24 808 00504 0000 006  NVIDIA    Introduction to the Cg Language    Q Render state and texture state specification     In practical terms  by wrapping both Cg vertex programs and Cg fragment  programs together with render state  texture state  and pass information   developers can describe a complete rendering effect  Although individual Cg  programs may contain the core rendering algorithms necessary for an effect   only when combined with this additional environmental information does  the shader become complete and self contained  The addition of artist   friendly GUI descriptions and fallbacks enables CgFX files to integrate well  with the production workflow used by artists and programmers     CgFX encapsulates  in a single text file  everything needed to apply a  rendering effect  This feature lets a third party tool or another 3D application  use a CgFX text file as is  with no external information other than the  necessary geometry and texture data  In this sense  CgFX acts as an  interchange format  CgFX allows shaders to be exchanged without the  associated C   code that is normally necessary to make a Cg program work  wit
235. ldSpacePos         h   normalize l   e   float3 halfAngle   normalize vertToEye   LightVec           X   max dot  LightVec worldNormal  0 0     y   max  dot  halfAngle  worldNormal   0 0         transform into homogeneous clip space      mul  WorldViewProj  tempPos         808 00504 0000 006    191  NVIDIA          Cg Language Toolkit       Bump Dot3x2 Diffuse and Specular    Description    The bump dot3x2 diffuse and specular effect mixes bump mapping with  diffuse and specular lighting based on the texm3x2tex DirectX 8 pixel  shader instruction  DOT_PRODUCT_TEXTURE_2D in OpenGL   This  instruction computes the dot product of the normal and the light vector   corresponding to the diffuse light component  and the dot product of the  normal and the half angle vector  corresponding to the specular light  component  This results into two scalar values that are used as texture  coordinates to look up a 2D illumination texture containing the diffuse color  and the specular term in its alpha component  Since the normal fetched from  the normal map is in tangent space  both the light vector and the half angle  vector are transformed to this space by the vertex shader  Fig  14          Fig  14  Example of Bump Dot3x2 Diffuse and Specular       192 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Bump Dot3x2    Semuce ex f    y     float4 Position   POSITION    in object space  float3 Normal   NORMAL    in object space  float2 TexCcoord   TEX
236. ler2D ColorMap     color     components   radius irisDepth  eta  lensDensity   uniform float4 BallData           172    808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders          components   phongExp gloss1 gloss2  drop     uniform  uniform  uniform  uniform  uniform  uniform    float4 GlossData   float3 AmbiColor   losa WalirirtColloie   float3 SpecColor   float3 LensColor   ioar o BG COMO  COLOR    const half3 baseTex   half3 1 0h 1 0h 1 0h    const half GRADE   0 05h   const half3 yAxis   half3 0 0h 1 0h 0 0h      const ha  const ha    ii lectu  nadle abest  Sarei  nali asea  wee aLa  half3 pu  Pip alae Sx  half D    avale eLa  half4 pl       view  half3 Vn  nalia NE  half3 Ln  naes IDa  half3 mi  als Dal    half3 ha  half ndh  half spe  nal ye   specl   specl     half3 Sp       aces inal    skit  elie  half g       1f3 xAxis   half3 1 0h 0 0h 0 0h    ies  bellet   aedis  0   Ola   0  Ola   0  5 Ole  y       ally constants   could be done in VP or on CPU   sSize   BallData RADIUS     0h BallData IRIS DEPTH   BallData IRIS DEPTH    sScale   0 3333h   max 0 01h  irisSize    sDist   BallData RADIUS   BallData IRIS  DEPTH   fowiceinceie   beller a gt  halra  GuxesbeysL sic  0   Ola     5 Olin  E  axis  returns simple  irisDist    dot pupilCenter  xAxis     cs   MN ROP oO sate Wome   real smile    aneEquation   half4 xAxis  D                  vector TO surface     normalize IN OPosition   IN VPosition    normalize IN N       IN LightVecO xyz    EtaGine   Distcolor 
237. les supports MRTs  The MaxDrawBuffers profile option  may be used to explicitly set the number of draw buffers  that is  render  targets  available on the target hardware  If the input program requires more  than the specified number of draw buffers  compilation fails     If the MaxDrawBuf fers profile option is not specified  the stand alone Cg  compiler  cgc  assumes that the target hardware supports MRTs to whatever  extent required by the input program     When compiling programs using the Cg runtime  be sure to call  cgGLSetOptimalOptions    under OpenGL  or call  cgD3D9Get Opt imalOptions    under Direct3D  These functions allow you to       2  To understand the capabilities of OpenGL ARB fragment programs and the code  produced by the compiler  refer to the ARB fragment program extension in the OpenGL  Extensions documentation        808 00504 0000 006 263  NVIDIA          Cg Language Toolkit    automatically determine the value for the MaxDrawBuffers profile option  that is appropriate for the graphics hardware on the target machine     Resource Limits    The ARB_fragment_profile specifications allows an OpenGL  implementation to place limits on the numbers and types of resources that a  fragment program may use  If these resource limits must be exceeded to  compile a Cg program  the compilation will fail  Resources that may be  limited include the number of instructions  the number of registers  and the  number of dependent texture reads     The arb  p1 profile suppo
238. lication    on page 106 and    Expanded  Interface DirectD3D 8 Application    on page 109    Expanded Interface Vertex Program    The following Cg code is assumed to be in a file called VertexProgram  cg     void VertexProgram                     in float4 position    POSITION    in float4 color INCOLORO    in float4 texCoord   TEXCOORDO    out float4 positionO   POSITION    ate Eloei   Coloro TT COLORO    out float4 texCoordO   TEXCOORDO    const uniform float4x4 ModelViewMatrix     positionO   mul  position  ModelViewMatrix     Colon On eolon   texCoordO   texCoord       Expanded Interface Fragment Program    The following Cg code is assumed to be in a file called FragmentProgram cg     void FragmentProgram            da Ellos celos INCOLORO  in float4 texCoord   TEXCOORDO   out float4 coloro   COLORO     const uniform sampler2D BaseTexture   const uniform float4 SomeColor     colorO   color   tex2D BaseTexture  texCoord    SomeColor        Expanded Interface DirectD3D 9 Application    The following C code links the previous vertex and fragment programs to  the Direct3D 9 application     include  lt cg cg h gt     include  lt cg cgD3D9 h gt     IDirect3DDevice9  device     Initialized somewhere else  IDirect3DTexture9  texture     Initialized somewhere else  D3DXCOLOR constantColor     Initialized somewhere else    CGcontext context    IDirect3DVertexDeclaration9  vertexDeclaration   CGprogram vertexProgram  fragmentProgram   CGparameter baseTexture  someColor  modelViewMat
239. ll ps 1 x profiles        texCUBE reflect dp3x3  uniform samplerCUBE tex  float4 strq     float4 intermediate coordl   float4 intermediate coord2   float4 prevlookup           Performs the following   float3 E   float3  intermediate coord2 w  intermediate coordl w   strq w    float3 N   float3  dot  intermediate coordl xyz  prevlookup xyz    dot  intermediate coord2 xyz  prevlookup xyz    dot  strq xyz  prevlookup xyz     return texCUBE tex  2   dot N  E    dot N  N    N  E    where  strq are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate _coordl are texture coordinates associated with the n 2  texture unit  and  intermediate coord2 are texture coordinates associated with the n 1  texture unit   This function can be used to generate the texm3x3pad texm3x3pad   texm3x3vspec instruction combination in all ps 1 x profiles              808 00504 0000 006    NVIDIA    317          Cg Language Toolkit    Table 54  ps 1 x Auxiliary Texture Functions  continued        Texture Function       Description       texCUBE reflect eye dp3x3  uniform samplerCUBE tex   float3 str  float4 intermediate coordl   float4 intermediate coord2   float4 prevlookup  uniform float3 eye        Performs the following   float3 N   float3  dot  intermediate coordl xyz  prevlookup xyz    dot  intermediate coord2 xyz  prevlookup xyz    dot  coords xyz  prevlookup xyz     return texCUBE  tex  2   dot N  E    dot N  N    N  E    where  s
240. loating point values  The  two converted components are then packed into a single 32 bit result  This  operation can be reversed using the unpack 2half    function        C Pseudocode  result      half a y   lt  lt  16     half a x     unpack 2half      half2 unpack 2half float a      Unpacks a 32 bit value into two 16 bit floating point values        C Pseudocode  result x    a  gt  gt  0   amp  OxFF   result y    a  gt  gt  16      OxFF        276 808 00504 0000 006  NVIDIA       Appendix B Language Profiles    pack 2ushort      float pack 2ushort float2 a    float pack 2ushort  half2 a      Converts the components of a into a pair of 16 bit unsigned integers  The two  converted components are then packed into a single 32 bit return value  This  operation can be reversed using the unpack 2ushort    function        C Pseudocode   USING sx   seguuoel   5535 5 0    Cllewjo a  lt   0 0  J50  5  ushon xy   Ote  S65535 0   Cllamjo a yv  0 0  J30  5  ESSE   tislaome ivy  lt  lt  US O    unpack_2ushort    float2 unpack_2ushort  float a      Unpacks two 16 bit unsigned integer values from a and scales the results into  individual floating point values between 0 0 and 1 0        C Pseudocode  mew     0x  gt  gt  0     amp  bees    65529 07  resulte    UE  gt  16      demise    15535550     pack 4byte      float pack 4byte float4 a    float pack 4byte  half4 a      Converts the four components of a into 8 bit signed integers  The signed  integers are such that a representation with al
241. m modern languages such as C   and Java  and from earlier shading  languages such as RenderMan and the Stanford shading language  The  language also introduces a few new ideas  In particular  it includes features  designed to represent data flow in stream processing architectures such as  GPUs  Profiles  which are specified at compile time  may subset certain  features of the language  including the ability to implement loops and the  precision at which certain computations are performed     Silent Incompatibilities    Most of the changes from ANSI C are either omissions or additions  but there  are a few potentially silent incompatibilities  These are changes within Cg that  could cause a program that compiles without errors to behave in a manner  different from C     Q The type promotion rules for constants are different when the constant is  not explicitly typed using a type cast or type suffix  In general  a binary  operation between a constant that is not explicitly typed and a variable is  performed at the variable s precision  rather than at the constant s default  precision     Q Declarations of struct perform an automatic typedef  as in C    and  thus could override a previously declared type        O Arrays are first class types that are distinct from pointers  As a result   array assignments semantically perform a copy operation for the entire  array     808 00504 0000 006 221    NVIDIA       Cg Language Toolkit    Similar Operations That Must be Expressed Differen
242. many situations  such as tabularizing complex functions into  texture maps  where it is useful to execute Cg programs on the CPU and not  on the GPU  While the CPU path doesn t offer the same performance  it can  be useful because it doesn t have the resource limits associated with GPUs     Programs that run on a CPU in this manner are declared like the following     float foo   4 f   mioara sie  loewe js TEOSTON wihoeic2 clellicae  2 ASIA  E COLOR             30    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    ESC UE FOC     113037          The POSITION semantic denotes the parameter or parameters that should be  set with the coordinates of each point at which the function is evaluated      there is a coordinate value from zero to one for each dimension over which  the function is being evaluated  The PSIZE semantic denotes a parameter that  should be initialized with the value of the spacing between samples at which  the function is being evaluated  and the COLOR semantic denotes where the  result of the function should be returned   Thus  the function above could  have been written as a void function with an out float4 ret   COLOR  parameter and an assignment to ret instead of the return statement      Given an effect file with such a program  a CGprogram handle to it can be  retrieved by creating a program with the following CG_PROFILE_GENERIC  profile        CGprogram tp   cgCreateProgramFromEffect  effect   CE IPINOM Iii  Cla  WiewuayesY ATA          
243. me as COLOR               DEPTH Fragment depth value  float  Interpolated depth from rasterizer   in range  0 1    in range  0 1                        If a program desires an output color alpha of 1 0  it should explicitly write a  value of 1 0 to the w component of the COLOR output  The language does not  define a default value for this output        Note  If the target hardware uses a default value for this output  the compiler may  choose to optimize away an explicit write specified by the user if it matches the  default hardware value  Such defaults are not exposed in the language        In contrast  the language does define a default value for the DEPTH output   This default value is the interpolated depth obtained from the rasterizer   Semantically  this default value is copied to the output at the beginning of the  execution of the fragment program        Note  Although the DEPTH output is assigned a default value  as with all outputs its  value cannot be read in a Cg program           252    808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    As discussed earlier  when a binding semantic is applied to an output  the  type of the output variable is not required to match the type of the binding  semantic  For example  the following is legal  although not recommended     struct myfragoutput    cloar  mycolor z COLOR          In such cases  the variable is implicitly copied  with a typecast  to the  semantic upon program completion  If the variable   s 
244. meter  parameter  i                break   default      Here is the code that handles the parameter     break        while    parameter   cgGetNextParameter  parameter     0           In practice  it is usually simpler to iterate over all of the    leaf    parameters   that is  non aggregate parameters  directly using  cgGetNextLeafParameter       CGparameter cgGetFirstLeafParameter  CGprogram program     CGenum namespace     CGparameter cgGetNextLeafParameter  CGparameter parameter       These functions iterate through all the simple parameters  including  structure fields and array elements that serve as inputs to the program   Nothing is guaranteed regarding the order of the parameters in the  sequence     Direct Retrieval    Any parameter of a program can also be retrieved directly by using its name  with cgGetNamedParameter      CGparameter cgGetNamedProgramParameter  CGprogram program   CGenum namespace   const char  name      Here  namespace may be either CG_GLOBAL or CG_PROGRAM  as above  If the  program has no parameter corresponding to name  cgGetNamedParameter     returns zero     The Cg syntax is used to retrieve structure fields or array elements  Let s take  the following code snippet as an example   EE ostra    float  A   float4 B   y   EErEE legs exu 1  Foostruct Fool    y        808 00504 0000 006 57    NVIDIA          Cg Language Toolkit    void main BarStruct Bar 3       P       The following are valid names for retrieving the corresponding parameter       
245. meters are managed  By far the  easiest method is to enable texture management in the context   cgGLSetManageTextureParameters  context  CG TRUE               If this is done  then when the CGprogram is bound by a call to  cgSetPassState     the texture parameters used are associated with the  appropriate hardware texture units automatically        124    808 00504 0000 006  NVIDIA    Introduction to CgFX    Alternatively  the mapping of texture parameters to hardware units can be  handled explicitly by the application  using the routine  cgGLEnableTextureParameter       CGparameter progParam   cgGetNamedParameter  prog   sampler     cgGLEnableTextureParameter  progParam                However  note that it is not possible to call cgGLEnableTextureParameter     with a handle to an effect   s sampler parameter  the handle must be to an  actual program parameter     In general  the first approach is to be preferred for its simplicity     Interfaces and Unsized Arrays    CgFX also supports Cg   s interfaces and unsized arrays features  Given an  effect file with Cg programs that use these features  the compile statement  can be used in two different ways to resolve the interfaces and unsized  arrays so that the program can be compiled  The abstract types may be  resolved using Cg code itself  or they may be resolved using the Cg runtime     Consider the following example  a Light interface has been defined with  SpotLight implementing the interface  The main   program takes an  
246. minimal interface 85  cgD3D8ResourceToDeclUsage   90  cgD3D8ValidateVertexDeclaration    88  cgD3D9ResourceToDeclUsage   90  cgD3D9ValidateVertexDeclaration    88  Direct3D 8 application 95  Direct3D 9 application 92  fragment program 92  type retrieval 91  vertex declaration 85  vertex declaration for Direct3D 8 86  vertex declaration for Direct3D 9 86  vertex program 91  header files 46  loading 47  modifying parameters 47    OpenGL 73  error reporting 85  OpenGL application 82  OpenGL parameter setting 74  parameter shadowing 73  program execution 48  releasing resources 49  Cg Runtime Library  overview 45  Cg standard library 33  Cg Simple file 145  cgc exe  Cg compiler 329  cgD3D9EnableParameterShadowing   103  CGerror  Direct3D 114  OpenGL 85  cint type  specification 229  command line options  Cg compiler 329  comparison operators 248  introduction 21  compilation profiles  use of 225  compiler options  command line 329   debug 330   Dmacro 329   entry 329   h 330    pathname 329   I filename 329   longprogs 330   maxunrollcount 330   nocode 329   nofx 329   nostdlib 329   0 329   profile 329   profileopts 329   quiet 329   strict 329   v 330  compile time type category 232  computation frequency for performance 327  concrete type category 232  conditional code in fragment programs and  performance 328  conditional operator 248       332    808 00504 0000 006    NVIDIA    conditional operators 22  constants  typing of 232  construction operator  described 244  context 
247. modulate the  lighting contributions with the material properties to get the final vertex  color  and we assign it to the output structure   s color field  OUT Color   Finally  we set the alpha channel of the final color to 1 0  so that our object  will be opaque  and return the computed position and color values stored in  the OUT structure     Further Experimentation    Use simple cg as a framework to try more advanced experiments  perhaps by  adding more parameters to the program or by performing more complex  calculations in the vertex program  Have fun experimenting        152 808 00504 0000 006  NVIDIA          Advanced Profile Sample Shaders    This chapter provides a set of advanced profile sample shaders written in Cg   Each shader comes with an accompanying snapshot  description  and source    code    Examples shown are  Improved Skinning  Improved Water  Melting Paint  MultiPaint  Ray Traced Refraction  Skin   Thin Film Effect   Car Paint 9       Oo oO oO OO O O    808 00504 0000 006    NVIDIA    153       Cg Language Toolkit       Improved Skinning    Description    This shader takes in a set of all the transformation matrices that can affect a  particular bone  Each bone also sends in a list of matrices that affect it  There  is then a simple loop that for each vertex goes through each bone that affects  that vertex and transforms it  This allows just one Cg program to do the  entire skinning for vertices affected by any number of bones  instead of  having one pr
248. n  2 f      technique AsmFrag         pass    FragmentProgram   asm      Vine  0  3X     dco    Oho  WH  207  END          he       The most common of these three options for specifying programs is using  compile statements  The first argument following the compile keyword is  the name of the profile to which the program is to be compiled  for example     p30    p40  arbfp1  or vp20   The next argument gives the name of the  function in the effect file that serves as the program entry point  followed by  a list of expressions  for example   2        These expressions have a one to one  correspondence with the uniform parameters of the program being  compiled    there must be exactly one for each uniform program parameter   no more  and no less     In the example above  the expression     2       sets the value for the foo  parameter to main      Because it is a literal value  CgFX is able to compile the  program to a particularly efficient version that just includes returning the uv  value     It is also possible to include references to effect parameters in the expression  used in the compile statement  for example     WEllyeuc A4t asin  ia tora loat   OO  Float ww 9 WaKCOORIDG   e COLOR          return  foo   0  2 uv g 2   uy        122 808 00504 0000 006  NVIDIA    Introduction to CgFX       float bar     technique NewSimpleFrag    pass    VertexProgram   NULL   FragmentProgram   compile arbfpl main 2   bar              Here  the value    2   bar    is associated with the 
249. n gt     MaxLocalParams  lt n gt      where 1  lt   n  lt   32  default 32    where 1  lt   n  lt   8  default 1     where 16  lt   n     4096  default 1024    where 16  lt   n     256  default 96        262    NVIDIA    808 00504 0000 006    Appendix B Language Profiles       OpenGL ARB Fragment Program Profile  arb  p1     The OpenGL ARB Fragment Program Profile is used to compile Cg source  code to fragment programs compatible with version 1 0 of the  GL_ARB_fragment_program OpenGL extension        Q Profile name  arbfp1       Q How to invoke  Use the compiler option  profile arbfp1     The arbfp1 profile limits Cg to match the capabilities of OpenGL ARB  fragment programs  This section describes the capabilities and restrictions of  Cg when using the arbfp1 profile     Accessing OpenGL State    The arbfp1 profile supports access to OpenGL state with the same set of  state semantics provided by the arbvp1 profile  See    Accessing OpenGL  State    on page 256 for more information about this feature     MRT Support    This profile supports multiple render targets  MRTs   When MRTs are used   up to three additional four component outputs may be written in addition to  the COLOR and DEPTH outputs supported in other profiles  These new outputs  are available via the output semantics COLOR1 through COLOR3     The use of MRTs is an optional feature of the ARB_fragment_program and  the DirectX PixelShader 2 specifications  consequently  not all hardware that  supports these profi
250. n it  for editing  While you are editing simple  cg  you can press Control F7 at  any time to compile it  Because of the way the project is set up  any errors in  your code will be shown just as when you compile a normal C or C    program     You can also double click on an error  which takes you to the location in the  source code that caused the error        Understanding simple cg    The Cg  Simple application runs the shader defined in simple  cg on a torus   The provided version of simple  cg calculates diffuse and specular lighting  for each vertex  A screenshot of the shader is shown in Fig  4     F       Fig  4  The simple cg Shader       146    808 00504 0000 006  NVIDIA    A Brief Tutorial    Program Listing for simple cg    The following is the program listing for simple  cg        Define inputs from application   struct appin     float4 Position BO Silt k  Nie  float4 Normal   NORMAL   y        Define outputs from vertex shader   Struct wvertout     float4 HPosition PO SATIS NIIS  float4 Color g  COLOR   y     vertout main appin IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelViewIT   uniform float4 LightVec     vertout OUT        Transform vertex position into homogenous clip space   OUT HPosition   mul ModelViewProj  IN Position               Transform normal from model space to view spac  float3 normalVec   normalize  mul  ModelViewIT   IN Normal  xyz         Store normalized light vector   float3 lightVec   normalize LightVec xyz         Calculate 
251. n of compiler and  profile  The operations supported on a packed type in a particular profile  may be different than the operations supported on the corresponding  unpacked type in that same profile  Profiles may define a maximum  allowable size for packed arrays  but must support at least size 4 for  packed vector  one dimensional array  types  and 4x4 for packed matrix   two dimensional array  types     When declaring an array of arrays in a single declaration  the packed  modifier only refers to the outermost array  However  it is possible to  declare a packed array of packed arrays by declaring the first level of  array in a typedef using the packed keyword and then declaring a  packed array of this type in a second statement  It is not possible to have  a packed array of unpacked arrays        230    808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    Q For any supported numeric data type TYPE  implementations must  support the following packed array types  which are called vector types   Type identifiers must be predefined for these types in the global scope    typedef packed TYPE TYPE1 1    typedef packed TYPE TYPE2 2    typedef packed TYPE TYPE3 3    typedef packed TYPE TYPE4 4      For example  implementations must predefine the type identifiers  float1  float2  float3  float4  and so on for any other supported  numeric type     Q For any supported numeric data type TYPE  implementations must  support the following packed array types  which are call
252. name  respectively   Similarly  there are versions of each function that retrieve any matrices in the  given parameter in row major or column major order  These are specified  using ror c  respectively  At most  nvals values will be copied into the given  array  v The total number of values copied into v is returned     For example  cgGetParameterValueic    retrieves the values of the given  parameter into the supplied array of integer data  and copies matrix data in  column major order  The total number of values associated with a given       58    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    parameter  and hence the required length of the given array  can be  computed using the core Cg runtime           int nrows   cgGetParameterRows  param     int ncols   cgGetParameterColumns  param     int asize   cgGetArrayTotalSize  param    int ntotal   nrows ncols     a  asigze  gt  0  meoral    asizes    A similar family of entry points exist for setting a parameter s values     void cgSetParameterValue i f d  r c   CGparameter param   int nvals  type  v      The entry points in this family are identical to those of the  cgGetParameterValue family  The total number of values in a parameter  may be computed as above  If nva1s is less than the total size of the  parameter  an error is generated     The core Cg runtime also allows the application to query a parameter s  default values   const double  cgGetParameterValues  CGparameter parameter   CGenum valueT
253. nction pipeline or specify a user written vertex program   If the user wishes to mix these two approaches  it is sometimes desirable to  guarantee that the position computed by the first approach is bit identical to  the position computed by the second approach  This position invariance is  particularly important for multipass rendering     Support for position invariance is optional in Cg vertex profiles  but for those  vertex profiles that support it  the following rules apply     Q Position invariance with respect to the fixed function pipeline is  guaranteed if two conditions are met        250    808 00504 0000 006  NVIDIA       Appendix A Cg Language Specification    The vertex program is compiled using a compiler option indicating  position invariance   posinv  for example      The vertex program computes position as follows   OUT_POSITION   mul MVP  IN_POSITION     where   OUT POSITION is a variable  or structure element  of type float 4  with an output binding semantic of POSITION or HPOS    IN POSITION is a variable  or structure element  of type float 4  with an input binding semantic of POSITION    MVP is a uniform variable  or structure element  of type   loat 4x4  with an input binding semantic that causes it to track the fixed   function modelview projection matrix   The name of this binding  semantic is currently profile specific   for OpenGL profiles  the  semantic GL MVP is recommended      Q Ifthe first condition is met but not the second  the compiler is  
254. nctions by Profile         llle 226  Syntax for Parameters in Function Definitions           eiie 227  Function Calls  serg ce cee ee SEER eae OXON XA RA ERE eR Dee 228  Method Call u prismi ra Aa A de Melia Een Bed 228   niic ink CARE eRe oA RESTA Ke eae ee 228  TYDES 4 bo Pa x BRERA REE LEMS Cad VEE RETA SRR Tea Ros Bas 229  Partial    Support Of TYPOS    ad cht wedded Sh Shed BERR a 231  ae OM SS   EP 232  CONSIGN sia hee ee AS Re ae ae I S Eds 232  iv 808 00504 0000 006    NVIDIA    Type Qualifiers  omar tit aida mw Reged kee eg BR Rodd ag ORR debe dS dw d 233       Type CONVENIOS scada Weak 234  Type  EGUNAN Y ss so cat ede eee he CGR DI RU ER XE ee ee eee RED Rea 236  TYPe Promotion RUES pgn   2  aga RE ROSARII OAQUEROOROE RE RR ea dS 236  NAMESPACES finde eee AR SRK EROR ORG ORG OA HRT ORO RA 231  Arrays and SubDSCHIDEIFI wis cg a A ete Red 238  Unisize AaS AAA AAA A 239  Funcion OVEROAGING   cuba rad ARA a A we E 240  evt rp PP di o e a ws 241  Use of Uninitializead Variables in iras ao o tuts neo e Ra TR ca Rec 241  PreprocessO Farid id AS a ls 241  Overview of Binding Semantics          iliis ee 241  Binding SEMANTIES serios oro cari Sexe a e eate Dh d aH ies 242  Aliasing of SS IV ACCS a  su acquis qr anl n asc Rok TR bct ah vas CR RR GORA Re nn 243  Restrictions on Semantics Within a Structure          llli 243  Additional Details for Binding Semantics           0 000000  n 243  How Programs Receive and Return Data           0 000000  eee ers 243  Statemelts  ccs 
255. ndicates no error  When either error fetching entry point  is called  its cached error value is reset to 0     More comprehensive error checking and handling can be achieved using   Cg s error handler callback mechanism  Each time an error occurs  the core   Cg runtime calls an error handler callback function  optionally provided by   the application  The application registers the error handler using   typedef void   CGerrorHandlerFunc   CGcontext ctx  CGerror err   void  appdata      void cgSetErrorHandler  CGerrorHandlerFunc func  void  data       When an error occurs  the Cg runtime calls the specified function  passing  the CGcontext in which the error occurred  the code associated with the  triggering error  and a copy of the data pointer registered by the application   A typical implementation of the error handler might look like this        void HandleCgError  CGcontext ctx  CGerror err  void  appdata          imjoneainioie  Sicilia   Es Eicicores Bs inl  cocacola  ier  J  p       const char  listing   cgGetLastListing  ctx    if  listing    NULL   oriol  suele  Y last Jlistimes usa  Je  p       Here is a list of some of the CGerror codes specific to the core Cg runtime     O cc NO ERROR  Returned when no error has occurred        O CG COMPILER ERROR  Returned when the compiler generated an error  A  call to cgGetLastListing   should be made to get more details on the  actual compiler error     O CG INVALID PARAMETER ERROR  Returned when the parameter used is  invalid 
256. ngle precision        Q half  fixed  and double data types are treated as float   half data types can be used to specify partial precision hint for pixel  shader instructions     int data type is supported using floating point operations        sampler  types are supported to specify sampler objects used for texture  fetches     Statements and Operators    With the ps_2_0 profiles while  do  and for statements are allowed only if  the loops they define can be unrolled because there is no dynamic branching  in PS 2 0 shaders  In current Cg implementation  extended ps_2_x shaders  also have the same limitation     Comparison operators are allowed   gt    lt    gt     lt            and Boolean  operators        amp  amp       are allowed  However  the logic operators   amp            are  not     Using Arrays and Structures    Variable indexing of arrays is not allowed  Array and structure data is not  packed        808 00504 0000 006 301  NVIDIA          Cg Language Toolkit                                                                   Bindings  Binding Semantics for Uniform Data  The valid binding semantics for uniform parameters in the ps_2_0 and  ps 2 X profiles are summarized in Table 42   Table 42  ps 2   Uniform Input Binding Semantics  Binding Semantics Name Corresponding Data  register  s0    register s15  Texunit unit N  where N is in range  0  15   TEXUNITO TEXUNIT15 May only be used with uniform inputs with  sampler  types   register  c0  register  c31  Constant r
257. ngle vector      fetch the illumination map using    A the result of the two previous dot products  Hf as texture coordinates      returns the diffuse color in the       Ve color components and the specular color in the     alpha component    float2 illumCoord     float2  dot  IN  LightVector xyz  bumpNormal xyz     dot  IN HalfAngleVector xyz  bumpNormal xyz     float4 illumination   tex2D IlluminationMap  illumCoord          expand iterated normal to   1 1   float4 normal   2    IN Normal   0 5         compute self shadowing term  float shadow   saturate 4   dot  normal xyz   IN  LightVectorUnsigned xyz                 compute final color  return  Ambient   color   shadow      illumination   color   illumination wwww          808 00504 0000 006 195  NVIDIA       Cg Language Toolkit       Bump Reflection Mapping    Description    This effect mixes bump mapping and reflection mapping based on the  texm3x3vspec DirectX 8 pixel shader instruction   DOT_PRODUCT_REFLECT_CUBE_MAP in OpenGL   This instruction  computes three dot products to transform the normal fetched from the  normal map into the environment cube space  reflects the transformed  normal with respect to the eye vector and fetches a cube map to get the final  color  The vertex shader is responsible for computing the transform matrix  and the eye vector  Fig  15          Fig  15  Example of Bump Reflection Mapping       196 808 00504 0000 006  NVIDIA    Vertex Sha    Basic Profile Sample Shaders    der Source Code fo
258. nguage     Typically  the initial value of a uniform variable or parameter is stored in a  different class of hardware register  Furthermore  the external mechanism for  specifying the initial value of uniform variables or parameters may be  different than that used for specifying the initial value of non uniform  variables or parameters  Parameters qualified as uniform are normally  treated as persistent state  while non uniform parameters are treated as  streaming data  with a new value specified for each stream record  such as  within a vertex array      Function Declarations    Functions are declared essentially as in C  A function that does not return a  value must be declared with a void return type  A function that takes no  parameters may be declared in one of two ways     Q AsinC  using the void keyword  functionName  void        Q With no parameters at all  functionName       Functions may be declared as static  If so  they may not be compiled as a  program and are not visible from other compilation units     Overloading of Functions by Profile    Cg supports overloading of functions by compilation profile  This capability  allows a function to be implemented differently for different profiles  It is  also useful because different profiles may support different subsets of the  language capabilities  and because the most efficient implementation of a  function may be different for different profiles        226    808 00504 0000 006  NVIDIA    Appendix A Cg Language S
259. nguage    features  especially in fragment programs  These are referred to as basic  profiles     See    Language Profiles    on page 255 for detailed descriptions of these  and related profiles     Declaring Programs in Cg    CPU code generally consists of one program specified by main    in C  In  contrast  a Cg program can have any name  A program is defined using the  following syntax       return type    lt program name gt    lt parameters gt       lt semantic   name gt      C   os  FF     Program Inputs and Outputs    The programmable processors in GPUs operate on streams of data  The  vertex processor operates on a stream of vertices  and the fragment processor  operates on a stream of fragments     A programmer can think of the main program as being executed just once on  a CPU  In contrast  a program is executed repeatedly on a GPU     once for each  element of data in a stream  The vertex program is executed once for each  vertex  and the fragment program is executed once for each fragment     The Cg language adds several capabilities to C to support this stream based  programming model  For new Cg programmers  these capabilities often take  some time to understand because they have no direct correspondence to C  capabilities  However  the sample programs later in this document  demonstrate that it really is easy to use these capabilities in Cg programs     Two Kinds of Program Inputs  A Cg program can consume two different kinds of inputs     Q Varying inputs are u
260. notation     float4 main uniform Foo myfoo  uniform float myval    COLOR    return myfoo helper  myval              808 00504 0000 006 13  NVIDIA          Cg Language Toolkit    Arrays    Note that in the current release  member variables must be declared before  member functions that reference them  additionally  member functions may  not be overloaded based on profile     Arrays are supported in Cg and are declared just as in C  Because Cg does  not support pointers  arrays must always be defined using array syntax  rather than pointer syntax        Declare a function that accepts an array     of five skinning matrices   returnType foo float4x4 mymatrix 5                   Basic profiles place substantial restrictions on array declaration and usage   General purpose arrays can only be used as uniform parameters to a vertex  program  The intent is to allow an application to pass arrays of skinning  matrices and arrays of light parameters to a vertex program     The most important difference from C is that arrays are first class types  That  means array assignments actually copy the entire array  and arrays that are  passed as parameters are passed by value  the entire array is copied before  making any changes   rather than by reference     Unsized Arrays    Cg supports unsized arrays    arrays with one or more dimensions having no  specified length  This makes it possible to write Cg functions that operate on  arrays of arbitrary size  For example     float myfunc float val
261. nslateCGerror  error        OutputDebugString  buffer        cgSetErrorCallback  MyErrorCallback                116    808 00504 0000 006  NVIDIA          Introduction to CgFX       CgFX Overview    CgFX is an extended file format for Cg  In addition to Cg programs  CgFX  files can also represent both fixed function graphics state and meta   information about shader parameters  The CgFX API makes it possible to  load CgFX effects files  traverse the data in them  set the associated graphics  state  and so on  This chapter introduces this new API and the ideas behind it  and is intended to make it easy to get started using CgFX     This chapter assumes that the OpenGL state manager  implemented as part  of the CgGL runtime  is being used  Because CgFX allows for extensible   custom state managers  alternate state managers that accept different state  syntax may also be available  For example  a Direct3D state manager might  accept Direct3D style state names  while a Direct3D Under OpenGL state  manager might accept Direct3D style state names  but allow for rendering  using OpenGL     Key Concepts    Effect    An effect file contains a collection of shader source code  parameters  and  rendering techniques  An effect encapsulates one or more different methods  to render a particular visual effect  For example  the effect might provide one  approach intended for use on fixed function hardware  and a different  approach on more modern  programmable hardware     Technique    Each 
262. nt the named interface  Interfaces contain  only function prototype definitions  They do not contain actual function  implementations or data members  For example  the following example  defines an interface named Light consisting of two methods  illuminate     and color       interface Light    micare sliltimimece  GEl sics 2  owe oae ih  9  Tear colo  weasel  y    y     A Cg structure may optionally implement an interface  This is signified by  placing a         and the name of the interface after the name of the structure  being defined  The methods required by the interface must be defined within    the body of the structure  For example     struct Spotlight   Light    sampler2D shadow   samplerCUBE distribution   flogs Plight   Clg  Hoars dias  loss 19  elle Elosjes 1  4          16    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    L   normalize  Plight     P    return Clight   tex2D shadow  P  xxx    texCUBE  distribution  L  xyz           itle S color  woul  4  return Clight       y   Here  the SpotLight structure is defined  which implements the Light  interface  Note that the illuminate    and color    methods are defined  within the body of the structure  and that their implementations are able to  reference data members of the SpotLight structure  for example  Plight   Clight  shadow  and distribution      Function parameters  local variables  and global variables all may have  interface types  Interface parameters to top level functions     such 
263. nts in the pass in order  to determine what state was changed  so that it can set it back to the desired  values   The routines to manually traverse the state in a pass are explained in     OpenGL State   on page 129      Effect Parameters    Handles to effect parameters can be retrieved using  cgGetNamedEffectParameter     Given such a handle  the name of the  parameter can be found with cgGetParameterName     its value can be set  using the Cg runtime value setting entry points  and so on           CGparameter c   cgGetNamedEffectParameter  effect   Color     cgSetParameter3fv c  Color      CGparameter mvp   cgGetNamedEffectParameter  effect    ModelViewProjection        cgGLSetStateMatrixParameter  mvp   CG_GL_MODELVIEW_PROJECTION_MATRIX   CG GL MATRIX IDENTITY                                 Vertex and Fragment Programs    With the OpenGL state manager  vertex and fragment programs are defined  via assignments to the VertexProgram and FragmentProgram states   respectively  Three different classes of expressions can be given on the right   hand side of these state assignments     Q Compile statements       808 00504 0000 006 121  NVIDIA          Cg Language Toolkit    Q In line assembly          Q NULL  These three possibilities are demonstrated in the effect file below   lOc meta  maso illoeie oo  lose i 8 EEXXCIOYOIEJD QU  m  COILOIR     return  foo  gt  0  2 UY 2 2 uv             technique SimpleFrag    pass    VertexProgram   NULL   FragmentProgram   compile arbfpl mai
264. ny constant that is not explicitly typed is implicitly typed  1f the constant  includes a decimal point  it is implicitly typed as c  1oat  If it does not  include a decimal point  it is implicitly typed as cint        232    808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    By default  constants are base 10  For compatibility with C  integer  hexadecimal constants may be specified by prefixing the constant with 0x   and integer octal constants may be specified by prefixing the constant with 0     Compile time constant folding is preferably performed at the same precision  that would be used if the operation were performed at run time  Some  compilation profiles may allow some precision flexibility for the hardware   in such cases the compiler should ideally perform the constant folding at the  highest hardware precision allowed for that data type in that profile     If constant folding cannot be performed at run time precision  it may  optionally be performed using the precision indicated below for each of the  numeric data types     float  s23e8    p32  IEEE single precision floating point  half  s10e5    p16  floating point with IEEE semantics  fixed  s1 10 fixed point  clamping to   2  2    double  s52e11    p64  IEEE double precision floating point       D D DO D    int  signed 32 bit integer    Type Qualifiers    The type of an object may be qualified with one or more qualifiers  Qualifiers  apply only to objects  Qualifiers are removed from the valu
265. o clamp the result of a dot product computation to the range   0  1  in a fragment program  use the saturate    function instead of  max     This is often written as max  0  dot  N  L    but as long as the N  and L vectors are normalized  this can be written equivalently as  saturate  dot  N  L   because the dot product of two normalized  vectors is never greater than one  Given that saturate    is free in  fragment programs  see  3  Use the Cg Standard Library  on page 324    this compiles to more efficient code     Q Use the 1it    Standard Library function  if appropriate  The 1it     function implements a diffuse glossy Blinn shading model  It takes three  parameters       The dot product of the normalized surface normal and the light  vector      The dot product of a half angle vector and the normal    The specular exponent   It returns a 4 vector  where     The x and w components are always one       The y component is equal to the diffuse dot product or to zero if the  product is less than zero       The z component is equal to the specular dot product raised to the  given exponent or to zero if the diffuse dot product was less than  zero    All this is done substantially more efficiently than if the corresponding   operations were written out in Cg code        326 808 00504 0000 006  NVIDIA    Appendix C Nine Steps to High Performance Cg       7  Take Advantage of the Different Levels of  Computation Frequency    Always keep in mind the fact that fragment programs gen
266. o small  objects using a single bounce  ray traced pass  In this example  the polygonal  surface is sampled and a refraction vector is calculated  This vector is then  intersected with a plane that is defined as being perpendicular to the object s  x axis  The intersection point is calculated and used as texture indices for a  painted iris     The demo permits varying the index of refraction  the depth and density of  the lens  Note that the choice of geometry is arbitrary   this sample is a  sphere  but any polygonal model can be used        Fig  9  Example of Ray Traced Refraction       170 808 00504 0000 006  NVIDIA    Vertex Shad    Advanced Profile Sample Shaders    er Source Code for Ray Traced Refraction    struct appin                        float4 Position 8 POSE IONS  float4 Normal   NORMAL   y      output    same struct is the input to fragment shader  struct EyeV2F    float4 HPosition   POSITION     clip space pos  locas OPosiriom E TEXCOORDOS    Olj coswels location  float3 VPosition   TEXCOORD1     eye pos  obj space   float3 N   TEXCOORD2     normal  obj space   plogar J  nbgysuEWexe O 8 T COORDSa J   ligae clu  6193 s2     y              O    O          EyeV2F main  appin IN     uniform float4x4 ModelViewProj   uniform float4x4 ModelViewI   uniform float4 LightVec     in EYE coords                EyeV2F OUT       calculate clip space position for rasterizer use  UT HPosition   mul ModelViewProj  IN Position        pass through object space position  UT OPositi
267. o the Cg Runtime Library                   D3DDECLTYPE FLOAT2  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE TEXCOORD  0     D3DD3CL_END                                     y     and the following Direct3D 8 vertex declaration is valid                    DWORD declaration        D3DVSD_STREAM  0    D3DVSD  REG D3DVSDE POSITION  D3DVSDT_FLOAT3    D3DVSD REG D3DVSDE DIFFUSE  D3DVSDT_D3DCOLOR     D3DVSD STREAM 1            D3DVSD  SKIP  4    D3DVSD  REG  D3DVSDE  D3DVSD  END      y              H       EXCOORDO  D3DVSDT_FLOAT2               This is true because D3DDECLUSAGE_POSITION and D3DVSDE_POSITION match  the hardware register associated with the predefined semantic POSITION   D3DDECLUSAGE_DIFFUSE and D3DVSDE_DIFFUSE match the register  associated with COLORO  and D3DDECLUSAGE_TEXCOORDO and  D3DVSDE_TEXCOORDO match the register associated with TEXCOORDO     The above declarations can also be written the following way using  cgD3D9 ResourceToDeclUsage    Or cgD3D8ResourceToInputRegister      const D3DVERTEXELEMENT9 declaration         LO  0  sizcor  loci   D3DDECLTYPE_FLOAT3  D3DDECLMETHOD_DEFAULT   cgD3D9ResourceToDeclUsage CG POSITION   O0     i  Uu 3S       fubexexouE  elote    D3DDECLTYPE D3DCOLOR  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage  CG COLORO   O       i  4    sizcolr  lose  p  D3DDECLTYPE FLOAT2  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage CG TEXCOORDO   O     D3DD3CL END                                                                                       
268. of program using  D3DXAssembleShader    with assembleFlags as the D3DXASM flags   Depending on the program   s profile  it then either uses  IDirect3DDevice9  CreateVertexShader    to create a Direct3D 9 vertex  shader  or uses IDirect 3DDevice9  CreatePixelShader    to create a  Direct3D 9 pixel shader     Here is a typical use of the function              HRESULT hresult   cgD3D9LoadProgram vertexProgram  TRUE   D3DXASM DEBUG    HRESULT hresult   cgD3D9LoadProgram fragmentProgram  TRUE  0                           To load a program in Direct3D 8 use cgD3D8LoadProgram      HRESULT cgD3D8LoadProgram CGprogram program   BOOL parameterShadowingEnabled  DWORD assembleFlags   DWORD vertexShaderUsage  const DWORD  declaration      This function assembles the result of the compilation of program using  D3DXAssembleShader    with assembleFlags as the D3DXASM flags   Depending on the program   s profile  it then either uses  IDirect3DDevice8  CreateVertexShader    to create a Direct3D vertex  shader with declaration as the vertex declaration and vertexShaderUsage  as the usage control  or uses IDirect3DDevice8     CreatePixelShader    to  create a Direct3D pixel shader        808 00504 0000 006 103  NVIDIA          Cg Language Toolkit    The value of parameterShadowingEnabled should be set to TRUE to enable  parameter shadowing for the program  This behavior can be changed after  the program is created by calling cgD3DEnableParameterShadowing     Here is a typical use of the function
269. of reasons  including    Q Changing variability of parameters  Parameters may be changed from uniform variability to literal variability   compile time constant   See the cgSetParameterVariability manual  page for more information     Q Changing value of literal parameters  Changing the value of a literal parameter will require recompilation  since the value is used at compile time  See the cgSetParameter and  cgSetMatrixParameter manual pages for more information     Q Resizing unsized arrays  Changing the length of a parameter array may require recompilation  depending on the capabilities of the program profile  See the       808 00504 0000 006 51  NVIDIA          Cg Language Toolkit    cgSetArraySize and cgSetMultiDimArraySize manual pages for more  information     Q Connecting structures to interface parameters  Structure parameters can be connected to interface program parameters  to control the behavior of the program  Changing these connections  requires recompilation on all current profiles  See the  cgConnectParameter manual page and the Interfaces section of this  document for more details     When a program enters an uncompiled state  it is automatically unloaded  and unbound  In order to be used again  the program must be recompiled   either automatically or manually     see the following   and then reloaded  and rebound     Compilation can be performed manually by the application via  cgCompileProgram CGprogram program      or automatically by the runtime     Com
270. of vectors than an array of matrices   Accessing a matrix requires a floor calculation  followed by a multiply by a  constant to compute the register index  Because vectors  and scalars  take  one register  neither the floor nor the multiply is needed  It is faster to do       808 00504 0000 006 297  NVIDIA          Cg Language Toolkit    matrix skinning using arrays of vectors with a premultiplied index than  using arrays of matrices     Bindings  Binding Semantics for Uniform Data    The valid binding semantics for uniform parameters in the vs_2_0 and  vs 2 Xprofiles are summarized in Table 39     Table 39  vs 2   Uniform Input Binding Semantics       Binding Semantics Name Corresponding Data       register  c0  register c255   Constant register  0  95     C0 C255 The aliases c0 c95  lowercase  are also  accepted    If used with a variable that requires more  than one constant register  for example  a  matrix   the semantic specifies the first  register that is used                 Binding Semantics for Varying Input Output Data    Only the binding semantic names need be given for these profiles  The vertex  parameter input registers are allocated dynamically  All the semantic names   except POSITION  can have a number from 0 to 15 after them     Table 40  vs 2   Varying Input Binding Semantics       POSITION PSIZE       BLENDWEIGHT BLENDINDICES       NORMAL TEXCOORD       COLOR TANGENT          TESSFACTOR BINORMAL             The valid binding semantics for varying output 
271. ogram  The arbvp1 conventions are  compatible with the vp20 and vp30 profiles        808 00504 0000 006 259  NVIDIA          Cg Language Toolkit    Loading Constants    Bindings    Applications that do not use the Cg run time are no longer required to load  constant values into program parameters registers as indicated by the   const expressions in the Cg compiler output  The compiler produces  output that causes the OpenGL driver to load them  However  uniform  variables that have a default definition still require constant values to be  loaded into the appropriate program parameter registers  as ARB vertex  programs do not support this feature  Application programs either have to  use the Cg run time  parse  and handle the  default commands  or have to  avoid initializing uniform variables in the Cg source code     Binding Semantics for Uniform Data    The valid binding semantics for uniform parameters in the arbvp1 profile are  summarized in Table 16     Table 16  arbvp1 Uniform Input Binding Semantics          Binding Semantics Name Corresponding Data   register  c0   register  c255  Local parameter with index n  n    0  255     C0 C255 The aliases c0 c255  lowercase  are also  accepted     If used with a variable that requires more  than one constant register  for example  a  matrix   the semantic specifies the first local  parameter that is used                 Binding Semantics for Varying Input Output Data    The valid binding semantics for uniform parameters in the a
272. ogram  To   see if it is compatible with the program  use   cgD3D9ValidateVertexDeclaration       CGbool cgD3D9ValidateVertexDeclaration  CGprogram program   const D3DVERTEXELEMENT9  declaration       for the Direct3D 9 Cg runtime or cgD3D8ValidateVertexDeclaration      Use cgD3D8ValidateVertexDeclaration        CGbool cgD3D8ValidateVertexDeclaration  CGprogram program   const DWORD  declaration      for the Direct3D 8 Cg runtime     A call to cegD3D9ValidateVertexDeclaration   or  cgD3D8ValidateVertexDeclaration   returns CG TRUE if the vertex  declaration is compatible with the program  A Direct3D 9 declaration is  compatible with the program if the declaration has an entry matching every  varying input parameter used by the program  A Direct3D 8 declaration is  compatible with the program if the declaration has a D3DVSD REG    macro  call matching every varying input parameter used by the program  For the  program  void main float4 position   POSITION    float4 color   COLORO    float4 texCoord   TEXCOORDO        i    the following Direct3D 9 vertex declaration is valid                                                                                            const D3DVERTEXELEMENT9 declaration         LO       sizcor  tle  y   D3DDECLTYPE FLOAT3  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE POSITION  O      LO  5 Saleen   float    D3DDECLTYPE_D3DCOLOR  D3DDECLMETHOD_DEFAULT   D3DDECLUSAGE COLOR  O0        i  4 3 salgeoie  elos     88 808 00504 0000 006    NVIDIA    Introduction t
273. ogram for one bone  another program for two bones  and so on        Fig  5  Example of Improved Skinning       154 808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders    Vertex Shader Source Code for Improved Skinning    SERUCIE imauies                              float4 position POSITION   float4 weights BLENDWEIGHT   float4 normal NORMAL   float4 matrixIndices TESSFACTOR   float4 numBones SPECULAR    y    SstrUCC QUEPUES      float4 hPosition POSITION   float4 color COLORO     y     Oui   dUlcs mesita  Eajoules I  N   uniform float4x4 modelViewProj     uniform float3x4 boneMatrices 30      vision los colo   uniform float4 lightPos     outputs OUT     float4 index   IN matrixIndices   float4 weight   IN weights    float4 position    float3 normal    for  float i   0  i  lt  IN numBones x        transform the offset by bone i  position Position neto  float 4  mul  boneMatrices index x    il OMe       transform normal by bone i  normal normal   weight x    mul   float3x3  boneMatrices  ind       TN mal 7  o XS P    ff shit ower iin     the index and     the       index weight variables   weight for the current bone into    i       1  d    JUIN DOS ELO  ZE    Pepe    this moves     X component of the index and weight variables       808 00504 0000 006  NVIDIA    155          Cg Language Toolkit          156 808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders       Improved Water    Description    This demo gives the appearance that the viewer is surrounded 
274. om vertex shader   SECU Wee ol     float4 HPosition REO STONE  float4 Color CEN EISE  y     vertout main appin IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelViewIT   uniform float4 LightVec     vertout OUT        Transform vertex position into homogenous clip space   OUT HPosition   mul ModelViewProj  IN Position               Transform normal from model space to view spac  float3 normalVec   normalize  mul  ModelViewIT   IN Normal  xyz        Store normalized light vector   loat3 lightVec   normalize LightVec xyz         Int  Ss      Calculate half angle vector   loat3 eyeVec   float3 0 0  0 0  1 0    loat3 halfVec   normalize lightVec   eyeVec      in inn oe      Calculate diffuse component   loat diffuse   dot normalVec  lightVec       lan  Ss      Calculate specular component   loat specular   dot  normalVec  halfVec                ri Ss       Use the lit function to compute lighting vector from     diffuse and specular values   float4 lighting   lit diffuse  specular  32         Blue diffuse material  micare Cchirruseuerecial   wiloacs 0 0  0 0  10        White specular material  figaro pecularMarteriia   Eloces  1 0  1 0  1 0        Combine diffuse and specular contributions and       10    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    IN Sat final vertaz color     OUT Color rgb   lighting y   diffuseMaterial      lighting z   specularMaterial        OUT Color s   1 05    return OUT        Working with Data    Like C  Cg supports feature
275. ompiled into GPU assembly code  either on  demand at run time or beforehand     Cg makes it easy to combine a Cg fragment program with a handwritten  vertex program  or even with the non programmable OpenGL or DirectX  vertex pipeline  Likewise  a Cg vertex program can be combined with a  handwritten fragment program  or with the non programmable OpenGL or  DirectX fragment pipeline     Cg Language Profiles    Because all CPUs support essentially the same set of basic capabilities  the C  language supports this set on all CPUs  However  GPU programmability has  not quite yet reached this same level of generality  For example  the current  generation of programmable vertex processors supports a greater range of  capabilities than do the programmable fragment processors  Cg addresses  this issue by introducing the concept of language profiles  A Cg profile defines  a subset of the full Cg language that is supported on a particular hardware  platform or API  The current release of the Cg compiler supports the  following profiles     Q OpenGL ARB vertex programs  Runtime profile    CG_PROFILE_ARBVP1  Compiler option   profile arbvpl    Q OpenGL ARB fragment programs  Runtime profile      CG_PROFILE_ARBFP1  Compiler option   profile arbfp1    Q OpenGL NV40 vertex programs  Runtime profile  CG_PROFILE VP40  Compiler option   profile vp40   Q OpenGL NV40 fragment programs  Runtime profile      CG_PROFILE_FP40  Compiler option   profile fp40   Q OpenGL NV30 vertex programs  Runtime p
276. on   IN Position xyz       object space normal  UT N   normalize IN Normal xyz        transform view pos and light vec to obj space  UT VPosition   mul ModelViewI  float4 0 0 0 1   xyz   UT LightVecO   normalize  mul  ModelViewI  LightVec             return OUT        808 00504 0000 006 171    NVIDIA          Cg Language Toolkit    Pixel Shader Source Code for Ray Traced Refraction       Assume ray direction is normalized      Vector  planeEq  is encoded half3 A B C D  where        Ax By C       z D  0 and half3 A B C  has been normalized        Returns distance along to to intersection  distance is       negative    HAE ig  aceso     half intersect_plane half3 rayOrigin half3 rayDir     half4 planeEg             half3 planeN   planeEq xyz   half denominator   dot planeN  rayDir    levels ires ulti il  mp       d  0   gt  parallel    d gt 0   gt  faces away  if  denominator    0 0h     half top   dot  planeN  rayOrigin    planeEq w           result    top denominator          return result        define ETA    il solorields im    Balilipdeca  define RADIUS x  define IRIS_DEPTH y    Zz          define LENS_DENSITY w                      IV suwistielc in  Specnaira   define PHONG x   define GLOSS1 y   define GLOSS2 z   define DROP w    struct EyeV2F                           llo au EIE osito  MEOS ERE ONIE  float3 OPosition   TEXCOORDO   loss Weroslrlomn 2 WapyCOOurraw il    SESAN   TEXCOORD2   float4 LightVecO   TEXCOORD3           half4 main         EyeV2F IN     uniform samp
277. on  and are therefore only applied to those  functions for which code is being generated  This specification uses the word  program to refer to the top level function  any functions the top level function  calls  and any global variables or typedef definitions it references     Each profile must have a separate specification that describes its  characteristics and limitations    This core Cg specification requires certain minimum capabilities for all  profiles  In some cases  the core specification distinguishes between vertex   program and fragment program profiles  with different minimum  capabilities for each     The Uniform Modifier    Non static global variables and parameters passed to functions  such as  main     can be declared with an optional qualifier uniform  To specify a  uniform variable  use this syntax    uniform  lt type gt   lt variable gt     For example     uniform float4 myVector        808 00504 0000 006 225  NVIDIA          Cg Language Toolkit    or    float4 foo uniform float4 uv      If the uniform qualifier is specified for a function that is not top level  it is  meaningless and is ignored  The intent of this rule is to allow a function to  serve either as a top level function or as one that is not     Note that uniform variables may be read and written just like non uniform  variables  The uniform qualifier simply provides information about how the  initial value of the variable is to be specified and stored  through a  mechanism external to the la
278. onent of x is not equal to 0   Returns false otherwise    asin  x  Arcsine of x in range   1 2  1 2    x should be in   1  1     atan  x  Arctangent of x in range   1 2  1 2         atan2 y  x     Arctangent of y x in range   z  n         ceil x     Smallest integer not less than x       clamp x  a  b     x clamped to the range  a  b  as follows     Returns a if x is less than a    e Returns b if x is greater than b      Returns x otherwise        cos  x     Cosine of x        cosh x     cross a  b     Hyperbolic cosine of x     Cross product of vectors a and b   a and b must be 3 component vectors        degress x     Radian to degree conversion        determinant  M     Determinant of matrix M                          dot a  b  Dot product of vectors a and b    exp  x  Exponential function e     exp2  x  Exponential function 2     floor  x  Largest integer not greater than x   fmod x  y  Remainder of x y  with the same sign as x     If y is zero  the result is implementation defined           34    808 00504 0000 006  NVIDIA       Table 1     Cg Standard Library Functions    Mathematical Functions  continued        Mathematical Functions          Function Description  frac  x  Fractional part of x   frexp x  out exp  Splits x into a normalized fraction in the interval  1 2     1   which is returned  and a power of 2  which is stored  in exp   If x is zero  both parts of the result are zero        isfinite x     Returns true if x is finite              isinf  x  Returns true
279. orMap  IN TexCoords xy     half4 material   tex2D MaterialMap  IN TexCoords xy      half3 Nt   tex2D NormalMap  IN TexCoords xy  rgb    ha teo  0 5  045127 0   510  y          SpecData MAXSPEC  should  range from 0   1   half specStr   material SPEC STR   SpecData MAXSPEC     half specPower   SpecData MINPOWER    material NORM SPEC EXPON                            SpecData MAXPOWER   SpecData MINPOWER            palm oiva   morma lla vs  IN  essen mM MEISNIIO EO SENESIEGIET  half3 Ln   normalize IN LightVecO  xyz   half3 Nb   normalize BumpData BUMP SCALE      nal  nal    Fh Fh Fh  w    nal    ns    naL  naL    imn dew dem  ns    ns    Na    nar       half4 reflColor               ie dew de    nal        Nt x IN T   Nt y IN B      NE Z ENS       cizre   got NO   Hn    normalize Vn   Ln    4 lighting   lit diff  dot Hn  Nb   specPower       diffResult   lighting y   surfCol   specCol   lerp WHITE  surfCol  material METALNESS     specResult   lighting z   specStr   specCol                    3 reflVect   reflect  Vn  Nb               texCUBE  EnvMap  reflVect               fakeFresnel   ReflData FRESNEL_MIN               ReflData FRESNEL_MAX    pow  saturate  1 0h dot   Vn  IN N     ReflData FRESNEL_EXPON                               168    808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders          808 00504 0000 006 169  NVIDIA          Cg Language Toolkit       Ray Traced Refraction    Description    This shader presents a method for adding high quality details t
280. orm samplerCUBE EnvironmentMap   uniform float3 EyeVector    COLOR                      fetch the bump normal from the normal map  float4 normal   tex2D NormalMap  IN TexCoord xy         transform the bump normal into cube space      then use the transformed normal and eye vector   M to compute the reflection vector that is    a used to fetch the cube map   return texCUBE reflect eye dp3x3 EnvironmentMap   IN TangentToCubeSpace2 xyz   IN TangentToCubeSpace0   IN TangentToCubeSpacel   normal    EyeVector                                    808 00504 0000 006 199  NVIDIA          Cg Language Toolkit       Fresnel    Description    This effect computes a reflection vector to lookup into an environment map  for reflections  and modulates this by a Fresnel term  The result is reflections    only at grazing angles  Fig  16          Fig  16    Example of Fresnel    Vertex Shader Source Code for Fresnel    struct app2vert         float4 Position 8 POSITION   float4 Normal   NORMAL   float4 TexCoordO0 S TEXCOORDOR       y        200 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders    SUSE Sues         float4 HPosition DIO SARA ONS  float4 Color0   COLORO   float4 TexCoord0 LE XECOORD OT     y        vert2frag main  app2vert IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelView   uniform float4x4 ModelViewIT     vert2frag OUT         ifdef PROFILE ARBVPI1  ModelViewProj   glstate matrix mvp   ModelView   glstate matrix modelview 0    ModelViewIT   glstate matrix inv
281. ormal parameter is a struct  the binding semantic may be  specified with an element of the struct when the struct is defined   struct  lt struct tag gt      lt type gt   lt identifier gt      lt binding semantic gt                y     Q If the input to the function is implicit  a non static global variable that is  read by the function   the binding semantic may be specified when the  non static global variable is declared      lt type gt   lt identifier gt      lt binding semantic gt       lt initializer gt      If the non static global variable is a struct  the binding semantic may be  specified when the struct is defined  as described in the second bullet  above     O A binding semantic may be associated with the output of a top level  function in a similar manner    lt type gt   lt identifier gt     lt parameter list gt        lt binding semantic gt       lt body gt       Another method available for specifying a semantic for an output value  is to return a struct and to specify the binding semantic s  with       242    808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    elements of the struct when the struct is defined  In addition  if the  output is a formal parameter  the binding semantic may be specified  using the same approach used to specify binding semantics for inputs     Aliasing of Semantics    Semantics must honor a copy on input and copy on output model  Thus  if  the same input binding semantic is used for two different variables  those  v
282. ormalize vert Binormal    normalize vert Normal  j        184    808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders             FRESNEL   OFFSET  SCALE  POWER  UNUSED     float4 Fresnel O Ce p c                            float3x3 ViewTangent   mul  ModelTangent    float3x3 ModelViewIT            Generate VIEW SPACE vectors   float3 viewN   normalize  mul   float3x3 ModelView   vert  ONormal     float4 viewP   mul  ModelView  vert OPosition    viewP w   l saturate sqrt  dot  viewP xyz   ViewP xyz   0 01               float3 viewV    viewP xyz           Generate OBJECT SPACE vectors   float3 objV   normalize  EyePosition vert OPosition xyz     float3 objL   normalize  LightVector      float3 objH   normalize objL   objV                           Generate TANGENT SPACE vectors   float3 tanL   mul  ModelTangent  objL    float3 tanV   mul  ModelTangent  objV    float3 tanH   mul  ModelTangent  objH                           Generate REFLECTION vector for per vertex     reflection look up  float3 reflection   reflect   viewV  viewN                     Generate FRESNEL term   float ndv   saturate  dot  viewN  viewV      float FresnelApprox    pow  1 ndv  Fresnel z  Fresnel y    Fresnel x               Fill OUTPUT parameters                                  ONU   yart wwe    TEXCOORDO xy  O   dbitcoylee   tanL     Tangent space LIGHT     Tangent space HALF ANGLE  O halfangle   float4 tanH x  tanH y   tanH z  l exp  viewP w     G reflection   deux op    View space REFLECTI
283. oved Water        0 0 0 0  llle 158  Pixel Shader Source Code for Improved Water            0 0 00 epee 160  uir TTC CUIU 161  DeSEFTDEIDIIS 1 ccf ae mat dos Sex eh se db ia eatis mut dc ne Ro 161  Vertex Shader Source Code for Melting Paint       2    2 0 00  161  Pixel Shader Source Code for Melting Paint               00 000 cece eee eee 163  M  ltiPalfE   3 2 aic ad o EE RR aci gh aed  Ban Ok aed A 165  rtis   LOUPE 165  Vertex Shader Source Code for MultiPaint            llle 166  Pixel Shader Source Code for MultiPaint          llle 167  Ray  Traced RefracBoN  rs 3424624 5 06 QA Sheet esas p Rn Rud qd Re NA 170  DeseriptiOll  cas epr He Et AREER E REN a a 170  Vertex Shader Source Code for Ray Traced Refraction           0 o oo ooooo eee 171  Pixel Shader Source Code for Ray Traced Refraction           ooooooooooooo   172  Jl 175  DeseriDtlOl 24 ca cktve d bp Ea Du ERROR REAR GR UR KEE ORS 175  Pixel Shader Source Code for SHIM sistas aa ja tax x ert eh le RT Ras e 175  Trot FINTENECE cose sc ER oe E SEHR DINI UPS Rp Rd dre a Fa i 180  DeSEEIDEIBRI spies  suis ta de agai A  dos SOROR e S Re ie OR UR IER RU ERU ROB ion 180  Vertex Shader Source Code for Thin Film Effect          llle 180  Pixel Shader Source Code for Thin Film Effect           llle 182  CaP POA Os  ogc te hoe ed Since Se QN aM aa dr m e Paha st aq a Rari Ee Ran ees 183  DESEO sarria REGE CER ONE a A RE EROR RUE de 183  Vertex Shader Source Code for Car Paint9    1    ee 184  Pixel Shader Source Code for 
284. parameters in the vs 2 0  and vs 2 x profiles are summarized in Table 41        298 808 00504 0000 006  NVIDIA    Appendix B Language Profiles    These map to output registers in DirectX 9 vertex shaders     Table 41  vs 2   Varying Output Binding Semantics       Binding Semantics Name    Corresponding Data             POSITION Output position  oPos  PSIZE Output point size  oPts  FOG Output fog value  oFog       COLORO COLOR1    Output color values  oDO  oD1       TEXCOORDO TEXCOORD7          Output texture coordinates  oTO  oT7          Options    The vs_2_x profile allows the following profile specific options     DynamicFlowControlDepth  lt n gt     NumTemps  lt n gt   Predication     where n  0 or 24  default 24    where 12  lt   n  lt   32  default 16    default true        808 00504 0000 006    NVIDIA    299          Cg Language Toolkit       DirectX Pixel Shader 2 x Profiles  ps 2       Memory    The DirectX Pixel Shader 2 0 Profiles are used to compile Cg source code to  DirectX 9 PS 2 0 pixel shaders    and DirectX 9 PS 2 0 extended pixel shaders     Q Profile names  ps_2_0  for DirectX 9 PS 2 0 pixel shaders   ps_2_x  for DirectX 9 PS 2 0 extended pixel shaders     Q How to invoke  Use the compiler options    profile ps_2_0    profile ps 2 x  The ps 2 0 profile limits Cg to match the capabilities of DirectX PS 2 0 pixel  shaders  The ps 2 x profile is the same as the ps 2 0 profile but allows  extended features such as arbitrary swizzles  larger limit on number of 
285. pare for lighting        store normalized light vector  float3 lightVec   normalize LightVec xyz         calculate half angle vector  float3 eyeVec   float3 0 0  0 0  1 0    float3 halfVec   normalize lightVec   eyeVec      At this point we have to ensure that all our vectors are normalized  We start  by normalizing LightVec   Then  in preparation for specular lighting  we  have to define the    half angle    vector halfvec  which is the vector halfway  between the light and the eye vectors  that is   lightVecteyeVec   2   We  normalize halfvec  so we don t need to bother with the division by two   because it cancels out after normalization anyway  In this example  we  assume that the eye is at  0 0 1   but an application would typically pass  the eye position also as a uniform parameter  since it would be unchanged  from vertex to vertex  We use Cg s inline vector construction capability to  build a 3 component float vector that contains the eye position  and then  we assign this value to eyeVec        1  Because Light Vec is uniform  it is more efficient to normalize it once in the application  rather than on a per vertex basis  It is done here for illustrative purposes        150 808 00504 0000 006  NVIDIA    A Brief Tutorial    Calculating the Vertex Color    Now we have to calculate the vertex color to output     Calculating the Diffuse and Specular Lighting Contributions    In this example  we re going to calculate just a simple combination of diffuse  and specular ligh
286. pecification    The profile name must immediately precede the type name in the function  declaration  For example  to define two different versions of the function  myfunc   for the profileA and profileB profiles     prorilea flos myituine  Flogs x  1 7  sns  Vi   profiles float mwtumo  float x          19    If a type is defined  using a typedef  that has the same name as a profile  the  identifier is treated as a type name and is not available for profile  overloading at any subsequent point in the file    If a function definition does not include a profile  the function is referred to  as an open profile function  Open profile functions apply to all profiles     Several wildcard profile names are defined  The name vs matches any vertex  profile  while the name ps matches any fragment or pixel profile     The names ps_1 and ps_2 match any DirectX 8 pixel shader 1 x profile or  DirectX 9 pixel shader 2 x profile  respectively  Similarly  the names vs_1 and  vs_2 match any DirectX vertex shader 1 x or 2x  respectively  Additional  valid wildcard profile names may be defined by individual profiles     In general  the most specific version of a function is used  More details are  provided in    Function Overloading    on page 240  but roughly speaking  the  search order is the following     1  Version of the function with the exact profile overload    2  Version of the function with the most specific wildcard profile overload   such as vs Or ps_1     3  Version of the function 
287. peration  and  m is the 2 D bump environment mapping matrix   This function can generate the texbem instruction in all ps 1 x profiles        offsettex2DScaleBias  uniform sampler2D tex  float2 st   float4 prevlookup  uniform float4 m   uniform float scale  uniform float bias        Performs the following   float2 newst   st   m xy   prevlookup xx   m zw   prevlookup yy   float4 result   tex2D tex  newst    return result   saturate  prevlookup z   scale   bias    where  st are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   m is the 2 D bump environment mapping matrix   scale is the 2 D bump environment mapping scale factor  and  bias is the 2 D bump environment mapping offset   This function can generate the texbem1 instruction in all ps 1 x profiles                 808 00504 0000 006 315  NVIDIA          Cg Language Toolkit    Table 54  ps 1 x Auxiliary Texture Functions  continued        Texture Function       Description       texlD dp3 samplerlD tex  float3 str  float4 prevlookup        Performs the following   return tex1D  tex  dot  str  prevlookup xyz     where  str are texture coordinates associated with sampler tex  and  prevlookup is the result of a previous texture operation     This function can be used to generate the texdp3tex instruction in the  ps 1 2andps 1 3 profiles        tex2D dp3x2 uniform sampler2D tex  float3 str   float4 intermediate coord  float4 prevlookup        Performs the following   
288. peration or arithmetic  Operation can occur in the program  A texture shader operation may not  have any dependency on the output of an arithmetic operation unless    Q the arithmetic operation is a valid input modifier for the texture shader  operation       5  For more details about the underlying instruction sets  their capabilities  and their  limitations  please refer to the NV texture shader and NV register combiners  extensions in the OpenGL Extensions documentation        808 00504 0000 006 283  NVIDIA          Cg Language Toolkit    Q the arithmetic operation is part of a complex texture shader operation   which are summarized in the section    Auxiliary Texture Functions    on  page 290     Modifiers    There are certain simple arithmetic operations that can be applied to inputs  of texture shader operations and to inputs and outputs of arithmetic  operations without generating a register combiner instruction  These  operations are referred to as input modifiers and output modifiers     Instead of generating a register combiners instruction  the arithmetic  operation modifies the assembly instruction or source registers to which it is  applied  For example  the following Cg expression    z   x  0 5   y    2    could generate the following register combiner instruction  assuming x is in  tex0  y is in tex1  and z is in co10     rgb     discard   half bias  tex0 rgb     discard   texl rgb   col0   sum     scale_by_one_half        alpha     discard   half bias tex0 a  
289. pilation behavior is controlled via  void cgSetAutoCompile  CGcontext ctx  CGenum flag      Here    1ag may be one of the following enumerants     Q CG COMPILE MANUAL  In this mode  the application is responsible for manually compiling a  program  The application may check to see if a program requires  recompilation with the entry point cgIsProgramCompiled  The program  may then be compiled via cgCompileProgram     This mode provides  the application with the most control over how and when program  recompilation occurs     QO CG COMPILE IMMEDIATE  In this mode  the Cg runtime will force compilation automatically and  immediately when a program enters an uncompiled state  or when the  program is first created  This is the default mode     O CG COMPILE LAZY  This mode is similar to CG COMPILE IMMEDIATE  but will delay program  compilation until the program object code is needed  The advantage of  this method is the reduction of extraneous recompilations  The  disadvantage is that compile time errors will not be encountered when  the program enters an uncompiled state  but will instead be encountered  at some later time  most likely when the program is loaded or bound         52    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    A call to cgIsProgramCompi led    determines whether a program needs to  be recompiled     CGbool cgIsProgramCompiled CGprogram program       To recompile a program  use cgCompileProgram     cgCompileProgram CGprogram program     
290. ponding Data       POSITION  HPOS Output position             PSIZE  PSIZ Output point size          808 00504 0000 006  NVIDIA       Appendix B Language Profiles    Table 25  vp30 Varying Output Binding Semantics  continued                                   Binding Semantics Name Corresponding Data   FOG  FOGC Output fog coordinate   COLORO  COLO Output primary color   COLOR1  COL1 Output secondary color   BCOLO Output backface primary color  BCOL1 Output backface secondary color  TEXCOORDO TEXCOORD7  Output texture coordinates  TEXO TEX7   CLPO CL5 Output Clip distances             The profile allows wPos to be present as binding semantics on a member of a  structure of a varying output data structure  provided the member with this  binding semantics is not referenced  This allows Cg programs to have same  structure specify the varying output of a vp30 profile program and the  varying input of an   p30 profile program        808 00504 0000 006 273  NVIDIA       Cg Language Toolkit       OpenGL NV fragment program Profile    p30     The   p30 Fragment Program Profile is used to compile Cg source code to  fragment programs for use by the NV   ragment program OpenGL  extension     Q Profile name    p30       Q How to invoke  Use the compiler option  profile fp30     This section describes the capabilities and restrictions of Cg when using the    p30 profile     Language Constructs and Support    Data Types  Q fixed type  s1 10 fixed point  is supported       Q half type  s10e5
291. ponding Data  register  s0    register  s15  Texunit image unit N  where wis in range  TEXUNITO TEXUNIT15  0  15     May only be used with uniform inputs with  sampler  types        register  c0   register  c31  Local Parameter N  where wis in range  C0 C31  0  31   May only be used with uniform inputs                 Binding Semantics for Varying Input Output Data    The valid binding semantics for varying input parameters in the arb  p1 pro   file are summarized in Table 20     Table 20  arbfp1 Varying Input Binding Semantics             Binding Semantics Name Corresponding Data  type   COLORO Input color 0    loat4    COLOR1 Input color 1    loat4   TEXCOORDO TEXCOORD7 Input texture coordinates  float 4                 The valid binding semantics for varying output parameters in the arbfp1  profile are summarized in Table 21     Table 21  arbfp1 Varying Output Binding Semantics                            Binding Semantics Name Corresponding Data  COLOR  COLORO Output color  float 4   DEPTH Output depth  float   808 00504 0000 006 265    NVIDIA          Cg Language Toolkit       Options  The ARB fragment program profile allows the following profile specific  options   NumTemps  lt n gt   where 0  lt   n  lt   32  default 32   NumInstructionSlots  lt n gt   where n  gt   0  default 1024   NumMathInstructionSlots  lt n gt   where n  gt   0  default 1024   NoDependentReadLimit  lt b gt   where b 0 or 1  default 1   NumTexInstructionSlots  lt n gt   where n  gt   0  default 102
292. pport the half type  but may choose to implement it with the same  precision as the float type     Q The fixed type is a signed type with a range of at least   2 2  and with at  least 10 bits of fractional precision  Overflow operations on the data type  clamp rather than wrap  Fragment profiles must support the fixed type   but may implement it with the same precision as the half or float  types  Vertex profiles are required to provide partial support  see     Partial Support of Types  on page 231  for the fixed type  Vertex  profiles have the option to provide full support for the fixed type or to  implement the fixed type with the same precision as the half or float    types   O The bool type represents Boolean values  Objects of bool type are either  true or false     O The cint type is 32 bit two s complement  This type is meaningful only  at compile time  it is not possible to declare objects of type cint        Q The cfloat type is IEEE single precision  32 bit  floating point  This type  is meaningful only at compile time  it is not possible to declare objects of  type cfloat     Q The void type may not be used in any expression  It may only be used as  the return type of functions that do not return a value        808 00504 0000 006 229    NVIDIA          Cg Language Toolkit       The sampler  types are handles to texture objects  Formal parameters of  a program or function may be of type sampler   No other definition of  sampler  variables is permitted  A sampler  v
293. profiles affects the  Cg source code that the developer writes     The vs_2_0 profile limits Cg to match the capabilities of DirectX VS 2 0  vertex shaders  The vs_2_x profile is the same as the vs_2_0 profile but  allows extended features such as dynamic flow control  branching      DirectX 9 vertex shaders have a limited amount of memory for instructions  and data     Program Instruction Limit    DirectX 9 vertex shaders are limited to 256 instructions  If the compiler needs  to produce more than 256 instructions to compile a program  it reports an  error     Vector Register Limit    Likewise  there are limited numbers of registers to hold program parameters  and temporary results  Specifically  there are 256 read only vector registers  and 12 32 read write vector registers  If the compiler needs more registers to  compile a program than are available  it generates an error        6  To understand the DirectX VS 2 0 Vertex Shaders and the code the compiler produces  see  the Vertex Shader Reference in the DirectX 9 SDK documentation        296    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Statements and Operators    If the vs_2_0 profile is used  then if  while  do  and for statements are  allowed only if the loops they define can be unrolled because there is no  dynamic branching in unextended VS 2 0 shaders     If the vs_2_x profile is used  then if  while  and do statements are fully  supported as long as the DynamicFlowControlDepth option is not 0    
294. r Bump Reflection Mapping    Exc ex f    y     float4 Position   POSITION     in object space  float2 TexCoord   TEXCOORDO                    float3   TEXCOORD1     in object space  Hoar SiS mlb COORD 2 E    in object space  moat SN EX COORDS     in object space          SiO wd d    y     Floats Posicion s POSITION     su projection Space  float4 TexCoord   TEXCOORDO           tiret Bow or the S29 transtorm  hit from tangent to cube space  float4 TangentToCubespace0   TEXCOORD1           second row of the 3x3 transform  Ml from tangent to cube space  float4 TangentToCubeSpacel   TEXCOORD2        ff third row  ue the BRS Eras oran    J from tangent to cube space  float4 TangentToCubeSpace2   TEXCOORD3        v2f main a2v IN     uniform float4x4 WorldViewProj    uniform float3x4 ObjToCubeSpace    uniform float3 EyePosition     in cube space  uniform float BumpScale        WAI OUP       pass texture coordinates for  Ue fetching the normal map  OUT TexCoord xy   IN TexCoord xy        compute 3x3 transform from tangent to object space  float3x3 objToTangentSpace        first rows are the tangent and binormal     scaled by the bump scale  objToTangentSpace 0    BumpScale   IN T        808 00504 0000 006 197    NVIDIA          Cg Language Toolkit       objToTangentSpace 1    BumpScale   IN B   objToTangentSpace  2  ENENG     compute the 3x3 transform from             Hi tangent space to cube space       TangentToCubeSpace   Hy   object2cube   tangent2object   Va   object2cube   tran
295. r or  equal to zero and less than  the value of   GL MAX PROGRAM LOCAL  PARAMETERS ARB forthe  GL VERTEX PROGRAM ARB  target to  glGetProgramivARB       VertexLocalParameter   ndx     float4    ARB_vertex_program   ndx must be greater or  equal to zero and less than  the value of   GL MAX PROGRAM LOCAL  PARAMETERS ARB for the  GL VERTEX PROGRAM ARB  target to  glGetProgramivARB       VertexProgram          compile  statement          ARB_vertex_program or  NV vertex program          138    NVIDIA    808 00504 0000 006       Introduction to CgFX    Similarly  there is a simple algorithm for determining the relationship  between enumerants for glEnable    and for glDisable    and each of the  states in the table below  for example  the state assignment BlendEnable    false corresponds to a call to glDisable  GL_BLEND                                                                        Table 7  Enable Disable States   Enable  Disable State Name Type Requires   AlphaTestEnable bool OpenGL 1 0   AutoNormalEnable bool 1 0   BlendEnable bool 1 0   ClipPlaneEnable  ndx  bool 1 0  ndx must be greater or equal to zero and less  than the value of GL_MAX_CLIP_PLANES   ColorLogicOpEnable bool 1 2   CullFaceEnable bool 1 0   DepthBoundsEnable bool EXT_depth_bounds   DepthClampEnable bool NV_depth_clamp   DepthTestEnable bool 1 0   DitherEnable bool 1 0   FogEnable bool 1 0   LightEnable  ndx  bool 1 0  ndx must be greater or equal to O and less than  the value of GL_MAX_LIGHTS   Lighting
296. ragraph are three lists of the state fields that can be accessed  The array  indexes are shown as 0  but an array can be accessed using any positive  integer that is less than the limit of the array  For example  the diffuse  component of the second light would be accessed by using the semantic       1  See  OpenGL NV  vertex program 1 0 Profile  vp20   on page 279 for a full explanation  of the data types  statements  and operators supported by this profile        256    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    state  light  1   diffuse  assuming that GL_MAX_LIGHTS is at least 2  as  shown in the following code     void main  uniform float4 lightColor   state light 1  diffuse        The state semantics of type   1oat4x4 that can be accessed are in Table 13                                               Table 13  float4x4 state Semantics   state matrix modelview 0  state matrix projection  state matrix mvp state matrix texture 0   state matrix palette 0  state matrix program 0   state matrix inverse modelview 0  state matrix inverse projection  state matrix inverse mvp state matrix inverse texture 0   state matrix inverse palette 0  state matrix inverse program 0   state matrix transpose  modelview  0  state matrix transpose projection  state matrix transpose mvp state matrix transpose texture 0   state matrix transpose palette 0  state matrix transpose program 0   state matrix invtrans modelview 0  state matrix invtrans projection  state matrix invtrans m
297. ram   CG  COMPILED PROGRAM    D3DXAssembleShader progSrc  strlen progSrc   0  O0  0    amp byteCode  0    device  gt CreatePixelShader  byteCode   GetBufferPointer      amp pixelShader               Grab some parameters   modelViewMatrix   cgGetNamedParameter  vertexProgram                  ModelViewMatrix     baseTexture   cgGetNamedParameter  fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram    SomeColor     96 808 00504 0000 006    NVIDIA    Introduction to the Cg Runtime Library          Sanity check that parameters have th xpected siz          assert  cgD3D8TypeToSize  cgGetParameterType    modelViewMatrix      16    assert  cgD3D8TypeToSize  cgGetParameterType  someColor          SE       Il Called to render ie seen  void OnRender           Get the Direct3D resource locations for parameters     This can be done earlier and saved  DWORD modelViewMatrixRegister    cgGetParameterResourcelndex  modelViewMatrix     DWORD baseTextureUnit    cgGetParameterResourcelndex  baseTexture     DWORD someColorRegister    cgGetParameterResourceIndex  someColor                         Set the Duzect3D state   device  gt SetVertexShaderConstant  modelViewMatrixRegister    amp matrix  4    device  gt SetPixelShaderConstant  someColorRegister   fcComstentColor  js  device  gt SetTexture  baseTextureUnit  texture    device  gt SetVertexShader  vertexShader     device  gt SetPixelShader  pixelShader                         Draw scene              Called befor
298. ram and  the varying input of an   p30 profile program        282    NVIDIA    808 00504 0000 006    Appendix B Language Profiles       OpenGL NV_texture_shader and NV_register_combiners  Profile    p20   The OpenGL NV_texture_shader and NV_register_combiners profile is used  to compile Cg source code to the nvparse text format for the    NV_texture_shader and NV_register_combiners family of OpenGL  extensions        o Profile name    p20       a How to invoke  Use the compiler option  profile fp20     This document describes the capabilities and restrictions of Cg when using  the   p20 profile     Overview    Operations in the   p20 profile can be categorized as texture shader  operations and arithmetic operations  Texture shader operations are  operations which generate texture shader instructions  arithmetic operations  are operations which generate register combiners instructions     The underlying instruction set and machine architecture limit  programmability in this profile compared to what is allowed by Cg  constructs  Thus  this profile places additional restrictions on what can and  cannot be done in a Cg program     Restrictions    A Cg program in one of these profiles is limited to generating a maximum of  four texture shader instructions and eight register combiner instructions   Since these numbers are quite small  users need to be very aware of this  limitation while writing Cg code for these profiles     The   p20 profile also restricts when a texture shader o
299. rameter parameter   const double  matrix      The matrix is passed as an array of floating point values whose size matches  the number of coefficients of the matrix  The r suffix is for functions that  assume the matrix is laid out in row order  and the e suffix is for functions  that assume the matrix is laid out in column order     The corresponding parameter value retrieval functions are    void cgGLGetMatrixParameterfr  CGparameter parameter   float  matrix     void cgGLGetMatrixParameterfc  CGparameter parameter   float  matrix         808 00504 0000 006 75  NVIDIA          Cg Language Toolkit    void cgGLGetMatrixParameterdr  CGparameter parameter   double  matrix      void cgGLGetMatrixParameterdc  CGparameter parameter   double  matrix       Use cgGLSetStateMatrixParameter    to set a OpenGL 4x4 state matrix     void cgGLSetStateMatrixParameter  CGparameter parameter   GLenum stateMatrixType  GLenum transform       The variable stateMat rixType is an enumerate type specifying the state  matrix to be used to set the parameter     CG GL MODELVIEW MATRIX for the current model view matrix    D    CG GL PROJECTION MATRIX for the current projection matrix    O CG GL TEXTURE MATRIX for the current texture matrix       O CG GL MODELVIEW PROJECTION MATRIX for the concatenated model   view and projection matrices    The variable transformis an enumerate type specifying a transformation  applied to the state matrix before it is used to set the parameter value        CG GL MATRIX
300. ray  you can use  cgGetArrayDimension     cgGetArraySize     cgGetArrayParameter      and cgGetNextParameter      int cgGetArrayDimension  CGparameter parameter    int cgGetArraySize CGparameter parameter  int dimension     CGparameter cgGetArrayParameter  CGparameter parameter    int index      These three functions return 0 if parameter is not of type CG_ARRAY   Function cgGetArrayDimension    gives the dimension of the array  It  returns 1 for float4 array 10   2 for float4 array 10   100   and so on   Next  cgGetArraySize    gives the size of every dimension  For example  for  float 4 array 10   100   cgGetArraySize  array  0  returns 10 and  cgGetArraySize  array  1  returns 100  An array  anArray  has  cgGetArraySize  anArray  0  elements  If its dimension is greater than one   those elements are themselves arrays     Here is how these iteration functions could be used given a valid program  named program   void IterateProgramParameters  CGprogram program     RecurseProgramParameters  cgGetFirstParameter  program   CG_PROGRAM         void RecurseProgramParameters  CGparameter parameter     if  parameter    0   return   do 4  switch  cgGetParameterType parameter      case ONSE   RecurseProgramParameters    cgGetFirstStructParameter  parameter      break           56 808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    case CG_ARRAY   int arraySize   cgGetArraySize parameter  0    oie Kume a   Of   lt  Si sa    RecurseProgramParameters    cgGetArrayPara
301. rbvp1 profile are  summarized in Table 17     The set of binding semantics for varying input data to arbvp1 consists of  POSITION  BLENDWEIGHT  NORMAL  COLORO  COLOR1  TESSFACTOR  PSIZE   BLENDINDICES  and TEXCOORDO TEXCOORD7  One can also use TANGENT and  BINORMAL instead of TEXCOORD6 and TEXCOORD7  Additionally  a set of  generic binding semantics of ATTRO ATTR15 can be used In OpenGL  implementations  conventional and generic vertex attributes may or may not  be aliases for each other  see the ARB vertex program specification for more       260    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    details  The mapping of these semantics to corresponding setting command    is listed in the table     Table 17     arbvp1 Varying Input Binding Semantics       Binding Semantics Name    Corresponding Data             POSITION Input Vertex  through Vertex command   BLENDWEIGHT Input vertex weight through WeightARB   VertexWeightEXT command   NORMAL Input normal through Normal command       COLORO  DIFFUSE    Input primary color through Color command          COLOR1  SPECULAR Input secondary color through  SecondaryColorEXT command  FOGCOORD Input fog coordinate through FogCoordEXT    command       TEXCOORDO TEXCOORD7    Input texture coordinates  texcoord0   texcoord7  through MultiTexCoord command       ATTRO ATTR15    Generic Attribute 0 15 through VertexAttrib  command          PSIZE  ATTR6          Generic Attribute 6       The valid binding semantics for varying o
302. red for setting a parameter of a particular type        100    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    For convenience  there is also a function to set a parameter from a 4x4 matrix  of type D3DMATRIX     HRESULT cgD3D9SetUniformMatrix  CGparameter parameter   const D3DMATRIX  matrix       The upper left portion of the matrix is extracted to fit the size of the input  parameter  so that you could set matrixParam this way as well   D3DXMATRIX matrix     i  I  ip 9    lg Ip ip 0    0  0  0  O    07 W  07 Y  i    cgD3D9SetUniformMatrix matrixParam   amp matrix               In the example above  every element of matrixParam is set to 1     Setting Uniform Arrays of Scalar  Vector  and Matrix Parameters    To set an array parameter  use cgD3D9SetUni formArray       HRESULT cgD3D9SetUniformArray  CGparameter parameter   DWORD startIndex  DWORD numberOfElements   const void  array      The parameters startIndexand numberOfElements specify which elements  of the array parameter are set  Those are the numberOfElements elements of  indices ranging from startIndexto startIndex   numberOfElements 1  It  is assumed that array contains enough values to set all those elements  As  with cgD3D9SetUniform    cgD3D9TypeToSize    can be used to determine  how many values are required  and the type is void  so a compatible user   defined structure can be passed in without type casting     There is a convenience function equivalent to cgD3D9SetUni formMatrix      H
303. returned     There is a one to one correspondence between a set of predefined semantics   POSITION  COLOR  and so on  and hardware resources  registers  texture  units  and so on   In the Cg runtime  a hardware resource is represented by  the type CGresource and cgGetParameterResource    retrieves the  resource assigned to a parameter     CGresource cgGetParameterResource  CGparameter parameter         808 00504 0000 006 69  NVIDIA          Cg Language Toolkit    If the parameter does not have any associated resource   cgGetParameterResource    returns CG_UNDEFINED     The two functions cgGetResource    and cgGetResourceString   allow  you to determine the correspondence between a resource enumerant and its  corresponding string    CGresource  cgGetResource  const char  resourceString      const char  cgGetResourceString CGresource resource      If the string passed to cgGetResource    does not correspond to any  resource  CG_UNDEFINED is returned     Using cgGetParameterBaseResource    allows you to retrieve the base  resource for a parameter in a Cg program     CGresource cgGetParameterBaseResource    CGparameter parameter      The base resource is the first resource in a set of sequential resources  For  example  if a given parameter has a resource equal to C6_TEXCOORD7  its base  resource is CG  TEXCOORDO  Only parameters with resources whose name  ends with a number have a base resource  All other parameters return  CG_UNDEFINED When cgGetParameterBaseResource    is call
304. rface  each member of which has a different implementation   This ability makes it easy for applications to construct material trees on the       60    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    fly  to change the number or type of texture maps applied to an object at  application runtime  and so on     Specifying which particular implementation of an interface to use is  accomplished through    connecting    parameters  In particular  a shared  instance of a struct that implements the interface is created by the  application  This shared instance is then connected to the interface  parameter  The act of connecting the parameters causes the interface  parameter to inherit the shared parameter   s implementation of the interface   This process can be thought of as implementing compile time  polymorphism     It is legal to connect a shared parameter of a user defined structure type to an  interface parameter  as long as the structure type implements that interface  type  At runtime  the entry point s cgIsParentType  coupled with  cgGetParameterNamedType  can be used to determine type parenthood     When a structure parameter is connected to an interface parameter  copies of  any child  that is  member  variables associated with the source structure  parameter are automatically created as children of the sink parameter   Under most circumstances  these member variable copies can be ignored by  the application  since their values and variability are a
305. riangles drawn     The second step consists in enabling the varying parameter for a specific  drawing call   void cgGLEnableClientState  CGparameter parameter      The equivalent disabling function is  void cgGLDisableClientState  CGparameter parameter      Another way to set the vertex varying parameter is to use the   cgGLSet Parameter functions  When a cgGLSetParameter function is called  for a varying parameter  the appropriate immediate mode OpenGL entry  point is called  The egGLGet Parameter functions do not apply to varying  parameters     Setting Sampler Parameters    Setting a sampler parameter requires two steps  First  an OpenGL texture  object handle must be assigned to the sampler parameter  Next  the texture  unit associated with the sampler must be enabled prior to drawing  The first  step must be done explicitly by the application  The second step may also be  performed explicitly by the application  or the OpenGL Cg runtime can be  instructed to automatically manage texture units itself     The first step consists in assigning an OpenGL texture object to the sampler  parameter using  void cgGLSetTextureParameter  CGparameter parameter    GLuint textureName       where textureName is the OpenGL texture name  Note that when your  application makes OpenGL calls to initialize the texture environment for a  given sampler  it is important to remember to set the active texture unit to  that associated with the sampler before doing so  The sampler   s texture unit  
306. rithmetic instruction  From here on  these  operations are referred to as input modifiers and output modifiers     The ps_1_x profiles also restrict when a texture addressing operation or  arithmetic operation can occur in the program  A texture addressing  operation may not have any dependency on the output of an arithmetic  operation unless    Q The arithmetic operation is a valid input modifier for the texture  addressing operation        O The arithmetic operation is part of a complex texture addressing  operation  which are summarized in the section on Auxiliary Texture  Functions      Input and output modifiers may be used to perform simple arithmetic  operations without generating an arithmetic instruction  Instead  the  arithmetic operation modifies the assembly instruction or source registers to  which it is applied  For example  the following Cg expression    z   x  0 5 y    2  could generate the following pixel shader instruction  assuming x is in t0  y  is in t1  and z is in r0     add_d2 r0  t0_bias  tl  How different DirectX pixel shader 1_X instruction set modifiers are  expressed in Cg programs are summarized in Table 48  For more details on  the context in which each modifier is allowed and ways in which modifiers  may be combined refer to the DirectX pixel shader 1_X documentation     Table 48  ps 1 x Instruction Set Modifiers                               Instruction  Register   Cg Expression  Modifier  instr X2 2 x  instr X4 4 x  instr d2 x 2  808 00504 00
307. rix           106    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library       Called at application startup  void OnStartup         J Create Gomes   context   cgCreateContext              Called whenever the Direct3D device needs to be created  void OnCreateDevice           Pass the Direct3D device to th xpanded interfac  cgD3D9SetDevice  device                      Determine the best profiles to use  CGprofile vertexProfile   cgD3D9GetLatestVertexProfile     CGprofile pixelProfile   cgD3D9GetLatestPixelProfile             Grab the optimal options for each profile     const char  vertexOptions        cgD3D9GetOptimalOptions  vertexProfile   0     const char  pixelOptions          cgD3D9GetOptimalOptions  pixelProfile   0             Create the vertex shader   vertexProgram   cgCreateProgramFromFile    context  CG_SOURCE   VertexProgram cg    vertexProfile   VertexProgram   vertexOptions        If your program uses explicit binding semantics  you     can create a vertex declaration using those semantics   const D3DVERTEXELEMENT9 declaration        t 97   sizeof float    D3DDECLTYPE FLOAT3  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE POSITION  0       9   sizeof float    D3DDECLTYPE D3DCOLOR  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE COLOR  O0     TROF   sizeof float    D3DDECLTYPE_FLOAT2  D3DDECLMETHOD_DEFAULT   D3DDECLUSAGE TEXCOORD  0     D3DD3CL_END                                                                                        I                                
308. rm float scale  uniform float bias     offsettexRECTScaleBias  uniform samplerRECT tex  float2 st   float4 prevlookup  uniform float4 m   uniform float scale  uniform float bias        Performs the following  float2 newst   st   m xy   prevlookup xx   m zw   prevlookup yy   float4 result   tex2D RECT  tex  newst     return result   saturate  prevlookup z   scale   bias    where  st are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   m is the offset texture matrix   scale is the offset texture scale  and  bias is the offset texture bias   This function can be used to generate the offset_2d_scale or  offset_rectangle_scale NV_texture_shader instructions              291       808 00504 0000 006  NVIDIA       Cg Language Toolkit    Table 38    p20 Auxiliary Texture Functions  continued        Texture Function       Description       tex1D_dp3 sampler1D tex  float3 str  float4 prevlookup        Performs the following  return tex1D  tex  dot  str  prevlookup xyz     where  str are texture coordinates associated with sampler tex  and  prevlookup is the result of a previous texture operation   This function can be used to generate the dot product 1d  NV_texture_shader instruction        tex2D_dp3x2  uniform sampler2D tex  float3 str     texRECT_dp3x2  uniform samplerRECT tex  float3 str     float4 intermediate_coord  float4 prevlookup     float4 intermediate coord  float4 prevlookup        Performs the following  float2 ne
309. rm points from model  space to clip space  The second matrix  ModelViewIT  is the inverse transpose  of the modelview matrix  The third parameter  LightVec  is a vector that  specifies the location of the light source     Basic Transformations    Now we start the body of the vertex program     vertout OUT     OUT HPosition   mul  ModelViewProj  IN Position       A vertex program is responsible for calculating the homogenous clip space  position of the vertex  given the vertex   s model space coordinates    Therefore  the vertex   s model space position  given by IN  Position  needs  to be transformed by the concatenation of the modelview and projection  matrices  called ModelViewProj in this example   The transformed position  is assigned directly to OUT HPosition  Note that you are not responsible for       808 00504 0000 006 149  NVIDIA          Cg Language Toolkit    the perspective division when using vertex programs  The hardware  automatically performs the division after executing the vertex program     Since we want to do our lighting in eye space  we have to transform the  model space normal IN Normal to eye space        transform normal from model space to view space  float3 normalVec   normalize  mul  ModelViewIT   IN Normal  xyz      Remember that when transforming normals  we need to multiply by the  inverse transpose of the modelview matrix  Then we normalize the eye space  normal vector and store it as normalVec     Prepare for Lighting  The subsequent steps pre
310. rmance or precision reasons  it is generally wiser to use the standard  library functions when possible  The standard library functions will continue  to be optimized for future GPUs  meaning that a shader written today will  automatically be optimized for the latest architectures at compile time   Additionally  the standard library provides a convenient unified interface for  both vertex and fragment programs     This section describes the contents of the Cg Standard Library  including  Mathematical functions   Geometric functions   Texture map functions    Derivative functions       D D O Do O    Predefined helper struct types    Where appropriate  functions are overloaded to support scalar and vector  variations when the input and output types are the same        Mathematical Functions    Table 1     Mathematical Functions    lists the mathematical functions that the  Cg Standard Library provides  The list includes functions useful for  trigonometry  exponentiation  rounding  and vector and matrix  manipulations  among others  All functions work on scalars and vectors of  all sizes  except where noted     808 00504 0000 006 33    NVIDIA       Cg Language Toolkit    Table 1  Mathematical Functions       Mathematical Functions                   Function Description   abs  x  Absolute value of x    acos  x  Arccosine of x in range  0 7   x in   1  1     all  x  Returns true if every component of x is not equal to 0   Returns false otherwise    any  x  Returns true if any comp
311. rofile  CG_PROFILE VP30  Compiler option   profile vp30          808 00504 0000 006 3  NVIDIA       Cg Language Toolkit    Q OpenGL NV30 fragment programs  Runtime profile  CG_PROFILE_FP30  Compiler option   profile fp30   Q OpenGL NV2X vertex programs  Runtime profile  CG_PROFILE VP20  Compiler option   profile vp20   Q OpenGL NV2X fragment programs  Runtime profile      cG PROFILE FP20  Compiler option   profile fp20   a DirectX 9 vertex shaders  Runtime profiles  CG_PROFILE VS 2 X   CG PROFILE VS 2 0  Compiler options   profile vs 2 x   profile vs 2 0    Q DirectX 9 pixel shaders  Runtime profiles  CcG PROFILE PS 2 X  CG PROFILE PS 2 0  Compiler options   profile ps 2 x   profile ps 2 0    a DirectX 8 vertex shaders  Runtime profile  CG_PROFILE VS 1 1  Compiler option   profile vs 1 1    Q DirectX 8 pixel shaders  Runtime profiles  CG_PROFILE PS 1 3  CG_PROFILE_PS_1_2  CG_PROFILE_PS_1_1  Compiler options   profile ps_1_3   profile ps 1 2   profile ps 1 1  The DirectX 9 profiles  vs 2 x and ps 2 x   OpenGL ARB profiles  arbfp1  and arbvp1   NV30 OpenGL profiles    p30 and vp30   and NV40 OpenGL  profiles    p40 and vp40  generally support longer  more complex programs  and offer more features and functionality to the developer  These are referred  to as advanced profiles     The DirectX 8 profiles  vs 1 1and ps 1 3 and NV2X OpenGL profiles     p20 and vp20  have more restrictions on program length and available       4 808 00504 0000 006  NVIDIA    Introduction to the Cg La
312. rtex shaders are    summarized in Table 47     Table 47  vs 1 1 Varying Output Binding Semantics       Binding Semantics Name    Corresponding Data             POSITION Output position  oPos  PSIZE Output point size  oPts  FOG Output fog value  oFog       COLORO COLOR1    Output color values  oDO  oD1       TEXCOORDO TEXCOORD7          Output texture coordinates  oTO oT7          When using the vs_1_1 profile under DirectX 9 it is necessary to tell the  compiler to produce del statements to declare varying inputs  The option   profileopts dcls causes dcl statements to be added to the compiler    output        808 00504 0000 006    307       Cg Language Toolkit       DirectX Pixel Shader 1 x Profiles  ps 1       Overview    The DirectX pixel shader 1_X profiles are used to compile Cg source code to  DirectX PS 1 1  PS 1 2  or PS 1 3 pixel shader assembly     Q Profile names  ps_1_1  for DirectX PS 1 1 pixel shaders   ps_1_2  for DirectX PS 1 2 pixel shaders   ps_1_3  for DirectX PS 1 3 pixel shaders     Q How to invoke  Use the compiler options   profile ps 1 1   profile ps 1 2   profile ps 1 3    The deprecated profile dx8ps is also available and is synonymous with  ps 1 1     This document describes the capabilities and restrictions of Cg when using  the DirectX pixel shader 1  X profiles     DirectX PS 1 4 is not currently supported by any Cg profile  all statements  aboutps 1 xin the remainder of this document refer only to ps 1 1   ps 12andps 1 3     The underlying instru
313. rts a number of options that allow these limits to be  specified on the compiler command line  see    Options    on page 262 for  details  These limits may also be values appropriate for the host computer s  GPU  which are set using the cgGLSetoptimaloptions    Cg runtime call     Language Constructs and Support    Data Types    This profile implements data types as follows        float data type is implemented as IEEE 32 bit single precision   Q half  fixed  and double data types are treated as float   Q int data type is supported using floating point operations   O sampler  types are supported to specify sampler objects used for texture    fetches     Statements and Operators    With the ARB fragment program profiles while  do  and for statements are  allowed only if the loops they define can be unrolled because there is no  dynamic branching in ARB fragment program 1     Comparison operators are allowed   gt    lt    gt     lt            and Boolean  operators        amp  amp       are allowed  However  the logic operators   amp            are  not     Using Arrays and Structures    Variable indexing of arrays is not allowed  Array and structure data is not  packed        264    808 00504 0000 006  NVIDIA    Bindings    Appendix B Language Profiles    Binding Semantics for Uniform Data    The valid binding semantics for uniform parameters in the arbfp1 profile are  found in Table 19     Table 19  arbfp1 Uniform Input Binding Semantics          Binding Semantics Name Corres
314. s              Here  my  unc    is declared to be a function of a single parameter  vals   which is a one dimensional array of floats  However  the length of the vals  array is not specified     The effect of this declaration is that any subsequent call to myfunc   that  passes a one dimensional array of floats of any size resolves to the declared  function  For example     float myfunc float vals              nicard mein     4       14    808 00504 0000 006  NVIDIA    Introduction to the Cg Language    float valel 2 5  float valsz 76      float myvall   myfunc  vals1      match  float myval2   myfunc vals2      match         The actual length of an array parameter  sized or unsized  may be queried  via the   length pseudo member     float myfunc float vals       AS   07  for  aime 2   Op 3  lt  vals  encep i  i  sum    vals i       return sum          The size of a particular dimension of a multidimensional array may be  queried by dereferencing the appropriate number of dimensions of the array     For example  vals2d 0    length gives the length of the second dimension of  the two dimensional vals2d array     lost myjaruiae  Gelkoeie yelsz2en 11  d  float sum   0   fore  ine i   Op a  lt  velszel  lenguas au  4    tow  aum J    7    aL  lt  weise  lencia  3a   if  sum    vals    0    3 3 11 g         return sum          If the length of any dimension of an array parameter is specified  that  parameter only matches calls with variables whose corresponding  dimension is of th
315. s  advantage of this situation to compute lighting per vertex  rather than  per pixel     In a similar manner  it may be advantageous to move any vertex shader  computation that is solely dependent on the values of uniform parameters to  the CPU and then to pass the result of the computation into the vertex shader  with different uniform parameters  For example  if the vertex shader is  passed a float3 vector giving the direction of a distant light source  the  vector should be normalized on the CPU and passed to the vertex shader   This avoids the need to repeatedly and unnecessarily recompute   normalize  lightvector  in the vertex shader        808 00504 0000 006 327  NVIDIA          Cg Language Toolkit       8  Avoid Matrix Transposes J ust for Multiplication    Computing the transpose of a matrix can often be avoided  If you would like  to multiply transposed float3x3 matrix mby a float3 v     mali  mu  is equivalent to and more efficient than    mul  transpose m   v         9  Minimize Conditional Code in Fragment Programs    GPUs don t currently support branching in fragment programs  a program  with a large amount of code that is conditionally executed   for example in  an if else expression   tends to run at the same speed as if all of it were  executed  Therefore  if you have a large amount of conditional code and it is  possible to evaluate the condition on the CPU  it may be advantageous to  have multiple versions of the shader source code and to bind the one with
316. s are not required to support any operations on arbitrarily  sized arrays  only support for vectors and matrices is required     Unsized Arrays    An unsized array may be declared by declaring an array with no length  specified between the brackets  float a    The actual length of the array  may then be set by the runtime before program execution  In program code   the length of any array can be queried using the syntax a  length  where  length acts like an undeclared structure parameter that holds the actual  length of the array at runtime        808 00504 0000 006 239  NVIDIA          Cg Language Toolkit    Function Overloading    Multiple functions may be defined with the same name  as long as the  definitions can be distinguished by unqualified parameter types and do not  have an open profile conflict  see    Overloading of Functions by Profile    on  page 226      Function matching rules     1     Add all visible functions with a matching name in the calling scope to  the set of function candidates     Eliminate functions whose profile conflicts with the current compilation  profile     Eliminate functions with the wrong number of formal parameters  If a  candidate function has excess formal parameters  and each of the excess  parameters has a default value  do not eliminate the function     If the set is empty  fail     For each actual parameter expression in sequence  perform the  following     a  If the type of the actual parameter matches the unqualified type of the 
317. s by sampling at      different frequencies             float3 fleckN    float3 tex2D FleckMap  vert uv 37  2 1   Pecki E S eleat9 ex2bIlleciMapPVeE5uVvePS  2   1   2 5  ie exei 2 p    float fleck_n_d_h   saturate dot fleckN  H     float3 fleck color   FleckColor   pow fleck n d h        808 00504 0000 006 187  NVIDIA          Cg Language Toolkit    lerp  NewPaintSpec y  NewPaintSpec w  v_dist        Control the ambient fleckiness and also     attenuate with distance  fleck_color   fleck_color Ambient vert halfangle w           DIFFUSE   close Ie cl   Series  ial il  2    float3 paintResult   lerp Ambient paint_color   parme dolos  le er                      FRESNEL  float Fresnel   saturate  dot  ClearCoat  reflect_color     Fresnel   pow Fresnel  NewPaintSpec z         This helps make the clear coat less omnipresent         only the really  perceptually  bright areas reflect  Vi ThS moste   Fresnel   saturate  vert fresn Fresnel        Show more of the specular reflection environment      when in fresnel zones      diffuse    1 fresnel    environment    fresnel   paintResult   lerp paintResult  reflect color  Fresnel               SPECULAR     diffuse   specular   flecks  paintResult   paintResult   n_d_h   fleck_color           OUTPUT  return paintResult xyzz        188 808 00504 0000 006  NVIDIA       Basic Profile Sample Shaders    This chapter provides a set of basic profile sample shaders written in Cg   Each shader comes with an accompanying snapshot  description  and
318. s that create and manipulate data     D    a       a    Basic types  Structures  Arrays    Type conversions    Basic Data Types    Cg supports seven basic data types           a  float  A 32 bit IEEE floating point  s23e8  number that has one sign bit  a 23 bit  mantissa  and an 8 bit exponent  This type is supported in all profiles   although the DirectX 8 pixel profiles implement it with reduced  precision and range for some operations    QO half  A 16 bit IEEE like floating point  s10e5  number    A  int  A 32 bit integer  Profiles may omit support for this type or have the  option to treat int as float    Q fixed  A 12 bit fixed point number  s1 10  number  It is supported in all  fragment profiles    Q bool  Boolean data is produced by comparisons and is used in if and  conditional operator      constructs  This type is supported in all  profiles    OU sampler    808 00504 0000 006 11    NVIDIA          Cg Language Toolkit    The handle to a texture object comes in six variants  sampler  sampler1D   sampler2D  sampler3D  samplerCUBE  and samplerRECT  With one  exception  these types are supported in all pixel profiles  fragment  profiles  and the NV40 vertex program profile  The samplerRECT type is  not supported in the DirectX profiles     Q string  Although it is not possible to use strings in Cg program code for any  currently existing profile  they can be set and have their values queried  though the Cg runtime API  thus  they can be useful for storing  information a
319. s when passing function parameters         Q    Top level function parameters may be defined using that type     If a type is partially supported  variables may be defined using that type but  no useful operations can be performed on them  Partial support for types  makes it easier to share data structures in code that is targeted at different  profiles     Type Categories    The integral type category includes types cint and int     The floating type category includes types c  loat  float  half  and  fixed   Note that floating really means floating or fixed fractional      The numeric type category includes integral and floating types     The compile time type category includes types cfloat and cint  These  types are used by the compiler for constant type conversions     O The concrete type category includes all types that are not included in the  compile time type category        O The scalar type category includes all types in the numeric category  the  bool type  and all types in the compile time category  In this  specification  a reference to a  lt category gt  type  such as a reference to a  numeric type  means one of the types included in the category  such as  float  half  or fixed      Constants    A constant may be explicitly typed or implicitly typed  Explicit typing of a  constant is performed  as in C  by suffixing the constant with a single  character indicating the type of the constant        for float    d for double    D    Q hforhalf       OQ xforfixed    A
320. sche EUR RE ROO QUR RPE AR aR CH Ree RE CUR p good 244  Minimum Requirements for if  while  and for Statements                   244  New Vector Operators  versan RO E ROROE Oe RS EROR COE Re Rhee Roe 244  Arithmetic Precision and Range            2  rs 246  Operator Precedentes 247  Operator ENNaNCEmMENtS okai a i dok aca a e eR a i 247  Or AAA IC PE  248  Reserved WOPIS    2 coram avidus i a ORCI mnie DANCER Poma TE nau e AE 249  Cg Standard  Library FUNCIONS riseire epp ron in Parens dace 250  Vertex Program Profiles  creon scs qo Bcd toa x uot RU RR a 250  Mandatory Computation of Position Output          lee 250  Position nVatia CQ    cache hedged nage de qux Yo OE tee po di barra e aas 250  Binding Semantics for Outputs       sva kei o pp dead eed E  a RE E Lgs 251  Fragment Program Proves  ai fees x gia dba ak ose ERR EO DR REOR OR D d 252  Binding Semantics for Outputs  i    us sepe qp rr Cad qun g d de Rae us 252  Appendix B  Language Profiles oo rr RR REIN a Rara EE Rua ad Rmi au 255  OpenGL ARB Vertex Program Profile  arbvp1           liliis 256  OVAs cura Reps d E RE REN ERRASSE IEEE Sd de mtb dug ds 256  Accessing OpenGL State   ss exsdeer RERO eke meee hed EALE EN DEPRE 256  Position nValidliCBa axons acs 8 gig a bob regen   SORS Rebeka Qd Debout ded anis 258  Data pesto a a Ia ERR ORI ge Ud BC ee ae 258  Compatibility with the vp20 Vertex Program Profile                o0oo oooooo   259  Loading CONSEANES soria cda PCT   ER 260  Bihdinds ua ri a ma gig gR E m a
321. sed for data that is specified with each element of the  stream of input data  For example  the varying inputs to a vertex  program are the per vertex values that are specified in vertex arrays  For  a fragment program  the varying inputs are the interpolants  such as  texture coordinates     Q Uniform inputs are used for values that are specified separately from the  main stream of input data  and don   t change with each stream element   For example  a vertex program typically requires a transformation  matrix as a uniform input  Often  uniform inputs are thought of as  graphics state        808 00504 0000 006 5  NVIDIA          Cg Language Toolkit    Varying Inputs to a Vertex Program    A vertex program typically consumes several different per vertex  varying   inputs  For example  the program might require that the application specify  the following varying inputs for each vertex  typically in a vertex array     Q Model space position    O Model space normal vector       Q Texture coordinate    In a fixed function graphics pipeline  the set of possible per vertex inputs is  small and predefined  This predefined set of inputs is exposed to the  application through the graphics API  For example  OpenGL 1 4 provides the  ability to specify a vertex array of normal vectors     In a programmable graphics pipeline  there is no longer a small set of  predefined inputs  It is perfectly reasonable for the developer to write a  vertex program that uses a per vertex refractive inde
322. shader  is developed  The ultimate test for a shader is    Does it look right     To  that end  the ability to quickly prototype and modify a shader is crucial  to the rapid development of high quality effects     O The compiler optimizes code automatically and performs low level  tasks  such as register allocation  that are tedious and prone to error        O Shading code written in a high level language is much easier to read and  understand  It also allows new shaders to be easily created by modifying  previously written shaders  What better way to learn than from a shader  written by the best artists and programmers     Q Shaders written in a high level language are portable to a wider range of  hardware platforms than shaders written in assembly code     This chapter introduces Cg  C for Graphics   a high level language tailored  for programming GPUs  Cg offers all the advantages just described  allowing  programmers to finally combine the inherent power of the GPU with a  language that makes GPU programming easy     808 00504 0000 006 1  NVIDIA       Cg Language Toolkit       The Cg Language    Cg is based on C  but with enhancements and modifications that make it easy  to write programs that compile to highly optimized GPU code  Cg code looks  almost exactly like C code  with the same syntax for declarations  function  calls  and most data types     Before describing the Cg language in detail  it is important to explain the  reason for some of the differences that exis
323. sionality of an array is queried using   int cgGetArrayDimension  CGparameter param      Dimensions are enumerated starting at 0  zero   The length of a particular  dimension of an array can be retrieved by calling  int cgGetArraySize CGparameter param  int dimension      The total number of elements in an array may be queried using  int cgGetArrayTotalSize CGparameter param      Here  param may be an array of any dimension  the returned value is the  total number of elements across all dimensions of the array    The type of each element of an array can be queried using   CGtype cgGetArrayType  CGparameter param      For example  if a parameter were declared  sc lleyene d  array  21 1517    cgGetArrayType    would return CG_FLOAT4  If it were declared    misere uses L3     cgGetArrayType    would return the enumerant corresponding to the user   defined mystruct type     Unsized Array Length    Unsized arrays can be assigned concrete sizes via the runtime  Under many  profiles  setting the size of unsized arrays associated with a Cg program is  required before the program can be compiled        808 00504 0000 006 67  NVIDIA          Cg Language Toolkit    The length of one dimensional unsized arrays can be set using  void cgSetArraySize  CGparameter param  int size      The size of multidimensional arrays may be set using  void cgSetMultiDimArraySize  CGparameter param  int  sizes      Note that arrays with completely determined lengths may not have their size    changed using
324. specular highlights        Fig  12  Example of Car Paint 9       808 00504 0000 006 183  NVIDIA          Cg Language Toolkit    Vertex Shader Source Code for Car Paint 9       This shader is based on the Time Machine temporal rust     shader  Car paint data was measured by Cornell     University from samples provided by Ford Motor Company     Siecuce azy di                float4 OPosition POSITION   float3 ONormal NORMAL    lomo why EXCOORDO   float3 Tangent EXCOORD1   float3 Binormal EXCOORD2   float3 Normal EXCOORD3           y     Struct VS OUTBRUT Y                                     float4 HPosition POSITION     coord position in window  late 2 uw   TEXCOORDO     wavy fleckmap coords  loaro Licime   TEXCOORD1     light pos  tangent space   float4 halfangle   TEXCOORD2     Blinn halfangle   float3 reflection  TEXCOORD3     Refl vector  per vertex   float4 view   TEXCOORD4     view  tangent space   float3 tangent   TEXCOORD5     view tangent matrix  float3 binormal   TEXCOORD6       float3 normal 8 ECC OORD Wim 7   float fresn   COLORO              y     VS_OUTPUT main  a2v vert      TRANSFORMATIONS  uniform float4x4 ModelView   uniform float4x4 ModelViewIT   uniform float4x4 ModelViewProj   uniform float3 LightVector     uniform float3 EyePosition         Obj space     Obj space          VS OUTPUT O           Generate homogeneous POSITION  O HPosition   mul ModelViewProj  vert OPosition         Generate BASIS matrix   float3x3 ModelTangent     normalize vert Tangent    n
325. spose  objToTangentSpace        since the inverse of a rotation is its transpose    P4       So a row of TangentToCubeSpace is the transform by    objToTangentSpace of the corresponding row of  il ObjToCubeSpace             OUT TangentToCubeSpaceO0 xyz    mul objToTangentSpace  ObjToCubeSpace 0  xyz    OUT TangentToCubeSpacel xyz    mul objToTangentSpace  ObjToCubeSpace l1  xyz    OUT TangentToCubeSpace2 xyz    mul objToTangentSpace  ObjToCubeSpace 2  xyz                     compute the eye vector           T  going from eye to shaded point  in cube space   float3 eyeVector   mul ObjToCubeSpace  IN Position     EyePosition    OUT TangentToCubeSpace0 w   eyeVector x           OUT TangentToCubeSpacel w   eyeVector y   OUT TangentToCubeSpace2 w   eyeVector z              transform position to projection space  OUT Position   mul WorldViewProj  IN Position         retura OUT        198 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders    Pixel Shader Source Code for Bump and Reflection Mapping    EEwKeE wed 1  float4 Position   POSITION    in projection space  float4 TexCoord   TEXCOORDO0        JE ESSE WEE C ME cR SEO    d from tangent to cube space  float4 TangentToCubeSpace0   TEXCOORD1           second row of the 3x3 transform     from tangent to cube space  float4 TangentToCubeSpacel   TEXCOORD2           third row of the 3x3 transform   f d from tangent to cube space   float4 TangentToCubeSpace2   TEXCOORD3   y        AM  WIE IN   uniform sampler2D NormalMap   unif
326. stroyed     Parameter References    A parameter that is referenced by the original Cg source code may be  optimized out of the compiled program by the compiler  in which case the  application can simply ignore it and not set its value  Calling  cgIsParameterReferenced   allows you to check whether a parameter is  potentially used by the final compiled program    CGbool cgIsParameterReferenced CGparameter parameter      Note that the value returned by this entry point is conservative  but not  always exact  particularly if the program has not yet been compiled  Also   note that no error is generated if you set the value of a parameter that is not  referenced        66    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    Parameter Size    A number of core Cg runtime entry points are provided for querying and  setting parameter size and length     The number of rows or columns associated with a parameter can be retrieved  using   int cgGetParameterRows  CGparameter param      int cgGetParameterColumns  CGparameter param      A scalar parameter is considered to have a single row and a single column   while a vector parameter has a single row and columns equal to the length of  the vector  If paramis a matrix parameter  the values returned correspond to  those of the matrix  If paramis an array  the number of rows or columns  associated with each element of the array is returned  If paramis not a  numeric type  0 is returned by either entry point    The dimen
327. sual Studio  workspace  both provided on the accompanying CD  that you can use to  start experimenting with Cg        Q    Advanced Profile Sample Shaders    on page 153  A list of sample NV30 shaders  complete with source code     Q    Basic Profile Sample Shaders    on page 189   A list of sample NV2X shaders  complete with source code   O Appendix A   Cg Language Specification  on page 221  The formal Cg language specification        O Appendix B     Language Profiles    on page 255   Describes features and restrictions of the currently supported language  profiles  DirectX 8 vertex  DirectX 8 pixel  OpenGL ARB vertex  NV2X  OpenGL vertex  NV30 OpenGL vertex  NV30 OpenGL fragment   OpenGL ARB fragment  NV40 OpenGL vertex  and NV40 OpenGL  fragment     808 00504 0000 006 XV  NVIDIA       Cg Language Toolkit    Q Appendix C     Nine Steps to High Performance Cg    on page 321  Strategies for getting the most out of your Cg code       Appendix D   Cg Compiler Options  on page 329   A list of the various command line options that the Cg compiler accepts        a CgDeveloper s CD   The CD provided with this book contains the entire Cg release  which  allows you get started immediately  The readme txt file on the CD  describes the contents of the release in detail     You can begin working with Cg immediately by reading the  Introduction to  the Cg Language  on page 1 and then going through    A Brief Tutorial  on  page 145  Once you have a basic understanding of the Cg language
328. sulting declaration is compatible with the      shader  This is really just a sanity check    assert  cgD3D8ValidateVertexDeclaration vertexProgram   declaration                Load the program with th xpanded interfac     Parameter shadowing is enabled  second parameter   TRUE            110    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library       cgD3D8LoadProgram vertexProgram  TRUE  0  0  declaration      Ii Create the pizel shader    fragmentProgram   cgCreateProgramFromFile    context  CG_SOURCE   FragmentProgram cg    pixelProfile   FragmentProgram   pixelOptions                   Load the program with th xpanded interfac      Parameter shadowing is enabled  second parameter   TRUE       Ignore vertex shader specifc flags  like declaration and     usage    cgD3D8LoadProgram fragmentProgram  TRUE  0  0  0                  Grab some parameters   modelViewMatrix   cgGetNamedParameter  vertexProgram            ModelViewMatrix     baseTexture   cgGetNamedParameter  fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram         SomeColor          Sanity check that parameters have th xpected siz                            assert  cgD3D8TypeToSize  cgGetParameterType   modelViewMatrix      16    assert  CgD3D8TypeToSize cgGetParameterType  someColor         NI       Set parameters that don t change  They can be set     only once since parameter shadowing is enabled  cgD3D8SetTexture  baseTexture  texture    cgD3D8SetUniform som
329. t between Cg and C   Fundamentally  it comes down to the difference in the programming models  for GPUs and for CPUs     Cg s Programming Model for GPUs    CPUs normally have only one programmable processor  In contrast  GPUs  have at least two programmable processors  the vertex processor and the  fragment processor  plus other non programmable hardware units  The  processors  the non programmable parts of the graphics hardware  and the  application are all linked through data flows  Cg   s model of the GPU is  illustrated by Fig  1     3D  Application  or Game    3D API  Commands  3D API     OpenGL  or Direct3    CPU   GPU Boundary          GPU  Command  amp     Data Stream Assembled Pixel    Vertex Index Polygons  Lines Location Pixel  Stream  amp  Points Stream Updates  GPU   a   Primitive   quem Rasterization  amp    mmm Raster           Buffer  Front En Assembly  Interpolation Operations Frame  Pretransformed Transformed Rasterized Transformed  Vertices Vertices Pretransformed Fragments  Fragments  Programmable ac    Vertex Pr Processor  t Fk  Fig  1  Cg s Model of the GPU  2 808 00504 0000 006    NVIDIA    Introduction to the Cg Language    The Cg language allows you to write programs for both the vertex processor  and the fragment processor  We refer to these programs as vertex programs and  fragment programs  respectively   Fragment programs are also known as pixe   programs ox pixel shaders  and we use these terms interchangeably in this  document   Cg code can be c
330. t of functions on top  of the core Cg runtime to ease the integration of Cg to an application based  on this API  They essentially interface between the core runtime data  structures and the API data structures to provide the following facilities        72 808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    Q Setting the parameter values  A distinction is made between texture   matrix  array  vector and scalar values as those various types are handled  differently by each API and have different data structures     Q Executing the program  Program execution is divided into program  loading  passing the result of the Cg compiler to the API  and program  binding  setting the program as the one to execute for any subsequent  draw calls   This is because those two operations are usually done at a  different time  A program is loaded each time it is recompiled and it is  bound each time it needs to be executed for a particular draw call     Parameter Shadowing    When the value of a uniform parameter is set by some function of the  OpenGL Cg runtime  it is actually stored internally  or shadowed  by either  the Cg or the OpenGL runtime so that it does not need to be reset every time  the program is about to be executed  This behavior is referred to as parameter  shadowing     If the Direct3D Cg runtime expanded interface  described in    Direct3D  Expanded Interface    on page 98  is used  parameter shadowing can be  turned on or off on a per program basis  When
331. t only  default  as in  C    808 00504 0000 006 19    NVIDIA          Cg Language Toolkit    Cg supports function overloading by the number of operands and by  operand type  The choice of a function is made by matching one operand at a  time  starting at the first operand  The formal language specification  provides more details on the matching rules  but it is not normally necessary  to study them because the overloading generally works in an intuitive  manner  For example  the following code declares two versions of a function   one that takes two bool operands  and one that takes two float operands     bool same float a  float b    return  a    b     bool same bool a  bool b    return  a    b       Arithmetic Operators from C    Cg includes all the standard C arithmetic operators              and allows the  operators to be used on vectors as well as on scalars  The vector operations  are always performed in elementwise fashion  For example     float3 a  b  c    float3 A  B  C  equals float3 a A  b B  c C     These operators can also be used in a form that mixes scalar and vector   the  scalar is    smeared    to create a vector of the necessary size to perform an  elementwise operation  Thus     a   float3 A  B  C  isequalto float3 a A  a B  a C     The built in arithmetic operators do vot currently support matrix operands  It  is important to remember that matrices are not the same as vectors  even if  their dimensions are the same     Multiplication Functions    Cg s mu
332. t4  variable     Q Scalar conversions  Implicit conversion of any scalar numeric type to any other scalar  numeric type is allowed  A warning may be issued if the conversion is  implicit and a loss of precision is possible  Implicit conversion of any  scalar object type to any compatible scalar object type is allowed   Conversions between incompatible scalar object types or between object  and numeric types are not allowed  even with an explicit cast  A sampler  is compatible with sampler1D  sampler2D  sampler3D  samplerCube   and samplerRECT  No other object types are compatible    sampler1D is  not comparable with sampler2D  even though both are compatible with  sampler   Scalar types may be implicitly converted to vectors and matrices of  compatible type  The scalar is replicated to all elements of the vector or  matrix  Scalar types may also be explicitly cast to structure types if the  scalar type can be legally cast to every member of the structure     Q Vector conversions  Vectors may be converted to scalar types  the first element of the vector is  selected   A warning is issued if this is done implicitly  A vector may also  be implicitly converted to another vector of the same size and compatible  element type   A vector may be converted to a smaller compatible vector or a matrix of  the same total size  but a warning is issued if an explicit cast is not used     Q Matrix conversions  Matrices may be converted to a scalar type   element  0 0  is selected  As  with ve
333. tand the DirectX VS 1 1 Vertex Shaders and the code the compiler produces  see  the Vertex Shader Reference in the DirectX 8 1 SDK documentation        304    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Q int data type is supported using floating point operations  which adds  extra instructions for proper truncation for divides  modulos and casts  from floating point types     Q fixed or sampler  data types are not supported  but the profile does  provide the minimal partial support that is required for these data types  by the core language specification    that is  it is legal to declare variables  using these types  as long as no operations are performed on the  variables     Statements and Operators    The if  while  do  and for statements are allowed only if the loops they  define can be unrolled  because there is no branching in VS 1 1 shaders     There are no subroutine calls either  so all functions are inlined  Comparison  operators are allowed   gt    lt    gt     lt            and Boolean operators        amp  amp        are allowed  However  the logic operators  s           are not allowed     Using Arrays    Variable indexing of arrays is allowed as long as the array is a uniform  constant  For compatibility reasons arrays indexed with variable expressions  need not be declared const just uniform  However  writing to an array that is  later indexed with a variable expression yields unpredictable results     Array data is not packed because verte
334. te int4 Keep  Zero  2 0 or  Replace  Incr  EXT stencil two side  Decr  Invert   IncrWrap  DecrWrap  TexGenSMode   ndx  int ObjectLinear  1 0  or 1 3   EyeLinear  ARB texture cube map   SphereMap  EXT texture cube map  Ol  ReflectionMap  NV texgen reflection for  NormalMap ReflectionMap  or  NormalMap  ndx must be  greater or equal to zero and  less than the value of  GL MAX TEXTURE COORDS  TexGenTMode  ndx  int Same as TexGenSMode  TexGenRMode  ndx  int ObjectLinear  1 0  or 1 3   EyeLinear  ARB texture cube map   ReflectionMap  EXT texture cube map  Or  NormalMap NV texgen reflection for  ReflectionMap Or  NormalMap  ndx must be  greater or equal to zero and  less than the value of  GL MAX TEXTURE COORDS  TexGenQMode  ndx  int ObjectLinear  1 0  ndx must be greater or  EyeLinear equal to zero and less than  the value of  GL MAX TEXTURE COORDS  TexGenSEyePlane  ndx  float 4 1 0  ndx must be greater or  equal to zero and less than  the value of  GL MAX TEXTURE COORDS  TexGenTEyePlane  ndx  float 4 Same as  TexGenSEyePlane  TexGenREyePlane  ndx  float 4 Same as  TexGenSEyePlane                      136    NVIDIA    808 00504 0000 006       Introduction to CgFX                                     Table 6    CgFX OpenGL State Manager States  continued   State Name Type Valid Enumerants Requires  TexGenQEyeP lane  ndx  float4 Same as  TexGenSEyePlane  TexGenSObjectPlane float 4 Same as   ndx  TexGenSEyePlane  TexGenTObjectPlane float 4 Same as   ndx  TexGenSEyePlane  TexGenRObject
335. ter shadowing is enabled  cgD3D9SetTexture  baseTexture  texture    cgD3D9SetUniform someColor   amp constantColor                Called io render the seen  void OnRender                  Load model view matrix   D3DXMATRIX modelViewMatrix   J d       108 808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library          Set the parameters that change every frame     This must be done before binding the programs  cgD3D9SetUniformMatrix modelViewMatrix   amp modelViewMatrix          Set the vertex declaration  device  gt SetVertexDeclaration vertexDeclaration            Bind the programs  This downloads any parameter values     that have been previously set   cgD3D9BindProgram vertexProgram    cgD3D9BindProgram fragmentProgram         Draw scene              Called before the device changes or is destroyed  void OnDestroyDevice         fy Calling tales function tells da xpanded interface to     release its internal reference to the Direct3D devic     eme free its Directs  resources   cgD3D9SetDevice 0                        Called before application shuts down  void OnShutdown            This frees any core runtime resource   cgDestroyContext  context          Expanded Interface DirectD3D 8 Application    The following C code links the previous vertex and fragment programs to  the Direct3D 8 application      include  lt cg cg h gt    include  lt cg cgD3D8 h gt                    IDirect3DDevice8  device     Initialized somewher is  IDirect3DTexture8  texture     Ini
336. terface element for  manipulating uniform parameters  or to describe the type of render target a  rendering pass is expecting     float bumpHeight   lt     string gui    slider    elosie  viimia   0 07  float uimax   1 0f   float uistep   0 1f     gt    0 587    The annotation appears after the optional semantic and before variable  initialization  Applications can query for annotations  and use them to  expose certain parameters to artists in a CgFX aware tool  such as Discreet s  3ds max 5 or Alias   Wavefront s Maya 4 5        More Details    The purpose of this chapter has been to give you a brief overview of Cg so  that you can get started quickly and experiment to gain hands on experience   If you would like some more detail about any of the language features  described in this chapter  see    Cg Language Specification    on page 221        32    808 00504 0000 006  NVIDIA          Cg Standard Library Functions    Cg provides a set of built in functions and predefined structures with  binding semantics to simplify GPU programming  These functions are  similar in spirit to the C standard library  providing a convenient set of  common functions  In many cases  the functions map to a single native GPU  instruction  meaning they are executed very quickly  Of those functions that  map to multiple native GPU instructions  you may expect the most useful to  become more efficient in the near future     Although customized versions of specific functions can be written for  perfo
337. ters and pointer related capabilities  such as the  amp  and   gt  operators   are not supported           Arrays are supported  but with some limitations on size and  dimensionality  Restrictions on the use of computed subscripts are also  permitted  Arrays may be designated as packed  The operations allowed  on packed arrays may be different from those allowed on unpacked  arrays  Predefined packed types are provided for vectors and matrices  It  is strongly recommended these predefined types be used        222 808 00504 0000 006  NVIDIA    Appendix A Cg Language Specification    a  Unsized arrays can be created by declaring an array   s dimension as      The array s actual dimension can be set at runtime before a final  compilation step     Q There is a built in swizzle operator   xyzw or  rgba for vectors  This  operator allows the components of a vector to be rearranged and also  replicated  It also allows the creation of a vector from a scalar     Q For an lvalue  the swizzle operator allows components of a vector or  matrix to be selectively written        Q There is a similar built in swizzle operator for matrices      _In lt row gt  lt col gt  _m lt row gt  lt col gt            This operator allows access to individual matrix components and allows  the creation of a vector from elements of a matrix  For compatibility with  DirectX 8 notation  there is a second form of matrix swizzle  which is  described later     Q Numeric data types are different  Cg s primary numer
338. tes the type of the parameter  array elements  1 for arrays of float1  2 for arrays of   1oat2  and so on  The  variables start Index and numberOfElements specify which elements of the  array parameter are set  They are the numberOfElements elements of the  indices that range from startIndexto startIndextnumberOfElements 1  Passing a value of 0 for numberO  Elements tells the functions to set all the  values starting at index startIndex up to the last valid index of the array   namely cgGetArraySize  parameter  0  1  This is equivalent to setting  numberOfElements to cgGetArraySize  parameter  0   startIndex  The  parameter array is an array of scalar values  It must have  numberOfElements for the cgGLSet ParameterArray1 functions   2 numberOfElements for the cgGLSetParameterArray2 functions  and so    on     The corresponding parameter value retrieval functions are as follows     void    void    void    void    void    void    void    void    cgGLGetParameterArraylf  CGparameter parameter    long startIndex  long numberOfElements  float  array    cgGLGetParameterArrayld  CGparameter parameter    long startIndex  long numberOfElements  double  array    cgGLGetParameterArray2f  CGparameter parameter    long startIndex  long numberOfElements  float  array    cgGLGetParameterArray2d  CGparameter parameter    long startIndex  long numberOfElements  double  array    cgGLGetParameterArray3f  CGparameter parameter    long startIndex  long numberOfElements  float  array    cgGLGetPar
339. texture lookups 23  texture map functions 38  texture maps for performance 324  textures 123  thin film effect  pixel shader code example 182  vertex shader code example 180  tutorial 145  type conversions  array 235  matrix 234  scalar 234  structure 235  vector 234  type equivalency 236  type promotion 236  assignment 237  smearing 237  type qualifiers 233  const 233  in 233  out 233  types  general discussion 229  partial support 231    12  234    U   uniform inputs 5   uniform modifer  use of 225  uninitialized variables  use of 241  unsized arrays 125    V  variables  global 241  uninitialized  use of 241  varying inputs 5  6  vector data types 12  vector operators  new 244  vectorization  for performance 321  vectors  constructing 21    808 00504 0000 006    vertex color 149   vertex position 149   vertex program   121  varying output 7   vertex program profiles 250   vertex programs  defined 3   virtual machine 127   void type  specification 229   vp20 profile 279   vp30 profile 270   vs_1_1 profile 304   vs_2_0 profile 296   vs 2 x profile 296    Ww  water  improved  pixel shader code example  sample shader 157  vertex shader code example  web site  NVIDIA xvi  while statements 244  workspace  loading 145  write mask operator 22  described 246    337  NVIDIA    160    158          Cg Language Toolkit       338 808 00504 0000 006  NVIDIA    
340. the CgFX state assignment BlendFunc    int2  Zero  DstAlpha      When a state assignment depends on the presence of an OpenGL extension   for example  BlendFuncSeparate requires either  EXT_blend_func_separate or the presence of OpenGL 1 4   it is possible to  successfully load an effect file that uses that extension in one of its  techniques  even if the OpenGL context doesn t support that extension   However  validation of any technique that uses such an unsupported  extension in of its passes will fail     The following table lists the names of the states supported by the CgFX  OpenGL state manager  their types  and valid enumerants  The    Requires     column in the tables below indicates what OpenGL version or extension is  required for each state assignment                                Table 6  CgFX OpenGL State Manager States  State Name Type Valid Enumerants Requires  AlphaFunc float2 Never  Less  OpenGL 1 0   enum  LEqual  Equal   reference_  Greater  NotEqual   value  GEqual  Always  BlendFunc int2  src   Zero  One  1 0  1 4 or  factor  DestColor  NV blend square for  dst factor   OneMinusDestColor   SrcColor or  SrcAlpha  OneMinusSrcColor for  OneMinusSrcAlpha  src  factor  and  DstAlpha  DstColor Or  OneMinusDstAlpha  OneMinusDstColor for  SrcAlphaSaturate  dst factor  SrcColor   OneMinusSrcColor   ConstantColor   OneMinusConstantColor   ConstantAlpha   OneMinusConstantAlpha  130 808 00504 0000 006    NVIDIA       Introduction to CgFX                            
341. the existing profiles       Q Runon future profiles corresponding to new 3D APIs or to hardware  that did not exist at the time the Cg programs were written    No Dependency Limitations    If you link a Cg program to the application when it is compiled  the  application is too dependent on the result of the compilation  The application  program has to refer to the Cg program input parameters by using the  hardware register names that are output by the Cg compiler  This approach  is awkward for two reasons     Q The register names can t be easily matched to the corresponding  meaningful names in the Cg program without looking at the compiler  output     Q Register allocations can change each time the Cg program  the Cg  compiler  or the compilation profile changes  This means you have the  inconvenience of updating the application each time as well     In contrast  linking a Cg program to the application program at run time  removes the dependency on the Cg compiler  With the runtime  you need to  alter the application code only when you add  delete  or modify Cg input  parameters     Input Parameter Management    The Cg runtime also offers additional facilities to manage the input  parameters of the Cg program  In particular  it makes data types such as  arrays and matrices easier to deal with  These additional functions also  encompass the necessary 3D API calls to minimize code length and reduce  programmer errors        44    808 00504 0000 006  NVIDIA    Introduction to t
342. the unpack_4ubyte    function        C Psuedocode    Ws ok   wowiacd 255 0     clemolasz  0 0  1 0   p  o  y   wowacd 255  0  clemo lay  0 0  1     Pp  uo    roumca 2339 0   clemolasz  0 0  1 0     oy   wowacd 255 0   clemo lali  0 0  1    p  restile    low  lt  lt  24     lo   lt  lt  16    wig yv  lt  lt  e    wos    unpack_4ubyte    half4 unpack_4ubyte  float a      Unpacks the four 8 bit integers in a and scales the results into individual 16   bit floating point values between 0 0 and 1 0        C Pseudocode       resultes      a  gt  gt  0   amp  O    255 07  esla y Ma  gt  8   amp  Os    255 05  wesulke 4 SO EE   25507  ESSE O E   255 05  278 808 00504 0000 006    NVIDIA    Appendix B Language Profiles       OpenGL NV_vertex_program 1 0 Profile  vp20     Overview    The vp20 Vertex Program profile is used to compile Cg source code to vertex  programs for use by the NV  vertex program OpenGL extension      Q Profile name  vp20       Q How to invoke  Use the compiler option  profile vp20     This section describes the capabilities and restrictions of Cg when using the  vp20 profile     The vp20 profile limits Cg to match the capabilities of the  NV_vertex_program extension  NV_vertex_program has the same  capabilities as DirectX 8 vertex shaders  so the limitations that this profile  places on the Cg source code written by the programmer is the same as the  DirectX VS 1 1 shader profile        Aside from the syntax of the compiler output  the only difference between  the
343. tialized somewher ls  D3DXCOLOR constantColor     Initialized somewher ls    CGcontext context   CGprogram vertexProgram  fragmentProgram   CGparameter baseTexture  someColor  modelViewMatrix           808 00504 0000 006 109  NVIDIA          Cg Language Toolkit       Called at application startup  void OnStartup            Vi Create comerse  context   cgCreateContext           Ii Called whenever the Direct sn device meses to ba crearad  void OnCreateDevice                        Pass the Direct3D device to th xpanded interfac  cgD3D8SetDevice  device                Determine the best profiles to use  CGprofile vertexProfile   cgD3D8GetLatestVertexProfile     CGprofile pixelProfile   cgD3D8GetLatestPixelProfile             Grab the optimal options for each profile   const char  vertexOptions        cgD3D8GetOptimalOptions  vertexProfile   0     const char  pixelOptions        cgD3D8GetOptimalOptions  pixelProfile   0          Create the vartez ssl   vertexProgram   cgCreateProgramFromFile    context  CG_SOURCE   VertexProgram cg    vertexProfile   VertexProgram   vertexOptions        If your program uses explicit binding semantics  like                                        this one   you can create a vertex declaration      using those semantics    DWORD declaration        D3DVSD STREAM 0    D3DVSD REG D3DVSDE POSITION  D3DVSDT FLOAT3    D3DVSD REG D3DVSDE DIFFUSE  D3DVSDT_D3DCOLOR     D3DVSD REG D3DVSDE TEXCOORDO  D3DVSDT FLOAT2    D3DVSD END                   Ensure the re
344. tics for Varying Input Output Data    The valid binding semantics for varying input parameters in the vp30 profile    are summarized in Table 24     One can also use TANGENT and BINORMAL instead of TEXCOORD6 and  TEXCOORD7  These binding semantics map to NV_vertex_program2 input  attribute parameters  The two sets act as aliases to each other     Table 24  vp30 Varying Input Binding Semantics       Binding Semantics Name    Corresponding Data       POSITION  ATTRO    Input Vertex  Generic Attribute 0       BLENDWEIGHT  ATTR1    NORMAL  ATTR2    Input vertex weight  Generic Attribute 1    Input normal  Generic Attribute 2       COLORO  DIFFUSE  ATTR3    Input primary color  Generic Attribute 3       COLOR1  SPECULAR  ATTR4    Input secondary color  Generic Attribute 4       TESSFACTOR  FOGCOORD   ATTR5    Input fog coordinate  Generic Attribute 5       PSIZE  ATTR6    Input point size  Generic Attribute 6       BLENDINDICES  ATTR7    Generic Attribute 7       TEXCOORDO TEXCOORD 7   ATTR8 ATTR15    Input texture coordinates  texcoord0   texcoord7   Generic Attributes 8 15       TANGENT  ATTR14       BINORMAL  ATTR15       Generic Attribute 14  Generic Attribute 15          The valid binding semantics for varying output parameters in the vp30    profile are summarized in Table 25     These binding semantics map to NV_vertex_program2 output registers  The  two sets act as aliases to each other     Table 25  vp30 Varying Output Binding Semantics       Binding Semantics Name Corres
345. ting        calculate diffuse component  float diffuse   dot normalVec  lightVec          calculate specular component  float specular   dot normalVec  halfVec         Use the lit function to compute lighting vector from     diffuse and specular values  float4 lighting   lit diffuse  specular  32      Here we use the Cg Standard Library to perform dot products  using dot        We also make use of the Standard Library s lit    function to calculate a  Blinn style lighting vector based on the previously computed dot products   The returned vector holds the diffuse lighting contribution in the y   coordinate  and the specular lighting contribution in the z coordinate     Remember to take advantage of the Standard Library to help speed up your  development cycle     Modulating the Diffuse and Specular Lighting Contributions    Once the diffuse and specular lighting contributions lighting  y and  lighting  z have been calculated  we need to modulate them with the  object   s material properties        blue diffuse material  rilo ges Chiirruisaieircerial   iloacs 0  0  0 0  1 0        white specular material  float3 specularMaterial   float3 1 0  1 0  1 0         combine diffuse and specular contributions and     output final vertex color   OUT Color rgb   lighting y   diffuseMaterial    lighting z   specularMaterial   OUT CoLor a   mO    return OUT           808 00504 0000 006 151  NVIDIA          Cg Language Toolkit    We define the object   s diffuse material color as blue  We 
346. tion vertexDeclaration    vice  gt SetTexture  baseTextureUnit  texture    vice  gt SetVertexShader  vertexShader     vice  gt SetPixelShader  pixelShader                         Q 000          Draw scene              94    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library       Called before the device changes or is destroyed  void OnDestroyDevice      vertexShader     gt Release     pixelShader  gt Release      vertexDeclaration  gt Release                Called before application shuts down   void OnShutdown         This frees any core runtime resources      The minimal interface has no dynamic storage to free   cgDestroyContext  context          Direct3D 8 Application       The following C code links the previous vertex and fragment programs to  the Direct3D 8 application     include  lt cg cg h gt     include  lt cg cgD3D8 h gt     IDirect3DDevice8  device     Initialized somewhere else  IDirect3DTexture8  texture     Initialized somewhere else  D3DXMATRIX matrix     Initialized somewhere else  D3DXCOLOR constantColor     Initialized somewhere else  CGcontext context    CGprogram vertexProgram  fragmentProgram    DWORD vertexShader  pixelShader    CGparameter baseTexture  someColor  modelViewMatrix                 Called at application startup  void OnStartup       Z2 Create comerse  context   cgCreateContext                 Called whenever the Direct3D device needs to be created  void OnCreateDevice          Create the vertex shader  vertexProgram   c
347. tly    There are several changes that force the same operation to be expressed  differently in Cg than in C     Q A Boolean type  bool  is introduced  with corresponding implications for  operators and control constructs     Q Arrays are first class types because Cg does not support pointers        Q Functions pass values by value result  and thus use an out or inout  modifier in the formal parameter list to return a parameter  By default   formal parameters are in  but it is acceptable to specify this explicitly   Parameters can also be specified as in out  which is semantically the  same as inout     Differences from ANSI C    Cg was developed based on the ANSI C language with the following major  additions  deletions  and changes   This is a summary   more detail is  provided later in this document      Q Language profiles  described in    Profiles    on page 225  may subset  language capabilities in a variety of ways  In particular  language profiles  may restrict the use of for and while loops  For example  some profiles  may only support loops that can be fully unrolled at compile time     QA binding semantic may be associated with a structure tag  a variable  or a  structure element to denote that object   s mapping to a specific hardware  or API resource  See    Binding Semantics    on page 242     Reserved keywords goto  break  and continue are not supported     Reserved keywords switch  case  and default are not supported   Labels are not supported either     Q Poin
348. top level  function or by any functions that it calls  The output of the program comes  from the return value of the function  which is always implicitly varying    and from any out parameters  which must also be varying     Parameters to a program of type sampler  are implicitly const        808 00504 0000 006 243  NVIDIA          Cg Language Toolkit    Statements    Statements are expressed just as in C  unless an exception is stated elsewhere  in this document  Additionally     Q The if  while  and for statements require bool expressions in the  appropriate places     O Assignment is performed using    The assignment operator returns a  value  just as in C  so assignments may be chained        O The new discard statement terminates execution of the program for the  current data element   such as the current vertex or current fragment     and suppresses its output  Vertex profiles may choose to omit support  for discard     Minimum Requirements for if  while  and for Statements  The minimum requirements are as follows     Q All profiles should support if  but such support is not strictly required  for older hardware        O All profiles should support for and while loops if the number of loop   iterations can be determined at compile time       Can be determined at compile time    is defined as follows   The loop iteration expressions can be evaluated at compile time by  use of intra procedural constant propagation and folding  where the  variables through which constant v
349. trans modelview 0    tendif    OUT HPosition   mul  ModelViewProj  IN Position      float3 normal   normalize  mul  ModelViewIT   TENES mas  Ez  float3 eyeToVert   normalize  mul  ModelView   IN   POS Eon  s x72  P          reflect th ye vector across the normal vector     for reflection  OUT TexCoord0   float4  reflect  eyeToVert  normal   1 0         float   0   15       compute the fresnel term   float oneMCosAngle   1 dot  eyeToVert normal     oneMCosAngle   pow oneMCosAngle  5    OUT Color0   lerp oneMCosAngle  1    0   xxxx        rerurn OUT        808 00504 0000 006 201  NVIDIA          Cg Language Toolkit       Grass    Description    This effect shows procedural animation of geometry using a Sine function   along with calculation of a normal for the procedurally deformed geometry   Fig  17          Fig  17  Example of Grass    Vertex Shader Source Code for Grass    Serle EMS d  float4 Position   POSITION   float4 Normal   NORMAL        202 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders       float4 TexCoordO0 TEXCOORDO   float4 Coloro COLORO    y    struct vertout    float4 Hposition BOSTON  Hali MN ike  COLORO   float4 TexCoordO0 TEXCOORDO     y     vertout main app2vert IN     uniform  uniform  uniform  uniform       float4x4 ModelViewProj   float4x4 ModelView   float4x4 ModelViewIT   float4 Constants        vertout OUT        we need to figure OUT what the position is  float4 position  position z   0   position y 0     IN Position        add IN the act
350. trq are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate_coord1 are texture coordinates associated with the n 2  texture unit   intermediate coord2 are texture coordinates associated with the n 1  texture unit  and  eye is the eye ray vector     This function can be used to generate the texm3x3pad texm3x3pad   texm3x3spec instruction combination in all ps 1 x profiles        tex dp3x2 depth float3 str  float4 intermediate coord   float4 prevlookup        Performs the following     float z   dot intermediate coord xyz  prevlookup xyz    float w   dot str  prevlookup xyz    return z   w    where    str are texture coordinates associated with the nth texture unit   intermediate coord are texture coordinates associated with the n 1  texture unit  and   prevlookup is the result of a previous texture operation     This function can be used with the DEPTH varying out semantic to generate the  texm3x2pad texm3x2depth instruction combination in ps 1 3                 318 808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Examples    The following examples illustrate how a developer can use Cg to achieve  DirectX pixel shader 1_X functionality           Example 1   struct VertexOut    float4 color SOHO RO   float4 texCoord0   TEXCOORDO   float4 texCoord1   TEXCOORD1           y    float4 main VertexOut IN   uniform sampler2D diffuseMap   uniform sampler2D normalMap    COLOR    float4 diffuseTexColor   t
351. ual base location of      d cas straw  uel IN Colum   POSE Lam    POSWELOM     a WN Colo eO  xp  position a   Tou  s UN Color    z7          figure OUT where the wind is coming from  float4 origin lora  20   0 20  0  4  float4 dir POSLEOM   Gueiepusp    Wf al tae imncsnesity  Our idas wmi       float inten   sin Constants x    2 length dir      JUN  POSE LOIN ye  dir   normalize  dir         Bezier curve stuff here    float4 0 0 0 0     Float  0  UN  Colocd w 2  0 0  P   loca  eli x lt    nimeem  IN Color  Y   chile  atimta  0         do the Bezier linear interpolation steps   ILO de  IN Colort o were       we need to do som  iloare rere lil  ihoaee orii   Tlosr   ciel           808 00504 0000 006    203  NVIDIA       Cg Language Toolkit    loge temo   lheroa  ciellil  qued  ww   clock cenos   lema  eri  gus  i  Pp  float4 result   lerp temp  temp2  t            add IN the height and wind displacement components  position   position   result   position w   1        transform for sending to the reg  combiners  OUT Hposition   mul  ModelViewProj  position         calculate the texture coordinate  721 from the position passed IN    UA econ    lose    N osea  ar 195  5300  8  ML 10  9       find the normal      we need one more point to do a partial  cedo   lez  Curl  etrr Er  05  8   tempa   lewo erele  ceils  Er0  05    float4 newResult   lerp temp  temp2  t 0 05            do a crossproduct with a vector that   EY is horizontal across the screen   float normal   cross   result
352. ues  CGannotation   int  nvalues    const int  cgGetIntAnnotationValues  CGannotation   int  nvalues    const char  cgGetStringAnnotationValue  CGannotation     const int  cgGetBooleanAnnotationValues  CGannotation   int  nvalues          OpenGL State    When egGLRegisterStates    is called  the CgFX OpenGL runtime  initializes state assignments that correspond to almost all appropriate or  useful OpenGL API calls  The set of states and state callbacks that are  registered by this call compose the CgFX OpenGL state manager     There is a one to one mapping between the state assignments that are  provided by the OpenGL state manager and the corresponding OpenGL  calls  Given an OpenGL call of interest  it is intended to be simple to  determine which state assignment it corresponds to  and vice versa  For  example  the state assignment ClearColor   float4 0 1 0 1  leadsto the  call glclearColor  0 1 0 1  when the state assignment is executed during  a call to cgSetPassState       For calls that take enumerated values  for example  GL  DEST COLOR for  glBlendFunc       corresponding enumerants are defined by the CgFX       808 00504 0000 006 129  NVIDIA          Cg Language Toolkit    OpenGL state manager  again with a straightforward mapping   GL_DEST_COLOR corresponds to DestColor  and so forth  When an OpenGL  call takes multiple parameters or multiple enumerants  a corresponding  vector type is used  for example  a call to glBlendFunc  GL_ZERO   GL_DST_ALPHA  corresponds to 
353. unction can generate an OpenGL error in addition  to the Cg specific error  These errors are checked in Cg  as in any OpenGL  application  by using glGetError        Direct3D Cg Runtime    The Direct3D Cg runtime is composed of two interfaces     Q Minimal interface  This interface makes no Direct3D calls itself and should  be used when you prefer to keep the Direct3D code in the application  itself     Q Expanded interface  This interface makes the Direct3D calls necessary to  provide enhanced program and parameter management and should be  used when you prefer to let the Cg runtime manage the Direct3D  shaders     Direct3D Minimal Interface    The minimal interface simply supplies convenient functions to convert some  information provided by the core runtime to information specific to  Direct3D     Vertex Declaration    In Direct3D  you have to supply a vertex declaration that establishes a  mapping between the vertex shader input registers and the data provided by  the application as data streams  In Direct3D 9  this vertex declaration is  bound to the current state the same way the vertex shader is  see the       808 00504 0000 006 85  NVIDIA          Cg Language Toolkit    Direct3D 9 documentation on  IDirect3DDevice9  CreateVertexDeclaration   and   IDirect 3DDevice9   SetVertexDeclaration   for a detailed explanation    In Direct3D 8  the vertex declaration is required at the time you create the  vertex shader  for more information  see the Direct3D 8 documentation on 
354. unsized array of Light interface objects  loops over them  and returns the  sum of the values returned by their respective value    methods   interface Light     float4 value     y     struct Soo lic 2 lem 1  floats value  recaen Elo ata  aa O  y     loa mesa  Was cora biome LIY 3 COOR 1  float4 v   float4 0 0 0 0    foe  aime 3b   Of a  lt  lL lewwguimge sri   Ww a  Ifa   velue O p  return v          Recall that all uniform parameters to the program must have expressions in  the parenthesized list in the compile statement  and therefore one expression  is necessary here for the 1 parameter        808 00504 0000 006 125  NVIDIA          Cg Language Toolkit    Resolution using Cg    The first way that main   can be compiled is to provide the name of an effect  parameter that resolves both the actual size of the array as well as the  concrete type that implements the Light interface     SpotLight spots 4      technique    pass    FragmentProgram   compile arbfpl main  spots            Resolution using the Cg runtime    Alternatively  the application can leave the resolution of the concrete types  and array size until later so that they may be set via Cg runtime calls from  the application  as one typically does for Cg programs that are not CgFX     For this case  the expression passed to the compile statement should just be  an unsized array of the abstract interface type     ieme liiciaes         technique    pass    FragmentProgram   compile arbfpl main lights              Th
355. urns CG  FALSE     The declaration returned by cgD3D9GetVertexDeclaration    or  cgD3D8GetVertexDeclaration    is for a single stream  so that for the  following program        mole  masa  atin deluxe Posiriom e POSITION   in float4 color 2 COLOROF  in tloat4 texCoord   TEXCOORDO    out float4 hpos 2 IOS IVE ION        i    it is equivalent to                                                                                                                          const D3DVERTEXELEMENT9 declaration         LO  0  slizcor  float    D3DDECLTYPE_FLOAT4  D3DDECLMETHOD_DEFAULT   D3DDECLUSAGE POSITION  O0 Jj    LO  4 v sizcor  ac Leste  y   D3DDECLTYPE_FLOAT4  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE COLOR  0      i 9   8      epibeAxexouE  elote  y  D3DDECLTYPE_FLOAT4  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE TEXCOORD  0      D3DD3CL_END                         y     for the Direct3D 9 Cg runtime  and it is equivalent to   const DWORD declaration             808 00504 0000 006 87  NVIDIA               Cg Language Toolki       D3DVSD_STREAM 0     D3DVSD_REG  D3DVSDE_POSITION  D3DVSDT_FLOAT4     D3DVSD_REG  D3DVSDE_DIFFUSE  D3DVSDT_FLOAT4     D3DVSD_REG  D3DVSDE_TEXCOORDO  D3DVSDT_FLOAT4    O    D3DVSD_END                               y   for the Direct3D 8 Cg runtime     Usually though  you want to apply a vertex program to geometric data that   come in multiple streams or with specific vertex formats  In this case  the   vertex declaration is based on the vertex formats rather than the pr
356. utomatically set by the  Cg runtime  However  in some situations it may be useful to query a    sink   side    member parameter for its underlying resource  for example     A shared instance of a structure whose type in defined in one Cg program or  effect may be connected to parameters of other programs or effects  provided  that the entities involved define the source structure types and destination  interface types equivalently  See    Parameter Type Equivalency    on page 65  or more details  If the types are not equivalent  cgconnect Parameter     generates a runtime error     The following example illustrates structure to interface connection by  creating three programs  all of which define a type named Foo  with one  program   s definition differing from the others        interface MyInterface    close Weill  itllke  te xx  p  y   struct MyStruct   MyInterface    float Scale   float Val float x    return Scale   x    y   float4 main  MyInterface foo    COLOR    Stevia  GECKO  Well  52  2  P            808 00504 0000 006 61  NVIDIA          Cg Language Toolkit    Listing 1  Cg Program 1  interface MyInterface    float Val float x         y   Sici  licio Mistico EMNIN er decem  float Scale   float Val float x    return Scale   x    y   float4 main MyInterface foo    COLOR    erica  too  Well  5 8  AKA  y         Listing 2  Cg Program 2  interface MyInterface    half valiai x        y   struct MyStruct   MyInterface    float Scale   palk valjali     recura  Scala   sx   y
357. utput parameters in the arbvp1  profile are found in Table 18  These binding semantics map to  ARB_vertex_program output registers  The two sets act as aliases to each    other     Table 18     arbvp1 Varying Output Binding Semantics       Binding Semantics Name    Corresponding Data       POSITION  HPOS    Output position       PSIZE  PSIZ    Output point size       FOG  FOGC    Output fog coordinate       COLORO  COLO    Output primary color       COLOR1  COL1    Output secondary color          BCOLO          Output backface primary color          808 00504 0000 006    261  NVIDIA          Cg Language Toolkit    Options    Table 18  arbvp1 Varying Output Binding Semantics  continued        Binding Semantics Name    Corresponding Data       BCOL1    Output backface secondary color       TEXCOORDO TEXCOORD 7        TEXO TEX7          Output texture coordinates          Note  The application must call ylEnable  GL COLOR SUM ARB  in order to  enable COLOR1 output when using the arbvp1 profile     The profile also allows wPos to be present as binding semantics on a member  of a structure of a varying output data structure  provided the member with  this binding semantics is not referenced  This allows Cg programs to have  the same structure specify the varying output of an arbvp1 profile program  and the varying input of an   p30 profile program     The arbvp1 profile supports the following profile specific options     NumTemps   n    MaxAddressRegs   n      MaxInstructions  lt 
358. vector size is shorter  than the semantic   s vector size  the larger numbered components of the  semantic receive their default values  if applicable  and otherwise are  undefined  In the case above  the R and G components of the output color are  obtained from mycolor  while the B and A components of the color are  undefined        808 00504 0000 006 253  NVIDIA          Cg Language Toolkit       254 808 00504 0000 006  NVIDIA       Appendix B  Language Profiles    This appendix describes the language capabilities that are available in each  of the following profiles supported by the Cg compiler     Oooooddooodo oO       a    OpenGL ARB Vertex Program Profile  arbvp1   OpenGL ARB Fragment Program Profile  arbfp1   OpenGL NV_vertex_program 3 0 Profile  vp40   OpenGL NV_fragment_program 2 0 Profile    p40   OpenGL NV_vertex_program 2 0 Profile  vp30   OpenGL NV_fragment_program Profile    p30   OpenGL NV_vertex_program 1 0 Profile  vp20           OpenGL NV texture shader and NV_register_combiners Profile    p20   DirectX Vertex Shader 2 x Profiles  vs 2       DirectX Pixel Shader 2 x Profiles  ps  2       DirectX Vertex Shader 1 1 Profile  vs  1  1    DirectX Pixel Shader 1 x Profiles  ps  1        In each case  the capabilities are a subset of the full capabilities described by  the Cg language specification in  Cg Language Specification  on page 221     808 00504 0000 006    255  NVIDIA       Cg Language Toolkit       OpenGL    Overview    ARB Vertex Program Profile  arbvp1
359. vp state matrix invtrans texture 0   state matrix invtrans palette 0  state matrix invtrans program 0           Accessible state semantics of type float4 are listed in Table 14                                         Table 14  float4 state Semantics   state material ambient state material diffuse  state material specular state material emission  state material shininess state material front ambient  state material front diffuse state material front specular  state material front emission state material front shininess  state material back ambient state material back diffuse  state material back specular state material back emission  808 00504 0000 006 257    NVIDIA          Cg Language Toolkit                                                          Table 14  float4 state Semantics  continued   state material back shininess state light  0   ambient  state light  0   diffuse state light  0   specular  state light  0   position state light  0   attenuation  state light  0   spot direction state light  0   half  state lightmodel  ambient state lightmodel scenecolor  state lightmodel front scenecolor state  lightmodel  back scenecolor  state lightprod 0   ambient state lightprod 0  diffuse  state lightprod 0  specular state lightprod 0  front ambient  state lightprod 0  front diffuse state lightprod 0  front specular  state lightprod 0  back ambient state lightprod 0  back diffuse  state lightprod 0  back specular state texgen 0  eye s  state texgen 0  eye t state texgen 0  eye r
360. with no profile overload    This search process allows generic versions of a function to be defined that  can be overridden as needed for particular hardware     Syntax for Parameters in Function Definitions    Functions are declared in a manner similar to C  but the parameters in  function definitions may include a binding semantic  see    Binding  Semantics    on page 242  and a default value     Each parameter in a function definition takes the following form    uniform    type   identifier      binding semantic gt       lt default gt      where    Q   type   may include the qualifiers in  out  inout  and const  as  discussed in    Type Qualifiers    on page 233        808 00504 0000 006 227  NVIDIA          Cg Language Toolkit         default   is an expression that resolves to a constant at compile time     Default values are only permitted for uniform parameters  and for in  parameters to functions that are not top level     Function Calls    A function call returns an rvalue  Therefore  if a function returns an array  the  array may be read but not written  For example  the following is allowed          minas  6     4 1  E  But  this is not  myfunc  x   2    y      For multiple function calls within an expression  the calls can occur in any  order     it is undefined     Method Calls    Structures may have methods declared and defined in their structure  definitions  For example     strict Roo 1  float value   float valueTimesTwo     return 2   value       DE       uon
361. wst   float2  dot  intermediate coord xyz  prevlookup xyz    dot str  prevlookup xyz     return tex2D RECT  tex  newst    where  str are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation  and  intermediate coord are texture coordinates associated with the previous  texture unit     This function can be used to generate the dot product 2dor  dot product rectangle NV texture shader instruction combinations           tex3D dp3x3 sampler3D tex  float3 str     texCUBE dp3x3 samplerCUBE tex  float3 str     float4 intermediate coordl   float4 intermediate coord2  float4 prevlookup     float4 intermediate coordl   float4 intermediate coord2  float4 prevlookup           292    808 00504 0000 006  NVIDIA    Appendix B Language Profiles    Table 38    p20 Auxiliary Texture Functions  continued        Texture Function       Description       Performs the following  float3 newst   float3  dot  intermediate coordl xyz  prevlookup xyz      dot  intermediate coord2 xyz  prevlookup xyz     dot  str  prevlookup  xyz      return tex3D CUBE  tex  newst     where  str are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate_coord1 are texture coordinates associated with the n 2  texture unit  and  intermediate coord2 are texture coordinates associated with the n 1  texture unit     This function can be used to generate the dot_product_3d or  dot_product_cube_map NV_texture
362. x Shader Source Code for Melting Paint       define inputs from application  struct app2vert     float4 Position TEOSTITON   float4 Normal   NORMAL        808 00504 0000 006 161  NVIDIA          Cg Language Toolkit      1    1  H    St LUCE         A          Ime  dram    H    vert2f    ve    Wy  Hi  Ou   14  Ou         TE                                                    oat4 ColorO LODOLOBOU  oat4 TexCoord0    TEXCOORDD   vert2frag  oat4 HPosition T ROSTETON   oat3 OPosition EAS O ORINA  oat3 EPosition ECO ORDES  oat3 Normal   TEXCOORD1   oat3 TexCoord0 LOU EXCODEIDUS  oat4 ColorO  COLO   loat3 LightPos   TEXCOORD4   loat3 ViewerPos   TEXCOORD5   rag main  app2vert In     uniform float4x4 ModelViewProj   uniform float4x4 ModelView   uniform float4x4 ModelViewl   uniform float4 ViewerPos   uniform float4 LightPos       ae ere teo  tout     Vertex positions    In clip space   t HPosition   mul  ModelViewProj  In Position    In object space   E OROSite tom   Im  POSE Oa a SUE   In eye space   t EPosition   mul  ModelView  In Position   xyz        t Normal   normalize In Normal xyz    Copy the texture coordinates  t TexCoord0   In TexCoord0 xyz        Generate a white color   t Color0   LightPos    t LightPos   mul  ModelViewI  LightPos   xyz   t ViewerPos   mul  ModelViewI  float4 0 0 0 1    xyz   cuca  Ova p       162    808 00504 0000 006  NVIDIA    Advanced Profile Sample Shaders    Pixel Shader Source Code for Melting Paint    struct vert2frag                             
363. x program indexing does not permit  it  Each element of the array takes a single 4 float program parameter  register  For example  float arr 10   float2 arr 10   float3 arr 10    and   1oat4 arr 10  all consume ten program parameter registers     It is more efficient to access an array of vectors than an array of matrices   Accessing a matrix requires a floor calculation  followed by a multiply by a  constant to compute the register index  Because vectors  and scalars  take  one register  neither the floor nor the multiply is needed  It is faster to do  matrix skinning using arrays of vectors with a premultiplied index than  using arrays of matrices     Constants    Literal constants can be used with this profile  but it is not possible to store  them in the program itself  Instead the compiler will issue  as comments  a  list of program parameter registers and the constants that need to be loaded  into them  The Cg run time system will handle loading the constants  as  directed by the compiler        808 00504 0000 006 305  NVIDIA          Cg Language Toolkit    Bindings       Note  If the Cg run time system is not used  it is the responsibility of the programmer to  make sure that the constants are loaded properly        Binding Semantics for Uniform Data    The valid binding semantics for uniform parameters in the vs_1_1 profile are  summarized in Table 45     Table 45  vs_1_1 Uniform Input Binding Semantics          Binding Semantics Name Corresponding Data   register  c0
364. x value as long as the  application provides this value with each vertex     Cg provides a flexible mechanism for specifying these per vertex inputs in  the form of a set of predefined names  Each program input must be bound to  aname from this set  In the following structure  the vertex program  definition binds its parameters to the predefined names POSITION  NORMAL   TANGENT  and TEXCOORD3  The application must provide the vertex array data  associated with these predefined names     struct myinputs            float3 myPosition   POSITION   float3 myNormal   NORMAL   float3 myTangent   TANGENT   float refractive index   TEXCOORD3                    outdata foo myinputs indata          He noa      Within the program  the parameters are referred to as      indata myPosition    indata myNormal   and so on    Xe Vows UY         We refer to the predefined names as binding semantics  The following set of  binding semantics is supported in all Cg vertex program profiles  Some Cg  profiles support additional binding semantics    POSITION BLENDWEIGHT   NORMAL TANGENT       6 808 00504 0000 006  NVIDIA    Introduction to the Cg Language    BINORMAL PSIZE  BLENDINDICES TEXCOORDO   TEXCOORD7    The binding semantic POSITIONO is equivalent to the binding semantic  POSITION  likewise  the other binding semantics have similar equivalents     In the OpenGL Cg profiles  binding semantics implicitly specify the mapping  of varying inputs to particular hardware registers  However  in DirectX
365. xwese  9 sealer    outputs main inputs IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelView   uniform float4x4 ModelViewIT   uniform float theta     outputs OUT   OUT hPosition   mul ModelViewProj  IN Position         convert the position and normal into      appropriate spaces   float3 eyeToVert   mul ModelView  IN Position  xyz   eyeToVert   normalize eyeToVert     float3 normal   mul ModelViewIT  IN Normal  xyz   normal   normalize  normal      OUT refractVec xyz   refract eyeToVert  normal  theta            206 808 00504 0000 006  NVIDIA    Basic Profile Sample Shaders     DIU acte c WIR    OUT reflectVec xyz   reflect eyeToVert  normal    OUT reflectVec w   1              calculate the fresnel reflection  OUT fresnelTerm   fast fresnel  eyeToVert  normal   Ergat o  3 0  1 0   9 0    Pp       return OUT     Pixel Shader Source Code for Refraction                ellos mata  aum sEllowHES  eer ACE Wee   TEXCOORDO   iin Tlosis retlecivee e WE YMCOORIDI y   in float3 fresnelTerm   COLORO     uniform samplerCUBE environmentMaps 2    uniform float enableRefraction   uniform float enableFresnel    COLOR       float3 refractColor   texCUBE  environmentMaps 0    refractVec   rgb   float3 reflectColor   texCUBE environmentMaps  1    reflectVec   rgb        float3 reflectRefract   lerp refractColor  reflectColor   fresnelTerm             float3 finalColor   enableRefraction     enableFresnel   reflectRefract   refractColor    enableFresnel   reflectColor   fresnelTerm   
366. xyz    float w   dot  texCoord lt n 1 gt   t xyz    depth   z   w     Auxiliary Texture Functions    Because the capabilities of the texture shader instructions are limited in  NV_texture_shader  a set of auxiliary functions are provided in these profiles  that express the functionality of the more complex texture shader  instructions  These functions are merely provided as a convenience for  writing   p20 Cg programs  The same result can be achieved by writing the  expanded form of each function directly  Using the expanded form has the  additional advantage of being supported on other profiles     These functions are summarized in Table 38        290 808 00504 0000 006  NVIDIA    Appendix B Language Profiles       Table 38    p20 Auxiliary Texture Functions             Texture Function  Description  offsettex2D  uniform sampler2D tex  float2 st   uniform float4 m     float4 prevlookup     offsettexRECT  uniform samplerRECT tex  float2 st   float4 prevlookup  uniform float4 m     Performs the following   float2 newst   st   m xy   prevlookup xx   m zw   prevlookup yy     return tex2D RECT  tex  newst       where  st are texture coordinates associated with sampler tex     prevlookup is the result of a previous texture operation  and  m is the offset texture matrix     This function can be used to generate the offset_2d or  offset_rectangle NV_texture_shader instructions        offsettex2DScaleBias  uniform sampler2D tex  float2 st   float4 prevlookup  uniform float4 m   unifo
367. ying  contexts     Context Creation and Destruction    Programs can only be created as part of a context that acts as a program  container  A context is created by calling cgCreateContext      CGcontext cgCreateContext         A context is destroyed by cgDestroyContext      void cgDestroyContext  CGcontext context       cgDestroyContext    deletes all data associated with the context  including  all programs it contains  cgDestroyContext    should be called before  destroying any associated OpenGL context or Direct3D device     Context Query    To check whether a context handle references a valid context or not  use  cgIsContext      CGbool cgIsContext  CGcontext context       Core Cg Program    There are Cg functions for creating  destroying  iterating over  and querying  programs     Program Creation and Destruction    A program is created by calling either cgCreateProgram       CGprogram cgCreateProgram  CGcontext context   CGenum programType   const char  program   CGprofile profile   const char  entry   const char   args       Or cgCreateProgramFromFile        CGprogram cgCreateProgramFromFile  CGcontext context   CGenum programType   const char  program   CGprofile profile   const char  entry   const char   args         50    808 00504 0000 006  NVIDIA    Introduction to the Cg Runtime Library    These functions create a program object  add it to the specified context and  compile the associated source code  For both of them     Q context is a valid context handle     Q
368. ype   int  numberOfValuesReturned       This entry point retrieves the parameter   s default value if valueType is equal  to CG_DEFAULT  The components of the value are returned in row major  order as a pointer to an array containing type double elements  The number  of components available in the array is returned in  numberOfValuesReturned  Function cgGetParameterValues    can also be  used to retrieve a parameter s constant values  but this functionality is rarely  used  see the corresponding manual page for more details     Shared Parameters    The core Cg runtime supports the creation of instances of any type of  concrete parameter  e g   built in types  user defined structures  within a Cg  context  A parameter instance may be connected to any number of  compatible parameters  including any program or effect parameter within  the context     When an instance is connected to another parameter  the second parameter  will inherit its values from the instance  Furthermore  if the variability of the  second parameter has not been explicitly set by a call to  cgSetParameterVariability     its variability will also be inherited from  the instance        808 00504 0000 006 59  NVIDIA          Cg Language Toolkit    The ability to create and easily manage shared  context global parameters  provides a powerful means for creating parameter trees  and for sharing data  and user defined objects between multiple Cg programs or effects     Shared Parameter Creation  Shared parameters 
369. zles to Make the Most of Vectorization    The GPU can swizzle the values in vectors with no performance penalty   recall that a swizzle can be used to rearrange the elements of a vector    Given a vector     clo ees ex   ilheees O  id  2  5    swizzles construct new vectors   pos   lloat  0  0   9  P    a vaz   itloecad il  2  2 9  cozy   tlhoae2  2  X     and so forth  By swizzling your data carefully  you can still take advantage of  vectorization  even when you don   t want to use the same component of both  vectors on both sides of your computation  For example  consider the  computation of the cross product  Given two three dimensional vectors  the  cross product returns a new vector that is perpendicular to the given vectors   It is computed by    loss as 19   loss  g    LOBES  owe   Esa do  Elsa   ESO a  Bo RI  WF     EFD  p    Here we ve again got a lot of arithmetic operations  each using a single pair  of float values  Some cleverness lets us turn this into a vectorized operation   Below is the implementation of the cross    function from the Cg Standard  Library  requiring just two vector multiply operations and one vector  subtraction operation     flost3S crossTPlost3 a  Floats b  1  ESOT Es    lO  wy     BoB   10 ZE         Confirm for yourself that this computes the same value as the first section of  code for the cross product  note that it exposes much more vectorized  computation for the GPU to efficiently process        808 00504 0000 006 323  NVIDIA       
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
Jura Capresso X90/X95 User's Manual  Reflected signal - TV antenna installation guidelines  Filtros de Habitaculo  Tarifs, équipements, options et accessoires  Proposition RCP actualisé  Sherwood SRB3200 User's Manual  manual - Diputación Provincial de Almería  Télécharger l`Appel à Manifestation d`Intérêt.  iProcurement - Business Office    Copyright © All rights reserved. 
   Failed to retrieve file