Home
        Cg Toolkit User`s Manual
         Contents
1.                Enable the profiles  EnableProfile  vertexProfile    LEnableProfile  fragmentProfile                  Bind the programs  LBindProgram vertexProgram    LBindProgram fragmentProgram      Enable texture  LEnableTextureParameter  baseTexture         Draw scene    Disable texture  LDisableTextureParameter  baseTexture         Disable the profiles  LDisableProfile vertexProfile    LDisableProfile fragmentProfile            Set the varying parameters  LDisableClientState  position    LDisableClientState  color    LDisableClientState  texCoord                lled before application shuts down  CgShutdown       E        This frees any runtime resource   estroyContext  context          56    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    OpenGL Error Reporting    Here is the list of the CGerror errors specific to the OpenGL Cg runtime     Q CG PROGRAM LOAD ERROR  Returned when the program could not be  loaded     Q CG PROGRAM BIND ERROR  Returned when the program could not be  bound     Q CG PROGRAM NOT LOADED ERROR  Returned when the program must be  loaded before the operation may be used        Q CG UNSUPPORTED GL EXTENSION ERROR  Returned when an unsupported  Open GL extension is required to perform the operation     Any OpenGL Cg runtime function can generate an OpenGL error in addition  to the Cg specific error  These errors are checked in Cg  as in any OpenGL  application  by using glGetError        Direct3D Cg Runtime    The Direct3D Cg runtime is 
2.                To load a program in Direct3D 8 use egD3D8LoadProgram        HRESULT cgD3D8LoadProgram CGprogram program   BOOL parameterShadowingEnabled  DWORD assembleFlags   DWORD vertexShaderUsage  const DWORD  declaration      This function assembles the result of the compilation of program using  D3DXAssembleShader    with assembleFlags as the D3DXASM flags   Depending on the program s profile  it then either uses  IDirect3DDevice8  CreateVertexShader   to create a Direct3D vertex  shader with declaration as the vertex declaration and vertexShaderUsage  as the usage control  or uses IDirect3DDevice8  CreatePixelShader   to  cteate a Direct3D pixel shader     The value of parameterShadowingEnabled should be set to TRUE to enable  parameter shadowing for the program  This behavior can be changed after the  program is created by calling cgD3DEnableParameterShadowing     Here is a  typical use of the function        HRESULT hresult   cgD3D8LoadProgram vertexProgram  TRUE    D3DXASM DEBUG  D3DUSAGE SOFTWAREVERTEXPROCESSING    declaration     HRESULT hresult   cgD3D8LoadProgram fragmentProgram  TRUE   0  0  0                                                   If you want to apply the same vertex program to several sets of geometric data   each having a different layout  you need to load the program with different  vertex declarations in Direct3D 8  To do so  you need to make a duplicate of  the program  using cgCopyProgram     for each of these declarations  Here is a  code sam
3.              D3DVSD    Ej       ND     y    If it is possible to do so  the functions cgD3D9ResourceToDeclUsage   and  cgD3D8ResourceToInputRegister    convert a CGresource enumetated  type into a Direct3D vertex shader input register     BYTE  cgD3D9ResourceToDeclUsage  CGresource resource    DWORD cgD3D8ResourceToInputRegister  CGresource resource      If the resource is not a vertex shader input resource  the call to  cgD3D9ResourceToDeclUsage    returns CGD3D9 INVALID REG and the call  to cgD3D8ResourceToInputRegister   returns CGD3D8 INVALID REG        808 00504 0000 004 61  NVIDIA    Cg Language Toolkit    To write the vertex declarations described above based on the program  parameters  which eliminates the reference to any semantic  use  cgD3D9ResourceToDeclUsage    ot cgD3D8ResourceToInputRegister                                                                                                                                                                          CGparameter position    cgGetNamedParameter  program   position     CGparameter color    cgGetNamedParameter program   color     CGparameter texCoord    cgGetNamedParameter program   texCoord     const D3DVERTEXELEMENT9 declaration        ff  gr 0  sizeor  Melo  y  D3DDECLTYPE FLOAT3  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage   cgGetParameterResource position     cgGetParameterResourceIndex position       s S w  Sao  Ge lona  v    D3DDECLTYPE D3DCOLOR  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsag
4.              TANGENT SPAC    mul  Model  mul  Model17    mul  Model             mul  ModelView   1 saturate  sqrt  dot  viewP xyz     normalize  normalize objL   objV      Tangent   Tangent   Tangent      float3x3 ModelViewIT      Wectors     mul   float3x3 ModelView   vert ONormal     vert OPosition      Walle   272      10   QU   9     viewP xyz     vectors  EyePosition vert OPosition xyz    LightVector         m  E     vectors   ob3L     195 W      objH          REFL          Generate       ECTION vector for per vertex             ll Sai Loeb    float3 reflection   reflect   viewV  viewN        Generate FRESNEL term   float ndv   saturate  dot  viewN  viewV       float FresnelApprox     pow  1 ndv   Fresnel z  Fresnel y    Fresnel x                                    Fill OUTPUT parameters  ALVA   vert uv     TEXCOORDO xy   os   tanL     Tangent space LIGHT     Tangent space HALF ANGLE  O halfangle   float4 tanH x  tanH y    tanH z  l exp  viewP w     O reflection   reflection     View space REFLECTION     Tangent space VIEW   distance attenuation  O view   float4 tanV x  tanV y           808 00504 0000 004 129    NVIDIA    Cg Language Toolkit                ELY ga     VIEWTANGENT  O tangent   normalize  View  O binormal   normalize View  O normal   normalize View  O fresn   FresnelApprox   return 0     vie  Tang  Pang  Tang     amp        A    wP w     ent 0       column 0  cie LI      collin 1  emela Dy 27 colma 2    Pixel Shader Source Code for Car Paint 9       This sh
5.          v2f main a2v IN     uniform float4x4 WorldViewProj   uniform float4 LightVector    in object space  uniform float4 EyePosition   in object space            Wie QUE       pass texture coordinates for     fetching the diffuse map  OUT TexCoord0 xy   IN TexCoord xy        pass texture coordinates for     fetching the normal map  OUT TexCoordl xy   IN TexCoord xy        compute the 3x3 transform from     tangent space to object space  float3x3 obj ToTangentSpace              obj ToTangentSpace 0    IN T   obj ToTangentSpace 1    IN B   obj ToTangentSpace 2    IN N   808 00504 0000 004 137    NVIDIA    Cg Language Toolkit       transform normal from      object space to tangent space   OUT Normal xyz   0 5   mul objToTangentSpace  IN Normal     037       transform light vector from      object space to tangent space   float3 lightVectorInTangentSpace    mul  objToTangentSpace  LightVector xyz     OUT LightVector xyz   lightVectorInTangentSpace    OUT LightVectorUnsigned xyz   0 5     lightVectorInTangentSpace   0 5                    compute view vector  float3 viewVector         normalize EyePosition xyz   IN Position xyz         compute half angle vector  float3 halfAngleVector    normalize LightVector xyz   viewVector         transform half angle vector from      object space to tangent space   OUT HalfAngleVector xyz    mul objToTangentSpace  halfAngleVector         transform position to projection space  OUT Position   mul WorldViewProj  IN Position      return OUT  
6.         238    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Examples    The following examples illustrate how a developer can use Cg to achieve  DirectX pixel shader 1_X functionality     Example 1    struct VertexOut    float4 color   COLORO   float4 texCoord0   TEXCOORDO   float4 texCoordl PEXCOORD1                4    H   float4 main VertexOut IN   uniform sampler2D diffuseMap   uniform sampler2D normalMap    COLOR    float4 diffuseTexColor   tex2D diffuseMap  IN texCoord0 xy     float4 normal   2    tex2D normalMap  IN texCoordl xy  0 5     ilom  lagi vector   2    IN color  raa   0 5  z   tl eEed dor resulte   SeicuiceieS  choir  igne vector   normal xyz  xxxx     return Glow  result   dirrusste Color                Example 2   struct VertexOut    float4 texCoord0 TEXCOORDO   float4 texCoordl TEXCOORD1   float4 texCoord2   TEXCOORD2   float4 texCoord3   TEXCOORD3     y    float4 main  Vertex0ut IN   uniform sampler2D normalMap   uniform sampler2D intensityMap   uniform sampler2D colorMap    COLOR    float4 normal   2    tex2D normalMap  IN texCoord0 xy  0 5    float2 intensCoord   float2   dot  IN texCoordl xyz  normal xyz    dot  IN texCoord2 xyz  normal xyz     float4 intensity   tex2D intensityMap  intensCoord    float4 color   tex2D colorMap  IN texCoord3 xy    Pe curn color   NESSIE y       808 00504 0000 004 239  NVIDIA    Cg Language Toolkit       OpenGL NV_vertex_program 1 0 Profile  vp20     The vp20 Vertex Program profile is used to compile Cg 
7.       OUT refractVec xyz   refract eyeToVert  normal  theta    OUT refractVec w   1           OUT reflectVec xyz   reflect  eyeToVert  normal    OUT reflectVec w   1        calculace the fresnel reflection  QUIN  1rResaclcin   rast txssmel  eya lovere  normal   closets  5 07 LO  0 0   7                return OUT     Pixel Shader Source Code for Refraction          float4 main in float3 refractVec MESS O ORD O  in float3 reflectVec MECO ORD  in float3 fresnelTerm COMODO         uniform samplerCUBE environmentMaps 2    uniform float enableRefraction   uniform float enableFresnel    COLOR          float3 refractColor   texCUBE  environmentMaps 0    refractVec  rgb   float3 reflectColor   texCUBE  environmentMaps 1    reflectVec   rgb        float3 reflectRefract   lerp refractColor  reflectColor   fresnelTerm            float3 finalColor   enableRefraction     enableFresnel   reflectRefract   refractColor    enableFresnel   reflectColor   fresnelTerm            return float4 finalColor  1 0         808 00504 0000 004 151  NVIDIA    Cg Language Toolkit       Shadow Mapping    Description    This effect shows generating texture coordinates for shadow mapping  along  with using the shadow map in the lighting equation per pixel  Figure 19         Figure 19 Example of Shadow Mapping       152 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Shadow Mapping    struct appdata      he    IO RSS Son  OSA ION  float3 Normal   NORMAL     struct vpc
8.      offsettex2D  uniform sampler2D tex  float2 st   float4 prevlookup  uniform float4 m        Performs the following   float2 newst   st   m xy   prevlookup xx   m zw   prevlookup yy   return tex2D tex  newst    where  st are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation  and  m is the 2 D bump environment mapping matrix           This function can generate the texbem instruction in all ps 1 X profiles           234    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Table 40 ps 1 x Auxiliary Texture Functions  continued        Texture Function       Description       offsettex2DScaleBias  uniform sampler2D tex  float2 st   float4 prevlookup  uniform float4 m   uniform float scale  uniform float bias        Performs the following  float2 newst   st   m xy   prevlookup xx   m zw   prevlookup yy   float4 result   tex2D  tex  newst    return result   saturate  prevlookup z   scale   bias    where  st are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   m is the 2 D bump environment mapping matrix   scale is the 2 D bump environment mapping scale factor  and  bias is the 2 D bump environment mapping offset   This function can generate the texbem1 instruction in all ps 1 x profiles        texlD dp3 samplerlD tex  float3 str  float4 prevlookup        Performs the following  return texlD tex  dot str  prevlookup xyz     where  str are texture coordi
9.      use the lit instruction to calculate lighting      automatically clamp  float4 lighting   lit diffuse  specular  32         output final lighting results  OUT diffCol    float4 lighting y   OUT specCol    float4 lighting z        return OUT     Pixel Shader Source Code for Thin Film Effect    SRCE ZE      E locie  lied  oL   COLORO   float3 specCol COMO        float2 filmDepth   TEXCOORDO   y        void main  v2f IN   e aeiloync4  Colkoie 3  COMO   uniform sampler2D fringeMap   uniform sampler2D diffMap        diffuse material color  logics  Carrol   eloacS  0 3  0 3  O 5 8       lookup fringe value based on view depth  float3 fringeCol    float3 tex2D fringeMap  IN filmDepth          modulate specular lighting by fringe color       combine with regular lighting   color rgb   fringeCol IN specCol   IN diffCol diffCol   Coloma   ban       126    808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders       Car Paint 9    Description    This car paint shader uses gonioreflectomettic paint samples measured by  Cornell University  The samples were converted into a 2D texture map which is  indexed using NdotL and NdotH as the s t coordinate pair  and which provides  the diffuse component of our lighting equation  The specular term is calculated  using the Blinn model  and also includes a term which simulates the clear coat s  metallic flecks     The fleck normal mipmap chain has randomly generated vectors which reside  within a positive Z cone in tangent space  The con
10.     2    could generate the following pixel shader instruction  assuming x is in t0  y is in  t1  and z is in r0     add d2 r0  t0 bias  tl  Table 34 summarizes how different DirectX pixel shader 1_X instruction set  modifiers are expressed in Cg programs  For more details on the context in    which each modifier is allowed and ways in which modifiers may be combined  refer to the DirectX pixel shader 1  X documentation     Table34 ps 1 x Instruction Set Modifiers                                              Instruction Register  Modifier Cg Expression  instr X2 2 x  instr X4 4A x  instr d2 x 2  instr sat saturate  x   i e  min x  max x  1   0    reg bias x 0 5  1 reg 1 x   reg  x  reg bx2 2   x 0 5   228 808 00504 0000 004    NVIDIA    Appendix B Language Profiles    Language Constructs and Support    Data Types    In the ps_1_X profiles  operations occur on signed clamped floating point   values in the range MaxPixelShaderValue to MaxPixelShaderValue  where  MaxPixelShaderValue is determined by the DirectX implementation  These  profiles allow all data types to be used  but all operations are carried out in the    above range  Refer to the DirectX pixel shader 1  X documentation for more  details     Statements and Operators    The DirectX pixel shader 1  X profiles support all of the Cg language  constructs  with the following exceptions                 Q Arbitrary swizzles are not supported  though arbitrary write masks are    Only the following swizzles are allowed   x  
11.     Binding Semantics for Varying Input Output Data    Only the binding semantic names need be given for these profiles  The vertex  parameter input registers are allocated dynamically  All the semantic names   except POSITION  can have a number from 0 to 15 after them     Table 11 vs 2   Varying Input Binding Semantics       POSITION PSIZE       BLENDWEIGHT BLENDINDICES       NORMAL TEXCOORD       COLOR TANGENT          TESSFACTOR BINORMAL                Table 12 summatizes the valid binding semantics for varying output parameters  in the vs 2 0 and vs 2 X profiles        198    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    These map to output registers in DirectX 9 vertex shaders     Table 12 vs 2   Varying Output Binding Semantics       Binding Semantics Name    Corresponding Data             POSITION Output position  oPos  PSIZE Output point size  oPts  FOG Output fog value  oFog       COLORO COLOR1    Output color values  oDO  oD1       TEXCOORDO TEXCOORD7          Output texture coordinates  oT0   oT7       Options    The vs 2 x profile allows the following profile specific options     DynamicFlowControlDepth  lt n gt     NumTemps  lt n gt   Predication     where n   0 or 24  default 24    where 12  lt   n  lt   32  default 16    default true        808 00504 0000 004    NVIDIA    199       Cg Language Toolkit       DirectX Pixel Shader 2 x Profiles  ps 2       Memory    The DirectX Pixel Shader 2 0 Profiles are used to compile Cg source code to  DirectX
12.     It will be moved along the direction from   ll light to vertex to extrude the shadow volume   float away    float   ndotl    0               Move the back facing shadow volume points  loci new Position   eztr  sion VES Y away r imss DOS        Transform position to hclip space   OUT Hposition   mul WorldViewProj  new position            Set the color to blue for when the shadow volume  Teh is rendered in color for illustrative purposes  float4 color   float4 0  0  Factors x  0      OUr Colori   colos  OUT TexCoord0 xy   IN TexCoord0   return OUT           808 00504 0000 004 157  NVIDIA    Cg Language Toolkit       Sine Wave Demo    Description    This effect modifies the vertex positions using a sine function based on the  current time  It demonstrates use of the built in sin    function  It also  computes a normal based on the perturbed mesh  and uses this to compute a  reflection vector to look up in a cube map  Figure 21         Figure 21 Example of Sine Wave       158 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Sine Wave    struct appdata    float4 TexCoord0   TEXCOORDO        he    SLUG wou  float4 HPOS   POSITION   float4 COLO   COLORO   float4 TEXO   TEXCOORDO                 H    vpconn main appdata IN   uniform float4x4 WorldViewProj   uniform float3x4 WorldView   uniform float3x3 WorldViewIT   uniform float3 WavesX   Quantus 3EllexeHr S WavesY   uniform float3 WavesH   uniform float3 Time    vpconn OUT     float3 a
13.     Ola  0   Olan     come lasilies  Joa   Inculiesi   Ola   0     Ola  0   Ola  p           actually constants   could be done in VP or on CPU   half irisSize   BallData RADIUS     sart  UL Ola Beles  TRES DARIK   SEUL DEE  IRIS  DUBINI  p  half irisScale   0 3333h   max 0 01h  irisSize    hakk rise   ysl Deca  RADIUS   BaLlDAta IRIS DUPINA  nubes   mol cc    loll itGicie 4   Javedbit 9  S sie  0   OOO  ICE 5x GRAS  Teruras S  mole lesbi   half D    dot  pupilCenter  xAxis     Panke sico   duoi rs   12 DS E   half4 planeEquation   half4 xAxis  D                           view vector TO surface   half3 Vn   normalize IN OPosition   IN VPosition     half3 Nf   normalize IN N     half3 Ln   IN LightVecO xyz    ohlar S Diker Lukee   JbitsesECIo loe   Sciuanas  lot  UNE  Ia     9  half3 missColor AmbiColor   baseTex   DiffLight    half3 DiffPupil   AmbiColor   saturate  dot  xAxis   Ln             half3 halfAng   normalize  Ln   Vn     half ndh   abs  dot  Nf halfAng      half specl   pow ndh  GlossData PHONG     half s2   smoothstep GlossData GLOSS1  GlossData GLOSS2   specl     specl   lerp GlossData DROP  speci  s2     half3 SpecularLight   SpecColor   specl           half3 hitColor   missColor     if  slice  gt   0 0h          808 00504 0000 004 117  NVIDIA    Cg Language Toolkit    half gradedEta   BallData ETA   gradedEta   1 0h gradedEta   half3 faceColor   BgColor        half3 refVector   refract Vn  Nf  gradedEta       dot refVector  refVector   gt  0        now let s in
14.     Product of row vector v and matrix M  as shown below     M       12 22 Mz 42  mul v  M     V V  V  Vj M  M       If v is a 1xA vector and M is an AxB matrix  returns a  1xB vector        noise  x     Either a 1   2   or 3 dimensional noise function  depending on the type of its argument    The returned value is between zero and one and is  always the same for a given input value        pow x  y     xY       radians  x           Degree to radian conversion          22    808 00504 0000 004  NVIDIA       Cg Standard Library Functions    Table 1 Mathematical Functions  continued        Mathematical Functions                Function Description  round  x  Closest integer to x  rsqrt x  Reciprocal square root of x   x must be greater than zero   sign  x  lifx  0    1 if x  lt  0   0 otherwise   sin x  Sine of x       sincos  float x   out s  out c     s is set to the sine of x  and c is set to the cosine of x     If sin  x  and cos  x  are both needed  this function  is more efficient than calculating each individually        sinh x     Hyperbolic sine of x       smoothstep  min     For values of x between min and max  returns a                max  x  smoothly varying value that ranges from 0 at x 2 min   to 1 at x   max  xis clamped to the range  min  max   and then the interpolation formula is evaluated    2     x min     max min      3     x min     max min       step a  x  Oifx  a   lifx gt  a    sqrt  x  Square root of x   x must be greater than zero    tan  x  Tangent of
15.     This frees any core runtime resources      The minimal interface has no dynamic storage to free   cgDestroyContext  context            66    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    Direct3D 8 Application  The following C code links the previous vertex and fragment programs to the  Direct3D 8 application     finclude  lt cg cg h gt    include  lt cg cgD3D8 h gt                    IDirect3DDevice8  device     Initialized somewher ls  IDirect3DTexture8  texture     Initialized somewher ls  D3DXMATRIX matrix     Initialized somewher ls  D3DXCOLOR constantColor     Initialized somewher ls           eec omes enisi   CGprogram vertexProgram  fragmentProgram    DWORD vertexShader  pixelShader    CGparameter baseTexture  someColor  modelViewMatrix            7 Called su agalilcaicion sterco  void OnStartup           Create context  context   cgCreateContext                    Called whenever the Direct3D device needs to be created  void OnCreateDevice           Create the vertex shader  vertexProgram   cgCreateProgramFromFile  context  CG SOURCE   YETTE TOG rem  CG   CE PNOMEE WS 1 1  Wertes rogram  U   CComPtr lt ID3DXBuffer gt  byteCode   const char  progSrc   cgGetProgramString vertexProgram   CG COMPILED PROGRAM       Normally  you also grab the constants and prepend them     to your vertex declaration  Not shown here for brevity   D3DXAssembleShader  progSrc  strlen progSrc   0  0  O    amp byteCode  0       If your program uses explicit binding semantic
16.     p30 profiles and will also be supported by all future profiles that have texture   mapping capabilities  All of the functions in Table 3 return a   1oat4 value     Because of the limited pixel programmability of older hardware  the ps 1 and    p20 profiles use a different set of texture mapping functions  See    Language  Profiles  on page 195 for more information     Table3 Texture Map Functions       Texture Map Functions    Function   Description       texlD samplerlD tex  float s     1D nonprojective       tex1D sampler1D tex  float s  float dsdx  float dsdy     1D nonprojective with derivatives       texlD samplerlD tex  float2 sz   1D nonprojective depth compare  texlD samplerlD tex  float2 sz  float dsdx  float dsdy     1D nonprojective depth compare with derivatives       texlDproj samplerlD tex  float2 sq     1D projective       texlDproj samplerlD tex  float3 szq     1D projective depth compare       tex2D sampler2D tex  float2 s     2D nonprojective       tex2D sampler2D tex  float2 s  float2 dsdx  float2 dsdy     2D nonprojective with derivatives       tex2D sampler2D tex  float3 sz           2D nonprojective depth compare             808 00504 0000 004 25  NVIDIA    Cg Language Toolkit    Table3 Texture Map Functions  continued        Texture Map Functions    Function  Description       tex2D sampler2D tex  float3 sz  float2 dsdx  float2 dsdy     2D nonprojective depth compare with derivatives       tex2Dproj  sampler2D tex  float3 sq     2D projective       
17.    If less values are set than the parameter requires  the last value is smeared  The  cgGLSetParameter functions may be called for either uniform or varying       808 00504 0000 004    47  NVIDIA    Cg Language Toolkit    parameters  When called for a varying parameter  the appropriate immediate  mode OpenGL entry point is called     The corresponding parameter value retrieval functions are as follows     cgGLGetParameterlf  CGparameter parameter  float  array    cgGLGetParameterld  CGparameter parameter  double  array    cgGLGetParameter2f  CGparameter parameter  float  array    cgGLGetParameter2d  CGparameter parameter  double  array    cgGLGetParameter3f CGparameter parameter  float  array    cgGLGetParameter3d  CGparameter parameter  double  array    cgGLGetParameter4f  CGparameter parameter  double  array    cgGLGetParameter4d CGparameter parameter  type  array      Setting Uniform Matrix Parameters    The cgGLSetMatrixParameter functions ate used to set any matrix     void cgGLSetMatrixParameterfr  CGparameter parameter   const float  matrix    void cgGLSetMatrixParameterfc  CGparameter parameter   const float  matrix    void cgGLSetMatrixParameterdr  CGparameter parameter   const double  matrix    void cgGLSetMatrixParameterdc  CGparameter parameter   const double  matrix      The matrix is passed as an atray of floating point values whose size matches the  number of coefficients of the matrix  The r suffix is for functions that assume  the matrix is laid out in row o
18.    More precisely  it is the number of floating point values required to store a  parameter of type type  This function does not apply to some types  like the  sampler types  in which case it returns zero  It is useful because applications  can determine how many floating point values they have to provide to set the  value of a given parameter     Minimal Interface Program Examples    In this section we provide some code samples that illustrate how and when to  use functions from the minimal interface to make Cg programs work with  Direct3D  To enhance clarity  the examples do very little error checking  but a  production application should check the return values of all Cg functions  The  vertex and fragment programs below are referenced in    Direct3D 9  Application  on page 64 and    Direct3D 8 Application  on page 67     Vertex Program    The following Cg code is assumed to be in a file called VertexProgram cg     void VertexProgram    ntm Elles posite onm MEO STIN ON   sum  llos colo   COLORO    mne Ee eC   TEXCOORDO    Que logs ses sislOn OO    JIEXOISIDI OQ   out float4 colorO ee OLORO   out float4 texCoordO   TEXCOORDO    const uniform float4x4 ModelViewMatrix              positionO   mul  position  ModelViewMatrix     colorO   color   texCoordO   texCoord        Fragment Program    The following Cg code is assumed to be in a file called FragmentProgram cg     void FragmentProgram            iim cloar colo COLORO   in float4 texCoord   TEXCOORDO   Oe logr   uoles cae 
19.    Pixel Shader Source Code for Bump Dot3x2    struct v2f    float4 Position   POSITION    in projection space  float4 Normal   COLORO    in tangent space  float4 LightVectorUnsigned   COLORI    in tangent space  float3 TexCoord0   TEXCOORDO   float3 TexCoordl   TEXCOORD1   float4 LightVector   TEXCOORD2    in tangent space  float4 HalfAngleVector   TEXCOORD3    in tangent space                      138 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders    float4 main  v2f IN   uniform sampler2D DiffuseMap   uniform sampler2D NormalMap   uniform sampler2D IlluminationMap   uniform float Ambient    COLOR       tecch base color  float4 color   tex2D DiffuseMap  IN TexCoord0 xy         fetch bump normal and expand it to   1 1   float4 bumpNormal   2     tex2D NormalMap  IN TexCoordl xy    0 5         compute the dot product between       the bump normal and the light vector       compute the dot product between       the bump normal and the half angle vector      fetch the illumination map using   Jd the result of the two previous dot products      as texture coordinates      returns the diffuse color in the       Wit color components and the specular color in the    d alpha component    float2 illumCoord     float2  dot  IN LightVector xyz  bumpNormal xyz     dot  IN HalfAngleVector xyz  bumpNormal xyz     float4 illumination   tex2D IlluminationMap  illumCoord          expand iterated normal to   1 1   float4 normal   2    IN Normal   0 5         compute self shadowing
20.    The arithmetic operator   is the remainder operator  as in C  It may only be  applied to two operands of cint or int type     When   or   is used with cint or int operands  C rules for integer   and      apply        808 00504 0000 004 189  NVIDIA    Cg Language Toolkit    The C operators that combine assignment with arithmetic operations  such as      are also supported when the corresponding arithmetic operator is supported    by Cg    Conditional Operator   P   If the first operand is of type bool  one of the following statements must hold  for the second and third operands    Q Both operands have compatible structure types    Q Both operands are scalars with numeric or bool type     Q Both operands are vectors with numeric or bool type  where the two  vectors are of the same size  which is less than or equal to four     If the first operand is a packed vector of bool  then the conditional selection is  performed on an elementwise basis  Both the second and third operands must  be numeric vectors of the same size as the first operand     Unlike C  side effects in the expressions in the second and third operands are  always executed  regardless of the condition     Miscellaneous Operators   typecast       Cg supports C s typecast and comma operators        190    808 00504 0000 004  NVIDIA    Reserved Words    The following ate the reserved words in Cg     asm    bool   catch   column major  const cast  default   do   dynamic cast  enum   false   for   goto   in   int   
21.    User s Manual    A Developer s Guide to Programmable Graphics    Release 1 1  February 2003    RHVIDIA       Cg Language Toolkit    ALL NVIDIA DESIGN SPECIFICATIONS  REFERENCE BOARDS  FILES  DRAWINGS  DIAGNOSTICS   LISTS  AND OTHER DOCUMENTS  TOGETHER AND SEPARATELY   MATERIALS   ARE BEING PROVIDED   AS IS   NVIDIA MAKES NO WARRANTIES  EXPRESSED  IMPLIED  STATUTORY  OR OTHERWISE WITH  RESPECT TO THE MATERIALS  AND EXPRESSLY DISCLAIMS ALL IMPLIED WARRANTIES OF  NONINFRINGEMENT  MERCHANTABILITY  AND FITNESS FOR A PARTICULAR PURPOSE     Information furnished is believed to be accurate and reliable  However  NVIDIA Corporation assumes  no responsibility for the consequences of use of such information or for any infringement of patents or  other rights of third parties that may result from its use  No license is granted by implication or  otherwise under any patent or patent rights of NVIDIA Corporation  Specifications mentioned in this  publication are subject to change without notice  This publication supersedes and replaces all  information previously supplied  NVIDIA Corporation products are not authorized for use as critical  components in life support devices or systems without express written approval of NVIDIA  Corporation     Trademarks  NVIDIA and the NVIDIA logo are trademarks of NVIDIA Corporation     Microsoft  Windows  the Windows logo  and DirectX are registered trademarks of Microsoft  Corporation     OpenGL is a trademark of SGI   Other company and product name
22.    provided these inputs are not referenced  This allows Cg programs to have the  same structure specify the varying output of a vp20 profile program and the  varying input of a   p20 profile program     Table 49 summarizes the valid binding semantics for varying output parameters  in the   p20 profile     Table 49    p20 Varying Output Binding Semantics                            Binding Semantics Name Corresponding Data  COLOR  COLORO Output color  float4   COL  COLO  DEPR Output depth  float   DEPTH  250 808 00504 0000 004    NVIDIA    Appendix B Language Profiles    The output depth value is special in that it may only be assigned a value of the  form    float4 t    lt texture shader operation gt      float z   dot  texCoord lt n gt   t xyz    float w   dot  texCoord lt n 1 gt   t xyz    depth   z   w     Auxiliary Texture Functions    Because the capabilities of the texture shader instructions ate limited in  NV_texture_shader  a set of auxiliary functions are provided in these profiles  that express the functionality of the more complex texture shader instructions   These functions are merely provided as a convenience for writing   p20 Cg  programs  The same result can be achieved by writing the expanded form of  each function directly  Using the expanded form has the additional advantage of  being supported on other profiles     Table 50 summatizes these functions     Table 50   p20 Auxiliary Texture Functions       Texture Function       Description       offsettex2D 
23.    void    void    void    void    void    void    void    void    void    void    void    void    void    void    cgGLSetParameterlf CGparameter parameter  float x    cgGLSetParameterlfv  CGparameter parameter    const float  array    cgGLSetParameterld CGparameter parameter  double x    cgGLSetParameterldv  CGparameter parameter    const double  array      cgGLSetParameter2f CGparameter parameter  float x   float y    cgGLSetParameter2fv  CGparameter parameter   const float  array    cgGLSetParameter2d CGparameter parameter  double x   double y    cgGLSetParameter2dv  CGparameter parameter   const double  array      cgGLSetParameter3f  CGparameter parameter  float x   float y  float z    cgGLSetParameter3fv  CGparameter parameter   const float  array    cgGLSetParameter3d CGparameter parameter  double x   double y  double z    cgGLSetParameter3dv  CGparameter parameter   const double  array      cgGLSetParameter4f  CGparameter parameter  float x   float y  float z  float w    cgGLSetParameter4fv CGparameter parameter   const float  array    cgGLSetParameter4d  CGparameter parameter  double x   double y  double z  double w    cgGLSetParameter4dv  CGparameter parameter   const double  array      The digit in the name of those functions indicates how many scalar values are  set by the function  The v suffix is for functions that operate on an array of  values as opposed to individual arguments     If more values are set than the parameter requires  the extra values are ignored
24.   CG PROFILE ARBVP1   main   args         CG SOURCE indicates that myVertexProgramString  a string argument   contains Cg source code  not precompiled object code  Indeed  the Cg runtime  also lets you create a program from precompiled object code  if you want to     CG PROFILE ARBVP1 is the profile the program is to be compiled to  The   main  parameter gives the name of the function to use as the main entry point  when the program is executed  Lastly  args is a null terminated list of null   terminated strings that is passed as an argument to the compiler     Loading a Program    After you compile a program  you need to pass the resulting object code to the  3D API that you re using  For this  you need to invoke the Cg runtime s API   specific functions     The Direct3D specific functions require the Direct3D device structure in order  to make the necessary Direct3D calls  The application passes it to the runtime  using the following call     cgD3D9SetDevice  Device            32    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    You must do this every time a new Direct3D device is created  typically only at  the beginning of the application     You can then load a Cg program in this way for the Direct3D 9 Cg runtime   cgD3D9LoadProgram  program  CG FALSE  0      or this way for the Direct3D 8 Cg runtime           cgD3D8LoadProgram  program  CG FALSE  0  0  vertexDeclaration      The parameter vertexDeclaration is the Direct3D 8 vertex declaration array  that 
25.   NVIDIA    Cg Language Toolkit    Q Cg Developer s CD    The CD provided with this book contains the entire Cg release  which  allows you get started immediately  The readme txt file on the CD describes  the contents of the release in detail     You can begin working with Cg immediately by reading the    Introduction to  the Cg Language    on page 1 and then going through    A Brief Tutorial    on  page 89  Once you have a basic understanding of the Cg language  use the     Advanced Profile Sample Shaders    on page 97 and    Basic Profile Sample  Shaders    on page 133 as a basis to build your own effects        Release Notes    Release notes for Cg ate now contained in a separate document that is part of  the Cg distribution     Please report any bugs  issues  and feedback to NVIDIA by e mailing  cegsupport nvidia com  We will expeditiously address any reported problems        Online Updates    Any changes  additions  or corrections are posted at the NVIDIA Cg Web site   http   developer nvidia com Cg    Refer to this site often to keep up on the latest changes and additions to the Cg  language  Information on how to report any bugs you may find in the release is  also available on this site        xiv    808 00504 0000 004  NVIDIA       Introduction  to the Cg Language    Historically  graphics hardware has been programmed at a very low level  Fixed   function pipelines were configured by setting states such as the texture   combining modes  More recently  programmers con
26.   Scalar types may be implicitly converted to vectors and matrices of  compatible type  The scalar is replicated to all elements of the vector or  matrix  Scalar types may also be explicitly cast to structure types if the scalar  type can be legally cast to every member of the structure     Vector conversions    Vectors may be converted to scalar types  the first element of the vector is  selected   A warning is issued if this is done implicitly  A vector may also be  implicitly converted to another vector of the same size and compatible  element type    A vector may be converted to a smaller comparable vector or a matrix of  the same total size  but a warning is issued if an explicit cast is not used     Matrix conversions    Matrices may be converted to a scalar type  element  0 0  is selected   As  with vectors  this causes a warning if it is done implicitly  A matrix may also  be converted implicitly to a matrix of the same size and shape and  comparable element type    A matrix may be converted to a smaller matrix type  the upper right sub   matrix is selected  ot to a vector of the same total size  but a warning is  issued if an explicit cast is not used     Structure conversions    A structure may be explicitly cast to the type of its first member or to  another structure type with the same number of members  if each member  of the struct can be converted to the corresponding member of the new  struct  No implicit conversions of struct types ate allowed        176    8
27.   This profile implements data types as follows   Q float data types are implemented as IEEE 32 bit single precision   Q half and double data types are implemented as float        Q int data type is supported using floating point operations  which add extra  instructions for proper truncation for divides  modulos  and casts from  floating point types     Q fixed or sampler  data types are not supported  but the profile does  provide the minimal partial support that is required for these data types by  the core language specification   that is  it is legal to declare variables using  these types  as long as no operations are performed on the variables     Bindings  Binding Semantics for Uniform Data    Table 41 summarizes the valid binding semantics for uniform parameters in the    vp20 profile     Table 41  vp20 Uniform Input Binding Semantics          Binding Semantics Name Corresponding Data   register  c0  register  c95  Constant register  0  95     C0 C95 The aliases c0   c95  lowercase  are also  accepted     If used with a variable that requires more  than one constant register  for example  a  matrix   the semantic specifies the first  register that is used                    808 00504 0000 004 241  NVIDIA    Cg Language Toolkit    Binding Semantics for Varying Input Output Data    Table 42 summarizes the valid binding semantics for varying input parameters    in the vp20 profile     One can also use TANGENT and BINORMAL instead of TEXCOORD6 and  TEXCOORD7  A second se
28.   although many of the  principles are mote broadly applicable        1  Program for Vectorization    The GPU can generally perform four arithmetic operations as quickly as it can  perform a single operation  Therefore  if you have two vectors of four floating  point values      igne ip 197  you can add the two vectors together  log E   atop    with no more computational expense than adding together two of their  elements    elote Cl e ess ue IX    This has two implications for efficient programming  First  you should try to  write code that naturally maps to these vector operations  If you want to add    808 00504 0000 004 257    NVIDIA       Cg Language Toolkit    two float4 variables together  it may be substantially less efficient to write it  this way     Elko  g e lolas   p loa x Els E Day fura ue DnA  a w   b w         than to write it this way   cloca      cio     The compiler does its best to find vectorization in your programs  but the more  vectorized your original code is  the better starting place it has to work from     A mote specific example comes from a common computation done for  tangent space bump mapping  Given a texture map that encodes a bump map  by storing the offset along the tangent direction in x  the offset along the  binormal in y  and the offset along the normal in z  the bump mapped normal is  computed by scaling the tangent  binormal  and normal appropriately  In C or  C    the natural way to write this computation is as shown           Tangen
29.   b  c  zyx yields float3 c  b  a   float4 a  b  c  d  xxyy yields float4  a  a  b  b   float2  a  b  yyxx yields float4  b  b  a  a   float4  a  b  c  d  w yields d    The swizzle operator can also be used to create a vector from a scalar     a xxxx yields float4 a  a  a  a     The precedence of the swizzle operator is the same as that of the array  subsctipting operator          Write Mask Operator    The write mask operator     is placed on the left hand side of an assignment  statement  It can be used to selectively overwrite the components of a vector  It  1s illegal to specify a patticular component more than once in a write mask  or to  specify a write mask when initializing a variable as part of a declaration     The following is an example of a write mask     float4 color lora 0  1 0   9507  10150  8  color a   1 0     Set alpha to 1 0  leaving RGB alone     The write mask operator can be a powerful tool for generating efficient code  because it maps well to the capabilities of GPU hardware  The precedence of  the write mask operator is the same as that of the swizzle operator        16    808 00504 0000 004  NVIDIA    Introduction to the Cg Language    Conditional Operator    Cg includes C s if else conditional statement and conditional operator         With the conditional operator  the control variable may be a bool vector  If so   the second and third operands must be similarly sized vectors  and selection is  performed on an elementwise basis  Unlike C  any side
30.   const DWORD  declaration      for the Direct3D 8 Cg runtime     A call to cgD3D9ValidateVertexDeclaration   ot  cgD3D8ValidateVertexDeclaration   returns CG TRUE if the vertex  declaration is compatible with the program  A Direct3D 9 declaration is  compatible with the program if the declaration has an entry matching every  varying input parameter used by the program  A Direct3D 8 declaration is  compatible with the program if the declaration has a D3DVSD_REG    macro call  matching every varying input parameter used by the program  For the program  void main float4 position   POSITION    float4 color   COLORO    float4 texCoord   TEXCOORDO        td    the following Direct3D 9 vertex declaration is valid     const D3DVERTEXELEMENT9 declaration        e 9r ESAS ou  EMO o  D3DDECLTYPE FLOAT3  D3DDECLMETHOD DEFAULT      K                                   i        0  D  D3DD AGE POSITION  0       O  3 zeof  float    D3DDE P    D3DCOLOR  D3DDECLMETHOD DEFAULT   D  4  D  D             I                D3D AGE COLOR  0     eof float     E FLOAT2  D3DDECLMETHOD DEFAULT   EXC O ORD O    E        Qaa   E    AL  D3DDECLT  D3DDECLUSAGE   D3DD3CL END          and the following Direct3D 8 vertex declaration is valid                          W N                      E                                                          DWORD declaration        D3DVSD_STREAM 0    D3DVSD REG D3DVSDE POSITION  D3DVSDT FLOAT3    D3DVSD REG D3DVSDE DIFFUSE  D3DVSDT D3DCOLOR    D3DVSD STREAM 1    D3DVSD 
31.   eight elements   float4x4 matrix4     Four by four matrix  sixteen elements     Note that the multi dimensional array float M 4   4  is not type equivalent to  the matrix float4x4 M     There are no unions or bit fields in Cg at present     Type Conversions    Type conversions in Cg work largely as they do in C  Type conversions may be  explicitly specified using the C  newtype  cast operator        808 00504 0000 004 11  NVIDIA    Cg Language Toolkit    Cg automatically performs type promotion in mixed type expressions  just as C  does  For example  the expression floatvar   halfvar is compiled as  floatvar    float  halfvar     Cg uses different type promotion tules than C does in one case     constant  without an explicit type suffix does not cause type promotion  CG compiles the  expression halfvar   2 0 as halfvar    half  2 0     In contrast  C would compile itas   double  halfvar    2 0  Cg uses  different rules than C to minimize inadvertent type promotions that cause  computations to be performed in slower  high precision arithmetic  If the C  behavior is desired  the constant should be explicitly typed to force the type  promotion  halfvar   2 0f is compiled as   float  halfvar    2 0f     Cg uses the following type suffixes for constants   Q   f for float   A h for half   Q x for fixed    Structures    Arrays    Cg supports structures the same way C does  Cg adopts the C   convention of    implicitly performing a typedef based on the tag name when a struct is  de
32.   float3 eyeSpacePosition TEXCOORD7           he    ELOGIES imegnimese  locus wil  itiloees w2  loss e        float costheta    float3 g2    float3 gtemp     costheta   dot   vl  v2      g2   g g    guisempe  Mec a G4   2  Oe COS chetan  gtemp   pow  gtemp  1 5 xxx      Gneewuo    1  sos    2    creas   return Lemos       Computes the single scattering approximation to     scattering from a one dimensional volumetric surface   float3 singleScatter  float3 wi  float3 wo  float3 n   float3 g  float3 albedo   float thickness         float win   abs dot wi n     float won   abs dot wo n     float  eterm    Brodit si result    eterm   1 0   exp      1  win   1  won   thickness      result   eterm    albedo   hgphase  wo  wi  g       win   won       return result        i is the incident ray      n is the surface normal      eta is the ratio of indices of refraction     r is the reflected ray      t is the transmitted ray    float fresnel  float3 i  float3 n  float eta        120    808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders    ine  acloenes  12  Obie elotes qu        float result   lote  Cy  itle ES27   ie llOete  1616 Weve p       Refraction vector courtesy Paul Heckbert   cl   cdo   i  im  4   esa    50er  es  La 0 Eil 91  y   ella    losw   es2    059         ula    Greece  sal  sp exeo   A  ISS allready  wait tenga or  0 0 0        Compute Fresnel terms      From Global Illumination Compendeum      LOBE  MOORE    ile coss chiy COSp  logre Cosi oh cosi  itle 
33.   for DirectX PS 1 1 pixel shaders   ps 1 2  for DirectX PS 1 2 pixel shaders   ps 1 3  for DirectX PS 1 3 pixel shaders     Q How to invoke  Use the compiler options   profile ps 1 1   profile ps 1 2   profile ps 1 3    The deprecated profile dx8ps is also available and is synonymous with ps 1 1     This document describes the capabilities and restrictions of Cg when using the  DirectX pixel shader 1  X profiles     Overview    DirectX PS 1 4 is not currently supported by any Cg profile  all statements  aboutps 1 Xin the remainder of this document refer only to ps 1 1  ps 1 2  and ps 1 3     The underlying instruction set and machine architecture limit programmability  in these profiles compared to what is allowed by Cg constructs    Thus  these  profiles place additional restrictions on what can and cannot be done in a Cg  program     The main differences between these profiles from the Cg perspective is that  additional texture addressing operations are exposed in ps 1 2andps 1 3and  the depth value output is made available  in a limited form  in ps 1 3     Operations in the DirectX pixel shader 1  X profiles can be categorized as  texture addressing operations and arithmetic operations  Texture addressing  operations are operations which generate texture addressing instructions   arithmetic operations are operations which generate arithmetic instructions  A  Cg program in one of these profiles is limited to generating a maximum of four  texture addressing instructions and
34.   lighting z   attenuation        float main  vert2frag IN   uniform float4  LightPos   uniform sampler3D noise map   uniform sampler2D nv_ map   uniform samplerCUBE cube map   uniform float4  interpolate  IEC OO        float diffuse  specular        808 00504 0000 004 107  NVIDIA    Cg Language Toolkit    float3 biVariate    float3 IN OPosition x IN OPosition z   NRO BO Sion aN    Osito OE   float3 uniVariate   float3 IN OPosition x IN OPosition z   0  0      float3 normal   normalize IN Normal    float3 noiseTex   float3  IN OPosition x IN OPosition z  6   dN gexexsstieskom a    2 5  0   float3 noiseSum   tex3D noise map  biVariate 3  rgb 12    tex3D noise map  noiseTex  rgb 18     tex3D noise map  biVariate 6  rgb 18   normal   normalize  normal   noiseSum               calcLighting diffuse  specular  normal  IN OPosition   IN LightPos  IN ViewerPos  32      float3 nvShift   tex3D noise map  uniVariate 3  rgb   2    tex3D noise map  uniVariate  rgb   4    tex3D noise map  biVariate 3  rgb   16    SS ANS A A SOS   nvShift y   0     biVariate   float3 IN OPosition x   IN OPosition z   NOLO Sascsonenya  9   float2 texCoord   biVariate xy 4   float2 1 1   5     nvShift yx   float2 0  interpolate x 8    float3 nvDecal    LES  OF mejo  OETA  Ulsa  SC OO  1     suelo     l interpolate x    7  xxx        float3 eye   IN ViewerPos   IN OPosition   float3 lightMetal   texCUBE cube map   reflect  normal  eye   rgb   MAPS RAMSAR  RS 7 dEdkoyeuES   5   25 0   sp  enxexeuwubeuc  s ar ko
35.   lt   1024  default 1024   Predication  lt b gt  where b     0 or 1  default 1   ArbitrarySwizzle  lt b gt  where b     0 ot 1  default 1   GradientInstructions  lt b gt  where b  0 ot 1  default 1   NoDependentReadLimit   b   where b     0 ot 1  default 1   NoTexInstructionLimit   b    where b  0 or 1  default 1        Limitations in this Implementation    Currently  this profile implementation has the following limitations     Q Dynamic flow control is not supported in extended pixel shaders        Q Multiple color outputs are not supported in pixel shaders  Only Coloro is  supported        808 00504 0000 004 203  NVIDIA    Cg Language Toolkit       OpenGL    Overview    ARB Vertex Program Profile  arbvp1     The OpenGL ARB Vertex Program Profile is used to compile Cg source code  to vertex programs compatible with version 1 0 of the  GL ARB vertex program extension     a Profile name  arbvp1       Q How to invoke  Use the compiler option  profile arbvpl     This section describes the capabilities and restrictions of Cg when using the  arbvpl profile     Q The arbvp1 profile is similar to the vp20 profile except for the format of its  output and its capability of accessing OpenGL state easily        O ARB vertex program has the same capabilities as NV vertex program  and DirectX 8 vertex shaders  so the limitations that this profile places on  the Cg source code written by the programmer is the same as the   NV vertex   program  profile     Accessing OpenGL State    The ar
36.   pass through object space position    IN Normal  xyz         110  NVIDIA    808 00504 0000 004       pass  OUT N  CUE  QUID       trans  OUT  PGS    ans  OUT Ligh          through object space normal     normalize IN Normal xyz    IN Tangent xyz   IN Binormal xyz     form view pos  origin        Advanced Profile Sample Shaders    tangent  binormal     to obj space  loca  OO  0 3     072    ition   mul ModelViewI   form light vector to obj space  tVecO   mul ModelViewI     return OUT     Pixel Shader Source Code for MultiPaint                      LightVec       define WHITE half4 1 0h 1 0h 1 0h 1 0h       input    same struct is output from  cg multipaintVP cg    struct MultiPaintV2F    float4 HPosition POSITION     position  clip space   float4 TexCoords   TEXCOORD0     base ST coordinates  float3 OPosition UEXCO ORD CO Sion  OO Asp de e   float3 Normal TEXCOORD2     normal  eye space   float3 VPosition TEXCOORD3     view pos  obj space   float3 T TEXCOORD4     tangent  obj space   float3 B TEXCOORD5     binormal  obj space   float3 N TEXCOORD6     normal  obj space   float4 LightVecO TEXCOORD7     light dir  obj space     lg          channel  p    define SE    S in our material map     EC STR x             define M    ETALN    BIS Vy        define NORM SF    fp lutis aia          PEC_EXPON z             subfields in  SpecData   define MINPOWER x   define MAXPOWER  define MAXSPEC       y  Z        ReflData           define FR    ESNE             define FRESNE      MIN x   
37.   texRECT  samplerRECT  float3        texRECTproj  samplerRECT  float3        texRECTproj  samplerRECT  float4        tex3D  sampler3D  float3        tex3Dproj  sampler3D  float4        texCUBE  samplerCUBE  float3           texCUBEproj  samplerCUBE  float4              Note  The nonprojective texture lookup functions are actually done as projective  lookups on the underlying hardware  Because of this  the w component of the  texture coordinates passed to these functions from the application or vertex  program must contain the value 1        Texture coordinate parameters for projective texture lookup functions must  have swizzles that match the swizzle done by the generated texture shader  instruction  While this may seem burdensome  it is intended to allow   p20  profile programs to behave correctly under other pixel shader profiles     Table 46 lists the swizzles required on the texture coordinate parameter to the  projective texture lookup functions     Table 46 Required Projective Texture Lookup Swizzles                                  Texture Lookup Function Texture Coordinate Swizzle  texlDproj  Xw  ra  tex2Dproj  Xyw  rga  texRECTproj  Xyw  rga  tex3Dproj  Xyzw   rgba  texCUBEproj  Xyzw   rgba  248 808 00504 0000 004    NVIDIA    Bindings    Appendix B Language Profiles    Manual Assignment of Bindings    The Cg compiler can determine bindings between texture units and uniform  sampler parameters  texture coordinate inputs automatically  This automatic  assignment is 
38.   w    float2 waveDir   WaveData i  xy     calcWave  disp  norm  dampening  IN Position xyz   waveTime  height  frequency  waveDir     position y   position y   disp    normal   z e normal  Xz   norm     OUT HPosition   mul  ModelViewProj  position         transfom normal into eye space  normal   mul  ModelViewIT  normal    normal xyz   normalize normal xyz         get a vector from the vertex to the eye  float3  eyeToVert   mul ModelView  position  xyz   eyeToVert   normalize  eyeToVert         calculate the reflected vector for cubemap look up  float4 reflected   mul  TextureMat   reflect  eyeToVert  normal xyz   xyzz                  output two reflection vectors for the two     environment cubemaps   OUT TexCoord0   reflected    OUT TexCoordl   reflected              Calculate a Eresnell term  note that   0   0   float fres   l dot eyeToVert normal xyz     fres   pow fres  5            set the two color coefficients  the magic constants       808 00504 0000 004 103  NVIDIA    Cg Language Toolkit          are arbitrary   these two color coefficients are used     to calculate the contribution from each of the two     environment cubemaps  one bright  one dark     Ow od cec SS A SSL O   AO ATAN So 0   2  OW Colori e  eres db 2 9   Roos       return OUT     Pixel Shader Source Code for Improved Water          float4 main in float3 color0   COLORO   aum  ftlhouuES colos COMORES  in float3 reflectVec   TEXCOORDO   in float3 reflectVecDark   TEXCOORDI   uniform samplerCUBE envir
39.   without generating a register combiner instruction  These operations are  referred to as input modifiers and output modifiers     Instead of generating a register combiners instruction  the arithmetic operation  modifies the assembly instruction or source registers to which it is applied  For    example  the following Cg expression    z  x 0 5 4y  2    could generate the following register combiner instruction  assuming x is in    tex0  y is in tex1  and z is in col0     rgb  1  cliisieguas    Ineillit gia  eel  zeielloy  y  discard   texl rgb   col0   sum     scale low ome meli  p     alpha     cliseauad   male fora stezom  discard ES  col0   sum       scale by one half          Table 44 summarizes how different NV_texture_shadet and  NV_register_combiners instruction set modifiers are expressed in Cg  programs  For more details on the context in which each modifier is allowed  and ways in which modifiers may be combined refer to the NV_texture_shader  and NV_register_combiners documentation     Table 44 NV texture shader and NV register combiners    Instruction Set Modifiers       Instruction Register Modifier    Cg Expression                      scale by two   2 x  scale by four   4A x  scale by one half   x 2  bias by negative one half   x 0 5          808 00504 0000 004  NVIDIA    245       Cg Language Toolkit    Table 44 NV texture shader and NV_register_combiners    Instruction Set Modifiers  continued        Instruction Register Modifier Cg Expression       bias by
40.  0   and a representation with all bits set to 1 corresponds to 1 0  The four unsigned  integers are then packed into a single 32 bit result  This operation can be  reversed using the unpack 4ubyte    function        C Psuedocode    os   Kowmcl Z55  0  clamp lar   050 5 1  0  5   ilg   woumcl Z55 0   clama  y  00  1 40  5   ilo z   rouncd 299 0   clewwe esz   0540  14509  5  O   comica  0 clean  007 1320  5  resule SA I ocz xs Teo    unes  lt  lt  8  OA    unpack 4ubyte     half4 unpack 4ubyte  float a    Unpacks the four 8 bit integers in a and scales the results into individual 16 bit  floating point values between 0 0 and 1 0        C Pseudocode       rosula    a  gt  gt    0  an   255 05  estilo    Ma  gt  gt  E   amp  ease 7 239 108  result az e  Ma  gt  gt  16   amp  Os    255 059  resultes      e  gt  gt  24   amp  008    255 05  222 808 00504 0000 004    NVIDIA    Appendix B Language Profiles       DirectX Vertex Shader 1 1 Profile  vs 1 1     The DirectX Vertex Shader 1 1 profile is used to compile Cg source code to  DirectX 8 1 Vertex Shaders and DirectX 9 VS 1 1 shaders      a Profile name  vs 1 1       a How to invoke  Use the compiler option  profile vs 1 1     The vs 1 1 profile limits Cg to match the capabilities of DirectX Vertex  Shaders     This section describes how using the vs 1 1 profile affects the Cg source code  that the developer writes     Memory Restrictions    DirectX 8 vertex shaders have a limited amount of memory for instructions and  data     
41.  004  NVIDIA    Using the Cg Runtime Library    The parameter type is retrieved by cgGetParameterType       CGtype cgGetParameterType  CGparameter parameter    One of five types is returned   1  CG_STRUCT if the parameter is a structure   2   CG ARRAY if the parameter is an array   3  CG  HALF  if the parameter is a half     based type   4  CG  FLOAT  if the parameter is a   loat based type  or  5   CG SAMPLER  if the parameter is a sampler based type     The pair of functions cgGetType    and cgGetTypeString   indicates the  correspondence between a type enumerant and its corresponding string   CGtype cgGetType  const char  typeString    const char  cgGetTypeString CGtype type      If the string passed to cgGetType    does not correspond to any type   CG UNKNOWN TYPE is returned   Function cgGetParameterName    retrieves the parameter name     const char  cgGetParameterName  CGparameter parameter      Use cgGetParameterSemantic    to retrieve the parameter semantic string     const char  cgGetParameterSemantic CGparameter parameter    If the parameter does not have any semantic  an empty string is returned     There is a one to one correspondence between a set of predefined semantics   POSITION  COLOR  and so on  and hardware resources  registers  texture units   and so on   In the Cg runtime  a hardware resource is represented by the type  CGresource and cgGetParameterResource    retrieves the resource assigned  to a parameter     CGresource cgGetParameterResource  CGparame
42.  204  arithmetic operators 14  189  arithmetic precision 188  arithmetic range 188  array type  specification 172  arrays  declaration and use of 179  support of 12    B  binding semantics 183  defined 6  overview 183  Blinn Phong Bump Mapping 119  bool data type 11  bool type  specification 172  boolean operators 15  189  built in functions 19  bump dot3x2 diffuse and specular  pixel shader code example 138  sample shader 136  vertex shader code example 137  bump reflection mapping  pixel shader code example 143  sample shader 140  vertex shader code example 141       C    C preprocessor  supporting 182  C    relation to Cg 165  Car Paint 9  pixel shader code example 130  vertex shader code example 128  cfloat type  specification 172  Cg  brief tutorial 89  defined 1  language  introduction 1  necessity for xii  standard library functions 19  Cg compiler  cgc exe 265  command line options 265  Cg runtime 29  API specific 45  benefits 29  compiling 32  context creation 32  Direct3D 57  cgD3D9GetLastError   87  CGerror 86  debugging mode 83  error callbacks 87  error testing 87  error types 85  Direct3D  cgD3D9EnableDebugTracing   85  Direct3D cgD3D9TranslateHRESULT   87  Direct3D expanded interface 69  cgD3D8LoadProgram   75  cgD3D8SetSamplerState   73  cgD3D9BindProgram   76  cgD3D9EnableParameterShadowing      808 00504 0000 004    NVIDIA    Cg Language Toolkit    74  cgD3D9GetDevice   70  cgD3D9GetLatestPixelProfile   76  cgD3D9GetLatestVertexProfile   76  cgD3D9GetOptimal
43.  85  NVIDIA    Cg Language Toolkit    Q CGerror       CcgD3D9Failed  Set when a Direct3D runtime function makes a  Direct3D call that returns an error       cgD3D9DebugTrace  Set when a debug message is output to the debug  console when using the debug DLL  see  Direct3D Debugging Mode   on page 82     Q HRESULT     CGD3D9ERR_INVALIDPARAM  Returned when a parameter value cannot  be set       CGD3D9ERR INVALIDPROFILE  Returned when a program with an  unexpected profile is passed to a function       CGD3D9ERR INVALIDSAMPLERSTATE  Returned when a parameter of  type D3DTEXTURESTAGESTATETYPE  which is not a valid sampler state   is passed to a sampler state function      CGD3D9ERR_INVALIDVEREXDECL  Returned when a program is loaded  with the expanded interface  but the given declaration is incompatible    Y  CGD3D9ERR NODEVICE  Returned when a required Direct3D device is  0  This typically occurs when an expanded interface function is called  and a Direct3D device has not been set with cgD3D9SetDevice      Y  CGD3D9ERR NOTMATRIX  Returned when a parameter that is not a  matrix type is passed to a function that expects one      CGD3D9ERR_NOTLOADED  Returned when a parameter has not been  loaded with the expanded interface by cgD3D9LoadProgram          CGD3D9ERR NOTSAMPLER  Returned when a parameter that is not a  sampler parameter is passed to a function that expects one       CGD3D9ERR NOTUNIFORM  Returned when a parameter that is not  uniform is passed to a function that expects 
44.  9 PS 2 0 pixel shaders  and DirectX 9 PS 2 0 extended pixel shaders   a Profile names   ps 2 0  for DirectX 9 PS 2 0 pixel shaders    ps 2 x  for DirectX 9 PS 2 0 extended pixel shaders     Q How to invoke  Use the compiler options    profile ps 2 O0    profile ps 2 x  The ps 2 0 profile limits Cg to match the capabilities of DirectX PS 2 0 pixel  shaders  The ps 2 x profile is the same as the ps 2 0 profile but allows  extended features such as arbitrary swizzles  larger limit on number of  instructions  no limit on texture instructions  no limit on texture dependent  reads  and support for predication     This section desctibes the capabilities and restrictions of Cg when using these  profiles     Program Instruction Limit    DirectX 9 Pixel shaders have a limit on the number of instructions in a pixel  shadet     a PS 2 0  ps_2 0  pixel shaders are limited to 32 texture instructions and 64  arithmetic instructions     Q Extended PS 2  ps 2 x  shaders have a limit of maximum number of total  instructions between 96 to 1024 instructions   There is no separate texture instruction limit on extended pixel shaders     If the compiler needs to produce more than the maximum allowed number of  instructions to compile a program  it reports an error     Vector Register Limit    Likewise  there are limited numbers of registers to hold program parameters  and temporary results  Specifically  there are 32 read only vector registers and  12 32 read write vector registers  If the compile
45.  Input Binding Semantics                ins 232  Table 38 ps 1 x Varying Input Binding Semantics                ln 233  Table 39 ps 1 x Varying Output Binding Semantics               sns 233  Table 40 ps 1 x Auxiliary Texture Functions         aoso aoo a 234  Table 41 vp20 Uniform Input Binding Semantics                      241  Table 42 vp20 Varying Input Binding Semantics                            4 242  Table 43 vp20 Varying Output Binding Semantics                    ls  242  Table 44 NV texture shader and NV  register combiners Instruction Set Modifiers       245  Table 45 Supported Standard Library Functions                  ll sn 247  Table 46 Required Projective Texture Lookup Swizzles                 ls  248  Table 47   p20 Uniform Binding Semantics                      lll   4 249  Table 48   p20 Varying Input Binding Semantics                  sls ns 250  Table 49   p20 Varying Output Binding Semantics                    ls 250  Table 50   p20 Auxiliary Texture Functions                          4 251  x 808 00504 0000 004    NVIDIA       Foreword    We are in the midst of a great transition in computer graphics  both in terms of  graphics hardware and in terms of the visual quality and authoring process for  games  interactive applications  and animation  Graphics hardware has evolved  from    big iron  graphics workstations costing hundreds of thousands of dollars  to single chip graphics processing units  GPUs  whose performance and  features have grown to mat
46.  MAX y                808 00504 0000 004    NVIDIA    111    Cg Language Toolkit       define FRESN          EL EXPON z             STRENGTH w          subfields          half4 main  Mu  uniform  uniform  uniform  uniform  uniform  uniform  uniform    8  loui    half4 surfC  half4 mater  half3 Nt         SpecData  half specSt  half specPo    half3 Vn    half3 Ln    half3 Nb      meli gli  half3 Hn    half4 ligh    clsitime INI    detine BUM     in  BumpData   SCALE x             ier Pari I  N   sampler2D ColorMap   sampler2D MaterialMap   sampler2D NormalMap   samplerCUBE EnvMap   float4 SpecData   float4 ReflData   float4 BumpData   R                Wi  dtl  A  Ad  A  El  A    color   see above  tangent space normals  environment skybox  see above   see above   see above    ol   tex2D ColorMap  IN TexCoords xy    ial   tex2D MaterialMap   tex2D NormalMap  IN TexCoords xy  rgb      REALESO Sia  0  Sint  0   310  P    JEN 5L ess C toYonetols oor    d    MANI PE CSS O Ml Mbit c re M   i  ie meson SH SIMA       wer   SpecData MINPOWER            material NORM SP    AC        EXPON          SpecData MAXPOW         normalize IN VPosition            ER   SpecData MINPOWER         IN OPosition      normalize IN LightVecO  xyz   normalize BumpData BUMP SCALE       IME A JEN IN             dot  Ln  Nb     normalize Vn   Ln    amo   Mate  Clute  Choice  sim                  NINA Ne EIN 318  E    Nb   specPower      half4 diffResult   lighting y   surfCol   ol   lerp WHITE  surfCol   ha
47.  Range    Some hardware may not conform exactly to IEEE arithmetic rules  Fixed point  data types do not have IEEE defined rules     Optimizations are allowed to produce slightly different results than  unoptimized code  Constant folding must be done with approximately the  correct precision and range  but is not required to produce bit exact results  It is  recommended that compilers provide an option either to forbid these  optimizations or to guarantee that they are made in bit exact fashion     Operator Precedence    Cg uses the same operator precedence as C for operators that ate common  between the two languages     The swizzle and write mask operators     have the same precedence as the  structure member operator     and the array index operator          Operator Enhancements    The standard C arithmetic operators                 unary   are extended to  support vectors and matrices  Sizes of vectors and matrices must be  appropriately matched  according to standard mathematical rules  Scalar to   vector promotion  see    Smearing of Scalars to Vectors    on page 179  allows  relaxation of these rules     Table7 Expanded Operators                                                    Operator Description  M n   m  Matrix with n rows and m columns  V n  Vector with n elements   V n      V n  Unary vector negate   M n      M n  Unary matrix negate  vin    Vin    gt  V n  Componentwise    Vin    V n    gt  V n  Componentwise    vin    Vin    gt  V n  Componentwise    V n    
48.  Semantics               o         202  Table 19 arbvp1 Uniform Input Binding Semantics                      208  Table 20 arbvp1 Varying Input Binding Semantics                          209  Table 21 arbvp1 Varying Output Binding Semantics                            210  Table 22 arbfp1 Uniform Input Binding Semantics                   lr  212  Table 23 arbfp1 Varying Input Binding Semantics                        213  Table 24 arbfp1 Varying Output Binding Semantics                        213  Table 25 vp30 Uniform Input Binding Semantics                           215  Table 26 vp30 Varying Input Binding Semantics                  sn  216  Table 27 vp30 Varying Output Binding Semantics                    s    216  Table 28   p30 Uniform Input Binding Semantics                           219  Table 29   p30 Varying Input Binding Semantics                               219  Table 30   p30 Varying Output Binding Semantics                     220  Table 31 vs 1i 1 Uniform Input Binding Semantics             o        225  Table 32 vs 1 1 Varying Input Binding Semantics               rns 225  Table 33 vs 1 1 Varying Output Binding Semantics                    226  Table 34 ps 1 x Instruction Set Modifiers                  lens 228  808 00504 0000 004 ix    NVIDIA    Cg Language Toolkit    List of Tables       Table 35 Supported Standard Library Functions                 ee ee 230  Table 36 Required Projective Texture Lookup Swizzles                 rn  231  Table 37 ps 1 x Uniform
49.  Vertex shader input register  v3  PSIZE Vertex shader input register  v4       COLORO  DIFFUSE    Vertex shader input register  v5       COLOR1  SPECULAR    Vertex shader input register  v6       TEXCOORDO TEXCOORD7    Vertex shader input register  v7 v14       TANGENT     Vertex shader input register  v14       BINORMAL          Vertex shader input register  v15          i  TANGENT is an alias for TEXCOORD7        808 00504 0000 004    225    NVIDIA    Cg Language Toolkit    Table 33 summarizes the valid binding semantics for varying output parameters  in the vs 1 X profile These map to output registers in DirectX 8 1 vertex    shaders     Table 33 vs 1 1 Varying Output Binding Semantics       Binding Semantics Name    Corresponding Data             POSITION Output position  oPos  PSIZE Output point size  oPts  FOG Output fog value  oFog       COLORO COLOR1    Output color values  oDO  oD1       TEXCOORDO TEXCOORD7          Output texture coordinates  oTO oT7          Options  When using the vs 1 1 profile under DirectX 9 it is necessary to tell the  compiler to produce del statements to declare varying inputs  The option   profileopts dcls causes dcl statements to be added to the compiler output   226 808 00504 0000 004    NVIDIA       Appendix B Language Profiles       DirectX Pixel Shader 1 x Profiles  ps 1       The DirectX pixel shader 1_X profiles are used to compile Cg source code to  DirectX PS 1 1  PS 1 2  or PS 1 3 pixel shader assembly     a Profile names  ps 1 1
50.  a cgGLSetParameter function is called for a varying  parameter  the appropriate immediate mode OpenGL entry point is called  The  cgGLGetParameter functions do not apply to varying parameters     Setting Sampler Parameters    Setting a sampler parameter requires two steps   The first step consists in assigning an OpenGL texture object to the sampler  parameter using   void cgGLSetTextureParameter  CGparameter parameter    GLuint textureName     where textureName is the OpenGL texture name   The second step consists of enabling the sampler parameter for a specific  drawing call    void cgGLEnableTextureParameter  CGparameter parameter    Function cgGLEnableTextureParameter    must be called after  cgGLSetTextureParameter    and before the actual drawing call   The equivalent disabling function is    void cgGLDisableTextureParameter  CGparameter parameter      You can retrieve the texture object assigned to a sampler parameter using    GLuint cgGLGetTextureParameter CGparameter parameter      You can retrieve the OpenGL enumerant for the texture unit associated with a  sampler parameter usinp     GLenum cgGLGetTextureEnum  CGparameter parameter      The returned enumerant has the form GL_TEXTURE _ARB where   is the texture  unit index     OpenGL Profile Support    A convenient function is provided that gives the best available profile for vertex  or fragment programs depending on the available OpenGL extensions     CGprofile cgGLGetLatestProfile CGGLenum profileType      Param
51.  an application based on  this API  They essentially interface between the core runtime data structures  and the API data structures to provide the following facilities        808 00504 0000 004 45  NVIDIA    Cg Language Toolkit    Q Setting the parameter values  A distinction is made between texture  matrix   atray  vector and scalar values as those various types are handled differently  by each API and have different data structures     Q Executing the program  Program execution is divided into program loading   passing the result of the Cg compiler to the APD  and program binding   setting the program as the one to execute for any subsequent draw calls    This is because those two operations are usually done at a different time  A  program is loaded each time it is recompiled and it is bound each time it  needs to be executed for a particular draw call     Parameter Shadowing    When the value of a uniform parameter 1s set by some function of the OpenGL  Cg runtime  it is actually stored internally  or shadowed  by either the Cg or the  OpenGL runtime so that it does not need to be reset every time the program 1s  about to be executed  This behavior is referred to as parameter sbadoming     If the Direct3D Cg runtime expanded interface  described in    Direct3D  Expanded Interface  on page 69  is used  parameter shadowing can be turned  on ot off on a pet program basis  When parameter shadowing is turned off for  a given program and the value of any of its uniform paramete
52.  and Destruction    Programs can only be created as part of a context that acts as a program  container  A context is created by calling cgCreateContext        CGcontext cgCreateContext       A context is destroyed by cgDestroyContext        void cgDestroyContext  CGcontext context      Context Query    To check whether a context handle references a valid context or not  use  cgIsContext        CGbool cgIsContext  CGcontext context      Core Cg Program    There are Cg functions for creating  destroying  iterating over  and querying  programs        808 00504 0000 004 35  NVIDIA    Cg Language Toolkit    Program Creation and Destruction    A program is created by calling either cgCreateProgram       CGprogram cgCreateProgram CGcontext context   CGenum programType   const char  program   CGprofile profile   const char  entry   const char   args      Of cgCreateProgramFromFile        CGprogram cgCreateProgramFromFile  CGcontext context   CGenum programType   const char  program   CGprofile profile   const char  entry   const char   args      These functions create a program object  add it to the specified context and  compile the associated source code  For both of them     Q context is a valid context handle     Q profileisan enumerant specifying the profile to which the program must  be compiled     Q entry is the name of the function that must be considered as the main  entry point by the compiler  If the value is zero  the name main is used     Q args is a pointer to a null 
53.  assumed to be in a file called FragmentProgram cg     void FragmentProgram     a Elo color COLORO  a oaeee Coor a LE XCOORDOU   out float4 coloro COLORO       const uniform sampler2D BaseTexture   const uniform float4 SomeColor     color O Ne o Mo TS AD Bas e nece mes CO Ora  a Some olio       808 00504 0000 004 77  NVIDIA    Cg Language Toolkit    Expanded Interface DirectD3D 9 Application  The following C code links the previous vertex and fragment programs to the  Direct3D 9 application     finclude  lt cg cg h gt    include  lt cg cgD3D9 h gt                    IDirect3DDevice9  device     Initialized somewher ls  IDirect3DTexture9  texture     Initialized somewher ls  D3DXCOLOR constantColor     Initialized somewher ilc    CGcontext context    IDirect3DVertexDeclaration9  vertexDeclaration   CGprogram vertexProgram  fragmentProgram   CGparameter baseTexture  someColor  modelViewMatrix           Called at application startup  void OnStartup          Create Context  context   cgCreateContext                 Called whenever the Direct3D device needs to be created  void OnCreateDevice           Pass the Direct3D device to th xpanded interfac  cgD3D9SetDevice  device                         Determine the best profiles to use  CGprofile vertexProfile   cgD3D9GetLatestVertexProfile     CGprofile pixelProfile   cgD3D9GetLatestPixelProfile             Grab the optimal options for each profile     const char  vertexOptions        cgD3D9GetOptimalOptions vertexProfile   0     cons
54.  but is cumbersome for an application that uses many programs   What s worse  the application is frozen in time  It supports only the profiles that  existed when it was compiled  it cannot take advantage of the optimizations that  future compilers could offer     In contrast  programs compiled by applications at run time    Q Benefit from future compiler optimizations for the existing profiles       O Run on future profiles corresponding to new 3D APIs or to hardware that  did not exist at the time the Cg programs were written    No Dependency Limitations    If you link a Cg program to the application when it is compiled  the application  is too dependent on the result of the compilation  The application program has  to refer to the Cg program input parameters by using the hardware register  names that ate output by the Cg compiler  This approach is awkward for two  reasons     O The register names can   t be easily matched to the corresponding  meaningful names in the Cg program without looking at the compiler  output        Q Register allocations can change each time the Cg program  the Cg compiler   ot the compilation profile changes  This means you have the inconvenience  of updating the application each time as well     In contrast  linking a Cg program to the application program at run time  removes the dependency on the Cg compiler  With the runtime  you need to  alter the application code only when you add  delete  or modify Cg input  parameters     Input Parameter Ma
55.  ca DA a te 105  Description suae eder S HER e AN da 105  Vertex Shader Source Code for Melting Paint              0  cee eee eee 105  Pixel Shader Source Code for Melting Paint  s e tasa ce eee ee nh 107  MultiPailit   25 22r eS Egon abd doa Sew A CRAM awa ee RRS 109  DVS CHM OIE sy sae caer PA 109  Vertex Shader Source Code for MultiPaint  2     02 2 celere nn 110  Pixel Shader Source Code for MultiPaint     2    26600 cee ee rm n 111  Ray Iraced Refraction eres 2th 2  eened do Re par d acdadera desea A 114  DESCAPUO me M   rmm 114  Vertex Shader Source Code for Ray Traced Refraction           o o ooooooooo   115  Pixel Shader Source Code for Ray Traced Refracti0N           oooooooooooo   116  Sisa x ees Sa ia ada e EORR 119  ii 808 00504 0000 004    NVIDIA    Description ia sot a rosa a cs ii a a Gtk ths 119       Pixel Shader Source Code for Skin    are tees aa a A ii ee 119  Thin  Fil EMEGE 240 rara wea oka a X ST UR RC 124  regn p mara ri Ew dowd ee aa Ie BE eee eine ees 124  Vertex Shader Source Code for Thin Film Effect      2 2 2    00  eee eee 124  Pixel Shader Source Code for Thin Film Effect          oooooooccoconm mmo  126  Car Paint ora e a ea a eee ae aS eee dene 127  peni EET 127  Vertex Shader Source Code for Car Paint 9 4222 an a eee 128  Pixel  Shader Source  Code for Car Paint occu cune ansaa m epa eqs 130  Basic Profile Sample Shaders                   lessen nnn 133  Anisotropic Loting sica Rr AA hax m eR RERUM d 134  Descrip z 5 55 9o Pr A A AA 134  Ver
56.  cgD3D9SetDevice   69  cgD3D9SetSamplerState   73  cgD3D9SetTexture   73  cgD3D9SetTextureWrapMode   74  cgD3D9SetUniform   72  cgD3D9SetUniformArray   73  cgD3D9SetUniformMatrix   72  cgD3D9SetUniformMatrixArray   73  cgD3D9UnloadProgam   76  Direct3D 8 application 81  Direct3D 9 application 78  Direct3D device 69  fragment program 77  lost devices 70  parameters 72  array 73  sampler 73  uniform 72  profile support 76  program executiion 74  vertex program 77  HRESULT 86  minimal interface 57  cgD3D8ResourceToDeclUsage   61  cgD3D8ValidateVertexDeclaration    60  cgD3D9ResourceToDeclUsage   61  cgD3D9ValidateVertexDeclaration    60  Direct3D 8 application 67  Direct3D 9 application 64  fragment program 63  type retrieval 63  vertex declaration 57  vertex declaration for Direct3D 8 58  vertex declaration for Direct3D 9 58  vertex program 63    Direct3D debug DLL  using 85  DirectX pixel shader 1 x profiles 227  DirectX pixel shader 2 x profile 200       808 00504 0000 004    269    Cg Language Toolkit    DirectX vertex shader 1 1 profile 223  DirectX vertex shader 2 x profile 196  dot   for performance 259   dx8ps profile  deprecated 227    E    explicit casts  compile time 177  numeric 177  numeric matrix 177  numeric vector 177    F  fixed data type 11  fixed type  specification 171  float data type 10  float type  specification 171  floating type category 174  for statements 185  fp20 profile 244  fp30 profile 218  fragment profiles  texture lookups 17  fragment program  
57.  easy        The Cg Language    Cg is based on C  but with enhancements and modifications that make it easy to  wtite programs that compile to highly optimized GPU code  Cg code looks    808 00504 0000 004 1  NVIDIA    Cg Language Toolkit    almost exactly like C code  with the same syntax for declarations  function calls   and most data types     Before describing the Cg language in detail  it is important to explain the reason  for some of the differences that exist between Cg and C  Fundamentally  it    comes down to the difference in the programming models for GPUs and for  CPUS     Cg s Programming Model for GPUs    CPUs normally have only one programmable processor  In contrast  GPUs  have at least two programmable processors  the vertex processor and the  fragment processor  plus other non programmable hardware units  The  processors  the non programmable parts of the graphics hardware  and the  application are all linked through data flows  Figure 1 illustrates Cg s model of  the GPU     an  parem    36 AF  Commands j    GAU  GPU Boundary       GFU  Command E  Dura Serer Aerie Fiii  ers uc bradi Polpgons  Linea Location Fl    Sana Eu Prora Sirasm ket  GPU   Potente   GESEH    Pr      7 a Gm Paria dl c  Cem     7 Fears Haier    Pretrariicerrad BM T sior rapa Fasenizad Tani red  Vertice DLE Praia Figi  F 5 e  mana       regret    Verben  F    Figure 1  Cg s Model of the GPU    The Cg language allows you to write programs for both the vertex processor  and the fragment p
58.  effects associated with  the second and third operands always occut  regardless of the conditional     As an example  the following would be a very efficient way to implement a  vector clamp function  if the min    and max    functions did not exist     loss lensis    loe ABL edL  3ElovenE mesxsyedl      s Em  ke  lt  w  ipgawyeudlo sex  2 wobenygedL ass 97 NB  xe cx  iex 3   ebewedloxex  A E 55  return x          Texture Lookups in Advanced Fragment Profiles    Cg   s advanced fragment profiles provide a variety of texture lookup functions   Please note that Cg uses a different set of texture lookup functions for basic  fragment profiles because of the restricted pixel programmability of that  hardware  Basic fragment profile lookup functions aren   t discussed in this  introductory chapter     Advanced fragment profile texture lookup functions always require at least two  parameters     Q Texture sampler    A texture sampler is a variable with the type sampler  sampler1D   sampler2D  sampler3D  samplerCUBE  of samplerRECT and represents  the combination of a texture image with a filter  clamp  wrap  or similar  configuration  Texture sampler variables cannot be set directly within the  Cg language  instead  they must be provided by the application as uniform  parameters to a Cg program     Q    Texture coordinate    Depending on the type of texture lookup  the coordinate may be a scalar  a  two vector  a three vector  or a four vector     The following fragment program use
59.  eight arithmetic instructions  Since these  numbers are quite small  users need to be very aware of this limitation while  wtiting Cg code for these profiles     There are certain simple arithmetic operations that can be applied to inputs of  texture addressing operations and to inputs and outputs of arithmetic       6  For more details about the underlying instruction sets  their capabilities  and their  limitations  refer to the MSDN documentation of DirectX pixel shaders 1 1  1 2 and 1 3        808 00504 0000 004 227  NVIDIA    Cg Language Toolkit    operations without generating an arithmetic instruction  From here on  these  operations ate referred to as input modifiers and output modifiers     The ps_1_x profiles also restrict when a texture addressing operation or  arithmetic operation can occur in the program  A texture addressing operation  may not have any dependency on the output of an arithmetic operation unless    Q The arithmetic operation is a valid input modifier for the texture addressing  operation     Q The arithmetic operation is part of a complex texture addressing operation   which are summarized in the section on Auxiliary Texture Functions      Modifiers    Input and output modifiers may be used to perform simple arithmetic  operations without generating an arithmetic instruction  Instead  the arithmetic  operation modifies the assembly instruction or source registers to which it is  applied  For example  the following Cg expression    z   x   0 5   y
60.  float4 intermediate coord2  float4 prevlookup        Performs the following  float3 newst     float3  dot  intermediate coordl xyz  prevlookup xyz    dot intermediate coord2 xyz  prevlookup xyz    dot str  prevlookup xyz     return tex3D CUBE  tex  newst     where  str are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate coordl are texture coordinates associated with the n 2  texture unit  and  intermediate coord  are texture coordinates associated with the n 1  texture unit     This function can be used to generate the dot product 3d or  dot product cube map NV texture shader instruction combinations                 808 00504 0000 004 253  NVIDIA    Cg Language Toolkit    Table 50   p20 Auxiliary Texture Functions  continued        Texture Function       Description       texCUBE reflect dp3x3 uniform samplerCUBE tex  float4 strq   float4 intermediate coordl   float4 intermediate coord2   float4 prevlookup        Performs the following  float3 E   float3  intermediate coord2 w  intermediate coordl w   strq w    float3 N   float3 dot intermediate coordl xyz  prevlookup xyz    dot intermediate coord2 xyz  prevlookup xyz    dot strq xyz  prevlookup xyz     return texCUBE  tex  2   dot N  E    dot N  N    N  E    where  strq are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate coordl are texture coordinates associated with the n 2  te
61.  going from eye to shaded point  in cube space   float3 eyeVector   mul ObjToCubeSpace  IN Position     EyePosition    OUT TangentToCubeSpace0 w   eyeVector x    OUT TangentToCubeSpacel w   eyeVector y    OUT TangentToCubeSpace2 w   eyeVector z                 transform position to projection space  OUT Position   mul WorldViewProj  IN Position         return OUT        142 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders    Pixel Shader Source Code for Bump and Reflection Mapping    Saca swear d    e    float4 Position   POSITION    in projection space  float4 TexCoord   TEXCOORDO           first row of the 3x3 transform     from tangent to cube space  float4 TangentToCubeSpace0   TEXCOORD1           second row of the 3x3 transform     from tangent to cube space  float4 TangentToCubeSpacel   TEXCOORD2           third row of the 3x3 transform  Ue from tangent to cube space  float4 TangentToCubeSpace2   TEXCOORD3        floats main  v2f IN     uniform sampler2D NormalMap   uniform samplerCUBE EnvironmentMap   uniform float3 EyeVector    COLOR             fetch the bump normal from the normal map  float4 normal   tex2D NormalMap  IN TexCoord xy         transform the bump normal into cube space       then use the transformed normal and eye vector   EN to compute the reflection vector that is    4 used to fetch the cube map   return texCUBE reflect eye dp3x3 EnvironmentMap   IN TangentToCubeSpace2 xyz   IN TangentToCubeSpaceO0   IN TangentToCubeSpacel   normal    EyeVec
62.  in the same program    Fragment profiles are required to fully support the sampler  sampler1D   sampler2D  sampler3D  and samplerCUBE data types  Fragment profiles  are required to provide partial support  see    Partial Support of Types  on  page 173  for the samplerRECT data type and may optionally provide full  support for this data type    Vertex profiles are required to provide partial support for the six sampler  data types and may optionally provide full support for these data types     An array type is a collection of one or more elements of the same type  An  array variable has a single index     Some array types may be optionally designated as packed  using the packed  type modifier  The storage format of a packed type may be different from  the storage format of the corresponding unpacked type  The storage format  of packed types is implementation dependent  but must be consistent for  any particular combination of compiler and profile  The operations  supported on a packed type in a particular profile may be different than the  operations supported on the corresponding unpacked type in that same  profile  Profiles may define a maximum allowable size for packed arrays   but must support at least size 4 for packed vector  one dimensional array   types  and 4x4 for packed matrix  two dimensional array  types        172    808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    Q When declaring an array of arrays in a single declaration  the packed  mod
63.  inputs IN   uniform float4x4 modelViewProj   uniform float3x4 boneMatrices  30    uniform float4 color   uniform float4 lightPos      oxbhejoxbues OUR     float4 index   IN matrixIndices   float4 weight   IN weights     float4 position   float3 normal     moje Galerio ab   Of al  lt  lt  DN ammisoaeso a 3L ar d  i     transform the offset by bone i  position   position   weight x    float4  mul  boneMatrices  index x   IN position   xyz   WON          transform normal by bone i  normal   normal   weight x    mul  float3x3 boneMatrices index x    IN normal xyz   xyz              shift over the index weight variables  this moves     the index and weight for the current bone into          808 00504 0000 004 99  NVIDIA    Cg Language Toolkit          100 808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders       Improved Water    Description    This demo gives the appearance that the viewer is surrounded by a large grid of  vertices  because of the free rotation   but switching to wireframe or increasing  the frustum angle makes it apparent that the vertices are a static mesh with the  height  normal  and texture coordinates being calculated on the fly based on the  direction and height of the viewet  This technique allows for very GPU friendly  water animations because the static mesh can be precomputed  The vertices are  displaced using sine waves  and in this example a loop is used to sum five sine  waves to achieve realistic effects        Figure 6 Example of Improv
64.  is  invalid     Q CG INVALID PROFILE ERROR  Returned when the profile is not supported     Q CG INVALID VALUE TYPE ERROR  Returned when an unknown value type  is assigned to a parameter     Q CG NOT MATRIX PARAM ERROR  Returned when the parameter is not of a  matrix type     Q CG INVALID ENUMERANT ERROR  Returned when the enumerant parameter  has an invalid value     O CG NOT 4x4 MATRIX ERROR  Returned when the parameter must be a 4x4  matrix type     CG FILE READ ERROR  Returned when the file cannot be read   CG FILE WRITE ERROR  Returned when the file cannot be written     CG MEMORY ALLOC ERROR  Returned when a memory allocation fails     D D co O    CG INVALID CONTEXT HANDLE ERROR  Returned when an invalid context  handle is used     Q CG INVALID PROGRAM HANDLE ERROR  Returned when an invalid program  handle is used     Q CG INVALID PARAM HANDLE ERROR  Returned when an invalid parameter  handle is used     O CG UNKNOWN PROFILE ERROR  Returned when the specified profile is  unknown     O CG VAR ARG ERROR  Returned when the variable arguments are specified  incorrectly     O CG INVALID DIMENSION ERROR  Returned when the dimension value is  invalid     CG ARRAY PARAM ERROR  Returned when the parameter must be an array        CG OUT OF ARRAY BOUNDS ERROR  Returned when the index into an array  is out of bounds        API Specific Cg Runtimes    Each API specific Cg runtimes provides an additional set of functions on top  of the core Cg runtime to ease the integration of Cg to
65.  is not referenced  This allows Cg programs to have the same  structure specify the varying output of an arbvp1 profile program and the  varying input of an   p30 profile program        210 808 00504 0000 004  NVIDIA    Appendix B Language Profiles       OpenGL ARB Fragment Program Profile  arb  p1     Memory    The OpenGL ARB Fragment Program Profile is used to compile Cg source  code to fragment programs compatible with version 1 0 of the  GL ARB fragment program OpenGL extension      a Profile name  arbfp1       Q How to invoke  Use the compiler option  profile arbfpl     The arbfp1 profile limits Cg to match the capabilities of OpenGL ARB  fragment programs   This section describes the capabilities and restrictions of  Cg when using the arb  p1 profile     Program Instruction Limits    OpenGL ARD fragment programs have a limit on number of instructions in an  ARB fragment program     ARB fragment programs are limited to number of instructions that can be  queried from underlying OpenGL implementation using   MAX PROGRAM INSTRUCTIONS ARB with a minimum value of 72  There are  limits on number of texture instructions  minimum limit of 24  and atithmetic  instructions  minimum limit of 48  that can be quetied from OpenGL  implementation     If the compiler needs to produce more than maximum allowed instructions to  compile a program  it reports an error     Vector Register Limits    Likewise  there are limited numbers of registers that can be queried from  OpenGL implementat
66.  m01 m02   Bones i   MOO m01 m02   mo MIO mi mi2    Bones  i s mi mil m2   un  1020  21 1022   Bones  11   120 21 1227    float3 posl   mul  Bones i   tempPos      II crensian Sp Ty Sew  float3 sl   mul  m  IN S    Hoare acil   mul  m  IN T    float3 sxtl nmol  Git  JENIS San  o       final blending    fi oland m Ey EXE       float3 finalSxT          blend between the two positions             float3x3 worldToTangentSpace     Basic Profile Sample Shaders    float3 finalS   30    UN Meuemesos  ap Sil SIN MEL ES  Y   float3 finalT   t0   IN Weights x   tl   IN Weights  y   sxt0   IN Weights xt sxtl   IN Weights y     float3 finalPos   pos0   IN Weights xtpos1 IN Weights y     worldToTangentSpace  m00 m01 m02   finalS   worldToTangentSpace  m10 ml11 m12   finalT   worldToTangentSpace  m20 m21 m22   finalSxT        float3 tangentLight         normalize  mul  worldToTangentSpace  LightVec       i  Seale cue bias  edel bit Que eulos    tangentLight     tangentLight   1 0    0 5    0 2        create float4 with 1 0 alpha  float4 tempLight    tempLight xyz   tangentLight xyz   tempLight w   1 0           808 00504 0000 004    NVIDIA    163    Cg Language Toolkit          164 808 00504 0000 004  NVIDIA    Appendix A  Cg Language Specification       Language Overview    The Cg language is primarily modeled on ANSI C  but adopts some ideas from  modern languages such as C   and Java  and from earlier shading languages  such as RenderMan and the Stanford shading language  The language al
67.  matrix type when applied to another matrix type of the same  number of rows and columns  808 00504 0000 004 177    NVIDIA    Cg Language Toolkit    Type Equivalency  Type T1 is equivalent to type T2 if any of the following are true   Q T2is equivalent to T1     a T1 and T2 are the same scalar  vector  or structure type   A packed array type is ot equivalent to the same size unpacked array     T1 is a typedef name of T2   T1 and T2 are arrays of equivalent types with the same number of elements     Q The unqualified types of T1 and 12 are equivalent  and both types have the  same qualifications        Q T1 and T2 are functions with equivalent return types  the same number of  parameters  and all corresponding parameters are pair wise equivalent     Type Promotion Rules    The cfloat and cint types behave like float and int types except for the  usual arithmetic conversion behavior and function overloading rules  see   Punction Ovetloading  on page 181      The usual arithmetic conversions for binary operators are defined as follows   1  Ifeither operand is double  the other is converted to double     2  Otherwise  if either operand is float  the other operand is converted to  float     3  Otherwise  if either operand is half  the other operand is converted to  half     4  Otherwise  if either operand is fixed  the other operand is converted to  fixed     5  Otherwise  if either operand is c  1oat  the other operand is converted to  cfloat     6  Otherwise  if either operand i
68.  negative one half scale by two     2   x 0  5                       unsigned reg  saturate  x    i e  min  x  max x  1   0    unsigned invert reg  1 saturate  x   half bias reg  x 0 5   reg  x  expand reg  2   x 0 5           Language Constructs and Support    Data Types    In the   p20 profile  operations occur on signed clamped floating point values in  the range  1 to 1  These profiles allow all data types to be used  but all  operations are carried out in the above range  Refer to the NV texture shader  and NV  register combiners documentation for more details     Statements and Operators    The   p20 profile supports all of the Cg language constructs  with the following  exceptions     a    Arbitrary swizzles are not supported  though arbitrary write masks are    Only the following swizzles are allowed    x  r  y  g  z  b  w  a    xy  rg  xyz  rgb  xyzw  rgba    xxx  rrr  yyy  ggg  zzz  bbb  www  aaa    xxxx  rrrr  yyyy  gggg  zzzz  bbbb  wwww  aaaa    Matrix swizzles are not supported     Boolean operators other than  lt    lt     gt  and  gt   are not suppotted   Purthermore   lt    lt     gt  and  gt   are only supported as the condition in the     operator     Bitwise integer operators ate not supported       is not supported unless the divisor is a non zero constant or it is used to  compute the depth output        246    808 00504 0000 004  NVIDIA       Appendix B Language Profiles    Q   is not supported        Q    Ternary    is supported if the boolean test exp
69.  not normally  noticeable  except when declaring a vatiable that will hold the value of a  boolean expression  Cg also supports the C comparison operators  which  produce values of type bool     lt  less than    lt   less than or equal to      inequality      equality    gt   greater than or equal to    gt  greater than       808 00504 0000 004 15  NVIDIA    Cg Language Toolkit    Unlike C  Cg allows all boolean operators to be applied to vectots  in which case  boolean operations are performed in an elementwise fashion  The result of such  a boolean expression is a vector of bool elements with that number of elements  being the same as the two source vectots  Also unlike C  the logical AND   amp  amp    and logical OR       operators cannot be used for short circuiting evaluation   side effects of both sides of these expressions always occur  regardless of the  value of the boolean expression     Swizzle Operator    Cg has a swixz e operator     that allows the components of a vector to be  rearranged to form a new vector  The new vector need not be the same size as  the original vector   elements can be repeated ot omitted  The characters x  y   z  and w represent the first  second  third  and fourth components of the original  vector  respectively  The characters r  g  b  and a can be used for the same  purpose  Because the swizzle operator is implemented efficiently in the GPU  hardware  its use is usually free     The following ate some examples of swizzling     float3 a
70.  profile implements data types as follows   Q float data type is implemented as IEEE 32 bit single precision   Q half data type is implemented as float     Q int data type is supported using floating point operations  which adds extra  instructions for proper truncation for divides  modulos  and casts from  floating point types     Q fixed or sampler  data types are not supported  but the profile does  provide the minimal partial support that is required for these data types by  the core language specification   that is  it is legal to declare variables using  these types  as long as no operations are performed on the variables        214 808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Statements and Operators    This profile is a superset of the vp20 profile  Any program that compiles for the  vp20 profile should also compile for the vp30 profile  although the converse is  not true     The additional capabilities of the vp30 profile  beyond those of vp20 are    Q for  while  and do loops are supported without requiring loop unrolling       Q Full support for if else allowing non constant conditional expressions  Bindings    Binding Semantics for Uniform Data    Table 25 summarizes the valid binding semantics for uniform parameters in the  vp30 profile     Table 25  vp30 Uniform Input Binding Semantics          Binding Semantics Name Corresponding Data   register  c0  register  c255  Constant register  0  255     C0 C255 The aliases c0   c255  lowercase  are als
71.  program  compute a position output  This homogeneous clip space position is used by  the hardware rasterizer and must be stored in a program output with an output  binding semantic of POSITION  or HPOS for backward compatibility      Position Invariance    In many graphics APIs  the user can choose between two different approaches  to specifying per vertex computations  use a built in configurable fixed function  pipeline or specify a user written vertex program  If the user wishes to mix these  two approaches  it is sometimes desirable to guarantee that the position  computed by the first approach is bit identical to the position computed by the  second approach  This position invariance is particularly important for multipass  rendering     Support for position invariance is optional in Cg vertex profiles  but for those  vertex profiles that support it  the following rules apply     Q Position invariance with respect to the fixed function pipeline is guaranteed  if two conditions are met       The vertex program is compiled using a compiler option indicating  position invariance   posinv  for example        The vertex program computes position as follows   OUT POSITION   mul  MVP  IN POSITION     where    OUT POSITION is a variable  or structure element  of type float4 with  an output binding semantic of POSITION oft HPOS     IN POSITION is a variable  or structure element  of type float4 with  an input binding semantic of POSITION     MVP is a uniform variable  or structu
72.  space coordinates   Therefore   the vertex   s model space position  given by IN  Position  needs to be  transformed by the concatenation of the modelview and projection matrices   called ModelViewProj in this example   The transformed position is assigned  directly to OUT  HPosition  Note that you are not responsible for the  perspective division when using vertex programs  The hardware automatically  performs the division after executing the vertex program        808 00504 0000 004 93  NVIDIA    Cg Language Toolkit    Since we want to do our lighting in eye space  we have to transform the model  space normal IN Normal to eye space           transform normal from model space to view spac  float3 normalVec   normalize  mul  ModelViewIT   IN Normal  xyz      Remember that when transforming normals  we need to multiply by the inverse  transpose of the modelview matrix  Then we normalize the eye space normal  vector and store it as normalVec     Prepare for Lighting  The subsequent steps prepare for lighting        store normalized light vector  float3 lightVec   normalize  LightVec xyz         calculate half angle vector  float3 eyeVec   float3 0 0  0 0  1 0    float3 halfVec   normalize lightVec   eyeVec       At this point we have to ensure that all our vectors are normalized  We start by  normalizing LightVec   Then  in preparation for specular lighting  we have to  define the    half angle    vector halfvec  which is the vector halfway between  the light and the eye vector
73.  term  float shadow   saturate 4   dot  normal xyz   IN  LightVectorUnsigned  xyz                 compute final color  return  Ambient   color   shadow      illumination   color   illumination wwww          808 00504 0000 004 139  NVIDIA    Cg Language Toolkit       Bump Reflection Mapping    Description    This effect mixes bump mapping and reflection mapping based on the  texm3x3vspec DirectX 8 pixel shader instruction   DOT_PRODUCT_REFLECT_CUBE_MAP in OpenGL   This instruction  computes three dot products to transform the normal fetched from the normal  map into the environment cube space  reflects the transformed normal with  respect to the eye vector and fetches a cube map to get the final color  The  vertex shader is responsible for computing the transform matrix and the eye    vector  Figure 15         Figure 15 Example of Bump Reflection Mapping       140 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Bump Reflection Mapping    struct a2v      H    float4 Position   POSITION     in object space  float2 TexCoord   TEXCOORDO                 loss UW 2 MU TURON  9 STONE    in object space  float3 B   TEXCOORD2     in object space  float3 N TEXCOORD3     in object space    Struct  fay    yo    float4 Position   POSITION     in projection space  float4 TexCoord   TEXCOORDO        ff ei TOW  our  taS 929 itacresavelirroncim      from tangent to cube space  float4 TangentToCubeSpace0   TEXCOORD1           second row of the 3x3 tran
74.  than b    e Returns x otherwise        cos  x     Cosine of x       cosh  x     Hyperbolic cosine of x       cross  a  b     Cross product of vectors a and b   a and b must be 3 component vectors        degress  x     Radian to degree conversion                   determinant  M  Determinant of matrix M   dot a  b  Dot product of vectors a and b  exp  x  Exponential function e    exp2  x  Exponential function 2    floor  x  Largest integer not greater than x                20    808 00504 0000 004  NVIDIA       Table 1    Cg Standard Library Functions    Mathematical Functions  continued        Mathematical Functions          Function Description  fmod  x  y  Remainder of x y  with the same sign as x    If y is zero  the result is implementation defined   frac  x  Fractional part of x       frexp x  out exp     Splits x into a normalized fraction in the interval  1 2   1   which is returned  and a power of 2  which is stored  in exp    If x is zero  both parts of the result are zero        isfinite  x     Returns true if x is finite       isinf  x     Returns true if x is infinite       isnan  x     Returns true if x is NaN  not a number        ldexp  x  n     x   27       lerp a  b  f     Linear interpolation   1      a   b    where a and b  are matching vector or scalar types  Parameter    can be  either a scalar or a vector of the same type as a and b        lit ndotl  ndoth  m     Computes lighting coefficients for ambient  diffuse  and  specular light contributions  Retu
75.  the Cg Language  on page 1  A quick introduction to the current release of Cg  with everything you need  to know to start wotking it        Cg Standard Library Functions  on page 19   A list of the Standard Library functions  which can help to reduce your  program development time       Using the Cg Runtime Library    on page 29   An introduction to the Cg runtime APIs  which allow you to easily compile  Cg programs and pass data to them from within applications       A Brief Tutorial    on page 89   A description of a simple Cg program and Microsoft Visual Studio  wotkspace  both provided on the accompanying CD  that you can use to  start experimenting with Cg       Advanced Profile Sample Shaders    on page 97   A list of sample NV30 shaders  complete with source code        Basic Profile Sample Shaders    on page 133   A list of sample NV2X shadets  complete with source code   Appendix A     Cg Language Specification    on page 165  The formal Cg language specification     Appendix B     Language Profiles    on page 195    Describes features and restrictions of the currently supported language  profiles  DirectX 8 vertex  DirectX 8 pixel  OpenGL ARB vertex  NV2X  OpenGL vertex  NV30 OpenGL vertex  and NV30 OpenGL fragment     Appendix C     Nine Steps to High Performance Cg    on page 257  Strategies for getting the most out of your Cg code    Appendix D     Cg Compiler Options  on page 265   A list of the various command line options that the Cg compiler accepts     xiii
76.  the language  by using a  compiler command line switch  for example     The profile restrictions are only applied to the top level function that is being  compiled and to any variables or functions that it references  either directly or  indirectly  If a function is present in the source code  but not called directly or  indirectly by the top level function  it is free to use capabilities that are not  supported by the current profile     The intent of these rules is to allow a single Cg source file to contain many  different top level functions that are targeted at different profiles  The core Cg  language specification is sufficiently complete to allow all of these functions to  be parsed  The restrictions provided by a compilation profile are only needed  for code generation  and are therefore only applied to those functions for which  code is being generated  This specification uses the word program to refer to the  top level function  any functions the top level function calls  and any global  vatiables or typedef definitions it references        168    808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    Each profile must have a separate specification that describes its characteristics  and limitations     This core Cg specification requires certain minimum capabilities for all profiles   In some cases  the core specification distinguishes between vertex program and  fragment program profiles  with different minimum capabilities fot each     The Unifor
77.  three ways     a The binding semantic is specified in the formal parameter declaration for  the function  The syntax for formal parameters to a function is   const   in   out   inout     type     identifier        lt binding semantic gt       lt initializer gt    Q Ifthe formal parameter is a struct  the binding semantic may be specified  with an element of the struct when the struct is defined     struct   struct tag        type    lt identifier gt      lt binding semantic gt                   Q If the input to the function is implicit  a non static global variable that is  read by the function   the binding semantic may be specified when the non   static global variable is declared       type    lt identifier gt      lt binding semantic gt       lt initializer gt         808 00504 0000 004 183  NVIDIA    Cg Language Toolkit    If the non static global variable is a struct  the binding semantic may be  specified when the struct is defined  as described in the second bullet  above     Q A binding semantic may be associated with the output of a top level  function in a similar manner     type     identifier      lt parameter list gt        lt binding semantic gt       lt body gt       Another method available for specifying a semantic for an output value is  to return a struct and to specify the binding semantic s  with elements of  the struct when the struct is defined  In addition  if the output is a  formal parameter  the binding semantic may be specified using the same  a
78.  uniform sampler2D tex  float2 st   float4 prevlookup  uniform float4 m    offsettexRECT  uniform samplerRECT tex  float2 st   float4 prevlookup  uniform float4 m        Performs the following   float2 newst   st   m xy   prevlookup xx   m zw   prevlookup yy   return tex2D RECT tex  newst    where  st are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation  and  m is the offset texture matrix     This function can be used to generate the o    set 2d or  offset rectangle NV texture shader instructions                 808 00504 0000 004 251  NVIDIA    Cg Language Toolkit    Table 50   p20 Auxiliary Texture Functions  continued        Texture Function       Description       offsettex2DScaleBias  uniform sampler2D tex  float2 st   float4 prevlookup  uniform float4 m   uniform float scale  uniform float bias   offsettexRECTScaleBias  uniform samplerRECT tex  float2 st   float4 prevlookup  uniform float4 m   uniform float scale  uniform float bias        Performs the following  float2 newst   st   m xy   prevlookup xx   m zw   prevlookup yy   float4 result   tex2D RECT  tex  newst    return result   saturate  prevlookup z   scale   bias    where  st are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   m is the offset texture matrix   scale is the offset texture scale  and  bias is the offset texture bias     This function can be used to generate the o  fset 2d scale o
79.  used        Figure9 Example of Ray Traced Refraction       114 808 00504 0000 004  NVIDIA    Vertex Shade    Advanced Profile Sample Shaders    r Source Code for Ray Traced Refraction    struct appin         float4 Position see Oouelak N  float4 Normal   NORMAL   y      output    same struct is the input to fragment shader    struct EyeV2F         float4 HPosition   POSITION     clip space pos    float3 OPosition  float3 VPosition  float3 N   float4 LightVecO                    OUT        OUT    uA  OUT        OUT       Sy    TEXCOORDO     Obj coords location  TEXCOORD1     eye pos  obj space   TEXCOORD2     normal  obj space    TEXCOORD3     light dir  obj sp     a    q          E    EyeV2F main  appin IN     uniform float4x4 ModelViewProj   uniform float4x4 ModelViewI   uniform float4 LightVec     in EYE coords                EyeV2F OUT     calculate clip space position for rasterizer use    HPosition   mul ModelViewProj  IN Position      pass through object space position  MOROS TENON AUN POSPONE     object space normal  T N   normalize IN Normal xyz      transform view pos and light vec to obj space    VPosition   mul ModelViewI  float4 0 0 0 1   xyz        OUT      LightVecO   normalize  mul  ModelViewI  LightVec       return OUT        808 00504 0000 004    115  NVIDIA    Cg Language Toolkit    Pixel Shader Source Code for Ray Traced Refraction       Assume ray direction is normalized      Vector  planeEq  is encoded half3 A B C D  where      Ax By Cz D  0 and half3 A 
80.  void cgGLDisableProfile CGprofile profile      Some profiles may not be supported on some systems  For example  a given  profile is not supported if the OpenGL extensions it requires are not available   You can check if a profile is supported by using cgGLIsProfileSupported       CGbool cgGLIsProfileSupported  CGprofile profile      It returns CG TRUE if profile is supported and CG FALSE otherwise     OpenGL Program Examples    This section presents code that illustrates how to use functions from the  OpenGL Cg interface to make Cg programs work with OpenGL  The vertex  and fragment programs below are used in    OpenGL Application  on page 54     OpenGL Vertex Program    The following Cg code is assumed to be in a file called VertexProgram cg   void VertexProgram                    in float4 position PROS ELTON   in float4 color Fe OOO  in float4 texCoord 2 LLEXCOORDO    out float4 positionO   POSITION   Ote locus colo    COLORO   out float4 texCoordO TEXCOORDO   808 00504 0000 004 53    NVIDIA    Cg Language Toolkit    const uniform float4x4 ModelViewMatrix      positionO   mul  position  ModelViewMatrix    colorO   color   texCoordO   texCoord         OpenGL Fragment Program    The following Cg code is assumed to be in a file called FragmentProgram  cg     void FragmentProgram            in float4 color   COLORO   in float4 texCoord   TEXCOORDO   out float4 coloro  amp   egi    const uniform sampler2D BaseTexture   const uniform float4 SomeColor     colorO   color   tex2D 
81.  x   tanh  x  Hyperbolic tangent of x       transpose  M           Matrix transpose of matrix M  If M is an AxB matrix  the  transpose of M is a BxA matrix whose first column is  the first row of M  whose second column is the second  row of M  whose third column is the third row of M  and  so on           808 00504 0000 004    23  NVIDIA       Cg Language Toolkit       Geometric Functions    Table 2 presents the geometric functions that are provided in the Cg Standard  Library     Table 2 Geometric Functions       Geometric Functions    Function Description       distance ptl  pt2  Euclidean distance between points pt1 and pt2       faceforward N  I  Ng   N ifdot Ng I   lt 0   otherwise   N        length v  Euclidean length of a vector    normalize  v  Returns a vector of length 1 that points in the same  direction as vector v        reflect i  n  Computes reflection vector from entering ray  direction i and surface normal n     Only valid for 3 component vectors     refract i  n  eta  Given entering ray direction i  surface normal n   and relative index of refraction eta  computes  refraction vector  If the angle between i and n is  too large for a given eta  returns  0 0 0     Only valid for 3 component vectors                    24 808 00504 0000 004  NVIDIA    Cg Standard Library Functions       Texture Map Functions    Table 3 presents the texture functions that are provided in the Cg Standard  Library  These texture functions are fully supported by the ps_2  arbfp1  and
82.  you can precompute the  function in your application and store it in a texture map  replacing calls like    loci val   ilz y    with code like  float val   tex2D fSampler  float2 x  y   x     This method can also be applied to one  and three dimensional functions  using  1D and 3D texture maps     More generally  the values you pass to the function may not be in the range   0 1   and the values your function returns may not be in the range  0 1   In  this case  the following two utility functions can serve as a base  remapTo01     remaps the range  low  high  into  0 1   remapFrom01    does the opposite     float4 remapTo01 float4 v  float4 low  float4 high     return saturate  v   low   high low       float4 remapFrom01 float4 v  float4 low  float4 high     return lerp low  high  v           Don t forget vectorization here as well  If two float valued functions have the  same domain and range  you can pack them into two texture components of the  same texture  Only one texture lookup is needed to load them both  and  vectorized versions of the remap     can be used to do the remapping more  efficiently as well        260    808 00504 0000 004  NVIDIA    Appendix C Nine Steps to High Performance Cg       5  Use Data Types with Minimum Sufficient Precision    For profiles that support multiple precisions  a general rule of thumb is that if  you can do a computation with fixed precision variables  the computation is  faster than if you use half  and if you use half  the comput
83. 08 00504 0000 004  NVIDIA    a    Appendix A Cg Language Specification    Array conversions  No convetsions of array types are allowed     Table 6 summarizes the type conversions discussed here  The table entries have  the following meanings  but please pay attention to the footnotes     a       a  a  a    Allowed  allowed implicitly or explicitly  Warning  allowed  but warning issued if implicit  Explicit  only allowed with explicit cast   No  not allowed    Table 6 Type Conversions                                                 Target Type Source Type  Scalar Vector Matrix Struct Array   Scalar Allowed Warning Warning Explicit  No  Vector Allowed Allowed    Warning    Explicit  No  Matrix Allowed   Warning    Allowed  Explicit  No  Struct Explicit No No Explicit    No  Array No No No No No   i Only allowed if the first member of the source can be converted to the target    ii  Not allowed if target is larger than source  Warning issued if target is smaller than source    ii  Only allowed if source and target are the same total size    iv  Only allowed if both source and target have the same number of members  and each member of the    source can be converted to the corresponding member of the target     Explicit casts are          Q Compile time type when applied to expressions of compile time type  Q Numeric type when applied to expressions of numeric or compile time type  Q Numeric vector type when applied to another vector type of the same number  of elements  Q Numeric
84. 155  Descriptloll ous qoe rad e A A ROR Ewa ees 155  Vertex Shader Source Code for Shadow Volume Extrusion              llli  156  Sine Wave DEMO    suse s adum s POSS SEERA Rd ai SACR CER RR 158   bs goo na MMC T  I   158  Vertex Shader Source Code for Sine Wave    isses rar ka Rs 159  Matrix Palette Skinning      5 ico wb Oe Re ee RARI UDeE GU ERE Laud LASSE eRe 161  DDSScFIDUON fests   x 80k dor ciat mart es a end aoo A 161  Vertex Shader Source Code for Matrix Palette Skinning              o o oooooo   162  808 00504 0000 004 ii    NVIDIA    Cg Language Toolkit       Appendix A  Cg Language Specification       ooooooooccccc nnnm 165  Language OVSIVI amp W 2 53 32 ab sod after odios Bul a RE quendi ie det Sidi 165  Silent Incompatbilities  ssa x  xut rr a dn tr hne Re Rh or Rene 165  Similar Operations That Must be Expressed Differently            o oooooooo    165  Differences from ANSL C cias e ec Ru ru abs cR RR HERR 166  Detailed Language Specification    is  2i  eci ok RR RR RERO ERR RE 168  Definitions 4a dod e eer Oa ORR oR SE bem kae PU RUD 4p BORNE I eno da 168  Profiles ss pi mE usd dd RI RR E CiU EIER ER RUE as 168  The Uniform  Modifier a cs ree y e ala a R cR Ron 169  Function DeclaraltlOrs s    3 2 2 22 eas ros Ce EE aaa ls 169  Overloading of Functions by Profile           serrtis entieri m hn 170  Syntax for Parameters in Function Definitions             llle 171  Function Call sc 3  nk none RR R Rh EUR GR EROR RR Da DEA RA EEA Beate a 171   uo rr ad ra id 
85. 188  Operator Enhancements sss  esa tka sra kc E PRO RE EROR ERE 188  olco SCC  e ea ea e e a a d aa a 189  Reserved WOEdS   aora A RA 191  Cg Standard Library Functians     iiu sx a e da Ue E AS 191  Vertex Program Profiles    iacu iu ini 192  Mandatory Computation of Position Output            llle 192  iv 808 00504 0000 004    NVIDIA    Position  Imvatial e     3  a anu nm harundo Xx cR AR a a 192       Binding Semantics for OutpUtS Hi    ar eee ead tracer donando e died 193  Fragment Program Profiles  sisi nage hots aie Re eae debe Melee a ee US Sto Se 193  Binding Semantics for OUEDUlS  ccna e aise ee die hd ea 193  Appendix B  Language Profiles 2 0 06 000 00 0sse eee ee eee 195  DirectX Vertex Shader 2 x Profiles  vs 2     0    cece eect iishie inan iia 196  OVEVI Wicca or ta nae wad Ree OE X FCR Peas Red aaa ons emo 196  MEMORY si erta enpa Eh a b OE Gee ae o a ebd ee 196  Statements and Operators  i esr e iu pd ek A Ao e 197  Data IDES 5 aide ach a ERR CR CER CR CIRC a ac OR CR 197  Using  Arfdys ix cedro bere ve dane eee eRe x RU RR Pd hw cuo Fab Oe n 197  BINDINGS  lt   Sx wastes exec eed e ek da de Ead REG 198  CODES su costuatd ola aiite in diis dereud mv dcs Sees Pada S sainte ue E 199  DirectX Pixel Shader 2 x Profiles  ps  2 Jo uude ic dc aeter eee tee eda 200  MEMON  EET a e ert EN E MERE 200  Language    Constructs and S  ppott va t a 94s 40 bce Re a a ew eee 201  BINGINGS zs kae n n AAA RETO RE a AAA 202  A                              HER 203  Limitations i
86. 3  If the number of functions remaining in the set is not one  then fail     Global Variables    Global variables ate declared and used as in C  Uniform non static variables  may have a semantic associated with them  Uniform non static variables may  have their value set through the run time API     Use of Uninitialized Variables    It is incorrect for a program to use an uninitialized variable  However  the  compiler is not obligated to detect such errors  even if it would be possible to  do so by compile time data flow analysis  The value obtained from reading an  uninitialized variable is undefined  This same rule applies to the implicit use of a  variable that occurs when it is returned by a top level function  In particular  if a  top level function returns a struct  and some element of that struct is never  written  then the value of that element is undefined        Note  Variables are not defined as being initialized to zero because this would result in a  performance penalty in cases where the compiler is unable to determine if a  variable is properly initialized by the programmer        Preprocessor    Cg profiles must support the full ANSI C standard preprocessor capabilities   fif  define  and so on  However  Cg profiles are not required to support  macto like define or the use of  include directives        182 808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification       Overview of Binding Semantics    In stream processing architectures  data packets f
87. 4    195  NVIDIA    Cg Language Toolkit       DirectX Vertex Shader 2 x Profiles  vs 2       The DirectX Vertex Shader 2 0 profiles are used to compile Cg source code to  DirectX 9 VS 2 0 vertex shaders  and DirectX 9 VS 2 0 Extended vertex  shaders   Q Profile names   vs 2 0  for DirectX 9 VS 2 0 vertex shaders    vs 2 x  for DirectX 9 VS 2 0 extended vertex shaders     Q How to invoke  Use the compiler options   profile vs 2 0   profile vs 2 x    This section describes how using the vs 2 0 and vs 2 x profiles affects the Cg  source code that the developer writes     Overview    Memory    The vs 2 0 profile limits Cg to match the capabilities of DirectX VS 2 0 vertex  shaders  The vs 2 x profile is the same as the vs 2 0 profile but allows  extended features such as dynamic flow control  branching      DirectX 9 vertex shadets have a limited amount of memory for instructions and  data     Program Instruction Limit    DirectX 9 vertex shaders are limited to 256 instructions  If the compiler needs  to produce more than 256 instructions to compile a program  it reports an error     Vector Register Limit    Likewise  there are limited numbers of registers to hold program parameters  and temporary results  Specifically  there are 256 read only vector registers and  12 32 read write vector registers  If the compiler needs more registers to  compile a program than are available  it generates an error        1  To understand the DirectX VS 2 0 Vertex Shaders and the code the compile
88. 535  07    pack_4byte      float pack 4byte float4 a    float pack 4byte  half4 a      Converts the four components of a into 8 bit signed integers  The signed  integers are such that a representation with all bits set to 0 corresponds to the  value   128 127   and a representation with all bits set to 1 corresponds to   127 127   The four signed integers are then packed into a single 32 bit result   This operation may be reversed using the unpack_4byte   function        C Pseudocode          Gors   rowne  lA  clama  128  127 1329373029      MAS  Z   o  S7   Tome   elano anys     129 127   127 127  s 128 5   lo    romme 27   elema laczy 128 1277 L27127    MLS  p    plo wy   oa     clamo tasaw  128 127 129 712   a 128 7   edi   que  lt  lt  24    Gosa  lt  lt  16     me    lt  lt  9    tle  xe  808 00504 0000 004 221    NVIDIA    Cg Language Toolkit    unpack_4byte     half4 unpack 4byte float a    Unpacks four 8 bit integers from a and scales the results into individual 16 bit  floating point values between   128 127  and   127 127         C Pseudocode    Festes    Ma  gt  gt  0     Oui    128  Y 127 07  SSI      a  gt   gt     O   128    127 405  resuule yz c  Ulea 49  E  dom    128    127 08  Festes    la  gt  gt  24    Ox    128    127 07    pack_4ubyte      float pack 4ubyte float4 a    float pack 4ubyte half4 a      Converts the four components of a into 8 bit unsigned integers  The unsigned  integers are such that a representation with all bits set to 0 corresponds to 0
89. 9   ELOENES         Eloisa yl   2     La dB iOS  SEA A  AOS     Here we ve again got a lot of arithmetic operations  each using a single pair of  float values  Some cleverness lets us turn this into a vectorized operation   Below is the implementation of the cross    function from the Cg Standard  Library  requiring just two vector multiply operations and one vector  subtraction operation     floes Cross  blogs a  Eloars 19  4  TSIEN    EL GVWR 7 DOES     lq Sy E OSNES         Confirm for yourself that this computes the same value as the first section of  code for the cross product  note that it exposes much more vectorized  computation for the GPU to efficiently process        3  Use the Cg Standard Library    The functions in the Cg Standard Library have been carefully written for both  efficiency and correctness  By using Standard Library functions when  appropriate  you can automatically take advantage of the work that went into  making sure they compile to fast code on GPUs while you concentrate on the  hard problems yow   re solving in your own shaders     Particularly fast Standard Library functions include dot      which computes the  dot product of two vectors  abs      which computes the absolute value of a  variable  saturate     which clamps a value to be between zero and one  and  min    and max      which return the minimum and maximum of a pair of values   You won t be able to write more efficient implementations of these functions  than the Standard Library pr
90. B C  has been normalized      Returns distance along to to intersection  distance is     negative if no intersection   half intersect plane half3 rayOrigin half3 rayDir   half4 planeEg      half3 planeN   planeEq xyz    half denominator   dot  planeN  rayDir     half result    1 0h                 d  0     parallel    d  0   gt  faces away   if  denominator  lt  0 0h     half top   dot planeN rayOrigin    planeEq w   result    top denominator             return result        subfields in  BallData   define RADIUS x   define IRIS DEPTH y  define ETA z  define LENS DENSITY w                subfields in  SpecData   define PHONG x   define GLOSS1 y   define GLOSS2 z   define DROP w                      struct EyeV2F      silio d ted EO Site OL  2 O SII ONIS   Flogs  iS kao nue mE LEEK OORD 0s   float3 VPosition TEXCOORD1   float3 N TEXCOORD2    float4 LightVecO TEXCOORD3           H       half4 main EyeV2F IN   uniform sampler2D ColorMap     color     components   radius irisDepth  eta  lensDensity           116 808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders    uniform float4 BallData       components   phongExp glossi gloss2 drop   uniform float4 GlossData    uniform float3 AmbiColor    uniform float3 DiffColor    uniform float3 SpecColor    uniform float3 LensColor    massa Oconee CC OMe    COLOR             const half3 baseTex   half3 1 0h 1 0n 1 0h    const half GRADE   0 05h    const half3 yAxis nales  10 Ola  il  Olay   0  5 Ola     CONS ESAS aele S  AL   Ola
91. BaseTexture  texCoord    SomeColor        OpenGL Application    This C code links the previous vertex and fragment programs to the application     finclude  lt cg cg h gt    include  lt cg cgGL h gt                 float  vertexPositions     Initialized somewher ls  float  vertexColors     Initialized somewher ls  float  vertexTexCoords     Initialized somewher ls  GLuint texture     Initialized somewher ls  float constantColor       Initialized somewher ls        eS OmscssM S enis crie   CGprogram vertexProgram  fragmentProgram    CGprofile vertexProfile  fragmentProfile    CGparameter position  color  texCoord  baseTexture  someColor   modelViewMatrix           Called at initialization  void CgGLInit          Create context  context   cgCreateContext             Initialize profiles and compiler options  vertexProfile   cgGLGetLatestProfile CG GL VERTEX    cgGLSetOptimalOptions  vertexProfile                           54    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library          fragmentProfile   cgGLGetLatestProfile CG GL FRAGMENT    cgGLSetOptimalOptions fragmentProfile            Create the vertex program   vertexProgram   cgCreateProgramFromFile   context  CG SOURCE   VertexProgram cg    vertexProfile   VertexProgram   0            Load the program  cgGLLoadProgram vertexProgram               Create the fragment program   fragmentProgram   cgCreateProgramFromFile   context  CG SOURCE   FragmentProgram cg    fragmentProfile   FragmentProgram   0            Loa
92. CopyProgram     CGprogram cgCopyProgram CGprogram program      This function creates a new program object that is a copy of program and adds  it to the same context  So  you can have several versions of the same original  program  each of them modified in a particular way     Program Iteration    The programs within a context are sequentially ordered and can be iterated over  by using cgGetFirstProgram   and cgGetNextProgram       CGprogram cgGetFirstProgram CGcontext context    CGprogram cgGetNextProgram CGprogram program      The first program of the sequence is retrieved by egGetFirstProgram   lf  the context is invalid or does not contain any program  the function returns  zero  Given a program  cgGetNextProgram   returns the program  immediately next in the sequence  or zero if there is none  Here is how those  two functions would typically be used given a valid context named context   CGprogram program   cgGetFirstProgram context    while  program    0         Here is the code that handles the program      program   cgGetNextProgram program               Nothing is guaranteed regarding the order of the programs in the sequence or  how cgGetFirstProgram   and cgGetNextProgram   behave when  programs are created or destroyed during iteration     Program Query    Program queries encompass validity  compilation results  and attributes        808 00504 0000 004 37  NVIDIA    Cg Language Toolkit    Program Validity    Use cgIsProgram   to check whether a program handle refere
93. D3D9SetTextureWrapMode  parameter  D3DWRAP U   D3DWRAP V    Parameter Shadowing    Parameter shadowing can be enabled or disabled on a per program basis     Q When loading the program  see    Expanded Interface Program Execution     on page 74        Q At any time using  HRESULT cgD3D9EnableParameterShadowing    CGprogram program  CGbool enable       for which enable should be set to CG_TRUE to enable parameter shadowing  and to CG_FALSE to disable it   To know if parameter shadowing is enabled for a given program  use   CGbool cgD3D9IsParameterShadowingEnabled  CGprogam program      This function returns CG  TRUE if parameter shadowing is enabled for program   Expanded Interface Program Execution    To load a program in Direct3D 9 use cgD3D9LoadProgram       HRESULT cgD3D9LoadProgram CGprogram program   CG BOOL parameterShadowingEnabled   DWORD assembleFlags      This function assembles the result of the compilation of program using  D3DXAssembleShader   with assembleFlags as the D3DXASM flags   Depending on the program s profile  it then either uses       74    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    IDirect3DDevice9  CreateVertexShader    to create a Direct3D 9 vertex  shader  or uses IDirect3DDevice9  CreatePixelShader   to create a  Direct3D 9 pixel shader     Here is a typical use of the function        HRESULT hresult   cgD3D9LoadProgram vertexProgram  TRUE   D3DXASM DEBUG    HRESULT hresult   cgD3D9LoadProgram fragmentProgram  TRUE  0            
94. DIA    Cg Language Toolkit       274 808 00504 0000 004  NVIDIA    
95. DTSS MIPFILTER for                                        sampler parameter  BaseTexture    cgD3D  TRACE   Deleting vertex shader for program 3   cgD3D TRACE   Deleting pixel shader for program 24  To use the debug DLL     1  Link your application against cgD3D9d  lib  or cgD3D8d  lib  instead of  cgD3D9 lib  ot cgD3D8  lib      2  Make sure that the application can find egD3D9d d11  or cgb3D8d  d11      3  Turn on and turn off tracing of portions of your code using  cgD3D9EnableDebugTracing        void cgD3D9EnableDebugTracing  CGbool enable      Here is how you would enable debug tracing for part of the application code     cgD3D9EnableDebugTracing  CG TRUE    fl      Application code that is traced   4   cgD3D9EnableDebugTracing CG FALSE      Note that each debug trace output sets an error equal to cgD3D9DebugTrace   So  if an error callback has been registered with the core runtime using  cgSetErrorCallback     each debug trace output triggers a call to this error  callback  see  Using Error Callbacks  on page 87      Direct3D Error Reporting    Error reporting in Cg includes defined error types  functions that allow testing  for errors  and support for error callbacks     Direct3D Error Types    The Direct3D runtime generates errors of type CGerror  reported by the Cg  core runtime and of type HRESULT  reported by the Direct3D runtime  In  addition  it returns the errors listed in the next two groups that are specific to  the Direct3D Cg runtime        808 00504 0000 004
96. GetArraySize    gives the size of every dimension  For example  for float4  array 10   100   cgGetArraySize  array 0  returns 10 and  cgGetArraySize  array 1  returns 100  An atray  anArray  has  cgGetArraySize  anArray  0  elements  If its dimension is greater than one   those elements are themselves arrays     Here is how all these iteration functions would typically be used given a valid  program named program     void IterateProgramParameters  CGprogram program     RecurseProgramParameters cgGetFirstParameter  program   CG PROGRAM       void RecurseProgramParameters  CGparameter parameter     if  parameter    0   ESCU  do    switch  cgGetParameterType  parameter       Sas CIG  SEQUI S  RecurseProgramParameters   cgGetFirstStructParameter  parameter     break   case CG ARRAY   int arraySize   cgGetArraySize parameter  0    core  me 3b    p  3L     Aras ama   RecurseProgramParameters    cgGetArrayParameter  parameter  i     break           40 808 00504 0000 004  NVIDIA    Using the Cg Runtime Library             default      Here is the code that handles the parameter     break        while  parameter   cgGetNextParameter  parameter      0           If you do not need to know how the parameters are organized in terms of  structure and arrays  you can also iterate through all of them using  cgGetFirstLeafParameter    and cgGetNextLeafParameter        CGparameter cgGetFirstLeafParameter  CGprogram program   CGenum namespace     CGparameter cgGetNextLeafParameter  CGparameter 
97. OOO   808 00504 0000 004 63    NVIDIA    Cg Language Toolkit    const uniform sampler2D BaseTexture   const uniform float4 SomeColor     colorO   color   tex2D BaseTexture  texCoord    SomeColor        Direct3D 9 Application    The following C code links the previous vertex and fragment programs to the  Direct3D 9 application     include  lt cg cg h gt     include  lt cg cgD3D9 h gt                    IDirect3DDevice9  device     Initialized somewher 1s  IDirect3DTexture9  texture     Initialized somewher ls  D3DXMATRIX matrix     Initialized somewher ls  D3DXCOLOR constantColor     Initialized somewher ls       CGcontext context    CGprogram vertexProgram  fragmentProgram   IDirect3DVertexDeclaration9  vertexDeclaration   IDirect3DVertexShader9  vertexShader   IDirect3DPixelShader9  pixelShader    CGparameter baseTexture  someColor  modelViewMatrix                    Called at application startup  void OnStartup          Create context  context   cgCreateContext                 Called whenever the Direct3D device needs to be created  void OnCreateDevice           Create the vertex shader  vertexProgram   cgCr  eateProgramFrombile  context  CG SOURCE    VertexProgram cg   CG PROFILE VS 2 0   VertexProgram   0    CComPtr  ID3DXBuffer   byteCode   const char  progSrc   cgGetProgramString  vertexProgram   CG COMPILED PROGRAM    D3DXAssembleShader progSrc  strlen progSrc   0  0  O0    amp byteCode  0       If your program uses explicit binding semantics  like     this one   you c
98. Options   77  cgD3D9IsParameterShadowingEnable  dO 74  cgD3D9IsProgramLoaded   76  cgD3D9LoadProgram   74  cgD3D9SetDevice   69  cgD3D9SetSamplerState   73  cgD3D9SetTexture   73  cgD3D9SetTextureWrapMode   74  cgD3D9SetUniform   72  cgD3D9SetUniformArray   73  cgD3D9SetUniformMatrix   72  cgD3D9SetUniformMatrixArray   73  cgD3D9UnloadProgam   76  Direct3D 8 application 81  Direct3D 9 application 78  Direct3D device 69  fragment program 77  lost devices 70  parameters 72  array 73  sampler 73  uniform 72  profile support 76  program execution 74  vertex program 77  Direct3D HRESULT 86  Direct3D minimal interface 57  cgD3D8ResourceToDeclUsage   61  cgD3D8ValidateVertexDeclaration    60  cgD3D9ResourceToDeclUsage   61  cgD3D9ValidateVertexDeclaration    60  Direct3D 8 application 67  Direct3D 9 application 64  fragment program 63  type retrieval 63  vertex declaration 57  vertex declaration for Direct3D 8 58  vertex declaration for Direct3D 9 58    vertex program 63   header files 32   loading 32   modifying parameters 33   OpenGL 46   error reporting 57   OpenGL application 54   OpenGL parameter setting 46   parameter shadowing 46   program execution 33   releasing resources 34  Cg Runtime Library   overview 30  Cg standard library 19  Cg_Simple file 89  cgc exe  Cg compiler 265  cgD3D9EnableParameterShadowing   74  CGerror   Direct3D 86   OpenGL 57  cint type  specification 172  command line options  Cg compiler 265  comparison operators 189   introduction 15  compilation pro
99. Program Instruction Limits    The DirectX 8 vertex shaders are limited to 128 instructions  If the compiler  needs to produce more than 128 instructions to compile a program  it reports  an error     Vector Register Limits    Likewise  there are limited numbers of registers to hold program parameters  and temporary results  Specifically  there are 96 read only vector registers and  12 read  write vector registers  If the compiler needs more registers to compile a  program than are available  it generates an error     Language Constructs and Support  Data Types    This profile implements data types as follows   Q float data types are implemented as IEEE 32 bit single precision   Q half and double data types ate treated as float        Q int data type is supported using floating point operations  which adds extra  instructions for proper truncation for divides  modulos and casts from  floating point types        5  To understand the DirectX VS 1 1 Vertex Shaders and the code the compiler produces   see the Vertex Shader Reference in the DirectX 8 1 SDK documentation        808 00504 0000 004 223  NVIDIA    Cg Language Toolkit    Q fixed or sampler  data types are not supported  but the profile does  provide the minimal partial support that is required for these data types by  the core language specification   that is  it is legal to declare variables using  these types  as long as no operations are performed on the variables     Statements and Operators    The if  while  do  an
100. SKIP 4         60    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library       D3DVSD_R  D3DVSD_EN  e    El               EXCOORDO  D3DVSDT FLOAT2         G  D3DVSDE  D             This is true because D3DDECLUSAGE POSITION and D3DVSDE POSITION match  the hardware register associated with the predefined semantic POSITION   D3DDECLUSAGE DIFFUSE and D3DVSDE DIFFUSE match the register associated  with COLORO  and D3DDECLUSAGE TEXCOORD0 and D3DVSDE TEXCOORDO match  the register associated with TEXCOORDO     The above declarations can also be written the following way using  cgD3D9ResourceToDeclUsage    of cgD3D8ResourceToInputRegister       const D3DVERTEXELEMENT9 declaration         CLIO 0 sio  lho arte    D3DDECLTYPE FLOAT3  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage  CG POSITION   O     CO  3 5 SLE  ELOTE  y   D3DDECLTYPE D3DCOLOR  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage  CG COLORO   0     IAS CO EE OER  D3DDECLTYPE FLOAT2  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage  CG TEXCOORDO   0     D3DD3CL END                                                                                                                              DWORD declaration         D3DVSD STREAM 0     D3DVSD REG cgD3D8ResourceToInputRegister CG POSITION     D3DVSDT FLOAT3     D3DVSD REG cgD3D8ResourceToInputRegister CG COLORO    D3DVSDT D3DCOLOR                  D3DVSD STREAM 1     D3DVSD SKIP 4     D3DVSD REG cgD3D8ResourceToInputRegister CG TEXCOORDO    D3DVSDT FLOAT2           
101. T   uniform float4 LightVec     vertout OUT        Transform vertex position into homogenous clip space   OUT HPosition   mul ModelViewProj  IN Position            Transform normal from model space to view spac  float3 normalVec   normalize  mul  ModelViewIT   IN Normal  xyz            Store normalized light vector   float3 lightVec   normalize  LightVec xyz         Calculate half angle vector   float3 eyeVec   float3 0 0  0 0  1 0    float3 halfVec   normalize lightVec   eyeVec            808 00504 0000 004 9  NVIDIA    Cg Language Toolkit       Calculate diffuse component   float diffuse   dot  normalVec  lightVec          Calculate specular component   float specular   dot  normalVec  halfVec         Use the lit function to compute lighting vector from     diffuse and specular values   float4 lighting   lit diffuse  specular  32         Blue diffuse material  float3 diffuseMaterial   float3 0 0  0 0  1 0         White specular material  iulio asc e entrar Ma recio M GN e AO PEST  EINER ODE       Combine diffuse and specular contributions and     output final vertex color    OUT Color rgb   lighting y   diffuseMaterial    lighting z   specularMaterial   OUT Colors cae le Ol        return OUT        Working with Data    Like C  Cg supports features that create and manipulate data     Q  Q  Q  Q    Basic Data    Basic types  Structures  Arrays    Type conversions    Types    Cg supports six basic data types     Q    float    A 32 bit IEEE floating point  s23e8  number that ha
102. Uniform Arrays of Scalar  Vector  and Matrix Parameters    To set the values of arrays of uniform scalar or vector parameters  use the  cgGLSetParameterArray functions     void cgGLSetParameterArraylf CGparameter parameter   long startIndex  long numberOfElements   const float  array     void cgGLSetParameterArrayld CGparameter parameter   long startIndex  long numberOfElements   const double  array     void cgGLSetParameterArray2f  CGparameter parameter   long startIndex  long numberOfElements   const float  array     void cgGLSetParameterArray2d CGparameter parameter   long startIndex  long numberOfElements   const double  array     void cgGLSetParameterArray3f  CGparameter parameter   long startIndex  long numberOfElements   const float  array     void cgGLSetParameterArray3d CGparameter parameter   long startIndex  long numberOfElements   const double  array     void cgGLSetParameterArray4f  CGparameter parameter   long startIndex  long numberOfElements   const float  array     void cgGLSetParameterArray4d CGparameter parameter   long startIndex  long numberOfElements   const double  array         808 00504 0000 004 49  NVIDIA    Cg Language Toolkit    The digit in the name of those functions indicates the type of the parameter  array elements  1 for arrays of float1  2 for arrays of   1oat2  and so on  The  variables startIndex and numberOfElements specify which elements of the  array parameter are set  They are the numberOfElements elements of the  indices that range fr
103. Use the compiler option  profile fp30     This section describes the capabilities and restrictions of Cg when using the    p30 profile     Language Constructs and Support    Data Types  Q fixed type  s1 10 fixed point  is supported  Q half type  s10e5 floating point  is supported    It is recommended that you use fixed  half  and   1oat in that order for  maximum performance  Reversing this order provides maximum precision  You  are encouraged to use the fastest type that meets your needs for precision     Statements and Operators    Q Full support for if else       Q No for and while loops  unless they can be unrolled by the compiler  Q Support for flexible texture mapping  Q Support for screen space derivative functions  Q No support for variable indexing of arrays  218 808 00504 0000 004    NVIDIA    Bindings    Appendix B Language Profiles    Binding Semantics for Uniform Data    Table 28 summarizes the valid binding semantics for uniform parameters in the      p30 profile     Table 28    p30 Uniform Input Binding Semantics       Binding Semantics Name Corresponding Data       register  s0    register  s15  Texunit N  where N is in the range  0  15    TEXUNITO TEXUNIT15 May be used only with uniform inputs with  sampler  types        register  c0   register  c31  Constant register N  where N is in range  C0 C31  0  15   May only be used with uniform inputs              Binding Semantics for Varying Input Output Data    Table 29 summatizes the valid binding semantics for v
104. V n    gt  VI n  Componentwise    V n    Vin    gt  V n  Componentwise    M n  m    M n   m    gt  M n   m  Componentwise    188 808 00504 0000 004    NVIDIA    Appendix A Cg Language Specification    Table 7 Expanded Operators  continued                                Operator Description   M n   m    M n  m    gt  M n   m  Componentwise     M n  m    M n  m    gt  M n   m  Componentwise     M n  m    M n  m    gt  M n   m  Componentwise     M n  m    M n  m    gt  M n   m  Componentwise    Operators   Boolean    amp  amp          Boolean operators may be applied to bool packed bool vectors  in which case  they are applied in elementwise fashion to produce a result vector of the same  size  Each operand must be a bool vector of the same size     Both sides of  amp  amp  and     are always evaluated  there is no short circuiting as  there is in C    Comparisons    lt   gt   lt    gt   l        Comparison operators may be applied to numeric vectors  Both operands must  be vectors of the same size  The comparison operation is performed in  elementwise fashion to produce a bool vector of the same size     Comparison operators may also be applied to bool vectors  For the purpose of  relational comparisons  true is treated as one and false is treated as zero  The  comparison operation is performed in elementwise fashion to produce a bool  vector of the same size     Comparison operators may also be applied to numeric or bool scalars     Arithmetic                    unary  unary  
105. Y The vector swizzle operator may only be applied to vectors or to  scalars       Applying the vector swizzle operator to a scalar gives the same result as  applying the operator to a vector of length one    Thus  myscalar xxx and float3  myscalar myscalar myscalar   yield the same value     Y Ifonly one swizzle character is specified  the result is a scalar  not a  vector of length one  Therefore  the expression b y returns a scalar     Y Care is required when swizzling a constant scalar because of ambiguity  in the use of the decimal point character  For example  to create a  three vector from a scalar  use one of the following      1  xxx Or 1  xxx Of 1 0 xxx Of 1 0f xxx    The size of the returned vector is determined by the number of swizzle    characters  Therefore  the size of the result may be larger or smaller  than the size of the original vector     For example    loat2  0 1   xxyy and float4  0 0 1 1  yield the  same result     Q Matrix swizzle operator        186 808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    For any matrix type of the form  lt type gt  lt rows gt x lt columns gt   the notation    matrixObject    m lt row gt  lt col gt  _m lt row gt  lt co1 gt            can be used to access individual matrix elements  in the case of only one   lt row gt  lt col gt  pait  or to construct vectors from elements of a matrix  in the  case of more than one   row    col   pair   The row and column numbers  are zero based     For example     floa
106. a Bode  Tierra  umibcra Piowtded Bode  Viet  wabbore PioktE Lighter     wertost OUT       iras Pura erber potion Labo boc oli pegs  OTT Paro   malb EodelViewFre   IW Positimn        trama cEm EGER  TOR   del apa  5 viss   dpecse  Bloat  mara le    nmnrkalizm maiiExRlViss I IM Bore    mri    store mor amp s iired light vector     bosah l  ghiVee   sera lim Lighks Tes  api    calculate kalf ample Lor  Elpatd wepmec   lat D  B E  1 05  E bost halfTec   nor amp al amp rsiligkitVec   pate    calculate dillume copo  Plat difiere   dot nrmm Ter  Liga Fee h       Figure 3 The Cg Simple Workspace    808 00504 0000 004 89  NVIDIA    Cg Language Toolkit    As usual  click the FileView tab to view the various files in the project  What s  different in this case  though  is that in addition to the usual Source Files and  Header Files folders  there 1s also a Cg Programs folder     This Cg Programs folder should contain one Cg program  simple cg  which is  what you can use for experimentation  Double click simple   cg to open it for  editing  While you ate editing simple  eg  you can press Control F7 at any  time to compile it  Because of the way the project 1s set up  any errors in your  code will be shown just as when you compile a normal C or C   program     You can also double click on an error  which takes you to the location in the  source code that caused the error        Understanding simple cg    The Cg Simple application runs the shader defined in simple cg on a totus   The provi
107. ache de adie phan 223  Memory RESHHCUONS  c  cus a dos mue Ae ea teed whee ae aah Ent 223  Language Constructs and SuppOIt   acces anh eae xen ow kat Bee 223  808 00504 0000 004    NVIDIA    Cg Language Toolkit       Sl sr ries ad ie aa 225  gor c  PP  EP 226  DirectX Pixel Shader 1 x Profiles ps 1              seee n I 227  OVENI GW ne a a ideas 227  Modifie Soccer RA ee aed D 228  Language  Constructs and  Support    saisi ika ce en Sones dea Er eel anes 229  Standard Library Functions    2k pe eR bu x R ease AA 230  BINGINGS  e P  a ee awd ale aa e ada dos 232  Auxiliary Texture  FUNCOMS ver 045 xem puce pr AD Rh cee RP ee RUR 234  EXAMPICS errei Gard alee a Sa we Ron s we dido Pe a DOE a en ecard 239  OpenGL NV  vertex program 1 0 Profile  wp20             oooooooommmm eee 240  OVEIMEW si eaque acEE qu RU COE Y ERA d WEN ce A E Ra e EN E ed 240  Position Invallarice    us x ob a A ERE RO RE le ESO NO RUE K EGER 240  Data  TY PCS  ase tad ai eae Gehan odia d ai ra 241  hn M                               lcd 241  OpenGL NV texture shader and NV register combiners Profile    p20              244  al ne tic Cer 244  RESHIGHONS 2 2 5 rata a RRS eA Rae AAA ded dotem 244  MOGI GTS T                   eb iaa 245  Language Coristructs and Support 2x ee mr irra ee ee RE RERO 246  Standard Library FUlCUODS  iii  wien cca rtc ck td ware opns 247  ssl dI D aria e Rem E gone 249  Auxiliary Texture  FUNCIONS    s  dica a a pon 251  Exatmipless u oros ESE da ee MP eS 256  Appendix C  Nin
108. ader is based on the Time Machine temporal rust       sha    der     Car paint data was measured by Cornell       University from samples provided by Ford Motor Company     Pal    SRC  floa  floa  floa   loa   loa  loa  loa  loa  loa  loa    fl       Jd RIS  float4    iif 38  72    floa  floa  floa  floa                                                                         VS OUTPUT     t4 HPosition POSITION     coord position in window    22 Uy TEXCOORDO     wavy fleckmap coords   ES ILENE TEXCOORD1     light pos  tangent space    t4 halfangle TEXCOORD2     Blinn halfangle   t3 reflection  TEXCOORD3     Refl vector  per vertex    t4 view TEXCOORD4     view  tangent space    t3 tangent TEXCOORD5     view tangent matrix   t3 binormal TEXCOORD6       t3 normal TEXCOORDI o  v   t fresn COLORO    EL SHADER   Main  VS OUTPUT vert    uniform sampler2D WavyMap register  s0     uniform samplerCUBE EnvironmentMap register  s1     uniform sampler2D PaintMap register  s2     uniform sampler2D FleckMap register  s3     uniform float Ambient   COLOR   EWPAINTSPEC     UNUSED  SPEC POWER  GLOSSINESS   FLECK SPEC POWER     t4 NewPaintSpec qt  Qi  Gus  Sete Tenue  is   t3 ClearCoat   OE 2 00 DS qm  Odd deg   T Luke ekC oor See ORO eles Ops ale   t3 WavyScale ed 0 27 0527 1 0 p       130    NVIDIA    808 00504 0000 004    Advanced Profile Sample Shaders       Tangent space LIGHT vector  float3 L   normalize vert light         Tangent space HALF ANGLE vector  float3 H   normalize  vert halfan
109. al reference to the Direct3D devic     and free its Direct3D resources   cgD3D8SetDevice 0                        Called before application shuts down  void OnShutdown          This frees any core runtime resource   cgDestroyContext  context          Direct3D Debugging Mode    In addition to the error reporting mechanisms described in    Direct3D Error  Reporting    on page 85  a debug version of the Direct3D 9 or Direct3D 8 Cg  runtime DLL is provided to assist you with the development of applications  using the Direct3D 9 or Direct3D 8 Cg runtime  This version does not have  debug symbols  but when used in place of the regular version  it uses the Win32  function OutputDebugString   to output many helpful messages and traces       808 00504 0000 004 83  NVIDIA    Cg Language Toolkit    to the debug output console  Examples of information the debug DLL outputs  ate the following     O Any Direct3D or Cg core runtime errors    Q Debugging information about parameters that are managed by the  expanded interface       Q Potential performance warnings    Here is a sample trace   cgD3D  TRACI  cgD3D  TRACI    E   Creating vertex shader for program 3       Discovering parameters for vertex program 3       cgD3D TRACE   Discovered uniform parameter  ModelViewProj   of type float4x4  cgD3D TRACE   Finished discovering parameters for vertex    program 3  cgD3D  TRACI  cgD3D  TRACI  cgD3D  TRACI  cgD3D  TRACI    Creating pixel shader for program 24  Discovering parameters for pixel pr
110. an create a vertex declaration     using those semantics   const D3DVERTEXELEMENT9 declaration                                              64    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    LO   9    sico  elote  v   D3DDECLTYPE FLOAT3  D3DDECLMETHOD DEFAULT                                                                                   D3DDECLUSAGE POSITION  0        Oy  Si S Ze o Eoo  D3DDECLTYPE D3DCOLOR  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE COLOR  O   Oj  4  SAO  AO is  v   D3DDECLTYPE FLOAT2  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE TEXCOORD  0                      D3DD3CL END         y      Make sure the resulting declaration is compatible with     the shader  This is really just a sanity check   assert cgD3D9ValidateVertexDeclaration vertexProgram   declaration             device  gt CreateVertexDeclaration    declaration   amp vertexDeclaration    device  gt CreateVertexShader    byteCode  gt GetBufferPointer     amp vertexShader                   Create the pixel shader   fragmentProgram   cgCreateProgramFromFile  context   CG SOURCE   FragmentProgram cg    CE WROMMMT PIS 2 0  EE gm enero qa 0              CComPtr lt ID3DXBuffer gt  byteCode   const char  progSrc   cgGetProgramString  fragmentProgram   CG COMPILED PROGRAM    D3DXAssembleShader  progSrc  strlen progSrc   0  0  O0    amp byteCode  0    device  gt CreatePixelShader  byteCode  gt GetBufferPointer       amp pixelShader                 Grab some parameters   modelViewMatrix   cgGetNamedPara
111. and gives a brief overview of how it is  used in an application  The next two sections   Core Cg Runtime  on page 34  and    API Specific Cg Runtimes  on page 45  give an exhaustive description of  the APIs composing the Cg Runtime        Introducing the Cg Runtime    Cg programs are lines of code that describe shading  but they need the support  of applications to create images  To interface Cg programs with applications   you must do two things     1  Compile the programs for the correct profile  In other words  compile the  programs into a form that is compatible with the 3D API used by the  application and the underlying hardware     2  Link the programs to the application program  This allows the application  to feed varying and uniform data to the programs     You have two choices as to when to perform these operations  You can  perform them at compile time  when the application program is compiled into  an executable  or you can perform them at run time  when the application is  actually executed  The Cg runtime is an application programming interface that  allows an application to compile and link Cg programs at run time     Benefits of the Cg Runtime    Future Compatibility    Most applications need to run on a range of profiles  If an application  precompiles its Cg programs  the compile time choice   it must store a  compiled version of each program for each profile  This is reasonable for one    808 00504 0000 004 29    NVIDIA       Cg Language Toolkit    program 
112. are and set a vector output that uses the  COLOR semantic  This value is usually used by the hardware as the final color of  the fragment  Some fragment profiles also support the DEPTH output semantic   which allows the depth value of the fragment to be modified     As with vertex programs  fragment programs may return their outputs in the  body of a structure  However  it is usually more convenient to either declare  outputs as out parametets        WO nus op Ww   Ome 3Ellorened  cello    COLOR  Ote milo cesa    Das  7  JU ll  coler   clierusccaolo r    JI nna    cama m ss B       8 808 00504 0000 004  NVIDIA    Introduction to the Cg Language    or to associate a semantic with the return value of the shader     loma masa  y cos E   8 COLOR A  PE aoo Tf  rerun deubtiruasecOlor      oo   fs         The following example shows a simple vertex program that calculates diffuse  and specular lighting  Two structures for varying data  appin and vertout  are  also declared  Don   t worry about understanding exactly what the program is  doing   the goal is simply to give you an idea of what Cg code looks like     A  Brief Tutorial    on page 89 explains this shader in detail        Define inputs from application   StXUCE East     loci osuicseim NEO SIRIO  float4 Normal   NORMAL   e       Define outputs from vertex shader   Struck cwertout     float4 HPosition   POSITION   close Color 3 COLORS  y     vertout main  appin IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelViewI
113. arying input parameters  in the   p30 profile     These binding semantics map to NV   ragment program input registers  The  two sets act as aliases to each other  The profile also allows POSITION  FOG   PSIZE  HPOS  FOGC  PSIZ  BCOLO  BCOL1  and CLPO CLPS5 to be present as  binding semantics on a member of a structure of a varying input data structure   provided the member with this binding semantics is not referenced  This allows  Cg programs to have the same structure specify the varying output of a vp30  profile program and the varying input of an   p30 profile program     Table 29   p30 Varying Input Binding Semantics                                     Binding Semantics Name Corresponding Data  type    COLORO  COLO Input color0  float4    COLOR1  COL1 Input colori  float4    TEXCOORDO TEXCOORD7   Input texture coordinates  float4    TEXO TEX7   WPOS Window Position Coordinates  float 4   808 00504 0000 004 219    NVIDIA    Cg Language Toolkit    Table 30 summarizes the valid binding semantics for varying output parameters  in the   p30 profile     Table 30   p30 Varying Output Binding Semantics             Binding Semantics Name Corresponding Data  COLOR  COLORO  COL Output color  float4   DEPTH  DEPR Output depth    1oat                 Pack and Unpack Functions    The   p30 profile provides a number of functions for packing multiple floating  point values into a single 32 bit result  Corresponding unpacking functions are  also provided  These functions map directly to 
114. ated as the main  entry point at compilation time  The varying inputs to the program come from  this top level function s varying in parameters  The uniform inputs to the  program come from the top level function s uniform in parameters and from  any non static global variables that are referenced by the top level function or  by any functions that it calls  The output of the program comes from the return  value of the function  which is always implicitly varying   and from any out  parameters  which must also be varying     Parameters to a program of type sampler  are implicitly const     Statements    Statements are expressed just as in C  unless an exception is stated elsewhere in  this document  Additionally     Q The if  while  and for statements require bool expressions in the  approptiate places     Q Assignment is performed using    The assignment operator returns a value   just as in C  so assignments may be chained        Q The new discard statement terminates execution of the program for the  current data element    such as the current vertex or current fragment   and  suppresses its output  Vertex profiles may choose to omit support for  discard     Minimum Requirements for if  while  and for Statements    The minimum requirements are as follows     Q All profiles should support if  but such support is not strictly required for  older hardware        Q All profiles should support for and while loops if the number of loop   iterations can be determined at compile t
115. ategory 174    O  object  Cg definition 168  open profile functions 170  OpenGL Cg runtime 46  error reporting 57  OpenGL application 54  parameter setting 46  OpenGL CGerror 57  OpenGL profiles  ARB fragment program 211  ARB vertex program 204  NV  fragment program 218  NV register combiners 244  NV texture shader 244  NV vertex program 240  NV  vertex program 2 0 214  operations  expressed differently from C 165  operator  enhancements 188  precedence 188    operators  arithmetic 14  boolean 15    conditional 17  introduction 13  swizzle 16  write mask 16    P    packed  type modifier 172  parameter shadowing 46  parameters  modifiable function  passing 14  parameters in function definitions  syntax 171  performance techniques  abs   259  avoiding matrix transposes 263  computation frequency 262  conditional code in fragment  programs 263  data types 261  dot   259  min   259  saturate   260  shading computations 261  Swizzle 258  texture maps 260  vectorization 257  pixel program  defined 2  pixel shader  defined 2  position invariance 192  profile  arbfp1 211  arbvp1 204  fp20 244  fp30 218  psii psi2 psi13 227  ps20 ps2x 200  vp20 240  vp30 214  vs_1 1 223  vs 20 vs 2x 196  profile  defined 3  program  declaring 4  kinds of inputs 5  program profiles  fragment 193  vertex 192  programming model  GPU 2  ps_1_x profile 227  ps_2_0 profile 200  ps 2 xprofile 200       808 00504 0000 004    271    NVIDIA    Cg Language Toolkit    R  ray traced refraction  pixel shader code e
116. ation is faster than  if you use float  Although sometimes you need the range and extra precision  that half and float offer  you should avoid using them unless necessary        6  Usethe Right Standard Library Routines for Shading  Computations    If you re implementing a shading model  such as Lambertian  Blinn  or Phong    you ll generally be performing some dot product routines  clamping negative  results to zero  and raising some of the values to a power  to compute a specular  exponent  There are a few tricks that can speed up this process     Q Besure to use the dot    function when computing dot products        Q Ifyou need to clamp the result of a dot product computation to the range   0 1  in a fragment program  use the saturate    function instead of  max     This is often written as max  0  dot  N  L     but as long as the N and  L vectors are normalized  this can be written equivalently as   saturate  dot N L   because the dot product of two normalized vectors  is never greater than one  Given that saturate    is free in fragment  programs  see    3  Use the Cg Standard Library    on page 259   this compiles  to mote efficient code     Q Use the 1it    Standard Library function  if appropriate  The 1it     function implements a diffuse glossy Blinn shading model  It takes three  parameters       The dot product of the normalized surface normal and the light vector    The dot product of a half angle vector and the normal      The specular exponent  It returns a 4 
117. based on the context in which uniform sampler parameters and  texture coordinate inputs are used together     To specify bindings between texture units and uniform parameters texture  coordinates to match their application  all sampler uniform parameters and  texture coordinate inputs that are used in the program must have matching  binding semantics   for example  TEXUNIT lt n gt  may only be used with  TEXCOORD lt n gt   Partially specified binding semantics may not work in all cases   Fundamentally  this restriction is due to the close coupling between texture  samplers and texture coordinates in the NV_texture_shader extension     Binding Semantics for Uniform Data    If a binding semantic for a uniform parameter is not specified  then the  compiler will allocate one automatically  Scalar uniform parameters may be  allocated to either the xyz or the w portion of a constant register depending on  how they ate used within the Cg program  When using the output of the  compller without the Cg runtime  you must set all values of a scalar uniform to  the desired scalar value  not just the x component     Table 47 summatizes the valid binding semantics for uniform parameters in the    p20 profile     Table 47   p20 Uniform Binding Semantics       Binding Semantics Name Corresponding Data       register  s0    register s3    Texture unit N  where wis in range  0  3    TEXUNITO   TEXTUNIT3 May be used only with uniform inputs with  sampler  types                 The ps 1 X profile
118. bvp1 profile allows Cg programs to refer to the OpenGL state directly   unlike the vp20 profile  Howevet  if you want to write Cg programs that are  compatible with vp20 and dx8vs profiles  you should use the alternate  mechanism of setting uniform variables with the necessary state using the Cg  run time  The compiler relies on the feature of ARB vertex assembly programs  that enables parts of the OpenGL state to be written automatically to program  parameter registers as the state changes  The OpenGL driver handles this state   tracking feature  A special variable called g1state  defined as a structure  can be  used to refer to every part of the OpenGL state that ARB vertex programs can  reference  Following this paragraph are three lists of the g1state fields that can  be accessed  The array indexes ate shown as 0  but an array can be accessed  using any positive integer that is less than the limit of the array  For example  to  access the diffuse component of the second light use  gistate light 1  diffuse  assuming that GL  MAX LIGHTS is at least 2        3  See    DirectX Vertex Shader 1 1 Profile  vs 1 1  on page 223 for a full explanation of  the data types  statements  and operators supported by this profile        204    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Table 16 lists the glstate fields of type float4x4 that can be accessed                                                                                   Table 16 float4x4 glstate Fields  
119. calar to vector 179  Stanford shading language  relation to Cg 165  statements   introduction 13  statements  in Cg 185  structures   introduction 12  swizzle   for performance 258  swizzle operator 16  swizzle operator  described 186    T  texture lookups 17  texture map functions 25  texture maps for performance 260  thin film effect  pixel shader code example 126  vertex shader code example 124  tutorial 89       272    808 00504 0000 004    NVIDIA    type conversions 11  176  array 177  matrix 176  scalar 176  structure 176  vector 176   type equivalency 178   type promotion 178  assignment 178  smearing 179   type qualifiers 175  const 175  in 175  out 175   types  general discussion 171  partial support 173    U   uniform inputs 5   uniform modifer  use of 169  uninitialized variables  use of 182    V  variables  global 182  uninitialized  use of 182  varying inputs 5  vector data types 11  vector operators  new 186  vectorization  for performance 257  vectors  constructing 15  vertex color 93  vertex position 93  vertex program  varying output 7  vertex program profiles 192  vertex programs  defined 2  void type  specification 172  vp20 profile 240  vp30 profile 214  vs_1_1 profile 223  vs_2_0 profile 196  vs 2 x profile 196    WwW  water  improved  pixel shader code example 104  sample shader 101  vertex shader code example 102  web site  NVIDIA xiv  while statements 185  workspace  loading 89  write mask operator 16  described 187       808 00504 0000 004    273  NVI
120. cally activates the Direct3D shader corresponding to  program by calling IDirect3DDevice9    SetVertexShader    or  IDirect3DDevice9   SetPixelShader    depending on the program s profile   If parameter shadowing is enabled for program  it also sets all the shadowed  parameters and their associated Direct3D states  such as texture stage states for  the sampler parameters   No value or state tracking is performed by the  runtime so that this setting is done regardless of what the current values of these  parameters or of their states are  If a shadowed parameter has not been set by  the time cgD3D9BindProgram   is called  no Direct3D call of any sort is issued  for this parameter     Only one vertex program and one fragment program can be bound at any given  time  so binding a program of a given type implicitly unbinds any other program  of the same type     Expanded Interface Profile Support    Two convenient functions are provided that give the highest vertex and pixel  shader versions supported by the device     CGprofile cgD3D9GetLatestVertexProfile     CGprofile cgD3D9GetLatestPixelProfile       This allows you to make your application future ready  because the Cg  programs ate automatically compiled for the best profiles that are available at  runtime  even if these profiles did not exist at the time the application was  written  Another function that allows you optimal compilation is       76    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    cgD3D9GetOpti
121. capable GPUs of today and  tomorrow  APIs do not  and cannot  keep up with the rapid pace of innovation  in GPUs  As APIs and underlying technologies change  programmers  artists   and software publishers struggle to adapt to the change and the churn of the  hardware software platform     What   s needed is to raise the level of abstraction for interaction with GPUs   Continued updates and improvements to the hardware and APIs are too painful  if developers are too    close to the metal     This problem was exacerbated by the  advent of programmability in GPUs  Older GPUs had a small number of  controllable or configurable rendering paths  but the most recent technology is    808 00504 0000 004 xi    NVIDIA    Cg Language Toolkit    highly programmable  and becoming ever more so  We can now write short  vertex and fragment programs to be executed by the GPU  This requires great  skill  and is only possible with short programs     When GPU hardwate grows to allow programs of hundreds  thousands  or even  more instructions  assembly coding will no longer be practical  Rather than  programming each rendering state  each bit  byte  and word of data and control  through a low level assembly language  we want to express our ideas in a more  straightforward form  using a high level language     Thus Cg     C for Graphics     becomes necessary and inevitable  Just as C was  derived to expose the specific capabilities of processors while allowing higher   level abstraction  Cg allows 
122. ce for  both vertex and fragment programs     This section describes the contents of the Cg Standard Library  including  Mathematical functions   Geometric functions   Texture map functions    Derivative functions       D D DUO O    Predefined helper struct types    Where appropriate  functions are overloaded to support scalar and vector  variations when the input and output types are the same        Mathematical Functions    Table 1 lists the mathematical functions that the Cg Standard Library provides   The list includes functions useful for trigonometty  exponentiation  rounding     808 00504 0000 004 19  NVIDIA    Cg Language Toolkit    and vector and matrix manipulations  among others  All functions work on    scalars and vectots of all sizes  except where noted     Table 1 Mathematical Functions       Mathematical Functions                      Function Description   abs  x  Absolute value of x   acos  x  Arccosine of x in range  0 1   x in   1 1    all x  Returns true if every component of x is not equal to 0   Returns   alse otherwise    any  x  Returns true if any component of x is not equal to 0   Returns false otherwise    asin  x  Arcsine of x in range   1 2 x 2    x should be in   1 1     atan  x  Arctangent of x in range   1 2 1 2        atan2 y  x     Arctangent of y x in range   z 7        ceil x     Smallest integer not less than x       clamp x  a  b     x clamped to the range  a  b  as follows   e Returns a if x is less than a    e Returns b if x is greater
123. cgGetFirstParameter    and cgGetNextParameter    will  allow you to iterate through all the parameters of a program that are within the  scope of the context     Here is how those two functions would typically be used given a valid program  called program   CGparameter parameter   cgGetFirstParameter  program   CG PROGRAM    while  parameter    0        Here is the code that handles the parameter     parameter   cgGetNextParameter  parameter                    These functions don   t give access to the fields of a structure parameter  type  CG_STRUCT  or the elements of an array parameter  type CG_ARRAY         808 00504 0000 004 39  NVIDIA    Cg Language Toolkit    To get access to the fields of a structure  you use  cgGetFirstStructParameter    along with cgGetNextParameter        CGparameter cgGetFirstStructParameter    CGparameter parameter      If parameter is not of type CG_STRUCT  cgGetFirstStructParameter     returns zero     To get access to the elements of an array  you use cgGetArrayDimension      cgGetArraySize     cgGetArrayParameter     and cgGetNextParameter        int cgGetArrayDimension  CGparameter parameter     int cgGetArraySize CGparameter parameter  int dimension     CGparameter cgGetArrayParameter  CGparameter parameter   int index      These three functions return 0 if parameter is not of type CG  ARRAY  Function  cgGetArrayDimension    gives the dimension of the array  It returns 1 for  float4 array 10  2 for float4 array 10  100  and so on  Next   cg
124. ch and now even to exceed traditional workstations   The processing power provided by a modern GPU in a single frame rivals the  amount of computation that used to be expended for an offline rendered  animation frame  Indeed  at the launch of GeForce3 on the Apple Macintosh  a  convincing version of Pixar s Luxo  Jr  was demonstrated running interactively  in real time  At the 2001 SIGGRAPH conference  an interactive version of a  more recent film  Square Studios  Final Fantasy  was shown running in real   time  again on a GeForce     Although these feats of computation are astounding  there is much more to  come  Today s GPUs evolve vety quickly  Typically  a product generation is  only six months long  and with each new product generation comes a two fold  increase in performance  Graphics processor performance increases at  approximately three times the rate of microprocessors Moore s Law cubed  In  addition to the performance increases  each year brings new hardware features   supported by new application programming interfaces  APIs   This dizzying  pace is difficult for developers to adapt to  but adapt they must     Developers and usets are demanding better rendeting quality and more realistic  imagery and experiences  Users don t care about the details  they simply want  games and other interactive applications to look more like movies  special  effects  and animation  Developets want more power  always more   along with  more flexibility in controlling the massively 
125. cie O  Ve       float3 finalColor   lerp lightMetal  darkMetal  nvDecal x    return float4 finalColor  1         108 808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders       MultiPaint    Description    MultiPaint presents a single pass solution to a common production problem   mixing multiple kinds of materials on a single polygonal surface  MultiPaint  provides a simple BRDF  bidirectional reflectance distribution function  that 1s  still complex enough to represent many common metallic and dielectric  surfaces  and controls all key factors of the variable BRDF through texturing   This permits you to create multiple materials without switching shaders   splitting your model  or resorting to multiple passes     Uses for MultiPaint might include complex armor built of inlaid metals  woods   and stones   all modeled on a single  simple poly mesh  buildings composed of  multiple types of stone  glass  and metal  expressed as simple cubes  cloth with  inlaid metallic threads  or as in this demo  metal partially covered with peeling  paint     Using multiple BRDFs is common in the offline world  but rarely optimized   instead  two different shaders may be evaluated and their results blended using a  mask texture or chained through if statements  For maximum real time  performance  MultiPaint instead integrates all of the key parts of the BRDFs as  multiple painted textures so that only one pass through the shader is required to  create the mixed appearance  This permit
126. clared     sheer  my Seeker      PO ge t qua  yews Sy f f DeErtas  s    as ey Muy        Arrays ate supported in Cg and are declared just as in C  Because Cg does not  support pointers  arrays must always be defined using array syntax rather than  pointer syntax        Declare a function that accepts an array     of five skinning matrices   Retin peat ooisltod 4x4 aca  Y co EU    Basic profiles place substantial restrictions on array declaration and usage   General purpose arrays can only be used as uniform parameters to a vertex  program  The intent is to allow an application to pass arrays of skinning  matrices and arrays of light parameters to a vertex program     The most important difference from C is that arrays are first class types  That  means array assignments actually copy the entire array  and arrays that are       12    808 00504 0000 004  NVIDIA    Introduction to the Cg Language    passed as parameters are passed by value  the entire array is copied before  making any changes   rather than by reference        Statements and Operators    Cg supports the following types of statements and operators   Control flow   Function definitions and function overloads   Arithmetic operators from C   Multiplication function   Vector constructor   Boolean and comparison operators   Swizzle operator    Write mask operator       D DO OCOLDOLDCDO ZrLv    Conditional operator    Control Flow    Cg uses the following C control constructs        a Function calls and the return stat
127. composed of two interfaces     Q Minimal interface  This interface makes no Direct3D calls itself and should be  used when you prefer to keep the Direct3D code in the application itself        Q Expanded interface  This interface makes the Direct3D calls necessary to  provide enhanced program and parameter management and should be used  when you prefer to let the Cg runtime manage the Direct3D shaders     Direct3D Minimal Interface    The minimal interface simply supplies convenient functions to convert some  information provided by the core runtime to information specific to Direct3D     Vertex Declaration    In Direct3D  you have to supply a vertex declaration that establishes a mapping  between the vertex shader input registers and the data provided by the  application as data streams  In Direct3D 9  this vertex declaration is bound to  the current state the same way the vertex shader is  see the Direct3D 9  documentation on IDirect3DDevice9   CreateVertexDeclaration   and  IDirect3DDevice9  SetVertexDeclaration   fora detailed explanation    In Direct3D 8  the vertex declaration is required at the time you create the  vertex shader  for mote information  see the Direct3D 8 documentation on  IDirect3DDevice8     CreateVertexShader            808 00504 0000 004 57  NVIDIA    Cg Language Toolkit    A data stream is basically an array of data structures  Each of those structures is  of a particular type called the vertex format of the stream  Here is an example of a  vertex d
128. coordinates as the two  floating point values located at an offset equal to twice the size of a DWORD from  the end of the normal data in stream 0  The tangents are provided in stream 1 as  a second texture coordinate set that is found as the first three floating point  values of the vertex format     To get a vertex declaration from a Cg vertex program for the Direct3D 9 Cg  runtime use cgD3D9GetVertexDeclaration        CGbool cgD3D9GetVertexDeclaration  CGprogram program   D3DVERTEXELEMENT9 declaration  MAXD3DDECLLENGTH            58    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    MAXD3DDECLLENGTH is a Direct3D 9 constant that gives the maximum length of  a Direct3D 9 declaration  If no declaration can be derived from the program   cgD3D9GetVertexDeclaration    fails and returns CG_FALSE     To get a vertex declaration from a Cg vertex program for the Direct3D 8 Cg  runtime use cgD3D8GetVertexDeclaration        CGbool cgD3D8GetVertexDeclaration  CGprogram program   DWORD declaration MAX FVF DECL SIZE       MAX FVF DECL SIZE is a Direct3D constant that gives the maximum length of  a Direct3D declaration  If no declaration can be derived from the program   cgD3D8GetVertexDeclaration    fails and returns CG  FALSE     The declaration returned by cgD3D9GetVertexDeclaration    or  cgD3D8GetVertexDeclaration    is for a single stream  so that for the  following program          votan ao AOS ion ROS ENEON   in float4 color INCOLORO  AS ASEOS S  EpXCO QISIDU  out f
129. cs for Uniform Data    Table 13 summatizes the valid binding semantics for uniform parameters in the  ps 2 0 and ps 2 X profiles    Table 13 ps 2   Uniform Input Binding Semantics             Binding Semantics Name Corresponding Data   register  s0    register s15  Texunit unit N  where N is in range  0  15    TEXUNITO TEXUNIT15 May only be used with uniform inputs with  sampler  types    register  c0  register c31  Constant register N  where N is in range   C0 c31  0  31              May only be used with uniform inputs        Binding Semantics for Varying Input Output Data    Table 14 summarizes the valid binding semantics for varying input parameters  in the ps 2 0 andps 2 x profiles     Table 14 ps 2   Varying Input Binding Semantics                Binding Semantics Name Corresponding Data  type   COLORO Input color 0  float4    COLOR1 Input color 1  float4   TEXCOORDO TEXCOORD7 Input texture coordinates  float4              Table 15 summatizes the valid binding semantics for varying output parameters  in theps_2 0 and ps 2 x profiles     Table 15 ps 2   Varying Output Binding Semantics             Binding Semantics Name Corresponding Data  COLOR  COLORO Output color  float4   DEPTH Output depth    1oat                       202    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Options  The ps 2 x profile allows the following profile specific options    NumTemps  lt n gt  where 12  lt   n  lt   32  default 32   NumInstructionSlots  lt n gt  where 96  lt   n
130. ctX 9 pixel shaders  Runtime profiles    CG PROFILE PS 2 X  CG PROFILE PS 2 0  Compiler options   profile ps 2 x   profile ps 2 0  a OpenGL ARD vertex programs       Runtime profile  CG PROFILE ARBVP1  Compiler option   profile arbvpl   Q OpenGL ARB fragment programs  Runtime profile  CG PROFILE ARBFP1  Compiler option   profile arbfp1   a OpenGL NV30 vertex programs  Runtime profile  CG_PROFILE VP30  Compiler option   profile vp30   Q OpenGL NV30 fragment programs  Runtime profile  CG PROFILE FP30  Compiler option   profile fp30   808 00504 0000 004 3    NVIDIA    Cg Language Toolkit    a DirectX 8 vertex shaders  Runtime profile  CG PROFILE VS 1 1  Compiler option   profile vs 1 1  Q DirectX 8 pixel shaders  Runtime profiles  CG_PROFILE PS 1 3  CG PROFILE PS 1 2  CG PROFILE PS 1 1  Compiler options   profile ps 1 3   profile ps 1 2   profile ps 1 1  Q OpenGL NV2X vertex programs    Runtime profile  CG PROFILE VP20  Compiler option   profile vp20   Q OpenGL NV2X fragment programs  Runtime profile  CG PROFILE FP20  Compiler option   profile fp20    The DirectX 9 profiles  vs 2 x and ps 2 x   OpenGL ARB profiles  arbfp1  and arbvp1   and NV30 OpenGL profiles    p30 and vp30  generally support  longet  more complex programs and offer more features and functionality to  the developer  These ate referred to as advanced profiles     The DirectX 8 profiles  vs 1 1andps 1 3 and NV2X OpenGL profiles     p20 and vp20  have more restrictions on program length and available  features  e
131. ction parameters ate aliased by a function call  In Cg  the  two parameters have separate storage in the function  whereas in C   they  would share storage     To reinforce this distinction  Cg uses a different syntax than C   to declare  function parameters that are modified     function blahl  out SPINE x      x is output only  function blah2 inout float x   Jf sx 3e agyoure aime  Omara  function blah3  in Flo 2  7    x is input only        conecto lollela  lose 7 f f 2 ts imon oby  eleudbs  as am E     Cg suppotts function ovetloading by the number of operands and by operand  type  The choice of a function is made by matching one operand at a time   starting at the first operand  The formal language specification provides more  details on the matching rules  but it is not normally necessary to study them  because the ovetloading generally works in an intuitive manner  For example   the following code declares two versions of a function  one that takes two bool  operands  and one that takes two float operands     bool same float a  float b    return  a    b     bool same bool a  bool b    return  a    b       Arithmetic Operators from C    Cg includes all the standard C arithmetic operators              and allows the  operators to be used on vectors as well as on scalars  The vector operations are  always performed in elementwise fashion  For example     float3 a  b  c      loat3 A  B  C  equals float3 a A  b B  c C      These operators can also be used in a form that mixe
132. d for statements are allowed only if the loops they define  can be unrolled  because there is no branching in VS 1 1 shaders     There are no subroutine calls either  so all functions are inlined  Comparison  operators ate allowed   gt    lt    gt     lt            and Boolean operators        amp  amp        are allowed  However  the logic operators   amp           are not allowed     Using Arrays    Variable indexing of arrays is allowed as long as the array is a uniform constant   For compatibility reasons arrays indexed with variable expressions need not be  declared const just uniform  However  writing to an array that is later indexed  with a variable expression yields unpredictable results     Array data is not packed because vertex program indexing does not permit it   Each element of the array takes a single 4 float program parameter register  For  example  float arr 10   float2 arr 10   float3 arr 10   and float4  arr 10  all consume ten program parameter registers     It is more efficient to access an array of vectors than an array of matrices   Accessing a matrix requires a floor calculation  followed by a multiply by a  constant to compute the register index  Because vectors  and scalars  take one  register  neither the floor nor the multiply is needed  It is faster to do matrix  skinning using arrays of vectors with a premultiplied index than using atrays of  matrices     Constants    Literal constants can be used with this profile  but it is not possible to sto
133. d the program  cgGLLoadProgram fragmentProgram            Grab some parameters    position   cgGetNamedParameter  vertexProgram   position     color   cgGetNamedParameter  vertexProgram   color     texCoord   cgGetNamedParameter vertexProgram   texCoord     modelViewMatrix   cgGetNamedParameter  vertexProgram                        ModelViewMatrix     baseTexture   cgGetNamedParameter  fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram   NSome Color         Set parameters that don t change       They can be set only once because of parameter shadowing   cgGLSetTextureParameter  baseTexture  texture    cgGLSetParameter4fv someColor  constantColor                  Called to render the scen  void Display          Set the varying parameters  cgGLEnableClientState  position    cgGLSetParameterPointer  position  3  GL FLOAT  0   vertexPositions    cgGLEnableClientState color                        808 00504 0000 004 55    NVIDIA    Cg Language Toolkit    cgG    cgG  cgG       cgG        cgG  cgG        cgG  cgG        cgG    A        mi  cgG    A  cgG  cgG         cgG  cgG  cgG    id Qa  void     ii  cgD    IsetrbeceivecerPoumeer  cedo  1  Ch LOA  0   vertexColors     EnableClientState  texCoord      LSetParameterPointer  texCoord  2  GL FLOAT  0     vertexTexCoords                                  Set the uniform parameters that change every frame  LSetStateMatrixParameter  modelViewMatrix   CG GL MODELVIEW PROJECTION MATRIX   CG GL MATRIX IDENTITY
134. d within the same scope     a Vector constructors  such as the form   1oat4  1 2 3 4   may be used  anywhere in an expression     O A struct definition automatically performs a corresponding typedef  as          in C     a C   style    comments are allowed in addition to C style          comments   808 00504 0000 004 167    NVIDIA    Cg Language Toolkit       Detailed Language Specification    Definitions    Profiles    The following definitions are based on the ANSI C standard     a Object  An object is a region of data storage in the execution environment  the  contents of which can represent values  When referenced  an object may be  interpreted as having a particular type     O Declaration  A declaration specifies the interpretation and attributes of a set of  identifiers     a Definition  A declaration that also causes storage to be reserved for an object or code  that will be generated for a function named by an identifier 1s a definition     Compilation of a Cg program  a top level function  always occurs in the context  of a compilation profile  The profile specifies whether certain optional language  features are supported  These optional language features include certain control  constructs and standard library functions  The compilation profile also defines  the precision of the float  half  and fixed data types  and specifies whether  the fixed and sampler  data types are fully or only partially supported  The  choice of a compilation profile is made externally to
135. ded version of simple  cg calculates diffuse and specular lighting for  each vertex  Figure 4 shows a screenshot of the shader      d       Figure 4 The simple cg Shader       90    808 00504 0000 004  NVIDIA    A Brief Tutorial    Program Listing for simple cg    The following is the program listing for simple   cg        Define inputs from application   struct appin     float4 Position    JACKS IMIEILOIN  2  float4 Normal   NORMAL   y        Define outputs from vertex shader   struct Vebloue     float4 HPosition S POSITION   OA Color COMO  yo    vertout main appin IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelViewIT   uniform float4 LightVec     vertout OUT        Transform vertex position into homogenous clip space   OUT HPosition   mul ModelViewProj  IN Position               Transform normal from model space to view spac  float3 normalVec   normalize  mul  ModelViewIT   IN Normal  xyz         Store normalized light vector   float3 lightVec   normalize  LightVec xyz         Calculate half angle vector   float3 eyeVec   float3 0 0  0 0  1 0    float3 halfVec   normalize lightVec   eyeVec         Calculate diffuse component   float diffuse   dot  normalVec  lightVec         Calculate specular component   float specular   dot  normalVec  halfVec               Use the lit function to compute lighting vector from       808 00504 0000 004 91  NVIDIA    Cg Language Toolkit       diffuse and specular values   float4 lighting   lit diffuse  specular  32         Blue 
136. describes where to find the necessary vertex attributes in the vertex  streams   See    Expanded Interface Program Execution    on page 74 for the  details on the arguments to cgD3D8LoadProgram    and  cgD3D9LoadProgram        In OpenGL  the equivalent call is  cgGLLoadProgram program      Modifying Program Parameters    The runtime gives you the option of modifying the values of yout program  parameters  The first step is to get a handle to the parameter        CGparameter myParameter   cgGetNamedParameter    program   myParameter       The variable    myParameter    is the name of the parameter as it appears in the  program source code     The second step is to set the parameter value  The function used depends on  the parameter type     Here is an example in OpenGL     cgGLSetParameter4fv myParameter  value      Hete is the same example in Direct3D     cgD3D9SetUniform myParameter  value      These function calls assign the four floating point values contained in the array  value to the parameter myParameter  which is assumed to be of type float4     In both APIs  there are variants of these calls to set matrices  arrays  textures   and texture states     Executing a Program    Before you can execute a program in OpenGL  you must enable its  corresponding profile     cgGLEnableProfile CG PROFILE ARBVP1         808 00504 0000 004 33  NVIDIA    Cg Language Toolkit    In Direct3D  nothing explicitly needs to be done to enable a specific profile     Next  you bind the program 
137. diffuse material  float3 diffuseMaterial   float3 0 0  0 0  1 0         White specular material  float3 specularMaterial   float3 1 0  1 0  1 0         Combine diffuse and specular contributions and     output final vertex color    OUT Color rgb   lighting y   diffuseMaterial    lighting z   specularMaterial   OUT Color a   1 0        return OUT     Definitions for Structures with Varying Data    The first thing to notice is the definitions of structures with binding semantics  for varying data     Let s take a look at the appin structure   PP       define inputs from application  struct appin     float4 Position    POSITION   float4 Normal   NORMAL   y     This structure contains only two members  Position and Normal  Because this  data varies per vertex  the binding semantics POSITION and NORMAL tell the  compiler that the position information is associated with the predefined  attribute POSITION and that the normal information is associated with the  predefined attribute NORMAL     The other structure that is defined in simple  cg is vertout  which connects  the vertex to the fragment        define outputs from vertex shader  SLEUCE wvertout     float4 HPosition   POSITION   float  Color COLOR  be       92    808 00504 0000 004  NVIDIA    A Brief Tutorial    The vertout structure also contains only two members  Hposition  the vertex  position in homogeneous coordinates  and Color  the vertex color  Again   binding semantics ate used to specify register locations for the va
138. distance   leck colos   fleck color    milena wes Insul icing le w          DIFFUSE   flos  je cl    amp entuseente  um ol 1152 5   locus peintkesult   lero  2vulosieimE seua  robos   parco lio de  Drs             FRESNEL  log Bresmel   seua  Clo  ClmesCosenr  Rerlece Color   Fresnel   pow Fresnel  NewPaintSpec z          This helps make the clear coat less omnipresent         only the really  perceptually  bright areas reflect     the most    Fresnel   saturate vert fresn Fresnel         Show more of the specular reflection environment      when in fresnel zones      diffuse    1 fresnel    environment    fresnel   pemaxesulie   lero  paint Result  iWSclecie color  Fresnel            SPECULAR     O rtuse specular lecks  parres ult o o Results Colon          OUTPUT  return paintResult xyzz        132 808 00504 0000 004  NVIDIA       Basic Profile Sample Shaders    This chapter provides a set of basic profile sample shaders written in Cg  Each  shader comes with an accompanying snapshot  description  and source code     Examples shown are    Anisottopic Lighting   Bump Dot3x2 Diffuse and Specular  Bump Reflection Mapping   Fresnel   Grass   Refraction   Shadow Mapping   Shadow Volume Extrusion    Sine Wave Demo       D D DL oO CCOO U oO oO O    Matrix Palette Skinning    808 00504 0000 004 133  NVIDIA    Cg Language Toolkit       Anisotropic Lighting    Description    The anisotropic lighting effect  Figure 13  shows the vertex program s half   angle vector calculation  It use
139. ds  rgb   bscale  1 0          tangentSpaceNormal   tangentSpaceNormal   bumpscale           Transform it into eye space             t loyeWE S  my   n 0    dot  In tangentToEyeMat0 xyz  tangentSpaceNormal     n 1    dot  In tangentToEyeMatl  tangentSpaceNormal     n 2    dot  In tangentToEyeMat2  tangentSpaceNormal      n   normalize  n          Compute the  loat       LOa  LOa    LOa  LOa  LOa    lighting equation     t acotl   mesi doe  a  Ly  0  llamo  0  o 1   t nadoda   mezi corta m  O J2  f Cle    to 1   t flag    float   ndotl  gt  0     ompute oil  sheen  subsurf scattering contributions   EA guts   t4 sheen    t4 subsurf           122    808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders    loa Kr  Kr2   loe IRE KEZ   ElogieSs dU  12   ELOSNES IN  825       Compute fresnel at sheen layer  ramp it up a bit   ra   mass aci Y i  Sta  R UP M   Kr SMOO M sie soln 0 0  Q5  kae p   Mc  alk 0    dies          Compute the refracted light ray and the refraction  Uc oeste nts   ge     esse  IL  ins SEE  IR   UU vp   KoA smooskisisc o 0 07  0 39  ES RR   Ke2   db  c Tac       For oil contribution  modulate the oiliness mask by a     specular term   Oil     0 39   guless   jw melon  ia 9       For sheen contribution  modulate Fresnel term by      sheen color times specular   Modulate by additional      diffuse term to soften it a bit    sheen   2 5 Kr sheenColor  ndotl  0 2   pow  ndoth  m              Compute single scattering approximation to subsurface      scatter
140. e  cgGetParameterResource  color     cgGetParameterResourceIndex color      if dL  4L  w size elo    D3DDECLTYPE FLOAT2  D3DDECLMETHOD DEFAULT   cgD3D9ResourceToDeclUsage   cgGetParameterResource texCoord     cgGetParameterResourceIndex texCoord      D3DD3CL END    y   DWORD declaration        D3DVSD_STREAM 0    D3DVSD REG  cgD3D8ResourceToInputRegister    cgGetParameterResource  position    D3DVSDT FLOAT3    D3DVSD REG  cgD3D8ResourceToInputRegister   cgGetParameterResource  color    D3DVSDT D3DCOLOR    D3DVSD STREAM 1    D3DVSD SKIP 4    D3DVSD REG  cgD3D8ResourceToInputRegister   cgGet ParameterResource  texCoord    D3DVSDT FLOAT2    D3DVSD END                           The size specified as the second argument of the D3DVSD REG    macro call of a  Direct3D 8 declaration does not need to match the size of the corresponding  parameter for the vertex declaration to be valid  Those sizes are specified to  describe how the data is laid out in the streams  not to perform any type    checking with the shader code  The data referred to by a D3DVSD REG    macro       62    NVIDIA    808 00504 0000 004    Using the Cg Runtime Library    call is expanded to the four floating point values of the corresponding hardware  register  and the missing values are set to 0 for x  y  and z  and to 1 for w     Minimal Interface Type Retrieval    Use cgD3D9TypeToSize    to rettieve the size of a CGtype enumerated type in  terms of floating point numbers     DWORD cgD3D9TypeToSize  CGtype type   
141. e Extrusion       808 00504 0000 004 155  NVIDIA    Cg Language Toolkit    Vertex Shader Source Code for Shadow Volume Extrusion    struct appdata         ne    float4 Position   POSITION   float3 Normal   NORMAL   float4 DiffuseColor   COLORO   float2 TexCoord0   TEXCOORDO        struct vpconn      he    Plat 4 Hpos von PSI ON  float4 Color0   COLORO   float2 TexCoord0   TEXCOORDO        vpconn main  appdata IN     uniform float4x4 WorldViewProj    uniform float4 LightPos      in object space   uniform float4 Fatness    uniform float4 ShadowExtrudeDist    uniform float4 Factors       vpconn OUT        Create normalized vector from vertex to light  tlosur4  ligne to vere   momasllias  N Positron   Igino  7       N dot L to decide if point should be moved away     from the light to extrude the volum  iloeur melojl   doe  eligiir tO wert sya  UN NOrmal sayz  7          Inset the position along      the normal vector direction      This moves the shadow volume points      inside the model slightly to minimize      popping of shadowed areas as      each facet comes in and out of shadow       The Fatness value should be negative   AS pos    TN Normek   Fatness A  UNO Sisto oye S74    eye za   MSS JOOS   IN  POSTON wig       156    808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders       scale the vector from light to vertex  Plc extrusion yes   ligne to vee   Sinverclonmdxceicvicls ul sic y          if ndotl    0 then the vertex faces   y away from the light  so move it   
142. e Qualifiers  on page 175        Q   default   is an expression that resolves to a constant at compile time     Default values are only permitted for uniform parameters  and fot in  parameters to functions that are not top level     Function Calls    Types    A function call returns an rvalue  Therefore  if a function returns an array  the  array may be read but not written  For example  the following is allowed    y   myfunc  x   2     But  this is not  myfunc  x   2    y      For multiple function calls within an expression  the calls can occur in any  order   it is undefined     Cg   s types are as follows     O The int type is preferably 32 bit two   s complement  Profiles may  optionally treat int as float     a The float type is as close as possible to the IEEE single precision  32 bit   floating point  Profiles must support the   1oat data type        a The half type is lower precision IEEE like floating point  Profiles must  support the half type  but may choose to implement it with the same  precision as the float type     a The fixed type is a signed type with a range of at least   2 2  and with at  least 10 bits of fractional precision  Overflow operations on the data type  clamp rather than wrap  Fragment profiles must support the fixed type   but may implement it with the same precision as the half or float types   Vertex profiles are required to provide partial support  see    Partial Support  of Types  on page 173  for the fixed type  Vertex profiles have the o
143. e Steps to High Performance Cg       ooococococcc a 257  Appendix D  Cg Compiler Options         ccccconn hh hh ahhh hne 265  vi 808 00504 0000 004    NVIDIA    Contents  Figures  and Tables    List of Figures       Figure 1 Cgs Model of the GPW i     iur care o Oe Rc ROO  D Des RR 2  Figure 2 The Parts of the Cg Runtime API           e                          31  Figure 3 The Cg_Simple Workspace                                           89  Figure 4 The simple cg Shader               llle 90  Figure 5 Example of Improved Skinning     s a aooaa a                           4 98  Figure 6 Example of Improved Water            a                 2    101  Figure 7 Example of Melting Paint                                      0  105  Figure 8 Example of MultiPaint                                  4  2    109  Figure 9 Example of Ray Traced Refraction             0                            114  Figure 10 Example of Skin             lt  lt     ot RR Rn 119  Figure 11 Example of Thin Film Effect          0                            124  Figure 12 Example of Car Paint9  a  lt  aoc saosna e                  4    2  127  Figure 13 Example of Anisotropic Lighting                a a                     134  Figure 14 Example of Bump Dot3x2 Diffuse and Specular                  136  Figure 15 Example of Bump Reflection Mapping                                 140  Figure 16 Example of Fresnel s sa sa daaa o RR rtr nn 144  Figure 17 Example Of Grass              llle 146  Figure 18 Exampl
144. e desired scalar  value  not just the x component     Table 37 summarizes the valid binding semantics for uniform parameters in the  ps 1 X profiles     Table 37 ps 1 x Uniform Input Binding Semantics       Binding Semantics Name Corresponding Data       register  s0    register  s3    Texture unit N  where wis in range  0  3    TEXUNITO   TEXTUNIT3 May be used only with uniform inputs with  sampler  types        register  c0  register c7    Constant register  0  7   C0 C7                   232    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Binding Semantics for Varying Input Output Data    The vatying input binding semantics in the ps 1 X profiles are the same as the  varying output binding semantics of the vs 1 1 profile     Varying input binding semantics in the ps 1 X profiles consist of COLORO   COLOR1  TEXCOORDO  TEXCOORD1  TEXCOORD2 and TEXCOORD3  These map to  output registers in DirectX vertex shaders     Table 38 summarizes the valid binding semantics for varying input parameters  in the ps 1 X profiles     Table 38 ps 1 x Varying Input Binding Semantics                Binding Semantics Name Corresponding Data   COLOR  COLORO Input color value vO   COL  COLO   COLOR1 Input color value v1   COL1   TEXCOORDO   TEXCOORD3 Input texture coordinates t0 t3  TEXO   TEX3             Additionally  the ps 1 x profiles allow POSITION  FOG  PSIZE  TEXCOORD4   TEXCOORD5  TEXCOORD6  and TEXCOORD7 to be specified on varying inputs   provided these inputs are not r
145. e is reduced gradually at  every level such that in the distance the flecks are pointing mostly up  The  flecks    specular power and their contribution are reduced by distance  to give it  a gtainier appearance up close and a more uniform appearance from afar  Next   the view vector is reflected off a wavy normal map   which represents the  object   s natural undulations   to index into the environment map  The  shininess of the clear coat itself is calculated by scaling the Fresnel term by the  luminance  of the environment map  Finally  the shader lerps between the  diffuse paint color and the reflection based on the Fresnel term  and adds the  specular highlights        Figure 12 Example of Car Paint 9       1  The luminance transfer function selects only the perceptually bright areas of the  environment map in order not to reflect the darker areas of the scene        808 00504 0000 004 127  NVIDIA    Cg Language Toolkit    Vertex Shader Source Code for Car Paint 9       This shader is based on the Time Machine temporal rust       shader     Car paint data was measured by Cornell       University from samples provided by Ford Motor Company     struct alv      float4   float3   float2     t ll oiu   1e ll oiu     float3  he    OPosition  ONormal  uv  Tangent  Binormal  Normal    struct VS_OUTPUT      float4  float2  float3  float4  float3  float4  float3  float3  i   oe  float       be       VS OUTPUT main       TRANSFORMATIONS  uniform float4x4  uniform float4x4  uniform 
146. e of Refraction                               2    eo  149  Figure 19 Example of Shadow Mapping                             2      152  Figure 20 Example of Shadow Volume Extrusion          0        ee         155  Figure 21 Example of Sine Wave                          ler 158  Figure 22 Example of Matrix Palette Skinning                                    161  808 00504 0000 004 vii    NVIDIA    Cg Language Toolkit    List of Figures       viii 808 00504 0000 004  NVIDIA    List of Tables       Table 1 Mathematical Functions     lt e sa te ae les 20  Table 2 Geometric  FUNCIONS    uxo x 0x x03 wee c o EO x ROX ERY RES SS E Rx OS 24  Table 3 Texture Map FUNCUONS   4 4 6 5 s woo 9k a e es 38 3k O RO oe 25  Table 4 Derivative Functions         22er 27  Table 5 Debugging Function                eere 28  Table 6 Type Conversions        1 ww                    4  ea    177  Table 7 Expanded Operators               llle 188  Table 8 Vertex Output Binding Semantics                                  4 193  Table 9 Fragment Output Binding Semantics           2                      ee 193  Table 10 vs 2   Uniform Input Binding Semantics                o        198  Table 11 vs 2   Varying Input Binding Semantics                eae  198  Table 12 vs 2   Varying Output Binding Semantics             o        199  Table 13 ps 2   Uniform Input Binding Semantics              rns 202  Table 14 ps 2   Varying Input Binding Semantics              rns 202  Table 15 ps 2   Varying Output Binding
147. e previously computed dot products  The  returned vector holds the diffuse lighting contribution in the y coordinate  and  the specular lighting contribution in the z coordinate     Remember to take advantage of the Standard Library to help speed up your  development cycle     Modulating the Diffuse and Specular Lighting Contributions    Once the diffuse and specular lighting contributions lighting  y and  lighting z have been calculated  we need to modulate them with the object s  material properties        blue diffuse material  float3 diffuseMaterial   float3 0 0  0 0  1 0         white specular material  float3 specularMaterial   float3 1 0  1 0  1 0         combine diffuse and specular contributions and  P ECOut pU E ndm vent exe olo   OUT Color rgb   lighting y   diffuseMaterial    lighting z   specularMaterial   OUNCE Ooi se  m Jp       return OUT     We define the object s diffuse material color as blue  We modulate the lighting  contributions with the material properties to get the final vertex color  and we  assign it to the output structure s color field  OUT  Color  Finally  we set the  alpha channel of the final color to 1 0  so that our object will be opaque  and  return the computed position and color values stored in the OUT structure     Further Experimentation    Use simple cg as a framewotk to try more advanced experiments  perhaps by  adding more parameters to the program or by performing more complex  calculations in the vertex program  Have fun experimen
148. e standard  library documentation for descriptions of these functions     Table 35 Supported Standard Library Functions       dot floatN  floatN        lerp floatN  floatN  floatN        lerp floatN  floatN  float        tex1D sampler1D  float        tex1D sampler1D  float2        texlDproj  samplerlD  float2        texlDproj  sampler1D  float3        tex2D  sampler2D  float2        tex2D  sampler2D  float3        tex2Dproj  sampler2D  float3        tex2Dproj  sampler2D  float4        tex3D  sampler3D  float3        tex3Dproj  sampler3D  float4        texCUBE  samplerCUBE  float3           texCUBEproj  samplerCUBE  float4              Note  The non projective texture lookup functions are actually done as projective  lookups on the underlying hardware  Because of this  the w component of the  texture coordinates passed to these functions from the application or vertex  program must contain the value 1        Texture coordinate parameters for projective texture lookup functions must  have swizzles that match the swizzle done by the generated texture addressing  instruction  While this may seem burdensome  it is intended to allow ps 1 X  profile programs to behave correctly under other pixel shader profiles        230    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Table 36 lists the swizzles required on the texture coordinate parameter to the  ptojective texture lookup functions     Table 36 Required Projective Texture Lookup Swizzles                         Te
149. e7   oia 19    float kE   ndott   dot   n t         Cost diy COS iO C   elz  Gloss ciy Cosi   O its    ss    CO  Siem eli COSL   ote     Cost chiy COSL    Ste    FSE SaS   so    Cosi chiy esse   Gta    Cosi chiy casi   Gta         19   jeg    598   ise e 9 5 s  eS     estube   rice sr  s    lleve    2  ie er cohlke crt  e te            return result     float4 main  fragin In   uniform sampler2D tex0   uniform sampler2D texl   uniform sampler2D tex2   uniform sampler2D tex3   uniform float3 eyeSpaceLightPosition   uniform float thickness   uniform float4 ambient     COLOR       808 00504 0000 004 121  NVIDIA    Cg Language Toolkit         LOat         LOa    ag    Loa  Loa  Loa  Loa  Loa    O  Loa     7         1    LOa    LOat             bscale   In tangentToEyeMat0 w   vara    O71 1   atio of indices of refraction  air skin   5 tm   S455    specular exponent  tA lsgimCcolos     i  ij i  1 by ff Xxgjwe colos  tA simeeCcolor     i  1  1  1 he    sees colos  v4 sikeimColeie   e 2D  esl  In  CSCO A  ES a  OS Der GeO    t3 albedo    0 8 03  054 ip  iliness mask  cl laos   0 3  SAD mea  mee o s   L eye spac VE VECTOR  t3 v   normalize   In eyeSpacePosition          Get eye space light and halfangle vectors     if     Loa    loa       t3 1   normalize  eyeSpaceLightPosition    In eyeSpacePosition     Eo la e doxwuedbbee  wd p       Get tangent space normal vector from normal map     157  f    Loa       loa    t3 tangentSpaceNormal  t3 bumpscale   bscale     tex2D tex0  In texcoor
150. eclaration for Direct3D 9                                                                                                                                                                                const D3DVERTEXELEMENT9 declaration        O  O   SS O  Elo   D3DDECLTYPE FLOAT3  D3DDECL ETHOD DEFAULT   D3DDECLUSAGE POSITION  0       Position  Or Sy 9 SES O  hoa  y  D3DDECLTYPE FLOAT3  D3DDECL ETHOD DEFAULT   D3DDECLUSAGE NORMAL  0       Normal  0 E    size CE OEE yy  D3DDECLTYPE FLOAT2  D3DDECL ETHOD DEFAULT   D3DDECLUSAGE TEXCOORD  0       Base texture  i  0  sizeojr  loe    D3DDECLTYPE FLOAT3  D3DDECL ETHOD DEFAULT   D3DDECLUSAGE TEXCOORD  1       Tangent   D3DD3CL END            Here is an example of a vertex declaration for Direct3D 8                             const DWORD declaration        D3DVSD_STREAM 0    D3DVSD REG D3DVSDE POSITION  D3DVSDT FLOAT3      Position  D3DVSD REG D3DVSDE NORMAL  D3DVSDT FLOAT3      Normal  D3DVSD SKIP 2      Skip the diffuse and specular color  D3DVSD REG D3DVSDE TEXCOORDO    DSDVSDT ELOAT2      Base texture   D3DVSD STREAM 1      Tangent basis stream  D3DVSD REG D3DVSDE TEXCOORDI  D3DVSDT FLOAT3     Tangent  D3DVSD END                           Both declarations tell the Direct3D runtime to find  1  the positions of the  vertices in stream 0 as the first three floating point values of the vertex format    2  the normals as the next three floating point values following the three  floating point values in stream 0  and  3  the texture 
151. ed Water       808 00504 0000 004 101  NVIDIA    Cg Language Toolkit    Vertex Shader Source Code for Improved Water    struct app2vert       float4 Position 3 POSITION     H    struct vert2frag                  float4 HPosition   POSITION   float4 TexCoord0   TEXCOORDO   float4 TexCoordl   TEXCOORDI   float4 Color0 ESOO RO  float4 Colorl zu e TRES       y    void calcWave out float disp  out float2 normal   float dampening  float3 viewPosition   float waveTime  float height   float frequency  float2 waveDirection     float distancel   dot  viewPosition xy  waveDirection    distancel   frequency   distancel   waveTime     disp   height   sin distancel    dampening   normal    cos distancel    height   frequency     waveDirection xy      4 dampening      vert2frag main    app2vert IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelView   uniform float4x4 ModelViewIT   uniform float4x4 TextureMat   uniform float Time              uniform float4 Wavel   uniform float4 WavelOrigin   uniform float4 Wave2   uniform float4 Wave20rigin     const uniform float4 WaveData 5      vert2frag OUT        102 808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders    float4 position   float4 IN Position x  0   NADOS ont  float4 normal   float4 0 1 0 0    float dampening   1   dot  position xyz  position xyz  1000   loci  at  Clio   float2 norm     ron O s   al    d             float waveTime   Time x   WaveData i  z   float frequency   WaveData i  z    float height   WaveData i
152. ed for any struct containing arrays     Minimum Array Requirements    Profiles are required to provide partial support for certain kinds of arrays  This  partial support is designed to support vectors and matrices in all profiles  For  vertex profiles  it is additionally designed to support arrays of light state   indexed by light number  passed as uniform parameters  and arrays of skinning  matrices passed as uniform parameters     Profiles must support subscripting  copying  and swizzling of vectors and  matrices  However  subscripting with run time computed indices is not required  to be supported     Vertex profiles must support the following operations for any non packed array  that is a uniform parameter to the program  or is an element of a structure that  is a uniform parameter to the program  This requirement also applies when the  array is indirectly a uniform program parameter  that is  it and or the structure  containing it has been passed via a chain of in function parameters   The two  operations that must be suppotted are    O Rvalue subscripting by a run time computed value or a compile time value    Q Passing the entire array as a parameter to a function  where the  corresponding formal function parameter is declared as in    The following operations are explicitly not required to be supported   O Lvalue subscripting  a Copying    Q Other operators  including multiply  add  compare  and so on       180    808 00504 0000 004  NVIDIA    Function    Appendix A C
153. eferenced  This allows Cg programs to have the  same structure specify the varying output of a vs 1 1 profile program and the  varying input of a ps 1 X profile program     Table 39 summarizes the valid binding semantics for varying output parameters  in the ps 1 X profile     Table 39 ps 1 x Varying Output Binding Semantics                               Binding Semantics Name Corresponding Data  COLOR  COLORO Output color  float4   COL  COLO  DEPTH Output depth    1oat   DEPR  808 00504 0000 004 233    NVIDIA    Cg Language Toolkit    The output depth value is special in that it may only be assigned a value in the  ps_1_3 profile  and must be of the form    float4 t     texture addressing operation       float z   dot texCoord lt n gt   t xyz    float w   dot  texCoord lt n 1 gt   t xyz    depth   z   w     Auxiliary Texture Functions    Because the capabilities of the texture addressing instructions are limited in  DirectX pixel shader 1  X  a set of auxiliary functions are provided in these  profiles that express the functionality of the more complex texture addressing  instructions  These functions are merely provided as a convenience for writing  ps 1 X Cg programs  The same result can be achieved by writing the expanded  form of each function directly  Using the expanded form has the additional  advantage of being supported on other profiles     Table 40 summatizes these functions     Table 40 ps 1 x Auxiliary Texture Functions       Texture Function       Description  
154. ement     if else   Q while   Q for    These control constructs require that their conditional expressions be of type  bool  Because Cg expressions like i  lt   3 are of type bool  this change from  C is normally not apparent     The vs_2_x and vp30 profiles support branch instructions  so for and while  loops are fully supported in these profiles  In other profiles  for and while  loops may only be used if the compiler can fully unroll them  that is  if the  compiler can determine the iteration count at compile time   Likewise  return  can only appear as the last statement in a function in these profiles     Function recursion  and co recursion  is forbidden in Cg     The switch  case  and default keywords are reserved  but they are not  supported by any profiles in the current release of the Cg compiler        808 00504 0000 004 13  NVIDIA    Cg Language Toolkit    Function Definitions and Function Overloading    To pass a modifiable function parameter in C  the programmer must explicitly  use pointers  C   provides a built in pass by reference mechanism that avoids  the need to explicitly use pointers  but this mechanism still implicitly assumes  that the hardware supports pointers  Cg must use a different mechanism  because the vertex and fragment hardware of the GPU does not support the use  of pointers  Cg passes modifiable function parameters by value result  instead of  by reference  The difference between these two methods is subtle  it is only  apparent when two fun
155. eter profileType is equal to CG GL VERTEX or CG GL FRAGMENT   Function cgGLGetLatestProfile    may be used in conjunction with  cgCreateProgram   or cgCreateProgramFromFile    to ensure that the best  available vertex and fragment profiles are used for compilation  This allows you  to make your application future ready  because the Cg programs are  automatically compiled for the best profiles that are available at runtime  even if  these profiles did not exist at the time the application was written  Another       52    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    function that allows you optimal compilation is cgGLSetOptimalOptions     It  sets implicit compiler arguments that are appended to the argument list passed  to cgCreateProgram   ot cgCreateProgramFromFile        void cgGLSetOptimalOptions  CGprofile profile      OpenGL Program Execution    All programs must be loaded before they can be bound  To load a program use  cgGLLoadProgram       void cgGLLoadProgram CGprogram program    Binding a program only wotks if its profile is enabled  This is done by calling  cgGLEnableProfile   with the program profile    void cgGLEnableProfile CGprofile profile      The binding itself is done using cgGLBindProgram     void cgGLBindProgram CGprogram program      Only one vertex program and one fragment program can be bound at any given  time  so binding a program implicitly unbinds any other program of that type   Profiles are disabled using cgGLDisableProfile      
156. ex program  specification     half data type is implemented as float        fixed or sampler  data types are not supported  but the profile does  provide the minimal partial support that is required for these data types by  the core language specification   that is  it is legal to declare variables using  these types as long as no operations are performed on the variables     Compatibility with the vp20 Vertex Program Profile    Programs that work with the vp20 profile are compatible with the arbvp1  profile as long as they use the Cg run time to manage all uniform parameters   including OpenGL state  That is  arbvp1 and vp20 profiles can be used  interchangeably without changing the Cg source code or the application  program except for specifying a different profile  However  if any of the  glProgramParameterxxNV    routines ate used the application program needs  to be changed to use the corresponding ARB functions     Since there is no ARB function corresponding to glTrackMatrixNV     an  application using glTrackMatrixNV    and the arbvp1 profile needs to be  modified  One solution is to change the Cg source code to refer to the matrix  using the glstate structure so that the matrix is automatically tracked by the  OpenGL driver as part of its GL ARB vertex support  Another solution is for  the application to use the Cg run time routine  cgGLSetStateMatrixParameter    to load the appropriate matrix or matrices  when necessaty     Another potential incompatibility between 
157. figured programmable  pipelines by using programming interfaces at the assembly language level  In  theory  these low level programming interfaces provided great flexibility  In  practice  they were painful to use and presented a serious barrier to the effective  use of hardware     Using a high level programming language  rather than the low level languages of  the past  provides several advantages     Q A high level language speeds up the tweak and run cycle when a shader is  developed  The ultimate test for a shader is  Does it look right   To that  end  the ability to quickly prototype and modify a shader is crucial to the  rapid development of high quality effects     O The compiler optimizes code automatically and performs low level tasks   such as register allocation  that are tedious and prone to error        Q Shading code written in a high level language is much easier to read and  understand  It also allows new shaders to be easily created by modifying  previously written shaders  What better way to learn than from a shader  wtitten by the best artists and programmers     Q Shaders written in a high level language are portable to a wider range of  hardware platforms than shaders written in assembly code     This chapter introduces Cg  C for Graphics   a new high level language tailored  for programming GPUs  Cg offers all the advantages just described  allowing  programmers to finally combine the inherent power of the GPU with a  language that makes GPU programming
158. files  use of 168  compiler options   command line 265    debug 266    Dmacro 265    entry 265    h 266    Ipathname 265      filename 265    longprogs 266    maxunrollcount 266    nocode 265    nofx 265    nostdlib 265    0 265    profile 265    profileopts 265    quiet 265    strict 265    v 266  compile time type category 174       268    808 00504 0000 004    NVIDIA    computation frequency for performance 262   concrete type category 174   conditional code in fragment programs and  performance 263   conditional operator 190   conditional operators 17   constants  typing of 174   construction operator  described 186   context  core Cg 35   control constructs used 13   core Cg context 35   core Cg runtime 34    D  data types  bool 11  fixed 11  float 10  half 11  int 11  sampler 11  supported 10  data types for performance 261  debugging function 28  declaration  Cg definition 168  definition  as used in Cg 168  derivative functions 27  Direct3D Cg runtime 57  cgD3D9EnableDebugTracing   85  cgD3D9GetLastError   87  cgD3D9TranslateHRESULT   87  CGerror 86  debugging mode 83  error callbacks 87  error testing 87  error types 85  expanded interface 69  cgD3D8LoadProgram   75  cgD3D8SetSamplerState   73  cgD3D9BindProgram   76  cgD3D9EnableParameterShadowing    74  cgD3D9GetDevice   70  cgD3D9GetLatestPixelProfile   76    cgD3D9GetLatestVertexProfile   76  cgD3D9GetOptimalOptions   77  cgD3D9IsParameterShadowingEnable  dO 74  cgD3D9IsProgramLoaded   76  cgD3D9LoadProgram   74 
159. float4 dir x inten  IN Color0 y   Chit  mc   9  g      do the Bezier linear interpolation steps    stuff here                   808 00504 0000 004    NVIDIA    147    Cg Language Toolkit    float t   IN Color0 w    LOBE ieee   leo  CETE ECEE ic  Pp  float4 temp2   lerp ctrl2  ctrl3  t    float4 result   lerp temp  temp2  t                add IN the height and wind displacement components  position   position   result   position w   1        transform for sending to the reg  combiners  OUT Hposition   mul ModelViewProj  position            calculate the texture coordinate      from the position passed IN  QU ASCO    loc    Cri POSi EQ Sx  wr 09  MEL 1  dl  10  m       find the normal      we need one more point to do a partial  ices    Jewel  qeux12  OO    canoa   leg  Geel2  eS  i05  7   float4 newResult   lerp temp  temp2  t 0 05               do a crossproduct with a vector that      is horizontal across the screen    float normal   cross  result   newResult  xyz   loas tir rO Ore  normal   normalize  normal         calculate diffuse lighting off the normal       that was just calculated   loses kewe RoS   Elocues  0  5  1D  p   float3 lightVec   normalize  lightPos   position    float diffuseInten   dot lightVec  normal      M Sart wo che tinel colos     The first term is a semi random term based  ll on the total height of this straw     The second term is the diffuse lighting component  OUT Color0   normalize ctrl3    diffuseInten    IN 5 IXoysl  tie d  ono Ze       retur
160. float4x4  uniform float3  uniform float3    HPosition  uv   light  halfangle    Rele cto ne    view  tangent  binormal  normal  fresn    VS OUTPUT O        Generat    a2v vert     POSTTION   NORMAL    EXCOORDO   PEXCOORD1   l EXCOORD2    EXCOORD3     e    E    Er               Vel  by      44       by      Pl  He     POSTELON    EXCOORDO   PEXCOORD1   l EXCOORD2    EXCOORD3    TE XCOORD4    EXCOORD5    EXCOORD6    EXCOORD7   COLORO     coord position in window  wavy fleckmap coords  light pos  tangent space   Blinn halfangle   Refl vector  per vertex   view  tangent space   view tangent matrix    E    ES         E    ET              E             odelView   odelViewIT   odelViewProj   LightVector   EyePosition         Obj  USOS    space  space             O HPosition      homogeneous POSITION  mul  ModelViewProj     vert OPosition         Generate BASIS matrix    float3x3 ModelTangent        normalize vert Tangent    normalize vert Binormal         128    808 00504 0000 004  NVIDIA       FRESNEL  float4 Fresn       float3x3 Vie       Generate VI    float3 viewN  t4 viewP    P w    loa  view    float3 viewV     Generate  iELOENES   glow  float3 objL    float3 objH             Generate  float3 tanL  float3 tanV  float3 tanH    Advanced Profile Sample Shaders                      normalize vert Normal           OERSET SCALE  POWER I UNUSED IS   el Ex HE Odes cula E On Ose due  wlangent   mul  ModelTangent           EW SPAC  normalize       i        OBJECT SPAC  normalize        
161. for a program  in order to avoid any  unfortunate inconsistencies it is advisable to stick with the expanded interface  for all shader related operations that can be performed through its functions   such as shader setting  shader activation  and parameter setting   including  setting texture stage states     Setting the Direct3D Device    The expanded interface encapsulates more functionality than the minimal  interface to ease program and parameter management  It does this by making  the appropriate Direct3D calls at the appropriate times  Because some of these  calls require the Direct3D device  it must be communicated to the Cg runtime     HRESULT cgD3D9SetDevice  IDirect3DDevice9  device          808 00504 0000 004 69  NVIDIA    Cg Language Toolkit    You can get the Direct3D device currently associated with the runtime using  cgD3D9GetDevice       IDirect3DDevice9  cgD3D9GetDevice         When cgD3D9SetDevice    is called with zero as an input  all Direct3D  resources used by the expanded interface are released  Since a Direct3D device  is destroyed only when all references to it are removed  the application should  call cgD3D9SetDevice    with zero as an input when it is done with a Direct3D  device so that it gets destroyed when the application shuts down  Otherwise   Direct3D does not shut down properly and reports memory leaks to the debug  console     Note that calling cgD3D9SetDevice    with zero as an input does not affect the  Cg core runtime resources in any wa
162. fragmentProgram          Draw scene              Called before the device changes or is destroyed  void OnDestroyDevice             ff Clas tas i wince iem tetis ta xpanded interface to    release ies internal referente to the Directa device  tando tree ats Direct3D  resources   cgD3D9SetDevice  0                     Called before application shuts down    void OnShutdown               This frees any core runtime resource   cgDestroyContext  context          80    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    Expanded Interface DirectD3D 8 Application    The following C code links the previous vertex and fragment programs to the  Direct3D 8 application     finclude   cg cg h    finclude  lt cg cgD3D8 h gt                    IDirect3DDevice8  device     Initialized somewher ls  IDirect3DTexture8  texture     Initialized somewher ls  D3DXCOLOR constantColor     Initialized somewher ls     ClerciomtieexiE   Cuore esxE D  CGprogram vertexProgram  fragmentProgram   CGparameter baseTexture  someColor  modelViewMatrix           Called at application startup  void OnStartup               Create context  context   cgCreateContext                   Called whenever the Direct3D device needs to be created  void OnCreateDevice              Pass the Direct3D device to th xpanded interfac  cgD3D8Set Device  device                   Determine the best profiles to use  CGprofile vertexProfile   cgD3D8GetLatestVertexProfile     CGprofile pixelProfile   cgD3D8GetLatestPixelProfi
163. g Language Specification    Note that when the array is rvalue subscripted  the result is an expression  and  this expression is no longer considered to be a uniform program parameter   Therefore  if this expression is an array  its subsequent use must conform to the  standard rules for atray usage     These rules ate not limited to arrays of numeric types  and thus imply support  for arrays of struct  arrays of matrices  and arrays of vectors when the array is  a uniform program parameter  Maximum array sizes may be limited by the  number of available registers or other resource limits  and compilers are  permitted to issue etrot messages in these cases  However  profiles must  support sizes of at least   1oat arr 8   float4 arr 8   and float4x4  arr 4  4      Fragment profiles are not required to support any operations on arbitrarily sized  arrays  only support for vectors and matrices is required     Overloading    Multiple functions may be defined with the same name  as long as the  definitions can be distinguished by unqualified parameter types and do not have  an open profile conflict  see    Overloading of Functions by Profile  on   page 170      Function matching rules     1  Add all visible functions with a matching name in the calling scope to the  set of function candidates     2  Hliminate functions whose profile conflicts with the current compilation  profile     3  Eliminate functions with the wrong number of formal parameters  If a  candidate function has exces
164. g Output Binding Semantics  continued                       Binding Semantics Name Corresponding Data   COLORO  COLO Output primary color   COLOR1  COL1 Output secondary color   BCOLO Output backface primary color  BCOL1 Output backface secondary color  TEXCOORDO TEXCOORD3  TEXO TEX3   Output texture coordinates                The profile also allows WPOS to be present as binding semantics on a member of  a structure of a varying output data structure  provided the member with this  binding semantics is not referenced  This allows Cg programs to have the same  structure specify the varying output of a vp20 profile program and the varying  input of an   p30 profile program        808 00504 0000 004 243  NVIDIA    Cg Language Toolkit       OpenGL NV_texture_shader and NV_register_combiners  Profile    p20     The OpenGL NV_texture_shader and NV_register_combiners profile is used  to compile Cg source code to the nvparse text format for the  NV_texture_shader and NV_register_combiners family of OpenGL  extensions        a Profile name    p20       Q How to invoke  Use the compiler option  profile fp20     This document describes the capabilities and restrictions of Cg when using the    p20 profile     Overview    Operations in the   p20 profile can be categorized as texture shader operations  and atithmetic operations  Texture shader operations are operations which  generate texture shader instructions  arithmetic operations are operations which  generate register combinets inst
165. g semantic   For example  the following is legal  although not recommended     struct myfragoutput    Hroar muwcouose 3 COMO       In such cases  the variable is implicitly copied  with a typecast  to the semantic  upon program completion  If the variable   s vector size is shorter than the  semantic s vector size  the larger numbered components of the semantic receive  their default values  if applicable  and otherwise are undefined  In the case  above  the Rand G components of the output color are obtained from  mycolor  while the B and A components of the color are undefined        194    808 00504 0000 004  NVIDIA          Appendix B  Language Profiles    This appendix describes the language capabilities that are available in each of  the following profiles supported by the Cg compiler     a    Oooodooco O       a    DirectX Vertex Shader 2 x Profiles  vs_2      DirectX Pixel Shader 2 x Profiles  ps 2      OpenGL ARB Vertex Program Profile  arbvp1    OpenGL ARB Fragment Program Profile  arbfp1    OpenGL NV  vertex program 2 0 Profile  vp30    OpenGL NV  fragment  program Profile    p30    DirectX Vertex Shader 1 1 Profile  vs 1 1    DirectX Pixel Shader 1 x Profiles  ps 1      OpenGL NV  vertex program 1 0 Profile  vp20    OpenGL NV  texture shader and NV  register combiners Profile    p20     In each case  the capabilities are a subset of the full capabilities described by the  Cg language specification in    Cg Language Specification    on page 165     808 00504 0000 00
166. ghtEXT command   NORMAL Input normal through Normal command   COLORO  DIFFUSE Input primary color through Color command   COLOR1  SPECULAR Input secondary color through  SecondaryColorEXT command   FOGCOORD Input fog coordinate through FogCoordEXT  command   TEXCOORDO TEXCOORD7 Input texture coordinates  texcoord0   texcoord7  through MultiTexCoord command   ATTRO ATTR15 Generic Attribute 0 15 through VertexAttrib  command   PSIZE  ATTR6 Generic Attribute 6   808 00504 0000 004 209    NVIDIA    Cg Language Toolkit    Table 21 summarizes the valid binding semantics for varying output parameters  in the arbvp1 profile  These binding semantics map to ARB vertex program  output registers  The two sets act as aliases to each other     Table 21  arbvp1 Varying Output Binding Semantics                                  Binding Semantics Name Corresponding Data  POSITION  HPOS Output position   PSIZE  PSIZ Output point size   FOG  FOGC Output fog coordinate   COLORO  COLO Output primary color   COLOR1  COL1 Output secondary color   BCOLO Output backface primary color  BCOL1 Output backface secondary color  TEXCOORDO TEXCOORD7  TEXO TEX7 Output texture coordinates                Note  The application must call ylEnable  GL COLOR SUM ARB  in order to  enable COLORI output when using the arbvpl1 profile            The profile also allows WPOS to be present as binding semantics on a member of  a structute of a varying output data structure  provided the member with this  binding semantics
167. gle xyz                     Tangent space VIEW vector  float3 V   normalize vert view xyz    ilo  w Cist   vert  VIEW  Wy          Tangent space WAVY NORMAL  float3 wavyN    float3 tex2D WavyMap  vert uv  2 1   wavyN   normalize  wavyN WavyScale            PAINT      A normal map map could be loaded here instead if      we wanted more detail  In this case we have a      uniform tangent space normal  0 0 1    iloghne im cl id   Ih we   slow im db in   iz   float3 paint color    float3 tex2D PaintMap   AA  m gl i  m el im               SPECULAR POWER   use a saturated diffuse term     to clamp the backlighting  n dh   saturate n d 1 4  pow n d h  NewPaintSpec y                REFLECTION ENVIRONMENT      Reflect view vector about wavy normal and bring      to view space   float3 R   reflect  V  wavyN     R   R x vert tangent   R y vert binormal    R z vert normal    float3 reflect color    float3 texCUBE  EnvironmentMap  R                        FLECKS     Load random 3 vector flecks from fleck map     Reduce tiling artifacts by sampling at      different frequencies                   float3 fleckN    float3 tex2D FleckMap  vert uv 37  2 1   fleckN     float3 tex2D FleckMap  vert uv 23  2 1  2    ELECKN  25  808 00504 0000 004 131    NVIDIA    Cg Language Toolkit    log lees a alla saturate  dot  fleckN  H      iloenES fleck colo   Mecicolor   joxoxw Griexele m cl la   lerp NewPaintSpec y  NewPaintSpec w  v dist        Control the ambient fleckiness and also      attenuate with 
168. glstate matrix modelview 0  glstate matrix projection  glstate matrix mvp glstate matrix texture 0   glstate matrix palette 0  glstate matrix program 0   glstate matrix inverse modelview 0  glstate matrix inverse projection  glstate matrix inverse mvp glstate matrix inverse texture 0   glstate matrix inverse palette 0  glstate matrix inverse program 0   glstate matrix  transpose  modelview 0  glstate matrix transpose projection  glstate matrix transpose mvp glstate matrix transpose texture 0   glstate matrix transpose palette 0   glstate matrix transpose program 0   glstate matrix invtrans modelview 0   glstate matrix invtrans projection  glstate matrix invtrans mvp glstate matrix invtrans texture 0   glstate matrix invtrans palette 0  glstate matrix invtrans program 0   Table 17 lists the glstate fields of type float4 that can be accessed   Table 17 float4 glstate Fields  glstate material ambient glstate material diffuse  glstate material specular glstate material emission  glstate material shininess glstate material front ambient  glstate material front diffuse glstate material front specular  glstate material front emission glstate material front shininess  glstate material back ambient glstate material back diffuse  glstate material back specular glstate material back emission  glstate material back shininess glstate light 0  ambient  gistate light 0  diffuse glstate light 0  specular  glstate light 0   position gistate light 0  attenuation  gistate light 0  spot directio
169. hader specifc flags  like declaration and     usage    cgD3D8LoadProgram fragmentProgram  TRUE  0  0  0               Grab some parameters   modelViewMatrix   cgGetNamedParameter  vertexProgram    ModelViewMatrix     baseTexture   cgGetNamedParameter fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram    SomeColor                      Sanity check that parameters have th xpected siz                assert  cgD3D8TypeToSize  cgGetParameterType    modelViewMatrix      16    assert  cgD3D8TypeToSize  cgGetParameterType  someColor          may       Set parameters that don t change  They can be set     only once since parameter shadowing is enabled  cgD3D8SetTexture baseTexture  texture    cgD3D8SetUniform someColor   amp constantColor               82 808 00504 0000 004  NVIDIA    Using the Cg Runtime Library       Called to render the scen  void OnRender                     Load model view matrix   D3DXMATRIX modelViewMatrix   i if          Set the parameters that change every frame     This must be done before binding the programs  cgD3D8SetUniformMatrix  modelViewMatrix   amp modelViewMatrix          Bind the programs  This downloads any parameter values     that have been previously set   cgD3D8BindProgram vertexProgram     cgD3D8BindProgram fragmentProgram          Draw scene      d       Called before the device changes or is destroyed   void OnDestroyDevice         I Calling dais unction ells   ela xpanded interface to     release its intern
170. hat parameters have th xpected siz   assert  CgD3D8TypeToSize  cgGetParameterType    modelViewMatrix      16      oSize cgGetParameterType  someColor         5       n T                assert  cgD3D81     4      YP    I Calles to rencer idas Seca  void OnRender                  Get the Direct3D resource locations for parameters     This can be done earlier and saved  DWORD modelViewMatrixRegister    cgGetParameterResourceIndex  modelViewMatrix               68    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library       DWORD baseTextureUnit    cgGetParameterResourceIndex  baseTexture     DWORD someColorRegister    cgGetParameterResourceIndex  someColor                  See the Direct3D state    device  gt SetVertexShaderConstant  modelViewMatrixRegister   matrix  4     device  gt SetPixelShaderConstant  someColorRegister    TO onisisdnmis oto SNP  device  gt SetTexture  baseTextureUnit  texture    device  gt SetVertexShader  vertexShader     device  gt SetPixelShader  pixelShader                         Draw scene     bik       Called before the device changes or is destroyed  void OnDestroyDevice      device  gt DeleteVertexShader  vertexShader     device   DeletePixelShader  pixelShader                   Called before application shuts down   void OnShutdown         This frees any core runtime resources      The minimal interface has no dynamic storage to free   cgDestroyContext  context               Direct3D Expanded Interface    If you use the expanded interface 
171. he application  provides this value with each vertex     Cg provides a flexible mechanism for specifying these per vertex inputs in the  form of a set of predefined names  Each program input must be bound to a       808 00504 0000 004 5  NVIDIA    Cg Language Toolkit    name from this set  In the following structure  the vertex program definition  binds its parameters to the predefined names POSITION  NORMAL  TANGENT  and  TEXCOORD3  The application must provide the vertex array data associated with  these predefined names     struct myinputs      float3 myPosition JNE SEMINE  float3 myNormal   NORMAL   float3 myTangent   TANGENT              flog eriam de pev Mem CL ECE CORDO  he    outdata foo myinputs indata     PP ugs OH     Within the program  the parameters are referred to as        indata myPosition        indata myNormal     and so on   As       We refer to the predefined names as binding semantics  The following set of    binding semantics is supported in all Cg vertex program profiles  Some Cg  profiles support additional binding semantics        POSITION BLENDWEIGHT   NORMAL TANGENT   BINORMAL PSIZE   BLENDINDICES TEXCOORDO   TEXCOORD7    The binding semantic POSITIONO is equivalent to the binding semantic  POSITION  likewise  the other binding semantics have similar equivalents     In the OpenGL Cg profiles  binding semantics implicitly specify the mapping of  varying inputs to particular hardware registers  However  in DirectX based Cg  profiles there is no such 
172. ifier only refers to the outermost array  However  it is possible to  declare a packed array of packed arrays by declaring the first level of array  in a typedef using the packed keyword and then declaring a packed array  of this type in a second statement  It is not possible to have a packed array  of unpacked arrays     Q For any supported numeric data type TYPE  implementations must support  the following packed array types  which are called vector types  Type  identifiers must be predefined for these types in the global scope     typedef packed TYPE TYPE1 1    typedef packed TYPE TYPE2 2    typedef packed TYPE TYPE3 3    typedef packed TYPE TYPE4 4      For example  implementations must predefine the type identifiers   10at1   float2  float3  float4  and so on for any other supported numeric type     Q For any supported numeric data type TYPE  implementations must support  the following packed array types  which are called matrix types   Implementations must also predefine type identifiers  in the global scope   to represent these types     packed TYPE1 TYPE1x1 1   packed TYPE1 TYPE3x1 3    packed TYPE2 TYPE1x2 1   packed TYPE2 TYPE3x2  3    packed TYPE3 TYPE1x3 1   packed TYPE3 TYPE3x3 3    packed TYPEA TYPE1x4  1   packed TYPEA TYPE3x4  3    packed TYPE1 TYPE2x1 2   packed TYPE1 TYPEAx1 4    packed TYPE2 TYPE2x2 2   packed TYPE2 TYPE4x2 4    packed TYPE3 TYPE2x3 2   packed TYPE3 TYPEAx3 4    packed TYPEA TYPE2x4 2   packed TYPE4 TYPE4x4 4      For example  implementatio
173. iform Scalar  Vector  and Matrix Parameters    The function cgD3D9SetUniform   sets floating point parameters like float 3  and float4x3     HRESULT cgD3D9SetUniform CGparameter parameter   const void  value      The amount of data required depends on the type of parameter  but is always  specified as an array of one or more floating point values  The type is void  so  a user defined structure that is compatible can be passed in without type  casting  Here is some code illustrating the use of cgD3D9SetUniform   for  setting a vectorParam of type float3  matrixParam of type float2x3  and  arrayParam of type float2x2  3      D3DXVECTOR3 vectorData 1 2 3    loe imac ara 2 1  ES   til  Zo sh  Le Sy Bike  float arrayData 3  2  2       ills 21 495 450594097 By MS tte  10h   lily 12  cgD3D9SetUniform vectorParam   amp vectorData    cgD3D9SetUniform matrixParam  matrixData    cgD3D9SetUniform arrayParam  arrayData         As mentioned previously  cgD3D9TypeToSize    can be used to determine how  many values are required for setting a parameter of a particular type     For convenience  there is also a function to set a parameter from a 4x4 matrix  of type D3DMATRIX     HRESULT cgD3D9SetUniformMatrix  CGparameter parameter   const D3DMATRIX  matrix      The upper left portion of the matrix is extracted to fit the size of the input  parameter  so that you could set matrixParam this way as well   D3DXMATRIX matrix    i   iL   iL   0 E   IO   0  0  0  0    Qu Of Of         cgD3D9SetUnifor
174. iform parameters in the  arbfp1 profile     Table 22 arbfp1 Uniform Input Binding Semantics          Binding Semantics Name Corresponding Data  register  s0    register s15  Texunit image unit N  where wis in range  TEXUNITO TEXUNIT15  0  15     May only be used with uniform inputs with  sampler  types        register  c0   register  c31  Local Parameter N  where N is in range  C0 C31  0  31   May only be used with uniform inputs                    808 00504 0000 004  NVIDIA    Options    Appendix B Language Profiles    Binding Semantics for Varying Input Output Data    Table 23 summarizes the valid binding semantics for varying input parameters in  the arbfp1 profile    Table 23  arbfp1 Varying Input Binding Semantics                Binding Semantics Name Corresponding Data  type   COLORO Input color 0  float4    COLOR1 Input color 1  float4   TEXCOORDO TEXCOORD7 Input texture coordinates    1oat4                 Table 24 summarizes the valid binding semantics for varying output parameters  in the arbfp1 profile     Table 24 arbfp1 Varying Output Binding Semantics                Binding Semantics Name Corresponding Data  COLOR  COLORO Output color  float4   DEPTH Output depth  float              The ARB fragment program profile allows the following profile specific  options     NumTemps  lt n gt   where 16  lt   n  lt   32  default 32   NumInstructionSlots  lt n gt   where 72  lt   n  lt   1024  default 1024   NoDependentReadLimit  lt b gt   where b   0 or 1  default 1     N
175. iform variables  in the Cg source code     Bindings    Binding Semantics for Uniform Data    Table 19 summarizes the valid binding semantics for uniform parameters in the  arbvpl profile     Table 19 arbvp1 Uniform Input Binding Semantics          Binding Semantics Name Corresponding Data   register  c0  register c255    Local parameter with index n  n    0  255     C0 C255 The aliases c0   c255  lowercase  are also  accepted     If used with a variable that requires more  than one constant register  for example  a  matrix   the semantic specifies the first local  parameter that is used                    208 808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Binding Semantics for Varying Input Output Data    Table 20 summarizes the valid binding semantics for uniform parameters in the  arbvpl profile     The set of binding semantics for varying input data to arbvp1 consists of  POSITION  BLENDWEIGHT  NORMAL  COLORO  COLOR1  TESSFACTOR  PSIZE   BLENDINDICES  and TEXCOORDO TEXCOORD7  One can also use TANGENT and  BINORMAL instead of TEXCOORD6 and TEXCOORD7  Additionally  a set of binding  semantics of ATTRO ATTR15 can be used  The mapping of these semantics to  corresponding setting command is listed in the table     Table 20  arbvp1 Varying Input Binding Semantics       Binding Semantics Name   Corresponding Data                                              POSITION Input Vertex  through Vertex command   BLENDWEIGHT Input vertex weight through WeightARB   VertexWei
176. ime       Can be determined at compile time  is defined as follows   The loop iteration expressions can be evaluated at compile time by use  of intra procedural constant propagation and folding  where the  variables through which constant values are propagated do not appear  as lvalues within any kind of control statement  if  for  or while  or     construct    Profiles may choose to support more general constant propagation   techniques  but such support is not required     Q Profiles may optionally support fully general   or and while loops        808 00504 0000 004 185  NVIDIA    Cg Language Toolkit    New Vector Operators    These new operators are defined for vector types     Q Vector construction operator   lt typeID gt        This operator builds a vector from multiple scalars or shorter vectots     float4  scalar  scalar  scalar  scalar   float4 float3  scalar     O Matrix construction operator   lt typeID gt        This operator builds a matrix from multiple rows  Each row may be  specified either as multiple scalars or as any combination of scalars and  vectots with the appropriate size   float3x3 1  2  3  4  5  6  7  8  9   float3x3  float3  float3  float3   float3x3 1  float2  float3  float3  1  1  1     Q  Swizzle operator         a   b xxyz     A swizzle operator exampl       At least one swizzle character must follow the operator     E  Y There are two sets of swizzle characters and they may not be mixed   Set one is xyzw   0123  and set two is rgba   0123     
177. ime type category includes types c  1oat and cint  These types  are used by the compiler for constant type convetsions     O The concrete type category includes all types that are not included in the  compile time type category     Q The scalar type category includes all types in the numeric category  the bool  type  and all types in the compile time category  In this specification  a  reference to a  lt category gt  type  such as a reference to a numeric type   means one of the types included in the category  such as float  half  or  fixed      Constants    A constant may be explicitly typed or implicitly typed  Explicit typing of a  constant is performed  as in C  by suffixing the constant with a single character  indicating the type of the constant     Q f  for float  Q   d for double  A h for half   Q x for fixed    Any constant that is not explicitly typed is Zzp icitly typed  If the constant includes  a decimal point  it is implicitly typed as cfloat  If it does not include a decimal  point  it is implicitly typed as cint        174 808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    By default  constants are base 10  For compatibility with C  integer hexadecimal  constants may be specified by prefixing the constant with 0x  and integer octal  constants may be specified by prefixing the constant with 0     Compile time constant folding is preferably performed at the same precision  that would be used if the operation were performed at run time  Some  c
178. imilar functions exist to set the values of arrays of uniform mattix parameters     void cgGLSetMatrixParameterArrayfr CGparameter parameter   long startIndex  long numberOfElements   const float  array     void cgGLSetMatrixParameterArrayfc  CGparameter parameter   long startIndex  long numberOfElements   const float  array     void cgGLSetMatrixParameterArraydc  CGparameter parameter   long startIndex  long numberOfElements   const double  array     void cgGLSetMatrixParameterArraydc  CGparameter parameter   long startIndex  long numberOfElements   const double  array         50    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    and to query those values     void cgGLGetMatrixParameterArrayfr  CGparameter parameter   long startIndex  long numberOfElements  float  array    void cgGLGetMatrixParameterArrayfc  CGparameter parameter   long startIndex  long numberOfElements  float  array    void cgGLGetMatrixParameterArraydc  CGparameter parameter   long startIndex  long numberOfElements  double  array    void cgGLGetMatrixParameterArraydc  CGparameter parameter   long startIndex  long numberOfElements  double  array      The c and r suffixes have the same meaning as they do for the  cgGLSetMatrixParameter functions     Setting Varying Parameters    The values of fragment program vatying parameters are set as the result of the  interpolation across the triangles performed by the GPU  so only the values of  vertex program vatying parameters are set by the applicatio
179. implied mapping     Binding semantics may be specified directly on program parameters rather than  on struct elements  Thus  the following vertex program definition is legal     outdata foo float3 myPosition EE C CHEN  float3 myNormal   NORMAL   float3 myTangent   TANGENT           loat  seeubree A SN SANA O ORD SA       E eng E      Within the program  the parameters are referred to by     their variable names   myPosition    myNormal         myTangent   and  refractive index     JC EET          6 808 00504 0000 004  NVIDIA    Introduction to the Cg Language    Varying Outputs to and from Vertex Programs    The outputs of a vertex program pass through the rasterizer and ate made  available to a fragment program as varying inputs  For a vertex program and  fragment program to interoperate  they must agree on the data being passed  between them     As it does with the data flow between the application and vertex program  Cg  uses binding semantics to specify the data flow between the vertex program and  fragment program     This example shows the use of binding semantics for vertex program output        Vertex program  SEAMS IG MAE 3          float4 pout   POSITION     Used for rasterization  float4 diffusecolor   COLORO   float4 uv0   TEXCOORDO   float4 uvl TEXCOORD1   he  Hay TOO aoo YY 4  myvf outstuff   jv Gros Rial    Savia Wyble ee S         And  this example shows how to use this same data as the input to a fragment  program        Fragment program  SEUA A  float4 diffu
180. ing  Here we compute 3 scattering terms      simultaneously and the results end up in the x y z      components of a float3  Using 3 terms approximates      distribution of multiply scattered light  For      details see  Matt Pharr s SIGGRAPH 2001 RenderMan      course notes  Layered Media for Surface Shaders     float3 temp   singleScatter  T2  T  n  g  albedo    thickness      Suo Z5 COM    meoicll E 5 NOEZ      temp x temp y temp z                     Add contributions from oil  sheen  and subsurface      scattering and modulate by light color and result      of a shadow map lookup    return lightColor tex2Dproj  tex3  In shadowcoords   r     oil   sheen   subsurf         808 00504 0000 004 123  NVIDIA    Cg Language Toolkit       Thin Film Effect    Description    This demo shows a thin film interference effect  Specular and diffuse lighting  are computed per vertex in a Cg program  along with a view depth parameter   which is computed using the view vector  surface normal  and the depth of the  thin film on the surface of the object  The view depth is then perturbed in an  ad hoc manner per fragment by the underlying decal texture  and is then used  to lookup into a 1D texture containing the precomputed destructive  interference for red   green   blue wavelengths given a particular view depth   This interference value is then used to modulate the specular lighting  component of the standard lighting equation        Figure 11 Example of Thin Film Effect    Vertex Shader S
181. ion to hold local program parameters  minimum limit of  24  and temporary results  minimum limit of 16      If the compiler needs more temporaries or local parameters to compile a  program than are available  it generates an error        4  To understand the capabilities of Opena ARB fragment programs and the code  roduced by the compiler  refer to the ARB fragment program extension in the OpenGL  ixtensions documentation        808 00504 0000 004 211    NVIDIA    Cg Language Toolkit    Language Constructs and Support    Bindings    Data Types   This profile implements data types as follows    Q float data type is implemented as IEEE 32 bit single precision   Q half  fixed  and double data types are treated as float    Q int data type is supported using floating point operations   Q    sampler  types are supported to specify sampler objects used for texture  fetches     Statements and Operators    With the ARB fragment program profiles while  do  and for statements are  allowed only if the loops they define can be unrolled because there is no  dynamic branching in ARB fragment program 1     Comparison operators are allowed   gt    lt    gt     lt            and Boolean operators   11   amp  amp       are allowed  However  the logic operators  8           ate not     Using Arrays and Structures    Variable indexing of arrays is not allowed  Array and structure data is not    packed     Binding Semantics for Uniform Data    Table 22 summarizes the valid binding semantics for un
182. ions in the  Direct3D Cg runtime library have a cgD3D prefix     There are actually two Direct3D Cg runtime libraries  One for Direct3D 8 and  one for Direct3D 9  Functions belonging to the Direct3D 8 Cg runtime have a  cgD3D8 prefix  and functions belonging to the Direct3D 9 Cg runtime have a  cgD3D9 prefix  Because most of the functions are identical between the two  runtimes  we describe the Direct3D 9 Cg runtime with the understanding that  the description applies to the Direct3D 8 Cg runtime as well  unless otherwise  indicated     The same prefix convention used for the function names is also used for the  type names  macro names and enumerant values        808 00504 0000 004 31  NVIDIA    Cg Language Toolkit    Header Files    Here is how to include the core Cg runtime API into your C or C   program   include  lt Cg cg h gt     Here is how to include the OpenGL Cg runtime API   include  lt Cg cgGL h gt    Here is how to include the Direct3D 9 Cg runtime API   include  lt Cg cgD3D9 h gt     And  here is how to include the Direct3D 8 Cg runtime API   include  lt Cg cgD3D8 h gt              Creating a Context    A context is a container for multiple Cg programs  It holds the Cg programs  as  well as their shared data     Here s how to create a context   CGcontext context   cgCreateContext          Compiling a Program    Compile a Cg program by adding it to a context with cegCreateProgram       CGprogram program   cgCreateProgram context   CG SOURCE  myVertexProgramString 
183. ions on size and dimensionality   Restrictions on the use of computed subscripts are also permitted  Arrays  may be designated as packed  The operations allowed on packed arrays  may be different from those allowed on unpacked arrays  Predefined  packed types ate provided for vectors and matrices  It is strongly  recommended these predefined types be used     Q There is a built in swizzle operator   xyzw or  rgba for vectors  This  operator allows the components of a vector to be rearranged and also  replicated  It also allows the creation of a vector from a scalar     Q For an lvalue  the swizzle operator allows components of a vector or matrix  to be selectively written        Q There is a similar built in swizzle operator for matrices     m lt row gt  lt col gt  _m lt row gt  lt col gt            This operator allows access to individual matrix components and allows the  creation of a vector from elements of a matrix  For compatibility with       166    808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    DirectX 8 notation  there is a second form of matrix swizzle  which is  described later     O Numeric data types are different  Cg s primary numeric data types are  float  half  and fixed  Fragment profiles are required to support all  three data types  but may choose to implement half and fixed at float  precision  Vertex profiles are required to support half and   1oat  but may  choose to implement half at float precision  Vertex profiles may omit  supp
184. l constructs     808 00504 0000 004 165    NVIDIA       Cg Language Toolkit    Q Arrays are first class types because Cg does not support pointers        Q Functions pass values by value result  and thus use an out or inout  modifier in the formal parameter list to return a parameter  By default   formal parametets ate in  but it is acceptable to specify this explicitly   Parameters can also be specified as in out  which is semantically the same  as inout     Differences from ANSI C    Cg was developed based on the ANSI C language with the following major  additions  deletions  and changes   This is a summary   mote detail is provided  latet in this document      Q Language profiles  described in    Profiles    on page 168  may subset  language capabilities in a variety of ways  In particular  language profiles  may restrict the use of for and while loops  For example  some profiles  may only suppott loops that can be fully unrolled at compile time     O A binding semantic may be associated with a structure tag  a variable  ot a  structure element to denote that object s mapping to a specific hardware or  API resource  See    Binding Semantics    on page 183     Reserved keywords goto  break  and continue are not supported     Reserved keywords switch  case  and default are not supported   Labels ate not supported either     Q Pointers and pointer related capabilities  such as the     and   gt  operators  are  not supported        Q Arrays are supported  but with some limitat
185. lar namespace       typedef names  including an automatic typedef from a struct  declaration       Variables      Function names    Arrays and Subscripting    Arrays are declared as in C  except that they may optionally be declated to be  packed  as described under    Types    on page 171  Arrays in Cg are first class  types  so array parameters to functions and programs must be declared using  array syntax  rather than pointer syntax  Likewise  assignment of an array   typed object implies an array copy rather than a pointer copy     Arrays with size  1  may be declared but are considered a different type from  the corresponding non artay type     Because the language does not currently support pointers  the storage order of  arrays is only visible when an application passes parameters to a vertex or  fragment program  Therefore  the compiler is currently free to allocate  temporary variables as it sees fit        808 00504 0000 004 179  NVIDIA    Cg Language Toolkit    The declaration and use of arrays of arrays 1s in the same style as in C  That is  if  the 2D array A is declared as    float A 4   4    then  the following statements are true   a The array is indexed as A  row   column      Q The array can be built with a constructor using    A  a ALTO ALO  1D  ALOT 7 ALON TSI    CALL IOI  AML   Aa 1211  ZEN  o  CASPZEEO CASAS AS E232 ALZ Si    CALS  Oil  AISI Ll  AIST  AISI TSI  e    Q A O0 is equivalent to  A 0   0   A 0   1   A 0   2   A 0  3       Support must be provid
186. le                Grab the optimal options for each profile     const char  vertexOptions        cgD3D8GetOptimalOptions  vertexProfile   0     const char  pixelOptions          cgD3D8GetOptimalOptions pixelProfile   0          Create the vertex shader   vertexProgram   cgCreateProgramFromFi le    context  CG SOURCE   VertexProgram cg    vertexProfile   VertexProgram   vertexOptions        If your program uses explicit binding semantics  like     this one   you can create a vertex declaration     using those semantics   DWORD declaration        D3DVSD STREAM  0                     808 00504 0000 004 81    NVIDIA    Cg Language Toolkit                   D3DVSD REG D3DVSDE POSITION  D3DVSDT FLOAT3    D3DVSD REG D3DVSDE DIFFUSE  D3DVSDT D3DCOLOR    D3DVSD REG D3DVSDE TEXCOORDO  D3DVSDT FLOAT2    D3DVSD END                  Ensure the resulting declaration is compatible with the      shader  This is really just a sanity check    assert  cgD3D8ValidateVertexDeclaration  vertexProgram   declaration             Load the program with th xpanded interfac     Parameter shadowing is enabled  second parameter   TRUE    cgD3D8LoadProgram vertexProgram  TRUE  0  0  declaration                     Create the pixel shader    fragmentProgram   cgCreateProgramFromFile   context  CG SOURCE   FragmentProgram cg    pixelProfile   FragmentProgram   pixelOptions               Load the program with th xpanded interfac      Parameter shadowing is enabled  second parameter   TRUE       Ignore vertex s
187. lf4 specResult   lighting z   specStr   specCol     half4 specC          half3 reflVect   reflect Vn  Nb         half4 refl1C  half fakeFr                olor   texCUBE  EnvMap           esnel   ReflData FRESN    i              material METALNESS         reflVect                     ReflData FRESN    E            MIN      MAX         112    NVIDIA    808 00504 0000 004    Advanced Profile Sample Shaders    pow  saturate  1 0h dot   Vn  IN N     ReflData FRESNEL EXPON    half4 paintShine   fakeFresnel   reflColor   half4 metalShine   surfCol   reflColor   half4 shineCol   ReflData REFL STRENGTH    lerp paintShine  metalShine   material METALNESS                                             half4 finalColor   specResult   diffResult   shineCol   finalColor w   1 0h     return finalColor        808 00504 0000 004 113  NVIDIA    Cg Language Toolkit       Ray Traced Refraction    Description    This shader presents a method for adding high quality details to small objects  using a single bounce  ray traced pass  In this example  the polygonal surface is  sampled and a refraction vector is calculated  This vector is then intersected  with a plane that is defined as being perpendicular to the object s x axis  The  intersection point is calculated and used as texture indices for a painted iris     The demo permits varying the index of refraction  the depth and density of the  lens  Note that the choice of geometry is arbitrary   this sample is a sphere  but  any polygonal model can be
188. loat4 hpos SEIS SIMON            it is equivalent to                                                                                                                                     const D3DVERTEXELEMENT9 declaration         CO  0 Siizeoit  elkoaie   7   D3DDECLTYPE FLOAT4  D3DDECL ETHOD DEFAULT   D3DDECLUSAGE POSITION  0       OL Sez COs  Hoare  ir   D3DDECLTYPE FLOAT4  D3DDECL ETHOD DEFAULT   D3DDECLUSAGE COLOR  ORE    CO  te   MESS cod tod   D3DDECLTYPE FLOAT4  D3DDECL ETHOD DEFAULT   D3DDECLUSAGE TEXCOORD  0      D3DD3CL END                     y    for the Direct3D 9 Cg runtime  and it is equivalent to                            const DWORD declaration        D3DVSD STREAM 0    D3DVSD REG D3DVSDE POSITION  D3DVSDT FLOAT4    D3DVSD REG D3DVSDE DIFFUSE  D3DVSDT_FLOAT4    D3DVSD REG D3DVSDE TEXCOORDO  D3DVSDT FLOATA4    D3DVSD END         e  for the Direct3D 8 Cg runtime        808 00504 0000 004 59  NVIDIA    Cg Language Toolkit    Usually though  you want to apply a vertex program to geometric data that  come in multiple streams or with specific vertex formats  In this case  the vertex  declaration is based on the vertex formats rather than the program  To see if it is  compatible with the program  use cgD3D9ValidateVertexDeclaration        CGbool cgD3D9ValidateVertexDeclaration  CGprogram program   const D3DVERTEXELEMENT9  declaration      for the Direct3D 9 Cg runtime or cgD3D8ValidateVertexDeclaration       CGbool cgD3D8ValidateVertexDeclaration  CGprogram program 
189. lookups  require the associated texture unit to be configured by the application for  depth compare texturing  otherwise  no depth comparison is actually  performed     More Details    The purpose of this chapter has been to give you a brief overview of Cg  so that  you can get started quickly and experiment to gain hands on experience  If you  would like some more detail about any of the language features described in this  chapter  see    Cg Language Specification    on page 165        18 808 00504 0000 004  NVIDIA       Cg Standard Library Functions    Cg provides a set of built in functions and predefined structures with binding  semantics to simplify GPU programming  These functions ate similar in spirit  to the C standard library  providing a convenient set of common functions  In  many cases  the functions map to a single native GPU instruction  meaning they  are executed very quickly  Of those functions that map to multiple native GPU  instructions  you may expect the most useful to become more efficient in the  near future     Although customized versions of specific functions can be written for  performance or precision reasons  it is generally wiser to use the standard library  functions when possible  The standard library functions will continue to be  optimized for future GPUs  meaning that a shader written today will  automatically be optimized for the latest architectures at compile time   Additionally  the standard library provides a convenient unified interfa
190. low between different  programmable units  On a GPU  for example  packets of vertex data flow from  the application to the vertex program     Because packets are produced by one program  the application  in this case    and consumed by another  the vertex program   there must be some method  for defining the interface between the two  The approach used in Cg is to  associate a binding semantic with each element of the packet  This is a bind by   name approach  For example  an output with the binding semantic FOO is fed to  an input with the binding semantic FOO  Profiles may allow the user to define  arbitrary identifiers in this    semantic namespace     or they may restrict the  allowed identifiers to a predefined set  Often  these predefined names  correspond to the names of hardware registers or API resources     In some cases  predefined names may control non programmable parts of the  hardware  For example  vertex programs normally compute a position that is  fed to the rasterizer  and this position is stored in an output with the binding  semantic POSITION     For any profile  there are two namespaces for predefined binding semantics     the namespace used for in variables and the namespace used for out variables   The primary implication of having two namespaces is that the binding semantic  cannot be used to implicitly specify whether a variable is in ot out     Binding Semantics    A binding semantic may be associated with an input to a top level function in  one of
191. m Modifier    Function    Non static global variables and parameters passed to functions  such as main      can be declared with an optional qualifier uniform  To specify a uniform  variable  use this syntax     uniform   type     variable    For example   uniform float4 myVector   Of  fragout foo uniform float4 uv      If the uniform qualifier is specified for a function that 1s not top level  it is  meaningless and is ignored  The intent of this rule is to allow a function to serve  either as a top level function or as one that is not     Note that uniform variables may be read and written just like non uniform  variables  The uniform qualifier simply provides information about how the  initial value of the variable 1s to be specified and stored  through a mechanism  external to the language     Typically  the initial value of a uniform variable or parameter is stored in a  different class of hardware register  Furthermore  the external mechanism for  specifying the initial value of uniform variables or parameters may be different  than that used for specitying the initial value of non uniform variables or  parameters  Parameters qualified as uniform are normally treated as persistent  state  while non uniform parameters are treated as streaming data  with a new  value specified for each stream record  such as within a vertex array      Declarations    Functions are declared essentially as in C  A function that does not return a  value must be declared with a void return ty
192. m is returned     Q CG PROGRAM ENTRY  The main entry point of the Cg source program is  returned     O CG PROGRAM PROFILE  The profile string is returned     Q CG COMPILED PROGRAM  The resulting compiled program is returned        38    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    Core Cg Parameter    Cg functions exist for retrieving and querying parameters     Parameter Retrieval  Parameter retrieval can be either iterative ot direct   Iteration    A program has a sequence of parameters that can be iterated over by using  cgGetFirstParameter   and cgGetNextParameter        CGparameter cgGetFirstParameter  CGprogram program   CGenum namespace    CGparameter cgGetNextParameter  CGparameter parameter      A call to cgGetFirstParameter    returns the first parameter of the sequence   If the program is invalid or does not contain any parameter  the call returns  zero  Given a parameter  cgGetNextParameter    returns the parameter  immediately next in the sequence or zero if there is none  The namespace  argument of cgGetFirstParameter   specifies the name space of the  parameters returned by this function and subsequent calls to  cgGetNextParameter     Every parameter belongs to a particular name space  that defines its scope  For now  the scope of any parameter is limited to the  program it belongs to  so that the only possible value for namespace is   CG PROGRAM        Note  In the future  other name spaces  such as the context  may be defined  in which  case 
193. mMatrix matrixParam   amp matrix           In the example above  every element of matrixParam is set to 1        72    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    Setting Uniform Arrays of Scalar  Vector  and Matrix Parameters    To set an array parameter  use cgD3D9SetUniformArray        HRESULT cgD3D9SetUniformArray  CGparameter parameter   DWORD startIndex  DWORD numberOfElements   const void  array      The parameters startIndex and numberOfElements specify which elements  of the array parameter are set  Those are the numberO  Elements elements of  indices ranging from startIndexto startIndex   numberOfElements 1  It  is assumed that array contains enough values to set all those elements  As with  cgD3D9SetUniform    cgD3D9TypeToSize    can be used to determine how  many values are required  and the type is void  so a compatible user defined  structure can be passed in without type casting     There is a convenience function equivalent to cgD3D9SetUniformMatrix        HRESULT cgD3D9SetUniformMatrixArray  CGparameter parameter   DWORD startIndex  DWORD numberOfElements   const D3DMATRIX  matrices      The parameters startIndex and numberOfElements have the same meanings  as for cgD3D9SetUniformMatrix       The upper left portion of each matrix of the atray matrices is extracted to fit  the size of the element of the array parameter parameter    rray matrices is  assumed to have numberOfElements clements     Setting Sampler Parameters    You assign a Direc
194. malOptions     It returns a string representing the optimal set  of compiler options for a given profile     char const  cgD3D9GetOptimalOptions  CGprofile profile      This string is meant to be used as part of the argument parameter to  cgCreateProgram    It does not need to be destroyed by the application   However  its content could change if cgD3D9GetOptimalOptions    is called  again for the same profile but for a different Direct3D device     Expanded Interface Program Examples    In this section we provide programs that illustrates how and when to use  functions from the expanded interface to make Cg programs work with  Direct3D  For the sake of clarity  the examples do very little error checking  but  a production application should check the return values of all Cg functions  The  vertex and fragment programs that follow are referenced in  Expanded  Interface DirectD3D 9 Application    on page 78 and    Expanded Interface  DirectD3D 8 Application  on page 81     Expanded Interface Vertex Program    The following Cg code is assumed to be in a file called VertexProgram cg     void VertexProgram    OA doles O SON   aum logs color ee OLOROF   in float4 texCoord   TEXCOORDO    Gue loert posre OnO 8 Jeep   out float4 coloro   COLORO    out float4 texCoordO   TEXCOORDO    const uniform float4x4 ModelViewMatrix              positionO   mul position  ModelViewMatrix    colorO   color   texCoordO   texCoord       Expanded Interface Fragment Program    The following Cg code is
195. matrix    new   packed  pixelshader   public  return  sampler state  sampler3D  short   static  struct  template  texture2D  textureRECT  true   typeid  union  vector   virtual  while    asm fragment  break   char  compile  continue  delete  double   else  explicit  fixed  friend   half   inline  interface  mutable  operator  pass   private  register  row major  sampler1D  samplerCUBE  signed  static_cast  switch  texture   texture3D  this   try  typename  unsigned  vertexfragment   void    Appendix A Cg Language Specification    auto   case   class   const   decl   discard  dword    emit   extern  float    get   if   inout   long  namespace  out  pixelfragment   protected  reinterpret cast  sampler  sampler2D  shared  sizeof  string   technique   texturelD  textureCUBE  throw  typedef  uniform  using  vertexshader   volatile      identifier  two underscores before identifier        Cg Standard Library Functions    Cg provides a set of built in functions and predefined structures with binding  semantics to simplify GPU programming  These functions are discussed in    Cg  Standard Library Functions  on page 19        808 00504 0000 004    NVIDIA    191    Cg Language Toolkit       Vertex Program Profiles    A few features of the Cg language that are specific to vertex program profiles  are required to be implemented in the same manner for all vertex program  profiles     Mandatory Computation of Position Output    Vertex program profiles may  and typically do  require that the
196. me provides all the functions necessary to manage Cg  programs from within the application  It makes no assumption about which 3D  API the applications uses  so that any application could easily ignore the API   specific Cg runtime libraries and content itself with the core Cg runtime        34    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    The core Cg runtime is built around three main concepts  context  program   and parameter  which are represented by the CGcontext  CGprogram  and  CGparameter object types  Those concepts are hierarchically related one to  each other  a program has several parameters  a context contains several  programs  and the application can define several contexts        Note  In the future  it will also be possible to define parameters at the level of the  context so that they are shared among all the programs of a context        The next sections go over those three basic object types and the related  functions  The three object types have some points in common     Q The use of CGbool  which is an integer type equal to either CG  TRUE or  CG FALSE    Q The use of CGenum  which is an enumerate type used to specify various  enumerate values that are not necessarily related       a The convention that functions that return a value of type CGcontext   CGprogram  CGparameter  of const char  indicate failure by returning  zero    Core Cg Context    Cg provides functions for creating  destroying  and querying contexts     Context Creation
197. meter  vertexProgram    ModelViewMatrix     baseTexture   cgGetNamedParameter  fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram    SomeColor                   Sanity check that parameters have th xpected siz  assert  cgD3D9TypeToSize cgGetParameterType   modelViewMatrix      16               808 00504 0000 004 65  NVIDIA    Cg Language Toolkit       assert  CgD3D9TypeToSize  cgGet ParameterType  someColor       07       Called to render the scen  void OnRender          Get the Direct3D resource locations for parameters     This can be done earlier and saved  DWORD modelViewMatrixRegister    cgGetParameterResourceIndex  modelViewMatrix    DWORD baseTextureUnit    cgGetParameterResourcelndex  baseTexture     DWORD someColorRegister    cgGetParameterResourceIndex someColor                              Set the Direct3D state   device  gt SetVertexShaderConstantF  modelViewMatrixRegister   matrix  4    device  gt SetPixelShaderConstantF  someColorRegister   ACOSTA  E   vice  gt SetVertexDeclaration vertexDeclaration     vice  gt SetTexture  baseTextureUnit  texture    vice  gt SetVertexShader  vertexShader     vice  gt SetPixelShader  pixelShader                            aaqaaQaa       Draw scene              Called before the device changes or is destroyed  void OnDestroyDevice      vertexShader  gt Release      pixelShader  gt Release     vertexDeclaration   Release                   Called before application shuts down   void OnShutdown     
198. n     Setting a vertex vatying parameter requires two steps     The first step consists in passing a pointer to an array containing the values for  each vertex  This is done using cgGLSetParameterPointer        void cgGLSetParameterPointer  CGparameter parameter   GLint size  GLenum type  GLsizei stride   GLvoid  array      The variable size indicates the number of values per vertex that are stored in  array  It is equal to 1  2  3  or 4  If fewer values are set than the parameter  requires  the non specified values default to O for x  y  and z  and 1 for w     The enumerate type type specifies the data type of the values stored in array   GL SHORT GL INT GL FLOAT  or GL DOUBLE     The parameter stride is the byte offset between any two consecutive vertices   Passing a value of zero for stride is equivalent to passing a byte offset equal to  size multiplied by the size of type in bytes  in other words  it means that there  is no gap between two consecutive vertex values  Note that the minimum size  for array is implicitly defined by the biggest vertex index specified in the  triangles drawn     The second step consists in enabling the varying parameter for a specific  drawing call     void cgGLEnableClientState  CGparameter parameter    The equivalent disabling function is  void cgGLDisableClientState CGparameter parameter         808 00504 0000 004 51  NVIDIA    Cg Language Toolkit    Another way to set vertex varying parameter is to use the cgGLSetParameter  functions  When
199. n OUT        148 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders       Refraction    Description    This effect performs custom texture coordinate generation to compute a  refracted vector per vertex that is then used to look up in a cube map  Fresnel is  also calculated to blend between reflection and refraction  Figure 18         Figure 18 Example of Refraction       808 00504 0000 004 149  NVIDIA    Cg Language Toolkit    Vertex Shader Source Code for Refraction    GWESPUIOIE ENPUES         F     floats Position SEP SSIBINIKONIS  float4 Normal   NORMAL     struct Outputs                     float4 hPosition we POSITION  float4 fresnelTerm   COLORO   float4 refractVec TELE XCOORD OF   float4 reflectVec ECO ORD       fresnel approximation  eo ias os cin  lees T  ilo   Ny    float3 fresnelValues        fixed power   fresnelValues x   fixed scale   fresnelValues y   fixed bias   fresnelValues z           cecuri Jotas sr jor  0    elo  r  1   ower    seco     outputs main inputs IN     uniform float4x4 ModelViewProj   uniform float4x4 ModelView   uniform float4x4 ModelViewIT   uniform float theta     outputs OUT   OUT hPosition   mul ModelViewProj  IN Position         convert the position and normal into      appropriate spaces   float3 eyeToVert   mul ModelView  IN Position  xyz   eyeToVert   normalize  eyeToVert     float3 normal   mul ModelViewIT  IN Normal  xyz   normal   normalize  normal         150    808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders 
200. n glstate light 0  half                808 00504 0000 004    NVIDIA    205          Cg Language Toolkit                                                 Table 17 float4 glstate Fields  continued   glstate lightmodel ambient glstate lightmodel scenecolor  glstate lightmodel  front scenecolor  glstate lightmodel back scenecolor  glstate lightprod 0   ambient glstate lightprod 0  diffuse  glstate lightprod 0  specular glstate lightprod 0  front ambient  glstate lightprod 0  front diffuse  glstate lightprod 0  front specular  gistate lightprod 0  back ambient glstate lightprod 0  back diffuse  glstate lightprod 0  back specular  glstate texgen 0  eye s  glstate texgen 0   eye t glstate texgen 0   eye r  glstate texgen 0  eye q glstate texgen 0   object s  glstate texgen 0   object t glstate texgen 0   object r  glstate texgen 0   object q glstate fog color  glstate fog params glstate clip 0  plane   Table 18 lists the glstate fields of type float that can be accessed   Table 18 float glstate Fields          glstate point size       glstate     point attenuation       Position Invariance    Q The arbvp1 profile supports position invariance  as described in the core  language specification     Q The modelview projection matrix is not specified using a binding semantic  Of GL MVP        206    NVIDIA    808 00504 0000 004          Appendix B Language Profiles    Data Types    This profile implements data types as follows     Q float data type is implemented as defined in the ARB vert
201. n this Implementation  224 220040 m RR Rohan nx man 203  OpenGL ARB Vertex Program Profile  arbvp1           leen 204  OVGIVIGW  eeu ac dur xii ROI ORC A 204  Accessing OpenGL State    espaces qoe ree RP o a OR Pone PUES RON RE n 204  Posion Invatlamncez sx oe ees A RE RUE E CANS wor ace ee 206  Data TYPES   45i tek ahh a Ra RR ACER RYE RO ROCA RA AR a RC 207  Compatibility with the vp20 Vertex Program Profile                 o oooooo o   207  Loading Constants sereni gei wed cepa a te Re ete e n e xU e Od 208  BINGINGS 22 42  T 208  OpenGL ARB Fragment Program Profile  aztb  p1             llle 211  MEMO cow a ce ar sh che te bay dat dada Aa ita Grinch ek 211  Language Constructs and SUPPORT s eee ee se ara e eRe a aee m ee 212  BINDINGS s 3  sade RR Coe Re EROR TATE dub RUE ERR a Maia Ra 212  OPUS   snc eee enemies ned See eke hee oe Ge Pees 213  Limitations in the Implementation        looo n eee 213  OpenGL NV vertex program 2 0 Profile  vp30   s  ers lle lee n n 214  Position InVvaridiCe   oia b rper om de qe ipee Re ex RR Roe RU dd wore ded 214  Language CONStTUELS   sco ue ark rt RR ee DA Rao RR E RC ER RT eee EY 214  BINGINGS C e eM PE tar aa Reade DA ae ear koe da on 215  OpenGL NV_fragment_program Profile    p30             ooocooooorrrooroo   218  Language Constructs and Support    554 22  rrr hh n Rh e ee 218  A          r   rrm 219  Pack and Unpack FUNCHONS   omic mde pae ao me pcm RE er m cni ete woe 220  DirectX  Vertex shader 1 1 Profile  vs  11  iasi asa ie c
202. nagement    The Cg runtime also offers additional facilities to manage the input parameters  of the Cg program  In particular  it makes data types such as arrays and matrices  easier to deal with  These additional functions also encompass the necessary  3D API calls to minimize code length and reduce programmer errors     Overview of the Cg Runtime    The Cg runtime API consists of three parts  Figure 2      Q A core set of functions and structures that encapsulates the entire  functionality of the runtime    A set of functions specific to OpenGL built on top of the core set       A set of functions specific to Direct3D built on top of the core set       30    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    To make it easier for application writers  the OpenGL and Direct3D runtime  libraries adopt the philosophy and data structure style of their respective API        Figure 2 The Parts of the Cg Runtime API    The rest of the section provides instructions for using the Cg runtime in the  framework of an application  Each step includes source code for OpenGL and  Direct3D programming     Functions that involve only pure Cg resource management belong to the core  runtime and have a cg prefix  In these cases  the same code is used for OpenGL  and Direct3D     When functions from the OpenGL or Direct3D Cg runtimes are used  notice  that the API name is indicated by the function name  Functions belonging to  the OpenGL Cg runtime library have a egGL prefix  and funct
203. name vs matches any vertex  profile  while the name ps matches any fragment or pixel profile     The names ps 1 and ps 2 match any DirectX 8 pixel shader 1 x profile or  DirectX 9 pixel shader 2 x profile  respectively  Similarly  the names vs 1 and  vs 2 match any DirectX vertex shader 1 x or 2x  respectively  Additional valid  wildcard profile names may be defined by individual profiles     In general  the most specific version of a function is used  More details are  provided in    Function Overloading  on page 181  but roughly speaking  the  search order is the following     1  Version of the function with the exact profile overload    2  Version of the function with the most specific wildcard profile overload   such as vs or ps 1     3  Version of the function with no profile overload    This search process allows generic versions of a function to be defined that can  be overridden as needed for particular hardware        170    808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    Syntax for Parameters in Function Definitions    Functions are declared in a manner similar to C  but the parameters in function  definitions may include a binding semantic  see    Binding Semantics  on  page 183  and a default value     Each parameter in a function definition takes the following form    uniform    type   identifier      binding semantic gt       lt default gt      where    Q   type   may include the qualifiers in  out  inout  and const  as discussed  in    Typ
204. nates associated with sampler tex  and  prevlookup is the result of a previous texture operation   This function can be used to generate the texdp3tex instruction in the  ps 1 2andps 1 3 profiles     tex2D dp3x2 uniform sampler2D tex  float3 str   float4 intermediate coord  float4 prevlookup           Performs the following  float2 newst   float2  dot  intermediate coord xyz  prevlookup xyz    dot str  prevlookup xyz     return tex2D tex  newst    where  str are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation  and  intermediate coord are texture coordinates associated with the previous  texture unit   This function can be used to generate the texm3x2pad texm3x2tex  instruction combination in all ps 1 x profiles                 808 00504 0000 004 235  NVIDIA    Cg Language Toolkit    Table 40 ps 1 x Auxiliary Texture Functions  continued        Texture Function       Description       tex3D dp3x3 sampler3D tex  float3 str   float4 intermediate coordl   float4 intermediate coord2  float4 prevlookup   texCUBE dp3x3 samplerCUBE tex  float3 str   float4 intermediate coordl   float4 intermediate coord2  float4 prevlookup        Performs the following  float3 newst     float3  dot  intermediate coordl xyz  prevlookup xyz    dot intermediate coord2 xyz  prevlookup xyz    dot str  prevlookup xyz     return tex3D CUBE  tex  newst    where  str are texture coordinates associated with sampler tex   prevlookup is the result of a p
205. nces a valid  program     CGbool cgIsProgram CGprogram program    Compilation Result  You can query the result of the compilation resulting from the last call to  cgCreateProgram   for a given context by using cgGetLastListing     const char  cgGetLastListing CGcontext context    If no call to cgCreateProgram    has been made for the context   cgGetLastListing   returns zero  Otherwise  it returns a string containing    the output you would typically get from the command line version of the  compiler     Program Attributes    To retrieve the context the program belongs to  use cgGetProgramContext     CGcontext cgGetProgramContext  CGprogram program    Retrieving the profile the program has been compiled to is done with  cgGetProgramProfile      CGprofile cgGetProgramProfile  CGprogram program    The function pair cgGetProfile   and cgGetProfileString   allows you  to find the correspondence between a profile enumerant and its corresponding  string   CGprofile cgGetProfile const char  profileString    const char  cgGetProfileString CGprofile profile      If the string passed to cgGetProfile   does not correspond to any profile   CG PROFILE UNKNOWN is returned     The function cgGetProgramString    retrieves various strings related to the  program depending on the value of the enumerant stringType     const char  cgGetProgramString CGprogram program   CGenum stringType      The variable stringType can have any of these values   Q CG PROGRAM SOURCE  The original Cg source progra
206. nctions  continued        Texture Function       Description       texCUBE reflect eye dp3x3 uniform samplerCUBE tex   float3 str  float4 intermediate coordl   float4 intermediate coord2   float4 prevlookup  uniform float3 eye        Performs the following  float3 N   float3 dot intermediate coordl xyz  prevlookup xyz    dot intermediate coord2 xyz  prevlookup xyz    dot coords xyz  prevlookup xyz     return texCUBE  tex  2   dot N  E    dot N  N    N   E    where  strq are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate coordl are texture coordinates associated with the n 2  texture unit   intermediate coord  are texture coordinates associated with the n 1  texture unit  and  eye is the eye ray vector   This function can be used to generate the texm3x3pad texm3x3pad   texm3x3spec instruction combination in all ps 1 x profiles        tex dp3x2 depth float3 str  float4 intermediate coord   float4 prevlookup        Performs the following  float z   dot  intermediate coord xyz  prevlookup xyz    float w   dot str  prevlookup xyz     return z   w   where  str are texture coordinates associated with the nth texture unit   intermediate coord are texture coordinates associated with the n 1  texture unit  and  prevlookup is the result of a previous texture operation   This function can be used with the DEPTH varying out semantic to generate the  texm3x2pad texm3x2depth instruction combination in ps 1 3         
207. ngle   WavesX   IN TexCoord0 x    WavesY   IN TexCoord0 y   angle   angle   Time        float3 sine  cosine   sincos  angle  sine  cosine      posicion abes  u  sunan   sim eimnglei    we  float4 position    position xz   IN TexCoord0 xy    position y   dot WavesH  sine     POSO O    OUT HPOS   mul WorldViewProj  position         normal is  t h WaveX cos  angle    YX     t h WaveY cos angle     float3 normal    normal x   dot  WavesH   WavesX  cosine      1        808 00504 0000 004  NVIDIA    159    Cg Language Toolkit    Tornali     gt  0187  normal z dot WavesH   WavesY  cosine         transform normal into eye space  normal   mul WorldViewIT  normal    normal   normalize  normal         Transform vertex to eye space and      compute the vector from the eye to the vertex      Because th ye is at 0  no subtraction is      necessary  Because the reflection of this vector    d looks into a cube map normalization is also   72 4 unnecessary    float3  eyeVector   mul WorldView  position    OUT TEXO xyz   reflect eyeVector  normal                        return OUT        160 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders       Matrix Palette Skinning    Description    This effect performs matrix palette skinning using two bones per vertex  All the  bones for the mesh ate set in the constant memory  and each vertex includes  two indices that indicate which bones influence this vertex  The final skinned  positions are computed using these bones  along with the weights 
208. niform float4x4 ModelViewI   uniform float4 ViewerPos   uniform float4 LightPos     vert2frag Out           2  Ou   i  Ou   n    jac    Vertex positions    In clip space   t HPosition   mul ModelViewProj  In Position    In object space   t OPosition   In Position xyz    In eye space   t EPosition   mul  ModelView  In Position   xyz        t Normal   normalize In Normal xyz     Copy the texture coordinates  t TexCoord0   In TexCoord0 xyz        Generate a white color   t Color0   LightPos    t LightPos   mul  ModelViewI  LightPos  xyz   t ViewerPos   mul  ModelViewI  float4 0 0 0 1   xyz5  CUE       106    808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders    Pixel Shader Source Code for Melting Paint    struct vert2frag               MO AA O SiO ADO SON  POSES POLOS Iron TEX O ORDZ    XN ae mE Os TEON TEXCOORD3   float3 Normal TEXCOORDI   float3 TexCoord0   TEXCOORDO   float4 Color0 ICO MORO  float3 LightPos TEXCOORD4   float3 ViewerPos TEXCOORD5                    he    void calcLighting out float diffuse  out float specular   f3ltorais genou maus tio rs caghos sn lOces ico   float3 eyePos  float specularExp        ElOcics beim   ligados     icici os7  float len   length light    Jibgjsue    liee   len    float3 eye   normalize eyePos   fragPos    float3 halfVec   normalize eyePos   light      iphone siecle   Ls    13   lem            loew Liclmestas   iE cla  late ae  orando    dot  halfVec  normal   specularExp     diffuse   lighting y   attenuation    specular 
209. ns must predefine the type identifiers  float2x1  float3x3  float4x4  and so on  A typedef follows the usual  matrix naming convention of TYPE_rows_X_columns  If we declare  float4x4 a  then a 3  is equivalent to a _m30_m31_m32_m33     Both expressions extract the third row of the matrix     Q Implementations are required to support indexing of vectors and matrices  with constant indices        Q Astruct type is a collection of one or more members of possibly different    types   Partial Support of Types    This specification mandates partial support for some types  Partial support for a  type requires the following        808 00504 0000 004 173  NVIDIA    Cg Language Toolkit    Q Definitions and declarations using the type are supported     Q Assignment and copy of objects of that type are supported  including  implicit copies when passing function parameters      a Top level function parameters may be defined using that type     If a type is partially supported  variables may be defined using that type but no  useful operations can be performed on them  Partial support for types makes it  easier to share data structures in code that is targeted at different profiles     Type Categories    O The   ntegral type category includes types cint and int     O The floating type category includes types cfloat  float  half  and fixed    Note that floating really means floating or fixed   fractional      Q The numeric type category includes integral and floating types     The compile t
210. ntensCoord   float2   dot IN texCoordl xyz  normal xyz    dot IN texCoord2 xyz  normal xyz     intensity   tex2D intensityMap  intensCoord     color   tex2D colorMap  IN texCoord3 xy     Corona Matas        256    808 00504 0000 004  NVIDIA    Appendix C  Nine Steps to High Performance Cg    Writing Cg code that compiles to efficient programs requires techniques and  approaches that are different from efficient programming in C  C    or Java   While some of the basic lessons ate the same  such as using efficient underlying  algorithms   the hardware programming model of modern GPUs is substantially  different from that of modern CPUs  This can lead to pitfalls   where you may  be disappointed by your shader s performance   as well as to opportunities     where you can push the GPU to its limits though careful programming     The Cg language shields you from the majority of the low level details of GPU  hardware  enabling you to think about your shaders at a higher level than the  low level GPU instruction sets  However  just as an understanding of modern  computer architecture  such as cache and memoty hierarchy issues  is important  for writing fast C and C   code  understanding a bit about the GPU can help  you write better Cg code  This appendix focuses on techniques for maximizing  performance from vertex and fragment programs written in Cg and running on  the NVIDIA GeForce FX architecture  specifically the vp30    p30  arb  p1   ps 2 0 ps 2 x vs 2 0  andvs 2 x profiles 
211. o  accepted     If used with a variable that requires more  than one constant register  for example  a  matrix   the semantic specifies the first  register that is used                    808 00504 0000 004 215  NVIDIA    Cg Language Toolkit    Binding Semantics for Varying Input Output Data    Table 26 summarizes the valid binding semantics for varying input parameters    in the vp30 profile     One can also use TANGENT and BINORMAL instead Of TEXCOORD6 and  TEXCOORD7  These binding semantics map to NV_vertex_program2 input  attribute parameters  The two sets act as aliases to each other     Table 26  vp30 Varying Input Binding Semantics       Binding Semantics Name    Corresponding Data       POSITION  ATTRO    Input Vertex  Generic Attribute O       BLENDWEIGHT  ATTR1    Input vertex weight  Generic Attribute 1       NORMAL  ATTR2    Input normal  Generic Attribute 2       COLORO  DIFFUSE  ATTR3    Input primary color  Generic Attribute 3       COLOR1  SPECULAR  ATTR4    Input secondary color  Generic Attribute 4       TESSFACTOR  FOGCOORD   ATTR5    Input fog coordinate  Generic Attribute 5       PSIZE  ATTR6    Input point size  Generic Attribute 6       BLENDINDICES  ATTR7    Generic Attribute 7       TEXCOORDO TEXCOORD7   ATTR8 ATTR15    Input texture coordinates  texcoord0   texcoord7   Generic Attributes 8 15       TANGENT  ATTR14    Generic Attribute 14       BINORMAL  ATTR15          Generic Attribute 15       Table 27 summatizes the valid binding semantics for var
212. ogram  TRUE  0  0  0               70    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    fp rece T      Bind sampler parameter   GCparameter parameter    parameter   cgGetParameterByName  program   MySampler     cgD3D9SetTexture  parameter  myDefaultPoolTexture            void OnLostDevice              First release all necessary resources          PrepareForReset          Next actually reset the D3D devic  Giewioce   meset  PP cao 9 DE      Finally recreate all those resource  OnReset         void PrepareForReset        JS See Soh     Releas xpanded interface referenc  cgD3D9SetTexture  mySampler  0       Release local reference     and any other references to the texture  myDefaultPoolTexture   Release     HO ta Yh          void OnReset            Recreate myDefaultPoolTexture in D3DPOOL DEFAULT  EO Bak OH     Since the texture was just recreated      it must be re bound to the parameter  GCparameter parameter   parameter   cgGetParameterByName  prog   MySampler      cgD3D9SetTexture  nySampler  myDefaultPoolTexture     PO bec SUL   j    See the Direct3D documentation for a full explanation of lost devices and how  to properly handle them           808 00504 0000 004 71  NVIDIA    Cg Language Toolkit    Setting Expanded Interface Parameters    This section discusses setting the various types of parameters of the expanded  interface  including uniform scalar  uniform vector  uniform matrix  uniform  arrays of the three previous types  and sampler     Setting Un
213. ogram 24       Discovered sampler parameter  BaseTexture     E D b EH    Discovered uniform parameter  SomeColor  of    type float4   cgD3D TRACE   Finished discovering parameters for pixel  program 24   cgD3D TRACE   Shadowing state for sampler parameter  BaseTexture       cgD3D TRACE   Shadowing sampler state D3DTSS MAGFILTER for  sampler parameter  BaseTexture   cgD3D TRACE   Shadowing sampler state D3DTSS MINFILTER for  sampler parameter  BaseTexture   cgD3D TRACE   Shadowing sampler state D3DTSS MIPFILTER for  sampler parameter  BaseTexture                                      cgD3D  TRACE Shadowing 16 values for uniform parameter            ModelViewProj  of type float4x4  cgD3D TRACE   Activating vertex shader for program 3  cgD3D TRACE   Setting shadowed parameters for program 3  CgD3D TRACE   Setting registers for uniform parameter   ModelViewProj  of type float4x4  cgD3D TRACE   Setting constant registers  0   3  for  parameter  ModelViewProj  of type float4x4  cgD3D  TRACE   Activating pixel shader for program 24  cgD3D  TRACE   Setting shadowed parameters for program 24  cgD3D TRACE   Setting texture for sampler parameter   BaseTexture                          cgD3D TRACE   Setting SamplerState 0  D3DTSS MAGFILTER for  sampler parameter  BaseTexture              84    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    cgD3D  TRACE   Setting SamplerState 0  D3DTSS MINFILTER for  sampler parameter  BaseTexture     cgD3D  TRACE   Setting SamplerState 0  D3
214. om startIndexto startIndextnumberOfElements 1  Passing a value of 0 for numberOfElements tells the functions to set all the  values starting at index startIndex up to the last valid index of the array   namely cgGetArraySize  parameter 0  1  This is equivalent to setting  numberOfElements to cgGetArraySize  parameter  0   startIndex  The  parameter array is an atray of scalar values  It must have numberOfElements  for the cgGLSetParameterArray1 functions  2 numberOfElements for the  cgGLSetParameterArray2 functions  and so on     The corresponding parameter value retrieval functions are as follows     void cgGLGetParameterArraylf  CGparameter parameter    long startIndex  long numberOfElements  float  array    void cgGLGetParameterArrayld CGparameter parameter    long startIndex  long numberOfElements  double  array    void cgGLGetParameterArray2f  CGparameter parameter    long startIndex  long numberOfElements  float  array    void cgGLGetParameterArray2d CGparameter parameter    long startIndex  long numberOfElements  double  array    void cgGLGetParameterArray3f  CGparameter parameter    long startIndex  long numberOfElements  float  array    void cgGLGetParameterArray3d CGparameter parameter    long startIndex  long numberOfElements  double  array    void cgGLGetParameterArray4f  CGparameter parameter    long startIndex  long numberOfElements  float  array    void cgGLGetParameterArray4d CGparameter parameter    long startIndex  long numberOfElements  double  array      S
215. ompilation profiles may allow some precision flexibility for the hardware  in  such cases the compiler should ideally perform the constant folding at the  highest hardware precision allowed for that data type in that profile     If constant folding cannot be performed at run time precision  it may optionally  be performed using the precision indicated below for each of the numeric data    types   Q float  s23e8    p32  IEEE single precision floating point    half  s10e5    p16  floating point with IEEE semantics       Q fixed  s1 10 fixed point  clamping to   2  2   Q double  s52e11    p64  IEEE double precision floating point  Q int  signed 32 bit integer    Type Qualifiers    The type of an object may be qualified with one or more qualifiers  Qualifiers  apply only to objects  Qualifiers are removed from the value of an object when  used in an expression  The qualifiers are    Q const  The value of a const qualified object cannot be changed after its initial  assignment  The definition of a const qualified object that is not a  parameter must contain an initializer  Named compile time values are  inherently qualified as const  but an explicit qualification is also allowed   The value of a static const cannot be changed after compilation  and  thus its value may be used in constant folding during compilation  A  uniform const  on the other hand  is only const for a given execution of  the program  its value may be changed via the runtime between executions     O inand out  F
216. ompiled program by the compiler  in which case the application can  simply ignore it and not set its value  Calling cgIsParameterReferenced    allows you to check whether a parameter is actually used by the final compiled  program    CGbool cgIsParameterReferenced CGparameter parameter      No error is generated if you set the value of a parameter that is not referenced   Parameter Attributes    The program that the parameter corresponds to is found using  cgGetParameterProgram       CGprogram cgGetParameterProgram CGparameter parameter      To determine whether the parameter is vatying  uniform  ot constant   cgGetParameterVariability   is used     CGenum cgGetParameterVariability  CGparameter parameter      The call returns CG_VARYING if the parameter is a varying parameter    CG UNIFORM if the parameter is a uniform parameter  or CG  CONSTANT if the  parameter is a constant parameter  A constant parameter is a parameter whose  value never changes for the life of a compiled program  so that changing its  value requires recompiling the program  For some profiles  the compiler has to  add some that correspond to literal constant values in the code     To obtain the parameter direction  use cgGetParameterDirection      CGenum cgGetParameterDirection CGparameter parameter    It returns CG_IN if the parameter is an input parameter  CG OUT if the parameter    is an output parameter  or CG_INOUT if the parameter is both an input and an  output parameter        42    808 00504 0000
217. one    Y  CGD3D9ERR NULLVALUE  Returned when a value of zero is passed to a  function that requires a non zero value       CGD3D9ERR OUTOFRANGE  Returned when an array range specified to a  function is out of range    Y  CGD3D9 INVALID REG  Returned when a register number is requested    for an invalid parameter type  This error is specific to the minimal  interface functions and does not trigger an error callback        86    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    Testing for Errors    When a Direct3D runtime function is called that returns an error of type  HRESULT  the proper method of testing for success or failure is to use the  Win32 macros FAILED    and SUCCEEDED     Simply testing the error against  Zero or D3D OK is not sufficient  because there could be more than one success  value     As an added convenience  and for uniformity with the core runtime  the  Direct3D runtime also supplies cgD3D9GetLastError     which is analogous  to cgGetLastError    but returns the last Direct3D runtime error of type  HRESULT for which the FAILED    macro returns TRUE     HRESULT cgD3D9GetLastError     The last error is always cleared immediately after the call     The function cgD3D9TranslateHRESULT    converts an error of type HRESULT  into a string     const char  cgD3D9TranslateHRESULT  HRESULT hr      This function should be called instead of DXGetErrorDescription 9     because it also translates errors that the Cg Direct3D runtime generates     Using Err
218. onmentMaps  2     8  looi       float3 reflectColor   texCUBE  environmentMaps 0            reflectVec  rgb   float3 reflectColorDark   texCUBE  environmentMaps 1    reflectVecDark  rgb           closws colo     rertlecucolo r      xdos  sr   reflectColorDark   colorl    eubisa lose delo  LO        104 808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders       Melting Paint    Description    This shader uses an environment map with procedurally modified texture  lookups to create a melting effect on the surface texture  the NVIDIA logo in  this example   The reflection vector is shifted using a noise function  giving the  appearance of a bumpy surface  The surface texture s texture coordinates are  shifted in a time dependent manner  also based on a noise texture        Figure 7 Example of Melting Paint    Vertex Shader Source Code for Melting Paint       define inputs from application  struct app2vert     float4 Position LO SeenON  float4 Normal   NORMAL        808 00504 0000 004 105  NVIDIA    Cg Language Toolkit    fal  iz  y    oat4 Color0 COLORO   oat4 TexCoord0   TEXCOORDO        struct vert2frag                     float4 HPosition  SEOSJTUN ON  float3 OPosition LE X lt COOR DZ   float3 EPosition TEXCOORD3   float3 Normal TEXCOORD1    float3 TexCoord0 TES O OBI Or   float4 Color0 COLORO   float3 LightPos TEXCOORD4   float3 ViewerPos TEXCOORD5        y                vert2frag main  app2vert In     uniform float4x4 ModelViewProj   uniform float4x4 ModelView   u
219. onn      y    float4 Hposition   POSITION   float4 TexCoord0   TEXCOORDO   float4 TexCoordl   TEXCOORD1   itle Color 3  euo             vpconn main appdata IN     uniform float4x4 WorldViewProj   uniform float4x4 TexTransform   uniform float3x3 WorldIT   uniform float3 LightVec   vpconn OUT   float3 worldNormal   normalize mul WorldIT  IN Normal     float ldotn   max dot LightVec  worldNormal   0 0    Qu Color Q yz   loud   float4 tempPos   tempPos xyz   IN Position xyz     tempPos w   1 0     OUT TexCoordO0  OUT TexCoordl    mul  TexTransform  tempPos    mul  TexTransform  tempPos                  OUT Hposition   mul WorldViewProj  tempPos      CECU OU        808 00504 0000 004 153    NVIDIA    Cg Language Toolkit    Pixel Shader Source Code for Shadow Mapping    SIC    float4   float4   float4   float4  hg    simple     position BOS ION  TexCoord0   TEXCOORDO   TexCoordl   TEXCOORD1   Colon RECEN  OT ESTA             float4 main v2f simple IN   uniform sampler2D ShadowMap     uniform sampler2D SpotLight    COLOR     float4 shadow   tex2D ShadowMap  IN TexCoord0 xy    float4 spotlight   tex2D SpotLight  IN TexCoordl xy      float4    return       lighting   IN Color0     shadow   spotlight   lighting        154    808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders       Shadow Volume Extrusion    Description    This effect uses vertex programs to generate shadow volumes by extruding  geometry along the light vector  Figure 20         Figure 20 Example of Shadow Volum
220. ons arrays indexed with variable expressions need not be  declared const just uniform  However  writing to an array that is later indexed  with a variable expression yields unpredictable results     Array data is not packed because vertex program indexing does not permit it   Each element of the array takes a single 4 float program parameter register  For  example  float arr 10   float2 arr 10   float3 arr 10   and float4  arr 10  all consume 10 program parameter registers     Itis more efficient to access an array of vectors than an array of matrices   Accessing a matrix requires a floor calculation  followed by a multiply by a  constant to compute the register index  Because vectors  and scalars  take one  register  neither the floor nor the multiply is needed  It is faster to do matrix  skinning using arrays of vectors with a premultiplied index than using atrays of  matrices        808 00504 0000 004 197  NVIDIA    Cg Language Toolkit    Bindings    Binding Semantics for Uniform Data    Table 10 summatizes the valid binding semantics for uniform parameters in the  vs_2 0 and vs_2 X profiles     Table 10 vs 2   Uniform Input Binding Semantics       Binding Semantics Name Corresponding Data       register  c0  register c255   Constant register  0  95     C0 C255 The aliases c0 c95  lowercase  are also  accepted    If used with a variable that requires more  than one constant register  for example  a  matrix   the semantic specifies the first  register that is used          
221. ooooooooooooooos 14  Arithmetic Operators rom C    uoa epos a ete 14  Multiplication  PUNCHONS   vw  3 is di BER a rr qc br TR a n 15  Vector CONSTUCO sd actio o RU Heo EA AA gs 15  Boolean and Comparison Operators  2    s l e hh 15  SWizzle OpeFatof sacar see tbi e eoe em hg repe bh deas re Rotes 16  Write Mask Operator  i a aa hah a a RG aes ac Rara 16  Conditional Operator    x   ns exo ERE ax RR ME Rer eg AA dpt os 17  Texture Lookups in Advanced Fragment Profiles              ooooooooommoo   17  More Detalls  22 30  rm 18   Cg Standard Library Functions        cooocccccccc nnn rn 19  Mathematical EUNEONS        5 2a it ri a mot abad dle diete 19  GeOITigbric  FUNCHONG  ot ia ia ia a iS oie 24  Texture Map FUNCIONS  s x suede ri e Ed esas tons 25  Derivative  FUNCHONS  gt  lt a main cea AG gotta neck or ta e 27  Debugding FUNCI  N       22 e tre io a de PER Rx Roca ipte d eres 28  Predefined Fragment Program Output Structures             llle 28   808 00504 0000 004    NVIDIA    Cg Language Toolkit       Using the  Cg Runtime Library    osc ee ee eee RR Rn 29  Introducing the Cg Rutritiirie     cimas sarria ae PAG ip a AA 29  Benefits of the Cg R  fltllIne    aeaiee sek be bx e it e te del 29  Overview of the Ca Runtlie   aua ek Exch Rb xx Rus ERU eda x ERA 30  Gore Cy Until   dois ba ERA abate o E e ae Oe dede 34  Core Co Context  zen ich ipeo iced etur aa A ie re 35  Core Gg Program seirene s eima m nah ph HR RR ECEORORORO RUN A A ae RE 35  Gore Cg Palatmeltel sse ape o
222. opts profopts  Specify a comma separated list of profile specific options  See the profile  specification for valid options    QO   entry fname   Specify the main function name as fname     O  o fname   Write the output to file fname    O  Dmacro  value    Define a macro  with optional value    A   Ipathname   Specify path to an include directory    Q  1 filename   Write compiler messages to filename rather than to standard output   Q  strict   Enforce strict type checking    QO  nofx   Do not treat CgFX keywords as reserved words   Qh  quiet   Suppress printing the header to stdout    Q  nocode   Compile  but do not generate any code        Oh  nostdlib  Do not include the stdlib h header file before compilation     808 00504 0000 004 265  NVIDIA    Cg Language Toolkit    OU   longprogs  Allow code generation that is longer than a profiles limit   Q  debug  Activate the debug    function   Q  v  Print the compiler s version to stdout   a  h  Print a short help message   Q  maxunrollcount N  Set the maximum loop unroll count to N  Loops with greater than N  iterations are not unrolled  Defaults to 256   Q  posinv    Generate a position invariant vertex program if position invariance is  supported by the current profile        266 808 00504 0000 004  NVIDIA       A  abs   for performance 259  animation of geometry 146  anisotropic lighting  sample shader 134  vertex shader code example 135  ANSI C  differences from Cg 166  relation to Cg 165  arbfp1 profile 211  arbvp1 profile
223. or Callbacks    Here is an example of a possible error callback that sorts out debug trace errors  from core runtime errors and from Direct3D runtime errors     void MyErrorCallback      CGerror error   cgGetError     if  error    cgD3D9DebugTrace        This is a debug trace output      A breakpoint could be set here to step from one     debug output to the other   Detur                   char buffer 1024            if  error    cgD3D9Failed   soler  mire  WA Walwecie sw  error Ocorre Es War   cgD3D9TranslateHRESULT  cgD3D9GetLastError       else    Sjoveslinicse To wrtafs SAGE SIS OTIO G Uem  cgD3D9TranslateCGerror error     OutputDebugString  buffer       cgSetErrorCallback  MyErrorCallback                   808 00504 0000 004 87  NVIDIA    Cg Language Toolkit       88 808 00504 0000 004  NVIDIA       A Brief Tutorial    This section walks you through the sample Cg Microsoft Visual Studio  wotkspace we have provided  along with a simple Cg program that you can use  fot experimentation        Loading the Workspace    When you load the Cg  Simple file  your workspace should look like the image  in Figure 3     Mis p gee pee e dd leds gehe me   amp wug T area Sm Jin dar tua                      l debio tugun  rcm appliratizm    quere 1 ness zira i appii      El vn vga Fina       Loi m im DEE Pain 1h  ear      Fy mete itr i Ploktd Errei    T dia pae cem    debia quipete iros rerien uheia  siruci martori       wati EPoalth  R  Dat 4  Gabor       sto main mppixz 1N  unbiors Dict
224. ormal parameters may be qualified as in  out  or both  by using in out or  inout   By default  formal parameters are in qualified  An in qualified  parameter is equivalent to a call by value parameter  An out qualified  parameter is equivalent to a call by result parameter  and an inout qualified  parameter is equivalent to a value result parameter  An out qualified  parameter cannot be const qualified  nor may it have a default value        808 00504 0000 004 175  NVIDIA    Cg Language Toolkit    Type Conversions    Some type conversions are allowed implicitly  while others require an cast  Some  implicit conversions may cause a warning  which can be suppressed by using an  explicit cast  Explicit casts are indicated using C style syntax  casting variable  to the   1oat4 type can be achieved using  float4  variable     a    Scalar conversions    Implicit conversion of any scalar numetic type to any other scalar numeric  type is allowed  A warning may be issued if the conversion is implicit and a  loss of precision is possible  Implicit conversion of any scalar object type to  any compatible scalar object type is allowed  Conversions between  incompatible scalar object types or between object and numeric types are  not allowed  even with an explicit cast  A sampler is compatible with  sampler1D  sampler2D  sampler3D  samplerCube  and samplerRECT   No other object types are compatible   sampler1D is not comparable with  sampler2D  even though both are compatible with sampler  
225. ort for fixed operations  but must still support definition of fixed  varlables  Cg allows profiles to omit run time support for int  Cg allows  profiles to treat double as float     Many operators support per element vector operations        The          amp  amp      and comparison operators can be used with bool four   vectots to perform four conditional operations simultaneously  The side  effects of all operands to the          and  amp  amp  operators are always executed        Q Non static global variables and parameters to top level functions    such as  main       may be designated as uniform  A uniform variable may be read  and written within a program  just like any other variable  However  the  uniform modifiet indicates that the initial value of the variable or  parameter is expected to be constant across a large number of invocations  of the program     A new set of sampler  types represents handles to texture objects     D    Functions may have default values for their parameters  as in C    These  defaults are expressed using assignment syntax     Function overloading is supported   There is no enum ot union   Bit field declarations in structures are not allowed     There are no bit field declarations in structures        D Do 0 O    Variables may be defined anywhere before they are used  rather than just at  the beginning of a scope as in C   That is  we adopt the C   rules that  govern where variable declarations are allowed   Variables may not be  redeclare
226. osAngle  1    0   xxxx     return OUT        808 00504 0000 004 145  NVIDIA    Cg Language Toolkit       Grass    Description    This effect shows procedural animation of geometty using a Sine function   along with calculation of a normal for the procedurally deformed geometry     Figure 17         Figure 17 Example of Grass    Vertex Shader Source Code for Grass    struct app2vert    isto ais M PON OIM EO STEDIRGONIS       146 808 00504 0000 004  NVIDIA    float4 Normal    Basic Profile Sample Shaders    NORMAL        float4 TexCoord0 TEXCOORDO   float4 Color0 COLORO    he   struct vertout    float4 Hposition POSITION   float4 Color0 COLORO   float4 TexCoord0 TEXCOORDO     y       vertout main app2vert IN     uniform  uniform  uniform  uniform    float4x4 ModelViewProj   float4x4 ModelView   float4x4 ModelViewIT   float4 Constants     vertout OUT        we need to figure OUT what the position is  float4 position   IN Position    position z   0    POSO ny Up       add IN the actual base location of      the straw  stored IN Color0 xz    POSTON 5 AO Sato aP JUN  Colo sre  POSTEN 7 O O Son EN CONO OR          figure OUT where the wind is coming from  float4 origin   float4 20 0 20 0    float4 dir   position   origin        find the intensity of the wind   float inten   sin Constants x    2 length  dir     JIN c POSILIE LOI  S77   dir   normalize dir            we need to do some Bezier curv   Eloet ermi float4 0 0 0 0     loci ceriz   alo   N Color  7 2 9 OP ODE   float4 ctr13 
227. ource Code for Thin Film Effect       define inputs from application  STEEL Ev      float4 Position   POSITION        124 808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders    float3 Normal   NORMAL   10       define outputs from vertex shader    SILIEMOE  WIE      float4 HPOS 2 POS a  TON   float4 diffCol SE e OLOROF  float4 specCol COMORAS       float2 filmDepth   TEXCOORDO   y        v2f main a2v IN   uniform float4x4 WorldViewProj   uniform float4x4 WorldViewIT   uniform float4x4 WorldView   uniform float4 LightVector   uniform float4 FilmDepth   uniform float4 EyeVector     wi  UP      transform position to clip space  OUT HPOS   mul WorldViewProj  IN Position      float4 tempnorm   float4 IN Normal  0 0       transform normal from model space to view spac    float3 normalVec   mul WorldViewIT  tempnorm  xyz   normalVec   normalize normalVec                compute th ye  gt vertex vector  float3 eyeVec   EyeVector xyz              compute the view depth for the thin film  float viewdepth    1 0   dot normalVec  eyeVec      FilmDepth x     OUT filmDepth   viewdepth xx        store normalized light vector  float3 lightVec   normalize   float3  LightVector          calculate half angle vector  float3 halfAngleVec   normalize lightVec   eyeVec          808 00504 0000 004 125  NVIDIA    Cg Language Toolkit       calculate diffuse component  float diffuse   dot normalVec  lightVec         calculate specular component  float specular   dot normalVec  halfAngleVec    
228. ovides because many of them compile directly to  GPU assembly language instructions  Writing a dot product function of your  own     flog  velar  Elosies e dieses 19  1  Situ Elo o wp ELA yap Gig AO   A         compiles to a handful of instructions  while the built in dot    function  compiles to a single specialized dot product instruction  There   s no other way to  get to this instruction other than by using the Standard Library        808 00504 0000 004 259  NVIDIA    Cg Language Toolkit    Two functions deserve particular attention  The abs    function usually has no  cost in either vertex or fragment programs because the GPU can evaluate the  function while executing other instructions  Similatly  the saturate    function  usually has no cost in fragment programs  Do not hesitate to use these  functions when appropriate        4  Use Texture Maps to Encode Complex Functions    For profiles that support texture maps  filtered texture map lookups are  extraordinarily efficient  If you have a complex function that takes more than a  handful of arithmetic operations to evaluate  you might want to encode the  function in a texture map  Say that you have written a function f  x  y  that is a  bottleneck in your shader  Assume for now that it is always called with values of  x and y between zero and one  and that the value that     x  y  computes is  always between zero and one  If the function is reasonably smooth and you  don t need to compute it at extremely high precision 
229. parameter      These functions iterate through all the simple parameters  structure fields and  array elements that are input to the program  Nothing is guaranteed regarding  the order of the parameters in the sequence     Direct Retrieval    Any parameter of a program can be retrieved directly by using its name with  cgGetNamedParameter        CGparameter cgGetNamedParameter  CGprogram program   const char  name      If the program has no parameter corresponding to name   cgGetNamedParameter    returns zero     The Cg syntax is used to retrieve structure fields or array elements  Let s take  the following code snippet as an example     struct FooStruct    float4 A   float4 B   e  Sieg U ele iByeurscii en d  WOOSELUCE DOOL   y  void main BarStruct Bar 3       Un       The following are valid names for retrieving the corresponding parameter        Bar       Bar 1     Bar 1  Foo      Bar 1   Foo 0       Bar 1  Foo 0  B        808 00504 0000 004 41  NVIDIA    Cg Language Toolkit    Parameter Query  Parameter queries encompass validity  references  and attributes   Parameter Validity    The function cgIsParameter    allows you to check whether a parameter  handle references a valid parameter or not     CGbool cgIsParameter CGparameter parameter      A parameter handle becomes invalid when the program or the context of the  program it corresponds to is destroyed     Parameter References    A parameter that is referenced by the original Cg source code may be optimized  out of the c
230. pe  A function that takes no  parameters may be declared in one of two ways     Q Asin C  using the void keyword  functionName  void        Q With no parameters at all  functionName          808 00504 0000 004 169    NVIDIA    Cg Language Toolkit    Functions may be declared as static  If so  they may not be compiled as a  y ar Pey may t p  program and are not visible from other compilation units     Overloading of Functions by Profile    Cg supports overloading of functions by compilation profile   This capability  allows a function to be implemented differently for different profiles  It is also  useful because different profiles may support different subsets of the language  capabilities  and because the most efficient implementation of a function may  be different for different profiles     The profile name must immediately precede the type name in the function  declaration  For example  to define two different versions of the function  myfunc    for the profileA and profileB profiles     protrTeA loer mytunc Elost x  IIA  protileB float myfunc float x                If a type is defined  using a typedef  that has the same name as a profile  the  identifier is treated as a type name and is not available for profile overloading at  any subsequent point in the file     If a function definition does not include a profile  the function is referred to as  an open profile function  Open profile functions apply to all profiles     Several wildcard profile names are defined  The 
231. ple illustrating this operation   CGprogam programl  program2   programl   cgCreateProgramrromBile  context  CG SOURCE    VertexProgram cg   CG PROFILE VS 1 1  0  0    const DWORD declarationl    cgD3D8GetVertexDeclaration programl    cgD3D8LoadProgram programl  TRUE  0  0  declarationl    program2   cgCopyProgram programl                  const DWORD declaration2        loaa Customs declaration  H  808 00504 0000 004 75    NVIDIA    Cg Language Toolkit    1f  cgD3D8ValidateVertexDeclaration program2  declaration2    cgD3D8LoadProgram program2  TRUE  0  0  declaration2      Only the loading functions differ between Direct3D 9 and Direct3D 8  the  unloading and binding functions ate the same        To release the Direct3D resources allocated by cgD3D9LoadProgram    such as  the Direct3D shader object and any shadowed parameter  use    HRESULT cgD3D9UnloadProgam CGprogram program      Note that cgD3D9UnloadProgam   does not free any core runtime resources   such as program and any of its parameter handles  On the other hand   destroying a program with cgDestroyProgram   or cgDestroyContext     releases any Direct3D resources by indirectly calling cgD3D9UnloadProgam        Function cgD3D9IsProgramLoaded    returns CG TRUE if a program is loaded   CGbool cgD3D9IsProgramLoaded  CGprogram program      All programs must be loaded before they can be bound  Binding a program is  done by calling cgD3D9BindProgram       HRESULT cgD3D9BindProgram CGprogram program      This function basi
232. ply transposed float3x3 matrix m by a float3 v     mul v  m    is equivalent to and more efficient than    mul transpose m   v         9  Minimize Conditional Code in Fragment Programs    GPUs don t currently support branching in fragment programs  a program with  a large amount of code that is conditionally executed   for example in an if   else expression   tends to run at the same speed as if all of it were executed   Therefore  if you have a large amount of conditional code and it is possible to  evaluate the condition on the CPU  it may be advantageous to have multiple  versions of the shader source code and to bind the one with the appropriate  code path at run time     An example of this situation would be a fragment shader that supported a  generic light source model for shading  Depending on how its parameters were  set  it might implement a point light  a spotlight  or a light source that projected  a texture map to determine the light distribution  Rather than having a series of  if else tests to determine which light model to use  having a separate version  of the shader for each light type is generally more efficient        808 00504 0000 004 263  NVIDIA    Cg Language Toolkit       264 808 00504 0000 004  NVIDIA       P Appendix D  Cg Compiler Options    This appendix describes the command line options for the Cg compiler  What  follows are the command line options for the Cg compiler  egc  exe     A  profile prof  Compile for the prof profile        Ud  profile
233. pproach used to specify binding semantics for inputs     Aliasing of Semantics    Semantics must honor a copy on input and copy on output model  Thus  if the  same input binding semantic is used for two different variables  those variables  are initialized with the same value  but the variables are not aliased thereafter   Output aliasing is illegal  but implementations are not required to detect it  If  the compiler does not issue an error on a program that aliases output binding  semantics  the results are undefined     Restrictions on Semantics Within a Structure    For a particular profile  it is illegal to mix input binding semantics and output  binding semantics within a particular struct  That is  for a particular top level  function  a struct must be either input only or output only  Likewise  a  struct must consist exclusively of uniform inputs or exclusively of non   uniform inputs  It is illegal to use binding semantics to mix the two within a  single struct     Additional Details for Binding Semantics    The following rules are somewhat redundant  but provide extra clarity    Q Semantics names are case insensitive    Q Semantics attached to parameters to non main functions are ignored   Q Input semantics may be aliased by multiple variables   Q    Output semantics may not be aliased        184    808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    How Programs Receive and Return Data    A program is just a non static function that has been design
234. predefined output structures 28  varying output 8  fragment program profiles 193  OpenGL ARB 211  OpenGL NV_fragment_program 218  fragment program  defined 2  fresnel 144  sample shader 144  vertex shader code example 144  function  calls 171  multiplying 15  open profile 170  function definitions  introduction 14  function overloading 181  introduction 14  functions  debugging 28  declaring 169  derivative 27    geometric 24  mathematical 19  overloading by profile 170  standard library 19  texture map 25    G  geometric functions 24  GL_ARB_vertex 204  global variables 182  graphics hardware  evolution of xi  grass   sample shader 146   vertex shader code example 146    H  half data type 11  half type  specification 171    I  if statements 185  inputs  uniform 5  varying 5  int data type 11  int type  specification 171  integral type category 174    J  Java  relation to Cg 165    L  language profiles  concept of 3    M  mathematical functions 19  matrices  multiplying 15  matrices  support of 11  matrix palette skinning 161  sample shader 161  vertex shader code example 162  matrix transposes and performance 263  melting paint       270    808 00504 0000 004    NVIDIA    pixel shader code example 107  sample shader 105  vertex shader code example 105  min   for performance 259  miscellaneous operators 190  modifiable function parameters  passing 14  multipaint  pixel shader code example 111  sample shader 109  vertex shader code example 110    namespaces 179  numeric type c
235. ption       808 00504 0000 004 171    NVIDIA    Cg Language Toolkit             to provide full support for the fixed type or to implement the fixed type  with the same precision as the half or float types     The bool type represents Boolean values  Objects of bool type are either  true or false     The cint type is 32 bit two s complement  This type is meaningful only at  compile time  it is not possible to declare objects of type cint     The c  1oat type is IEEE single precision  32 bit  floating point  This type  is meaningful only at compile time  it is not possible to declare objects of  type cfloat     The void type may not be used in any expression  It may only be used as  the return type of functions that do not return a value     The sampler  types ate handles to texture objects  Formal parameters of a  program or function may be of type sampler   No other definition of  sampler  variables is permitted  A sampler  vatiable may only be used by  passing it to another function as an in parameter  Assignment to sampler   variables is not permitted  and sampler  expressions are not permitted   The following sampler  types are always defined  sampler  sampler1D   sampler2D  sampler3D  samplerCUBE  and samplerRECT  The base  sampler type may be used in any context in which a more specific sampler  type is valid  However  a sampler variable must be used in a consistent way  throughout the program  For example  it cannot be used in place of both a  sampler1D and a sampler2D
236. r  offset rectangle scale NV texture shader instructions        texlD dp3 samplerlD tex  float3 str  float4 prevlookup        Performs the following  return texlD tex  dot str  prevlookup xyz     where  str are texture coordinates associated with sampler tex  and  prevlookup is the result of a previous texture operation     This function can be used to generate the dot product 1d  NV texture shader instruction                 252    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Table 50   p20 Auxiliary Texture Functions  continued        Texture Function       Description       tex2D_dp3x2  uniform sampler2D tex  float3 str   float4 intermediate coord  float4 prevlookup   texRECT_dp3x2  uniform samplerRECT tex  float3 str   float4 intermediate coord  float4 prevlookup        Performs the following  float2 newst   float2  dot  intermediate coord xyz  prevlookup xyz    dot str  prevlookup xyz     return tex2D RECT  tex  newst     where  str are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation  and  intermediate coord are texture coordinates associated with the previous  texture unit     This function can be used to generate the dot product 2d or  dot product rectangle NV texture shader instruction combinations        tex3D dp3x3 sampler3D tex  float3 str   float4 intermediate coordl   float4 intermediate coord2  float4 prevlookup     texCUBE dp3x3 samplerCUBE tex  float3 str   float4 intermediate coordl  
237. r  y  g  z  b  w  a   xy  rg  xyz  rgb  xyzw  rgba   xxx  rrr  yyy  ggg  zzz  bbb  www  aaa   xxxx  rrrr  yyyy  gggg  zzzz  bbbb  wwww  aaaa  Matrix swizzles are not supported   Boolean operators other than      lt     gt  and  gt   are not supported   Furthermore   lt    lt     gt  and  gt   are only supported as the condition in the     operator   Bitwise integer operators are not supported     is not supported unless the divisor is a non zero constant or it is used to  compute the depth outputin ps 1 3     is not supported   Ternary    is supported if the boolean test expression is a compile time  boolean constant  a uniform scalar boolean or a scalar comparison to a  constant value in the range   0 5  1 0   for example a    0 5   b   c     Q do for and while loops are supported only when they can be completely  unrolled    Q arrays  vectors  and matrices may be indexed only by compile time constant  values or index variables in loops that can be completely unrolled    Q The discard statement is not supported  The similar but less general  clip   function is supported    Q The use of an allocation rule identifier for an input or output  struct is optional    808 00504 0000 004 229    NVIDIA    Cg Language Toolkit    Standard Library Functions    Because the DirectX pixel shader 1_X profiles have limited capabilities  not all  of the Cg standard library functions are supported  Table 35 presents the Cg  standard library functions that are supported by these profiles  See th
238. r fragment  program profiles  Profiles may define additional output binding semantics with  specific behaviors  and these definitions are expected to be consistent across  commonly used profiles     Table 9 Fragment Output Binding Semantics                               Name Meaning Type Default Value  COLOR RGBA output color float4  Undefined  808 00504 0000 004 193    NVIDIA    Cg Language Toolkit    Table 9 Fragment Output Binding Semantics  continued        COLORO Same as COLOR               DEPTH Fragment depth value  float   Interpolated depth from rasterizer   in range  0 1    in range  0 1                        If a program desires an output color alpha of 1 0  it should explicitly write a  value of 1 0 to the w component of the COLOR output  The language does not  define a default value for this output        Note  If the target hardware uses a default value for this output  the compiler may  choose to optimize away an explicit write specified by the user if it matches the  default hardware value  Such defaults are not exposed in the language        In contrast  the language does define a default value for the DEPTH output  This  default value is the interpolated depth obtained from the rasterizer   Semantically  this default value is copied to the output at the beginning of the  execution of the fragment program     As discussed earlier  when a binding semantic is applied to an output  the type  of the output variable is not required to match the type of the bindin
239. r needs more registers to  compile a program than are available  it generates an error        2   To understand the capabilities of DirectX PS 2 0 Pixel Shaders and the code produced by  the compiler  refer to the Pixel Shader Reference in the DirectX 9 SDK documentation        200    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Language Constructs and Support  Data Types    This profile implements data types as follows     Q float data type is implemented as IEEE 32 bit single precision        Q half  fixed  and double data types are treated as float   half data types can be used to specify partial precision hint for pixel shader  instructions     int data type is supported using floating point operations        sampler  types are supported to specify sampler objects used for texture  fetches     Statements and Operators    With the ps 2 0 profiles while  do  and for statements are allowed only if the  loops they define can be unrolled because there is no dynamic branching in PS  2 0 shaders  In current Cg implementation  extended ps_2_x shaders also have  the same limitation     Comparison operators ate allowed   gt    lt    gt     lt            and Boolean operators   1 1   amp  amp      are allowed  However  the logic operators  8           are not     Using Arrays and Structures    Variable indexing of arrays is not allowed  Array and structure data is not  packed        808 00504 0000 004 201  NVIDIA    Cg Language Toolkit    Bindings    Binding Semanti
240. r produces   see the Vertex Shader Reference in the DirectX 9 SDK documentation        196    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Statements and Operators    If the vs 2 0 profile is used  then i    while  do  and for statements are  allowed only if the loops they define can be unrolled because there is no  dynamic branching in unextended VS 2 0 shaders     Ifthe vs 2 x profile is used  then i    while  and do statements are fully  supported as long as the DynamicFlowControlDepth option is not 0     Comparison operators ate allowed   gt    lt    gt     lt            and Boolean operators   11   amp  amp       are allowed  However  the logic operators  8           are not     Data Types   The profiles implement data types as follows   Q float data types are implemented as IEEE 32 bit single precision   Q half and double data types ate treated as float        Q int data type is supported using floating point operations  which adds extra  instructions for proper truncation for divides  modulos and casts from  floating point types     Q fixed or sampler  data types are not supported  but the profiles do  provide the minimal partial support that is required for these data types by  the core language specification   that is  it is legal to declare variables using  these types  as long as no operations are performed on the variables     Using Arrays    Variable indexing of arrays is allowed as long as the array is a uniform constant   For compatibility reas
241. rc rabo Ede CR BOE RC SUPER WC SOME RO a te ER 39  Core  CO ENO si oro                IUIUS 44  APTSpecific Cg R  ntilies     a m o cc A eee ROC UR ales 45  Parameter SHOWING  1  422 rte hex EO oet heh RR Rp OR RR ER CE RS 46  OpenGL Cg RUMME gr scary EQ ES EON Ru ESO E RR 46  Direct3D Cg RUNTIME  ss cos ere eee ie rol weed Si dP ed OS ko eee 57  A Brief T  torlal  4 52x acres area E C CERCA CR RR RC DR or RC 89  Loading the WorkSpace sa cccc5 4000e   4 RR EY ad ei 89  Understanding simple  versatil 90  Program  Listing for Simple CO  125 uen d is di e e er A 91  Definitions for Structures with Varying Data           oocccccororooooomoos 92  Passing  AUMENS cz rro rr as 93  Basic  Iranisformationis   0d ica dt etre Ad ane dedos 93  Prepare for Lightihg apre epu ark otn demo A ae eR rers dade dus 94  Calculating the Vertex Colo  chos vx gae bue ru boli ghd CERA GUERRE dd 94  Further Experimentation i224  cie ka ERR RAEG A Se 95  Advanced Profile Sample Shaders                ieeeeee enn 97  Improved  SKINNING sawed cara media a RR REOR ORG RN A ee eines eae 98  prego DP                               tt 98  Vertex Shader Source Code for Improved Skinning             o oooooooomoo   99  Improved Waters  22x eic acku addo eee PE Ea E Rok d EP E donde tn dete tos 101  D  eSCriptlO  cuins dre kCR SCR SR 101  Vertex Shader Source Code for Improved Water              ilsis ee 102  Pixel Shader Source Code for Improved Water             llle 104  Meting Pant   s pora pra bead ene Roca
242. rder  and the c suffix is for functions that assume  the matrix is laid out in column otder     The corresponding parameter value retrieval functions are    void cgGLGetMatrixParameterfr  CGparameter parameter   float  matrix     void cgGLGetMatrixParameterfc  CGparameter parameter   float  matrix     void cgGLGetMatrixParameterdr  CGparameter parameter   double  matrix     void cgGLGetMatrixParameterdc  CGparameter parameter   double  matrix      Use egGLSetStateMatrixParameter   to set a OpenGL 4x4 state matrix     void cgGLSetStateMatrixParameter  CGparameter parameter   GLenum stateMatrixType  GLenum transform      The variable stateMatrixType is an enumerate type specifying the state matrix  to be used to set the parameter     Q CG GL MODELVIEW MATRIX for the current model view matrix       48 808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    Q CG GL PROJECTION MATRIX for the current projection matrix    CG GL TEXTURE MATRIX for the current texture matrix       CG GL MODELVIEW PROJECTION MATRIX for the concatenated model   view and ptojection matrices    The variable transform is an enumerate type specifying a transformation  applied to the state matrix before it is used to set the parameter value     O CG GL MATRIX IDENTITY for applying no transformation at all    CG_GL MATRIX TRANSPOSE for transposing the matrix       a  Q CG GL MATRIX INVERSE for inverting the matrix  a    CG GL MATRIX INVERSE TRANSPOSE for inverting and transposing the  matrix    Setting 
243. re  them in the program itself  Instead the compiler will issue  as comments  a list  of program parameter registers and the constants that need to be loaded into  them  The Cg run time system will handle loading the constants  as directed by  the compiler        Note  If the Cg run time system is not used  it is the responsibility of the programmer to  make sure that the constants are loaded properly           224 808 00504 0000 004  NVIDIA    Bindings    Appendix B Language Profiles    Binding Semantics for Uniform Data    Table 31 summarizes the valid binding semantics for uniform parameters in the    vs_1 1 profile     Table 31 vs_1 1 Uniform Input Binding Semantics       Binding Semantics Name    Corresponding Data       register  c0   register  c95   C0 C95       Constant register  0  95     The aliases c0 c95  lowercase  are also  accepted    If used with a variable that requires more than  one constant register  for example  a matrix    the semantic specifies the first register that is  used              Binding Semantics for Varying Input Output Data    Table 32 summarizes the valid binding semantics for uniform parameters in the  vs 1 1 profile  These map to the input registers in DirectX 8 1 vertex shaders     Table 32 vs 1 1 Varying Input Binding Semantics       Binding Semantics Name    Corresponding Data                   POSITION Vertex shader input register  vO  BLENDWEIGHT Vertex shader input register  v1  BLENDINDICES Vertex shader input register  v2  NORMAL
244. re element  of type float4x4 with  an input binding semantic that causes it to track the fixed function  modelview projection matrix   The name of this binding semantic is  currently profile specific   for OpenGL profiles  the semantic  _GL_MVP is recommended         192    808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    Q If the first condition is met but not the second  the compiler is encouraged  to issue a warning        Q Implementations may choose to recognize more general versions of the  second condition  such as the variables being copy propagated from the  original inputs and outputs   but this additional generality is not required     Binding Semantics for Outputs    As shown in Table 8  there are two output binding semantics for vertex  program profiles     Table 8 Vertex Output Binding Semantics       Name Meaning Type Default Value       POSITION   Homogeneous clip space position  float4   Undefined  fed to rasterizer                    PSIZE Point size float   Undefined          Profiles may define additional output binding semantics with specific behaviors   and these definitions are expected to be consistent across commonly used  profiles        Fragment Program Profiles    A few features of the Cg language that are specific to fragment program profiles  are requited to be implemented in the same manner for all fragment program  profiles     Binding Semantics for Outputs    As shown in Table 9  there are three output binding semantics fo
245. rent kinds of inputs     a Varying inputs are used for data that is specified with each element of the  stream of input data  For example  the varying inputs to a vertex program  are the pet vertex values that are specified in vertex arrays  For a fragment  program  the varying inputs are the interpolants  such as texture  coordinates     a Uniform inputs are used for values that are specified separately from the main  stream of input data  and don t change with each stream element  For  example  a vertex program typically requires a transformation matrix as a  uniform input  Often  uniform inputs are thought of as graphics state     Varying Inputs to a Vertex Program    A vertex program typically consumes several different per vertex  varying   inputs  For example  the program might require that the application specify the  following varying inputs for each vertex  typically in a vertex array     a Model space position    Q Model space normal vector       Q Texture coordinate    In a fixed function graphics pipeline  the set of possible per vertex inputs is  small and predefined  This predefined set of inputs is exposed to the application  through the graphics API  For example  OpenGL 1 4 provides the ability to  specify a vertex array of normal vectors     In a programmable graphics pipeline  there is no longer a small set of  predefined inputs  It is perfectly reasonable for the developer to write a vertex  program that uses a per vertex refractive index value as long as t
246. ression is a compile time  boolean constant  a uniform scalar boolean or a scalar comparison to a  constant value in the range   0 5  1 0   for example  a  gt  0 5   b   c      Q do  for and while loops are supported only when they can be  completely unrolled     Q arrays  vectors  and matrices may be indexed only by compile time constant  values or index variables in loops that can be completely unrolled     Q The discard statement is not supported  The similar but less general  clip   function is supported        Q The use of an allocation rule identifier for an input or output  struct is optional     Standard Library Functions    Because the   p20 profile has limited capabilities  not all of the Cg standard  library functions are supported     Table 45 presents the Cg standard library functions that are supported by this  profile  See the standard library documentation for descriptions of these  functions     Table 45 Supported Standard Library Functions       dot floatN  floatN        lerp floatN  floatN  floatN        lerp floatN  floatN  float        texlD samplerlD  float        tex1D sampler1D  float2        tex1Dproj sampler1D  float2        texlDproj samplerlD  float3        tex2D sampler2D  float2        tex2D sampler2D  float3        tex2Dproj sampler2D  float3        tex2Dproj sampler2D  float4              texRECT  samplerRECT  float2           808 00504 0000 004 247  NVIDIA    Cg Language Toolkit    Table 45 Supported Standard Library Functions  continued      
247. revious texture operation   intermediate coordl are texture coordinates associated with the n 2  texture unit  and  intermediate _coord2 are texture coordinates associated with the n 1  texture unit   This function can be used to generate the texm3x3pad texm3x3pad   texm3x3tex instruction combination in all ps 1 x profiles                 236    808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Table 40 ps 1 x Auxiliary Texture Functions  continued        Texture Function       Description       texCUBE reflect dp3x3 uniform samplerCUBE tex  float4 strq   float4 intermediate coordl    float4 intermediate coord2    float4 prevlookup        Performs the following  float3 E   float3 intermediate coord2 w  intermediate coordl w   strq w    float3 N   float3 dot intermediate coordl xyz  prevlookup xyz    dot intermediate coord2 xyz  prevlookup xyz    dot strq xyz  prevlookup xyz     return texCUBE  tex  2   dot N  E    dot N  N    N  E    where  strq are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate coordl are texture coordinates associated with the n 2    texture unit  and  intermediate coord  are texture coordinates associated with the n 1    texture unit   This function can be used to generate the texm3x3pad  texm3x3pad   texm3x3vspec instruction combination in all ps 1 x profiles                 237    808 00504 0000 004  NVIDIA    Cg Language Toolkit    Table 40 ps 1 x Auxiliary Texture Fu
248. riables  In this  case  the homogeneous position information resides in the hardware register  corresponding to POSITION and that the color information resides in the  hardware register corresponding to COLOR     Passing Arguments    Now let s take a look at the body of the program  section by section  starting  with the declaration of main        vertout main appin IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelViewIT   uniform float4 LightVec     As required for a vertex program  main   takes an application to vertex  structure as input and returns a vertex to fragment structure  In this case  we  are using the two structure types we have already defined  appin and vertout   Notice that main   takes in three uniform parameters  two matrices and one  vector  All three parameters are passed to simple   cg by the application  using  the run time library     The first matrix  ModelViewProj  is the concatenation of the modelview and  projection matrices  Together  these matrices transform points from model  space to clip space  The second matrix  ModelViewIT  is the inverse transpose  of the modelview matrix  The third parameter  LightVec  is a vector that  specifies the location of the light source     Basic Transformations    Now we start the body of the vertex program     vertout OUT     OUT HPosition   mul  ModelViewProj  IN Position      A vertex program is responsible for calculating the homogenous clip space  position of the vertex  given the vertex   s model
249. rns a 4 vector as  follows     e The x component of the result vector contains the  ambient coefficient  which is always 1 0      The y component contains the diffuse coefficient  which is zero if  n     1     0  otherwise  n     1     e The z component contains the specular coefficient  which is zero if either  n     1   lt  Oor  n e h   lt  0    n 9 n   otherwise    e The w component is 1 0    There is no vectorized version of this function             log x  Natural logarithm 1n  x      x must be greater than zero   log2  x  Base 2 logarithm of x    x must be greater than zero   log10  x  Base 10 logarithm of x     x must be greater than zero           max  a  b        Maximum of a and b          808 00504 0000 004    21  NVIDIA       Cg Language Toolkit    Table 1 Mathematical Functions  continued        Mathematical Functions    Function    Description       min  a  b     Minimum of a and b       modf  x  out ip     Splits x into integral and fractional parts  each with the  same sign as x    Stores the integral part in ip and returns the fractional  part        mul M  N     Matrix product of matrix M and matrix N  as shown  below    M    11  mul M  N                M  Mis  Ma    If M has size AxB  and N has size BxC  returns  a matrix of size AxC        mul  M  v     Product of matrix M and column vector v  as shown  below        mul M  v          SIS             My  Mis  Mis  Mia       If Mis an AxB matrix and v is an Bx1 vector  returns an  Ax1 vector        mul v  M 
250. rocessor  We refer to these programs as vertex programs and  fragment programs  respectively   Fragment programs are also known as pixel  programs ot pixel shaders  and we use these terms interchangeably in this  document   Cg code can be compiled into GPU assembly code  either on  demand at run time or beforehand        2 808 00504 0000 004  NVIDIA    Introduction to the Cg Language    Cg makes it easy to combine a Cg fragment program with a handwritten vertex  program  or even with the non programmable OpenGL or DirectX vertex  pipeline  Likewise  a Cg vertex program can be combined with a handwritten  fragment program  or with the non programmable OpenGL or DirectX  fragment pipeline     Cg Language Profiles    Because all CPUs support essentially the same set of basic capabilities  the C  language supports this set on all CPUs  However  GPU programmability has  not quite yet reached this same level of generality  For example  the current  generation of programmable vertex processors supports a greater range of  capabilities than do the programmable fragment processors  Cg addresses this  issue by introducing the concept of language profiles  A Cg profile defines a  subset of the full Cg language that is supported on a particular hardware  platform or API  The current release of the Cg compiler supports the following  profiles     a DirectX 9 vertex shaders  Runtime profiles    CG PROFILE VS 2 X  CG PROFILE VS 2 0  Compiler options   profile vs 2 x   profile vs 2 0  Q Dire
251. rs is set by some  function of the Direct3D Cg runtime  it is immediately downloaded to the GPU  constant memory  the memory containing the values of all the uniform  parameters   When parameter shadowing is turned on  the value is shadowed  instead and no Direct3D call is made at the time it is set  only when the program  is bound are all of its parameters actually downloaded to the constant memory   This means that a parameter value set after binding the program is not used  during the execution of the program until the next time the program is bound   Parameter shadowing applies to all parameter settings including texture state  stage and texture mode     Disabling parameter shadowing allows the runtime to consume less memory   but forces the application to do the work of making sure that the constant  memory contains all the right values every time it activates a program     OpenGL Cg Runtime    This section discusses setting parameters and program execution for the  OpenGL Cg runtime     Setting Parameters in OpenGL    In accordance with the OpenGL convention  many of the functions described  below come in two versions  a version operating on float values  marked with  an f  and a version operating on double values  marked with a d        46    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    Setting Uniform Scalar and Uniform Vector Parameters    To set the values of scalar parameters or vector parameters  use the  cgGLSetParameter functions     void  void 
252. ructions     The underlying instruction set and machine architecture limit programmability  in this profile compared to what is allowed by Cg constructs  Thus  this profile  places additional restrictions on what can and cannot be done in a Cg program     Restrictions    A Cg program in one of these profiles is limited to generating a maximum of  four texture shader instructions and eight register combiner instructions  Since  these numbers are quite small  users need to be very aware of this limitation  while writing Cg code for these profiles     The   p20 profile also restricts when a texture shader operation or arithmetic  operation can occur in the program  A textute shader operation may not have  any dependency on the output of an arithmetic operation unless    O the arithmetic operation is a valid input modifier for the texture shader  operation       Q the arithmetic operation is part of a complex texture shader operation   which ate summatized in the section    Auxiliary Texture Functions    on  page 251        9  For more details about the underlying instruction sets  their capabilities  and their  limitations  please refer to the NV_texture_shader and NV_register_combiners extensions  in the OpenGL Extensions documentation        244    808 00504 0000 004  NVIDIA    Modifiers    Appendix B Language Profiles    There are certain simple arithmetic operations that can be applied to inputs of  texture shader operations and to inputs and outputs of arithmetic operations
253. s  like     this one   you can create a vertex declaration     using those semantics                       DWORD declaration        D3DVSD_STREAM  0     D3DVSD_REG D3DVSDE POSITION  D3DVSDT_FLOAT3          D3DVSD_REG  D3DVSD_REG  D3DVSD_END            D3DVSDE DIFFUSE  D3DVSDT D3DCOLOR    D3DVSDE TEXCOORDO  D3DVSDT FLOAT2                                      808 00504 0000 004 67    NVIDIA    Cg Language Toolkit       Make sure the resulting declaration is compatible with     the shader  This is really just a sanity check   assert  cgD3D8ValidateVertexDeclaration  vertexProgram   declaration        Create the shader handle using the declaration   device  gt CreateVertexShader  declaration   byteCode  gt GetBufferPointer     amp vertexShader  0                     Create the pixel shader    fragmentProgram   cgCreateProgramFromFile  context   CG_SOURCE   FragmentProgram cg    ds PROMI ES 1 1  Vinicecmemceioguen         2          CComPtr lt ID3DXBuffer gt  byteCode   const char  progSrc   cgGetProgramString  fragmentProgram   CG_COMPILED PROGRAM    D3DXAssembleShader progSrc  strlen progSrc   0  0  O0    amp byteCode  0    device  gt CreatePixelShader  byteCode  gt GetBufferPointer       amp pixelShader                  Grab some parameters   modelViewMatrix   cgGetNamedParameter  vertexProgram    ModelViewMatrix     baseTexture   cgGetNamedParameter fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram    SomeColor                   Sanity check t
254. s  that is   1ightVec eyeVec   2   We normalize  halfVec  so we don t need to bother with the division by two  because it  cancels out after normalization anyway  In this example  we assume that the eye  is at  0 0 1   but an application would typically pass the eye position also as a  uniform parametet  since it would be unchanged from vertex to vertex  We use  Cg   s inline vector construction capability to build a 3 component float vector  that contains the eye position  and then we assign this value to eyeVec     Calculating the Vertex Color    Now we have to calculate the vertex color to output     Calculating the Diffuse and Specular Lighting Contributions    In this example  we re going to calculate just a simple combination of diffuse  and specular lighting        calculate diffuse component  float diffuse   dot normalVec  lightVec         1  Because LightVec is uniform  it is more efficient to normalize it once in the application  rather than on a per vertex basis  It is done here for illustrative purposes        94 808 00504 0000 004  NVIDIA    A Brief Tutorial       calculate specular component  float specular   dot  normalVec  halfVec         Use the lit function to compute lighting vector from     diffuse and specular values  float4 lighting   lit diffuse  specular  32      Here we use the Cg Standard Library to perform dot products  using dot       We also make use of the Standard Library s 1it    function to calculate a Blinn   style lighting vector based on th
255. s HdotN and LdotN per vertex to look up into a  2D texture to achieve interesting lighting effects        Figure 13 Example of Anisotropic Lighting       134 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Anisotropic Lighting    struct appdata    itlowurS iesus     IOS IVIPILONNIP  float3 Normal   NORMAL    he    shesauicia vo COmnen   float4 Hposition   POSITION   float4 TexCoord0   TEXCOORDO        H    vpconn main appdata IN   uniform float4x4 WorldViewProj   uniform float3x3 WorldIT   uniform float3x4 World   uniform float3 LightVec   uniform float3 EyePos        vpconn OUT   float3 worldNormal   normalize  mul  Wor1dIT  IN Normal         build float4   float4 tempPos    tempPos xyz   IN Position xyz   tempPos w   1 0       compute world space position   float3 worldSpacePos   mul World  tempPos       wector from vertex to eye  normalized   float3 vertToEye   normalize EyePos   worldSpacePos              h   normalize l   e   float3 halfAngle   normalize vertToEye   LightVec             OUI o omBUNES max  dot LightVec worldNormal  0 0    OUT TexCoord0 y max  dot  halfAngle worldNormal  0 0       transform into homogeneous clip space  OUT Hposition   mul  WorldViewProj  tempPos             return OUT        808 00504 0000 004 135    NVIDIA    Cg Language Toolkit       Bump Dot3x2 Diffuse and Specular    Description    The bump dot3x2 diffuse and specular effect mixes bump mapping with diffuse  and specular lighting based on the 
256. s a single pass shader containing  diffuse  speculat  and environmental lighting effects in a compact  fast   executing package        Figure 8 Example of MultiPaint       808 00504 0000 004 109  NVIDIA    Cg Language Toolkit    Vertex Shader Source Code for MultiPaint       define inputs from vertex buffer  struct appin                  float4 Position Poe TT LON   float4 UV   TEXCOORDO   float4 Tangent     IMEPACOMIRUDIL 3  float4 Binormal TEXCOORD2   float4 Normal TEXCOORD3    n      output    same struct is the input   struct MultiPaintV2F    float4 HPosition POSITION       float4 TexCoords CELE COORD YY  float3 OPosition MERC O ORD  float3 Normal TEXCOORD2      float3 VPosition TEXCOORD3      float3 T TEXCOORD4      float3 B TEXCOORDS ER  float3 N TEXCOORD6      float4 LightVecO MEXCO ORD                y       MultiPaintV2F main  appin  uniform  uniform  uniform  uniform  uniform    IN        float   MultiPaintV2F OUT   OUT HPosition   mul  ModelViewProj        OUT OPosition   IN Position xyz      transform normal to eye space  OUT Normal            OUT TexCoords   IN UV   TexRepeats     normalize  mul  ModelViewIT     ito Mes plagam o cuj     position  clip space   base ST coordinates  position  obj space   normal  eye space   view pos  obj space   tangent  obj space   binormal  obj space   normal  obj space   largime Chir OS pace     float4x4 ModelViewProj   float4x4 ModelViewIT   float4x4 ModelViewI   float4 TexRepeats   LightVec         eye space     IN Position    
257. s allow the programmer to decide which constant register a  uniform variable will reside in by specifying the C lt n gt  register  c lt n gt    binding semantic  This is not allowed in the   p20 profile since the   NV register combiners extension does not have a single bank of constant  registers  While the NV register combiners extension does describe  constant registers  these constant registers are per combiner stage and  specifying bindings to them in the program would overly constrain the  compiler        808 00504 0000 004 249    NVIDIA    Cg Language Toolkit    Binding Semantics for Varying Input Output Data    The varying input binding semantics in the   p20 profile are the same as the  varying output binding semantics of the vp20 profile     Varying input binding semantics in the   p20 profile consist of COLORO  COLOR1   TEXCOORDO  TEXCOORD1  TEXCOORD2 and TEXCOORD3  These map to output  registers in vertex shaders     Table 48 summarizes the valid binding semantics for varying input parameters  in the   p20 profile     Table 48    p20 Varying Input Binding Semantics                   Binding Semantics Name Corresponding Data   COLOR  COLORO Input color value vo   COL  COLO   COLOR1 Input color value v1   COL1   TEXCOORDO   TEXCOORD3 Input texture coordinates t0 t3  TEXO   TEX3   FOGP Input fog color and factor   FOG                Additionally  the   p20 profile allows POSITION  PSIZE  TEXCOORD4   TEXCOORD5  TEXCOORD6  and TEXCOORD  to be specified on varying inputs
258. s formal parameters  and each of the excess  parameters has a default value  do not eliminate the function     4  If the set is empty  fail   For each actual parameter expression in sequence  perform the following     a  If the type of the actual parameter matches the unqualified type of the  corresponding formal parameter in any function in the set  remove all  functions whose corresponding parameter does not match exactly     b  If there is a defined promotion for the type of the actual parameter to the  unqualified type of the formal parameter of any function  remove all  functions for which this is not true from the set     c  If there is a valid implicit cast that converts the type of the actual  parameter to the unqualified type of the formal parameter of any  function  remove all functions without this cast        808 00504 0000 004 181    NVIDIA    Cg Language Toolkit    d  Fail   5  Choose a function based on profile     a  If there is at least one function with a profile that exactly matches the  compilation profile  discard all functions that don t exactly match     b  Otherwise  if there is at least one function with a wildcard profile that  matches the compilation profile  determine the    most specific     matching wildcard profile in the candidate set  Discard all functions  except those with this most specific wildcard profile  How    specific    a  given wildcard profile name is relative to a particular profile is  determined by the profile specification     
259. s int  the other operand is converted to int   7  Otherwise  both operands have type cint     Note that conversions happen prior to performing the operation   Assignment    Assignment of an expression to an object or compile time typed value converts  the expression to the type of the object or value  The resulting value 1s then  assigned to the object or value        178 808 00504 0000 004  NVIDIA    Appendix A Cg Language Specification    The value of the assignment expressions         and so on  is defined as in C   An assignment expression has the value of the left operand after the assignment  but is not an lvalue  The type of an assignment expression is the type of the left  operand unless the left operand has a qualified type  in which case it is the  unqualified version of the type of the left operand  The side effect of updating  the stored value of the left operand occurs between the previous and the next  sequence point     Smearing of Scalars to Vectors    If a binaty operator is applied to a vector and a scalar  the scalar is automatically  type promoted to a same sized vector by replicating the scalar into each  component  The ternaty    operator also suppotts smeating  The binary rule is  applied to the second and third operands first  and then the binary rule is  applied to this result and the first operand     Namespaces    Just as in C  there are two namespaces  Each has multiple scopes  as in C     O Tag namespace  which consists of struct tags       Q Regu
260. s may be trademarks of the respective companies with which they  are associated     Updates  Any changes  additions  or corrections will be posted at the NVIDIA Cg Web site     http   developer nvidia com Cg    Refer to this site often to keep up on the latest changes and additions to the Cg language     Copyright  Copyright NVIDIA Corporation 2002    RVIDIA     NVIDIA Corporation  2701 San Tomas Expressway  Santa Clara  CA 95050  www nvidia com       Foreword a a es a a ea ee A A xi    Preface caca AS xiii  Release Notes cusa c doa ke UE Ue DECR REX eet EOD E dor CCP LCS em xiv  Online Updates       24x emer eoe XE OR Re kb oer d RE RE MER xiv   Introduction   to the Cg Language sisas er OG EO RR E Re pipa do S RS E d dc 1  The Cg Language  eliana ri Ea ERO RE EE ERES EP V BEN 1  Cg s Programming  Model for GPUS   orinar pr ea io Ro s 2  Cg Language Profiles  srami Bee nds ine eb beet e eee o esie bon ea UR s Rois 3  Declaring Programs It Cg  cocinar eats ARA rea badd Sonne n 4  Program Inputs and  OUtDUES   mus ia Re Roe A A Poe le 4  Working with Data   uas kh ER a ERG A AAA DERE EE 10  Basic Data TYPES a ip da den id do d irae 10  Type  CONVERSIONS a 3L d nen cte rue b aa se ers O Cor ce de wes eae ex RG ae ira 11  SHrUCHINES  c p qr don ase encoded eate E P oe d st od 12  TOS s aactor xb don didt oa SEE Rad IE Roto O opinor Pus 12  Statements  and Operators    3  aedes e res ri IRR AI A 13  lire M T CC io e a 13  Function Definitions and Function Overloading            o
261. s one sign bit  a 23 bit  mantissa  and an 8 bit exponent  This type is supported in all profiles        10    808 00504 0000 004  NVIDIA    Introduction to the Cg Language    although the DirectX 8 pixel profiles implement it with reduced precision  and range for some operations        A  half  A 16 bit IEEE like floating point  s10e5  number   Q int    A 32 bit integer  Profiles may omit support for this type or have the option  to treat int as float     a  fixed  A 12 bit fixed point number  s1 10  number  It is supported in all fragment  profiles    QO bool    Boolean data is produced by comparisons and is used in i   and conditional  operator       constructs  This type is supported in all profiles     O sampler   The handle to a texture object comes in six variants  sampler  sampler1D   sampler2D  sampler3D  samplerCUBE  and samplerRECT  These types are  supported in all pixel and fragment profiles  with one exception   samplerRECT is not supported in the DirectX profiles     Cg also includes built in vector data types that are based on the basic data types   A sample of these built in vector data types includes  but is not limited to  the    following   float4 float3 float2 floatl  bool4 boo13 bool2 booll    Additional support is provided for matrices of up to four by four elements   Here are some examples of matrix declarations     floatixl matrixl     One element matrix   float2x3 matrix2     Two by three matrix  six elements   float4x2 matrix3     Four by two matrix
262. s scalar and vector   the  scalar is    smeared    to create a vector of the necessary size to perform an  elementwise operation  Thus     a     loat3 A  B  C  is equal to float3 a A  a B  a C     The built in arithmetic operators do no  currently support matrix operands  It is  important to remember that matrices are not the same as vectors  even if their  dimensions are the same        14    808 00504 0000 004  NVIDIA    Introduction to the Cg Language    Multiplication Functions    Cg   s mul    functions are for multiplying matrices by vectors  and matrices by  matrices        Matrix by column vector multiply  matrix column vector  mul M  v         Row vector by matrix multiply  row vector matrix  mul v  M         Matrix by matrix multiply  matrix matrix  mul M  N      It is important to use the correct version of mul      Otherwise  you are likely to  get unexpected results  More detail on the mul    functions are provided in    Cg  Standard Library Functions  on page 19     Vector Constructor    Cg allows vectors  up to size 4  to be constructed using the following notation   y ce xx   floac2 3 0  2 0  1 0   1L 0        The vector constructor can appear anywhere in an expression     Boolean and Comparison Operators    Cg includes three of the standard C boolean operators      amp  amp  logical AND  II logical OR    logical negation    In C  these operators consume and produce values of type int  but in Cg they  consume and produce values of type bool  This difference is
263. s the tex2D    function to perform a 2D  texture lookup to determine the fragment   s RGBA color     void applytex uniform sampler2D mytexture   float2 uv   TEXCOORDO   out float4 Guuicolo  5  CON OR  E  outcolor   tex2D mytexture  uv            808 00504 0000 004 17  NVIDIA    Cg Language Toolkit    Cg provides a wide variety of texture lookup functions  a sample of which is  given below  For a complete list see    Texture Map Functions    on page 25     Q Standard nonprojective texture lookup     tex2D  sampler2D tex  float2 s    texRECT  samplerRECT tex  float2 s    texCUBE  samplerCUBE tex  float3 s      Q Standard projective texture lookup     tex2Dproj  sampler2D tex  float3 sq    texRECTproj  samplerRECT tex  float3 sg   texCUBEproj  samplerCUBE tex  float4 sq      O Nonprojective texture lookup with user specified filter kernel size     tex2D  sampler2D tex  float2 s   float2 dsdx  float2 dsdy     texRECT  samplerRECT tex  float2 s   float2 dsdx  float2 dsdy     texCUBE  samplerCUBE tex  float3 s   float3 dsdx  float3 dsdy      The filter size is specified by providing the derivatives of the texture  cootdinates with respect to pixel coordinates x  dsdx  and y  dsdy   For  more information see    Texture Map Functions  on page 25     Q Shadowmap lookup     tex2Dproj  sampler2D tex  float4 szq    tex2DRECT  samplerRECT tex  float4 szq      In these functions  the z component of the texture coordinate holds a  depth value to be compared against the shadowmap  Shadowmap 
264. secolor   COLORO   float4 uv0   TEXCOORDO   float4 uvl LE XCOOR DIF   y   fragout bar  myvf indata     float4 x   indata uv0   JR iM  Ip     The following binding semantics are available in all Cg vertex profiles for    output from vertex programs  POSITION  PSIZE  FOG  COLORO COLOR1  and  TEXCOORDO TEXCOORD7                All vertex programs must declare and set a vector output that uses the  POSITION binding semantic  This value is required for rasterization        808 00504 0000 004 7  NVIDIA    Cg Language Toolkit    To ensure interoperability between vertex programs and fragment programs   both must use the same struct for their respective outputs and inputs  For  example    struct myvert2frag    fite 87 OSO SEIKO  float4 uvO0   TEXCOORDO   float4 uvl TEXCOORD1   be                  Vertex program   myvert2frag vertmain         myvert2frag outdata   EE ting    return outdata        Fragment program   void fragmain myvert2frag indata      float4 tcoord   indata uv0   SS        Note that values associated with some vertex output semantics are intended for  and ate used by the rasterizer  These values cannot actually be used in the  fragment program  even though they appear in the input struct  For example   the indata pos value associated with the POSITION fragment semantic may  not be read in the   ragmain shader     Varying Outputs from Fragment Programs    Binding semantics are always required on the outputs of fragment programs   Fragment programs are required to decl
265. sform   A from tangent to cube space  float4 TangentToCubeSpacel   TEXCOORD2           third row of the 3x3 transform  722 from tangent to cube space  float4 TangentToCubeSpace2   TEXCOORD3        mA nenkin  ay JN     uniform float4x4 WorldViewProj    uniform float3x4 ObjToCubeSpace    uniform float3 EyePosition     in cube space  uniform float BumpScale        WE  UP       pass texture coordinates for  UA fetching the normal map  OUT TexCoord xy   IN TexCoord xy        compute 3x3 transform from tangent to object space  float3x3 objToTangentSpace        first rows are the tangent and binormal     scaled by the bump scale       808 00504 0000 004 141    NVIDIA    Cg Language Toolkit    objToTangentSpace 0    BumpScale   IN T   objToTangentSpace 1    BumpScale   IN B   objToTangentSpace 2    IN N           compute the 3x3 transform from  Hi tangent space to cube space      TangentToCubeSpace       Gi   object2cube   tangent2object        object2cube   transpose  objToTangentSpace        since the inverse of a rotation is its transpose     Jl       So a row of TangentToCubeSpace is the transform by  Up  objToTangentSpace of the corresponding row of      ObjToCubeSpace       OUT  TangentToCubeSpace0 xyz     mul  objToTangentSpace  ObjToCubeSpace 0  xyz     OUT  TangentToCubeSpacel xyz     mul  objToTangentSpace  ObjToCubeSpace 1   xyz     OUT  TangentToCubeSpace2 xyz    mul  objToTangentSpace  ObjToCubeSpace 2  xyz                           compute the eye vector             f  
266. so  introduces a few new ideas  In particular  it includes features designed to  represent data flow in stream processing architectures such as GPUs  Profiles   which ate specified at compile time  may subset certain features of the language   including the ability to implement loops and the precision at which certain  computations are performed     Silent Incompatibilities    Most of the changes from ANSI C are either omissions or additions  but there  are a few potentially silent incompatibilities  These are changes within Cg that could  cause a program that compiles without errors to behave in a manner different  from C     Q The type promotion rules for constants ate different when the constant is  not explicitly typed using a type cast ot type suffix  In general  a binary  operation between a constant that is not explicitly typed and a variable is  performed at the variable s precision  rather than at the constant s default  precision     O Declarations of struct perform an automatic typedef  as in C    and  thus could override a previously declared type        O Arrays are first class types that are distinct from pointers  As a result  array  assignments semantically perform a copy operation for the entire array     Similar Operations That Must be Expressed Differently    There are several changes that force the same operation to be expressed  differently in Cg than in C     a A Boolean type  bool  is introduced  with corresponding implications for  operators and contro
267. soutce code to vertex  programs for use by the NV vertex program OpenGL extension      a Profile name  vp20       Q How to invoke  Use the compiler option  profile vp20     This section describes the capabilities and restrictions of Cg when using the  vp20 profile     Overview    The vp20 profile limits Cg to match the capabilities of the NV vertex program  extension  NV vertex program has the same capabilities as DirectX 8 vertex  shaders  so the limitations that this profile places on the Cg source code wtitten  by the programmer is the same as the DirectX VS 1 1 shader profiles     Aside from the syntax of the compiler output  the only difference between the  vp20 Vertex Shader profile and the DirectX VS 1 1 profile is that the vp20  profile supports two additional outputs  BCOLO  for back facing primary color   and BCOL1  for back facing secondary color      Position Invariance    O The vp20 profile supports position invariance  as described in the core  language specification        Q The modelview projection matrix must be specified using a binding  semantic of GL MVP        7  To understand the NV vertex program and the code produced by the compiler using the  vp20 profile  see the GL NV  vertex program extension documentation    8  See  DirectX Vertex Shader 1 1 Profile  vs 1 1   on page 223 for a full explanation of  the data types  statements  and operators supported by this profile        240 808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Data Types 
268. specially in fragment programs  These are referred to as basic profiles     See  Language Profiles  on page 195 for detailed descriptions of these and  related profiles     Declaring Programs in Cg    CPU code generally consists of one program specified by main   in C  In  contrast  a Cg program can have any name     program is defined using the  following syntax     lt return type gt   lt program name gt    lt parameters gt       lt semantic name gt            asa  7     Program Inputs and Outputs    The programmable processors in GPUs operate on streams of data  The vertex  processor operates on a stream of vertices  and the fragment processor operates  on a stream of fragments        4 808 00504 0000 004  NVIDIA    Introduction to the Cg Language    A programmer can think of the main program as being executed just once on a  CPU  In contrast  a program is executed repeatedly on a GPU   once for each  element of data in a stream  The vertex program is executed once for each vertex   and the fragment program is executed once for each fragment     The Cg language adds several capabilities to C to support this stream based  programming model  For new Cg programmers  these capabilities often take  some time to understand because they have no direct correspondence to C  capabilities  However  the sample programs later in this document demonstrate  that it really is easy to use these capabilities in Cg programs     Two Kinds of Program Inputs    A Cg program can consume two diffe
269. st double  cgGetParameterValues  CGparameter parameter   CGenum valueType  int  numberOfValuesReturned       It retrieves the default value if valueType is equal to CG_DEFAULT and the  constant value if valueType is equal to CG_CONSTANT  The components of the  value are returned in row major order as a pointer to an array containing type  double elements  After cgGetParameterValues    is called  the number of  components available in the array is pointed to by numberOfValuesReturned     Core Cg Error    The core Cg runtime reports an error by setting a global variable containing the  error code  You quety it  as well as the corresponding error string  as follows        CGerror error   cgGetError     const char  errorString   cgGetErrorString  error            Each time an error occurs  the core Cg runtime also calls a callback function   optionally provided by the application  that usually calls cgGetError        void MyErrorCallback      const char  errorString   cgGetErrorString  cgGetError                 cgSetErrorCallback MyErrorCallback          Here is the list of all the CGerror etrots specific to the core Cg runtime   O CG NO ERROR  Returned when no error has occurred     O CG COMPILER ERROR  Returned when the compiler generated an error  A  call to egGetLastListing   should be made to get more details on the  actual compiler error        44    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    Q CG INVALID PARAMETER ERROR  Returned when the parameter used
270. supplied per  vertex  Tangent space bases are skinned in a similar fashion and then used to    transform the light vector into tangent space for per pixel bump mapping   Figure 22         Figure 22 Example of Matrix Palette Skinning       808 00504 0000 004 161  NVIDIA    Cg Language Toolkit    Vertex Shader Source Code for Matrix Palette Skinning    struct appdata    mode SAPOS EON REOS ENEON   float2 Weights   BLENDWEIGHTO   float2 Indices   BLENDINDICES   float3 Normal   NORMAL   float2 TexCoord0   TEXCOORDO    Sto ON S 2 ICO   float o TE METH OEI 27   rial is os mL COORDS                       E                         be    SURCO SORT  float4 Hposition   POSITION   float4 TexCoord0   TEXCOORDO   float4 TexCoordl   TEXCOORD1   float4 Color0   COLORO                   vpconn main appdata IN   uniform float4x4 WorldViewProj   uniform float3x4 Bones 26    uniform float3 LightVec     vpconn OUT     float4 tempPos   tempPos xyz   IN Position xyz   tempPos w   1 0        grab first bone matrix  float i   IN Indices x       transform position  float3 pos0   mul Bones i   tempPos        create 3x3 version of bone matrix  Lores  inp   m  m00 m01 m02   Bones i   m00 m01 m02   m  m10 mil m12 somes  aj   mi0 mi m12   mio 1620 mZ 1122   Iomes x    1120 um21 1225       ans omnes UM  Se       162    808 00504 0000 004  NVIDIA    float3 s0   mul  m  IN S    Ploate   mul  m  IN T    float3 sxt0   mul  m  IN SxT         next bone  i   IN Indices y        create 3x3 version of bone   m  m00
271. t  binormal  normal  Passed in from vertex program   loaro 18  INP  Float3 Nbump     Bump mapped normal    Float3 bump   tex2D bumpSampler  uv     Nbump x pump za SUI O EZ   IN gdp  Iowiwgow   PUMPA w MS  ue domo 7   F JBoss   m oo 74 55 NO  Nou UE TA O ZN PAZOS             However  here we have written a series of computations that add and multiply  single pairs of floating point values at a time  After a little algebra  we can  rewrite this as three multiplies of a   loat3 and a float and two float3  additions   which runs several times faster than the original     Now     oxbWgaos 9 JU se JOwdsg   JB xr logra 7 INE       2  Use Swizzles to Make the Most of Vectorization    The GPU can swizzle the values in vectors with no performance penalty  recall  that a swizzle can be used to rearrange the elements of a vector   Given a vector     float3 a   float3 0  1  2      swizzles construct new vectots     laos   loca  0r Or 0   ayaa c lores  b 257 2  E  Eloy    Elo ae  2  18    and so forth  By swizzling your data carefully  you can still take advantage of  vectorization  even when you don t want to use the same component of both       258    808 00504 0000 004  NVIDIA    Appendix C Nine Steps to High Performance Cg    vectors on both sides of your computation  For example  consider the  computation of the cross product  Given two three dimensional vectors  the  cross product returns a new vector that is perpendicular to the given vectors  It  is computed by      itlloiuES  cy 1
272. t char  pixelOptions          cgD3D9GetOptimalOptions pixelProfile   0             Create the vertex shader   vertexProgram   cgCreateProgramFromFile   context  CG SOURCE   VertexProgram cg    vertexProfile   VertexProgram   vertexOptions       If your program uses explicit binding semantics  you     can create a vertex declaration using those semantics   const D3DVERTEXELEMENT9 declaration                                        78    808 00504 0000 004  NVIDIA    Using the Cg Runtime Library    LO   9    sico  elote  v   D3DDECLTYPE FLOAT3  D3DDECLMETHOD DEFAULT                                                                                D3DDECLUSAGE POSITION  0        Oy  Si S Ze o Eoo  D3DDECLTYPE D3DCOLOR  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE COLOR  O   Oj  4  SAO  AO is  v   D3DDECLTYPE FLOAT2  D3DDECLMETHOD DEFAULT   D3DDECLUSAGE TEXCOORD  0                            D3DD3CL END         y           Ensure the resulting declaration is compatible with the     shader  This is really just a sanity check   assert cgD3D9ValidateVertexDeclaration vertexProgram   declaration      device  gt CreateVertexDeclaration     declaration   amp vertexDeclaration        Load the program with th xpanded interfac     Parameter shadowing is enabled  second parameter   TRUE    cgD3D9LoadProgram vertexProgram  TRUE  0                     Create the pixel shader    fragmentProgram   cgCreateProgramFromFile   context  CG SOURCE   FragmentProgram cg    pixelProfile   FragmentProgram   pixelOp
273. t of binding semantics  ATTRO ATTR15  can also be  used  The two sets act as aliases to each other     Table 42  vp20 Varying Input Binding Semantics       Binding Semantics Name    Corresponding Data       POSITION  ATTRO    Input Vertex  Generic Attribute 0       BLENDWEIGHT  ATTR1    Input vertex weight  Generic Attribute 1       NORMAL  ATTR2    Input normal  Generic Attribute 2       COLORO  DIFFUSE  ATTR3    Input primary color  Generic Attribute 3       COLOR1  SPECULAR  ATTR4    Input secondary color  Generic Attribute 4       TESSFACTOR  FOGCOORD  ATTR5    Input fog coordinate  Generic Attribute 5       PSIZE  ATTR6    Input point size  Generic Attribute 6       BLENDINDICES  ATTR7    Generic Attribute 7       TEXCOORDO TEXCOORD7   ATTR8   ATTR15    Input texture coordinates  texcoord0   texcoord7   Generic Attributes 8 15       TANGENT  ATTR14    Generic Attribute 14          BINORMAL  ATTR15       Generic Attribute 15       Table 43 summarizes the valid binding semantics for varying output parameters    in the vp20 profile     These binding semantics map to NV_vertex_program output registers  The    two sets act as aliases to each other     Table 43 vp20 Varying Output Binding Semantics       Binding Semantics Name    Corresponding Data       POSITION  HPOS    Output position       PSIZE  PSIZ    Output point size       FOG  FOGC       Output fog coordinate             242    NVIDIA    808 00504 0000 004          Appendix B Language Profiles    Table 43 vp20 Varyin
274. t3D texture to a sampler parameter using    HRESULT cgD3D9SetTexture  CGparameter parameter   IDirect3DBaseTexture9  texture      To set the sampler state in the Direct3D 9 Cg runtime  use    HRESULT cgD3D9SetSamplerState  CGparameter parameter   D3DSAMPLERSTATETYPE type  DWORD value      Parameter type is any of the D3DSAMPLERSTATETYPE enumerants and  parameter value is a value appropriate for the corresponding type  Here is an  example of how to use this function   cgD3D9SetSamplerState  parameter  D3DSAMP MAGFILTER    D3DTEXF LINEAR                        To set the texture stage state in the Direct3D 8 Cg runtime  use     HRESULT cgD3D8SetTextureStageState  CGparameter parameter   D3DTEXTURESTAGESTATETYPE type  DWORD value         808 00504 0000 004 73  NVIDIA    Cg Language Toolkit    Parameter type must be one of the following values     D3DTSS_ADDRESSU D3DTSS_ADDRESSV  D3DTSS_ADDRESSW D3DTSS_BORDERCOLOR  D3DTSS_MAGFILTER D3DTSS_MINFILTER  D3DTSS_MIPFILTER D3DTSS MIPMAPLODBIAS    D3DTSS MAXMIPLEVEL D3DTSS MAXANISOTROPY    Parameter value is a value appropriate for the corresponding type  Here is an   example of how to use this function    cgD3D8SetTextureStageState  parameter  D3DTSS MAGFILTER   D3DTEXF LINEAR         The texture wrap mode is set using     HRESULT cgD3D9SetTextureWrapMode  CGparameter parameter   DWORD value      The input value is either zero or a combination of D3DWRAP_U  D3DWRAP V   and D3DWRAP_W  Here is an example of how to use this function     cg
275. t4x4 myMatrix   float myFloatScalar   float4 myFloatVec4        Set myFloatScalar to myMatrix 3   2    myFloatScalar   myMatrix m 32        Assign the main diagonal of myMatrix to myFloatVec4   myFloatVec4   myMatrix m 00 m11 m22 m33        Forcompatibility with the D3DMatrix data type  Cg also allows one   based swizzles  using a form with the m omitted after the _ symbol       matrixObject     lt row gt  lt col gt  _ lt row gt  lt col gt            In this form  the indexes for   row   and  lt co1 gt  are one based  rather  than the C standard zero based  So  the two forms are functionally  equivalent     float4x4 myMatrix   float4 myVec        These two statements are functionally equivalent   myVec   myMatrix  m00 m23 m11 m31   Es emos ili 54 22 427    Because of the confusion that can be caused by the one based  indexing  use of the latter notation is strongly discouraged       The matrix swizzles may only be applied to matrices  When multiple  components are extracted from a matrix using a swizzle  the result is an  appropriately sized vector  When a swizzle is used to extract a single  component from a matrix  the result is a scalar     Q The write mask operator         It can only be applied to an lvalue that is a vector  It allows assignment to  particular elements of a vector or matrix  leaving other elements  unchanged The only restriction is that a component cannot be repeated        808 00504 0000 004 187  NVIDIA    Cg Language Toolkit    Arithmetic Precision and
276. ter parameter      If the parameter does not have any associated resource   cgGetParameterResource    returns CG_UNDEFINED     The two functions cgGetResource    and cgGetResourceString   allow  you to determine the correspondence between a resource enumerant and its  corresponding string   CGresource cgGetResource  const char  resourceString    const char  cgGetResourceString  CGresource resource      If the string passed to cgGetResource    does not correspond to any resource   CG UNDEFINED is returned     Using cgGetParameterBaseResource    allows you to retrieve the base  resoutce for a parameter in a Cg program     CGresource cgGetParameterBaseResource    CGparameter parameter         808 00504 0000 004 43  NVIDIA    Cg Language Toolkit    The base resource is the first resource in a set of sequential resources  For  example  if a given parameter has a resource equal to CG_TEXCOORD7  its base  resource is CG  TEXCOORDO  Only parameters with resources whose name ends  with a number have a base resource  All other parameters return CG  UNDEFINED  when cgGetParameterBaseResource    is called     Function egGetParameterResourceIndex   retrieves the numerical portion  of the resource     unsigned long cgGetParameterResourceIndex   CGparameter parameter      For example  if the resource for a given parameter is C6_TEXCOORD7   cgGetParameterResourceIndex    returns 7     The cgGetParameterValues    function retrieves the default or constant value  of a uniform parameter     con
277. terminated array of null terminated strings that  are passed as arguments to the compiler  The pointer may itself be null     The only difference between the two functions is how program is interpreted   For cgCreateProgramFromFile     programis a string containing the name of  a file containing source code  for cgCreateProgram     program directly  contains source code  If the enumerant programType is equal to CG_SOURCE   the source code is Cg source code  if it is equal to CG_OBJECT  the source code is  precompiled object code and does not require any further compilation     The CGprogram handle returned by cgCreateProgramFromFile    is valid if it  1s different from zero  which means that the program has been successfully  created and compiled  The program is destroyed by passing its handle to  cgDestroyProgram       void cgDestroyProgram CGprogram program         36 808 00504 0000 004  NVIDIA    Using the Cg Runtime Library       Note  In the future  it will be possible to modify a program that has been created by  cgCreateProgram   or cgCreateProgramFromFile   through the  runtime   by changing the variability or the semantics of some parameters  for  example   so that it will need to be recompiled        A call to cgIsProgramCompiled   determines whether a program needs to be  recompiled     CGbool cgIsProgramCompiled  CGprogram program      To recompile a program  use cgCompileProgram       cgCompileProgram CGprogram program      A useful function in this context is cg
278. tersect with the iris plane  halz Leis   Intersect Dlane  IN Oo sirio   planeEquation    helie racet   teisik   Bel Dete  TINS DENSI ENA  fadeT   fadeT   fadeT   faceColor   DiffPupil xxx   ds noms X   0   1  halts  1ersPomte   INSOPOsteron  cese Set Vecioni  half3 irisST    irisScale irisPoint     hatis  sm   055m  0 51  y  faceColor   tex2D ColorMap        refVector           aciei s eds yz    159197       faceColor   lerp faceColor  LensColor  fadeT    hitColor   lerp missColor  faceColor   smoothstep 0 0h  GRADE  slice          hitColor   hitColor   SpecularLight     return half4 hitColor     1 43  7       118    808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders       Skin    Description    This effect demonstrates some techniques for rendering skin ranging from  simple Blinn Phong Bump Mapping to more complex Subsurface Scattering  lighting models  It also illustrates the use of    Rim    lighting and simple  translucency for capturing some of the more subtle properties of skin resulting  from complex  non local lighting interactions  Finally  it shows how the various  techniques can be combined to produce compelling  stylized skin        Figure 10 Example of Skin    Pixel Shader Source Code for Skin    SENGE Eram               float2 texcoords LEX COORD O  float4 shadowcoords   TEXCOORD1   808 00504 0000 004 119    NVIDIA    Cg Language Toolkit          float4 tangentToEyeMat0 TEXCOORD4   float3 tangentToEyeMatl TEXCOORD5   float3 tangentToEyeMat2 TEXCOORD6 
279. tex Shader Source Code for Anisotropic Lighting                 oo ooooo   135  B  mp Dot3x2  Diffuse and Specular   encre Rm tnm emet mens 136  Description asia doma pipa nie had quodp ande aca cadran m Ro Od AUR RUN DG GR ia aa 136  Vertex Shader Source Code for Bump Dot3X2        ics eser m eh 137  Pixel Shader Source Code for Bump Dot3X2          lisse aan 138  B  mp Reflection Mapping  isa   entem Aretha dl d ear 140  DESCHPUOM resisten iU dr unire a a a adiri us 140  Vertex Shader Source Code for Bump Reflection Mapping               llle  141  Pixel Shader Source Code for Bump and Reflection Mapping                     143  ETOSDell i osa ee pce acide ea aos qiu Sextius e ca eua ir Sr EUR 144  Descrip ae 9e ockde aac RR ERE E ENE D VE EUER es VR AL Ne ens 144  Vertex Shader Source Code for Fresnel          cse hh nn 144  GOSS iaa da aa 146  Bises T  P a E a wwe 146  Vertex Shader Source Code for Grass    24s pee ek dre ug a A 146  Reacciona ans aaa d ER GE E i lo ol ume 149  DGS CHINO   v                             a  149  Vertex Shader Source Code for Refraction    s  soca durar monas 150  Pixel Shader Source Code for Refraction   cas es lh RR nbn uh Ron 151  Shadow Mapping  x  sas ai al eat eee AUR o A d HL eS 152  Descriptio  0064 ear ede ented agers A ERR kee e tae Pes 152  Vertex Shader Source Code for Shadow Mapping              0000ee eee eeee 153  Pixel Shader Source Code for Shadow Mapping             0000e cence eee 154  Shadow Volume EXtrusion   ira a cada 
280. tex2Dproj sampler2D tex  float4 szq     2D projective depth compare       texRECT  samplerRECT tex  float2 s     2D RECT nonprojective       texRECT  samplerRECT tex  float2 s  float2 dsdx  float2 dsdy     2D RECT nonprojective with derivatives       texRECT  samplerRECT tex  float3 sz     2D RECT nonprojective depth compare       texRECT  samplerRECT tex  float3 sz  float2 dsdx  float2 dsdy     2D RECT nonprojective depth compare with derivatives       texRECTproj samplerRECT tex  float3 sq     2D RECT projective       texRECTproj  samplerRECT tex  float3 szq     2D RECT projective depth compare       tex3D sampler3D tex  float3 s     3D nonprojective       tex3D sampler3D tex  float3 s  float3 dsdx  float3 dsdy     3D nonprojective with derivatives       tex3Dproj  sampler3D tex  float4 szq     3D projective depth compare                   26 808 00504 0000 004  NVIDIA    Cg Standard Library Functions    Table 3 Texture Map Functions  continued        Texture Map Functions    Function   Description       texCUBE  samplerCUBE tex  float3 s     Cubemap nonprojective       texCUBE  samplerCUBE tex  float3 s  float3 dsdx  float3 dsdy     Cubemap nonprojective with derivatives       texCUBEproj  samplerCUBE tex  float4 sq              Cubemap projective       In the table  the name of the second argument to each function indicates how  its values are used when performing the texture lookup  s indicates a 1   2   or  3 component texture coordinate  z indicates a depth comparison 
281. texm3x2tex DirectX 8 pixel shader  instruction  DOT PRODUCT TEXTURE 2D in OpenGL   This instruction  computes the dot product of the normal and the light vector  corresponding to  the diffuse light component  and the dot product of the normal and the half  angle vector  corresponding to the specular light component  This results into  two scalar values that are used as texture cootdinates to look up a 2D  illumination texture containing the diffuse color and the specular term in its  alpha component  Since the normal fetched from the normal map is in tangent  space  both the light vector and the half angle vector are transformed to this  space by the vertex shader  Figure 14         Figure 14 Example of Bump Dot3x2 Diffuse and Specular       136 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders    Vertex Shader Source Code for Bump Dot3x2    struct a2v      y     float4 Position   POSITION    in object space  float3 Normal   NORMAL    in object space  float2 TexCoord   TEXCOORDO                 float3 T   TEXCOORD1    in object space  float3 B   TEXCOORD2    in object space  float3 N TEXCOORD3    in object space    Seater WAsE di    he    float4 Position   POSITION    in projection space   float4 Normal   COLORO    in tangent space   float4 LightVectorUnsigned   COLOR1    in tangent space  float3 TexCoord0   TEXCOORDO    float3 TexCoordl   TEXCOORD1    float4 LightVector   TEXCOORD2    in tangent space  float4 HalfAngleVector   TEXCOORD3    in tangent space          
282. that vaties lineatly over the face  of the triangle  for example  the distance from the fragment to a light  source  to be used for attenuation   the value can be computed in the vertex  shader at each vertex  passed to the fragment shader  and automatically  interpolated by the GPU along the way     Q The result is nearly linear across a triangle     When a value computed by a fragment shader varies slowly over triangles  it  may be an acceptable approximation to compute its value at each vertex  and use its linearly interpolated value in the fragment shader  For example   the usual Gouraud shading algorithm takes advantage of this situation to  compute lighting per vertex  rather than per pixel     In a similar manner  it may be advantageous to move any vertex shader  computation that is solely dependent on the values of uniform parameters to  the CPU and then to pass the result of the computation into the vertex shader  with different uniform parameters  For example  if the vertex shader is passed a    loat3 vector giving the direction of a distant light source  the vector should be  normalized on the CPU and passed to the vertex shader  This avoids the need  to repeatedly and unnecessarily recompute normalize  lightvector  in the  vertex shader        262 808 00504 0000 004  NVIDIA    Appendix C Nine Steps to High Performance Cg       8  Avoid Matrix Transposes Just for Multiplication    Computing the transpose of a matrix can often be avoided  If you would like to  multi
283. the arbvp1 and vp20 profiles is the  way that input varying semantics are handled  In the vp20 profile  semantic  names such as POSITION and ATTRO ate aliases of cach other the same way   NV vertex program aliases Vertex and Attribute 0  see Table 42     vp20  Varying Input Binding Semantics   on page 242   In the arbvp1 profile  the  semantic names are not aliased because ARB vertex program allows the  conventional attributes  such as vertex position  to be separate from the generic  attributes  such as Attribute 0   For this reason it is important to follow the  conventions given in Table 20     arbvp1 Varying Input Binding Semantics      on page 209 so that arbvp1 programs work for all implementations of   ARB vertex program  The arbvp1 conventions are compatible with the vp20  and vp30 profiles        808 00504 0000 004 207  NVIDIA    Cg Language Toolkit    Loading Constants    Applications that do not use the Cg run time ate no longer required to load  constant values into program parameters registers as indicated by the  const  expressions in the Cg compiler output  The compiler produces output that  causes the OpenGL driver to load them  However  uniform variables that have  a default definition still require constant values to be loaded into the appropriate  program parameter registers  as ARB vertex programs do not support this  feature  Application programs either have to use the Cg run time  parse  and  handle the  default commands  or have to avoid initializing un
284. the packing and unpacking  instructions defined by the NV   ragment program OpenGL extension     pack 2half      float pack 2half float2 a    float pack 2half half2 a      Converts the components of a into a pair of 16 bit floating point values  The  two converted components are then packed into a single 32 bit result  This  operation can be reversed using the unpack 2half   function        C Pseudocode  result      half a y   lt  lt  16     half a x   unpack 2half      half2 unpack 2half float a      Unpacks a 32 bit value into two 16 bit floating point values        C Pseudocode  result x    a  gt  gt  0      OXEE   result y    a  gt  gt  16   amp  OxFF        220 808 00504 0000 004  NVIDIA    Appendix B Language Profiles    pack 2ushort      float pack 2ushort float2 a    float pack 2ushort half2 a      Converts the components of a into a pair of 16 bit unsigned integers  The two  converted components are then packed into a single 32 bit return value  This  operation can be reversed using the unpack_2ushort    function        C Pseudocode   MSIE y    mouncl  G5535 0   Cileamo a x lt   0 0  1 0  7  Wisin y   wouncl GS535  0   clama  y  0 0  oO  F  resule    USAGE ay  lt  lt  IG  ES Os    unpack_2ushort      float2 unpack 2ushort float a      Unpacks two 16 bit unsigned integer values from a and scales the results into  individual floating point values between 0 0 and 1 0        C Pseudocode  resul     Oe SS 0   amp  Oxenwn    6533505  resolt sy    es Se 15   amp   umi    65
285. the same abstraction for GPUs  Cg changes the way  programmers can program  focusing on the ideas  the concepts  and the effects  they wish to create not on the details of the hardware implementation  Cg also  decouples programs from specific hardware because the language is functional   not hardware implementation specific  Also  since Cg can be compiled at run  time on any platform  operating system  and for any graphics hardware  Cg  programs ate truly portable  Finally  and perhaps best of all  Cg programs are  future proof and can adapt to run well on future products  The compiler can  optimize directly for a new target GPU that perhaps did not even exist when the  original Cg program was written     This book is intended as an introduction to Cg  as well as a practical handbook  to get programmers started developing in Cg  It includes a language description   a reference for the standard and run time libraties  and is full of helpful  examples  The goal for this book 1s to be both an introduction and a tool for the  new uset  as well as a reference and resource for developers as they become  more proficient     Welcome to the world of Cg   David Kirk    Chief Scientist  NVIDIA Corporation       xii 808 00504 0000 004  NVIDIA          Preface    The goal of this book is to introduce to you Cg  a new high level language for  graphics programming  To that end  we have organized this document into the  following sections     a          808 00504 0000 004     Introduction to
286. ting        808 00504 0000 004 95  NVIDIA    Cg Language Toolkit       96 808 00504 0000 004  NVIDIA       Advanced Profile Sample Shaders    This chapter provides a set of advanced profile sample shaders written in Cg   Each shader comes with an accompanying snapshot  description  and source    code    Examples shown are  Improved Skinning  Improved Water  Melting Paint  MultiPaint  Ray Traced Refraction  Skin   Thin Film Effect   Car Paint 9       Oo ooo o oO O    808 00504 0000 004    NVIDIA    97       Cg Language Toolkit       Improved Skinning    Description    This shader takes in a set of all the transformation matrices that can affect a  particular bone  Each bone also sends in a list of matrices that affect it  There is  then a simple loop that for each vertex goes through each bone that affects that  vertex and transforms it  This allows just one Cg program to do the entire  skinning for vertices affected by any number of bones  instead of having one  program for one bone  another program for two bones  and so on        Figure 5 Example of Improved Skinning       98 808 00504 0000 004  NVIDIA    Advanced Profile Sample Shaders    Vertex Shader Source Code for Improved Skinning    GNE DIOE ENPUES                              float4 position EOS TEON  float4 weights   BLENDWEIGHT   float4 normal   NORMAL   float4 matrixIndices   TESSFACTOR   float4 numBones SPECULAR    y    struct cULDUES      float4 hPosition LDOPOSXTION  float4 color   COLORO     be    outputs main
287. tions            Load the program with th xpanded interface  Parameter     shadowing is enabled  second parameter   TRUE   Ignore     vertex shader specifc flags  such as declaration usage   cgD3D9LoadProgram fragmentProgram  TRUE  0                     Grab some parameters   modelViewMatrix   cgGetNamedParameter  vertexProgram    ModelViewMatrix     baseTexture   cgGetNamedParameter  fragmentProgram    BaseTexture     someColor   cgGetNamedParameter  fragmentProgram    SomeColor                      Sanity check that parameters have th xpected siz  assert  cgD3D9TypeToSize cgGetParameterType    modelViewMatrix      16    assert  cgD3D9TypeToSize  cgGetParameterType  someColor        4  7             808 00504 0000 004 79  NVIDIA    Cg Language Toolkit       Set parameters that don t change  They can be set     only once since parameter shadowing is enabled  cgD3D9SetTexture baseTexture  texture    cgD3D9SetUniform someColor   amp constantColor               Called to render the scen  void OnRender                 Load model view matrix   D3DXMATRIX modelViewMatrix   Hh       Set the parameters that change every frame     This must be done before binding the programs  cgD3D9SetUniformMatrix modelViewMatrix   amp modelViewMatrix            Set the vertex declaration  device  gt SetVertexDeclaration vertexDeclaration               Bind the programs  This downloads any parameter values     that have been previously set    cgD3D9BindProgram  vertexProgram      cgD3D9BindProgram  
288. to the  COLOR output of the program  and execution of the  program is terminated    If the compiler s DEBUG option is not specified  this  function does nothing           The debug function is intended to allow a program to be compiled twice   once  with the DEBUG option and once without  By executing both programs  you can  obtain one frame buffer containing the final output of the program and a  second containing an intermediate value to be examined for debugging        Predefined Fragment Program Output Structures    A number of  e per structure types for use in fragment programs are predefined  in the standard library  Variables of these types can be used to hold the outputs  of a fragment program  Their use is strictly optional     For the ps 1 and   p20 profiles  the   ragout structure is defined as follows     Siesculcie Esso     d  moar col 3  CONO        The ps 2  arbfp1  and   p30 profiles have two fragment output types defined     struct ragout  half4 col   COLOR   float depth   DEPTH        be  Sib track  OUte mel    citer  float4 col COMO  ioo cosi DEPART  be             28 808 00504 0000 004  NVIDIA          Using the  Cg Runtime Library    This chapter describes the Cg Runtime Library  It assumes that you have some  basic knowledge of the Cg language  as well as the OpenGL or Direct3D APIs   depending on which one you use in your applications     The first section  Introducing the Cg Runtime  on page 29 talks about the  benefits of using the Cg Runtime Library 
289. to the current state  This means that in subsequent  drawing calls the program is executed for every vertex in the case of a vertex  program and for every fragment in the case of a fragment program    Here s how to bind a program in OpenGL    cgGLBindProgram program      Here s how to bind a program in Direct3D    cgD3D9BindProgram  program     You can only bind one vertex and one fragment program at a time for a  particular profile  Therefore  the same vertex program is executed until another  vertex program is bound  Similarly  the same fragment program is executed as  long as no other fragment program is bound    In OpenGL  you disable profiles by the following call   cgGLDisableProfile CG PROFILE ARBVP1     Disabling a profile also disables the execution of the corresponding vertex or    fragment program   Releasing Resources    When your application is ready to close  it is good programming practice to free  resources that you ve acquired     Because the Direct3D runtime keeps an internal reference to the Direct3D  device  you must tell it to release this reference when you are done using the  runtime  This is done with the following call     cgD3D9SetDevice  0      To free resources allocated for a program  call this function   cgDestroyProgram  program       To free resources allocated for a context  use this function   cgDestroyContext  context      Note that destroying a context destroys all the programs it contains as well        Core Cg Runtime    The core Cg runti
290. tor                           808 00504 0000 004 143    NVIDIA    Cg Language Toolkit       Fresnel    Description    This effect computes a reflection vector to lookup into an environment map for  reflections  and modulates this by a Fresnel term  The result is reflections only    at grazing angles  Figure 16         Figure 16 Example of Fresnel    Vertex Shader Source Code for Fresnel    struct app2vert         float4 Position 8  o SINK   float4 Normal   NORMAL   float4 TexCoord0 ERAS O ORIO    be       144 808 00504 0000 004  NVIDIA    Basic Profile Sample Shaders    struct vert2frag         float4 HPosition   POs tT LON   float4 Color0 COMOROS  float4 TexCoord0d   TEXCOORDO     H       vert2frag main app2vert IN   uniform float4x4 ModelViewProj   uniform float4x4 ModelView   uniform float4x4 ModelViewIT     vert2frag OUT            ifdef PROFILE ARBVP1  ModelViewProj   glstate matrix mvp   ModelView   glstate matrix modelview 0    ModelViewIT   glstate matrix invtrans modelview 0    fendif       OUT HPosition   mul ModelViewProj  IN Position      float3 normal   normalize  mul  ModelViewIT   IN Normal  xyz    float3 eyeToVert   normalize  mul ModelView   ION Osito  59 722           reflect th ye vector across the normal vector     for reflection  OUT TexCoord0   float4  reflect  eyeToVert  normal   1 0            float f0    1           compute the fresnel term   float oneMCosAngle   1 dot eyeToVert  normal     oneMCosAngle   pow oneMCosAngle  5    OUT Color0   lerp oneMC
291. umTexInstructionSlots  lt n gt   where n  gt   24     Limitations in the Implementation    Currently  this profile implementation has following limitations     Q OpenGL ARB fragment program profile is still in developmental beta stage  as the extension and its support is not widely available        Q OpenGL state access in ARB fragment programs is not yet implemented        808 00504 0000 004    213  NVIDIA    Cg Language Toolkit       OpenGL NV vertex program 2 0 Profile  vp30     The vp30 Vertex Program profile is used to compile Cg source code to vertex  programs for use by the NV vertex program2 OpenGL extension     a Profile name  vp30  Q How to invoke  Use the compiler option  profile vp30      The vp30 profile limits Cg to match the capabilities of the  NV vertex program  extension  This section describes the capabilities and  restrictions of Cg when using the vp30 profile     Position Invariance    The vp30 profile supports position invariance  as described in the core language  specification     Q The modelview projection matrix must be specified using a binding  semantic of  GL MVP  Unlike the vp20 and arbvp1 profiles  this profile  causes the compiler to emit the instructions for transforming the position  using the modelview projection matrix     Q The assembly code position invariant option is not used because the  hardware guarantees that the position calculation is invariant compared to  the fixed pipeline calculation     Language Constructs  Data Types    This
292. ure coordinates associated with the nth texture unit   intermediate coord are texture coordinates associated with the n 1  texture unit  and  prevlookup is the result of a previous texture operation   This function can be used in conjunction with the DEPTH varying out semantic  to generate the dot product depth replace NV texture shader  instruction combination                 808 00504 0000 004 255  NVIDIA    Cg Language Toolkit    Examples    The following examples illustrate how a developer can use Cg to achieve  NV texture shader and NV_register_combiners functionality     Example 1    struct VertexOut    float4 color sane OOROF   float4 texCoord0   TEXCOORDO   float4 texCoordl   TEXCOORD1     be          float4 main VertexOut IN     uniform sampler2D diffuseMap     uniform sampler2D normalMap    COLOR     float4 diffuseTexColor   tex2D diffuseMap  IN texCoord0 xy     float4 normal   2    tex2D normalMap  IN texCoordl xy  0 5    El eies ligat vector   2    1  color  gis   0 5  7  logr   dor resule   SEE    dot light vector  normal xyz  xxxx    rertra Clo resule   cliiriusste Color      Example 2  struct VertexOut    float4 texCoordO0 TEXCOORDO   float4 texCoordl TEXCOORD1   float4 texCoord2 TEXCOORD2   float4 texCoord3 TEXCOORD3                 float4 main VertexOut IN     float4  float2    float4  float4  return    uniform sampler2D normalMap   uniform sampler2D intensityMap   uniform sampler2D colorMap    COLOR    normal   2    tex2D normalMap  IN texCoord0 xy  0 5    i
293. vaa A A ed 171  Partial Support of TYPES oasis ai E A x RR Cad s Ox deis 173  TYPE Categories  23 2 v arts ar ene EIER na Miah Beheaded 174  Constants x ascos ach saos alid ten fox a eot bd ad  nde 174  TYPO QUAES s oi  DET 175  Type  CONVEISIONS su cce nde do Re a 176  Type  EguiValefiGy s suat irs ROS ardt rt eh artes ores iw ide roe Pw eye de 178  Type Promotion Rule cuevas ricardo dae 178  NAMESPACES    4 5 3 aca a ba OEC DPA ode ea 179  Arrays and SUBSCHPINdO eva iia a e ame n 179  Function  Overloacliht ea uii aio et iD eei wee A ETT RA eh Pe 181  Global VartablesS ic 2 2 thee hee Ede idonee S ah deii ao n OR RR gat 182  Use of Uninitialized Variables   rrai oer dre a nm eet e dram ms 182  PIepEOCOSSOE   aot iii OX ae DE eR e dit n Eco c oe ea E do Roe rt ed Satu 182  Overview of Binding Semantics    s lisss ek Eleanor RR AA 183  Binding SEMANTI  S 1    2 2 ond ceed dea a Boer nt Ae herbe Pete X orae dui 183  Alidsing of Semalnties  sucia eco eem a eed RO SD daw ca CR RC 184  Restrictions on Semantics Within a Structure            llle 184  Additional Details for Binding Semantics             llll llle  184  How Programs Receive and Return Data            llle 185  Statements cid eC ER Rx ack E tee ara ge dog V eoi 185  Minimum Requirements for if  while  and for Statements                   185  New Vector Operators  3  pia n cham alee dci ae 186  Arithmetic Precision   and Range    isses ee rr ee RR RR a x 188  Operator Precedence    uei cea  a a Quer ULP Ced ded ARA 
294. value for  shadowmap lookups  q indicates a perspective value and is used to divide the  texture coordinate  s  before the texture lookup is performed     For convenience  the standard library also defines versions of the texture  functions prefixed with h4  such as h4tex2D      that return hal   4 values and  prefixed with x4  such as x4tex2D     that return fixed4 values     When the texture functions that allow specifying a depth comparison value are  used  the associated texture unit must be configured for depth compare  texturing  Otherwise  no depth comparison is actually performed        Derivative Functions    Table 4 presents the derivative functions that are supported by the Cg Standard  Library  Vertex profiles are not required to support these functions     Table 4 Derivative Functions       Derivative Functions    Function Description       ddx  a  Approximate partial derivative of a with respect to  screen space x coordinate        ddy  a  Approximate partial derivative of a with respect to  screen space y coordinate                    808 00504 0000 004 27  NVIDIA    Cg Language Toolkit       Debugging Function    Table 5 presents the debugging function that is supported by the Cg Standard  Library  Vertex profiles are not required to support this function     Table 5 Debugging Function       Debugging Function    Function Description       void debug  float4 x  If the compiler s DEBUG option is specified  calling  this function causes the value x to be copied 
295. vector  where      The x and w components are always one       The y component is equal to the diffuse dot product or to zero if the  product is less than zeto       The z component is equal to the specular dot product raised to the  given exponent or to zero if the diffuse dot product was less than zero     All this is done substantially more efficiently than if the corresponding  operations were written out in Cg code        808 00504 0000 004 261  NVIDIA    Cg Language Toolkit       7  Take Advantage of the Different Levels of  Computation Frequency    Always keep in mind the fact that fragment programs generally are executed  many more times than vertex programs  Therefore  move computation from  fragment programs into vertex programs whenever possible  Recall that varying  outputs from vertex programs are automatically lineatly interpolated before  being passed to the fragment program     There are three main cases where you can move computation from a fragment  program into a vertex program     Q The result is constant over all fragments    If the vertex shader computes a value that is the same for all vertices  so  that all fragments receive the same value after interpolation  any  computation that the fragment shaders do that is based solely on such  values can be moved to the vertex shader  as long as it doesn t require  texture map lookups or other fragment only operations      O The result is linear across a triangle     If the fragment shader is computing a value 
296. xample 116  sample shader 114  vertex shader code example 115  recursion  function 13  reflection vector 144  refraction  pixel shader code example 151  sample shader 149  vertex shader code example 150  release notes xiv  Renderman  relation toCg 165  reserved words 191  runtime  coreCg 34    S  sampler data type 11  sampler type  specification 172  saturate   for performance 260  scalar type category 174  semantics  aliasing 184  restrictions 184  shader sample  anisotropic lighting 134  bump dot 3x2 diffuse and specular 136  bump reflection mapping 140  fresnel 144  grass 146  improved skinning 98  improved water 101  matrix palette skinning 161  melting paint 105  multipaint 109  ray traced refraction 114  refraction 149  shadow mapping 152  shadow volume extrusion 155  sine wave demo 158  skin 119  shader  simple cg example 90  shaders  advanced profile samples 97    basic profile samples 133  shading computations for performance 261  shadow mapping 152   pixel shader code example 154   sample shader 152   vertex shader code example 153  shadow volume extrusion   sample shader 155   vertex shader code example 156  shadow volumes 155  silent incompatibilities with C 165  simple cg   basic transformations 93   passing arguments 93  Sine function 146  158  sine wave demo   sample shader 158   vertex shader code example 159  sinh x  23  skin   pixel shader code example 119   sample shader 119  skinning  improved   sample shader 98   vertex shader code example 99  smearing  s
297. xture Lookup Function Texture Coordinate Swizzle  texlDproj xw  ra   tex2Dproj  Xyw  rga   texRECTproj  Xyw  rga   tex3Dproj  xyzw  rgba   texCUBEproj  xyzw  rgba             808 00504 0000 004 231  NVIDIA       Cg Language Toolkit    Bindings    Manual Assignment of Bindings    The Cg compiler can determine bindings between texture units and uniform  sampler parameters  texture coordinate inputs automatically  This automatic  assignment is based on the context in which uniform sampler parameters and  texture coordinate inputs are used together     To specify bindings between texture units and uniform parameters  texture  cootdinates to match their application  all sampler uniform parameters and  texture coordinate inputs that are used in the program must have matching  binding semantics   that is  TEXUNIT  n   may only be used with TEXCOORD lt n gt      Partially specified binding semantics may not work in all cases  Fundamentally   this restriction is due to the close coupling between texture samplets and texture  cootdinates in DirectX pixel shaders 1  X     Binding Semantics for Uniform Data    If a binding semantic for a uniform parameter is not specified then the compiler  will allocate one automatically  Scalar uniform parameters may be allocated to  either the xyz or the w portion of a constant register depending on how they are  used within the Cg program  When using the output of the compiler without  the Cg runtime  you must set all values of a scalar uniform to th
298. xture unit  and  intermediate coord  are texture coordinates associated with the n 1  texture unit   This function can be used to generate the  dot product reflect cube map eye from qs NV texture shader  instruction combination                 254 808 00504 0000 004  NVIDIA    Appendix B Language Profiles    Table 50   p20 Auxiliary Texture Functions  continued        Texture Function       Description       texCUBE reflect eye dp3x3  uniform samplerCUBE tex   float3 str   float4 intermediate coordl   float4 intermediate coord2   float4 prevlookup   uniform float3 eye        Performs the following  float3 N   float3  dot  intermediate coordl xyz  prevlookup xyz    dot intermediate coord2 xyz  prevlookup xyz    dot coords xyz  prevlookup xyz     return texCUBE  tex  2   dot N  E    dot N  N    N   E    where  strq are texture coordinates associated with sampler tex   prevlookup is the result of a previous texture operation   intermediate coordl are texture coordinates associated with the n 2  texture unit   intermediate coord  are texture coordinates associated with the n 1  texture unit  and  eye is the eye ray vector   This function can be used generate the  dot product reflect cube map const eye NV texture shader  instruction combination        tex dp3x2 depth float3 str  float4 intermediate coord   float4 prevlookup        Performs the following  float z   dot  intermediate coord xyz  prevlookup xyz    float w   dot str  prevlookup xyz    return z   w   where  str are text
299. y  all the related core runtime handles  of  type CGprogram  CGparameter  and so on  remain valid     If you call egD3D9SetDevice    a second time with a different device  all  programs managed by the old device are rebuilt using the new device     Responding to Lost Direct3D Devices    The expanded interface may hold references to Direct3D resources that need to  be recreated in response to a lost device  In particular  certain sampler  patameters might need to be released before a Direct3D device can be reset  from a lost state  The expanded interface is holding a reference to a texture that  needs to be reset in response to a lost device if both of the following are true for  a texture     Q It was created in the D3DPOOL_ DEFAULT pool     Q It was bound to a sampler parameter  using cgD3D9SetTexture     of a  program for which parameter shadowing is enabled     In this case  the parameter must be set to zero  using cgD3D9SetTexture     to  remove the expanded interface s reference to that texture so it can be destroyed  and the Direct3D device can be reset from a lost state  Later  after resetting the  Direct3D device and recreating the texture  it needs to be re bound to the  sampler parameter  For example     IDirect3DDevice9  device     Initialized elsewhere  IDirect3DTexture9  myDefaultPoolTexture   CGprogram program           void OneTimeLoadScene               Load the program with cgD3D9LoadProgram and     enable parameter shadowing    x ge HR   cgD3D9LoadProgram pr
300. ying output parameters    in the vp30 profile     These binding semantics map to NV vertex program2 output registers  The  two sets act as aliases to each other     Table 27  vp30 Varying Output Binding Semantics       Binding Semantics Name    Corresponding Data       POSITION  HPOS    Output position          PSIZE  PSIZ       Output point size          216    808 00504 0000 004    NVIDIA          Appendix B Language Profiles    Table 27 vp30 Varying Output Binding Semantics  continued                             Binding Semantics Name Corresponding Data   FOG  FOGC Output fog coordinate   COLORO  COLO Output primary color   COLOR1  COL1 Output secondary color   BCOLO Output backface primary color  BCOL1 Output backface secondary color  TEXCOORDO TEXCOORD7  Output texture coordinates  TEXO TEX7   CLPO CL5 Output Clip distances                The profile allows WPOS to be present as binding semantics on a member of a  structure of a vatying output data structure  provided the member with this  binding semantics is not referenced  This allows Cg programs to have same  structure specify the varying output of a vp30 profile program and the varying  input of an   p30 profile program        808 00504 0000 004 217  NVIDIA    Cg Language Toolkit       OpenGL NV_fragment_program Profile    p30     The   p30 Fragment Program Profile is used to compile Cg source code to  fragment programs for use by the NV   ragment program OpenGL extension     a Profile name    p30  Q How to invoke  
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
manual de instruções do termômetro digital modelo td  N°129 mai 2013  本質安全防爆認定 非接触赤外線温度センサのご案内  USER`S MANUAL - fujitsu general  Arquivo  1 - 492 Cafe  Swann REDALERT SW244-WDW User's Manual    Copyright © All rights reserved. 
   Failed to retrieve file