JeGX's Lab - Part 6

GPU Caps Viewer Validation Facility

January 8, 2008 JeGX 4 Comments

In the version 1.4.0 of GPU Caps Viewer, I added a validation functionality:

Petit changement de Thème

January 7, 2008 JeGX Comment

Voilà j’ai encore changé de thème pour ce blog. Et oui que voulez vous, avec la quantité astronomique de thèmes disponibles, je pourrais en changer tous les jours. Ceci dit, pour ceux qui sont interessés, ce thème est dispo ici: Paalam.

Juste pour info, l’image d’entête est issue d’une petite démo 3d qui sera releasée prochainement.

News en Vrac, Tools de JeGX

Compétition d’overclocking GPU

December 22, 2007 JeGX Comment

Le site www.hardware.info propose une compétition d’overclocking de GPU et utilise le Fur Rendering Benchmark comme utilitaire principal (bon c’est ma version des faits vu que tout est écrit en néerlandais mais je ne dois pas me tromper de beaucoup – si quelqu’un comprend cette langue, merci d’avance pour un petit feedback de ce qui s’y raconte). Décidément, le fur benchmark fait parler de lui ces derniers temps. Mais le truc marrant c’est que tout le monde s’obstine à utiliser la version 1.0.0 alors que la version 1.1.0 existe…

La page de la compétition se trouve ici: GPU Overclocking Contest

News en Vrac

The Technology of a 3D Engine @ Beyond3D – Part 1

December 22, 2007 JeGX Comment

“This series of articles is meant for anyone willing to write, or learn about the process of writing, a modern, streaming, 3D engine, taking advantage of current programmable hardware.”

Read the article HERE

Nos amis de Beyond3D viennent de lancer une nouvelle serie d’articles, cette fois ci sur l’architecture d’un moteur 3D moderne et qui sait exploiter nos cartes graphiques toujours plus puissantes. Après quelques pages de banalités (pages 1, 2 et 3), la quatrième et dernière page (quoi déjà?) nous parle plus en détail des différrentes API de rendu 3D (Direct3D et OpenGL) et l’auteur nous dit que son moteur (le FlExtEngine) utilise une couche d’abstraction pour le renderer 3D. C’est une solution dece type qui est utilisé dans le moteur oZone3D qui propulse Demoniak3D ou GPU Caps Viewer.

Donc la lecture de cet article et surtout des suivants vous permettra d’en apprendre un peu plus sur les coulisses de Demoniak3D. J’essaierai de faire un petit feedback lors de la sortie des autres articles.

OpenGL

Les nouveaux Catalyst 7.11 à la sauce “Bug-Inside”

December 10, 2007 JeGX Comment

ATI vient de nous livrer les nouveaux Catalyst 7.11 pour nos belles Radeon. Mais on dirait que ça commence à être une habitude chez les petits gars d’ATI de nous pondre des pilotes bogués surtout pour les nouvelles cartes! Souvenez-vous des Catalyst 7.9 qui enfin corrigeaient un gros bug au niveau des shadow-maps et ce bug n’était visible que pour les Radeon 2k. Bien maintenant c’est la même chose avec les Cat7.11: ils sont bogués pour les Radeon 3k au niveau OpenGL: impossible de mettre plus d’une lumière dynamique dans les shaders GLSL! C’est quand même un sacré bug! Bon pour le moment je n’ai testé que sous WinXP donc peut etre que sous Vista c’est mieux.

A part ce bug (il y en a surement d’autres mais j’ai pas fait assez de tests pour le savoir), les Cat7.11 sont les premiers pilotes qui supportent les Radeon HD 3870. Le numéro interne des Cat7.11 est les 8.432.0.0.

Le téléchargement des Cat7.11 se passe ici:

WinXP 32-bit: [DOWNLOAD]
Vista 32-bit: [DOWNLOAD]

News en Vrac

Lancement du blog infâme.

December 10, 2007 JeGX 1 Comment

Voilà le post d’ouverture de ce nouveau blog. J’arrête le JeGX’s DevBlog pour toutes sortes de raisons et je commence ce nouveau blog plus généraliste mais toujours orienté tech! De plus il sera seulement en français car pousser des cris de rage ou d’insulte est beaucoup plus simple en FR qu’en anglais.

Plus sérieusement, ce blog causera de softwares (utilitaires, jeux vidéo, …), de matos (surtout graphique, non mais!), de code (bin oui!) et de tout ce qui me semble intéressant au niveau tech.

Bon ça c’est fait…

Tools de JeGX

Fur Rendering Benchmark Used in a Review

October 27, 2007 JeGX Comment

Funny, my furry benchmark has been used in a graphics card review along with 3DMark or Lost Planet. Pretty cool 😉

Tools de JeGX

Catalyst 7.9, Radeon 2900 and Surface Deformer

September 12, 2007 JeGX Comment

From oZone3D.Net Forums, the Catalyst 7.9 seems to unleash ATI Radeon 2900 GPU. The Surface Deformer benchmark is a benchmark that requires a lot of vertex processing horse power. With Catalyst prior to 7.9, the score of an ATI 2900 was around 8000 o3Marks (that was already high). Now with Catalyst 7.9, the 2900 gets a score of 15000 o3Marks. Incredible!!! Why such a big big jump in OpenGL performance ?

My first thought is that ATI has managed to use correctly the unified arch of the R600 gpu. With unified arch, the workload is distribued over all shaders processors no matter the type of the shader prog (vertex or pixel). So if the vertex shader needs more processing power than the pixel shader, more shaders processors will be used for the vertex shader. My second thought: unified arch has involved new kernel code for catalyst and simply ATI has optimized the R600 codepath. A driver for a modern GPU like the R600 is a very complex piece of code and optimizing such a code is a huge task….

OpenGL

Catalyst 7.9 and Radeon 2K Shadow Mapping Bug

September 12, 2007 JeGX Comment

I found this bug while I was coding a new small soft shadows demo for GPU Caps Viewer. Soft shadows are built on shadow mapping and my OpenGL shadow mapping code works perfectly on all Geforce 6/7/8 and Radeon 1k but not on Radeon 2K (2400/2600/2900). Why ? Because of the shadow mapping comparison function that had a serious bug! To be short, the comparison function was supposed to return a boolean value (if shadow returns 0, else returns 1) and before Catalyst 7.9, this function returned, for Radeon 2K, the depth buffer value (as if the comparison function was disabled). But this bug is now a memory since Catalyst 7.9 has fixed it.

I guess we can say thanks to Quake Wars, that has been released few days ago and that is an OpenGL game. For this game (that is really nice), ATI has fixed all major OpenGL bugs.

Tools de JeGX

Fur Rendering Benchmark

August 26, 2007 JeGX Comment

I officially released the fur rendering benchmark 4 days ago. So let’s analyze a little bit the first feedbacks available on forums over the web.

Homepage: www.ozone3d.net/benchmarks/fur/

1 – Fur rendering benchmark isn’t cpu dependent and this is a very good thing for a graphics card benchmark. No matter the cpu speed, the result for a given card stays equivalent:
– oZone3D.Net forums
– extremeoverclocking.com forums
– extremeoverclocking.com forums
“yes the first propper gpu bench ive come across, i really like this…. it stops all the arguments about memory timings and cpu speeds. its a good equaliser as all our systems can provide that 10% cpu info the gpu needs…”

2 – 8800GTX vs 2900XT
ATI 2900XT seems to beat NVIDIA 8800GTX. In all forums, the 2900XT is ahead:
– oZone3D.Net forums
– overclockers.co.uk forums
– extremeoverclocking.com forums

3 – this benchmark seems to nicely overload the graphics card and then is a cool GPU burner and stress/stability test utility.
– clubic.com forums : “Par contre j ai jamais vu ma carte graphique chauffer autant: environ 100° pendant la test:ouch:”
– oZone3D.Net forums: “This thing just succeeded to shut down twice the PSU, caused by overloading of the graphics board!!”

I done a little test with my 8800GTX:
– gpu core temp at rest: 58°C
– gpu core temp at load: 83°C

Okay, that’s all for that small benchmark. :winkhappy:

OpenGL, Programming

GLSL: ATI vs NVIDIA

May 29, 2007 JeGX Comment

Today two new differences between Radeon and Geforce GLSL support.

1 – float2 / vec2
vec2 is the GLSL type to hold a 2d vector. vec2 is supported by NVIDIA and ATI. float2 is a 2d vector but for Direct3D HLSL and for Cg. The GLSL compilation for Geforce is done via the NVIDIA Cg compiler. Here is the GLSL version displayed by GPU Caps Viewer: 1.20 NVIDIA via Cg compiler. That explains why a GLSL source that contains a float2 is compilable on NVIDIA hardware. But the GLSL compiler of ATI is strict and doesn’t recognize the float2 type.

2 – the following line:

vec2 vec = texture2D( tex, gl_TexCoord[0].st );

is valid for NVIDIA compiler but produces an error with ATI compiler. One again, the ATI GLSL compiler has done a good job. By default, texture2D() returns a 4d vector. The right syntax is:

vec2 vec = texture2D( tex, gl_TexCoord[0].st ).xy;

Conclusion: always test your shaders on both ATI and NVIDIA platforms unless you target one platform only.

Hardware, Programming

R600 is VTF-capable

May 29, 2007 JeGX Comment

“All of the fetch and filtering capabilities are available to each thread type, making the samplers completely agnostic about what’s using them.”

This line from Beyond3D article on R600 means that vertex, geometry and pixel shaders can access to texture samplers. So Vertex Texture Fetching is now available with Radeon 2k series :thumbup: What’s more, the R600 can handle very large texture up to 8192×8192 just like the G80.

OpenGL, Tests et Reviews, Tools de JeGX

Dynamic branching and NVIDIA Forceware Drivers

May 22, 2007 JeGX Comment

Several weeks ago, I posted on Beyond3D a thread on my dynamic branching benchmark. I wondered why dynamic branching performances on Geforce 7 were worse than ones on Geforce 6 or 8. I believe I’ve got the answer: Forceware drivers.

Here are some new results where ratio = Branching_ON / Branching_OFF :

7600GS – Fw 84.21 – Branching OFF: 496 o3Marks – Branching ON: 773 o3Marks – Ratio = 1.5
7600GS – Fw 91.31 – Branching OFF: 509 o3Marks – Branching ON: 850 o3Marks – Ratio = 1.6
7600GS – Fw 91.36 – Branching OFF: 508 o3Marks – Branching ON: 850 o3Marks – Ratio = 1.6
7600GS – Fw 91.37 – Branching OFF: 509 o3Marks – Branching ON: 850 o3Marks – Ratio = 1.6

7600GS – Fw 91.45 – Branching OFF: 509 o3Marks – Branching ON: 472 o3Marks – Ratio = 0.9
7600GS – Fw 91.47 – Branching OFF: 509 o3Marks – Branching ON: 472 o3Marks – Ratio = 0.9
7600GS – Fw 93.71 – Branching OFF: 508 o3Marks – Branching ON: 474 o3Marks – Ratio = 0.9
7600GS – Fw 97.92 – Branching OFF: 505 o3Marks – Branching ON: 478 o3Marks – Ratio = 0.9
7600GS – Fw 100.95 – Branching OFF: 508 o3Marks – Branching ON: 480 o3Marks – Ratio = 0.9

my conclusion is: dynamic branching in OpenGL works fine (read the performance are better than without dynamic branching: ratio > 1) for forceware < = 91.37. For the drivers >= 91.45, the ratio drops under 1. Dynamic branching works as expected for gf6 and gf8 but not for gf7 since forceware 91.45. So the bug explanation is a plausible answer (and it’s easily understandable: in this news we learnt that a forceware driver is made of around 20 millions of lines of code – a paradise for a small bug!!!). I’ve also done the test with the simple soft shadows demo provided with the NV SDK 9.5. The results are the same.

I’ve just done the bench with a 7950gx2 and the latest forceware 160.02 and dynamic branching is still buggy…

Hardware, Tests et Reviews

Quick Review – Asus Silient Square Pro

May 15, 2007 JeGX Comment

Voilà le nouveau ventirad pour le cpu que j’ai installé sur ma machine de test:

Je dois dire que j’en suis assez satisfait. Facile à installer, ce ventirad d’Asus est quasiment inaudible. Et il rempli à merveille sa fonction de dissipation car les aillettes restent froides en permanence. Evidemment, je pense que c’est mon AMD X2 3800+ qui ne chauffe pas assez mais par rapport au ventirad d’origine livré avec le cpu, la différence est nette.

Du bon matos :thumbup:

Programming

Embedded Your Shader Souce Code In Your C/C++ Apps

March 18, 2007 JeGX Comment

The NVIDIA developer blog shows a way to include shaders codes to your
windows exe: blogs.nvidia.com/developers/2007/03/inlining_shader.html.

But this example is not fully operational. I slightly modified the code to make it totally operational (I compiled it on vc++ 6.0):

1) Add a define to your resource.h file:
#define IDF_SHADEFILE 1000

2) Add an entry in your resource.rc file:
IDF_SHADERFILE RCDATA DISCARDABLE "myShader.glsl"

3) Use the resource in your code:
HMODULE hModule = GetModuleHandle(NULL);
HRSRC hResource = FindResource(hModule, (LPCTSTR)IDF_SHADERFILE, RT_RCDATA);
if(hResource) 
{
  DWORD dwSize = SizeofResource(hModule, hResource);
  HGLOBAL hGlobal = LoadResource(hModule, hResource);
  if(hGlobal) 
  {
    LPVOID pData = LockResource(hGlobal);
    if(pData) 
    {
	// Cast pData to a char * and you have your shader
	char *shader_code = (char *)pData;
	
        // Now do whatever you want with shader_code pointer. 
	// Do not forget that shader_code is not a zero-terminated string!
	// Use dwSize to handle that.
			
    }
  }
}

OpenGL

New NVIDIA OpenGL Extensions Headers

February 24, 2007 JeGX Comment

The new OpenGL headers files contain new extensions stuff. You can download them from… just a second, I start GPU Caps Viewer and… okay I got it :thumbup: : from developer.nvidia.com/object/nvidia_opengl_specs.html.

But there are a couple of weird things:

1 – the glext.h version is 28 (#define GL_GLEXT_VERSION 28). The version I use to compile the oZone3D engine renderer is the 29. And I use this header since more than one year…

2 – the glext.h header does not compile with vc6 (yes I still use visual studio 6!) because of the GL_EXT_timer_query extension. Here is the origianl piece of code you can find in glext.h:

/*
* Original code - does not compile with vc6.
*/
#ifndef GL_EXT_timer_query
typedef signed long long GLint64EXT;
typedef unsigned long long GLuint64EXT;
#endif

and here is the code I updated for visual c 6:

/*
Modified code for oZone3D engine - compile with vc6
*/
#ifndef GL_EXT_timer_query
	#ifdef _WIN32
		typedef signed __int64 GLint64EXT;
		typedef unsigned __int64 GLuint64EXT;
	#else
		typedef signed long long GLint64EXT;
		typedef unsigned long long GLuint64EXT;
	#endif
#endif

I wonder if the original glext.h compiles with vc7 or vc8. If anyone has the answer, feel free to contact me…

Tools de JeGX

Quick Review – GPU Caps Viewer

February 21, 2007 JeGX Comment

GPU Caps Viewer is the new I worked on these last days. It’s the successor of HardwareInfos. GPU Caps Viewer is based on the branch v3.x of the oZone3D engine (while HardwareInfos is an oZone3D v.2.x branch based tool). In addition to classic GPU/CPU information / capabilities, GPU Caps Viewer offers two cool features:

– an OpenGL Extensions database. Either you can see the extensions supported by the current graphics card or you can see all existing extensions no matter the graphics board you have. You can quickly select an extension and jump directly to ist webpage (SGI or NVIDIA extensions specs). I must confess it’s very useful for me.

– a GPU-Burner… that was the hard-coding part of GPU Caps Viewer. The GPU-Burner allows to open several 3D windows. Actually you can open as many 3D views you want (1, 2, 4, 6, 10, 20, …). Each view renders a GLSL toon-shaded object with vsync disabled. You can set the size of each window individually (default size is 400×400). Each 3D view is rendered in its own thread… I let you imagine how hard is to debug a multitreaded gfx application :raspberry: And because I’m only a human, there are always some bugs in my code. But there is a very cool tool that helped me to manage the mad threads: ProcessExplorer :thumbup: You can download it here: www.majorgeeks.com/Process_Explorer_d4566.html.

Here an screenshot of my desktop with 13 instances of the 3D view runing at the same time. I will release GPU Caps Viewer very very soon. So stay tuned! :winkhappy:

OpenGL

NVIDIA OpenGL Extension Specifications

February 21, 2007 JeGX Comment

Finally NVIDIA releases the specs of the new OpenGL extensions that come with the gf8800. Great news! :thumbup:

These specs are very important for us, poor graphics developers, in order to update our software with the latest cool features. So among these specs, there is the GL_EXT_draw_instanced that allows to do geometry instancing. Another extension is WGL_NV_gpu_affinity. This ext allows to send the gfx calls to a particular GPU in multi-gpus system. Should be cool to see how a 7950GX2 behaves. The GL_EXT_timer_query ext provides a nano-second resolution timer to determine the amount of time it takes to fully complete a set of OpenGL gfx calls. There are still so many cool extensions. As soon as I get a 8800 board, I’ll made a little tutorial to cover these cool extensions.

Programming

Ageia PhysX SDK for free

November 5, 2006 JeGX Comment

Ageia has announced new licensing terms, allowing its PhysX SDK to be used and its runtime components distributed in all commercial and non-commercial PC projects for free.

This is a really good news for the community and for Hyperion! I filled up the register form and now I hope to receive the download link quickly.

OpenGL

NVIDIA GLSL compiler

November 1, 2006 JeGX Comment

In the demo I received from satyr (see oZone3D.Net forums), there is a toon shader that uses glsl uniforms. The pixel shader looked like to:

uniform float silhouetteThreshold;

void main()
{
  silhouetteThreshold = 0.32;     

  //... shader code
  //... shader code
  //... shader code
}

This pixel shader compiles well on nVidia gc but generes an error on ati. The error is right since an uniform is a read-only variable. This is an example of the nVidia glsl compiler laxity. That’s why I code my shader on ati: if the code is good for ati, we can be sure it will be good for nvidia too (of course there are always some exceptions…)