********************************************************************************
³ Previous notes: ³
³ The author assume no responsability for any effects/damage ³
³ this file can produce. ³
³ This file cannot be modified without the author's permission. ³
³ This file can be (and must be !) freely distributed ³
********************************************************************************
I heard not so long ago talking about method of shading called
Gouraud shading, mainly based on the interpolation of the
light intensity along the edge of the polygon and then along
each scanline of it.
The book in wich I read it was saying: "very approximative but
very fast, this method is often used in realtime applications,
because of its low cost in calculation time"(or sumthin like that..).
After, I've seen it in many demos, and didn't understand how
a so slow and uncomfortable algorithm could be used for animating
thousand of vectors at a reasonnable framerate on my computer.
It's only a few months ago I realized how fast and comfortable
This method was and how many doors in the domain of 3d animation
it opened to me.
It's the principle of interpolating values along a polygon using
fixed point math that is the key of many other algorithms,
such as:
 the degenerated Gouraud's: Zgouraud, distancebased Gouraud
 Phong shading
 Zbuffer
 texture mapping
 many other more advanced techniques like:
 environment/reflection mapping
 bump mapping
 Abuffer
 and much more I expect you'll find and write to me !..
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³1) The first step: Gouraud shading ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
I won't repeat here what many other docs already told,
but just make a brief recapitulation.
Here's your polygon:
I > * At each vertice constituing the poly,
I N1 you can compute a normal vector,
/\ (preferably precalculated and rotated
/ \ like the other vertices).
/ \ using this normal vector, you can
/ \> find the light intensity, which is
/ /> simply the cosine of the angle between
</ / N2 the normal vector and the light vector,
> \ / given by:
N4 \ / > >
\ / cos a = (N . L)/[L]*[N]
\/
I where N and L are respectively the
I normal and the light vector and
I > [N] and [L] are the length of these two
v N3 vectors.
It's obvious that these two lengths should value 1.
The intensity is then just the dot product of the two vectors.
* The next step is to know, or just approximate, the intensity
at each point of the poly. to do this, just INTERPOLATE !!
For example: we just compute the intensity at each point of the
poly nicely drawed above, you got for ex. the values a1 and a2
corresponding to the two vectors N1 and N2.
The value of the intensity at the point x,y of the first right
edge is given by:
a = a1 + (y  y1)*((a2  a1)/(y2  y1))
where a1 and a2 are the intensity of the points 1 and 2
y1 and y2 are the ycoord ON THE SCREEN of these two points.
You got now all the values of the intensity along each edge
of the poly. the second part of the algorithm concern
the interpolation of these values along each scanline
of the poly, thus obtaining the intensity at each pixel...
This is done using the same scheme:
a = a1 + (x  x1)*((a2  a1)/(x2  x1))
where a1 and a2 are the intensity of the points 1 and 2
x1 and x2 are the xcoord " " " "
/\
/ \
/ \
/ \
/ \
a1<__________>a2 < scanline
/ <a> \
/ \
/ \
/ \
\ /
\ /
\ /
\ /
\ /
\ /
\ /
\ /
\ /
\/
Of coz it shouldn't be done using one MUL and one DIV per
pixel ! The values (for ex. in the last formula) :
(a2a1) and (x2x1) are precalculated for each scanline,
and so is the value (a2a1)/(x2x1)...
Then you set the first 'a' at the value a1 and the only
operation you'll have to do for each pixel is just an addition
between two fixedpoint values. (a and (a2a1)/(x2x1))
A brief example for the implementation :
for each point of the scanline do:
{asm code} {pseudocode}
MOV AL,DH ; al = a shr 8
ADD DX,BP ; a = a+((a2a1)/(x2x1))
STOSB ; es:[di] = a ; di = di+1
'GoT It ??
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³2) The next step : ZGouraud and Zbuffer ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
So now you got the code (how that 'not yet !!' ??) for
interpolate one single value along a polygon.
you can interpolate this famous intensity, as in the
most classic Gourauds, but there is much more amusing:
just approximate the Z component of each point of the
polygon, using the same method.
If you plot the intensity corresponding to this value
you'll have some kind of 'depth shading' effect.
(also known as 'ZGouraud', which is quite comprehensible!)
Having the Z component of each point is very interesting
when you have two faces intersecting each other.
(f.e. when two objx are colliding !..)
Just set up a huge (320*200 is good) buffer in your mem, where
you put the Zvalue of every pixel on the screen.
when you try to write a pixel, first ask yourself:
'ain't there any point recovering the one I wanna write ?'
In computer language, this is called a 'test'!:
It just means : look if the Zvalue of the current point
is greater than the corresponding value in the Z_buffer.
If it is, don't write it ! if not, you can write it
on the screen (or in a screenequivalent buffer..) and
write its Z value in the Zbuffer !
This method is let's be reasonnable very slow, it means
one test per pixel, plus the refreshment of the Z_buffer
(all points should be set to (infiny) at every frame !),
and a writing operation two times slower than the Gouraud
shading one...
But it is also the most realistic and impressive method
for intersecting objects, Zclipping (which can be done
automatically using an unsigned test..),it allows
immediate depthshading, etc...
If you didn't understood everything, these littal schemas
are gonna help you:
Zbuff: screen:
ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿ ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ ³ ³ ³ ° poly #1
³ ³ ³ ³
³ 1111 ³ ³ °°°° ³ ± poly #2
³ 2222 ³ ³ °°°° ³
³334444 ³ ³°°±±±± ³ Û poly #3
³ 5555 ³ ³ ±±±± ³
³ 2366766 ³ ³ ÛÛ±±Û±± ³
³ 23457 ³ ³ ÛÛÛÛÛ ³
³ 23457 ³ ³ ÛÛÛÛÛ ³
³ 23457 ³ ³ ÛÛÛÛÛ ³
³ 23457 ³ ³ ÛÛÛÛÛ ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÙ ÀÄÄÄÄÄÄÄÄÄÄÄÄÙ
The poly #3 is intersecting the poly #2, which is itself
recovering the poly #1...
WARNING !! these drawings assume that the Zaxis is
pointing towards the observer.
(the greater is the nearer..)
And of coz, again a tiny example of the main loop
implementation: (the rasterizer)
(in 386 assembler, quite more easy than 8088..)
@@:
SHRD EBX,EDX,24 ; ebx = edx shr 8
ADD EDX,EBP ; compute the Z value
CMP BX,Z_BUFF[EDI*2] ; Z > z_buff[di] ??
JG PLOT_IT ; yes ..
INC EDI ; no: nothing
LOOP @B
JMP @F
PLOT_IT:
MOV SCREEN[EDI],AL ; store the color..
MOV Z_BUFF[EDI*2],BX ; store the Zvalue
INC EDI ; next one!...
LOOP @B
@@:
Again, don't forget to reverse the test if you are using another
orientation..
ÚÄÄÄÄÄÄÄÄÄÄÄÄ¿
³3) Textures ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÙ
So now you passed two months trying to understand this shit,
you got it, nicely shaded polys rotating around each other and
intersecting,and so on...
But what about mapping?
It just means put a bitmap onto a polygon, just as if
the bitmap was itself rotating in the space.
(I'm sure you've already seen that somewhere...)
How to do it?
we'll pass over the boring 'it'snotpossibletodofreedirection
texturemappinginrealtimesoI'lljustshowyouhowtomake
floors'
realtime free direction texturemapping IS possible and I'm
gonna show you how...
You know how to interpolate one value along a polygon:
free distorsion/linear mapping , which are the true names
of the method I use,only consists in interpolating
TWO values instead of one.
These two values are just the coordinates (UVs or IJs) of the point in
the texturemap corresponding to the point on the poly.
Well, just consider the poly we used in the Gouraud shading example:
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÂÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³ iB,jB ³ ³
³ /\ ³< on the screen, it looks like:³
³ / \ ³ ³
³ / \ ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´
³ / \ ³ ³
³ i1,j1 / \ i2,j2 ³ ³
³ <> < scanline³ in the texture, it looks like: ³
³ / <i,j> \ ÃÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ´
³ / \ ³ ³
³ / \ ³ i1,j1 ³
³ / \ ³ iA,jA <+> iB,jB ³
³ iA,jA \ / ³ ÚÄÄÄÄÄÄÄÄÄÄÄ\ÄÄÄÄÄÄ¿ ³
³ \ / ³ ³ \ ³ ³
³ \ / ³ ³ B I T \ ³ ³
³ \ / ³ ³ \ ³ ³
³ \ / ³ ³ \ ³ ³
³ \ / ³ ³  M A P \ ³ ³
³ \ / ³ ³ \+ i2,j2 ³
³ \ / ³ ³ ³ ³
³ \ / ³ ³ ³ ³
³ \/ ³ ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ ³
³ ³ ³
³ ³ ³
³ ³ ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÁÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
As you can see, there are two interpolations, and, for each point, you
got to find the offset corresponding to the two interpolated
coordinates.
The pseudocode looks like:
for each point on the scanline:
{ i=i+inc_i ;
j=j+inc_j ;
c=texture[i,j] ;
plot (c) ;
}
The main difficulty is to code it in assembler, and using the less
instructions as possible.
I think it is possible to do it in six instructions per pixel.
Has anybody got better ?
code sample:
(assuming a 256*X bitmap)
; EDX and ESI are the shifted coord.
; EBP and INC_J are the corresponding shifted increments.
; ECX is the nb. of points on the scanline
; EDI is the offset in the screen
@@:
MOV BX,SI ; bx < l2 shl 8
MOV BL,DH ; bx < l2 shl 8 + l1
MOV AL,TEXTURE[BX] ; get the pixel in the bitmap
STOSB ; aff. le pixel
ADD EDX,EBP ; incr. i
ADD ESI,INC_J ; incr. j
LOOP @B
(I'll tell you later how to get rid of the 'INC_J' var...
asm fans don't worry ! cf. tricks&tips: unrolled loops )
Well, OK, this is not true texturemapping (As I said, this
is only freedisto), but the illusion works, and the method
is fast.
(the game DESCENT implements true texturemapping, if
there is a guy somewhere who knows how: please tell me !!)
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³4) A word about Phong shading ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Well, I ain't gonna give here THE method for Phongshade
your polys, I don't know wich is the fastest one, but
I am able to tell you some tricks I collected here and there..
(Usenet discussions,f.e..).
First you should know that many Phong that you've seen in
demos or other are fakes, I mean they are just modified
Gouraud shading. We can demonstrate that with a sufficient amount
of polygons and using the Phong illumination model
( the shadings in the palette are nonlinear ), the differences
between Gshading and Pshading are very small (but they do exist..).
The main principle is that instead of interpolating intensities, like
in Gouraud, just interpolate normal vectors along your scanlines..
It means three values to compute for every pixel (x,y and z coords).
Then, when you've got your normal vector for each point, you gotta
renormalise it,it means divide each coordinate by the current length
of your interpolated vector. It must be done to compute properly the dot
product between this vector and the light vector, then you've got your
intensity !..
A little drawing ?: x
/ light src
/
/
..... N ....
..... : ....
N1 ... : .... N2
................:...............
\ I /
\ I /
\ I /
\____________I___________/
< scanline
N1 and N2 are the startandend vectors,
N is the interpolated vector,
the part of the N vector in ':' is the result of the renormalisation..
As you can expect, this is impossible to compute in realtime..
(using this algorithm, I mean!..)
you've got to update 3 values (the coordinates of the
interpolated vector), find the length, divide
the coords by this length and then compute a dot product...
It takes something like 3 ADDs, 3 DIVs and 6 MULs per pixel !!
There are some speedup tech.. I'm gonna make here a brief overview.
First: don't compute every pixel: you can easily display three
pixels of the same color together, or better: compute the true
values every 4 pixels and interpolate between them...
(I already told ya: 'interpolation' is THE word..)
next: linear approximation is good, but quadratic is better !
I've read a very interesting article in IMPHOBIA mag Nø10,
written by .. mhh .. don't remember .. (greetingz!),
talking about QUADADDERS : a very efficient method for
interpolating between three values using a parabolic curve.
It only takes 2 ADD's per pixel !
you just have to compute three values on your scanline (with kewl
MUL's and DIV's !!), and then interpolate (again!) between 'em.
Just read this article, coz the development is too long for this txt...
another one: 'FAST PHONG' by Mister Bishop (??), using taylor
approximations... I didn't read it. :/
see ACM computer graphics 20(4) pp.103106 '1986
Some other methods were developped, using angular or biangular
interpolations.. (see f.e. 'faster Phong shading via angular interpo
lation', Kujik & Blake, computer graphics forum 8 (1989)).
Also think about spherical coordinates and lookup tables.. ;)
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³5) Tricks and tips ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
a) fixedpoint maths

For those who didn't understand how I was doing my incrementations,
here's a brief recap of the fixedpoint principle:
We assume the values you're computing are made of two parts:
the integer part and the decimal part, the one after the point.
these two parts are contained in one number and you can decide
which bits are significant for the int. or dec. part.
It's very easy to implement : You just have to shift your numbers
of n bits, meaning you multiply them by 2^n. When you want your
integer approx., just 'deshift' them by the right number of bits.
It's even more easier when you shift your values of 8 bits.
The 80X88 architecture allows you to split your regs in two
8bits parts. So if you've got your shifted value in AX,
the integer approximation will be contained in AH.
If you work with 16 bits values, for ex. in AX, your shifted
val should be in EAX, so that the extended part contains the
integer.(You can even work with 24 bits values, easy to implement
when smartly using the SHLD/SHRD 386 instr.)
b) single bytes suckz

All the mainloops examples I wrote only write one byte at each loop.
It's much more efficient when possible to write two bytes (also
called a word huhu!) at the same time.
For example, here's the new Gouraud main loop :
(5 inst. for two pixels!)
SHR ECX,1
@@:
MOV AL,DH
ADD EDX,EBP
MOV AH,DH
STOSW
ADD EDX,EBP
LOOP @B
But warning ! This only works when you are writing at even offsets.
Otherwise it won't speed up much, because when the CPU write a word on
a byte border, it's as slow as writing two bytes 'manually'.
So you'll have to check  before entering the loop  the parity
of your first offset, first write one byte if it's odd, and the same
when the loop is finished...
c) about the 'loop' instruction

A word about the 'loop' instruction: I used it for clarity in my codes,
but the fastest method for looping is the following:
instead of: use:
LOOP @B DEC ECX
JNZ @B
The result is the same but it works much more faster !
(Perhaps no more on P5, ..must be verified..)
d) unrolled loops

The last tip is fine but the real fastest method for looping
is ... not to loop !
Don't you guess ?
The trick is to 'unroll' your loop, I mean when you know the
maximum number of times you may loop (usually 320 for a scanline ,
let's say MAX) you write all the instruction one after one.
When you enter your loop just jump at the offset LABEL+(MAXECX)*N,
where N is the number of bytes of the instr. constituing your loop.
Of coz you gotta use the 'REPEAT MAX (...) ENDM' macroinst for
writing the code, which induces a space loss, but it's really
worth it.
For the Gouraud:
JMP ... ; do it yourself, big boy !
REPEAT 320
MOV AL,DH ; al = a shr 8
ADD DX,BP ; a = a+((a2a1)/(x2x1))
STOSB ; es:[di] = a ; di = di+1
ENDM
I said when talking about the texture mapping that it was possible
to get rid of the var : just replace it by ECX which is not used
anymore in the loop !
***** last minute *****
It seems that the unrolled loops are not so efficient
if not less on the new pipelined/"RISC" architecture of the Pentium..
I am not a hardware freak, but so many people told me that
I think I just have to believe it !
It's your matter to know wich config your prodz should work on..
And one way to know is TESTING.
e) Zbuffer optimizations

Yes, there are many !
DO you know what a depthsort is ?
Well, it's just an implementation of the famous painter's algorithm:
sort your polygons 'backtofront', so that the nearest polys are
recovering the furthest.
For Zbuffer, just reverse : sort your polys 'fronttoback',
so that with a bit of luck you just write the necessary pixels
on the screen, and not the one which are going to be recovered.
John De Goes  creditz !  told me some other trickz too:
f.e:
 you normally don't have to clear your Zbuffer every frame
 in some special cases, you don't have to test every pixel
on your scanline
 You should be careful too on the Pentium branchprediction
chip(see 'lastminute' above): if most of the pixel of the
poly will be written,execute the jump only if the current
point is not written..
 I heard talking about 'fuzzy' algorithms wich may be helpful
(just heard..)
(see 'Zbuffinf.txt', kewl doc)
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³6) conclusion words³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
I hope this doc will be helpful for beginners in 3d coding.
It's the kind of help I would have appreciated when I was
beginning one year ago.
It perhaps seems a bit confused and the level of each part
may be variable, but I didn't write it in one time, so
if some explanations are not clear, don't hesitate to ask me..
(Sorry for the bad english. I speak french, so be indulgent..)
I still have much to do in 3d coding:
 implement a TRUE texturemapping algo,
not a linear distorsion.
 I will probably releaze a txt about Phong shading,
including a (more or less) efficient implementation.
 BSP trees and/or Zbuffer for a 3d virtual world.
 The use of extended VESA/SVGA graphic modes such as
Hicolor or 8bitsHirez modes.
 Pentium optimizations trix.
any help is appreciated..!
ÚÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄ¿
³7) Credits and greetingz ³
ÀÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÄÙ
Doc written by :

KARMA [ Tfl/TdV = The Flamoots/The Dark Vision ]
(Jean Cardinal)
CONTACT ME !!
for anything: discussion,advices,remarks,flaming,collaboration...
Email:
jcardin@is1.ulb.ac.be
Thanx &/or greetingz to:

MORFLAME [TfL/TdV]
we gonna do good work together!
JOHN DE GOES [on Compuserve]
Cool docs !
ALL THE MEMBS OF MY CREW :
type one,sam,cybersurfer,
fred,bismarck,gopi,fly,zoltan,Rod...
IMPHOBIA
'nice' is the word
and everyone I forgot...
