Java-Gaming.org Hi !
Featured games (91)
games approved by the League of Dukes
Games in Showcase (799)
Games in Android Showcase (237)
games submitted by our members
Games in WIP (865)
games currently in development
News: Read the Java Gaming Resources, or peek at the official Java tutorials
 
    Home     Help   Search   Login   Register   
Pages: [1]
  ignore  |  Print  
  Fill Ellipse from a fragment shader  (Read 25071 times)
0 Members and 1 Guest are viewing this topic.
Offline divxdede
« Posted 2015-01-18 12:40:09 »

I always rewriting some old 2D stuffs using old OpenGL glBegin/glEnd for use modern openGL statements.

I'm writing a fillOval implementation using a fragment shader instead of pushing a lot of vertices for drawing my ellipse.

The idea is to paint a classic rectangle (2 triangles composition) and use a fragment shader that discard all fragment that is not inside the ellipse.

As is, i push on OpenGL in order to draw my ellipse :
  - 4 vertices
  - 6 indices 
  - 2 "vec2" uniforms
       - the center of the ellipse
       - the radius of the ellipse

my fragment shader (responsible to discard fragments) looks like this one:

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
#version 330

uniform vec2 u_centerOval;
uniform vec2 u_radiusOval;

layout(origin_upper_left) in vec4 gl_FragCoord;

out vec4 outputColor;

void checkFragment() {
   /**
    * Ellipse equation =  (x/a)2 + (y/b)2 = 1
    *     a = horizontal radius
    *     b = vertical   radius
    */

   
    float e1 =  ( gl_FragCoord.x - u_centerOval.x ) / ( u_radiusOval.x );
    float e2 =  ( gl_FragCoord.y - u_centerOval.y ) / ( u_radiusOval.y );
   
    float d  = (e1 * e1) + (e2 * e2);
    if( d > 1 ) {
       outputColor = vec4(1.0 , 0.0 , 0.0 , 1.0 ); // discard;
    }    
}



So, it works not so bad. I have an "aliasing" issue, but i will see this later.

But now, if i paint with this statement:
1  
2  
  g2d.setColor( Color.YELLOW );
  g2d.fillOval( 400 ,  50 ,  200 ,  100 );


i obtain this

But if i apply an affine transformation like this:

1  
2  
3  
4  
 g2d.transform( AffineTransform.identity().rotate( 0.2f , 400 + 100 , 50 + 50 ) );
 g2d.setColor( Color.YELLOW );
 g2d.fillOval( 400 ,  50 ,  200 ,  100 );
g2d.setTransform( null );


i obtain this

That's quite normal since my fragment shader don't take in account my affine transform.
It's only my vertex shader that use it (and it rotate the rectangle as well).

i should inverse my transformation on gl_FragCoord in order to compute the ellipse equation ?
It's a pain, all matrices are not invertible. I can pass it as an uniform.

Have you a better idea ?
At this point, maybe this technical approach is completely crazy but at first  thinking i found it quite fun.

Best regards,
S├ębastien.
Offline matheus23

JGO Kernel


Medals: 138
Projects: 3


You think about my Avatar right now!


« Reply #1 - Posted 2015-01-18 12:48:48 »

I'm writing a fillOval implementation using a fragment shader instead of pushing a lot of vertices for drawing my ellipse.

I'm sorry I don't have the time to look deep into this, but what I have been told, when trying to "optimize" the circle drawing  by drawing a rectangle and determining which pixels to fill in the fragment shader I have been told that simply pushing lots of verticies is faster. You might need to benchmark this. I didn't test it, but that's what I have been told Smiley

See my:
    My development Blog:     | Or look at my RPG | Or simply my coding
http://matheusdev.tumblr.comRuins of Revenge  |      On Github
Offline lcass
« Reply #2 - Posted 2015-01-18 13:20:46 »

Generating it will be a lot faster if you use a lot of vertices. The issue you have here is you are performing thousands of expensive operations in your fragment stage when it could simply be just a color set. If you just have 1000 non overlapping vertices that have been generated outside of the shader will be much more efficient.
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline divxdede
« Reply #3 - Posted 2015-01-18 13:55:36 »

When using the "vertices" solution, the main problem is to determine how many vertices we should use (depending on the resolution and size of the ellipse).
Using the fragment technics, it's openGL that determine the "size" of the fragment and you haven't to worry about.

Another technic is to use by example the Bresenham's algorithm and you don't have to worry about determining how many vertices you need. but:
   -  when you push only points, it's quite ok, when you want lines, you need to set vertices in the correct order that is a pain with such algorithm.
   -  using the plain old trigo algo is enough but i don't like to compute cos/sin for every points.

i give a try to the bresenham's and will make some benchs
Offline Roquen

JGO Kernel


Medals: 518



« Reply #4 - Posted 2015-01-18 14:02:12 »

Sticking with brute force rendering...the orientation of the bounding rectangle gives you the rotation information.
Offline nsigma
« Reply #5 - Posted 2015-01-18 14:22:23 »

If you want to do this in a fragment shader, wouldn't you be better using texture coordinates rather than gl_FragCoord and your uniforms?

Praxis LIVE - hybrid visual IDE for (live) creative coding
Offline cylab

JGO Kernel


Medals: 195



« Reply #6 - Posted 2015-01-18 15:04:54 »

https://bitbucket.org/TwoLivesLeft/core/wiki/EllipseShader

Mathias - I Know What [you] Did Last Summer!
Offline divxdede
« Reply #7 - Posted 2015-01-18 16:16:56 »

I give it a try to the Bresenham algorithm

Some result here (i know there is no reason to be so enthousiast Tongue)

1  
2  
3  
      
g2d.setColor( Color.YELLOW );
g2d.fillOval( 400 ,  50 ,  200 ,  100 );







1  
2  
3  
4  
5  
      
g2d.setColor( Color.YELLOW );
g2d.transform( AffineTransform.identity().rotate( 0.2f , 400 + 100 , 50 + 50 ) );
g2d.fillOval( 400 ,  50 ,  200 ,  100 );
g2d.setTransform( null );






 
1  
2  
3  
4  
5  
      
g2d.setColor( Color.YELLOW );
g2d.transform( AffineTransform.identity().rotate( 0.2f , 400 + 100 , 50 + 50 ) );
g2d.drawOval( 400 ,  50 ,  200 ,  100 );
g2d.setTransform( null );






1  
2  
3  
4  
g2d.setPaint( texture ); // set a Texture instead of a color to fill it
g2d.transform( AffineTransform.identity().rotate( 0.2f , 400 + 100 , 50 + 50 ) );
g2d.fillOval( 400 ,  50 ,  200 ,  100 );
g2d.setTransform( null );




Here the implementation using Breshenham.
I just have a difficulty to pre check buffers size of my batcher. So ATM i must check a BufferOverFlowException that is not quite fun...

I don't do any benchmarks for now but i find this solution much functionnal than the fragment shader (but less fun ^^)

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
35  
36  
37  
38  
39  
40  
41  
42  
43  
44  
45  
46  
47  
48  
49  
50  
51  
52  
53  
54  
55  
56  
57  
58  
59  
60  
61  
62  
63  
64  
65  
66  
67  
68  
69  
70  
71  
72  
73  
74  
75  
76  
77  
78  
79  
80  
81  
82  
83  
84  
85  
86  
87  
88  
89  
90  
91  
92  
93  
94  
95  
96  
97  
98  
99  
100  
101  
102  
103  
104  
105  
106  
107  
108  
109  
110  
111  
112  
113  
114  
115  
116  
117  
118  
119  
120  
121  
122  
123  
124  
125  
126  
127  
128  
129  
130  
131  
132  
133  
134  
135  
136  
137  
138  
139  
140  
141  
142  
143  
144  
145  
146  
147  
148  
149  
150  
151  
152  
153  
154  
155  
156  
157  
158  
159  
160  
161  
162  
163  
164  
165  
166  
167  
168  
169  
170  
171  
172  
173  
174  
175  
176  
177  
178  
179  
180  
/**
     *  https://sites.google.com/site/ruslancray/lab/projects/bresenhamscircleellipsedrawingalgorithm/bresenham-s-circle-ellipse-drawing-algorithm
     */

    private void paintOvalBresenham(float x0, float y0, float rx, float ry , boolean fill , Graphics2DAttributes g2dAttributes , boolean overflowed ) {
       /**
        * Basics primitives perform pre check on all buffers of this batcher before pushing datas like:
        *      // 1 primitive 4 vertices and 6 indices.
        *      checkBufferSize( 1 , 4 , 6 );  
        *    
        * This implementation is quite more complicated and i can't determine how much vertices and indices i will use.
        * So, i don't do any pre check but if i have a buffer overflow, i discard this primitive, flush() and restart this paint process
        *
        * I can assume at least to have 1 primitive enought
        */

       checkBufferSize( 1  , 0 , 0 );
       
       /**
        * Get the primitive
        */

       Primitive p = getPrimitive( ( fill ? GL11.GL_TRIANGLE_FAN : GL11.GL_LINE_LOOP ) , g2dAttributes );
       
      float a2  = rx * rx;
       float b2  = ry * ry;
       float fa2 = 4 * a2, fb2 = 4 * b2;
       
       /** Bresenham algorithm compute 2 octants and determine 6 others by symetric computations
        *    - Theses 2 first octants are done without the use of cos/sin operations but use (* +)
        *    - Theses 6 others octants are done only with translation ( + or - operations)
        *
        *  The difficulty is able to order vertices in the "indices buffer" sinces points are not computed sequentially...
        *  
        *  Which is the order of points computation:
        *    - Each octant is numbered from 1 to 8. We compute octants 1,2,3,4 (multiplexed) and after we compute octants 5,6,7,8 (multiplexed)
        *    - Some octants as computed in a clockwise and some others in counterclockwise
        *    - The drawing process draw octants in this order 1 , 5 , 7 , 3 , 4 , 8 , 6 , 2 in the counterclockwise.
        *    - Some octant are clockwise and some others in counterclockwise by nature of symetric computation used.
        *               - Octant 1 : counterclockwise
        *               - Octant 5 : clockwise
        *               - Octant 7 : counterclockwise
        *               - Octant 3 : clockwise
        *               - Octant 4 : counterclockwise
        *               - Octant 8 : clockwise
        *               - Octant 6 : counterclockwise
        *               - Octant 2 : clockwise
        *                
        *  
        *   - Octants topology:
        *  
        *                   ***************
        *              *****   \ 4 | 3  /  *****
        *            **    8     \ |  /    7    **
        *          **             \|/             **
        *          *---------------X---------------*
        *          **      6     / | \     5      **
        *            **         /  |  \         **
        *              *****  /  2 | 1  \  *****
        *                   ***************
        */

       int sizeOctants1234 = 0;
       int sizeOctants5678 = 0;
       
       try {
          /* first half */
          float x, y, sigma;  
          for (x = 0, y = ry, sigma = 2*b2+a2*(1-2*ry); b2*x <= a2*y; x++)
          {
             p.pushVertex( x0 + x , y0 + y ); // Octant 1
             p.pushVertex( x0 - x , y0 + y ); // Octant 2
             p.pushVertex( x0 + x , y0 - y ); // Octant 3
             p.pushVertex( x0 - x , y0 - y ); // Octant 4
             sizeOctants1234++;
             
              if (sigma >= 0)
              {
                  sigma += fa2 * (1 - y);
                  y--;
              }
              sigma += b2 * ((4 * x) + 6);
          }
   
          /* second half */
          for (x = rx, y = 0, sigma = 2*a2+b2*(1-2*rx); a2*y <= b2*x; y++)
          {
             p.pushVertex( x0 + x , y0 + y ); // Octant 5
             p.pushVertex( x0 - x , y0 + y ); // Octant 6
             p.pushVertex( x0 + x , y0 - y ); // Octant 7
             p.pushVertex( x0 - x , y0 - y ); // OCtant 8
             sizeOctants5678++;
   
             if (sigma >= 0)
              {
                  sigma += fb2 * (1 - x);
                  x--;
              }
              sigma += a2 * ((4 * y) + 6);
          }
       }
       catch( BufferOverflowException e ) {
          /** Ouch, we have not enough place for put all vertices, we flush this batcher without the primitive
           */

          if( overflowed ) throw new IllegalStateException("OpenGLSupport with too small buffer size for this method");
          primitiveCount--; // discard this primitive
          flush();
          paintOvalBresenham(x0, y0, rx, ry, fill, g2dAttributes , true);
          return;
       }
       
       /** Now we put all vertices, we know how many indices we must have
        *  We can do a pre check in order to avoid the BufferOverflowException...
        */

       if( ! checkBufferSize( 0 , 0 , ( 4 * sizeOctants1234 ) + ( 4 * sizeOctants5678 )  , false ) ) {
          if( overflowed ) throw new IllegalStateException("OpenGLSupport with too small buffer size for this method");
          primitiveCount--; // discard this primitive
          flush();
          paintOvalBresenham(x0, y0, rx, ry, fill, g2dAttributes , true);
          return;
       }
       
       /** Push indices
        */

       IntBuffer buffer = indicesBuffer.getBuffer();
       
       final int strideOctants1234 = 4 * sizeOctants1234;
       final int strideOctants5678 = 4 * sizeOctants5678;
       
       /**
        * Computing where start each octants
        */

       final int offsetOctant1 = 0;                                                      // forward
       final int offsetOctant4 = 3;                                                 // forward
       final int offsetOctant6 = strideOctants1234 + 1;                              // forward
       final int offsetOctant7 = strideOctants1234 + 2;                              // forward
       final int offsetOctant5 = strideOctants1234 + strideOctants5678 - 4;              // backward
       final int offsetOctant3 = strideOctants1234 - 2;                             // backward
       final int offsetOctant8 = strideOctants1234 + strideOctants5678 - 1;             // backward
       final int offsetOctant2 = strideOctants1234 - 3;                          // backward
       
       // octant 1
       int offset = offsetOctant1;
       int stride = 4;
       for(int i = 0 ; i < strideOctants1234 ; i += stride ) {
          buffer.put( offset + i );
       }
       // octant 5
       offset = offsetOctant5;
       for(int i = 0 ; i < strideOctants5678 ; i += stride ) {
          buffer.put( offset - i );
       }
       // octant 7
       offset = offsetOctant7;
       for(int i = 0 ; i < strideOctants5678 ; i += stride ) {
          buffer.put( offset + i );
       }
       // octant 3
       offset = offsetOctant3;
       for(int i = 0 ; i < strideOctants1234 ; i += stride ) {
          buffer.put( offset - i );
       }
       // octant 4
       offset = offsetOctant4;
       for(int i = 0 ; i < strideOctants1234 ; i += stride ) {
          buffer.put( offset + i );
       }
       // octant 8
       offset = offsetOctant8;
       for(int i = 0 ; i < strideOctants5678 ; i += stride ) {
          buffer.put( offset - i );
       }
       // octant 6
       offset = offsetOctant6;
       for(int i = 0 ; i < strideOctants5678 ; i += stride ) {
          buffer.put( offset + i );
       }
       // octant 2
       offset = offsetOctant2;
       for(int i = 0 ; i < strideOctants1234 ; i += stride ) {
          buffer.put( offset - i );
       }
       p.end();
   }


S├ębastien.
Offline theagentd
« Reply #8 - Posted 2015-01-18 16:22:49 »

The easiest way to do this in a fragment shader is to NOT rely on gl_FragCoord. If you rotate the coordinates, the result stays the same. A better idea is to assign "texture" coordinates to each corner and have the shader rely on those instead, since those coordinates will be and interpolated correctly after rotation.

Myomyomyo.
Offline pitbuller
« Reply #9 - Posted 2015-01-18 18:21:33 »

The easiest way to do this in a fragment shader is to NOT rely on gl_FragCoord. If you rotate the coordinates, the result stays the same. A better idea is to assign "texture" coordinates to each corner and have the shader rely on those instead, since those coordinates will be and interpolated correctly after rotation.

Another way would be to use rotated ellipse equation but that would be just nuts. http://www.maa.org/external_archive/joma/Volume8/Kalman/General.html
Games published by our own members! Check 'em out!
Legends of Yore - The Casual Retro Roguelike
Offline divxdede
« Reply #10 - Posted 2015-01-18 19:26:16 »

If you want to do this in a fragment shader, wouldn't you be better using texture coordinates rather than gl_FragCoord and your uniforms?


The easiest way to do this in a fragment shader is to NOT rely on gl_FragCoord. If you rotate the coordinates, the result stays the same. A better idea is to assign "texture" coordinates to each corner and have the shader rely on those instead, since those coordinates will be and interpolated correctly after rotation.

Thanks all, it's the right way with the fragment shader. I don't found it alone.... damn Tongue
It's not yet naturals for me to write shaders and understood all subtles with theses !!!

The most hard things with openGL is the debugging !! .
When it don't work as you wish, it's a reallly a painfull to find the cause.

Thanks all, for your advices, but il will stick with the Bresenham implementation.
Offline pitbuller
« Reply #11 - Posted 2015-01-18 19:42:59 »

With modern GPU's I doubt that nothing beats pixel shader variant in speed for dynamics ellipses. If you try to use a lot of triangles you end up with badly shaped thin/small triangles that are not efficient to rasterize. Baking shape to texture does not work well for small or really big shapes.(you also need special distance texture instead of pure shape. With alpha test like shader.). Also nothing can beat analytical anti-aliasing.

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
void checkFragment() {
   /**
    * Ellipse equation =  (x/a)2 + (y/b)2 = 1
    *     a = horizontal radius
    *     b = vertical   radius
    */

   
    vec2 e =  ( v_texCoord - 0.5 ) * u_invRadius.xy; // can be moved to vert shader.
   
    float d  = dot(e, e);
    if( d > 1.0 ) {
       outputColor = vec4(1.0 , 0.0 , 0.0 , 1.0 ); // discard;
    }    
}


So optimized code is just one vec2 dot operation and logic for handling what to do with the result.
Offline Roquen

JGO Kernel


Medals: 518



« Reply #12 - Posted 2015-01-18 20:29:48 »

All the information is in the rectangle....didn't I say that already.  Wink
Offline divxdede
« Reply #13 - Posted 2015-01-21 21:10:47 »

i'm completing the fragment shader from advices.

The result is here :

The shader look's like

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
#version 330

varying vec2 v_boundingBoxPosition;

void checkFragment() {

   /**
    * Ellipse equation =  (x/a)2 + (y/b)2 = 1
    *     a = horizontal radius
    *     b = vertical   radius
    *
    *
    * Bounding box is a vec2 that correponds to this rectangle coordinate
    *        (-1,-1)            (+1,-1)
    *                  (0,0)
    *        (-1,+1)            (+1,+1)
    *
    *  The varying result in this fragment shader correspond to [ (x/a) , (y/b) ]
    *  We can perform a dot() for doing the pow(2)
    */

    float d  = dot( v_boundingBoxPosition , v_boundingBoxPosition );
    if( d > 1 ) {
       discard;
    }    
}


The fragment result looks much nicer.

But in term of performance, i bench this for 600 ellipses per frame (i use large buffer for prevent implicit flush while pushing datas):
  - Bresenham : 10 ms for pushing datas on the batcher. 6.8 ms for flush it
  - Shader      : 0.1 ms for pushing datas on the batcher. 3.5ms for flush it.

In both of them i take the worst case but in average it's most:
  - Bresenham : 10 ms for pushing datas on the batcher. 6 ms for flush it.
  - Shader      : 0.1 ms for pushing datas on the batcher. 2.8 ms for flush it.

I must check if i have not some performance issues in the push data method. but in all cases, the flush is also faster in the shader version.

So, ATM the shader way seems the way to go...


EDIT: The bresenham algorithm push 6 floats per vertices (color + 2d position) that mean 24 bytes per vertices
The ellipse drawn use 584 vertices that means 14 016 bytes for an ellipse (14 Ko)
When i draw 600 ellipses, it's come up to 8 409 600 bytes (8.02 Mo).
the FloatBuffer.put(...) take me the 50% of total time of the push sequence. It seems strange that it's longer to fill it in RAM than sending it to the GPU...

EDIT : I updated the fragment shader, i don't use anymore an uniform. the varying vec2 is ready to use and minimize computation inside the fragment shader like pitbuller said.
Offline pitbuller
« Reply #14 - Posted 2015-01-25 00:04:29 »

The shader version is beautiful at it simplicity.
Offline basil_

« JGO Bitwise Duke »


Medals: 418
Exp: 13 years



« Reply #15 - Posted 2015-01-25 00:22:49 »

with AA :

1  
2  
3  
4  
5  
6  
7  
8  
9  
10  
11  
12  
13  
14  
15  
16  
17  
18  
19  
20  
21  
22  
23  
24  
25  
26  
27  
28  
29  
30  
31  
32  
33  
34  
#version 120

varying vec2 v_boundingBoxPosition;

void main() {
   
    float d  = dot( v_boundingBoxPosition , v_boundingBoxPosition ) * 0.5;

    // [quote author=Roquen]
    // fwidth was a bad idea for a function...we should all pretend like it doesn't exist.
    // [/quote]

    // float edge = fwidth(d);

    vec2 pw = fwidth(v_boundingBoxPosition);
    float edge = max(pw.x, pw.y);
 
    // float alpha = step(0.5, d);
    float alpha = smoothstep(0.5 - edge, 0.5 + edge , d);

    // [quote author=ra4king]
    // You still want to do a discard so the depth buffer isn't written to.
    // [/quote]
   
    // [quote author=ra4king]
    // Using equality with floating points is a bad idea
    // [/quote]
   
    // if(alpha == 0.0) discard;
    if(alpha < 0.01) discard;

    gl_FragColor = vec4(1.0,1.0,0.0, alpha);

}
Offline ra4king

JGO Kernel


Medals: 508
Projects: 3
Exp: 5 years


I'm the King!


« Reply #16 - Posted 2015-01-25 18:39:06 »

You still want to do a discard so the depth buffer isn't written to.

Offline Roquen

JGO Kernel


Medals: 518



« Reply #17 - Posted 2015-01-25 19:42:38 »

fwidth was a bad idea for a function...we should all pretend like it doesn't exist.

discard on this type of AA isn't useful...has to be back to front anyway.
Offline basil_

« JGO Bitwise Duke »


Medals: 418
Exp: 13 years



« Reply #18 - Posted 2015-01-25 20:24:47 »

yea, it's not msaa. still better then nothing, even with zero alpha.
Offline ra4king

JGO Kernel


Medals: 508
Projects: 3
Exp: 5 years


I'm the King!


« Reply #19 - Posted 2015-01-26 07:24:53 »

Using equality with floating points is a bad idea Pointing

1  
if(alpha <= 0.01) discard; // much better

Offline Roquen

JGO Kernel


Medals: 518



« Reply #20 - Posted 2015-01-26 08:39:29 »

In this case it's perfect fine.  The only problem would be if smoothstep didn't fullfill its contract.  To cover that hypothetical situation the compare really should be <= 0 rather than some ad-hoc constant.

But again, the discard doesn't buy you any quality.  In either case (with or without) if you render anything behind...it will be wrong.  The advantage of the bounding rectangle wrong is that it will be immediately obvious what the problem is.  The "jaggy" depth buffer wrong could take awhile to figure out...esp if it pops up at a much later date.  (aliased looking thing behind with a strangely colored edge of the forward).

On fwidth:  The problem is that it's a misleading function name.  You expect it to be L2 when it's in fact Linf.  Want you really want is some approximation of pixel coverage, for cheapish I'd go for (L2)2: dFdx2+dFdy2
Offline basil_

« JGO Bitwise Duke »


Medals: 418
Exp: 13 years



« Reply #21 - Posted 2015-01-26 12:42:46 »

i dont know. fwidth^2 isn't going to work out of the box due to the projection.

after all fwidth works window space so it wont return correct values when the surface partially covered by something drawn before. for best result such AA should be handled the same way transparent surfaces are handled.
Offline divxdede
« Reply #22 - Posted 2015-01-26 13:28:02 »

Hi,

I'm not on the "AA" feature (that seems to be another complete topic  persecutioncomplex)
But the discard() is not mandatory if you play with glBlend function by example if you set an equivalent than AlphaComposite.SRC ?

Offline basil_

« JGO Bitwise Duke »


Medals: 418
Exp: 13 years



« Reply #23 - Posted 2015-01-26 13:51:21 »

yes, you're right. discard here is good cos' it avoids writing invisible pixels to the depth-buffer. visually it's the same.
Offline Roquen

JGO Kernel


Medals: 518



« Reply #24 - Posted 2015-01-26 15:01:39 »

I'm not talking about fwidth squared:  (|dx|+|dy|)2  And like I said...the invisible pixels don't matter...you render behind and it's broken.  If the argument becomes performance then you have Pi ab vs. 4ab.  So ~79% of the time you're inside (and the compare and not taking discard hurts you) and the rest you've already performed all the work you're going to do and are only skipping on some writes.

NOTE: I said LInf when I meant L1 above.
Offline basil_

« JGO Bitwise Duke »


Medals: 418
Exp: 13 years



« Reply #25 - Posted 2015-01-26 15:29:48 »

yes i know. dfdx^2+dfdy^2 has the same issues with projections tho'.

in the end if you want a antialiased shape rendered correctly with msaa buffer, early z, any draw-order and blending then you need polygons anyway.

shading it like that is the same as using a texture with alpha. gotta use alpha-to-coverage or sample-shading (which is defeating the idea of analytic AA).
Offline theagentd
« Reply #26 - Posted 2015-01-26 21:40:34 »

Don't use discard() directly. It's more flexible to simply output the alpha to the alpha channel and then do the rest using glEnable()s. Want fast but aliased result? Enable the alpha test. Want alpha-to-coverage? Just enable it. Want supersampling? Enable alpha test+sample shading. If you discard in the shader, you can't use it for alpha-to-coverage. Even discarding on alpha==0.0 is slower, as the if-statement is more expensive than the writing.

Myomyomyo.
Offline basil_

« JGO Bitwise Duke »


Medals: 418
Exp: 13 years



« Reply #27 - Posted 2015-01-26 21:47:20 »

aah, good to know.
Offline pitbuller
« Reply #28 - Posted 2015-01-26 23:34:38 »

Don't use discard() directly. It's more flexible to simply output the alpha to the alpha channel and then do the rest using glEnable()s. Want fast but aliased result? Enable the alpha test. Want alpha-to-coverage? Just enable it. Want supersampling? Enable alpha test+sample shading. If you discard in the shader, you can't use it for alpha-to-coverage. Even discarding on alpha==0.0 is slower, as the if-statement is more expensive than the writing.
That would imply that gpu's still have special alpha test unit? Do they? Dx11 does not have alpha testing stage anymore which I have taken as hint that gpu's might have get rid of that too.
Pages: [1]
  ignore  |  Print  
 
 

 
Riven (201 views)
2019-09-04 15:33:17

hadezbladez (4909 views)
2018-11-16 13:46:03

hadezbladez (1810 views)
2018-11-16 13:41:33

hadezbladez (5183 views)
2018-11-16 13:35:35

hadezbladez (1024 views)
2018-11-16 13:32:03

EgonOlsen (4387 views)
2018-06-10 19:43:48

EgonOlsen (5228 views)
2018-06-10 19:43:44

EgonOlsen (2972 views)
2018-06-10 19:43:20

DesertCoockie (3875 views)
2018-05-13 18:23:11

nelsongames (4306 views)
2018-04-24 18:15:36
Java Gaming Resources
by philfrei
2019-05-14 16:15:13

Deployment and Packaging
by philfrei
2019-05-08 15:15:36

Deployment and Packaging
by philfrei
2019-05-08 15:13:34

Deployment and Packaging
by philfrei
2019-02-17 20:25:53

Deployment and Packaging
by mudlee
2018-08-22 18:09:50

Java Gaming Resources
by gouessej
2018-08-22 08:19:41

Deployment and Packaging
by gouessej
2018-08-22 08:04:08

Deployment and Packaging
by gouessej
2018-08-22 08:03:45
java-gaming.org is not responsible for the content posted by its members, including references to external websites, and other references that may or may not have a relation with our primarily gaming and game production oriented community. inquiries and complaints can be sent via email to the info‑account of the company managing the website of java‑gaming.org
Powered by MySQL Powered by PHP Powered by SMF 1.1.18 | SMF © 2013, Simple Machines | Managed by Enhanced Four Valid XHTML 1.0! Valid CSS!