Archive for February, 2008

Two OpenGL Optimizations

Wednesday, February 20th, 2008

In the last week, I’ve been working on optimizing E15 for speed and efficiency. So far I’ve implemented two ways to increase performance: texture tiling and frustrum culling. I’ll start with texture tiling.

Texture Tiling

We were using non power-of-two (POT) textures in E15, since they are supported, and it makes everything easy since source images for our textures don’t necessarily come in POT. For most images, this is fine, since they are small and manageable. With OpenGL 2.1, most things work with non POT. Performance issues arise when you have large textures, in our case rendering web pages. When web pages get turned into a bitmap, they become huge. Blogs are especially large, easily reach 15,000 pixels high. Of course these textures are too large for OpenGL, and so we decided to go back to POT and tile the images by subdividing them in multiple textures applied to multiple quads.

Going back to POT was a good move, since on the ATI X1900 it seems hardware mipmaps are only supported with POT (so originally we where using gluBuildMipmap). Implementing this was relatively straight forward. Here’s what needs to get done:

  1. Obtain texture, then create a new image with the next largest POT dimension.
  2. Create new image by placing original image onto the new image.
  3. Create textures by subdividing image with predefined tile size.

All images are supplied as a CGImageRef, so I implemented a new method that will go through and accomplish the above task. It is pretty simple. You pass a CGImageRef and tile size and it will return an array of OpenGL texture ids.

- (GLuint *)createTiledTexturesFromCGImage:(CGImageRef)cgImage
                 tileSize:(int)newTileSize
{
  GLuint *textureNames;
  if(cgImage) {
    float image_w = CGImageGetWidth(cgImage);
    float image_h = CGImageGetHeight(cgImage);
    float remain_x = image_w/newTileSize;
    float remain_y = image_h/newTileSize;
    float spacing_w = (ceil(remain_x)-remain_x)*(float)newTileSize;
    float spacing_h = (ceil(remain_y)-remain_y)*(float)newTileSize;
    float width = image_w + spacing_w;
    float height = image_h + spacing_h;

    void* tData = calloc(width * 4, height);
    CGRect rect = CGRectMake(0, spacing_h, image_w, image_h);
    CGColorSpaceRef color_space = CGColorSpaceCreateDeviceRGB();
    CGContextRef myBitmapContext = CGBitmapContextCreate(
     tData, width, height, 8, width*4, color_space,
     kCGImageAlphaPremultipliedFirst);
    CGContextDrawImage(myBitmapContext, rect, cgImage);

    int perWidth = (int)ceil(width/newTileSize);
    int perHeight = (int)ceil(height/newTileSize);

    int numTextureNames = perWidth*perHeight;
    textureNames = malloc(sizeof(GLuint)*numTextureNames);

    textureType = GL_TEXTURE_2D;
    glEnable(textureType);
    glGenTextures(numTextureNames, textureNames);

    //backup default pixel store state
    glPushClientAttrib(GL_CLIENT_PIXEL_STORE_BIT);

    //setup bitmap attributes
    glPixelStorei(GL_UNPACK_ROW_LENGTH, width);
    glPixelStorei(GL_UNPACK_ALIGNMENT, 1);

    int onY;
    for(onY = 0; onY < perHeight; onY++) {
      int onX;
      for(onX = 0; onX < perWidth; onX++) {
        int onTexture = onY*perWidth + onX;

        //setup offsets
        int x = onX*newTileSize;
        int y = onY*newTileSize;

        //setup extents
        int dx = MINOF2(width-x, newTileSize);
        int dy = MINOF2(height-y, newTileSize);

        //skip to x,y for read from bitmap
        glPixelStorei(GL_UNPACK_SKIP_PIXELS, x);
        glPixelStorei(GL_UNPACK_SKIP_ROWS, y);

        glBindTexture(textureType, textureNames[onTexture]);

        glTexParameteri(textureType, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
        glTexParameteri(textureType, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
        glTexParameteri(textureType, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
        glTexParameteri(textureType, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
        glTexParameteri(textureType, GL_TEXTURE_BASE_LEVEL, 0);
        glTexParameteri(textureType, GL_TEXTURE_MAX_LEVEL, 4);
        glTexParameteri(textureType, GL_TEXTURE_MIN_LOD, 0);
        glTexParameteri(textureType, GL_TEXTURE_MAX_LOD, 4);

        glTexParameteri(textureType, GL_GENERATE_MIPMAP, GL_TRUE);
        glTexImage2D(textureType, 0, GL_RGBA, newTileSize,
             newTileSize, 0, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8, NULL);
        glTexSubImage2D(textureType, 0, 0, 0, dx, dy, GL_BGRA,
                                                GL_UNSIGNED_INT_8_8_8_8, tData);
      }
    }

    //restore default pixel store state
    glPopClientAttrib();

    glDisable(textureType);

    // release
    CGColorSpaceRelease(color_space);
    CGContextRelease(myBitmapContext);
    free(tData);
  }
  return textureNames;
}

We use glTexImage2D with NULL data and instead use glTexSubImage2D to insert an image of size dx, dy to account for the images at the edges. I’m not sure if that was necessary. Now all we need to do is iterate through the textures and create the necessary quads in our scene. Initially, I had rendered the quad size to be the texture size (which is the tileSize) but many times the quad sizes are too big and had rendering quirks with overlapping quads. The solution is to make sure you size the quad to be the same size as the original image. So for edge textures, you would have not create square quads, instead you will have whatever size necessary to show the original image. Here’s a code snippet from the scene:

unsigned i = 0;
float x, y;

for (y = 0; y > -dob.h; y -= textureSize) {
  float dy = MINOF2(dob.h + y, textureSize);
  glPushMatrix();
  glTranslatef(0, 2*y/mapScaler, 0);
  for (x = 0; x < dob.w; x += textureSize) {
    glPushMatrix();
    glTranslatef(2*x/mapScaler, 0, 0);
    if (dob.textureIds[i]) {
      if (renderMode == GL_SELECT) {
        glLoadName(j);
      }
      float dx = MINOF2(dob.w - x, textureSize);
      glBindTexture(GL_TEXTURE_2D, dob.textureIds[i]);
      glBegin(GL_QUADS);
        //Page textures are flipped. Compensate for that.
        glTexCoord2f(0.0f, 0.0f);
        glVertex3f(0.0f, 0.0f, 0.0f);
        glTexCoord2f(dx/textureSize, 0.0f);
        glVertex3f(2*dx/mapScaler, 0.0f, 0.0f);
        glTexCoord2f(dx/textureSize, dy/textureSize);
        glVertex3f(2*dx/mapScaler, -2*dy/mapScaler, 0.0f);
        glTexCoord2f(0.0f, dy/textureSize);
        glVertex3f(0.0f, -2*dy/mapScaler, 0.0f);
      glEnd();
      glPopMatrix();
    }
    i++;
  }
  glPopMatrix();
}

Now we can handle large sites since we’re just rendering 256×256 images.

Frustrum Culling

Implementing frustrum culling was pretty straight forward. I just had problems since I was applying my matrix transformation in the wrong order. Remember, they don’t commute! This is a good article that you can follow to implement it. Once implemented, the performance boost was noticeable for examples using lots of large textures. Now we need to work on a texture manager that will do manual mipmapping to display different images at different camera positions.

Building a Console for E15

Saturday, February 16th, 2008

console.png

This week I worked on implementing a console on E15 that can display contents of stdout while also accepting stdin. Until now, we all used the Xcode console, but of course if we want to distribute a binary, we are going to need a console for the users. The task wasn’t as trivial as I initially thought, and the result is a bit of a hack, but I thought I’d just write it up.

The console is just a simple NSTextView that is contained in a NSDrawer attached to the bottom of the OpenGLView. The first thing to do is to print the contents of error output into that view. It’s easy enough to do, we just need to add the contents to the NSTextStorage of the console. This seemed to work fine, but it would hang the application if we write to it repeatedly in a short time. Something like:

for i in range(1000):
  print i

would hang at some arbitrary point. For a while I was convinced it was a problem in appending strings to the text view, but even after trying to append strings every way possible, it still didn’t work. In the end, the problem was due to threading. The python interpreter runs in a secondary thread, and I had thought that it was fine to modify the GUI from a secondary thread, since drawing to a view from a secondary thread was safe. Anyway, as soon as I used

[NSObject performSelectorOnMainThread:withObject:waitUntilDone:]

to write the contents, everything was good to go. It would be nice to know the exact things that must run on the main thread in Cocoa are, but the Apple documentation is pretty bad about that. So the lesson learned: when something barfs unexpectedly, run in the main thread!

The other part to the console is to accept user input in the console and pipe it to stdin, to the Python interpreter. This is important since sometimes you would want to receive user information from input() or raw_input(). Initially I thought using NSTask and NSPipe would be the best, but that required me to set up the Python interpreter as a subprocess, but Kyle worked out all the stuff dealing with Python and threading and saving states and I didn’t want to mess with that. I knew that in Python, you can set any file handler as a stdin, stderror or stdout. So this is a total hack, and I don’t really know if there is a nicer way to go about it, but the idea is to grab the contents of the last line on the console when the user presses return or enter, then create a tmp file with the contents written inside it.

On the Python end, we would have to redefine raw_input(). I instead have a new method console_input() which is defined like this:

def console_input(message, magic_string='Yes'):
  print message
  open("/tmp/STDIN.txt", "w").write("# STDIN #")
  saved_state = sys.stdin
  sys.stdin = open("/tmp/STDIN.txt")
  line = sys.stdin.readline().strip()
  prev_result = line
  first_time = True
  while upper(line) != upper(magic_string):
    if prev_result != line and first_time == False:
        print 'Please answer ' + magic_string
        prev_result = line
    first_time = False
    sys.stdin = open("/tmp/STDIN.txt")
    line = sys.stdin.readline().strip()
  sys.stdin = saved_state

So, the method prints out a message to the console, then also waits for the user to input the magic_string. If the user enters the magic string, the method will exit. Ghetto, I know…but it works and I can’t come up with a better solution.

Web Design is Dead

Saturday, February 9th, 2008

skull.jpg

I know I haven’t updated with a new entry in a while. I’ve been working on this post for the last three months or so, but every time I try to write about something, I find it difficult to finish. But here it goes, another attempt at writing.

I said last year that I will work on a web application every month. I failed. Not because it was particularly difficult to implement, but because I thought it was a waste of time. I also don’t particularly enjoy making web applications or designing web pages. I’ve been doing it for a while, and the technical challenge isn’t so exciting; and it seems like everybody these days claims to be a web designer, and it’s true, being a web designer isn’t difficult (of course whether they are good is another question). With frameworks like rails and relatively compliant web browsers, it’s becoming simple to deploy web applications. Of course it is a nice change from the days of writing endless lines of redundant php code, but at this stage, it’s really about the idea and not about technological challenges. Ideas are hard, therefore I failed.

As a designer I feel the limitations of web browsers growing every day. Of course limitations can be due to security restrictions, and I feel like spending time to circumvent restrictions is a waste of time. Also, we have powerful computers, yet none of the graphics capabilities on the browser takes advantage of powerful graphics cards. This is why I don’t really spend much time working on web applications anymore. In a research context, when I think about what the “next thing” is for the web, I think it’s about the web as an environment we interact with, without a web browser; and give web designers a whole new set of graphical and interaction possibilities.

I’m focused on working on E15, and it’s been great. Implementing a desktop application comes with more complexity, but rewards with more flexibility. For the first time in a while, I feel like I can finally build things that I think, without discovering later that it is impossible. I’ve been knee deep in Cocoa, and I think I’ll probably focus more on writing about problems and solutions I’ve come across in future posts…if I ever decide to write onto this neglected blog of mine…