Archive for the 'Programming' Category

Two OpenGL Optimizations

Wednesday, February 20th, 2008

In the last week, I’ve been working on optimizing E15 for speed and efficiency. So far I’ve implemented two ways to increase performance: texture tiling and frustrum culling. I’ll start with texture tiling.

Texture Tiling

We were using non power-of-two (POT) textures in E15, since they are supported, and it makes everything easy since source images for our textures don’t necessarily come in POT. For most images, this is fine, since they are small and manageable. With OpenGL 2.1, most things work with non POT. Performance issues arise when you have large textures, in our case rendering web pages. When web pages get turned into a bitmap, they become huge. Blogs are especially large, easily reach 15,000 pixels high. Of course these textures are too large for OpenGL, and so we decided to go back to POT and tile the images by subdividing them in multiple textures applied to multiple quads.

Going back to POT was a good move, since on the ATI X1900 it seems hardware mipmaps are only supported with POT (so originally we where using gluBuildMipmap). Implementing this was relatively straight forward. Here’s what needs to get done:

  1. Obtain texture, then create a new image with the next largest POT dimension.
  2. Create new image by placing original image onto the new image.
  3. Create textures by subdividing image with predefined tile size.

All images are supplied as a CGImageRef, so I implemented a new method that will go through and accomplish the above task. It is pretty simple. You pass a CGImageRef and tile size and it will return an array of OpenGL texture ids.

- (GLuint *)createTiledTexturesFromCGImage:(CGImageRef)cgImage
                 tileSize:(int)newTileSize
{
  GLuint *textureNames;
  if(cgImage) {
    float image_w = CGImageGetWidth(cgImage);
    float image_h = CGImageGetHeight(cgImage);
    float remain_x = image_w/newTileSize;
    float remain_y = image_h/newTileSize;
    float spacing_w = (ceil(remain_x)-remain_x)*(float)newTileSize;
    float spacing_h = (ceil(remain_y)-remain_y)*(float)newTileSize;
    float width = image_w + spacing_w;
    float height = image_h + spacing_h;

    void* tData = calloc(width * 4, height);
    CGRect rect = CGRectMake(0, spacing_h, image_w, image_h);
    CGColorSpaceRef color_space = CGColorSpaceCreateDeviceRGB();
    CGContextRef myBitmapContext = CGBitmapContextCreate(
     tData, width, height, 8, width*4, color_space,
     kCGImageAlphaPremultipliedFirst);
    CGContextDrawImage(myBitmapContext, rect, cgImage);

    int perWidth = (int)ceil(width/newTileSize);
    int perHeight = (int)ceil(height/newTileSize);

    int numTextureNames = perWidth*perHeight;
    textureNames = malloc(sizeof(GLuint)*numTextureNames);

    textureType = GL_TEXTURE_2D;
    glEnable(textureType);
    glGenTextures(numTextureNames, textureNames);

    //backup default pixel store state
    glPushClientAttrib(GL_CLIENT_PIXEL_STORE_BIT);

    //setup bitmap attributes
    glPixelStorei(GL_UNPACK_ROW_LENGTH, width);
    glPixelStorei(GL_UNPACK_ALIGNMENT, 1);

    int onY;
    for(onY = 0; onY < perHeight; onY++) {
      int onX;
      for(onX = 0; onX < perWidth; onX++) {
        int onTexture = onY*perWidth + onX;

        //setup offsets
        int x = onX*newTileSize;
        int y = onY*newTileSize;

        //setup extents
        int dx = MINOF2(width-x, newTileSize);
        int dy = MINOF2(height-y, newTileSize);

        //skip to x,y for read from bitmap
        glPixelStorei(GL_UNPACK_SKIP_PIXELS, x);
        glPixelStorei(GL_UNPACK_SKIP_ROWS, y);

        glBindTexture(textureType, textureNames[onTexture]);

        glTexParameteri(textureType, GL_TEXTURE_MIN_FILTER, GL_LINEAR_MIPMAP_LINEAR);
        glTexParameteri(textureType, GL_TEXTURE_MAG_FILTER, GL_LINEAR);
        glTexParameteri(textureType, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
        glTexParameteri(textureType, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
        glTexParameteri(textureType, GL_TEXTURE_BASE_LEVEL, 0);
        glTexParameteri(textureType, GL_TEXTURE_MAX_LEVEL, 4);
        glTexParameteri(textureType, GL_TEXTURE_MIN_LOD, 0);
        glTexParameteri(textureType, GL_TEXTURE_MAX_LOD, 4);

        glTexParameteri(textureType, GL_GENERATE_MIPMAP, GL_TRUE);
        glTexImage2D(textureType, 0, GL_RGBA, newTileSize,
             newTileSize, 0, GL_BGRA, GL_UNSIGNED_INT_8_8_8_8, NULL);
        glTexSubImage2D(textureType, 0, 0, 0, dx, dy, GL_BGRA,
                                                GL_UNSIGNED_INT_8_8_8_8, tData);
      }
    }

    //restore default pixel store state
    glPopClientAttrib();

    glDisable(textureType);

    // release
    CGColorSpaceRelease(color_space);
    CGContextRelease(myBitmapContext);
    free(tData);
  }
  return textureNames;
}

We use glTexImage2D with NULL data and instead use glTexSubImage2D to insert an image of size dx, dy to account for the images at the edges. I’m not sure if that was necessary. Now all we need to do is iterate through the textures and create the necessary quads in our scene. Initially, I had rendered the quad size to be the texture size (which is the tileSize) but many times the quad sizes are too big and had rendering quirks with overlapping quads. The solution is to make sure you size the quad to be the same size as the original image. So for edge textures, you would have not create square quads, instead you will have whatever size necessary to show the original image. Here’s a code snippet from the scene:

unsigned i = 0;
float x, y;

for (y = 0; y > -dob.h; y -= textureSize) {
  float dy = MINOF2(dob.h + y, textureSize);
  glPushMatrix();
  glTranslatef(0, 2*y/mapScaler, 0);
  for (x = 0; x < dob.w; x += textureSize) {
    glPushMatrix();
    glTranslatef(2*x/mapScaler, 0, 0);
    if (dob.textureIds[i]) {
      if (renderMode == GL_SELECT) {
        glLoadName(j);
      }
      float dx = MINOF2(dob.w - x, textureSize);
      glBindTexture(GL_TEXTURE_2D, dob.textureIds[i]);
      glBegin(GL_QUADS);
        //Page textures are flipped. Compensate for that.
        glTexCoord2f(0.0f, 0.0f);
        glVertex3f(0.0f, 0.0f, 0.0f);
        glTexCoord2f(dx/textureSize, 0.0f);
        glVertex3f(2*dx/mapScaler, 0.0f, 0.0f);
        glTexCoord2f(dx/textureSize, dy/textureSize);
        glVertex3f(2*dx/mapScaler, -2*dy/mapScaler, 0.0f);
        glTexCoord2f(0.0f, dy/textureSize);
        glVertex3f(0.0f, -2*dy/mapScaler, 0.0f);
      glEnd();
      glPopMatrix();
    }
    i++;
  }
  glPopMatrix();
}

Now we can handle large sites since we’re just rendering 256×256 images.

Frustrum Culling

Implementing frustrum culling was pretty straight forward. I just had problems since I was applying my matrix transformation in the wrong order. Remember, they don’t commute! This is a good article that you can follow to implement it. Once implemented, the performance boost was noticeable for examples using lots of large textures. Now we need to work on a texture manager that will do manual mipmapping to display different images at different camera positions.

Building a Console for E15

Saturday, February 16th, 2008

console.png

This week I worked on implementing a console on E15 that can display contents of stdout while also accepting stdin. Until now, we all used the Xcode console, but of course if we want to distribute a binary, we are going to need a console for the users. The task wasn’t as trivial as I initially thought, and the result is a bit of a hack, but I thought I’d just write it up.

The console is just a simple NSTextView that is contained in a NSDrawer attached to the bottom of the OpenGLView. The first thing to do is to print the contents of error output into that view. It’s easy enough to do, we just need to add the contents to the NSTextStorage of the console. This seemed to work fine, but it would hang the application if we write to it repeatedly in a short time. Something like:

for i in range(1000):
  print i

would hang at some arbitrary point. For a while I was convinced it was a problem in appending strings to the text view, but even after trying to append strings every way possible, it still didn’t work. In the end, the problem was due to threading. The python interpreter runs in a secondary thread, and I had thought that it was fine to modify the GUI from a secondary thread, since drawing to a view from a secondary thread was safe. Anyway, as soon as I used

[NSObject performSelectorOnMainThread:withObject:waitUntilDone:]

to write the contents, everything was good to go. It would be nice to know the exact things that must run on the main thread in Cocoa are, but the Apple documentation is pretty bad about that. So the lesson learned: when something barfs unexpectedly, run in the main thread!

The other part to the console is to accept user input in the console and pipe it to stdin, to the Python interpreter. This is important since sometimes you would want to receive user information from input() or raw_input(). Initially I thought using NSTask and NSPipe would be the best, but that required me to set up the Python interpreter as a subprocess, but Kyle worked out all the stuff dealing with Python and threading and saving states and I didn’t want to mess with that. I knew that in Python, you can set any file handler as a stdin, stderror or stdout. So this is a total hack, and I don’t really know if there is a nicer way to go about it, but the idea is to grab the contents of the last line on the console when the user presses return or enter, then create a tmp file with the contents written inside it.

On the Python end, we would have to redefine raw_input(). I instead have a new method console_input() which is defined like this:

def console_input(message, magic_string='Yes'):
  print message
  open("/tmp/STDIN.txt", "w").write("# STDIN #")
  saved_state = sys.stdin
  sys.stdin = open("/tmp/STDIN.txt")
  line = sys.stdin.readline().strip()
  prev_result = line
  first_time = True
  while upper(line) != upper(magic_string):
    if prev_result != line and first_time == False:
        print 'Please answer ' + magic_string
        prev_result = line
    first_time = False
    sys.stdin = open("/tmp/STDIN.txt")
    line = sys.stdin.readline().strip()
  sys.stdin = saved_state

So, the method prints out a message to the console, then also waits for the user to input the magic_string. If the user enters the magic string, the method will exit. Ghetto, I know…but it works and I can’t come up with a better solution.

E15

Tuesday, September 25th, 2007

E15

E15 is our newest project. It’s a new 3D graphics programming environment where we enable users to control the content and presentation. Our vision of what the web will be in the future. We demoed this at Flash Forward 2007, and just recently launched its website. We’ll be posting new stuff up there.

Modster

Wednesday, August 8th, 2007

Modster

Finally finished the first application that uses MudSketch. Modster is a graphical exquisite corpse, each of which are drawn by three people. Participants draw a portion of the corpse in sequence: head, torso and leg. Each of the three participants must contribute and submit a drawing – otherwise the corpse will never be completed, making the creation process truely collaborative and participation dependent.

Canvas Drawing Tool

Friday, July 13th, 2007

MudSketch

I have a few project ideas that involve drawing, and I needed to start by creating reusable drawing code. I finished it and just wanted to put it up. It’s called MudSketch, all the interesting work stuff is in the /javascripts/MudSketch directory. Now that the tedious part is done, I can start working on the actual projects! BTW this counts as my July webapp…

RunLog(ger) on Facebook

Saturday, June 2nd, 2007

RunLog Facebook

I wanted to see what the new Facebook platform was like, so I created RunLogger, which is RunLog for Facebook. It’s not attached to the RunLog site, but it will post a mini-feed to your profile, as well as give you running information about
your friends.

GPC: Graphical Pen-based CAPTCHA

Saturday, June 2nd, 2007

GPC

I know I’m a little behind, but here is the new web application for the month of May. It’s a start of a system I implemented as a final project for a class I took this semester (6.870.) This is a pen based CAPTCHA system, where you trace over shapes to conduct the Turing test. I also have an associated paper that talks about it in more detail, I will fix it up a bit and upload it here when it is done.

MITPTyper

Saturday, April 14th, 2007

plw.gif

Here’s a simple tool called MITPTyper. Any phrase you type into the text box will render it as Muriel Cooper’s MIT Press logo. I know this is a small project, but it’s going to count as my April project.

FakeID

Saturday, March 31st, 2007

FakeID

I’m going to try and release a new web app every month this year. So far for the month of January and February, there was RunLog and OpenLocker. The web app for the month of March is FakeID.

FakeID is an OpenID server that gives you control over your online identity. With your existing OpenID account, you can use FakeID to create unique online identities. Think of it as an OpenID proxy. Anyway, I just finished it, which means it may be a bit buggy, but so far everything seems to work.

OpenLocker

Sunday, February 4th, 2007

OpenLocker.gif

Finally, OpenLocker (formerly PLW Locker) is ready after some months in development. OpenLocker is a multi-functional web application that serves as an OpenID server, and a personal homepage. As an OpenID server, OpenLocker differs from other OpenID authentication servers and attempts to preserve user privacy by using a locker metaphor. The OpenID specs specify only that an OpenID URI be unique to a specific user, but does not specify any authentication method. Usually, an OpenID server manages user accounts like any other web application: with a username (which corresponds to a unique URI) and an alpha-numeric password. Because of this, this method of using OpenID still suffers from security/privacy issues that exist in other username/password systems. OpenLocker uses a geographic locker location (which is unique) as the username and an emulated combination lock as the password.

Beyond its use as an OpenID server, user’s identity URL (connected as HTTPS, once logged in) will provide them with draggable modules, which are individual RSS aggregators. Users can pick from a collection of feeds (like OpenStudio and OpenCode) or enter any valid RSS/Atom URL to display content. This provides a page that can act as a user’s identity page as well as their homepage, where they can begin their daily Internet browsing.