Dr. Woohoo, Generating Artwork, and some Python code to massage user submitted content (specifically, images).

The cool thing about user submitted content is that you can't always predict what you're going to get. Our speakers at the Singularity Web Conference, for example, submit and update their own bios and session descriptions on the site. Yesterday, I noticed that Dr. Woohoo had put up an image of one of his awesome generative artworks in his session description.

Of course, since I hadn't considered images in session descriptions, this had the side-effect of breaking the layout of the sessions page.

(In case you're wondering, yes, this is the way I like to work. Instead of over-engineering things, I like to see how people actually use stuff and then evolve them to meet their needs.)

So tonight I wrote a bit of code to massage and tame how images in session descriptions are displayed and I thought I'd share it with you in case it helps anyone else. (Another, more complicated way to go about things would have been to grab the images using urlfetch, store them in the datastore, and resize them via the image API -- but that would have been overkill for my needs.)

# Copyright (c) 2008 Aral Balkan, Singularity Web Conference
# http://www.singularity08.com
# Released under the open source MIT license.

from markdown import Markdown

image_tag_width_re = r'(?P<img><img.*?width=")(?P<width>\d*?)"'
image_tag_height_re = r'(?P<img><img.*?height=")(?P<height>\d*?)"'
image_tag_re = r'(<img)(.*?)>'
image_tag_src_re = r'<img.*?src="(.*?)"'
image_tag_alt_re = r'<img.*?alt="(.*?)"'

image_tag_width_rc = re.compile(image_tag_width_re)
image_tag_height_rc = re.compile(image_tag_height_re)
image_tag_rc = re.compile(image_tag_re)
image_tag_src_rc = re.compile(image_tag_src_re)
image_tag_alt_rc = re.compile(image_tag_alt_re)


def massage_images(html):
  """Helper: Alters dimensions of any images in the passed HTML to make them safe for the site's design."""
  image_tag_widths = image_tag_width_rc.findall(html)
  image_tag_heights = image_tag_height_rc.findall(html)
  image_tag_srcs = image_tag_src_rc.findall(html)
  image_tag_alts = image_tag_alt_rc.findall(html)

  for i in range(len(image_tag_widths)):
    # Reduce the width of any found images to 160px so as not to break the layout
    original_width = int(image_tag_widths[i][1])

    maintain_aspect_ratio = True
      original_height = int(image_tag_heights[i][1])
    except IndexError:
      # Mismatched width/height pairs on image tags. We won't be
      # able to maintain aspect ratio.
      maintain_aspect_ratio = False

    if maintain_aspect_ratio:

      aspect_ratio = float(original_width)/float(original_height)
      new_height = int(IMAGE_SAFE_WIDTH/aspect_ratio)

      # Substitute the new height
      html = image_tag_height_rc.sub(r'\g<img>'+repr(new_height)+r'"', html)

    # Substitute the new width
    html = image_tag_width_rc.sub(r'\g<img>'+str(int(IMAGE_SAFE_WIDTH))+'"', html)

    # Add float:left and slight margin so that text flows around the image
    html = image_tag_rc.sub(r'\1 style="float:left; margin-right:.5em;" \2>', html)

    # Finally, add a link to the original image if people want to see it larger
    html = image_tag_rc.sub(r'<a href="'+ image_tag_srcs[i] + '" title="'+image_tag_alts[i]+r'">\1\2></a>', html)

  return html

There are a couple of basic but helpful regular expressions in there and you might find the snippet useful if you want to manipulate image tags generated from user submitted content.

Oh, and before I forget, Dr. Woohoo is going to be talking about Generating Artwork at the Singularity Web Conference. Check out his bio and session and the other sessions at the conference.

(You can find out more about Dr.Woohoo on his web site and take a look at his latest book, Color Visualizations: Exploring the Circle, vol 02.)

If you haven't booked your ticket for Singularity yet, hurry, as the $99 early bird discount ends at the end of this month.