Sunday, August 31, 2008

OSCON 2008 summary (belated)

I attended OSCON 2008 in Portland, Oregon in July.

It's a good opportunity to see what the Open Source community is up to, and what interesting
technologies capture the attention. There were a lot of presentations, many of them concurrent, fortunately many of them have also  made it online.

I took a few pictures while I was there, you can find them at6 my flickr oscon set.
(Also, they had a photo contest from prior years - and one of my balloon photos from 2007 won first place!)

General themes I found most interesting, and which had a lot of other interest:
  • large data sets and their implementations (Hadoop, Bigdata)
  • scaling and performance (Facebook, Flickr, and others)
  • propagating data using Jabber's XMPP protocol (aka XMPP Pub Sub)
  • OAuth (open authentication protocol for services)
  • Microblogging (identi.ca)
  • dynamic languages (Groovy, JRuby,Ruby,Erlang)
  • web frameworks (Django, RoR, etc.)
  • virtualization (I attended only a couple of these, lots of interest though)
Of the sessions I attended, here are a few highlights.

****
Open source applications making inroads into mainstream usage:
Django, Alfresco, Zimbra.

****
The XMPP Pub Sub idea got a lot of interest. It was inspired by the massive crawling
that FriendFeed was performing on Flickr, looking for images updated. This idea will enable a listener to subscribe to content changes, reusing the Jabber XMPP IM protocol (containing an Atom payload.)

****
Groovy vs JRuby discussed when you might prefer one language over the other.
Summary was that:
  • Groovy had a better fit with Java, good (and improving) performance and is best for tight and/or heavy Java code integration. Also there is a Groovy compiler that will generate .JAR files, useful for "stealth" integration.
  • JRuby code integration with Java good but still needs work. Best for general scripting and light code integration. But you have both the JDK and Ruby libraries to work with.
Note also that Sun has dedicated resources for JRuby (and Jython)

****
Facebook developer's mentioned they had >400 memcached hosts, using multi-retrieval code
that they've written (and shared.)

Also some discussion on using TCP vs UDP, and about high-latency problems caused by East/West coast server traffic.

****

Web Frameworks

There were several discussions centered on frameworks.  One new term was SOFEA. Idea consists mainly of moving more functionality into the client, rather then relying upon server-side framework to perform everything.  (Basically this is what's happening with the RIA model, such as Dojo, Laszlo, and Flex enable.)

Some quick summary judgements on different frameworks:
  • OpenLaszlo: pummeled by Adobe
  • JavaFX: no one using it (yet at least)
  • Rails: ActiveScaffold adds REST and Ajax, Google Trends shows peak in 2005.
  • Grails: scaffolding not very good yet, but has better performance than Rails
  • Flex: Flex + Rails an interesting platform, doesn't yet support all HTML well.

Performance notes:
LinkedIn has a Rails-based Facebook application that supports 1M requests/month - named "Bumper Sticker".

On a comparison scale, in turns of ms/iteration, we see that:
  • Java, C++ : very low, less than 1;
  • JRuby: 100
  • Groovy: 215
  • Python: 225
  • PHP: 600

In a session I missed, but saw notes from later, there were some significant performance improvements in Ruby performance too. They've introduced a compiler that generates code for the LLVM machine, and it's much faster than the C-based "Matz" interpreter. Name of this project is Rubinius.

****

Mozilla developers discussed ways to implement static analysis of C++ programs.
They have developed a plug-in for GCC that let's them run JavaScript in the compiler, giving
them access to the program graphs directly. This is "Dehydra" project.

They're already using this technique to refactor some of their existing code - and they're
actually converting it into JavaScript - more on this in a minute. Now that they have access to the program graph, they're looking for other things they can do to the code too: security analysis, bug analysis, standards enforcement, etc.

About the JavaScript conversion: they're using Trace Compilation in the interpreter that takes
care of performance bottlenecks. Doesn't fix one-time execution though. Still, they claim that
much of their code is easily translated into JS, and its just as fast as the C++ and safer. Target language is JS 2.0 which allows for class/struct within traditional JS objects, making
them safe for C access.

****

The Open Microblogging discussions were interesting. 'identi.ca' is an open source service using Twitter api, that's federated (supports multiple servers). Project code is 'laconi.ca'. Currently uses HTTP between servers, but they know this won't scale. They intend to use the XMPP Pub Sub idea. There's a lot of interest in this technology, and some good ideas.

****

Mark Shuttleworth from Canonical (Ubuntu's business org) gave an interesting talk about their development practices. Its a mix of lean and agile techniques. They really try to amplify learning and not specialization. "Decide late, deliver early" was mentioned. But then later he talked about how knowledge and expertise was more important than colocation, so obviously some level of specialization occurs.

Some of the other practices mentioned:
  • cadence/cycle
  • track bugs, features, ideas
  • branch/merge, keeps cadence trunk pristine, merging important.
  • code review, but they stay away from voting, too divisive
  • automated tests: unit, integration, utilization, full app, profile usage.
  • pre-commit testing, with trunk locked to everyone but a robot that runs test before commit.
****
There was a good discussion on Python & C++ integration using the SWIG modules.
SWIG is now based on a full compiler that reads C/C++ declarations and generates C extensions that allow Python access. Python code can even extend and override C++ classes.
One might have to cleanup the headers a little, but often it works just fine. While its not the best generated code,it works and cuts out most of the work that would otherwise need to be done by hand.

As an example, they put wxPython on it. 6M LOC, 90-95% generated by SWIG, enabling one person to do most of the maintenance on it. (I experimented with SWIG 2-3 years ago and wasn't encouraged, looks like its improved!)

****
A different discussion from an engineer at Meebo, describing how they implemented a hiring process, and the trials & tribulations they went through trying to "staff up".  One unusual practice they instituted was a "simulation" where did a 4-hour exercise reflecting "everyday" tasks. (Can't have candidates do real work though.)

****
There was the big announcement that Microsoft was becoming a Platinum member of the Apache Software Foundation, as well as assurances that they are continuing to evaluate and license "open source" technology. PFIF/Samba agreement, PHP support in Win2008 (ADODB), Ruby libs were all mentioned.

****

Saturday, November 17, 2007

photographic technology mashup


I recently had a chance to experiment with making digital negatives and using them for contact printing with platinum/palladium emulsions.

Above is a quick scan of one of the images, to give you some idea of what one looks like - although it does not do justice to the final print.

This was something I've been wanting to experiment with since I first heard about using imagesetters for this purpose, but just never got to it.

 Recently, Ron Reeder taught a workshop on the subject at Photographic Center Northwest, so I couldn't resist any longer. Given my background in digital printing and photography in general, it was an easy way to build on both of these experiences and create something new. The workshop itself was two days long - the first day covering the theory and calibration process, the second day generating negatives and printing them.

The technology is elegantly simple: you take the image you want to print, apply transfer functions to it to map its density range into the range of the print emulsion, then generate a full-sized negative onto translucent film. (Its the determination of these transfer functions, the calibration process, that adds to the complexity. But being able to do this step on a computer makes this task far easier than working with film - as photographers who used this technique in the 19th century did! I'm certainly not complaining!)

This negative is then used to make contact prints in the wet lab. For this workshop, we were using hand-applied emulsion that's sensitive to UV light - approximately 5 minute exposures, then "developed" and "cleared". This process is using chemicals I'm not used to handling in the darkroom, but which are not particularly noxious or otherwise dangerous to handle. The result after washing is an archival print, with all the smooth gradations that only a contact print makes easy, perhaps slightly soft but not in a bad way.

This same technique can be applied to other historical and modern printing processes, cyanotypes and albumen prints are something I'd like to try, and even modern silver gelatin prints can be created, with some variations. I'll probably experiment with silver a bit first, because I have to build UV light box in order to work with many of the other techniques.
I can work with silver now - I still have a functional wet lab setup (which admittedly does not get used much these days.)

A good starting point for this technology, as well as some wonderful images created using it, are available at Ron Reeder's,  site.

Thursday, October 25, 2007

catching up on this week

this week started out bad - i got completely derailed from my integration work into investigating customer issues with a new release. I must have burned 2.5 days nailing this stupid thing down. the usual problems getting things setup and reproduced - and of course the debug build didn't show the problem. eventually i found it - a measly little off-by-one error where one of my predecessors compensated for the null byte of a C string twice. In just the right circumstances, the source string would exactly fill the allocated storage, and attempts to copy out that extra byte caused an access fault. first i had to decode the algorithm though - turns out its some hashing variation from Sedgewick - it completely avoids compacting in return for quick response. I just wonder how quick it is now compared to the more standard implementations such as STL. And of course this is legacy code with no test cases.

iPhone Tech Talk on Tuesday was moderately interesting. Much of the technical material now available online, at least in video format, but it was good to have some discussion around it. I thought the UI portion was particularly good - advice for general UI design, with a slant toward mobile devices.
Very customer-centric and conceptual - sometimes develoers et al have a hard time with drilling down into "features" a bit early.

Also, we had a good trails-planning session last week, and should have some awesome work planned for Grand Forest East this weekend. Come help! see http://www.bitrails.org/ for details.

Oh, also last week, SeaJUG hosted Daniel Shore talking about Agile Development implementation, quite a good discusson on both the technical and structural/social aspects of implementing any agile methodology.

Sunday, October 14, 2007

mid-October

things are kind of busy as usual, have the following projects going on recently:

on the technical side:

Learning about ASP.NET 2.0 Http pipeline, and how to insert modules into it. Pretty slick, and a lot easier than writing ISAPI filters in C. I've been working on an authentication layer for a legacy application using this stuff, and there's a minimum of hacks required.

Trying to learn enough DreamWeaver to maintain an existing school web site, and enough CSS to apply it to a revamped design.
So far so good, thanks in part to DW providing basic CSS templates I can hit the ground running.
Its all pretty static content, all client-side, but there are monthly meeting updates that need to get posted, so it changes enough I want to make it easier. Already I've gotten rid of the tables, and all the little pixel-GIF hacks (and replaced them with CSS pixel hacks I guess!)

Have to read another chapter of the Erlang book this week - Mnesia database is next.

on the local side:

a good walkthrough on Grand Forest East yesterday. Trying to get a feel for future maintenance work and possible reroutes.
Nothing firm yet - current goal though is to "overlay" a new trail network on the old one, providing something easier to maintain and more suitable for all users. None of it is easy, but over the years both volunteers and staff have put a lot of time into maintaining the old trail network, that started out as logging roads of course. Now if we could just get some of the many equestrians who use those trails to help out...

attended the North Madison open house. sheesh, you'd think it would not be a big deal to widen a shoulder on the uphill side, and install a pedestrian path on the other. There seem to be a couple of arguments against it,including "preserving the rural ambience" and 'keeping it safer by keeping it narrow". My thought is that there are enough walkers, joggers, and cyclists that keeping this arterial as a no-shoulder road is foolish, not safer. Its only preserving the "rural ambience" for people driving, those of us to walk or cycle along it just put up with it. Frankly, I think much of the energy is being driven by people who, over the years, have incorporated part of the existing right-of-way into an extension of their own property, and now don't want to give it back. Everyone seems afraid to say it, but given the folks who attended this meeting and were the most loudly opposed, it seems pretty obvious. In any case, its time to realize that if we want to build a safe route for someone other than auto drivers, roads have to change to accomodate multiple users safely.

enough for now.

Monday, September 24, 2007

trails update

on Saturday, 9/22, we had our first trails work party under the new 'regime'. no last-minute phone calls to remind people. no signage put out several days before to drum up local support (and guide people). no phone calls to the Park District staff. Oh, and competition from local soccer leagues.

so guess what happened? Three people showed up. Parks staff forgot. Combined with myself and Ed (the other crew leader) that makes five of us, working with whatever tools we brought, which wasn't a lot!

end result: we got a good amount of work done - finishing up the sides & surface of the trail from the main junction down to the Fort Ward fence. Maybe its better that so few showed up, we got to work in a couple of clusters, and move quickly.
also we had a chance to complain about staff once more forgetting about our regularly-scheduled work party. Even after they signed up on the mailing list.

For those of you who need a reminder or want to know where the next one is located: its always 4th Sat of the month - signup for email on http://www.bitrails.org.

Tuesday, September 11, 2007

when is Blowfish valid?

apart from the usual weird technical problems this week (GUID changes, encrypting ASP.NET credentials, Cocoa/WebKit/Java interactions on Panther) I've got to figure out what's wrong with a C-based implementation of Blowfish, and how to make it right AND make it match the results of the "official" Java implementation. It seems to work correctly with a "known vectors' test case, and round-tripping works, but the results don't match the Java version. Block-cipher, sensitive to the endian-ness of the machine, character representation, all probably contribute to obfuscating the problem. i need to start with a reference implementation I think.

trails all over

two trails-related mtgs this week, the first of the Trails Advisory Committee (TAC) for Fall 2007, the second an open house sponsored by the Non-Motorized Transportation Committee (NMTC). Both mtgs lightly attended (we are having awfully nice weather after all.)

The NMTC showed maps of potential trail corridors, for addition to the City GIS so that they show up during future development projects, before they get a house built smack in the middle. A lot of trails have disappeared over the years, but there still is a loosely-connected mesh. The NMTC is working to identify what's there, and help determine where best to add new trails to offer better connectivity throughout. (sounds like a network doens't it! next we'll be talking about the graph topology...)

As for the TAC mtg, we are at some sort of inflection point I think. Apparently when the old District dissolved and transferred to the new District, they passed along the entire reporting structure of the citizen advisory committees over to the staff, who
seem to have a decidely different take on the role of volunteers advising them.

What happens next depends upon both staff and TAC members. Since the new staff have taken over, we've had several points of contention - and each time they've revolved around the construction of a trail that had not been reviewed. Often this was at the urging of someone "higher-up" and the resulting trail violated both the existing trail standards (such as they are) as well as the model of low-maintenance trails we've been attempting to develop.

The most recent occurrence of this is the weird little trail that was punched through at the old Lovgreen Road gravel pit, supposedly to improve non-motorized access. I don't think we're going to see many cyclists using it for commuting - its way too steep, has a 90 degree turn halfway down the slope and a 180 degree turn at the bottom. I'll give them credit - they built one helluva culvert - we'll see whether its enough to deal with the runoff on that slope.

Much of the TAC work has been to go in afterwards and fix things, and its getting a little tiring. Sometimes they're more than a volunteer group can manage effectively using manual labor. its one thing to fix up historical trails, half of which were old logging roads or jeep roads anyway, but to find new ones built w/o regard to topography or aesthetics is very frustrating.

As a TAC member, and as a citizen, I've asked informally for a trail-building protocol, and was flatly denied by staff during this meeting, although staff then did promise to try and work with us. Exactly which trail is so important to build "right now" that it needs to be built w/o design review and/or community input, I'm not quite sure. Okay, i'm being cynical.
Maybe this will work out with an "informal" protocol, but given the past incidents I have my doubts.
If not, I guess the next step is to petition the Board or the City Council to establish a formal design review process - and that probably really will slow things down!

In the meantime, next work project continues at Fort Ward Hill trail, and supposedly a design process is going to start in Grand Forest East this winter.