Well, it's interesting from the CRM114 standpoint... I've (obviously) looked into that extensively, and I picked it as a spam filter, because it's a completely naive classifier. It has no preconceived notions of what you're trying to filter, so you can train it only anything. It's got a bunch of different modules/classifiers it can use too... one of them, the hyperspace classifier, almost sounds like something out of science fiction. It models your comparison by basically taking the two training datasets, and modeling them as hyperspatial (n-dimensional) galaxies, and then emitting a single "photon" from each star (data point) and calculating total illumination of the document being tested, based on it's proximity to each star. I'm actually looking at using some of the algorithms for the Artica project, as a way to potentially compare artwork (based upon user/viewer opinions and interactions.) The spam filter here uses the much more basic classifiers though, as they're more mature at the moment.
As for the actual stupidfilter project... I don't think they're serious about it, I just think it's a good joke idea, like the old
swedish-chef translator.