July 2004

You are currently browsing the monthly archive for July 2004.

Off to LWESF

I’m off to Linux World Expo tomorrow where I’ll be talking about high-capacity Linux file server management with Corey Shields.

I’m looking forward to LWE this year as it will be a great chance to catch up with a lot of the folks that we host as well as possibly recruit some more hosting candidates for the OSL.

I’m also hoping to meet with some folks to discuss some issues with community development and growth. Bdale Garbee gave a talk at OSCON last week about Debian and how that community has flourished. However, I see that as good as the community is, it has grown to the point where it almost cannot be managed effectively. From this we have seen off-shoots of other distros such as Gentoo. Of course Red Hat has been seeing the same thing as well. What does it take to successfully grow a community? When the community gets sufficiently large, how do you maintain focus and more importantly keep people and the project on task.

If you’re there and would like to meet up then just drop me a note via email or flag me down on the showroom floor.

As always some excellent sessions at OSCON. I’m amazed at how well attended the conference is this year. I’m guessing (just looking at the sequence of numbers on nametags) that we’re talking about 1500 or so people at this thing. That’s pretty cool.

Had to go to the “Commerical OSS Business Panel” which was moderated by Tim O’Reilly and had Bob Lisbonn (VC), Matt Asay (Novell), Zach Urlocker (MySQL), Jason Matusow (Microsoft) and Brian Behlendorf (CollabNET) on it. This was almost exactly the same panel that was at the OSBC and the talk was just as lively as you would expect.

Jason Matusow played the Microsoft party line which of course fits with anything you’d hear from somebody on Wall Strett: uniqueness leads to scarcity, scarcity leads to value and value is what makes companies interesting to VC’s and allows them to make money. This makes sense for pretty much every traditional business model (including Microsoft’s) but doesn’t fit with open source for obvious reasons. Of course people need to get their licks in on Microsoft and Jason had some excellent points. Matt Asay chimed in with an interesting thought; software is not about owning the source code anymore … its about owning the source of the code. Looking at MySQL and JBoss as examples, these companies employ the vast majority of the developers that commit changes to these codebases. That’s a big deal and that is what generates value for their customers and shareholders.

To me, the real value in open source is the community (also considered the source of the code). The eyeballs familiar with the code and willing and able to make it better over time. That’s value. Look at companies like Google and eBay … their big value is in the customers they serve and their ability to help customers help themselves; that’s open source in a nutshell. I would be willing to bet that a VC would be more interested in a company with more customers than more intellectual property.

Companies specifically leveraging open source to make money (JBoss, MySQL, etc) are doing so by answering the age old question “what is the main thing?” A lot of their customers could easily staff up to support, write and manage over time the development of these applications. But why would they? For a reasonable price (and no lock-in) these companies can turn to MySQL and JBoss to do the work for them. This allows these companies to better manage the bottom line over time as well as increase value for their customers.

I also attended a talk by Jason McManus talk about eCommerce API solutions at eBay (and by extension PayPal). Jason is a funny guy and good with a crowd even if he is a bit self-effacing (we don’t hold it against you that you are a .NET developer … your code will work with Mono). I liked the talk and was left wondering if PayPal would ever enter into the market of providing its API to general folks like us so that we can get out from under the Visa/Mastercard requirements for e-commerce.

Finally I went to a talk by the Barracuda Networks about building your own Linux anti-spam firewall. I basically learned a few things a) I could have started Barracuda at the same time they did b) the technologies they started using don’t scale (amavisd-new is fantastic but a pig in terms of resources and I have definitely found the practical limit of through-put with the latest MyDoom virus) and c) their solutions to the bottleneck issues are proprietary. It was a good talk and confirmed mostly what I already knew but it was good to go nonetheless.

I’m at OSCON this week and have just arrived for my 1st day (although the 3rd day of the conference). The keynotes this morning were from Tim O’Reilly and Robert Lefkowitz.

O’Reilly is really good at talking about the directions for technology. He always has some excellent examples and possiblities. My personal favorite during this talk was the need for the open source community to “napsterize calendar and contact information” lest this data become locked up into vendor-specific formats. Frickin’ genius IMHO. Also, he talked a bit about dashboard and how they are doing some neat things with their applications. For example, an email comes in and it sends a trigger to a personal search engine and then posts the information to your “dashboard”; topics of last 3 emails, information about the person who emailed you, etc. It was interesting to see Miquel de Icaza give a talk later that afternoon about how GTK and Gnome have had this functionality for over a year now.

Lefkowitz was excellent as well. The guy has more slides than you can shake a stick at and sometimes you’re not sure where he’s headed with something (see many excerpts of the “Pricess Bride”) but he always brings it back together. His talk focused on the struggles he had with trying to open source at Merrill Lynch and how what things mean aren’t really what they mean with many excellent examples.

Later in the afternoon I got a chance to see Miguel de Icaza’s talk on “Mono 1.0″ which covered where Mono is at and where it is headed. I have to say I’ve all but ignored the entire Mono project until today. Miguel showed off some neat tools (Mono Developer tied together with Glade) and built a localization compatible web browser in about 10 minutes and 30 lines of code. He changed the localization to French and viola … worked like a champ. Hebrew? No problem. It even put the buttons and text justified on the right hand side of the browser (to which one heckler in the audience chimed in “which is the back button then?”). There were no Hebrew speakers in the audience so it went unanswered.

One other tidbit he showed was the documentation tools within Mono. They have taken what PHP has done with their documentation and taken it a step further. Embedded in the Mono documentation tools are the ability to edit a wiki page. Changes you make can be sent off to the “master” server where, through a review process, they are added to the documentation. Now that is frickin’ genius and I have a feel many other projects will be soon to follow (there is a reason that Wikipedia is so successful.

And no OSCON is complete with out some Perl lightning talks. -)

This years’ conference seems extremely well attended. More from the trenches tomorrow.

I spent most of yesterday trying to put the clamps down on the mail issues from the latest variant of Worm.MyDOOM.M. The targeted domain was mozilla.org (whom we relay mail for) and we were running into limitations of our mail relays to be able to stop the virus.

We use amavisd-new, Postfix, spamassassin and ClamAV to help stop spam. Our problem was that we relayed mail for mozilla.org. In relaying we have to accept all mail for mozilla.org, process it and then reject it based on content (for example if its a virus which most of the mail was). What I needed was a list of valid recipients for @mozilla.org so I could reject mail based on unknown recipient. I got a drop from the Mozilla folks and slapped it into production. It was a total of 772 aliases that I put into a virual user table and then Postfix was rejecting mail at a frenetic pace.

Once we started rejecting the mail I started to look at the number of connections we were seeing to our relays. We have 3 brand new Sun v60x’s dedicated to relaying mail and they were seeing about 50-60 connections a second from bogus hosts. I wrote a quick perl hack that would find hosts that were hitting us with lots of unknown recipients as well as RBL hosts that were hitting us and shoved out the list to the relays and firewalled them off. We have a rolling 24 hour window of blocks going based on our mail logs. As of 9:30am PST we have a list of 1389 hosts that we are rejecting connections from (these are infected hosts sending virii at us).

In the last 48 hours we have caught several million copies of MyDoom.M in addition to several hundred thousand rejected mail due to unknown recipients. Our relays are now able to breath again and mail is flowing relatively well to mozilla.org again.

MyDoom.M hits

This morning about 7am PST we had MyDOOM.M hit us pretty hard. The graphs below outline what happened this morning.

At one point we were rejecting close to 2000 messages a minute as virii. In total we blocked well over 1 million messages and countless other with firewalling.

Our mail relays also act as the primary anti-spam/anti-virus relays for mozilla.org, php.net and Freenode.

We can see from the graphs the obvious limitations in our configurations. We have some pretty good hardware but the amavisd-new processes spent far too much time chewing on bogus emails when it should have just rejected them outright. At one point I was actually /dev/null’ing postmaster@mozilla.org because of the deluge.

Its almost 7pm PST now and the calm is settling back in. After running some numbers on “bad” clients we’re not blocking over 1300 machines from connecting to our relays (gotta love iptables). Its not ideal, but it will get us through until things calm down.

I’ll be looking into ways to streamline or speed up the amavisd-new processes and possibly split out the virus checking before it gets to amavisd-new. It just couldn’t handle the load.

The Open Source Lab recently split out its rsync services from general user services such as ftp and http that were hosted on ftp.osuosl.org. Now rsync.osuosl.org is completely seperate hardware that is updated by the projects that are hosted there. The server is a Dell 2650 with 1TB of attached RAID5 storage and we have given access to several projects (Gentoo, KDE and LFS) so that they can update their archives themselves and push out to the secondary mirrors (and subsequently end-users) even faster.

A case in point is Gentoo Linux that recently uploaded their latest release (2004.2) up to rsync.osuosl.org. See the bandwidth graph below:

You can see that in just 18 hours we were able to populate 23 seperate mirrors across the globe with 7.5GB of data. Normally this process can take days and days as the mirrors used to have to compete with end-users to get their data. Also, many projects are starting to realize that pre-populating downstream mirrors with some rsync trickery will allow you to ‘flip-the-bits’ when the actual release day comes.

Our goal in providing rsync.osuosl.org is to help projects such as Gentoo Linux propagate out their wares quickly and easily to their end-users thus helping shorten the open source transaction time.

Orkut for IRC

In between 503 server errors on slashdot I happened to notice an article about the language tempest brewing at Orkut.

For those of you that don’t know, Orkut is one of those community-based networking sites that allows you to connect to other people through a set of trusted friends. The concept is quite interesting to say the least but I have yet to see a reason that it should stand on its own. I think that Orkut itself has grown at the pace that it has mainly because of the Google affiliation and the fact that Miguel de Icaza must have been the first person to receive an invitation.

One place that I think this concept would work quite well in is IRC. I know here at the OSL we have been providing hosting resources for folks such as Gentoo and Freenode and we see the advantage of having many diverse groups working closely together. This cross-polination has yielded some interesting new projects as well as helped others get involved in things they might not have otherwise.

I would love to see the Orkut model applied here. Take for example Freenode. If I could go to a website and login with my IRC credentials and then be able to list “friends” and find folks with similar interests I would drive traffic on the network. Not only that, but you’d see a growth in the cross-project collaboration that tends to happen when you get everybody in one place. I would love it when I get /msg’d to be able to see who the heck somebody is. Not only that; see who they know and are “friends” with. This model of trust would help go a long way (especially considering the difficult nature of communicating via IRC).

I believe that the first network to build into their general services a model such as this would see an explosion of services. I’ll take it one step further; if you run an IRC network and need resources to do such a thing, the OSL will help provide hosting resources (machines and bandwidth) to make it happen.

Virus Outbreak

Yet-another-Windows-virus has hit us again. I think this time we stopped the damage pretty darn quickly.

You see, I am a statistics junkie. Graphs are my favorite. I love being able to whittle down gigs of log information into one easy-to-ready purdy graph. See the following graphs as examples to what I’m talking about:

Graph of our spam/bounced/rejected emails.

The above graph is information on how many emails we rejected during the recent outbreak and how the infection has all but died out. We received our ClamAV virus signatures at 10:38 AM PST. Yes, you are reading that correctly; we got up in the 700 messages rejected per minute range for most of the day.

Our mail queues fill up

If ever there was a good reason to have external mail relays with an easy to use MTA (in our case Postfix) then the above image should clarify. Above we see that some of the folks that we relay mail for could not handle the influx of mail and started to stop accepting mail. Fortunately we have loads of queue space and were able to weather the storm.

I would chalk up the success to the fact that we have some great software (Postfix, ClamAV, amavisd-new and Spamassassin — check out this fantastic howto by Tobias Rice if you want to give this a shot on your relays). This is easily the fifth time that these open source applications (and signatures maintained by volunteers) has saved our bacon. It took a full 3 more hours to get the definitions for the applications we pay for.

First off, hats off to the Mozilla Foundation and all of the folks that have made it a success over the last year (honestly its been longer than that as the key players had to work hard long before that to get Mozilla free of AOL and form the foundation). Congrats; its been a fantastic year.

And what an amazing year it has been. Many releases of the flagship Mozilla product; new initiatives in the form of Firefox and Thunderbird. Revamped website. Entirely new distribution mechanism (via volunteer software mirrors). Its been an amazing transformation and one that I’ve been lucky enough to see a portion of (I help maintain the primary mirrors).

So what’s the secret? That of course is the million dollar question. Something has changed in the last year at Mozilla and its not just that they aren’t a part of AOL anymore or that they have new offices. It’s something more.

I got a chance to visit the new Mozilla Foundation digs in Moutain View in March and got to meet with some of the folks there to see what it is that makes the MoFo tick. Their offices look like just about any other development shop you’ve ever seen; a huge 1/12 scale replica of the london bridge built from pop cans, their giant chess board and desks built quite economically from doors and 4×4’s.

In talking with the folks there it was clear that a few things have really made an impact on Mozilla:

Change of customer: While at AOL, the Mozilla teams first priority was helping AOL with Netscape. The end-users came second. Thus was born the one of the original reasons for the Mozilla release. Now free of AOL, the Mozilla developers have just one customer; the end-user. This shift in focus has had a huge impact on the quality and quantity of products released by the MoFo.

Visual identity: When you visit the website, download their products or buy a t-shirt, there is a definite visual identity associated with everything Mozilla. The creation of their visual identity team has had a huge impact on this. A lot of these folks are volunteers as well. To me, its almost like seeing the open source mentality reach across a new medium; marketing.

Rapid release cycles: Since its inception, the MoFo has put rapid release cycles high on its priority list. In the truest form of the “release early, release often” open source mantra, the Mozilla Foundation will see major releases of all of its products almost quarterly if not even more frequently.

Embracing the community: With the change customer focus to the end-user, Mozilla and several of its sister organizations such as Mozilla Europe, MozillaZine and MozDev really do get it. Each of these groups is one more place for users to voice what they want from Mozilla. MozDev even gives them a chance to do development if they so choose. With the recent announcement of updates.mozilla.org, we can see that the end-users are now going to get a chance to contribute directly to the success of the MoFo.

I’m excited about the possibilities for Mozilla. I believe their 2nd birthday will be even better than the first and look forward to continuing to work with them.

I’ve been reading slashdot for probably 5 years now. Although I don’t have an uber low UID (the low 6 figure range) and I’m just as tired of listening to CmdrTaco whine about everything, I still read it quite frequently. Recently I’ve found that the site is just getting plain stale. What have they done (other than under-the-hood mojo) in the last five years that’s particularly innovative?

In the not-so-distant past, Google and Slashdot were in the same place. Growing user base, lots of page views, $$$ from ad revenue, etc. But somewhere, Slashdot stalled. Where Google is now looking at a huge IPO leveraging their users and innovation after innovation, Slashdot has done almost nothing to grow the community and leverage their users. Of course Google and Slashdot are not in the same business, but back then, who knew Google would/could do anything more than just be a search engine?

I would love to see one or all of the following ideas implemented:

Caching: Squid is your friend. You could easily set it up as a service for your paying customers (or be really innovative and do it for free) where you crawl and then mirror the site in question that you are about to devestate. I don’t know how many times I’ve tried to view a site that has been slashdoted only to forget to go back and look later. A big, fat public squid proxy cache would solve this problem (or even tie it back to freecache.org somehow).

Talkback: Why not have irc or IM channels dedicated to each story as it comes on? Then folks can get in there in _real-time_ and discuss it. Then, just as with archives on the website, you could ferret away the chat logs and turn off the channel (essentially making it read-only).

Users: Slashdot’s greatest asset hands-down is its user-base. By any number of accounts there are half-a-million users that read Slashdot on a regular basis. That’s pretty impressive. So what are you doing for them? How are you making their experience better? How are you making it so they can communicate and share and innovate better? The fact is, the creaky slashcode codebase is getting on in years and I just don’t think it is up to snuff.

Maybe its time to gut the entire beast and start over?

« Older entries

About

This is the blog of Scott Kveton, digital identity promoter, open source contributor, avid gardener, passionate pizza maker, loving husband and proud father. Read More ...

Also Known As

Once or twice in my life people have mis-spelled my name (I know, its a shocker) ... you may have seen my lastname appear as any or all of the following:

Kverton • Kvelton • Keaton
Rueton • Kreton • Kventon
Kevton • Kevin • Smith (true story)
Kueton• Kvetan• Keveton