Announcement: New perlwikipedia maintainer

December 8, 2007

Well, I finally bit the bullet today and stepped down as maintainer of Perlwikipedia, my MediaWiki bot framework. My successor is ST47, a fellow admin on enwiki who serves on the Bot Approvals Group and has more bots than I have fingers.

I can’t say that it hasn’t been a long time coming, but I think that ST47 will do a much better job as maintainer than I did. He’s enthusiastic about Wikipedia, is a great Perl hacker, and has written more bolt-on enhancements to Perlwikipedia than there are original lines of code.

In any case, I believe we’ll see a brand-spankin’-new Perlwikipedia release in the near future, one that’s more shiny and can do your dishes.

~alex

libgnomecanvas woes

December 1, 2007

Jhbuild was merrily cranking away at Evolution deps, when libgnomeprintui spit out an error about “Too many open files.” So, I cranked up my per-user open files limit in /etc/security/limits.conf to 4096. Logged-out and back in again, and it was still there.

Turns out, on older versions of libgnomecanvas and gail, like the ones that jhbuild uses by default, the two libraries have circular dependencies.

Solution? When jhbuild fails, go to the shell and switch to the libgnomecanvas directory. Then, execute

svn switch http://svn.gnome.org/svn/libgnomecanvas/tags/LIBGNOMECANVAS_2_20_1

Then do the “./autogen.sh && make && make install” business.

Now, in the gail directory, do

svn switch http://svn.gnome.org/svn/gail/tags/GAIL_1_19_6

Build it. Now you can exit the shell and re-run the jhbuild configure with the circular dependencies resolved.

~alex

Distro wars

December 1, 2007

I have a problem. I’m probably the most indecisive person that I have ever known. I have severe difficulties deciding what dressing to put on my salad. Guitar strings? I need an hour of research. So it’s fairly obvious that I’m horrible at choosing a Linux distribution.

The problem is, there’s so many! Since I first saw Linux about 3-4 years ago, I’ve tried Fedora, Gentoo, Debian, Slackware, Sourcemage, SuSE, Lunar Linux, OpenBSD (it’s close enough), and Ubuntu. I’m sure there’s more in that list.

I believe the first distro I ever used was Fedora. That was Fedora Core 4, on a slow-as-dirt i386 machine clocked at around 133 MHz. I then used most of the distributions in the above list on that machine. Guess how well Gentoo performed at package installation on that machine. Needless to say, I soon upgraded my laptop to a blazing fast 1.6 GHz machine. Compared to what I used before, it was raw power.

Of course, having a faster machine also meant that I would need a correspondingly good distribution to handle the awesomeness (Seriously, 1.6 GHz was a huge step up from 133 MHz, it was a power trip). So then I cycled through every single distribution I knew. Finally, I settled on Ubuntu, because it was the only distribution that actually worked with my annoyingly-unsupported i810 on-board graphics chipset. That lasted for a good 6 months. Then, for some odd reason, I decided to switch to Gentoo. I backed up my machine, wiped the drive, and… Gentoo didn’t work with the i810. Great, I thought. I’ll just go back to Ubuntu. And so I did, for another month or so. Then I tried going back to Gentoo. And it failed again. This continued for a good 4 months or so (I didn’t know at the time that there was an i810 driver in the kernel, which I neglected to enable). Finally, Gentoo worked! It was a miracle! And so, I became one of the Linux elite who compiled everything from source and could manipulate the command line blindfolded while drunk. That lasted about another 6 months. Then, I realized that I was spending more time compiling and configuring than I was doing work. So, I switched to Fedora, which had gone through two cores already and landed at Fedora 7. That’s where I sit today.

However, I’m starting to think that Gentoo isn’t so bad after all. I mean, I’m a geek. I like to do geeky things. I like seeing my system exposed, like it was in Gentoo, not like with Fedora, where all I do is type `yum install evolution’ and sit back and watch, blissfullyunaware of what is going on. Fedora’s great, don’t get me wrong. If I had used Gentoo for my MythTV setup, it would have taken at least two weeks. But Gentoo seemed so… fun, I guess (As I write this, I notice that jhbuild is building evolution-data-server, which is the farthest it’s ever gotten. Hooray!).

The real question is, if I switch back to Gentoo, what happens? Will I finally resign from Wikipedia? Become a kernel hacker? Oh, that’s the other thing. On Gentoo, kernel modules were dirt simple to build. Just cd to /usr/src/linux/drivers/misc, write a test driver, insert the corresponding fragment into the Makefile, and type make. Done.

But anyway, I don’t know if I’m even sure that I want to start thinking about switching. I need to do more research, my intuition is notoriously unreliable for this sort of thing.

~alex

Jhbuild headaches

November 30, 2007

I’ve spent the last day or so wrestling with Jhbuild, Gnome’s build-from-SVN program. I thought “All I want is to build Evolution, it’s all I ask.” Nah, that would be too easy!

There’s a reason that they say trunk is unstable. I’ve had more build errors trying to get jhbuild to do a clean run on Evolution than I did trying to compile everything from source on Fedora Core 4.

I’ll post problems I encounter and their solutions, assuming I manage to get this thing built finally. It might be easier just to grab Fedora’s srpms. Meh, I’m a developer, let’s take the hard way.

~alex

Censorship = Awesome

November 28, 2007

I received an interesting email today from someone who is helping their friend use my closed proxy server. They said that their friend couldn’t access the server after I had given them login credentials. Naturally, I SSH-ed into the Toolserver, did a wget against my domain name, and it worked. So either China had discovered the proxy and blacklisted it, or there was some other problem which I couldn’t even begin to comprehend.

So, I checked my domain against a free site that determines whether a domain is accessible from China. The test came back saying that it was inaccessible from Beijing, but perfectly fine from Seattle. Guess where this is going.

Apparently, China discovered my proxy and blacklisted the domain name. Just to check, I tested the checker against a secondary Dyndns.org domain that I maintain for redundancy. The test worked fine.

Ain’t censorship grand? I have a feeling that I’m going to need to disclose my domain names only via email from now on.

~alex

Linux rocks!

November 25, 2007

I own two laptops. One is connected to a widescreen TV via the VGA-out connection on the back, and the other is my personal machine that I do all of my work on. Both have wireless cards. I use wireless for the TV laptop because I’m too lazy to pull cable, and I use wireless for my personal laptop because, well, it’s better than being tethered to an Ethernet cable at my desk.

Both cards, however, have Broadcom chipsets. For those of you cringing at the sound of “broadcom,” I feel your pain. Up until recently, the bcm43xx-series of chipsets were impossible to use under Linux. Then the fabulous bcm43xx drivers were released, and now Fedora incorporates the b43 driver into their standard kernel (All of my machines run Fedora, too).

The end result? Linux just works. I know that many have seen the “Mac vs. PC” adverts on TV, but they just have nothing on Linux. I went out and bought a Linksys WPC54G notebook card today, so I wouldn’t have to pull the card out of my TV laptop whenever I felt like going wireless. I pulled the card out of the box, unwrapped the anti-static cover, and plugged it into my laptop. Ten seconds later, I received GNOME’s wonderful “You are now connected to wireless network..” message. Ah, the sweet sound of something working properly.

On Gentoo, though, I think it was more fun to use Linux. I was like an adventurer going into the jungle, not sure of what obstacles lay ahead. I had to compile my own kernels, build software from source, and (gasp) select my packages’ features. It was sweet control-freak bliss.

On Fedora, I think that my Linux experience is closer to the “Macs just work” theory. I do some simple shell commands to extract the firmware from the Broadcom drivers, and then I can use any Broadcom-based wireless card. Simple.

Linux just works.

~alex

I want to be a developer.

November 24, 2007

It’s hard to get into open source. I mean, for me, anyway. I’m not sure what exactly I want to develop, or even how to start, but I know I want to help out the community.

I’ve been looking at kernel module programming as of late, but the trouble is that I don’t know what to write. I don’t have any devices I use that don’t work on Linux, and there’s not really a central place where people say “Hey, it’d be nice if this device worked.” Not to mention, I doubt that I have enough C / kernel experience to write a decent driver anyway.

Wikipedia is great, but it will only get so far. I’ve realized that I’m not on Wikipedia anymore as an editor, which bothers me. AntiSpamBot used to be great, but it just seems to be annoying more and more people lately, and I’m seeing little to no net benefit from it. It seems to have lost the Useful Purpose effect, unlike the Tawkerbots, which people seem to adore more than their own children. Perlwikipedia is good as well, except that I’ve been told on a number of occasions that it’s written poorly. I realize that. You can’t expect me to write a flawless work of art that rivals OpenBSD the first time.

I think I’m eventually going to burn myself out of Wikipedia. I’ve been coding in C# lately to write DiffShovel, but I don’t think that DiffShovel is going to be of much benefit to the community, in a manner similar to most of the code I write. And it, too, is written poorly. Even I can tell that, although I don’t think I should be writing good code when I’ve only known C# for a month. At some point, I’m going to realize that I don’t add anything to the project any more, I’ll find others to maintain the bots and Perlwikipedia, and then I’ll vanish.

I’ve looked at contributing patches to apps that I use, like Banshee (my music organizer/player) and the Linux kernel, but I think they’re too big for me to wrap my new-developer-mind around. I need to start small and work my way up, but I can’t find something to contribute to!

~alex

(In other news, I’m at 38,841 words, and this post was written solely to stall for time. In about 30 seconds I’ll be forced to open vi and get to 40,000.)

(Yep, right about now.)

(In a minute.)

November 19

November 19, 2007

I’ve got 12 days left, and I have a total of 29,646 words done. Which, when you work it out, is 1697 words per day. Not too bad, considering that I’m going to surge through this week at impossible speeds🙂

~alex

Guitar

November 18, 2007

Ah, I love the sound of new guitar strings…

my guitar

Wikipedia’s Tor Problem

November 4, 2007

Today I noticed an essay on Tor. This essay, while very interesting, brought something slightly disturbing to my attention: That another admin had performed a bot-assisted blocking run on “suspected” Tor servers. This isn’t new; admins have been doing these sorts of runs under the radar for a while. Go back and look at the CharlotteWebb RFAR. Yet, this was a fairly large run, and I wasn’t comfortable assuming there wasn’t collateral damage.

So, I went ahead and wrote several Perl scripts to grab all pages linking to Template:Tor, the standard template used to notify editors that an IP is a blocked Tor server. This list went into a Postgres database. Then, another script checked every IP and decided if it was really blocked or not. Only about 12 of the 740 IPs with this template weren’t blocked, nothing major. I removed the template from the IPs’ talk pages and went on. Then, I used a Python script distributed with Tor to get a list of all exit nodes that can access the Wikipedia servers. This also went into Postgres. Now for the problematic part.

I ran an SQL query that took all blocked IPs marked as Tor nodes, then checked if they were actually Tor nodes. The list of supposed Tor nodes contained 87 IP addresses. Want to know how many were really Tor nodes?

87.

That’s right, there are currently 653 IP addresses that were, at one point, probably Tor nodes, but now they aren’t. 653 innocent IP addresses. Now, to put this in context, let’s examine how many REAL Tor nodes are blocked.

I used the block-checking script to check the list of actual, live, Tor nodes that could access Wikipedia. There were 1553 Tor exit nodes when I ran the query. Guess how many were blocked.

269.

To save you the math, that means we are NOT preventing 82.7% of Tor exit nodes from accessing Wikipedia. That’s a great statistic, considering that Wikipedia’s policy on Tor is to disable editing access for Torified users.

Now, this isn’t a perfect study. I’m not taking into account rangeblocks, which I don’t believe show up on Special:Ipblocklist for an IP in their range, autoblocks, and other things I can’t scan for. All this means is that we have a relatively huge hole through which users can “abuse.” However, I highy doubt they will.

People need to stop taking WP:OP so seriously if they aren’t going to enforce it. I can’t begin to count how many open proxies and Tor nodes I’ve seen blocked that have since been closed or switched to a different IP. Meanwhile, the IP is still blocked, usually for periods of 5 years or more. If you block a proxy, you need to follow up on it! Administrators can’t just assume Tor nodes have static IPs; I, for one, operate a center Tor node (read: A node that can’t allow traffic out, except to other Tor nodes) on a dynamic IP address. We need to start taking more responsibility for our blocks and stop issuing fire-and-forget blocks that will, at some point when the IP changes, affect legitimate users.

I’m probably going to start testing the waters to see how the community would react to a TawkerbotTorA clone. Perhaps now that we’re seeing more adminbots, they’ll finally realize that adminbots are useful for some tasks. Based on what I’ve seen in my study, a bot would certainly be more effective and accurate than some administrators.

~alex