Planet Beagle

I Roll For Team Beagle

Hackers working on the Beagle Project

July 07, 2008

Robert Love

The Business of Elections

To date, clearly poised to surpass $1 billion before the cycle is over, the campaigns have spent a whopping $900 million. The New York Times, again proving that their core competency is in producing remarkably-informative graphics, has this rad little interactive visualization:

New York Times Election Graphic
$4.3m to Verizon for cell phones!

See also the related article, Cashing In on Obama and McCain .

by Robert Love (noreply@blogger.com) at July 07, 2008 02:33 PM

June 27, 2008

Joe Shaw

this is going to get worse before it gets better

Tell my wife I love her.

by Joe at June 27, 2008 05:58 PM

June 26, 2008

Joe Shaw

promotional consideration provided by

Like Havoc, Brette and I are members of a CSA in which we pick up our veggies once a week from a truck in a parking lot in Central Square. I just picked up this weeks share, and we have some great beets, carrots, mustard greens, and the best strawberries I’ve ever had.

In addition to the veggie CSA, we’re also members of a meat CSA, Chestnut Farms. The meat is local — raised in western Massachusetts and slaughtered nearby — and I pick it up once a month. It comes frozen, but I think it’s done so quickly after slaughter that the meat is still incredibly fresh when thawed… it never tastes like a freezer, it never has freezer burn, and it’s never tough.

They offer different cuts of grass-fed beef, pork, lamb, and free range chicken every month and often have eggs for sale at an additional cost. We’re not big lamb eaters so we don’t get any, but the rest of the food is some of the best stuff I’ve ever tasted. The pork in particular has a much richer flavor than anything you can get in the store (including Whole Foods)… those factory farms just can’t make tasty pork.

We get 10 lbs every month for $70. A quick survey of the freezer shows me that have: ground beef, ground pork, country-style spare ribs, pork chops, pork tenderloin, pork breakfast sausage, chicken legs and thighs, chicken breast, smoked bacon, and beef round eye roast.

If you’re a meat lover and live in Massachusetts, I suggest signing up even though there’s now a waiting list. If you’re elsewhere, meat CSAs are becoming increasingly popular (Results 1 - 10 of about 643,000 for meat csa.) and well worth it.

by Joe at June 26, 2008 12:00 AM

June 24, 2008

Joe Shaw

i left the living room window open

The weather here this week has been awesome.

A Photo by Joe Shaw

It’s too bad Jacob moved away just as it was getting good. :(

A Photo by Joe Shaw

These kinds of storms remind me of my childhood. When I was six, I remember sitting on the front porch of my grandparents’ house with them and listening to a battery-powered radio. The power had gone out, and a tornado had touched down about halfway between where my parents and grandparents lived — about 30 miles apart. Good times.

by Joe at June 24, 2008 10:03 PM

June 22, 2008

Robert Love

Martian Skies

Collected by the Boston Globe, these photos of Martian skies are without peer. There is a romance to exploring the unexplored, about going somewhere new simply because that's what's next.

It reminds me of President Reagan's speech, quoting from the poem High Flight, later cribbed by The West Wing, on the night of the "Challenger" disaster. Scheduled to give his state of the union, he spoke in lieu from the West Wing:

For the families of the seven, we cannot bear, as you do, the full impact of this tragedy. But we feel the loss, and we're thinking about you so very much. Your loved ones were daring and brave, and they had that special grace, that special spirit that says, "Give me a challenge and I'll meet it with joy." They had a hunger to explore the universe and discover its truths. They wished to serve, and they did. They served all of us.

And I want to say something to the school children of America who were watching the live coverage of the shuttle's takeoff. I know it is hard to understand, but sometimes painful things like this happen. It's all part of the process of exploration and discovery. It's all part of taking a chance and expanding man's horizons. The future doesn't belong to the fainthearted; it belongs to the brave. The Challenger crew was pulling us into the future, and we'll continue to follow them.

The crew of the space shuttle Challenger honored us by the manner in which they lived their lives. We will never forget them, nor the last time we saw them, this morning, as they prepared for their journey and waved good-bye and "slipped the surly bonds of earth" to "touch the face of God."

Slipped the surly bonds of earth to touch the face of God.

Anyhow, beautiful pictures.

by Robert Love (noreply@blogger.com) at June 22, 2008 01:00 PM

June 17, 2008

Robert Love

Food Blog

I am keeping a food blog, Food Tastes Good. It is mostly recipes, such as,

Do check it out, if that sort of thing interests you.

by Robert Love (noreply@blogger.com) at June 17, 2008 11:21 AM

June 11, 2008

Joe Shaw

i would like to lick that sandwich

From the Tri-State Observer:

Our concern is not that we are using the remainder of our strategic grain reserves for humanitarian relief. AAM fully supports the action and all humanitarian food relief. Our concern is that the U.S. has nothing else in our emergency food pantry. There is no cheese, no butter, no dry milk powder, no grains or anything else left in reserve. The only thing left in the entire CCC inventory will be 2.7 million bushels of wheat, which is about enough wheat to make 1⁄2 of a loaf of bread for each of the 300 million people in America.

Wait. We had a strategic cheese reserve and nobody told me about this? Because my strategy is to eat as much cheese as I can when my wife is not around. I have a cheese-eating strategy.

On an unrelated note, congratulations to Bockover, Gabriel, and the rest of the Banshee team for their 1.0 release. These guys are amazing hackers, and Banshee has really matured into a fantastic piece of software. And having a new website up and packages for several distros available on release day? These guys have their shit together.

by Joe at June 11, 2008 03:45 PM

June 01, 2008

Robert Love

Growth, Inflation, Politics, and Mistakes

Mankiw on the corporate income tax:

The ultimate payers of the corporate tax are those individuals who have some stake in the company on which the tax is levied. If you own corporate equities, if you work for a corporation or if you buy goods and services from a corporation, you pay part of the corporate income tax. The corporate tax leads to lower returns on capital, lower wages or higher prices—and, most likely, a combination of all three.

Krugman on embedded versus non-embedded inflation:

Imagine that there are two entrepreneurs, Harry and Louise, both of whom change prices only at fairly long intervals—say, once a year. Other things equal, Harry want his average price over the next year to be about the same as Louise's; Louise wants her average price to be about the same as Harry's. But their price setting takes place on different dates. (This is a metaphor for the real economy, in which people setting prices have to think about the prices of many competitors and suppliers that will prevail until they revise the price again.) In this situation, inflation can feed on itself: Harry raises his price above Louise's, because he expects her to raise her price in the future, and she does the same thing when it's her turn.

Love on the 2008 US Presidential election:

Clinton, citing Puerto Rican victory, soldiers on ... PR cannot vote in general ... Obama 48 delegates shy of lock

Biz Stone on why the above is so damn slow:

We currently use one database for writes with multiple [sources say two] slaves for read queries. As many know, replication of MySQL is no easy task, so we've brought in MySQL experts to help us with that immediately. We've also ordered new machines and failover infrastructure to handle emergencies.

by Robert Love (noreply@blogger.com) at June 01, 2008 10:38 PM

May 27, 2008

Robert Love

I can't believe this is Massachusetts

Crane Beach, Ipswich, MA

Crane Beach, Ipswich, MA
Crane Beach, Ipswich, MA

by Robert Love (noreply@blogger.com) at May 27, 2008 03:36 PM

May 24, 2008

Joe Shaw

“darn”

Road rules: 1 in 6 drivers would flunk (CNN money)

About one in six U.S. drivers wouldn’t be able to pass a written driving test if they took it today, according to a new study.

Drivers in the Northeast continued to have the lowest scores and the highest failure rates, with New York, New Jersey, Massachusetts, and the District of Columbia maintaining their three-year streak in the bottom five rankings.

SUVs plunge toward ‘endangered’ list (CNN)

Jorge Fernandez makes his way through the car lot littered with unwanted SUVs. “I’ve never seen it this bad,” the auto dealer says. With gas prices at an all-time high, one expert says SUVs and trucks as personal cars may soon be an “endangered species.”

by Joe at May 24, 2008 02:19 PM

May 21, 2008

Kevin Kubasik

The Reality of Semantic Desktops: Death To Tags, Labels and Folders

So, I recently saw some more updates on the Gnome Live wiki regarding the evolution of a ‘Semantic Desktop’. I have some bad news people: Its not going to happen. Now before everyone spends 20 minutes explaining all the ways it could, let me clarify my point. It’s a largely unattainable goal, which if it ever were to complete, would be a horrible user experience. I think somewhere between RDF, FoaF, and ObjectRank we lost sight of the original goal of a Semantic Desktop. We wanted to organize, present and store data in a fashion more congruent with the human mind. The general effort behind the Semantic Web and Desktop movements was to reduce the ‘multiplier effect’ of communication. (Take for example one e-mail sent to a mailing list, the file and data is now duplicated a hundred times over, and each receiver must filer or classify the e-mail with relationship to themselves). On the scale that communication takes place over the web, this effort is still crucial, but in the desktop world, where we operate on a billionth of the scale, that problem is not nearly as pervasive. No doubt the advances made in understanding and structuring the mass hysteria of the web will benefit desktop users, but I think forcing that structure onto the desktop is not only impossible, but counter-productive.



In my opinion the options are clearly laid out before us:

1) Move the desktop into the structured realm of a million and one tags/categories/color filters/labels/folders

- Or -

2) Get rid of it all. And just know what the user wants. (Ok, not really all of it, but instead of adding more hierarchies, we add more in-place understanding)



I know, its a bold statement, but somewhere between my tags, stars, labels, folders and emblems I realized that all these efforts we were making towards ease of use and understanding are just obfuscating things even further.  These elaborate systems that require users to squeeze into sub-par standards like iCal exacerbate the problem even more, and ignore the efficiency of simple systems, like a pad of paper. (Yes, props to Tomboy). The problem is, many times a blunt-simple interface requires significantly more work on the programmers side (to actually understand the data entered) than a more traditional tabs-and-forms approach. I think we are demanding too much from users, how many people actually keep their address book completely updated? Or tag all their photos, or keep every document in the right folder? Even those who are vigilant eventually fall behind, and that’s because users already know what the material they are filling is, but still have to spend time explaining to the computer which items are related and where they belong. Especially for users with large sets of desktop data (Few thousand docs,e-mails,photos, and songs) the time can add up. Instead of asking users to commit even more time for data integrity and organization with more tagging systems.



The way I see it, we can count on 2 skills from a Desktop user.

1) Searching ( ThankYou Google!! Most people are quite comfortable with search phrasing!) or more accurately, knowing what they are looking for

2) To use their computer even when they aren’t looking for something (ie content generation, surfing the web, chat etc.)



These are the common denominators that we should be reaching for. We shouldn’t be trying to make the user classify their relationship with each person in their address book, we should just always be there, identifying the relationship based upon their level of interaction. And on a higher level than traditional approaches have taken us. After working on the Beagle Project for some time, the incredible weight of maintaining the backends to communicate with each mail client, each rss reader and each chat client almost seems to drown out the gain from having the data in a central and unified place. I mean, each time it was just someone talking to someone else right? Why have we taken simple actions and tried to codify them, when the complexities of human behavior are so great any Psychologist would tell you its a guessing game anyways. I think we should start with the disorganized mess that is someones workday at a computer and ask for nothing else. Reverse the system, take all of our analytical energies and structure, and use it for ourselves, in the backend, and just have the users use computers.



The best example of this is the phenomenon of tagging. Basically associating like objects via keyword-phrases. The problem is tags restrict themselves, lets say I have created a blog post about web browsers, while the tags ‘html, web, mozilla, ie’ may indeed be the most accurate 4 words from my point of view, they in no way approach the whole set of meanings and connotations carried by all their synonyms, let alone the entire post. In the realm of multimedia, tags are more useful, as images and videos are harder to extract contextual value from, but there is a better way….



Lets be smart! Instead of trying to stem the tide of data to make it more manageable, we ride the wave! Data is very rarely stagnate on a machine, people send photos to friends, edit each others papers, and share music all the time, there is a wealth of information in the chat I have with a friend while he listens to the new song I sent him, we just need to grab it!



I have specifics and even a little bit of code for my next post, but until then, I want feedback, do people agree? I mean, yeah, a million and one more ways for me to catalog and store my data, but when I’m actually looking for something the tags never seem to help much. While tags and folders do help with the clutter problem, I want to propose the idea that we move completely beyond presenting the hierarchy to the user, and start determining how (from the most basic of usage data) we could better present/organize information. Is the ubiquitous search box the only UI system that fits? What about something like Dasher meets lowfat, powered by an incredible datastore, but for files?








by Kevin Kubasik at May 21, 2008 11:24 AM

May 20, 2008

Nat Friedman

We're Hiring

We're Hiring

One of my most fun responsibilities at Novell is running the SUSE Incubation Team: a  small team of developers focused on innovation, prototyping, and exploratory hacking.  Our charter is to come up with disruptive ideas that take Novell's Linux business in exciting new directions.

The team is a diverse group, ranging from web developers who love working in Ruby on Rails to kernel hackers and virtualization experts, and it's a great privilege to work with them.  We have an upbeat culture that's tolerant of experimentation, we're obsessive about delivering innovative and amazing experiences to our users, and we hold each other to high standards.  Besides our exploratory development work, the team is also responsible for running the twice-a-year SUSE Hack Week.

As it happens, one of our projects — an innovative web application — is starting to look promising and so we're working on getting it ready for a limited public beta.  And we're looking for a few talented, energetic developers to help us get there.

The job descriptions are below.  Keep in mind that we're not looking for specialists: we're a small team, and we need people who are willing and happy to shift gears whenever necessary.

If any of these sound interesting to you, mail us your CV/resume.

We're open to hiring people in any location, but we have a slight preference for people who can work in Nürnberg, Germany, and a preference for people close to the UTC+1 timezone.  We offer competitive salaries and benefits in a fun, tight-knit team.

Quality Engineer

If you believe that quality is priority one and that great QA also means writing code, then this could be the job for you.  We are looking for a skilled programmer to help create and run a robust testing environment for an innovative new web service.

Your responsibilities will include building and maintaining a test harness and test environments; automating UI testing of our web application; monitoring and analyzing test results; helping to fill in unit and functional tests; creating test environments; and playing the role of bugmaster in our bugzilla.

The ideal person will be a strong programmer who can tell a good bug report from a bad one, will consider themself a whiz at scripting (shell, perl - whatever works for you), and will enjoy understanding the ins and outs of a sophisticated system.

Deployment and Release Engineer

Interested in designing and operating a streamlined deployment architecture for a cluster of several hundred cores? We are looking for an engineer to architect and manage the build, release and deployment infrastructure for our new web service.

Your responsibilities will include creating and maintaining deployment scripts; creating deployable packages and images; system administration of production machines; building RPM packages and virtual images to simplify deployment; and setting up and maintaining a cluster monitoring infrastructure.

Linux packaging and system administration skills, and experience deploying web applications are a must; experience with Ruby on Rails is a plus; solid programming skills and a strong focus on delivering a great user experience are required.  Infrequent travel to our data center in Boston will also be required.

Developer

This position will be working today on the core of our web application, which is mostly written in Ruby on Rails, Perl and in C.  Ideal candidates will be creative self-starters with a strong focus on user experience and performance, and will have good communication skills and experience working in teams.

Because of the nature of our team, we can't allow ourselves to be defined by the tools we happen to be using at any given time.  Today you might be writing Ruby on Rails, but tomorrow you could find yourself knee-deep in C: whatever it takes to get the job done.  Above all, we're looking for smart programmers who don't mind learning a new codebase or a new language overnight, and who are willing to hit a few dead ends before arriving at the perfect solution.  We're also looking for people who are good writers, and with good design skills.

If you're applying for this position, please send us some code that you've written that you're particularly proud of.

by nat at May 20, 2008 02:15 PM

May 19, 2008

Kevin Kubasik

Bazaar and its Rockage

So, I think most of the open source world has agreed that the DRCS model fits our working style better than the traditional model pushed by SVN and CVS etc. And in this DRCS world we have rallied around 3 main tools: Bzr, hg, and git. And in an even greater display of complacency we have given those 3 tools quick and general classifications that became obsolete almost a year ago. Bzr is user friendly but slow and technologically inferior, hg is the champion of the middle but with slow development and a lackluster community, git is wicked fast and ‘The Right Way’TM but a pain to use.

Really? Come on guys, those molds were cast almost a year and a half ago, isn’t it time we looked at things again? Git has an entirely new interface, hg has a slew of plugins/extensions, and bzr has a completely new repo format, and network protocol, resulting in a massive speedup. Now I’m not claiming to be some unbiased source, and comparing 3 incredibly robust tools is not my job, but given the amount of support that Git receives from its very vocal supporters makes me feel a need to give props to my favorite DRCS system: Bazaar.

That’s right, Bazaar (or bzr) is awesome. Sure, git is awesome too, and so is mercurial, but I have found myself loving bzr. I’m not going to attack other DRCS tools, I just want to extol the awesomeness that is bzr.

1) Bzr is Python-Tastic! - As a python hacker, being able to utilize a robust API and plugin system is a cool plus, this also generates lots of powerful and complete plugins, which leads me to the next point.
2) Bzr has a ton of plugins! - Plugins like bzr-avahi (allows the discovery of branches on a local network, great for sprints/hackfests), bzr-svn (makes working with upstream repositories easy as pie!), quilt and gtk tools.
3) Bzr works on Windows - Yeah, I’m not a huge fan of accommodating Windows users, but it makes collaboration easier, I don’t have to make my roommate boot into Ubuntu to lend a hand with some CSS bugs.
4) Bzr is easy to share - The ability to push branches to some central repo is a big component of distributed development. While patches work in some cases, most of the time, having access to a branch makes the whole system work better. Both Git and Hg require a bit of work to set up a new repo and push a branch, bzr supports a ton of protocols and can create the target directory/repo with one command. Sharing is easy!
5) Bzr is fast - Maybe others are faster, maybe it could be a million times faster, I dunno. What I do know is the only thing I seem to wait on is my net connection… I realize that many people need more than that. So here you go. http://bazaar-vcs.org/Benchmarks
6) Bzr is small - In my development model (a shared repo with branches inside of it) bzr is compact and aware of disk space, without repositories it might be huge, I dunno.
7) Bzr is clear about whats happening - I can follow what Bzr is trying to do with my code. A branch is a new directory, and I can always see my code. Not only is this comforting/reassuring, but I often utilize IDE’s like Wing, Eclipse, or Monodevelop when working on code, and while they can handle other systems, directories for branches translates to every editor and works well.
8) Bzr is reliable - A massive suite of unit-tests and a commitment to their excellence offers some comfort that I won’t be left holding half of my code in one hand and an ugly binary blob in the other.
9) Most of all, its a feeling. Its hard to explain, but I don’t notice bzr. Its just there, and I just have my code. I rarely take notice of it, and don’t focus on it. I spend 99% of my time coding and every 30 min I enter a terminal for a few seconds to do all my DRCS stuff. Maybe its why people who use Bzr aren’t very vocal about it. Its not a revolution in revision control, and I don’t do a million cool things in it. I just write code, and bzr is there, doing whatever it does.

by Kevin Kubasik at May 19, 2008 04:43 PM

May 09, 2008

Nat Friedman

Rack servers in Boston, make money

We're looking for a neat, meticulous person to help us rack and wire some servers next week, on Tuesday and Wednesday. The tasks are unboxing, carrying, mounting, screwing, wiring, and testing the servers.  Pay is $20/hour, duration is until we're done, location is in Waltham (we'll pay for your transport).

If you're interested, send mail to pzb@novell.com and mention any relevant experience or skills.

by nat at May 09, 2008 08:17 PM

May 08, 2008

Kevin Kubasik

Utah Python Users Group

If your in the greater Salt Lake area and love python swing by the meeting this evening! We’re doing a python editor head-to-head, should be fun!

by Kevin Kubasik at May 08, 2008 11:44 PM

Debajyoti Bera

One way to get things fixed in Ubuntu

Become famous and then blog about it. Its easy (the blogging part). And it
works [1a, 1b].

Now if only someone famous blogged about some other literal showstoppers in
beagle and KNetworkManager, some more Kubuntu users would be happy [2].
Beagle is second class citizen in Ubuntopia, but I thought KNetworkManager is
important.

One thing I noted though, since Beagle was moved to Universe from Main, it is
getting better treatment. Periodic sync from debian is actually happening
unlike when it was in main; the core Ubuntu developers rarely found time to
update the Beagle package. Kudos to Masters of the Universe!

[1a]

https://bugs.launchpad.net/ubuntu/+source/galago-sharp/+bug/186049/comments/9
[1b] http://arstechnica.com/reviews/os/hardy-heron-review.ars/3
[2] https://bugs.launchpad.net/ubuntu/+source/beagle/+bug/207157

by noreply@blogger.com (dBera) at May 08, 2008 11:14 AM

May 07, 2008

Arun Raghavan

IITK Fascism Update

So we (some of us students) met and decided to do something about the sudden implementation of the Internet shutdown from 0000-0600. Some updates:

The intimation about doing this was sent at 2357 hours today (yesterday, to be precise) to all. The notification basically stated that because of “undesirable activities”, Internet will, with immediate effect, be disabled from 0000 to 0600 every day. And that’s it -- poof. The hostel network is disconnected from the rest of the Institute, thus making sure that nobody can access the Internet (or even the Institute’s own computing facilities). To compensate, the Computer Center (with a capacity of <200 computers) is to be kept open 24x7.

Of course, this was unacceptable, so a bunch of us decided that something needs to be done. There are 2 issues -- the decision, and how it was implemented. While the decision itself needs discussion (more about this later), the implementation is of immediate concern. People were not prepared, and work on several people's theses were affected. Plus, this has been done just a little after the end-semester exams, when most students are not on campus. This sort of fascism usually rears its head under precisely these circumstances. We decided that what needed to be addressed right now is the implementation -- the Internet has to be made available this night.

A couple of our student representatives spoke to the Dean of Student Affairs (the DoSA -- the official channel between the students and the administration). The DoSA basically said that they, the various Deans and the Director (and Deputy Director?) have made the decision at nothing would be done about it. More precisely, the Director, as the highest power in the Institute has taken the decision and that's that. Further discussion may be taken up with him.

About 60-70 of us went to the Director's house at about 2:30 (the entire process was peaceful -- there was not shouting or slogans). We met with the security, who called the Head of the Computer Center (CC) and the DoSA to the place after some attempted dodging.

The CC Head turned up first and started asking what our problem was. He offered such resources as a vehicle to transfer us from hostel to CC as well as as many pen-drives as we require to transfer data from our machines to the CC machines. The DoSA just said that we’ve given you 2 years to think about whether this should be implementing it, and now we will be implementing it, so there.

Our student representatives (who did a pretty good job), after some dialogue, got the connection reinstated for tonight. They will be further taking up the issue later today.

The decision itself is extremely foolish, of course. Moreover, the dictatorial way in which this is being done is just as shocking. Let’s see how things pan out in time. Perhaps sense and sanity will prevail.

May 07, 2008 10:45 PM

May 06, 2008

Robert Love

A Beautiful Day

Boston over the Charles
Spring arrives in Boston (cf. winter)

In yesterday's Financial Times, Larry Summers on tax competition and cooperation:

First, the US should take the lead in promoting global cooperation in the international tax arena. There has been a race to the bottom in the taxation of corporate income. Closely related is the problem of tax havens that seek to lure wealthy citizens with promises that they can avoid paying taxes altogether on large parts of their fortunes. It might be inevitable that globalization leads to some increases in inequality; it is not necessary that it also compromise the possibility of progressive taxation.

Agreeing or disagreeing with Secretary Summers' point is largely a question of the role of government as much as it is one of international economics. I generally view tax competition as a healthy restraint on the tax burden and thus a bridle on the size of the state. Here, Larry is taking the view that without cooperation, you will have nanny states without nannies and thus nothing to transfer.

by Robert Love (noreply@blogger.com) at May 06, 2008 10:58 AM

May 04, 2008

Robert Love

You're gonna be so proud. Proud? Proud.

From the make-Edward-Tufte-proud department, another stellar graphic in today's Sunday Times, this one visualizing the basket of goods making up the CPI and both the relative size of those goods within the consumption bundle and the year-on-year change in that size:

All of Inflation's Little Parts
All of Inflation's Little Parts by Amanda Cox

You always glean points from a good visualization that you don't from the tabular data. For example, consumers spend the same amount (about 1% of total) on cable service as on doctor visits. The portion of consumer spending allotted to "computers" has declined 12% year-on-year. Rising import prices, particularly oil (which, although denominated in dollars, experiences the same exchange rate pressure as other world market goods), and growing food costs account for the bulk of inflationary pressures. I am happy to note if you rent your home, don't own a car, and spend most of your money on clothes and bacon, your purchasing power has actually increased year-over-year.

Note that, while a proxy, the change in spending on a category is not the same as inflation. For example, the share of spending on citrus dropped 9.5% year-over-year. That could be due to deflation, but the spending drop could also be caused by a decrease in demand—perhaps consumers are substituting oranges with apples, which grew 7.5% year-on-year. Alternatively, note that while the cost of most health-related expenses went up, so did the science advancing the field, ushering in new drugs and improved procedures. If you aren't comparing, say, apples to apples year-on-year, you are measuring more than monetary inflation. These are just two of a myriad of problems with computing inflation.

A page earlier, Alan Blinder argues for greater regulation of the financial industry. Unfortunately, Prof Blinder notes:

It will, for example, substantially reduce the profitability of investment houses and, therefore, reduce their scale. But that’s the price you pay for access to a publicly financed safety net.

No doubt increased regulation, particularly in the area of margin requirements curbing excess leverage, will lower short-term profits. But I don't see why the goal of any changes in regulation shouldn't include maintaining or even improving longer term profits. After all, you'd have to take substantial bites out of Goldman's earnings to equal the loss in a single implosion such as Bear's.

by Robert Love (noreply@blogger.com) at May 04, 2008 03:16 PM

May 03, 2008

Robert Love

Linux Journal Readers' Choice

Linux Journal unleashed their annual Readers' Choice Awards the other day, and I am proud to note that Linux System Programming—my recent work on system-level Linux hacking—received an honorable mention in the category of "Favorite Linux Book." Whether you live strictly at the lowest levels or only occasionally reach outside your cozy virtual machine; whether you code in C++ or Python; whether you are wolf or neophyte, the text is both an excellent guide to systems programming and a handy reference to Linux's sparsely-documented system call API.

Also, congratulations to GNOME for winning "Favorite Desktop Environment" and—natchvi for winning "Favorite Text Editor."

Disclaimer: I am Contributing Editor at LJ, but I was wholly uninvolved in the Readers' Choice Awards. Hat tip to "loyal reader" for the link.

by Robert Love (noreply@blogger.com) at May 03, 2008 04:53 PM

May 01, 2008

Robert Love

Economists in Post on Gas Tax

Today's Post tackles yesterday's topic:

A growing chorus—including a top congressional Democrat—labeled Sen. Hillary Rodham Clinton's proposal for suspending the federal gasoline tax ineffective and shortsighted yesterday, even as she continued to paint Sen. Barack Obama as insensitive to drivers' woes for not endorsing the plan.

The initimble Prof Mankiw chimes in:

Harvard professor N. Gregory Mankiw, who has written a best-selling textbook on economics, said what he teaches is different from what Clinton and McCain are saying about gas taxes. "What you learn in Economics 101 is that if producers can't produce much more, when you cut the tax on that good the tax is kept by the suppliers and is not passed on to consumers," he said.

Over the short-run—and particularly over the summer, with refineries already at maximum capacity—the quantity supplied is fixed. Cutting the tax will cause consumers to simply bid the price back up to its original value, allowing demand to meet the fixed supply.

Here is a policy proposal: Ditch the gas tax and replace it with a broader tax on all carbon. Offset the carbon tax with a revenue-neutral reduction in marginal tax rates. Also—for good measure—abolish all farm subsidies.

by Robert Love (noreply@blogger.com) at May 01, 2008 11:32 AM

Kevin Kubasik

Mono GSOC Projects: Linq to SQLite

So I noticed that one of the accepted proposals for the Mono project is to create a LINQ provider for SQLite. Major props to this (its something I totally want to see!) and I’m glad to see that LINQ in Mono is going to be its own beast, I love it when the FOSS community just takes a technology and runs with it! Anyways, I wanted to try and get in touch with the mentor/student of this project and share my experience (as the author of the current LINQ to SQLite component ). But contact info seemed hard to come by, so I thought I would post what I had learned.

First, people really want this, and there are several half-complete implementations floating around, including mine (read only, no commit/update/delete support) and this one.

Second, support for just queries is quite easy. Support for complete CRUD, tedious but not to difficult (lots of examples already exist). Support for the generation/mapping/reflection of a database to real Linq objects, this is the tricky part (specifically the UI elements when unable to just piggyback the Visual Studio work).

Anyways, all the luck in the world to this GSOC project, I would really like to see a working implementation come from this!

by Kevin Kubasik at May 01, 2008 08:38 AM

April 30, 2008

Robert Love

A Gas Tax Holiday. Seriously?

Although neither are in office, Senators Clinton and McCain have both endorsed a gas tax holiday this summer, temporarily eliminating the 18¢ per gallon federal excise tax. To his credit, Senator Obama has denounced the holiday as not "an idea designed to get you through the summer" but one "designed to get through an election." It is also bad economics.

The price of fuel during the "holiday" will depend on gas's elasticities of supply and demand. As the short-run supply of gas is fairly constant—in the short-term, supply is fixed as factories (at least over the summer) already run at full capacity—the holiday price of gas will rise to meet the pre-holiday price.

Put another way, given the fixed supply, the price of gas will rise until the quantity demanded drops to meet the quantity supplied. Since the supply is invariant with respect to the tax, the price will not change.

Gas taxes, in the short-term anyhow, do not modify behavior—they just transfer payments from the supplier to the state. Thus, Clinton's version of the holiday, which replaces the gas tax with an offsetting tax on gas producers, asininely accomplishes nothing, but at least her plan is funded.

Let's assume fuel prices do drop. Over the course of the summer, this will save the average driver the cost of about a tank of gas (Obama says half a tank, but my calculation comes out a little higher). Now, if the price drops, the quantity demanded will increase and thus consumption increases (this will bid the price up, as we are now assuming supply is not inelastic, by some amount less than the full 18¢). What happened to yesterday's policy of the day, global warming? And what happened to last year's headliner, crumbling infrastructure, which the gas tax funds?

The proposal is just pandering, but if we really care about stimulating the economy by putting money in consumer's hands, there are better methods than targeted tax credits—for example, cutting marginal income tax rates.

by Robert Love (noreply@blogger.com) at April 30, 2008 07:14 PM

April 28, 2008

Debajyoti Bera

I knew searching would fail

There is a great report about a usability test on the web about a guy
giving some common computer tasks to his girlfriend on a fresh Hardy
Heron installation. I found it via Slashdot but since fewer and fewer
people are slashdotting these days, so here is a link [1].

The tasks are well chosen and the user is _not_ a first time
grandma-type-user and approaches each task in the obvious possible
way. "Obvious possible" - for people not used to a Linux distribution,
where "doing stuff my way" rules over "getting stuff done".

As I started reading it, I knew at some point the user was going to
"search for something". And I knew she was going to fail. Which she
did.

The problem with most (all?) of the linux desktop search applications
is that they are cut out for a particular task and are (hopefully)
pretty good at it. Indexing is the keyword there - how to index all
kinds of data in the best possible way and then allow users to search
the indexed data. And there are lots of sophistication there.
Unfortunately the common search tasks by an user is not quite that.

- Search for a file by name - most common
- Search for files of certain types
- Search for files in home directory containing some text - slightly
advanced usage
- Search among browsed websites etc.
- As a computer "user", it is not clear why I would search for
websites in the desktop search tool and not in the browser. Of course,
once I am told this can be done in the desktop search tool _too_, I
would be extremely glad and nod in appreciation.

It takes time to write a desktop indexing and searching system. I
didnt believe it when I first heard of it and my friends asked me what
is so difficult about it other than implementing inotify. For some
reason it is. So a lot of effort in invested behind that. But there
has been less effort in presenting a failsafe, minimum capability
search experience in that direction. What do I mean by failsafe and
minimum capability ?

- One obvious way to launch the search tool (there could be more, but
there should be one which may not be the best but works in the worst
case)
- The obvious tool should never fail on the basic searching - never,
never, never. By basic searching I mean searching for non-file content
information - name, size, type (what on earth is _mimetype_ to a
non-CS/IT person ? searching by extension is what I mean. broad
classification like music, picture helps).
- Repeat the above. Let me say it this way - if the user knows a file
exists, she should be able to find it by name. And matches by name
should come _first_. Same for search by types.
- Anything else is a bonus. When we have complete semantic desktops,
where a file is same as an email and same as an addressbook entry,
maybe then users would want to search for everything or specify what
exactly she wants. Not now.

So where does beagle fall behind (or some of the others tools, by
reading about them and looking at their screenshots).
- User want to use them to search for files. The tools return pretty
much everything.
- Give them an option on where to search. There is no need to
include an option for "application/rdf+xul" but list the common
options. A search service has to work for the minimum, a GUI has to
cater to the average. I would be sad if it didnt have ways to cater to
the advanced crowd too but I dont mind if that requires one extra
step.
- User wants to search everywhere (in the filesystem).
- Thats definitely not what beagle was designed for. A beagle search
tool is not expected to do that. But when it is presented as the
_main_ search tool to the users, it will be used to search everywhere.
And it will fail.
- I dont quite know how to design a failsafe GUI search tool but a
good start would be use the indexing service to home directory files
and brute-force 'find /usr/bin' for non-home directory partitions.
- Some users would never need to search by content.
- If searching content was cheap then it would not be a big deal if
searching by content is enabled. But content searching is expensive
from my experience. It would be better if users are allowed to opt-in
for content searching.
- Content searching is not supposed to be expensive. As far as
beagle is concerned, it is halfway on meeting this goal. It still
needs some fault-tolerance feature to detect problems before too much
damage has been done.

There are lots of other ways to make searching "just work". The user
does not even need to know there is any indexing in the background.
The sad part is a lot of what I suggested (or could have suggested) is
already possible with the current beagle infrastructure. What is
lacking is someone with a good GUI knowledge to work on improving the
search experience. I am defending the base by fixing the occassional
simple bugs but a real developer is needed. And needed urgently
otherwise yet another distributions will be released with a lacklustre
search experience.

http://contentconsumer.wordpress.com/2008/04/27/is-ubuntu-useable-enough-for-my-girlfriend/

by noreply@blogger.com (dBera) at April 28, 2008 02:12 PM

April 27, 2008

Nat Friedman

More Tweetable Scripts

A few more tweetable commandlines have emerged since I posted the last round-up.

From pupitetris, this little work of art:

a=1;for i in {1..34};do printf %$[40-${#a}]s"$(eval $(echo $a*$a|bc|sed 's/$/0/;s/\([0-9]\)/tput setab \1; echo -n \\ ;/g'))"\\n;a=1$a;done

This Linux-specific commandline from Justin:

s=.o0O0o.o0O0o.o0O0o.o0O0o.o0O0o.o0O0o.o0;n(){ for x in `seq $1 $2 $3`;do notify-send ${s:0:x}; done };while :;do n 1 2 39;n 39 -2 1;done

And I wrote these two:

clear;for x in {0..150}; do y=`echo "12+6*s($x/6)"|bc -l|cut -d. -f 1`;echo -en \\e[$y\;"$(($x/2))"HX; sleep .1;done

s=`seq 9|shuf`;while :;do for((i=0;i<15;i+=2));do echo $s;a=${s:i:1};b=${s:i+2:1};[ $a -gt $b ]&&s=${s:0:i}$b\ $a${s:i+3};sleep .2;done;done

That last one is a bubble-sort implementation in 140 characters. Unfortunately, 140 characters is one character too many for a twitter post. Can you figure out how to shave off a character or two? (You'll need a recent version of coreutils for shuf).

Thanks to some helpful hints in the comments (abock, knipknap, Mitch) we're down to 137 chars:

s=`shuf -i1-9`;while i=;do for((;i<15;i+=2));do echo $s;a=${s:i:1};b=${s:i+2:1};[ $a \> $b ]&&s=${s:0:i}$b\ $a${s:i+3};sleep .2;done;done

I'll be posting more on twitter as people send them in.

by nat at April 27, 2008 01:23 PM

April 25, 2008

Kevin Kubasik

Speaking at UT Code Camp

So, if you live in the greater Salt Lake City area, there’s a pretty cool low key (and free!) conference coming up, the Utah Code Camp. I’ll be doing a little talk on getting data out of HTML with Python (utilizing lxml and twill). If your interested, you can register here.

by Kevin Kubasik at April 25, 2008 05:21 AM

April 20, 2008

Debajyoti Bera

Takes Two to Release

The GMail backend I blogged about before is now available for mass abuse in Beagle 0.3.6(.1). We also tried to maintain our love of cutting edge technology by upgrading the Firefox extension to work with Firefox 3.0.

I noticed several forum posts where users wanted to use beagle like locate/find-grep. The desire was two pronged - no intention to run a daemon continuously and return files from everywhere doing basic searches in name and path. That is not how beagle is supposed to be used but users are the boss in a community project. So I added blocate, a wrapper to beagle-static-query. Currently it only matches the -d DBpath parameter of locate but works like a charm. Sample uses
  $ blocate sondesh
  $ blocate -d manpages locate

The other thing I added was a locate backend. I absolutely do not recommend using this one. Yet if you insist ... when enabled and used with the FileSystem backend, it will return search results from the locate program. Yes, results from eVeRyWhErE, as you wished.

You can use both the GMail and the locate backends in beagle-search as well. But both the new backends are rather primitive, so I have taken enough precautions againsts n00bs accidentally using them. So in summary, 0.3.6 is not going to do you any good. Oops... did I just say that ?!

The title is based on the empiricial count of the number of actual releases (including brown bag ones) needed for last few releases.

by noreply@blogger.com (dBera) at April 20, 2008 10:04 PM

April 18, 2008

Robert Love

Linux System Programming, Nipponese

Linux System Programming cover, Japanese

Linux システム プログラミング!

Just received my copy of the Japanese edition of Linux System Programming, which you can likewise own for a mere ¥ 3,780. I am told my unique brand of humor translates particularly well into the Japanese language.

Everyone, regardless of vernacular, ought to buy a copy. Reading it is nice, but not required.

Also available is the mother tongue version of LSP. Its like reading the Talmud in Aramaic. Or Shakespeare in retarded English.

by Robert Love (noreply@blogger.com) at April 18, 2008 02:39 PM

April 17, 2008

Joe Shaw

nice, $20.50 more than i thought

I just got my first IRS “stimulus package” scam email:

From: Internal Revenue Service
Date: Thu, Apr 17, 2008 at 11:17 AM
Subject: Tax refund - Online form
Mailed-by: gmail.com

After the last annual calculations of your fiscal activity we have determined that
you are eligible to receive a tax refund of $620.50.
Please submit the tax refund request and allow us 3-6 days in order to
process it.

A refund can be delayed for a variety of reasons.
For example submitting invalid records or applying after the deadline.

To access the form for your tax refund, please click here [link to 62.219.243.194]

Note: For security reasons, we will record your ip-address, the date and time.
Deliberate wrong inputs are criminally pursued and indicated.

Regards,
Internal Revenue Service

Copyright 2008, Internal Revenue Service U.S.A. All rights reserved.

Pretty well done, I would say! Somebody should tell them though that works of the US Government are not copyrightable.

by Joe at April 17, 2008 06:01 PM

April 14, 2008

Robert Love

Linux System Programming, reviewed

En route back to Boston from LugRadio Live, caught a Slashdot review of Linux System Programming:

I have been looking for something that would take my K&R level of experience and bring it up to date with modern methods, hopefully letting me write more efficient and reliable programs. Linux System Programming is a volume that targets this need.

[Easy introductions of an advanced concept] are done in a nicely graded level for each topic. In "file access" to give an example, you are lead from simple read/write calls, through to what the C library can provide in buffering, to improved performance using mmap. The techniques continue with descriptions of I/O schedulers and how the kernel will order hardware disk access, scatter/gather, and ends up with how it is possible to order block reads/writes yourself bypassing any scheduler.

You are hardly aware of the progression, as the pacing is very well done. New concepts clearly fit into what you have seen so far—current sections signpost the practical use of what is being explained and at what cost, allowing clear consideration of the use of advanced features against any consequences.

I recommend this book to anyone who has a need to developing Linux applications.

The review rated the book an eight on a ten point scale—but decide for yourself. Justice is not served until every occupant of this planet owns one copy for every toe and finger on their body.

by Robert Love (noreply@blogger.com) at April 14, 2008 06:55 PM

April 13, 2008

Robert Love

Seasons of a Day

Noon, late afternoon, early evening, late evening shot, respectively, of the Bay Bridge with foreground farrago.

George Soros has a book hitting shelves on the credit crunch. Yes, this newest effort invokes reflexivity within the first few pages.

by Robert Love (noreply@blogger.com) at April 13, 2008 01:02 AM

April 11, 2008

Nat Friedman

Ten Tweetable Scripts

Yesterday morning I proposed a contest to create the best one-line program that would fit inside Twitter's 140-character buffer. To kick things off, I wrote this 105-character script which displays a small animation:

s="-<";while true;do echo -ne "$s\r";s=`sed 's/->$/-<-/;s/^</>/;s/-</<-/;s/>-/->/;'<<<$s`;sleep 0.1;done

Arturo (or Pupi as his friends call him) wrote a 135-character morse code decoder in shell:

m=etianmsurwdkgohvf?l?pjbxcyzq;p=0;while read -sn1 c;do [ -z "$c" ]&&p=0&&echo&&continue;let p+=c;echo -ne \\b${m:$p:1};let p+=p+2;done

Press '0′ for dot, '1′ for dash, and hit space (or enter) as a char separator. Wow!

I learned a few tricks from Arturo's script. First, he uses the ${} braces operator to take substrings, like so:

${var:offset:length}

This is incredibly useful! You can actually do shell arithmetic in the offset and length parameters, too. So for example,

${var:i+1:a-3}

is valid for shell variables $i and $a. And to find the length of a string, you can use:

${#str}

So str="foobar"; echo ${#str} will print "6". You can read more about the braces operator in the bash info page.

Another thing I learned from Arturo's script is the versatility of the 'read’ builtin in bash. Pupi uses the -s argument, which causes read not to echo its input (useful for inputting passwords) and -n1 which tells it to only read one character. Also, Arturo uses [ test] && operation, which is a handy short-hand for an if statement in shell (and other languages).

Pádraig Brady wrote this excellent screensaver:

tr -c "[:digit:]" " " < /dev/urandom | dd cbs=$COLUMNS conv=lcase,unblock | GREP_COLOR="1;32" grep --color "[^ ]"

Pádraig makes use of the square-brace character class operator in tr(1) to filter out all the numerals, which bash also supports.

Building on what I learned from Pupi, here is one I wrote that I call paint.sh:

c=12322123;x=20;y=20;while read -sn1 p;do k=${c:(p-1)*2:2};let x+=$((k/10-2));let y+=$((k%10-2));echo -en \\033[$y\;"$x"HX;done

Use the 1 2 3 and 4 keys to move the cursor around the screen. It's an etch-a-sketch for your terminal! You can see that I made use of the read -sn1 trick from pupi as well as the braces operator to substring. I also used ANSI escape codes to position the cursor.

And this is one I call rockband.sh (Updated - works much better now!):

while read -sn1 p;do s="";for((i=0;i<$p;i++));do s=x$s;done; yes $s > /dev/audio&sleep 0.1;kill %%;done

Use the number keys to play different tones. When you're done, hit Control-c.

The way it works is that the ASCII value of each character you send to /dev/audio specifies the excursion of the speaker diaphragm (roughly). The 'yes’ command prints whatever string you give it, followed by a newline character (ASCII 13, pretty low), over and over again. So the longer the string of 'x’ characters you pass to 'yes', and which 'yes’ prints between newlines, the slower the oscillation of the speaker diaphragm, and the lower the tone. Neat, huh? I learned this trick from my boyhood friend Edward Loper many years ago.

And here's the last one I wrote:

s=" #55755071317011117011117075557";for i in `seq 2 $((${#s}-1))`; do k=${s:i:1}; for b in 1 2 4; do echo -n "${s:(k&b)/b:1}"; done; echo; done

Miguel submitted this tiny function plotter:

for x in `seq -1 .05 1`; do y=`echo "s($x*8)*10+10" | bc -l`; for p in `seq 0 $y`; do echo -n " "; done; echo "*" ;done

And here's another plot:

for x in `seq -5 .5 5`; do y=`echo "$x*$x" | bc`; for p in `seq 0 $y`; do echo -n " "; done; echo "*" ;done

Those last three scripts make use of the venerable "seq" command to generate a series of numbers. Miguel uses fractional steps, but if you only need integers you can also use braces in shell, like this:

sum=0;for i in {1..100}; do let sum+=i; done; echo $sum

Ryan Paul of ArsTechnica fame wrote this Ruby script:

proc{|f|f[proc{|x|x+1},0]}[proc{|x,y|proc{|f,z|x[proc{|w|y[f,w]},z]}}[proc{|f,x|f[f[f[f[f[f[f[x]]]]]]]},proc{|f,x|f[f[f[f[f[f[x]]]]]]}]]

Ryan is using the "proc" primitive in Ruby, which allows you to create an anonymous function (like lambda in lisp), and which I didn't know about even though I've been coding Ruby off and on the last few months. He uses Church encoding to encode the numbers 7 and 6, and lambda calculus to multiply them, thus confirming that he is the most awesome IT journalist working today.

Finally, Jay Wren sent in this C program:

main(x,y){for(;x++;) for(y=2;x%y;)printf( ++y/x+"\0%d\n",x);}

of which he is not the original author (and which I suspect was an IOCCC entry), but which is a very compact way of generating all the prime numbers. The author uses the args to main to save space on variable declaration, and the leading null-terminator in the string is a really clever way to select whether or not to print the output without an if statement. Lots of cleverness in there (though the algorithm to find primes is just brute force).

There were too many good entries to declare a winner, and maybe a contest was the wrong idea anyway. But this was a lot of fun. If you want to send me a script on twitter, be sure to send a "@natfriedman" message after, so that I notice you.

by nat at April 11, 2008 02:09 PM

Robert Love

I will move the White House to San Francisco

A gentle reminder: LugRadio Live is this weekend at The Metreon!

Alcatrez

I present on Sunday, laying down a developer's overview of Android, our mobile platform.

by Robert Love (noreply@blogger.com) at April 11, 2008 10:38 AM

April 10, 2008

Nat Friedman

Take the Tweetable Script Challenge

Twitter limits posts ("tweets") to 140 characters. This constraint makes sending updates to your friends challenging, but it makes programming more interesting.  I just tweeted this 105 character shell script:

s="-<";while true;do echo -ne "$s\r";s=`sed 's/->$/-<-/;s/^</>/;s/-</<-/;s/>-/->/;'<<<$s`;sleep 0.1;done

(Pasting from the tweet link seems to work a lot better than pasting from my blog — not sure what wordpress is doing to that script) (Fixed - disabled smart quoting in wordpress).

Cute, huh? :-) But you can probably do better. Tweet your one-liner, and then send a @natfriedman message on Twitter so that I notice it. Best tweetable script posted today wins. All the basic shell languages are allowed, but your script has to be pastable into the shell, i.e. "perl -e" is ok.

by nat at April 10, 2008 01:37 PM

April 09, 2008

Robert Love

You can't put it up in the West Wing

Via Wired, a new map of US carbon footprint:

Map of the U.S. Carbon Footprint
Image courtesy of The Vulcan Project

The map is cute, and the raw data is invaluable, but the visualization is worthless. As it is, this map does not look any different than one depicting population density.

Needed is a visualization corrected for population, thus showing carbon footprint per capita. Even better would be a map corrected for GDP, thereby depicting carbon per unit of economic output.

Otherwise, what is the policy prescription derivable from this thing? Hey, Northeastern Corridor! If you ditch your population and your jobs, you can be as pollution-free as Montana!

by Robert Love (noreply@blogger.com) at April 09, 2008 05:51 PM

Doug Cutting

cutting


A few days ago Google announced its App Engine, which lets folks build applications that run in Google’s cloud. Amazon has for a while had a number of services to let folks run applications in Amazon’s cloud. But in both of these cases, one must use their proprietary APIs.

For example, Google provides a datastore API that applications must use to persist state, while Amazon similarly provides a simple DB API. Amazon’s services are generally lower-level and easier to adopt ala-carte, while Google provides one-stop-shopping. Either way, one’s application code becomes dependent on a particular vendor. This is in contrast to most web applications today, where, with things like the LAMP stack, folks can build vendor-neutral applications from free (as in beer) parts and select from a competitive, commodity hosting market.

As we shift applications to the cloud, do we want our code to remain vendor-neutral? Or would we rather work in silos, where some folks build things to run in the Google cloud, some for the Amazon cloud, and others for the Microsoft cloud? Once an application becomes sufficiently complex, moving it from one cloud to another becomes difficult, placing folks at the mercy of their cloud provider.

I think most would prefer not to be locked-in, that cloud providers instead sold commodity services. But how can we ensure that?

If we develop standard, non-proprietary cloud APIs with open-source implementations, then cloud providers can deploy these and compete on price, availability, performance, etc., giving developers usable alternatives. But such APIs won’t be developed by the cloud providers. They have every incentive to develop proprietary APIs in order to lock folks into their services. Good open-source implementations will only come about if the community makes them a priority and builds them.

Hadoop is a big initial step in this direction. Its current focus is on batch computing, but several of its components are also key to cloud hosting. HDFS provides a scalable, distributed filesystem. It doesn’t yet meet the high-availability requirements of cloud hosting, but once folks who need that help to build it, it will. HBase provides a database comparable to Amazon’s Simple DB and Google’s Datastore API. It’s still young, but, if folks want, it could become a solid competitor to these.

Moral: if you want commodity cloud hosting, pitch in now.

by Doug Cutting at April 09, 2008 04:45 PM

April 08, 2008

Debajyoti Bera

You said Google owns your life

And I nodded my head and felt sympathy for you. I also used the I_doubt_its_maintained_anymore and just_barely_works xemail-net to write a live GMail backend for beagle. It does not index the emails as of now, but uses the IMAP search protocol to directly search the emails on the GMail IMAP server. And searching followed by retrieving the headers of the matched emails is really slow; the delay is clearly perceptible. It could be due to xemail but I could not find a better alternative for a .Net IMAP library.

It should not be hard to take the current backend and add the ability to download a batch of emails and index them locally. Google also publishes a nice set of GData .NET API for accessing Google documents, calendars and a lot of other services. A backend for them would at least make our beloved maintainer happy.

A basic GMail query-only backend was on my TODO list for more than year. And now its finally done. Proper GMail indexing and Google service is also on my TODO list. So hope to see it sometime... say by next year.

by noreply@blogger.com (dBera) at April 08, 2008 07:31 PM

Robert Love

Seriously?

McCain 2008 Graffiti
As seen in the Meatpacking District, New York, NY

by Robert Love (noreply@blogger.com) at April 08, 2008 12:23 PM

April 07, 2008

Arun Raghavan

Knock knock ...

Been a good week. I crossed 50 commits to Beagle. They’re all pretty modest contributions, but it’s been awesome fun.

In addition, pkgcore 0.4.4 has my patch to support HTTP proxies for rsync. This was a fun patch to write, small as it is. The code is beautiful, and Brian Harring (ferringb) and Patrick Lauer (bonsaikitten) walked me through a lot of it. Good stuff!

I’ve also been working on splitting the gnome-python* ebuilds to make the dependency trees for packages that use these bindings a lot saner. This has been longer and more painstaking that intended. It wouldn’t even have happened if Jim Ramsay (lack) hadn’t made an excellent start with the gnome-python-desktop split, since all subsequent work was based on that. Hope this is useful to someone, though. :-)

As I said, a good week.

April 07, 2008 08:50 PM

April 03, 2008

Robert Love

Hank!

Hawk Attacks Girl At Fenway Park (video):

A red-tailed hawk attacked a girl on a tour of Fenway Park Thursday, drawing blood and sending the girl to a hospital for treatment.

[Hank] swooped down on the girl, [his] talons cutting her head above her eyes.

The hawk flew off after the attack.

Great. Now he has a taste for human blood and will live to despoil another day.

Hank the Hawk

Although this debauch pales in comparison to my feces-drenched stoop, with today's massacre the jive turkey has gone too far. That bird must be stopped.

by Robert Love (noreply@blogger.com) at April 03, 2008 06:30 PM

April 01, 2008

Joe Shaw

worst day of the year

STILL NOT FUNNY.

by Joe at April 01, 2008 01:28 PM

Robert Love

The Worst Day on the Internet

Today would not be unbearable if the fake press releases and bogus news stories were, you know, funny.

Robert Love
Robert, now a happy Windows users and strong supporter of single-payer healthcare!

by Robert Love (noreply@blogger.com) at April 01, 2008 12:51 PM

March 31, 2008

Kevin Kubasik

Sound problems in Ubuntu Hardy

So if your like me, you’ve been suffering through some painful sound problems in Ubuntu Hardy, apparently its a known kernel issue, so just sit tight. However, if your like me (or 90% of nerds) then you need some sort of music to code. A little digging revealed that I did not in fact have any of the alsa kernel modules installed for my current kernel. apt-get left me high and dry (also without an nvidia driver yet, but that’s an easy fix).

The simple remedy is to just build the alsa modules yourself, a pretty painless task. The problem is, if you want to have any hope of keeping your install halfway clean, then you need to get those files tracked by dpkg so we avoid conflicts when the modules are fixed. There’s a simple solution:

sudo apt-get install module-assistant
sudo m-a update
sudo m-a prepare
sudo m-a a-i alsa

This utilizes the handy module-assistant package to automatically build alsa for you. :) Reboot and enjoy!

by Kevin Kubasik at March 31, 2008 07:43 PM

Back From PyCon, Break

So I just returned from my massive onslaught of travel that started with PyCon, took me from one US coast to the other, a Carribean island, and then back home to Washington D.C. I’m on Spring Break for the rest of the week, and hope to  get some good blog posts in reguarding the awesomness that was PyCon 2008!

by Kevin Kubasik at March 31, 2008 07:10 AM

March 30, 2008

Nat Friedman

Guns, Guts and God

At dinner during a recent meeting of Democrats Abroad in Brussels, an articulate American investment banker from London recounted the story of a visit he'd made to a Republican gathering in the US where he learned that the unofficial motto of the Republican party is:

Guns, Guts and God make America great. Republican Party.

This pithy catchphrase impressed me. In 9 words it defines a coalition of voters (church-goers, rural gun-owners, and military families), emphasizes strong patriotism, and sketches a personality that's instantly familiar to many people.

"So," he went on to ask, "what's the analogous slogan for Democrats?"

There were a lot of intelligent people at the table, and murmured discussion followed. The closest anyone could get to a counterpart was the familiar "Strength through diversity." But it doesn't pack the rhetorical punch of the Republican slogan. In fact, it could be cynically interpreted as another way of saying "We couldn't agree on a motto."

I found myself wondering whether this was an inherent state of affairs. In a two-party system, if one party comprises a well-defined coalition, the other party could end up picking up the scraps — and be left with such a diverse group of members that it would have trouble expressing common cause, except "we're not them."

Or maybe a group defined by its tolerance, rationality, and empiricism simply can't deliver the kind of bumper-sticker policy positions as the Republican party.

Certainly the division we see right now between the Obama and Clinton supporters hasn't happened in the Republican party, despite the fact that McCain is despised by many conservatives.

I was reminded of another quote I read recently:

A conservative is a liberal who got mugged and a liberal is a conservative who got arrested.

There's a symmetry in here which would seem to point the way to some kind of catchphrase.

Liberal political groups in other parts of the world manage to cohere well, and to express themselves compellingly.

Can you come up with a catchy slogan for the Democratic Party?

by nat at March 30, 2008 09:36 PM

March 28, 2008

Debajyoti Bera

Better late than never

Beagle 0.3.4 was released into the wild last week. We tried to fix the build problems (new Gnome-sharp, Mono 1.9, missing files from last release) but building Beagle with Ndesk-DBus 0.5.2 remained broken, as per tradition.

Other than that this version builds nicely with Mono 1.9 and contains #ifdef-ed code to use the Mono.Unix.UnixSignal API when built with Mono 1.9. That should ensure that beagled and index-helper will quit when asked to quit. Yes, this is the 21st century.

There is a lot of mapping in Beagle and a lot of them are hardcoded. We are gradually moving them to user configurable files. The config files were moved out earlier; in 0.3.4 we have moved out the query mapping (mapping "ext:html" to right internal property name). If you want to add a mapping for some property you repeatedly query, just add it to the local ~/.beagle/query-mapping.xml

I added another handy tool to use Beagle as updatedb/locate. Use beagle-build-index or beagled to create indexes (like updatedb). And then use beagle-static-query to query the indexes (like locate) - no long running daemon beagled has to be running.

E.g. I created an index for system files in /usr/bin, /bin and other global directories (the --disable-filtering is to disable filtering file contents since here is only care about the file names and such),

$ beagle-build-index --recursive --disable-filtering --target ~/.systembeagle /usr/bin/ /usr/local/bin/ /bin/ /etc /usr/local/etc/

Then I could query just like locate,

$ beagle-static-query --add-static-backend ~/.systembeagle 'net*' --backend none

(--backend none is to tell beagle to not search any other backends). I could have added ~/.systembeagle to beagle using beagle-config so that I dont have to add this path everytime or I could have even created an alias for this.

Why do this when locate/updatedb already does it ? Because I can :). Ok, I actually use this to search monodocs. I am not a big fan of this mouse, point, click business and I stick to the terminal with mod and monop2 at my disposal. These tools are great once you know the fully qualified name of the method or the class. Use this jack-of-all-trade beagle.

Step-1: Enable the system wide monodoc index. Its one of the crawl-rules shipped with beagle but disabled by default.

Step-2: Let cron build it or you run the cron job yourself. Building the monodoc index takes time though. Definitely longer than any special indexer for monodoc files. But thats only a one time cost.

Step-3: Use beagle-static-query. You can also use phrases and wildcards '*' and search only in methods or classes or properties (just look in the returned fields and use beagle property query).

by noreply@blogger.com (dBera) at March 28, 2008 11:24 PM

March 24, 2008

Robert Love

The Lives of Others

From the State Department's fact sheet for American athletes traveling to Beijing for the 2008 Olympics: Your hotel room is bugged.

All visitors should be aware that they have no reasonable expectation of privacy in public or private locations. All hotel rooms and offices are considered to be subject to on-site or remote technical monitoring at all times. Hotel rooms, residences and offices may be accessed at any time without the occupant’s consent or knowledge.

To test, when I visited last February, every night I ruminated aloud on how my coworker was a Taiwanese nationalist. And a proselytizing Christian. Fortunately she left unharmed.

by Robert Love (noreply@blogger.com) at March 24, 2008 05:00 PM

March 18, 2008

Robert Love

On the Rationality of Active Investing

Learned hand Luis Villa links to my post on achieving alpha and asks,

The interesting question, in my mind, is why so many people are so irrational. This is a gold mine for the behavioral economists, and a nice counter-argument for when someone tells you that better information creates more efficient markets.

He concludes:

I long for a Love-ian explanation of why this happens in theoretically rational markets with nearly perfect price information.

Luis's is a good question. I think there are three unrelated but collaborative factors.

First, active investing is not irrational if most folks believe mutual funds are the superior investment vehicle—remember, rational isn't epistemic. ETFs are a relatively-recent invention—they have been available in Europe since only 1999—and have not yet reached ubiquity. Indeed, many folks do not even know what ETFs are, let alone the returns they offer or low costs they charge. Many 401(k) plans, moreover, continue to offer few if any passive funds. Mutual fund fees, such as front-end loads, create incentives for brokers to push expensive funds over cheaper index funds. Thus, given the relative dearth of information about and access to ETFs, the choice of active funds over passive funds is often rational.

Second, not all of the costs of active investing are "wasted" on unachieved excess returns. Some of the costs go toward tax minimization strategies, for example, that a passive fund does not provide. This portion of the cost should not be included in the calculation.

The best for last: I believe the most accurate explanation is that investors, individually and as a group, are generally making the right choices. Recall I closed my previous post with this nugget:

One reason the market and thus passive investors can earn the returns they do is because of active investor's strategies that close arbitrage opportunities, set prices, tighten spreads, and otherwise make the market more efficient.

This suggests that the overall outcome ($100 billion spent chasing excess returns) is net beneficial, but does not comment on why any individual investor rationally puts his or herself in the active camp.

To answer that, let's study investing at the margins. There is both a cost and a return to active investing. Active investors hope that the return is in excess of the market, net costs. This excess is called alpha. Assume there are no active investors, only passive. Then the market would be a cacophony of noise, and any active investing strategy would reap substantial reward. The cost would also be high, both because there would be few suppliers of such service and because the infrastructure for doing so (equity research, realtime operating systems, the city of Greenwich) would not exist, but the alpha would still be significant. Given these outsized returns, capital will flow out of passive and into active investing.

Now let's look at the other extreme and assume all market participants are engaged in active strategies. The outsized returns will be bid down, but the costs would also be much lower as the supply of funds and their managers meets demand and economies of scale kick in, lowering marginal cost. In this all-active world, the typical return would approximate the market's total return. Given the lack of alpha, capital will flow out of active strategies and into index funds.

In either market, put yourself at the margin. Everyone is passive? Achieving alpha is easy, as you just have to beat the noise. Everyone is active? Then just mimic their strategies by tracking the entire market and achieve similar returns without the cost. With each person switching from passive to active, or from active to passive, the marginal utility, along with the efficiency of the market, increases or decreases. Eventually, equities are efficiently priced as folks long the hot stocks and short the stinkers, spreads tighten, and arbitrage opportunities close. At this point, because the market is efficient, it no longer pays to expend excess cost on active investing. Thus the guy at the margin moves into an index fund, simply tracking the whole market and earning the market's return. If enough people choose passive over active strategies, perhaps the next guy will notice that spreads are too wide or some stock is underpriced. If so, he will choose an active strategy and achieve alpha.

And so it goes, until we reach the $100 billion equilibrium we are at today, where the marginal cost of active investing meets its marginal utility. The balance might be imperfect—too many folks free-riding with their passive investments or tilting at windmills in pursuit of alpha—but I bet its pretty damn close. Disagree? Then just move to the other side of the fence—you will gain excess net returns and make the market more efficient.

by Robert Love (noreply@blogger.com) at March 18, 2008 05:13 PM

March 17, 2008

Robert Love

On an Employer who Loves Dogs

My coworker, napping:

Golden Retriever

A particularly beautiful breed.

by Robert Love (noreply@blogger.com) at March 17, 2008 06:12 PM

March 13, 2008

Robert Love

Android at LugRadio

But we in it shall be remember'd: LugRadio Live USA 2008 is the 12 and 13th of April in hilly San Francisco.

LugRadio

I will be there, speaking on Android, our forthcoming complete, free, and open mobile platform. It will be a hit, a very palpable hit.

by Robert Love (noreply@blogger.com) at March 13, 2008 02:25 PM

March 12, 2008

Robert Love

The best laid schemes o' Mice an' Men

I am in Chicago, a little late reading the Sunday Times, but this article on beating the market caught my attention:

Investors collectively spend around $100 billion a year trying to beat the stock market. That’s the finding of a rigorous effort to measure the total costs of Americans’ efforts to surpass the returns they would have received by simply holding a stock index fund. The huge price tag helps explain why beating a buy-and-hold strategy is so difficult.

In his new study, Professor French tried to make his estimate of investment costs as comprehensive as possible. He took into account the fees and expenses of domestic equity mutual funds (both open- and closed-end, including exchange-traded funds), the investment management costs paid by institutions (both public and private), the fees paid to hedge funds, and the transactions costs paid by all traders (including commissions and bid-asked spreads). If a fund or institution was only partly allocated to the domestic equity market, he counted only that portion in computing its investment costs.

Professor French then deducted what domestic equity investors collectively would have paid if they instead had simply bought and held an index fund benchmarked to the overall stock market, like the Vanguard Total Stock Market Index fund, whose retail version currently has an annual expense ratio of 0.19 percent.

The difference between those amounts, Professor French says, is what investors as a group pay to try to beat the market.

The study's conclusion:

What are the investment implications of his findings? One is that a typical investor can increase his annual return by just shifting to an index fund and eliminating the expenses involved in trying to beat the market. Professor French emphasizes that this typical investor is an average of everyone aiming to outperform the market—including the supposedly best and brightest who run hedge funds.

The bottom line is this: The best course for the average investor is to buy and hold an index fund for the long term. Even if you think you have compelling reasons to believe a particular trade could beat the market, the odds are still probably against you.

I go further: The odds are, most definitively, against you. You might beat the market this year, but you won't the next. Repeat after me: You cannot beat the market. Put your money in an ETF—or a small basket of them—and leave it alone.

There are theories that decree this—equity prices reflect all known information and are an unbiased and collective valuation and thus you can only outperform the market through luck—but you do not have to subscribe to them, as mere arithmetic can make the case:

Active investing yields average returns. Proof. Let M be the entire market. By definition, M's return is the market's total return minus net costs. Let P be a subset of M such that x is in P if x is pursuing a passive strategy. That is, P is tracking the total market. Then P is also earning the market's total return, minus costs (which are very low). Now, let A be a subset of M such that x is in A if x is pursuing an active (that is, managed) strategy. Since M=P+A and both M and P are earning average returns, A is also earning average returns, minus costs (which are large). QED.

To be sure, you could now argue that A is composed of two subsets, those who consistently and substantially outperform the market and those who consistently and substantially underperform the market. I think its clear that, in a given period, some funds are in one subset and some are in the other—but, over the long-run, no one fund is consistently in either and you get reversion toward the mean.

As an aside, one reason M and thus P can earn the returns they do is because of A's active strategies that close arbitrage opportunities, set prices, close spreads, and otherwise make the market more efficient. Is that worth $100 billion? Definitely.

by Robert Love (noreply@blogger.com) at March 12, 2008 11:47 AM

March 08, 2008

Debajyoti Bera

Beyond Search: arrhh...dee...efff

If the news reports and blogs are to be believed, this is the age of Semantic Something. First people wanted to search web, then file contents, and then search emails and other user data. Everybody was talking about desktop search; along came Beagle, Spotlight, Google Desktop Search, Kat, MetaTracker, Pinot, Strigi etc. While desktop search at its core is nothing but a crawler which reads different file formats and stores them in a searchable database, searching is the most trivial and IMO, boring application built on Beagle's infrastructure.

These days the focus seems to have shifted to Semantic Desktop and Semantic Web. Most blog comments and mailing list posts about Semantic-Fu have a hint of it being vapourware. Its not totally their fault either; the ideas are around for a long time and people are working on it for many many years. But there is no glittering gold in sight. Only recently some interesting Semantic Web ideas have started taking shape. Semantic Desktop is a slightly different game but it should not be far behind. After taking about 40 developer years, Beagle is just about ready to take desktop search beyond simple file content search. Historians might want to take note of the dashboard project and how beagle came into being as a necessary requirement for that truly beyond-desktop-search application.

The core idea behind Semantic Desktop, upto my understanding, revolves around the buzzword jack-of-most-trades RDF. And for the impatient kind, here is a rude shock - RDF is not useful for human beings. Even further, it is not even meant for you, me and us; storing every conceivable data in the RDF format is not going to make our life any easier right away.

RDF or Resource Description Framework is a generic way to describe anything, to be accurate any description of anything. It is a fairly elaborate yet structured format; very easy for programs to analyze that information but extremely redundant to human eyes. Notwithstanding what the AI experts are claiming about the future of AI, human mind can work without immediate deductive reasoning and in fact does that a lot of time. It recognizes familiar words without reading the alphabets one at a time, it deduces the color by merely glancing at it, it conjures up strange connections; its a wonder that will be hard to completely characterize by any set of rules. At least at the current stage, algorithms have to be told the facts and the relations between them for them to do any kind of processing with its data. These are the things that we just know when we see something and is thus the reason why storing the description of something in an RDF format is not going to gain me anything immediately. On the other hand, this is also the reason why applications should be fed data in an RDF format to allow it unhindered access to the semantics of the data.

If that felt hand wavy, try to think about the difference between the semantics of a data and its syntax. An array could be used to represent a linked list, a queue, a stack, a tree or an heap - the latter are the different semantics of the representations, the array is one of the many syntactic representations of one of the latter concepts. A bunch of pairs could be stored in a database table; the table is a syntactic representation of the data which has the semantics of a bunch of name, phone-number pairs. It is hard to work with the semantics of an idea, in a sense it is something up in the air; on the other hand storing some data in a suitable working form could fail to capture some concept about the data. Also, once stored in a particular form it is easy to miss the bigger picture; thus limiting the scope of what we could do with that data.

Saying all that, for the time being think of the RDF format as a bunch of objects and facts where each object is related to some number of facts. The semantics of related could differ based on the context, and RDF is powerful enough to describe even that semantics and a whole bunch of other facts about the facts. With beagle pulling data from nooks and corners of a user's desktop and providing a service which allows applications to search this data, it is a shame if we cannot exploit the relationships in this data for a better mankind... err... dolphins... err... us.

Consider all the emails I have. Now I know that there some emails that are part of discussion threads. Beagle does not. With the beauty of N3 (a close cousin of Semantic-Fu and RDF), I can write this extra information as a set of rule (the single '.' represents end of one rule). I am using emails msgid to track emails in a thread. I could not help but notice the similarity of these rules with prolog or other logic programming languages.

/* an email with subject 'foobar' is in its own thread */
{ ?email :title 'foobar' . ?email :msgid ?msg . } => { ?msg :inthread ?msg } .
/* if any email refers to some email in thread, then this email is also in other email's thread */
{ ?ref :inthread ?parent . ?email1 :reference ?ref . ?email1 :msgid ?msg .} => {?msg :inthread ?parent} .

Using the RDFAdapter of the beagle-rdf branch, I can use this to get all the emails in the thread with foobar in its subject. Note that I am able to write my set of rules only when I see this data as actual emails and not a bunch of lucene documents with fields. The latter carry no meaning. Further note that, I can also use the BeagleClient API to perform field specific queries to obtain the same results. The difference is that the process of using BeagleClient will require me to think about the relationships from scratch and then figure out the right sequence of queries. Instead I could store all the relationship among the emails in the email-index in an RDF format (and also related information not stored in the index e.g. saying a list of email addresses are all mine and should be treated as for one person). Then, whenever I want to extract some information, I can write the question (again in an RDF format) and let the RDF-Magic figure out the how to execute this question against that data given this set of inference rules. Isn't it cool ?

If I missed it earlier, this kind of data-mining operations are not for my everyday use (here my refers to usual computer users) and is not for everybody. Still it is can sometimes come in handy. Imagine the possibilities if you can write the relationships between a file in an mp3 playlist (playlist filter), its download link and how your arrived at that page (webhistory indexing), the email you sent with that file as an attachment in a zip file (email attachment and archive filter), its ratings and usage statistics in Amarok (amarok querydriver) and of course the actual file on the harddisk (user home directory indexing).

Warning: The RDF Adapter in beagle uses the sophisticated SemWeb library which allows anyone to perform graph operations (selecting subgraphs, walking on graphs, pruning nodes and edges etc.) on the RDF graph of the data. Unlike most RDF stores for desktop data, beagle is not optimized for RDF operations and could take quite a bit of time and heat up the CPU. It took me about 4 seconds to find all threads with the word beagle among 500 emails (my actual email index has about 20K emails! I refuse to imagine what will happen if I run it on the full index). If you are interested, checkout the rdf branch and take a look at the test SemWebClient.cs.

by noreply@blogger.com (dBera) at March 08, 2008 09:36 PM

March 05, 2008

Kevin Kubasik

Can Someone Get Us A Real Django IDE?

So the more I work with Django the more I long for a solid development environment to work in. I u