Two Bits - LightNovelsOnl.com
You're reading novel online at LightNovelsOnl.com. Please use the follow button to get notifications about your favorite novels and its latest chapters so you can come back anytime and won't miss anything.
Torvalds also mentions that he has ported "bash" and "gcc," software created and distributed by the Free Software Foundation and tools essential for interacting with the computer and compiling new versions of the kernel. Torvalds's decision to use these utilities, rather than write his own, reflects both the boundaries of his project (an operating-system kernel) and his satisfaction with the availability and reusability of software licensed under the GPL.
So the system is based on Minix, just as Minix had been based on UNIX-piggy-backed or bootstrapped, rather than rewritten in an entirely different fas.h.i.+on, that is, rather than becoming a different kind of operating system. And yet there are clearly concerns about the need to create something that is not Minix, rather than simply extending or "debugging" Minix. This concern is key to understanding what happened to Linux in 1991.
Tanenbaum's Minix, since its inception in 1984, was always intended to allow students to see and change the source code of Minix in order to learn how an operating system worked, but it was not Free Software. It was copyrighted and owned by Prentice Hall, which distributed the textbooks. Tanenbaum made the case-similar to Gosling's case for Unipress-that Prentice Hall was distributing the system far wider than if it were available only on the Internet: "A point which I don't think everyone appreciates is that making something available by FTP is not necessarily the way to provide the widest distribution. The Internet is still a highly elite group. Most computer users are NOT on it. . . . MINIX is also widely used in Eastern Europe, j.a.pan, Israel, South America, etc. Most of these people would never have gotten it if there hadn't been a company selling it."9 By all accounts, Prentice Hall was not restrictive in its sublicensing of the operating system, if people wanted to create an "enhanced" version of Minix. Similarly, Tanenbaum's frequent presence on comp.os.minix testified to his commitment to sharing his knowledge about the system with anyone who wanted it-not just paying customers. Nonetheless, Torvalds's pointed use of the word free and his decision not to reuse any of the code is a clear indication of his desire to build a system completely unenc.u.mbered by restrictions, based perhaps on a kind of intuitive folkloric sense of the dangers a.s.sociated with cases like that of EMACS.10 The most significant aspect of Torvalds's initial message, however, is his request: "I'd like to know what features most people would want. Any suggestions are welcome, but I won't promise I'll implement them." Torvalds's announcement and the subsequent interest it generated clearly reveal the issues of coordination and organization that would come to be a feature of Linux. The reason Torvalds had so many eager contributors to Linux, from the very start, was because he enthusiastically took them off of Tanenbaum's hands.
Design and Adaptability.
Tanenbaum's role in the story of Linux is usually that of the straw man-a crotchety old computer-science professor who opposes the revolutionary young Torvalds. Tanenbaum did have a certain revolutionary reputation himself, since Minix was used in cla.s.srooms around the world and could be installed on IBM PCs (something no other commercial UNIX vendors had achieved), but he was also a natural target for people like Torvalds: the tenured professor espousing the textbook version of an operating system. So, despite the fact that a very large number of people were using or knew of Minix as a UNIX operating system (estimates of comp.os.minix subscribers were at 40,000), Tanenbaum was emphatically not interested in collaboration or collaborative debugging, especially if debugging also meant creating extensions and adding features that would make the system bigger and harder to use as a stripped-down tool for teaching. For Tanenbaum, this point was central: "I've been repeatedly offered virtual memory, paging, symbolic links, window systems, and all manner of features. I have usually declined because I am still trying to keep the system simple enough for students to understand. You can put all this stuff in your version, but I won't put it in mine. I think it is this point which irks the people who say 'MINIX is not free,' not the $60."11 So while Tanenbaum was in sympathy with the Free Software Foundation's goals (insofar as he clearly wanted people to be able to use, update, enhance, and learn from software), he was not in sympathy with the idea of having 40,000 strangers make his software "better." Or, to put it differently, the goals of Minix remained those of a researcher and a textbook author: to be useful in cla.s.srooms and cheap enough to be widely available and usable on the largest number of cheap computers.
By contrast, Torvalds's "fun" project had no goals. Being a c.o.c.ky nineteen-year-old student with little better to do (no textbooks to write, no students, grants, research projects, or committee meetings), Torvalds was keen to accept all the ready-made help he could find to make his project better. And with 40,000 Minix users, he had a more or less instant set of contributors. Stallman's audience for EMACS in the early 1980s, by contrast, was limited to about a hundred distinct computers, which may have translated into thousands, but certainly not tens of thousands of users. Tanenbaum's work in creating a generation of students who not only understood the internals of an operating system but, more specifically, understood the internals of the UNIX operating system created a huge pool of competent and eager UNIX hackers. It was the work of porting UNIX not only to various machines but to a generation of minds as well that set the stage for this event-and this is an essential, though often overlooked component of the success of Linux.
Many accounts of the Linux story focus on the fight between Torvalds and Tanenbaum, a fight carried out on comp.os.minix with the subject line "Linux is obsolete."12 Tanenbaum argued that Torvalds was reinventing the wheel, writing an operating system that, as far as the state of the art was concerned, was now obsolete. Torvalds, by contrast, a.s.serted that it was better to make something quick and dirty that worked, invite contributions, and worry about making it state of the art later. Far from ill.u.s.trating some kind of outmoded conservatism on Tanenbaum's part, the debate highlights the distinction between forms of coordination and the meanings of collaboration. For Tanenbaum, the goals of Minix were either pedagogical or academic: to teach operating-system essentials or to explore new possibilities in operating-system design. By this model, Linux could do neither; it couldn't be used in the cla.s.sroom because it would quickly become too complex and feature-laden to teach, and it wasn't pus.h.i.+ng the boundaries of research because it was an out-of-date operating system. Torvalds, by contrast, had no goals. What drove his progress was a commitment to fun and to a largely inarticulate notion of what interested him and others, defined at the outset almost entirely against Minix and other free operating systems, like FreeBSD. In this sense, it could only emerge out of the context-which set the constraints on its design-of UNIX, open systems, Minix, GNU, and BSD.
Both Tanenbaum and Torvalds operated under a model of coordination in which one person was ultimately responsible for the entire project: Tanenbaum oversaw Minix and ensured that it remained true to its goals of serving a pedagogical audience; Torvalds would oversee Linux, but he would incorporate as many different features as users wanted or could contribute. Very quickly-with a pool of 40,000 potential contributors-Torvalds would be in the same position Tanenbaum was in, that is, forced to make decisions about the goals of Linux and about which enhancements would go into it and which would not. What makes the story of Linux so interesting to observers is that it appears that Torvalds made no decision: he accepted almost everything.
Tanenbaum's goals and plans for Minix were clear and autocratically formed. Control, hierarchy, and restriction are after all appropriate in the cla.s.sroom. But Torvalds wanted to do more. He wanted to go on learning and to try out alternatives, and with Minix as the only widely available way to do so, his decision to part ways starts to make sense; clearly he was not alone in his desire to explore and extend what he had learned. Nonetheless, Torvalds faced the problem of coordinating a new project and making similar decisions about its direction. On this point, Linux has been the subject of much reflection by both insiders and outsiders. Despite images of Linux as either an anarchic bazaar or an autocratic dictators.h.i.+p, the reality is more subtle: it includes a hierarchy of contributors, maintainers, and "trusted lieutenants" and a sophisticated, informal, and intuitive sense of "good taste" gained through reading and incorporating the work of co-developers.
While it was possible for Torvalds to remain in charge as an individual for the first few years of Linux (199195, roughly), he eventually began to delegate some of that control to people who would make decisions about different subcomponents of the kernel. It was thus possible to incorporate more of the "patches" (pieces of code) contributed by volunteers, by distributing some of the work of evaluating them to people other than Torvalds. This informal hierarchy slowly developed into a formal one, as Steven Weber points out: "The final de facto 'grant' of authority came when Torvalds began publicly to reroute relevant submissions to the lieutenants. In 1996 the decision structure became more formal with an explicit differentiation between 'credited developers' and 'maintainers.' . . . If this sounds very much like a hierarchical decision structure, that is because it is one-albeit one in which partic.i.p.ation is strictly voluntary."13 Almost all of the decisions made by Torvalds and lieutenants were of a single kind: whether or not to incorporate a piece of code submitted by a volunteer. Each such decision was technically complex: insert the code, recompile the kernel, test to see if it works or if it produces any bugs, decide whether it is worth keeping, issue a new version with a log of the changes that were made. Although the various official leaders were given the authority to make such changes, coordination was still technically informal. Since they were all working on the same complex technical object, one person (Torvalds) ultimately needed to verify a final version, containing all the subparts, in order to make sure that it worked without breaking.
Such decisions had very little to do with any kind of design goals or plans, only with whether the submitted patch "worked," a term that reflects at once technical, aesthetic, legal, and design criteria that are not explicitly recorded anywhere in the project-hence, the privileging of adaptability over planning. At no point were the patches a.s.signed or solicited, although Torvalds is justly famous for encouraging people to work on particular problems, but only if they wanted to. As a result, the system morphed in subtle, unexpected ways, diverging from its original, supposedly backwards "monolithic" design and into a novel configuration that reflected the interests of the volunteers and the implicit criteria of the leaders.
By 199596, Torvalds and lieutenants faced considerable challenges with regard to hierarchy and decision-making, as the project had grown in size and complexity. The first widely remembered response to the ongoing crisis of benevolent dictators.h.i.+p in Linux was the creation of "loadable kernel modules," conceived as a way to release some of the constant pressure to decide which patches would be incorporated into the kernel. The decision to modularize Linux was simultaneously technical and social: the software-code base would be rewritten to allow for external loadable modules to be inserted "on the fly," rather than all being compiled into one large binary chunk; at the same time, it meant that the responsibility to ensure that the modules worked devolved from Torvalds to the creator of the module. The decision repudiated Torvalds's early opposition to Tanenbaum in the "monolithic vs. microkernel" debate by inviting contributors to separate core from peripheral functions of an operating system (though the Linux kernel remains monolithic compared to cla.s.sic microkernels). It also allowed for a significant proliferation of new ideas and related projects. It both contracted and distributed the hierarchy; now Linus was in charge of a tighter project, but more people could work with him according to structured technical and social rules of responsibility.
Creating loadable modules changed the look of Linux, but not because of any planning or design decisions set out in advance. The choice is an example of the privileged adaptability of the Linux, resolving the tension between the curiosity and virtuosity of individual contributors to the project and the need for hierarchical control in order to manage complexity. The commitment to adaptability dissolves the distinction between the technical means of coordination and the social means of management. It is about producing a meaningful whole by which both people and code can be coordinated-an achievement vigorously defended by kernel hackers.
The adaptable organization and structure of Linux is often described in evolutionary terms, as something without teleological purpose, but responding to an environment. Indeed, Torvalds himself has a weakness for this kind of explanation.
Let's just be honest, and admit that it [Linux] wasn't designed.
Sure, there's design too-the design of UNIX made a scaffolding for the system, and more importantly it made it easier for people to communicate because people had a mental model for what the system was like, which means that it's much easier to discuss changes.
But that's like saying that you know that you're going to build a car with four wheels and headlights-it's true, but the real b.i.t.c.h is in the details.
And I know better than most that what I envisioned 10 years ago has nothing in common with what Linux is today. There was certainly no premeditated design there.14 Adaptability does not answer the questions of intelligent design. Why, for example, does a car have four wheels and two headlights? Often these discussions are polarized: either technical objects are designed, or they are the result of random mutations. What this opposition overlooks is the fact that design and the coordination of collaboration go hand in hand; one reveals the limits and possibilities of the other. Linux represents a particular example of such a problematic-one that has become the paradigmatic case of Free Software-but there have been many others, including UNIX, for which the engineers created a system that reflected the distributed collaboration of users around the world even as the lawyers tried to make it conform to legal rules about licensing and practical concerns about bookkeeping and support.
Because it privileges adaptability over planning, Linux is a recursive public: operating systems and social systems. It privileges openness to new directions, at every level. It privileges the right to propose changes by actually creating them and trying to convince others to use and incorporate them. It privileges the right to fork the software into new and different kinds of systems. Given what it privileges, Linux ends up evolving differently than do systems whose life and design are constrained by corporate organization, or by strict engineering design principles, or by legal or marketing definitions of products-in short, by clear goals. What makes this distinction between the goal-oriented design principle and the principle of adaptability important is its relations.h.i.+p to politics. Goals and planning are the subject of negotiation and consensus, or of autocratic decision-making; adaptability is the province of critique. It should be remembered that Linux is by no means an attempt to create something radically new; it is a rewrite of a UNIX operating system, as Torvalds points out, but one that through adaptation can end up becoming something new.
Patch and Vote.
The Apache Web server and the Apache Group (now called the Apache Software Foundation) provide a second illuminating example of the how and why of coordination in Free Software of the 1990s. As with the case of Linux, the development of the Apache project ill.u.s.trates how adaptability is privileged over planning and, in particular, how this privileging is intended to resolve the tensions between individual curiosity and virtuosity and collective control and decision-making. It is also the story of the progressive evolution of coordination, the simultaneously technical and social mechanisms of coordinating people and code, patches and votes.
The Apache project emerged out of a group of users of the original httpd (HyperText Transmission Protocol Daemon) Web server created by Rob McCool at NCSA, based on the work of Tim Berners-Lee's World Wide Web project at CERN. Berners-Lee had written a specification for the World Wide Web that included the mark-up language HTML, the transmission protocol http, and a set of libraries that implemented the code known as libwww, which he had dedicated to the public domain.15 The NCSA, at the University of Illinois, Urbana-Champaign, picked up both www projects, subsequently creating both the first widely used browser, Mosaic, directed by Marc Andreessen, and httpd. Httpd was public domain up until version 1.3. Development slowed when McCool was lured to Netscape, along with the team that created Mosaic. By early 1994, when the World Wide Web had started to spread, many individuals and groups ran Web servers that used httpd; some of them had created extensions and fixed bugs. They ranged from university researchers to corporations like Wired Ventures, which launched the online version of its magazine (HotWired.com) in 1994. Most users communicated primarily through Usenet, on the comp.infosystems.www.* newsgroups, sharing experiences, instructions, and updates in the same manner as other software projects stretching back to the beginning of the Usenet and Arpanet newsgroups.
When NCSA failed to respond to most of the fixes and extensions being proposed, a group of several of the most active users of httpd began to communicate via a mailing list called new-httpd in 1995. The list was maintained by Brian Behlendorf, the webmaster for HotWired, on a server he maintained called hyperreal; its partic.i.p.ants were those who had debugged httpd, created extensions, or added functionality. The list was the primary means of a.s.sociation and communication for a diverse group of people from various locations around the world. During the next year, partic.i.p.ants hashed out issues related to coordination, to the ident.i.ty of and the processes involved in patching the "new" httpd, version 1.3.16 Patching a piece of software is a peculiar activity, akin to debugging, but more like a form of ex post facto design. Patching covers the spectrum of changes that can be made: from fixing security holes and bugs that prevent the software from compiling to feature and performance enhancements. A great number of the patches that initially drew this group together grew out of needs that each individual member had in making a Web server function. These patches were not due to any design or planning decisions by NCSA, McCool, or the a.s.sembled group, but most were useful enough that everyone gained from using them, because they fixed problems that everyone would or could encounter. As a result, the need for a coordinated new-httpd release was key to the group's work. This new version of NCSA httpd had no name initially, but apache was a persistent candidate; the somewhat apocryphal origin of the name is that it was "a patchy webserver."17 At the outset, in February and March 1995, the pace of work of the various members of new-httpd differed a great deal, but was in general extremely rapid. Even before there was an official release of a new httpd, process issues started to confront the group, as Roy Fielding later explained: "Apache began with a conscious attempt to solve the process issues first, before development even started, because it was clear from the very beginning that a geographically distributed set of volunteers, without any traditional organizational ties, would require a unique development process in order to make decisions."18 The need for process arose more or less organically, as the group developed mechanisms for managing the various patches: a.s.signing them IDs, testing them, and incorporating them "by hand" into the main source-code base. As this happened, members of the list would occasionally find themselves lost, confused by the process or the efficiency of other members, as in this message from Andrew Wilson concerning Cliff Skolnick's management of the list of bugs: Cliff, can you concentrate on getting an uptodate copy of the bug/improvement list please. I've already lost track of just what the heck is meant to be going on. Also what's the status of this pre-pre-pre release Apache stuff. It's either a pre or it isn't surely? AND is the pre-pre-etc thing the same as the thing Cliff is meant to be working on?
Just what the fsck is going on anyway? Ay, ay ay! Andrew Wilson.19 To which Rob Harthill replied, "It is getting messy. I still think we should all implement one patch at a time together. At the rate (and hours) some are working we can probably manage a couple of patches a day. . . . If this is acceptable to the rest of the group, I think we should order the patches, and start a systematic processes of discussion, implementations and testing."20 Some members found the pace of work exciting, while others appealed for slowing or stopping in order to take stock. Cliff Skolnick created a system for managing the patches and proposed that list-members vote in order to determine which patches be included.21 Rob Harthill voted first.
Here are my votes for the current patch list shown at http://www.hyperreal.com/httpd/patchgen/list.cgi
I'll use a vote of
-1 have a problem with it
0 haven't tested it yet (failed to understand it or whatever)
+1 tried it, liked it, have no problem with it.
[Here Harthill provides a list of votes on each patch.]
If this voting scheme makes sense, lets use it to filter out the stuff we're happy with. A "-1" vote should veto any patch. There seems to be about 6 or 7 of us actively commenting on patches, so I'd suggest that once a patch gets a vote of +4 (with no vetos), we can add it to an alpha.
Harthill's votes immediately instigated discussion about various patches, further voting, and discussion about the process (i.e., how many votes or vetoes were needed), all mixed together in a flurry of e-mail messages. The voting process was far from perfect, but it did allow some consensus on what "apache" would be, that is, which patches would be incorporated into an "official" (though not very public) release: Apache 0.2 on 18 March.23 Without a voting system, the group of contributors could have gone on applying patches individually, each in his own context, fixing the problems that ailed each user, but ignoring those that were irrelevant or unnecessary in that context. With a voting process, however, a convergence on a tested and approved new-httpd could emerge. As the process was refined, members sought a volunteer to take votes, to open and close the voting once a week, and to build a new version of Apache when the voting was done. (Andrew Wilson was the first volunteer, to which Cliff Skolnick replied, "I guess the first vote is voting Andrew as the vote taker :-).")24 The patch-and-vote process that emerged in the early stages of Apache was not entirely novel; many contributors noted that the FreeBSD project used a similar process, and some suggested the need for a "patch coordinator" and others worried that "using patches gets very ugly, very quickly."25 The significance of the patch-and-vote system was that it clearly represented the tension between the virtuosity of individual developers and a group process aimed at creating and maintaining a common piece of software. It was a way of balancing the ability of each separate individual's expertise against a common desire to s.h.i.+p and promote a stable, bug-free, public-domain Web server. As Roy Fielding and others would describe it in hindsight, this tension was part of Apache's advantage.
Although the Apache Group makes decisions as a whole, all of the actual work of the project is done by individuals. The group does not write code, design solutions, doc.u.ment products, or provide support to our customers; individual people do that. The group provides an environment for collaboration and an excellent trial-by-fire for ideas and code, but the creative energy needed to solve a particular problem, redesign a piece of the system, or fix a given bug is almost always contributed by individual volunteers working on their own, for their own purposes, and not at the behest of the group. Compet.i.tors mistakenly a.s.sume Apache will be unable to take on new or unusual tasks because of the perception that we act as a group rather than follow a single leader. What they fail to see is that, by remaining open to new contributors, the group has an unlimited supply of innovative ideas, and it is the individuals who chose to pursue their own ideas who are the real driving force for innovation.26 Although openness is widely touted as the key to the innovations of Apache, the claim is somewhat disingenuous: patches are just that, patches. Any large-scale changes to the code could not be accomplished by applying patches, especially if each patch must be subjected to a relatively harsh vote to be included. The only way to make sweeping changes-especially changes that require iteration and testing to get right-is to engage in separate "branches" of a project or to differentiate between internal and external releases-in short, to fork the project temporarily in hopes that it would soon rejoin its stable parent. Apache encountered this problem very early on with the "Shambhala" rewrite of httpd by Robert Thau.
Shambhala was never quite official: Thau called it his "noodling" server, or a "garage" project. It started as his attempt to rewrite httpd as a server which could handle and process multiple requests at the same time. As an experiment, it was entirely his own project, which he occasionally referred to on the new-httpd list: "Still hacking Shambhala, and laying low until it works well enough to talk about."27 By mid-June of 1995, he had a working version that he announced, quite modestly, to the list as "a garage project to explore some possible new directions I thought *might* be useful for the group to pursue."28 Another list member, Randy Terbush, tried it out and gave it rave reviews, and by the end of June there were two users exclaiming its virtues. But since it hadn't ever really been officially identified as a fork, or an alternate development pathway, this led Rob Harthill to ask: "So what's the situation regarding Shambhala and Apache, are those of you who have switched to it giving up on Apache and this project? If so, do you need a separate list to discuss Shambhala?"29 Harthill had a.s.sumed that the NCSA code-base was "tried and tested" and that Shambhala represented a split, a fork: "The question is, should we all go in one direction, continue as things stand or Shambahla [sic] goes off on its own?"30 His query drew out the miscommunication in detail: that Thau had planned it as a "drop-in" replacement for the NCSA httpd, and that his intentions were to make it the core of the Apache server, if he could get it to work. Harthill, who had spent no small amount of time working hard at patching the existing server code, was not pleased, and made the core issues explicit.
Maybe it was rst's [Robert Thau's] choice of phrases, such as "garage project" and it having a different name, maybe I didn't read his mailings thoroughly enough, maybe they weren't explicit enough, whatever. . . . It's a shame that n.o.body using Shambhala (who must have realized what was going on) didn't raise these issues weeks ago. I can only presume that rst was too modest to push Shambhala, or at least discussion of it, onto us more vigourously. I remember saying words to the effect of "this is what I plan to do, stop me if you think this isn't a good idea." Why the h.e.l.l didn't anyone say something? . . . [D]id others get the same impression about rst's work as I did? Come on people, if you want to be part of this group, collaborate!31 Harthill's injunction to collaborate seems surprising in the context of a mailing list and project created to facilitate collaboration, but the injunction is specific: collaborate by making plans and sharing goals. Implicit in his words is the tension between a project with clear plans and goals, an overarching design to which everyone contributes, as opposed to a group platform without clear goals that provides individuals with a setting to try out alternatives. Implicit in his words is the spectrum between debugging an existing piece of software with a stable ident.i.ty and rewriting the fundamental aspects of it to make it something new. The meaning of collaboration bifurcates here: on the one hand, the privileging of the autonomous work of individuals which is submitted to a group peer review and then incorporated; on the other, the privileging of a set of shared goals to which the actions and labor of individuals is subordinated.32 Indeed, the very design of Shambhala reflects the former approach of privileging individual work: like UNIX and EMACS before it, Shambhala was designed as a modular system, one that could "make some of that process [the patch-and-vote process] obsolete, by allowing stuff which is not universally applicable (e.g., database back-ends), controversial, or just half-baked, to be s.h.i.+pped anyway as optional modules."33 Such a design separates the core platform from the individual experiments that are conducted on it, rather than creating a design that is modular in the hierarchical sense of each contributor working on an a.s.signed section of a project. Undoubtedly, the core platform requires coordination, but extensions and modifications can happen without needing to transform the whole project.34 Shambhala represents a certain triumph of the "shut up and show me the code" aesthetic: Thau's "modesty" is instead a recognition that he should be quiet until it "works well enough to talk about," whereas Harthill's response is frustration that no one has talked about what Thau was planning to do before it was even attempted. The consequence was that Harthill's work seemed to be in vain, replaced by the work of a more virtuosic hacker's demonstration of a superior direction.
In the case of Apache one can see how coordination in Free Software is not just an afterthought or a necessary feature of distributed work, but is in fact at the core of software production itself, governing the norms and forms of life that determine what will count as good software, how it will progress with respect to a context and background, and how people will be expected to interact around the topic of design decisions. The privileging of adaptability brings with it a choice in the mode of collaboration: it resolves the tension between the agonistic compet.i.tive creation of software, such as Robert Thau's creation of Shambhala, and the need for collective coordination of complexity, such as Harthill's plea for collaboration to reduce duplicated or unnecessary work.
Check Out and Commit.
The technical and social forms that Linux and Apache take are enabled by the tools they build and use, from bug-tracking tools and mailing lists to the Web servers and kernels themselves. One such tool plays a very special role in the emergence of these organizations: Source Code Management systems (SCMs). SCMs are tools for coordinating people and code; they allow multiple people in dispersed locales to work simultaneously on the same object, the same source code, without the need for a central coordinating overseer and without the risk of stepping on each other's toes. The history of SCMs-especially in the case of Linux-also ill.u.s.trates the recursive-depth problem: namely, is Free Software still free if it is created with non-free tools?
SCM tools, like the Concurrent Versioning System (cvs) and Subversion, have become extremely common tools for Free Software programmers; indeed, it is rare to find a project, even a project conducted by only one individual, which does not make use of these tools. Their basic function is to allow two or more programmers to work on the same files at the same time and to provide feedback on where their edits conflict. When the number of programmers grows large, an SCM can become a tool for managing complexity. It keeps track of who has "checked out" files; it enables users to lock files if they want to ensure that no one else makes changes at the same time; it can keep track of and display the conflicting changes made by two users to the same file; it can be used to create "internal" forks or "branches" that may be incompatible with each other, but still allows programmers to try out new things and, if all goes well, merge the branches into the trunk later on. In sophisticated forms it can be used to "animate" successive changes to a piece of code, in order to visualize its evolution.
Beyond mere coordination functions, SCMs are also used as a form of distribution; generally SCMs allow anyone to check out the code, but restrict those who can check in or "commit" the code. The result is that users can get instant access to the most up-to-date version of a piece of software, and programmers can differentiate between stable releases, which have few bugs, and "unstable" or experimental versions that are under construction and will need the help of users willing to test and debug the latest versions. SCM tools automate certain aspects of coordination, not only reducing the labor involved but opening up new possibilities for coordination.
The genealogy of SCMs can be seen in the example of Ken Thompson's creation of a diff tape, which he used to distribute changes that had been contributed to UNIX. Where Thompson saw UNIX as a spectrum of changes and the legal department at Bell Labs saw a series of versions, SCM tools combine these two approaches by minutely managing the revisions, a.s.signing each change (each diff) a new version number, and storing the history of all of those changes so that software changes might be precisely undone in order to discover which changes cause problems. Written by Douglas McIlroy, "diff" is itself a piece of software, one of the famed small UNIX tools that do one thing well. The program diff compares two files, line by line, and prints out the differences between them in a structured format (showing a series of lines with codes that indicate changes, additions, or removals). Given two versions of a text, one could run diff to find the differences and make the appropriate changes to synchronize them, a task that is otherwise tedious and, given the exact.i.tude of source code, p.r.o.ne to human error. A useful side-effect of diff (when combined with an editor like ed or EMACS) is that when someone makes a set of changes to a file and runs diff on both the original and the changed file, the output (i.e., the changes only) can be used to reconstruct the original file from the changed file. Diff thus allows for a clever, s.p.a.ce-saving way to save all the changes ever made to a file, rather than retaining full copies of every new version, one saves only the changes. Ergo, version control. diff-and programs like it-became the basis for managing the complexity of large numbers of programmers working on the same text at the same time.
One of the first attempts to formalize version control was Walter Tichy's Revision Control System (RCS), from 1985.35 RCS kept track of the changes to different files using diff and allowed programmers to see all of the changes that had been made to that file. RCS, however, could not really tell the difference between the work of one programmer and another. All changes were equal, in that sense, and any questions that might arise about why a change was made could remain unanswered.
In order to add sophistication to RCS, d.i.c.k Grune, at the Vrije Universiteit, Amsterdam, began writing scripts that used RCS as a multi-user, Internet-accessible version-control system, a system that eventually became the Concurrent Versioning System. cvs allowed multiple users to check out a copy, make changes, and then commit those changes, and it would check for and either prevent or flag conflicting changes. Ultimately, cvs became most useful when programmers could use it remotely to check out source code from anywhere on the Internet. It allowed people to work at different speeds, different times, and in different places, without needing a central person in charge of checking and comparing the changes. cvs created a form of decentralized version control for very-large-scale collaboration; developers could work offline on software, and always on the most updated version, yet still be working on the same object.
Both the Apache project and the Linux kernel project use SCMs. In the case of Apache the original patch-and-vote system quickly began to strain the patience, time, and energy of partic.i.p.ants as the number of contributors and patches began to grow. From the very beginning of the project, the contributor Paul Richards had urged the group to make use of cvs. He had extensive experience with the system in the Free-BSD project and was convinced that it provided a superior alternative to the patch-and-vote system. Few other contributors had much experience with it, however, so it wasn't until over a year after Richards began his admonitions that cvs was eventually adopted. However, cvs is not a simple replacement for a patch-and-vote system; it necessitates a different kind of organization. Richards recognized the trade-off. The patch-and-vote system created a very high level of quality a.s.surance and peer review of the patches that people submitted, while the cvs system allowed individuals to make more changes that might not meet the same level of quality a.s.surance. The cvs system allowed branches-stable, testing, experimental-with different levels of quality a.s.surance, while the patch-and-vote system was inherently directed at one final and stable version. As the case of Shambhala exhibited, under the patch-and-vote system experimental versions would remain unofficial garage projects, rather than serve as official branches with people responsible for committing changes.
While SCMs are in general good for managing conflicting changes, they can do so only up to a point. To allow anyone to commit a change, however, could result in a chaotic mess, just as difficult to disentangle as it would be without an SCM. In practice, therefore, most projects designate a handful of people as having the right to "commit" changes. The Apache project retained its voting scheme, for instance, but it became a way of voting for "committers" instead for patches themselves. Trusted committers-those with the mysterious "good taste," or technical intuition-became the core members of the group.
The Linux kernel has also struggled with various issues surrounding SCMs and the management of responsibility they imply. The story of the so-called VGER tree and the creation of a new SCM called Bitkeeper is exemplary in this respect.36 By 1997, Linux developers had begun to use cvs to manage changes to the source code, though not without resistance. Torvalds was still in charge of the changes to the official stable tree, but as other "lieutenants" came on board, the complexity of the changes to the kernel grew. One such lieutenant was Dave Miller, who maintained a "mirror" of the stable Linux kernel tree, the VGER tree, on a server at Rutgers. In September 1998 a fight broke out among Linux kernel developers over two related issues: one, the fact that Torvalds was failing to incorporate (patch) contributions that had been forwarded to him by various people, including his lieutenants; and two, as a result, the VGER cvs repository was no longer in synch with the stable tree maintained by Torvalds. Two different versions of Linux threatened to emerge.
A great deal of yelling ensued, as nicely captured in Moody's Rebel Code, culminating in the famous phrase, uttered by Larry McVoy: "Linus does not scale." The meaning of this phrase is that the ability of Linux to grow into an ever larger project with increasing complexity, one which can handle myriad uses and functions (to "scale" up), is constrained by the fact that there is only one Linus Torvalds. By all accounts, Linus was and is excellent at what he does-but there is only one Linus. The danger of this situation is the danger of a fork. A fork would mean one or more new versions would proliferate under new leaders.h.i.+p, a situation much like the spread of UNIX. Both the licenses and the SCMs are designed to facilitate this, but only as a last resort. Forking also implies dilution and confusion-competing versions of the same thing and potentially unmanageable incompatibilities.
The fork never happened, however, but only because Linus went on vacation, returning renewed and ready to continue and to be more responsive. But the crisis had been real, and it drove developers into considering new modes of coordination. Larry McVoy offered to create a new form of SCM, one that would allow a much more flexible response to the problem that the VGER tree represented. However, his proposed solution, called Bitkeeper, would create far more controversy than the one that precipitated it.
McVoy was well-known in geek circles before Linux. In the late stages of the open-systems era, as an employee of Sun, he had penned an important doc.u.ment called "The Sourceware Operating System Proposal." It was an internal Sun Microsystems doc.u.ment that argued for the company to make its version of UNIX freely available. It was a last-ditch effort to save the dream of open systems. It was also the first such proposition within a company to "go open source," much like the doc.u.ments that would urge Netscape to Open Source its software in 1998. Despite this early commitment, McVoy chose not to create Bitkeeper as a Free Software project, but to make it quasi-proprietary, a decision that raised a very central question in ideological terms: can one, or should one, create Free Software using non-free tools?
On one side of this controversy, naturally, was Richard Stallman and those sharing his vision of Free Software. On the other were pragmatists like Torvalds claiming no goals and no commitment to "ideology"-only a commitment to "fun." The tension laid bare the way in which recursive publics negotiate and modulate the core components of Free Software from within. Torvalds made a very strong and vocal statement concerning this issue, responding to Stallman's criticisms about the use of non-free software to create Free Software: "Quite frankly, I don't _want_ people using Linux for ideological reasons. I think ideology sucks. This world would be a much better place if people had less ideology, and a whole lot more 'I do this because it's FUN and because others might find it useful, not because I got religion.'"37 Torvalds emphasizes pragmatism in terms of coordination: the right tool for the job is the right tool for the job. In terms of licenses, however, such pragmatism does not play, and Torvalds has always been strongly committed to the GPL, refusing to let non-GPL software into the kernel. This strategic pragmatism is in fact a recognition of where experimental changes might be proposed, and where practices are settled. The GPL was a stable doc.u.ment, sharing source code widely was a stable practice, but coordinating a project using SCMs was, during this period, still in flux, and thus Bitkeeper was a tool well worth using so long as it remained suitable to Linux development. Torvalds was experimenting with the meaning of coordination: could a non-free tool be used to create Free Software?
McVoy, on the other hand, was on thin ice. He was experimenting with the meaning of Free Software licenses. He created three separate licenses for Bitkeeper in an attempt to play both sides: a commercial license for paying customers, a license for people who sell Bitkeeper, and a license for "free users." The free-user license allowed Linux developers to use the software for free-though it required them to use the latest version-and prohibited them from working on a competing project at the same time. McVoy's attempt to have his cake and eat it, too, created enormous tension in the developer community, a tension that built from 2002, when Torvalds began using Bitkeeper in earnest, to 2005, when he announced he would stop.
The tension came from two sources: the first was debates among developers addressing the moral question of using non-free software to create Free Software. The moral question, as ever, was also a technical one, as the second source of tension, the license restrictions, would reveal.
The developer Andrew Trigdell, well known for his work on a project called Samba and his reverse engineering of a Microsoft networking protocol, began a project to reverse engineer Bitkeeper by looking at the metadata it produced in the course of being used for the Linux project. By doing so, he crossed a line set up by McVoy's experimental licensing arrangement: the "free as long as you don't copy me" license. Lawyers advised Trigdell to stay silent on the topic while Torvalds publicly berated him for "willful destruction" and a moral lapse of character in trying to reverse engineer Bitkeeper. Bruce Perens defended Trigdell and censured Torvalds for his seemingly contradictory ethics.38 McVoy never sued Trigdell, and Bitkeeper has limped along as a commercial project, because, much like the EMACS controversy of 1985, the Bitkeeper controversy of 2005 ended with Torvalds simply deciding to create his own SCM, called git.
The story of the VGER tree and Bitkeeper ill.u.s.trate common tensions within recursive publics, specifically, the depth of the meaning of free. On the one hand, there is Linux itself, an exemplary Free Software project made freely available; on the other hand, however, there is the ability to contribute to this process, a process that is potentially constrained by the use of Bitkeeper. So long as the function of Bitkeeper is completely circ.u.mscribed-that is, completely planned-there can be no problem. However, the moment one user sees a way to change or improve the process, and not just the kernel itself, then the restrictions and constraints of Bitkeeper can come into play. While it is not clear that Bitkeeper actually prevented anything, it is also clear that developers clearly recognized it as a potential drag on a generalized commitment to adaptability. Or to put it in terms of recursive publics, only one layer is properly open, that of the kernel itself; the layer beneath it, the process of its construction, is not free in the same sense. It is ironic that Torvalds-otherwise the spokesperson for antiplanning and adaptability-willingly adopted this form of constraint, but not at all surprising that it was collectively rejected.
The Bitkeeper controversy can be understood as a kind of experiment, a modulation on the one hand of the kinds of acceptable licenses (by McVoy) and on the other of acceptable forms of coordination (Torvalds's decision to use Bitkeeper). The experiment was a failure, but a productive one, as it identified one kind of non-free software that is not safe to use in Free Software development: the SCM that coordinates the people and the code they contribute. In terms of recursive publics the experiment identified the proper depth of recursion. Although it might be possible to create Free Software using some kinds of non-free tools, SCMs are not among them; both the software created and the software used to create it need to be free.39 The Bitkeeper controversy ill.u.s.trates again that adaptability is not about radical invention, but about critique and response. Whereas controlled design and hierarchical planning represent the domain of governance-control through goal-setting and orientation of a collective or a project-adaptability privileges politics, properly speaking, the ability to critique existing design and to propose alternatives without restriction. The tension between goal-setting and adaptability is also part of the dominant ideology of intellectual property. According to this ideology, IP laws promote invention of new products and ideas, but restrict the re-use or transformation of existing ones; defining where novelty begins is a core test of the law. McVoy made this tension explicit in his justifications for Bitkeeper: "Richard [Stallman] might want to consider the fact that developing new software is extremely expensive. He's very proud of the collection of free software, but that's a collection of re-implementations, but no profoundly new ideas or products. . . . What if the free software model simply can't support the costs of developing new ideas?"40 Novelty, both in the case of Linux and in intellectual property law more generally, is directly related to the interplay of social and technical coordination: goal direction vs. adaptability. The ideal of adaptability promoted by Torvalds suggests a radical alternative to the dominant ideology of creation embedded in contemporary intellectual-property systems. If Linux is "new," it is new through adaptation and the coordination of large numbers of creative contributors who challenge the "design" of an operating system from the bottom up, not from the top down. By contrast, McVoy represents a moral imagination of design in which it is impossible to achieve novelty without extremely expensive investment in top-down, goal-directed, unpolitical design-and it is this activity that the intellectual-property system is designed to reward. Both are engaged, however, in an experiment; both are engaged in "figuring out" what the limits of Free Software are.
Coordination Is Design.
Many popular accounts of Free Software skip quickly over the details of its mechanism to suggest that it is somehow inevitable or obvious that Free Software should work-a self-organizing, emergent system that manages complexity through distributed contributions by hundreds of thousands of people. In The Success of Open Source Steven Weber points out that when people refer to Open Source as a self-organizing system, they usually mean something more like "I don't understand how it works."41 Eric Raymond, for instance, suggests that Free Software is essentially the emergent, self-organizing result of "collaborative debugging": "Given enough eyeb.a.l.l.s, all bugs are shallow."42 The phrase implies that the core success of Free Software is the distributed, isolated, labor of debugging, and that design and planning happen elsewhere (when a developer "scratches an itch" or responds to a personal need). On the surface, such a distinction seems quite obvious: designing is designing, and debugging is removing bugs from software, and presto!-Free Software. At the extreme end, it is an understanding by which only individual geniuses are capable of planning and design, and if the initial conditions are properly set, then collective wisdom will fill in the details.
However, the actual practice and meaning of collective or collaborative debugging is incredibly elastic. Sometimes debugging means fixing an error; sometimes it means making the software do something different or new. (A common joke, often made at Microsoft's expense, captures some of this elasticity: whenever something doesn't seem to work right, one says, "That's a feature, not a bug.") Some programmers see a design decision as a stupid mistake and take action to correct it, whereas others simply learn to use the software as designed. Debugging can mean something as simple as reading someone else's code and helping them understand why it does not work; it can mean finding bugs in someone else's software; it can mean reliably reproducing bugs; it can mean pinpointing the cause of the bug in the source code; it can mean changing the source to eliminate the bug; or it can, at the limit, mean changing or even re-creating the software to make it do something different or better.43 For academics, debugging can be a way to build a career: "Find bug. Write paper. Fix bug. Write paper. Repeat."44 For commercial software vendors, by contrast, debugging is part of a battery of tests intended to streamline a product.
Coordination in Free Software is about adaptability over planning. It is a way of resolving the tension between individual virtuosity in creation and the social benefit in shared labor. If all software were created, maintained, and distributed only by individuals, coordination would be superfluous, and software would indeed be part of the domain of poetry. But even the paradigmatic cases of virtuosic creation-EMACS by Richard Stallman, UNIX by Ken Thompson and Dennis Ritchie-clearly represent the need for creative forms of coordination and the fundamental practice of reusing, reworking, rewriting, and imitation. UNIX was not created de novo, but was an attempt to streamline and rewrite Multics, itself a system that evolved out of Project MAC and the early mists of time-sharing and computer hacking.45 EMACS was a reworking of the TECO editor. Both examples are useful for understanding the evolution of modes of coordination and the spectrum of design and debugging.
UNIX was initially ported and shared through mixed academic and commercial means, through the active partic.i.p.ation of computer scientists who both received updates and contributed fixes back to Thompson and Ritchie. No formal system existed to manage this process. When Thompson speaks of his understanding of UNIX as a "spectrum" and not as a series of releases (V1, V2, etc.), the implication is that work on UNIX was continuous, both within Bell Labs and among its widespread users. Thompson's use of the diff tape encapsulates the core problem of coordination: how to collect and redistribute the changes made to the system by its users.
Similarly, Bill Joy's distribution of BSD and James Gosling's distribution of GOSMACS were both ad hoc, noncorporate experiments in "releasing early and often." These distribution schemes had a purpose (beyond satisfying demand for the software). The frequent distribution of patches, fixes, and extensions eased the pain of debugging software and satisfied users' demands for new features and extensions (by allowing them to do both themselves). Had Thompson and Ritchie followed the conventional corporate model of software production, they would have been held responsible for thoroughly debugging and testing the software they distributed, and AT&T or Bell Labs would have been responsible for coming up with all innovations and extensions as well, based on marketing and product research. Such an approach would have sacrificed adaptability in favor of planning. But Thompson's and Ritchie's model was different: both the extension and the debugging of software became shared responsibilities of the users and the developers. Stallman's creation of EMACS followed a similar pattern; since EMACS was by design extensible and intended to satisfy myriad unforeseen needs, the responsibility rested on the users to address those needs, and sharing their extensions and fixes had obvious social benefit.
The ability to see development of software as a spectrum implies more than just continuous work on a product; it means seeing the product itself as something fluid, built out of previous ideas and products and transforming, differentiating into new ones. Debugging, from this perspective, is not separate from design. Both are part of a spectrum of changes and improvements whose goals and direction are governed by the users and developers themselves, and the patterns of coordination they adopt. It is in the s.p.a.ce between debugging and design that Free Software finds its niche.
Conclusion: Experiments and Modulations.
Coordination is a key component of Free Software, and is frequently identified as the central component. Free Software is the result of a complicated story of experimentation and construction, and the forms that coordination takes in Free Software are specific outcomes of this longer story. Apache and Linux are both experiments-not scientific experiments per se but collective social experiments in which there are complex technologies and legal tools, systems of coordination and governance, and moral and technical orders already present.
Free Software is an experimental system, a practice that changes with the results of new experiments. The privileging of adaptability makes it a peculiar kind of experiment, however, one not directed by goals, plans, or hierarchical control, but more like what John Dewey suggested throughout his work: the experimental praxis of science extended to the social organization of governance in the service of improving the conditions of freedom. What gives this experimentation significance is the centrality of Free Software-and specifically of Linux and Apache-to the experimental expansion of the Internet. As an infrastructure or a milieu, the Internet is changing the conditions of social organization, changing the relations.h.i.+p of knowledge to power, and changing the orientation of collective life toward governance. Free Software is, arguably, the best example of an attempt to make this transformation public, to ensure that it uses the advantages of adaptability as critique to counter the power of planning as control. Free Software, as a recursive public, proceeds by proposing and providing alternatives. It is a bit like Kant's version of enlightenment: insofar as geeks speak (or hack) as scholars, in a public realm, they have a right to propose criticisms and changes of any sort; as soon as they relinquish that commitment, they become private employees or servants of the sovereign, bound by conscience and power to carry out the duties of their given office. The const.i.tution of a public realm is not a universal activity, however, but a historically specific one: Free Software confronts the specific contemporary technical and legal infrastructure by which it is possible to propose criticisms and offer alternatives. What results is a recursive public filled not only with individuals who govern their own actions but also with code and concepts and licenses and forms of coordination that turn these actions into viable, concrete technical forms of life useful to inhabitants of the present.
The question cannot be answered by argument. Experimental method means experiment, and the question can be answered only by trying, by organized effort. The reasons for making the trial are not abstract or recondite. They are found in the confusion, uncertainty and conflict that mark the modern world. . . . The task is to go on, and not backward, until the method of intelligence and experimental control is the rule in social relations and social direction. -john dewey, Liberalism and Social Action
8. "If We Succeed, We Will Disappear"
In early 2002, after years of reading and learning about Open Source and Free Software, I finally had a chance to have dinner with famed libertarian, gun-toting, Open Sourcefounding impresario Eric Raymond, author of The Cathedral and the Bazaar and other amateur anthropological musings on the subject of Free Software. He had come to Houston, to Rice University, to give a talk at the behest of the Computer and Information Technology Inst.i.tute (CITI). Visions of a mortal confrontation between two anthropologists-manque filled my head. I imagined explaining point by point why his references to self-organization and evolutionary psychology were misguided, and how the long tradition of economic anthropology contradicted basically everything he had to say about gift-exchange. Alas, two things conspired against this epic, if bathetic, showdown.
First, there was the fact that (as so often happens in meetings among geeks) there was only one woman present at dinner; she was young, perhaps unmarried, but not a student-an interested female hacker. Raymond seated himself beside this woman, turned toward her, and with a few one-minute-long exceptions proceeded to lavish her with all of his available attention. The second reason was that I was seated next to Richard Baraniuk and Brent Hendricks. All at once, Raymond looked like the past of Free Software, arguing the same arguments, using the same rhetoric of his online publications, while Baraniuk and Hendricks looked like its future, posing questions about the transformation-the modulation-of Free Software into something surprising and new.