Open Source Software Development
29 November 2005
This
chapter explains
Open
source is a development approach in which the source code of the software is
entirely free to access. The term “open source” is used to refer to both the
product and the development approach. Any individual is able to view the code,
modify it or duplicate it. Access to the source code facilitates the
distributed and cooperative approach to software development that is
fundamental to an open source style of development.
Some
examples of larger open source products are: Mozilla web browser; Apache web
server; GNU/Linux and GNU/HURD operating systems; MySQL database software; Perl
programming language; MyOffice and OpenOffice office suites. This range of
products illustrates that open source software development can produce a
diverse range of products.
Open
source development is recognition of the hacker’s attitude to building
software. Sometimes the term hacker has been associated with negative aspects
of computing. However, hackers are now recognised as a community of highly
skilled programmers, who relish the act of writing code and participate for
enjoyment or to enhance their programming reputation. It is fundamental to the
hacker ethic that information and knowledge should be freely shared without
restriction because this stimulates collaborative thinking, leading to superior
ideas.
The
same principle is applied in open source development. Rather than the code
being confined to a small core of developers (or even just one person), as in
proprietary methods, a wider audience facilitates a greater influx of ideas and
a greater degree of innovation. It is believed that because the source code is examined
by a larger audience than proprietary software, any imperfections stand a
greater chance of being identified and consequently rectified. The sharing of
code therefore leads to more reliable code.
However,
openness and the concept of code sharing do not mean that open source products
are free to buy. There are other important issues of principles, which we now
discuss.
Self Test Question
What is
the primary goal of open source development?
Answer
Reliable
software
End
Whilst
there is a shared belief in collaboration and openness within the development
community, a schism does exist in terms of the motivation and underlying
philosophy of open source. The main split is between the Free Software Foundation and the
Open Source Initiative.
The Free Software Foundation (FSF) was founded in 1985.
The
philosophy of the FSF is that
individual freedom should never be compromised and that all individual action
should also benefit the wider community. Therefore, whilst individual
programmers are encouraged and admired, they are also expected to feed their
findings and their skills back into the community of programmers to which they
ultimately belong. This is done through the sharing of code and the
distribution of good programming practice.
The
principle of freedom is reflected in the use of the term copyleft in
order to distinguish it from the usual term copyright. In order to distribute its
products, the free software community has devised its own license agreement,
the General Public License (GPL). The GPL (which runs to about 5
pages of text) provides users with certain fundamental freedoms:
There are
two other important ingredients:
Thus
the GPL actually makes it illegal for anyone to make GPL code proprietary or
“closed”. It also disallows building any GPL covered software into proprietary software.
Essentially, this means that GPL software always remains GPL software and
therefore always remain free
The FSF is absolutely resolute in not
allowing any proprietary software to be incorporated into their software. All
their products are covered under the GPL and they are largely unaffiliated with
commercial software development companies.
Free
software organizations often offer their program code for free, most commonly as
a download from their website. However, remembering that is “free as in free
speech, not as in free beer”, some organizations sell free software as a
complete package, shrink wrapped, sometimes including user manuals and
additional support services.
The Open Source Movement, which later
became the Open Source Initiative (OSI) is spearheaded by Eric S. Raymond.
Their emphasis is on the benefits of open source as a development approach,
rather than any moral benefits that can accrue. It is a purely pragmatic
approach. They stress that open source development can produce higher quality
software than other approaches.
The OSI are more willing to collaborate with larger software companies, sometimes including developers of proprietary products. They wish to appeal to the business sector because this enables greater distribution of their product. However, unlike the FSF, their approach is motivated primarily because of the quality, rather than because of the freeness of the software. Forming contracts with larger companies is one way of exposing OSI products to a larger potential market. However, it also means that the product must compete with other commercial package products.
Some
open source development projects have devised their own open source licenses,
which differ in varying degrees from the GPL. However, the majority of founding
open source projects still deploy the GPL.
Self Test Question
Can you
write and sell software with a GPL license?
Answer
yes
End
Self Test Question
Can you
obtain software that has a GPL license and then sell it?
Answer
yes
End
Despite
the schism within open source in terms of ethics and philosophy, the
development practices principally remain the same between the two movements.
Open source development tends
to use the following techniques:
Managing an open source project is potentially complex, and we look at this topic in the next section.
Throughout development, the internet facilitates communication between developers and also the distribution of source code, via the Web, File Transfer Sites and email.
There is usually no formal mechanism for gathering initial user requirements for an open source development. The process often consists of a software requirement that is instigated by a sole developer, with requests for collaboration, targeting the hacker community. The head developer specifies most requirements. Additional user requirements are either implemented by individual developers themselves via personal modification of the source code, or through a communal process known as “code forking”. Code forking occurs when the developer base has alternative requirements or conflicting ideas on how to implement a requirement. The code is seen to “fork” because it is split and each copy of the code is developed in parallel. After this split occurs, the code is irreconcilable and therefore two different products exist, both growing from the same base code. Each fork competes for developer attention, so that the most popular or the most reliable version survives. In theory, the “fittest” code should survive.
The design of software is communicated via web-based tools. Sometimes UML diagrams or other notations using hyperlinks to depict the overall structure of the program are deployed. However, generally, there is a lack of design documentation within open source products. This reflects the hacker ethic that software is simply code.
The code writing on an open source project is sustained
through voluntary contributions. Developers are motivated by the enjoyment of programming,
the belief in the sharing of software or their own requirement for the software
product. Code is commonly implemented
via re-use and many open source projects begin immediately by re-writing the
code of existing products, with enhancements and alterations made where
necessary. When there is no original from which to copy, a core developer base
begins writing the code before offering it to the wider community for critique.
Once
contributions have been implemented, beta versions of open source products are
released. Releases are made frequently, so that the effectiveness of
contributions can be tested immediately. Feedback on the latest version is
received and contributions again incorporated into the code in a continuous
cycle, which continues until the community is satisfied with the eventual
outcome. Contributions then slow down or cease.
Development communities and product websites act as sources of support for users of open source software. The websites contain installation tutorials and user forum groups providing technical support. The development community mostly provides these voluntarily.
As an
alternative means of support, commercially supported versions of open source
software such as GNU/Linux are available to buy. This software is an exact
replica of the source code, but is provided with supporting manuals and
services. These services do not exist for all products and therefore many
smaller open source products are only used by technically adept users.
In summary, the following table
lists most of the essential tasks of software development, alongside how they
are carried out in open source development.
|
Requirements
elicitation |
An individual has an idea for
a program and puts it on a mailing list or a newsgroup. Potential users
suggest features. A discussion takes place until a consensus arrives. There
is no market research, no interviewing of potential users. |
|
Architectural
design |
There is no explicit process
for this activity. Either the design is implicit (and obvious) or it evolves
over time. (For example, there has been a long and vigorous argument about
the best structure for the Linux kernel.) This is perhaps the least-defined
part of development - and, perhaps, the most vulnerable. |
|
Detailed
design |
This stage simply does not
exist, except as a by-product of the next stage. |
|
Coding |
This is the central part of
open source development. This is what developers enjoy doing. Source code is
regarded as the most important, or only, product. |
|
Integration |
A version is built and placed
on an internet site for users to download. |
|
Verification |
This is what the
collaborators do. Not only can they run the program (and reveal bugs) but
they can study the source code (and reveal bugs). There is no test plan or
strategy. Different users will investigate the program in different ways. For
example, some will be interested in robustness, others in security. This
diversity has the potential to provide thorough testing. |
|
Bug
fixes |
Again, this is what the
collaborators do. |
|
Support |
Via newsgroups or commercial
organizations |
Self Test Question
What is
the main technique of open source development?
Answer
Code
sharing
End
Self Test Question
What is
the main tool of open source development?
Answer
The
internet
End
The hacker ethic is essentially
anti-managerial. So how are the following activities carried out in open source
development?
These
activities are particularly challenging because of the large numbers of people
involved. There is a need for a responsive decision-making structure so that
decisions can be made quickly when bugs are reported and new features are
suggested.
An
explicit project manager or management group is generally in place on open
source projects. They decide on the usefulness and appropriateness of contributions
that are made by the wider developer community. They also usually add a patch
to the code and therefore act as chief implementer on the project. Various
organizational styles are used but a common characteristic is a hierarchy of
control.
As an example of individual
rule, The Linux kernel development
(see below) is based on a hierarchy, with Linus Torvalds at the top. Proposals
are examined for appropriateness, selectively filtered and sent up the
hierarchy until they reach Torvalds. If he accepts the proposal he integrates
it into the code. He tightly controls the kernel. He has said "I couldn't
manage lots of developers. I would not have been able to keep control"
An example of group hierarchical management is the
Apache project. This development is led by a group, the Project Management
Committee (PMC). Membership of this group is by invitation only and must be
approved by a majority vote of the group, with no veto from any member. A
potential member must demonstrate high technical competence. Outside the PMC are
the large numbers of contributors. A contributor submits a change, which is
vetted by a PMC member. The PMC then votes on whether to accept the change.
As these examples indicate,
most open source projects use an explicit project manager or a management group
at the head of a hierarchy. The hierarchy means that a project is prevented
from being overwhelmed by contributions. It also lessens the risk of sabotage. While
the hierarchy within open source projects does not provide an “undo"
facility,
it does attempt to ensure that it does attempt to ensure that contributions are
rigorously interrogated as they pass through the structure. There is clearly an
important element of trust.
GNU/Linux is an open source operating system, loosely based upon Unix. It contains over 10 million lines of code and has been developed using over 3000 major contributors of code from 90 countries. It is perhaps the most famous open source project. It has achieved a reputation for high reliability, and is widely used in servers across the world. It is distributed under the GPL license.
Linus Torvalds, who still oversees the development today,
instigated the project in 1991. Torvalds
originally began the project because none of the current operating systems
served his own requirements. They were either unreliable, too expensive or
devoid of the functionality he required. He knew that the main free software
operating system, GNU ("GNU is not Unix"), was some way from
completion. He could not wait. What was missing from GNU was the central
component, the kernel. So he began to write a kernel. Torvalds was also motivated
by the enjoyment of writing code and claims that he wrote it “just for fun!” The
kernel was named Linux, after Linus. The Free Software Foundation stringently claim
that the majority of the complete operating system was written by the GNU
people and that it should more appropriately be called Gnu/Linux.
Torvalds targeted
developer forums and websites, posting an early release of the kernel and
requesting feedback and contributions. Increased contributions and
collaborations between GNU/Linux and GNU groups meant that distribution of beta
versions was frequent and continuous.
Now you might
imagine that Linux was developed in an egalitarian fashion, in line with the
hacker ethic. But this is not so. Torvalds is in charge, surrounded by a small
group of trusted lieutenants (called credited developers). These people are selected
by Linux and by the group itself. The architecture of Linux is deliberately
modular, so as to minimise communication between developers and to make it is
easier to carry out development of different modules in parallel. Each credited
developer has responsibility for an individual module.
The development
of Linux centres around an electronic mailing list. This contains:
·
bug reports from users and testers
·
suggested bug fixes (patches)
·
code for new features
·
announcements, such as the announcement of a new release
The scale of this
list is huge - in one five year period, approximately 13,000 contributors
posted around 175,000 messages to the list. The credited developers watch the
mailing list, looking for entries relevant to their module. They assess whether
a contribution is useful and, where appropriate, submit patches to Torvalds.
Then he decides what happens.
So the people on
the project are organized as follows:
·
Torvalds is at the centre
·
he is surrounded by a small group of credited developers
·
they are surrounded by thousands of contributors
Clearly this is a
very centralized, and hierarchical, structure.
After years of continuous development, GNU/Linux is now a renowned open source operating system, competing on the world market with other commercial and proprietary products. What began as a personal project is now widely used and technically reputable. The GNU/Linux code is still available in its original non-supported format. However, a number of commercial organizations also exist to provide appropriate support for various user markets. GNU/Linux remains in continuous development, undergoing corrections and enhancements.
Open
source development’s most attractive asset is the enormous enthusiasm and
passion that resonates throughout the developer community and their building of
software. Developers have an unrelenting belief in what they do; voice their
pride in their hacker roots; and find nothing more fulfilling than the art of
programming.
Forking
ensures that developer requirements are established and implemented in a
democratic process. This means that the requirements of the majority of the
development community are satisfied. Similarly, any specific personal
modification can be made by individuals, providing that they have the technical
ability to implement them.
However,
it is worth noting that this process largely ignores non-developer user
requirements. The general user does not have the power to register their vote
via code implementation; neither can they personally modify their own code.
The re-use of code is an important development approach. However, in the case of open source projects that attempt to re-write entire systems and applications, a re-use approach can only be facilitated by source code that is not covered by a proprietary license. Liability issues may hinder entire projects because developers may not have legal access to any code that they would like to re-write. However, the overall expertise of the hacker community usually means that volunteers are willing to take on the alternative and more difficult task of writing entire systems from scratch.
Releasing
frequent versions of the software brings benefits of continuous feedback.
Whilst the beta code may not contain all the functionality that is required, it
means that the developer base can immediately evaluate the code and get a feel
for the software. Crucially, the potentially vast audience of testers can
immediately begin to track and fix bugs, so that changes can be made
incrementally, continuously and at a relatively fast pace.
Inappropriate
patches, once incorporated into code can irreparably damage a project. Having
an explicit manager on open source projects means that all contributions are
monitored and approved. This ensures that the freedom to contribute is upheld,
but lessens the risk of any sabotage attempts.
Open source program code tends to be highly reliable because bugs are found and fixed by a wide viewing audience with highly proficient programming abilities. Proprietary software corporations are being forced to acknowledge open-source development as a valid approach and are beginning to experiment with its techniques. The high viewing audience that can track and fix bugs is seen as an efficient way of “cleaning up” software that is proving to be unreliable. Consequently, some companies have now opened up previously closed code. This suggests that the open source development approach can influence other mainstream techniques.
Contributors to open source projects have a passion for programming, so that writing code is seen as more of a hobby, than a chore or a job. They and gain enormous satisfaction in seeing their patches integrated into a program. However, because open source projects generally rely upon voluntary contributions, there is always the risk that the community will cease to contribute to the project. This would result in a stagnation of a development project and an unfinished product.
Similarly, the lack of documentation also potentially limits the maintenance to the original developer base and lessens the ability of someone else being able to take on the project. If the initial developer base tires of a project, it is not easy for another developer to take on the project without documentation as a means of communicating the design of the program.
The usefulness of informal support mechanisms is questionable, particularly for the non-teachnical user. Web site tutorials are often aimed at a technically adept audience. In addition, since support services are voluntary, there is no guarantee that someone will be available when required and users may have to wait until someone responds to their enquiry.
Open
source development is a collaborative approach relying upon voluntary
contributions of program code. It has its roots in a hacker ethic that promotes
individual skill, but also upholds the importance of community.
The
approach produces extremely reliable software because open source code means
bugs are exposed to a vast audience. Thus more bugs are likely to be found and
fixed. The regular release of the software also means that program code is
continually tested before the final product version is released.
Non-commercial
open source organizations are often weak in supporting the general user (for
example, producing and supporting a word processor). However, the commercial
sector, acknowledging the superiority of the open source code, is addressing
this problem, providing support services for open source products and adopting
open source development techniques.
1.
Can you think of any situations
or products for which open source development might be most appropriate?
2.
Can you think of examples of
situations in which open source development of products might be unwise?
3.
Assess whether open source
would be suitable for each of the developments given in appendix A.
4.
Compare and contrast the
approaches of the Free Software Foundation and the Open Source Initiative.
5.
Is open source development just
hacking?
Hackers:
Heroes of the Computer Revolution
Levy, S. (2002), Anchor Books
This provides a rare insight into the history of
hacking, from its origins at MIT in the 1950 to the rise of open source
software.
--------------------------------------------------------------------------------------------------------------
Open Sources: Voices from the Open Source
Revolution (1st Edition)
DiBona,
C., Ockman, S., & Stone, M. (1999),
A
comprehensive collection of essays covering topics from licensing issues to the
engineering of major open source products such as Mozilla and Perl.
-----------------------------------------------------------------------------------------------------------------
Rebel Code: Inside Linux and the Open
Source Revolution
Moody,
G. (2001), Perseus Publishing.
This is
a very accessible book which depicts the development of the GNU/Linux Operating
System, including interviews with major contributors in the open source field.
------------------------------------------------------------------------------------------------------
The Cathedral and the Bazaar: Musings on
Linux and Open Source by an Accidental Revolutionary (Revised Edition)
This is
a response to
-------------------------------------------------------------------------------------------------------------
Free as in Freedom:
Primarily
focusing on the life and moral crusade of Stallman, this text also describes
the development of GNU project and other projects of the Free Software
Foundation.
--------------------------------------------------------------------------------------------------------------
SourceForge.com is a
web site that coordinates open source development projects. If you want to
contribute to projects, this is the place.
The URL is http://www.sourceforge.net
---------------------------------------------------------------------------------------------
The Success of
Open Source
A social scientist's view of open source development,
showing how it challenges conventional wisdom.