Greetings. One of the most contentious issues on the Internet and its World Wide Web is the rising furor surrounding the filtering and rating of site content. It has all the elements of a classic "B" movie: politics, religion, sex, and even some dandy sci-fi aspects such as runaway technology. But filtering and content matters far transcend the importance of an afternoon's idle entertainment, and strike to the very heart of some crucial concerns of both individuals and society.
The Internet has created the potential for information distribution and access without respect to organizational size, jurisdictions, or geographic boundaries. These abilities are unparalleled in the human experience. Even such fundamental developments as the printing press seem to pale in scope when compared with the vast quantity and reach of information the Internet can provide.
The Internet and Web are just tools of course, and as such do not possess intrinsic ethical or moral sensibilities. The available materials cover the entire range from the vile to the sublime. But assigning any particular page of information, photos, or other Internet data to a specific point along that continuum is a highly individualistic experience, with reasonable and honorable people disagreeing over virtually every category.
It is into this unprecedented environment that the world's populace has found itself suddenly thrust, and the urge to attempt the implementation of "simple" solutions to a very complex set of circumstances is proving to be overwhelming. As usual, however, we're finding that the simple approaches are often wrought with problems of their own.
The core issue revolves around the desire and abilities of individuals, organizations, and governments to rate, filter, or otherwise control the Internet content that may be viewed by any given individual. In some cases, their specific concerns may be fundamentally laudable, in other cases, highly suspect. Countries with a history of censoring political speech, for example, have been quick to attempt the implementation of proxy servers and other controls to try stem the flow of such communications.
But this trend is not limited only to governments with a history of draconian information controls, but also has appeared in such enlightened democracies as Australia, where government-mandated rating and blocking requirements, aimed primarily at "offensive" entertainment material, have been implemented. Similar government edicts are on the rise within the European Union and other areas of the world.
In the United States, these movements are also present. The use of content filtering software programs is on the rise by private and public organizations, municipalities in their offices, schools, and libraries, and so on. Sometimes these filters are directed at children's use of computers, but often adults as well are required to abide by the programs' restrictions. The U.S. Congress has twice attempted to mandate the use of such filters by public institutions, linking such usage to federal funding. These mandates have so far been rejected by the federal court system, though the legal wrangling continues.
Even if such filtering programs accurately performed their stated purposes, the information control, freedom of speech, and related issues would be formidable at best. But making matters even worse is the flawed nature of these filtering methodologies, and in many cases the secretiveness with which they implement their content filtering decisions.
Filtering can be applied to nearly any type of Internet content, from e-mail to Web pages. It can be implemented via automated systems, typically using keyword searching to try find "offending" materials. This tends to be the most laughable filtering technique, since its false positive rate is immensely high. Web pages mentioning the term "Superbowl XXX" have been blocked as pornography by such systems. Even the recent PFIR Statement on Spam was rejected by some sites running filters that declared the PFIR message to be spam--possibly because terms such as "multi-level marketing" were included within the discussion of spam problems. We don't really know what triggered the rejections--you're usually not told specifically what content in a message or Web page was deemed unacceptable by the programs.
While controlling spam is certainly a positive goal, it's obvious that you cannot accurately determine the context of words via such crude techniques. Systems that are keyword-based without human review are unsuitable for use in any Internet content filtering application.
Unfortunately, content filtering systems based on ostensibly human-created lists or human review seem to be equally inaccurate and obnoxious. Most commercially available Web filtering programs contain "secret" lists of sites to be blocked--the manufacturers often consider their block lists to be proprietary and copyrighted. Operational experiences have suggested that many of these lists are highly inaccurate, often blocking sites unrelated to the announced blocking criteria. Health information sites have been blocked as if they were pornography, for example.
In many other cases, blocks are so far off-base that it's difficult to imagine how they could have occurred unless automated systems were actually responsible for the listings. At one point, the well-respected PRIVACY Forum was blocked by a popular filtering program, which had placed the Forum Web pages within a "criminal skills" category. It turned out that the mere mention of encryption issues within some PRIVACY Forum articles had triggered this categorization! When contacted, the firm who created the filter acknowledged the obvious inappropriateness of the block, and removed the PRIVACY Forum from their block lists. The company never had a reasonable explanation of how their human reviewers could have made such an error.
This brings up another critical point. Sites who are blocked normally have no way to even know of their blocked status unless somebody attempting to access the site informs them about it. Companies selling blocking software don't normally even attempt to inform sites when they've been added to a block list, nor are systematic procedures for appealing such categorizations universally available. Sites have no reliable way to know which of the many available filtering programs are blocking them, possibly completely inaccurately, at any given time. Even after specific blocking errors are corrected, such mistakes could recur again without warning.
These factors, along with the secretiveness with which the filtering companies tend to treat their blocking lists, create an untenable situation. Especially when such filters are being used by public entities such as libraries and schools, they create the Orwellian atmosphere of secret censorship committees, completely devoid of any genuine accountability. What do the block lists really contain? Porn sites? Religious sites? Political speech sites? We can't know if the lists are unavailable. This is a horror in any modern public policy context. At a bare minimum, public institutions should be prohibited from using any filtering software which does not make its complete block lists available for public inspection!
Most manufacturers of filtering software are very serious about keeping their lists hidden. In a very recent case, individuals who decrypted the block list from one such package are being sued by the company involved, who is also reportedly trying to learn the identities of the persons who accessed those decrypted materials from related Web sites. While the detailed legal issues relating to the actual decryption in this case may be somewhat problematic, the intolerable fact that the block lists are kept hidden seems to have at least partly driven this situation.
Outside of the rating procedures used by the commercial filtering software packages, there are also a variety of efforts aimed at inducing all Web sites to "self-rate" via various criteria, often with the suggestion of penalties or sanctions in cases of perceived inaccurate ratings. In some countries, as in the Australian case, these ratings are being mandated by the government. In other cases they are being presented as being ostensibly voluntary. But it's clear that there'd really be nothing voluntary about them, since unrated sites would presumably be treated as "objectionable" by many Web browser configurations that would implement the rating systems. And again, we find ourselves faced with the problem of how ratings would be evaluated for "accuracy"--given the wide range of opinions and world views present in any society. To whom do we cede the power to make such determinations in the international environment of the Internet?
It is particularly alarming to observe the extent to which the proponents of mandatory filtering seem anxious to control Internet content that is not similarly controlled in other situations. A common example frequently cited is information about explosives. There is certainly such information available on the Internet which could be used to harm both persons and property. But much of this same sort of information is available in bookstores, libraries, or by mail order. How do we draw the line on what would be forbidden? Radical literature? Industrial training materials? Chemistry textbooks? Are we really so anxious to dramatically alter our notions of free speech across the board, not just relating to the Internet?
Free speech is by no means absolute, but blaming the Internet or Web for our perceived problems is merely finding a convenient scapegoat, not a genuine solution. Before we tamper dramatically with such fundamental concepts, we'd better be very careful about what we wish for, and consider how the granting of some wishes could potentially damage society and our most cherished precepts.
In any case, personal responsibility, both in terms of our own behaviors and when it comes to supervising the activities of children, must not be replaced by automated systems. Taking responsibility is our job as human beings--it is certainly not an appropriate role for our machines!
It should be interesting to see how many automated content filters the vocabulary of this very document will trigger...
firstname.lastname@example.org or email@example.com
Co-Founder, PFIR - People For Internet Responsibility - http://www.pfir.org
Moderator, PRIVACY Forum - http://www.vortex.com
Member, ACM Committee on Computers and Public Policy