Children's Internet Protection Act (CIPA) Ruling - LightNovelsOnl.com
You're reading novel online at LightNovelsOnl.com. Please use the follow button to get notifications about your favorite novels and its latest chapters so you can come back anytime and won't miss anything.
We do not credit Finnell's estimates of the rates of underblocking in the Westerville and Greenville public libraries for several reasons. First, Finnell's estimates likely understate the actual rate of underblocking because patrons, who knew that filtering programs were operating in the Greenville and Westerville Libraries, may have refrained from attempting to access sites with s.e.xually explicit materials, or other contents that they knew would probably meet a filtering program's blocked categories. Second, and most importantly, we think that the formula that Finnell used to calculate the rate of underblocking in these two libraries is not as meaningful as the formula that information scientists typically use to calculate a rate of recall, which we describe above in Subsection II.E.3. As Dr.
Nunberg explained, the standard method that information scientists use to calculate a rate of recall is to sort a set of items into two groups, those that fall into a particular category (e.g., those that should have been blocked by a filter) and those that do not. The rate of recall is then calculated by dividing the number of items that the system correctly identified as belonging to the category by the total number of items in the category.
In the example above, we discussed a database that contained 1000 photographs. a.s.sume that 200 of these photographs were pictures of dogs. If, for example, a cla.s.sification system designed to identify pictures of dogs identified 80 of the dog pictures and failed to identify 120, it would have performed with a recall rate of 40%. This would be a.n.a.logous to a filter that underblocked at a rate of 60%. To calculate the recall rate of the filters in the Westerville and Greenville public libraries in accordance with the standard method described above, Finnell should have taken a sample of sites from the libraries' Internet use logs (including both sites that were blocked and sites that were not), and divided the number of sites in the sample that the filter incorrectly failed to block by the total number of sites in the sample that should have been blocked. What Finnell did instead was to take a sample of sites that were not blocked, and divide the total number of sites in this sample by the number of sites in the sample that should have been blocked. This made the denominator that Finnell used much larger than it would have been had he used the standard method for calculating recall, consequently making the underblocking rate that he calculated much lower than it would have been under the standard method.
Moreover, despite the relatively low rates of underblocking that Finnell's study found, librarians from several of the libraries proffered by defendants that use blocking products, including Greenville, Tacoma, and Westerville, testified that there are instances of underblocking in their libraries. No quant.i.tative evidence was presented comparing the effectiveness of filters and other alternative methods used by libraries to prevent patrons from accessing visual depictions that are obscene, child p.o.r.nography, or in the case of minors, harmful to minors.
Biek undertook a similar study of the overblocking rates that result from the Tacoma Library's use of the Cyber Patrol software. He began with the 3,733 individual blocks that occurred in the Tacoma Library in October 2000 and drew from this data set a random sample of 786 URLs. He calculated two rates of overblocking, one with respect to the Tacoma Library's policy on Internet use that the pictorial content of the site may not include "graphic materials depicting full nudity and s.e.xual acts which are portrayed obviously and exclusively for sensational or p.o.r.nographic purposes" and the other with respect to Cyber Patrol's own category definitions. He estimated that Cyber Patrol overblocked 4% of all Web pages in October 2000 with respect to the definitions of the Tacoma Library's Internet Policy and 2% of all pages with respect to Cyber Patrol's own category definitions.
It is difficult to determine how reliable Biek's conclusions are, because he did not keep records of the raw data that he used in his study; nor did he archive images of the Web pages as they looked when he made the determination whether they were properly cla.s.sified by the Cyber Patrol program. Without this information, it is impossible to verify his conclusions (or to undermine them). And Biek's study certainly understates Cyber Patrol's overblocking rate for some of the same reasons that Finnell's study likely understates the true rates of overblocking used in the libraries that he studied.
We also note that Finnell's study, which a.n.a.lyzed a set of Internet logs from the Tacoma Library during which the same filtering program was operating with the same set of blocking categories enabled, found a significantly higher rate of overblocking than the Biek study did. Biek found a rate of overblocking of approximately 2% while the Finnell study estimated a 6.34% rate of overblocking. At all events, the category definitions employed by c.i.p.a, at least with respect to adult use visual depictions that are obscene or child p.o.r.nography are narrower than the materials prohibited by the Tacoma Library policy, and therefore Biek's study understates the rate of overblocking with respect to c.i.p.a's definitions for adults.
In sum, we think that Finnell's study, while we do not credit its estimates of underblocking, is useful because it states lower bounds with respect to the rates of overblocking that occurred when the Cyber Patrol, Websense, and N2H2 filters were operating in public libraries. While these rates are substantial between nearly 6% and 15% we think, for the reasons stated above, that they greatly understate the actual rates of overblocking that occurs, and therefore cannot be considered as anything more than minimum estimates of the rates of overblocking that happens in all filtering programs.
5. Methods of Obtaining Examples of Erroneously Blocked Web Sites
The plaintiffs a.s.sembled a list of several thousand Web sites that they contend were, at the time of the study, likely to have been erroneously blocked by one or more of four major commercial filtering programs: SurfControl Cyber Patrol 6.0.1.47, N2H2 Internet Filtering 2.0, Secure Computing SmartFilter 3.0.0.01, and Websense Enterprise 4.3.0. They compiled this list using a two-step process. First, Benjamin Edelman, an expert witness who testified before us, compiled a list of more than 500,000 URLs and devised a program to feed them through all four filtering programs in order to compile a list of URLs that might have been erroneously blocked by one or more of the programs.
Second, Edelman forwarded subsets of the list that he compiled to librarians and professors of library science whom the plaintiffs had hired to review the blocked sites for suitability in the public library context.
Edelman a.s.sembled the list of URLs by compiling Web pages that were blocked by the following categories in the four programs: Cyber Patrol: Adult/s.e.xually Explicit; N2H2: Adults Only, Nudity, p.o.r.nography, and s.e.x, with "exceptions" engaged in the categories of Education, For Kids, History, Medical, Moderated, and Text/Spoken Only; SmartFilter: s.e.x, Nudity, Mature, and Extreme; Websense: Adult Content, Nudity, and s.e.x.
Edelman then a.s.sembled a database of Web sites for possible testing. He derived this list by automatically compiling URLs from the Yahoo index of Web sites, taking them from categories from the Yahoo index that differed significantly from the cla.s.sifications that he had enabled in each of the blocking programs (taking, for example, Web sites from Yahoo's "Government" category). He then expanded this list by entering URLs taken from the Yahoo index into the Google search engine's "related" search function, which provides the user with a list of similar sites. Edelman also included and excluded specific Web sites at the request of the plaintiffs' counsel.
Taking the list of more than 500,000 URLs that he had compiled, Edelman used an automated system that he had developed to test whether particular URLs were blocked by each of the four filtering programs. This testing took place between February and October 2001. He recorded the specific dates on which particular sites were blocked by particular programs, and, using commercial archiving software, archived the contents of the home page of the blocked Web sites (and in some instances the pages linked to from the home page) as it existed when it was blocked. Through this process, Edelman, whose testimony we credit, compiled a list of 6,777 URLs that were blocked by one or more of the four programs.
Because these sites were chosen from categories from the Yahoo directory that were unrelated to the filtering categories that were enabled during the test (i.e., "Government" vs. "Nudity"), he reasoned that they were likely erroneously blocked. As explained in the margin, Edelman repeated his testing and discovered that Cyber Patrol had unblocked most of the pages on the list of 6,777 after he had published the list on his Web site. His records indicate that an employee of SurfControl (the company that produces Cyber Patrol software) accessed his site and presumably checked out the URLs on the list, thus confirming Edelman's judgment that the majority of URLs on the list were erroneously blocked.
Edelman forwarded the list of blocked sites to Dr. Joseph Janes, an a.s.sistant Professor in the Information School of the University of Was.h.i.+ngton who also testified at trial as an expert witness. Janes reviewed the sites that Edelman compiled to determine whether they are consistent with library collection development, i.e., whether they are sites to which a reference librarian would, consistent with professional standards, direct a patron as a source of information.
Edelman forwarded Janes a list of 6,775 Web sites, almost the entire list of blocked sites that he collected, from which Janes took a random sample of 859 using the SPSS statistical software package. Janes indicated that he chose a sample size of 859 because it would yield a 95% confidence interval of plus or minus 2.5%. Janes recruited a group of 16 reviewers, most of whom were current or former students at the University of Was.h.i.+ngton's Information School, to help him identify which sites were appropriate for library use. We describe the process that he used in the margin. Due to the inability of a member of Janes's review team to complete the reviewing process, Janes had to cut 157 Web sites out of the sample, but because the Web sites were randomly a.s.signed to reviewers, it is unlikely that these sites differed significantly from the rest of the sample. That left the sample size at 699, which widened the 95% confidence interval to plus or minus 2.8%.
Of the total 699 sites reviewed, Janes's team concluded that 165 of them, or 23.6% percent of the sample, were not of any value in the library context (i.e., no librarian would, consistent with professional standards, refer a patron to these sites as a source of information). They were unable to find 60 of the Web sites, or 8.6% of the sample. Therefore, they concluded that the remaining 474 Web sites, or 67.8% of the sample, were examples of overblocking with respect to materials that are appropriate sources of information in public libraries.
Applying a 95% confidence interval of plus or minus 2.8%, the study concluded that we can be 95% confident that the actual percentage of sites in the list of 6,775 sites that are appropriate for use in public libraries is somewhere between 65.0% and 70.6%. In other words, we can be 95% certain that the actual number of sites out of the 6,775 that Edelman forwarded to Janes that are appropriate for use in public libraries (under Janes's standard) is somewhere between 4,403 and 4,783.
The government raised some valid criticisms of Janes's methodology, attacking in particular the fact that, while sites that received two "yes" votes in the first round of voting were determined to be of sufficient interest in a library context to be removed from further a.n.a.lysis, sites receiving one or two "no"
votes were sent to the next round. The government also correctly points out that results of Janes's study can be generalized only to the population of 6,775 sites that Edelman forwarded to Janes.
Even taking these criticisms into account, and discounting Janes's numbers appropriately, we credit Janes's study as confirming that Edelman's set of 6,775 Web sites contains at least a few thousand URLs that were erroneously blocked by one or more of the four filtering programs that he used, whether judged against c.i.p.a's definitions, the filters' own category criteria, or against the standard that the Janes study used. Edelman tested only 500,000 unique URLs out of the 4000 times that many, or two billion, that are estimated to exist in the indexable Web.
Even a.s.suming that Edelman chose the URLs that were most likely to be erroneously blocked by commercial filtering programs, we conclude that many times the number of pages that Edelman identified are erroneously blocked by one or more of the filtering programs that he tested.
Edelman's and Janes's studies provide numerous specific examples of Web pages that were erroneously blocked by one or more filtering programs. The Web pages that were erroneously blocked by one or more of the filtering programs do not fall into any neat patterns; they range widely in subject matter, and it is difficult to tell why they may have been overblocked. The list that Edelman compiled, for example, contains Web pages relating to religion, politics and government, health, careers, education, travel, sports, and many other topics. In the next section, we provide examples from each of these categories.
6. Examples of Erroneously Blocked Web Sites
Several of the erroneously blocked Web sites had content relating to churches, religious orders, religious charities, and religious fellows.h.i.+p organizations. These included the following Web sites: the Knights of Columbus Council 4828, a Catholic men's group a.s.sociated with St. Patrick's Church in Fallon, Nevada, http://msnhomepages.talkcity.com/SpiritSt/kofc4828, which was blocked by Cyber Patrol in the "Adult/s.e.xually Explicit"
category; the Agape Church of Searcy, Arkansas, http://www.agapechurch.com, which was blocked by Websense as "Adult Content"; the home page of the Lesbian and Gay Havurah of the Long Beach, California Jewish Community Center, http://www.compupix.com/gay/havurah.htm, which was blocked by N2H2 as "Adults Only, p.o.r.nography," by Smartfilter as "s.e.x," and by Websense as "s.e.x"; Orphanage Emmanuel, a Christian orphanage in Honduras that houses 225 children, http://home8.inet.tele.dk/rfb_viva, which was blocked by Cyber Patrol in the "Adult/s.e.xually Explicit" category; Vision Art Online, which sells wooden wall hangings for the home that contain prayers, pa.s.sages from the Bible, and images of the Star of David, http://www.visionartonline.com, which was blocked in Websense's "s.e.x" category; and the home page of Tenzin Palmo, a Buddhist nun, which contained a description of her project to build a Buddhist nunnery and international retreat center for women, http://www.tenzinpalmo.com, which was categorized as "Nudity" by N2H2.
Several blocked sites also contained information about governmental ent.i.ties or specific political candidates, or contained political commentary. These included: the Web site for Kelley Ross, a Libertarian candidate for the California State a.s.sembly, http://www.friesian.com/ross/ca40, which N2H2 blocked as "Nudity"; the Web site for Bob Coughlin, a town selectman in Dedham, Ma.s.sachusetts, http://www.bobcoughlin.org, which was blocked under N2H2's "Nudity" category; a list of Web sites containing information about government and politics in Adams County, Pennsylvania, http://www.geocities.com/adamscopa, which was blocked by Websense as "s.e.x"; the Web site for Wisconsin Right to Life, http://www.wrtl.org, which N2H2 blocked as "Nudity"; a Web site that promotes federalism in Uganda, http://federo.com, which N2H2 blocked as "Adults Only, p.o.r.nography"; "Fight the Death Penalty in the USA," a Danish Web site dedicated to criticizing the American system of capital punishment, http://www.fdp.dk, which N2H2 blocked as "p.o.r.nography"; and "Dumb Laws," a humor Web site that makes fun of outmoded laws, http://www.dumblaws.com, which N2H2 blocked under its "s.e.x" category.
Erroneously blocked Web sites relating to health issues included the following: a guide to allergies, http://www.x- sitez.com/allergy, which was categorized as "Adults Only, p.o.r.nography" by N2H2; a health question and answer site sponsored by Columbia University, http://www.goaskalice.com.columbia.edu, which was blocked as "s.e.x" by N2H2, and as "Mature" by Smartfilter; the Western Amputee Support Alliance Home Page, http://www.usinter.net/wasa, which was blocked by N2H2 as "p.o.r.nography"; the Web site of the Willis-Knighton Cancer Center, a Shreveport, Louisiana cancer treatment facility, http://cancerftr.wkmc.com, which was blocked by Websense under the "s.e.x" category; and a site dealing with halitosis, http://www.dreamcastle.com/tungs, which was blocked by N2H2 as "Adults, p.o.r.nography," by Smartfilter as "s.e.x," by Cyber Patrol as "Adult/s.e.xually Explicit," and by Websense as "Adult Content."
The filtering programs also erroneously blocked several Web sites having to do with education and careers. The filtering programs blocked two sites that provide information on home schooling. "HomEduStation the Internet Source for Home Education," http://www.perigee.net/~mcmullen/homedustation/, was categorized by Cyber Patrol as "Adult/s.e.xually Explicit."
Smartfilter blocked "Apricot: A Web site made by and for home schoolers," http://apricotpie.com, as "s.e.x." The programs also miscategorized several career-related sites. "Social Work Search," http://www.socialworksearch.com/, is a directory for social workers that Cyber Patrol placed in its "Adult/s.e.xually Explicit" category. The "Gay and Lesbian Chamber of Southern Nevada," http://www.lambdalv.com, "a forum for the business community to develop relations.h.i.+ps within the Las Vegas lesbian, gay, transs.e.xual, and bis.e.xual community" was blocked by N2H2 as "Adults Only, p.o.r.nography." A site for aspiring dentists, http://www.vvm.com/~bond/home.htm, was blocked by Cyber Patrol in its "Adult/s.e.xually Explicit" category.
The filtering programs erroneously blocked many travel Web sites, including: the Web site for the Allen Farmhouse Bed & Breakfast of Alleghany County, North Carolina, http://planet- nc.com/Beth/index.html, which Websense blocked as "Adult Content"; Odysseus Gay Travel, a travel company serving gay men, http://www.odyusa.com, which N2H2 categorized as "Adults Only, p.o.r.nography"; Southern Alberta Fly Fis.h.i.+ng Outfitters, http://albertaflyfish.com, which N2H2 blocked as "p.o.r.nography"; and "Nature and Culture Conscious Travel," a tour operator in Namibia, http://www.trans-namibia-tours.com, which was categorized as "p.o.r.nography" by N2H2.
The filtering programs also miscategorized a large number of sports Web sites. These included: a site devoted to Willie O'Ree, the first African-American player in the National Hockey League, http://www.missioncreep.com/mw/oree.html, which Websense blocked under its "Nudity" category; the home page of the Sydney University Australian Football Club, http://www.tek.com.au/suafc, which N2H2 blocked as "Adults Only, p.o.r.nography," Smartfilter blocked as "s.e.x," Cyber Patrol blocked as "Adult/s.e.xually Explicit" and Websense blocked as "s.e.x"; and a fan's page devoted to the Toronto Maple Leafs hockey team, http://www.torontomapleleafs.atmypage.com, which N2H2 blocked under the "p.o.r.nography" category.
7. Conclusion: The Effectiveness of Filtering Programs Public libraries have adopted a variety of means of dealing with problems created by the provision of Internet access. The large amount of s.e.xually explicit speech that is freely available on the Internet has, to varying degrees, led to patron complaints about such matters as unsought exposure to offensive material, incidents of staff and patron hara.s.sment by individuals viewing s.e.xually explicit content on the Internet, and the use of library computers to access illegal material, such as child p.o.r.nography.
In some libraries, youthful library patrons have persistently attempted to use the Internet to access hardcore p.o.r.nography.
Those public libraries that have responded to these problems by using software filters have found such filters to provide a relatively effective means of preventing patrons from accessing s.e.xually explicit material on the Internet. Nonetheless, out of the entire universe of speech on the Internet falling within the filtering products' category definitions, the filters will incorrectly fail to block a substantial amount of speech. Thus, software filters have not completely eliminated the problems that public libraries have sought to address by using the filters, as evidenced by frequent instances of underblocking. Nor is there any quant.i.tative evidence of the relative effectiveness of filters and the alternatives to filters that are also intended to prevent patrons from accessing illegal content on the Internet.
Even more importantly (for this case), although software filters provide a relatively cheap and effective, albeit imperfect, means for public libraries to prevent patrons from accessing speech that falls within the filters' category definitions, we find that commercially available filtering programs erroneously block a huge amount of speech that is protected by the First Amendment. Any currently available filtering product that is reasonably effective in preventing users from accessing content within the filter's category definitions will necessarily block countless thousands of Web pages, the content of which does not match the filtering company's category definitions, much less the legal definitions of obscenity, child p.o.r.nography, or harmful to minors. Even Finnell, an expert witness for the defendants, found that between 6% and 15% of the blocked Web sites in the public libraries that he a.n.a.lyzed did not contain content that meets even the filtering products' own definitions of s.e.xually explicit content, let alone c.i.p.a's definitions.
This phenomenon occurs for a number of reasons explicated in the more detailed findings of fact supra. These include limitations on filtering companies' ability to: (1) harvest Web pages for review; (2) review and categorize the Web pages that they have harvested; and (3) engage in regular re-review of the Web pages that they have previously reviewed. The primary limitations on filtering companies' ability to harvest Web pages for review is that a substantial majority of pages on the Web are not indexable using the spidering technology that Web search engines use, and that together, search engines have indexed only around half of the Web pages that are theoretically indexable.
The fast rate of growth in the number of Web pages also limits filtering companies' ability to harvest pages for review. These shortcomings necessarily result in significant underblocking.
Several limitations on filtering companies' ability to review and categorize the Web pages that they have harvested also contribute to over- and underblocking. First, automated review processes, even those based on "artificial intelligence," are unable with any consistency to distinguish accurately material that falls within a category definition from material that does not. Moreover, human review of URLs is hampered by filtering companies' limited staff sizes, and by human error or misjudgment. In order to deal with the vast size of the Web and its rapid rates of growth and change, filtering companies engage in several practices that are necessary to reduce underblocking, but inevitably result in overblocking. These include: (1) blocking whole Web sites even when only a small minority of their pages contain material that would fit under one of the filtering company's categories (e.g., blocking the Salon.com site because it contains a s.e.x column); (2) blocking by IP address (because a single IP address may contain many different Web sites and many thousands of pages of heterogenous content); and (3) blocking loophole sites such as translator sites and cache sites, which archive Web pages that have been removed from the Web by their original publisher.
Finally, filtering companies' failure to engage in regular re-review of Web pages that they have already categorized (or that they have determined do not fall into any category) results in a substantial amount of over- and underblocking. For example, Web publishers change the contents of Web pages frequently. The problem also arises when a Web site goes out of existence and its domain name or IP address is rea.s.signed to a new Web site publisher. In that case, a filtering company's previous categorization of the IP address or domain name would likely be incorrect, potentially resulting in the over- or underblocking of many thousands of pages.
The inaccuracies that result from these limitations of filtering technology are quite substantial. At least tens of thousands of pages of the indexable Web are overblocked by each of the filtering programs evaluated by experts in this case, even when considered against the filtering companies' own category definitions. Many erroneously blocked pages contain content that is completely innocuous for both adults and minors, and that no rational person could conclude matches the filtering companies'
category definitions, such as "p.o.r.nography" or "s.e.x."
The number of overblocked sites is of course much higher with respect to the definitions of obscenity and child p.o.r.nography that c.i.p.a employs for adults, since the filtering products' category definitions, such as "s.e.x" and "nudity,"
encompa.s.s vast amounts of Web pages that are neither child p.o.r.nography nor obscene. Thus, the number of pages of const.i.tutionally protected speech blocked by filtering products far exceeds the many thousands of pages that are overblocked by reference to the filtering products' category definitions.
No presently conceivable technology can make the judgments necessary to determine whether a visual depiction fits the legal definitions of obscenity, child p.o.r.nography, or harmful to minors. Given the state of the art in filtering and image recognition technology, and the rapidly changing and expanding nature of the Web, we find that filtering products' shortcomings will not be solved through a technical solution in the foreseeable future. In sum, filtering products are currently unable to block only visual depictions that are obscene, child p.o.r.nography, or harmful to minors (or, only content matching a filtering product's category definitions) while simultaneously allowing access to all protected speech (or, all content not matching the blocking product's category definitions). Any software filter that is reasonably effective in blocking access to Web pages that fall within its category definitions will necessarily erroneously block a substantial number of Web pages that do not fall within its category definitions.
2. a.n.a.lytic Framework for the Opinion: The Centrality of Dole and the Role of the Facial Challenge
Both the plaintiffs and the government agree that, because this case involves a challenge to the const.i.tutionality of the conditions that Congress has set on state actors' receipt of federal funds, the Supreme Court's decision in South Dakota v.
Dole, 483 U.S. 203 (1987), supplies the proper threshold a.n.a.lytic framework. The const.i.tutional source of Congress's spending power is Article I, Sec. 8, cl. 1, which provides that "Congress shall have Power . . . to pay the Debts and provide for the common Defence and general Welfare of the United States." In Dole, the Court upheld the const.i.tutionality of a federal statute requiring the withholding of federal highway funds from any state with a drinking age below 21. Id. at 211-12. In sustaining the provision's const.i.tutionality, Dole articulated four general const.i.tutional limitations on Congress's exercise of the spending power.
First, "the exercise of the spending power must be in pursuit of 'the general welfare.'" Id. at 207. Second, any conditions that Congress sets on states' receipt of federal funds must be sufficiently clear to enable recipients "to exercise their choice knowingly, cognizant of the consequences of their partic.i.p.ation." Id. (internal quotation marks and citation omitted). Third, the conditions on the receipt of federal funds must bear some relation to the purpose of the funding program.
Id. And finally, "other const.i.tutional provisions may provide an independent bar to the conditional grant of federal funds." Id.
at 208. In particular, the spending power "may not be used to induce the States to engage in activities that would themselves be unconst.i.tutional. Thus, for example, a grant of federal funds conditioned on invidiously discriminatory state action or the infliction of cruel and unusual punishment would be an illegitimate exercise of the Congress' broad spending power."
Id. at 210.
Plaintiffs do not contend that c.i.p.a runs afoul of the first three limitations. However, they do allege that c.i.p.a is unconst.i.tutional under the fourth p.r.o.ng of Dole because it will induce public libraries to violate the First Amendment.
Plaintiffs therefore submit that the First Amendment "provide[s]