NTIA censorware comment

[ Archived at http://sethf.com/freespeech/censorware/essays/ntia.php ]

Reply to NTIA request for censorware comments

Comment submitted in reply to the Request for Comment on the Effectiveness of Internet Protection Measures and Safety Policies
by Seth Finkelstein (sethf[at-sign]sethf.com) August 27 2002

Dear NTIA:

I would like to reply to the following questions raised in the "Evaluation of Available Technology Protection Measures":

Let me state my background. To be immodest, I have probably extracted more censorware blacklists than anyone in the world - CyberSitter, I-Gear, X-Stop, SurfWatch, CyberPatrol, Websense, and more. I was responsible for the very first expose of censorware. I coFounded Censorware Project (though I am no longer a member) and provided essentially all of the decrypted blacklists used in reports there. For all my work, in March 2001, I was honored as follows:

ELECTRONIC FRONTIER FOUNDATION (EFF) PIONEER AWARDS HONOR INTERNET LUMINARIES
Ennis, Finkelstein, and Perrin Presented Awards at EFF's Tenth Annual Pioneer Awards Ceremony
...
Seth Finkelstein - Anti-censorship activist and programmer Seth Finkelstein spent hundreds of unpaid and uncredited hours over several years to decrypt and expose to public scrutiny the secret contents of the most popular censorware blacklists. Seth has been active in raising the level of public awareness about the dangers that Internet content blocking software and rating/labeling schemes pose to freedom of communication. His work has armed many with information of great assistance in the fight against government mandated use of these systems.

Censorware Operation

4. Please explain how the technology protection products block or filter prohibited content (such as yes lists, (appropriate content); no lists, (prohibited content), human review, technology review based on phrase or image, or other method.) Explain whether these methods successfully block or filter prohibited online content and whether one method is more effective than another.

I refer the reader to the White Paper: Blacklisting Bytes
http://www.eff.org/Censorship/Censorware/20010306_eff_nrc_paper1.html
Co-authors: Seth Finkelstein, Consulting Programmer; Lee Tien, Senior Staff Attorney, EFF.

This was submitted to the National Research Council (NRC) as part of their censorware study, and is also available at the URL http://www7.nationalacademies.org/itas/whitepaper_1.html .

Censorware Blacklists

5. Are there obstacles to or difficulties in obtaining lists of blocked or filtered sites or the specific criteria used by technology companies to deny or permit access to certain web sites? Explain.

To give a short answer: Yes.

To give a long answer:

Virtually every censorware company considers its blacklist to be proprietary information, protected both by technical (encryption) and legal means. Calling it "obstacles or difficulties" borders on humorous understatement. In detail:

Encryption and Legal Matters

Encryption as a barrier

The encryption used by different censorware companies varies in its strength. However, it is far beyond the ability of the vast majority of the population to decode even the weakest encryption. It requires some technical expertise. And greater strength of encryption requires greater skill to decrypt. Relatively few people have the ability to do this sort of technical work.

But this work carries extensive legal risk. The blacklists are huge. They can be literally tens to hundreds of thousands of items. I'm only one person, I could never hope to examine more than a fraction. No single person could examine it all in a reasonable time.

Legal risks to overcoming encryption as a barrier

But if I were to publish the entire blacklists, for other people to evaluate, or alternatively, tools to allow them to extract the blacklists themselves, I'd be risking a lawsuit under various legal causes. I could be sued under various theories such as "Copyright Infringement", "Breach of Licensing Agreement" "Theft of Trade Secrets", etc. And publishing decryption tools is especially problematic because of the Digital Millennium Copyright Act (DMCA).

This risk of a lawsuit is not mere conjecture. In March 2000, CyberPatrol sued two programmers (Matthew Skala and Eddy Jansson), for publishing tools to allow decryption of CyberPatrol's censorware blacklist. In fact, that case, Microsystems v, Scandinavia Online, et. al. took place in court just downtown from me. Note I had done similar work years before, but was chilled from publishing it all because of the legal implications.

By the way, there is a certain myth regarding the DMCA and censorware blacklists. The myth traces back to some inaccurate news coverage of the exemption granted by the Library Of Congress regarding DMCA prohibitions. That exception was mistakenly reported in some instances as a general permission for censorware blacklist decryption. No. In its rulemaking process , the Library of Congress determined investigating censorware blacklists to be one of two specific exemptions granted regarding one part of the DMCA ("circumvention"). My work played a role in the granting of this exemption.

As with all things law and legal, this is a fiendishly complicated subject. Does the exemption only technically apply to the actual process of investigating a censorware blacklist? (i.e, "circumvention" itself, provision "1201(a)(1)" ) There's another part of the DMCA ("1201(a)(2)" ) which deals with prohibitions against "manufacture, import, offer to the public, provide, or otherwise traffic in any technology, product, service, device, component, or part thereof, that -" roughly, are for circumvention. Yet a third section contains a a virtually identical prohibition concerning a technological measure that effectively protects a right of a copyright owner ("1201(b)(1)" ). Whether the anticensorware circumvention exemption is extremely narrow, or implies some broader protection for making tools to aid in investigating censorware, is the subject of possible future civil-liberties litigation.

But the exemption does not apply to any other law which affects investigating censorware blacklists. For example, one censorware company, N2H2, has stated:

"The U.S. Copyright Office issued a final rule interpreting the provisions of the Digital Millennium Copyright Act, or DMCA, that prohibits the circumvention of technological copyright protection mechanisms. The final rule took effect on October 28, 2000, and created an exemption to the DMCA anti-circumvention provisions for compilations consisting of lists of Web sites blocked by filtering software applications. The consequence of the final rule is that lists of Web sites blocked by filtering software do not receive extra protection under the DMCA, and technological measures used to prevent access to such lists may be circumvented without violating the new "anti-circumvention" provisions Copyright Act. N2H2 considers its lists of blocked Web sites to be proprietary, valuable information. However, N2H2 does not believe that the final rule will affect the value of its lists of blocked Web sites. N2H2 regards its lists as trade secrets and protects their confidentiality primarily through physical security controls and contractual non-disclosure provisions. The final rule simply exempts lists of blocked Web sites from the new copyright law protections available under the DMCA, so that the level of copyright law protection available for such lists is the same as it was before the DMCA was enacted in 1998."

N2H2 can also be extraordinarily legally aggressive. During the CIPA trial, it sought to prevent even expert-witnesses testimony from being public on "trade secret" grounds - "They say that certain things we talk about them having blocked will show the nature of their software, ..."

There's current litigation (Edelman vs. N2H2) against N2H2, seeking declaratory judgment concerning various censorware decryption legal issues. There's a great quote by Ben Edelman: "I don't want to go to jail. I want to go to law school."

I don't want to go to jail either.

So, the end result is a dilemma, where despite my expertise, legal risk prevents me from publishing either full censorware blacklists, or tools for others to extract the censorware blacklists.

The Implications of Legal and Technical Prohibitions

The effects of these obstacles and barriers are profound. It creates a situation vastly unbalanced in favor of the promotion of censorware. There is no cost for a censorware-maker to engage in unrealistic claims of human review, or misleading promises of accuracy. It is profitable for them to do so. But the hard and unrewarding work of rebutting such PR claims, of showing how censorware in fact functions, is constantly running a gauntlet of such prohibitions. Critical examination of censorware, by some of those both most skeptical, and with the necessary expertise to do extensive analysis, is stifled.

For example, consider the work I did in exposing an undocumented category of N2H2's censorware, a LOOPHOLE. This was
http://sethf.com/anticensorware/bess/loophole.php

BESS's Secret LOOPHOLE: (censorware vs. privacy & anonymity) - a secret category of BESS (N2H2), and more about why censorware must blacklist privacy, anonymity, and translators

and the various ideas I'd been pioneering here finally made it into the CIPA court decision, as a reason for striking down that law.

"Another technique that filtering companies use in order to deal with a structural feature of the Internet is blocking the root level URLs of so-called "loophole" Web sites. These are Web sites that provide access to a particular Web page, but display in the user's browser a URL that is different from the URL with which the particular page is usually associated. Because of this feature, they provide a "loophole" that can be used to get around filtering software, i.e., they display a URL that is different from the one that appears on the filtering company's control list. "Loophole" Web sites include caches of Web pages that have been removed from their original location, "anonymizer" sites, and translation sites."

While NTIA may not be concerned with Constitutionality, this type of architectural analysis, concerning how censorware actually functions, provides key information in policymaking. If people are hindered in finding out what censorware really blocks, how it functions in specific, if the claims of marketing-types and product-peddlers are practically privileged from being seriously technically challenged, then vital information ends up suppressed.

The View From The Trenches

One of my favorite passages about the CyberPatrol lawsuit is the following: (emphasis added)

"Of course I was disappointed by this state of affairs. When we published the essay I didn't expect a lawsuit, but I had also thought, "Well, if there is a lawsuit it won't be a problem, because there are organizations that take care of things like that." I fondly imagined that in case of legal silliness, someone would just step in and say "We'll take it from here." What I found out was that those organizations, through no fault of their own, were able to give me a lot of sympathy and not enough of anything else, particularly money, to bring my personal risk of tragic consequences down to an acceptable level, despite, incredibly, the fact that what I had done was legal. Ultimately, I couldn't rely on anybody to deal with my problems but myself.
Some people learn that lesson a bit less impressively than I had to."

One thing I've found, is that discussion of the chilling effects of facing a lawsuit, simply sails over too many people's heads. It is not within their experience. So the material can come across as whining or ranting or petty bickering. Because the typical reader has experience with hearing complaints, they do that every day. But they have no such experience concerning living with severe potential legal liability, those situations are very rare.

It's a bit like the linguistic effect where a diagnosis such as "sickle-cell anemia" is heard by some people as "sick-as-hell anemia". The words "sickle" and "cell" aren't in their vocabularies, or at least are very rare, but words such as "sick" and "hell" are common everyday terms. Here, more metaphorically, "facing large fine or jail" tends to be taken as "making large whine or wail". Because the experience to understand it, just isn't within the average person's day.

Censorware decryption is difficult, legally-risky, work. There's a famous quote:

"I must say that, as a litigant, I should dread a lawsuit beyond almost anything short of sickness and death."
-- Judge Learned Hand, from "The Deficiencies of Trials to Reach the Heart of the Matter", in 3 "Lectures On Legal Topics" 89, 105 (1926), quoted in Fred R. Shapiro, "The Oxford Dictionary Of American Legal Quotations" 304 (1993).

I find it immensely stressful to be at risk for such volunteer work. I don't get paid for it. It's not my job. I did it just because I thought it would make a difference. For many years, I did this work virtually anonymously. Even after winning the EFF Pioneer Award, I was still very circumspect about details. But there's all sort of pitfalls. For example, Michael Sims (former Censorware Project webmaster and now a journalist now working for the website Slashdot), went, in the public statement of Censorware Project, (n.b: I didn't write this!) "flipping out" on Censorware Project, destroyed the original Censorware Project web site, then used the hijacked domain "to confuse people and divert traffic away". As part of this "flipping out", he later broke the legal confidence that Censorware Project attorney James Tyre had entrusted in him, and publicized to the world, including all censorware companies, an internal Censorware Project message containing every detail of every decryption I'd done at the time. Every name, every date, every program. And this was done right at the start of the CIPA trial, when the information would be most legally harmful to me, and most helpful to the censorware companies if they desired to sue me (it's a great message praising me, and I have it on my site now, but I sure didn't want it being effectively told to censorware companies at approximately the same time N2H2 was taking legal actions against even court expert-witness testimony). In combination with N2H2's legal aggressiveness, the skyrocketing legal risk derailed all the anticensorware reports I had planned to release to coincide with the CIPA trial.

Now, when I try to explain how destructive Michael Sims was here, this is the point where people tend to roll their eyes and mutter about personal/petty-bickering/flame-war, etc. But when a former co-worker makes available confidential legal information which can be used by companies which might sue you, well, the fact that they've done it for "personal" reasons doesn't minimize the damage. It's one of those "obstacles to or difficulties", arising from the legal risk of the work, that people formerly trusted might breach ethics for revenge. It's not an obvious implication, but it is very real, and dramatically shows the environment which hinders censorware investigation.

I don't like playing russian-roulette with legal risk. I fear eventually there will be a lawsuit with my name on it. This is indeed an obstacle and difficulty.

Censorware Blacklist Updating

6. Do technology companies readily add or delete specific web sites from their blocked lists upon request? Please explain your answer.

I have a saying: "Alacrity varies with publicity". But sometimes not even then. The best evidence I can offer is direct from the CIPA decision, discussing an expert-witness report:

16. In October 2001, Edelman published the results of his initial testing on his Web site. In February and March 2002 he repeated his testing of the 6,777 URLs originally found to be blocked by at least one of the blocking products, in order to determine whether and to what extent the blocking product vendors had corrected the mistakes that he publicized. Of those URLs blocked by N2H2 in the October 2001 testing, 55.10% remained blocked when tested by Edelman in March 2002. Of those URLs blocked by Websense in the October 2001 testing, 76.28% remained blocked when tested by Edelman in February 2002. Of those URLs blocked by SurfControl's Cyber Patrol product, only 7.16% remained blocked, i.e., Cyber Patrol had unblocked almost 93% of the Web pages originally blocked. Because the results posted to his Web site were accessed by an employee of SurfControl (as evidenced by Edelman's records of who was accessing his Web site), we infer that Cyber Patrol had determined that 93% of all 6,777 pages, or 6,302 Web pages, were originally wrongly blocked by the product.
[Referencing (the above is a footnote, this is the main passage)]
Because these sites were chosen from categories from the Yahoo directory that were unrelated to the filtering categories that were enabled during the test (i.e., "Government" vs. "Nudity"), he reasoned that they were likely erroneously blocked. As explained in the margin, Edelman repeated his testing and discovered that Cyber Patrol had unblocked most of the pages on the list of 6,777 after he had published the list on his Web site. His records indicate that an employee of SurfControl (the company that produces Cyber Patrol software) accessed his site and presumably checked out the URLs on the list, thus confirming Edelman's judgment that the majority of URLs on the list were erroneously blocked.(16)

Even with such a high-profile expert-witness report available, where a court case was underway, some censorware companies seemed not to care. How much more important could this evidence be to them? And yet, they didn't bother to fix their lists. What more can one say?

Conclusion

In sum, secret blacklists made by unaccountable private parties, who can and will take chilling legal action against people who investigate them, should not play any part in any public mandate.