Infringement and Enforcement: Three Examples

A user comes across this URL: http://www.scofflaw.example.com/some/path This is a web site containing infringing material. The user decides to check it out.

Approach 1: Host Name Blocking

Behind the scenes, web browsing software translates the host name ("www.scofflaw. example.com") into an IP address (e.g., 192.1.1.34), a process known as host name resolution. The IAP could configure the local Domain Name System (DNS) not to translate the names of infringing sites. When the customer software attempts the resolution, the IAP DNS server would respond "no such host", "server error", "administratively prohibited", or some variant.

This approach is easily defeated in a number of ways. Although an IAP typically instructs the customer to configure the software to point to one or more IAP-controlled DNS servers, there is nothing preventing the customer from specifying other DNS servers. Such servers could be anywhere in the world, perhaps even under the control of the infringing site. Since the algorithms under which DNS operate generally assume that every visible DNS server can find the answer to any given DNS query, detouring to a different DNS server is no problem.

An IAP could prevent this user’s DNS queries from leaving the IAP-controlled facilities, in effect forcing the customer to use the IAP-controlled DNS servers. This approach is undesirable for at least two reasons: first, it places providers at a competitive disadvantage because not all IAPs will impose this restriction. Second, such a move is hazardous to the IAP's own DNS operation, adding a significant new level of complexity.

And even if an IAP forces all DNS queries to be done via IAP-controlled DNS servers and then blocks resolution of infringing site names, the user still reaches the site by directly specifying the IP address rather than host name in the URL:

http://192.1.1.34/some/path

as opposed to

http://www/scofflaw.example.com/some/path.

In this manner, the user easily sidesteps controls placed on the translation of host names.

Approach 2: Packet Filtering

Individual filtering of packets based on their content would be impossible. Because data is broken into non-continuous blocks, it is not possible to determine the content of any packet while it is in transmission. Only once the packets have been reassembled on the end-users computer can they be understood for their content. As a result, packets must be filtered according to their origin, not their content.

The IAP may attempt to block access to infringing sites through adjustments to IP routers. Packets move on the Internet according to IP addresses. IP routers direct this traffic. The filtering rules for these devices could be programmed to selectively reject packets based on various criteria like source or destination IP address. There are, however, significant problems with this approach.

A given server can have a large number of IP addresses assigned to it; all of which are easy to change. The IAP would have to constantly consult DNS for changes in the IP address list for an infringing site, and then revise and re-install the filtering rules on the routers to account for any changes.

A far larger problem, however, is the sheer number of filtering rules which would have to be in place. Imagine hundreds or thousands of infringing sites being listed for blocking at any given time, and imagine several possible IP addresses for many of them. Even if the attempt were made, router performance would be degraded significantly, becoming much slower at routing packets through the network. Unable to keep up, routers would begin to discard other packets (which have nothing to do with infringing sites). Even a relatively small number of lost packets can begin to cause serious problems in a network. As packets are lost, the originating hosts resend them, consuming more bandwidth and router processing.

The added filtering requirements not only degrade performance, they diminish security. An IAP uses router filters to protect its customers and its own infrastructure from accidental or intentional intrusions. The added complexity of the filtering rules (and the large number of new rules they represent) creates a threat to the IAP's security plans. Given that the rules would have to change quite frequently, the chances of accidentally misconfiguring the router is increased enormously.

Approach 3: Blocking HTTP Proxy Servers

Even if an IAP accepted the administrative, security, and performance costs of packet filtering for some number of infringing sites, access control would be ineffective. The HTTP protocol and conventions are extremely flexible and robust in overcoming obstacles, even beyond the original design intent of HTTP.

If a customer wanted to access web pages of a site, "www.scofflaw.example.com", and found that the IAP was filtering packets to that site, the user could simply find an HTTP proxy server outside the control of the IAP and "bounce" requests through it. A proxy server performs web browsing requests for a third party. Normally, a web browser sends its requests directly to the desired site. All web browsers can also be configured to use a proxy server. When so configured, each user request is bundled up and sent to the named proxy server. The proxy server unbundles the request, performs the requested access, and returns the results to the user. The process is analogous to a hotel concierge obtaining theater tickets for a hotel guest. Proxy servers are often used to facilitate IAP or organization security arrangements or to improve performance.

Suppose the proxy server is a host named "proxy.sbwu.edu". As far as router packet filters were concerned, all IP packets would be traveling between the IAP customer and "proxy.sbwu.edu". Since the HTTP proxy host is outside the network facilities directly controlled by the IAP, the fact that packets traveled between "proxy.sbwu.edu" and "scofflaw.example.com" would be completely invisible to the IAP and the IAP's routers.

If a proxy server allows itself to be used to access infringing sites, the proxy server could also be blocked. The list of "renegade" proxy servers could be added to the list of infringing sites, and whatever techniques are used to block direct access to infringing sites could also be used to block access to the listed proxy servers.

Thousands of HTTP proxy servers exist today, both in the U.S. and beyond. Many of those proxy servers are unrestricted with respect to who uses them. New servers come and go every day. In number terms, this activity magnifies the problems described earlier.

Blocking proxy servers also begs the test of reasonableness. Many proxy servers are operated on the very same machines that sites use to host normal web servers. Blocking access to the HTTP proxy server would effectively require blocking access to the entire site. HTTP proxy service would likely only be a minor part of the sites traffic, and within proxy service, relays to infringing sites would be an even smaller percentage.

Would it be possible to look for HTTP proxy requests in packet data? If the IAP could examine proxy requests, then the ultimate destination for the request would be easily seen, and the request could be blocked. Because HTTP is an application-level protocol, all of the data for the request and the response are carried in the payload data in the IP packets. Equipment and software used to move packets in the Internet today examine only the TCP/IP header bytes and leave the payload data unexamined and unaltered (just as a delivery service uses address information without understanding the package contents).

Because packet data does not arrive as a single block the packets can arrive in a different order than they were sent. As a result, some packets may be lost, and only a certain number of packets will be sent before acknowledgments are received for earlier packets, making the job of the computer analyzing the payload data extremely difficult. This computer must gather packets, reassemble payload data in the proper sequence, and then do any application-specific scan. The algorithmic details and performance costs make this solution unfeasible, both in terms of economics and efficiency.

Even if the situation were otherwise, the approach could be easily thwarted by masking the request to make it look like a protocol other than HTTP. Suppose, for instance, a user is still trying to reach "www.scofflaw.example.com", this time using a proxy server. The user has reason to believe, however, that the proxy server is blocked too. The persistent user masks the HTTP proxy request with the Network News Transfer Protocol (NNTP), used for reading netnews on the Internet. Like HTTP, NNTP is a text-based application protocol . A cooperating user and proxy server would pretend to be performing an NNTP conversation. At some mutually understood point in the conversation, the user's software would send the proxy request. The proxy server would interpret and act on the request, as usual, and return the results to the user.

The NNTP example illustrates two new requirements for IAPs attempting to do the blocking. First, they must constantly be on the lookout for new points at which the conversation turns into a proxy request. This would probably vary from site to site. It would also vary for a given site as more and more IAPs became aware of a given technique and the site switched to new techniques. For example, at one site the technique might be to switch right after seeing a command containing the string "XYZZY". At another site, it might be to switch after 200 bytes had been transmitted.

Second, the IAP must not merely examine the beginning of a data exchange to decide if it is an HTTP proxy request but must monitor the entire conversation. Since more packets must be examined and more patterns must be sought, this increases the cost and performance burden for the IAP.

While the counter measures described above may appear to be the provenance of only a small group of highly sophisticated users, this is really not the case. The know how necessary to evade access controls is widely available; ironically, the Internet itself makes much of this information available. And while those seek to evade controls have a single target, the IAP operates in a one to many environment, with thousands of users devising tens of thousands of new approaches to beat the system. In this scenario, the point of diminishing returns is quickly reached. No approach is fool proof. Rather, as this discussion suggests, once content is released unprotected to the Internet, a pound of prevention is worth an ounce of cure. In attempting to retrofit a solution, the very steps taken to protect digital content make that content less convenient, more expensive and, in the end, less attractive. Even that which is technically feasible becomes economically unreasonable.

Conclusion

Clearly, the solution to Internet copyright lies beyond the internal architecture of the Internet. As this section suggests, effective copyright protection cannot be relegated to the point where content is on the transmission path of the Internet. Instead, the only effective points of control are at the point of transmission and the point of reception. This reality places the responsibility of "wrapping" or otherwise placing digital "identifiers" onto or around valuable intellectual property with the copyright holder and content provider. Protecting content in this manner will result in discouraging infringement "initiators" and provide IAPs with the digital tools that will make it technologically feasible and economically reasonable to deal with unauthorized reproductions.

Continue to next page.