Development of Functions to Prevent Secondary Use of Digitized Materials for Digital Libraries

Junji Nakata, Taminori Tomita, Hiroshi Kinukawa
Systems Development Laboratory, Hitachi, Ltd.
1099 Ohzenji, Asao, Kawasaki, 215 Japan
Phone: +81-44-966-9111, Fax: +81-44-966-1796
E-Mail: j-nakata@sdl.hitachi.co.jp,
t-tomita@sdl.hitachi.co.jp,
kinu@sdl.hitachi.co.jp

Tsutomu Mizuno Software Development Center, Hitachi, Ltd.
549-6 Shinano-cho, Totsuka, Yokohama 244, Japan
Phone: +81-45-824-2311, Fax: +81-45-826-8347
E-Mail: mizun_ts@soft.hitachi.co.jp

Abstract

A major issue yet to be resolved in the operation of digital libraries is the problem of copyright violation. One reason for the problem not having been solved is that the definition of "solving the problem" remains unclear. This paper hypothesizes that the problem can be solved by the establishment of a lending and return system for electronic books.
One of the prerequisites for the establishment of such a system is a way to prevent the redistribution of book contents and set expiration dates. Now, a function to prevent secondary use of material read on a Web browser has already been developed in a digital library system that uses the World Wide Web (WWW). This has been able to prevent the secondary use of material read without adversely affecting the ease with which the library can be accessed with Web browsers, the tools most commonly used.
However, and this can be said of other systems as well, because access to the material and the methods by which the material is read are controlled by the user, the material is not fully protected against attacks by ill-intentioned users.
It is likely that this technology will develop in two directions. The first is the development of technology to protect material from ill-intentioned users. The second is the development of technology to protect material and users from oversights that could be made by well-intentioned users.

Keywords:

Secondary Use, Digital Library, World Wide Web, Web Browser

1. Introduction

In recent years, the establishment of digital libraries is coming closer to reality in technical terms as use of the Internet has become more widespread. However, one of the basic issues relating to digital libraries, that of copyright, is as yet unsolved [1], [2]. Still there are no clear indicators to show what type of results would indicate that problems related to copyright have been solved. Therefore, when deliberating technical solutions to the problem of copyright in digital libraries, we must first establish technology development goals.

2. Goals for Technology Development

It is generally believed that there is a copyright problem in digital libraries when the following two conditions are satisfied.
  1. The copyright of the material in question is still valid.
  2. The electronic material has been distributed through a network.

The most important issue to be addressed in digital libraries is who pays for material to be converted into an electronic form if the copyright for that material has expired. If the material is not to be distributed via a network, the potential for large-scale distribution from the digital library is greatly reduced and the importance of the copyright issue is lessened.

Normally, to distribute material through a network where the copyright for that material belongs to someone else, the permission of the copyright holder is required. In other words, the operators of digital libraries must obtain permission from the copyright holder for the material to be placed on a network. Now, for what types of digital library systems are copyright holders likely to give their permission? Here, we will consider the conditions relating to network distribution with which copyright holders are likely to agree. Then we will discuss the requirements of users, the private information service industry, and the library operators.

2.1 Factors Required by Copyright Holders

Normally, a copyright holder does not collect a fee every time a particular work is loaned from a library. However, the copyright holder does obtain a fee that corresponds to the number of copies of the work purchased by libraries. More than this amount of copies cannot be lent at the same time. Therefore, permission granted to libraries to lend material does not affect the turnover in the marketplace when that material is extremely popular.

If we adopt the same analogy for digital libraries, we could use the following method. First, we set a limit for the frequency of simultaneous accesses to material stored in the digital library. The copyright holder would be reimbursed with a fee from the digital library that corresponds to this value. In this connection, the Xerox Corporation has already made a proposal where various rights would be set down in relationship to the material [3].

Now, when limits are set as described above, secondary use of the material must be made difficult. That is, we must ensure that material lent from a digital library is always used in its original form and that no part of the material can be extracted to make other material. It follows that this limitation also prevents electronic referencing of material. However, this is unavoidable since the boundary between appropriate referencing and secondary use without permission cannot be determined solely by electronic means.

On the other hand, in the future there will much material that exists only on networks. If a digital library could function as an impartial institution that can prove authorship of original material, this would provide an important impetus to copyright holders to entrust their material to digital libraries.

2.2 Factors Required by Users

Libraries are public facilities. In libraries, anyone can access the information they want. This feature of libraries is one that should be inherited by digital libraries. But this can only happen if digital libraries are established with public funding. In other words, a digital library service should be provided at no cost, or at a very low cost. If the cost of borrowing material is the same as that required to purchase the material in the marketplace, then the institution cannot be called a library.

Likewise, if investment in new equipment is required to enable a user to use a digital library, then this raises problems for public libraries. Use of digital libraries should be made possible merely by the addition of simple software based on the most popular infrastructure possible

2.3 Factors Required by the Private Information Service Industry

We must note the following: If public funds are used to create digital libraries, this must not force private industry out of the arena. If digital libraries appear to be overpowering the existing information service industry in terms of acquiring information, then the entire information service industry will work to stop the construction of such institutions.

So what restrictions should apply to digital libraries? By their nature, libraries excel in the collection and classification of material. The problem lies in the unrestricted release of all this collected material to users. In particular, if digital libraries widely distribute specialized material that has been a major source of income for the information service industry, digital libraries will severely restrict business in the information service industry.

Therefore, the material distributed from digital libraries must be of a type that does not compete with the material flowing from private industry. Or, if it does compete, then it must be of an adequately smaller volume than that distributed from private industry.

2.4 Factors Required by Library Operators

If all of the above can be achieved, digital libraries should then be able to operate appropriately in terms of the copyright issue. Furthermore, it will become important for digital library operators to manage copyright information of the material they holds. If users are able to obtain this copyright information from a digital library when they want to re-use material, then there will be a rapid increase in the significance of constructing digital libraries.

3. Technical Issues

All of the above factors can be summarized as control of the number of copies of material that can be accessed simultaneously. The following issues are of equal importance.
  1. Material distributed through the network must not be able to be redistributed in a reusable form
  2. Expiration dates for material must be set.

If current book lending and return systems can be successfully digitized by solving these problems, people wanting to borrow popular material will need to wait until the material is returned. When this occurs, those with sufficient disposable income will purchase the material at electronic book stores; and others will wait until the material is returned and then borrow it. Each of these issues will now be discussed.

3.1 Methods for Inhibiting Redistribution of Material

Normally, when material is distributed over a network, the material is stored as a file on a terminal. Here, even if the material is safely distributed to users using enciphering technology, copies of it can always be made because the material is stored as a file on a terminal. Even if this material is enciphered, valid users can always decode the cipher and will always have the cipher key. If they didn't, they would not be able to use the material.

If the decoding method is supplied in the form of software, the decoding method will also be able to be copied. In other words, the enciphered material and the decoding method can be copied, and this can be passed on together with the cipher key to another person. This means that the material can be redistributed in a usable format. (See Figure 3.1.)

Figure 3.1 Ease of Redistribution of Enciphered Material

Of course, employing special hardware to decode material could be considered. However, if it is then said that all digital library users require special hardware, the expense of this for users or library operators would multiply. This is hardly the best solution. Rather, it is more desirable to enable use of a digital library by adding software to the user terminal, if possible.

Note here that software alone will not completely prevent the redistribution of material. Users that redistribute material must be made to suffer a penalty.

The advantages that users of that type can gain by redistributing material need to be considered.

3.1.1 Profit

Many operators make a profit by illegally diverting material belonging to other people. It is hard to conceive of a perfect solution for dealing with such ill-intentioned users. Guidelines to punish them would, if strong enough, provide a situation where people dealing in pirated versions could be detected by means of the pirated version itself. For a pirated version to be made, the material must always first be obtained once through a legitimate source. Perhaps, the inclusion of information in the material itself, giving details of the person assigned rights to the material, might enable indiscriminate dealers to be detected when a pirated version is found. (See Figure 3.2.)

Figure 3.2 Potential for Detecting Dealer Name from Pirated Versions

However, this cannot be considered a perfect method either because people could purchase material under a false name. And some would find methods of removing the information giving details of the person assigned rights to the material, from the material.

3.1.2 Spreading the Word about Useful Information

The more a user finds useful information, the more they seem to want to tell a third party about that information. Perhaps they believe that the third party may benefit from being informed of such useful information. Perhaps, if they belong to the same organization, the user may see benefits for the whole organization. However, some users just do it to say, "did you know that ...!"

Some users will not be aware that this action of passing on information may breach the rights of others. Some users may even mistakenly believe that they are advertising the material. There are many people who believe that none of the information on the Internet is subject to copyright. This is probably because the material exists where it can be easily copied.

We cannot ignore the damage copyright holders incur through such actions. In particular, material on a network can be redistributed instantaneously to a third party. If one individual redistributes the material to a few others, before one knows it, thousands of copies have been made. Although the results are the same as when material is redistributed without permission for profit by ill- intentioned people, the big difference is that these people are not ill-intentioned.

Although they are not ill-intentioned, their lack of care or understanding about copyright means that sometimes their actions may breach copyright. Therefore, immediate exposure when such a breach of copyright is discovered may be relatively meaningless for the ill-intentioned users described above. However, it may serve as a severe punishment for other users. It is most important to find a technology that will warn against and prevent breaches of copyright made by ill-intentioned users. Warning users that it is possible to detect which users have redistributed material from the material itself is probably a deterrent.

3.2 Setting Expiration Dates

If expiration dates were to be included within enciphered material, and if decoding methods were provided with specifications that operated according to this information, it would be at least possible to set expiration dates. However, this would also mean that expired material would accumulate on a terminal and put pressure on terminal space.

It would be more appropriate if material that had passed the expiration date were automatically deleted by the decoding method, or deleted after user confirmation. However, there would be some people who, not wanting to delete the material, would not delete it. Incentives to ensure that such unjust behavior does not occur are required.

An operating system must be devised for digital libraries where users benefit by the removal of enciphered material. The easiest method would be to have users report to the digital library when the decoding system has deleted the enciphered material. When users do not send in such a report, the next time they borrow material a warning including a deletion request would be issued, and users would be warned that their borrowing rights would cease if their present behavior continued.

4. Establishing Material Distribution Models

Let me now summarize the above. Information that specifies the user, expiration dates, and information on the copyright holder is included in the material and distributed in enciphered form to users. When it is distributed, a distribution record would be made at the digital library. A decoding system would have previously been distributed to the user terminal. The enciphered material would be decoded using the cipher key obtained by some means or other. Here, if the expiration date for the material has passed, a query concerning deletion of the material would be made and the material deleted. This fact would then be reported to the digital library.

Enciphered material would be able to be read using an appropriate reading system. Here, what is important is the prevention of redistribution of the enciphered material without the information that has been added to it. When enciphered material is read, it is also important that referral at any time to the information that specifies the user be enabled. (See Figure 4.1.)

Figure 4.1 Material Distribution Model

Let us consider here the protection of decoded material. The reading system and other file access systems enable decoded material to be accessed. The reading system must of course always access the material when reading it. The file access system does not always have to be operating while material is being read. That is, use of the file access system must be prevented along with secondary use through the reading system.

5. Creating a Working System

Next, we attempted to build the model described in the previous chapter. We used a WWW server and a data base for the digital library and a Web browser as the reading system. The material was in HTML format and was enciphered before being distributed to user terminals. A helper application was provided at the user terminal for decoding the cipher. This operated for MIME-type enciphered material sent from the server.

The decoding helper, decodes the cipher and stores the material in the specified directory. Then the material is displayed using the browser. At the same time, the information that specifies the user, the expiration date, and copyright information that was included in the material is displayed on the screen or is made available for reference at any time. When the material is displayed on a browser, the decoding helper also starts up an application that monitors the browser and the operation of any other applications. If this monitoring application notices any copy or edit operations being used in the browser, it would restrict these or display a warning. Where necessary it would also inhibit the operation of other applications. (See Figure 5.1.)

Figure 5.1 Setup in Trial System

Figure 5.2 shows a screen example in the system actually trialed. This diagram shows an example where a window displaying a warning and information relating to the material being displayed is shown when an attempt is made to use the Web browser for cut and paste operations.

Figure 5.2 Screen Example

6. Evaluation

If users are to acquire and read material using the Web browser that they are familiar with, we will not be able to force a new user interface on them. This would also mean that because special hardware will not be required, a material lending and return system will be able to be constructed at a relatively low cost.

In this paper, we have assumed that a digital library would be constructed from a WWW server and a Web browser on a PC platform. While the function of the popular Web browsers are useful, there is also the danger that incorrect use of these functions may result in breaches of the copyright of the material, a right that belongs to a third party. However, when the popular nature of these browsers, their expressive power, their functions, and the speed of their performance is considered, they must still be considered the most appropriate platform for distributing material over a network.

Future technology development must follow one of the two following directions. First, the technology may, as has been done in the past, attempt to protect valuable material from illegal distribution by ill-intentioned people. The more valuable the material, the more money can be spent in turning it into a product. For example, the development of technology for special hardware may progress to protect valuable and expensive material.

On the other hand, technology may be developed to protect users from the possibility of being sued because they have inadvertently breached the rights of another. Although it is easily forgotten, it can be expected that in the future there will be more and more people using the network, acquiring information with little knowledge about computers or the law. The risks of the current WWW specifications where such users can readily breach the rights of another person are all too great. Work is needed to minimize this risk for users by creating software that is easier to use.

7. Conclusion

As one proposal to solve the copyright problem in digital libraries, we are proposing a lending and return system similar to that used in current libraries. The technical issues we have raised are the prevention of redistribution of material and the setting of expiration dates. We have conducted trials in particular relating to the prevention of redistribution of material. Assuming a material distribution system based on the WWW, we have developed an application to prevent secondary distribution of material using Web browsers and we have developed a method for distributing material.

The system can be introduced with no ill- feeling because we do not need to change the user interface and therefore users can continue to use popular Web browsers, and because we can display warnings only when attempts are made to enable secondary use of the material.

In the future, technology to protect copyright will be divided into two types. One type of technology will be developed to exclude ill- intentioned users who, knowingly, intentionally, enable secondary use of material. The other type of technology will be developed to protect users who enable illegal secondary use of material, either unwittingly or through lack of care, from being sued after the fact by the holders of rights or from harsh criticism from a third party who has no rights (a trend of late).

Technology for copying material has progressed to such an extent and copies can be so easily made that cases of people unwittingly breaching the rights of others by copying material are on the increase. Not only should greater use of the media be made to educate the public about intellectual property, but increasingly a better understanding of intellectual property rights should be facilitated by the product itself.

Acknowledgment

This paper includes the work under consideration by the "Next-Generation Digital Library System Research and Development Project" being promoted by the Ministry of International Trade and Industry (MITI), Information-technology Promotion Agency, Japan (IPA) and Japan Information Processing Development Center (JIPDEC).

References

[1] K. Nawa: "Copyright for Digital Library," Journal of Information Processing Library of Japan, VOL, 37, NO.9, pp.857-860 (1996)

[2] P. Samuelson: "Legally Speaking: Copyright and Digital Libraries," Communication of the ACM, April 1995/Vol. 38, No. 4 (1995)

[3] Marti A. Hearst: "Research in Support of Digital Libraries at Xerox PARC," D-Lib Magazine, May 1996, http://www.dlib.org/dlib/may96/05hearst.html (1996)