But we are also becoming familiar with some of the limits of these activities. There is often too much information available, as captured in the popular image of trying to drink out of a fire hose. We encounter too much useless information on the way to those nuggets that truly transform our work. Potentially useful information can be difficult to assimilate and contextualize. Or we may simply fail to find the information that we need.
Information achieves significant value only when it contributes to the achievement of important human purposes. We often contrast "information" with "data," with the latter missing the step of interpretation that gives meaning to "raw" material. But we need even more than information. We need "knowledge", which we take to be information of significant value for human purposes. Knowledge is information that has been sifted, selected, organized, verified, and vetted. Knowledge is information that has been assimilated into our activities and social structures in ways that advance our accomplishment of significant personal and social goals.
Knowledge technologies have long played a major role in shaping how we work together. Everyone knows that communication technologies like the telegraph and the telephone had profound impacts on knowledge work, making it possible to organize work in radically different ways. JoAnne Yates [5] described how a range of now familiar office technologies like carbon paper, the mimeograph machine, the typewriter, and the vertical file had profound impacts on the organization of group work. Indeed, from the mid nineteenth century to today emerging knowledge technologies have produced new forms of working together at a dizzying pace, and there is no slow-down in sight.
As information professionals we are of course interested in how knowledge networks are affected by the changes in information technologies. Those of us attending this symposium are brought together by a common interest in one of the most exciting forms of knowledge technology, digital libraries. However, digital libraries are just one of the kinds of knowledge technologies that are emerging, and it is important to understand the bigger arena of new knowledge technologies in order to understand how knowledge work will be advanced by even such a specific technology as digital libraries.
In our vision of knowledge technologies, articulated in an earlier paper [4], we propose that there are three interlinked kinds of capabilities that are needed:
Person-to-person technologies: these are those communication and computing technologies that link people with each other. Much that goes under the general title of groupware fits here.
Access to digital knowledge repositories: these are the wide array of stores of information and knowledge that we can access in order to accomplish our work. This refers to digital libraries and all manner of data bases and on-line information stores.
Remote access to the physical world: this refers to the capability to access and even interact with remote parts of the physical world. This might include such remote objects as scientific or engineering instruments, cameras, or various specialized manipulanda.
All of these can be used to relax constraints on space and time in access to and the use of knowledge in the service of human goals. We are in the midst of a period of enormous organizational innovation and experimentation. Concepts like distributed groups, virtual organizations, knowledge networks, and electronic communities are bantered about in both the popular and scholarly press. While there is no lack of hype about what these new organizational forms might offer, there is also widespread discussion about the difficulties created by the new technologies. The pace of technology creation is much faster than the growth of understanding about their effects, but researchers and the funding agencies that support them are increasing their investment in studies of technology in actual practice.
A simple example will illustrate the contrast between knowledge in declarative and procedural forms. Consider someone who is learning to solve problems in physics [2] The beginning student learns lots of facts and principles about physics, and then tries to solve problems by looking for relevant knowledge in memory. Since the knowledge about physics was learned in declarative form, the student must try to generate descriptions of the problem that can be matched in memory with this knowledge. Once relevant knowledge is found it must be combined with very general, domain-independent problem solving heuristics to actually solve the problem. Problem solving is halting, with frequent blind alleys, and it takes a long time. The skilled problem solver behaves very differently. Through extensive work with physics problems, the expert acquires procedural knowledge that embodies the physics knowledge in skills. This procedural knowledge is triggered by examination of the problem, and it immediately activates domain-specific problem solving routines that are fine-tuned to problems of different types. Problem solving is fluid, and proceeds quickly and directly.
The proceduralization of knowledge is a long, slow process. Learning a simple cognitive skill takes many days and weeks, while learning a complex skill like playing chess takes upwards of a decade. This is one of the primary reasons why expertise in complex domains is so scarce - it takes a very long time to achieve.
The duality of knowledge - declarative and procedural - represents a major challenge for the evolution of knowledge networks through the use of modern communication and computing technology. Modern information systems are superb at the manipulation and dissemination of declarative knowledge. Such systems require that the information they process be in an explicit form. We have developed computational methods of great power that we can use with these explicit representations, and our computational tools have extended our reach considerably.
Of course, information artifacts play a role in the rendering of procedural knowledge. Our actions occur in interaction with the social and material world, and part of learning a cognitive skill is learning what actions are appropriate in various situations. These elements of the world are the triggers and the shapers of the expression of procedural knowledge. Similarly, the social and material worlds provide representational support for cognitive activity, as so vividly shown in Ed Hutchins' descriptions of skilled behavior [5,6]. Again, we know that the highly efficient human perceptual systems are critical components of skilled behavior [1]. Thus, while information systems traffic in the explicit representations of declarative knowledge, the way in which they appear to users through their "interfaces" are critical to the orchestration of procedural knowledge.
Procedural knowledge is much more difficult for the information analyst to elicit and make visible than declarative knowledge. For the latter one can usually just ask an informant or observe the overt communication between informants. But for procedural knowledge one has to observe the subtle interactions of an individual with other individuals or with information artifacts. Thus, it is often the case that procedural knowledge is ignored in designing information systems, and many failures or surprises in deploying such systems results because of incomplete understanding of the implicit details of the work situation.
There are many aspects of working together that are affected by knowledge technology, and in looking at knowledge work in action these often constitute some of the important dimensions of analysis.
The Participants. Historically, working together usually implied that one was geographically close to one's work partners. In classic studies, Tom Allen and Bob Kraut showed that the likelihood of working together was strongly dependent upon distance, and that if two people's workspaces were more than 30 meters apart they were no more likely to collaborate than if they were hundreds or even thousands of miles apart. This 30-meter principle is one of the most basic characteristics of working together. Yet it has struck many that one effect of the new knowledge technologies might be to reduce the effect of distance on participation in working together, or at a minimum alter the functional form of the dependency.
Space and time. A common way to classify work situations is to build a two-by-two table, with same/different time on one axis and same/different place on the other. The different combinations of space and time allow for different ways of working together. The character of the work that can be done in these different situations is often quite different, and thus a typical project team will often work in 3 or even 4 of these quadrants, depending on what they are doing. In general, complex, tightly-coupled knowledge work has required being in the same place at the same time. For example, some kinds of informationally intensive activities like brainstorming or decision making are usually done face-to-face. More loosely-coupled work can be done asynchronously or at a distance. Similarly, access to special kinds of facilities (e.g., specialized instruments, specialized information resources, project rooms) has usually introduced significant constraints on when and where work can take place. It is hardly surprising that knowledge technologies are conjectured to have a big impact here.
Social arrangements. Organizing is a key element of working together. Organizing includes setting up patterns of authority and communication, establishing rules and procedures, and forming subgroupings with defined missions. Since communication is such a key element of organizing it is hardly surprising that changes in technologies of communication have big impacts on how it is done.
Tools and infrastructure. Knowledge work requires not just people but also tools and resources. The new digital technologies have introduced a wide array of new tools to support knowledge work, and the ability to share such tools over networks offers new possibilities for the coordination of working together.
Costs. All of these elements of working together have a cost structure. Getting together for focused synchronous work has long been possible but the transportation costs for those whose normal base of operation is geographically distributed can be very high. Video conferencing can relax such travel constraints, but it too has non-trivial capitol and operational costs. High-speed networks to support easy interactions at a distance are also expensive. There is generally a correlation between expense and quality, so if the highest quality of interaction is not required there are often inexpensive alternatives. For instance, low frame-rate video is available for free over the Internet (assuming one has an Internet connection).
All of these elements of work are in the process of being transformed. Patterns of participation and communication are altered when work becomes possible using distance independent means. The constraints of time and place are relaxed. New organizational forms emerge, and new kinds of tools and infrastructure to support work emerge. The cost structure of working together in various ways is altered, with resulting long-term changes in how work is done.
Digital Library Technology for Project-Based Science Education. One particularly important component of knowledge work is access to information resources. The promise of digital information is that the collections and services that characterize traditional libraries can be made available any place at any time. This provides more flexibility for linking users with useful material, and it enlarges the scope of access - any user anywhere can in principle access a journal or book that is in digital form. But there are numerous obstacles to achieving this vision. Some are purely technical, whereas others involve situating the digital library in a situation of practice.
The University of Michigan Digital Library (UMDL) project is focusing on both aspects of this. On the technical side, the project is focusing on how to manage the inevitable heterogeneity that will characterize users and collections. Users have diverse purposes and characteristics in seeking information. The high school student looking for material for a collaborative science project is quite different from the senior scientist checking out the latest results from the labs of distant colleagues. Similarly, collections will inevitably be quite diverse, both in content and in form. UMDL uses distributed agent technology from artificial intelligence and market mechanisms from economics to form an infrastructure to support these two kinds of heterogeneity. Collections of information are represented by software agents that contain information about the contents and characteristics of the collection. Similarly, users are represented by agents that embody descriptions of their characteristics and purposes. Mediation agents act as brokers, attempting to match users with appropriate collections. The interactions of these various agents are governed by market mechanisms that embody various aspects of cost and benefit in searches.
In order to situate a UMDL ideas in a real community of practice, To test these ideas in practice, a testbed has been built in the area of earth and space science using the UMDL technology. The initial deployment of the testbed is in middle and high schools, where UMDL is being used to support project-based science education. Here students are investigating science topics through concrete research projects and are using a version of the UMDL system called Artemis to access digital resources as needed while they do their work. Artemis provides a suite of tools to support their work, based on observations of how project-based science education is done in classrooms. This tailoring of the interface to a specific set of practices is the essence of linking the explicit representations of the digital library with the details of practice. Careful empirical studies will provide feedback that can be used to revise the details of the interface.
When information is in digital form the very definition of what constitutes information can be enlarged. For instance, in our collaboratory projects described below users would like access to digital versions of their primary journals to support their collaboratory workshops, in which they would carry out distributed discussions of theory and data. But they would also like on-line access to past science campaigns, so they can replay interesting events and reflect on the interpretations and reactions those present at the event. Thus, activity itself can be captured and stored for later re-use, a boon not only to the practitioners of science but to students or the public who would like to better understand what science is about. Digital libraries can become a venue for an enlarged repertoire of useful types of information, with new genres standing alongside old ones.
Of course, the emergence of digital libraries is challenging our idea of what a library is. Historically a library has been a place where one goes for access to collections and services. A digital library need not be linked to a place, although even in the digital world the place metaphor has a lot of conceptual power. This is clearly an era where traditional libraries will be re-evaluating their role and how they can integrate themselves with the digital world. The result will be important for the next generation of knowledge workers.
A Collaboratory that is Transforming the Practice of a Science. Despite the popular stereotype of the scientist as a lonely genius, modern science is a highly collaborative activity. The laboratory is a social institution that emerged in the nineteenth century as a way of facilitating such work. Now, a new organizational form is emerging, the collaboratory. A collaboratory is "a center without walls, in which the nation's researchers can perform their research without regard to geographical location - interacting with colleagues, accessing instrumentation, sharing data and computational resources, [and] accessing information in digital libraries." Many projects are under way to build and evaluate collaboratories in various domains.
One specific example is the Upper Atmospheric Research Collaboratory (UARC). This is a multidisciplinary project whose goal is to design, build, deploy, and evaluate collaboratory technology for a distributed community of upper atmospheric researchers. The project began in 1992, and it has evolved through multiple generations of the technology. The design has been informed by a user-centered design strategy, in which careful empirical observations of the work of the scientists, both without and with the technology, has played a critical role in designing the technology.
The field of upper atmospheric physics focuses on the study of the ionosphere, looking at the interactions of the earth's magnetic field, the solar wind, and the physical and chemical properties of the upper atmosphere. Observations are made from ground-based facilities, most at high latitudes, satellites, and rockets. Complex models run on supercomputers that integrate many data sources and attempt to predict the emergence of new phenomena are becoming increasingly important in the field.
Initially, the UARC project focused on a community of users of the Sondrestrom Upper Atmospheric Research facility at Kangerlussuaq, Greenland. This site has a number of instruments whose aim is to record phenomena in the upper atmosphere. At the time UARC began there were two predominant modes of using this facility. One was to develop a fixed observational routine in advance and send these specifications to the site crew at Sondrestrom. They would carry out these observations and send the data back to the investigator. The alternative was to conduct interactive, often collaborative studies by going to the site and making decisions on the fly about observational modes depending upon what was actually happening in the upper atmosphere. The UARC project focused on the second of these.
We developed technology that allowed the scientists to observe data displays of Sondrestrom instruments in real time over the Internet. We also gave them a collaborative chat facility that allowed them to exchange text messages and images with each other, along with several other simple collaboration tools. We evolved these tools over several years, adding additional instruments and capabilities, refining the interface, and improving the performance. Over a series of three winter campaign seasons (93-94, 94-95, 95-96) the UARC technology was used in a number of scientific campaigns.
By the 95-96 campaign season several things had become apparent that had big implications for UARC. First, the centralized architecture of the early system was not scaling well as we added more users and instruments. This limited scalability was exacerbated by the tremendous growth in usage of the Internet that followed the emergence of the World Wide Web. Second, the emergence of the World Wide Web and the associated technologies such as multi-platform browsers and the Java programming language provided a new model of how the UARC technology might be delivered. Third, the scientists themselves increasingly discussed more ambitious science that they could carry out with envisioned extensions of the UARC concept. In particular, they talked of being able to see data from many ground-based sites all over the world, along with data from key satellites. This would give them a more global picture of what was happening, and would allow them to compare model predictions with global data in real time.
This led to a major redesign and rebuild of the UARC technology. First, to solve the performance problems, a new, more distributed architecture was developed that gave more satisfactory performance over realistic Internet conditions. Second, the new UARC technology was built as Java applets accessible from a Web browser. Third, data viewers that would allow for many data sources from around the world and for comparisons between data and theory were developed. This new UARC technology was first used in the fall of 1996, and has subsequently been refined and used in additional campaigns.
The evolution of the UARC technology has produced a revolution in the practice of the science. In the early days of UARC we essentially reproduced over the Internet what the scientists did when they went to Greenland. By being able to do this familiar science over the Internet they had considerably more flexibility in the scheduling of scientific campaigns and in who could participate. But they also began to have new ideas of things they could not do previously. For instance, following the second season of using the UARC technology a group of scientists asked if they could replay one of the campaigns, watching the phenomena flow by again for additional reflection and bringing in some additional participants who had not been present originally. During this same season another two groups of scientists conducted an interleaved campaign. One group desired electrically quiet conditions in the atmosphere, the other needed electrical disturbances. So using UARC one group or the other could "have the floor" depending on conditions. We also saw examples of innovative educational uses of UARC. Some senior scientists used UARC in their classrooms to teach about upper atmospheric phenomena. Others have used UARC as the basis for creative science projects by undergraduates
The most recent rounds of science conducted with the new Web-based versions of UARC represent a true revolution in their science. The cutting edge issues in upper atmospheric science have to do with the emergence of models of global electrical activity and the evaluation of these models against data. This has mostly taken place through extensive analysis of historical data. In the UARC campaigns of 1996-97 the scientists had available live data from many sources across the northern hemisphere, and while observing current conditions they were able to display the outputs from models and make preliminary comparisons of data and theory. As one of the scientists noted, this allowed them "to close the data-theory loop. Furthermore, they were able to use the theory to guide their observations.
The scientists are of course eager to extend the UARC technology in yet further directions. One in particular that they have stressed is the ability to conduct workshops over the Internet. The community already has a tradition of conducting face-to-face workshops, in which data from a number of different time periods are debated and discussed in the context of emerging theories. These workshops often result in a set of scientific papers published in a special issue of a journal or in a book. They would like to do this over the Internet, because it would allow for more flexible participation and therefore might accelerate the process of getting significant results into the scientific literature. We hope to conduct several of these kinds of collaboratory workshops in the coming year.
UARC is a prime example of the productive strategy of working closely with a user community and evolving technology with their close cooperation that in the end profoundly changes the way they work. It also shows how the emerging collaboratory concept can productively affect scientific practice. The collaboratory idea is of course much broader than what we have done with UARC, and in other studies we are looking at communities with very different kinds of information needs in order to better understand the full range of capabilities required to support productive work over distance and time.
[2] Chi, M.T.H., Feltovich, P.J., & Glaser, R. Categorization and Representation of Physics Problems by Experts and Novices. Cognitive Science, Vol. 5, pp. 121-152, 1981.
[3] Cohen, M.D., & Bacdayan, P. Organizational Routines are Stored as Procedural Memory: Evidence from a Laboratory Study. Organizational Science, Vol. 5, pp. 554-568, 1994.
[4] Finholt, T.A., & Olson, G.M. From Laboratories to Collaboratories: A New Organizational Form for Scientific Collaboration. Psychological Science, Vol. 8, pp. 28-36, 1997.
[5] Hutchins, E. Cognition in the Wild. Cambridge, MA: MIT Press, 1995.
[6] Hutchins, E. How a Cockpit Remembers Its Speed. Cognitive Science, Vol. 19, pp. 265-288, 1995.
[7] Yates, JoAnne. Control Through Communication. Baltimore: Johns Hopkins University Press, 1989.