Lev Manovich

New Media: a User’s Guide

 

 

How Media Became New

 

On August 19, 1839, the Palace of the Institute in Paris was completely full with curious Parisians who came to hear the formal description of the new reproduction process invented by Louis Daguerre. Daguerre, already well-known for his Diorama, called the new process daguerreotype. According to a contemporary, "a few days later, opticians' shops were crowded with amateurs panting for daguerreotype apparatus, and everywhere cameras were trained on buildings. Everyone wanted to record the view from his window, and he was lucky who at first trial got a silhouette of roof tops against the sky." The media frenzy has begun. Within five months more than thirty different descriptions of the techniques were published all around the world: Barcelona, Edinburg, Halle, Naples, Philadelphia, Saint Petersburg, Stockholm. At first, daguerreotypes of architecture and landscapes dominated the public's imagination; two years later, after various technical improvements to the process, portrait galleries were opened everywhere — and everybody rushed in to have their picture taken by a new media machine.

In 1833 Charles Babbage started the design for a device he called the Analytical Engine. The Engine contained most of the key features of the modern digital computer. The punch cards were used to enter both data and instructions. This information was stored in the Engine's memory. A processing unit, which Babbage referred to as a "mill," performed operations on the data and wrote the results to memory; final results were to be printed out on a printer. The Engine was designed to be capable of doing any mathematical operation; not only would it follow the program fed into it by cards, but it would also decide which instructions to execute next, based upon intermediate results. However, in contrast to the daguerreotype, not even a single copy of the Engine was completed. So while the invention of this modern media tool for the reproduction of reality impacted society right away, the impact of the computer was yet to be measured.

Interestingly, Babbage borrowed the idea of using punch cards to store information from an earlier programmed machine. Around 1800, J.M. Jacquard invented a loom which was automatically controlled by punched paper cards. The loom was used to weave intricate figurative images, including Jacquard's portrait. This specialized graphics computer, so to speak, inspired Babbage in his work on the Analytical Engine, a general computer for numerical calculations. As Ada Augusta, Babbage's supporter and the first computer programmer, put it, "the Analytical Engine weaves algebraical patterns just as the Jacquard loom weaves flowers and leaves." Thus, a programmed machine was already synthesizing images even before it was put to process numbers. The connection between the Jacquard loom and the Analytical Engine is not something historians of computers make much of, since for them image synthesis and manipulation represent just one application of the modern digital computer among thousands of others; but for a historian of new media it is full of significance.

We should not be surprised that both trajectories — the development of modern media, and the development of computers — begin around the same time. Both media machines and computing machines were absolutely necessary for the functioning of modern mass societies. The ability to disseminate the same texts, images and sounds to millions of citizens thus assuring that they will have the same ideological beliefs was as essential as the ability to keep track of their birth records, employment records, medical records, and police records. Photography, film, the offset printing press, radio and television made the former possible while computers made possible the latter. Mass media and data processing are the complimentary technologies of a mass society.

For a long time the two trajectories developed in parallel without ever crossing paths. Throughout the nineteenth and the early twentieth century, numerous mechanical and electrical tabulators and calculators were developed; they were gradually getting faster and their use was became more wide spread. In parallel, we witness the rise of modern media which allows the storage of images, image sequences, sounds and text in different material forms: a photographic plate, film stock, a gramophone record, etc.

Let us continue tracing this joint history. In the 1890s modern media took another step forward as still photographs were put in motion. In January of 1893, the first movie studio — Edison's "Black Maria" — started producing twenty seconds shorts which were shown in special Kinetoscope parlors. Two years later the Lumière brothers showed their new Cinématographie camera/projection hybrid first to a scientific audience, and, later, in December of 1895, to the paying public. Within a year, the audiences in Johannesburg, Bombay, Rio de Janeiro, Melbourne, Mexico City, and Osaka were subjected to the new media machine, and they found it irresistible. Gradually the scenes grew longer, the staging of reality before the camera and the subsequent editing of its samples became more intricate, and the copies multiplied. They would be sent to Chicago and Calcutta, to London and St. Petersburg, to Tokyo and Berlin and thousands and thousands of smaller places. Film images would sooth movie audiences, who were too eager to escape the reality outside, the reality which no longer could be adequately handled by their own sampling and data processing systems (i.e., their brains). Periodic trips into the dark relaxation chambers of movie theatres became a routine survival technique for the subjects of modern society.

The 1890s was the crucial decade, not only for the development of media, but also for computing. If individuals' brains were overwhelmed by the amounts of information they had to process, the same was true of corporations and of government. In 1887, the U.S. Census office was still interpreting the figures from the 1880 census. For the next 1890 census, the Census Office adopted electric tabulating machines designed by Herman Hollerith. The data collected for every person was punched into cards; 46, 804 enumerators completed forms for a total population of 62,979,766. The Hollerith tabulator opened the door for the adoption of calculating machines by business; during the next decade electric tabulators became standard equipment in insurance companies, public utilities companies, railroads and accounting departments. In 1911, Hollerith's Tabulating Machine company was merged with three other companies to form the Computing-Tabulating-Recording Company; in 1914 Thomas J. Watson was chosen as its head. Ten years later its business tripled and Watson renamed the company the International Business Machines Corporation, or IBM.

We are now in the new century. The year is 1936. This year the British mathematician Alan Turing wrote a seminal paper entitled "On Computable Numbers." In it he provided a theoretical description of a general-purpose computer later named after its inventor the Universal Turing Machine. Even though it was only capable of four operations, the machine could perform any calculation which can be done by a human and could also imitate any other computing machine. The machine operated by reading and writing numbers on an endless tape. At every step the tape would be advanced to retrieve the next command, to read the data or to write the result. Its diagram looks suspiciously like a film projector. Is this a coincidence?

If we believe the word cinematograph, which means "writing movement," the essence of cinema is recording and storing visible data in a material form. A film camera records data on film; a film projector reads it off. This cinematic apparatus is similar to a computer in one key respect: a computer's program and data also have to be stored in some medium. This is why the Universal Turing Machine looks like a film projector. It is a kind of film camera and film projector at once: reading instructions and data stored on endless tape and writing them in other locations on this tape. In fact, the development of a suitable storage medium and a method for coding data represent important parts of both cinema and computer pre-histories. As we know, the inventors of cinema eventually settled on using discrete images recorded on a strip of celluloid; the inventors of a computer — which needed much greater speed of access as well as the ability to quickly read and write data — came to store it electronically in a binary code.

In the same year, 1936, the two trajectories came even closer together. Starting this year, and continuing into the Second World War, German engineer Konrad Zuse had been building a computer in the living room of his parents' apartment in Berlin. Zuse's computer was the first working digital computer. One of his innovations was program control by punched tape. The tape Zuse used was actually discarded 35 mm movie film.

One of these surviving pieces of this film shows binary code punched over the original frames of an interior shot. A typical movie scene — two people in a room involved in some action becomes a support for a set of computer commands. Whatever meaning and emotion was contained in this movie scene has been wiped out by its new function as a data carrier. The pretense of modern media to create simulation of sensible reality is similarly cancelled; media is reduced to its original condition as information carrier, nothing else, nothing more. In a technological remake of the Oedipal complex, a son murders his father. The iconic code of cinema is discarded in favor of the more efficient binary one. Cinema becomes a slave to the computer.

But this is not yet the end of the story. Our story has a new twist — a happy one. Zuse's film, with its strange superimposition of the binary code over the iconic code anticipates the convergence which gets underway half a century later. The two separate historical trajectories finally meet. Media and computer — Daguerre's daguerreotype and Babbage's Analytical Engine, the Lumière Cinématographie and Hollerith's tabulator — merge into one. All existing media are translated into numerical data accessible for the computers. The result: graphics, moving images, sounds, shapes, spaces and text become computable, i.e. simply another set of computer data. In short, media becomes new media.

This meeting changes both the identity of media and of the computer itself. No longer just a calculator, a control mechanism or a communication device, a computer becomes a media processor. Before the computer could read a row of numbers outputting a statistical result or a gun trajectory. Now it can read pixel values, blurring the image, adjusting its contrast or checking whether it contains an outline of an object. Building upon these lower-level operations, it can also perform more ambitious ones: searching image databases for images similar in composition or content to an input image; detecting shot changes in a movie; or synthesizing the movie shot itself, complete with setting and the actors. In a historical loop, a computer returned to its origins. No longer just an Analytical Engine, suitable only to crunch numbers, the computer became Jacqurd's loom — a media synthesizer and manipulator.

 

 

Principles of New Media

 

The identity of media has changed even more dramatically. In the following I tried to summarize some of the key differences between old and new media. In compiling this list of differences I tried to arrange them in a logical order. That is, the principles 3 and 4 are dependent on the principles 1 and 2. This is not dissimilar to axiomatic logic where certain axioms are taken as staring points and further theorems are proved on their basis.

1. Discrete representation on different scales.

This principle can be called "fractal structure of new media." Just as a fractal has the same structure on different scales, a new media object has the same discrete structure throughout. Media elements, be it images, sounds, or shapes, are represented as collections of discrete samples (pixels, polygons, voxels, characters). These elements are assembled into larger-scale objects but they continue to maintain their separate identity. Finally, the objects themselves can be combined into even larger objects -- again, without losing their independence. For example, a multimedia "movie" authored in popular Macromedia Director software may consist from hundreds of images, QuickTime movies, buttons, text elements which are all stored separately and are loaded at run time. These "movies" can be assembled into a larger "movie," and so on.

We can also call this "modularity principle" using the analogy with structured computer programming. Structural computer programming involves writing small and self-sufficient modules (called in different computer languages routines, functions or procedures) which are assembled into larger programs. Many new media objects are in fact computer programs which follow structural programming style. For example, an interactive multimedia application is typically programmed in Macromedia Director’s Lingo language. However, in the case of new media objects which are not computer programs, an analogy with structural programming still can be made because their parts can be accessed, modified or substituted without affecting the overall structure.

2. Numerical representation. Consequences:

2.1. Media can be described formally (mathematically). For instance, an image or a shape can be described using a mathematical function.

2.2. Media becomes a subject to algorithmic manipulation. For instance, by applying appropriate algorithms, we can automatically remove "noise" from a photograph, alter its contrast, locate the edges of shapes, and so on.

3. Automation.

Discrete representation of information (1) and its numerical coding (2) allow to automate

many operations involved in media creation, manipulation and access. Thus human intentionally can be removed from the creative process, at least in part.

The following are some of the examples of what can be called "low-level" automation of media creation, in which the computer modifies (i.e., formats) or creates from scratch a media object using templates or simple algorithms. These techniques are robust enough that they are included in most commercial software: image editing, 3-D graphics, word processing, graphic layout. Image editing programs such as Photoshop can automatically correct scanned images, improving contrast range and removing noise. They also come with filters which can automaticaly modify an image, from creating simple variations of color to changing the whole image as though it was painted by Van Gog, Seurat or other brand-name artist. Other computer programs can automatically generate 3-D objects such as trees, landscapes, human figures and detailed ready-to-use animations of complex natural phenomena such as fire and waterfalls. In Hollywood films, flocks of birds, ant colonies and even crowds of people are automatically created by AL (artificial life) programs.Word processing, page layout, presentation and Web creation software comes with "agents" which offer the user to automatically create the layout of a document. Writing software helps the user to create literary narratives using formalized highly conventions genre convention. Finally, in what maybe the most familiar experience of automation of media generation to most computer users, many Web sites automatically generate Web pages on the fly when the user reaches the site. They assemble the information from the dataabses and format it using templates and scripts.

The researchers are also working on what can be called "high-level" automation of media creation which requires a computer to understand, to a certain degree, the meanings embedded in the objects being generated, i.e. their semantics. This research can be seen as a part of a larger initiative of artificial intelligence (AI). As it is well known, AI project achieved only very limited success since its beginnings in the 1950s. Correspondingly, work on media generation which requires understanding of semantics is also in the research stage and is rarely included in commercial software. Beginning already in the 1970s, computers were often used to generate poetry and fiction. In the 1990s, the users of Internet chat rooms became familiar with bots -- the computer programs which simulate human conversation. Meanwhile, the researchers at New York University showed the systems which allow the user to interact with a "virtual theatre" composed of a few "virtual actors" which adjust their behavior in real-time. The researchers at MIT Media Lab demonstrated "smart camera" which can automatically follow the action and frame the shots given a script. Another Media Lab project was ALIVE, a a virtual environment where the user interacted with animated characters. Finally, Media Lab also showed a number of versions of a new kind of human-computer interface where the computer presents itself to a user as an animated talking character. The character, generated by a computer in real-time, communicates with user using natural language; it also tries to guess user’s emotional state and to adjust the style of interaction accordingly.

The areas of new media where the avarage computer user encountered AI in the 1990s was not, however, human-computer interface but computer games. Almost every commercial game includes a component called AI engine. It stands for part of the game’s computer code which controls its characters: car drivers in a car race simulation, the enemy forces in a straregy game such as Command and Conquer, the single enemies which keep attacking the user in first-person shooters such as Quake. AI engines use a variety of approaches to simulate intelligence, from rule-based systems to neural networks. The characters they create are not really too intelligent. Like AI expert systems, these computer-driven have expertise in some well-defined areas such as attacking the user. And because computer games are highly codified and rule-based and because they severaly limit possible behaviors of the user, these characters function very effectively. To that exent, every computer game can be thought off as being another version of a competition between a human chess player and a computer opponent. For instance, in a martial arts fighting game, I can’t ask questions of my opponent, nor do I expect him to start a conversation with me. All I can do is to "attack" him by pressing a few buttons; and within this severaly limited communication bandwidth the computer can "fight" me back very effectively. In short, computer characters can display intelligence and skills only because they put severe limits on our possible interactions with them. So, to use another example, once I was playing against both human and computer-controlled characters in a V R simulation of some non-existent sport game. All my opponents apeared as simple blobs covering a few pixels of my VR display; at this resolution, it made absolutely no diffirence who was human and who was not. The computers can pretend to be intelligent only by tricking us into using a very small part of who were are when we communicate with them.

Along with "low-level" and "high-level" automation of media creation, another area of media use which is being subjected to increasing automation is media access. The switch to computers as a means to store and access enormous amound of media material, exemplified by the Internet’s "media assets" distributed across numerous Web sites, creates the need to find more efficient ways to classify and search media objects. Word processors and other text management software for a long time provided the abilities to search for specefic strings of text and automatically index documents. In the 1990s software designers started to provide media users with similar abilities. Virage introduced Virage's VIR Image Engine which allows the user to search for visually simular image content among millions of images as well as a set of video search tools to allow indexing and searching video files. By the end of the 1990s, the key Web search engines already included the options to search the Internet by specefic media such as images, video and audio.

The Internet also crystallized the basic condition of the new information society: over-abundance of information of all kind. One response was the popular idea of "agent" software. Some "Agents" are supposed to act as filters which deliver small amounts of information given user' criteria. Other are allowing users to tap into the expertise of other users, following their selections and choices. For example, MIT Software Agents Group developed such agents as BUZZwatch which "distills and tracks trends, themes, and topics within collections of texts across time" such as Internet discussions and Web pages; Letizia, "a user interface agent that assists a user browsing the World Wide Web by… scouting ahead from the user's current position to find Web pages of possible interest"; Footprints which "uses information left by other people to help you find your way around."

At the end of the twentieth century, the problem was no longer how to create a new media object such as an image; the new problem was how to find the object which already exists somewhere. That is, if you want a particular image, chances are it is already exists somewhere but it may be easier to create one from scratch when to find the one already stored. Historically, we first developed technologies which automated media construction: a photo camera, a film camera, a tape recorder, a video recorder, etc. These technologies allowed us, over the course of about one hundred and fifty years, to accumulate an unprecedented amount of media materials: photo archives, film libraries, audo archives…This then led to the next stage in media evolution: the need for technologies to store, organize and effeciently access these media. The computer provided a basis for these new technologies: digital archives of media; hyperlinking, hierarchical file system and other ways of indexing the digital material; and sofware for content-based search and retrieval. Thus automation of media access is the next logical stage of the process which was already put into motion when a first photograph was taken.

4. Variability: a new media object (such as a Web site) is not something fixed once and for all but can exist in different (potentially infinite) versions. This is another consequence of discrete representation of information (1) and its numerical coding (2).

Old media involved a human creator who manually assembled textual, visual or audio elements (or their combination) into a particular sequence. This sequence was stored in some material, its order determined once and for all. Numerous copies could be run off from the master, and, in perfect correspondence with the logic of an industrial society, they were all identical. New media, in contrast, is characterized by variability.

Stored digitally, rather than in some permanent material, media elements maintain their separate identity and can be assembled into numerous sequences under program control. At the same time, because the elements themselves are broken into discrete samples (for instance, an image is represented as an array of pixels), they can be also created and customized on the fly.

The logic of new media thus corresponds to the post-industrial logic of "production on demand" and "just in time" delivery which themselves were made possible by the use of digital computers and computer networks in all stages of manufacturing and distribution. Here "culture industry" is actually ahead of the rest of the industry. The idea that a customer determines the exact features of her car at the showroom, the data is transmitted to the factory, and hours later the new car is delivered remains a dream, but in the case of computer media, it is reality. Since the same machine is used as a showroom and a factory, and since the media exists not as a material object but as data which can be sent through the wires with the speed of light, the response is immediate.

Here are some particular cases of the variability principle:

4.1. Media elements are stored in a media database; a variety of end-user objects which vary both in resolution, in form and in content can be generated, either beforehand, or on demand, from this database.

4.2. It becomes possible to separate the levels of "content" (data) and interface. A number of different interfaces can be created to the same data. A new media object can be defined as one or more interfaces to a multimedia database.

4.3. The information about the user can be used by a computer program to automatically customize the media composition as well as to create the elements themselves. Examples: Web sites use the information about the type of hardware and browser or user's network address to automatically customize the site which the user will see; interactive computer installations use information about the user's body movements to generate sounds, shapes, or control behaviors of artificial creatures.

4.4 A particular case of 4.3 is branching-type interactivity. (It is also sometimes called menu-based interactivity.) The program presents the user with choice(s) and let her pick. In this case the information used by a program is the output of user's cognitive process (rather than the network address or body position).

4.5. Hypermedia: the multimedia elements making a document are connected through hyperlinks. Thus the elements and the structure are separate rather than hard-wired as in traditional media. By following the links the user retrieves a particular version of a document. (World Wide Web is a particular implementation of hypermedia in which the elements are distributed throughout the network).

Out of these four principles, the principle of variability maybe be the most interesting. On the one hand, such popular new media forms as branching-type interactivity and hypermedia can be seen as particular instances of variability principle. On the other hand, this principle demonstrates how the changes in media technologies are closely tied up with changes in social organization. Just as the logic of old media corresponded to the logic of industrial mass society, the logic of the new media fits the logic of the post-industrial society of personal variability. In industrial mass society everybody was supposed to enjoy the same goods -- and to have the same beliefs. This was also the logic of media technology. A media object was assembled in a media factory (such as a Hollywood studio). Millions of identical copies were produced from a master and distributed to all the citizens. Broadcasting, film distribution, print technologies all followed this logic.

In a post-industrial society, every citizen can construct her own custom lifestyle and "select" her ideology from a large (but not infinite) number of choices. Rather than pushing the same objects/information to a large group, marketing tries to target each individual separately, The logic of new media technology reflects this new condition perfectly. Every visitor to a Web site automatically gets her own custom version of the site created on the fly from a database. Every hypertext reader gets her own version of the text. Every viewer of an interactive installation gets her own version of the work. And so on. In this way new media technology acts as the most perfect realization of the utopia of a perfect society composed from unique individuals. New media objects assure the users that their choices — and therefore, their underlying thoughts and desires — are unique, rather than pre-programmed and shared with others. As though trying to compensate for their earlier role in making us all the same, today desdentans of the Jacqurd's loom, The Hollerith tabulator and Zuse's cinema-computer are now working to convince us that we are all different.