THE AGE OF INTELLIGENT MACHINES | Artificial Intelligence and Musical Composition
September 24, 2001
- author |
- Charles Ames
Consider the young music student who wants to learn about chord progressions. Traditionally, he would be required to try out some progressions of blocked chords on the piano. But if keyboard skills were lacking, the student would have to put off learning about progressions for several years until these skills had been acquired. With an appropriately designed expert music-tutoring program, such a student might type a harmonic plan into a computer and have the program quickly derive a full-blown realization with rhythmic chords, a bass line, even a drum part.
Consider the musical theorist who desires a truly empirical means of testing assumptions about how music is made. Whereas in the past, musical theories have as often as not been accepted on the basis of intrinsic logical elegance, AI techniques now make it possible to implement a theory as a program whose products may be evaluated empirically against authentic musical examples.
Consider also the composer of soundtracks for recorded dramas, documentaries, and advertisements. It is already common to rely heavily on standard background patterns when producing large volumes of precisely timed and thematically consistent music in the face of production deadlines. It is very likely that tomorrow’s professional composers will be relying even more heavily on software modules for accompaniments and perhaps even for leads.
For the art-music composer, artificial intelligence provides a mechanical extension to human intellect. For example, a composer might wish to set up criteria for comparing solutions to a compositional problem (e.g., the problem of choosing pitches for a melody) and then let the computer evaluate every possible solution in order to choose the best one.
Perhaps the most radical potential for automated composition resides in the capability of a single program to generate many distinct pieces of music. One might think of such a program as a new kind of “record” that plays a different piece every time. The possibilities range from the mundane, such as computer-composed music for fast-food restaurants and supermarkets, to the sublime. We have now seen the appearance of a new type of composer: the metacomposer, who shapes compositional principles as fluently as traditional composers have shaped compositional material. Far such artists the originality of the music will clearly depend on the originality and subtlety of the programming.
The situations described above are not at all speculative. All of them can become realities within five years if developers are willing to expend the necessary energy to make them happen. Indeed, successes already achieved indicate that creativity can be modeled with much greater precision than conventional wisdom once suggested. A number of programs of my own are already producing credible emulations of human-composed music in popular styles, and a program by Kemal Ebcioglu for harmonizing chorales in the style of J. S. Bach has managed on occasion to duplicate Bach’s solutions exactly.1 Although the styles in all of the current programs have been rather narrowly defined, the insights they provide lead toward ever-more-general representations of compositional procedure.
A Survey of Decision-Making Tactics and Compositional Strategies
From their beginnings in the mid 1950s up until a few years ago, programs for automatic musical composition have been developed by a small handful of musicians working in relative isolation from the mainstream AI movement and, for that matter, from other computer-music fields such as digital sound synthesis.2 This isolation has been harmful to the extent that most early practitioners were ignorant of recursion, linked data structures, and basic search methods. Yet it has also led to some unique approaches that might have been passed over by a more orthodox methodology.
Statistics Up to the late 1960s the most pervasive decision-making tactics were statistical. Statistics provided an economical way of representing trends of musical behavior from short-term dependencies (e.g., how often G7 chords are followed by C major chords, by A minor chords, etc.) to long-term distributions of material (e.g., the relative usage of scale degrees). A statistical composing program from these early years selected options randomly; the options assigned the greatest statistical weight by the programmer had the greatest probability of selection.3
The basic compositional strategy for these programs was “left-to-right,” with some “top-down” influences. Wholly left-to-right programs selected musical details directly as they were to appear in the music, conditioning later decisions upon earlier choices. Many programs had top-down influences to the extent that they divided the music into sections and then chose the details of each section in a left-to-right manner. Each new section was distinguished by unique parameters that affected the statistical makeup of these details. One of the composer-programmers of this period even interposed subsections between sections and details.
Rote Processing When computers went online during the 1970s, a number of musicians took advantage of the new technology to develop programs that would process musical information either in real time or within the context of an interactive dialog. The desire for rapid interaction led to an emphasis on procedures that were simple enough that they could respond at the touch of a button, and especially to an emphasis on rote operations on themes. Among the most familiar of these operations are those that come from traditional
Courtesy of Kamal Ebcioglu
Kemal Ebcioglu’s 1984 program for harmonizing chorales guided its left-to-right note selection with top-down planning of harmonic goals and subgoals. For a few chorale tunes Ebcioglu’s program managed to duplicate Bach’s own solutions exactly. However, the solution shown here is more typical.
From Charles Ames, “Crystals: Recursive Structures in Automated Composition.” Computer Music Journal 6 , no. 3: courtesy of Charles Ames
Top-down productions begin with a very general archetype of a musical form and recursively elaborate upon this archetype until a complete description of the musical details has been obtained.
canons and fugues. They include transposition (playing a theme with every note shifted equally up or down in pitch), inversion (playing a theme upside down), retrograde (playing a theme backward), and diminution (decreasing the note lengths in a theme so that it goes by twice as quickly). The user of one of these programs would typically build a composition using a “bottom-up” strategy: he could either enter an originally composed theme or have one randomly generated, he might then derive variations on this theme using one or more operations, and he could cut up, paste together, and edit this thematic material in various ways until a whole composition had been produced.
Searches Intelligent composing programs such as Ebcioglu’s and my own are distinguished from statistical and rote programs by their ability to discriminate among solutions to musical problems. The secret behind this ability lies in constrained-search techniques drawn directly from AI.4 As an illustration of how a constrained search works, consider the problem of composing a melody when the rhythm, range, harmonic context, and style are given. Solving this kind of problem means choosing a pitch for each note in a way that conforms to the given style. Yet although one might have a very good idea how a melody in this style should sound, that’s not sufficient understanding to develop a program. One must be able to describe the style in terms meaningful to a computer.
To do this, the programmer needs to make some general observations concerning how pitches behave in the style. He might observe, for example, that the melodies never move in parallel octaves or fifths with the bass, that they always employ chord roots at cadence points, that nonchord tones always resolve by step to chord tones, and that dissonant chord tones (e.g., chord sevenths) always resolve downward by step in the next chord. The programmer might observe as well that scalewise motion is preferable to leaps, that leading tones tend to lead upward, and that the different scale degrees are usually in balance. Observations characterized by such words as “never” and “always” can be implemented directly as
From Charles Ames, “Protocol Motivation, design, and production of a Composition for Solo Piano,” Interface: Journal of New Music Research 11 , no. 4:226
In one stage of producing Charles Ames Protocol for solo piano, the composer decided that grouping of chords (themselves computer-composed) would be most appropriate to the musical spirit of the piece when constituent pairs of chords shared few degrees of the chromatic scale in common. The program scored the left group less highly than the right one because chords 7 and 13 on the left share five out of seven degrees.
constraints. Constraints keep a search within the musical ballpark. Though many are drawn from the rules of musical pedagogy, they should in no sense be taken as standards of “good music.” If a melody violates a constraint, we cannot say it is wrong, only that it is out of style. The basic mechanism for composing a melody subject to constraints is as follows: For each note, the search steps through the available pitches until it finds one conforming to all the constraints. Whenever it finds an acceptable pitch, it moves forward to an uncomposed note; whenever it exhausts all the pitches available to a given note, it backtracks to the most recent composed note that caused a conflict, revises this note’s pitch, and begins working its way forward again.
Observations characterized by words such as “tend,” “preferable,” and “usually” generally cannot be implemented as hard-and-fast constraints. The only alternative is to bias the search toward solutions with more preferable attributes. Should the programmer wish to encourage scalewise motion, for example, the search might be designed to try neighboring pitches before others. If such motion is more critical at the end of a phrase than at the beginning, it may be more effective to start at the end of the phrase and work backward; this will prevent early phrase-notes from forming a context that precludes scalewise motion later on. The official AI jargon for this kind of procedural biasing is “heuristic programming.”
The abilities to seek the more preferable solutions, to apply constraints, and to backtrack in the face of an impasse are the basic advantages that constrained searches have over statistical and rote procedures. The method applies just as well to musical problems radically different from melody writing, such as top-down generation of musical details from forms. Searches are a complement, not a substitute, for statistical and rote procedures. If the programmer desires to maintain a sense of unpredictability within given stylistic bounds, then randomness can be incorporated into the mechanism by which the search assigns priorities to options. If long-term statistical distributions are of concern, then a search can be designed to favor options that best conform to these distributions. Finally, if rote thematicism is desired, then themes generated by such operations as the ones described above can themselves be treated as options by searches evaluating many different thematic variations to choose the best one for the context.
One trade-off with searches is that their ability to backtrack renders them compositional rather than improvisational. A composer in a pinch can throw things out, but an improvisor cannot turn back the clock on music that has already been played. As a result, it is impractical to implement search-driven compositional procedures in real time. Another trade-off is that one must be willing to interact with a search-driven composing program on its own terms. Interactive feedback from a search consists at present of three phases: setting up constraints and preferences, leaving the search to do its thing (searches are most effective when left to run with a minimum of interference from the human user), and accepting or rejecting the results. If the results are unsatisfactory, one might make adjustments in the constraints and preferences before running the search again.
The entry of AI into musical creativity is by no means as radical as it might seem to laymen in view of the active and long-standing tradition of composer involvement with technical theory about music. This tradition reaches from Pythagoras and Aristoxenus of antiquity, through numerous medieval writers, through such Renaissance theorists as Gioseffe Zarlino and Thomas Morley, through the Baroque composer Jean-Philippe Rameau, through more recent composers as diverse as Arnold Schoenberg, Henry Cowell, Paul Hindemith, Harry Partch, and Joseph Schillinger to contemporaries such as Pierre Boulez and Iannis Xenakis. Each has built stylistic models from constraints, preferences, and procedural descriptions of the act of making a composition, not as a way of codifying what went on in the past, but as an intellectual aid to his craft. However, the ability of AI programs to generate actual music from such models brings them out of the speculative realm and makes them accessible to all: to composers seeking an augmented working environment in which the content, form, and style of entire compositions can be adjusted at the touch of a button; to theorists wishing to determine where the strengths and weaknesses of new models lie; and to music students seeking expert assistance in the realization of their musical projects.
1. Ebcioglu’s results have great implications for pedagogy, since Bach’s chorales are the paradigm for traditional studies in musical harmony.
2. Programs for automatic music composition began in 1956 with Hiller and Isaacson’s Illiac Suite, created slightly later than the first chess-playing and theorem-proving programs. An exhaustive survey of composing programs, replete with names, musical examples, and bibliography, is available in Charles Ames, “Automated Composition in Retrospect: 1956-1986,” Leonardo: Journal of the International Society for Science, Technology and the Arts, 1987.
3. These statistical composing programs anticipated by some 20 years the current AI fashion for “fuzzy logic.”