“Boilerplate” consists of standardized terms whose meaning is intended to be consistent from one transaction to the next, and these provisions are ubiquitous in contracts and related transactional documents. In their recent Duke Law Journal article Stephen Choi, Mitu Gulati, and Robert Scott have highlighted the potentially corrosive effect of the legal drafting process on boilerplate provisions. They show how incremental edits to boilerplate pari passu clauses for sovereign debt agreements have led to textual “black holes,” which potentially undercut the standardization purpose, wording, and substantive meaning of these boilerplate provisions. In this Article we offer preliminary evidence of a similar textual “black hole” phenomenon taking place in the mergers and acquisitions context.
We show that the mergers and acquisition context epitomizes the problem of unreflective copying of precedent provisions combined with ad hoc edits to individual clauses, which erode the textual integrity and meaning of boilerplate provisions. Each agreement is based on a prior deal precedent, and drafters frequently incorporate sections of the prior deal without sufficient scrutiny about the degree to which idiosyncratic novelties have been introduced in the precedent document that may be inapplicable to the new deal. At the same time, high levels of “editorial churning” take place in the process of transforming each precedent into the current acquisition agreement. The result is a problem of “drafting drift.” Boilerplate provisions live on from deal to deal, yet gradually shed their textual integrity and potentially lose their clear meaning as ad hoc edits are copied from deal to deal and new ad hoc edits are added at each stage.
We show how it is possible to identify the paragraphs of acquisition agreements which serve as boilerplate and to document both the degree and type of textual “drift” of these provisions over multiple generations. We construct “family trees” for boilerplate provisions by tracing the ancestors of each provision backwards in a linear way to each prior precedent. Then we reverse the process to show how ancestor provisions have progeny extending out in multiple directions which become increasingly dissimilar to their original ancestor and to each other over a few generations of acquisition agreements.
Our study shows that incremental changes in boilerplate from one generation to the next foster rapid “speciation” of the terms. Small additions and deletions from boilerplate text lead to significant cumulative effects over multiple generations. We demonstrate that this textual “drift” takes place both within individual boilerplate lineages, but also even more broadly for boilerplate provisions that have a common ancestor precedent, yet evolve separately along different lineages of precedents. Like the Big Bang, the heterogeneity of boilerplate text appears to increase in all directions, which supports an “expanding universe” theory for boilerplate that undermines the textual integrity and the meaning of boilerplate terms. While we will expand on the quantitative and qualitative analysis of the evolution of boilerplate in a future work, the preliminary evidence presented in this paper reinforces the case for the textual “black hole” theory.
The use of boilerplate terms heightens legal certainty, drafting efficiency, and the universality of provisions by providing uniform language whose meaning has stood the test of time.The challenge is that legal drafters frequently compose transactional documents that are neither completely negotiated nor completely standardized. Lawyers routinely recycle boilerplate provisions from earlier precedents. But instead of adhering to boilerplate language, lawyers often appear to engage in idiosyncratic edits that gradually transform the text of ostensibly standardized language.
In their recent Duke Law Journal article, Stephen Choi, Mitu Gulati, and Robert Scott used the sovereign debt agreement context to show how over time textual “black holes” have developed as edits aggregate from deal to deal and warp the textual integrity and meaning of boilerplate provisions.In this piece, we provide preliminary evidence that the “black hole” drafting pathology identified by Choi, Gulati, and Scott extends beyond the sovereign debt context and also characterizes the drafting process in mergers and acquisitions (M&A) transactions.
In our recent article, The Inefficient Evolution of Merger Agreements,we show how M&A agreements combine elements of standardization with high levels of “editorial churning,” ad hoc edits that appear to be cosmetic rather than substantive. This combination fosters high levels of “speciation” among merger agreements, which causes agreements as a whole (at the “macro” level) to bear little similarity to their precedent progenitors even over a few generations. In a subsequent article we applied those insights about macro-level agreement drift to suggest pathways to greater efficiency in M&A drafting. That work only looked at agreements as a whole, potentially overlooking evolution in individual paragraphs and sections of boilerplate text (at the “micro” level).
In this Article we shift our focus to the micro-level of boilerplate clauses of M&A agreements to examine how editorial churning affects the drafting process. We show how cosmetic edits rapidly accumulate over time and distort the form of boilerplate provisions in M&A agreements, which we would expect to change rarely, if ever, from one deal to the next. Our preliminary findings in the M&A context support the textual “black hole” thesis. Lawyers’ “rote usage” of boilerplate without examination of the terms, coupled with “encrustation,” the retention of idiosyncratic textual variations, have undercut the meaning of boilerplate provisions and created the preconditions necessary for textual “black holes.”
What distinguishes our approach from that of Choi, Gulati, and Scott is the scale and method of analysis that we use to analyze this drafting pathology in the distinctive M&A context. M&A is different from most other contractual settings because of the extent of its artisanal drafting and lack of standardization. This fact makes analyzing the evolution of M&A boilerplate important, since these provisions serve as some of the few sources of standardization. We leverage our data set of over 12,000 public company merger agreements from 1994 to 2014 to create a comprehensive picture of the evolution of boilerplate provisions over time. We use a computer program to identify and analyze the word-for-word differences between boilerplate provisions. This approach allows us to measure the degree of textual similarity or dissimilarity based on the number of insertions and deletions (i.e., edits) in boilerplate provisions across agreements.
We show how it is possible to identify the paragraphs of acquisition agreements which serve as boilerplate and demonstrate both the degree and type of textual “drift” of these provisions over multiple generations. We construct “family trees” for boilerplate provisions by tracing the descendants of each “ancestor” provision. We show that common ancestors have progeny extending out in multiple directions which become increasingly dissimilar to each other over a few generations of acquisition agreements. This textual “drift” takes place within individual boilerplate lineages. The textual “Big Bang” effect is even more pronounced for boilerplate provisions that have a common ancestor precedent, but evolve separately along different lineages of precedents. We also show spatially that the pattern of boilerplate “speciation” underscores the high impact of editorial churning in undercutting standardization of boilerplate.
Our preliminary findings suggest that the macro-problem of acquisition agreement “speciation” takes place throughout the micro-level of boilerplate provisions. Lawyers appear to recycle boilerplate without giving adequate thought to the meaning of this language or the impact of editorial changes. The result is “drafting drift.” Boilerplate provisions live on from deal to deal, yet gradually shed their textual integrity and potentially lose their clear meaning as ad hoc edits are copied from deal to deal and new ad hoc edits are added at each stage. The random variations that the “encrustation” and “abrasion” of boilerplate text introduce in the drafting process appear to be even more severe in the merger agreement context than in other contractual settings, leading to rapid drift away from the original boilerplate.
Part I lays out our data and methodology as well as delineate the distinctive challenges of identifying boilerplate in non-standardized documents. Part II provides empirical evidence substantiating the high degree of textual drift within both lineages of boilerplate and the even more extensive drift between the divergent branches of boilerplate with a common precedent ancestor. Part III discusses some of the implications and shortcomings of this study that we will address in greater detail in a future work.
This Article builds on our larger project of systematically examining the evolution of public company merger agreements and exposing the high degree of editorial churning.In that piece, we documented the extensive “drift” in merger agreements over time as precedents are used, edited, and reused in deals. The Securities and Exchange Commission (SEC) mandates disclosure of public company acquisition agreements, which provide a window into the end product of lawyering that the public often is unable to see in other areas of transactional law. The challenge, however, is that no one outside of the drafting deliberations can witness the process that leads to the formation of acquisition agreements. Ironically, even the lawyers involved in any given transaction may not necessarily appreciate the full implications of the drafting give and take on the substance of the legal text. The myriad of lawyers involved in drafting, the scale of the edits, and the compressed time period of drafting means that no one involved in the transaction may be positioned to scrutinize the full extent and potential impact of textual changes.
We seek to reverse engineer the drafting process to identify potential inefficiencies and textual distortions by analyzing the evolution of public company acquisition agreements provisions.In our previous study we leveraged SEC-mandated disclosures to compile a dataset of public company mergers from 1994 to 2014, which covers over 12,000 agreements. In this Article we use this dataset to examine micro-level evolution of deal-terms. Acquisition agreements are so complex, and the legal stakes so high, that nearly every public company merger agreement is based on an earlier acquisition agreement that serves as its precedent. We use computer textual analysis tools to show how it is possible to identify the precedent which serves as the template for the drafting of each deal. We leverage computer technology to lift the veil on the drafting process by showing how agreements are created and how both documents as a whole and individual provisions change in incremental ways over time.
By identifying the precedent for each deal, we are able to pinpoint the exact edits from one deal to the next. We can analyze the overall extent of edits to highlight potential churning as well as identify individual changes within particular provisions. Additionally, we can assess the degree to which a particular section of the agreement remains consistent from deal to deal (e.g. a truly boilerplate provision) or is the focal point of drafting activity.
This approach allows us to show empirically that a high level of “editorial churning” takes place as merger agreements appear riddled with edits that are cosmetic and unnecessary.The drafting of every acquisition agreement necessarily entails deal-specific edits and reflects a fusion of the vision for the agreement from both parties as they seek to frame or reframe terms to their advantage. Additionally, innovations are taking place in acquisition agreements in a more episodic fashion in response to exogenous events. But our analysis found that over half of the text of merger agreements is routinely rewritten from one deal to the next, suggesting that there is a high level of inefficiency in the precedent selection and drafting process that cannot be explained away in terms of substantive changes in acquisition agreements.
Our initial study demonstrated that public merger agreement terms are not based off a common “form” agreement, but rather are the product of a highly path-dependent “evolution” over many generations.This point is true even within large law firms where drafts are based on prior agreements rather than standardized form language. The absence of even firm-specific forms has led to haphazard and inconsistent lawyering, as lawyers add significant amounts of deal-specific edits to each deal and inadvertently retain deal-specific information from prior deals.
Our initial paper reflected a macro-view of editorial churning in assessing the extent of word changes from precedent to the final deal in each of the 12,000 agreements, however we did not engage in fine-grained analysis of particular provisions to test our hypothesis of drafting inefficiency. For this reason our original paper did not directly support the “black hole” theory. Although we found extensive editing between precedent and final draft, it is possible that legal drafters were simply integrating paragraphs or whole sections of text as they engaged in innovative lawyering.
In this paper we address this limitation of our prior study by examining the extent of changes in individual boilerplate provisions from deal to deal. Our preliminary findings are that similar editorial churning and drift are apparent when provisions are examined on a clause-by-clause basis as well as when agreements are examined as a whole. Our data shows that boilerplate provisions like virtually every other part of acquisition agreements are drifting over time due to incremental changes in each agreement which have cumulative effects over multiple generations. Haphazard editing takes place throughout virtually every part of acquisition agreements which afflicts even ostensibly standardized boilerplate language and potentially erodes the text and meaning.
Our study compiled a data set of 12,407 merger agreements filed with the SEC between 1994 and 2014 and performed a word-for-word comparison of each of these documents.The computer script visited each URL contained in the Archive Indices of the SEC EDGAR Database and collected the full text of each acquisition agreement. We excluded any document whose title did not contain “merger” or “reorganization” to ensure that we were not including any non-acquisition agreements. We also excluded duplicative agreements, intra-firm reorganizations, reincorporations in other states, and private company acquisitions. We also eliminated older plain-text agreements for which paragraph demarcations are unreliable, resulting in a focus of a subset of our database on agreements filed after 2001.
The key to our analysis is the use of a computer program to engage in a word-for-word comparison of each agreement to every other agreement in the data set. The underlying premise is that a document retains substantial word-for-word similarity to its precedent document even after a high degree of edits. While every drafting process entails a degree of deal-specific changes, we can still identify the textual “DNA” linking a document to its precedent. This similarity is not present among documents that were not copied directly or indirectly from one another, even when the documents deal with identical subject matter.
The same logic applies for our comparison of boilerplate provisions from precedent to subsequent agreement across numerous generations. The computer program calculated the “edit distance” (also known as the Levenshtein distance) between each pair of agreements.Edit distance is a method for measuring the extent of textual similarity or dissimilarity based on the number of insertions and deletions (i.e., edits) that differentiate two documents. The concept is analogous to the traditional “blacklining” or “redlining” process of comparing two documents with one another, which is routinely used in transactional law drafting. This approach is also similar to those used to detect plagiarism in writing, which can detect common ancestry of texts even after significant editing.
The difference in our approach is that we are seeking to assess quantitatively the degree of difference between each agreement in our dataset in order to determine which agreement is most likely to form the precedent for the drafting of a subsequent agreement. The computer program allowed us to engage in this comparative analysis for each agreement in our database. As a result, we were able to identify the likely precedent document for each merger agreement in the database by determining which document had the smallest length-normalized pairwise edit distance (among those with earlier dates than the given document). This finding provided us with a window for seeing the starting point and the end result for the drafting of each acquisition agreement, so that we could establish quantitatively the degree of edits in each transaction. We then compared the individual paragraphs in the descendant agreement to its ancestor agreement to determine the source for each paragraph in the descendant.
The starting point for every M&A deal entails the selection of a precedent agreement, which serves as the textual base from which the deal document is drafted. M&A agreements typically reflect a process of back-and-forth negotiations between the acquirer and target (and their counsel). Typically, the lawyers for the acquirer select the precedent to use as the base for the agreement, customize the draft to fit the needs of the current deal, and forward the agreement to the target.Counsel for the target will then propose changes to the acquirer’s draft and initiate negotiations which focus on changes to particular provisions rather than the “form” of the agreement. This process goes back-and-forth several times before the draft is finalized.
In theory the reliance on a precedent agreement suggests that lawyers and clients value the legal certainty that comes from building on precedents and boilerplate provisions whose language has stood the test of time (and of courts). Additionally, one might expect that acquisition agreements would have significant textual similarity since they generally follow similar broad outlines of categories of provisions. But contrary to these plausible hypotheses we found that there was little evidence of standardization among merger agreements. Not only was there significant divergence in the text from each agreement to its precedent, but there was also remarkable diversity in the merger templates that law firms used. Table I, excerpted from our prior article, highlights the small degree of commonalities among merger agreements based on word-for-word comparisons.
Table I. Similarity Distribution of the Data
The most striking finding is that only 4.3 percent of agreements were more than 25 percent the same despite the fact that they had nearly identical substantive provisions and subject matter. These findings suggest that the world of acquisition agreements is strikingly diverse even though these agreements deal with similar categories of information, and even though each agreement is based on a precedent. The mean and median degree of similarity of documents were less than 20 percent, which suggests that there is only a small core of standardization that cuts across the agreements.
Another step in our empirical analysis entailed examining the degree of divergence between each precedent and the resulting agreement. The previous paragraph discusses similarity of random chosen pairs, but here we are examining the similarity between agreements related to each other. While we show earlier the extensive degree of diversity among acquisition agreements, the most telling evidence of inefficiency is the high level of editorial churning in the drafting process even in documents copied from one another. Figure I, drawn from our prior article, shows the percentage of the textual similarity between documents and their precedents, assessed at the whole document level.
Figure I. Similarity of Documents to Precedents
Figure I highlights the high degree of editorial churning that takes place during the drafting process. While there are significant outliers on both ends of similarity and dissimilarity, the largest number of acquisition agreements have approximately 50 percent similarity to their nearest precedent. Some acquisition agreements have 80 percent or greater similarity with their precedents. But most of these documents are repeat-player acquisitions involving the same acquirer which means they have limited applicability to the broader pattern of precedent selection.
It would be challenging to assess the precise degree of inefficiency because deal-specific edits are an essential part of every deal, and the degree of edits will necessarily vary based off of the transaction. But we can estimate the amount of time that lawyers are investing in the drafting process to put the potential degree of editorial churning in context. From 1994 to 2014 the median number of words in an acquisition agreement increased from about 21,000 words to approximately 39,400 words a year.The rate of increase was just over 900 words a year as there was a remarkable “accretion effect” that led to a near doubling in the length of the average acquisition agreement. Table II highlights the dramatic increase in the length of merger agreements over time.
Of course some of this additional word count may be justified by legal responses to exogenous events or other substantive developments in the architecture of acquisition agreements. But while Professor Coates has pointed to some degree of substantive changes over this period,in reality there appears to be little substance to justify this dramatic expansion in the length of merger agreements.
The evidence of a consistently high level of editing suggests that lawyers are ineffectively engaging in precedent selection and document design throughout the drafting process. Some deals may require more edits because of the unique nature of the deal, but “revolutionary” deals are few and far between, and the degree of editorial churning that routinely occurs in the deal process suggests that there is inefficiency in the precedent selection and document design.
Having provided evidence of underinvestment in precedent selection and high levels of editorial churning, we turn to the more granular question of whether the textual changes reflect the wholesale insertion of new provisions and paragraphs or whether editorial churning is pervasive throughout the document. This question is key to testing the black hole theory for boilerplate provisions, as we would expect to see the highest degree of standardization in acquisition agreements at the paragraph or provision level for boilerplate text if it were being substantively embraced.
We compared the individual paragraphs in each agreement to its precedent agreement to determine the source for each paragraph in the agreement. Our challenge was defining what constitutes boilerplate because of the high degree of edits throughout the agreements. Some types of clauses are readily classified as “boilerplate” based on their subject matter (e.g., Governing Law, Entire Agreement, Waiver of Jury Trial). Many other clauses, however, do not fall neatly into the boilerplate or non-boilerplate category, raising the question of how to define boilerplate text. This question is a threshold issue for empirical analysis of boilerplate in any context, as text is reused in many ways, even in fully negotiated paragraphs (e.g., jargon phraseology). But only certain types of reused text reach a sufficiently high degree of standardization to qualify as boilerplate.
To answer this question, we turn to the data itself. The following figure presents the distribution of the percentage similarity of paragraphs to their nearest ancestor paragraph in the precedent agreement, denoted by the solid line.For comparison purposes, the dotted line shows the percentage similarity between paragraphs and the nearest paragraph from a merger agreement chosen at random (i.e. not the precedent document).
Figure II: Distribution of Similarities Between Paragraphs
The Figure makes it clear that there is a bimodal distribution of similarity to precedent paragraphs, with a fairly strong bifurcation between boilerplate provisions and fully negotiated provisions. The right hump in the solid line is the relatively common boilerplate provisions that are 70–100 percent similar to their precedents. The left hump in the solid line is the negotiated provisions that are not much more similar than clauses from random agreements. Interestingly, in the middle there is a range between 40–70 percent similar where the moderately negotiated provisions in the precedent agreements are only slightly more similar to each other than the agreements chosen at random.
The solid line in the Figure makes it clear that merger agreements contain a large number of boilerplate paragraphs, a large number of fully negotiated paragraphs, and relatively few paragraphs in between these extremes. This point validates the idea that boilerplate paragraphs differ qualitatively, not just quantitatively, from negotiated (generally deal-specific) provisions. Relying on the Figure, we set the threshold for boilerplate copying at 70 percent similarity and up, which captures most of the agreements in the boilerplate category. Thus, our analysis focuses on the degree of continuity or evolution from precedent to the next between paragraphs that are 70 percent or more similar to one another.
Our interest is in the paragraphs that are copied over multiple generations to determine the degree and type of drift from the original ancestor over time. To examine this drift, we construct a “family tree” for the set of paragraphs. We begin by taking the set of paragraphs that have no descendants, which often (but not always) come from later agreements near the end of our data coverage. We then trace the copying history of each of those paragraphs back in time, finding its ancestor, the ancestor of the ancestor, and so on. Because each paragraph has only one immediate ancestor, these lineages do not branch as they are traced back in time.
We then reverse the direction, constructing a family tree for each ancestor by tracing the descendants of each ancestor over time using the reverse-lineages just constructed. Accordingly, we follow the evolution of each lineage of paragraphs from the original ancestor to all of its direct and indirect descendants. This creates a tree-like structure for each ancestor. Some ancestors have many branches (and branches of branches), while others have a single lineage through time.
We then drew a sample of 28,717 ancestor paragraphs from the total set of 202,422 ancestor paragraphs to analyze, and we included all descendants of those ancestor paragraphs. The following table presents some descriptive statistics about the sample boilerplate paragraphs.
Table III. Descriptive Statistics
The boilerplate paragraphs tend to be relatively short, with a median length of 68 words, and tend to have relatively few edits from one generation to the next, with a median number of words edited of just 4 (about 6 percent of the median paragraph length). This finding is expected since these are boilerplate provisions. The lineages also tend to be very short, with the median lineage only two generations long (meaning that the median paragraph was copied only once). This latter result occurs because most merger agreements themselves are copied only once (if at all), with the descendant never being copied again. This means that there are a lot of “dead ends” in the evolutionary process. In some cases we find that paragraphs are copied over many generations of agreements. In other cases, paragraphs are copied once and then become “extinct.” Having established reasonable parameters for what constitutes boilerplate provisions in the M&A context and a framework for identifying family trees for these provisions, we turn to our analysis of the degree of drift in boilerplate provisions over time.
In this Part we examine the evolution of boilerplate clauses from three different perspectives. First, we examine the extent of drafting drift over generations between the original boilerplate ancestor provision and direct and indirect linear descendants. Second, we examine the astounding variety of descendant clauses produced by a single boilerplate ancestor which evolve separately along different lineages of precedents. Third, we examine the geometry of the relationships among the clauses to illustrate spatially the high degree of divergence in boilerplate provisions over time.
The theory developed in the “black holes” literature suggests the possibility that slippage in the drafting process will have cumulative effects that will distort the boilerplate text. If drafters are unable to identify all non-standard edits embedded in a precedent document (or simply fail to invest time in checking for consistency), some of the edits from previous transactions will be retained in addition to the edits added for the present transaction. As a result, each generation of a paragraph will tend to differ more from the original ancestor provision than the last generation. Thus, if slippage is occurring we should observe paragraphs drifting further from their original ancestors as the number of generations between the drafts increases.
To examine whether paragraphs drift over time, we examined all lineages with at least six generations and compared the text of each descendant paragraph at each generation to the original ancestor. The results of this comparison are presented in Figure III below, with point estimates and 95 percent confidence intervals denoted by the points and bars, respectively.
Figure III: Distance From Ancestor by Generation
The amount of average overall drift from one generation to the next is remarkable considering that because of our focus on boilerplate provisions most pairs of paragraphs in our analysis only have very slight edits (or even none in any given generation). Small changes have cumulative effects over multiple generations, however, eventually producing a descendant that is quite different from its ancestor in terms of the text. Over a long enough time horizon the substance of these provisions may be transformed which may undermine the purpose of having standardized text.
This subpart examines the degree of heterogeneity that develops among the descendants of a particular ancestor over time. In other words, to what extent does a single ancestor produce a variety of descendant clauses? Although this question is closely tied to the drift of each lineage over time examined in subpart A, the two issues are distinct. It is entirely possible that different lineages from the same ancestor could drift rapidly over time as in subpart A, yet not diverge from one another. This result would occur, for example, where the various lineages were responding in tandem to external shocks, such as changes in the economic or regulatory environment which, if true, would be consistent with Coates’ thesis of change being driven primarily by innovation. If, instead, the lineages from the same ancestor diverge rapidly, the explanation might more plausibly be attributed to editorial churning rather than rational adaptation.
Because our aim is to examine multiple lineages from the same ancestor, we exclude clauses that had only one or no descendants. For each “family tree” descended from an ancestor, we compute the diversity among the ancestor’s descendants according to the number of generations to connect them. For example, a sibling pair of paragraphs descended from a common parent would involve two generations (one up from one sibling to the parent and one down to the other sibling). For a grandchild to its “uncle” paragraph the distance would be three (two generations up to the grandparent and one down to the uncle paragraph. For each such generational “distance” we then compute the average normalized edit distance among the paragraphs at that distance to assess the overall heterogeneity by number of generations removed.
The following Figure sets forth the mean heterogeneity of descendants of the same ancestor by number of generations separating texts.
Figure IV: Distance Among Descendants By Generations
The average distance among descendants from the same ancestor increases with generational separation just as the distance from an ancestor increases over the generations. Comparing this Figure to Figure III above shows that the process of “drift” is not only away from ancestors, but also away from other lineages descended from the same ancestor. Indeed, the rate of divergence is considerably faster among the descendants than from the ancestor. The results show that all paragraphs are moving away from one another in an “expanding universe” of clauses akin to the “Big Bang” theory on the scale of boilerplate. This finding is consistent with widespread editorial churning that is haphazard, rather than driven by responses to exogenous legal events or attempts at innovation.
The previous sections show that virtually all boilerplate paragraphs are moving away from one another in terms of their textual similarity. In this subpart, we attempt to characterize the evolution of merger agreement clauses in terms of the geometrical shapes of the groupings of related clauses in a high-dimensional space. Although it is unusual to think of groups of contract clauses in terms of their shapes, the concept of difference or distance lends itself to such a graphical interpretation.
Consider the graphs in the following figure derived from simulated data.
Figure V. A Spherically Distributed Cluster and an Ellipsoidal Cluster
The upper graph has a well-defined center (or standard form) with a point cloud around it. The lower graph is elongated. These two graphs have approximately the same average distance to the nearest point, but they differ in the structure of the relationships.
The top plot is similar to what one would expect from documents based on a standard form. Although parties often can negotiate terms in standard provisions, parties will reverse deal specific edits in subsequent uses of the form, meaning that documents based on forms will have a (hyper-)spherical distribution. This fact does not necessarily mean the documents are only lightly edited. As in Figure V, documents can be close to the center or far from it; the key is that there is a center to which documents tend to revert because of the use of a form or standardized language. Similarly, documents not based on forms can change slowly and incrementally, but end up very far from where they began. The dramatic difference between the two document clusters turns on the extent to which the past editing history of the document shapes the form of its descendants. The same logic applies to analysis of particular provisions of boilerplate language that serve as loci of standardization within acquisition agreements.
With this background, we now examine the data from the merger agreements. Although there are many methods that could examine whether the underlying data have a spherical or non-spherical structure, we use the eigenvalues derived from a multidimensional scaling of the distance matrices for sets of related paragraphs (those derived from a common ancestor).If the “clouds” of points representing clauses are roughly spherical, then the eigenvalues should be close to one another. If, on the other hand, the clouds have a linear structure to them, at least one of the eigenvalues will tend to be significantly larger than the other ones.
Indeed, we find that very few of the “family trees” of boilerplate agreements have the spherical structure we would expect from documents based on a standard form. The first eigenvalue accounts for a median of .65 of the variation of all the eigenvalues (obtained by dividing the first eigenvalue by the sum of the eigenvalues). This suggests that a small number of eigenvectors (or even one) can account for most of the variation, indicating that our data have a structure that deviates markedly from a spherical shape.
One important implication of these findings is that there is no “center” or “standard” to most boilerplate paragraphs. One might expect that the ancestor paragraphs of a set of descendants would be the “center” of the descendants, and indeed that would be the case if the ancestor were used as a “form.” But each paragraph’s form is transient without fixed referents to which it reverts in subsequent generations. Each successive generation tends to “wander” farther from the original, leading to elongated lineages rather than offspring clustered around a central point. The ends of these elongated “point clouds” bear little resemblance to each other, meaning that new forms of clauses are constantly arising in a process similar to speciation.
Our preliminary results confirm that the clause-by-clause evolution of merger agreements mirrors the overall evolution of agreements. The changes that are introduced at each generation of a document’s evolution tend to be preserved in subsequent generations, causing the text to drift significantly over time. This finding has a number of implications for the drafting process as well as the emerging literature on black holes and grey holes in contract law, which are explored in this Part.
The analysis in this paper provides support for the Choi/Scott/Gulati thesis that the rote use and encrustation processes may lead to black holes in contracts. Boilerplate provisions in acquisition agreements are recycled from deal to deal, but idiosyncratic changes aggregate quickly from generation to generation and potentially alter the substance of these provisions. We also identified equally significant evidence of “abrasion” as deletions shaped the evolution of boilerplate terms, even though over time these provisions, like acquisition agreements as a whole, tended to increase by length year by year. The data suggests a strong role for slippage in the drafting process that may lead toward unconsidered and ultimately unintended variations in documents.
The high and increasing degree of editorial churning in boilerplate text appears to reflect potential structural shortcomings of the transactional drafting process. In theory lawyers should know to respect boilerplate provisions unless there is a deal-specific reason to deviate from the text.But in practice both our earlier study of the macro picture of M&A editing and this analysis of boilerplate text highlight how virtually every aspect of the agreement is potentially subject to the editing process. The extent of rapid speciation of boilerplate provisions suggests that the substantive benefits of standardized terms may be at risk. Lawyers’ penchant for editing may transform not only the text, but also potentially the meaning of boilerplate provisions. This problem is magnified by the sheer scale of the process as a multitude of lawyers rapidly edit a complex acquisition agreement and make a myriad of changes at each stage of the back-and-forth of negotiations. The problem appears to occur as lawyers process drafts without having anyone ever check back to the initial precedent (or precedent boilerplate text) to see if the edits are necessary or may potentially transform the meaning of the boilerplate. The absence of sufficient effort to check for deviations from standard text may help to explain how idiosyncratic and cosmetic edits arise and aggregate from one generation of an agreement to the next.
However, the primary empirical conclusion of this analysis, that edits in one generation are often passed down to subsequent generations, could also have other interpretations. For example, it is possible that the edits improve the document and are retained as part of a process of evolution toward better agreements. At this preliminary stage, our analysis cannot definitively resolve the question of whether the cumulative edits over many generations have effects on M&A boilerplate that are positive, negative, or neutral. That would require a more fine-grained, qualitative analysis of individual boilerplate terms to attempt to assess the legal implications of textual changes over time, which we will pursue in a future work. However, the fact that the descendants of a common ancestor boilerplate term diverge from one another is strong evidence of random drift rather than conscious improvement. The potentially random drift driven by inadvertently copied deal-specific edits and consequent speciation may lead to black holes or grey holes as language becomes unmoored from accepted formulations with established interpretations. We will need to conduct further qualitative research to analyze in a selective fashion the degree to which the textual evolution of particular provisions has transformed the substantive meaning of boilerplate.
The drift and speciation characteristics of the evolutionary process may lead to black holes where boilerplate loses its meaning. But this erosion of meaning does not necessarily occur in the majority of the cases, at least over a small number of generations. However, the larger the number of generations of drift from the original boilerplate, the more likely that the meaning of boilerplate will evolve over time in tandem with the increasing level of textual changes. Since lawyers typically choose precedents that are approximately one year old,it would be possible to extend the number of generations in future studies and engage in fine-grained qualitative analysis of the meaning of particular provisions.
One clear consequence of rapid speciation is an erosion in the value of network effects as ostensibly boilerplate language becomes increasingly less standardized over time. Our study stipulated that boilerplate consisted of text that is at least 70 percent or more similar to a paragraph in its immediate precedent document, which is a high degree of similarity given the nonstandardized nature of acquisition agreements. But our empirical analysis shows that the degree of drift effectively undercuts the emergence of truly standardized boilerplate language in M&A agreements, at least in the sense of a standardized form that we see in other areas of contracts. This fact imposes significant costs on market participants.
The first type of cost of nonstandardization is the easiest to see. The unnecessary effort expended in the drafting process as lawyers introduce random edits, another set of lawyers using the precedent attempt to compensate for those edits, and so forth, occurs generation after generation. This phenomenon is the “editorial churning” that we identified was occurring on an entire document basis in our previous article. In this paper, we show that the same churning is occurring on a clause-by-clause level, which serves as evidence of inefficiency.
The more important costs of the lack of standardization, however, come through impairment of the network effects that arise through standardization of boilerplate in other contractual contexts. As ancestors change through encrustation, abrasion, and rote repetition of encrusted texts, the value of the network effects decline. As ancestors split into multiple descendant species based on divergent lineages of precedents and provisions, the network effect value declines further. In this paper we show that both trends occur in the M&A boilerplate context. The text both drifts from its original source and splits into multiple lineages, each of which drifts away from the ancestor and away from each other.
Our work provides preliminary evidence for the rote use and encrustation phenomena in the context of merger agreements. The results have a number of limitations, however, as detailed in this section.
2. Missing or Misidentified Precedents. The starting point for our analysis is the identification of the likely precedent for each public company acquisition agreement in our database. We only look for precedent clauses within the documents determined to constitute the precedent documents. It is possible the precedent documents are not the actual precedent documents because the actual precedents are not available in the dataset. It is also possible that the precedent clauses are not found because the clause was copied from an agreement other than the precedent for the whole document (i.e., a clause was swapped from a different precedent).
While we would tend to discount the significance of either of these possibilities in many cases, we do recognize that much of the “innovation” in acquisition agreements occurs from copying the innovations of first movers in other acquisition agreements. For this reason, if an exogenous legal shock arises, it is quite possible that lawyers will take advantage of SEC-mandated transparency and the absence of intellectual property protection to copy and paste relevant provisions from an agreement that is not the precedent for the current deal. This issue is more significant for our study of boilerplate than our broader study of the evolution of acquisition agreements because of our ability to identify the likely precedent for each agreement with a high degree of probability based on the degree of similarity.
But we should not overstate the risk that the opportunistic copying of innovations from unrelated agreements is skewing our boilerplate analysis. In the case of swapped-in language from another precedent, we would expect to find large edit distances from the precedent underpinning the current deal. As shown in Figure II, a typical merger agreement clause does not find close matches in another random merger agreement, even for boilerplate provisions. Therefore, we would expect that we would typically not even identify the swapped-in clause as boilerplate for the purposes of our study. For this reason the boilerplate provisions that are the focus of our study are much more likely to have continuity from one precedent to the next. While numerous edits take place throughout acquisition agreements and boilerplate provisions, the empirical evidence suggests that piecemeal editing rather than transplantation of terms from other precedents is the norm.
Our study shows that the high levels of “editorial churning” that take place in the process of transforming each precedent into the current acquisition agreement affect agreements on a clause-by-clause basis, not just an entire document basis. Boilerplate provisions live on from deal to deal, yet gradually shed their textual integrity and potentially lose their clear meaning as they evolve over generations of lineages.
We show that incremental changes in boilerplate from one generation to the next lead to rapid “speciation” of the terms. We demonstrate that this textual “drift” takes place both within boilerplate that falls within a given chain of precedent, but also even more broadly for boilerplate provisions that have a common ancestor precedent, but evolve separately along different lineages of precedents. Our findings reinforce the black hole concern that Choi, Gulati, and Scott raised that rote usage, combined with encrustation and abrasion of terms may distort the degree of standardization and meaning of boilerplate over even a short number of generations. We plan on building on this study for future research that is larger in scope and duration and also integrates qualitative assessments of the evolution of particular boilerplate provisions over time.
We are including an example of the evolution of a boilerplate provision to provide a concrete illustration of the extent of changes that routinely occur over a small number of generations. The following boilerplate representation attests to the management’s compliance with internal control requirements under the Sarbanes-Oxley Act. Anyone who is familiar with this requirement would recognize that this standardized language could easily be carried over verbatim into subsequent agreements (with the minor exception of the date in section iv and any adjustment for the numbering of this representation in subsequent agreements). We highlight the evolution of this boilerplate provision over four generations in a four-year period to highlight the degree of (largely) cosmetic changes in the drafting process.
(f) the acquired corporations have implemented and maintain a system of internal control over financial reporting as defined in Rules 13a-15(f) and 15d-15(f) under the Exchange Act sufficient to provide reasonable assurance regarding the reliability of financial reporting and the preparation of financial statements for external purposes in accordance with GAAP including without limitation that
(a) there have not been any changes in the acquired corporations internal control over financial reporting that have materially affected or are reasonably likely to materially affect the acquired corporations internal control over financial reporting and
(b) all significant deficiencies and material weaknesses as such terms are defined by the public accounting oversight board have been disclosed to the company’s outside auditors and the audit committee of the company board.
In the second generation of this boilerplate provision, the language was virtually identical (with the understandable exception of a numbering and a date change until section iv(b). The revised iv(b) and new iv(c) was as follows (with deletions crossed out and additions underlined):
2.4(f) The Acquired Corporations have implemented and maintain a system of internal control over financial reporting (as defined in Rules 13a-15(f) and 15d-15(f) under the Exchange Act) sufficient to provide reasonable assurance regarding the reliability of financial reporting and the preparation of financial statements for external purposes in accordance with GAAP, including, without limitation, that
The third generation of this boilerplate provision changed the date and added a non-substantive reference to the company’s disclosure system, but then retained the second generation’s additions to section iv(b) and added a significant amount of additional text:
2.4(f)(g) The Acquired Corporations have implemented and maintain a system of internal control over financial reporting (as defined in Rules 13a-15(f) and 15d-15(f) under the Exchange Act) sufficient to provide reasonable assurance regarding the reliability of financial reporting and the preparation of financial statements for external purposes in accordance with GAAP, including, without limitation, that
In contrast, by the fourth generation of this boilerplate lineage, the edits were so far-reaching that the drafters effectively rewrote approximately one-half of the first generation’s text, which is as follows:
2.4(f) (g)The Acquired Corporations have Company has implemented and maintains a system of internal control over financial reporting (as defined in Rules 13a-15(f) and 15d-15(f) under the Exchange Act) sufficient to provide reasonable assurance regarding the reliability of financial reporting and the preparation of financial statements for external purposes in accordance with GAAP, including without limitation and that
Copyright © 2019 Robert Anderson & Jeffrey Manns.
† Professor of Law, Pepperdine University School of Law.
†† Professor of Law, The George Washington University Law School.
We would like to thank Emiliano Catan, Stephen Choi, John Coates, Elisabeth de Fontenay, Ronald Gilson, Cathy Hwang, Mitu Gulati, Barak Richman, and Robert Scott, as well as participants in the Columbia Law & Economics Workshop and the annual meeting of the American Bar Association’s Mergers & Acquisitions Committee for their constructive comments.
 See Marcel Kahan & Michael Klausner, Standardization & Innovation in Corporate Contracting, 83 Va. L. Rev. 713, 719–20 (1997) (discussing the potential “learning benefits” of commonly used terms); Michael Klausner, Standardization & Innovation in Corporate Contracting, 81 Va. L. Rev. 757, 783–84 (1995) (discussing the network benefits from familiarity with boilerplate terms).
 See Stephen J. Choi, Mitu G. Gulati & Robert E. Scott, The Black Hole Problem in Commercial Boilerplate, 67 Duke L.J. 1, 2–4 (2017) (discussing the black hole problem in the context of the pari passu clause, a boilerplate provision in sovereign debt contracts); see also Christopher J. French, The Illusion of Insurance Contracts, 89 Temple L. Rev. 535 (2017) (discussing the difficulties of determining the intent of drafters of standard form language in insurance contracts).
 See Choi, Gulati & Scott, supra note 2, at 4 (discussing the need for broader empirical research on the extent of rote usage and encrustation in boilerplate provisions).
 Robert Anderson & Jeffrey Manns, The Inefficient Evolution of Merger Agreements, 85 Geo. Wash. L. Rev. 57 (2017).
 Robert Anderson & Jeffrey Manns, Engineering Greater Efficiency in Mergers and Acquisitions, 72 Bus. Law. 657 (2017).
 See Anderson & Manns, supra note 4.
 See SEC, Form 8-K, Item 1.01, at 4, https://www.sec.gov/about/forms/form8-k.pdf [https://perma.cc/WE5G-A7CY] (requiring companies to disclose material definitive agreements outside of the ordinary course of business including merger agreements).
 Other notable empirical works also examine changes in contractual provisions in other transactional contexts. See, e.g., Mitu Gulati & Robert E. Scott, The 3½ Minute Transaction: Boilerplate and the Limits of Contract Design 3–10 (2013) (using empirical data to show that once a boilerplate provision is in place it often becomes part of a transactional checklist regardless of its actual value-added); Stephen J. Choi & Mitu Gulati, Innovation in Boilerplate Contracts: An Empirical Examination of Sovereign Bonds, 53 Emory L.J. 930, 932–34 (2004) (conducting empirical analysis of sovereign bond offerings to show that boilerplate provisions changed in response to significant shifts in the interpretation of key provisions, but only after an industry-wide delay which reflected the reluctance of lawyers to change boilerplate provisions); Jonathan C. Lipson, Price, Path and Pride: Third Party Closing Opinion Practice Among U.S. Lawyers (A Preliminary Investigation), 3 Berkeley Bus. L.J. 59, 113–14 (2005) (using qualitative interviews to assess the logic behind lawyers’ drafting of third-party closing opinions).
 See supra note 6.
 See Scott J. Burnham, Drafting and Analyzing Contracts: A Guide to the Practical Application of the Principles of Contract Law 5–6 (3d ed. 2003) (discussing how attorneys “rarely start to draft on a blank slate. . . . [and generally] start with an existing contract or form”).
 See Tina L. Stark, Drafting Contracts: How and Why Lawyers Do What They Do 335–36 (2007) (discussing the benefits of heightened efficiency and legal certainty from precedent-based legal drafting).
 Two other notable empirical studies provide similar prisms for understanding the M&A drafting process, yet reach different conclusions. Professor Coates documents the growth in length of merger agreements over the past twenty years, which he attributes to changes in legal risks and deal and financing markets, as well as the increase in “linguistic complexity’ of these documents. See generally John C. Coates, Why Have M&A Contracts Grown? Evidence from Twenty Years of Deals, (European Corp. Gov. Inst., Working Paper No. 333/2016, 2016), at 16–28, available at http://dx.doi.org/10.2139/ssrn.2862019. Professor Jennejohn argues that the complexity of M&A exposes acquisition agreements to multiple sources of path dependency which undercuts efforts at standardization. See generally Matthew Jennejohn, Assymetric Standardization in M&A Agreements (Mar. 25, 2017) (unpublished manuscript) (on file with authors). Numerous other studies have examined the development of particular acquisition agreement provisions. See generally Afra Afsharipour, Transforming the Allocation of Deal Risk Through Reverse Termination Fees, 63 Vand. L. Rev. 1161 (2010) (discussing attempts at reallocating deal risks through reverse termination fees that compensate target companies should the buyer walk away, and assessing the impact such attempts have on acquisition agreement drafting); Albert Choi & George Triantis, Strategic Vagueness in Contract Design: The Case of Corporate Acquisitions, 119 Yale L.J. 848 (2010) (arguing that before closing the deal, the intentional vagueness of material adverse change (“MAC”) clauses creates more efficient incentives for the seller, rather than more precise and less costly proxies); Yair Y. Galil, MAC Clauses in a Materially Adversely Changed Economy, 2002 Colum. Bus. L. Rev. 846 (discussing how unclear judicial interpretations of the contours of MAC clauses and material adverse effect (“MAE”) clauses cast a shadow over merger deals); Ronald J. Gilson & Alan Schwartz, Understanding MACs: Moral Hazard in Acquisitions, 21 J.L. Econ. & Org. 330 (2005) (using economic modeling to analyze the role that MAC and MAE clauses play in the structure of the standard acquisition agreement and the incentive effects for acquirers and targets); Sean J. Griffith, Deal Protection Provisions in the Last Period of Play, 71 Fordham L. Rev. 1899 (2003) (discussing the significance of Delaware’s judicially created limitations on deal protection provisions meant to resolve the conflicting incentives of the acquirer’s and target’s management when facing last minute third-party bids); Claire A. Hill, Bargaining in the Shadow of the Lawsuit: A Social Norms Theory of Incomplete Contracts, 34 Del. J. Corp. L. 191 (2009) (arguing that the legal terms in acquisition agreements are intentionally ambiguous to deter litigation and incentivize negotiators to close the deal); Alan Schwartz & Robert E. Scott, Contract Interpretation Redux, 119 Yale L.J. 926 (2010) (arguing for interpretative default rules in construing MAC clauses).
 See Anderson & Manns, supra note 4, at 61–62; see also infra Part III.
 See Avery Katz, The Strategic Structure of Offer and Acceptance: Game Theory and the Law of Contract Formation, 89 Mich. L. Rev. 215, 277 (1990) (discussing the tradeoffs between standardization and customization in contractual drafting).
 See Anderson & Manns, supra note 4, at 75–77.
 See id. at 82–83.
 Cf. Coates, supra note 11, at 16–28 (arguing that the doubling in the length of merger agreements over the past twenty years reflects responses to changes in legal risks and deal and financing markets, as well as the increase in the linguistic complexity of these documents).
 See Archive Indices of the SEC EDGAR Database, SEC (last modified Apr. 28, 2014), http://www.sec.gov/cgi-bin/edgar_archive_indices [https://perma.cc/USL4-V94J].
 Exhibit 2 is the exhibit where merger agreements are filed, along with any other “plan of acquisition, reorganization, arrangement, liquidation or succession.” See 17 C.F.R. § 229.601(b)(2) (1995). Such agreements can also be filed under Exhibit 10, but primarily when they relate to other companies, such as subsidiaries.
 This approach eliminates agreement types that may overlap such as “Contribution Agreement,” “Stock Purchase Agreement,” “Asset Purchase Agreement,” “Transaction Agreement,” “Share Exchange Agreement,” “Arrangement Agreement,” and the like. Although these agreements certainly contain overlapping language, this study focused on documents that were clearly public company acquisition agreements. Very short documents that are less than 15,000 characters were also eliminated because these agreements likely did not address the complex issues raised in larger public company acquisitions. Mutual holding company conversions were also excluded.
 Near duplicates were defined as those documents filed within 100 days of each other and having 97 percent or more similarity to one another. Most of these were the identical document, but some were amended and restated versions of the same document. Many of the documents contained extraneous text such as attachments to the main merger agreement. To remove this text, this study disregarded text following the first occurrence (if any) of “In witness whereof,” which typically signals the end of a merger agreement.
 The paragraph demarcations are unreliable because paragraphs are separated with carriage returns but so are page breaks, making it ambiguous in many cases whether particular text is separated by a new paragraph or a new page. The HTML documents have tags indicating new paragraphs and therefore do not suffer from this problem.
 See Dan Gusfield, Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology 215–16 (1997) (discussing the Levenshtein distance).
 See id.
 See, e.g., Zhan Su et al., Plagiarism Detection Using the Levenshtein Distance and Smith-Waterman Algorithm, 2008 Innovative Computing Info. & Control 569, 569–72.
 See Robert A. Feldman & Raymond T. Nimmer, Drafting Effective Contracts: A Practitioner’s Guide 1-20 (2d ed. 2005) (discussing basic strategies in drafting contracts); James C. Freund, Anatomy of a Merger: Strategies and Techniques for Negotiating Corporate Acquisitions 26–27 (1975) (discussing how the power to make the first draft gives the drafter leverage over other parties).
 See Thomas E. Tyner, Mechanics of Document Drafting, in Drafting Business Contracts: Principles, Techniques & Forms 1-1, 1-16 (2015) (discussing the limitations lawyers face in suggesting revisions to a draft); Freund, supra note 25, at 28 (“Typically, the seller should live with the purchaser’s form of agreement, without being precluded in any way from negotiating any and all substantive matters.”).
 See Anderson & Manns, supra note 4, at 70 & Tbl. I.
 See id. at 75 & Fig. 2.
 See id. at 75–76; see also Robert Anderson & Jeffrey Manns, Engineering Greater Efficiency in Mergers and Acquisitions, 72 Bus. Law. 657, 678–79 & Tbl. IV.
 See generally Coates, supra note 10, at 16–28 (attributing the growth in the length of merger agreements over the past twenty years to “reactive growth,” such as new case law, statutes, and finance risks, “innovative growth” such as new ways of achieving client goals, as well as the increase in “linguistic complexity’ of these documents).
 This data was generated by drawing ten paragraphs at random from each descendant and computing the normalized edit distance to the closest paragraphs in the immediate ancestor for each such paragraph.
 As Figure II highlights, we could have alternatively set the threshold for boilerplate text at eighty percent similarity, which would have entailed a very similar data set while still accounting for the extent of drafting changes from deal to deal.
 We used the cmdscale( ) function in R, which performs classic multidimensional scaling.
 See E. Allan Farnsworth, Contracts 296–97 (3d ed. 1999) (explaining that “in routine transactions the typical agreement consists of a standard printed form that has been prepared by one party and assented to by the other with little or no opportunity for negotiation”).
 See Susan L. Brody et al., Legal Drafting 3–5 (1994) (discussing “the myth that drafting is merely a fill-in-the-blank activity” and explaining the context-specific nature of legal drafting).
 See Anderson & Manns, supra note 4, at 74–75 & Figure 1.
 Each of the four generations of the internal controls requirement boilerplate comes from an SEC Edgar filing: