Whether a binding international legal tax regime exists is a persistent debate defining the academic field of international taxation. There is no formal ‘world tax organisation’, nor a comprehensive multilateral agreement controlling the taxation of cross-border transactions. Yet, it has been suggested by multiple commentators that a customary law of taxation exists, or that at least there are some ‘soft legal standards’ adhered to by most jurisdictions.

Most international transactions are taxed in accordance with the laws of the jurisdictions involved and their roughly 3,000 bilateral tax treaties. While they only bind the jurisdictions that signed them, these treaties are believed to be so similar that some see them as an expression of a binding customary transnational legal framework for the taxation of cross-border activities. Most treaties follow non-binding models that are published occasionally by the Organisation for Economic Co-operation and Development (OECD) or the United Nations (UN). The OECD is seen as so influential in this context that some have referred to it as a de facto world tax organisation.

To date, there has been no empirical assessment of such claims. In our paper, ‘The Making of International Tax Law: Empirical Evidence from Natural Language Processing’, we offer the first attempt at such an analysis.


We build a dataset of 4,052 bilateral income tax treaties, of which about 3000 are currently in force. We also collect 16 model tax treaties published by the UN, OECD and the US. We use standard methods of natural language processing to perform pair-wise text-based comparisons of all treaties in effect in any given year.

In the database used, the treaties were already segmented by clause. We split each clause into sentences. The sentences are used to produce phrases up to four words in length. The text features are filtered (by parts of speech, eg, noun/adjective) to focus on technical key phrases from international tax law such as ‘income from immovable property,’ ‘income from government securities,’ ‘preparatory or auxiliary character,’ ‘has an habitual abode,’ etc. These phrases provide more legal information than single words or unfiltered phrases. In particular, this method captures the highly context-dependent meanings of individual words, such as ‘income’ (eg, ‘income from immovable property’ versus ‘income from government securities’).

Rare words and phrases are excluded from the final vocabulary, while party countries and non-party countries are tagged with special tokens. The outcome of this process is that each treaty is represented as a vector of frequencies of 45,259 phrases.

Treaties are then compared using cosine similarity, a standard measure in the literature on information extraction and document classification. Cosine similarity is computed from the angle between the vectors, such that documents which contain similar phrase counts ‘point’ in the same direction and result in a higher value. The output is a pair-wise similarity measure for N × (N – 1) = 19,092,530 treaty pairs. This dataset of cosine similarities is used in the empirical analysis.


The first question is whether tax treaties have become more similar in their language over time. Figure 1 below presents the mean similarity by year in the language of all tax treaties in effect, for the years 1964 through 2014 (in earlier years, there were too few treaties to form averages). A value of ‘1’ would denote complete identity between all treaties compared, while a value of ‘0’ would represent no similarity in language. Values at around 0.6 are generally considered to represent a high degree of similarity. Error spikes give the 25th and 75th quantiles for similarity. We identify clear trends of convergence of legal language in bilateral tax treaties during this time-period.

Figure 1

To explore the institutional source of such consensus, we compare bilateral treaties to the model treaties. Figure 2 below shows the trends in similarity of newly concluded treaties to the three models, OECD, UN and US. This graph gives the average similarity of all treaties in a given year, to all versions of all the models ever published. The story is the same when picking particular models, or only the ones that are currently in effect. On average, recent active treaties are most similar to OECD models, and least similar to US treaty models.

Figure 2

Next, we look at how the introduction of new models affects bilateral treaties signed in the two years before and after the introduction of the new model. For each of these treaties, we compute the cosine similarity to the new model, divided by the cosine similarity to the previous model. This ratio measures the influence of the new model relative to the old model for each treaty. A discrete increase in the level or trend after a new model is published indicates that it is having an influential impact—new treaties are following the new model.

Figures 3a to 3c plot these relative similarity trends for the OECD, UN, and US models, respectively. The dots show the average relative similarity of treaties concluded in each of 24 months before and after the introduction of a new model. New OCED models in Figure 3a create a discrete upward break in the trend, meaning that the new model is having an impact and new treaties are following it. This is in clear contrast to the UN (Figure 3b) and US (Figure 3c) models, where there is no change in the level or trend. In addition, for both the OECD and US models, we see positive trends in the pre-period, meaning that the models themselves draw on pre-existing trends in treaty design.

Figure 3a: New OECD models

Figure 3b: New UN models

Figure 3c: New US models

The trends we identify are robust to controls for whether treaties are concluded between high-income, low-income, or high-and-low income countries.

Conclusion: The OECD as key institutional standard-setter

Overall, our findings support the view that a trend towards international legal consensus exists, and that the OECD is the key institutional source of the consensus building process. The OECD seems to play an effective role as a standard-setter in tax treaty matters.

Our empirical investigation has important implications for the international tax regime debate. Specifically, if countries are free to adopt whatever tax rules they wish, a high level of variance in tax treaty language is to be expected. This is because in tax treaty negotiations, each country pair has unique circumstances and countries will try to adopt the position that best serves their national interest. For example, one country may be a net capital exporter in relation to one treaty partner, but a capital importer in relation to another. A country may hold a strong negotiating position vis-a-vis one treaty partner (for example, due to economic size), but a weak stance against another. Different pairs of countries may present varying levels of kinship or animosity, whether diplomatic or cultural. In a period of increasing economic, political, cultural, and legal complexity, one would expect greater diversity in tax law language and decreasing similarity across treaties. Yet we find the opposite. The convergence in legal language may suggest that countries are guided by transnational legal considerations.

This does not mean that a customary international law of taxation exists, because we cannot conclude the countries act under a sense of legal obligation. However, the empirical findings paint the OECD as the institutional standard-setter in international tax matters. Even though its tax policy recommendations are not binding, countries seem to defer to OECD policy preferences.

Elliott Ash is Assistant Professor of Law, Economics, and Data Science at ETH Zurich, Switzerland.

Omri Marian is Professor of Law and Academic Director, Graduate Tax Program at University of California, Irvine School of Law.