One of the first research articles that attempted to assess the impact of information on financial markets was the paper “The analysis of world events and stock prices” by Victor Niederhoffer, published in 1971 in the Journal of Business. That article concludes with the author saying: "I hope that this study will stimulate other quantitative research on the effect of information on markets". Indeed, since then, the field has taken off.

Primary research in the field used raw metrics of information such as simple counts of news articles. The fields of computational linguistics, natural language processing, machine learning and econometrics, along with improved data availability of media articles, online information exchange websites and social networks have rapidly developed in the recent years. These developments have enabled researchers to move one step beyond simply counting the number of articles and arbitrarily classifying information as “good” or “bad”. Recent papers algorithmically (using software/programming code) quantify the content of news and calculate the effect of media columns and online discussions on stock prices. Relevant recent research has studied the relationship between information and institutional trading, the effect of media coverage during mergers and acquisitions (M&As), initial public offerings (IPOs), but also during the recent European Financial Crisis.

Moving forward, one of the primary goals of this research field will be to study in a deeper and more sophisticated way the interaction among five entities, which provide and receive information to (and from) one another: analysts, corporations, institutional investors, and the financial media. Each of these entities is a provider of information: analysts, through their recommendations; corporations, through their corporate filings; institutional investors, through their trading behavior and via their filings, but also through market interventions such as interviews and press conferences (in the case of activist investors), and through periodical letters to investors; and the media through the articles they release.

There are a series of open questions that the current literature has not sufficiently answered. How does information from and to each of these interacting entities affect each other? Do subgroups exist within these entities that are able to outperform their peers (star analysts, star fund managers)? Can this outperformance (at least partially) be attributed to a better/faster access to information signals - in terms of receiving, processing, interpreting and acting on information signals? How fast do various investor classes analyze information and trade on it? How are they able to distinguish between significant and insignificant information signals? Analyzing the cross section of investors may prove not to be sufficient, as there exists significant heterogeneity between investor classes (retail investors, mutual funds, hedge funds, algorithmic traders, proprietary traders, market makers, etc). Does heavily investing in IT and top-notch technology play a role in the performance of these entities? Does recruitment of people from top universities, having top grades, publications or doctoral degrees make a difference? What is the role of networks on financial markets? Does education from the same university at the same period/same major or graduation from the same school affect investment decisions (friends might influence one another through their social interactions)?

One possible way to move forward is by making use of recently developed techniques in the field of computer science and machine learning, such as topic detection algorithms (Latent Dirichlet Allocation). Another area of interest might be to study whether the econometric performance of the current financial word lists can be outperformed by the use of hybrid models. Such models could be developed by combining readily available sentiment analysis platforms (such as the Python NLTK platform) with financial sentiment word lists.

These and many other questions are open for researchers to tackle in the coming years, in a field that is expected to be fascinating for financial research. As Hal Varian wrote in an Op-Ed in the New York Times in September 23, 2004: “Perhaps computational linguistics and textual data mining will become the new hot technologies in financial economics”.


Andreas Chouliaras is a PhD Candidate in Finance at the Luxembourg School of Finance. This blog post is a summary of his paper “The Effect of Information on Financial Markets: A Survey”.  His working papers can be accessed here.