Introduction - adelekap/capstone_algo_trading GitHub Wiki


There is an ever increasing of speed, frequency, and volume of data. We are in the Information Age or the Fourth Paradigm[12], experiencing a digital data deluge. The rise of real-time financial analytics (applying software and technology in combination with algorithmic methods to gather, process, and analyze data in order to gain insights and make decisions) is likely a result of this increase in data abundance. In this capstone project, I combine a number of different analytical tools to make inferences and predictions about the buying and selling of stocks on the New York Stock Exchange.

The Efficient Market Hypothesis

The Efficient Market Hypothesis[10] is a controversial investment theory that states that it is impossible to "beat the market". This theory, first proposed by Eugene Fama, states that all of the important and relevant financial information is intrinsically built into the price of a stock, so you cannot profit from the overpricing or underpricing of a stock. This suggests that neither technical analysis (using past stock price movements to predict future stock prices) nor fundamental analysis (using financial information such as company earnings to select undervalued stocks) could make you a profit.[16] Supporters of the Efficient Market Hypothesis maintain that the only way to earn higher returns is to purchase higher risk investments or to invest solely in index funds, making a profit on the overall upward trend of the market. So what does this mean for active traders? Why would they attempt to make predictions about share prices if it is simply a fools errand? Critics of the theory point to evidence that strictly contradict the Efficient Market Hypothesis[17]. First, stocks take time to respond to new information. This is usually because of the rate of the dissemination of information. Traders can take advantage of being among the first to know some news about a security. Also, stock prices are heavily affected by human error and emotional decision making. Overreactions and market crashes are two market phenomena that are caused by emotional decision making. Traders can take advantage of overreactions, on the grounds of mean reversion. Finally, there exists investors, such as Warren Buffett, who have proven that they can profit from market anomalies. Kevin Davey, a professional systems-developer trader, has also demonstrated that it is possible to create a successful, dynamic and completely automated trading system.[5]

The History of Quantitative Analysis

What is a Quant?

There are three key decisions to be made in trading securities: what positions should I take?, how many shares of each should I own?, and what is my exit strategy? If all of these questions are answered systematically (through theory-driven and data-driven statistical analysis), that trading entity is considered to be a "quant"; if any question is answered by a human, then that is not considered to be a "quant strategy".[34] Quant strategies are often considered to be complex and secretive, an enigmatic black box. However, they are simply systematic implementations of the kinds of things that human traders and investors have always done. In algorithmic trading, humans design and build the software that automate the trading, but once most systems go "live", human judgment is limited in the day-to-day management of the portfolio.[22] This can be beneficial, but it can also be dangerous. By providing instructions to an automated system, you eliminate the human error and emotional component from the trading process. A computer program has the ability to distance itself from risky transactions and not participate in market overreactions, but rather, capitalize on them. However, a mechanical style of trading (a strategy that is purely computational and there is no discretion on the human trader’s end) can fail. A trading system is a product of a human designer and even the smallest of errors in the software can lead to problems accumulating very slowly over time and then suddenly explode on a busy trading day. This possibility and the high-stakes circumstances that algorithmic traders find themselves in, necessitates rigorous testing of algorithms and trading strategies. Some active traders adopt a mixed trading system which is a style of trading that includes some aspects of algorithmic trading, but incorporates some human judgment. These types of hybrid systems can allow traders to accept or reject signals that the system recommends or use the trader to decide the size of monetary investment in a suggested position. Both purely mechanical and mixed systems have found success in practice. In this project, I employ a strictly rule-based strategy that generates signals based on outputs of predictive models informed by historical data.

Technology and Quantitative Finance

The field of quantitative finance began in the mid 1900s. The first "quants" include Harry Markowitz[18] and Robert Merton[19] who set a foundation for the field of quantitative trading with their statistical financial theories. Quant strategies began to take-off in the 1971 debut of the first electronic stock market (NASDAQ), and in 1976 when the designated order turnaround (DOT) electronic system was put in place to route orders. This computerization of the market and the increase in computing power in general made it possible for researchers to gather and process larger volumes of data to identify patterns. At the turn of the 21st century, high frequency trading (HFT) became a common strategy that many traders used to place very fast orders.[8] These transactions required powerful computers housed next door to the stock exchange. HFT have investment horizons of less than one day and generally try to end the day with no positions whatsoever. This style of trading uses the same algorithms that classic systematic traders have developed just with a larger number of orders and at faster speeds. This capstone project will only be trading on a daily basis, and will therefore have much longer investment time horizons than HFT. Another way that technology has impacted the world of quantitative finance is through online trading platforms (e.g. Robinhood[26], TD Ameritrade[33]) and algorithm development platforms (e.g. Quantopian[25]). Quantopian is a crowd- sourced quantitative investment firm that provides data, capital, education, and a research environment in develop new investment algorithms. The community is made up of over 150,000 members that create and share different algorithms for investing in the stock market.

Risks of Quant Strategies

Investors like risk; it is an inevitable component of investing in the stock market. What investors do not like is uncertainty– not knowing how much the risk is. There is risk in all trading, but algorithmic trading can be more susceptible to certain types of risks. The most basic and common form of risk in quant strategies is model risk. Models attempt to make approximations of the real world and if a researcher does a poor job of modeling particular phenomena in the market, the strategy will likely not be profitable. Models that do not accurately represent the process could be due to poor optimization/parameterization, incorrect assumptions/priors, or even choosing an inappropriate algorithm or learning style. Good quant strategies usually work most of the time, but fail when uncommon extreme events occur. One of the key strategies algorithmic traders use is fitting models to historical stock data to help predict the future. If markets behave in a particular way for a long time, a quantitative system will be very good at predicting behavior that is similar to that behavior because that is all it has seen, up to this point. If there is a regime change, the system will suffer because the relationships among the features have changed, at least temporarily.[22] In addition, nonmarket information can be ex- tremely critical to predicting how prices will move and this is where humans excel and current algorithms fall short. Information that is entirely external to the mar- ket is not usually included in algorithmic financial systems, and an event such as a merger or a terrorist attack can cause quants to make inaccurate predictions. Many quants see only the numbers and while the predictions that these systems make are grounded in sound statistics and probability, stock prices are influenced by a myriad of factors and a model or algorithm is not guaranteed to find success. Many financial analysts blame quants and algorithmic traders for the 2008 financial crisis and recession.[28] It is important to remember that the stock market is a complex ecosystem that will probably never be fully modeled. The discipline, computing power, and scientific rigor in quant strategies can lead to interesting inferences about the market and certainly allow for making some money along the way.

Motivations and Goals

Modeling stock market trades and realizing complex algorithms to place financial transactions has clearly been a well-studied problem in the fields of economics, ma- chine learning, and social science. The large and comprehensive financial datasets available make this challenge well-suited for machine learning techniques. There is a large volume of "algorithmic" literature on this subject and in my capstone I wished to create a system that used the methods that I found in this literature to be the most successful for this problem as well as incorporating strategies and predictions that everyday human traders make. The aims of this capstone project were two-fold. The first aim was to design and implement a trading framework that would gather historical stock price data, clean and store the data, suggest stocks to invest in, make predictions about the future of those stocks’ prices, and utilize a trading strategy to make a profit on hypothetical investments. The second aim, and arguably more valuable aim, was to incorporate various techniques and concepts learned from my courses into a single goal-directed project. These concepts include information research methods, database design, ma- chine learning, artificial intelligence, and neural networks. This project intended to demonstrate how techniques from all of these areas in the field of Information could be implemented in a system to gain a profit on stock market investments. I combine the concepts I learned in my courses, those I read in literature, and my own intuition about how I would invest my money into a single trading system designed to help those who are stock-market-savvy and those who are not. In this report, I outline the components of my trading system, PrOFIT (The Predictive Operative For Intelligent Trading). I also go through the rationale for the choices I have made in creating this system, the average performance in various market situations, and discuss what I have learned from completing this project.