Hunting for Short Squeeze
Richard Kim & Shumiao Ouyang

Video Introduction

What is Short Squeeze

In October 2008, Volkswagen AG, the German automanufacturer, briefly became the most valuable company in the world as its share price jumped from €209.63 on October 24th 2008 to €939.53 on October 28th 2008, which was over 4 fold jump in the share price within 2 trading days. This rare phenomenon was attributed as a result of "Short Squeeze."


Short sale is an investment/trading method in which an investor borrows shares of the interested stock from a broker and sells it back in the market. Typically, investors will short sell a stock if they have a pessimistic outlook on the underlying company. For example, he or she may believe that earnings (or sales, operating cash flow, etc) of the company will substantially decline. Other reasons may include - for a pharmaceutical company, rejection of a drug by FDA; for a defense company, loss of government contract, etc.

Short sale strategy is to sell the stock when it is high and buy it back when it is low. Thus, the strategy is the opposite of the common and widely known investment strategy - "buy low, and sell high." Short sale strategy can be described as "sell high, and buy low." It is an investment strategy often utilized by hedge funds and sophisticated individual investors.

When a sufficiently large number of investors short sales a stock, this stock has a potential to experience Short Squeeze when these short investors in a relatively close period of time attempt to cover (buy back) their short sale position. In such an instance, the stock price may jump rapidly, often over 5% up to 30% ~ 50% in a day. As shown above in the example of Volkswagen in 2008, stock price may jump 400% in over 2 day period.

For additional information about Short Squeeze please take a look at the link below:

For additional information about Volkswagen share price in October 2008:


Richard Kim, CFA, had worked as an equity research analyst and investment banking analyst for approximately 6 years. During his tenure as an analyst, he has personally seen numerous short squeeze phenomenons in the market; moreover, as a part of his duty as an analyst, he had to seek out and predict these short squeeze opportunities in the market. Richard analyzed trends in the historical financial statements, other analysts' estimates, short interest data, etc. and applied human intuition, experience, and know-how to predict the short squeeze trading opportunities.

While considered a highly risky strategy, successful short sale trade can be highly rewarding - sometimes generating over 20% return in a matter of few weeks. Thus, many analysts and traders invest significant amount of time and energy into building up the skills to successfully predict short squeeze phenomenons.

So we asked the following question:

Can we teach a machine to predict Short Squeeze opportunities in the market?

The Data

With over 20,000 stocks traded in North American financial markets alone, in order to narrow down our focus, we targeted S&P 500 companies. S&P 500 companies represent some of the largest publicly traded companies in the US, and they are listed on various stock exchanges including NYSE and NASDAQ. We limited our data to 10 years.

S & P 500

We found Wharton Research Data Services (WRDS) through Harvard Business School's Baker Library, which hosted a large array of financial data on the US and international companies around the world.

Wharton Research Data Services

There are countless variables that impact stock price movements. We decided to focus our research on the following variables:

  • Price & Volume

  • Short Interest

  • Historical Financial Statements

  • Equity Analyst's EPS Estimates

Eliminating companies that did not have any one of the information above, we had total of

377 Companies

over 9 million rows and

over 4 gigabytes of data

Financial Statements Analysis

We gathered key financial statement items (i.e. Revenue, Operating Income, Net Income, EPS, Assets, Cash & Equivalent, and Equity) per each company every quarter up to 10 years.

Financial Statement Trends

Is there a relationship between Short Squeeze Incidents and Earnings Announcement Dates?

We felt that we can narrow down our observations to Short Squeeze incidents that happen on the day of or close to the day of the earnings announcements by the companies. Intuitively, when a company releases the quarterly financial statements sometimes they contain information that positively surprises (also, negatively) the market. For example, revenue might have increased by higher rate than historical trends or Earnings per Share (EPS) might have been significantly higher than estimates made by analysts in the market.

Scatter Matrix

We could see that the Short Squeeze incidents were relatively normally distributed around the earninings announcement date, with clear exception of day zero - plus or minus approximately 4 days. Our intuition was likely correct in that these short squeeze opportunities happen as a result of new positive information released by the companies.

What factors in the Earnings (Financial Statements) Annoucements impact Short Squeeze Incidents?

Scatter Matrix

Unfortunately, there was no visibly noticeable differences in financial results of those that resulted in Short Squeeze incident versus those that did not.

EPS Estimates & P/E Ratio

For each of the companies, we collected EPS estimates made by the sell-side analysts per each period. We calculated moving 1 month average of the EPS estimates by all of the analyst on a given date. Using the average EPS estimates, we also calculated P/E ratios - a popular valuation metrics - on a given date.

Analyst EPS Estimates & P/E Ratio

How do EPS estimates, prices, and valuation change as the financial results announcement date approach?

EPS Estimates & P/E Ratio Analysis

Interestingly, we find overall increase in share prices before the announcement date. In addition, we also find increase in Average EPS estimate. Even though we have not determined which variables are related to Short Squeeze events, we forge ahead to apply this data to machine learning algorithm.

Hidden Markov Model

For this project, we took the opportunity to explore a machine learning technique that was not taught in the course. We considered many models including RandomForest; however, due to the sequential nature of our data, we felt that Hidden Markov Model is the most appropriate model for this project.

For those who are not familiar with HMM, please take a look at the link below:

We utilized a python library seqlearn, which is a supervised HMM learning module that has been modeled after scikit-learn.

We tested two different HMM methods available in seqlearn module:

Multinomial HMM vs Structured Perceptron

While the scores of both Multinomial HMM and Structured Perceptron appear quite high; in reality, as was demonstrated in problem set 5, with overwhelming days of no-Short Squeeze Events, if we had predicted all '0's, we would have scored quite high as well.

In fact, we found out that Multinomial HMM never made a single prediction for '1'; it attained higher average score than Structured Perceptron. On the other hand, Structured Perceptron did make predictions of '1''s and still managed to score approximately over 90 percentile. We believe Structured Perceptron yields superior results.

we decided to test one of Structured Perceptron's parameters - maximum iteration, which is the number of times each sequence is used for prediction.

Parameter Test of Structured Perceptron Model

The boxplot shows that there may be a modest improvement in prediction score as Max Iteration increases; however, the score leveled off after the parameter reached around 10.

Final Thoughts

While we can't claim to have found a method to perfectly predit Short Squeeze Events in the market (if so, we would be in position to make a lot of money!), we believe Structured Perceptron method is an interesting machine learning method for sequential data such as stock price movements.

We believe there is a lot of room for further exploration with the Structured Perceptron method.

iPython Notebook & Database

About Us

Richard Kim

Richard Kim, CFA

Richard worked as an investment banking analyst and an equity research analyst - focusing on Japanese financial markets for approximately 6 years prior to coming back to school. He is currently pursing ALM in Information Technology - Mathematics & Computations through the Harvard Extension School. Richard's research interests include machine learning and social computations as they relate to financial markets and natural languages.

Shumiao Ouyang

Shumiao Ouyang

Shumiao Ouyang is a graduate student of Peking University - Guanghua School of Management pursing Masters in Finance. He worked as a CFO of a start-up, Toyhouse (Tsinghua's Eduction Innovation Lab) and have participated in several academic research projects in economics and finance. He was awarded Exellent Student Awards of Peking University in 2014. Shumiao is interested in pursuing PhD in Economics.