Imagine that you have 1000 USD on the evening of 31st of December, 2017. In addition to that you happen to know daily prices of all US stocks in 2018. How much money could you make by taking into account the future and investing perfectly?

## Prelude

Obviously, this does not have that much practical sense, since we don’t know the future. However, I was curious how “knowing the future” outperforms passive investing. In other words this could be in some sense an upper for investment returns.

## Assumptions

The world is complex, so let’s make a bunch of assumptions to simplify the problem a bit:

• On a given day, only one of the following actions is allowed:
• Buy one position with all your cash at the lowest price available during that day.
• Sell everything at the highest price available that day.
• Do nothing.
• Transactions are free.
• One can buy only securities, which are available at IEX (I had to get the data somewhere).
• All transactions succeed no matter what (e.g. volumes are ignored).
• One can buy fractional shares.

It may look like holding at most one position at a time is limiting. However, the optimal trading sequence is like that. Imagine that you bought more than one stock. Then you sold them and got some profit:

• imagine you got the same profit from both. Then you could have bought only one of them originally and the overall sequence profit would have stayed exactly the same. In other words, if the first sequence (with more than one stock at a time) is optimal, then the resulting sequence with at most one stock at a time is also optimal.
• otherwise, one of them brings more profit. Thus, if you buy only this better stock (instead of two stocks), you will only improve overall sequence profit.

That’s a simplistic proof.

## Results

Let’s start with the results. This way if you are not interested in all the details, you don’t have to waste your time going through them.

So, you get 2235976324368894910070981708437303314132384811588503138465677325225209962386423808 USD. Yeap, for people confused, this is 2.2e+81 USD. <here goes a default boring comparison with number of atoms in the universe>. This is equivalent to 63.93% return on investment, daily. A lot, right? Penny stocks happened to be very volatile, you know. And in order to achieve this, you just needed to do these 107 transactions. Easy-peasy.

Obviously, now this feels boring and less exciting than I expected it to be. Moreover, I feel like buying stocks for 2.2e+81 USD can be tricky these days. What if we forbid penny stocks? In particular, let’s not buy anything cheaper than 5 USD per share. Boom, 1537713047211124526310105417510931136512 USD (i.e. 1.5e+39, daily ROI 25.64%, transactions).

Argh, this is still just some unrealistically large number. And I don’t recognize literally any of these stocks. Ok, let’s get really conservative here and restrict ourselves to S&P 500 only. Buying penny stocks can be tricky to execute, trading volumes can be low, volatility may be too volatile. So S&P 500, it is. And… 30 853 548 412.13 USD (i.e. 3.1e+10, i.e. just almost 31 billions, ROI 3.1e+09%, daily 4.84%). So what did you have to do? Just these dozen of dozens transactions.

Ok, so this is now much more interesting (at least for me). The number is somewhat realistic and the first transaction involves AMD, which I know. I declare this a success!

If you are curious, my code is available in this Github repository.

## Further research

I am curious to look more into “S&P 500 only” transactions, e.g. how long each company was in the portfolio, how large their relative impact on the final result was and so on. However, this post already took 7 hours and I haven’t even started writing the methodology yet. Moreover, I decided that I will have breakfast only after finishing this post and it is 12:03 already and I am starving now. And I have super tasty curry in the fridge, which I cooked yesterday. Obviously, it was not my original plan to eat curry for breakfast, but it is also not strictly a breakfast anymore. So I hope you understand me (and curry) and I may write another post about this some day.

## Methodology

I am pretty sure you all would love to know how I got the numbers above.

### Dataset

Obviously, I needed to get daily prices from somewhere, so I took them from iextrading.com (API docs). These folks don’t even require registration (i.e. you don’t need an API key), which is great!

For S&P 500 I just used some random datahub.io csv dataset, which was built from Wikipedia.

### Optimization

Once we have the numbers, how do we find the optimal sequence of transactions?

The key observation here is that if we have X shares of a given stock Y on day Z, then we don’t care at all how we arrived in this state. E.g. we could have bought them yesterday or 1 week ago, but this has zero effect on the final outcome (i.e. amount of money we have in the end).

This allows us to apply so-called dynamic programming. Basically for each day of 2018 and each available stock we want to calculate the maximum amount of that stock we could have had. I.e. best_quantity[2018-03-14][APPL] is a maximal number of APPL stock we could have had on Pi day in 2018 (i.e. by doing optimal transactions before).

According to the problem definition, we know that best_quantity[2017-12-31][$] is 1000 and best_quantity[2017-12-31][<any stock>] is 0 (we have 1000 USD and no stocks at the beginning). Then we can notice that we can calculate best_quantity[<date> + 1] if we know best_quantity[<date>]. If we are in a some state on some date, there are only 3 actions we can perform: • hold what we have • if we have cash, buy something • if we have something, sell and get cash. Thus, we can just go over all states, try to perform the action and see whether this improves the result for the new state. For example, buying stock X means: best_quantity[<date> + 1][<stock X>] = max(best_quantity[<date> + 1][<stock X>], best_quantity[<date>][$] / <price of X on <date + 1>>)

In other words, imagine that you have best_quantity[<date>][$] USD on the day <date> (evening) and you want to buy <stock X>. You can do this by waiting until tomorrow, i.e. <date + 1> and just buying that stock according to the current price. Why do we have max? Because there may be other better ways to get more of this stock (e.g. by buying it earlier at a cheaper price and just holding). By doing this for each date and each symbol, we get our final answer in best_quantity[2018-12-31][$].

How to get the actual optimal sequence of transactions? Each time some best_quantity is improved, we can just save which action has led to this improvement (i.e. from which state we got there). Using this information, we can easily see how we got to best_quantity[2018-12-31][$] and go to the previous state. And then just repeat this process of going backwards, until we reach our initial state of having 1000 USD. Once we know 2 adjacent states, reverse engineering which action it was is trivial. ### Issues Obviously, this wasn’t as smooth as I expected (and I have to explain spending at least 7 hours on this). ### Noisy data Mostly the issue was with weird prices in the data. Basically, my first attempts returned something like 10^124 USD, which was somewhat larger than I expected. After inspecting the proposed sequence of stocks, I discovered some very lucrative transactions, e.g.: 2018-05-30 buy ZAZZT 7.87048936311242e+47 @ 0.9 2018-05-31 sell ZAZZT 8.185301067147553e+52 @ 103999.9 ROI = 115554.44444444444 and in the data it was ['2018-05-29', None, None, None, 0.9], ['2018-05-30', None, None, None, 0.9], ['2018-05-31', 87359.95, 103999.9, 103999.9, 87359.95], ['2018-06-01', None, None, None, 87359.95], ['2018-06-04', None, None, None, 87359.95], which looked very fishy, so I just dropped all price points where at least one value was missing (i.e. 35098 price points out of 1962200, i.e. 1.7% - yeah, I was worried to drop too much, so I checked the percentage). This brought the result down to$10^115. This time the transactions were less awkward, but still odd. Example culprit:

2018-02-12 buy ZWZZT 2.5002156249822376e+16 @ 12.5
2018-02-14 sell ZWZZT 1.600121998608632e+21 @ 63999.36  ROI = 5118.9488
['2018-02-09', None, None, None, 18.1],
['2018-02-12', 12.5, 12.5, 12.5, 12.5],
['2018-02-13', None, None, None, 12.5],
['2018-02-14', 5, 63999.36, 63999.36, 5],
['2018-02-15', None, None, None, 5],

Looks like an outlier. Thus, let’s kick out price points with all 4 values being the same (65112 out of 1927102) and with max-min ratio larger than 10 (20 points out of remaining 1861990).

This gets us down to $10^84. The only suspiciously large ROI now was 2018-10-04 buy ZXZZT 5.209604829132704e+61 @ 10.86 2018-10-10 sell ZXZZT 6.338984640830725e+66 @ 121678.8 ROI = 11203.309392265193 Fighting this one was not obvious, so I started looking at trading volumes as well: ['2018-10-03', 100, None, None, None, 10], ['2018-10-04', 1200, 10.86, 11, 11, 10.86], ['2018-10-05', 100, None, None, None, 10.86], ['2018-10-08', 835, 0.0072, 0.0072, 0.0072, 0.0072], ['2018-10-09', 800, 168998.31, 168998.31, 168998.31, 168998.31], ['2018-10-10', 400, 121678.79, 121678.8, 121678.79, 121678.8], ['2018-10-11', 100, None, None, None, 121678.8], ['2018-10-12', 300, 115198.85, 127998.72, 127998.72, 115198.85], ['2018-10-15', 400, 99999.9, 100000, 100000, 99999.9], For simplicity I just removed all price points with less than 1000 trading volume (51610 out of remaining 1861970). This gave 2.235976324368895e+81, which you have already seen above. ### Penny stocks After morally accepting that number, I noticed that majority of the transactions are just penny stocks anyway, which is not that interesting (they are volatile as crazy). So I just stopped “buying” anything cheaper than$5 per share. I know that not all of these are penny stocks, but this affects only 226071 price points (and I am hungry, remember the curry), so the limit is \$5.

This brought us down to 1.5377130472111245e+39 USD, which still didn’t feel completely satisfying. So that’s how I started limiting stocks to S&P 500.

And now I am going to finally eat my curry (“Hooray!” says my stomach).