A new year calls for a new topic (Python) on The Babylonians. Python for Finance would be a brand new 2020 series that I decide to embark on after my previous reflection. The reason is because I see the future of finance, or any industries for that matter, advancing towards this direction. Hence, I have decided to pick up some new skills to stay relevant and hopefully also employable.
To start off, I am a complete newbie in this. My background is in accountancy and I know VERY little about the world of programming. Except for a 2-month summer school in Istanbul learning C programming, which has become a language for academic purposes, I had zero experience in python.
Most of the learning was self-taught through online resources. So everything you see would be rather simple, basic and probably full of mistakes. Calling out to all experts out there, do share your thoughts and comments about anything under the hood of python. Whether it is an error or a better way to code, I would greatly appreciate your wisdom and guidance in this.
In this Python for Finance series, I want to consolidate my learning and apply it in the real-world context. At the same time, it is also to reinforce my learning progress as the best way to learn is to teach. I wouldn’t dare to make the bold statement as far as to teach, but it is more like ‘share’.
We will see how far this thing goes after 3 months, 6 months, 2 years but for now, let’s see what I have learnt in 2020. Or more accurately, what I have learnt in the first 2 weeks of January 2020. This is the first post of a thousand miles. Let’s get started!
Getting Started in Python
I downloaded Anaconda which includes the python software, main libraries and Jupyter notebook. This Jupyter notebook is where you write all your python codes and run the outputs. It is intuitive and simple to use. Here is an example of how it looks. If you are interested, you can follow the code and it should return the same output.
Pulling Stock Data from Yahoo Finance
Alright, here is where the fun begins. After you open up the Jupyter, the first step is to import all the modules and libraries in python. Some of the more common ones are pandas, numpy and matplotlib. Most importantly, we need the DataReader from pandas. This helps us to read stock data from API sources and put it into a data frame format. The code can be seen on the first input.
The second step is to type in the ticker symbol of the banks and put them inside a variable which we will call
tickers. The ticker symbol for UOB, OCBC and DBS are “U11”, “O39” and “D05” respectively.
The objective is to grab the stock’s adjusted closing price from the year 2005 until today and store it inside a data frame which we will call
mydata. The adjusted closing price is used so that we can account for stock splits and dividends. The data source we will be using to pull these data is Yahoo Finance.
Overview of Stock Data
Let’s see what happens if we type
mydata and run it.
It automatically filters out the first 5 rows and the last 5 rows of data. You can verify the results by comparing the last traded price of the 3 banks with the last row’s adjusted closing price. It should give you the exact same match.
Next, we can also obtain a quick summary by typing
You can see that there are 3776 data points from the year 2005 till 10th January 2020. We can also see the mean price, min, max and etc. But this is over a time period of 15 years.
Visualising the Returns in a Chart
What if we want to compare the returns between the 3 banks? Which bank is the most profitable to invest? One simple way could be to just plot the prices of the 3 banks from 2005 to 2020. This is how it would look.
On first glance, it looks like OCBC is the laggard since the price is the lowest. However, that is an inaccurate way of looking at it since all of them started out trading at different prices.
In 2005, UOB was trading at an adjusted closing price of $8.17, OCBC at $3.73 and DBS at $9.02. That’s why the axis at 0 shows different starting points. It is not a fair comparison. We are just looking at the absolute figures but not the relative returns.
To adjust for that, we have to first normalise the data. One simple trick would be to take the daily adjusted closing price divided by the initial price on row 1. This ensures that the first data point would always be 1.
Let’s see how the edited version looks like.
The formula is essentially saying take every daily closing price divided by the price on 2005-1-1. Then plot it all out. The price in row 1 is
mydata.iloc. This ensures that we are comparing apple to apple as all of them started out at the same point.
You can observe the difference when x-axis = 0. The earlier version started out at different prices while the adjusted version was normalised to 100.
Now the tides have changed. UOB became the highest followed by OCBC and lastly DBS. Note that we are comparing the price % change since the year 2005.
Here is another way of interpreting the chart. UOB’s adjusted closing price since 2005 till date has gone up by 325%, OCBC has gone up by 290% and DBS has gone up by about 285%.
Measuring Annual Log Returns in %
What if we want to calculate the annual returns of DBS, OCBC and UOB? I have decided to use log returns instead of simple returns as it would be a better measure. This article provides a good explanation for it, but pretty complex and mathematical.
To calculate the annual log returns, we can calculate the daily average log return and multiply it by the number of trading days which is 252. This daily log return is simply the price today divided by the price yesterday.
How do we do that? If you look at the above formula, I used
mydata divided by
.shift(1) is a function that shifts down the entire data set 1 row down.
mydata / mydata.shift(1) simply means taking the current Price divided by the previous day’s price. That is how python returns the daily % price change from 2005 to 2020.
To find the log return, simply use
np.log() where np is the numpy function which we have imported earlier in the first step. After which, all the daily log-returns of the 3 banks are stored under a newly created variable called
Next, we have to find the average daily log return of the 3 banks. This can be done by using the function
mydata_log.mean() Finally, the annualised log-returns are calculated by multiplying the daily mean with 252 trading days.
After rounding it off to 3 decimals, it can be seen that UOB has the highest annual log-returns followed by OCBC and lastly, DBS. This corresponds to the earlier chart which shows that UOB has the highest % price gain since the year 2005.
Measuring Annual Log Volatility in %
Next, we can also measure the volatility of the stock. This can be done by comparing the annualised standard deviation. Standard deviation is meant by how far do prices deviate away from the mean. Hence, the higher the SD, the more volatile a stock is.
Since we already have all the daily price % change stored inside the variable
mydata_log, we can find the standard deviation by simply typing
Similarly to the above calculations for annualised return, we also need to multiply the daily standard deviation by 252 trading days. However, since standard deviation is a square root of the variance, we should also square root 252 and that’s why you see (252 ** 0.5) in the input.
If we run the code, you can see that DBS has the highest volatility while OCBC is the least volatile. This volatility is meant by the daily fluctuation of % changes in prices. As discussed earlier, the higher the number, the more volatile it is.
Visualising the Volatility in a Chart
We know that DBS has the highest standard deviation and OCBC has the lowest. But humans are all visual beings. So let’s put this into a chart and see how it looks like.
I used a 200-day rolling average because the daily chart is too noisy. The
.rolling() function works like a moving average line. It smooths out the data and calculates something over n periods. It could be average, sum but in this case, we are using standard deviation. So it’s a 200-day standard deviation line.
You can see that OCBC is indeed less volatile as compared to DBS and UOB. When it rises, it didn’t rise as much. When it falls, it didn’t fall as much. The swings are less extreme.
Summary of Stock Data
Let’s put together the insights we have gathered into a nice data frame table. This is more like a personal newbie python code practice for me. Still trying to grasp the concepts and be more fluid in it.
I created a
summary dictionary and created the 3 columns keys as follows: ‘Banks’, ‘Annual % Returns’ and ‘Annual % Volatility’. Finally, use pandas
.DataFrame to put it into a table format and set ‘Banks’ to be the first column. Let’s see what happen when we run it.
Bingo! A clean summary of the annual log return and volatility over the past 15 years.
Alright, that is pretty much some of the things I have learnt so far. It’s not much, just some basic returns and standard deviations calculations, which can be done with excel. But using python is much more elegant, powerful and efficient. That is of course if you know how to use it. But if you are like me who is struggling to learn the language, then we are on the same boat.