End-to-end automated trading bot
- Published on
- Duration
- 5 months (ongoing)
- Role
- Data Scientist, Engineer, Analyst & Web Developer
This project uses an algorithmic approach to trading designed to simplify the process of investing in the stock market. Through the use of a trading bot, the user can automate the process of buying and selling stocks based on a set of rules.
How it works
- The user registers for an account with Alpaca, a commission-free trading API.
- The user submits their API, secret keys and # days to trade which are stored in a secure database.
- Every working day at 9:30 EST, the bot will run and check for buy and sell signals.
- Trades are then placed based on the signals.
- The # of days remaining is then updated in the database.
- Accounts with 0 days remaining will be deleted from the database.
Link to the web application: Tradebotix
Link for PDF guide: Tradebotix Guide
User Research
Using Typeform, I was able to create visually appealing surveys that were easy to fill out. Not only that, for users who opted in yes, I used the Zapier plugin to send them a PDF guide on how to use the web application on my behalf.
Lastly, the data was appended automatically to a spreadsheet, which I could then use to analyze the data.
The survey was sent out to about ~30 people, and these were the questions:
- Have you made investments in the following instruments / products before?
- What are some of the reasons you might have for not entering positions in the stock market?
- What is your trading experience?
- You have used the following trading strategies before deciding what to buy:
- Have you used the following platforms before?
- How familiar are you with trading bots?
- What are your views on trading bots?
To keep the study anonymous, no personally identifiable information was collected.
Preview the questions here: Survey
User Research Findings
Based on the small sample size, I was able to gather the following insights:
- Majority of the respondents have made investment with insurance products, as they are generally considered to be safe, fuss free and low risk.
- More than half responded that the Stock Market carries too much risk, and that they do not have the time to monitor the market.
As the stock market is #3 on the most popular investment instruments, I believe that there is a market for a trading bot that can help to simplify the process of investing in the stock market, which aims to address the following concerns:
- Too much time and effort
- Steep learning curve
60% also opted in to beta test the web application, which I will be using to gather more feedback and improve the product.
Setting Up the Web Application
The task can be divided into 3 parts:
- Developing the backend bot
- Developing the frontend web application
- Setting up the database
Backend Bot
The algorithm was put through extensive testing to ensure that it is robust enough to handle different market conditions. Some of the algorithms tested include:
Classical technical analysis indicators:
- Crossover with Simple Moving Average (SMA), Exponential Moving Average (EMA), Relative Strength Index (RSI).
- Momentum with Rate-of-Change (ROC), Relative Strength Index (RSI) and Stochastic Oscillator.
- Mean Reversion with Relative Strength Index (RSI) and Bolinger Bands.
- Linear Regression with L1 and L2 regularization
- Random Forest with multiple trees
- Gradient Boosting models (XGBoost, LightGBM, CatBoost)
- Support Vector Machine (SVM) with Linear, Polynomial and Radial Basis Function (RBF) kernels
- Time-Series Models (ARIMA, ETS, Prophet)
Deep learning algorithms: (with multiple layers):
- Long Short-Term Memory (LSTM)
- Convolutional Neural Network (CNN)
- Recurrent Neural Network (RNN)
Testing Methodology As there are countless combinations of algorithms that can be used, a systematic approach was used to test the algorithms.
Step 1: Number of Tickers: 20 chosen at random from the S&P 500.
Step 2: Timeframe: 1 week, 1 month, 3 months, 6 months, 1 year, 5 years.
Step 3: Algorithm: 3 for each category (technical analysis, machine learning, deep learning) with 3 variants each.
Backtesting
Using an exponential moving average Crossover Strategy as an example, a class object is initialized with the following parameters:
- Ticker Symbol
- Short EMA
- Long EMA
- Start and End Date
- The first part of the code downloads the data from Yahoo Finance and stores it in a Pandas DataFrame.
class CrossoverEMA():
def __init__(self, symbol, EMA_S, EMA_L, start, end=datetime.datetime.now().date()):
self.symbol = symbol
self.EMA_S = EMA_S
self.EMA_L = EMA_L
self.start = start
self.end = end
self.get_data()
def get_data(self):
# extract data
data = yf.download(self.symbol, start=self.start, end=self.end).loc[:, "Close"].to_frame()
- The next part of the code calculates the exponential moving averages (EMA) for the short and long periods.
# create technical indicators (Exponential Moving Average)
data["EMA_S"] = data["Close"].ewm(span = self.EMA_S).mean()
data["EMA_L"] = data["Close"].ewm(span = self.EMA_L).mean()
# create signals
data["EMA_S_greater_than_EMA_L"] = np.where(data["EMA_S"] > data["EMA_L"], True, False)
data["EMA_S_less_than_EMA_L"] = np.where(data["EMA_S"] < data["EMA_L"], True, False)
# Crossover
data["EMA_S_greater_than_EMA_L_CO"] = np.where(
data["EMA_S_greater_than_EMA_L"] == False, False,
data["EMA_S_greater_than_EMA_L"].ne(data["EMA_S_greater_than_EMA_L"].shift())
)
data["EMA_S_less_than_EMA_L_CO"] = np.where(
data["EMA_S_less_than_EMA_L"] == False, False,
data["EMA_S_less_than_EMA_L"].ne(data["EMA_S_less_than_EMA_L"].shift())
)
- Next, the buy and sell signals are generated based on the crossover strategy.
# Buy and Sell signals
buysignals = data[data["EMA_S_greater_than_EMA_L_CO"] == True]
sellsignals = data[data["EMA_S_less_than_EMA_L_CO"] == True]
# position
data["position"] = np.where(data["EMA_S"] > data["EMA_L"], 1, 0)
# calculate buy_and_hold
data["buy_and_hold"] = np.log(data["Close"] / data["Close"].shift(1))
# calculate strategy buy_and_hold
data["strategy"] = data["position"].shift(1) * data["buy_and_hold"]
# drop NA
data.dropna(inplace=True)
# set date as index
data.reset_index(inplace=True)
data.set_index('Date', inplace=True)
# assign to self
self.data = data
self.buysignals = buysignals
self.sellsignals = sellsignals
return data, buysignals, sellsignals
- Finally, the returns are calculated and plotted on a chart.
def performance_summary(self):
# calculate sum
print("Sum of returns:")
print(self.data[["buy_and_hold", "strategy"]].sum(), "\n")
print("#" * 50)
# calculate what $1 would be worth
print("What $1 would be worth:")
print(self.data[["buy_and_hold", "strategy"]].cumsum().apply(np.exp), "\n")
print("#" * 50)
# calculate performance metrics
print("Performance metrics:")
pf.show_perf_stats(self.data["strategy"])
print("#" * 50)
# Plot graph
self.data[["buy_and_hold", "strategy"]].cumsum().apply(np.exp).plot(figsize=(10, 8))
plt.legend(loc="upper left")
plt.title(f"{self.symbol} Crossover with EMA {self.EMA_S} and EMA {self.EMA_L}")
plt.ylabel("Price (USD)")
plt.xlabel("Date")
# plot buy and sell signals
self.data[["Close"]].plot(figsize=(10, 8), color="gray", zorder=1)
plt.title(f"Buy and Sell signals of {self.symbol}")
plt.ylabel("Price (USD)")
plt.xlabel("Date")
plt.scatter(self.buysignals.index, self.buysignals["Close"], marker="^", color="green")
plt.scatter(self.sellsignals.index, self.sellsignals["Close"], marker="v", color="red")
plt.legend()
plt.show()
Frontend
The frontend was built using Streamlit, with the following structure:
In the application, the user can do the following:
- Submit their API/Sercet key & number of days to trade which will then be securely stored in a database.
- Find more about the project and the tickers that are available for trading, which is the S&P 500. Users can also read more about each company's business summary & historic performance.
- View their portfolio overrall performance, as well as the performance of each ticker.
- Lastly, there is also a contact form for users to get in touch with me, using formsubmit.co.
SQL Database
The database was built using MySQL and hosted on Kamatera, a cloud hosting provider.
The database is used to store the following information:
api_key | secret_key | days_to_trade |
---|---|---|
api_key_1 | secret_key_1 | 2 |
api_key_2 | secret_key_2 | Cancel |
api_key_2 | secret_key_2 | Indefinite |
Scheduling of the Bot
The bot is scheduled to run every working day at 9:30 EST using PythonAnywhere. To save costs on server hosting, the operational hours of the server is kept to 1000 - 2200 GMT+8. And 3 scripts were introduced:
- Start Server (1000 hrs GMT+8)
- Run Bot (0930 hrs EST, or 2130 hrs GMT+8)
- Stop Server (2200 hrs GMT+8)
Start Server
from libcloud.compute.types import Provider
from libcloud.compute.providers import get_driver
cls = get_driver(Provider.KAMATERA)
driver = cls('api_key', 'secret_key')
node = driver.list_nodes('server-name')
driver.start_node(node[0])
Run Bot
- Import necessary libraries
- Get the NYSE trading calendar
- Check if today is a trading day
- If yes:
- Connect to the database
- Get the API/Secret keys and # of days to trade
- Drop users that have "Cancel" on their # of days to trade
- Update users' days to trade -= 1 except for users with "Indefinite"
- Update users' days to trade = "Cancel" if # of days to trade = 0
- Update database
- Run the bot
- Get list of tickers from S&P 500 closing price
- Calculate EMA
- Create crossover signals
- Create buy and sell signals
- For each user in the database:
- Connect to Alpaca API
- If the account is blocked or does not have funds:
- Do nothing
- Else for each ticker in the buy list:
- Check if the ticker is in the user's portfolio
- If yes:
- Do nothing
- Else:
- Place a buy order
- For each ticker in the sell list:
- Check if the ticker is in the user's portfolio
- If yes:
- Place a sell order
- Else:
- Do nothing
- If user's API key is invalid (e.g., user regenerated a new key after submitting the form):
- Delete user from database
- If no:
- Do nothing
Stop Server
from libcloud.compute.types import Provider
from libcloud.compute.providers import get_driver
cls = get_driver(Provider.KAMATERA)
driver = cls('api_key', 'secret_key')
node = driver.list_nodes('server-name')
driver.stop_node(node[0])
Learning Points
This project was a great learning experience for me as I was able to learn more about the following:
- How to use the Alpaca API to place trades
- How to use Streamlit to build a web application that interacts with the backend
- Technical analysis indicators such as the Relative Strength Index (RSI), Exponential Moving Average (EMA), Simple Moving Average (SMA), Rate-of-Change (ROC), Bolinger Bands, Stochastic Oscillator
- Database management on the cloud using MySQL
Also gain proficiency in what I've already learnt such as:
- Python, specifically Data Analysis, Modeling & Code Optimization
- SQL
It was a complete end-to-end project that presented itself as an opportunity to develop interdisciplinary skills and gain exposure to multiple domains, to hone the skills into becoming a well-rounded professional capable of tackling complex problems from different perspectives.
For improvement, I would like to explore the following:
- More complex trading strategies
- More complex machine learning and deep learning algorithms
- Collect more user feedback and improve the web application
Thanks for reading!