Building Stockprice Database 2 (cronjob)

In the previous post, I’ve talked about scraping the Korea Exchange website to create a database of historical stockprices. In this post, I will discuss setting up cron_job so that the stockprice database is updated daily with the newest stock price data.

In setting up the webscraping code for daily execution, a few things change.

  1. The iteration across tradedates are no longer necessary
  2. Printing to kernel is not necessary, since no one is going to observe it

Instead of (1), the code needs to automatically fill in today’s date as tradedate. Instead of (2), the code needs to keep a log of activities so that I can check whether the code has been running without issues. One solution to logging is to create a log file, which gets updated everytime the code runs. However, I am bad at routinely checking at the logs, so I prefer a more passive solution.

In this post, I will have the code prompt me through cellphone messages whether it has been sucessfully executed. There are a number of messenger services, which offer APIs. For example, my team, Alternative Investment Valuation Team, at KIS Pricing uses mattermost, which offers REST API that can be accessed through python requests package. However, mattermost requires a server setup, which can be cumbersome. Since this stockprice database is for personal use, I will use Telegram messenger.

In order to use telegram as notification channel, I first need to install Telegram app and create a telegram account. Since this part is self-explanatory, I will elaborate more on this process. I use iPhone Xs, and screen captures are based on iOS. The instructions for android phones shouldn’t be too different.

Now, a bot needs to be created. A bot is an api controlled account, and can be created through interaction with botfather.

  1. Start conversation with @botfather
  2. Press start
  3. Submit /newbot
  4. Create name and username for the bot. The bot username has to end with bot.
  5. Save the API key.

After the bot has been created, a conversation with bot needs to be opened. Similar to searching for @botfather, the username of the bot can be searched for to find the bot.

Once a conversation is started, a conversation id is created, which can be accessed via the following link: https://api.telegram.org/bot(bot API key)/getUpdates. Note that the actual API key should be used in place of (bot API key).

The link should return a json response with chat information.

{"ok":true,"result":[{"update_id":552201747, "message":{"message_id":1,"from":{"id":869418718,"is_bot":false,"first_name":"Yeon Sik","last_name":"Cho","language_code":"en"},"chat":{"id":869418718,"first_name":"Yeon Sik","last_name":"Cho","type":"private"},"date":1581923361,"text":"/start","entities":[{"offset":0,"length":6,"type":"bot_command"}]}}]}

From the json data, I can find that the conversation id between e_cho_bot and me is 869418718. With the API Key and conversation id, I can move on. Python has telegram package, which can be installed through pip install python-telegram-bot. The package has functions to control bots in Telegram. Below is a code snippet to submit a simple message to my conversation with e_cho_bot.

import telegram

my_token = '(bot API key)' #actual api key should be used here
chat_id = 869418718

bot = telegram.Bot(token = my_token)
bot.sendMessage(chat_id = chat_id, text = "testing")

The code for daily stock price update is as follows.

import requests
import json
import time
import datetime
import MySQLdb as sql
import telegram

my_token = '(bot API key)' #actual api key should be used here
chat_id = 869418718

bot = telegram.Bot(token = my_token)
bot.sendMessage(chat_id = chat_id, text = "Stock Price Collection Started")

try:
    headers = {
        'Connection': 'keep-alive',
        'Accept': 'application/json, text/javascript, */*; q=0.01',
        'X-Requested-With': 'XMLHttpRequest',
        'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/80.0.3987.100 Safari/537.36',
        'Content-Type': 'application/x-www-form-urlencoded; charset=UTF-8',
        'Origin': 'http://marketdata.krx.co.kr',
        'Referer': 'http://marketdata.krx.co.kr/mdi',
        'Accept-Language': 'ko-KR,ko;q=0.9,en-US;q=0.8,en;q=0.7',
    }

    data = {
      'market_gubun': 'KSQ',
      'sect_tp_cd': 'ALL',
      'schdate': '20200214',
      'pagePath': '/contents/MKD/13/1302/13020101/MKD13020101.jsp',
      'code': 'OTP code here',
      'curPage':5
    }
    #Connect to MySQL database
    con = sql.connect(host="eric-cho.com", port=3308, database="echo_blog", user="user id", password="user pw", charset="utf8")
    cursor = con.cursor() 

    trade_date = datetime.datetime.today() #Set today's date as tradedate
    data['schdate'] = trade_date.strftime("%Y%m%d")
    for exchange in ['STK','KSQ']: #iterate through two exchanges
        temp_dt = []
        data['market_gubun'] = exchange   
        for i in range(1, 30):
            data['curPage'] = i
            response = requests.post('http://marketdata.krx.co.kr/contents/MKD/99/MKD99000001.jspx', headers=headers, data=data)
            dt = json.loads(response.content)['시가총액 상하위']
            temp_dt+=dt
            if len(dt)==0: #when an exchange is done, send summary to via telegram
                n_items = len(set([td['isu_cd'][:5] for td in temp_dt]))
                bot.sendMessage(chat_id = chat_id, text = f"{exchange} page {i-1} is last. {n_items} items")
                break
            for d in dt:
                cursor.execute("insert into stockprices values (%s,%s,%s,%s,%s,%s,%s,%s)",[d['isu_cd'],trade_date, d['opnprc'].replace(",",""),d['hgprc'].replace(",",""), d['lwprc'].replace(",",""), d['isu_cur_pr'].replace(",",""),d['lst_stk_vl'].replace(",",""), exchange])
            con.commit()
            time.sleep(2)
    con.close()
except Exception as e: #Submit error message if there has been an error
    con.close()
    bot.sendMessage(chat_id = chat_id, text = str(e))

bot.sendMessage(chat_id = chat_id, text = "Stock Price Collection Ended")

Now that the python script is complete, the last job is to setup scheduling service to run the script daily. My server set up is linux Centos 7. While a lot of server tasks can be done in Windows as well, Windows update tend to interfere rather frequently, and I’ve found linux to be more reliable as a server. One can use task scheduler to set up scheduling in windows.

I’ve placed my python script in /opt/stockprice_update.py. Running crontab -e will open an editor where I can schedule tasks. I want to updater to run Monday - Friday at 4PM (Stock market closes at 3:30, but KRX website posts 20 minute delayed prices). The command for this is 0 16 * * 1-5 python3.6 /opt/stockprice_update.py. The command for running the script can be different depending on where one placed the script and how they labelled python3.

crontab -l can show that the scheduler has been setup.

Tags: python  cron_job  linux 

Discussion and feedback