The COVID-19 outbreak and effects on major stock market indices across the globe: A machine learning approach

Background/Objectives: The Coronavirus disease in 2019 (COVID-19) was first seen in Wuhan (capital of Hubei, Chain), and has since spread throughout the world, resulting in the World Health Organisation (WHO) declaring the 2019– 20 coronavirus a pandemic. Its ongoing spread has resulted in a standstill of the world’s economy, total lockdown in some counties, closedown of business and firms globally. As of 25th Mar 2020, 11:00 am GMT, a total of 491,280 cases were reported globally. The evolution of COVID-19 and its economic impact on the regional financial markets are highly uncertain, which makes it difficult for legislators to formulate a suitable macroeconomic strategy response. Methods: To ascertain the possible economic impact of COVID-19 globally, this study examines the effect of COVID-19 onmajor stocks indices across the globe. Using a random sampling technique, we selected thirty (30) world stockmarket indices in different countries infected with COVID-19 from 31st Dec 2019 to 25th Mar 2020. However, not like a high percentage of previous studies that focus on the regional stock market, we examine the information on daily reported COVID-19 cases and stock market fluctuation over thirty stock market indices that houses the stock prices of several countries around the globe. Also, we estimate the monetary loss within the period, project the future surge of this pandemic on the stock market and outline some portfolio allocation strategies to help the investor hedge against investment risk. Findings: The experimental results in this study show that even a controlled outbreak of the COVID-19 can significantly influence the world’s economy in both the short-term and longterm. Our obtained results of COVID-19 associated economic loss echoed in stock prices movements advise that the cost can escalate severely and quickly into global economic stress. Hence, we conclude by outlining some measure that might help investors hedge against such risk using portfolio allocation


Introduction
. Summary ofrelated studies Period Region/ stock market index Objective (38) 10th Jan to 16th Mar 2020 China Investigates the effect of COVID-19 on the Chinese stock market (39) 10th Mar -30th Apr 2020 Emerging markets in Asian and Europe examined the impact of the novel virus on developing markets (40) 11th Apr 2019 to 9th Apr 2020 MSCI World Index, S&P 500 FTSE 100, FTSE MIB, IBEX and CSI 300 indices Examined the impact of COVID-19 on the market value of cryptocurrencies. (34) Jan Provides commentary on how the stock price -reacted in real-time to different stages in COVID-19's evolution (41) The month of March 2020 S&P1500 Investigate the economic effect of the COVID-19 on the US stock market. (42) 31 st Dec. 2019 -20 th Mar 2020 China and G7 countries Examines the economic constraints in China and G7 countries during the novel COVID-19 period. (37) January-April 2020 75 countries Demonstrate the reaction of investors to different COVID-19 data announcement. (43) 01st Jan 2020 -30 th Mar 2020 23 Countries Evaluate the significance of health-news trends during the novel COVID-19 in the predictability of stock returns. Investigate the impact of COVID-19 on oil-stock prices. (44) 02nd Jan 1900 -24 th Mar 2020.

US stock markets
Examine the effect of news related to infectious disease (Spanish Flu, COVID-19) on the stock market.
Continued on next page https://www.indjst.org/ Table 1 continued (35) 22nd Jan 2020 -17th Apr 2020 64 countries Scrutinise the stock markets' response to the novel COVID-19 pandemic. (45) 03rd Feb 2020 to 17 th Apr 2020 DJIA, FTSE 100, DAX, CAC 40, IGBM, and MIB Investigating the stock market's reaction to COVID-19 news in the six most affected countries. (46) 1/1/2020 -31/03/2020 Studies the influence of the novel COVID-19 on the structure and degree of risk-return dependence in the US. (47) Up to 27 th Mar 2020 US, Italy China, Mainland Spain, Germany, France United Kingdom, Switzerland, Korea, South, Netherlands and Japan Singapore Map the overall patterns of country-specific risks and total risks in the world financial markets. (48) 12th Feb 2020-02nd Apr 2020 using Moreover, as pointed out in (32) , the novel COVID-19 might be as transmissible economically as it is medically. Hence, this study attempts to fill in the gap in the literature by examining the extent to which COVID-19 outbreak is affecting the stock markets worldwide. Due to this, we hypothesis that following the anxiety, fear and panic within the initial stage of the COVID-19 pandemic, the stock market will movement downwards conferring to the "Model-of-Herd-Behaviour" before getting better to the regular price conferring to the "return-to-the-central" theory (21,49) , as investors gain better information and regularise their view on the actual effect of the COVID-19 outbreak.
The remaining sections of this study are organised as follows: Section 2 presents the methods and techniques used in this study. In section 3, we discuss the fallouts of this study and its implications of the global economy. Section 4 presents the conclusion arising from the study.

Methodology
This section presents the methods and tools adapted for the study. Figure 1 shows the dataflow diagram of the study, which is categorised into three (3) phases. Phase 1 covers study datasets download and integration. In phase 2, we perform data pre-processing, data visualisation and statistical analysis. Finally, we apply a machine learning algorithm under phase 3 to predict stock behaviour in future. We explain in details action taken under each phase in subsequent sections.

Study Data
Publicly available data on confirmed COVID-19 cases around the globe as of 25 th Mar 2020, 9:49 pm GMT was downloaded from https://www.worldometers.info/coronavirus/coronavirus-cases. The dataset had details of reported COVID-19 cases from all affected countries, which include features like new-cases, new-deaths, total-cases, total-deaths. The stock market indices dataset was download from yahoo finance (https://finance.yahoo.com/world-indices). A total of 30 out of 35 world stock market indices listed were download (from March 2019 to 25th Mar 2020), which captured the open-price (OP), closing price (CP) and volume (V). Details of used indices and individual company stock in this study are shown in Table A.1. We represent stock data (S D ) on a date (d) as a vector For every stock market data, its quantitative features (OPCP,andV) are represented by (n)(m) = (d) was represented as a vector(v) = feature of COVID-19 dataset and (d) to obtain the combined dataset (d) as the reference key.

Data Pre-processing and Visualization
The combined dataset (DS) was pre-processed in 2-steps: (i) data-cleaning and (ii) data-transformation. At the data-cleaning stage, we replaced missing values within a column (set of features) with the mean value of the said column and removed outliers where needed. A copy of the clean dataset was saved for data visualisation and statistical analysis. For the machine learning model to performed well and fast to reduce computational time, the clean dataset was scaled in the range of [0,1] using the max-min function as defined in Eq. (1). The ANOVA statistical technique was adopted to explore the total effect between cross-basis transformed daily reported cases of COVID-19 and the opening-price, closing-price, and trading volume of the 30 major stock market indices across the globe.
where k ′ (x) is the normalized value, k (x) = the value to be normalised k (x ) min and k (x)man = the minimal and maximum value of the dataset.

Machine learning model
Machine Learning (ML) has extensive business applications throughout several fields, especially in fraud-detection, inventoryplanning, recommendation-engines, image recognition, Amazon's Alexa, supply-chain, financial market analysis and much more. ML focuses on prediction and can make data-analysis resourceful by looking at huge amounts of data simultaneously. Several ML algorithms are applicable for this study; however, we adopted the decision tree (DT) algorithm due to its simplicity but highly efficient, faster training and testing time, which results in low computational cost (2) . The "DT is a flow-chart-like tree structure that uses a branching technique to clarify every single likely result of a decision" (2) . Algorithm 1 explains the operations of the DT algorithm (2,50) . The settings of our DT model was, max_depth = 4 and criterion = entropy. The Scikitlearn library and Python were used to implement the DT model, where the DT was already coded.
Where E(S) = entropy of acollection of DS, m = represents the manber of classes in the system and pi = represents the mamber of instances proportion that belongs to class i. 2. Calculate the information gain for an attribute K, in a collection S, as expressed in Eq (3). where E(S) represents the entropy of the entire collection and S u = the set of instances that have value u for attribute K.

Model Evaluation
We adopted two (2) closeness evaluation metrics, which ascertain the degree of closeness between the predicted value , namely, root-mean-square-error (RMSE)meanmachine lear (1) .
where ( y i ) = the predicted value, (y i ) = the actual value and n is the total number of testing data.

Results and discussions
We present the results and discussion of the studies under this section.

Data Visualisation
We applied some python code to get a pictorial view of our dataset to gain more understanding of our datasets. https://www.indjst.org/ thus, cases being treated. From Figure 2 it can be said that for every closed COVID-19 case, the recovery rate is approximately 84% and compared with the death rate of 16%. Furthermore, for cases considered as active, 95% were in the mild state, while 5% are in the critical conditions which might lead to death or loss of life. It is important to note that these statistics made here were in respect to our data download on 25 th Mar 2020, 9:49 pm GMT. Hence, possible changes might occur in these values at any time. Table A.1 shows the actual name of the used market indices and how we abbreviated them for simplicity. Fig. A.1-A.3 shows a plot of opening price against closing price of world major stock indices used in this study. The aim here was to see the variation in significant stock market indices before and during the COVID-19 pandemic. It was observed that approximate 97% out of the 30  (

Stock Dataset
Price change within COVID −19 period Table 2 shows the summary in market indices fluctuation before COVID-19 and during its peaks period around the world. We  Table 2) shows that the economic impact of COVID-19 is already evident in the countries most affected by the COVID-19 outbreak. The decline in stock price due to COVID-19 is higher than that reported by (54) in years between 1918 and 1920 or 1921 due to similar flu virus outbreaks, due to the rate infection and mortality rate compared with (SARS; 2002-2003) and Middle-East respiratory disorder (MERS; 2012-ongoing) (53) . The dramatic decline in stock price from 31 st Dec to 25 th Mar can be linked with behavioural economic studies which point out that negative sentiment motivated by nervousness and pessimism affects investment decisions which may also affect asset pricing (21,23,33) . Thus, assuming COVID-19 outbreaks generate nervousness and pessimism, COVID-19 would also influence investment decisions and later asset prices. Figures A.4

Modelling of the dataset
Experiencing the fluctuation in stock indices because of the COVID-19 pandemic above we attempt to identify the relationship between each feature (new-cases, new-deaths, total-cases, total-deaths) of COVID-19 dataset and the open price of market indices. Also, predict the next day opening price based on identified features using a regression DT. Figure 3 shows the feature ranking of the COVID-19 dataset. Features with importance ranking measure values closer to one were taking a highly significant feature. Even though we observed a fall in market indices for the period under study (see Table 2), we observed that the total number of COVID-19 cases and the total mortality (deaths) reported were extremely dominate factors that contributed to the fall in most market index opening price. However, it was further observed that the MOEX Russia Index (IMOEX.ME) responded contrary to total deaths than total cases. We further observed that the market index that recorded significant measure to have an increase or stability on open price.
Out of the feature importance ranking, we selected the two topmost features (total cases, and total deaths) as independent features (x) to predict the next day opening price of a market index. The total cases and mortality reported of COVID-19 was significantly associated with the opening price of most indices (ANOVA F-test average p-value 5.97409 ×10 −9 ), while the association with the closing price (ANOVA F-test p-value 6.37232 × 10 -8 ). Figure 4 shows the error measure of the DT regression model. The average RMSE of (0.174) and MAE of (0.353), shows that the model fits well to the dataset. Moreover, since theses metrics measure the degree of closeness between the actual values and the predicted values, based on these results it can be said that the stock market fluctuation can be predicted based on the outbreaks of infectious diseases like the COVID-19. This result affirms (51) study on COVID-19 effects on the world economy and other studies of infection disease (like H7N9, H5N1) and market depreciation (21,22) . https://www.indjst.org/

Conclusion and recommendation
Literature has shown that the stock market is affected by several factors; but previous studies were focused on analysing market volatility against public sentiments, historical stock price, financial news, and search engine queries. The global outbreak of COVID-19 has resulted in a standstill of the world's economy, total lockdown in some counties, closedown of business and industries is anticipated to affect the economic strength of countries. As a result, several works of literature (33)(34)(35)(36)(37)(38)(39)(40)(41)(42)(43)(44)(45)(46)(47) ascertained the impact of the novel virus on the stock market. However not like a high percentage of these works (33,(38)(39)(40)(41)(42)(43)(44)(45)(46)(47) that focused on regional datasets, this study presents a comprehensive examination of the information on daily reported COVID-19 cases and stock market fluctuation over thirty stock market indices (see Table 2 ) that houses the stock prices of several countries around the globe using decision tree algorithm. Also, we estimate the monetary loss within the period, project the future surge of this pandemic on the stock market and outline some portfolio allocation strategies to help the investor hedge against investment risk.
This study outcome shows that the stock market is highly associated with COVID-19 pandemic infections and mortality rate. The number of mortality and infection rate reported of COVID cases was also significantly associated with the opening price of the CBOE Volatility Index (VIX). Moreover, the movement of the predicted effect on this index was contrary to all other market indices; thus, it saw proportional to COVID-19 cases, while all others were inversely proportional. Though we do not have a concrete reason why the index was not affected negatively like all others, one reasonable explanation can be attributed to the emergency measurements taken by local governments to deal with COVID-19 together with encouragement by local government to it citizens.
The obtained average measure of RMSE value (0.174) and MAE of (0.353) for all examined indices in this study shows that the stock market is predictable to some extent based on an outbreak of the virus. We conclude that the outbreak of COVID-19 pandemic contributes to economic loss which is echoed in movements in stock prices.
We expected that for affected indices to bounce back, some policy and regulatory frameworks would be needed both in the short-term and coming years. Thus, in the short-term Treasuries and central banks are expected to ensure that mess-up economies remain functioning while the COVID-19 outbreak continues. Furthermore, governments are expected to play a focal part in the face of financial and real stress.
Finally, we recommend the following measures to help investors hedge against the risk of falling prices in this epidemic period using portfolio allocation. Table 2 , some index has recorded a positive change in opening price; therefore, buying of stocks in defensive businesses is a way for investors to make a profit despite the falling in stock prices. Accordingly, non-cyclical or defensive stocks are securities that mostly do better than the overall market all through bad times, such as COVID-19 pandemic. Thus, they usually provide unswerving stable earnings and dividend, irrespective of the state of the overall market. Firms that produce household non-durables such as toiletries, shampoo, toothpaste, and shaving cream are good examples of defensive firms because the public will still use these items in times of epidemic. 2. Adopting the "Play Dead" syndrome, thus remaining calm and not making any rapid moves. By playing dead, we admonished the investor to put a more substantial portion of their portfolio in money market securities of deposit, such as certificates of deposit, Governments Treasury bills, and fixed instrument or money market instruments which are less volatile as compared to the equity and commodity market in the face of the pandemic. 3. The long-term investor (meaning a time horizon of 10+ years) are encouraged to take advantage of dollar-cost averaging (DCA) and buying shares regardless of price, thus share price is typically low when the market is down. Over the long run, the cost will "average down, " leaving the investor with a better overall entry price for their shares.. 4. nvestors can apply the theory of diversification by spreading a percentage of their portfolio between bonds, stocks, cash, and alternative assets. It should be noted that the sharing percentage hinges on factors such as time horizon, risk tolerance, goals and many more. By doing this, the invested is assured of escaping the potentially harmful effects resulting from placing all their eggs in one basket. 5. Another away to hedge against the risk of falling prices in this epidemic period is to go-short. Thus, investors can practice short-selling, by borrowing shares in a firm and selling them-hopeful to purchase them back at a lesser price.

Declaration of interests
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper. Contributors: All authors contributed equally to the development of this paper and have approved the final article. Funding: Authors did not receive any funding for this study. https://www.indjst.org/