- Introduction
- Related Works
- Methods
- 3.1 Data
- 3.2 Data Preprocessing
- 3.3 Model Training
- 3.4 Optimization Search Model
- 3.5 Performance Evaluation
Stock trading is a popular investment choice for individuals seeking to grow their wealth. However, investing in stock markets is inherently risky, requiring careful decision-making to achieve desirable profits. To navigate this uncertainty, investors often rely on Technical Analysis, a method that analyzes historical price trends and trading activity to identify patterns and predict future market movements [3].
This research focuses on leveraging Technical Analysis to extract meaningful features for training machine learning models. The study evaluates the performance of three machine learning models—Logistic Regression, Support Vector Machines (SVM), and K-Nearest Neighbors (KNN)—against an optimization-based model using Differential Evolution (DE). The goal is to help investors understand the strengths and limitations of these algorithms in making efficient trading decisions.
Financial academics have long studied ways to improve technical trading systems. Hundreds of technical indicators have been developed using statistical and mathematical methods, categorized into types such as:
- Trend Indicators: Analyze trends (e.g., Exponential Moving Average (EMA), Directional Movement Index (DMI)).
- Momentum Indicators: Measure the rate of price changes (e.g., Relative Strength Index (RSI), On-Balance-Volume (OBV)) [8].
These indicators are used to suggest trends, reversal points, and trading signals [9]. Early attempts relied on simple rule-based approaches, but the dynamic nature of stock markets limited their effectiveness.
To address these limitations, optimization algorithms such as Genetic Algorithms, Differential Evolution (DE), and Particle Swarm Optimization (PSO) have been applied to build trading recommendation systems [10]-[13]. These algorithms excel at handling nonlinearity, uncertainty, and large-scale problems.
In contrast, machine learning algorithms specialize in pattern detection. Unlike traditional methods, machine learning thrives on flexibility, rapidly analyzing complex data without requiring clearly defined patterns. It has been widely used with Open-High-Low-Close (OHLC) price data to build stock price prediction models [6][15][16]. Advanced techniques such as Long Short-Term Memory (LSTM), Artificial Neural Networks (ANN), and Convolutional Neural Networks (CNN) have also been employed to predict stock values using features like price and trading volume [17][18].
A critical step in training supervised machine learning models is data labeling. Since there is no definitive answer to when to buy or sell stocks, labeling data is challenging. Recent approaches, such as the N-Period Min-Max (NPMM) method, address this by labeling data at specific time points to reduce sensitivity to small price changes [2].
This study uses data from nine securities listed on the Stock Exchange of Thailand (SET100) index, categorized into three groups based on their market trends:
- Uptrend: SIRI, BDMS, ORI
- Downtrend: AAV, BANPU, SPALI
- Sideway: CENTEL, AOT, BCH
This selection allows for the evaluation of investment strategies across various market conditions. Daily Open, High, Low, Close, and Volume data for these securities from January 1, 2019, to October 31, 2023, were imported using the y-finance library. Technical indicators were calculated using the TA-lib library in Python, with additional manual calculations for unavailable indicators.
Ten trading strategies were developed by combining technical indicators and setting conditions for generating buy/sell signals. These strategies are widely used by investors to make informed decisions:
-
Volume Profile (VP):
- Identifies key price levels with high trading volume.
- Generates buy signals when the current price is below significant peaks and sell signals when above [21].
-
Stochastic Oscillator (STO):
- A momentum indicator comparing closing prices to a range over a set period.
- Signals overbought (sell) and oversold (buy) conditions [22].
-
Bollinger Bands (BB):
- Plots two standard deviations around a moving average.
- Identifies overbought and oversold conditions [23].
-
Commodity Channel Index (CCI):
- A momentum oscillator identifying overbought/oversold conditions.
- Helps traders decide entry/exit points [24].
-
RSI and MACD:
- Combines MACD and RSI to assess trends and momentum.
- Buy signal: MACD crosses above the MACD signal, and RSI < 35.
- Sell signal: MACD crosses below the MACD signal, and RSI < 65 [25].
-
OBV and MACD:
- Combines leading (OBV) and lagging (MACD) indicators.
- Buy signal: OBV slope > 30 degrees, MACD line above the signal line.
- Sell signal: MACD line below the signal line, OBV slope < 30 [5].
-
ADX and DMI:
- Measures trend strength using ADX and direction using +DI and -DI.
- Detects trend changes through crossovers [5].
-
Crossover of SMA50 and SMA100:
- Buy signal: Shorter-term SMA crosses above longer-term SMA.
- Sell signal: Shorter-term SMA crosses below longer-term SMA [5].
-
Aroon:
- Detects trend shifts using Aroon Up and Aroon Down lines.
- Indicates strong trends when new highs/lows occur regularly [5].
-
Renko Charts:
- Filters market noise using bricks representing fixed price shifts.
- New bricks indicate trend continuation or reversal [26].
The N-Period Min-Max (NPMM) method was used to label data, addressing the shortcomings of traditional up-down labeling. NPMM analyzes stock price trends over a defined period (N = 14 to 21 days) and identifies minimum (buy) and maximum (sell) points within that period.
Three machine learning models were trained using the trading signals as parameters:
-
Logistic Regression:
- A statistical method for binary classification tasks.
- Models the relationship between features and class probabilities.
-
K-Nearest Neighbors (KNN):
- A supervised learning algorithm for classification and regression.
- Predicts labels based on the k closest data points.
-
Support Vector Machines (SVM):
- A powerful supervised learning algorithm for classification and regression.
- Finds the optimal hyperplane to separate data points into classes.
The Differential Evolution (DE) algorithm was used to optimize trading strategies. DE follows these steps:
- Parameterization: Identify key parameters (e.g., entry/exit signals, risk management).
- Population Initialization: Create an initial population of potential solutions.
- Mutation and Crossover: Introduce diversity and combine promising solutions.
- Selection: Evaluate solutions based on a fitness function (e.g., profitability, risk).
- Iteration and Convergence: Repeat until a stopping criterion is met.
The trading decision is calculated using the following equation:
The decision is calculated using the weighted average of trading signals as follows:
[ \text{Decision}d = \frac{w_1 s_1 + w_2 s_2 + \dots + w_n s_n}{\sum{i=1}^n w_i} ]
Where:
- ( w_n ): Weighted value of the nth trading signal.
- ( s_n ): Trading signal (1 = buy, 0 = hold, -1 = sell).
Once the decision value ( \text{Decision}_d ) is calculated, it is compared with a threshold ( t_d ) to determine the action:
- Buy if ( \text{Decision}_d > t_d ).
- Sell if ( \text{Decision}_d < -t_d ).
- Hold if ( \text{Decision}_d ) is within the range ( (-t_d, t_d) ).
A trading simulation was conducted for each stock, including both long and short positions. The average returns for each strategy were compared with the returns from a buy-and-hold (B&H) strategy. A 0.2% commission fee was applied to all transactions to account for realistic trading costs.