Study on Forecasting the Price of Oyster Mushrooms: Application of SARIMA, LSTM, and Neural Prophet
Abstract
Oyster mushrooms have the highest average production among the agricultural mushrooms grown in Korea. Because these mushrooms can be cultivated year-round, the price indicators considerably affect farmers’ production decisions. Herein, forecasting models were developed to predict the price of oyster mushrooms using Autoregressive Integrated Moving Average(ARIMA) model and deep learning models such as Long Short-Term Memory(LSTM) and neural prophet. Representative performance indicators were used to evaluate the predictive power of these models. As a result, the neural prophet model exhibited the highest accuracy. The neural prophet model, which reflects the characteristics of oyster mushrooms, whose prices fluctuate remarkably due to major events such as holidays rather than seasonal factors due to the nature of items produced throughout the year, is superior to seasonal factors.
초록
국내에서 재배되는 농산 버섯 중 느타리버섯은 평균 생산량이 가장 많은 버섯이다. 느타리버섯은 연중생산이 가능하기 때문에 가격 지표는 농가의 버섯 생산 의사결정에 중요한 영향을 미친다. 본 연구에서는 ARIMA(Autoregressive Integrated Moving Average) 등의 시계열 모델과 LSTM(Long Short-Term Memory), Neural Prophet 등의 딥러닝 모델을 활용하여 느타리 버섯 가격 예측 모델을 개발하였다. 모델의 예측력을 평가하기 위해 대표적인 성능 지표를 사용하였으며, 분석결과 Neural Prophet 모델의 정확도가 가장 높은 것으로 나타났다. 연중 생산되는 품목의 특성상 계절적인 요인보다, 명절 등 주요 이벤트성 가격 변동이 크게 발생하는 느타리버섯의 특성을 반영하는 Neural Prophet모형의 예측력이 우수한 것으로 나타났다.
Keywords:
oyster mushroom, price forecast, SARIMA, LSTM, neural prophetⅠ. Introduction
Oyster mushrooms account for more than 30% of the total agricultural mushroom production in Korea, which is the highest average production among cultivated agricultural mushrooms[1]. Oyster mushroom cultivation tends to require a relatively small capital investment compared to other mushrooms. In terms of substrate, they can be cultivated using inexpensive materials such as rice straw and cotton waste[2], thereby making the production of these mushrooms advantageous. Oyster mushrooms can be produced in facilities where temperature and humidity can be controlled, which enable their year-round production[3][4]. Compared with other products, they are an item whose market equilibrium tends to depend more heavily on farmers’ decisions, such as the production time and intended cultivation area.
Typically, decisions regarding agricultural production are made based on price information. This also holds true for oyster mushrooms, categorized as agricultural mushrooms, whose price information substantially influences farmers’ decision making. In addition to oyster mushrooms, king oyster and enoki mushrooms account for a considerable proportion of agricultural mushroom production; however, most of the king oyster and enoki mushrooms available in the market are produced in large-scale cultivation factory facilities. Button mushrooms can be considered a representative of mushrooms grown in farms, but they are produced in small scale; hence, their relative position in the mushroom industry can be considered lower than that of oyster mushrooms. Because price information plays a major role in the oyster mushroom cultivation, the prediction of oyster mushroom prices can serve as a key indicator in farmers’ decision making. Therefore, conducting research on this topic can be meaningful, considering the stability of the mushroom market.
To forecast the price of agricultural products, analysis is usually performed using a structural model built by considering factors influencing the supply and demand of the agricultural products and their prices. Particularly, in the case of long-term forecasting models, a common form of analysis is to derive the equilibrium price considering the total supply and total demand. To achieve this, detailed data containing numerous market-relevant information are required[5]. In the case of short-term forecasting models, based on the characteristics of data, time series models are sometimes used instead of a structural model. Time series models are relatively free in terms of model structuring, such as setting variables and building equations, and have advantage in identifying price trends and understanding patterns over time[6]. Numerous research have focused on the price forecasting of agricultural products, encompassing grains, livestock, and horticultural crops[7]-[10]. Moreover, the accuracy of forecasts has improved with enhancements in data quality, including improvements in outlook information and traceability system. However, in the case of oyster mushrooms, little research has been conducted on the market economy encompassing supply and demand due to a lack of basic data and the absence of forecasting models. Hence, studies on the price prediction of oyster mushrooms are scarce.
This study aimed to construct various alternative prediction models using the price data of oyster mushrooms and to determine the optimal forecasting model. To achieve this, in addition to traditional time series models commonly used for price analysis of agricultural products, machine learning models, which have been increasingly used in recent predictive analytics research, are utilized. Additionally, a comparative analysis of predictive performance of these models was carried out. The price forecasts derived from this study can aid in developing a judgment index for decision making by oyster mushroom producers and serve as a foundation for mushroom research.
Ⅱ. Forecasting Models
The Autoregressive Integrated Moving Average (ARIMA) model is a traditional analysis method for price forecasting and has been widely used to analyze time series data[11][12]. The ARIMA model was supplemented and developed in accordance with the research needs and data type. A representative example is SARIMA(seasonal ARIMA), which was developed to analyze seasonal time series data[13]-[15]. Recently, with advancements in technology and growing interest in machine learning, there has been a surge in the number of researches using machine learning models to predict the prices of agricultural products[16]-[20], and models such as Long Short-Term Memory(LSTM) and neural prophet are typically used to perform forecasting time series data[21][22]. In this study, to predict the price of oyster mushrooms, forecasting analysis is conducted by applying three models: SARIMA, LSTM, and neural prophet.
2.1 SARIMA
The SARIMA is an extension of the ARIMA model, which considers seasonal components, and methods such as seasonal AR(P), seasonal MA(Q), and seasonal ARIMA(P,D,Q)s are used to apply existing frameworks along with the inclusion of seasonal components, where seasonality is identified in nonstationary time series data[15]. The SARIMA model not only utilizes recent historical data leading up to the point of prediction but also incorporates factors in the periodic characteristics and seasonality of the data, thereby enabling the utilization of additional data from previous periods[6].
The form of SARIMA(p,d,q)(P,D,Q)s can be expressed as follows:
(1) |
where Bs represents the backward shift operator with the seasonal period s, yt denotes the value of the variable at time t, εt is white noise, Φp(Bs) indicates seasonal AR, and ΘQ(Bs) represents seasonal MA; p is the order of the AR term, q indicates the order of the MA term, d represents the order of differencing, P is the order of the seasonal AR term, Q denotes the order of the seasonal MA term, and D represents the order of seasonal differencing.
2.2 LSTM
LSTM is a special type of Recurrent Neural Network(RNN) technique invented by Hochreiter & Schmidhuber(1997). Conventional RNNs encounter instability issues such as vanishing or exploding gradients when learning long-sequence data, which lead to problems with long-term dependencies. LSTM addresses these issues by maintaining a similar structure to that of RNNs but allowing each memory cell to consist of four layers that interact with each other to resolve long-term dependencies. These four layers comprise the input gate, forget gate, output gate, and cell state, enabling control over only the necessary information through gate mechanisms to determine cell values.
Fig. 1. represents the conceptual structure for each LSTM gate. The cell state, denoted as ct, plays a crucial role in maintaining the past information flow from previous to current states and updating new information, thereby contributing to resolving long-term dependency issues. The forget gate, represented by Equation (2), determines whether to discard or retain past cell state information and is calculated using a sigmoid function.
(2) |
Equation (3) represents the input gate, which determines how new information xt should be incorporated using both a sigmoid function and a hyperbolic tangent (tanh) function.
(3) |
The output gate, as illustrated in Equation (4), first adjusts the value using a sigmoid function and then applies the tangent function to output the value.
(4) |
Equation (5) represents each sigmoid function and hyperbolic tangent function, where W denotes the weight matrices associated with xt and b represents the bias vectors for each layer. ⊗ denotes element-wise multiplication and ⊕ signifies element-wise addition.
(5) |
2.3 Neural prophet
The neural prophet model improves performance by combining the Prophet model provided by Facebook with neural network[23]. The neural prophet model can be represented by six components, outlined as follows:
(6) |
where T(t) represents trend, S(t) denotes seasonal effects, E(t) stands for event and holiday effects, F(t) indicates regression effects for future-known exogenous variables, A(t) denotes auto-regression effects, and L(t) signifies the effects of lagged regressors.
T(t) indicates long-term direction in which the data represent and can be expressed as follows:
(7) |
where δ represents growth rate, ρ denotes offset, and c signifies change point. S(t) pertains to patterns that reflect seasonal components and can be expressed as follows:
(8) |
where aj and bj are Fourier coefficients, p is periodicity, and j is the number of Fourier terms. E(t) signifies event effects considered in the forecasting model and can be expressed as follows:
(9) |
where ze is the coefficient of the model and e is the event as a binary variable. F(t) indicates regressors that can affect forecasts and targets and can be expressed as follows:
(10) |
where df represents coefficient of the model and f denotes future regressor A(t) means auto-regression effects at time t based on past observations, which depends on regression of yt on yt-i and can be expressed as follows:
(11) |
where c is intercept and et denotes white noise term. L(t) indicates regression effects at time t for lagged observations of exogenous variables,
(12) |
where xt-1,⋯,xt-p denote lagged observations of variables.
Ⅲ. Establishment of the model
3.1 Price data of oyster mushrooms
To run mushroom-price-forecasting models, data on wholesale prices of oyster mushrooms from the Garak Market provided by the Seoul Agro-fisheries and Food Corporation were used[24]. The dataset was constructed based on the premium price of a 2-kg box of oyster mushroom, and the producer price index was used to convert nominal prices into real prices. Daily data spanning from January 2021 to December 2022 was utilized for the analysis.
As the wholesale market remains closed on Sundays and public holidays, oyster mushroom wholesale prices are not available on those days, leading to missing price data. To address this, linear interpolation was applied for inputting missing values.
The time series plot of oyster mushroom prices shows that there are trends and seasonality over time. In particular, prices increase sharply during holiday periods, which occur periodically every year. When analyzing price predictions, it is necessary to consider the holiday effect. Excluding holiday periods, the prices for the mushroom range between 3,000 and 8,000 Korean Won(KRW). However, during holiday periods, prices can exceed 10,000 KRW and even surpass 20,000 KRW. Therefore, when applying the model, an analysis of oyster mushroom prices considering the seasonality and holiday effects was conducted.
3.2 SARIMA
Prior to setting up the model, data exploration was conducted to evaluate trends, seasonality, and stationarity of the time series data, as depicted in Fig. 3. By data exploration and utilization of the auto.arima function, the optimal SARIMA model that best explains the price of oyster mushrooms was determined with the aim of deriving the model as presented in Table 1. SARIMA(0,1,1)(0,0,2)7 was identified as the optimal model, taking into account AIC and BIC values.
The estimation results, including coefficients, are provided in Table 1. The results indicate coefficients of 0.384 for MA(1), 0.059 for seasonal MA(1), and 0.155 for seasonal MA(2). To verify the independence of the estimated model residuals, the Ljung–Box test was performed, revealing no autocorrelation present in the residuals.
3.3 LSTM
The LSTM model was configured by using Python 3.9.18, Keras 2.13.1, and Tensorflow 2.13.0. The oyster mushroom price dataset was split as follows: 60% for training, 20% for validation, and 20% for testing. To ensure accurate learning, the training data were transformed into values between 0 and 1 using standard normalization as in Equation (13).
(13) |
The hyperparameters used in the model are presented in Table 2. The number of epochs was determined by assessing the correlation between the number of epochs and the loss function results. The experimental results showed that beyond 400 epochs, there were minimal changes in the loss function results, leading to the decision to limit the epochs. Hyperparameters such as a learning rate of 0.001, batch size of 32, and Mean Squared Error(MSE) as the loss function were utilized. To mitigate the risk of overfitting during model training, the dropout technique was applied with the value set to 0.1.
3.4 Neural prophet
The neural prophet model uses the Python package neural prophet version 0.5.0. The composition of the price data used in the analysis was the same as that used in the LSTM analysis, i.e., 60% training data, 20% verification data, and 20% test data. When the neural prophet model was set up, the options of auto_regression, yearly seasonality, and weekly seasonality were applied.
Furthermore, event and holiday effects were applied and analyzed to capture the characteristics of oyster mushrooms, whose prices change considerably during holidays. The hyperparameters used in the model are presented in Table 3.
Ⅳ. Results
4.1 Measure of prediction accuracy
In this study, the evaluation of model accuracy was performed to examine how well the models predict the actual prices, i.e., how well the actual prices and predicted values derived from the forecasting models of oyster mushrooms prices match with each other [25]. Four measures were used to assess the model performance: the Root Mean Squared Error(RMSE), Mean Absolute Percentage Error(MAPE), Root Mean Square Percentage Error(RMSPE), and Mean Absolute Error(MAE). RMSE refers to the average of the squared differences between the actual values and predicted values, which can be expressed as follows:
(14) |
where n denotes the total number of observations while i represents each individual observation; y and are the predicted and actual values, respectively. MAE represents the average of the absolute differences between the actual values and predicted values, which can be calculated as follows:
(15) |
RMSE and MAE are the most widely used measures in forecasting[26]. As errors decrease, the RMSE and MAE values also decrease. The RMSPE and MAPE respectively convert RMSE and MAE into a percentage unit, ensuring that errors remain consistent regardless of changes in the measurement unit. RMSPE can be expressed as follows:
(16) |
MAPE serves as a measure that indicates the extent to which the error, which is the difference between the actual value and predicted value, accounts for the predicted value. MAPE can be represented as follows:
(17) |
RMSPE and MAPE, akin to RMSE and MAE, are widely used measures in many forecasting models, and they are insensitive to outliers compared with RMSE and MAE[26]. As the errors decrease, RMSPE and MAPE values also decrease accordingly, similar to RMSE and MAE.
4.2 Comparison of models
The predicted results for each model and the RMSE, MAE, RMSPE, and MAPE results of the test data were compared to determine which model performed the best. Comparative analysis results of the performance of SARIMA, LSTM, and neural prophet models are presented in Fig. 6 and Table 4. Fig. 6 illustrates the evaluation results of RMSE, MAE, RMSPE, and MAPE for each model.
As indicated in Table 4, RMSE of SARIMA, LSTM, and Neural Prophet demonstrate 2,099.89, 1,263.00, and 874.86, respectively. These values reveal that the lowest figure is found in the Neural Prophet, and additional evaluation results similarly indicate the lowest figure to be in the Neural Prophet. The results showed that the neural prophet model outperformed the other models across all performance indicators, thus making it the most suitable model for predicting oyster mushroom prices. Further, it was found that the performance of LSTM was superior to that of SARIMA.
Ⅴ. Conclusion
Environmental control of cultivation and year-round production are possible for oyster mushrooms, and decisions made by mushroom-growing farms regarding the production and shipment of the product act as major factors in determining the price. However, because of the lack of basic data and market research for the oyster mushrooms, indicators for decision making regarding price fluctuation are found wanting. In this study, the price-forecasting models were analyzed using the daily price data available for oyster mushrooms.
The SARIMA model, which considers seasonality, and deep learning LSTM and neural prophet models were used to analyze time series price data for predicting the oyster mushroom price. In particular, the accuracy of the neural prophet model was higher than that of the other two models.
This result stems from the fact that the neural prophet model fits well to the characteristics of oyster mushrooms, which are produced throughout the year and exhibit greater price fluctuations due to events such as holidays rather than seasonal factors.
The SARIMA model exhibited lower accuracy than the deep learning models. This is attributed to the relatively irregular seasonal patterns observed in mushroom production characteristics, in contrast to other agricultural products with regular seasonality. The SARIMA model works satisfactorily when the seasonal patterns remain stable and are consistent over time, but it is vulnerable to abnormal changes[27]. Therefore, the neural prophet model, which is comparatively robust to changes, is better suited for forecasting mushroom prices[28].
In this study, forecasting models were developed solely using the price data of oyster mushrooms. To further enhance the accuracy of models, it is essential to consider additional factors affecting price formation, such as production, trading volumes, and others. In addition, prediction accuracy can be improved by considering relations with other items such as button and enoki mushrooms, which may have substitution relations.
Acknowledgments
This work was supported by a 2-Year Research Grant of Pusan National University
References
- MAFRA, "Main statistics on agriculture, forestry, livestock and food", Ministry of Agriculture, Food and Rural Affairs, Korea, 2022.
- S. S. Cho, "Mushroom cultivation technology and management", Osung, Seoul, Korea, Apr. 2000.
- C. You, "History of mushroom industry in Korea", Journal of Mushrooms, Vol. 1, No. 1, pp. 1-8, Oct. 2003.
- R. Gogoi, Y. Rathaiah, and T. R. Borah, "Mushroom cultivation technology", Scientific Publishers, Feb. 2019.
- T. H. Kim, et al., "Development and Operation of Stochastic Agricultural Policy Analysis Models 2022", Korea rural economic institute, Korea, Dec. 2022.
- R. J. Hyndman and G. Athanasopoulos, "Forecasting: principles and practice", OTexts, May 2018.
- J. H. Kim, C. U. Kim, and H. Y. Rho, "Methods for advancing the wholesale price prediction model for green onions", Journal of Agriculture & Life Sciences, Vol. 57, No. 4, pp. 143-150, Aug, 2023. [https://doi.org/10.14397/jals.2023.57.4.143]
- D. H. Kim, S. H. Kim, and C. J. Yu, "A study on pork price prediction using LSTM", Korean Journal of Agricultural Management and Policy, Vol. 48, No. 4, pp. 593-612, Dec. 2021.
- S. Oh, N. Im, S. Lee, and M. S. Kim, "Long-term price prediction and trend analysis of garlic using Prophet Model", Journal of The Korean Data Analysis Society, Vol. 22, No. 6, pp. 2325-2336, Dec. 2020. [https://doi.org/10.37727/jkdas.2020.22.6.2325]
- D. Yoo, "Developing vegetable price forecasting model with climate factors", The Korean Journal of Agricultural Economics, Vol. 57, No. 1, pp. 1-24, Mar. 2016.
- E. Lee and J. Seo, "A study on the time-series variation of Korean rice price", Journal of Modern Social Science, Vol. 11, pp. 157-170, 2020.
- B. S. Kim, "A Comparison on Forecasting Performance of the Application Models for Forecasting of Vegetable Prices", The Korean Journal of Agricultural Economics, Vol. 46, No. 4, pp. 89-113, Dec. 2005. [https://doi.org/10.1002/met.1491]
- E. H. Etuk, "The fitting of a SARIMA model to monthly Naira-Euro Exchange Rates", Mathematical Theory and Modeling, Vol. 3, No. 1, pp. 2224-5804, Jan. 2013.
- M. Valipour, "Long‐term runoff study using SARIMA and ARIMA models in the United States", Meteorological applications, Vol. 22, pp. 592-598, Feb. 2015.
- J. G. Lee, "R program recipes for time series data analysis", Slow & steady, Seoul, Korea, May 2017.
- Y. M. Oh, D. O. Choi, and C. Yu, "A study on the development of rice price prediction model using Artificial Neural Network and its implications for international trade", The e-Business Studies, Vol. 24, No. 3, pp. 93-102, Jun. 2023.
- K. Bae and C. Kim, "An agricultural estimate price model of artificial neural network by optimizing hidden layer", Journal of Korean Institute of Information Technology, Vol. 14, No. 12, pp. 161-169, Dec. 2016. [https://doi.org/10.14801/jkiit.2016.14.12.161]
- S. Shin, M. Lee, and S. K. Song, "A prediction model for agricultural products price with LSTM Network", The Journal of the Korea Contents Association, Vol. 18, No. 11, pp. 416-429, Nov. 2018. [https://doi.org/10.5392/JKCA.2018.18.11.416]
- S. Yun, C. Lee, and S. Yang, "Development of price forecast models for international grains using Artificial Neural Netorks", The Korean Journal of Agricultural Economics, Vol. 57, No. 2, pp. 83-101, Jun. 2016.
- S. M. Hong, "A study on price stabilization of domestic onion; A focused on the causality analysis of production area price and price prediction using deep learning", Master's Thesis, Kangwon National University, Aug, 2022.
- I. Gridin, "Time Series Forecasting using Deep Learning", BPB Publications, Oct. 2021.
- G. Rafferty, "Forecasting Time Series Data with Prophet", Packt Publishing, Birmingham, Mar. 2023.
- O. Triebe, H. Hewamalage, P. Pilyugina, N. Laptev, C. Bergmeir and R. Rajagopal, "Neuralprophet: Explainable forecasting at scale", Nov. 2021. [https://doi.org/10.48550/arXiv.2111.15397]
- Seoul Agro-Fisheries & Food Corporation, http://www.garak.co.kr, [acessed: Nov. 14, 2023]
- H. Lee, S. Ji, and T. Suh, "Monthly Hanwoo supply and forecasting models", Korean Journal of Agricultural Science, Vol. 48, No. 4, pp. 797-806, Oct. 2021. [https://doi.org/10.7744/kjoas.20210067]
- M. V. Shcherbakov, et al., "A survey of forecast error measures", World applied sciences journal, Vol. 24, No. 24, pp. 171-176, Sep. 2013.
- S. Wang, J. Feng, and G. Liu, "Application of seasonal time series model in the precipitation forecast", Mathematical and Computer modelling, Vol. 58, pp. 677-683, Aug. 2013. [https://doi.org/10.1016/j.mcm.2011.10.034]
- C. Satrio, W. Darmawan, B. Nadia, and N. Hanafiah, "Time series analysis and forecasting of coronavirus disease in Indonesia using ARIMA model and PROPHET", Procedia Computer Science, Vol. 179, pp. 524-532, Feb. 2021. [https://doi.org/10.1016/j.procs.2021.01.036]
2016 : MA in Statistics, Chonnam National University
2019 ~ Present : Researcher, Korea Rural Economic Institute
Research interests : Machine Learning, Deep Learning, and Agricultural Price Forecasting
2019 : PhD in Agricultural and Resource Economics, University of Connecticut
2020 ~ 2023 : Research Fellow, Korea Rural Economic Institute
2023 ~ Present : Professor, Department of Food and Resource Economics, Pusan National University
Research interests : Distribution, and Price Forecasting