e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:09/September-2020
Impact Factor- 5.354
www.irjmets.com
STOCK MARKET PREDICTION USING VARIOUS MACHINE LEARNING AND DEEP LEARNING TECHNIQUES Shailesh Kolap*1, Ruturaj Patil*2, Ajay Patil*3, Omkar Kalyani*4, Vishal Sinhasan*5 *1,2,3,4,5Student
of Department of Computer Science and Engineering, Sharad Institute of Technology, Yadrav, Maharashtra, India.
ABSTRACT The aim of the project is to explore the many ways to predict future stocks returns based on past returns and numerical news indicators to create a portfolio of multiple stocks to risk. We do this by using priced study methods to predict prices. we will work with historical information about the company's stock listed publicly. We will use a mixture of machine learning algorithms to predict the stock worth of the corporate, beginning with straightforward algorithms like standardized and balanced, and moving on to advanced techniques like Auto ARIMA and LSTM. Keywords: Auto ARIMA and LSTM.
I.
INTRODUCTION
Predicting how the stock market will perform is one of the most difficult things to do. There are too many factors involved in prediction - physical factors compared to phycological, rational and irrational behaviors, etc. All of these factors combine to make price prices unchanged and more difficult to predict with a higher degree of accuracy. We can use machine learning as a game changer on this domain. Using features such as recent announcements about the organization, their quarterly revenue results, etc., machine learning methods have the potential to gain patterns and insights that we have not seen before, and this can be used to make accurate predictions that are irrefutable. The stock market volatility is violent and there are many complex financial indicators. Advances in technology, however, provide an opportunity to make more money in the stock market and can help experts find the most instructive indicators to make better predictions. Market value estimates are very important to help increase the profitability of a stock purchase while keeping risks low. The next section of the paper will be a way in which we will explain each process in detail. After that we will have figurative representations of what we have done, and we will discuss the results. Finally, we will describe the size of the project. We will talk about how to stretch the paper to get the best results.
II.
METHODOLOGY
This section will give you a detailed analysis of each process involved in the project. Each phase is mapped to one of the project phases. A.
Data Preprocessing
The pre-processing stage involves
Data Decentralization: Part of data reduction but with special significance, especially in numerical data. Data Transformation: Normalization. Data Cleaning: Fill in missing values. Data integration: Integration of data files.
After the data set was converted into a pure data set, the data set was split into training and testing sets for testing. Here, training prices are considered to be the most recent prices. Test data is stored as 5-10 percent of the complete database. B. Feature Selection and feature Generation
www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[698]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:09/September-2020
Impact Factor- 5.354
www.irjmets.com
We have created new features from low-level features that provide better data details such as a 50 day travel rate, a previous day difference, etc. To trim the less useful features, in the feature selection, we select the features according to the higher values k, with the help of a specific model to test the effect of a single regressor, in the order of many regressors. We used the SelectKBest Algorithm, with f regression as the test scorer. Additionally, we have added the Twitters Daily Sentiment Score, as a feature for each company based on user tweets about that company and tweets on that company page.
III.
ANALYSIS
To evaluate the effectiveness of the system, we use Root Mean Square (RMSE) error and the number of rˆ2 points. A.
Root Mean Square Error (RMSE)
The square root of the definition / square measure of the whole error. The use of RMSE is very common and makes the metric error of the standard objective price estimation. Compared to the same Mean Absolute error, RMSE amplifies and punishes severe errors. RMSE Value Calculation: ∑
̂
Fig. 1: RMSE Value Calculation B.
R-Squared Value (rˆ2 value)
The value of R2 can be between 0 and 1, and if its value rises more accurately the regression model as most variations are defined by the regression model. The value of R2 indicates the equal value of the variability of the response variables defined by independent variables. R-squared is a statistical measure of how close the data is to the correct return line. It is also known as the determination of the determination, or the equality of the determination of the many setbacks.
www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[699]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:09/September-2020
IV.
Impact Factor- 5.354
www.irjmets.com
GRAPHICAL REPRESENTATION
Fig.-2: Comparison Graphs RMSE Value
Fig.-3: Comparison Graphs R-squared Value
V.
CONCLUSION
Based on the results obtained, it is found that the Gradient Boosting Regressor remains very efficient. This is followed by Bagging Regressor, Random Forest Regressor, Adaboost Regressor and K Neighbor Regressor. Bagging Regressor is found to be performing well as Bagging (Bootstrap sampling) relies on the combination of more independent readers will greatly reduce the error. Therefore, we want to produce as many independent students as possible. Each student base is made by sampling a set of original data by inserting another. From the results, it is safe to say that additional hidden layers improve on model points. Random Forest is an expansion of bags where the main difference is the inclusion of random feature selection. Predicting the timeline is a very interesting field to work with, as I have seen in my time writing these articles. There is a perception in the community that it is a complex field, and while there is a grain of truth there, it is not so difficult when you get the hang of basic strategies.
ACKNOWLEDGMENT We would like to thank Dr. A. V. Turukmane sir for mentoring our project and introducing us to the new state-of-art technologies and helping us at every stage of this project. The success and final outcome of this project requires a lot of guidance and help from a lot of people and I am very privileged to have all of this at the end of my project. All I have done is thanks to such treatment and help and I do not forget to thank them.
www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[700]
e-ISSN: 2582-5208 International Research Journal of Modernization in Engineering Technology and Science Volume:02/Issue:09/September-2020
VI. [1]
[2] [3] [4]
Impact Factor- 5.354
www.irjmets.com
REFERENCES
Ishita Parmar, Navanshu Agarwal, Sheirsh Saxena, Ridam Arora, Shikhin Gupta, Himanshu Dhiman, Lokesh Chouhan, “Stock Market Prediction” 2018 First International Conference on Secure Cyber Computing and Communication (ICSCCC) Girija V Attigeri , Manohara Pai M M, Radhika M Pai, Aparna Nayak “Stock Market Prediction: A Big Data Approach”, IEEE. Mojgan Ghanavati, Raymond K. Wong, Fang Chen,Yang Wang, Simon Fong“A Generic Service Framework for Stock Market Prediction”-2016. C.-J. Huang, D.-X. Yang, and Y.-T. Chuang, “Application of wrapper approach and composite classifier to the stock trend prediction,” Expert Systems with Applications, vol. 34, no. 4, pp. 2870–2878, 2008.
www.irjmets.com
@International Research Journal of Modernization in Engineering, Technology and Science
[701]