IDL - International Digital Library Of Technology & Research Volume 1, Issue 6, June 2017
Available at: www.dbpublications.org
International e-Journal For Technology And Research-2017
HQR Framework optimization for predicting patient treatment time in big data 1
1
PRATEEKSHA S KULKARNI Co-Guide : Shanthi M B
Computer Science and Engineering, CMRIT Bengaluru Email: 1kulkarniprateeksha51@gmail.com Contact Number: +91-8553926003
Abstract: Today most of the hospital face overcrowded with patients long queues for different tasks. Hospital management face difficulty to handle these patients to provide optimal treatment time for each patients waiting in the long queue. Unnecessary and annoying waits for long periods result in substantial human resource and time wastage and increase the frustration endured by patients.It would be convenient and preferable if the patients could receive the most efficient treatment plan and know the predicted waiting time updates in real time. Because of the large-scale, realistic data-set and the requirement for real-time response, the PTTP algorithm and HQR system mandate efficiency and low-latency response. Extensive experimentation and simulation results demonstrate the effectiveness and applicability of the proposed model to recommend an effective and convenient treatment plan for patients to minimize their wait times in hospitals.
Keywords: Apache-spark, Hospital queuing recommendation, Big Data, Cloud Computing, Patient treatment time prediction, Classification and regression tree.
1.
INTRODUCTION Today most of the hospitals are overcrowded with long queue of the patients and have ineffective management of patient queue. Managing the patients queues and predicting their waiting time is complicated and difficult job. As each patient who comes for any checkup or any other task might require to perform different tasks/operations, such as checkup and Various tests, for example: blood test, X-rays or a CT scan, payment history, or MR scan, etc during treatment of the patients. We consider each task of these tasks as treatment tasks or tasks to be performed by individual patient. A patient in the hospitals are usually required to undergo some examinations, inspections or tests (test is referred to tasks) per his condition. As the tasks to be performed may be interdependent to be performed by each patient. Some tasks are independent, whereas others might have to depend on the other i.e. wait for the completion of dependent tasks. Most of the people who go for their checkup must wait for unpredictable but long periods waiting in queues, waiting for their turn on order to complete accomplish their checkup and treatment task.
IDL - International Digital Library
The main focus in this thesis is to help patients to complete their treatment tasks in a predictable and optimal time and making the hospitals to schedule each treatment task queue to avoid overcrowded and ineffective queues of the patients who opt for a hospital for their treatment. We use training data from different hospitals to develop a
patient treatment time model for the on an average maximum/optimal time required for their treatment. So to analyze the above context we have retrieve the patient data which are gathered from different hospitals by considering few important parameters, which include patient’s treatment start time of a particular task, its end time of the same task, patient age, and the other detailed treatment data for each of their tasks which ever is required for calculating the optimal time. We use a treatment model algorithm and an hospital queuing system by considering the real-time requirements for the treatment, huge data, and complexity of the system, we use the big data environment. The algorithm which is implemented based on a treatment time model algorithm and thee Random Forest (RF) method for each operative task
1|P a g e
Copyright@IDL-2017
IDL - International Digital Library Of Technology & Research Volume 1, Issue 6, June 2017
Available at: www.dbpublications.org
International e-Journal For Technology And Research-2017 which is being performed during the patients visit, and the waiting time of each task is being analyzed and predicts the average required time for each individual task. The hospital recommendation is defined for an convenient treatment plan for each patient and task. Patients can check their treating plan and the predicted waiting time in real-time using a mobile application developed. The Extensive experimented results and the analyzed context shows the time prediction algorithm and Random Forest implementation system results in providing highly effective and efficient performance.
Fig 2.1 Architecture of the HQR system
2.
DETAILS EXPERIMENTAL 2.1. Problem Statement Most of the data in hospitals are unstructured, massive and high dimensional. As every day hospitals produces a huge amount of business data which contains a great deal of information of individual patient such as medicine data, doctor name, and all the other detailed information. The time consumption of the treatment tasks in each department might not lie in the same range, which can vary per the content of tasks and vary circumstances, different period and different conditions of patients. For example, in case of CT scan, the time required for old man is generally longer than that required for a young man. There are the strict time requirements for hospital queuing recommendation and management. The speed of executing the HQR model and PTTP model so also critical. The realistic patient data which are collected from various hospitals are analyzed carefully and rigorously based on important parameter such as patient treatment start time, end time, patient age, and detail treatment content for each different task. We identify and calculate different waiting times for different patients based on their operations performed during treatment. We use the RF algorithm to train patient treatment the time consumption based on both patient and time characteristics and then build PTTP model. The overall logical structure of the project is divided into processing modules and a conceptual data structure is defined as Architectural data flow diagram as shown in the Figure 2.1
2.2. Data Pre-processing In the preprocessing phase, hospital treatment data from different treatment tasks are gathered. Everyday substantial numbers of patients visit each hospital. We collect the data from different hospitals for analyzing the treatment time required for each task. Let S be a set of patients in a hospital, and a patient
who has been registered and his information is represented by si. Assume that there are N patients in S: S = {s1,s2, . . . . . . , sN}, where each patient si can have specific unchanged parameters, e.g., name, ID, gender, age, and address of each patient. Some of these parameters are used for our analysis, whereas others are not preferably used. Each patient can visit multiple treatment tasks per his health condition. Let X|si be a set of treatment tasks for patient si during a specific visit: Table 1: Example of treatment records
X|si = {x1,x2, . . . . . , xK},
IDL - International Digital Library
2|P a g e
Copyright@IDL-2017
IDL - International Digital Library Of Technology & Research Volume 1, Issue 6, June 2017
Available at: www.dbpublications.org
International e-Journal For Technology And Research-2017 where each task record xi can consist of multiple information consider Y , e.g., task name, task location, department, start time, end time, doctor, and attending staff: Y|xi = {y1,y2, . . . . ,yM}, where yj is a feature variable of the record of treatment task xi. As shown in Table 7.1 the following records collected are used for calculating the average. 2.3. Workflow of the data pre-processing is given in the following steps: a: Collecting data from different treatment tasks Depending on statistics, the number of patients in a medium-sized hospital lies can lie between the ranges from 8,000 to 12,000 records per day, and the number of remedial treatment data records can range between from 120,000 to 200,000. These data are gathered from different treatment tasks, including all the information related to particular tasks. b:Choose the same dimensions of the data The hospital treatment data generated from different treatment tasks have all the different fields with different contents and formats which are of different dimensions. In order to train the consumption model for each task, we choose for the same features from these same dimensional data, such as the patient information (patient Id, gender, age, etc.), the treatment task information (task name, department name, doctor name, etc.), and the time information (Start time and End time). Other feature or other dimensions of the treatment data are ignored as they are not much useful for the PTTP algorithm, such as patient name, and address. c: Calculate new feature variable of the data We choose all these data to train the PTTP model, various features of the data should be calculated, such as the patient time consumption of each treatment record, day of week for the treatment time, and the time range of treatment time. The workflow of the patient treatment and wait model is illustrated below. Figure 2.2. Illustrates the task flow between different patients. Consider three patients as shown in the figure below (Patient1, Patient2, and Patient3),
Fig 2.2: Flow diagram of the patient wait and treatment model and a set of treatment tasks required for each patient. Some tasks can be dependent on a previous one as a continued task, e.g., surgery or bandage cannot be done before X-rays. Tasks {A; B; D} are required for Patient1, whereas task D must wait for the completion of B. Tasks {E; B; C; A} are required for Patient2, and tasks {D; E; C} are required for Patient3. Moreover, there are different numbers of patients waiting in the queue of each task, for example, 7 patients in the queue of task A and 5 patients in the queue of task B. In this paper, a Patient Treatment Time Prediction (PTTP) model is trained
based on hospitals' historical data. The waiting time of each treatment task is predicted by PTTP, which is the sum of all patients' waiting times in the current queue. Then, as per each patient's requested treatment tasks, a Hospital Queuing-Recommendation (HQR) system recommends an efficient and convenient treatment plan with the least waiting time for the patient. The patient treatment time consumption of each patient in the current waiting queue is estimated by the trained PTTP model. The whole waiting time of each task at the current time can be predicted, such as {TA = 35(min); TB = 30(min); TC = 70(min); TD = 24(min); TE = 87(min)}. Finally, the tasks of each patient are sorted in an ascending order according to the waiting time, except for the dependent tasks.
2.4 PTTP based on the improved random forest model
IDL - International Digital Library
3|P a g e
Copyright@IDL-2017
IDL - International Digital Library Of Technology & Research Volume 1, Issue 6, June 2017
Available at: www.dbpublications.org
International e-Journal For Technology And Research-2017
2.3 PTTP based on RF model In the preprocessing phase, the hospital treatment data from different treatment tasks are gathered. As the substantial numbers of patients do visit each hospital every day. After calculating new feature variables of treatment data, the error data need to be removed. The treatment records with missing values for the required data sample for critical features that are removed as incomplete data, such as patient gender, patient age, and task name. The treatment records which have negative values induces for time consumption those are removed as inconsistent data, for instance, if the end time of the treatment operation exist in the dataset and the training data is before the start time, which can occur in cases when a start time is recorded by a human and an end time is shown by a machine. The types of data shown above are considered as noisy data. In figure 2.3 represents the PTTP model based on the cart tree which takes the input as the training data from the dataset and compute the divisions as described in the below algorithm1 of the
create an empty CART tree h i; for each independent variable in do calculate candidates split points for each in do calculate the best split point arg min (∑ Left + ∑ Right) end for append node Node(ai,vp) to hi; split data for left branch RL(ai,vp) ← [x| ai < vp] split data for right branch RR(ai,vp) ← [x| ai > vp] for each data R in { RL(ai,vp) , RR(ai,vp)} do Calculate ɸ (vpL | ai) ← max ɸ(vp,ai) if ɸ (vp(L|R) | ai) ≥ vp,ai then append subnode Node(ai,vp(L|R)) to Node(ai,vp) multi-branch split data to two forks RL and RR else collect cleaned data for leaf node Dleaf calculate mean value of leaf node c (1/k) ∑ Dleaf
3
RESULT AND DISCUSSION
The following snapshots and graphs define the results or outputs that we will get after step by step execution of each proposed service application when a new patient opts for this service for checking the availability for booking the appointment. And the
tasks based on the age group and task. Finally, it computes the average time for each task for a patient. Algorithm 1: Process of the Random forest based on PTTP Algorithm Input: STrain : the training datasets; K : the number of CART trees in the RF model. Output: PTTPRF : The PTTP model based on the RF algorithm. for i = 1 to k do create training subset Strain ←sampling(STrain) create OOB subset SOOBi ← (STrain - Strain );
IDL - International Digital Library
Fig 3.1: The test result of the above model displaying the time for each patient for each task.
4|P a g e
Copyright@IDL-2017
IDL - International Digital Library Of Technology & Research Volume 1, Issue 6, June 2017
Available at: www.dbpublications.org
International e-Journal For Technology And Research-2017 result is displayed on the patients output screen with the optimal time which is calculated based on the above procedures. The figure 3.1 shows the time details which includes the start time and end time for each task with the doctor’s name. In the doctor’s login, the doctor can view the list of patients who request for the opted doctor.
1.
2.
A random forest technique is used to provide the optimal result which is performed by the patient time treatment prediction algorithm. The proposed system is developed to produce the optimal time for different tasks with more efficient and convenient plan for the patient’s.
REFERENCES 1.
Eric. Hamrock, Mathew toerper, Sauleh Siddiqui, Scott Levin “Real-time prediction of inpatient length of stay for discharge prioritization” www.ieee.org Vol. 10.1093/jamia/ocv106 april-2015.
2.
J G Dai pengyi Shi “A two time scale approach to time varying queues in hospital flow management”. Vol. 65.10.1287/opre. 2016 IEEET
3.
Raul fidalgo-merino, Marlon nunez “Self adaptive induction of regression trees” 10.1109/TPAMI.11.19 IEEE.
4.
Kenli Li, Xiaoyong Tang, Bharadhwaj Veeravali “Scheduling precedence constrained stochastic tasks on heterogeneous cluster systems” www.ieee.org Vol. 64 1-jan- 2016 IEEE.
5.
Apache. (Jan. 2015). Mahout. [Online]. Available: http://mahout. Ashok Kumar apache.org.
6.
Y. Xu, K. Li, L. He, L. Zhang, and K. Li, “A hybrid chemical reaction optimization scheme for task scheduling on heterogeneous computing systems” IEEE Trans. Parallel Distribute. Syst., vol. 26, no. 12, pp. 3208_3222, Dec. 2015.
7.
D. Dahiphale et al., ``An advanced MapReduce: Cloud MapReduce, enhancements and applications'' IEEE Trans.
8.
Network. Service Manage., vol. 11, no. 1, pp. 101_115, Mar. 2014.
9.
Amiya kumari tripathy, rebeck Carvalho, keshav pawaskar, “Mobile based healthcare management using artificial intelligent”.
Fig 3.2: The appointment list in the doctor login The doctor can login into this application and check out the list of the patients who has requested for his visit as shown in the figure 3.2.
Fig 3.3 Graph shows the avarage time vs Patient Age
The figure 3.3 shows the graphs representing the average time versus the age of the patient with which we can analyze the minimum average time required for each task for the patients requested tasks during the request of the appointment.
CONCLUSIONS The Hospital queuing treatment plan by using the PTTP algorithm which is based on the big data has been presented in this project.
IDL - International Digital Library
5|P a g e
Copyright@IDL-2017
IDL - International Digital Library Of Technology & Research Volume 1, Issue 6, June 2017
Available at: www.dbpublications.org
International e-Journal For Technology And Research-2017 www.ieee.org Vol. 10.1109/ICTSD. 30-042015
IDL - International Digital Library
6|P a g e
Copyright@IDL-2017