166

Page 1

Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-2, 2017 ISSN: 2454-1362, http://www.onlinejournal.in

Survey on a Secure and Dynamic MultiKeyword Ranked Search Scheme over Encrypted Cloud Data Dhanasree K. S., Anju K., Vincy Rajan Department Computer Science, Rajiv Gandhi Institute of Technology, Kottayam Abstract: Cloud computing means storing and accessing data and programs over the internet instead of your computer's hard drive. Due to the increasing popularity of cloud computing, more and more data owners are motivated to outsource their data to cloud servers for great convenience and reduced cost in data management. However, sensitive data should be encrypted before outsourcing for privacy requirements, which obsoletes data utilization like keyword-based document retrieval. This system presents a secure multi-keyword ranked search scheme over encrypted cloud data, which simultaneously supports dynamic update operations like deletion and insertion of documents. This survey covers techniques and approaches that are used for secure and dynamic multi-keyword ranked search scheme over encrypted cloud data.

1. Introduction

The first method describes similarity search over encrypted data [1], propose an efficient scheme for similarity search over encrypted data. To do so, a state-of-the art algorithm for fast near neighbour search is utilized in high dimensional spaces called locality sensitive hashing. To ensure the confidentiality of the sensitive data, a rigorous security definition is provided and the security of the proposed scheme is proved under the provided definition. Second method is practical and secure multikeyword search on encrypted cloud data [2]. A multi-keyword search, instead incorporate a conjunction of several keywords in a single query. Increasing the search constraints, only the most relevant items will be returned to the user which reduces the computation burden on the users. Therefore, this work proposes a novel secure and efficient multi-keyword search method that returns the matching data items in a ranked ordered manner.

CLOUD computing has been considered as a new model of enterprise IT infrastructure, which can organize huge resource of computing, storage and applications, and enable users to enjoy ubiquitous, convenient and on demand network access to a shared pool of configurable computing resources with great efficiency and minimal economic overhead[3]. Attracted by these appealing features, both individuals and enterprises are motivated to outsource their data to the cloud, instead of purchasing software and hardware to manage the data themselves. Despite of the various advantages of cloud services, outsourcing sensitive information such as emails, personal health records, company finance data, government documents, etc. to remote servers brings privacy concerns. The cloud service providers (CSPs) that keep the data for users may access users sensitive information without authorization. A general approach to protect the data confidentiality is to encrypt the data before outsourcing [4]. This paper presents a survey of a secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. Here we discuss five different methods used for this system.

The third method [5] propose schemes to deal with secure ranked multi-keyword search in a multiowner model. Different from prior works, these schemes enables authorized data users to achieve secure, convenient and efficient search over multiple data owners data. To enable the cloud server to perform secure search among multiple owners data encrypted with different secret keys, a novel secure search protocol is constructed. To rank the search results and preserve the privacy of relevance scores between keywords and files, a novel Additive Order and Privacy Preserving Function family is proposed. The next method [6] focus on addressing data privacy issues using SSE. For the first time, the privacy issue from the aspect of similarity relevance is formulated. Server-side ranking based on orderpreserving encryption (OPE) inevitably leaks data privacy. To eliminate the leakage, a two-round searchable encryption (TRSE) scheme is proposed that supports top-k multikeyword retrieval. In TRSE, a vector space model is employed and homomorphic encryption. The vector space model helps to provide sufficient search accuracy, and the homomorphic encryption enables users to involve in the ranking while the majority of computing work is done on the

Imperial Journal of Interdisciplinary Research (IJIR)

Page 988


Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-2, 2017 ISSN: 2454-1362, http://www.onlinejournal.in server side by operations only on ciphertext. As a result, information leakage can be eliminated and data security is ensured. The last method is Privacy Preserving MultiKeyword Ranked Search (MRSE) [7]. This method defines and solves the challenging problem of privacy-preserving multi-keyword ranked search over encrypted data in cloud computing (MRSE). A set of strict privacy requirements are established for such a secure cloud data utilization system. Among various multi-keyword semantics, the efficient similarity measure of coordinate matching” is chosen, i.e., as many matches as possible, to capture the relevance of data documents to the search query. Also use inner product similarity to quantitatively evaluate such similarity measure. A basic idea for the MRSE is proposed based on secure inner product computation, and then give two significantly improved MRSE schemes to achieve various stringent privacy requirements in two different threat models. .

2. Multi-Keyword Ranked Search Scheme techniques Here we discuss five methods used for MultiKeyword Ranked Search Scheme. They are Similarity Search over Encrypted Data, Practical and Secure Multi-Keyword Search on Encrypted Cloud Data, Secure Ranked Multi-Keyword Search for Multiple Data Owners, Secure Multi-keyword Top-k Retrieval Search, and Privacy Preser

2.1 Similarity Search over Encrypted Data A similarity search problem consists of a collection of data items that are characterized by some features, a query that specifies a value for a particular feature and a similarity metric to measure the relevance between the query and the data items. The goal is to retrieve the items whose similarity against the specified query is greater than a predetermined threshold under the utilized metric. Although exact matching based searchable encryption methods are not suitable to achieve this goal, there are some sophisticated cryptographic techniques that enable similarity search over encrypted data [8], [9].

Fig 1. Basic Secure Search Protocol

Overview of similarity search is presented in Figure 1. Locality-sensitive hashing (LSH) reduces the dimensionality of high-dimensional data. LSH hashes input items so that similar items map to the same “buckets” with high probability (the number of buckets being much smaller than the universe of possible input items). LSH differs from conventional and cryptographic hash functions because it aims to maximize the probability of a “collision” for similar items. Locality-sensitive hashing has much in common with data clustering and nearest neighbor search.

2.2 Practical and Secure Multi-Keyword Search on Encrypted Cloud Data: An efficient privacy-preserving search method over encrypted cloud data that utilizes minhash functions is used in this paper. One of the main advantages of this method is the capability of multikeyword search in a single query. The proposed method is proved to satisfy adaptive semantic security definition. Minhashing- Each document is represented by a small set called signature. The important property of signatures is that, it should be possible to compare two signatures and estimate a distance between the underlying sets without any other information. Although the exact similarity cannot be deduced from the signatures, they still provide a good approximation. Moreover, the accuracy of the similarity further increases as larger signatures are used. The signatures are composed of several elements, each of which is constructed using minhash functions [11]. This method assumes that the data owner does not have sufficient resources or is unwilling to store the whole database. He outsources the data to an untrusted, semi-honest server, but maintains the ability to search without revealing anything except

Imperial Journal of Interdisciplinary Research (IJIR)

Page 989


Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-2, 2017 ISSN: 2454-1362, http://www.onlinejournal.in the access and search patterns. The data owner encrypts the sensitive documents to be outsourced and generates a secure searchable index using the features of these sensitive documents. In an offline stage, both searchable index and the encrypted documents are outsourced to a semi honest cloud. Utilizing the searchable indexes, authorized users can perform search on the cloud and receive the encrypted documents that match with their queries. During this process, the cloud server should not learn anything other than what the data owner allows to leak. Finally, user decrypts the retrieved documents using the decryption key. 2.3 Practical and Secure Multi-Keyword Search on Encrypted Cloud Data: An efficient privacy-preserving search method over encrypted cloud data that utilizes minhash functions is used in this paper. One of the main advantages of this method is the capability of multikeyword search in a single query. The proposed method is proved to satisfy adaptive semantic security definition. Minhashing- Each document is represented by a small set called signature. The important property of signatures is that, it should be possible to compare two signatures and estimate a distance between the underlying sets without any other information. Although the exact similarity cannot be deduced from the signatures, they still provide a good approximation. Moreover, the accuracy of the similarity further increases as larger signatures are used. The signatures are composed of several elements, each of which is constructed using minhash functions [11]. This method assumes that the data owner does not have sufficient resources or is unwilling to store the whole database. He outsources the data to an untrusted, semi-honest server, but maintains the ability to search without revealing anything except the access and search patterns. The data owner encrypts the sensitive documents to be outsourced and generates a secure searchable index using the features of these sensitive documents. In an offline stage, both searchable index and the encrypted documents are outsourced to a semi honest cloud. Utilizing the searchable indexes, authorized users can perform search on the cloud and receive the encrypted documents that match with their queries. During this process, the cloud server should not learn anything other than what the data owner allows to leak. Finally, user decrypts the retrieved documents using the decryption key. 2.4 Secure Ranked Multi-Keyword Search for Multiple Data Owners: This paper propose schemes to deal with secure ranked multi-keyword search in a multi-owner

Imperial Journal of Interdisciplinary Research (IJIR)

model. To enable cloud servers to perform secure search without knowing the actual data of both keywords and trapdoors, a novel secure search protocol is constructed.

Fig 2. Architecture of secure keyword search in multi-owner multi-user cloud environment. In the multi-owner and multi-user cloud computing model, three entities are involved, as illustrated in Fig. 2, they are data owners, cloud server and data users. Data owners have a collection of files F. To enable efficient search operation on these files which will be encrypted, data owners first build a secure searchable index I on the keyword set W extracted from F. Then data owners encrypt their files F and get the corresponding encrypted files C. Finally both encrypted files and indexes are outsourced to the cloud server. Once an authorized user wants to search t keywords over these encrypted files stored on cloud servers, he first computes the trapdoor T and sends it to the cloud. Upon receiving the trapdoor T, cloud server searches the encrypted index I of each data owner and returns the corresponding set of encrypted files. To improve the file retrieval accuracy and save communication cost, data user would tell the cloud server a parameter k and cloud server would return the top-k relevant files to data user. Once data user receives the top-k encrypted files from cloud server, he will decrypt these returned files. 2.5 Secure Search:

Multi-keyword

Top-k

Retrieval

Jiadi [6] proposed this search using Two round searchable encryption (TRSE). In 1st round, users submits multiple keyword 'REQ' 'Wâ€&#x; as encrypted query for accomplishing data, keyword privacy and create trapdoor (REQ, PK) as Tw and sends to cloud server. Then cloud server calculate the score from encrypted index for files and returns the encrypted score result vector to user. In second round, user decrypt N with secret key and calculates the file ranking and then request files with Top k scores. The ranking of file is done on client side and scoring is done on server side. Scoring is a natural way to weight the relevance. Based on the relevance score,

Page 990


Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-2, 2017 ISSN: 2454-1362, http://www.onlinejournal.in files can then be ranked in either ascendingly or descendingly. Several models have been proposed to score and rank files in IR community. Among these schemes, we adopt the most widely used one tf-idf weighting, which involves two attributes-term frequency and inverse document frequency. Limitation: - The contraction and confining is used to reduce cipher text size, still the key size is too large. The communication aerial will be very high, if the encrypted trapdoor's size is too large. It does not make effective searchable index update.

2.6 Privacy Preserving Multi-Keyword Ranked Search (MRSE): Considering the large number of data users and documents in the cloud, it is necessary to allow multiple keywords in the search request and return documents in the order of their relevance to these keywords. This system defines and solves the challenging problem of privacy-preserving multikeyword ranked search over encrypted data in cloud computing (MRSE). A set of strict privacy requirements for such a secure cloud data utilization system is established. As a hybrid of conjunctive search and disjunctive search, “coordinate matching� [12] is an intermediate similarity measure which uses the number of query keywords appearing in the document to quantify the relevance of that document to the query. When users know the exact subset of the data set to be retrieved, Boolean queries perform well with the precise search requirement specified by the user. In cloud computing, however, this is not the practical case, given the huge amount of outsourced data. Therefore, it is more flexible for users to specify a list of keywords indicating their interest and retrieve the most relevant documents with a rank order.

Fig. 3 Architecture of the search over encrypted cloud data. Among various multi-keyword semantics, the efficient similarity measure of "coordinate matching," is chosen i.e., as many matches as

Imperial Journal of Interdisciplinary Research (IJIR)

possible, to capture the relevance of data documents to the search query. The "inner product similarity" is done to quantitatively evaluate such similarity measure. Considering a cloud data hosting service involving three different entities, as illustrated in Fig. 3: the data owner, the data user, and the cloud server. The data owner has a collection of data documents F to be outsourced to the cloud server in the encrypted form C. To enable the searching capability over C for effective data utilization, the data owner, before outsourcing, will first build an encrypted searchable index I from F, and then outsource both the index I and the encrypted document collection C to the cloud server. To search the document collection for t given keywords, an authorized user acquires a corresponding trapdoor T through search control mechanisms. Upon receiving T from a data user, the cloud server is responsible to search the index I and return the corresponding set of encrypted documents. To improve the document retrieval accuracy, the search result should be ranked by the cloud server according to some ranking criteria (e.g., coordinate matching, as will be introduced shortly). Moreover, to reduce the communication cost, the data user may send an optional number k along with the trapdoor T so that the cloud server only sends back top-k documents that are most relevant to the search query. Finally, the access control mechanism [10] is employed to manage decryption capabilities given to users and the data collection can be updated in terms of inserting new documents, updating existing documents, and deleting existing documents. 2. Conclusion In this paper we have discussed several methods used for secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. Some of the important issues to be handled by the searching technique for providing the data utilization and security are keyword privacy, data privacy, finegrained search, scalability, efficiency, index privacy, query privacy, result ranking, index confidentiality, query confidentiality, query unlinkability, semantic security and trapdoor unlinkability. The limitations for all the searching techniques mentioned in this paper are discussed as well. From the above survey, we can say that a secure, efficient and dynamic search scheme can be developed, which supports not only the accurate multikeyword ranked search but also the dynamic deletion and insertion of documents. Different types of features and classification algorithms can be combined in order to overcome their individual drawbacks and benefit from each other’s merits, and finally enhance the performance.

Page 991


Imperial Journal of Interdisciplinary Research (IJIR) Vol-3, Issue-2, 2017 ISSN: 2454-1362, http://www.onlinejournal.in

3. Acknowledgements This work was supported in part by our seminar guide in this college.

4. References [1] D M. Kuzu, M. S. Islam, and M. Kantarcioglu, Efficient similarity search over encrypted data, in Proc. IEEE 28th Int. Conf. Data Eng., 2012, pp. 11561167. [2] C. Orencik, M. Kantarcioglu, and E. Savas, A practical and secure multi-keyword search method over encrypted cloud data, in Proc. IEEE 6th Int. Conf. Cloud Comput., 2013, pp. 390397. [3] K. Ren, C. Wang, and Q. Wang, Security challenges for the public cloud, IEEE Internet Comput., vol. 16, no. 1, pp. 6973, Jan-Feb. [4] S. Kamara and K. Lauter, Cryptographic cloud storage, in Proc. Financ. Cryptography Data Secur., 2010, pp. 136149.C. Lin, Yulan He, R. Everson ―Weakly Supervised Joint Sentiment-Topic Detection from Text, IEEE Transactions On Knowledge And Data Engineering, Vol. 24, No. 6, June 2012. [5] W. Zhang, S. Xiao, Y. Lin, T. Zhou, and S. Zhou, “Secure ranked multi-keyword search for multiple data owners in cloudcomputing, in Dependable Syst. Networks (DSN), IEEE 44th Annu. IEEE/IFIP Int. Conf., 2014, pp. 276286. [6] Jiadi Yu, Peng Lu, Yanmin Zhu, Guangtao Xue, and Minglu Li, Toward Secure Multikeyword Top-k Retrieval over Encrypted Cloud Data, in dependable and secure computing, IEEE July/August 2013. [7] N. Cao, C. Wang, M. Li, K. Ren, and W. Lou, Privacy-preserving multi-keyword ranked search over encrypted cloud data, In Proc. IEEE INFOCOM, Apr. 2011, pp. 829837. [8] J. Feigenbaum, Y. Ishai, K. Nissim, M. Strauss, and R. Wright, Secure multiparty computation of approximations, ACM Transactions on Algorithms, vol. 2, pp. 435472, 2006. [9] M. Atallah, F. Kerschbaum, and W. Du, Secure and private sequence comparisons, in Proc. of the WPES03, 2003, pp. 3944. [10] B S. Yu, C. Wang, K. Ren, and W. Lou, Achieving Secure, Scalable, and Fine- Grained Data Access Control in Cloud Computing, Proc. IEEE INFOCOM, 2010. [11] A. Rajaraman and D. Ullman, Jeffrey, Mining of massive datasets. Cambridge University Press, 2011. [12] I.H. Witten, A. Moffat, and T.C. Bell, Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan Kaufmann Publishing, May 1999.

Imperial Journal of Interdisciplinary Research (IJIR)

Page 992


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.