12-IJAEST-Volume-No-3-Issue-No-2-Smart-Query-Answering-System-Based-on-Peer-System-and-Data-Mining-T by ISERP ISERP

W. Jai Singh et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 3, Issue No. 2, 161 - 164

Smart Query Answering System Based on Peer System and Data Mining Techniques 1

W. Jai Singh, 2K. Sobhana

1. Assistant Professor, Department of MCA, Park College of Engineering and Technology, Coimbatore, Tamil Nadu, India. 2. PG Scholar, Department of MCA, Park College of Engineering and Technology, Coimbatore, Tamil Nadu, India.

to collections of directories and files. A single command can manipulate the entire collection. Evaluating queries efficiently and intelligently requires an important step of query rewriting and modification. Query rewriting is a basic step in query processing aiming at transforming a given query into another more efficient one that uses less time and resources to execute. A rewritten query normally produces the same answer set as the original query.

Abstract - Knowledge discovery facilitate querying database knowledge and intelligent query answering in database systems. Querying a database is a common task for traditional database systems. Producing answers effectively depends largely on users’ knowledge about the query language and database schema. This paper proposes an smart query answering technique based on peer systems to improve effectiveness and convenience of querying databases. It is a version control system to automate request/response work. The proposed approach can be answered directly by simple retrieval or intelligently by analyzing the intend of query and providing generalized, peer systems or associated information using stored or discovered knowledge. The experimental result shows the response to the client’s request in fast and intelligent way.

IJ A

Keywords - Version control; intelligent query; Query answering; Peer systems; Database sharing;

1 INTRODUCTION

A database query is to find answers from data stored in a database, which meet certain conditions or constraints of a retrieval statement. The project history is stored in a single central database and all the systems share the copy of all the files that the developers are working on. Therefore, the network between the developers must be up to perform the operations (such as checking or updates) but need not be up to edit or manipulate the current versions of the files. Developers can perform all the operations which are available locally. In cases where several developers or teams want to maintain their own version of the files, because of geography, policy and/or project can import a version from another module within a project, and then merge the changes from the module with the latest files if that is what is desired. Unreserved checkouts allow more than one developer to work on the same files at a time. Project provides a flexible modules database that provides a symbolic mapping of names to components of a larger software distribution. It applies names

ISSN: 2230-7818

The notion of peer systems provides a convenient and flexible tool for representing similarity and can be used to describe both quantitative and qualitative information. Smart query answering mainly deals with the problem of finding information relevant to query by using inexactly matched items, based on a semantic distance or similarity. Nittaya Kerdpraso et.al., [1] proposed to design a multi-agent system working cooperatively in an intelligent way to analyze user’s request and revise the query with virtual mining and materialized views. Sergio Martín et.al., [2] developed an intelligent manager able to answer the student’s questions automatically, using the knowledge stored in Learning Management Systems (LMS).The work of Han et.al.,[3] is among the early research in intelligent query answering that incorporates data mining techniques to rewrite user’s queries. Their query relaxation approach employed the notion of generalization to build concept hierarchy. Lin et.al., [4] proposed an intelligent query answering technique that integrates neighborhood information and data mining rules discovered from database. The neighborhood information is incorporated into the query rewriting process to help the systems to rewrite the original query, and data mining rules are used to help answer the queries more intelligently, effectively and efficiently. Jiawei Han et.al., [5] investigated the application of concept hierarchies, discovered rules and knowledge discovery tools to intelligent query answering in database systems. Yongjian Fu et.al., [6] presented a Multi Layered Database Model for intelligent query answering in mobile environments using a series of generalized operators to this model. Query modification [7] interprets query rewriting

Page 161

W. Jai Singh et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 3, Issue No. 2, 161 - 164

2 A DATABASE MODEL FOR KNOWLEDGE-RICH DATABASES

GDB is a Generalized Database, consists of a set of generalized rules which summarize the regularities of the data at a high level. KDT is a set of Knowledge Discovery Tools, performs knowledge discovery efficiently in databases when necessary.

3 FOUR CATEGORIES OF QUERY ANSWERING In a knowledge-rich database system, there may exist two kinds of queries - data queries and knowledge queries, where a data query is to find concrete data stored in a database, which corresponds to a basic retrieval statement in a database system; whereas a knowledge query is to find rules and other kinds of knowledge in the database, which corresponds to querying database knowledge including deduction rules, integrity constraints, generalized rules and other regularities. Direct query answering is a direct, simple retrieval of data or knowledge from the Knowledge - Rich Database ; whereas smart query answering consists of analyzing the intent of query and providing generalized, neighborhood, or associated information relevant to the query. Based on such classifications query answering mechanisms can be categorized into the following four combinations:

IJ A

Data in knowledge-rich databases are classified into primitive data and high-level data. A primitive-level query is a query whose constants involve only primitive data; whereas a high-level query is a query whose constants involve high-level data. A rule which is explicitly defined by a user or an expert is a deduction rule; whereas a rule which is generalized from a database state is a generalized rule. Both deduction rules and generalized rules can be primitive-level or high-level rules. However, since a generalized rule summarizes data from a database state, it reflects a general fact in the current database state but does not enforce a constraint on the possible database states. This contrasts with a deduction rule or an integrity constraint which states a rule (or a constraint) that a potential database state must follow.

in a more relaxed way as a query refining process to produce answers that might be a superset of the expected answers. The advantage of query relaxation is the increased possibility of obtaining desired answers when users have limited knowledge about the problem domain and the database schema. Aragao and Fernandes [8] proposed a unified foundation for query answering and knowledge discovery. The combined system is called CIDS (Combined Inference Database System). Answering queries using views has long been elaborately studied by Halevy et. al., [9]. Materialized views can provide useful information in query processing especially in the context of web searching applications. Data mining techniques have been exploited to analyze data and discover generalized rules which have the summary, statistical distribution and characteristic/classification information about the data in the databases and help to provide intelligent answers to queries instead of direct retrieval of data.

A Knowledge-Rich Database (KRDB) consists of six components: 1) Schema, a Knowledge-Rich Database Schema; 2) EDB, an Extensional Database; 3) IDB, an Intensional Database; 4) H, a set of concept Hierarchies; 5) GDB, a Generalized Database; and 6) KDT, a set of Knowledge Discovery Tools, i.e., KRDB = (Schema, EDB, IDB, H,GDB, KDT). These could be defined as follows. 1.

2. 3. 4.

Schema is a Knowledge-Rich Database Schema, describes the general structure and organization of KRDB including i) physical and virtual entities, attributes and relationships, and ii) the organization of rules, integrity constraints and concept hierarchies, based on a deductive entity-relationship data model. EDB is an Extensional Database which consists of a set of extensional data relations. IDB, an Intensional Database, consists of a set of deduction rules and integrity constraints (ICs). H is a set of concept Hierarchies, specifies taxonomies of concepts on top of primitive data in extensional and intensional databases.

ISSN: 2230-7818

1) Data query and direct answering: direct answering of data queries. 2) Data query and intelligent answering: intelligent answering of data queries. 3) Knowledge query and direct answering: direct answering of knowledge queries. 4) Knowledge query and intelligent answering: intelligent answering of knowledge queries. A knowledge query can often be viewed as a followup to a data query when further explanation, reasoning or summarization are needed besides the answers to a data query. The proposed smart query answering system based on peer system is very efficient in query answering by views in an IT firm. The framework of the project is depicted in the Fig.1. Project Manager

Team Leader

Project Interfac e Token Passing

Query Answering data

Task Allocatio n

Fig.1 Logical structure query answering in IT firms

Page 162

W. Jai Singh et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 3, Issue No. 2, 161 - 164

It is the discipline of keeping evolving software products under control, and thus contributes to satisfying quality and delay constraints. After establishing a configuration, such as that of a IT or computer system, the evaluating and approving changes to the configuration and to the interrelationships among system components are possible.

features and assurances through control of changes made to hardware, software, firmware and documentation through out the development and operational lifecycle of system in module basis. It is the discipline of keeping evolving software products under control, and thus contributes to satisfying quality and delay constraints. The smart query answering literally depends on it’s processing time and accuracy.

3.1 Direct Answering of Data Queries

3.2 Smart Answering of Data Queries

Fig.2 The example user interface for project manager entry Association rules:

First, a query rewriting process can be performed to rewrite the query into one or a set of equivalent primitive-level data queries by substituting each high-level concept in the query with a set of or a range of its subordinate primitive-level concepts by consulting concept hierarchies in the KRDB. Second, each rewritten query is then fed into a relational or deductive query processor for processing. Answers should be returned at the primitive level. Presentation of answers at a non primitive level, when desired, is considered as a task of intelligent query answering.

Direct answering of data queries corresponds to direct data retrieval in knowledge-rich databases. A primitive-level data query can be processed directly using relational and deductive query processing techniques. A high-level data query can be processed in two steps.

IJ A

Smart answering of data queries refers to the mechanisms which answer data queries cooperatively and intelligently. There are many ways for a data query to be answered intelligently, including generalization and summarization of answers, explanation of answers or returning intensional answers, query rewriting using associated or neighborhood information, comparison of answers with those of similar queries etc. 1) Query rewriting using associated or neighborhood information. 2) Generalization and summarization of the answers to a query. 3) Comparison of the answer set with those under similar situations. . 4. IMPLEMENTATION AND EXPERIMENTAL

IF pjct_type = algorithmic THEN Tl_id = pl001 IF no_of_mduls > 50 THEN team_size > = 9 IF duration < = 2mnt THEN team_size > = 9

Given query

Q1: SELECT * FROM project_master WHERE pjct_type = ‘algorithmic’ AND Tl_id = ‘pl001’ AND duration < = ‘2mnt’ AND team_size > = ‘ 9’

Transformed query Q11: SELECT * FROM pjct_master WHERE pjct_type = ‘algorithmic’ AND team_size > = ‘9’ Q2: SELECT * FROM pjct_master INNER JOIN pjct_regist ON pjct_master.pjct_id = pjct_regist.pjct_id

RESULTS

This paper implemented the request/ response system for client’s query in fast and effective way . The relevant task related data are collected by execution of such queries ,which are generalized by removal of nongeneralizable attributes and derive a final generalized relation by further application of attribute-oriented induction. The problems in the IT industry regarding the configuration details can be solved. The management of

ISSN: 2230-7818

Fig.3 Examples of rules and joins used in query answering TABLE 1 Processing time(milliseconds) of given queries and transformed queries

Time

Q11

1026

1387

627

Page 163

W. Jai Singh et al. / (IJAEST) INTERNATIONAL JOURNAL OF ADVANCED ENGINEERING SCIENCES AND TECHNOLOGIES Vol No. 3, Issue No. 2, 161 - 164

5. CONCLUSION AND FUTURE WORK

REFERENCES

[7] T.Lmielinsky Intelligent query answering in rule based systems. J.Logic Programming, 4:229-257, 1987. [8] Aragao, M. and Fernandes, A., Logic-based integration of query answering and knowledge discovery. Proceedings of 6th International Conference on Flexible Query Answering Systems, pp. 68-83, 2004. [9] Halevy, A., Answering queries using views: a survey. The VLDB Journal, Vol.10, No.4, pp. 270-294, 2001.

This paper describes an smart query answering based on peer system and data mining techniques used to provide an integrated, flexible and efficient platform with the advantages of improving the relationship with their teams, availability, better data security and easy to navigate the system. User preferences can be easily changed without affecting the databases. Different users have different interests, needs and intensions. The better the system can detect these, the best response and service it can provide the user. The embedding feature of revision control direct this project extended towards automatic scheduling of tasks without raising tokens to the developers at the completion of their work.

[6] Sanjay Kumar Madriya, Yogjian Fu and Sourav Bowmick, Multi-Layered databases for Intelligent Query Answering in Mobile Environments, MDM 2003, LNCS 2574, pp. 381-385,2003.

More than fifteen queries are tested on the sample database. Query processing time of the given queries are observed and some results are shown in table 1. It can be noticed from the results that there is no profit on transforming the given query (Q1 to Q11) at first occasion because it takes much time on mining for association rules. However, it would be better if the constraints are met effectively within a reduced time span. The queries with inner and outer joins would be implemented in better manner as shown in Q2. The example user form for project manager entries are shown in Fig. 2.

[1] Nittaya Kerdprasop, Natthapon Pannurat and Kittisak Kerdprasop, Intelligent Query Answering with Virtual Mining and Materialized Views. World Academy of Science, Engineering and Technology 48 2008

IJ A

[2] Sergio Martín, Elio Sancristobal, Rosario Gil, Gabriel Díaz, Manuel Castro, Juan Peire, Development of an Intelligent Answering Machine based on LMS Knowledge. International Conference on Engineering Education – ICEE 2007. [3] Han, J. et.al., Intelligent query answering by knowledge discovery techniques. IEEE Transactions on Knowledge and Data Engineering, Vol.8, No.3, pp. 373-390, 1996. [4] Lin, T. et.al., Intelligent query answering based on neighborhood systems and data mining techniques. Proceedings of the International Database Engineering and Applications Symposium, pp. 91-96, 2004.

[5] Jiawei Han, Yue Huang, Nick Cerocone, and Yongijan fu.Intelligent Query Answering with Knowledge Discovery Techniques, IEEE T ransactions onKnowledge and Data Engineering, Vol 8, issue 3, pp. 373-390, June 1996.

ISSN: 2230-7818

Page 164